Hi,
As suggest i tried with the code , but in the result.txt i got output only
header. Nothing else was printing.
After debugging i came to know that while parsing , there is no value.
The problem is in line given below which is bold. While putting SysOut i
found no value printing in this line.
String xmlContent = value.toString();
InputStream is = new ByteArrayInputStream(xmlContent.getBytes());
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
* Document doc = builder.parse(is);*
*String ed=doc.getDocumentElement().getNodeName();*
out.write(ed.getBytes());
DTMNodeList list = (DTMNodeList) getNode("/Company/Employee",
doc,XPathConstants.NODESET);
When iam printing
out.write(xmlContent.getBytes):- the whole xml is being printed.
then i wrote for Sysout for list ,nothing printed.
out.write(ed.getBytes):- nothing is being printed.
Please suggest where i am going wrong. Please help to fix this.
Thanks in advance.
I have attached my code.Please review.
Mapper class:-
public class XmlTextMapper extends Mapper<LongWritable, Text, Text, Text> {
private static final XPathFactory xpathFactory =
XPathFactory.newInstance();
@Override
public void map(LongWritable key, Text value, Context context)
throws IOException, InterruptedException {
String resultFileName = "/user/task/Sales/result.txt";
Configuration conf = new Configuration();
FileSystem fs = FileSystem.get(URI.create(resultFileName), conf);
FSDataOutputStream out = fs.create(new Path(resultFileName));
InputStream resultIS = new ByteArrayInputStream(new byte[0]);
String header = "id,name\n";
out.write(header.getBytes());
String xmlContent = value.toString();
InputStream is = new ByteArrayInputStream(xmlContent.getBytes());
DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
DocumentBuilder builder;
try {
builder = factory.newDocumentBuilder();
Document doc = builder.parse(is);
String ed=doc.getDocumentElement().getNodeName();
out.write(ed.getBytes());
DTMNodeList list = (DTMNodeList) getNode("/Company/Employee",
doc,XPathConstants.NODESET);
int size = list.getLength();
for (int i = 0; i < size; i++) {
Node node = list.item(i);
String line = "";
NodeList nodeList = node.getChildNodes();
int childNumber = nodeList.getLength();
for (int j = 0; j < childNumber; j++)
{
line += nodeList.item(j).getTextContent() + ",";
}
if (line.endsWith(","))
line = line.substring(0, line.length() - 1);
line += "\n";
out.write(line.getBytes());
}
} catch (ParserConfigurationException e) {
e.printStackTrace();
} catch (SAXException e) {
e.printStackTrace();
} catch (XPathExpressionException e) {
e.printStackTrace();
}
IOUtils.copyBytes(resultIS, out, 4096, true);
out.close();
}
public static Object getNode(String xpathStr, Node node, QName
retunType)
throws XPathExpressionException {
XPath xpath = xpathFactory.newXPath();
return xpath.evaluate(xpathStr, node, retunType);
}
}
Main class
public class MainXml {
public static void main(String[] args) throws Exception {
Configuration conf = new Configuration();
if (args.length != 2) {
System.err
.println("Usage: XMLtoText <input path> <output path>");
System.exit(-1);
}
String output="/user/task/Sales/";
Job job = new Job(conf, "XML to Text");
job.setJarByClass(MainXml.class);
// job.setJobName("XML to Text");
FileInputFormat.addInputPath(job, new Path(args[0]));
// FileOutputFormat.setOutputPath(job, new Path(args[1]));
Path outPath = new Path(output);
FileOutputFormat.setOutputPath(job, outPath);
FileSystem dfs = FileSystem.get(outPath.toUri(), conf);
if (dfs.exists(outPath)) {
dfs.delete(outPath, true);
}
job.setMapperClass(XmlTextMapper.class);
job.setNumReduceTasks(0);
job.setMapOutputKeyClass(Text.class);
job.setMapOutputValueClass(Text.class);
System.exit(job.waitForCompletion(true) ? 0 : 1);
}
}
My xml file
<Company>
<Employee>
<id>100</id>
<ename>ranjini</ename>
<dept>IT1</dept>
<sal>123456</sal>
<location>nextlevel1</location>
<Address>
<Home>Chennai1</Home>
<Office>Navallur1</Office>
</Address>
</Employee>
<Employee>
<id>1001</id>
<ename>ranjinikumar</ename>
<dept>IT</dept>
<sal>1234516</sal>
<location>nextlevel</location>
<Address>
<Home>Chennai</Home>
<Office>Navallur</Office>
</Address>
</Employee>
</Company>
Thanks in advance
Ranjini. R