Does the 1.4 version of nutch have tika-app?  Also..maybe I am not using the
DocumentFragment object properly?  Below is a summary version of my code:

public ParseResult filter(Content content, ParseResult parseResult,
           HTMLMetaTags metaTags, DocumentFragment doc) {

   for (int x = 0; x < doc.getChildNodes().getLength(); x++) {
   
     System.out.println("xml node name" +
doc.getChildNodes().item(x).getNodeName());
     System.out.println("xml node value" +
doc.getChildNodes().item(x).getNodeValue());
     System.out.println("xml text content" +
doc.getChildNodes().item(x).getTextContent());

  }



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Cached-page-like-google-with-hits-highlighted-tp4001374p4001440.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to