I am using Nutch 2.2.1 and hbase.

I am trying to create a plug-in to parse and index specific elements.  I
store the parsed values in the webPage metadata field with a ParseFilter (I
verified this worked by retrieving the values right after that). 

Then I try to retrieve the values from the metadata in an IndexingFilter,
however, when getting the metadata from the page it returns the metadata Map
but it is always empty so I cannot retrieve the values.  However, if I look
at the actual table in HBase afterward, the metadata is fully populated with
all the values from the ParseFilter -it is not empty.

How come I can't retrieve the values in the IndexingFilter (why is it not
seeing the metadata and thinking the field is empty)?  Are the pages being
passed to the indexing filter not the actual page as stored in HBase?




--
View this message in context: 
http://lucene.472066.n3.nabble.com/Empty-webpage-metadata-in-IndexingFilter-but-not-empty-in-database-tp4086419.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to