I have 3 urls url1 url2 url3
And lets say I want to extract some data from these urls in my ParseFilter and then index it using my IndexingFilter and that data is url1 => data1 , data2,data3 url2 => data1 , data2 url3 => data1, data2, data3, data4,data5 Now when I am in ParseFilter I query webPage.getBaseUrl() and if its url1 I extract data1, data2, data3 and add them to my webPage.putToMetadata(key1 , data1) webPage.putToMetadata(key2 , data2) webPage.putToMetadata(key3 , data3) And similarly for url2 and url3. Now I was expecting that when Nutch will execute my Parse URL levelFilter and when I will query webPage.getFromMetadata(key1) and if its in for url1 it will return me url1's key1 data i.e data1 and so on... but its mixing up things. In my Solr I get mix results for url1 document , like data1 is of url1 but data2 is of url3 and data3 is of url2 etc. How can I make sure that when I am in my IndexingFilter and I query for key ( which is unique at URL level , not at current crawl level) I get consistent data for that particular url only. Thanks, Tony.