Hi,
I have been reading data from Nutch segments and came across
pages/records with empty parse text. So I looked more into this and
manually fetched data for this urls. Lots of them are redirect page, but stored 
into Nutch segment as pages (with meta data but empty parse text). My
question is does Nutch get the target page, the page that the original
page redirects to? Does it get all the information about it (text, meta
data...)? Why Nutch stores this empty/redirect pages?

Thanks,
    Tomislav

Reply via email to