you can calculate these statistics from the segment data, e.g. parsed
text.
To read the nutch file format is easy possible using the Nutch
Readers e.g. SequenceFile Reader.
Just take a look to the io package.
HTH
Stefan
Am 24.01.2006 um 08:18 schrieb Wong Ting Kiong:
hi all,
I'm now using nutch 0.7.1, and I wish to retrieve content from
index file,
how can i retrieve? Information that i want to retrieve are
- list of words from each links
- occurance of words in each links
can i retrieve these information in raw data format?
thanks for your attention
Kiong
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general