I want to read the stored segments to a xml file, but when I read the
SegmentReader.java, I find that it ‘s not a simple thing.
it’s a hadoop’s job to dump a text file. I just want to dump the
segments’ some content witch I interested to a xml.
So some one can tell me hwo to do this, any reply
Sami Siren wrote:
Stefan Groschupf wrote:
See:
http://www.find23.net/nutch_guiToHadoop.pdf
Section required hadoop changes.
I quess you refer to these:
• LocalJobRunner:
• Run as kind of singelton
• Have a kind of jobQueue
• Implement JobSubmissionProtocol status-report
methods
The wiki would be a good place for this.
Doug
Peter Landolt wrote:
Hello,
We tried to introduce Nutch at a telecommunication company in Switzerland
as search engine of their future main search solution. As they were also
proofing
commercial products we needed to offer them a brochure to make
[
http://issues.apache.org/jira/browse/NUTCH-413?page=comments#action_12456967 ]
Dogacan Güney commented on NUTCH-413:
-
About command-line options: that is not what I meant(I am not a native
speaker). I meant that I also set fetcher.parse t
Hi,
I only have experience in Lucene + Hadoop but not nutch. I want to
know the basic
idea behind the distributed searching provided by nutch, i.e. the
`mapper` & `reducder` functions involved
in both indexing and searching.
Anyone can give me some hints? Thanks.
[
http://issues.apache.org/jira/browse/NUTCH-413?page=comments#action_12456870 ]
Jonathan Amir commented on NUTCH-413:
-
I didn't check out the trunk, I checked out the 0.8.1 tag, because I wanted
stability. If it is fixed in the trunk, then
[
http://issues.apache.org/jira/browse/NUTCH-413?page=comments#action_12456832 ]
Dogacan Güney commented on NUTCH-413:
-
Are you sure about this? Running the fetcher (latest trunk) with -noParsing
option does not create any parse segments, wh