I am starting to dive into jms (particularly openjms),
for another project and just wondering if there have
been thoughts of doing the distributed work via some
jms means. It could be that there is jms in there
somewhere, but in searching the mail archive I found
no references.
I finished some
I noticed that HTML-Parser Plugin has references to xercesImpl.jar which
is plased in
src/plugin/parse-rss/lib/xercesImpl.jar
Where do you find some references to xercesImpl .jar in HTML-Parser plugin?
(If so, I dont understand how it can compile since the build scripts never
import any lib
Paul,
I am thinking about the mapred branch and the case of a mapred
multiprocess run over one or more machines. In this case,
multiple tasktracker processes are created.
I'm not sure what you mean.
As far I understand the code there is only one tasktracker per machine.
why are the
Hi,
Nutch comes with a own rpc implementation that is very lightweight
and fast - much faster than jms.
Beside that the distribution of tasks is down via map reduce so there
is no need for jms.
However I heard that helix people plan to use jms.
Greetings.
Stefan
Am 27.09.2005 um 09:53
[
http://issues.apache.org/jira/browse/NUTCH-36?page=comments#action_12330588 ]
Kerang Lv commented on NUTCH-36:
Code of a kind can be used to perform third-part CJK word
segmentation in NutchAnalysis.jj. CJKTokenizer, a kind of bi-gram segmentation
, was
Thank you
Shu (Steve) Chen
Techmate Corporation
Tel: 1-425-818-0568
Cell: 425-785-9971
Fax: 1-425-641-8908
-Original Message-
From: Jérôme Charron [mailto:[EMAIL PROTECTED]
Sent: Tuesday, September 27, 2005 2:10 AM
To: nutch-dev@lucene.apache.org
Subject: Re: Classpath for HTML
Chris Mattmann wrote:
I just noticed after checking out the latest SVN of Nutch that I am
currently failing the TestSegmentMergeTool Junit test when I type ant test
for Nutch.
I'm on the mapred branch, not the trunk, and all tests pass.
One thing I have noticed is that it is best to start
You know what the crazy thing is:
Seemingly, all tests pass now. And I didn't change a thing. Honest. I swear.
Very strange, indeed, but I'm happy because at least the tests are passing!
:-)
Cheers,
Chris
On 9/27/05 12:29 PM, Paul Baclace [EMAIL PROTECTED] wrote:
Chris Mattmann wrote:
Stefan Groschupf wrote:
As far I understand the code there is only one tasktracker per machine.
That is true, but only for the most apparent use case. I'm working on
testing which needs emulate a multi machine deployment.
As you can see in the tasktracker code, the ports are cleanly closed