1
page/sec when I do this. Same crawl list with old fetch flies by
comparison ~ 10pages/sec Box is only doing nutch. Anyone else seen
this or have a workaround..??
--
rp
)
at
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1487)
at java.lang.Thread.run(Thread.java:595)
Thanks,
rp
- at
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1487)
2007-02-15 20:05:15,307 INFO NutchController - at
java.lang.Thread.run(Thread.java:595)
RP wrote:
Hi all,
Pulled down the WEB2 stuff via SVN to finally look at the keymatch and
spellchecker stuff. Did the ANT thing per
to 128MB and see what happens
rp
Sean Dean wrote:
It looks like you don't have enough RAM to maintain the quick speeds you were
seeing when the index was only around 3000 pages.
Nutch scales very well, but the hardware behind it must also. Using quick calculations and common sense, if your
?
Tweakability is good, but real world performance is better and we just
want to be sure our results are based on something more than my playing
around with the values
--
rp
startup in 5967 ms
Andrzej Bialecki wrote:
RP wrote:
No changes to logging configuration that worked fine at 0.8 but at
0.9 I get this once I do a query (query returns just fine):
INFO: Server startup in 1947 ms
log4j:ERROR setFile(null,true) call failed.
java.io.FileNotFoundException
That was it - the log4j.properties in the original nutch.war under 0.8
is NOT the same as the log4j.properties in the 0.8 conf directory (which
is the same as the 0.9 one) Thx for pointing me in the right
direction
rp
Sean Dean wrote:
I think it might be getting logged into a file
$Http11ConnectionHandler.process(Http11AprProtocol.java:706)
at
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1487)
at java.lang.Thread.run(Thread.java:595)
log4j:ERROR Either File or DatePattern options are not set for appender
[DRFA].
--
rp
times seem shorter than doing it in one step as I'm also CPU
AND bandwidth throttled. As always, your mileage may vary so give some
things a try and you might get a nice surprise in improved speed
--
rp
easy or best practices ways on doing this any help/pointers
would be appreciated
--
rp
up
to the searchers to help offset the cost of the service and serve up or
flag links that rank first because of payment followed by normal search
link results
rp
Sean Dean wrote:
I might be totally off base with what your asking to do, but take a look at
this open source project: http
-keymatch-onebox.googlecode.com/svn/trunk/Keymatch.java
2006/12/19, RP [EMAIL PROTECTED]:
Let me qualify this - ad banner rotation is dealt with - I'm looking for
something that will use our Nutch engine to serve up relevant links from
people who pay for that privilege. We do not want to serve up
)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:130)
Is this a bug or my instance's misconfiguration?
Running on single box, java-1.5.0.09
Thanks.
--
rp
Fixed - re-ran merge again and all is well - a trip through the indexes
with Luke showed the original segments still there and the re-run took
care of it
rp
RP wrote:
Upgrade proceeding on 0.9x - was able to parse and index just fine
after first halt, but now I get an error in the logs
was the addition of the mapred.speculation=false which cured
the parse error and allowed me to continue and index. Only other thing
may be the linkdb not being put where it was told, but I moved it over
(it built fine) and the index operation ran without issue.
--
rp
(PhasedFileSystem.java:211)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:315)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:137)
--
rp
Pardon my ignorance, but where and how do I do this..?? Nothing in conf
files that I can see as a switch
rp
Andrzej Bialecki wrote:
RP wrote:
2006-12-15 00:52:43,895 WARN mapred.LocalJobRunner - job_dokmpz
java.lang.NullPointerException
Andrzej - Thanks that conf switch seemed to take care of it...! Any
idea if the Hadoop native stuff will give any performance boost..??
rp
Andrzej Bialecki wrote:
RP wrote:
Pardon my ignorance, but where and how do I do this..?? Nothing in
conf files that I can see as a switch
Ah-ha
18 matches
Mail list logo