Dear Michael,
I writed a tool OptimizeIndex.java, this is faster and there aren't
questions: what it is do?
After you optimize index with IndexOptimizer, the number of searching
for 'http' is the same?
Regards,
Ferenc
Michael Nebel wrotte:
Hi,
I fixed the problem with the following
try:
http://wiki.media-style.com/display/nutchDocu/Home
Stefan
Am 04.08.2005 um 19:54 schrieb Nishant Chandra:
Hi,
I am new to nutch. Is there any articles/tutorials which explains the
internal working of the crawler (crawl stratergy) etc.
Nishant
Hi Doug,
The slides from my talk yesterday at OSCON give some hints on how
to get started. We need a MapReduce tutorial.
http://wiki.apache.org/nutch/Presentations
Can you explan what this means: Page 20:
- cheduling is bottleneck, not disk, network or CPU?
Thanks.
Stefan
Stefan Groschupf wrote:
http://wiki.apache.org/nutch/Presentations
Can you explan what this means: Page 20:
- cheduling is bottleneck, not disk, network or CPU?
I mean that neither the CPUs, disks or network are at 100% of capacity.
Disks are running around 50% busy, CPUs a bit higher, and
Hello,
I think it is good idea to release ASAP. I wanted to contribute my code
for fault-tolerant searching - it takes more time than I expected
because as some of you know in meantime I become a father. But I hope I
will be able to send something for comments early next week. I will look
at
Doug I also ran into this when I was testing ndfs the system would have to
wait for the namenode to tell the datanodes what data to recieve and which
data to replicate, I'm currently setting up lustre to see how it works, its
at the kernel level that it operates, do you think if the namenode was
Jay Pound wrote:
Doug I also ran into this when I was testing ndfs the system would have to
wait for the namenode to tell the datanodes what data to recieve and which
data to replicate
When did you test this? Which version of Nutch? How many nodes? My
benchmark results from just a few days
Doug Cutting wrote:
Andrzej Bialecki wrote:
So, I would propose a deadline of Aug 8 for the last commits, and then
perhaps Aug 15 for the release?
Sounds good to me. Thanks for helping with this!
Unfortunately, the patches related to detecting the unmodified content
will have to wait
[ http://issues.apache.org/jira/browse/NUTCH-65?page=all ]
Andrzej Bialecki closed NUTCH-65:
--
Resolution: Fixed
Patches applied. Thanks!
index-more plugin can't parse large set of modification-date