[EMAIL PROTECTED] wrote:
Dear Piotr,
Thanks for link, I readed your great documentation. ;)
It is not mine - it is written by Michael Cafarella - I di only some
updates.
I have only few questions after it:
1. The NDFS is not slower than 'bin/nutch server' alternatives on searchs?
2. I think if I migrate to NDFS stucture from 'server' structure (with 8
million page), I need the followings:
- A web server with 1 GByte RAM.
- A fetcher with 1 GByte RAM.
- A namenode server with 4 GByte RAM
- 4 datenode servers with 4 GByte RAM, on each will be 2 million pages +
replications.
This is true?
3. When I like to remove old segments (I would like refresh after 30
days), how to do it? How to remove entirely segment directories from
NDFS (rm remove only one file)?
I am not using NDFS in production - I have played only a bit with it but
I do not think NDFS can be treated as an alternative to "bin/nutch
server". I do not have enough experience with it but it was written
some time ago on this list that it is not ready for production use yet
- the work that is going in mapreduce branch is also connected with NDFS
so we will probably see more advanced NDFS version in near future (just
my guess). So if you are going to use it now and in poduction
environment I will stay with your current approach.
Regards
Piotr
Piotr Kosiorowski wrote:
Hello Ferenc,
Some documentation on running ndfs can be found on wiki:
http://wiki.apache.org/nutch/NutchDistributedFileSystem
Regards,
Piotr
[EMAIL PROTECTED] wrote:
Have any location the ndfs usage documentation?
Regards,
Ferenc