Well, I would like to agree with Piotr here but current development i.e. 0.8 version and onwards single machine nutch install is not optimal there are various hadoop related issue example
http://issues.apache.org/jira/browse/HADOOP-206 are important for a single machine install. I don't think "one size fits all" is the catch phrase for nutch either. Thats why Anthony I would suggest you look at Solr or Lucene for your installation. The problem regarding 0.8 being slow on single machine is nothing new just search the mailing list you will find many example for it. 0.8 was released earlier this year and the problem is still not solved so I am sorry to be negative but I am just stating facts. On 11/13/06, Piotr Kosiorowski <[EMAIL PROTECTED]> wrote: > Anthony, > I do not think nutch can forget about small implementations. It was > one of its strong points > and I do think we will want to support them. For any issues please > report them in JIRA and I am sure they would be taken care of. > Regards > Piotr > > On 11/12/06, Anthony May <[EMAIL PROTECTED]> wrote: > > Greetings all, > > > > I have just been handed the administration of our nutch implementation, > > we are currently using nutch 0.7 and it very badly needs updating. > > However we are evaluating several options, and I wanted to know about > > where nutch is going as a project. I have not been able to find anything > > in the wiki or in the mailing list archives with this information > > (forgive me if I have missed it). > > > > The central issue is that our needs are for our crawling our own > > website with about 200,000 pages and documents with a single machine > > containing nutch, not for crawling the web with a massively scalar > > architecture. I have heard nutch is moving towards the latter and that > > the former usage is becoming very slow in 0.8 compared to 0.7, is this > > correct? > > > > Thank you for helping me out. > > > > Regards, > > > > > > Anthony May > > Web Developer > > NZQA > > > > ******************************************************************************** > > This email may contain legally privileged information and is intended only > > for the addressee. It is not necessarily the official view or > > communication of the New Zealand Qualifications Authority. If you are not > > the intended recipient you must not use, disclose, copy or distribute this > > email or > > information in it. If you have received this email in error, please contact > > the sender immediately. NZQA does not accept any liability for changes made > > to this email or attachments after sending by NZQA. > > > > All emails have been scanned for viruses and content by MailMarshal. > > NZQA reserves the right to monitor all email communications through its > > network. > > > > ******************************************************************************** > > > ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list Nutch-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nutch-general