>don't expect polish. You shouldn't need polish to be able to leran the command required to resume an aborted drawl, or to index what you have already crawled. Things like this shouldn't require an easter egg hunt. They are going to heppen to evryone doing greater than a simple crawl.
>If you find a bug, please file a bug report, so that other folks are aware of it. I have reported 2 so far. I have a third one (and a patch) that I am still in the process of developing documenting, which relates to parsing pdfs. >Better yet, if you have a >solution or improvement, please construct a patch file (even for >documentation) and attach it to a bug report. On the wiki, anyone can >make themselves an account and update documentation. We don't boss >folks around here, or complain. We pitch in and help. In the email I sent you I volunteered to help by offering to polish the documentation myself. I do need some answers first. Many of the questions that get asked on this list unfortunately go unanswered by the experts. If they go unanswered, it impossible for those who would otherwise share their solutions on the Wiki, because there is no solution to share. If I went and posted my knowledge about indexing and restarting crawls, it wouldn't be any better than what is already up there, which is incomplete and incorrect. I know there are those of you that no nutch inside and out. Right now that's just a few guys. I know I want to know more about it, that's why I am spending my free time trying to learn. Everyting I am doing is part of an open source search project, not a commercial endevour. I always contribute my knowledge back by posting answers to things I know about. Documentation, whether we like it or not, is key to the use of the product. The onus is on the developers to document the project, and to provide support when the documentation is clearly lacking. One the developers share more of their knowledge, their will be more knowledgable users and the developers wont need to spend as much time on support and documentation. I would agree that if you have 1 url to crawl, and you crawl it with depth = 3-6 , nutch is easy to use. I tried with depth=10, and I hit a snag. This has been very hard to get through, given the lack of documentation. I have nutch up and running fine here http://24.75.221.234:8080 But this is a simple crawl and doesn't reflect all of the pages needed to make a good search engine. I told you I was more than willing to help, and I think many users feel the same way, but I for one feel that there is a lack of documentation and support. This isn't meant to offend anyone, if you are offended you need to toughen up your skin a little bit. -----Original Message----- From: sudhendra seshachala [mailto:[EMAIL PROTECTED] Sent: Saturday, March 04, 2006 1:26 AM To: nutch-user@lucene.apache.org Subject: Re: project vitality? I could not agree with Doug more. This is one of the best.. am trying UIMA too... though UIMA also uses Lucene...as of today, it is still a framework and community in early stages.. In fact the nightly builds has good improvements than 0.71. Any serious user or adopter should be trying with a snapshot of nightly build.. Doug, It would be better, if there is official 0.8 release or atleast a RC. before major releasing 1.0. I am newbie, so let me know about ideas on releasing 0.8. Thanks Sudhi Doug Cutting <[EMAIL PROTECTED]> wrote: Richard Braman wrote: > I think it is still very much at proof of concept stage. I think it is > close, but as you have mentioned, the website Is severely out of date > and the information and documentation on it lacks luster. It stands to reason that if the documentation lacks "luster" the project must be dead! Seriously, this is an active project. It is not yet 1.0, so don't expect polish. If it doesn't look easily usable to you then perhaps it is not. It's still for early adopters. The commit list shows a fair amount of activity: http://www.mail-archive.com/nutch-commits%40lucene.apache.org/maillist.h tml Lots of public sites are using Nutch. Some are listed at http://wiki.apache.org/nutch/PublicServers, but many are not, like http://search.bittorrent.com/. > I have tried > to get the tutorial and faqs updated, but I haven't heard back. This is an all-volunteer project. If you find a bug, please file a bug report, so that other folks are aware of it. Better yet, if you have a solution or improvement, please construct a patch file (even for documentation) and attach it to a bug report. On the wiki, anyone can make themselves an account and update documentation. We don't boss folks around here, or complain. We pitch in and help. Doug Sudhi Seshachala http://sudhilogs.blogspot.com/ --------------------------------- Yahoo! Mail Bring photos to life! New PhotoMail makes sharing a breeze.