>don't expect polish.
You shouldn't need polish to be able to leran the command required to
resume an aborted drawl, or to index what you have already crawled.
Things like this shouldn't require an easter egg hunt.  They are going
to heppen to evryone doing greater than a simple crawl.

>If you find a bug, please file a bug report, so that other folks are
aware of it.  
I have reported 2 so far.  I have a third one (and a patch) that I am
still in the process of developing documenting, which relates to parsing
pdfs.

>Better yet, if you have a 
>solution or improvement, please construct a patch file (even for 
>documentation) and attach it to a bug report. On the wiki, anyone can 
>make themselves an account and update documentation. We don't boss 
>folks around here, or complain. We pitch in and help.

In the email I sent you I volunteered to help by offering to polish the
documentation myself.  I do need some answers first.  Many of the
questions that get asked on this list unfortunately go unanswered by the
experts.  If they go unanswered, it impossible for those who would
otherwise share their solutions on the Wiki, because there is no
solution to share.  

If I went and posted my knowledge about indexing and restarting crawls,
it wouldn't be any better than what is already up there, which is
incomplete and incorrect.  I know there are those of you that no nutch
inside and out. Right now that's just a few guys.  I know I want to know
more about it, that's why I am spending my free time trying to learn.
Everyting I am doing is part of an open source search project, not a
commercial endevour. I always contribute my knowledge back by posting
answers to things I know about.  

Documentation, whether we like it or not, is key to the use of the
product. The onus is on the developers to document the project, and to
provide support when the documentation is clearly lacking.  One the
developers share more of their knowledge, their will be more
knowledgable users and the developers wont need to spend as much time on
support and documentation.

I would agree that if you have 1 url to crawl, and you crawl it with
depth = 3-6 , nutch is easy to use.  I tried with depth=10, and I hit  a
snag.  This has been very hard to get through, given the lack of
documentation.  I have nutch up and running fine here
http://24.75.221.234:8080
But this is a simple crawl and doesn't reflect all of the pages needed
to make a good search engine.

I told you I was more than willing to help, and I think many users feel
the same way, but I for one feel that there is a lack of documentation
and support.  This isn't meant to offend anyone, if you are offended you
need to toughen up your skin a little bit.






-----Original Message-----
From: sudhendra seshachala [mailto:[EMAIL PROTECTED] 
Sent: Saturday, March 04, 2006 1:26 AM
To: nutch-user@lucene.apache.org
Subject: Re: project vitality?


I could not agree with Doug more. This is one of the best.. am trying
UIMA too... though UIMA also uses Lucene...as of today, it is still a
framework and community in early stages..
   
  In fact the nightly builds has good improvements than 0.71.
  Any serious user or adopter should be trying with a snapshot of
nightly build..
   
  Doug, 
  It  would be better, if there is official 0.8 release or atleast a RC.
  before major releasing 1.0. I am newbie, so let me know about ideas on
releasing 0.8.
   
  Thanks
  Sudhi
  

Doug Cutting <[EMAIL PROTECTED]> wrote:
  Richard Braman wrote:
> I think it is still very much at proof of concept stage. I think it is

> close, but as you have mentioned, the website Is severely out of date 
> and the information and documentation on it lacks luster.

It stands to reason that if the documentation lacks "luster" the project

must be dead! Seriously, this is an active project. It is not yet 1.0, 
so don't expect polish. If it doesn't look easily usable to you then 
perhaps it is not. It's still for early adopters.

The commit list shows a fair amount of activity:

http://www.mail-archive.com/nutch-commits%40lucene.apache.org/maillist.h
tml

Lots of public sites are using Nutch. Some are listed at 
http://wiki.apache.org/nutch/PublicServers, but many are not, like 
http://search.bittorrent.com/.

> I have tried
> to get the tutorial and faqs updated, but I haven't heard back.

This is an all-volunteer project. If you find a bug, please file a bug 
report, so that other folks are aware of it. Better yet, if you have a 
solution or improvement, please construct a patch file (even for 
documentation) and attach it to a bug report. On the wiki, anyone can 
make themselves an account and update documentation. We don't boss 
folks around here, or complain. We pitch in and help.

Doug



  Sudhi Seshachala
  http://sudhilogs.blogspot.com/
   


                
---------------------------------
Yahoo! Mail
Bring photos to life! New PhotoMail  makes sharing a breeze. 

Reply via email to