RE: need volunteer to develop search for apache.org

2006-01-26 Thread Fuad Efendi
Hope to join! +1 -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 25, 2006 4:24 PM To: nutch-dev@lucene.apache.org Subject: need volunteer to develop search for apache.org Would someone volunteer to develop Nutch-based site-search engine for all

[jira] Commented: (NUTCH-59) meta data support in webdb

2006-01-26 Thread James Jonas (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-59?page=comments#action_12364165 ] James Jonas commented on NUTCH-59: -- Stefan, Spot on. Use of HashMaps - very fast Use of separate file instead of extending WebDB - good Background Initially this will help l

[jira] Closed: (NUTCH-190) ParseUtil drops reason for failed parse

2006-01-26 Thread Andrzej Bialecki (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-190?page=all ] Andrzej Bialecki closed NUTCH-190: --- Fix Version: 0.8-dev Resolution: Fixed Assign To: Andrzej Bialecki Fixed. Thanks! > ParseUtil drops reason for failed parse > ---

[jira] Commented: (NUTCH-190) ParseUtil drops reason for failed parse

2006-01-26 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-190?page=comments#action_12364151 ] Chris A. Mattmann commented on NUTCH-190: - +1 i think that this is a needed patch. > ParseUtil drops reason for failed parse > ---

[jira] Commented: (NUTCH-190) ParseUtil drops reason for failed parse

2006-01-26 Thread [EMAIL PROTECTED] (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-190?page=comments#action_12364145 ] [EMAIL PROTECTED] commented on NUTCH-190: - Here's an example of failure output after patch is applied: 060126 141413 task_m_bx2ifn Error parsing: http://techreports.j

[jira] Updated: (NUTCH-190) ParseUtil drops reason for failed parse

2006-01-26 Thread [EMAIL PROTECTED] (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-190?page=all ] [EMAIL PROTECTED] updated NUTCH-190: Attachment: ParseUtil_drops_failure_reason.patch Attached is a suggested patch against revision 369598. > ParseUtil drops reason for failed parse > --

[jira] Created: (NUTCH-190) ParseUtil drops reason for failed parse

2006-01-26 Thread [EMAIL PROTECTED] (JIRA)
ParseUtil drops reason for failed parse --- Key: NUTCH-190 URL: http://issues.apache.org/jira/browse/NUTCH-190 Project: Nutch Type: Bug Components: fetcher Versions: 0.8-dev Environment: linux Reporter: [EMAIL PROT

Re: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Doug Cutting
Andrzej Bialecki wrote: Erhm.. please bear with me. I'd rather see these two classes in a separate package altogether, org.apache.nutch.metadata. The reason is that most likely these two classes will be used elsewhere too, not just in the protocol and parse/fetch related context. I'm specifical

Re: [jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Andrzej Bialecki
Doug Cutting (JIRA) wrote: [ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364125 ] My apologies for commenting here - JIRA produces broken HTML for me, I can't use it... Doug Cutting commented on NUTCH-139: I think we're

Re: need volunteer to develop search for apache.org

2006-01-26 Thread Stefan Groschupf
John, if you need any kind of support let me know. Especially I can help out with UI related stuff, however I also can help with all other issues. Stefan Am 26.01.2006 um 22:27 schrieb John X: On Thu, Jan 26, 2006 at 12:19:38PM -0800, Doug Cutting wrote: John X wrote: Please count me in

[jira] Commented: (NUTCH-59) meta data support in webdb

2006-01-26 Thread Stefan Groschupf (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-59?page=comments#action_12364136 ] Stefan Groschupf commented on NUTCH-59: --- Nutch 0.8 is very different to 0.7 in the way it stores page data and linkgraph. Therefore a reimplementation of meta data support

Re: need volunteer to develop search for apache.org

2006-01-26 Thread John X
On Thu, Jan 26, 2006 at 12:19:38PM -0800, Doug Cutting wrote: > John X wrote: > >Please count me in. > > Thanks, John. My pleasure. > > I forgot to mention that I'd prefer a committer for this, and you're a > committer, so that works well! > > >Is there a timetable for it? > > No, whenever y

Re: need volunteer to develop search for apache.org

2006-01-26 Thread Doug Cutting
John X wrote: Please count me in. Thanks, John. I forgot to mention that I'd prefer a committer for this, and you're a committer, so that works well! Is there a timetable for it? No, whenever you can get to it. I'll make you an account and send you the details. Doug

[jira] Commented: (NUTCH-59) meta data support in webdb

2006-01-26 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-59?page=comments#action_12364127 ] Doug Cutting commented on NUTCH-59: --- This patch is to the 0.7 release and will not work in the current trunk. Please see: http://www.mail-archive.com/nutch-dev@lucene.apache.

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364125 ] Doug Cutting commented on NUTCH-139: I think we're near agreement here. Here are the changes I think this patch still needs: MetadataNames belongs in the protocol package,

[jira] Commented: (NUTCH-59) meta data support in webdb

2006-01-26 Thread James Jonas (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-59?page=comments#action_12364122 ] James Jonas commented on NUTCH-59: -- I would like to offer my vote for Nutch-59 (+1) I do have some comments with regards to the metadata infrastructure in Nutch. Here are som

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Chris A. Mattmann (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364116 ] Chris A. Mattmann commented on NUTCH-139: - Just to add to Jerome's last comment, I think the key here is simplicity. As a software developer, and ultimately as an end u

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-26 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364112 ] Jerome Charron commented on NUTCH-139: -- In fact, the more I look at this, the more I agreed with last Doug comment. There is no real needs (for now) for a so complicated m

Re: A Nutch config editor...

2006-01-26 Thread Stefan Groschupf
Cool. There is already some work going on to get a web gui realized for nutch. I hope i will find the time to write some thing together soon, however get the nutch working without static nutch conf is the first and most important step. Stefan Am 26.01.2006 um 18:57 schrieb Dominik Friedrich:

A Nutch config editor...

2006-01-26 Thread Dominik Friedrich
... is available here: http://home.halifax.rwth-aachen.de/~positron/nutchclient.rar. To use it just start the nutchclient.exe, create a new Nutch project and double-click on the nutch-default.xml. The editor will never overwrite the nutch-default.xml but it will create a mapred-default.xml and

[jira] Created: (NUTCH-189) Injection infinite loop

2006-01-26 Thread Andy Liu (JIRA)
Injection infinite loop --- Key: NUTCH-189 URL: http://issues.apache.org/jira/browse/NUTCH-189 Project: Nutch Type: Bug Environment: Linux Reporter: Andy Liu Priority: Minor f you inject the crawldb with a url file that doesn't end with

[jira] Updated: (NUTCH-188) Add searchable mailing list links to http://lucene.apache.org/nutch/mailing_lists.html

2006-01-26 Thread Andy Liu (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-188?page=all ] Andy Liu updated NUTCH-188: --- Attachment: mailing_list.patch > Add searchable mailing list links to > http://lucene.apache.org/nutch/mailing_lists.html > --

[jira] Created: (NUTCH-188) Add searchable mailing list links to http://lucene.apache.org/nutch/mailing_lists.html

2006-01-26 Thread Andy Liu (JIRA)
Add searchable mailing list links to http://lucene.apache.org/nutch/mailing_lists.html -- Key: NUTCH-188 URL: http://issues.apache.org/jira/browse/NUTCH-188 Project: Nutch Type: Impr

[jira] Commented: (NUTCH-185) XMLParser is configurable plugin. It use XPath and namespaces to do the mapping between the XML elements and Lucene fields.

2006-01-26 Thread Philippe EUGENE (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-185?page=comments#action_12364087 ] Philippe EUGENE commented on NUTCH-185: --- Great Plugin. Thanks ! I succesfull test this plugin on a 0.7.1 version of nutch. I have just a problem with somes structures like

Re: need volunteer to develop search for apache.org

2006-01-26 Thread Zaheed Haque
Sounds very interesting! When are you guys planning to start? Cheers Zaheed On 1/25/06, Doug Cutting <[EMAIL PROTECTED]> wrote: > Would someone volunteer to develop Nutch-based site-search engine for > all apache.org domains? We now have a Solaris zone to host this. > > Thanks, > > Doug >