Re: [Nutch-cvs] svn commit: r372810 - /lucene/nutch/trunk/bin/nutch

2006-01-27 Thread Doug Cutting
Andrzej Bialecki wrote: Namely? I didn't notice any ... I think it's better to avoid bash-isms, if we easily can. Not all the world looks like Linux. ;-) IFS, at least. I tried running this on Solaris, where /bin/sh is not bash, and it didn't work. It complained about unsetting IFS. Doug

Re: [Nutch-cvs] svn commit: r372810 - /lucene/nutch/trunk/bin/nutch

2006-01-27 Thread Andrzej Bialecki
[EMAIL PROTECTED] wrote: Author: cutting Date: Fri Jan 27 02:45:35 2006 New Revision: 372810 URL: http://svn.apache.org/viewcvs?rev=372810view=rev Log: Explicitly specify bash, since this script requires some bash-specific features. Namely? I didn't notice any ... I think it's better to

Re: [Nutch-cvs] svn commit: r372810 - /lucene/nutch/trunk/bin/nutch

2006-01-27 Thread Andrzej Bialecki
Doug Cutting wrote: Andrzej Bialecki wrote: Namely? I didn't notice any ... I think it's better to avoid bash-isms, if we easily can. Not all the world looks like Linux. ;-) IFS, at least. I tried running this on Solaris, where /bin/sh is not bash, and it didn't work. It complained about

Re: svn commit: r372810 - /lucene/nutch/trunk/bin/nutch

2006-01-27 Thread Rod Taylor
Please don't do that. bash-2.05b$ ls /bin/bash ls: /bin/bash: No such file or directory bash-2.05b$ uname -a FreeBSD home 6.0-RELEASE FreeBSD 6.0-RELEASE #13: Sat Nov 5 00:19:49 EST 2005 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/HOME amd64

Re: need volunteer to develop search for apache.org

2006-01-27 Thread Rida Benjelloun
Hi Doug, I will be interested by this development. I have a lot of experience with lucene. Best regards On 1/27/06, Fuad Efendi [EMAIL PROTECTED] wrote: Hope to join! +1 -Original Message- From: Doug Cutting [mailto:[EMAIL PROTECTED] Sent: Wednesday, January 25, 2006 4:24 PM

Re: [jira] Commented: (NUTCH-185) XMLParser is configurable plugin. It use XPath and namespaces to do the mapping between the XML elements and Lucene fields.

2006-01-27 Thread Rida Benjelloun
Hi Philippe, Thanks, for your comments. I have already add multi-values for a field in lucene. I will try it with nutch plugin. Best regards. On 1/26/06, Philippe EUGENE (JIRA) [EMAIL PROTECTED] wrote: [ http://issues.apache.org/jira/browse/NUTCH-185?page=comments#action_12364087]

[jira] Commented: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-27 Thread Jerome Charron (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=comments#action_12364218 ] Jerome Charron commented on NUTCH-139: -- I think we're near agreement here. I really hope ... ;-) We should add an add() method to Metadata, and change set() to

Re: need volunteer to develop search for apache.org

2006-01-27 Thread Sami Siren
Will the resources (scripts, modifications, documentation etc) of this setup be publicly available I mean could this installation be something like RI for implementing nutch based search for a public web site. I see this as a great opportunity to produce some generally usable stuff that would

Re: [Nutch-cvs] svn commit: r372810 - /lucene/nutch/trunk/bin/nutch

2006-01-27 Thread Doug Cutting
Andrzej Bialecki wrote: Right, Solaris /bin/sh doesn't allow that... Hmm. Does this IFS setting/unsetting work for you? I mean, I just tried it on Linux, using the real Bash. I put the nutch distrib in a path containing spaces, and I'm not able to run anything... I initially added it to make

Re: svn commit: r372810 - /lucene/nutch/trunk/bin/nutch

2006-01-27 Thread Doug Cutting
Rod Taylor wrote: Please don't do that. bash-2.05b$ ls /bin/bash ls: /bin/bash: No such file or directory bash-2.05b$ uname -a FreeBSD home 6.0-RELEASE FreeBSD 6.0-RELEASE #13: Sat Nov 5 00:19:49 EST 2005 [EMAIL

[jira] Updated: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-27 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=all ] Doug Cutting updated NUTCH-139: --- Attachment: (was: NUTCH-139.jc.review.patch.txt) Standard metadata property names in the ParseData metadata

[jira] Updated: (NUTCH-139) Standard metadata property names in the ParseData metadata

2006-01-27 Thread Doug Cutting (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-139?page=all ] Doug Cutting updated NUTCH-139: --- Attachment: (was: NUTCH-139.Mattmann.patch.txt) Standard metadata property names in the ParseData metadata

Re: A Nutch config editor...

2006-01-27 Thread Dominik Friedrich
I've found a way to make remote calls from the client to a jobtracker and namenode without patching any nutch code. The new package includes a small wrapper that start a namenode and a jobtracker and makes a xml-rpc interface to those two services available. See

Re: svn commit: r372810 - /lucene/nutch/trunk/bin/nutch

2006-01-27 Thread Doug Cutting
Andrzej Bialecki wrote: #!/usr/bin/env bash +1 This works on Solaris, Linux cygwin. Does it work on FreeBSD? Doug

Re: svn commit: r372810 - /lucene/nutch/trunk/bin/nutch

2006-01-27 Thread Rod Taylor
On Fri, 2006-01-27 at 13:37 -0800, Doug Cutting wrote: Andrzej Bialecki wrote: #!/usr/bin/env bash +1 This works on Solaris, Linux cygwin. Does it work on FreeBSD? Yes. It will fail on some older and obscure systems but I don't imagine those will have a JVM anyway. -- Rod Taylor

Re: older Nutch list archives (@sf.net)?

2006-01-27 Thread Doug Cutting
The Sourceforge archives are still there, just hard to find, e.g.: http://sourceforge.net/mailarchive/forum.php?forum=nutch-developers These lists are also archived at mail-archive.com: http://www.mail-archive.com/nutch-developers%40lists.sourceforge.net/ Doug Gordon Mohr (archive.org)

[jira] Updated: (NUTCH-189) Injection infinite loop

2006-01-27 Thread Bryan Pendleton (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-189?page=all ] Bryan Pendleton updated NUTCH-189: -- Attachment: textinputformat.patch.txt Injection infinite loop --- Key: NUTCH-189 URL:

Re: older Nutch list archives (@sf.net)?

2006-01-27 Thread Doug Cutting
Gordon Mohr (archive.org) wrote: Doug Cutting wrote: The Sourceforge archives are still there, just hard to find, e.g.: http://sourceforge.net/mailarchive/forum.php?forum=nutch-developers When I visit that URL, I get: # Permission Denied # # Access to this page is restricted (either to

Re: need volunteer to develop search for apache.org

2006-01-27 Thread John X
Hi, Stefan, On Thu, Jan 26, 2006 at 10:17:52PM +0100, Stefan Groschupf wrote: John, if you need any kind of support let me know. Especially I can help out with UI related stuff, however I also can help with all other issues. Really appreciated. With all the support from the community,

Re: older Nutch list archives (@sf.net)?

2006-01-27 Thread Gordon Mohr (archive.org)
Access works now, thanks! (Search at SF.net seems flaky, though -- simple searches that bring expected results at mail-archive.com give nothing or abbrieviated results at SF.net.) Also, I misinterpreted the mail-archive.com robots.txt... it is crawlable, though neither G nor Y go very deep.

Nutch - New Features (?)

2006-01-27 Thread Fuad Efendi
Since we have such strange plugin structure (DI? IoC?), and many utility classes with a single UNIX shell script to run everything... 1. Separate concerns. Clearly. - Crawl - Parse - Generate URL List - Crawl - ... (Interfaces of WebDB should be more clear, so we can use databases, etc,...) 1a.