Re: lang identifier and nutch analyzer in trunk

2006-01-23 Thread Jérôme Charron
Any plan to implement this ? I mean move LanguageIdentifier class intto nutch core. As I already suggested it on this list, I really would like to move the LanguageIdentifier class (and profiles) to an independant Lucene sub-project (and the MimeType repository too). I don't remember why but

Re: lang identifier and nutch analyzer in trunk

2006-01-23 Thread Andrzej Bialecki
Jérôme Charron wrote: Any plan to implement this ? I mean move LanguageIdentifier class intto nutch core. As I already suggested it on this list, I really would like to move the LanguageIdentifier class (and profiles) to an independant Lucene sub-project (and the MimeType repository too).

Re: lang identifier and nutch analyzer in trunk

2006-01-23 Thread Jérôme Charron
+1. Other local modifications which I use frequently: * exporting a list of supported languages, * exporting an NGramProfile of the analyzed text, * allow processing of chunks of input (i.e. LanguageIdentifier.identify(char[] buf, int start, int len) ) - this is very useful if the text to

protocol-httpclient; maximum total connections

2006-01-23 Thread orkunt . sabuncu
Hi, Protocol-httpclient sets the maximum number of total connections to fetcher.threads.fetch configuration parameter for underlying commons-httpclient. However, if -threads argument is used with the fetcher it doesn't change fetcher.threads.fetch. Giving whatever number of threads to -threads

[jira] Resolved: (NUTCH-127) uncorrect values using -du, or ls does not return items

2006-01-23 Thread Stefan Groschupf (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-127?page=all ] Stefan Groschupf resolved NUTCH-127: Resolution: Fixed I guess it is solved, thanks. If able to reproduce it again I will just reopen this or a new report. Thanks! uncorrect values

Re: protocol-httpclient; maximum total connections

2006-01-23 Thread Stefan Groschupf
Thanks for finding this bug, please open a bug report in jira and if you like I guess patches are always welcome. :-) Am 23.01.2006 um 15:00 schrieb [EMAIL PROTECTED]: Hi, Protocol-httpclient sets the maximum number of total connections to fetcher.threads.fetch configuration parameter for

xml-parser plugin contribution

2006-01-23 Thread Rida Benjelloun
Hi, I have developed an xml parser plugin. I have test it with nutch 0.7.2. The parser use namespaces and xpath to do the mapping between XML nodes and lucene fields. I'm trying to send the source of the plugin in a zip file but my message is always rejected (it is considered as a spam). How can I

patch for nutch and nutch-daemon.sh

2006-01-23 Thread Zaheed Haque
Hi: Due to a bug in the if statement its not possible to use the symlinks for the shell scripts. Below you will find the patch. Thanks Zaheed --- $ svn diff nutch Index: nutch === --- nutch (revision 371849)