Nutch, tomcat6, UTF-8 and query filter = crash

2010-04-01 Thread Hannu Väisänen
I have a query filter that works when I search from the command line $ bin/nutch org.apache.nutch.searcher.NutchBean word The query filter crashes when it calls native code when I search through tomcat6 for a word that contains letters that are not in ASCII. Filter assumes that its input is in

linux crawl problem

2010-04-01 Thread hari2303
hello all, thanks to nutch , am using nucth-1.0 and am doing local file crawl, which crawls fine and gives me the search results in windows well. But when I do crawling in linux machine , i don't know whether it crawls or not . But i don't see any exception, my crawl-urlfilter.txt

Re: Nutch, tomcat6, UTF-8 and query filter = crash

2010-04-01 Thread MilleBii
I use tomcat6 in UTF-8, no problem. Don't forget a Java String is Unicode not UTF-8 so in the query filter you should think Unicode and not UTF-8 2010/4/1, Hannu Väisänen hvais...@joyx.joensuu.fi: I have a query filter that works when I search from the command line $ bin/nutch

Re: Nutch, tomcat6, UTF-8 and query filter = crash

2010-04-01 Thread MilleBii
Also I don't think the bean.LOG works in UTF-8 but only Iso-latin-1 2010/4/1, MilleBii mille...@gmail.com: I use tomcat6 in UTF-8, no problem. Don't forget a Java String is Unicode not UTF-8 so in the query filter you should think Unicode and not UTF-8 2010/4/1, Hannu Väisänen

Nutch with Hadoop in windows;;

2010-04-01 Thread Ahmad Al-Amri
Hello ; I'm trying to use Nutch and Hadoop shown here: http://wiki.apache.org/nutch/NutchHadoopTutorial but I am using Windows 2003 ... I am trying to run this command in cygwin bin/hadoop namenode -format and I get the following exceptions: Exception in thread main

problem: crawl pdfs from a website and index these to solr

2010-04-01 Thread toocrazymail
hi list, i am new in using nutch (solr and nutch integrated in solr too) and my english is not well :/ in my company i have the task to replace the media-wiki search function with a solr (include nutch) search function. furthermore i have to write a small program (java) which read ONE

Re: Nutch with Hadoop in windows;;

2010-04-01 Thread Ahmad Al-Amri
solved by specifying the config directory by adding --config /conf . bin/hadoop --config /conf namenode -format bin/hadoop --config /conf version Regards; From: Ahmad Al-Amri amri...@yahoo.com To: Nutch User nutch-user@lucene.apache.org Sent: Thu, April 1,

Re: description and keywords

2010-04-01 Thread toocrazymail
hi, may be the index-more and query-more plugin as part of values of property plugin.includes in nutch-default.xml or nutch-site.xml? (see http://wiki.apache.org/nutch/PluginCentral) regards marcel :) Am 01.04.2010 um 14:57 schrieb ramires: Is there any plugin for fetcher and indexer to

Re: description and keywords

2010-04-01 Thread Julien Nioche
Description and keywords are not provided by 'index-more' - you'll need to write a custom plugin for this. I have written one which does exactly that and will probably put it in JIRA if there is interest for it. It is quite straightforward to do and it would be a nice way of learning about Nutch

[VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Andrzej Bialecki
Hi all, According to an earlier [DISCUSS] thread on the nutch-dev list I'm calling for a vote on the proposal to make Nutch a top-level project. To quickly recap the reasons and consequences of such move: the ASF board is concerned about the size and diversity of goals across various subprojects

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Sudhi Seshachala
[ ] +1 - yes, I vote for the proposal This is awesome --- On Thu, 4/1/10, Andrzej Bialecki a...@getopt.org wrote: From: Andrzej Bialecki a...@getopt.org Subject: [VOTE] Nutch to become a top-level project (TLP) To: nutch-user@lucene.apache.org Date: Thursday, April 1, 2010, 12:23 PM Hi all,

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Mattmann, Chris A (388J)
Hi Andrzej, +1 from me. Cheers, Chris On 4/1/10 10:23 AM, Andrzej Bialecki a...@getopt.org wrote: Hi all, According to an earlier [DISCUSS] thread on the nutch-dev list I'm calling for a vote on the proposal to make Nutch a top-level project. To quickly recap the reasons and consequences

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Julien Nioche
Hi guys, +1 from me too Julien -- DigitalPebble Ltd http://www.digitalpebble.com On 1 April 2010 18:23, Andrzej Bialecki a...@getopt.org wrote: Hi all, According to an earlier [DISCUSS] thread on the nutch-dev list I'm calling for a vote on the proposal to make Nutch a top-level project.

RE: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Robert Hohman
+1 yes, and I also vote we try to somehow make nutch easier to work with maven-based projects. I've had a heck of a time integrating it (although more or less gotten it to work) -Original Message- From: Sudhi Seshachala [mailto:sudhi_...@yahoo.com] Sent: Thursday, April 01, 2010 10:37

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Adilson Oliveira Cruz
+ 1 ! On Thu, Apr 1, 2010 at 2:40 PM, Robert Hohman rob...@glassdoor.com wrote: +1 yes, and I also vote we try to somehow make nutch easier to work with maven-based projects. I've had a heck of a time integrating it (although more or less gotten it to work) -Original Message-

RE: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Eduard Kotysh
[x] +1 - yes, I vote for the proposal -Original Message- From: Andrzej Bialecki [mailto:a...@getopt.org] Sent: Thursday, April 01, 2010 10:24 AM To: nutch-user@lucene.apache.org Subject: [VOTE] Nutch to become a top-level project (TLP) Hi all, According to an earlier [DISCUSS] thread

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Andrzej Bialecki
On 2010-04-01 19:40, Robert Hohman wrote: +1 yes, and I also vote we try to somehow make nutch easier to work with maven-based projects. I've had a heck of a time integrating it (although more or less gotten it to work) Patches are welcome - I realize this could be beneficial, but I'm not

RE: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Robert Hohman
ok, once we get it sorted out on our end, we will start to submit... thanks for all the hard work Andrzej! -Original Message- From: Andrzej Bialecki [mailto:a...@getopt.org] Sent: Thursday, April 01, 2010 11:13 AM To: nutch-user@lucene.apache.org Subject: Re: [VOTE] Nutch to become a

Re: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Ashumeet Singh
+1 :) it is the best. Thanks Ashumeet Singh Be kinder than necessary because everyone you meet is fighting some kind of battle On Apr 1, 2010, at 2:16 PM, Robert Hohman wrote: ok, once we get it sorted out on our end, we will start to submit... thanks for all the hard work Andrzej!

Can't open a nutch 1.0 index with luke

2010-04-01 Thread Magnús Skúlason
Hi, I am getting the following exception when I try to open a nutch 1.0 (I am using the official release) index with Luke (0.9.9.1) java.io.IOException: read past EOF at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput. java:151) at

Re: Can't open a nutch 1.0 index with luke

2010-04-01 Thread Andrzej Bialecki
On 2010-04-01 21:09, Magnús Skúlason wrote: Hi, I am getting the following exception when I try to open a nutch 1.0 (I am using the official release) index with Luke (0.9.9.1) java.io.IOException: read past EOF at

Re: Can't open a nutch 1.0 index with luke

2010-04-01 Thread Magnús Skúlason
Hi, I found the problem, I could open the index on my server (Linux) but not on my desktop (Windows) so something must be messed up in transfering the files (FTP), same thing used to work just fine with nutch-0.9. I tried to zip it on the server and then unzip it on the windows and then I can

RE: [VOTE] Nutch to become a top-level project (TLP)

2010-04-01 Thread Arkadi.Kosmynin
[x] +1 - yes, I vote for the proposal -Original Message- From: Andrzej Bialecki [mailto:a...@getopt.org] Sent: Friday, 2 April 2010 4:24 AM To: nutch-user@lucene.apache.org Subject: [VOTE] Nutch to become a top-level project (TLP) Hi all, According to an earlier [DISCUSS]