including code between plugins

2009-11-02 Thread Eran Zinman
Hi, I've written my own plugin that's doing some custom parsing. I've needed language parsing in that plugin and the language-identifier plugin is wokring great for my needs. However, I can't use the language identifier plugin as it is, since I want to parse only a small portion of the webpage.

could you unsubscribe me from this mailing list pls. tks

2009-11-02 Thread Zanzico Gioele
Zanzico Gioele Senior Web Analyst VitecGroup - Division / Unit Tel +39 0424 07 Fax +39 0424 808999 www.vitecgroup.it http://www.vitecgroup.it/ P Respect the environment: don't print this e-mail, if not necessary. -- This message has been scanned for viruses and dangerous

Re: could you unsubscribe me from this mailing list pls. tks

2009-11-02 Thread Nico Sabbi
Il giorno lun, 02/11/2009 alle 09.48 +0100, Zanzico Gioele ha scritto: Zanzico Gioele Senior Web Analyst VitecGroup - Division / Unit Tel +39 0424 07 Fax +39 0424 808999 www.vitecgroup.it http://www.vitecgroup.it/ P Respect the environment: don't print this e-mail, if not

Re: could you unsubscribe me from this mailing list pls. tks

2009-11-02 Thread Heiko Dietze
Hello, there is no Administrator. But you can do the unsubscribe your-self. On the Nutch Maling-List information site http://lucene.apache.org/nutch/mailing_lists.html you can find the following E-Mail address: nutch-user-unsubscr...@lucene.apache.org Then your unsubscribe requests should

Re: updatedb is talking long long time

2009-11-02 Thread Andrzej Bialecki
Kalaimathan Mahenthiran wrote: I forgot to add the detail... The segment i'm trying to do updatedb on has 1.3 millions urls fetched and 1.08 million urls parsed.. Any help related to this would be appreciated... On Sun, Nov 1, 2009 at 11:53 PM, Kalaimathan Mahenthiran matha...@gmail.com

Re: including code between plugins

2009-11-02 Thread Andrzej Bialecki
Eran Zinman wrote: Hi, I've written my own plugin that's doing some custom parsing. I've needed language parsing in that plugin and the language-identifier plugin is wokring great for my needs. However, I can't use the language identifier plugin as it is, since I want to parse only a small

Re: could you unsubscribe me from this mailing list pls. tks

2009-11-02 Thread Nico Sabbi
Il giorno lun, 02/11/2009 alle 10.04 +0100, Heiko Dietze ha scritto: Hello, there is no Administrator. But you can do the unsubscribe your-self. On the Nutch Maling-List information site http://lucene.apache.org/nutch/mailing_lists.html you can find the following E-Mail address:

Re: including code between plugins

2009-11-02 Thread Eran Zinman
Hi Andrzej, thank you so much! that worked like a charm! I've spent so much time trying to figure this out and you helped me solve it in 5 min! Thanks! Eran On Mon, Nov 2, 2009 at 11:13 AM, Andrzej Bialecki a...@getopt.org wrote: Eran Zinman wrote: Hi, I've written my own plugin that's

Re: could you unsubscribe me from this mailing list pls. tks

2009-11-02 Thread Andrzej Bialecki
Nico Sabbi wrote: Il giorno lun, 02/11/2009 alle 10.04 +0100, Heiko Dietze ha scritto: Hello, there is no Administrator. But you can do the unsubscribe your-self. On the Nutch Maling-List information site http://lucene.apache.org/nutch/mailing_lists.html you can find the following E-Mail

Re: could you unsubscribe me from this mailing list pls. tks

2009-11-02 Thread Nico Sabbi
Il giorno lun, 02/11/2009 alle 10.47 +0100, Andrzej Bialecki ha scritto: Nico Sabbi wrote: Il giorno lun, 02/11/2009 alle 10.04 +0100, Heiko Dietze ha scritto: Hello, there is no Administrator. But you can do the unsubscribe your-self. On the Nutch Maling-List information site

Unsubscribe step-by-step (Re: could you unsubscribe me from this mailing list pls. tks)

2009-11-02 Thread Andrzej Bialecki
Andrzej Bialecki wrote: doesn't work, as reported by me and others last week. Thanks, Did you get the message with the subject of confirm unsubscribe from nutch-user@lucene.apache.org and did you respond to it from the same email account that you were subscribed from? .. I just verified

Re: No search results

2009-11-02 Thread Webmaster
Hi thanks for responding. Tomcat is working just like it should so it the crawl. I am not connecting to my search DB to seems to be the problem. Cause i get 0 results out of 0 and that is impossible cause there should be crawl data. Was thinking i did something wrong with the nutch-site.xml

Re: Unsubscribe step-by-step (Re: could you unsubscribe me from this mailing list pls. tks)

2009-11-02 Thread Nico Sabbi
Il giorno lun, 02/11/2009 alle 11.00 +0100, Andrzej Bialecki ha scritto: Andrzej Bialecki wrote: doesn't work, as reported by me and others last week. Thanks, Did you get the message with the subject of confirm unsubscribe from nutch-user@lucene.apache.org and did you respond to it

Re: updatedb is talking long long time

2009-11-02 Thread Kalaimathan Mahenthiran
Thanks for all the replies... Okay, I think there seems to be some issue too... I'm running nutch out of the box.. using nutch release 1.0... I running this in local mode.. The number of reduce tasks.. is the default configured by nutch... The db size is approximately 860 mb.. i know the

Asking again - WebSphere question

2009-11-02 Thread Joshua J Pavel
I'm very new at this, so forgive my novice questions. I'm trying to install nutch in WebSphere 6.1. While I can see that others have done this before, I've been unsuccessful. I keep getting this error: Error 500: java.lang.Error: java.lang.NoClassDefFoundError: org.apache.jsp._search (wrong

Re: Unsubscribe step-by-step (Re: could you unsubscribe me from this mailing list pls. tks)

2009-11-02 Thread Ryan McKinley
not having received a response from mailman I can't proceed to step 3 Have you checked the junk mail filters and stuff like that? Perhaps the message is getting deleted/removed/hidden before you get it...

Re: updatedb is talking long long time

2009-11-02 Thread Julien Nioche
Hi again i know the process is not stuck.. and the process is running because i turned on the hadoop logs and i can see logs being written to it... I'm not sure how to check if the task is completely stuck or not... run jps to identify the process id then *jstack id* several times to see if

Re: updatedb is talking long long time

2009-11-02 Thread Kalaimathan Mahenthiran
I have lot of space left on the /tmp . I don't have separate partition for /tmp... i have a folder called /tmp... There is lot of space left.. close to 1.3Terabytes... 1.4T 55G 1.3T 5% / tmpfs 3.8G 0 3.8G 0% /lib/init/rw varrun3.8G

Why is nutch writing files in /tmp?

2009-11-02 Thread Paul Tomblin
Why is nutch writing /tmp/hadoop-[userid] files, and how can I stop it doing that? -- http://www.linkedin.com/in/paultomblin http://careers.stackoverflow.com/ptomblin

EOFException while trying to read 65557 bytes

2009-11-02 Thread bhavin pandya
Hi, I got following exception in my datanode log file while udating db. 2009-11-03 04:39:24,273 ERROR datanode.DataNode - DatanodeRegistration(192.168.101.152:50010, storageID=DS-1706374374-192.168.101.152-50010-1255721446274, infoPort=50075, ipcPort=50020):DataXceiver java.io.EOFException:

Re: EOFException while trying to read 65557 bytes

2009-11-02 Thread bhavin pandya
Hello everyone, Here is more info about the exception. On both slave node i got exception. 2009-11-03 04:39:24,273 ERROR datanode.DataNode - DatanodeRegistration(192.168.101.152:50010, storageID=DS-1706374374-192.168.101.152-50010-1255721446274, infoPort=50075, ipcPort=50020):DataXceiver

How to make nutch crawl within a sub category of an URL?

2009-11-02 Thread saravan.krish
Hi, Can anyone please let me know how to make nutch crawl within a sub category of a URL? For example, if I want to crawl within Computers Internet category of answers.yahoo.com. How do I do it with Nutch? URL:

How to make nutch crawl within a sub category of an URL?

2009-11-02 Thread saravan.krish
Hi, Can anyone please let me know how to make nutch crawl within a sub category of a URL? For example, if I want to crawl within Computers Internet category of answers.yahoo.com. How do I do it with Nutch? URL: