Administration GUI on nutch 0.81

2007-09-24 Thread djames
Hello, I post to ask if someone is able to setup the administratuin gui present here http://issues.apache.org/jira/browse/NUTCH-251 nutch gui with nutch 0.81 because i got a big probleme and it's impossible for me to do it. Probleme with hadoop patch maybe because somme file are missing in

Prune synatx

2007-08-29 Thread djames
Hello, Please can someone give me a link where i can find a documentation on prune syntax with all the command i can put in the prune.txt file. Thanks for your help. -- View this message in context: http://www.nabble.com/Prune-synatx-tf4346788.html#a12383982 Sent from the Nutch - User mailing

Link analysis tool

2007-08-09 Thread djames
Hello, I got a question about link analysis in nutch... Is the link analysis in the default configuration of nutch 0.81 and if not how can i set it up? And what is the minimum depth for a performant link analysis -- View this message in context:

Re: manually Rank result

2007-08-08 Thread djames
Ok thanks a lot for your help, you save my work!!! I'm setting up the patch for multi index search in nutch (NUTCH-480) But i don't know how to create a nutch index and how to make nutch search in a field like url or keyword... -- View this message in context:

manually Rank result

2007-08-06 Thread djames
Hello, I need to now if it's possible to return in the first page of result some page that fit perfectly a reseach. For exemple il the research is mercedes can i put manually www.mercedes.com as the first result and then for the next one let nutch choose??? Thanks for your response -- View

Re: manually Rank result

2007-08-06 Thread djames
for this case it's good it works, but if the query is car i can't have in the first page mercedes and ferrari. Is it possible to modify the search.jsp?? -- View this message in context: http://www.nabble.com/manually-Rank-result-tf4223136.html#a12020402 Sent from the Nutch - User mailing list

Re: stackoverflow error

2007-06-06 Thread djames
Thanks a lot for your help I'll give you a feedback -- View this message in context: http://www.nabble.com/stackoverflow-error-tf3879034.html#a10993864 Sent from the Nutch - User mailing list archive at Nabble.com.

Nutch Admin GUI

2007-04-16 Thread djames
Hello, I'm trying to setup the nutch admin gui with the jira 251 but i have this exception during buil of the .job file: [echo] Compiling plugin: admin-management [javac] Compiling 9 source files to C:\Documents and Settings\jamel\workspace\test\build\admin-management\classes [javac]

web app 0.8 and 0.9 index

2007-04-06 Thread djames
Hello, I just post to ask if index generated with nutch 0.9 could be read with the web interface of nutch 0.8??? Congratulation for this new release -- View this message in context: http://www.nabble.com/web-app-0.8-and-0.9-index-tf3537024.html#a9872903 Sent from the Nutch - User mailing

Re: Nutch conf reading

2007-03-15 Thread djames
Thanks for your help but where i call this methode, she could'nt be resolver. Is there an import i must do? -- View this message in context: http://www.nabble.com/Nutch-conf-reading-tf3401343.html#a9491194 Sent from the Nutch - User mailing list archive at Nabble.com.

Re: Nutch conf reading

2007-03-15 Thread djames
Thanks you very much for your help it works fine -- View this message in context: http://www.nabble.com/Nutch-conf-reading-tf3401343.html#a9495938 Sent from the Nutch - User mailing list archive at Nabble.com.

Re: [SOLVED] external host link logging

2007-03-14 Thread djames
Finally i found the solution, if it interrest someone contacte me -- View this message in context: http://www.nabble.com/external-host-link-logging-tf3369106.html#a9471697 Sent from the Nutch - User mailing list archive at Nabble.com.

Nutch conf reading

2007-03-14 Thread djames
Hello, I need to add a parameter in the conf file of nutch. What is the method to read the xml file in nutch? Thanks -- View this message in context: http://www.nabble.com/Nutch-conf-reading-tf3401343.html#a9471867 Sent from the Nutch - User mailing list archive at Nabble.com.

Re: [SOLVED] external host link logging

2007-03-13 Thread djames
Hello, Could someone help me please??? thank you -- View this message in context: http://www.nabble.com/external-host-link-logging-tf3369106.html#a9450252 Sent from the Nutch - User mailing list archive at Nabble.com.

Re: [SOLVED] external host link logging

2007-03-12 Thread djames
Hello, I've tried the solution you gave me, but she loges all the links that the parser find. In the conf file there is a parameter named db.ignore.external.links, do you now where is this parameter treated in the code??? i think i juste bave to add an if condition to log the outlinks in a file.

Re: [SOLVED] external host link logging

2007-03-09 Thread djames
Hi, For information i run nutch in a clusterof 5 Pc. When i look in /nutch/logs any files containes external host link but containes only normal system output Sorry if i'm not looking the good logs directory. -- View this message in context:

Re: [SOLVED] external host link logging

2007-03-09 Thread djames
Thanks, i'm gonna try that. I need a log of all externalhost link the fetcher find but not the normal link. for exemple if i'm on www.nabble.com web site and contains a link to www.forecast.com i want to log it but dont log a link to www.nabble.com/forecast -- View this message in context:

Re: [SOLVED] Newbie questions about followed links

2007-03-08 Thread djames
Hi, With your configuration of nutch, the crawl dont take the link with dynamic parameter. you must edit your regex filter at this line: # skip URLs containing certain characters as probable queries, etc. [EMAIL PROTECTED] -- View this message in context:

external host link logging

2007-03-08 Thread djames
Hello, I'm working with nutch since 2 month now, and i'm very happy to see that this project is so powerfull! I need to crawl only a set of given website, so i set the parameter db.ignore.external.links to false and it works perfectly. But now i need to create a log file with the list of

Null pointer exception in search gui

2007-02-19 Thread djames
Hello, I juste create a 3 GO index but when i do a search with the web gui i get this error message: 2007-02-19 10:53:31,765 INFO NutchBean - searching for 20 raw hits 2007-02-19 10:53:34,828 ERROR [jsp] - Servlet.service() pour la servlet jsp a généré une exception java.lang.RuntimeException:

Lease expired exception

2007-01-28 Thread djames
Hello, During the parse of a fetch of 600 000 pages in a cluster of 5 box,The job failed with this error message on 2 box : org.apache.hadoop.ipc.RemoteException: org.apache.hadoop.dfs.LeaseExpiredException: No lease on /user/nutch/crawl/segments/20070127060350/crawl_parse/part-1 at

Re: Lease expired exception

2007-01-28 Thread djames
Thanks a lot for your response, I'm using nutch 0.8.1. I will rebuid hadoop with the patch... but i notice something, i'm running tasktracker on different VMware and the date is not strictly the same with diference of 3 or five minutes. could it be the reason of the buf? -- View this message in

Nutch Common administration's Task

2006-12-27 Thread djames
Hello dear expert, I'm new in my corporation, and they were searching for a solution to crawl a selection of 1 000 000 url. Naturaly my choice was for nutch for his scalability and java code. I begin working with nutch three weeks ago and appreciate many things. but i have some questions i can't