Hello,
I post to ask if someone is able to setup the administratuin gui present
here
http://issues.apache.org/jira/browse/NUTCH-251 nutch gui with nutch 0.81
because i got a big probleme and it's impossible for me to do it. Probleme
with hadoop patch maybe because somme file are missing in
Hello,
Please can someone give me a link where i can find a documentation on prune
syntax with all the command i can put in the prune.txt file.
Thanks for your help.
--
View this message in context:
http://www.nabble.com/Prune-synatx-tf4346788.html#a12383982
Sent from the Nutch - User mailing
Hello,
I got a question about link analysis in nutch...
Is the link analysis in the default configuration of nutch 0.81 and if not
how can i set it up?
And what is the minimum depth for a performant link analysis
--
View this message in context:
Ok thanks a lot for your help, you save my work!!!
I'm setting up the patch for multi index search in nutch (NUTCH-480)
But i don't know how to create a nutch index and how to make nutch search in
a field like url or keyword...
--
View this message in context:
Hello,
I need to now if it's possible to return in the first page of result some
page that fit perfectly a reseach.
For exemple il the research is mercedes can i put manually
www.mercedes.com as the first result and then for the next one let nutch
choose???
Thanks for your response
--
View
for this case it's good it works, but if the query is car i can't have in
the first page mercedes and ferrari.
Is it possible to modify the search.jsp??
--
View this message in context:
http://www.nabble.com/manually-Rank-result-tf4223136.html#a12020402
Sent from the Nutch - User mailing list
Thanks a lot for your help
I'll give you a feedback
--
View this message in context:
http://www.nabble.com/stackoverflow-error-tf3879034.html#a10993864
Sent from the Nutch - User mailing list archive at Nabble.com.
Hello,
I'm trying to setup the nutch admin gui with the jira 251 but i have this
exception during buil of the .job file:
[echo] Compiling plugin: admin-management
[javac] Compiling 9 source files to C:\Documents and
Settings\jamel\workspace\test\build\admin-management\classes
[javac]
Hello,
I just post to ask if index generated with nutch 0.9 could be read with the
web interface of nutch 0.8???
Congratulation for this new release
--
View this message in context:
http://www.nabble.com/web-app-0.8-and-0.9-index-tf3537024.html#a9872903
Sent from the Nutch - User mailing
Thanks for your help but where i call this methode, she could'nt be resolver.
Is there an import i must do?
--
View this message in context:
http://www.nabble.com/Nutch-conf-reading-tf3401343.html#a9491194
Sent from the Nutch - User mailing list archive at Nabble.com.
Thanks you very much for your help it works fine
--
View this message in context:
http://www.nabble.com/Nutch-conf-reading-tf3401343.html#a9495938
Sent from the Nutch - User mailing list archive at Nabble.com.
Finally i found the solution, if it interrest someone contacte me
--
View this message in context:
http://www.nabble.com/external-host-link-logging-tf3369106.html#a9471697
Sent from the Nutch - User mailing list archive at Nabble.com.
Hello,
I need to add a parameter in the conf file of nutch.
What is the method to read the xml file in nutch?
Thanks
--
View this message in context:
http://www.nabble.com/Nutch-conf-reading-tf3401343.html#a9471867
Sent from the Nutch - User mailing list archive at Nabble.com.
Hello,
Could someone help me please???
thank you
--
View this message in context:
http://www.nabble.com/external-host-link-logging-tf3369106.html#a9450252
Sent from the Nutch - User mailing list archive at Nabble.com.
Hello,
I've tried the solution you gave me, but she loges all the links that the
parser find.
In the conf file there is a parameter named db.ignore.external.links, do you
now where is this parameter treated in the code??? i think i juste bave to
add an if condition to log the outlinks in a file.
Hi,
For information i run nutch in a clusterof 5 Pc.
When i look in /nutch/logs any files containes external host link but
containes only normal system output
Sorry if i'm not looking the good logs directory.
--
View this message in context:
Thanks, i'm gonna try that.
I need a log of all externalhost link the fetcher find but not the normal
link.
for exemple if i'm on www.nabble.com web site and contains a link to
www.forecast.com i want to log it but dont log a link to
www.nabble.com/forecast
--
View this message in context:
Hi,
With your configuration of nutch, the crawl dont take the link with dynamic
parameter.
you must edit your regex filter at this line:
# skip URLs containing certain characters as probable queries, etc.
[EMAIL PROTECTED]
--
View this message in context:
Hello,
I'm working with nutch since 2 month now, and i'm very happy to see that
this project is so powerfull!
I need to crawl only a set of given website, so i set the parameter
db.ignore.external.links to false and it works perfectly.
But now i need to create a log file with the list of
Hello,
I juste create a 3 GO index but when i do a search with the web gui i get
this error message:
2007-02-19 10:53:31,765 INFO NutchBean - searching for 20 raw hits
2007-02-19 10:53:34,828 ERROR [jsp] - Servlet.service() pour la servlet
jsp a généré une exception
java.lang.RuntimeException:
Hello,
During the parse of a fetch of 600 000 pages in a cluster of 5 box,The job
failed with this error message on 2 box :
org.apache.hadoop.ipc.RemoteException:
org.apache.hadoop.dfs.LeaseExpiredException: No lease on
/user/nutch/crawl/segments/20070127060350/crawl_parse/part-1 at
Thanks a lot for your response,
I'm using nutch 0.8.1.
I will rebuid hadoop with the patch...
but i notice something, i'm running tasktracker on different VMware and the
date is not strictly the same with diference of 3 or five minutes. could it
be the reason of the buf?
--
View this message in
Hello dear expert,
I'm new in my corporation, and they were searching for a solution to crawl a
selection of 1 000 000 url.
Naturaly my choice was for nutch for his scalability and java code.
I begin working with nutch three weeks ago and appreciate many things.
but i have some questions i can't
23 matches
Mail list logo