Download nutch-0.8-dev

2006-03-13 Thread Alexander E Genaud
Hello, I have been reading many references to nutch-0.8-dev on this mailing list and in the docs. From where can it be downloaded? Are the nightly builds essentially 0.8-dev? http://cvs.apache.org/dist/lucene/nutch/nightly/ Alex -- Those who can make you believe absurdities can make you commit

try to parse pdf

2006-03-13 Thread Peter Swoboda
Hi I tried to crawl including the pdf plugin. doesn't seem to work. Does anyone know what could be the problem? nutch-site.xml is .. property nameplugin.includes/name

Intrant Crawling: Increasing Index Size, Updating the Index

2006-03-13 Thread Douglas Brunner
I'm planning to launch a vortal, and using the intrant crawl seems like the best choice for it. To test things at first, I'd like to create a relitavely small index, and increase progressively. I'm not sure of the best way to do this (Please note, I don't hold a degree in computer sciences, so

Problems

2006-03-13 Thread Laurent Michenaud
Hi, I have integrated nutch in my application. In the admin part, i can launch the indexation. In the client part, i can launch searches. After a search, i can't do an indexation again because the index files are used by the NutchBean. How can i do that please ? thx

RE: try to parse pdf

2006-03-13 Thread Richard Braman
That error is actually not from the http content limit, but I would recommend setting the content limit to -1. For some reason this error sems to happen sometimes even after you add the pdf parsing plug in like you did. I think nutch must cache the plug in properties in nutch-default. It will

Re: try to parse pdf

2006-03-13 Thread Andrzej Bialecki
Richard Braman wrote: That error is actually not from the http content limit, but I would recommend setting the content limit to -1. For some reason this error I would recommend against it - you may inadvertently fetch gigabyte-sized files if you skip content limits... but you can set it

Re: nutch-user Digest 6 Mar 2006 17:20:57 -0000 Issue 238

2006-03-13 Thread Alexander E Genaud
Hello, I am attempting to precompile the nutch JSPs on Tomcat-5.5 but have been unsuccessful. I am referencing: http://tomcat.apache.org/tomcat-5.5-doc/jasper-howto.html http://tomcat.apache.org/tomcat-5.5-doc/jasper-howto.html#Web%20Application%20Compilation I have checked out the 0.7.1

Buggy fetchlist' urls

2006-03-13 Thread Florent Gluck
Hi, I'm using nutch revision 385671 from the trunk. I'm running it on a single machine using the local fileystem. I just started with a seed of one single url: http://www.osnews.com Then I ran a crawl cycle of depth 2 (generate/fetch/updatedb) and dumpped the crawl db. Here is where I got quite

reload ROOT in tomcat

2006-03-13 Thread Michael Ji
HI there, I maded change in search.jsp, and compiled it to ROOT and ftp the ROOT directory to tomcat/webapps/; But somehow, when I launch the search page in webbrowser, no change is shown. I think tomcat still use the old ROOT files (maybe in its' cache?) How can I force tomcat to reload ROOT

Language Profiling Problem

2006-03-13 Thread Tolga Erkal
I am trying to use NGramProfile to create a profile and getting the following error. It is probably related with classpath setting but could not figure out how will I make it work. Any help? Tolga /cygdrive/c/nutch/trunk $ java org.apache.nutch.analysis.lang.NGramProfile -create Exception in

Re: Language Profiling Problem

2006-03-13 Thread Jack Tang
pls put hadoop-0.1-dev.jar into your classpath On 3/14/06, Tolga Erkal [EMAIL PROTECTED] wrote: I am trying to use NGramProfile to create a profile and getting the following error. It is probably related with classpath setting but could not figure out how will I make it work. Any help?

RE: Language Profiling Problem

2006-03-13 Thread Tolga Erkal
Hi, I put C:\nutch\trunk\lib into my classpath. Now I am getting the following error. Tolga /cygdrive/c/nutch/trunk $ java org.apache.nutch.analysis.lang.NGramProfile -create Exception in thread main java.lang.NoClassDefFoundError: org/apache/nutch/analysis/lang/NGramProfile Any Ideas?

Re: Language Profiling Problem

2006-03-13 Thread Jack Tang
I don't think java -classpath C:/nutch/trunk/lib xxx.xxx.A will put all jars under C:/nutch/trunk/lib dir into classpath. You should write some shell/bat script. Or you can run the class under IDE(say eclipse, netbean). On 3/14/06, Tolga Erkal [EMAIL PROTECTED] wrote: Hi, I put

Re: reload ROOT in tomcat

2006-03-13 Thread tonykingzhao
you can shutdown tomcat after delete tomcat\work\Catalina directory and files. tonykingzhao 2006-03-14 发件人: Michael Ji 发送时间: 2006-03-14 10:34:08 收件人: nutch-user@lucene.apache.org 抄送: 主题: reload ROOT in tomcat HI there, I maded change in search.jsp, and compiled it to ROOT and