Re: urls with ? and symbols

2009-03-01 Thread Bartosz Gadzimski
alx...@aim.com pisze: Hello, I use nutch-0.9 and try to index urls with ? and symbols. I have commented this line? -[...@=] in conf/crawl-urlfilter.txt, conf/automaton-urlfilter and conf/regex-urlfilter.txt files. However nutch still ignores these urls. Does anyone know how this can be

Stack OverFlow using parse xml plugin

2009-03-01 Thread Nicolas MARTIN
Hi, I made a JUnit test for parse xml plugin and i have the following command : Protocol protocol = new ProtocolFactory(conf).getProtocol(urlString); When running this line, the log is very long and seems to be in an infinite loop which lead to a stackoverflow problem even running with

Re: Stack OverFlow using parse xml plugin

2009-03-01 Thread Nicolas MARTIN
To be more precise, i followed the debug trace up to the following instruction in PluginRepository.get(Configuration conf) method : PluginRepository result = CACHE.get(conf); Cheers, 2009/3/1 Nicolas MARTIN nico.a...@gmail.com Hi, I made a JUnit test for parse xml plugin and i have the

Exception when crawling

2009-03-01 Thread Tony Wang
I just installed the nightly build (March 1, 2009) on my dedicated server and I tried to craw a single site, but it throws below exception: Exception in thread main java.io.IOException: Job failed! at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1232) at

How do you setup your svn for your nutch code?

2009-03-01 Thread dealmaker
Hi, I am modifying Nutch 0.9 code for my project. Currently, I put all my 0.9 code in my local main trunk. But I know that 1.0 will be out soon, and want to use 1.0 code instead in near future. What is the best way to setup svn to do that? Should I just sync the main trunk from apache server

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread Tony Wang
from my understanding, Nutch 1.0 is already in the latest nightly build. On Sun, Mar 1, 2009 at 5:22 PM, dealmaker vin...@gmail.com wrote: Hi, I am modifying Nutch 0.9 code for my project. Currently, I put all my 0.9 code in my local main trunk. But I know that 1.0 will be out soon, and

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread dealmaker
no, it's not the official 1.0. Even so, there may be 1.1 in future. I just want to know how to setup svn for future versions that needs minimum maintenance. Thanks. Tony Wang-3 wrote: from my understanding, Nutch 1.0 is already in the latest nightly build. On Sun, Mar 1, 2009 at 5:22

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread Dingding Ye
I have used git-svn to clone the nutch project. And then use a git repo to manage personal version and do periodical merge with the git version of nutch. On Mon, Mar 2, 2009 at 9:27 AM, dealmaker vin...@gmail.com wrote: no, it's not the official 1.0. Even so, there may be 1.1 in future. I

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread dealmaker
Is there a reason why u need to clone the nutch project and not just do a merge directly from the personal version and the nutch project version online? Dingding Ye wrote: I have used git-svn to clone the nutch project. And then use a git repo to manage personal version and do periodical

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread dealmaker
and also, do u clone the main trunk or just for examples 0.9? Dingding Ye wrote: I have used git-svn to clone the nutch project. And then use a git repo to manage personal version and do periodical merge with the git version of nutch. On Mon, Mar 2, 2009 at 9:27 AM, dealmaker

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread Dingding Ye
Just personal choice and i think the branch/merge feature of git is powerful than svn. It helps the smooth merge. What i did before is to clone main trunk. It should fit for 0.9 also. However, if you make rapid changes to the sources, i think none are helpful and you have to solve the

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread dealmaker
need more detail. Do u clone main trunk to your local main trunk, and then create a local branch for personal project, then do merge periodically for your local main trunk which u cloned? Dingding Ye wrote: Just personal choice and i think the branch/merge feature of git is powerful than

Re: How do you setup your svn for your nutch code?

2009-03-01 Thread Dingding Ye
similar. 1. git-svn clone nutch-trunk Then create a git project which is my working project. After that, clone the nutch-git repo as a remote repo of this git project 2. git remote add Now when you want to update the nutch, update at nutch-git at first. Then update the branch of your

Could not find the main class: admin.

2009-03-01 Thread nutchuser
Dear all when I try the command bin/nutch admin db -create I get a error message: Exception in thread main java.lang.NoClassDefFoundError: admin Caused by: java.lang.ClassNotFoundException: admin at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at

Re: Could not find the main class: admin.

2009-03-01 Thread Alexander Aristov
What do you want to run? Execute the command bin/nutch and you will get list of all supported commands Alexander 2009/3/2 nutchu...@sycona.com Dear all when I try the command bin/nutch admin db -create I get a error message: Exception in thread main java.lang.NoClassDefFoundError: admin

Re: Could not find the main class: admin.

2009-03-01 Thread nutchuser
Hi Alexander i downloaded the nutch tutorial and the bin/nutch admin db -create should be used to generate a new, empty db. But I guess it is wrong? I will get another tutorial Joerg Zitat von Alexander Aristov alexander.aris...@gmail.com: What do you want to run? Execute the command

Re: Could not find the main class: admin.

2009-03-01 Thread Bartosz Gadzimski
Hi, Command admin is not valid in 0.9 and in trunk versions of nutch You can use tutorial: http://peterpuwang.googlepages.com/NutchGuideForDummies.htm and many more on nutch wiki: http://wiki.apache.org/nutch/ nutchu...@sycona.com pisze: Hi Alexander i downloaded the nutch tutorial and