Hi guys It is the same problem In the new nutch ,the folder can contain many flat files having url names in then
In the nutch-0-7.1 version,it was a single file We can have a separate tutorial for each of the things And the latest nutch also makes use of the new crawl Whereas the nutch-0-7.1 use org.apache.nutch.tools.CrawlTool for crawl Hope this helps Raghavendra Prabhu On 1/4/06, Lukas Vlcek <[EMAIL PROTECTED]> wrote: > > Hi, > > I found the solution to the original problem at the beginning of this > mail thread. I am not sure if anybody is still interested in it, > anyway, here it comes: > > The problem is very simple. The current nutch-trunk version requires > initial url list to be stored in folder. In other words when usign > crawl command (like the follwoing example "bin/nutch crawl urls -dir > some_dir -depth n") then that urls MUST stands for folder and not for > flat file. > > This is one of (minor) changes made to nutch when it matured from > nutch-0.7.x to nutch-0.8. I didn't originally notice this. Again, this > is simple issue but if anybody is still interested.... > > Regards, > Lukas > > On 12/23/05, Stefan Groschupf <[EMAIL PROTECTED]> wrote: > > Hi > > > > > I have been struggling the same problem two days ago. I posted problem > > > to nutch-dev maillist under the following sublect: "nutch-0.8-dev > > > *mapred.input.subdir* problem > > > > As soon I found some time over the next days I will try to reproduce > > the problem. > > Meanwhile it would be good to know if you guys note that problem with > > the nightly > > build and if this also occurs when using a build form the latest > > sources in subversion. > > > > Stefan > > >
