Hi guys

It is the same problem
In the new nutch ,the folder can contain many flat files having url names in
then

In the nutch-0-7.1 version,it was a single file

We can have a separate tutorial for each of the things

And the latest nutch also makes use of the new crawl

Whereas the nutch-0-7.1 use org.apache.nutch.tools.CrawlTool for crawl

Hope this helps

Raghavendra Prabhu

On 1/4/06, Lukas Vlcek <[EMAIL PROTECTED]> wrote:
>
> Hi,
>
> I found the solution to the original problem at the beginning of this
> mail thread. I am not sure if anybody is still interested in it,
> anyway, here it comes:
>
> The problem is very simple. The current nutch-trunk version requires
> initial url list to be stored in folder. In other words when usign
> crawl command (like the follwoing example "bin/nutch crawl urls -dir
> some_dir -depth n") then that urls MUST stands for folder and not for
> flat file.
>
> This is one of (minor) changes made to nutch when it matured from
> nutch-0.7.x to nutch-0.8. I didn't originally notice this. Again, this
> is simple issue but if anybody is still interested....
>
> Regards,
> Lukas
>
> On 12/23/05, Stefan Groschupf <[EMAIL PROTECTED]> wrote:
> > Hi
> >
> > > I have been struggling the same problem two days ago. I posted problem
> > > to nutch-dev maillist under the following sublect: "nutch-0.8-dev
> > > *mapred.input.subdir* problem
> >
> > As soon I found some time over the next days I will try to reproduce
> > the problem.
> > Meanwhile it would be good to know if you guys note that problem with
> > the nightly
> > build and if this also occurs when using a build form the latest
> > sources in subversion.
> >
> > Stefan
> >
>

Reply via email to