well main question is that

i need html elements removed files

this is important not other things

is this possible ?

On 21 April 2010 16:38, nachonieto3 <jinietosanc...@gmail.com> wrote:

>
> Thank you a lot! Now I'm working on that, I have some doubts more...I'm not
> able to run the command readseg...I've been consulting some help forum and
> the basic synthesis is
> readseg <path of the file withe the segments>
> I have the segments in this path:
> D:\nutch-0.9\crawl-20100420112025\segments
> The file named  crawl-20100420112025 is the one where are stored the
> segments. So I'm trying to execute the command using these but none is
> working:
> readseg d/nutch-0.9/crawl-20100420112025/segments
> readseg crawl-20100420112025/segments
> readseg crawl-20100420112025
>
> What I'm doing wrong??When I try to execute I get bash: readseg:command not
> found.
> Any idea??Thank you in advance.
> --
> View this message in context:
> http://n3.nabble.com/how-to-parse-html-files-while-crawling-tp706816p739953.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>

Reply via email to