Re: Nutch 2.x parse MajorCode, MinorCode

Julien Nioche Tue, 30 Oct 2012 01:37:41 -0700

Hi

Look at the code for the class ParseStatusCodes. This simply indicates that
the parsing failed and is not the cause for the failing itself. Do you get
the entire text for the document or just what the parser managed to process
until it failed? Did you set the content limit to -1?


Thanks

Julien


On 29 October 2012 19:17, kiran chitturi <[email protected]> wrote:

> Hi!
>
> I am debugging nutch with eclipse and i have found out that some pdf files
> which are not succesfully parsed have majorCode as 2 and minorCode as 200
> and files which are succesfully parsed have majorCode 1 and minorCode 0.
>
> Can someone please explain me or point to what these codes mean ?
>
> Actually, the title, text and everything is parsed in the failed parses but
> somehow because of the codes it not saving the fields and returning as
> failed parsing.
>
> Thanks for your help.
>
> Regards,
> --
> Kiran Chitturi
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com
http://twitter.com/digitalpebble

Re: Nutch 2.x parse MajorCode, MinorCode

Reply via email to