Re: 2.0

John Hewson Tue, 14 Oct 2014 12:06:58 -0700

Unless somebody provides us with a list of those files, then I think this is an 
unreasonable request. As long as we continue to leave the old parser in PDFBox, 
we won’t get the bug reports which we need to fix the new parser, and the 
situation will never resolve itself. Falling back to the old parser is just as 
bad - we won’t get bug reports.


-- John

On 14 Oct 2014, at 07:39, Tilman Hausherr <[email protected]> wrote:

> I prefer that the "old" parser not be removed, because there are many files 
> that can only be parsed by the old parser. This came out in a  large scale 
> test with TIKA.
> 
> The best idea (in my current opinion) is to use the nonSeq parser first, and 
> the old parser if there is an exception.
> 
> Tilman
> 
> Am 14.10.2014 um 09:45 schrieb Timo Boehme:
>> Hi,
>> 
>> Am 14.10.2014 um 07:22 schrieb John Hewson:
>>> Hi,
>>>> 
>>>>> John Hewson <[email protected]> hat am 10. Oktober 2014 um 20:05 
>>>>> geschrieben:
>>>>> 
>>>>> 
>>>>>        - Parsing (Andreas?)
>>>> I guess we won't get a complete new parser in 2.0, but I try to improve 
>>>> the XRef
>>>> and the COSStream stuff
>>> 
>>> It would be great if we could get rid of the old parser and switch to the 
>>> non-sequential
>>> parser, WDYT?
>> 
>> I would also propose to completely remove the old parser. That way we are 
>> more flexible in parsing streams etc. since parts of the non-sequential 
>> parser are a compromise to work side-by-side with the old parser.
>> Possibly there are a small number of functions for which the old parser is 
>> still needed - e.g. signing?
>> 
>> 
>> Best,
>> Timo
>> 
>> 
>

Re: 2.0

Reply via email to