On 13 November 2013 08:16, Adrian Mcmenamin <acm...@york.ac.uk> wrote:

>
>
>
> On 10 November 2013 09:36, Adrian Mcmenamin <acm...@york.ac.uk> wrote:
>
>> That was obviously meant to state 2^32, sorry
>>
>>
>> On 10 November 2013 09:34, Adrian Mcmenamin <acm...@york.ac.uk> wrote:
>>
>>> I am parsing a very large file and the parsing seems to fail on a
>>> (bogus) fatal error once I get over about 2^16 lines (on line 4,295,025,275
>>> to be precise). Is there a hard limit on file sizes that can be parsed?
>>>
>>
>>
> I deleted approximately 1 billion lines from the file, while ensuring it
> was still well-formed XML, and this time the parse failed on line
> 4,295,015,171 - so I am very confident there is some sort of overflow bug
> in xerces-c's handling of very large XML files.
>


For what it's worth - the parse fails in exactly the same way (in C++) when
the default handler is used - ie when nothing at all is being done, but the
Java parser can happily handle the whole file. So all the evidence points
to a bug somewhere in the C++ implementation.

Reply via email to