On 13 November 2013 08:16, Adrian Mcmenamin <acm...@york.ac.uk> wrote:
> > > > On 10 November 2013 09:36, Adrian Mcmenamin <acm...@york.ac.uk> wrote: > >> That was obviously meant to state 2^32, sorry >> >> >> On 10 November 2013 09:34, Adrian Mcmenamin <acm...@york.ac.uk> wrote: >> >>> I am parsing a very large file and the parsing seems to fail on a >>> (bogus) fatal error once I get over about 2^16 lines (on line 4,295,025,275 >>> to be precise). Is there a hard limit on file sizes that can be parsed? >>> >> >> > I deleted approximately 1 billion lines from the file, while ensuring it > was still well-formed XML, and this time the parse failed on line > 4,295,015,171 - so I am very confident there is some sort of overflow bug > in xerces-c's handling of very large XML files. > For what it's worth - the parse fails in exactly the same way (in C++) when the default handler is used - ie when nothing at all is being done, but the Java parser can happily handle the whole file. So all the evidence points to a bug somewhere in the C++ implementation.