Yes, Saxon loads all documents are loaded into memory then processed.

VXQuery loads enough documents to fill a frame and then passes them on to
the next operator.


On Thu, Feb 13, 2014 at 10:27 AM, Till Westmann <[email protected]> wrote:

> One more question about this: We're querying a collection of documents,
> right? So if Saxon run out of memory, does that mean that it first loads
> all the documents in the collection into memory and keeps them there?
>
> Thanks,
> Till
>
> On Feb 12, 2014, at 9:31 PM, Till Westmann <[email protected]> wrote:
>
> > Right, I forgot about that.
> >
> > Thanks,
> > Till
> >
> > On Feb 12, 2014, at 12:23 PM, Eldon Carman <[email protected]> wrote:
> >
> >> They have a version that supports streams to handle larger files. Its
> just
> >> not the free version.
> >>
> >>
> >> On Tue, Feb 11, 2014 at 11:59 PM, Till Westmann <[email protected]>
> wrote:
> >>
> >>> Hi Preston,
> >>>
> >>> do you have indications that this is a limitation of just the free
> version?
> >>> I think that it wouldn't be completely surprising to see a big memory
> blow
> >>> up.
> >>> Assuming that the XML file is in single-byte UTF-8 (which I think it
> is)
> >>> and that the text is stored in 2-byte UTF-16 characters in the JVM, we
> >>> already have a factor of 2. And then there are probably a number of
> objects
> >>> and references that take up additional memory. So it might be that all
> >>> versions of Saxon take up a lot of space in memory. But of course it is
> >>> also possible that the commercial version uses a more memory efficient
> >>> representation.
> >>>
> >>> Cheers,
> >>> Till
> >>>
> >>> On Feb 11, 2014, at 8:07 PM, Eldon Carman <[email protected]> wrote:
> >>>
> >>>> In testing larger datasets sizes, saxon has run into a memory
> >>> limitation. A
> >>>> data set size of 2.21 GB was not able to be queried by saxon. Even
> with
> >>>> setting the java heap size be larger than the data set, the
> application
> >>>> throws an error: "Exception in thread "main"
> java.lang.OutOfMemoryError:
> >>> GC
> >>>> overhead limit exceeded". Just to confirm, I used the following
> settings:
> >>>> JAVA_OPTS="-Xmx12g -Xms12g"
> >>>>
> >>>> Several internet posts comment on allocating 5 times as much memory as
> >>> the
> >>>> xml data size as a rule of thumb. Not guaranteed to work. Some of my
> >>>> testing have worked with datasets up to 460MB (happens to the be the
> my
> >>>> tiny dataset size). Guess we now have confirmed the memory limitation
> of
> >>>> the free version of saxon.
> >>>
> >>>
> >
>
>

Reply via email to