Then to pass the XQuery test suite you probably use CHOP=OFF.
Are there other settings needed to be compliant?

On woensdag 17 februari 2021 00:04:38 CET Christian Grün wrote:
> Yes, you are certainly right. I think it was around 2007 when we chopped
> whitespaces by default, although we knew it didn't comply with the
> specification. One reason was that we rarely worked with mixed-content data
> at that time, and the whitespace indentations increased the size of
> databases and led to worse rendering results in the built-in visualizations
> (our first users were confused about that).
> 
> Maybe we’ll switch the default in a future version of BaseX.
> 
> 
> 
> 
> Jos van den Oever <j...@vandenoever.info> schrieb am Di., 16. Feb. 2021,
> 
> 23:36:
> > Thanks for the context.
> > 
> > Still, it does not explain the difference in behavior bestween doc() and
> > parse-xml().
> > 
> > As far as I understand the XDM specification, whitespace may be ignored by
> > the
> > parser if there is a DTD or XML Schema that says that an element is not
> > PCDATA
> > (DTD) or mixed (XML Schema). In the absense of (support for) schemas, all
> > whitespace should be left in. Wendell Piez writes it with many details.
> > 
> > Whitespace in XML tricky. E.g. indenting XML cannot be done well without
> > knowing which elements are PCDATA/mixed.
> > 
> > Now that I know about the CHOP option, I can use BaseX predictably. And
> > the
> > legacy reasons for keeping it set are understandable.
> > 
> > Best regards,
> > Jos
> > 
> > On dinsdag 16 februari 2021 23:10:05 CET Christian Grün wrote:
> > > There is an old (and still open) issue on GitHub [1] that might give you
> > > some more insight into the history of whitespace chopping in BaseX.
> > > 
> > > Hope this helps
> > > Christian
> > > 
> > > [1] https://github.com/BaseXdb/basex/issues/913
> > > 
> > > 
> > > 
> > > 
> > > Jos van den Oever <j...@vandenoever.info> schrieb am Di., 16. Feb. 2021,
> > > 
> > > 22:41:
> > > > Hi Christian,
> > > > 
> > > > Yes, writing 'CHOP=OFF' in .basex stops the vanishing of whitespace.
> > > > 
> > > > But where in the XQuery or XDM spec does it say that whitespace
> > 
> > handling
> > 
> > > > when
> > > > parsing is implementation dependent?
> > > > 
> > > > Cheers,
> > > > Jos
> > > > 
> > > > On dinsdag 16 februari 2021 22:10:30 CET Christian Grün wrote:
> > > > > Hi Jos,
> > > > > 
> > > > > Whitespaces will be preserved if the CHOP option is disabled. You
> > > > > can
> > > > 
> > > > make
> > > > 
> > > > > this a default by adding CHOP=false in your .basex configuration
> > > > > file
> > > > 
> > > > [1,2].
> > > > 
> > > > > Hope this helps,
> > > > > Christian
> > > > > 
> > > > > [1] https://docs.basex.org/wiki/Full-Text#Mixed_Content
> > > > > [2] https://docs.basex.org/wiki/Configuration
> > > > > 
> > > > > 
> > > > > 
> > > > > 
> > > > > Jos van den Oever <j...@vandenoever.info> schrieb am Di., 16. Feb.
> > 
> > 2021,
> > 
> > > > > 22:00:
> > > > > > Dear all,
> > > > > > 
> > > > > > First off: BaseX is great to work with. I use it for a few
> > 
> > statically
> > 
> > > > > > generated websites.
> > > > > > 
> > > > > > But I recently found what might be a bug.
> > > > > > 
> > > > > > Some whitespace vanishes when loading xml files. E.g. this xml
> > 
> > file:
> > > > > > ```test.xml
> > > > > > <a> a b <a> c </a> d e </a>
> > > > > > ```
> > > > > > 
> > > > > > run like this:
> > > > > > 
> > > > > > doc('test.xml')
> > > > > > 
> > > > > > gives:
> > > > > > 
> > > > > > <a>a b<a>c</a>d e</a>
> > > > > > 
> > > > > > But running this:
> > > > > > 
> > > > > > ```
> > > > > > parse-xml('<a> a b <a> c </a> d e </a>')
> > > > > > ```
> > > > > > 
> > > > > > retains the whitespace.
> > > > > > 
> > > > > > I've tested this with BaseX 7.0, 8.0, 9.0 and 9.4.6.
> > > > > > 
> > > > > > Running this in saxon-he-10.3.jar retains the whitespace.
> > > > > > 
> > > > > > I can work around this issue by placing xml:space="preserve" in
> > > > > > the
> > > > > > document
> > > > > > element.
> > > > > > 
> > > > > > I cannot come up with a scenario in which discarding whitespace
> > 
> > during
> > 
> > > > is
> > > > 
> > > > > > parsing is ok when no DTD or XML Schema is provided.
> > > > > > 
> > > > > > Best regards,
> > > > > > Jos

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to