Re: [basex-talk] Passing through entities unchanged when serializing
Ha ha, awesome Liam! Thank you for clarifying! Best, Bridger On Mon, Sep 9, 2019 at 9:37 PM Liam R. E. Quin wrote: > On Tue, 2019-09-10 at 02:59 +0200, Andreas Mixich wrote: > > I wonder why the serialization behaves that way. It does not make > > sense to > > me. If a user has the need to escape XML, it should be thorough, > > shouldn't it? > > XML entities are expanded by he XML parser, so by the time XQuery (or > XSLT) sees the document they are gone. > > Consider an entity like > "blackgreySteven on>"> > > > > It'd be really complex to have that visible to XPath and to have to > write, e.g. > /students/entity(*)/person > > If it's an external parsed entity it's visible in that the base-uri > property changes, but that's all. > > Character entities like (ŗ) are just special cases of > general entities, and XML does not distinguish them. I wish it did, but > we never got back to that work after publishing XML 1.0. > > Liam > > -- > Liam Quin, https://www.delightfulcomputing.com/ > Available for XML/Document/Information Architecture/XSLT/ > XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. > Web slave for vintage clipart http://www.fromoldbooks.org/ > >
Re: [basex-talk] Passing through entities unchanged when serializing
On Tue, Sep 10, 2019 at 3:37 AM Liam R. E. Quin wrote: > XML entities are expanded by he XML parser, so by the time XQuery (or > XSLT) sees the document they are gone. > Ah, yes, I totally forgot about that! Thanks for clarification!
Re: [basex-talk] Passing through entities unchanged when serializing
On Tue, 2019-09-10 at 02:59 +0200, Andreas Mixich wrote: > I wonder why the serialization behaves that way. It does not make > sense to > me. If a user has the need to escape XML, it should be thorough, > shouldn't it? XML entities are expanded by he XML parser, so by the time XQuery (or XSLT) sees the document they are gone. Consider an entity like blackgreySteven"> It'd be really complex to have that visible to XPath and to have to write, e.g. /students/entity(*)/person If it's an external parsed entity it's visible in that the base-uri property changes, but that's all. Character entities like (ŗ) are just special cases of general entities, and XML does not distinguish them. I wish it did, but we never got back to that work after publishing XML 1.0. Liam -- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Web slave for vintage clipart http://www.fromoldbooks.org/
Re: [basex-talk] Passing through entities unchanged when serializing
Hi Andreas - I'm not sure (way outside of my wheelhouse :), but I think because arbitrary serialization can generate invalid XML, so having a character map makes the possible invalidity explicit? Now that I've typed that, I'm not sure if that captures the rational or not. :) In any case, here's what the specifications have to say[1]. Best, Bridger [1] https://www.w3.org/TR/xslt-xquery-serialization-31/#character-maps On Mon, Sep 9, 2019 at 9:00 PM Andreas Mixich wrote: > I wonder why the serialization behaves that way. It does not make sense to > me. If a user has the need to escape XML, it should be thorough, shouldn't > it? > > On Mon, Sep 9, 2019 at 10:47 PM Liam R. E. Quin > wrote: > >> On Mon, 2019-09-09 at 15:04 +0200, Andreas Mixich wrote: >> > when serializing a string, that contains literal XML with entities, >> > how do I pass through those entities unchanged? >> >> One way is to use a character map, as Bridger Dyson-Smith described. >> >> Sometimes another way can be to have a version of the DTD in which the >> replacement text of the entity marks the presence of the entity, e.g. >> >> but this will affect full-text searching of course. >> >> Liam >> >> -- >> Liam Quin, https://www.delightfulcomputing.com/ >> Available for XML/Document/Information Architecture/XSLT/ >> XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. >> Barefoot Webslave for old illustrations http://www.fromoldbooks.org/ >> >> > > -- > Minden jót, all the best, Alles Gute, > Andreas Mixich >
Re: [basex-talk] Passing through entities unchanged when serializing
I wonder why the serialization behaves that way. It does not make sense to me. If a user has the need to escape XML, it should be thorough, shouldn't it? On Mon, Sep 9, 2019 at 10:47 PM Liam R. E. Quin wrote: > On Mon, 2019-09-09 at 15:04 +0200, Andreas Mixich wrote: > > when serializing a string, that contains literal XML with entities, > > how do I pass through those entities unchanged? > > One way is to use a character map, as Bridger Dyson-Smith described. > > Sometimes another way can be to have a version of the DTD in which the > replacement text of the entity marks the presence of the entity, e.g. > > but this will affect full-text searching of course. > > Liam > > -- > Liam Quin, https://www.delightfulcomputing.com/ > Available for XML/Document/Information Architecture/XSLT/ > XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. > Barefoot Webslave for old illustrations http://www.fromoldbooks.org/ > > -- Minden jót, all the best, Alles Gute, Andreas Mixich
Re: [basex-talk] Passing through entities unchanged when serializing
On Mon, 2019-09-09 at 15:04 +0200, Andreas Mixich wrote: > when serializing a string, that contains literal XML with entities, > how do I pass through those entities unchanged? One way is to use a character map, as Bridger Dyson-Smith described. Sometimes another way can be to have a version of the DTD in which the replacement text of the entity marks the presence of the entity, e.g. but this will affect full-text searching of course. Liam -- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Webslave for old illustrations http://www.fromoldbooks.org/
Re: [basex-talk] Passing through entities unchanged when serializing
Hi Andreas - Have you tried using different serialization options? I.e., serialize.xq: ``` declare option output:method "xml"; declare option output:parameter-document "map.xml"; declare variable $input := "Lorem ipsum, dolor sit amet."; serialize($input) ``` map.xml: ``` http://www.w3.org/2010/xslt-xquery-serialization;> ``` When run in the BaseX GUI, I get: `lt;pgt;Lorem ipsum, dolor sit amet.lt;/pgt;`, might be closer? I think you might have been experiencing the default 'basex' serialization option (see [1] for more). Hope that helps. Best, Bridger [1] http://docs.basex.org/wiki/Serialization On Mon, Sep 9, 2019 at 9:05 AM Andreas Mixich wrote: > Hi, > > when serializing a string, that contains literal XML with entities, how do > I pass through those entities unchanged? > Example: > > let $input := "Lorem ipsum dolor sit amet " > return serialize($input) > > results in: > > pLorem ipsum dolor sit amet, ' consectetur adipisicing > elit./p > > but I want: > > pLorem ipsum dolor sit amet, consectetur adipisicing > elit./p > > -- > Minden jót, all the best, Alles Gute, > Andreas Mixich >
[basex-talk] Passing through entities unchanged when serializing
Hi, when serializing a string, that contains literal XML with entities, how do I pass through those entities unchanged? Example: let $input := "Lorem ipsum dolor sit amet " return serialize($input) results in: pLorem ipsum dolor sit amet, ' consectetur adipisicing elit./p but I want: pLorem ipsum dolor sit amet, consectetur adipisicing elit./p -- Minden jót, all the best, Alles Gute, Andreas Mixich