Re: [basex-talk] Differences in serialization of arrays with JSON vs. adaptive methods

2017-08-10 Thread Christian Grün
Hi there,

To begin with: I forgot to mention that you can change the
serialization method by launching the command "set serializer
method=adaptive" in the command input text field on top of the BaseX
window.

I agree this is not very comfortable, so I have now added a new
interaction component for changing all serialization parameters in the
GUI. I have decided to move the input components to the Preferences
dialog (Ctrl-Shift-P, Visualization panel), as I’d like to keep the
main interface clean. If more people ask for it, I might add a
dropdown menu for the serialization method on top of the Result View,
let’s see.

The new component for adjusting the serialization parameters is also
available now in the Export Database dialog. A new stable snapshot is
available [1], and the next minor release will be available around end
of August.

Looking forward to your feedback,
Christian

[1] http://files.basex.org/releases/latest/



On Fri, Aug 11, 2017 at 12:40 AM, Christian Grün
 wrote:
> Hi Joe,
>
>> Have you considered adding a preference or toggle for selecting the default
>> serialization method used in the GUI's results?
>
> Sounds like an enticing idea! Something similar is embedded in our
> Database Export dialog (see menü items 'Database', 'Export…'). I
> haven’t touched it for years, and it could surely be revised as well.
> I will definitely think about adding something like this in to our
> Result View [1].
>
>> From my perspective in teaching XQuery, showing an
>> xs:string item in quotes (and integers sans quotes) helps reinforce the
>> concept of data types.
>
> This is a good thought indeed.
>
>> Besides string handling, though, are
>> there other aspects of "adaptive" that you dislike compared to the default
>> "basex" method?
>
> I would say that both methods (now) serve different purposes:
>
> • Our 'basex' method was included because BaseX is used in many
> different contexts, and we were looking for a single serialization
> method that can be used for as many use cases as possible at the same
> time. If BaseX is used on command-line, it can be convenient if the
> textual output (usually XML, strings, numbers) can directly be passed
> on to other commands, or saved in text files. If the GUI is used, text
> from the result view can be copied and pasted to other tools (such as
> CSV output, which can be pasted in Excel, etc.).
>
> • The 'adaptive' simplifies the recycling of results in other XQuery
> expressions. I agree it also helps users to understand the differences
> between data types. I find it a bit confusing, however, that some
> items will be output with a constructor function, whereas other will
> simply be output as strings. Some examples:
>
>   1,
>   xs:double(1),
>   'a"b',
>   xs:anyURI('a"b'),
>   xs:QName('xml:x'),
>   /@a
>
> …will be serialized as…
>
>   1
>   1
>   "a""b"
>   xs:anyURI("a""b")
>   Q{http://www.w3.org/XML/1998/namespace}x
>   a="b"
>
> It would probably have been more consistent to create output that can
> always be reused, and that always contains the datatype:
>
>   xs:integer(1)
>   xs:double(1)
>   xs:string("a""b")
>   xs:anyURI("a""b")
>   xs:QName("xml:x"),
>   attribute a { "b" }
>
> Well, maybe the type could have been omitted for xs:integer and
> xs:string, but as constructors are added for many types, I believe
> that any ambiguities should have been avoided.
>
> There are surely many things that would need to be considered (for
> example, a namespace of a prefix might not be declared; anonymous
> functions could only be re-used if the full function body was
> serialized as well; etc).
>
> Just my two cents,
> Christian
>
> [1] https://github.com/BaseXdb/basex/issues/1484


Re: [basex-talk] Shouldn't CHOP = false make xml:space="preserve" the default behavior?

2017-08-10 Thread Christian Grün
I agree that it might be reasonable to introduce different defaults
for WebDAV communication. Problems could arise if documents are opened
with WebDAV that have been stored via REST or another API… But we
could give it a try.


On Thu, Aug 10, 2017 at 11:28 PM, Andy Bunce  wrote:
> It seems globally setting `indent=no` gets applied to WebDAV (and everywhere
> else where serialization is not explicit specified). This would  be my
> preference for WebDAV, as it means documents can be round-tripped without
> any changes being introduced. The only side effect from this setting I have
> seen is view-source on generated html source is harder to read, but this is
> not a real issue.
>
> I have not tried setting them in web.xml yet. I wondered if you would expect
> it to work :-).
> I will try...
>
> Cheers
> /Andy
>
> On 10 August 2017 at 18:40, Christian Grün 
> wrote:
>>
>> Hi Andy,
>>
>> > Can the WebDAV serialization be set independently of the default, in
>> > web.xml?
>>
>> The defaults for whitespace chopping and serialization can only be
>> assigned globally for all features of BaseX. Did you try to set both
>> 'org.basex.chop' and 'org.basex.serializer' in web.xml / does it
>> introduce other unwanted side effects?
>>
>> Cheers,
>> Christian
>>
>>
>> > /Andy
>> >
>> > On 7 August 2017 at 09:57, Christian Grün 
>> > wrote:
>> >>
>> >> Dear Ottid,
>> >>
>> >> Thanks for providing us with the helpful example, which helped me to
>> >> understand the problem.
>> >>
>> >> >> replace /a foo bar
>> >> > "a.xml" (Line 1): Open quote is expected for attribute "xml:space"
>> >> > associated with an  element type  "root".
>> >>
>> >> Just a side note: Command-line parsing is restrictive when it comes to
>> >> replacing XML. The reason is that is possible to send multiple
>> >> commands in a single line, as shown in the following example:
>> >>
>> >>   create db db; replace /a ; xquery .
>> >>
>> >>
>> >> >> xquery /root
>> >> > foo bar
>> >>
>> >> You may be surprised to hear that whitespaces in your document were
>> >> actually chopped, and that the whitespaces are  added by the
>> >> serializer, because the "indent" serialization parameter is by default
>> >> set to "yes".
>> >>
>> >> It was surprised to see that no one else pointed at this so far, and
>> >> that was not mentioned in our documentation, so I have just added some
>> >> explanatory lines [1,2].
>> >>
>> >> Some more technical background:
>> >>
>> >> If you call BaseX the "info storage" command, you will see which XML
>> >> nodes are stored in the document:
>> >>
>> >> > set chop on;create db db  ; info storage
>> >> CHOP: true
>> >> Database 'db' created in 11.0 ms.
>> >> PRE  DIS  SIZ  ATS  ID  NS  KIND  CONTENT
>> >> -
>> >>   0121   0   0  DOC   db.xml
>> >>   1111   1   0  ELEM  aIf
>> >>
>> >> > set chop off;create db db  ; info storage
>> >> CHOP: false
>> >> Database 'db' created in 20.12 ms.
>> >> PRE  DIS  SIZ  ATS  ID  NS  KIND  CONTENT
>> >> -
>> >>   0131   0   0  DOC   db.xml
>> >>   1121   1   0  ELEM  a
>> >>   2111   2   0  TEXT
>> >>
>> >> Serialization indentation was a chosen as default because it goes hand
>> >> in hand with the CHOP option. It even works fine if CHOP is disabled
>> >> if a document has whitespaces included (in which case no whitespaces
>> >> will be added by the serialized). But it may definitely cause
>> >> undesirable output if a document contains no superfluous whitespaces,
>> >> such as in your case.
>> >>
>> >> Hope this helps,
>> >> Christian
>> >>
>> >> [1] http://docs.basex.org/wiki/Options#CHOP
>> >> [2] http://docs.basex.org/wiki/Full-Text#Mixed_Content
>> >
>> >
>
>


Re: [basex-talk] Differences in serialization of arrays with JSON vs. adaptive methods

2017-08-10 Thread Christian Grün
Hi Joe,

> Have you considered adding a preference or toggle for selecting the default
> serialization method used in the GUI's results?

Sounds like an enticing idea! Something similar is embedded in our
Database Export dialog (see menü items 'Database', 'Export…'). I
haven’t touched it for years, and it could surely be revised as well.
I will definitely think about adding something like this in to our
Result View [1].

> From my perspective in teaching XQuery, showing an
> xs:string item in quotes (and integers sans quotes) helps reinforce the
> concept of data types.

This is a good thought indeed.

> Besides string handling, though, are
> there other aspects of "adaptive" that you dislike compared to the default
> "basex" method?

I would say that both methods (now) serve different purposes:

• Our 'basex' method was included because BaseX is used in many
different contexts, and we were looking for a single serialization
method that can be used for as many use cases as possible at the same
time. If BaseX is used on command-line, it can be convenient if the
textual output (usually XML, strings, numbers) can directly be passed
on to other commands, or saved in text files. If the GUI is used, text
from the result view can be copied and pasted to other tools (such as
CSV output, which can be pasted in Excel, etc.).

• The 'adaptive' simplifies the recycling of results in other XQuery
expressions. I agree it also helps users to understand the differences
between data types. I find it a bit confusing, however, that some
items will be output with a constructor function, whereas other will
simply be output as strings. Some examples:

  1,
  xs:double(1),
  'a"b',
  xs:anyURI('a"b'),
  xs:QName('xml:x'),
  /@a

…will be serialized as…

  1
  1
  "a""b"
  xs:anyURI("a""b")
  Q{http://www.w3.org/XML/1998/namespace}x
  a="b"

It would probably have been more consistent to create output that can
always be reused, and that always contains the datatype:

  xs:integer(1)
  xs:double(1)
  xs:string("a""b")
  xs:anyURI("a""b")
  xs:QName("xml:x"),
  attribute a { "b" }

Well, maybe the type could have been omitted for xs:integer and
xs:string, but as constructors are added for many types, I believe
that any ambiguities should have been avoided.

There are surely many things that would need to be considered (for
example, a namespace of a prefix might not be declared; anonymous
functions could only be re-used if the full function body was
serialized as well; etc).

Just my two cents,
Christian

[1] https://github.com/BaseXdb/basex/issues/1484


Re: [basex-talk] Differences in serialization of arrays with JSON vs. adaptive methods

2017-08-10 Thread Joe Wicentowski
Hi Christian,

I actually quite like the adaptive serialization method and have made it
the default in eXide.  From my perspective in teaching XQuery, showing an
xs:string item in quotes (and integers sans quotes) helps reinforce the
concept of data types.  It feels to me like a datatype-sensitive view of
results, rather than a debug method.  Besides string handling, though, are
there other aspects of "adaptive" that you dislike compared to the default
"basex" method?

Have you considered adding a preference or toggle for selecting the default
serialization method used in the GUI's results?  For
comparison/inspiration, you might see the screenshots of the serialization
method dropdown and indent checkbox I added to eXide:
https://github.com/wolfgangmm/eXide/pull/168#issuecomment-307592370 - which
in my mind makes eXide quite a powerful for quickly experimenting with
different serializations of results - a particularly time-saving feature
given how verbose the boilerplate is for specifying serialization methods.

Joe

On Thu, Aug 10, 2017 at 5:04 PM, Christian Grün 
wrote:

> Hi Joe,
>
> Thanks for the link. So I noticed that you were quoting exactly the same
> phrase of the spec as I did. ;)
>
> I just checked what Saxon does: It seems to ignore the value of the indent
> parameter when serializing arrays with the adaptive method.
>
> So I guess that every implementation of XQuery 3.1 serializes arrays
> slightly differently, and the spec is probably too fuzzy to give a more
> precise answer.
>
> In general, I would have been happy if the adaptive method had been
> renamed to 'debug', and if another method had been added to the spec that
> is similar to our custom 'basex' method (which allows users to serialize
> all items – including maps, arrays and attributes – in a flavor that does
> not look like debugging output). In fact the initial version of the
> 'adaptive' method was more similar to ours (for example, strings were
> output without quotes). It changed a lot over the time, and we eventually
> decided to include a custom method.
>
> Well, it’s easy to ask for new features, and much more demanding to write
> specifications that satisfy everyone.
>
> Christian
>
>
>
>
> Am 10.08.2017 9:52 nachm. schrieb "Joe Wicentowski" :
>
> Hi Christian,
>
> Thanks for your reply.  I agree that the spec is not entirely clear here,
> but my understanding of the spec was based on the interpretations advanced
> by Michael Kay and Liam Quin on this xquery-talk thread about the question
> of indentation under the adaptive method:
>
>   http://markmail.org/message/dixi7e7qq2ttde74
>
> Joe
>
> On Thu, Aug 10, 2017 at 1:37 PM, Christian Grün  > wrote:
>
>> Dear Joe,
>>
>> Thanks for the kind feedback. I am glad to hear BaseX was useful in
>> your DH 2017 workshops.
>>
>> > the serialization spec notes that the adaptive method delegates the
>> handling
>> > of the "indent" parameter to JSON.
>>
>> Could you possibly point me to this rule in the spec? I remember there
>> was a lot discussion about the adaptive serialization method in the W3
>> Working Group. As it was difficult to define rules that cover
>> requirements of all members, the initial version differs quite a lot
>> from the final proposal, and various details were left to the
>> implementation (because it was assumed that the method will mostly be
>> used for debugging). I looked up the final version serialization spec
>> [1], which states in 10.1.4 that:
>>
>>   “The indent and suppress-indentation parameters are
>>   not directly applicable to the Adaptive output method.”
>>
>> In BaseX, the parameter is considered indeed when serializing maps and
>> arrays (and other data types as well), but there are various
>> differences between the two serialization methods. Consider the
>> following example (which should also work with other XQuery
>> processors):
>>
>>   xquery version "3.1";
>>   for $method in ('adaptive', 'json')
>>   return (
>> "METHOD: " || $method,
>> "OUTPUT: " || (
>>   try {
>> serialize(
>>   map { 'functions': [ false#0, true#0 ]},
>>   map { 'method': $method }
>> )
>>   } catch * {
>> $err:description
>>   }
>> )
>>   )
>>
>> The adaptive can be used to serialize items of any type, whereas the
>> json method is restricted to types that can be represented in JSON.
>>
>> Does this help?
>> Christian
>>
>> [1] https://www.w3.org/TR/xslt-xquery-serialization-31/#ADAPTIVE_INDENT
>>
>>
>>
>> On Thu, Aug 10, 2017 at 4:35 PM, Joe Wicentowski 
>> wrote:
>> > Hi all,
>> >
>> > First, I'm just back from DH2017, where Clifford Anderson and I taught
>> two
>> > workshops on XQuery using BaseX, along with eXist and Saxon.  BaseX
>> > performed like a champ.  We were able to configure the GUI window to
>> show
>> > just the query and results windows—perfect when you're projecting the
>> screen
>> > in 

Re: [basex-talk] Shouldn't CHOP = false make xml:space="preserve" the default behavior?

2017-08-10 Thread Andy Bunce
It seems globally setting `indent=no` gets applied to WebDAV (and
everywhere else where serialization is not explicit specified). This would
be my preference for WebDAV, as it means documents can be round-tripped
without any changes being introduced. The only side effect from this
setting I have seen is view-source on generated html source is harder to
read, but this is not a real issue.

I have not tried setting them in web.xml yet. I wondered if you would
expect it to work :-).
I will try...

Cheers
/Andy

On 10 August 2017 at 18:40, Christian Grün 
wrote:

> Hi Andy,
>
> > Can the WebDAV serialization be set independently of the default, in
> > web.xml?
>
> The defaults for whitespace chopping and serialization can only be
> assigned globally for all features of BaseX. Did you try to set both
> 'org.basex.chop' and 'org.basex.serializer' in web.xml / does it
> introduce other unwanted side effects?
>
> Cheers,
> Christian
>
>
> > /Andy
> >
> > On 7 August 2017 at 09:57, Christian Grün 
> wrote:
> >>
> >> Dear Ottid,
> >>
> >> Thanks for providing us with the helpful example, which helped me to
> >> understand the problem.
> >>
> >> >> replace /a foo bar
> >> > "a.xml" (Line 1): Open quote is expected for attribute "xml:space"
> >> > associated with an  element type  "root".
> >>
> >> Just a side note: Command-line parsing is restrictive when it comes to
> >> replacing XML. The reason is that is possible to send multiple
> >> commands in a single line, as shown in the following example:
> >>
> >>   create db db; replace /a ; xquery .
> >>
> >>
> >> >> xquery /root
> >> > foo bar
> >>
> >> You may be surprised to hear that whitespaces in your document were
> >> actually chopped, and that the whitespaces are  added by the
> >> serializer, because the "indent" serialization parameter is by default
> >> set to "yes".
> >>
> >> It was surprised to see that no one else pointed at this so far, and
> >> that was not mentioned in our documentation, so I have just added some
> >> explanatory lines [1,2].
> >>
> >> Some more technical background:
> >>
> >> If you call BaseX the "info storage" command, you will see which XML
> >> nodes are stored in the document:
> >>
> >> > set chop on;create db db  ; info storage
> >> CHOP: true
> >> Database 'db' created in 11.0 ms.
> >> PRE  DIS  SIZ  ATS  ID  NS  KIND  CONTENT
> >> -
> >>   0121   0   0  DOC   db.xml
> >>   1111   1   0  ELEM  aIf
> >>
> >> > set chop off;create db db  ; info storage
> >> CHOP: false
> >> Database 'db' created in 20.12 ms.
> >> PRE  DIS  SIZ  ATS  ID  NS  KIND  CONTENT
> >> -
> >>   0131   0   0  DOC   db.xml
> >>   1121   1   0  ELEM  a
> >>   2111   2   0  TEXT
> >>
> >> Serialization indentation was a chosen as default because it goes hand
> >> in hand with the CHOP option. It even works fine if CHOP is disabled
> >> if a document has whitespaces included (in which case no whitespaces
> >> will be added by the serialized). But it may definitely cause
> >> undesirable output if a document contains no superfluous whitespaces,
> >> such as in your case.
> >>
> >> Hope this helps,
> >> Christian
> >>
> >> [1] http://docs.basex.org/wiki/Options#CHOP
> >> [2] http://docs.basex.org/wiki/Full-Text#Mixed_Content
> >
> >
>


Re: [basex-talk] Differences in serialization of arrays with JSON vs. adaptive methods

2017-08-10 Thread Christian Grün
Hi Joe,

Thanks for the link. So I noticed that you were quoting exactly the same
phrase of the spec as I did. ;)

I just checked what Saxon does: It seems to ignore the value of the indent
parameter when serializing arrays with the adaptive method.

So I guess that every implementation of XQuery 3.1 serializes arrays
slightly differently, and the spec is probably too fuzzy to give a more
precise answer.

In general, I would have been happy if the adaptive method had been renamed
to 'debug', and if another method had been added to the spec that is
similar to our custom 'basex' method (which allows users to serialize all
items – including maps, arrays and attributes – in a flavor that does not
look like debugging output). In fact the initial version of the 'adaptive'
method was more similar to ours (for example, strings were output without
quotes). It changed a lot over the time, and we eventually decided to
include a custom method.

Well, it’s easy to ask for new features, and much more demanding to write
specifications that satisfy everyone.

Christian




Am 10.08.2017 9:52 nachm. schrieb "Joe Wicentowski" :

Hi Christian,

Thanks for your reply.  I agree that the spec is not entirely clear here,
but my understanding of the spec was based on the interpretations advanced
by Michael Kay and Liam Quin on this xquery-talk thread about the question
of indentation under the adaptive method:

  http://markmail.org/message/dixi7e7qq2ttde74

Joe

On Thu, Aug 10, 2017 at 1:37 PM, Christian Grün 
wrote:

> Dear Joe,
>
> Thanks for the kind feedback. I am glad to hear BaseX was useful in
> your DH 2017 workshops.
>
> > the serialization spec notes that the adaptive method delegates the
> handling
> > of the "indent" parameter to JSON.
>
> Could you possibly point me to this rule in the spec? I remember there
> was a lot discussion about the adaptive serialization method in the W3
> Working Group. As it was difficult to define rules that cover
> requirements of all members, the initial version differs quite a lot
> from the final proposal, and various details were left to the
> implementation (because it was assumed that the method will mostly be
> used for debugging). I looked up the final version serialization spec
> [1], which states in 10.1.4 that:
>
>   “The indent and suppress-indentation parameters are
>   not directly applicable to the Adaptive output method.”
>
> In BaseX, the parameter is considered indeed when serializing maps and
> arrays (and other data types as well), but there are various
> differences between the two serialization methods. Consider the
> following example (which should also work with other XQuery
> processors):
>
>   xquery version "3.1";
>   for $method in ('adaptive', 'json')
>   return (
> "METHOD: " || $method,
> "OUTPUT: " || (
>   try {
> serialize(
>   map { 'functions': [ false#0, true#0 ]},
>   map { 'method': $method }
> )
>   } catch * {
> $err:description
>   }
> )
>   )
>
> The adaptive can be used to serialize items of any type, whereas the
> json method is restricted to types that can be represented in JSON.
>
> Does this help?
> Christian
>
> [1] https://www.w3.org/TR/xslt-xquery-serialization-31/#ADAPTIVE_INDENT
>
>
>
> On Thu, Aug 10, 2017 at 4:35 PM, Joe Wicentowski  wrote:
> > Hi all,
> >
> > First, I'm just back from DH2017, where Clifford Anderson and I taught
> two
> > workshops on XQuery using BaseX, along with eXist and Saxon.  BaseX
> > performed like a champ.  We were able to configure the GUI window to show
> > just the query and results windows—perfect when you're projecting the
> screen
> > in a large room and want everyone to see.  Many thanks for such a great
> > teaching tool!  (Our materials are at
> > https://github.com/CliffordAnderson/XQuery4Humanists.)
> >
> > Back to the topic of this post, though, I noticed a slight difference
> > between BaseX's serialization of arrays when using JSON vs. adaptive
> > methods: with JSON, the array's items are separated by newlines, whereas
> > with adaptive, the items are separated by spaces.  This is interesting
> since
> > the serialization spec notes that the adaptive method delegates the
> handling
> > of the "indent" parameter to JSON.  Some code to reproduce this is below.
> >
> > I'm curious to know - is there a particular reason for this difference?
> >
> > Thanks,
> > Joe
> >
> >
> > serialization-test.xq
> > ```xquery
> > xquery version "3.1";
> >
> > declare namespace output="http://www.w3.org/2010
> /xslt-xquery-serialization";
> > let $array := ["Cheapside","London","Dean Prior","Devon"]
> > for $method in ("json", "adaptive")
> > let $serialization-parameters :=
> >   
> > {$method}
> > yes
> >   
> > return
> >   fn:serialize($array, $serialization-parameters)
> > ```
> >
> > serialization-test_results.txt
> > ```txt
> > [
> >   "Cheapside",
> >   "London",
> >   

Re: [basex-talk] Shouldn't CHOP = false make xml:space="preserve" the default behavior?

2017-08-10 Thread Christian Grün
Hi Andy,

> Can the WebDAV serialization be set independently of the default, in
> web.xml?

The defaults for whitespace chopping and serialization can only be
assigned globally for all features of BaseX. Did you try to set both
'org.basex.chop' and 'org.basex.serializer' in web.xml / does it
introduce other unwanted side effects?

Cheers,
Christian


> /Andy
>
> On 7 August 2017 at 09:57, Christian Grün  wrote:
>>
>> Dear Ottid,
>>
>> Thanks for providing us with the helpful example, which helped me to
>> understand the problem.
>>
>> >> replace /a foo bar
>> > "a.xml" (Line 1): Open quote is expected for attribute "xml:space"
>> > associated with an  element type  "root".
>>
>> Just a side note: Command-line parsing is restrictive when it comes to
>> replacing XML. The reason is that is possible to send multiple
>> commands in a single line, as shown in the following example:
>>
>>   create db db; replace /a ; xquery .
>>
>>
>> >> xquery /root
>> > foo bar
>>
>> You may be surprised to hear that whitespaces in your document were
>> actually chopped, and that the whitespaces are  added by the
>> serializer, because the "indent" serialization parameter is by default
>> set to "yes".
>>
>> It was surprised to see that no one else pointed at this so far, and
>> that was not mentioned in our documentation, so I have just added some
>> explanatory lines [1,2].
>>
>> Some more technical background:
>>
>> If you call BaseX the "info storage" command, you will see which XML
>> nodes are stored in the document:
>>
>> > set chop on;create db db  ; info storage
>> CHOP: true
>> Database 'db' created in 11.0 ms.
>> PRE  DIS  SIZ  ATS  ID  NS  KIND  CONTENT
>> -
>>   0121   0   0  DOC   db.xml
>>   1111   1   0  ELEM  aIf
>>
>> > set chop off;create db db  ; info storage
>> CHOP: false
>> Database 'db' created in 20.12 ms.
>> PRE  DIS  SIZ  ATS  ID  NS  KIND  CONTENT
>> -
>>   0131   0   0  DOC   db.xml
>>   1121   1   0  ELEM  a
>>   2111   2   0  TEXT
>>
>> Serialization indentation was a chosen as default because it goes hand
>> in hand with the CHOP option. It even works fine if CHOP is disabled
>> if a document has whitespaces included (in which case no whitespaces
>> will be added by the serialized). But it may definitely cause
>> undesirable output if a document contains no superfluous whitespaces,
>> such as in your case.
>>
>> Hope this helps,
>> Christian
>>
>> [1] http://docs.basex.org/wiki/Options#CHOP
>> [2] http://docs.basex.org/wiki/Full-Text#Mixed_Content
>
>


Re: [basex-talk] Differences in serialization of arrays with JSON vs. adaptive methods

2017-08-10 Thread Christian Grün
Dear Joe,

Thanks for the kind feedback. I am glad to hear BaseX was useful in
your DH 2017 workshops.

> the serialization spec notes that the adaptive method delegates the handling
> of the "indent" parameter to JSON.

Could you possibly point me to this rule in the spec? I remember there
was a lot discussion about the adaptive serialization method in the W3
Working Group. As it was difficult to define rules that cover
requirements of all members, the initial version differs quite a lot
from the final proposal, and various details were left to the
implementation (because it was assumed that the method will mostly be
used for debugging). I looked up the final version serialization spec
[1], which states in 10.1.4 that:

  “The indent and suppress-indentation parameters are
  not directly applicable to the Adaptive output method.”

In BaseX, the parameter is considered indeed when serializing maps and
arrays (and other data types as well), but there are various
differences between the two serialization methods. Consider the
following example (which should also work with other XQuery
processors):

  xquery version "3.1";
  for $method in ('adaptive', 'json')
  return (
"METHOD: " || $method,
"OUTPUT: " || (
  try {
serialize(
  map { 'functions': [ false#0, true#0 ]},
  map { 'method': $method }
)
  } catch * {
$err:description
  }
)
  )

The adaptive can be used to serialize items of any type, whereas the
json method is restricted to types that can be represented in JSON.

Does this help?
Christian

[1] https://www.w3.org/TR/xslt-xquery-serialization-31/#ADAPTIVE_INDENT



On Thu, Aug 10, 2017 at 4:35 PM, Joe Wicentowski  wrote:
> Hi all,
>
> First, I'm just back from DH2017, where Clifford Anderson and I taught two
> workshops on XQuery using BaseX, along with eXist and Saxon.  BaseX
> performed like a champ.  We were able to configure the GUI window to show
> just the query and results windows—perfect when you're projecting the screen
> in a large room and want everyone to see.  Many thanks for such a great
> teaching tool!  (Our materials are at
> https://github.com/CliffordAnderson/XQuery4Humanists.)
>
> Back to the topic of this post, though, I noticed a slight difference
> between BaseX's serialization of arrays when using JSON vs. adaptive
> methods: with JSON, the array's items are separated by newlines, whereas
> with adaptive, the items are separated by spaces.  This is interesting since
> the serialization spec notes that the adaptive method delegates the handling
> of the "indent" parameter to JSON.  Some code to reproduce this is below.
>
> I'm curious to know - is there a particular reason for this difference?
>
> Thanks,
> Joe
>
>
> serialization-test.xq
> ```xquery
> xquery version "3.1";
>
> declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization;;
> let $array := ["Cheapside","London","Dean Prior","Devon"]
> for $method in ("json", "adaptive")
> let $serialization-parameters :=
>   
> {$method}
> yes
>   
> return
>   fn:serialize($array, $serialization-parameters)
> ```
>
> serialization-test_results.txt
> ```txt
> [
>   "Cheapside",
>   "London",
>   "Dean Prior",
>   "Devon"
> ]
> ["Cheapside", "London", "Dean Prior", "Devon"]
> ```


Re: [basex-talk] Shouldn't CHOP = false make xml:space="preserve" the default behavior?

2017-08-10 Thread Andy Bunce
>But it may definitely cause undesirable output if a document contains no
superfluous whitespaces,

One situation where the default serialization indentation can be
problematic is WebDAV.
Can the WebDAV serialization be set independently of the default, in
web.xml?

/Andy

On 7 August 2017 at 09:57, Christian Grün  wrote:

> Dear Ottid,
>
> Thanks for providing us with the helpful example, which helped me to
> understand the problem.
>
> >> replace /a foo bar
> > "a.xml" (Line 1): Open quote is expected for attribute "xml:space"
> > associated with an  element type  "root".
>
> Just a side note: Command-line parsing is restrictive when it comes to
> replacing XML. The reason is that is possible to send multiple
> commands in a single line, as shown in the following example:
>
>   create db db; replace /a ; xquery .
>
>
> >> xquery /root
> > foo bar
>
> You may be surprised to hear that whitespaces in your document were
> actually chopped, and that the whitespaces are  added by the
> serializer, because the "indent" serialization parameter is by default
> set to "yes".
>
> It was surprised to see that no one else pointed at this so far, and
> that was not mentioned in our documentation, so I have just added some
> explanatory lines [1,2].
>
> Some more technical background:
>
> If you call BaseX the "info storage" command, you will see which XML
> nodes are stored in the document:
>
> > set chop on;create db db  ; info storage
> CHOP: true
> Database 'db' created in 11.0 ms.
> PRE  DIS  SIZ  ATS  ID  NS  KIND  CONTENT
> -
>   0121   0   0  DOC   db.xml
>   1111   1   0  ELEM  aIf
>
> > set chop off;create db db  ; info storage
> CHOP: false
> Database 'db' created in 20.12 ms.
> PRE  DIS  SIZ  ATS  ID  NS  KIND  CONTENT
> -
>   0131   0   0  DOC   db.xml
>   1121   1   0  ELEM  a
>   2111   2   0  TEXT
>
> Serialization indentation was a chosen as default because it goes hand
> in hand with the CHOP option. It even works fine if CHOP is disabled
> if a document has whitespaces included (in which case no whitespaces
> will be added by the serialized). But it may definitely cause
> undesirable output if a document contains no superfluous whitespaces,
> such as in your case.
>
> Hope this helps,
> Christian
>
> [1] http://docs.basex.org/wiki/Options#CHOP
> [2] http://docs.basex.org/wiki/Full-Text#Mixed_Content
>


Re: [basex-talk] Differences in serialization of arrays with JSON vs. adaptive methods

2017-08-10 Thread Giuseppe Celano
Hi Joe,

I am happy to hear you are also spreading the word! XQuery has a most clean 
data model, and BaseX has implemented and extended the language so efficiently 
and elegantly.

Best,
Giuseppe 

Universität Leipzig
Institute of Computer Science, Digital Humanities
Augustusplatz 10
04109 Leipzig
Deutschland
E-mail: cel...@informatik.uni-leipzig.de
E-mail: giuseppegacel...@gmail.com
Web site 1: http://www.dh.uni-leipzig.de/wo/team/
Web site 2: https://sites.google.com/site/giuseppegacelano/

> On 10 Aug 2017, at 16:35, Joe Wicentowski  wrote:
> 
> Hi all,
> 
> First, I'm just back from DH2017, where Clifford Anderson and I taught two 
> workshops on XQuery using BaseX, along with eXist and Saxon.  BaseX performed 
> like a champ.  We were able to configure the GUI window to show just the 
> query and results windows—perfect when you're projecting the screen in a 
> large room and want everyone to see.  Many thanks for such a great teaching 
> tool!  (Our materials are at 
> https://github.com/CliffordAnderson/XQuery4Humanists 
> .)
> 
> Back to the topic of this post, though, I noticed a slight difference between 
> BaseX's serialization of arrays when using JSON vs. adaptive methods: with 
> JSON, the array's items are separated by newlines, whereas with adaptive, the 
> items are separated by spaces.  This is interesting since the serialization 
> spec notes that the adaptive method delegates the handling of the "indent" 
> parameter to JSON.  Some code to reproduce this is below.
> 
> I'm curious to know - is there a particular reason for this difference?
> 
> Thanks,
> Joe
> 
> 
> serialization-test.xq
> ```xquery
> xquery version "3.1";
> 
> declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization 
> ";
> let $array := ["Cheapside","London","Dean Prior","Devon"]
> for $method in ("json", "adaptive")
> let $serialization-parameters := 
>   
> {$method}
> yes
>   
> return
>   fn:serialize($array, $serialization-parameters)
> ```
> 
> serialization-test_results.txt
> ```txt
> [
>   "Cheapside",
>   "London",
>   "Dean Prior",
>   "Devon"
> ]
> ["Cheapside", "London", "Dean Prior", "Devon"]
> ```



[basex-talk] Differences in serialization of arrays with JSON vs. adaptive methods

2017-08-10 Thread Joe Wicentowski
Hi all,

First, I'm just back from DH2017, where Clifford Anderson and I taught two
workshops on XQuery using BaseX, along with eXist and Saxon.  BaseX
performed like a champ.  We were able to configure the GUI window to show
just the query and results windows—perfect when you're projecting the
screen in a large room and want everyone to see.  Many thanks for such a
great teaching tool!  (Our materials are at
https://github.com/CliffordAnderson/XQuery4Humanists.)

Back to the topic of this post, though, I noticed a slight difference
between BaseX's serialization of arrays when using JSON vs. adaptive
methods: with JSON, the array's items are separated by newlines, whereas
with adaptive, the items are separated by spaces.  This is interesting
since the serialization spec notes that the adaptive method delegates the
handling of the "indent" parameter to JSON.  Some code to reproduce this is
below.

I'm curious to know - is there a particular reason for this difference?

Thanks,
Joe


serialization-test.xq
```xquery
xquery version "3.1";

declare namespace output="http://www.w3.org/2010/xslt-xquery-serialization;;
let $array := ["Cheapside","London","Dean Prior","Devon"]
for $method in ("json", "adaptive")
let $serialization-parameters :=
  
{$method}
yes
  
return
  fn:serialize($array, $serialization-parameters)
```

serialization-test_results.txt
```txt
[
  "Cheapside",
  "London",
  "Dean Prior",
  "Devon"
]
["Cheapside", "London", "Dean Prior", "Devon"]
```


Re: [basex-talk] Single quotes with json::parse

2017-08-10 Thread Thomas Daly
Hello,

Yes, I did escape all of them but got the same error.

I have tried Christian's solution which solves the problem, and good to know
about  too.

Many thanks,
Thomas

-Original Message-
From: basex-talk-boun...@mailman.uni-konstanz.de
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of Sebastian
Albert
Sent: 10 August 2017 13:47
To: basex-talk@mailman.uni-konstanz.de
Subject: Re: [basex-talk] Single quotes with json::parse

Just to make sure: Have you also escaped Merchant Taylors' School at the
same time?

I've successfully tried  to escape the apostrophe.

On 10.08.2017 14:41, Thomas Daly wrote:
> Regarding the JSON module json::parse function, how do you cope with 
> single quotes within the JSON data?
> 
>  
> 
> I have tried escaping them with a backslash, but to no avail.  In this 
> example I think it's the single quote in "St. John's College" that is 
> causing the problem:
> 
>  
> 
> XQUERY let $database := 'lk'
> 
> let $options := map { 'format': 'direct',
> 'escape': 'yes' }
> 
> let $j :=
> json:parse('{"roles":["poet","cleric"],"institutions":["St John's 
> College","Cambridge 
> University"],"characterId":"595051204300","homes":["Cheapside","London
> ","Dean 
> Prior","Devon"],"image":"robert_herrick.jpg","DOB":"1591-08-24","attri
> butes":["People educated at Merchant Taylors' 
> School","Northwood","Alumni of St John's College","Cambridge","Alumni 
> of Trinity Hall","Cambridge","English poets","People from the City of 
> London","People educated at Westminster School","London","English male 
> poets"],"DOD":"1674-10-08","nationality":"British","givenName":"Robert
> ","familyName":"Herrick"}',
> $options)
> 
> return db:add($database, $j,
> 'characters/c_595051204300_Robert_Herrick.xml')
> 
>  
> 
> Stopped at ., 3/78:
> 
> [XPST0003] Expecting closing bracket: json:parse. at 
> /usr/local/lib/basex-api/src/main/perl/BaseXClient.pm line 58.
> 
>  
> 
> Many thanks,
> 
> Thomas
> 
>  
> 




Re: [basex-talk] Single quotes with json::parse

2017-08-10 Thread Sebastian Albert
Just to make sure: Have you also escaped Merchant Taylors' School at the
same time?

I've successfully tried  to escape the apostrophe.

On 10.08.2017 14:41, Thomas Daly wrote:
> Regarding the JSON module json::parse function, how do you cope with
> single quotes within the JSON data?
> 
>  
> 
> I have tried escaping them with a backslash, but to no avail.  In this
> example I think it’s the single quote in “St. John’s College” that is
> causing the problem:
> 
>  
> 
> XQUERY let $database := 'lk'
> 
> let $options := map { 'format': 'direct',
> 'escape': 'yes' }
> 
> let $j :=
> json:parse('{"roles":["poet","cleric"],"institutions":["St John's
> College","Cambridge
> University"],"characterId":"595051204300","homes":["Cheapside","London","Dean
> Prior","Devon"],"image":"robert_herrick.jpg","DOB":"1591-08-24","attributes":["People
> educated at Merchant Taylors' School","Northwood","Alumni of St John's
> College","Cambridge","Alumni of Trinity Hall","Cambridge","English
> poets","People from the City of London","People educated at Westminster
> School","London","English male
> poets"],"DOD":"1674-10-08","nationality":"British","givenName":"Robert","familyName":"Herrick"}',
> $options)
> 
> return db:add($database, $j,
> 'characters/c_595051204300_Robert_Herrick.xml')
> 
>  
> 
> Stopped at ., 3/78:
> 
> [XPST0003] Expecting closing bracket: json:parse. at
> /usr/local/lib/basex-api/src/main/perl/BaseXClient.pm line 58.
> 
>  
> 
> Many thanks,
> 
> Thomas
> 
>  
> 



signature.asc
Description: OpenPGP digital signature


Re: [basex-talk] Single quotes with json::parse

2017-08-10 Thread Christian Grün
Hi Thomas,

This problem is a general one when working with quotes; see:

  'It is called "St. John's College"!'

One solution is to go with the XQuery 3.1 string constructor [1]

  ``[It is called "St. John's College"!]``

Cheers,
Christian


[1] http://docs.basex.org/wiki/XQuery_3.1#String_Constructor


On Thu, Aug 10, 2017 at 2:41 PM, Thomas Daly  wrote:
> Regarding the JSON module json::parse function, how do you cope with single
> quotes within the JSON data?
>
>
>
> I have tried escaping them with a backslash, but to no avail.  In this
> example I think it’s the single quote in “St. John’s College” that is
> causing the problem:
>
>
>
> XQUERY let $database := 'lk'
>
> let $options := map { 'format': 'direct', 'escape':
> 'yes' }
>
> let $j :=
> json:parse('{"roles":["poet","cleric"],"institutions":["St John's
> College","Cambridge
> University"],"characterId":"595051204300","homes":["Cheapside","London","Dean
> Prior","Devon"],"image":"robert_herrick.jpg","DOB":"1591-08-24","attributes":["People
> educated at Merchant Taylors' School","Northwood","Alumni of St John's
> College","Cambridge","Alumni of Trinity Hall","Cambridge","English
> poets","People from the City of London","People educated at Westminster
> School","London","English male
> poets"],"DOD":"1674-10-08","nationality":"British","givenName":"Robert","familyName":"Herrick"}',
> $options)
>
> return db:add($database, $j,
> 'characters/c_595051204300_Robert_Herrick.xml')
>
>
>
> Stopped at ., 3/78:
>
> [XPST0003] Expecting closing bracket: json:parse. at
> /usr/local/lib/basex-api/src/main/perl/BaseXClient.pm line 58.
>
>
>
> Many thanks,
>
> Thomas
>
>


[basex-talk] Single quotes with json::parse

2017-08-10 Thread Thomas Daly
Regarding the JSON module json::parse function, how do you cope with single
quotes within the JSON data?

 

I have tried escaping them with a backslash, but to no avail.  In this
example I think it's the single quote in "St. John's College" that is
causing the problem:

 

XQUERY let $database := 'lk'

let $options := map { 'format': 'direct', 'escape':
'yes' }

let $j :=
json:parse('{"roles":["poet","cleric"],"institutions":["St John's
College","Cambridge
University"],"characterId":"595051204300","homes":["Cheapside","London","Dea
n
Prior","Devon"],"image":"robert_herrick.jpg","DOB":"1591-08-24","attributes"
:["People educated at Merchant Taylors' School","Northwood","Alumni of St
John's College","Cambridge","Alumni of Trinity Hall","Cambridge","English
poets","People from the City of London","People educated at Westminster
School","London","English male
poets"],"DOD":"1674-10-08","nationality":"British","givenName":"Robert","fam
ilyName":"Herrick"}', $options)

return db:add($database, $j,
'characters/c_595051204300_Robert_Herrick.xml')

 

Stopped at ., 3/78:

[XPST0003] Expecting closing bracket: json:parse. at
/usr/local/lib/basex-api/src/main/perl/BaseXClient.pm line 58.

 

Many thanks,

Thomas