Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-17 Thread Christian Grün
Hi Wouter,

> What I meant is that the cost of immutability can be avoided in some cases.
> That's primarily a matter of low level short-circuiting, not so much of
> writing something new. I don't know if this can be done in Java, I know the
> Clojure compiler does this, and I'm not aware of XQ engines that do
> transient operations on persistent collections.

I see, thanks.

In most cases, I would doubt that it’s the immutability of XQuery maps
that slows down queries. The performance of hash tries (which I assume
are used in most implementations of XQuery) is close to mutable hash
maps. As Michael indicated, if data is repeatedly copied, immutable
maps can outperform classical hash maps, so it can even be a wise
choice to use them deliberately.

I have no idea about transient operations in Clojure. Do you know if
they are they based on a general concept, or are they closely related
to the semantics of Clojure? If they are intransparent for the user,
as it seems, I would assume they are comparable to other internal
optimizations that can be applied on a map (such as e.g. resorting to
intermediary mutable data structures if it is known in advance that a
bulk operation takes place).

It would be interesting to have a closer look at some real queries in
which maps seem to be insufficient, and find out if the bottleneck is
really the immutability of the maps, the usage of maps, or the usage
of XQuery in general.

Christian


> Op 16 jul. 2017 23:58 schreef "Michael Kay" :
>
> ce between C and Assembly is a lot smaller than between C and XQ, and I
> think memory management should not be taken lightly (I'm not saying you do).
> In OP's case, I believe keys could be discarded while memory layout remains
> intact.
>
>
> As a general rule of thumb, a lower-level language performs better provided
> that you have the time and skills to write the code efficiently.
>
> And as a general rule of thumb, you don't.
>
> Even when you are quite convinced that you do.
>
> It's a long time since I wrote in anything as low-level as C or assembler,
> but if you compare XQuery and Java, the level of abstraction of the API for
> maps is very similar, so there is no intrinsic reason to believe one should
> perform better than the other. The main difference is that maps in XQuery
> are immutable, which means you pay a little more for some operations (like
> adding a new entry), and you pay a lot less for other operations (llike bulk
> copying).
>
> Michael Kay
> Saxonica
>
>

___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-17 Thread Christian Grün
Hi Ghislain,

I completely agree with your assessment. I would expect the two writings…

  $map?*
  map:for-each($map, function($key, $value) { $value })

…to be faster than…

  $key ! map:get($map, $key)
  $key ! $map($key)

…because the latter ones require one additional lookup per key.
However, from a practical perspective, this is mostly up to the
implementation, which may choose different evaluation strategies for
each of the alternatives. For example, the evaluation of $map?* or
$map($key) could possibly be sped if it can statically be detected
that $map will always be of type map(*).

All the best,
Christian


On Mon, Jul 17, 2017 at 8:31 AM, Ghislain  Fourny  wrote:
> Hi,
>
> My first (naive) thought reading this thread is that, from a theoretical 
> complexity perspective -- unless I missed something -- the first version in 
> the original message was already asymptotically optimal (linear). The time it 
> takes to compute an output always has, as a lower bound, the time it takes to 
> output it ("print it out"), and the output is made of two "passes of the 
> input". The improvements suggested should change the constant, but 
> asymptotically on larger inputs, I do not expect a significant difference.
>
> Having said this, of course, in practice, the constant may make a big 
> difference.
>
> Just my two cents!
>
> Kind regards,
> Ghislain
>
>
>> On 16 Jul 2017, at 23:58, Michael Kay  wrote:
>>
>> ce between C and Assembly is a lot smaller than between C and XQ, and I 
>> think memory management should not be taken lightly (I'm not saying you do). 
>> In OP's case, I believe keys could be discarded while memory layout remains 
>> intact.
>>
>>
>> As a general rule of thumb, a lower-level language performs better provided 
>> that you have the time and skills to write the code efficiently.
>>
>> And as a general rule of thumb, you don't.
>>
>> Even when you are quite convinced that you do.
>>
>> It's a long time since I wrote in anything as low-level as C or assembler, 
>> but if you compare XQuery and Java, the level of abstraction of the API for 
>> maps is very similar, so there is no intrinsic reason to believe one should 
>> perform better than the other. The main difference is that maps in XQuery 
>> are immutable, which means you pay a little more for some operations (like 
>> adding a new entry), and you pay a lot less for other operations (llike bulk 
>> copying).
>>
>> Michael Kay
>> Saxonica
>> ___
>> talk@x-query.com
>> http://x-query.com/mailman/listinfo/talk
>
>
> ___
> talk@x-query.com
> http://x-query.com/mailman/listinfo/talk

___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-17 Thread W.S. Hager
What I meant is that the cost of immutability can be avoided in some cases.
That's primarily a matter of low level short-circuiting, not so much of
writing something new. I don't know if this can be done in Java, I know the
Clojure compiler does this, and I'm not aware of XQ engines that do
transient operations on persistent collections.

Op 16 jul. 2017 23:58 schreef "Michael Kay" :

ce between C and Assembly is a lot smaller than between C and XQ, and I
think memory management should not be taken lightly (I'm not saying you
do). In OP's case, I believe keys could be discarded while memory layout
remains intact.


As a general rule of thumb, a lower-level language performs better provided
that you have the time and skills to write the code efficiently.

And as a general rule of thumb, you don't.

Even when you are quite convinced that you do.

It's a long time since I wrote in anything as low-level as C or assembler,
but if you compare XQuery and Java, the level of abstraction of the API for
maps is very similar, so there is no intrinsic reason to believe one should
perform better than the other. The main difference is that maps in XQuery
are immutable, which means you pay a little more for some operations (like
adding a new entry), and you pay a lot less for other operations (llike
bulk copying).

Michael Kay
Saxonica
___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-16 Thread Ghislain Fourny
Hi,

My first (naive) thought reading this thread is that, from a theoretical 
complexity perspective -- unless I missed something -- the first version in the 
original message was already asymptotically optimal (linear). The time it takes 
to compute an output always has, as a lower bound, the time it takes to output 
it ("print it out"), and the output is made of two "passes of the input". The 
improvements suggested should change the constant, but asymptotically on larger 
inputs, I do not expect a significant difference.

Having said this, of course, in practice, the constant may make a big 
difference.

Just my two cents!

Kind regards,
Ghislain


> On 16 Jul 2017, at 23:58, Michael Kay  wrote:
> 
> ce between C and Assembly is a lot smaller than between C and XQ, and I think 
> memory management should not be taken lightly (I'm not saying you do). In 
> OP's case, I believe keys could be discarded while memory layout remains 
> intact.
> 
> 
> As a general rule of thumb, a lower-level language performs better provided 
> that you have the time and skills to write the code efficiently.
> 
> And as a general rule of thumb, you don't. 
> 
> Even when you are quite convinced that you do.
> 
> It's a long time since I wrote in anything as low-level as C or assembler, 
> but if you compare XQuery and Java, the level of abstraction of the API for 
> maps is very similar, so there is no intrinsic reason to believe one should 
> perform better than the other. The main difference is that maps in XQuery are 
> immutable, which means you pay a little more for some operations (like adding 
> a new entry), and you pay a lot less for other operations (llike bulk 
> copying).
> 
> Michael Kay
> Saxonica
> ___
> talk@x-query.com
> http://x-query.com/mailman/listinfo/talk


___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk


Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-16 Thread Michael Kay
ce between C and Assembly is a lot smaller than between C and XQ, and I think 
memory management should not be taken lightly (I'm not saying you do). In OP's 
case, I believe keys could be discarded while memory layout remains intact.


As a general rule of thumb, a lower-level language performs better provided 
that you have the time and skills to write the code efficiently.

And as a general rule of thumb, you don't. 

Even when you are quite convinced that you do.

It's a long time since I wrote in anything as low-level as C or assembler, but 
if you compare XQuery and Java, the level of abstraction of the API for maps is 
very similar, so there is no intrinsic reason to believe one should perform 
better than the other. The main difference is that maps in XQuery are 
immutable, which means you pay a little more for some operations (like adding a 
new entry), and you pay a lot less for other operations (llike bulk copying).

Michael Kay
Saxonica
___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk


Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-16 Thread W.S. Hager
Hi Christian,

The difference between C and Assembly is a lot smaller than between C and
XQ, and I think memory management should not be taken lightly (I'm not
saying you do). In OP's case, I believe keys could be discarded while
memory layout remains intact.

Cheers,
Wouter

Op 16 jul. 2017 12:52 schreef "Christian Grün" :

Hi Wouter,

In my experience, XQuery maps can be surprisingly efficient, even if
millions of items need to be processed. Obviously, no programming
language solves all problems best, though (even today, an assembler
language can a better choice than C).

Did you come across particular use cases in which it turned out that
XQuery maps and arrays were not the best choice? What do you believe
was the critical factor (memory consumption, runtime, …)?

Cheers,
Christian




On Sun, Jul 16, 2017 at 12:08 PM, W.S. Hager  wrote:
> Fact of the matter is that if a low level optimization can be made, it
must
> be made outside of XQuery. I assume it happens all the time. But sure, it
> depends on the use case.
>
> Op 15 jul. 2017 22:47 schreef "Michael Kay" :
>
>>
>> Obviously this operation could be implemented in the host language to
make
>> it truly efficient.
>
> Don't assume that other languages are inevitably more efficient than
XQuery.
>
> And don't assume that the cost of moving/converting data from one
> programming language to another is negligible.
>
> Michael Kay
> Saxonica
>
>
___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-16 Thread Christian Grün
Hi Wouter,

In my experience, XQuery maps can be surprisingly efficient, even if
millions of items need to be processed. Obviously, no programming
language solves all problems best, though (even today, an assembler
language can a better choice than C).

Did you come across particular use cases in which it turned out that
XQuery maps and arrays were not the best choice? What do you believe
was the critical factor (memory consumption, runtime, …)?

Cheers,
Christian




On Sun, Jul 16, 2017 at 12:08 PM, W.S. Hager  wrote:
> Fact of the matter is that if a low level optimization can be made, it must
> be made outside of XQuery. I assume it happens all the time. But sure, it
> depends on the use case.
>
> Op 15 jul. 2017 22:47 schreef "Michael Kay" :
>
>>
>> Obviously this operation could be implemented in the host language to make
>> it truly efficient.
>
> Don't assume that other languages are inevitably more efficient than XQuery.
>
> And don't assume that the cost of moving/converting data from one
> programming language to another is negligible.
>
> Michael Kay
> Saxonica
>
>

___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-16 Thread W.S. Hager
Fact of the matter is that if a low level optimization can be made, it must
be made outside of XQuery. I assume it happens all the time. But sure, it
depends on the use case.

Op 15 jul. 2017 22:47 schreef "Michael Kay" :

>
> Obviously this operation could be implemented in the host language to
make it truly efficient.

Don't assume that other languages are inevitably more efficient than XQuery.

And don't assume that the cost of moving/converting data from one
programming language to another is negligible.

Michael Kay
Saxonica
___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-15 Thread Michael Kay
> 
> Obviously this operation could be implemented in the host language to make it 
> truly efficient.

Don't assume that other languages are inevitably more efficient than XQuery.

And don't assume that the cost of moving/converting data from one programming 
language to another is negligible.

Michael Kay
Saxonica


___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk


Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-15 Thread Michael Kay
Another possibility:

declare namespace map = "http://www.w3.org/2005/xpath-functions/map";;
let $map := map { "a": 1, "b": 2, "c": 3 }
let $entries := map:for-each($map, function($k, $v){ [$k, $v] })
return (array{$entries!?1}, array{$entries!?2})

> 
> Here is yet another alternative that can be efficient (but it’s not
> the most compact one):
> 
>  let $map := map { "a": 1, "b": 2, "c": 3 }
>  return (
>array { map:keys($map) },
>array { map:for-each($map, function($key, $value) { $value }) }
>  )
> 

A disadvantage with this, and some other proposed solutions, is that the spec 
offers no guarantee that map:keys() and map:for-each() will iterate the entries 
of the map in the same order.

Note also that map:for-each($map, function($key, $value) { $value }) can be 
written $map?*

Michael Kay
Saxonica


___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-15 Thread W.S. Hager
Hi Joe,

In addition to other comments: also the size of the data matters, and not
to mention the purpose of what you're trying to achieve... Small chunks are
usually processed faster, so unless you just need to convert all data,
perhaps limiting the set could be useful. Especially when you need yet
another result, and the array would just be an intermediary result,
performance may vary across implementations.

Obviously this operation could be implemented in the host language to make
it truly efficient.

Cheers,
Wouter

Op 15 jul. 2017 21:12 schreef "Christian Grün" :

> Hi Joe, hi Emmanuel,
>
> Here is yet another alternative that can be efficient (but it’s not
> the most compact one):
>
>   let $map := map { "a": 1, "b": 2, "c": 3 }
>   return (
> array { map:keys($map) },
> array { map:for-each($map, function($key, $value) { $value }) }
>   )
>
> Last but not least, map:get($map, ...) can be replaced with $map(...):
>
>   let $map := map { "a": 1, "b": 2, "c": 3 }
>   let $keys := map:keys($map)
>   return (
> array { $keys },
> array { $keys ! $map(.) }
>   )
>
> Depending on the XQuery processor, and even the specific version, one
> or the other solution may be most efficient.
>
> Cheers,
> Christian
>
>
>
> On Sat, Jul 15, 2017 at 8:16 PM, Joe Wicentowski  wrote:
> > Great, thank you, Emmanuel!  That is much better.
> >
> > I might just offer one enhancement - switching from a FLWOR to the simple
> > map operator:
> >
> > ```
> > xquery version "3.1";
> >
> > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
> > let $keys := map:keys($letters-numbers)
> > return (
> > array { $keys },
> > array { $keys ! map:get($letters-numbers, .) }
> > )
> > ```
> >
> > Joe
> >
> > On Sat, Jul 15, 2017 at 12:12 PM Emmanuel Chateau  >
> > wrote:
> >>
> >> Hi,
> >> I would say
> >>
> >> ```
> >> let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
> >> let $keys := map:keys($letters-numbers)
> >> return (
> >>   array { $keys },
> >>   array { for $key in $keys return map:get($letters-numbers, $key) }
> >> )
> >> ```
> >> > Le 15 juil. 2017 à 11:41, Joe Wicentowski  a écrit
> :
> >> >
> >> > Hi all,
> >> >
> >> > Is there an efficient way to transform entries in a map into two
> arrays,
> >> > one containing the keys and one containing the values?  Here's my
> attempt:
> >> >
> >> > ```
> >> > xquery version "3.1";
> >> >
> >> > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
> >> > return
> >> > (
> >> > array { map:keys($letters-numbers) },
> >> > array { $letters-numbers?* }
> >> > )
> >> > ```
> >> >
> >> > This successfully takes `map { "a": 1, "b": 2, "c": 3 }` and returns
> >> > (["a", "b", "c"], [1, 2, 3]), but the way it iterates through the map
> twice
> >> > strikes me as somewhat inefficient.  Is there a better way to reduce
> this to
> >> > a single pass through the map, or is this actually the best approach?
> >> >
> >> > Joe
> >> > ___
> >> > talk@x-query.com
> >> > http://x-query.com/mailman/listinfo/talk
> >>
> >
> > ___
> > talk@x-query.com
> > http://x-query.com/mailman/listinfo/talk
>
> ___
> talk@x-query.com
> http://x-query.com/mailman/listinfo/talk
___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-15 Thread Christian Grün
Hi Joe, hi Emmanuel,

Here is yet another alternative that can be efficient (but it’s not
the most compact one):

  let $map := map { "a": 1, "b": 2, "c": 3 }
  return (
array { map:keys($map) },
array { map:for-each($map, function($key, $value) { $value }) }
  )

Last but not least, map:get($map, ...) can be replaced with $map(...):

  let $map := map { "a": 1, "b": 2, "c": 3 }
  let $keys := map:keys($map)
  return (
array { $keys },
array { $keys ! $map(.) }
  )

Depending on the XQuery processor, and even the specific version, one
or the other solution may be most efficient.

Cheers,
Christian



On Sat, Jul 15, 2017 at 8:16 PM, Joe Wicentowski  wrote:
> Great, thank you, Emmanuel!  That is much better.
>
> I might just offer one enhancement - switching from a FLWOR to the simple
> map operator:
>
> ```
> xquery version "3.1";
>
> let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
> let $keys := map:keys($letters-numbers)
> return (
> array { $keys },
> array { $keys ! map:get($letters-numbers, .) }
> )
> ```
>
> Joe
>
> On Sat, Jul 15, 2017 at 12:12 PM Emmanuel Chateau 
> wrote:
>>
>> Hi,
>> I would say
>>
>> ```
>> let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
>> let $keys := map:keys($letters-numbers)
>> return (
>>   array { $keys },
>>   array { for $key in $keys return map:get($letters-numbers, $key) }
>> )
>> ```
>> > Le 15 juil. 2017 à 11:41, Joe Wicentowski  a écrit :
>> >
>> > Hi all,
>> >
>> > Is there an efficient way to transform entries in a map into two arrays,
>> > one containing the keys and one containing the values?  Here's my attempt:
>> >
>> > ```
>> > xquery version "3.1";
>> >
>> > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
>> > return
>> > (
>> > array { map:keys($letters-numbers) },
>> > array { $letters-numbers?* }
>> > )
>> > ```
>> >
>> > This successfully takes `map { "a": 1, "b": 2, "c": 3 }` and returns
>> > (["a", "b", "c"], [1, 2, 3]), but the way it iterates through the map twice
>> > strikes me as somewhat inefficient.  Is there a better way to reduce this 
>> > to
>> > a single pass through the map, or is this actually the best approach?
>> >
>> > Joe
>> > ___
>> > talk@x-query.com
>> > http://x-query.com/mailman/listinfo/talk
>>
>
> ___
> talk@x-query.com
> http://x-query.com/mailman/listinfo/talk

___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-15 Thread Joe Wicentowski
Great, thank you, Emmanuel!  That is much better.

I might just offer one enhancement - switching from a FLWOR to the simple
map operator:

```
xquery version "3.1";

let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
let $keys := map:keys($letters-numbers)
return (
array { $keys },
array { $keys ! map:get($letters-numbers, .) }
)
```

Joe

On Sat, Jul 15, 2017 at 12:12 PM Emmanuel Chateau 
wrote:

> Hi,
> I would say
>
> ```
> let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
> let $keys := map:keys($letters-numbers)
> return (
>   array { $keys },
>   array { for $key in $keys return map:get($letters-numbers, $key) }
> )
> ```
> > Le 15 juil. 2017 à 11:41, Joe Wicentowski  a écrit :
> >
> > Hi all,
> >
> > Is there an efficient way to transform entries in a map into two arrays,
> one containing the keys and one containing the values?  Here's my attempt:
> >
> > ```
> > xquery version "3.1";
> >
> > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
> > return
> > (
> > array { map:keys($letters-numbers) },
> > array { $letters-numbers?* }
> > )
> > ```
> >
> > This successfully takes `map { "a": 1, "b": 2, "c": 3 }` and returns
> (["a", "b", "c"], [1, 2, 3]), but the way it iterates through the map twice
> strikes me as somewhat inefficient.  Is there a better way to reduce this
> to a single pass through the map, or is this actually the best approach?
> >
> > Joe
> > ___
> > talk@x-query.com
> > http://x-query.com/mailman/listinfo/talk
>
>
___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk

Re: [xquery-talk] From map entry pairs to a pair of arrays

2017-07-15 Thread Emmanuel Chateau
Hi,
I would say

```
let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
let $keys := map:keys($letters-numbers)
return (
  array { $keys },
  array { for $key in $keys return map:get($letters-numbers, $key) }
)
```
> Le 15 juil. 2017 à 11:41, Joe Wicentowski  a écrit :
> 
> Hi all,
> 
> Is there an efficient way to transform entries in a map into two arrays, one 
> containing the keys and one containing the values?  Here's my attempt:
> 
> ```
> xquery version "3.1";
> 
> let $letters-numbers := map { "a": 1, "b": 2, "c": 3 }
> return 
> (
> array { map:keys($letters-numbers) },
> array { $letters-numbers?* }
> )
> ```
> 
> This successfully takes `map { "a": 1, "b": 2, "c": 3 }` and returns (["a", 
> "b", "c"], [1, 2, 3]), but the way it iterates through the map twice strikes 
> me as somewhat inefficient.  Is there a better way to reduce this to a single 
> pass through the map, or is this actually the best approach?
> 
> Joe
> ___
> talk@x-query.com
> http://x-query.com/mailman/listinfo/talk


___
talk@x-query.com
http://x-query.com/mailman/listinfo/talk