Re: [xquery-talk] From map entry pairs to a pair of arrays
Hi Wouter, > What I meant is that the cost of immutability can be avoided in some cases. > That's primarily a matter of low level short-circuiting, not so much of > writing something new. I don't know if this can be done in Java, I know the > Clojure compiler does this, and I'm not aware of XQ engines that do > transient operations on persistent collections. I see, thanks. In most cases, I would doubt that it’s the immutability of XQuery maps that slows down queries. The performance of hash tries (which I assume are used in most implementations of XQuery) is close to mutable hash maps. As Michael indicated, if data is repeatedly copied, immutable maps can outperform classical hash maps, so it can even be a wise choice to use them deliberately. I have no idea about transient operations in Clojure. Do you know if they are they based on a general concept, or are they closely related to the semantics of Clojure? If they are intransparent for the user, as it seems, I would assume they are comparable to other internal optimizations that can be applied on a map (such as e.g. resorting to intermediary mutable data structures if it is known in advance that a bulk operation takes place). It would be interesting to have a closer look at some real queries in which maps seem to be insufficient, and find out if the bottleneck is really the immutability of the maps, the usage of maps, or the usage of XQuery in general. Christian > Op 16 jul. 2017 23:58 schreef "Michael Kay" : > > ce between C and Assembly is a lot smaller than between C and XQ, and I > think memory management should not be taken lightly (I'm not saying you do). > In OP's case, I believe keys could be discarded while memory layout remains > intact. > > > As a general rule of thumb, a lower-level language performs better provided > that you have the time and skills to write the code efficiently. > > And as a general rule of thumb, you don't. > > Even when you are quite convinced that you do. > > It's a long time since I wrote in anything as low-level as C or assembler, > but if you compare XQuery and Java, the level of abstraction of the API for > maps is very similar, so there is no intrinsic reason to believe one should > perform better than the other. The main difference is that maps in XQuery > are immutable, which means you pay a little more for some operations (like > adding a new entry), and you pay a lot less for other operations (llike bulk > copying). > > Michael Kay > Saxonica > > ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Hi Ghislain, I completely agree with your assessment. I would expect the two writings… $map?* map:for-each($map, function($key, $value) { $value }) …to be faster than… $key ! map:get($map, $key) $key ! $map($key) …because the latter ones require one additional lookup per key. However, from a practical perspective, this is mostly up to the implementation, which may choose different evaluation strategies for each of the alternatives. For example, the evaluation of $map?* or $map($key) could possibly be sped if it can statically be detected that $map will always be of type map(*). All the best, Christian On Mon, Jul 17, 2017 at 8:31 AM, Ghislain Fourny wrote: > Hi, > > My first (naive) thought reading this thread is that, from a theoretical > complexity perspective -- unless I missed something -- the first version in > the original message was already asymptotically optimal (linear). The time it > takes to compute an output always has, as a lower bound, the time it takes to > output it ("print it out"), and the output is made of two "passes of the > input". The improvements suggested should change the constant, but > asymptotically on larger inputs, I do not expect a significant difference. > > Having said this, of course, in practice, the constant may make a big > difference. > > Just my two cents! > > Kind regards, > Ghislain > > >> On 16 Jul 2017, at 23:58, Michael Kay wrote: >> >> ce between C and Assembly is a lot smaller than between C and XQ, and I >> think memory management should not be taken lightly (I'm not saying you do). >> In OP's case, I believe keys could be discarded while memory layout remains >> intact. >> >> >> As a general rule of thumb, a lower-level language performs better provided >> that you have the time and skills to write the code efficiently. >> >> And as a general rule of thumb, you don't. >> >> Even when you are quite convinced that you do. >> >> It's a long time since I wrote in anything as low-level as C or assembler, >> but if you compare XQuery and Java, the level of abstraction of the API for >> maps is very similar, so there is no intrinsic reason to believe one should >> perform better than the other. The main difference is that maps in XQuery >> are immutable, which means you pay a little more for some operations (like >> adding a new entry), and you pay a lot less for other operations (llike bulk >> copying). >> >> Michael Kay >> Saxonica >> ___ >> talk@x-query.com >> http://x-query.com/mailman/listinfo/talk > > > ___ > talk@x-query.com > http://x-query.com/mailman/listinfo/talk ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
What I meant is that the cost of immutability can be avoided in some cases. That's primarily a matter of low level short-circuiting, not so much of writing something new. I don't know if this can be done in Java, I know the Clojure compiler does this, and I'm not aware of XQ engines that do transient operations on persistent collections. Op 16 jul. 2017 23:58 schreef "Michael Kay" : ce between C and Assembly is a lot smaller than between C and XQ, and I think memory management should not be taken lightly (I'm not saying you do). In OP's case, I believe keys could be discarded while memory layout remains intact. As a general rule of thumb, a lower-level language performs better provided that you have the time and skills to write the code efficiently. And as a general rule of thumb, you don't. Even when you are quite convinced that you do. It's a long time since I wrote in anything as low-level as C or assembler, but if you compare XQuery and Java, the level of abstraction of the API for maps is very similar, so there is no intrinsic reason to believe one should perform better than the other. The main difference is that maps in XQuery are immutable, which means you pay a little more for some operations (like adding a new entry), and you pay a lot less for other operations (llike bulk copying). Michael Kay Saxonica ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Hi, My first (naive) thought reading this thread is that, from a theoretical complexity perspective -- unless I missed something -- the first version in the original message was already asymptotically optimal (linear). The time it takes to compute an output always has, as a lower bound, the time it takes to output it ("print it out"), and the output is made of two "passes of the input". The improvements suggested should change the constant, but asymptotically on larger inputs, I do not expect a significant difference. Having said this, of course, in practice, the constant may make a big difference. Just my two cents! Kind regards, Ghislain > On 16 Jul 2017, at 23:58, Michael Kay wrote: > > ce between C and Assembly is a lot smaller than between C and XQ, and I think > memory management should not be taken lightly (I'm not saying you do). In > OP's case, I believe keys could be discarded while memory layout remains > intact. > > > As a general rule of thumb, a lower-level language performs better provided > that you have the time and skills to write the code efficiently. > > And as a general rule of thumb, you don't. > > Even when you are quite convinced that you do. > > It's a long time since I wrote in anything as low-level as C or assembler, > but if you compare XQuery and Java, the level of abstraction of the API for > maps is very similar, so there is no intrinsic reason to believe one should > perform better than the other. The main difference is that maps in XQuery are > immutable, which means you pay a little more for some operations (like adding > a new entry), and you pay a lot less for other operations (llike bulk > copying). > > Michael Kay > Saxonica > ___ > talk@x-query.com > http://x-query.com/mailman/listinfo/talk ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
ce between C and Assembly is a lot smaller than between C and XQ, and I think memory management should not be taken lightly (I'm not saying you do). In OP's case, I believe keys could be discarded while memory layout remains intact. As a general rule of thumb, a lower-level language performs better provided that you have the time and skills to write the code efficiently. And as a general rule of thumb, you don't. Even when you are quite convinced that you do. It's a long time since I wrote in anything as low-level as C or assembler, but if you compare XQuery and Java, the level of abstraction of the API for maps is very similar, so there is no intrinsic reason to believe one should perform better than the other. The main difference is that maps in XQuery are immutable, which means you pay a little more for some operations (like adding a new entry), and you pay a lot less for other operations (llike bulk copying). Michael Kay Saxonica ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Hi Christian, The difference between C and Assembly is a lot smaller than between C and XQ, and I think memory management should not be taken lightly (I'm not saying you do). In OP's case, I believe keys could be discarded while memory layout remains intact. Cheers, Wouter Op 16 jul. 2017 12:52 schreef "Christian Grün" : Hi Wouter, In my experience, XQuery maps can be surprisingly efficient, even if millions of items need to be processed. Obviously, no programming language solves all problems best, though (even today, an assembler language can a better choice than C). Did you come across particular use cases in which it turned out that XQuery maps and arrays were not the best choice? What do you believe was the critical factor (memory consumption, runtime, …)? Cheers, Christian On Sun, Jul 16, 2017 at 12:08 PM, W.S. Hager wrote: > Fact of the matter is that if a low level optimization can be made, it must > be made outside of XQuery. I assume it happens all the time. But sure, it > depends on the use case. > > Op 15 jul. 2017 22:47 schreef "Michael Kay" : > >> >> Obviously this operation could be implemented in the host language to make >> it truly efficient. > > Don't assume that other languages are inevitably more efficient than XQuery. > > And don't assume that the cost of moving/converting data from one > programming language to another is negligible. > > Michael Kay > Saxonica > > ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Hi Wouter, In my experience, XQuery maps can be surprisingly efficient, even if millions of items need to be processed. Obviously, no programming language solves all problems best, though (even today, an assembler language can a better choice than C). Did you come across particular use cases in which it turned out that XQuery maps and arrays were not the best choice? What do you believe was the critical factor (memory consumption, runtime, …)? Cheers, Christian On Sun, Jul 16, 2017 at 12:08 PM, W.S. Hager wrote: > Fact of the matter is that if a low level optimization can be made, it must > be made outside of XQuery. I assume it happens all the time. But sure, it > depends on the use case. > > Op 15 jul. 2017 22:47 schreef "Michael Kay" : > >> >> Obviously this operation could be implemented in the host language to make >> it truly efficient. > > Don't assume that other languages are inevitably more efficient than XQuery. > > And don't assume that the cost of moving/converting data from one > programming language to another is negligible. > > Michael Kay > Saxonica > > ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Fact of the matter is that if a low level optimization can be made, it must be made outside of XQuery. I assume it happens all the time. But sure, it depends on the use case. Op 15 jul. 2017 22:47 schreef "Michael Kay" : > > Obviously this operation could be implemented in the host language to make it truly efficient. Don't assume that other languages are inevitably more efficient than XQuery. And don't assume that the cost of moving/converting data from one programming language to another is negligible. Michael Kay Saxonica ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
> > Obviously this operation could be implemented in the host language to make it > truly efficient. Don't assume that other languages are inevitably more efficient than XQuery. And don't assume that the cost of moving/converting data from one programming language to another is negligible. Michael Kay Saxonica ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Another possibility: declare namespace map = "http://www.w3.org/2005/xpath-functions/map";; let $map := map { "a": 1, "b": 2, "c": 3 } let $entries := map:for-each($map, function($k, $v){ [$k, $v] }) return (array{$entries!?1}, array{$entries!?2}) > > Here is yet another alternative that can be efficient (but it’s not > the most compact one): > > let $map := map { "a": 1, "b": 2, "c": 3 } > return ( >array { map:keys($map) }, >array { map:for-each($map, function($key, $value) { $value }) } > ) > A disadvantage with this, and some other proposed solutions, is that the spec offers no guarantee that map:keys() and map:for-each() will iterate the entries of the map in the same order. Note also that map:for-each($map, function($key, $value) { $value }) can be written $map?* Michael Kay Saxonica ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Hi Joe, In addition to other comments: also the size of the data matters, and not to mention the purpose of what you're trying to achieve... Small chunks are usually processed faster, so unless you just need to convert all data, perhaps limiting the set could be useful. Especially when you need yet another result, and the array would just be an intermediary result, performance may vary across implementations. Obviously this operation could be implemented in the host language to make it truly efficient. Cheers, Wouter Op 15 jul. 2017 21:12 schreef "Christian Grün" : > Hi Joe, hi Emmanuel, > > Here is yet another alternative that can be efficient (but it’s not > the most compact one): > > let $map := map { "a": 1, "b": 2, "c": 3 } > return ( > array { map:keys($map) }, > array { map:for-each($map, function($key, $value) { $value }) } > ) > > Last but not least, map:get($map, ...) can be replaced with $map(...): > > let $map := map { "a": 1, "b": 2, "c": 3 } > let $keys := map:keys($map) > return ( > array { $keys }, > array { $keys ! $map(.) } > ) > > Depending on the XQuery processor, and even the specific version, one > or the other solution may be most efficient. > > Cheers, > Christian > > > > On Sat, Jul 15, 2017 at 8:16 PM, Joe Wicentowski wrote: > > Great, thank you, Emmanuel! That is much better. > > > > I might just offer one enhancement - switching from a FLWOR to the simple > > map operator: > > > > ``` > > xquery version "3.1"; > > > > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } > > let $keys := map:keys($letters-numbers) > > return ( > > array { $keys }, > > array { $keys ! map:get($letters-numbers, .) } > > ) > > ``` > > > > Joe > > > > On Sat, Jul 15, 2017 at 12:12 PM Emmanuel Chateau > > > wrote: > >> > >> Hi, > >> I would say > >> > >> ``` > >> let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } > >> let $keys := map:keys($letters-numbers) > >> return ( > >> array { $keys }, > >> array { for $key in $keys return map:get($letters-numbers, $key) } > >> ) > >> ``` > >> > Le 15 juil. 2017 à 11:41, Joe Wicentowski a écrit > : > >> > > >> > Hi all, > >> > > >> > Is there an efficient way to transform entries in a map into two > arrays, > >> > one containing the keys and one containing the values? Here's my > attempt: > >> > > >> > ``` > >> > xquery version "3.1"; > >> > > >> > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } > >> > return > >> > ( > >> > array { map:keys($letters-numbers) }, > >> > array { $letters-numbers?* } > >> > ) > >> > ``` > >> > > >> > This successfully takes `map { "a": 1, "b": 2, "c": 3 }` and returns > >> > (["a", "b", "c"], [1, 2, 3]), but the way it iterates through the map > twice > >> > strikes me as somewhat inefficient. Is there a better way to reduce > this to > >> > a single pass through the map, or is this actually the best approach? > >> > > >> > Joe > >> > ___ > >> > talk@x-query.com > >> > http://x-query.com/mailman/listinfo/talk > >> > > > > ___ > > talk@x-query.com > > http://x-query.com/mailman/listinfo/talk > > ___ > talk@x-query.com > http://x-query.com/mailman/listinfo/talk ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Hi Joe, hi Emmanuel, Here is yet another alternative that can be efficient (but it’s not the most compact one): let $map := map { "a": 1, "b": 2, "c": 3 } return ( array { map:keys($map) }, array { map:for-each($map, function($key, $value) { $value }) } ) Last but not least, map:get($map, ...) can be replaced with $map(...): let $map := map { "a": 1, "b": 2, "c": 3 } let $keys := map:keys($map) return ( array { $keys }, array { $keys ! $map(.) } ) Depending on the XQuery processor, and even the specific version, one or the other solution may be most efficient. Cheers, Christian On Sat, Jul 15, 2017 at 8:16 PM, Joe Wicentowski wrote: > Great, thank you, Emmanuel! That is much better. > > I might just offer one enhancement - switching from a FLWOR to the simple > map operator: > > ``` > xquery version "3.1"; > > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } > let $keys := map:keys($letters-numbers) > return ( > array { $keys }, > array { $keys ! map:get($letters-numbers, .) } > ) > ``` > > Joe > > On Sat, Jul 15, 2017 at 12:12 PM Emmanuel Chateau > wrote: >> >> Hi, >> I would say >> >> ``` >> let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } >> let $keys := map:keys($letters-numbers) >> return ( >> array { $keys }, >> array { for $key in $keys return map:get($letters-numbers, $key) } >> ) >> ``` >> > Le 15 juil. 2017 à 11:41, Joe Wicentowski a écrit : >> > >> > Hi all, >> > >> > Is there an efficient way to transform entries in a map into two arrays, >> > one containing the keys and one containing the values? Here's my attempt: >> > >> > ``` >> > xquery version "3.1"; >> > >> > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } >> > return >> > ( >> > array { map:keys($letters-numbers) }, >> > array { $letters-numbers?* } >> > ) >> > ``` >> > >> > This successfully takes `map { "a": 1, "b": 2, "c": 3 }` and returns >> > (["a", "b", "c"], [1, 2, 3]), but the way it iterates through the map twice >> > strikes me as somewhat inefficient. Is there a better way to reduce this >> > to >> > a single pass through the map, or is this actually the best approach? >> > >> > Joe >> > ___ >> > talk@x-query.com >> > http://x-query.com/mailman/listinfo/talk >> > > ___ > talk@x-query.com > http://x-query.com/mailman/listinfo/talk ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Great, thank you, Emmanuel! That is much better. I might just offer one enhancement - switching from a FLWOR to the simple map operator: ``` xquery version "3.1"; let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } let $keys := map:keys($letters-numbers) return ( array { $keys }, array { $keys ! map:get($letters-numbers, .) } ) ``` Joe On Sat, Jul 15, 2017 at 12:12 PM Emmanuel Chateau wrote: > Hi, > I would say > > ``` > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } > let $keys := map:keys($letters-numbers) > return ( > array { $keys }, > array { for $key in $keys return map:get($letters-numbers, $key) } > ) > ``` > > Le 15 juil. 2017 à 11:41, Joe Wicentowski a écrit : > > > > Hi all, > > > > Is there an efficient way to transform entries in a map into two arrays, > one containing the keys and one containing the values? Here's my attempt: > > > > ``` > > xquery version "3.1"; > > > > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } > > return > > ( > > array { map:keys($letters-numbers) }, > > array { $letters-numbers?* } > > ) > > ``` > > > > This successfully takes `map { "a": 1, "b": 2, "c": 3 }` and returns > (["a", "b", "c"], [1, 2, 3]), but the way it iterates through the map twice > strikes me as somewhat inefficient. Is there a better way to reduce this > to a single pass through the map, or is this actually the best approach? > > > > Joe > > ___ > > talk@x-query.com > > http://x-query.com/mailman/listinfo/talk > > ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk
Re: [xquery-talk] From map entry pairs to a pair of arrays
Hi, I would say ``` let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } let $keys := map:keys($letters-numbers) return ( array { $keys }, array { for $key in $keys return map:get($letters-numbers, $key) } ) ``` > Le 15 juil. 2017 à 11:41, Joe Wicentowski a écrit : > > Hi all, > > Is there an efficient way to transform entries in a map into two arrays, one > containing the keys and one containing the values? Here's my attempt: > > ``` > xquery version "3.1"; > > let $letters-numbers := map { "a": 1, "b": 2, "c": 3 } > return > ( > array { map:keys($letters-numbers) }, > array { $letters-numbers?* } > ) > ``` > > This successfully takes `map { "a": 1, "b": 2, "c": 3 }` and returns (["a", > "b", "c"], [1, 2, 3]), but the way it iterates through the map twice strikes > me as somewhat inefficient. Is there a better way to reduce this to a single > pass through the map, or is this actually the best approach? > > Joe > ___ > talk@x-query.com > http://x-query.com/mailman/listinfo/talk ___ talk@x-query.com http://x-query.com/mailman/listinfo/talk