@jean-marc
I am not disagreeing with your results, with pure XQuery (without vendor 
extensions) I believe you will not come close to code written in C or even Java
for things like maps and vectors.    But you yourself say that the map from 
BaseX is 100X faster than one developed in pure XQuery.
Yet you *can use this map in XQuery* ... That's the whole point of vendor 
extensions is to overcome limitations in languages where appropriate.
This is not unique to XQuery.   I have not come across a language yet that 
doesn't have either vendor extensions, or libraries written in "native" code 
(lower level code then the language itself).    Even in C many standard library 
functions are actually optimized either by hand written assembly or compiler 
insertion to improve efficiencies of the pure language.      Furthermore, you 
have not tried all vendors, I discourage you from extrapolating from some 
vendors to all vendor implementations.   I am not suggesting you try all 
vendors (that would take a long time) only that generalizing statements about 
performance cannot be entirely accurate without doing so.

But I do agree if your goal is to avoid any vendor extensions and provide a 
"100% Pure XQuery"  module for functions that need to make great use of maps 
and vectors, XQuery is likey to not be as performant as languages which have 
the data types you need built in.
( which in essence is saying that those languages native types are the 
equivalent to XQuery vendor extensions except they are defined by the language 
spec instead of the vendor spec. ).


Also as a tiny sub-diversion, I do disagree with this generalized statement
" This is true for any implementation, any language : if you use a map instead 
of a vector, the overhead for each insertion or reading is log(n), where n is 
the size of the vector"

This is entirely dependent on the implementation of the map and vector.  I 
assert you cannot make such generalized assentation's.
For example, a map based on a hash instead of a tree can be linear time until 
it reaches some internal sizing.
Performance in real life is never as simple as on paper.    In your case you 
created a map based on a tree.   That will be roughly log(n), true.
But that is not the only possible implementation of maps.

In the end though ... for the cases you are trying for, I suggest 100% pure 
XQuery will not provide the performance you need.
So your choice is to use a different language, or to find an implementation of 
XQuery which has vendor extensions customized for
the data types and operations not in "100% pure XQuery".





From: jean-marc Mercier [mailto:[email protected]]
Sent: Wednesday, January 01, 2014 4:51 AM
To: David Lee
Cc: Michael Sokolov; xquery-discuss; ihe onwuka; Andrew Welch
Subject: Re: [xquery-talk] Matrix Multiplication

@David

Maybe my statement was not clear. Please let me do it again : I am just saying 
that maps are slow in insertion, compared to vectors. This is true for any 
implementation, any language : if you use a map instead of a vector, the 
overhead for each insertion or reading is log(n), where n is the size of the 
vector, that is huge when you look primarily for performances. Furthermore, 
please check this thread: 3 of us, including me, have tried to develop an 
external modulus implementing a map. The best result, that is the one of Leo 
from BaseX, consists in a 100X overhead factor compared to a vendor solution, 
the map from BaseX. So far we tried, so far we failed: this failure is just 
telling us that we can't use XQUERY to develop basic types as pairs, maps, 
vectors through external XQUERY modulus. In this context, I would not try, and 
will discourage anyone else, to develop an external linear Algebra modulus with 
this version of XQUERY, because it will be too slow compared to imperative 
language already existing linear algebra modulus.

David, I really try to have a scientific approach, I am not bound to any 
organization, have no particular private interest in the xml industry. I am 
just a xquery user. And I wish I were wrong. Please do not hesitate to point me 
out any failure in my reasoning.

Meanwhile, I wish you a happy new year !!!



2013/12/31 David Lee <[email protected]<mailto:[email protected]>>
One tiny correction
Your statement " You can't use Marklogic map, or any other vendor map to store 
vectors for performance issues (a map is really slow)."

So far on this thread this statement has not been substantiated.
MarkLogic maps and arrays are quite efficient, they are not the same thing as 
XQuery rb trees or anything else.
They are highly efficient C++ implementations of maps and arrays, implemented 
under the hood with the same data structures as you would in C++ stdlib.

Not suggesting your other line of reasoning is incorrect, only that so far on 
this thread I have not seen any evidence of testing vendor extensions for 
XQuery types.   I suspect you don't want to do that for good reasons (for 
example its not portable, or its not a free product),
but I do suggest you cannot extrapolate performance from one vendors 
implementation to another.
As is usual in performance "it depends".



From: jean-marc Mercier 
[mailto:[email protected]<mailto:[email protected]>]
Sent: Tuesday, December 31, 2013 12:02 PM
To: David Lee
Cc: Michael Sokolov; xquery-discuss; ihe onwuka; Andrew Welch

Subject: Re: [xquery-talk] Matrix Multiplication

@David pairs are also basically needed to write a linear algebra modulus, the 
topic of this thread. And XQUERY don't provide any efficient pair. You can't 
use Marklogic map, or any other vendor map to store vectors for performance 
issues (a map is really slow).

Note that there are a lot of workaround : I am using direct JAVA binding or C++ 
binding from XQUERY for linear algebra not to pay a too heavy tribute to these 
issue performances.

The point is simply to notice that XQUERY could be really good to write an 
efficient linear algebra modulus. But, due to these performance issues, nobody 
can write it. I just hope that the next XQUERY version will give the necessary 
container to write it. Meanwhile, nobody can contribute to XQUERY through 
external modulus using heavy algorithms, and that's just too bad.





2013/12/31 David Lee <[email protected]<mailto:[email protected]>>
Indeed  ... pair( (1,2,3) ) is just a function that returns a function as per 
my example.
But to the point,  you can either use vendor extensions (such as marklogic's  
json:array and map:map types) which have good support for efficient operations, 
or look to another language (<sigh>)  You may find Scala more amenable to this 
type of programming.



From: [email protected]<mailto:[email protected]> 
[mailto:[email protected]<mailto:[email protected]>] On Behalf Of 
jean-marc Mercier
Sent: Tuesday, December 31, 2013 11:49 AM
To: Michael Sokolov
Cc: xquery-discuss; ihe onwuka; Andrew Welch

Subject: Re: [xquery-talk] Matrix Multiplication

@David, there are tricks to overcome sequence concatenations now. See for 
instance definition of a pair by Leo, John Snelson, or me : you can write pair( 
( 1 , 2 , 3 ) , (4  5 , 6) ) to avoid sequence concatenation. I ve written also 
constant sized vectors using this trick : for instance NUPLE(1,(),<toto>,5), 
withh associated getters.
The bad new is that these operations takes too much time with all the 
interpreters I have tried, and can't be used in heavy algorithms.




2013/12/31 jean-marc Mercier 
<[email protected]<mailto:[email protected]>>
Hi

@Michael Concerning your remark over the discussions I quoted, maps are the 
basic block for sparse linear algebra. Without performent "maps of nodes" 
(equivalent to std::map in C++) you will not be able to write any performant 
linear algebra modulus for sparse matrix.

However, before even thinking to sparse matrix, operations on sequences as 
concatenations are the first "show-stop" to write a linear algebra modulus, 
since sequences are vectors.
Another one is the lack of constant sized vectors (needed for basic dense 
matrix operations).

@Ryan thx for these links, they are very interesting.

Well, I am going to party ! Have a happy new year !




2013/12/31 Michael Sokolov 
<[email protected]<mailto:[email protected]>>
I would love to see some tests of pure XQuery implementations of both sparse 
and dense operations.  I suspect performance of matrix multiply, inversion, 
etc, will be poorer than in C++ or Java, but I would expect performance 
comparable to Perl or Python (w/o its numpy extension) - just a wild guess.  
I'd also expect it might be easier to get good sparse performance.  I don't 
know why, just a hunch.

But the more interesting question for me is whether language changes are really 
needed to support this.  I would have thought that proper optimization of 
operations on sequences would be enough?  So for example, an extension module 
using sequences as matrix datatypes could possibly optimize performance using a 
lower-level implementation.  Does anyone see any reason why that wouldn't be 
possible?

-Mike

PS:
I reviewed the discussion you referred to, jean-marc, but it seems to have more 
to do with using functions as map keys, and I don't see any direct connection 
to linear algebra?


On 12/31/2013 9:55 AM, jean-marc Mercier wrote:
It is not due to the spec. It is rather due to the common usage of XQUERY, 
forcing vendor solutions (as BaseX) to focus primarily on XML Data Base 
requests more than algorithmic performances.

There are actually some threads that are discussing these performance issues in 
the context of maps (maps are for instance used for sparse matrix 
representations) : look for instance to ""map module for XQUERY ?" that I 
initiated or "Higher-order XQuery Modules" by Leo from BaseX, on 
[email protected]<http://x-query.com> mailing list.
Anyhow, to write a serious linear algebra modulus, the basic need is to have a 
vector containers. Unfortunately, XQUERY does not provide any performant vector 
containers at present time, and it is not possible to code them in pure XQUERY 
: I have tried, and more experienced xquery developpers than me have also 
tried, without success.

We have to wait for the XQUERY version that will give us these containers, a 
decision to be taken by the W3C.


2013/12/31 Andrew Welch 
<[email protected]<mailto:[email protected]>>

Are you saying the spec as it stands somehow forces all implementations to be 
1000x slower, or just what you have observed in some particular implementation?

On 31 Dec 2013 14:27, "jean-marc Mercier" 
<[email protected]<mailto:[email protected]>> wrote:
>
> As far as I understand, you want to write a linear algebra module using 
> XQUERY ?
> If so, I opened a thread some months ago about this idea. My opinion today is 
> that this is a false good idea at present time.
>
> 1) XQUERY would be really good for writing concise, efficient linear algebra 
> modulus.
> 2) However, I strongly recommend to wait a little bit for starting coding : 
> the current version of XQUERY (3.0) suffers from performance issues. A linear 
> algebra modulus written in XQUERY is expected to have performances 
> performances 1000 X slower than its corresponding C++ or JAVA (you can 
> measure it precisely). Any mathematician linear algebra modulus would 
> probably trashed your modulus after the first test.
>
> Hope this helps
>
>
>
> 2013/12/31 Ihe Onwuka <[email protected]<mailto:[email protected]>>
>>
>> Assuming a sparse representation it is about 4 lines of SQL. This is known 
>> not least because you can read enough articles and papers that discuss it 
>> and it optimises well. The obvious google search does not reveal any 
>> corresponding XQuery discussion, neither does it appear to have surfaced on 
>> this or the eXist mailing list (allowing for my deficient search skills). 
>> For something so "trivial" I thought that was rather strange. Hence I 
>> thought it would be prudent to ask  before naively embarking on a  600k X 
>> 40k matrix multiplication.
>>
>>
>>
>> On Tue, Dec 31, 2013 at 11:31 AM, Andrew Welch 
>> <[email protected]<mailto:[email protected]>> wrote:
>>>
>>>
>>> It should be pretty trivial...
>>>
>>> On 31 Dec 2013 11:07, "Ihe Onwuka" 
>>> <[email protected]<mailto:[email protected]>> wrote:
>>> >
>>> > Has anybody tried this in XQuery or if I am so foolish (not yet but give 
>>> > me time) would I be the courageous  
>>> > <culturalReference>http://www.youtube.com/watch?v=ik8JT2S-kBE</culturalReference>
>>> >  early adopter.
>>> >
>>> >
>>> > _______________________________________________
>>> > [email protected]<mailto:[email protected]>
>>> > http://x-query.com/mailman/listinfo/talk
>>
>>
>>
>> _______________________________________________
>> [email protected]<mailto:[email protected]>
>> http://x-query.com/mailman/listinfo/talk
>
>



_______________________________________________

[email protected]<mailto:[email protected]>

http://x-query.com/mailman/listinfo/talk





_______________________________________________
[email protected]
http://x-query.com/mailman/listinfo/talk

Reply via email to