Re: [basex-talk] Distributing queries to several on several processors

2015-04-22 Thread Andy Bunce
Hi Erol, I am not volunteering :-) but if somebody wants to take this route this code might give some pointers [1]. It uses Apache Spark to run Saxon-HE, an XQuery example [2], and more info [3]. /Andy [1] https://github.com/elsevierlabs/spark-xml-utils [2] https://github.com/elsevierlabs/spark

Re: [basex-talk] Distributing queries to several on several processors

2015-04-22 Thread Goetz Heller
OK. Let me do my stuff first. Then I will see if I'm able to dive deep enough into the BaseX code to come up with some meaningful contribution! Kind regards, Goetz -Ursprüngliche Nachricht- Von: Christian Grün [mailto:christian.gr...@gmail.com] Gesendet: Mittwoch, 22. April 2015 11:15

Re: [basex-talk] Distributing queries to several on several processors

2015-04-22 Thread Christian Grün
Hi Götz (cc @ basex-talk), > OK, I think I understand. However, I think there should be some possibilities to allow the user to give hints. In my opinion, FOR-loops would be first-class candidates to use parallel streams, in particular in the use case I described in my previous posting: > > FOR $

Re: [basex-talk] Distributing queries to several on several processors

2015-04-22 Thread Christian Grün
Any volunteers out there? ;) On Wed, Apr 22, 2015 at 11:05 AM, Erol Akarsu wrote: > Christian, > > I think we should be able to attach BaseX to Apache spark. But integration > code need to be written. > Everybody is able to read from Hadoop,SOLR, ElasticSearch etc. to Spark and > process there.

Re: [basex-talk] Distributing queries to several on several processors

2015-04-22 Thread Erol Akarsu
Christian, I think we should be able to attach BaseX to Apache spark. But integration code need to be written. Everybody is able to read from Hadoop,SOLR, ElasticSearch etc. to Spark and process there. Why not for BaseX? Erol Akarsu On Wed, Apr 22, 2015 at 4:28 AM, Christian Grün wrote: > Hi G

Re: [basex-talk] Distributing queries to several on several processors

2015-04-22 Thread Christian Grün
Hi Götz, > it would > make perfect sense to parallelize the query. Is there a way to achieve this > using xQuery? Our initial attempts to integrate low-level support for parallelization in XQuery turned out not to be as successful as we hoped they would be. One reason for that is that you can bas

[basex-talk] Distributing queries to several on several processors

2015-04-21 Thread Goetz Heller
So far I did not find any information on how BaseX can be advised how to use computing resources. The use case here is as follows: I get several megabytes of xml files each day, usually between 50 and 100 MB. These are organized in one database per day. Since most queries run on a daily base this w