Re: [basex-talk] How many QueryModule instances can be created?

2019-12-18 Thread Christian Grün
> So by creating such an XQuery wrapper module, we would ensure that there is 
> only a single import of the Java class (which represents module m1) - 
> regardless of how many times the XQuery wrapper module is imported? If so, 
> then indeed that sounds like an option.

Exactly: The XQuery specification ensures that XQuery modules that are
referenced multiple times are only parsed once. As a consequence, the
Java object that will be created in that module won’t be created
multiple times. Just run the example code that I provided in my
earlier mail, it should make clear how this works.




> -Ursprüngliche Nachricht-
> Von: Christian Grün [mailto:christian.gr...@gmail.com]
> Gesendet: Mittwoch, 18. Dezember 2019 17:48
> An: Johannes Echterhoff 
> Cc: BaseX 
> Betreff: Re: [basex-talk] How many QueryModule instances can be created?
>
> Hi Johannes,
>
> > m1 is a Java module, with de.interactive_instruments.module.MyQueryModule 
> > being the Java class that extends QueryModule.
> > …
> > We end up having multiple instances, i.e. Java objects, of class
> > MyQueryModule, …
>
> So the solution which I would recommend (and which I would generally 
> recommend when importing Java classes) is to write an XQuery wrapper module 
> for m1, and place the Java imports in that module (see [1] for an example). 
> This module will only exist once in your query context.
> This approach has various other advantages: For example, you can work with 
> XQuery data types in all other modules, and only the wrapper needs to ensure 
> that the XQuery parameters will be correctly converted into and back from 
> Java types.
>
> Does that sound like an option?
> Christian
>
> [1] http://docs.basex.org/wiki/Repository#Combined
>
>
>
> > -Ursprüngliche Nachricht-
> > Von: Christian Grün [mailto:christian.gr...@gmail.com]
> > Gesendet: Mittwoch, 18. Dezember 2019 15:59
> > An: Johannes Echterhoff 
> > Cc: BaseX 
> > Betreff: Re: [basex-talk] How many QueryModule instances can be created?
> >
> > > It may also answer my second question, where I was referring to expath 
> > > packaging and if that would make any difference - when compared to having 
> > > a pure JAR with the Java classes (and required libraries). It sounds like 
> > > expath packaging does not make a difference. Please correct me if I got 
> > > this wrong.
> >
> > Right: EXPath is just another way of packaging the code. The XQuery parser 
> > will handle all modules equally, no matter if they have initially been 
> > packaged as XAR or via our own packaging mechanisms.
> >
> > After having read your initial mail for a second time, I noticed I may have 
> > got your setup a little wrong. I think it will be easier to find a solution 
> > if we manage to construct a little example that shows the behavior you 
> > reported. Otherwise, there may be too many open questions to solve (how do 
> > you import and initialize your Java Code? is initialization identical to 
> > creating a class instance? is the initialization code embedded in global 
> > variables? etc).
> >
> > I have attached a basic set of files that (I believe) simulates your setup. 
> > Could you extend it for me, or comment back what I may have wrongly 
> > understood?


Re: [basex-talk] How many QueryModule instances can be created?

2019-12-18 Thread Johannes Echterhoff
Again, thank you, Christian.

So by creating such an XQuery wrapper module, we would ensure that there is 
only a single import of the Java class (which represents module m1) - 
regardless of how many times the XQuery wrapper module is imported? If so, then 
indeed that sounds like an option.

Best regards,
Johannes


-Ursprüngliche Nachricht-
Von: Christian Grün [mailto:christian.gr...@gmail.com] 
Gesendet: Mittwoch, 18. Dezember 2019 17:48
An: Johannes Echterhoff 
Cc: BaseX 
Betreff: Re: [basex-talk] How many QueryModule instances can be created?

Hi Johannes,

> m1 is a Java module, with de.interactive_instruments.module.MyQueryModule 
> being the Java class that extends QueryModule.
> …
> We end up having multiple instances, i.e. Java objects, of class 
> MyQueryModule, …

So the solution which I would recommend (and which I would generally recommend 
when importing Java classes) is to write an XQuery wrapper module for m1, and 
place the Java imports in that module (see [1] for an example). This module 
will only exist once in your query context.
This approach has various other advantages: For example, you can work with 
XQuery data types in all other modules, and only the wrapper needs to ensure 
that the XQuery parameters will be correctly converted into and back from Java 
types.

Does that sound like an option?
Christian

[1] http://docs.basex.org/wiki/Repository#Combined



> -Ursprüngliche Nachricht-
> Von: Christian Grün [mailto:christian.gr...@gmail.com]
> Gesendet: Mittwoch, 18. Dezember 2019 15:59
> An: Johannes Echterhoff 
> Cc: BaseX 
> Betreff: Re: [basex-talk] How many QueryModule instances can be created?
>
> > It may also answer my second question, where I was referring to expath 
> > packaging and if that would make any difference - when compared to having a 
> > pure JAR with the Java classes (and required libraries). It sounds like 
> > expath packaging does not make a difference. Please correct me if I got 
> > this wrong.
>
> Right: EXPath is just another way of packaging the code. The XQuery parser 
> will handle all modules equally, no matter if they have initially been 
> packaged as XAR or via our own packaging mechanisms.
>
> After having read your initial mail for a second time, I noticed I may have 
> got your setup a little wrong. I think it will be easier to find a solution 
> if we manage to construct a little example that shows the behavior you 
> reported. Otherwise, there may be too many open questions to solve (how do 
> you import and initialize your Java Code? is initialization identical to 
> creating a class instance? is the initialization code embedded in global 
> variables? etc).
>
> I have attached a basic set of files that (I believe) simulates your setup. 
> Could you extend it for me, or comment back what I may have wrongly 
> understood?


Re: [basex-talk] How many QueryModule instances can be created?

2019-12-18 Thread Christian Grün
Hi Johannes,

> m1 is a Java module, with de.interactive_instruments.module.MyQueryModule 
> being the Java class that extends QueryModule.
> …
> We end up having multiple instances, i.e. Java objects, of class 
> MyQueryModule, …

So the solution which I would recommend (and which I would generally
recommend when importing Java classes) is to write an XQuery wrapper
module for m1, and place the Java imports in that module (see [1] for
an example). This module will only exist once in your query context.
This approach has various other advantages: For example, you can work
with XQuery data types in all other modules, and only the wrapper
needs to ensure that the XQuery parameters will be correctly converted
into and back from Java types.

Does that sound like an option?
Christian

[1] http://docs.basex.org/wiki/Repository#Combined



> -Ursprüngliche Nachricht-
> Von: Christian Grün [mailto:christian.gr...@gmail.com]
> Gesendet: Mittwoch, 18. Dezember 2019 15:59
> An: Johannes Echterhoff 
> Cc: BaseX 
> Betreff: Re: [basex-talk] How many QueryModule instances can be created?
>
> > It may also answer my second question, where I was referring to expath 
> > packaging and if that would make any difference - when compared to having a 
> > pure JAR with the Java classes (and required libraries). It sounds like 
> > expath packaging does not make a difference. Please correct me if I got 
> > this wrong.
>
> Right: EXPath is just another way of packaging the code. The XQuery parser 
> will handle all modules equally, no matter if they have initially been 
> packaged as XAR or via our own packaging mechanisms.
>
> After having read your initial mail for a second time, I noticed I may have 
> got your setup a little wrong. I think it will be easier to find a solution 
> if we manage to construct a little example that shows the behavior you 
> reported. Otherwise, there may be too many open questions to solve (how do 
> you import and initialize your Java Code? is initialization identical to 
> creating a class instance? is the initialization code embedded in global 
> variables? etc).
>
> I have attached a basic set of files that (I believe) simulates your setup. 
> Could you extend it for me, or comment back what I may have wrongly 
> understood?


Re: [basex-talk] How many QueryModule instances can be created?

2019-12-18 Thread Johannes Echterhoff
Thank you for your patience and support, Christian.

Then my second question has been answered as well. So basically, when the 
XQuery execution involves multiple imports of a Java module, regardless of 
packaging, BaseX may create multiple instances of the according Java class.

I slightly updated your queries as follows (it is not executable, but it shows 
our current setup):

testquery.xq:
-
import module namespace m1 = 'de.interactive_instruments.module.MyQueryModule';
import module namespace m2 = 'm2' at 'm2.xqm';
import module namespace m3 = 'm3' at 'm3.xqm';

m1:init(... some data ...),
m1:add('m1'),

m1:functionMno(),
m2:do(),
m3:do(),

m1:values()
-

m2.xqm:
-
module namespace m2 = 'm2';

import module namespace m1 = 'de.interactive_instruments.module.MyQueryModule';

declare function m2:do() {
  ... some xquery that uses a function from m1, e.g. m1:functionAbc() ...
};
-

m3.xqm:
-
module namespace m3 = 'm3';

import module namespace m1 = 'de.interactive_instruments.module.MyQueryModule';

declare function m3:do() {
  ... some xquery that uses a function from m1, e.g. m1:functionXyz() ...
};
-

m1 is a Java module, with de.interactive_instruments.module.MyQueryModule being 
the Java class that extends QueryModule.
m1 is initialised, i.e. fed with some information, using methods defined by 
class MyQueryModule.
Once that is done, expressions from testquery.xq call additional methods of 
MyQueryModule, either directly or indirectly (by calling functions from the 
XQuery modules m2 and m3).
We end up having multiple instances, i.e. Java objects, of class MyQueryModule, 
- presumably due to the multiple imports of that module - and only one of them 
has been initialised with the necessary information. What we would have wanted 
is a single instance/object of MyQueryModule, which has the information given 
to it by calling initialisation functions in testquery.xq, and which would 
therefore have this information regardless of where one of the m1 functions is 
called afterwards (be it in testquery.xq, or the modules m2 and m3). 

So some kind of QueryModule singleton construct would be handy, but I doubt 
that would work with BaseX as-is (assuming that a default constructor is used 
to create an object of MyQueryModule when one of its methods is invoked). I 
guess we could also explicitly create the MyQueryModule object (as outlined in 
the FileWriter example on 
http://docs.basex.org/wiki/Java_Bindings#Namespace_Declarations) and pass it 
around as a function parameter, but that would increase the level of complexity 
in our XQuery code quite a bit, so I doubt that this would be desirable for us. 
Maybe having a QueryModule implementation that acts as a façade to a singleton 
which does the actual work ... but I digress.

From what I've learned so far in this conversation, we'd better just ensure 
that there is only a single import of module m1.

Best regards,
Johannes


-Ursprüngliche Nachricht-
Von: Christian Grün [mailto:christian.gr...@gmail.com] 
Gesendet: Mittwoch, 18. Dezember 2019 15:59
An: Johannes Echterhoff 
Cc: BaseX 
Betreff: Re: [basex-talk] How many QueryModule instances can be created?

> It may also answer my second question, where I was referring to expath 
> packaging and if that would make any difference - when compared to having a 
> pure JAR with the Java classes (and required libraries). It sounds like 
> expath packaging does not make a difference. Please correct me if I got this 
> wrong.

Right: EXPath is just another way of packaging the code. The XQuery parser will 
handle all modules equally, no matter if they have initially been packaged as 
XAR or via our own packaging mechanisms.

After having read your initial mail for a second time, I noticed I may have got 
your setup a little wrong. I think it will be easier to find a solution if we 
manage to construct a little example that shows the behavior you reported. 
Otherwise, there may be too many open questions to solve (how do you import and 
initialize your Java Code? is initialization identical to creating a class 
instance? is the initialization code embedded in global variables? etc).

I have attached a basic set of files that (I believe) simulates your setup. 
Could you extend it for me, or comment back what I may have wrongly understood?


Re: [basex-talk] How many QueryModule instances can be created?

2019-12-18 Thread Christian Grün
> It may also answer my second question, where I was referring to expath 
> packaging and if that would make any difference - when compared to having a 
> pure JAR with the Java classes (and required libraries). It sounds like 
> expath packaging does not make a difference. Please correct me if I got this 
> wrong.

Right: EXPath is just another way of packaging the code. The XQuery
parser will handle all modules equally, no matter if they have
initially been packaged as XAR or via our own packaging mechanisms.

After having read your initial mail for a second time, I noticed I may
have got your setup a little wrong. I think it will be easier to find
a solution if we manage to construct a little example that shows the
behavior you reported. Otherwise, there may be too many open questions
to solve (how do you import and initialize your Java Code? is
initialization identical to creating a class instance? is the
initialization code embedded in global variables? etc).

I have attached a basic set of files that (I believe) simulates your
setup. Could you extend it for me, or comment back what I may have
wrongly understood?
<>


Re: [basex-talk] How many QueryModule instances can be created?

2019-12-18 Thread Johannes Echterhoff
Hi Christian,

Thank you. If I understand you correctly, that answers my first question. 
It may also answer my second question, where I was referring to expath 
packaging and if that would make any difference - when compared to having a 
pure JAR with the Java classes (and required libraries). It sounds like expath 
packaging does not make a difference. Please correct me if I got this wrong.

You asked: Or should there only ever be a single such instance, regardless of 
how many times the module is imported? 
The answer is yes. That is because we call some specific functions first, to 
initialize the module, and we need the information conveyed by this 
initialization process throughout the whole execution of an xquery, regardless 
how many times the module is imported (by that xquery, or indirectly via 
modules that the xquery imports).

Before we made an attempt of modularizing all our custom xquery functions into 
a set of modules, the Java module worked fine, presumably because it was only 
imported once. Now the Java module is imported by the xquery and some xquery 
modules (that are imported by the xquery). And we end up with multiple 
instances of the Java module. I was wondering if any packaging approach would 
ensure that only a single instance of the Java module is created, regardless of 
how many times the module is imported. From what I heard so far, it looks like 
that is not the case. Please confirm, or correct me if I got it wrong.

If there is no guarantee that a Java module is instantiated multiple times if 
it is imported more than once, regardless of how the Java module is packaged, 
then we would simply go back to the original structure of our xquery (before 
the modularization attempt). I just need to understand what behavior to expect 
(from BaseX) when it processes imports of a Java module (provided as pure Java 
module, combined module, or using EXPath packaging). 

Best regards,
Johannes



-Ursprüngliche Nachricht-
Von: Christian Grün [mailto:christian.gr...@gmail.com] 
Gesendet: Mittwoch, 18. Dezember 2019 11:07
An: Johannes Echterhoff 
Cc: BaseX 
Betreff: Re: [basex-talk] How many QueryModule instances can be created?

Hi Johannes,

> · Is it expected behavior that multiple instances of a Java 
> QueryModule (M1, in my scenario) may be created and used during an execution 
> of a query scenario like the one described above – or in general?

Yes, this is currently expected behavior; the import mechanisms of XQuery and 
Java modules differ in various aspects. The customary way to proceed is to 
organize the Java calls in a single XQuery module.
Both the Java class and the XQuery module can then optionally be bundled as JAR 
file (see [1]).

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/Repository#Combined



 Or should there only ever be a single such instance, regardless of how many 
times the module is imported?
>
> o   Note: The functions used to initialize M1 are non-deterministic and 
> declared as such. Not entirely sure if that makes a difference regarding how 
> many times M1 would be created.
>
> · I have the same questions for the case that the Java QueryModule M1 
> was packaged in a XAR (as described in 
> http://docs.basex.org/wiki/Repository#EXPath_Packaging). Would such a 
> packaging approach actually make any difference?
>
> Apologies that I do not have a small, self-contained example project to 
> demonstrate this. I hope that I explained the issue with sufficient detail 
> and clarity. If not, just let me know.
>
> Best regards,
>
> Johannes
>
> P.S.: If you have suggestions for a better approach of handling such a 
> scenario, where a QueryModule must be initialized before it can be used, and 
> there shall only be a single instance of this module within the execution of 
> an XQuery, let me know.
>
>


Re: [basex-talk] How many QueryModule instances can be created?

2019-12-18 Thread Christian Grün
Hi Johannes,

> · Is it expected behavior that multiple instances of a Java 
> QueryModule (M1, in my scenario) may be created and used during an execution 
> of a query scenario like the one described above – or in general?

Yes, this is currently expected behavior; the import mechanisms of
XQuery and Java modules differ in various aspects. The customary way
to proceed is to organize the Java calls in a single XQuery module.
Both the Java class and the XQuery module can then optionally be
bundled as JAR file (see [1]).

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/Repository#Combined



 Or should there only ever be a single such instance, regardless of
how many times the module is imported?
>
> o   Note: The functions used to initialize M1 are non-deterministic and 
> declared as such. Not entirely sure if that makes a difference regarding how 
> many times M1 would be created.
>
> · I have the same questions for the case that the Java QueryModule M1 
> was packaged in a XAR (as described in 
> http://docs.basex.org/wiki/Repository#EXPath_Packaging). Would such a 
> packaging approach actually make any difference?
>
> Apologies that I do not have a small, self-contained example project to 
> demonstrate this. I hope that I explained the issue with sufficient detail 
> and clarity. If not, just let me know.
>
> Best regards,
>
> Johannes
>
> P.S.: If you have suggestions for a better approach of handling such a 
> scenario, where a QueryModule must be initialized before it can be used, and 
> there shall only be a single instance of this module within the execution of 
> an XQuery, let me know.
>
>


Re: [basex-talk] Huge No of XML files.

2019-12-18 Thread Liam R. E. Quin
On Wed, 2019-12-18 at 11:10 +0530, Sreenivasulu Yadavalli wrote:
> > 
> What exactly do you mean by moving collections around?.
> 
> A: moving the collections in the same system. 

So, you use the Linux "mv" command to do this? Or what?

What exactly do you mean by collections? I for one would find it easier
if you would stop talking in riddles, as my telepathy skills are weak.

> And every day we have to
> update the existing collection with call data. So finding the
> collection is
> taking more time

How do you look for the collection? Isn't it a separate BaseX database?

> 
> Are you taking a database with 100 million documents and renaming
> 50,000 of them?
> 
> What operations exactly are slow?
> 
> A: finding the existing collection.

find / -name collection.db ?

This is a little frustrating in that you are asking for people's help
but not explaining the problem. Are you saying that fn:collection() is
slow in BaseX? What arguments are you passing it exactly? What is the
size, in gigabytes, of the database, on disk? How many documents are in
it?

Can you give step-by-step EXACT AND PRECISSE instructions so someone
else could reproduce the problem you have having? Complete and exact
instructions, with sample files if needed, so they can reproduce the
problem on their own computer?

A database with 80,000 files is easy to "find" here, and opens quickly,
in a small fraction of a second. It doesn't take hours.

Is something else running on your computer that makes it slow??

Note: please remember to copy the list in your replies, as the BaseX
people are far more knowledgeable about BaseX than i am :) My goal as
an analyst is to get you to explain the problem you are having clearly
enough that you can get an answer :)

Liam

-- 
Liam Quin, https://www.delightfulcomputing.com/
Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations:  http://www.fromoldbooks.org