Re: [basex-talk] How many QueryModule instances can be created?
> So by creating such an XQuery wrapper module, we would ensure that there is > only a single import of the Java class (which represents module m1) - > regardless of how many times the XQuery wrapper module is imported? If so, > then indeed that sounds like an option. Exactly: The XQuery specification ensures that XQuery modules that are referenced multiple times are only parsed once. As a consequence, the Java object that will be created in that module won’t be created multiple times. Just run the example code that I provided in my earlier mail, it should make clear how this works. > -Ursprüngliche Nachricht- > Von: Christian Grün [mailto:christian.gr...@gmail.com] > Gesendet: Mittwoch, 18. Dezember 2019 17:48 > An: Johannes Echterhoff > Cc: BaseX > Betreff: Re: [basex-talk] How many QueryModule instances can be created? > > Hi Johannes, > > > m1 is a Java module, with de.interactive_instruments.module.MyQueryModule > > being the Java class that extends QueryModule. > > … > > We end up having multiple instances, i.e. Java objects, of class > > MyQueryModule, … > > So the solution which I would recommend (and which I would generally > recommend when importing Java classes) is to write an XQuery wrapper module > for m1, and place the Java imports in that module (see [1] for an example). > This module will only exist once in your query context. > This approach has various other advantages: For example, you can work with > XQuery data types in all other modules, and only the wrapper needs to ensure > that the XQuery parameters will be correctly converted into and back from > Java types. > > Does that sound like an option? > Christian > > [1] http://docs.basex.org/wiki/Repository#Combined > > > > > -Ursprüngliche Nachricht- > > Von: Christian Grün [mailto:christian.gr...@gmail.com] > > Gesendet: Mittwoch, 18. Dezember 2019 15:59 > > An: Johannes Echterhoff > > Cc: BaseX > > Betreff: Re: [basex-talk] How many QueryModule instances can be created? > > > > > It may also answer my second question, where I was referring to expath > > > packaging and if that would make any difference - when compared to having > > > a pure JAR with the Java classes (and required libraries). It sounds like > > > expath packaging does not make a difference. Please correct me if I got > > > this wrong. > > > > Right: EXPath is just another way of packaging the code. The XQuery parser > > will handle all modules equally, no matter if they have initially been > > packaged as XAR or via our own packaging mechanisms. > > > > After having read your initial mail for a second time, I noticed I may have > > got your setup a little wrong. I think it will be easier to find a solution > > if we manage to construct a little example that shows the behavior you > > reported. Otherwise, there may be too many open questions to solve (how do > > you import and initialize your Java Code? is initialization identical to > > creating a class instance? is the initialization code embedded in global > > variables? etc). > > > > I have attached a basic set of files that (I believe) simulates your setup. > > Could you extend it for me, or comment back what I may have wrongly > > understood?
Re: [basex-talk] How many QueryModule instances can be created?
Again, thank you, Christian. So by creating such an XQuery wrapper module, we would ensure that there is only a single import of the Java class (which represents module m1) - regardless of how many times the XQuery wrapper module is imported? If so, then indeed that sounds like an option. Best regards, Johannes -Ursprüngliche Nachricht- Von: Christian Grün [mailto:christian.gr...@gmail.com] Gesendet: Mittwoch, 18. Dezember 2019 17:48 An: Johannes Echterhoff Cc: BaseX Betreff: Re: [basex-talk] How many QueryModule instances can be created? Hi Johannes, > m1 is a Java module, with de.interactive_instruments.module.MyQueryModule > being the Java class that extends QueryModule. > … > We end up having multiple instances, i.e. Java objects, of class > MyQueryModule, … So the solution which I would recommend (and which I would generally recommend when importing Java classes) is to write an XQuery wrapper module for m1, and place the Java imports in that module (see [1] for an example). This module will only exist once in your query context. This approach has various other advantages: For example, you can work with XQuery data types in all other modules, and only the wrapper needs to ensure that the XQuery parameters will be correctly converted into and back from Java types. Does that sound like an option? Christian [1] http://docs.basex.org/wiki/Repository#Combined > -Ursprüngliche Nachricht- > Von: Christian Grün [mailto:christian.gr...@gmail.com] > Gesendet: Mittwoch, 18. Dezember 2019 15:59 > An: Johannes Echterhoff > Cc: BaseX > Betreff: Re: [basex-talk] How many QueryModule instances can be created? > > > It may also answer my second question, where I was referring to expath > > packaging and if that would make any difference - when compared to having a > > pure JAR with the Java classes (and required libraries). It sounds like > > expath packaging does not make a difference. Please correct me if I got > > this wrong. > > Right: EXPath is just another way of packaging the code. The XQuery parser > will handle all modules equally, no matter if they have initially been > packaged as XAR or via our own packaging mechanisms. > > After having read your initial mail for a second time, I noticed I may have > got your setup a little wrong. I think it will be easier to find a solution > if we manage to construct a little example that shows the behavior you > reported. Otherwise, there may be too many open questions to solve (how do > you import and initialize your Java Code? is initialization identical to > creating a class instance? is the initialization code embedded in global > variables? etc). > > I have attached a basic set of files that (I believe) simulates your setup. > Could you extend it for me, or comment back what I may have wrongly > understood?
Re: [basex-talk] How many QueryModule instances can be created?
Hi Johannes, > m1 is a Java module, with de.interactive_instruments.module.MyQueryModule > being the Java class that extends QueryModule. > … > We end up having multiple instances, i.e. Java objects, of class > MyQueryModule, … So the solution which I would recommend (and which I would generally recommend when importing Java classes) is to write an XQuery wrapper module for m1, and place the Java imports in that module (see [1] for an example). This module will only exist once in your query context. This approach has various other advantages: For example, you can work with XQuery data types in all other modules, and only the wrapper needs to ensure that the XQuery parameters will be correctly converted into and back from Java types. Does that sound like an option? Christian [1] http://docs.basex.org/wiki/Repository#Combined > -Ursprüngliche Nachricht- > Von: Christian Grün [mailto:christian.gr...@gmail.com] > Gesendet: Mittwoch, 18. Dezember 2019 15:59 > An: Johannes Echterhoff > Cc: BaseX > Betreff: Re: [basex-talk] How many QueryModule instances can be created? > > > It may also answer my second question, where I was referring to expath > > packaging and if that would make any difference - when compared to having a > > pure JAR with the Java classes (and required libraries). It sounds like > > expath packaging does not make a difference. Please correct me if I got > > this wrong. > > Right: EXPath is just another way of packaging the code. The XQuery parser > will handle all modules equally, no matter if they have initially been > packaged as XAR or via our own packaging mechanisms. > > After having read your initial mail for a second time, I noticed I may have > got your setup a little wrong. I think it will be easier to find a solution > if we manage to construct a little example that shows the behavior you > reported. Otherwise, there may be too many open questions to solve (how do > you import and initialize your Java Code? is initialization identical to > creating a class instance? is the initialization code embedded in global > variables? etc). > > I have attached a basic set of files that (I believe) simulates your setup. > Could you extend it for me, or comment back what I may have wrongly > understood?
Re: [basex-talk] How many QueryModule instances can be created?
Thank you for your patience and support, Christian. Then my second question has been answered as well. So basically, when the XQuery execution involves multiple imports of a Java module, regardless of packaging, BaseX may create multiple instances of the according Java class. I slightly updated your queries as follows (it is not executable, but it shows our current setup): testquery.xq: - import module namespace m1 = 'de.interactive_instruments.module.MyQueryModule'; import module namespace m2 = 'm2' at 'm2.xqm'; import module namespace m3 = 'm3' at 'm3.xqm'; m1:init(... some data ...), m1:add('m1'), m1:functionMno(), m2:do(), m3:do(), m1:values() - m2.xqm: - module namespace m2 = 'm2'; import module namespace m1 = 'de.interactive_instruments.module.MyQueryModule'; declare function m2:do() { ... some xquery that uses a function from m1, e.g. m1:functionAbc() ... }; - m3.xqm: - module namespace m3 = 'm3'; import module namespace m1 = 'de.interactive_instruments.module.MyQueryModule'; declare function m3:do() { ... some xquery that uses a function from m1, e.g. m1:functionXyz() ... }; - m1 is a Java module, with de.interactive_instruments.module.MyQueryModule being the Java class that extends QueryModule. m1 is initialised, i.e. fed with some information, using methods defined by class MyQueryModule. Once that is done, expressions from testquery.xq call additional methods of MyQueryModule, either directly or indirectly (by calling functions from the XQuery modules m2 and m3). We end up having multiple instances, i.e. Java objects, of class MyQueryModule, - presumably due to the multiple imports of that module - and only one of them has been initialised with the necessary information. What we would have wanted is a single instance/object of MyQueryModule, which has the information given to it by calling initialisation functions in testquery.xq, and which would therefore have this information regardless of where one of the m1 functions is called afterwards (be it in testquery.xq, or the modules m2 and m3). So some kind of QueryModule singleton construct would be handy, but I doubt that would work with BaseX as-is (assuming that a default constructor is used to create an object of MyQueryModule when one of its methods is invoked). I guess we could also explicitly create the MyQueryModule object (as outlined in the FileWriter example on http://docs.basex.org/wiki/Java_Bindings#Namespace_Declarations) and pass it around as a function parameter, but that would increase the level of complexity in our XQuery code quite a bit, so I doubt that this would be desirable for us. Maybe having a QueryModule implementation that acts as a façade to a singleton which does the actual work ... but I digress. From what I've learned so far in this conversation, we'd better just ensure that there is only a single import of module m1. Best regards, Johannes -Ursprüngliche Nachricht- Von: Christian Grün [mailto:christian.gr...@gmail.com] Gesendet: Mittwoch, 18. Dezember 2019 15:59 An: Johannes Echterhoff Cc: BaseX Betreff: Re: [basex-talk] How many QueryModule instances can be created? > It may also answer my second question, where I was referring to expath > packaging and if that would make any difference - when compared to having a > pure JAR with the Java classes (and required libraries). It sounds like > expath packaging does not make a difference. Please correct me if I got this > wrong. Right: EXPath is just another way of packaging the code. The XQuery parser will handle all modules equally, no matter if they have initially been packaged as XAR or via our own packaging mechanisms. After having read your initial mail for a second time, I noticed I may have got your setup a little wrong. I think it will be easier to find a solution if we manage to construct a little example that shows the behavior you reported. Otherwise, there may be too many open questions to solve (how do you import and initialize your Java Code? is initialization identical to creating a class instance? is the initialization code embedded in global variables? etc). I have attached a basic set of files that (I believe) simulates your setup. Could you extend it for me, or comment back what I may have wrongly understood?
Re: [basex-talk] How many QueryModule instances can be created?
> It may also answer my second question, where I was referring to expath > packaging and if that would make any difference - when compared to having a > pure JAR with the Java classes (and required libraries). It sounds like > expath packaging does not make a difference. Please correct me if I got this > wrong. Right: EXPath is just another way of packaging the code. The XQuery parser will handle all modules equally, no matter if they have initially been packaged as XAR or via our own packaging mechanisms. After having read your initial mail for a second time, I noticed I may have got your setup a little wrong. I think it will be easier to find a solution if we manage to construct a little example that shows the behavior you reported. Otherwise, there may be too many open questions to solve (how do you import and initialize your Java Code? is initialization identical to creating a class instance? is the initialization code embedded in global variables? etc). I have attached a basic set of files that (I believe) simulates your setup. Could you extend it for me, or comment back what I may have wrongly understood? <>
Re: [basex-talk] How many QueryModule instances can be created?
Hi Christian, Thank you. If I understand you correctly, that answers my first question. It may also answer my second question, where I was referring to expath packaging and if that would make any difference - when compared to having a pure JAR with the Java classes (and required libraries). It sounds like expath packaging does not make a difference. Please correct me if I got this wrong. You asked: Or should there only ever be a single such instance, regardless of how many times the module is imported? The answer is yes. That is because we call some specific functions first, to initialize the module, and we need the information conveyed by this initialization process throughout the whole execution of an xquery, regardless how many times the module is imported (by that xquery, or indirectly via modules that the xquery imports). Before we made an attempt of modularizing all our custom xquery functions into a set of modules, the Java module worked fine, presumably because it was only imported once. Now the Java module is imported by the xquery and some xquery modules (that are imported by the xquery). And we end up with multiple instances of the Java module. I was wondering if any packaging approach would ensure that only a single instance of the Java module is created, regardless of how many times the module is imported. From what I heard so far, it looks like that is not the case. Please confirm, or correct me if I got it wrong. If there is no guarantee that a Java module is instantiated multiple times if it is imported more than once, regardless of how the Java module is packaged, then we would simply go back to the original structure of our xquery (before the modularization attempt). I just need to understand what behavior to expect (from BaseX) when it processes imports of a Java module (provided as pure Java module, combined module, or using EXPath packaging). Best regards, Johannes -Ursprüngliche Nachricht- Von: Christian Grün [mailto:christian.gr...@gmail.com] Gesendet: Mittwoch, 18. Dezember 2019 11:07 An: Johannes Echterhoff Cc: BaseX Betreff: Re: [basex-talk] How many QueryModule instances can be created? Hi Johannes, > · Is it expected behavior that multiple instances of a Java > QueryModule (M1, in my scenario) may be created and used during an execution > of a query scenario like the one described above – or in general? Yes, this is currently expected behavior; the import mechanisms of XQuery and Java modules differ in various aspects. The customary way to proceed is to organize the Java calls in a single XQuery module. Both the Java class and the XQuery module can then optionally be bundled as JAR file (see [1]). Hope this helps, Christian [1] http://docs.basex.org/wiki/Repository#Combined Or should there only ever be a single such instance, regardless of how many times the module is imported? > > o Note: The functions used to initialize M1 are non-deterministic and > declared as such. Not entirely sure if that makes a difference regarding how > many times M1 would be created. > > · I have the same questions for the case that the Java QueryModule M1 > was packaged in a XAR (as described in > http://docs.basex.org/wiki/Repository#EXPath_Packaging). Would such a > packaging approach actually make any difference? > > Apologies that I do not have a small, self-contained example project to > demonstrate this. I hope that I explained the issue with sufficient detail > and clarity. If not, just let me know. > > Best regards, > > Johannes > > P.S.: If you have suggestions for a better approach of handling such a > scenario, where a QueryModule must be initialized before it can be used, and > there shall only be a single instance of this module within the execution of > an XQuery, let me know. > >
Re: [basex-talk] How many QueryModule instances can be created?
Hi Johannes, > · Is it expected behavior that multiple instances of a Java > QueryModule (M1, in my scenario) may be created and used during an execution > of a query scenario like the one described above – or in general? Yes, this is currently expected behavior; the import mechanisms of XQuery and Java modules differ in various aspects. The customary way to proceed is to organize the Java calls in a single XQuery module. Both the Java class and the XQuery module can then optionally be bundled as JAR file (see [1]). Hope this helps, Christian [1] http://docs.basex.org/wiki/Repository#Combined Or should there only ever be a single such instance, regardless of how many times the module is imported? > > o Note: The functions used to initialize M1 are non-deterministic and > declared as such. Not entirely sure if that makes a difference regarding how > many times M1 would be created. > > · I have the same questions for the case that the Java QueryModule M1 > was packaged in a XAR (as described in > http://docs.basex.org/wiki/Repository#EXPath_Packaging). Would such a > packaging approach actually make any difference? > > Apologies that I do not have a small, self-contained example project to > demonstrate this. I hope that I explained the issue with sufficient detail > and clarity. If not, just let me know. > > Best regards, > > Johannes > > P.S.: If you have suggestions for a better approach of handling such a > scenario, where a QueryModule must be initialized before it can be used, and > there shall only be a single instance of this module within the execution of > an XQuery, let me know. > >
Re: [basex-talk] Huge No of XML files.
On Wed, 2019-12-18 at 11:10 +0530, Sreenivasulu Yadavalli wrote: > > > What exactly do you mean by moving collections around?. > > A: moving the collections in the same system. So, you use the Linux "mv" command to do this? Or what? What exactly do you mean by collections? I for one would find it easier if you would stop talking in riddles, as my telepathy skills are weak. > And every day we have to > update the existing collection with call data. So finding the > collection is > taking more time How do you look for the collection? Isn't it a separate BaseX database? > > Are you taking a database with 100 million documents and renaming > 50,000 of them? > > What operations exactly are slow? > > A: finding the existing collection. find / -name collection.db ? This is a little frustrating in that you are asking for people's help but not explaining the problem. Are you saying that fn:collection() is slow in BaseX? What arguments are you passing it exactly? What is the size, in gigabytes, of the database, on disk? How many documents are in it? Can you give step-by-step EXACT AND PRECISSE instructions so someone else could reproduce the problem you have having? Complete and exact instructions, with sample files if needed, so they can reproduce the problem on their own computer? A database with 80,000 files is easy to "find" here, and opens quickly, in a small fraction of a second. It doesn't take hours. Is something else running on your computer that makes it slow?? Note: please remember to copy the list in your replies, as the BaseX people are far more knowledgeable about BaseX than i am :) My goal as an analyst is to get you to explain the problem you are having clearly enough that you can get an answer :) Liam -- Liam Quin, https://www.delightfulcomputing.com/ Available for XML/Document/Information Architecture/XSLT/ XSL/XQuery/Web/Text Processing/A11Y training, work & consulting. Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org