Re: [basex-talk] Converting a Java Set to a BaseX sequence

2021-07-23 Thread Reece Dunn
On Fri, 23 Jul 2021 at 13:34, Christian Grün 
wrote:

> Hi Reece,
>
> That was a helpful hint. Some caching was going on indeed; it was
> introduced in a much older version of BaseX, and I noticed it did
> strange things in more recent versions. A new snapshot is available.
>

Thanks for the fast turnaround. I can confirm that the snapshot fixes that
issue.

Kind regards,
Reece


> Best,
> Christian
>
>
> On Fri, Jul 23, 2021 at 1:18 PM Reece Dunn  wrote:
> >
> > On Fri, 9 Jul 2021 at 13:01, Christian Grün 
> wrote:
> >>
> >> Hi all, hi Reece,
> >>
> >> I have remastered the conversion of Java values: Objects of unknown
> >> type are now returned as function item, and the conversion of the
> >> contained value can be enforced by invoking the function item:
> >
> >
> > Thanks. I've ported my code over to the latest 9.6 dev release, which is
> working aside from a strange caching issue.
> >
> > The following produces the correct output in the BaseX GUI:
> > ---
> > declare namespace String = "java:java.lang.String";
> >
> > declare function local:tokenize($text as xs:string) {
> >   String:split($text, " ")
> > };
> >
> > local:tokenize("Lorem ipsum dolor"),
> > local:tokenize("sed emit consecutor")
> > ---
> >
> > Note: I'm using `String:split($text, " ")` here as a demonstration of
> the issue.
> >
> > However, if I take my https://github.com/rhdunn/document-viewer code
> running on the BaseX HTTP server (via bin/basexhttp on AdoptOpenJDK
> 11.0.7+10), and in src/modules/html.xqy add:
> > ---
> > declare namespace String = "java:java.lang.String";
> >
> > declare function local:tokenize($text as xs:string) {
> >   String:split($text, " ")
> > };
> > ---
> >
> > and then modify the text() case in html:simplify from:
> > ---
> > if (contains($node, "margin-bottom: ")) then
> >   ()
> > else
> >   $node
> > ---
> > to
> > ---
> > if (contains($node, "margin-bottom: ")) then
> >   ()
> > else
> >   text { html:tokenize($node) }
> > ---
> > I just see whitespace (as if it is caching the first $node value).
> Changing it to:
> > ---
> > if (contains($node, "margin-bottom: ")) then
> >   ()
> > else if (normalize-space($node) eq "") then
> >   $node
> > else
> >   text { html:tokenize($node) }
> > ---
> > then I see the first non-whitespace text node repeated.
> >
> > If I then replace the `String:split($text, " ")` call with
> `tokenize($text)` I don't see the issue, so it seems to be related with the
> Java interop being cached.
> >
> > Kind regards,
> > Reece
>


Re: [basex-talk] Converting a Java Set to a BaseX sequence

2021-07-23 Thread Reece Dunn
On Fri, 9 Jul 2021 at 13:01, Christian Grün 
wrote:

> Hi all, hi Reece,
>
> I have remastered the conversion of Java values: Objects of unknown
> type are now returned as function item, and the conversion of the
> contained value can be enforced by invoking the function item:
>

Thanks. I've ported my code over to the latest 9.6 dev release, which is
working aside from a strange caching issue.

The following produces the correct output in the BaseX GUI:
---
declare namespace String = "java:java.lang.String";

declare function local:tokenize($text as xs:string) {
  String:split($text, " ")
};

local:tokenize("Lorem ipsum dolor"),
local:tokenize("sed emit consecutor")
---

Note: I'm using `String:split($text, " ")` here as a demonstration of the
issue.

However, if I take my https://github.com/rhdunn/document-viewer code
running on the BaseX HTTP server (via bin/basexhttp on AdoptOpenJDK
11.0.7+10), and in src/modules/html.xqy add:
---
declare namespace String = "java:java.lang.String";

declare function local:tokenize($text as xs:string) {
  String:split($text, " ")
};
---

and then modify the text() case in html:simplify from:
---
if (contains($node, "margin-bottom: ")) then
  ()
else
  $node
---
to
---
if (contains($node, "margin-bottom: ")) then
  ()
else
  text { html:tokenize($node) }
---
I just see whitespace (as if it is caching the first $node value). Changing
it to:
---
if (contains($node, "margin-bottom: ")) then
  ()
else if (normalize-space($node) eq "") then
  $node
else
  text { html:tokenize($node) }
---
then I see the first non-whitespace text node repeated.

If I then replace the `String:split($text, " ")` call with
`tokenize($text)` I don't see the issue, so it seems to be related with the
Java interop being cached.

Kind regards,
Reece


Re: [basex-talk] Converting a Java Set to a BaseX sequence

2021-06-30 Thread Reece Dunn
On Tue, 29 Jun 2021 at 15:49, Christian Grün 
wrote:

> Hi Reece,
>
> I implemented an initial version of convert:from-java [1, 2].
>

Great, thanks.


> Looking forward to your feedback and further suggestions,
>

Trying to use convert:from-java on a list of a custom Java object, I get:

[convert:java] Java object cannot be converted: "Word(text=test,
normalized=test)".

It should just marshal the Java object like is done with the Java interop
in this case.

My initial testing on other cases (set-to-sequence) indicate that it is
slightly faster than the XQuery code I had -- I haven't measured them in
isolation, just on a test example I have.

Kind regards,
Reece

Christian
>
> [1] https://github.com/BaseXdb/basex/issues/2017
> [2] https://files.basex.org/releases/latest/
>
>
> On Tue, Jun 29, 2021 at 1:56 PM Reece Dunn  wrote:
> >
> > On Tue, 29 Jun 2021 at 12:31, Christian Grün 
> wrote:
> >>
> >> Out of interest, and as you seem to have worked with both the Saxon
> >> and BaseX Java mapping: Did you encounter other mapping details that
> >> you believe are handled better in one of the processors?
> >
> >
> > I've not actually used Saxon's Java bindings, so I can't go into more
> details other than what the documentation says. This is currently the only
> project I'm using Java bindings for.
> >
> > I'm only aware of the Saxon logic as I plan at some point to have Java
> integration in my XQuery plugin so you can navigate to the Java
> class/method/etc., have it auto-complete methods, and perform some static
> analysis like checking the number of arguments.
> >
> > Kind regards,
> > Reece
>


Re: [basex-talk] Converting a Java Set to a BaseX sequence

2021-06-29 Thread Reece Dunn
On Tue, 29 Jun 2021 at 12:31, Christian Grün 
wrote:

> Out of interest, and as you seem to have worked with both the Saxon
> and BaseX Java mapping: Did you encounter other mapping details that
> you believe are handled better in one of the processors?
>

I've not actually used Saxon's Java bindings, so I can't go into more
details other than what the documentation says. This is currently the only
project I'm using Java bindings for.

I'm only aware of the Saxon logic as I plan at some point to have Java
integration in my XQuery plugin so you can navigate to the Java
class/method/etc., have it auto-complete methods, and perform some static
analysis like checking the number of arguments.

Kind regards,
Reece


Re: [basex-talk] Converting a Java Set to a BaseX sequence

2021-06-29 Thread Reece Dunn
On Tue, 29 Jun 2021 at 10:36, Christian Grün 
wrote:

> Hi Reece,
>
> Interesting thoughts. All I can say is that your iterator approach for
> sets looks pretty similar to something that I tried in the past.
>
> > More generally, it would be helpful for BaseX to have adapters for Java
> arrays, Lists, Sets, Maps, and Iterables/Iterators to XQuery (XDM) types
> and functions to construct them in XQuery (like my util:list-to-sequence
> function above).
>
> One way would be to add new built-in functions to BaseX (in the
> Conversion Module, or in a new Java Module) that provide conversions
> custom functions for data structures in Java. I guess it might be
> cleaner to convert lists and sets to arrays, as those data structures
> can also contain null references.
>

It would be useful to have Java Collection to sequence, Java Collection to
array(*) and Java Map to map(*) converters. Either the conversion module or
a Java helper module would be useful. Saxon does the Collection to sequence
automatically in its Java bindings -
https://www.saxonica.com/documentation10/index.html#!extensibility/functions/function-result
.

My rational for not converting them to arrays is to avoid a performance
overhead when dealing with a large number of items, but I can see how null
values could be complicated to manage if the BaseX sequence interface
doesn't do flattening itself (otherwise, you could map null to the empty
sequence instance like with the general Java mapping).

Additionally, I'm working in Kotlin and have the list values as
non-nullable types, so that won't be an issue for my particular use case.


> The main reason why we didn’t push this any further was that we didn’t
> want to give users additional incentives to resort to Java code. Many
> things can also be done in XQuery, and as the XQuery-Java mapping for
> data types can never be perfect, and we experienced that users often
> stumbled upon these things in the beginning. However, quite obviously,
> there are always use cases in which a direct data exchange between
> XQuery and Java is helpful, and less cumbersome than writing custom
> Java functions with custom entry points for XQuery function calls (as
> e.g. documented in [2]).
>

Yeah. I'm experimenting with NLP and am passing the text through a
tokenization, stemming/lemmatization, part of speech, etc. pipeline which
looks something like this:

let $tokens := nlp:tokenize($node) => nlp:lemmatize() => nlp:pos-tag()
=> util:list-to-sequence()
for $token in $tokens
let $text := Token:get-text($token)
let $part-of-speech :=
util:set-to-sequence(Token:get-part-of-speech($token))
return {$text}

I'm using Java (Kotlin more accurately) to do the logic that needs state to
implement (and possibly share with other projects), and tying it together
in XQuery.

Maybe it would be good indeed to realize the set of additional
> functions as an XQuery module. We still haven’t defined a canonical
> way to promote and document external BaseX XQuery Modules – some users
> may remember that we have assembled existing modules on our server
> some time ago [1]; other modules, such as Leo’s algorithms and data
> structures, can be found on private repositories [3] – so ideas on how
> to get this better organized are welcome.
>

There is http://cxan.org/ but I don't know how active it currently is.

Kind regards,
Reece


> Cheers,
> Christian
>
> [1] https://files.basex.org/modules/
> [2] https://docs.basex.org/wiki/Repository#Combined
> [3] https://github.com/LeoWoerteler/xq-modules
>
>
> On Mon, Jun 28, 2021 at 6:04 PM Reece Dunn  wrote:
> >
> > Hi,
> >
> > I'm making use of the Java bindings in BaseX, with some of the functions
> returning List and Set types.
> >
> > For List I can adapt that to a sequence using:
> >
> > declare namespace List = "java:java.util.List";
> >
> > declare function util:list-to-sequence($list) {
> >   for $n in 0 to List:size($list) - 1
> >   return List:get($list, $n cast as xs:int)
> > };
> >
> > however, I'm not sure how to do the equivalent for Set (or more
> generally, any Iterator) without converting it to a list or array first,
> as Set only has size() and iterator() methods. Has anyone done this before?
> >
> > The best I can come up with is the following, which relies on the size
> of the set and the number of next calls in the iterator to be the same
> (where it should be checking hasNext):
> >
> > declare namespace Set = "java:java.util.Set";
> > declare namespace Iterator = "java:java.util.Iterator";
> >
> > declare function util:set-to-sequence($set) {
> >   let $iterator := Set:iterator($set)
> >   

[basex-talk] Converting a Java Set to a BaseX sequence

2021-06-28 Thread Reece Dunn
Hi,

I'm making use of the Java bindings in BaseX, with some of the functions
returning List and Set types.

For List I can adapt that to a sequence using:

declare namespace List = "java:java.util.List";

declare function util:list-to-sequence($list) {
  for $n in 0 to List:size($list) - 1
  return List:get($list, $n cast as xs:int)
};

however, I'm not sure how to do the equivalent for Set (or more
generally, any Iterator) without converting it to a list or array first,
as Set only has size() and iterator() methods. Has anyone done this before?

The best I can come up with is the following, which relies on the size of
the set and the number of next calls in the iterator to be the same (where
it should be checking hasNext):

declare namespace Set = "java:java.util.Set";
declare namespace Iterator = "java:java.util.Iterator";

declare function util:set-to-sequence($set) {
  let $iterator := Set:iterator($set)
  for $n in 0 to Set:size($set) - 1
  return Iterator:next($iterator)
};

More generally, it would be helpful for BaseX to have adapters for Java
arrays, Lists, Sets, Maps, and Iterables/Iterators to XQuery (XDM) types
and functions to construct them in XQuery (like my util:list-to-sequence
function above).

Kind regards,
Reece


Re: [basex-talk] Reloading jars on a running http server.

2021-04-30 Thread Reece Dunn
Hi Christian,

I'm not seeing any exceptions in the console window, even when enabling the
debug setting. I'm using the AdoptOpenJDK 1.8. I also have AdoptOpenJDK 11,
but I assume that will have the issue you described.

It's a custom-built jar using Kotlin, built via gradle.

One thing that it could be is that I'm using Kotlin objects (not classes),
e.g.:

package test
object Test { fun f(): String = "test" }

and using it like:

declare namespace Test = "java:test.Test";

declare function test:f() as xs:string {
Test::f(Test::INSTANCE())
};

The build.gradle file is simple. It looks something like this (removing
things like the junit configuration):

-
buildscript {
ext.kotlin_version = "1.4.32"
ext.kotlin_stdlib = "kotlin-stdlib"
ext.java_version = "1.8"

repositories { mavenCentral() }
dependencies { classpath
"org.jetbrains.kotlin:kotlin-gradle-plugin:$kotlin_version" }
}

apply plugin: 'kotlin'

repositories {
mavenCentral()
mavenLocal()
}

compileKotlin { kotlinOptions { jvmTarget = java_version } }
compileTestKotlin { kotlinOptions { jvmTarget = java_version } }

dependencies {
implementation "org.jetbrains.kotlin:$kotlin_stdlib:$kotlin_version"
}
-

You'll need to copy the kotlin-stdlib-1.4.32.jar file in addition to the
test.jar file to BaseX's lib directory.

Kind regards,
Reece

On Fri, 30 Apr 2021 at 07:48, Christian Grün 
wrote:

> Hi Reece,
>
> I’m sorry to hear that. Did you build a custom JAR file, or do you
> encounter problems with the JDK?
>
> Cheers,
> Christian
>
>
>
> On Thu, Apr 29, 2021 at 9:48 PM Reece Dunn  wrote:
> >
> > Hi Christian,
> >
> > Thanks for the response. Unfortunately, I've not been able to get the
> reloading working.
> >
> > Kind regards,
> > Reece
> >
> > On Wed, 21 Apr 2021 at 18:49, Christian Grün 
> wrote:
> >>
> >> Hi Reece,
> >>
> >> If you install your Java code as JAR file in the repository [1], the
> >> code will be loaded and unloaded every time when your query is
> >> executed. If you get an error message…
> >>
> >>   java.lang.reflect.InaccessibleObjectException: Unable to make field
> >> private final jdk.internal.loader.URLClassPath
> >> java.net.URLClassLoader.ucp accessible: module java.base does not
> >> "opens java.net" to unnamed module @79e2c065
> >>
> >> …unloading fails [2], as you’re probably using a more recent version
> >> of the JDK, which restricts reflective access to internal variables.
> >> You can get around this by adding Java flags at startup time:
> >>
> >>  --add-opens java.base/java.net=ALL-UNNAMED
> >>  --add-opens java.base/jdk.internal.loader=ALL-UNNAMED
> >>
> >> Maybe there are better solutions to unload JAR files today.
> >> Suggestions are welcome!
> >>
> >> Hope this helps,
> >> Christian
> >>
> >> [1] https://docs.basex.org/wiki/Repository#Java
> >> [2]
> https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/JarLoader.java#L34
> >>
> >>
> >>
> >> On Tue, Apr 20, 2021 at 6:44 PM Reece Dunn 
> wrote:
> >> >
> >> > Hi all,
> >> >
> >> > I'm working on a Java class that I'm importing into an XQuery, so I
> can do additional processing on the data that isn't easily expressible in
> XQuery (or XSLT). In order to get BaseX to pick up a modified version of
> the jar file I'm building, I'm restarting the BaseX http server.
> >> >
> >> > This makes it slower to turn around testing the changes. Is there a
> better way of doing this?
> >> >
> >> > Kind regards,
> >> > Reece
>


[basex-talk] Getting profile information in server responses.

2021-04-29 Thread Reece Dunn
Hi all,

In BaseX, is there a way to get the profile timings (compile, run, print,
etc.) in the response of a HTTP request via the Server-Timing header (
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Server-Timing)?

Kind regards,
Reece


Re: [basex-talk] Reloading jars on a running http server.

2021-04-29 Thread Reece Dunn
Hi Christian,

Thanks for the response. Unfortunately, I've not been able to get the
reloading working.

Kind regards,
Reece

On Wed, 21 Apr 2021 at 18:49, Christian Grün 
wrote:

> Hi Reece,
>
> If you install your Java code as JAR file in the repository [1], the
> code will be loaded and unloaded every time when your query is
> executed. If you get an error message…
>
>   java.lang.reflect.InaccessibleObjectException: Unable to make field
> private final jdk.internal.loader.URLClassPath
> java.net.URLClassLoader.ucp accessible: module java.base does not
> "opens java.net" to unnamed module @79e2c065
>
> …unloading fails [2], as you’re probably using a more recent version
> of the JDK, which restricts reflective access to internal variables.
> You can get around this by adding Java flags at startup time:
>
>  --add-opens java.base/java.net=ALL-UNNAMED
>  --add-opens java.base/jdk.internal.loader=ALL-UNNAMED
>
> Maybe there are better solutions to unload JAR files today.
> Suggestions are welcome!
>
> Hope this helps,
> Christian
>
> [1] https://docs.basex.org/wiki/Repository#Java
> [2]
> https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/util/JarLoader.java#L34
>
>
>
> On Tue, Apr 20, 2021 at 6:44 PM Reece Dunn  wrote:
> >
> > Hi all,
> >
> > I'm working on a Java class that I'm importing into an XQuery, so I can
> do additional processing on the data that isn't easily expressible in
> XQuery (or XSLT). In order to get BaseX to pick up a modified version of
> the jar file I'm building, I'm restarting the BaseX http server.
> >
> > This makes it slower to turn around testing the changes. Is there a
> better way of doing this?
> >
> > Kind regards,
> > Reece
>


[basex-talk] Reloading jars on a running http server.

2021-04-20 Thread Reece Dunn
 Hi all,

I'm working on a Java class that I'm importing into an XQuery, so I can do
additional processing on the data that isn't easily expressible in XQuery
(or XSLT). In order to get BaseX to pick up a modified version of the jar
file I'm building, I'm restarting the BaseX http server.

This makes it slower to turn around testing the changes. Is there a better
way of doing this?

Kind regards,
Reece


Re: [basex-talk] GUI Stack Trace

2020-06-07 Thread Reece Dunn
Hi Christian,

The following is the relevant part of the stack trace:

>
org.basex.gui.view.project.ProjectCellRenderer.getTreeCellRendererComponent(ProjectCellRenderer.java:29)

It is difficult to spot among the rest of the Java function calls.

Kind regards,
Reece

On Sun, 7 Jun 2020 at 07:34, Christian Grün 
wrote:

> Yes, this problem seems to be caused by the combination of your Linux
> environment and the Java version (the stack trace only contains paths to
> standard Java classes). Which JDK do you use?
>
>
>
>
> Ben Pracht  schrieb am So., 7. Juni 2020, 06:09:
>
>> Hi Folks,
>>
>> I got this error trying to specify a directory location when creating a
>> database using the GUI by dialog.  Creating the database via command
>> works.  I feel like this is somehow not the fault of BaseX because I don't
>> see this on my other Linux machine.  Nonetheless, I'd like recommendations
>> on how the Linux side of the basex community works.
>>
>> My setup:
>> cat /etc/*release
>> edora release 30 (Thirty)
>> NAME=Fedora
>> VERSION="30 (Workstation Edition)"
>> ID=fedora
>> VERSION_ID=30
>> VERSION_CODENAME=""
>> PLATFORM_ID="platform:f30"
>> PRETTY_NAME="Fedora 30 (Workstation Edition)"
>> ANSI_COLOR="0;34"
>> LOGO=fedora-logo-icon
>> CPE_NAME="cpe:/o:fedoraproject:fedora:30"
>> HOME_URL="https://fedoraproject.org/;
>> DOCUMENTATION_URL="
>> https://docs.fedoraproject.org/en-US/fedora/f30/system-administrators-guide/
>> "
>> SUPPORT_URL="
>> https://fedoraproject.org/wiki/Communicating_and_getting_help;
>> BUG_REPORT_URL="https://bugzilla.redhat.com/;
>> REDHAT_BUGZILLA_PRODUCT="Fedora"
>> REDHAT_BUGZILLA_PRODUCT_VERSION=30
>> REDHAT_SUPPORT_PRODUCT="Fedora"
>> REDHAT_SUPPORT_PRODUCT_VERSION=30
>> PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy;
>> VARIANT="Workstation Edition"
>> VARIANT_ID=workstation
>> Fedora release 30 (Thirty)
>> Fedora release 30 (Thirty)
>>
>> export |grep DESKTOP
>> declare -x DESKTOP_SESSION="/usr/share/xsessions/cinnamon2d"
>> declare -x GNOME_DESKTOP_SESSION_ID="this-is-deprecated"
>> declare -x IMSETTINGS_INTEGRATE_DESKTOP="yes"
>> declare -x XDG_CURRENT_DESKTOP="X-Cinnamon"
>> declare -x XDG_SESSION_DESKTOP=""
>>
>>
>> bpracht2@pracht-office-closet basex-rent-world]$ java -cp
>> /home/bpracht/rent-world/basex-rent-world/BaseX.jar:/home/bpracht/rent-world/basex-rent-world/lib/custom/*:/home/bpracht/rent-world/basex-rent-world/lib/*:
>> -Xms2g -Xmx4g org.basex.BaseXGUI
>>
>> Exception in thread "AWT-EventQueue-1" java.lang.NullPointerException
>> at javax.swing.JLabel.setIcon(JLabel.java:406)
>> at
>> org.basex.gui.view.project.ProjectCellRenderer.getTreeCellRendererComponent(ProjectCellRenderer.java:29)
>> at
>> javax.swing.plaf.basic.BasicTreeUI$NodeDimensionsHandler.getNodeDimensions(BasicTreeUI.java:2807)
>> at
>> javax.swing.tree.AbstractLayoutCache.getNodeDimensions(AbstractLayoutCache.java:492)
>> at
>> javax.swing.tree.FixedHeightLayoutCache.getBounds(FixedHeightLayoutCache.java:553)
>> at
>> javax.swing.tree.FixedHeightLayoutCache.getBounds(FixedHeightLayoutCache.java:199)
>> at
>> javax.swing.tree.AbstractLayoutCache.getPreferredHeight(AbstractLayoutCache.java:190)
>> at
>> javax.swing.plaf.basic.BasicTreeUI.updateCachedPreferredSize(BasicTreeUI.java:1902)
>> at
>> javax.swing.plaf.basic.BasicTreeUI.getPreferredSize(BasicTreeUI.java:2003)
>> at
>> javax.swing.plaf.basic.BasicTreeUI.getPreferredSize(BasicTreeUI.java:1991)
>> at javax.swing.JComponent.getPreferredSize(JComponent.java:1662)
>> at
>> javax.swing.ScrollPaneLayout.layoutContainer(ScrollPaneLayout.java:791)
>> at java.awt.Container.layout(Container.java:1513)
>> at java.awt.Container.doLayout(Container.java:1502)
>> at java.awt.Container.validateTree(Container.java:1698)
>> at java.awt.Container.validate(Container.java:1633)
>> at javax.swing.RepaintManager$3.run(RepaintManager.java:711)
>> at javax.swing.RepaintManager$3.run(RepaintManager.java:709)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at
>> java.security.ProtectionDomain$JavaSecurityAccessImpl.doIntersectionPrivilege(ProtectionDomain.java:74)
>> at
>> javax.swing.RepaintManager.validateInvalidComponents(RepaintManager.java:708)
>> at
>> javax.swing.RepaintManager$ProcessingRunnable.run(RepaintManager.java:1731)
>> at
>> java.awt.event.InvocationEvent.dispatch(InvocationEvent.java:311)
>> at java.awt.EventQueue.dispatchEventImpl(EventQueue.java:758)
>> at java.awt.EventQueue.access$500(EventQueue.java:97)
>> at java.awt.EventQueue$3.run(EventQueue.java:709)
>> at java.awt.EventQueue$3.run(EventQueue.java:703)
>> Best Regards,
>> Ben Pracht
>> ben.pra...@gmail.com
>>
>


Re: [basex-talk] I have been given control of xqDoc from Darin McBeath

2019-11-05 Thread Reece Dunn
On Tue, 5 Nov 2019 at 19:56, Loren Cahlander 
wrote:

>
> Hello folks,
>
> June of 2018, I updated the xqDoc codebase to be able to handle the
> additions to the language for XQuery 3.1 and most of the vendor specific
> language extensions.  I updated the parser to Antlr 4.
>
> Darin McBeath, the creator of xqDoc, asked me if I wanted to take over the
> domain.  I accepted.  I will be updating the website http://xqDoc.org and
> would like to get the community’s input.
>
> What changes would people like to see in the website?
>

Not specifically related to the website, but the xqDoc specification, here
are some of my thoughts:

1.  It would be nice to use Markdown in addition to HTML for the comments
(maybe via a `@format [html|markdown]` file-level annotation).
2.  Support for grouping signatures, like for
https://www.w3.org/TR/xpath-functions-31/#func-data to avoid repeating
documentation.
3.  Support for grouping related APIs like the accessors, trigonometric
functions, etc. in the W3C spec.
4.  Providing a way to support the documentation style from the W3C specs
(arguments described in place, custom sections -- summary, signatures,
properties, rules, error conditions, notes, examples).
5.  Reference a parameter in the documentation text.
6.  More annotation options for more complex documentation (maybe aligning
with some of the Doxygen functionality? --
http://www.doxygen.nl/manual/commands.html).

Kind regards,
Reece


> xqDoc’s GitHub is officially at http://github.com/xqdoc
>
> Thank you,
>
> Loren Cahlander


Re: [basex-talk] importing namespace declarations into the main module

2019-08-11 Thread Reece Dunn
Hi Christian,

Would that allow just prolog header statements (import statements,
declarations), or also allow prolog body statements (functions, variables,
options, etc.)?

Would that include imported schema types from the included modules schema
import statements?

If it is just a simple string substitution, how do you deal with things
like the included modules having their own VersionDecl or ModuleDecl
statements, and breaking the order of header and body statements?

How do you resolve potential namespace redeclarations and multiple import
conflicting/overlapping statements being pulled in?

Both Java and Python allow you to specify which statements (classes) to
import, so maybe something like:

import prolog namespaces a, b, c from "path/to/module.xqm"

NOTE: This would be something worth raising on the xpath-ng project.

Kind regards,
Reece

On Sun, 11 Aug 2019 at 19:19, Christian Grün 
wrote:

> Hi Graydon,
>
> We could possibly introduce an include statement:
>
> include 'path/to/module.xqm';
>
> In contrast to 'import module', the contents of the addressed file would
> simply be inserted as string into the original module. The inserted string
> could contain any other declarations that are allowed in the prolog of a
> query.
>
> More suggestions are welcome; I would particularly interested if other
> users would benefit from such an extension as well?
>
> Christian
>
>
>
>
> Graydon Saunders  schrieb am So., 11. Aug. 2019,
> 20:07:
>
>> Hi Christian --
>>
>> Appreciate the confirmation!
>>
>> Any chance of some syntactic sugar for this in a future BaseX release?
>>
>> The use case is writing a bunch of distinct queries to pull stuff out of
>> complex formats like OOXML or Opendocument; there are many namespaces
>> involved, it's important to have them all defined, and it'd be nice to be
>> able to abstract groups of definitions for re-use across queries.  So some
>> way to specify "get that bunch of definitions in the current context" would
>> be nice.  (But, admittedly, by no means necessary.)
>>
>> Thanks!
>> Graydon
>>
>> On Sun, Aug 11, 2019 at 8:36 AM Christian Grün 
>> wrote:
>>
>>> Hi Graydon,
>>>
>>> Your assumptions were correct: If namespaces are declared in another
>>> module, they will be only valid in the scope of that module, and not
>>> in the importing module.
>>>
>>> If your local element names are unique, and if you prefer short path
>>> expressions, you can always use a wildcard prefix (*:...); but that
>>> answer is actually not part of your question anymore ;)
>>>
>>> Best
>>> Christian
>>>
>>>
>>>
>>> On Fri, Aug 9, 2019 at 9:14 PM Graydon Saunders 
>>> wrote:
>>> >
>>> > Hi --
>>> >
>>> > I'm pretty sure this isn't a thing, but I thought I'd ask.
>>> >
>>> > I have a raft of namespace declarations because I'm pulling
>>> information out of Open Document documents.  I'd like to put all thirty-odd
>>> of these declarations in their own file and import that, but I'm pretty
>>> sure I can't because that imported module would need its own namespace and
>>> this would keep its internal namespace declarations from being in scope for
>>> the main module, where I actually want them declared.
>>> >
>>> > Is there a way to do this?  It's not critical, it's an outbreak of
>>> neatness, but it would be a nice neatness.
>>> >
>>> > Thanks!
>>> > Graydon
>>>
>>


Re: [basex-talk] Query profiling and debugging support in BaseX

2019-04-22 Thread Reece Dunn
On Mon, 22 Apr 2019 at 11:21, Christian Grün 
wrote:

> That’s good to know. Once we decide to stabilize our profiling and
> debugging features and make them public, we’ll have a closer look into the
> tracing API of Saxon.
>

Great.


> Thanks
> Christian
>
> PS: Our team member Sabine extended our Wiki article on the integration of
> IntelliJ. She might soon give you some feedback on her experiences with
> BaseX and your plugin.
>

That would be useful.

NOTE: In version 1.4 it is now possible to select "Active editor file" as
the query to run, and specify a context item (currently strings only). I
have also changed the output to put the item information into a separate
table (see https://plugins.jetbrains.com/files/8612/screenshot_19185.png).

Some of the things I am thinking about for the run configurations are:
1.  Supporting variables/parameters.
2.  Supporting specifying the type of the context item and variables.
3.  Highlighting single result items according to the resulting mimetype
(text, xml, json, html, rdf, etc.).
4.  Support for viewing csv, json, etc. results in a tabular form in
addition to the text output.
5.  Supporting saving the output to a file, with an option to open the
result in IntelliJ, a web browser, or some other application.
6.  Providing the ability to edit BaseX configuration files from within the
plugin with a UI specific to the BaseX configuration structure.
7.  Support navigating from the query error stack trace to the file that
caused the error.

Some other things I am thinking about are:
1.  An IntelliJ inspection to validate XQuery files using the query
processor (e.g. using BaseX's xquery:parse API).
2.  Log viewer integration for admin:logs.
3.  Unit test integration (IntelliJ test view for the unit module).
4.  Documentation integration, xqDoc syntax validation, and support for
BaseX xqDoc generation via the inspection module.
5.  Database navigation (list the database contents, view/add/edit/delete
files, view file properties).

Kind regards,
Reece


Re: [basex-talk] Query profiling and debugging support in BaseX

2019-04-22 Thread Reece Dunn
On Mon, 8 Apr 2019 at 15:40, Christian Grün 
wrote:

> Hi Reece,
>
> > I'm wondering how you would go about profiling a query in BaseX to
> determine how long various query statements take, so you can look at the
> places where you need to modify the query to be faster. I have added
> MarkLogic's profiling support into my XQuery plugin that does this and am
> wondering how this can be done in BaseX.
>
> Have you already embedded support for the Saxon debugger in IntelliJ?
>

The current development version of my plugin adds support for the Saxon
tracing API and am using that to generate a profile report similar to the
one for MarkLogic. This is working well. I am still investigating debugging
support for both Saxon and MarkLogic.

A similar tracing API could be useful for BaseX. Specifically, the ability
to add trace objects into the evaluation pipeline that call events on a
listener class before and after evaluation. I could make use of that API in
my plugin to provide profiling and debugging functionality.

I'm still planning on using the timing information and query plan from
BaseX.

Kind regards,
Reece


Re: [basex-talk] Func def & performance: element()* vs item()*

2019-04-11 Thread Reece Dunn
Hi Christian,

On Thu, 11 Apr 2019 at 13:37, Christian Grün 
wrote:

> Hi Chuck,
>
> Martin already suggested that map construction via map:merge is
> preferable and faster (my personal experience is that there are just
> few cases in which map:put is a better choice).
>
> Your query was an interesting one, though. In various cases, we drop
> type information at runtime, as it can be expensive to decorate all
> newly generated sequences with the correct type. As a result, the type
> of your function arguments is verified every time the function is
> called, and this takes additional time.
>
> But as it’s always recommendable to declare types, and as this is not
> the first time that this is chasing me, I had some more thoughts, and
> I have found a good answer on how to improve generally typing at
> runtime! You can already be sure that your query will benefit from the
> upcoming optimizations, i.e., with BaseX 9.2.
>

You may be interested in my
https://github.com/rhdunn/xquery-intellij-plugin/blob/master/docs/XQuery%20IntelliJ%20Plugin%20Data%20Model.md
document. It is the result of previous investigations in supporting static
type analysis in my XQuery plugin. Specifically:
1.  3.2.1 Item Type Union -- computing the best matching union type of two
item types.
2.  3.2.2 Sequence Type Union -- computing the union of two sequences for
use in disjoint expressions such as the if and else branches of an IfExpr.
3.  3.2.3 Sequence Type Addition -- computing the resulting type that best
matches an Expr.

The advantage of this is that the type information can be computed at
compile time.

I was able to get a basic prototype implementation working for some
expressions, and have tested the logic for the rules in that document. I
haven't worked on this recently, as I have been adding other features to my
plugin.

Kind regards,
Reece

Due to this, and due to some other minor optimizations that are still
> in progress, we decided to delay the release until beginning of next
> week.
>
> Cheers
> Christian
>
>
>
> On Thu, Apr 11, 2019 at 12:10 AM Chuck Bearden 
> wrote:
> >
> > BaseX is a great tool for analyzing & characterizing large amounts of
> > XML data. I have used it both at work and on personal projects. I hope
> > the following observation is useful.
> >
> > When I define a function that recurs over a sequence of elements in
> > order to build a map of element name counts, I find that when I
> > specify the type of the element sequence as 'element()*', the function
> > runs so slowly that I give up after 5 minutes or so. But when I
> > specify the type as 'item()*', it finishes in 40 seconds or less.
> > Here's an example:
> >
> > -begin code snippet-
> > declare namespace local="w00fw00f";
> > declare function local:count($elems as element()*, $elem_counts as
> map(*))
> > as
> map(*) {
> > let $elem := head($elems),
> > $elem_name := $elem/name(),
> > $elems_new := tail($elems),
> > $elem_name_count := if (map:contains($elem_counts, $elem_name))
> > then map:get($elem_counts, $elem_name) + 1
> > else 1,
> > $elem_counts_new := map:put($elem_counts, $elem_name,
> $elem_name_count)
> > return if (count($elems_new) = 0)
> > then $elem_counts_new
> > else local:count($elems_new, $elem_counts_new)
> > };
> >
> > let $coll := collection('pure_20190402'),
> > $elems := $coll/result/items/*,
> > $elem_names_map := local:count($elems, map {})
> > return json:serialize($elem_names_map, map {'format' : 'xquery'})
> > -end code snippet-
> >
> > In the function declaration, changing "$elems as element()*" to
> > "$elems as item()*" makes the difference in performance. Replacing the
> > JSON serialization with a standard XML one does not change the
> > performance. I am running BaseX 9.1.2 under Ubuntu 16.04.6.
> >
> > All the best,
> > Chuck Bearden
>


Re: [basex-talk] Query profiling and debugging support in BaseX

2019-04-10 Thread Reece Dunn
On Wed, 10 Apr 2019 at 16:30, Christian Grün 
wrote:

> A little update:
>
> • I have added a FULLPLAN option [1]. If it’s enabled, information on
> the original query string will be attached to the query plan.
> • The output of the session info and the internal XQuery command info
> has been aligned (the query plan will now be returned to the database
> client as well).
>

Thanks for the quick update. I'll see what I can do for the next update to
my plugin.

Kind regards,
Reece


> The release of BaseX 9.2 is scheduled for tomorrow.
>
> [1] http://docs.basex.org/wiki/Options#FULLPLAN
>
>
>
> On Tue, Apr 9, 2019 at 2:16 PM Christian Grün 
> wrote:
> >
> > > The MarkLogic profile integration generates results like in
> https://plugins.jetbrains.com/files/8612/screenshot_19187.png. It could
> be useful to have something similar in BaseX. I would envision that this
> would apply the optimization steps first (function call rewriting, constant
> count evaluation, index access, etc.) and then profile the resulting query.
> It could even be useful in figuring out what queries to rewrite in the
> BaseX optimization pass.
> >
> > Thanks for the example, I’ll take this into consideration.
> >
> > > Something that might be interesting/useful is being able to relate the
> query plan to the original source. Likewise for the information on applying
> indices. That would help when working on more advanced integration, such as
> annotating the types that BaseX evaluates variables, expressions, etc. to.
> >
> > I have slightly extended the code for generating the query plan. With
> > the latest snapshot, the query…
> >
> >   for $x in 1 to 5 return $x * $x
> >
> > …will yield the following query plan:
> >
> > 
> >   
> > 
> >   
> > 
> > 
> >id="0"/>
> >id="0"/>
> > 
> >   
> > 
> >
> > Another example (that involves some rewritings):
> >
> >   let $gauss := function($x) { sum(1 to $x) }
> >   for $i in 1 to 2
> >   return { $gauss(1) }
> >
> > The optimized query looks as follows:
> >
> >   for $i_1 in util:replicate("", 2)
> >   return element x { (50005000) }
> >
> > The corresponding query plan (the COMPPLAN option can be turned off to
> > get the plan of the unoptimized AST):
> >
> > 
> >   
> > 
> >   
> > 
> >   
> > 
> > 
> >   x_
> >   50005000
> > 
> >   
> > 
> >
> > I’m still hesitant to include the line and column numbers in the
> > official release, as it would make sense to also include the base URI
> > of the module for each expression if other modules are imported (and
> > this would bloat the plan noticeably). Maybe we could add yet another
> > option for generating a more comprehensive version of the query plan.
> >
> >
> > > Q: Is there any documentation on the format that the query errors can
> take?
> > > Q: How are query errors with function call stacks formatted, including
> from other modules?
> >
> > Here you can find the construction of the error message and the stack
> trace:
> >
> >
> https://github.com/BaseXdb/basex/blob/c813a426267b542b4500ecad0ceca84e2824fc74/basex-core/src/main/java/org/basex/query/QueryException.java#L239-L248
> >
> > Hope this helps,
> > Christian
>


Re: [basex-talk] Query profiling and debugging support in BaseX

2019-04-08 Thread Reece Dunn
Hi Christian,

On Mon, 8 Apr 2019 at 15:40, Christian Grün 
wrote:

> Hi Reece,
>
> First of all, thanks for your marvellous and always up-to-date
> XQuery/BaseX plugin for IntelliJ. We have included it in our Wiki just
> recently [1].
>

Thanks :).


> > I could display the timings, like in the BaseX GUI. Is there a way to
> get this information using the org.basex.api.client interfaces?
> > Is there a way to get the query plan, pre-evaluation mapping, and
> optimized query for the given query?
>
> I have attached a little Java example (QueryInfoExample.java) that
> uses the session API (the API can be used locally or with a running
> server instance). It demonstrates how the GUI timings can be requested
> via the client interface. The second code snippet
> (QueryPlanExample.java) goes one level deeper; it illustrates how the
> internal query plan can be accessed.


Thanks for providing the examples.


> While the format of the query
> plan has been modified in the past quite frequently, the structure of
> the textual output hasn’t changed for a longer time now, so it may be
> safer to work with the output of the first example.
>

I may just end up outputting the query plan in the XML format that BaseX
provides, like is done in the BaseX GUI.


> > I'm wondering how you would go about profiling a query in BaseX to
> determine how long various query statements take, so you can look at the
> places where you need to modify the query to be faster. I have added
> MarkLogic's profiling support into my XQuery plugin that does this and am
> wondering how this can be done in BaseX.
>
> The current version of BaseX has no built-in features for debugging
> and profiling code in a programmatical way. We had developed some
> prototypical code in the past; so far, it hasn’t been included in the
> official version due to performance concerns, and the fact that
> profiled code cannot be rewritten that aggressively (profiled code
> will often yield other timings than fully optimized code).
>

That makes sense.

The MarkLogic profile integration generates results like in
https://plugins.jetbrains.com/files/8612/screenshot_19187.png. It could be
useful to have something similar in BaseX. I would envision that this would
apply the optimization steps first (function call rewriting, constant count
evaluation, index access, etc.) and then profile the resulting query. It
could even be useful in figuring out what queries to rewrite in the BaseX
optimization pass.

>
> Have you already embedded support for the Saxon debugger in IntelliJ?
>

I've had a brief look at them, but nothing in-depth yet. I'm experimenting
with the APIs for integrating debuggers with IntelliJ. I see it can be used
for profiling, so I will investigate using those APIs for profiling Saxon
XPath, XQuery, and XSLT.


> Further questions are welcome. For example, we could think about
> adding dedicated functions to our API for requesting query information
> more elegantly.
>

The current API should be sufficient.

Something that might be interesting/useful is being able to relate the
query plan to the original source. Likewise for the information on applying
indices. That would help when working on more advanced integration, such as
annotating the types that BaseX evaluates variables, expressions, etc. to.

I do have some questions about exceptions. I'm currently using a simple
regex to extract the error information (
https://raw.githubusercontent.com/rhdunn/xquery-intellij-plugin/master/src/plugin-basex/main/uk/co/reecedunn/intellij/plugin/basex/query/session/BaseXQueryError.kt)
so I can standardize the formatting of the errors across the different
vendors and do things like link to the source files.

Q: Is there any documentation on the format that the query errors can take?
Q: How are query errors with function call stacks formatted, including from
other modules?

Kind regards,
Reece


> Christian
>
> [1] http://docs.basex.org/wiki/Integrating_IntelliJ_IDEA
>


[basex-talk] Query profiling and debugging support in BaseX

2019-04-05 Thread Reece Dunn
Hi,

I'm wondering how you would go about profiling a query in BaseX to
determine how long various query statements take, so you can look at the
places where you need to modify the query to be faster. I have added
MarkLogic's profiling support into my XQuery plugin that does this and am
wondering how this can be done in BaseX.

I'm aware of the prof functions, but these are more useful for manually
instrumenting sections of a query, and not an automated way of getting the
timing information for all the queries and parts of the query. Specifically
for integrating into an IDE.

I could display the timings, like in the BaseX GUI. Is there a way to get
this information using the org.basex.api.client interfaces?

Is there a way to get the query plan, pre-evaluation mapping, and optimized
query for the given query? It would be helpful to link these back to the
source file, so things like optimisation hints can link back to the file
being edited.

Finally, I am looking into debugging support and am wondering if this is
possible to do for BaseX queries. What approach would be best for this,
especially in terms of integrating it into an IDE?

Kind regards,
Reece


[basex-talk] XQuery IntelliJ Plugin (was: Re: Programmatically extracting function signatures)

2018-11-06 Thread Reece Dunn
On Wed, 7 Nov 2018 at 01:27, Bridger Dyson-Smith 
wrote:

> Hi Christian -
>
> thank you for your work on the xqdoc files (and the link to the xquery
> used for generating them)!
>
> The current state of XQuery plugins for the JetBrains/IntelliJ IDEs has
> changed a bit since I added the IDEA documentation to the wiki. Grzegorz
> Ligas' XQuery Support plugin has been forked by the talented developers at
> OverStory, with lots of MarkLogic-specific features added. Reece Dunn also
> has been working on an XQuery plugin that takes a slightly different
> approach: rather than focusing on one implementation, Reece is working on a
> plugin that provides broader language support.
>

To add some background... My plugin started as an attempt to address
various issues in the XQuery parser in Grzegorz' plugin. [1] was the point
where I decided to attempt creating my own plugin, after reporting various
issues to Grzegorz. Specifically, my plugin aimed to:
1.  Address the keyword vs identifier issue in various places [2].
2.  Provide a robust lexer and parser with error recovery.
3.  Provide full support for the MarkLogic syntax in addition to XQuery 3
support -- this later evolved into adding support for other XQuery
extensions (updating, full text, scripting), XQuery 3.1 support, and BaseX
and Saxon vendor extension support.

Version 1.2 of my plugin has support for the BaseX update, fuzzy, and
non-deterministic extensions. Version 1.3 (in development) adds support for
the BaseX 9.1 ternary if, elvis operator, and if without else syntax
extensions. My plan is to publish 1.3 when the IntelliJ 2018.3 release
candidate is released, so the plugin is available for the full release.

My plugin does not currently support running queries, debugging, code
reformatting, auto-completion, and various other features that Grzegorz'
plugin supports. These are planned (I am currently working on running
queries), but these things take time. I've also been improving function
lookup in 1.3 to be standards conformant -- this is complex, and is not
100% complete. The technical challenges in implementing XQuery support in
an IDE are different to those implementing it in a query processor as it
needs to support the functionality and capabiilties of both the IDE and
XQuery.

My long-term plans are to implement as many of the XPath and XQuery static
errors from the error condition list as possible to spot all statically
determinable errors in the IDE, as well as various other inspections.

I'm also considering adding specific support for the XPath subset of XQuery
and integrating that with XSLT for better XPath support in IntelliJ.

The very minor work that I've done has been for Reece - I'm (slowly) adding
> implementation builtin function signatures that will be used by his plugin
> to provide improved static analysis.
>

My plans have always been to describe the built-in functions and static
context as XQuery files. I completed this for the XQuery and MarkLogic
functions, and Bridger has added BaseX, EXPath/EXQuery, and Saxon built-in
function definitions. In the future, I want to add complete API
descriptions using xqdoc comments so I can integrate them into the IDE when
my plugin supports displaying the xqdoc information in place. I also want
to use the information in the function annotations to report functions that
require a different version of the XQuery processor.

[1] https://github.com/ligasgr/intellij-xquery/issues/199
[2] I have solved this by making the NCName and keyword tokens implement a
common interface, then check for that interface when looking for NCNames. I
then remove the keyword styling on keywords in NCName positions after the
code has been parsed. I also use this to support the reserved function name
functionality.

Kind regards,
Reece

I hope that provides a bit of information regarding news with
> XQuery/JetBrains plugins.
> Best,
> Bridger
>
>
>
> On Tue, Oct 30, 2018 at 9:13 AM Christian Grün 
> wrote:
>
>> Hi Bridger,
>>
>> I am glad to report I have created new stub files for the BaseX XQuery
>> Modules [1]. They’ll now be included in the official releases again
>> [2]. I have also uploaded the script that I wrote for generating the
>> xqdoc output [3]. It’s far from perfect, but definitely more complete
>> than the old version. If you encounter any errors, please don’t
>> hesitate to tell me.
>>
>> Could you give us a little update on your contribution to the IntelliJ
>> XQuery plugin?
>>
>> Cheers,
>> Christian
>>
>> [1] https://github.com/BaseXdb/basex/issues/1623
>> [2] http://files.basex.org/releases/latest/
>> [3] https://github.com/BaseXdb/basex-dist/blob/master/wiki2xqdoc.xq
>>
>>
>>
>> On Mon, Sep 17, 2018 at 5:38 PM Bridger Dyson-Smith
>>  wrote:
>> >
>> > Hi all -

[basex-talk] XPST0003 for decimal formats without any properties

2018-07-01 Thread Reece Dunn
Hi All,

Given the query:

declare decimal-format test;
()

BaseX 9.0 reports:

Error:
Stopped at basex-9.0/file, 1/28:
[XPST0003] Decimal-format property '' is invalid.

This is a valid XQuery 3.1 file, as
https://www.w3.org/TR/xquery-31/#doc-xquery31-DecimalFormatDecl denotes the
property production as "*", not "+".

Kind regards,
Reece


[basex-talk] "Infinity" is incorrectly cast to xs:float.

2018-06-23 Thread Reece Dunn
 Hi,

I have found the following behaviour in BaseX 9.0:

"Infinity" cast as xs:float (: INF :)
"-Infinity" cast as xs:float (: -INF :)
"infinity" cast as xs:float (: [FORG0001] Cannot convert xs:string to
xs:float: "infinity". :)
"Infinity" cast as xs:double (: [FORG0001] Cannot convert xs:string to
xs:double: "Infinity". :)
"-Infinity" cast as xs:double (: [FORG0001] Cannot convert xs:string to
xs:double: "-Infinity". :)

>From my reading of Functions and Operators 3.1, and XML Schema 1.1 Part 2
Datatypes, "Infinity" and "-Infinity" are both invalid values for xs:float
(like with xs:double) -- they should only be "INF" and "-INF".

Kind regards,
Reece H. Dunn