Re: [basex-talk] How do I prevent BaseX from insisting on adding these supposed HTML attributes?

2012-05-09 Thread Imsieke, Gerrit, le-tex
It isn’t the fault of BaseX. The parser (tagsoup, if you choose HTML 
parsing) inserts the default values for attributes. You should be able 
to suppress it by adding nodefaults=true to HTMLOPT.




On 2012-05-09 17:07, jida...@jidanni.org wrote:

AH == Alexander Holupirekalexander.holupi...@uni-konstanz.de  writes:

AH  Please post a small snippet or example, so that we are able to test the 
problem.

Taking the example from the Debian basex man page, we add an innocent
br  anda:

cat  bad.html\EOF
   html
 ul
   liAa href=oz/a
   liBbr
 /ul
   /html
EOF
basex -c 'set parser html; set htmlopt method=html,nons=true; create db htmldb 
bad.html'
basex -q doc('htmldb')

html
   body
 ul
   liAa shape=rect href=oz/a  HORRIBLE
   /li
   liBbr clear=none/  TERRIBLE
   /li
 /ul
   /body
/html

How can I stop basex from insisting on adding such atrocious junk?
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] using xslt 2 --- Unknown function 'current-grouping-key(...)'.

2012-09-13 Thread Imsieke, Gerrit, le-tex
Aren’t the -jar and -cp options mututally exclusive? That is, if you use 
-jar, -cp is being ignored? Maybe you should put all jars (BaseX, Saxon, 
...) in the -cp and then invoke org.basex.BaseXGUI or the like. Maybe 
adapt the startup scripts to include the Saxon jar in the cp.


Gerrit


On 2012-09-13 15:53, Alex Muir wrote:

Java is being used

On Thu, Sep 13, 2012 at 9:51 AM, Christian Grün
christian.gr...@gmail.com wrote:

Dear Alex,

as Max stated, it would be interesting to know what xslt:processor()
gives you as result. This way, we will know if Saxon or Java is used
as processor.

Thanks,
Christian
___

On Thu, Sep 13, 2012 at 3:47 PM, Alex Muir alex.g.m...@gmail.com wrote:

As I said in the first line of my post I'm looking at that module
page. The page states that

XSLT 2.0 is used instead if Version 9.x of the Saxon XSLT Processor
(saxon9he.jar, saxon9pe.jar, saxon9ee.jar) is found in the classpath.

I'm launching basex as follows

java -classpath
/mnt/xslt_volume/i4EnrichV7/resources/libs/saxon9he.jar -jar
/mnt/xslt_volume/i4EnrichV7/resources/libs/BaseX73.jar

Is there anything else I need to do?


On Thu, Sep 13, 2012 at 8:57 AM, Maximilian Gärber gaer...@axxepta.de wrote:

Hi,

did you check  http://docs.basex.org/wiki/XSLT_Module  ?

using xslt:processor() you can output if you're really using Saxon.

Regards,
Max

2012/9/13 Alex Muir alex.g.m...@gmail.com:

Hi,

Following the http://docs.basex.org/wiki/XSLT_Module

I'm launching basex as follows

java -classpath
/mnt/xslt_volume/i4EnrichV7/resources/libs/saxon9he.jar -jar
/mnt/xslt_volume/i4EnrichV7/resources/libs/BaseX73.jar

I get an error thought when executing which makes me think I have not
convinced basex that saxon is there to use...

Error: Stopped at line 14, column 48 in
/mnt/xslt_volume/i4EnrichV7/analysis/xquery/AggregateSectionsTitleNormalized.xq:
[XPST0017] Unknown function 'current-grouping-key(...)'.

Query: let $in := .//section
let $sections :=
sections
$in
/sections
let $style :=
xsl:stylesheet xmlns:xsl=http://www.w3.org/1999/XSL/Transform;
xmlns:xs=http://www.w3.org/2001/XMLSchema;
xmlns:fn=http://www.w3.org/2005/xpath-functions;
xmlns:mh=http://www.metaheuristica.com; version=2.0
xsl:output method=xml indent=no/

xsl:template match=sections
   groups
  xsl:for-each-group select=section group-by=@titleNormalized
 group name={current-grouping-key()}
count={count(current-group())}/
  /xsl:for-each-group
   /groups
/xsl:template

/xsl:stylesheet

return xslt:transform($sections, $style)

What am I doing wrong?
Regards

--
-

Alex G. Muir
Software Engineering Consultant
Linkedin Profile : http://ca.linkedin.com/pub/alex-muir/36/ab7/125
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk




--

Maximilian Gärber

axxepta solutions GmbH
Postfach 51 02 38
13362 Berlin

Tel +49 (0)30 499 147 66
Fax +49 (0)30 499 147 67
Mail gaer...@axxepta.de
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk




--
-

Alex G. Muir
Software Engineering Consultant
Linkedin Profile : http://ca.linkedin.com/pub/alex-muir/36/ab7/125
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk






--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] 256 namespaces limit

2013-01-13 Thread Imsieke, Gerrit, le-tex

Dear Team,

Do you plan on increasing the 256 namespaces limit any time soon? I know 
that there is the STRIPNS option, but this does not fit my use case 
(which is to index all my XSLT and XProc files) too well.


Gerrit
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] dynamically evaluate XPath

2013-11-11 Thread Imsieke, Gerrit, le-tex
I have a RESTXQ path/function that is supposed to retrieve a document 
fragment, restricted to an XPath expression that is given as a query 
parameter, i.e., as a string. The list of possible fragment XPaths has 
been calculated using path() by another function, and the user of a Web 
application may choose to retrieve any of the fragments.


An example for such a path would be 
'/Q{http://www.tei-c.org/ns/1.0}TEI[1]/Q{http://www.tei-c.org/ns/1.0}text[1]/Q{http://www.tei-c.org/ns/1.0}front[1]/Q{http://www.tei-c.org/ns/1.0}div[3]'.


Is there a better solution than the following, whose performance is of 
course quite poor (around 2 seconds execution time for the given 
documents)? I’m thinking of something like saxon:evaluate() or the XSLT 
3 instruction xsl:evaluate.


Maybe I’m just unaware of the obvious solution based upon XQuery 3 or a 
BaseX extension.


Gerrit

declare
  %rest:path(/content/fragment/{$db}/{$doc})
  %rest:query-param(xpath, {$xpath})
  %rest:GET
  function page:get-frags(
$db as xs:string,
$doc as xs:string,
$xpath as xs:string
  )
as item()*
{
  response
{ for $doc in db:open($db, $doc)
  return $doc//*[path(.) eq $xpath] }
  /response
};
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] dynamically evaluate XPath

2013-11-11 Thread Imsieke, Gerrit, le-tex

Thanks Christian, that’s it!

On 11.11.2013 13:49, Christian Grün wrote:

Hi Gerrit,

you are probably looking for the xquery:eval function [1]:

   xquery:eval( db:open(' || $db || ', ' || $doc || ') || $xpath)

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] db:open(), xquery:eval(), and path()

2013-11-16 Thread Imsieke, Gerrit, le-tex
I think that under certain conditions, the path() function does not 
return the proper paths.


Here’s an example that works ok:

for $doc in doca/bc//b/doc
let $nodes as element(*)* := xquery:eval($doc//*, map{doc:=$doc})
return
  for $node in $nodes
  return result path={path($node)} name={name($node)}/

⇒
result path=Q{http://www.w3.org/2005/xpath-functions}root()/Q{}a[1] 
name=a/
result path=Q{http://www.w3.org/2005/xpath-functions}root()/Q{}b[1] 
name=b/
result 
path=Q{http://www.w3.org/2005/xpath-functions}root()/Q{}b[1]/Q{}c[1] 
name=c/


Now I create a database 'doc' with the document 'doc.xml' and invoke the 
slightly modified query:


for $doc in db:open('doc', 'doc.xml')/*
let $nodes as element(*)* := xquery:eval($doc//*, map{doc:=$doc})
return
  for $node in $nodes
  return result path={path($node)} name={name($node)}/

⇒
result path=Q{http://www.w3.org/2005/xpath-functions}root() name=a/
result path=Q{http://www.w3.org/2005/xpath-functions}root() name=b/
result path=Q{http://www.w3.org/2005/xpath-functions}root() name=c/

The element names are still known, but not their paths. Is it a bug or 
am I missing something?


Gerrit

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] db:open(), xquery:eval(), and path()

2013-11-16 Thread Imsieke, Gerrit, le-tex

Probably related:

db:node-id(xquery:eval(db:open('doc', 'doc.xml')/*))
⇒ 0

db:node-id(db:open('doc', 'doc.xml')/*)
⇒ 1



On 16.11.2013 23:57, Imsieke, Gerrit, le-tex wrote:

I think that under certain conditions, the path() function does not
return the proper paths.

Here’s an example that works ok:

for $doc in doca/bc//b/doc
let $nodes as element(*)* := xquery:eval($doc//*, map{doc:=$doc})
return
   for $node in $nodes
   return result path={path($node)} name={name($node)}/

⇒
result path=Q{http://www.w3.org/2005/xpath-functions}root()/Q{}a[1]
name=a/
result path=Q{http://www.w3.org/2005/xpath-functions}root()/Q{}b[1]
name=b/
result
path=Q{http://www.w3.org/2005/xpath-functions}root()/Q{}b[1]/Q{}c[1]
name=c/

Now I create a database 'doc' with the document 'doc.xml' and invoke the
slightly modified query:

for $doc in db:open('doc', 'doc.xml')/*
let $nodes as element(*)* := xquery:eval($doc//*, map{doc:=$doc})
return
   for $node in $nodes
   return result path={path($node)} name={name($node)}/

⇒
result path=Q{http://www.w3.org/2005/xpath-functions}root() name=a/
result path=Q{http://www.w3.org/2005/xpath-functions}root() name=b/
result path=Q{http://www.w3.org/2005/xpath-functions}root() name=c/

The element names are still known, but not their paths. Is it a bug or
am I missing something?

Gerrit

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] transform before update query

2014-02-16 Thread Imsieke, Gerrit, le-tex
This XQuery update related question might be not the most BaseX-specific 
question that has ever been asked on the list, but anyway:


I’m currently improving the performance of sxedit (that I presented at 
XML Prague yesterday). One nasty workaround that I have to do in the 
browser is escaping the names of the TEI elements body and head when I 
generate TEI XML from HTML despite the fact that these elemens are in a 
different namespace (because browsers…). So I have TEI XML with the 
elements _head and _body. I serialize this XML and use regex 
replacement on the string when I submit the TEI XML for download or for 
storage in BaseX. (Yes, it seems silly to serialize the XML before 
POSTing it to RESTXQ and then using parse-xml(), but don’t mind that for 
the moment.)


The function that does the storage is at 
https://github.com/gimsieke/sxedit/blob/master/lib/basex/restxq/sxedit.xqm#L221


Now I thought that I might do without the performance-killing string 
replacement in the browser if I let BaseX transform the posted data 
prior to replacing the stored subtree with what was posted.


So I tried the following for the function body:

copy $doc := parse-xml($wrapper)
modify (
  for $n in $doc/descendant-or-self::*[starts-with(local-name(), 
'_')]

  let $repl := replace(local-name($n), '^_', ''),
  $uri := namespace-uri($n)
  return rename node $n as QName($uri, $repl)
)
return
replace node db:open($doc/*:frag/@db, $doc/*:frag/@doc)//*[path() 
eq $doc/*:frag/@xpath]

with $doc/*:frag/*

upon which I get:
[XUST0001] Transform expression: no updating expression allowed.

How would I achieve the transform and then update operation?

Gerrit


___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] transform before update query

2014-02-16 Thread Imsieke, Gerrit, le-tex

Christian, thanks for your XQuery support. Will pay back in XSLT support ;)

I made this change, and I also made the xquery:eval statements 
compatible with 7.8: 
https://github.com/gimsieke/sxedit/blob/master/lib/basex/restxq/sxedit.xqm


Gerrit

On 16.02.2014 15:40, Christian Grün wrote:

Hi Gerrit,

the return clause of the transform (copy) expression does not allow
update operations. What you can do is to wrap your code in a FLWOR
expression:

let $doc :=
   copy $doc := parse-xml($wrapper)
 modify (
   for $n in $doc/descendant-or-self::*[starts-with(local-name(), '_')]
   let $repl := replace(local-name($n), '^_', ''),
   $uri := namespace-uri($n)
   return rename node $n as QName($uri, $repl)
 )
 return $doc
return
 replace node db:open($doc/*:frag/@db, $doc/*:frag/@doc)//*[path()
eq $doc/*:frag/@xpath]
 with $doc/*:frag/*


You can also use the (for now BaseX-specific) update keyword, e. g.
as follows:

let $doc := parse-xml($wrapper) update (
   for $n in descendant-or-self::*[starts-with(local-name(), '_')]
   let $repl := replace(local-name($n), '^_', ''),
   $uri := namespace-uri($n)
   return rename node $n as QName($uri, $repl)
)
return
 replace node db:open($doc/*:frag/@db, $doc/*:frag/@doc)//*[path()
eq $doc/*:frag/@xpath]
 with $doc/*:frag/*

Hope this helps,
Christian



--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] restriction on number of namespaces has apparently been lifted – thank you

2014-03-20 Thread Imsieke, Gerrit, le-tex
Today I noticed that I could actually build an index of all XSLT, XProc, 
Relax NG and Schematron files on my hard disk (3316 files). I couldn’t 
do that 2 years ago because the maximum number of distinct namespaces in 
a DB was limited to 256 or so.


Thanks, BaseX team, for lifting this restriction!

This has already proved really useful: I knew that I wrote an XProc step 
that conditionally invoked a step whose local name I remembered. The 
simple XPath expression

collection('home')//*:declare-step[*:choose//*:paths]
helped me identify the two relevant files.
Since we do a lot of development in XML-syntax languages, an XML 
database is really really good for structured searches on these files. I 
bet you XQuery devs still use grep to query your code ;)


Gerrit
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] restriction on number of namespaces has apparently been lifted – thank you

2014-03-21 Thread Imsieke, Gerrit, le-tex
As Christian has pointed out in private, this limitation hasn’t been 
lifted yet. I ran into it when I tried to index all the XML files on my 
hard disk.
It worked for my code files yesterday because I switched to a new 
computer 1 year ago, and the number of code files / namespaces in them 
has not reached the critical limit yet.


So I made an issue out of it: https://github.com/BaseXdb/basex/issues/902

Gerrit

On 20.03.2014 22:54, Imsieke, Gerrit, le-tex wrote:

Today I noticed that I could actually build an index of all XSLT, XProc,
Relax NG and Schematron files on my hard disk (3316 files). I couldn’t
do that 2 years ago because the maximum number of distinct namespaces in
a DB was limited to 256 or so.

Thanks, BaseX team, for lifting this restriction!

This has already proved really useful: I knew that I wrote an XProc step
that conditionally invoked a step whose local name I remembered. The
simple XPath expression
collection('home')//*:declare-step[*:choose//*:paths]
helped me identify the two relevant files.
Since we do a lot of development in XML-syntax languages, an XML
database is really really good for structured searches on these files. I
bet you XQuery devs still use grep to query your code ;)

Gerrit

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Accessing DOCTYPE information after DB creation?

2014-03-28 Thread Imsieke, Gerrit, le-tex
You can preprocess your documents with Andrew Welch’s LexEv parser: 
http://andrewjwelch.com/lexev/


On 28.03.2014 12:25, Christian Grün wrote:

Hi Constantine,

unfortunately no, because this information is already consumed by the
XML parser (i. e., we don’t get to see it at all when the database is
being built).

Suggestions from other users with similar problems are welcome.
Christian



Hi all,

I would really like to be able to query a large corpus of documents to get
names and counts of the DTDs which are declared in the (somewhat
old-fashioned now) DOCTYPE declaration:

?xml version=1.0 encoding=utf-8?
!DOCTYPE converted-article PUBLIC -//ES//DTD journal article DTD version
4.5.2//EN//XML art452.dtd [
]
converted-article !-- etc --

Is there any way to get BaseX to preserve this information? Can I rewrite
the doctype declaration into some sort of element node as the DB is being
created so that this info can be queried?

Thanks for any tips,
Constantine.


___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] 7.8.2 Updating function items

2014-05-04 Thread Imsieke, Gerrit, le-tex

acclaim

On 04.05.2014 21:57, Christian Grün wrote:

Et voilà...

   https://github.com/BaseXdb/basex/issues/939
   http://files.basex.org/releases/latest/

Your feedback is welcome,
Christian


Re: [basex-talk] Opened by another process

2014-11-18 Thread Imsieke, Gerrit, le-tex



On 18.11.2014 11:31, Maximilian Gärber wrote:

wow, this has to be nicest tech list on the planet

users thank themselves, receive thanks from the owners ;-)


But there are also nasty subscribers such as myself who pedantically 
insist that said users thank each other (rather than themselves, 
reflexively). So curb your enthusiasm, Max!


Gerrit




2014-11-17 23:51 GMT+01:00 Christian Grün christian.gr...@gmail.com:

Ah, that did the trick!
I changed ownership to tomcat:tomcat and now the error has disappeared.
Thanks very much for the hint Paul :-)


Thanks, too ;)
Christian




Paul


Hi Fabrice,

How would I know?

A thought that just crossed my mind: the files of the database are not
owned by root or tomcat.
Could that be an issue?
(I'm not very familiar with unix, so I don't know exactly how ownership of
a file effects processes).

Paul


Hi Paul,
Is there any basexhttp instance that could have opened the db ?

Best regards,
Fabrice
Questel/Orbit

-Message d'origine-
De : basex-talk-boun...@mailman.uni-konstanz.de
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] De la part de Paul
Swennenhuis
Envoyé : lundi 17 novembre 2014 22:44
À : basex-talk@mailman.uni-konstanz.de
Objet : [basex-talk] Opened by another process

Why would I get a bxerr:BXDB0007 error Database 'profiles' cannot be
updated, as it is opened by another process
when executing these commands from a BaseX client:

open profiles;
xquery insert node profileabc/profile into /profiles

Where profiles is an existing database, and /profiles an existing root
element, and I am quite positive that the database is NOT being used in
another process?

When I issue these commands on localhost it is working fine.

Paul







--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler


Re: [basex-talk] BaseX request from Saxon XSL transformation

2015-11-18 Thread Imsieke, Gerrit, le-tex
A more convenient way (at least for Java amateurs like myself) might be
to set up a RESTXQ service in BaseX and to query it using plain
fn:doc(''). At least this works for GET requests. If you need to post
something, you’ll probably need the EXPath HTTP client library [1] that
doesn’t ship with Saxon yet, apart from Florent Georges’ patch [2].

Gerrit

[1] http://expath.org/modules/http-client/
[2] https://groups.google.com/forum/#!topic/expath/PKl27uQndng


On 18.11.2015 11:38, cmarch...@oxiane.com wrote:
>  
> 
>  
> 
> Thanks, I was reading the same articles... It was just to check if
> something already exists...
> 
>  
> 
> Best regards,
> 
> Christophe
> 
>  
> 
> Le 2015-11-18 11:35, Dirk Kirsten a écrit :
> 
>> Hello Christophe,
>>
>> I've never done this, but I'd say that extension functions are the way
>> to go. If I read the saxon documentation correctly
>> (http://www.saxonica.com/html/documentation/extensibility/functions/),
>> you can use it to call Java functions from within XSLT. As BaseX is
>> written in Java you should be able to put the BaseX jar file into the
>> library path and query BaseX using Java. Many Java Examples of how to
>> query BaseX can be found at http://docs.basex.org/wiki/Java_Examples
>>
>> Cheers
>> Dirk
>>
>> On 11/18/2015 09:18 AM, cmarch...@oxiane.com wrote:
>>>
>>>  
>>>
>>>  
>>>
>>> Hello,
>>>
>>>  
>>>
>>> I have to query a BaseX database from a XSL transformation. Does
>>> someone has ever done this ? I have no idea where to look in...
>>>
>>> I use SaxonEE, so I can write an extension function, if needed...
>>>
>>>  
>>>
>>> Best regards,
>>>
>>> Christophe
>>>
>>
>> -- 
>> Dirk Kirsten, BaseX GmbH, http://basexgmbh.de
>> |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz
>> |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer:
>> |   Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle
>> `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22

-- 
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler


Re: [basex-talk] XML embedded in JSON

2016-01-12 Thread Imsieke, Gerrit, le-tex
Actual XML elements in actual JSON documents won’t be feasible I guess.
But after you’ve transformed the JSON to XML, you can parse the escaped
strings into XML elements proper, using
http://www.w3.org/TR/xpath-functions-3/#func-parse-xml

On 13.01.2016 02:06, E. Wray Johnson wrote:
> We have JSON that has string/text values which are XML.  Is there a
> way o have the XML as sub-elements value and not be encoded/escaped
> string?
> 
> Wray Johnson
> 


Re: [basex-talk] Catalog Resolution Under Windows

2016-03-13 Thread Imsieke, Gerrit, le-tex

Hi Eliot,

I didn’t recently try it on Windows myself, but just two observations.

On 13.03.2016 01:13, Eliot Kimber wrote:

CATFILE = C:/workspace/DITA-OT2.x/catalog-dita.xml"


There is a trailing quote sign here, is this intentional? Don’t know the 
effects of unbalanced quotes here.


In any case, it might be necessary to give the location as a file: URI, 
as in file:///C:/workspace/DITA-OT2.x/catalog-dita.xml

Did you already try that?

Gerrit


Re: [basex-talk] Could not reserve enough space for object heap

2017-08-16 Thread Imsieke, Gerrit, le-tex
It wasn’t clear to me from the OP whether the issue of *slowness* 
persisted after raising max heap to 2 GB, or whether the issue of *not 
being able to allocate this amount of space at all* persisted.


If Bram set max heap to more than 1.5 GB and immediately received the 
message “Could not reserve enough space …”, this could be an indication 
that he is running a 32-bit Java on his 64-bit machine.


Bram, what is the output of 'java -version'? Mine is:

java version "1.8.0_144"
Java(TM) SE Runtime Environment (build 1.8.0_144-b01)
Java HotSpot(TM) 64-Bit Server VM (build 25.144-b01, mixed mode)

If it does not contain '64-Bit', it’s 32 bit, and you need to install a 
different binary from Oracle’s site.


Gerrit

On 16.08.2017 07:53, Kirsten, Dirk wrote:

Hi Bram

How did you set the java heap space? Usual way would be by using the 
command line and setting e.g.


   java -Xmx2g

However, the object which throws the error is already nearly 1,5GB in 
size, so if other stuff has to be allocates as well it could simply be 
that it is still not enough. Try increasing it to at least 4GB.


Indicating the memory leak: Well, it could certainly be, but I 
personally doubt it. First of all, having millions of databases is a 
lot, so some performance impact is expected. And given the scale of your 
databases I wouldn't call 600ms "terribly slow", you just seem to have a 
lot of data. When you create a new database there is also stuff which 
has to be checked (I would guess for example that BaseX checks that the 
database not already exists, which runtime obviously depends on the 
number of databases you already have), so I would guess the performance 
is somewhat expected. Of course, it might be possible to optimize this 
and try to make BaseX more performant for your given scenario.


Cheers
Dirk

**
Senacor Technologies Aktiengesellschaft - Sitz: Eschborn - 
Amtsgericht Frankfurt am Main - Reg.-Nr.: HRB 105546
Vorstand: Matthias Tomann, Marcus Purzer - Aufsichtsratsvorsitzender: 
Daniel Grözinger



-Ursprüngliche Nachricht-
Von: basex-talk-boun...@mailman.uni-konstanz.de 
[mailto:basex-talk-boun...@mailman.uni-konstanz.de] Im Auftrag von Bram 
Vanroy

Gesendet: Dienstag, 15. August 2017 22:22
An: basex-talk@mailman.uni-konstanz.de
Betreff: [basex-talk] Could not reserve enough space for object heap

Hi all

I'm running into an issue with many databases. I.e. one server instance 
with millions of databases. When creating all of these, I found that the 
more databases are included on the instance, the slower further database 
generation got. For instance, I could see in the logs that in the first 
< 10.000 databases the creation happened smoothly with around 50ms per 
file of 1-4kB. However, when having more and more databases for this 
server instance, things got very slow: for an XML file of 1-4kB the logs 
show ~600ms. This is terribly slow, as you can imagine.


At first I thought something was wrong with my hardware, but I checked 
on another system and the same issues arises. Then I thought maybe Java 
is doing something strange, so I figured I'd reboot and see if that 
cleared some stuff up. But now when I try to launch 'basex' or 
'basexserver', I get the following message:


  Could not reserve enough space for 1433600KB object heap

I googled the issue, and it was suggested that I added a JAVA option to 
my system's variable (I'm on Windows 10 64 bit, BaseX 8.6.4) indicating 
the memory it could use. I set that to 2048MB. But still the same issue 
persists.


I have contacted the list before, with issues of generating millions of 
database with the same server instance, and this seems another one 
related to the problem. I am no expert AT ALL, but isn't it possible 
there is some sort of micro memory leak that only becomes apparent when 
creating an amount of databases of this magnitude? If not, other ideas 
are welcome as well. At least on how to get rid of the Java error 
mentioned above.



Kind regards

Bram Vanroy



--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler


[basex-talk] CG 40

2017-07-08 Thread Imsieke, Gerrit, le-tex
“You’ll become smart at the age of forty,” a Swabian saying goes („Mit 
40 wird man g’scheid.“, see also [1]).


If this is true, and if the (excellent) current system is the product of 
Christian’s pre-40 dabbling, how much more sophisticated will BaseX 
become from now on?


Happy birthday, Christian! Enjoy your holidays.

—Gerrit


[1] https://de.wikipedia.org/wiki/Schwabenalter


Re: [basex-talk] Validate XML against RNG schema

2017-06-21 Thread Imsieke, Gerrit, le-tex
jing/trang is a Java tool for converting Relax NG schemas and for 
validating with these schemas. If you are using oXygen XML, it is 
integrated with the product and you can use it to convert rnc to rng.


It is maintained on Github 
(https://github.com/relaxng/jing-trang/releases), although I don’t think 
there is a binary distribution.


There is a page about trang, 
http://www.thaiopensource.com/relaxng/trang.html, although some of the 
links are broken.


The download page seems to be functional though: 
https://code.google.com/archive/p/jing-trang/downloads


Invocation is described on 
http://www.thaiopensource.com/relaxng/trang-manual.html


Gerrit

On 6/21/17 9:50 AM, Dharmendra Singh wrote:

Hi Gerrit,

Thanks for your response can you please explain what is trang is this a 
function or something else, can you please provide me the example or 
sample to convertrnc file to rng using

trang

Regards
Dharmendra Kumar Singh


On Wednesday, 21 June 2017 1:06 PM, "Imsieke, Gerrit, le-tex" 
<gerrit.imsi...@le-tex.de> wrote:



Hi Dharmendra,

The function validate:rng() seems to only accept a Relax NG *XML syntax*
document as its 2nd argument. You can convert the rnc file to rng using
trang and store it in the DB as a regular XML file.

Gerrit

On 6/21/17 8:20 AM, Dharmendra Singh wrote:
 > Hi all,
 >
 > I have loaded the RNG schema using function db:store and also loaded the
 > XML in the DB which has to be validated, but i am getting the error,
 > below is my code:
 >
 >
 >  let $binary := db:retrieve('onix','/relaxng/publishers-51cr.rnc')
 >let $schema := bin:decode-string($binary)
 >return
 >let $input := db:open('onix')
 >return validate:rng($input, $schema)
 >
 > when i run this code i get the error invalid XML charcter(20)
 >
 > so what i am doing wrong here.


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de <mailto:gerrit.imsi...@le-tex.de>, 
http://www.le-tex.de <http://www.le-tex.de/>


Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler





--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler


Re: [basex-talk] Validate XML against RNG schema

2017-06-21 Thread Imsieke, Gerrit, le-tex

Hi Dharmendra,

The function validate:rng() seems to only accept a Relax NG *XML syntax* 
document as its 2nd argument. You can convert the rnc file to rng using 
trang and store it in the DB as a regular XML file.


Gerrit

On 6/21/17 8:20 AM, Dharmendra Singh wrote:

Hi all,

I have loaded the RNG schema using function db:store and also loaded the 
XML in the DB which has to be validated, but i am getting the error, 
below is my code:



  let $binary := db:retrieve('onix','/relaxng/publishers-51cr.rnc')
   let $schema := bin:decode-string($binary)
   return
   let $input := db:open('onix')
   return validate:rng($input, $schema)

when i run this code i get the error invalid XML charcter(20)

so what i am doing wrong here.


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek,
Thomas Schmidt, Dr. Reinhard Vöckler


Re: [basex-talk] How to export a database containing xi:include to multiple files

2018-06-27 Thread Imsieke, Gerrit, le-tex

Hi Marco,

It didn’t go unanswered. Here’s Alex’s reply: 
https://mailman.uni-konstanz.de/pipermail/basex-talk/2018-May/013147.html
Although his answer doesn’t address the issue of recreating xi:include 
elements from elements with an xml:base attribute.


Gerrit

On 27.06.2018 17:34, Marco Randazzo wrote:

Hi,
I’m posting again the same question I did some time ago, unfortunately 
unanswered L


I used baseX GUI (9.0.1) to create a database similar to the following 
test file:


http://www.w3.org/2001/XInclude;>

     

     



The Result displayed in the baseX gui is correct:

http://www.w3.org/2001/XInclude;>

   

   



Now, after editing the values, I would like to export the contents of 
the database to MULTIPLE xml files, recreating the same files and the 
original folders (I see this information is not lost, it is stored in 
the value of xml:base)


However, using the export function available in the baseX GUI, I am able 
to obtain only a SINGLE file containing the Result shown above, where 
the two files have been merged in a single file.


What can I do?

Thank you,
Cheers,

Marco Randazzo



--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] BaseX File Module : access network folder on windows server

2018-01-22 Thread Imsieke, Gerrit, le-tex

It works with a file: URI with five forward slashes, like
file:list('file:/SERVERNAME_or_IP/Freigabename/')

The results, if folders, may contain backslashes though.


On 22/01/2018 19:05, Christian Grün wrote:

Hi Dieter,

I haven’t tried Window networking by myself, but AFAIK access on
server is not possible via the File Module. It is based on Java’s
default file access, and an additional library (such as [1]) would
need to be embedded. JCIFS uses the CIFS/SMB networking protocol to
access network drives.

Sorry for that,
Christian

[1] https://jcifs.samba.org/



On Mon, Jan 22, 2018 at 1:49 PM, Dieter Zanzinger
 wrote:


I am working with den BaseX-File-Module for a while and can access a local
folder,
for example: file:list('C:/').

Now I want the same in a Windows network.
If I create a folder on the Desktop of the local machine = localhost and
give it a "Freigabe" with "Freigabename" netzlaufwerktestF, it is possible
to list it´s contents with:
file:list('//WIN7PROPARPC/netzlaufwerktestF/').

Now I try the same with a folder on a server, which I can access by windows
explorer. I expected the following to work:
file:list('//SERVERNAME_or_IP/Freigabename/')

But I get only errors.
Does anybody know the right way to access a network folder with the file
module of BaseX under windows?
(I didn´t find anything in the wiki or by google).

Thanks in advance

Dieter Zanzinger


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] xml element beginning and end space loss

2018-01-26 Thread Imsieke, Gerrit, le-tex

Hi Stefania,

You can avoid it if you create the DB with chopping switched off in the 
first place.


Or you can supply the chopping option as you go:

db:replace('MyDB', 'doc.xml', ' stefy ', map {'chop':false()})

Whitespace chooping by default is maybe the most-detested design 
decision in BaseX, at least among users of mixed content.


Luckily you can switch it off.

Gerrit


On 27/01/2018 00:53, Stefania Axo wrote:

Hi all!

is there a way to preserve the beginning and ending spaces in the 
Database xml elements?



In other words If I execute this xQuery

            db:replace("MyDB", "doc.xml", "  stefy  ")


the resulted document will be

           stefy

As you can see I lost my beginner and ending spaces.


thanks
Stefania


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer: Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] diacritics sensitive not working

2018-08-03 Thread Imsieke, Gerrit, le-tex

Hi Ron,

You can add an extra element (or attribute) to the content when 
importing or modifying it. (Or another document in another database if 
you like – you can create and later find such an index document by 
giving it the same db:path as the original document.)


In this extra database, document, element and/or attribute, you can 
recreate the original text, except that you normalize the characters 
with diacritical marks to a canonical decomposition form and then strip 
away the diacritical marks like this:


replace(normalize-unicode($input, 'NFKD'), '\p{Mn}', '')

The full updating statement is beyond my cursory XQuery capabilities – 
I’d probably do it in XSLT. Also I don’t know how to trigger an event 
that would cause an update of the auxiliary fields when the underlying 
data changes.


Gerrit


On 03.08.2018 14:39, Ron Katriel wrote:

Christian,

Adding diacritics sensitive slows execution by a factor of 3. My script 
(fragment below), which joins two large databases, namely CT.gov 
 and DrugBank, takes 2 hours without the 
diacritics sensitive constraint but 6 hours with it. Given the 
combinatorics involved, I am wondering if there is a better way to do 
this in BaseX.


Thanks,
Ron


for $drug in db:open('DrugBank')/drugbank/drug
  let $drug_name := $drug/name/text()
  let $drug_synonyms := 
functx:value-union(normalize-space(lower-case($drug/name)), 
local:drug-synonyms($drug_name))

  for $synonym_name in $drug_synonyms
  ...
  for $study in 
db:open('CTGov')/clinical_study[intervention/intervention_name contains 
text { $synonym_name } using case insensitive using diacritics sensitive]

  ...


Ron Katriel, Ph.D. | Principal Data Scientist | Medidata Solutions 


350 Hudson Street, 7th Floor, New York, NY 10014
rkatr...@mdsol.com  | direct: +1 201 337 3622 
 | mobile: +1 201 675 5598 
 | main: +1 212 918 1800 



On August 1, 2018 at 12:41:26 PM, Ron Katriel (rkatr...@mdsol.com 
) wrote:


Thanks, Christian. Strange, prior to contacting you and on a hunch, I 
tried adding the missing “using” keyword but still got the syntax 
error. Anyway, everything is good now!


Best,
Ron

On August 1, 2018 at 3:57:51 AM, Christian Grün 
(christian.gr...@gmail.com ) wrote:



I have fixed the example in the doc.
Best, Christian


On Wed, Aug 1, 2018 at 5:08 AM Ron Katriel > wrote:

>
> Hi,
>
> The following from your website (docs.basex.org/wiki/Full-Text 
) appears to be syntactically 
incorrect

>
> "'Äpfel' will not be found..." contains text "Apfel" diacritics sensitive
>
> In the BaseX GUI the keyword diacritics is underlined in red and the 
following error is reported
>
> Unexpected end of query: 'diacritic sens...'.
>
> This happens in version 8.6.4 and also the latest (9.0.2).
>
> Thanks,
> Ron
>
>
> Ron Katriel, Ph.D. | Principal Data Scientist | Medidata Solutions
>
> 350 Hudson Street, 7th Floor, New York, NY 10014
>
> rkatr...@mdsol.com  | direct: +1 201 337 
3622 | mobile: +1 201 675 5598 | main: +1 212 918 1800

>
>




Re: [basex-talk] Can't get `cdata-section-elements` to work at all for XSLT output

2018-08-01 Thread Imsieke, Gerrit, le-tex

Hi Hugh,

The second version where you specify the serialization options in XQuery 
works for me (BaseX GUI 8.6.5 with Saxon PE 9.6.0.7):



http://backend.userland.com/rss2; 
xmlns:content="http://purl.org/rss/1.0/modules/content/; version="2.0">

  
  
  


The first version cannot generate CDATA sections since the XSLT 
processor is not serializing anything; it’s the XQuery processor that 
serializes the result.


The error that you are seeing, XPST0081, would be generated if there 
were no namespace declaration for the prefix 'content', maybe caused by 
an indistinguishable look-alike non-ASCII character in 'content'. 
Doesn’t seem to be the case. Maybe this is a bug that is specific to 
BaseX 9?


Gerrit


On 01.08.2018 21:17, Hugh Guiney wrote:

Hello,

First off, loving BaseX so far! Using it as the backend for an API I’m
building. However, I’m running into an issue. I’m trying to transform
my database XML into an RSS 2.0 feed. It’s mostly working fine, but I
can’t output CDATA content at all, which I need to do for
`content:encoded` elements.

Specs:

- BaseX 9.0.2 (started via basexserver script)
- Saxon-HE 9.8.0.12J from Saxonica
- java version "1.8.0_112"
- basex 0.9.0 (NodeJS)
- macOS Sierra 10.12.6

### First Attempt

I set `cdata-section-elements` in the XSLT.

rss.xq:
```
xquery version "3.0";
declare option output:omit-xml-declaration "no";

let $in :=
   
 hello
   
let $style := doc( 'rss.xslt' )
return xslt:transform( $in, $style )
```

rss.xslt:
```

http://backend.userland.com/rss2;
   xmlns:xsl="http://www.w3.org/1999/XSL/Transform;
   xmlns:content="http://purl.org/rss/1.0/modules/content/;



   
   
 
   hi
   howdy
   
 
   

```

Result:
```

http://backend.userland.com/rss2;
xmlns:content="http://purl.org/rss/1.0/modules/content/;
version="2.0">
   hi
   howdy
   hello

```

No CDATA sections.

### Second Attempt

I set `cdata-section-elements` in the XQuery.

rss.xq:
```
xquery version "3.0";
declare namespace content = "http://purl.org/rss/1.0/modules/content/;;
declare option output:omit-xml-declaration "no";
declare option output:cdata-section-elements "content:encoded";

let $in :=
   
 hello
   
let $style := doc( 'rss.xslt' )
return xslt:transform( $in, $style )
```

rss.xslt:
[Unchanged]

Result:
[XPST0081] No namespace declared for 'content:encoded'.

Clearly I declared the namespace two lines up.

This looks like a bug to me, but any help appreciated if I’ve missed a
step here.

Thanks,
Hugh



Re: [basex-talk] Can't get `cdata-section-elements` to work at all for XSLT output

2018-08-02 Thread Imsieke, Gerrit, le-tex

Hi Hugh,

Did you see Christian’s reply, archived at 
https://mailman.uni-konstanz.de/pipermail/basex-talk/2018-August/013490.html⁠?


He essentially said that, for your second approach, he saw the same 
behavior on BaseX 9.x as I saw on 8.6.5. So there doesn’t seem to be a 
regression.


I agree with you that xslt:transform-text() is not a solution.

Gerrit


On 01.08.2018 23:47, Hugh Guiney wrote:

Thanks for testing Gerrit, that's good to know. Sounds like a
regression then. Shall I go ahead and file this on Github or does it
need further confirmation?

Christian, your suggestion seems to work around the issue; the CDATA
sections do come in that way. Except, all the elements get sent back
entity-escaped for some reason. I have to manually reverse it back
into XML using `result.replace( //gi, '>' ).replace( //gi, '<'
)` in JavaScript. Not sure if that is a separate issue or expected
behavior.

On Wed, Aug 1, 2018 at 2:50 PM, Imsieke, Gerrit, le-tex
 wrote:

Hi Hugh,

The second version where you specify the serialization options in XQuery
works for me (BaseX GUI 8.6.5 with Saxon PE 9.6.0.7):


http://backend.userland.com/rss2;
xmlns:content="http://purl.org/rss/1.0/modules/content/; version="2.0">
   
   
   


The first version cannot generate CDATA sections since the XSLT processor is
not serializing anything; it’s the XQuery processor that serializes the
result.

The error that you are seeing, XPST0081, would be generated if there were no
namespace declaration for the prefix 'content', maybe caused by an
indistinguishable look-alike non-ASCII character in 'content'. Doesn’t seem
to be the case. Maybe this is a bug that is specific to BaseX 9?

Gerrit



On 01.08.2018 21:17, Hugh Guiney wrote:


Hello,

First off, loving BaseX so far! Using it as the backend for an API I’m
building. However, I’m running into an issue. I’m trying to transform
my database XML into an RSS 2.0 feed. It’s mostly working fine, but I
can’t output CDATA content at all, which I need to do for
`content:encoded` elements.

Specs:

- BaseX 9.0.2 (started via basexserver script)
- Saxon-HE 9.8.0.12J from Saxonica
- java version "1.8.0_112"
- basex 0.9.0 (NodeJS)
- macOS Sierra 10.12.6

### First Attempt

I set `cdata-section-elements` in the XSLT.

rss.xq:
```
xquery version "3.0";
declare option output:omit-xml-declaration "no";

let $in :=

  hello

let $style := doc( 'rss.xslt' )
return xslt:transform( $in, $style )
```

rss.xslt:
```

http://backend.userland.com/rss2;
xmlns:xsl="http://www.w3.org/1999/XSL/Transform;
xmlns:content="http://purl.org/rss/1.0/modules/content/;






  
hi
howdy

  


```

Result:
```

http://backend.userland.com/rss2;
xmlns:content="http://purl.org/rss/1.0/modules/content/;
version="2.0">
hi
howdy
hello

```

No CDATA sections.

### Second Attempt

I set `cdata-section-elements` in the XQuery.

rss.xq:
```
xquery version "3.0";
declare namespace content = "http://purl.org/rss/1.0/modules/content/;;
declare option output:omit-xml-declaration "no";
declare option output:cdata-section-elements "content:encoded";

let $in :=

  hello

let $style := doc( 'rss.xslt' )
return xslt:transform( $in, $style )
```

rss.xslt:
[Unchanged]

Result:
[XPST0081] No namespace declared for 'content:encoded'.

Clearly I declared the namespace two lines up.

This looks like a bug to me, but any help appreciated if I’ve missed a
step here.

Thanks,
Hugh





--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] Re Add line-number function

2018-07-16 Thread Imsieke, Gerrit, le-tex

Hi Pavel,

What kind of editor are your users using? If they use an XML editor 
proper, there will probably be a means to jump to a location specified 
by an XPath expression.


If they are using an ordinary text editor, how do you prevent them from 
messing up the XML in the first place? Ordinary, non-IT users tend to 
render XML moot if left without appropriate tooling.


If you are a visual editor that hides the tags, it should be easier to 
insert error feedback by XPath location than by line number.


Maybe you want to create an HTML rendering of the input where you 
highlight the errors (using their XPath locations). Then the users have 
enough context to locate the erroneous piece in the original XML input.


I’m just thinking of workarounds since I assume that the notion of line 
numbers is not something that can easily be added to the BaseX storage 
layout. As someone who often deals with non-indented XML files that 
consist of a single line or with XML that is formatted with 
varying line lengths, I have come to avoid relying on line number 
information altogether.


Gerrit


On 16.07.2018 08:45, Павел Павлов wrote:

Thanks for the detailed answer.

Our software is developed in .NET. And we use BaseX as a Xquery processor.
We use fn:path and return path to error element to user. But it's not 
enough.
Our users are ordinary people, not IT, and they want to see in which 
lines of xml files there are mistakes.
Now after execution of xquery by BaseX our application load xml file (in 
memory to XDocument object from .NET XML) with specified flag 
SetLineInfo. Then we execute returned xpath to select node in loaded xml 
file and get line number of selected node. Then we return line number to 
user.
That is, we have to load the file by .NET only to get the line number. 
If BaseX could do it itself we wouldn't have to load xml file at all. It 
would be a great benefit for us.


Is it possible to add some _mode_ in BaseX to store line numbers even 
with additional memory and undefined line numbers on changed or updated 
xml nodes?


Пятница, 6 июля 2018, 15:24 +07:00 от Christian Grün
:

Hi Symantis,

The original line numbers are not stored in XML databases (they may
change after updated, and would consume additional memory), so you
won’t be able to retrieve them with XQuery.

As far as I know, this does not work in eXist-db either; the eXist
link you referenced gives you the line of the util:line-number
expression in your XQuery module. As Fabrice pointed out (thanks!),
this could also be realized with $err:line-number.

With Saxon, it works indeed. However, you’ll need you use the -l
command line option (otherwise, due to performance considerations,
line numbers will be discarded as well).

On query/database level, there are two ways to get a direct reference:
• With fn:path, you get an XPath expression that points to your node.
• With db:node-pre [1], you get a direct reference to the node in a
database.

Best,
Christian

[1] http://docs.basex.org/wiki/Database_Module#db:node-id




--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] Re Re Add line-number function

2018-07-16 Thread Imsieke, Gerrit, le-tex
Still, I suggest that you do not only provide users with a list of 
errors but also with an HTML rendering of the whole document (or only of 
the erroneous bits, plus some context maybe).


Alternatively, pay BaseX GmbH in the order of (I’m just guessing) 20 
kEUR so that they enhance the storage layout to optionally include line 
numbers.


(Disclaimer: I’m making this statement as an ordinary list member. I 
don’t know about the actual cost or feasibility of adding line number 
support.)


Gerrit


On 16.07.2018 12:42, Павел Павлов wrote:

Hi!

Users send to us xml files according published xsd scheme.
And they can use any software: any xml editor or special editors, 
developed by third companies, with textbox'es for fill business objects.

So, our application does not have a graphical interface.
Our software' main goal is to validate xml files by some business logic 
rules written in xquery language. Result of such validation is xml file 
with text that is all OK or there are errors. Errors are explained by 
xpath to error node and text description with some text for example 
"Line 100: The quantity of cows must be greater than 0". Result send to 
users as answer.


And users want to see line numbers...

Hi Pavel,

What kind of editor are your users using? If they use an XML editor
proper, there will probably be a means to jump to a location specified
by an XPath expression.

If they are using an ordinary text editor, how do you prevent them from
messing up the XML in the first place? Ordinary, non-IT users tend to
render XML moot if left without appropriate tooling.

If you are a visual editor that hides the tags, it should be easier to
insert error feedback by XPath location than by line number.

Maybe you want to create an HTML rendering of the input where you
highlight the errors (using their XPath locations). Then the users have
enough context to locate the erroneous piece in the original XML input.

I?m just thinking of workarounds since I assume that the notion of line
numbers is not something that can easily be added to the BaseX storage
layout. As someone who often deals with non-indented XML files that
consist of a single line or with XML that is formatted with
varying line lengths, I have come to avoid relying on line number
information altogether.

Gerrit


On 16.07.2018 08:45, ? ?? wrote:
 > Thanks for the detailed answer.
 >
 > Our software is developed in .NET. And we use BaseX as a Xquery
processor.
 > We use fn:path and return path to error element to user. But it's
not
 > enough.
 > Our users are ordinary people, not IT, and they want to see in which
 > lines of xml files there are mistakes.
 > Now after execution of xquery by BaseX our application load xml
file (in
 > memory to XDocument object from .NET XML) with specified flag
 > SetLineInfo. Then we execute returned xpath to select node in
loaded xml
 > file and get line number of selected node. Then we return line
number to
 > user.
 > That is, we have to load the file by .NET only to get the line
number.
 > If BaseX could do it itself we wouldn't have to load xml file at
all. It
 > would be a great benefit for us.
 >
 > Is it possible to add some _mode_ in BaseX to store line numbers
even
 > with additional memory and undefined line numbers on changed or
updated
 > xml nodes?
 >
 > ???, 6  2018, 15:24 +07:00 ?? Christian Gr?n
 > mailto:christian.gr...@gmail.com>>:
 >
 > Hi Symantis,
 >
 > The original line numbers are not stored in XML databases (they may
 > change after updated, and would consume additional memory), so you
 > won?t be able to retrieve them with XQuery.
 >
 > As far as I know, this does not work in eXist-db either; the eXist
 > link you referenced gives you the line of the util:line-number
 > expression in your XQuery module. As Fabrice pointed out (thanks!),
 > this could also be realized with $err:line-number.
 >
 > With Saxon, it works indeed. However, you?ll need you use the -l
 > command line option (otherwise, due to performance considerations,
 > line numbers will be discarded as well).
 >
 > On query/database level, there are two ways to get a direct
reference:
 > ? With fn:path, you get an XPath expression that points to your node.
 > ? With db:node-pre [1], you get a direct reference to the node in a
 > database.
 >
 > Best,
 > Christian
 >
 > [1] http://docs.basex.org/wiki/Database_Module#db:node-id
 >
 >

-- 
Gerrit Imsieke

Gesch?ftsf?hrer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de


Re: [basex-talk] Schematron package error

2018-07-22 Thread Imsieke, Gerrit, le-tex
I ran it successfully on Windows with BaseX 8.6.5 and schematron-basex 
1.2, both when I used proper file: URIs and Windows file system paths 
with backslashes.


I used both the version of /docbook-mods.sch.xml that you sent me and a 
version with expanded entities (using xmllint --noent --dropdtd first).


When I gave it too little stack size (-Xss108k) on startup, I could only 
provoke a "Stack Overflow: Try tail recursion?" BaseX error, not the 
"Too many nested template calls" Saxon error that you reported.
Calling Saxon directly for applying iso_dsdl_include.xsl on the schema 
was also successful if heap and stack sizes were large enough.


What happens when you apply your Saxon EE standalone, transforming the 
schema with iso_dsdl_include.xsl?


Which version of Saxon 9 EE are you using?

Gerrit

On 22.07.2018 20:36, DK Singh wrote:
I Am using Saxon 9ee in the class path as well I have put it into basex 
lib directory


On Mon 23 Jul, 2018, 12:04 AM Imsieke, Gerrit, le-tex, 
mailto:gerrit.imsi...@le-tex.de>> wrote:


Is saxon9he.jar or another Saxon version on the classpath as per

https://github.com/Schematron/schematron-basex#user-content-using-xpath-20-and-above

?
The Schematron schema says queryBinding="xslt2" and this requires an
XSLT 2 processor.

On 22.07.2018 20:26, DK Singh wrote:
 > PFA
 >
     > On Sun 22 Jul, 2018, 11:47 PM Imsieke, Gerrit, le-tex,
 > mailto:gerrit.imsi...@le-tex.de>
<mailto:gerrit.imsi...@le-tex.de <mailto:gerrit.imsi...@le-tex.de>>>
wrote:
 >
 >     I didn’t get a message with a Schematron from you.
 >
 >     On 22.07.2018 20:05, DK Singh wrote:
 >      > Hi,
 >      > Already I have attached the schematron in the chain Mail
 >      >
 >      > On Sun 22 Jul, 2018, 11:33 PM Imsieke, Gerrit, le-tex,
 >      > mailto:gerrit.imsi...@le-tex.de> <mailto:gerrit.imsi...@le-tex.de
<mailto:gerrit.imsi...@le-tex.de>>
 >     <mailto:gerrit.imsi...@le-tex.de
<mailto:gerrit.imsi...@le-tex.de> <mailto:gerrit.imsi...@le-tex.de
<mailto:gerrit.imsi...@le-tex.de>>>>
 >     wrote:
 >      >
 >      >     I’m afraid we cannot help you without seeing the actual
 >     Schematron file
 >      >     that you are trying to compile/apply.
 >      >
 >      >     On 22.07.2018 19:59, DK Singh wrote:
 >      >      > Thank U Andy for your response, I have tried but it
is giving
 >      >     schematron
 >      >      > compilation error, but when I validate on Oxygen editor
 >     xmls got
 >      >      > validated, as github documentation I am doing first
schematron
 >      >      > compilation then validating the XML, SO how can I
resolve this
 >      >     error Now
 >      >      >
 >      >      > On Sun 22 Jul, 2018, 9:39 PM James Ball,
 >     mailto:m...@jamesball.co.uk>
<mailto:m...@jamesball.co.uk <mailto:m...@jamesball.co.uk>>
 >      >     <mailto:m...@jamesball.co.uk <mailto:m...@jamesball.co.uk>
<mailto:m...@jamesball.co.uk <mailto:m...@jamesball.co.uk>>>
 >      >      > <mailto:m...@jamesball.co.uk
<mailto:m...@jamesball.co.uk> <mailto:m...@jamesball.co.uk
<mailto:m...@jamesball.co.uk>>
 >     <mailto:m...@jamesball.co.uk <mailto:m...@jamesball.co.uk>
<mailto:m...@jamesball.co.uk <mailto:m...@jamesball.co.uk>>>>> wrote:
 >      >      >
 >      >      >     Hello Dharmendra,
 >      >      >
 >      >      >     Have you made sure that the Schematron is
compiled before
 >      >     using it
 >      >      >     to validate? I think this is the error you get
if you
 >     try to
 >      >     use an
 >      >      >     un-compiled Schematron file to perform a
validation.
 >      >      >
 >      >      >     The necessary steps are outlined with the module
 >     documentation on
 >      >      >     GitHub and I had it working successfully just a
week ago.
 >      >      >
 >      >      >     Regards, James
 >      >      >
 >      >      >      > Date: Fri, 20 Jul 2018 17:07:30 +0530
 >      >      >      > From: DK Singh mailto:dharam.m...@gmail.com>
 >     <mailto:dharam.m...@gmail.com <mailto:dharam.m...@gmail.com>>
 >      >     <mailto:dharam.m...@gmail.com
<mailto:dharam.m...@gmail.com> <mailto:dharam.m...@gmail.

Re: [basex-talk] Schematron package error

2018-07-22 Thread Imsieke, Gerrit, le-tex
I’m afraid we cannot help you without seeing the actual Schematron file 
that you are trying to compile/apply.


On 22.07.2018 19:59, DK Singh wrote:
Thank U Andy for your response, I have tried but it is giving schematron 
compilation error, but when I validate on Oxygen editor xmls got 
validated, as github documentation I am doing first schematron 
compilation then validating the XML, SO how can I resolve this error Now


On Sun 22 Jul, 2018, 9:39 PM James Ball, > wrote:


Hello Dharmendra,

Have you made sure that the Schematron is compiled before using it
to validate? I think this is the error you get if you try to use an
un-compiled Schematron file to perform a validation.

The necessary steps are outlined with the module documentation on
GitHub and I had it working successfully just a week ago.

Regards, James

 > Date: Fri, 20 Jul 2018 17:07:30 +0530
 > From: DK Singh mailto:dharam.m...@gmail.com>>
 > To: BaseX mailto:basex-talk@mailman.uni-konstanz.de>>
 > Subject: [basex-talk] Schematron package error
 > Message-ID:
 > 
  mailto:twju9kw...@mail.gmail.com>>

 > Content-Type: text/plain; charset="utf-8"
 >
 > Hi All,
 >
 > I am doing the schematron validation against XML docuemnt but i
am getting
 > these errors.
 >
 > Bad name element: XPath error. No XPath. Bad name element: XPath
error. No
 > XPath. Error on line 1409 column 61 of iso_dsdl_include.xsl: Too many
 > nested template or function calls. The stylesheet may be looping. at
 > xsl:call-template name="sch-check:strip-strings"
 >

(file:///C:/Program%20Files%20(x86)/BaseX/repo/http-github.com-Schematron-schematron-basex-1.2/content/iso-schematron/iso_dsdl_include.xsl#1409)
 >
 >
 > it looks like basex schematron pacakge throwing the error to run
schematron
 > validation.
 > can anyone suggest how can i resolve this errors.
 >
 > Regards
 > Dharmendra Kumar Singh
 > -- next part --
 > An HTML attachment was scrubbed...
 > URL:


 >



--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] many distinct namespaces

2018-10-31 Thread Imsieke, Gerrit, le-tex

Hi Sergei,

The corresponding issue will turn 5 next March: 
https://github.com/BaseXdb/basex/issues/902


If you are an XML developer who wants to index all the XML, XSLT, XProc, 
RNG, XSD, Schematron, etc. files on your hard disk in an XML database, 
chances are that you’ll need more than 256 namespaces.


I’m willing to shell out up to 1,200 Euros (plus VAT) out of my own 
pockets for this feature. Any other funders?
Christian, how much do we need to raise collectively for you to 
prioritize storage layout redesign?


Gerrit

On 31.10.2018 22:47, Сергей Чесноков wrote:


Hi all,

Is it possible to bypass the following restriction (I cannot change 
"ep_ins_med_q.xsd" (Central Bank xbrl scheme))?:


BaseX 9.1

Command:
CREATE DB bfo 
D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/

Error:
"D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/rep/2018-03-31/ep/ep_ins_med_q.xsd""ep_ins_med_q.xsd" 
(Line 21): Too many distinct namespaces (limit: 256).


Best regards, Sergei.


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] many distinct namespaces

2018-11-02 Thread Imsieke, Gerrit, le-tex
One approach to avoid migration and backwards compatibility issues would 
be to support a standard storage and an extended storage side by side.
The storage and query functions would know beforehand which layout 
variant the current database is in, and they could use the appropriate 
optimized functions.
If dynamic lookup of these layout-specific functions would be too 
costly, maybe providing two separate binaries (classic and extended 
storage) might be an option. I cannot fathom how many pieces of code 
need to be modified in order to be able to maintain a common codebase 
for both layouts.
I’m certainly naïve in this regard because back in the days, I thought: 
How hard can it be to move from a 16 bit architecture to a 32 bit 
architecture?


Gerrit


On 02.11.2018 17:25, Christian Grün wrote:

Hi Gerrit,

thanks for your generous offer to sponsor the requested feature. I am
ashamed to confirm it’s been a long time since this issue has been
opened and not closed yet. You are asking how much money will be
required to get this fixed. I’m not sure after all. Maybe I would
rather ask for 3, 4 weeks of “leisure time”, or – even better – get a
proposal into my hands how this could be resolved without compromising
backward conformance.

Some more details: The current storage layout per node has been fixed
to 16 bytes. One byte (8 bits) is reserved for the namespace
reference. The other 15 bytes (minus a few unused bits) are reserved
for other references and flags. We could extend the storage to 24 or
32 bits. As a result, the central database main table would get
larger, so this would affect both old databases (that need to be
imported) and the overall performance of the system. If we decide to
go this step, we could indeed overcome various of the current
limitations.

Any volunteers out there who are ready for the challenge?
Christian



On Wed, Oct 31, 2018 at 11:40 PM Imsieke, Gerrit, le-tex
 wrote:


Hi Sergei,

The corresponding issue will turn 5 next March:
https://github.com/BaseXdb/basex/issues/902

If you are an XML developer who wants to index all the XML, XSLT, XProc,
RNG, XSD, Schematron, etc. files on your hard disk in an XML database,
chances are that you’ll need more than 256 namespaces.

I’m willing to shell out up to 1,200 Euros (plus VAT) out of my own
pockets for this feature. Any other funders?
Christian, how much do we need to raise collectively for you to
prioritize storage layout redesign?

Gerrit

On 31.10.2018 22:47, Сергей Чесноков wrote:


Hi all,

Is it possible to bypass the following restriction (I cannot change
"ep_ins_med_q.xsd" (Central Bank xbrl scheme))?:

BaseX 9.1

Command:
CREATE DB bfo
D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/
Error:
"D:/portal/xbrl/CBR/final_1_3_1/Taxonomy_1_3_1/www.cbr.ru/xbrl/bfo/rep/2018-03-31/ep/ep_ins_med_q.xsd""ep_ins_med_q.xsd"
(Line 21): Too many distinct namespaces (limit: 256).

Best regards, Sergei.


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] Compare List Membership in XQuery

2019-03-25 Thread Imsieke, Gerrit, le-tex
If you are allowed to share some snippets of the actual documents, it 
will be easier to see how the query needs to be phrased.


Have you verified that $biblFull and $biblStruct actually contain 
strings? If not, do you need to declare a default namespace? The 
vocabulary looks like TEI, so


declare default element namespace "http://www.tei-c.org/ns/1.0;;

may be necessary. And if it is TEI, the ID attributes are probably 
called @xml:id rather than @id.


Gerrit

On 26.03.2019 00:18, Chris Yocum wrote:

Hi Gerrit,


Are you sure that the @target attributes are supposed to be identical to the
IDs?


Yes, they should be.  If they are not, I need to find them so I can
fix them to be identical.


Don’t you prepend a pound sign to @target attributes when they point to
IDs within the same document?


They are not in the same document.  The @target attributes live spread
out in the other documents while the IDs all live in the same
document.


So you probably need to say

where not(substring($title/@target,2) = $biblStruct) and
not(substring($title/@target,2) = $biblFull)



I will give this a shot tomorrow when I am not as tired.


And maybe you need to restrict the titles that you search to those with a
@target attribute, like so:

for $title in collection('edil_target/eDIL-A.xml')//entry//title[@target]



This is the other half of the problem which I did not state here. I am
to find all titles that do not have target attributes then give them a
target attribute based on some rules.  I have done so in a few files
(and I explicitly testing one of them in the query in my previous
email) and I will roll out the fix in all other files once I have
everything else tested and working.

I will give your suggestion a try tomorrow. Thanks!

All the best,
Chris





Re: [basex-talk] Compare List Membership in XQuery

2019-03-25 Thread Imsieke, Gerrit, le-tex
Are you sure that the @target attributes are supposed to be identical to 
the IDs? Don’t you prepend a pound sign to @target attributes when they 
point to IDs within the same document?

So you probably need to say

where not(substring($title/@target,2) = $biblStruct) and 
not(substring($title/@target,2) = $biblFull)


And maybe you need to restrict the titles that you search to those with 
a @target attribute, like so:


for $title in collection('edil_target/eDIL-A.xml')//entry//title[@target]

Otherwise also non-@target-bearing titles will match the where clause, 
which may be unintended.


Gerrit

On 25.03.2019 23:32, Chris Yocum wrote:

Hi Markus,


try

for $title in collection('edil_target/eDIL-A.xml')//entry//title
where not($title/@target = $biblStruct) and not($title/@target = $biblFull)
return $title


Thank you for the quick reply at such a late hour!  However, I am
getting the same results sadly.  These results I can open the files
and find the targets that are being return in either of the two lists.

All the best,
Chris


Re: [basex-talk] supporting XML Catalog files in xslt:transform() (patch)

2019-03-14 Thread Imsieke, Gerrit, le-tex




On 14.03.2019 10:56, Christian Grün wrote:

Maybe we could port Gerrit’s code to XQuery… Volunteers are welcome ;)


You probably can’t instruct Saxon to use the XSLT-based resolver (or an 
XQuery-based resolver) when reading XML files using doc() or xsl:import. 
I think it needs Java classes that provides certain interfaces. Not sure 
whether it makes sense to provide a Java class that executes XQuery when 
you can use a resolver that is written directly in Java.


Background for our XSLT-based resolver: We are using it in order to 
resolve canonical URIs of fonts or other resources that we need to read 
from the file system from within XSLT stylesheets or XProc pipelines, 
but that cannot be read by doc() (since they are not XML) or the EXPath 
file module methods (since Saxon won’t use the catalog resolver for 
file:read-binary()). However, we still want to be able to refer to these 
resources by a canonical URI such as 
http://transpect.io/fontlib/dejavu-sans/condensed-regular/DejaVuSansCondensed.ttf 
(which refers to a local copy of 
https://subversion.le-tex.de/common/fontlib/dejavu-sans/condensed-regular/DejaVuSansCondensed.ttf, 
using https://subversion.le-tex.de/common/fontlib/xmlcatalog/catalog.xml 
for the resolution).


A detail: We usually rely on the XML catalog resolver to resolve the URI 
to the XML catalog that we supply to the XSLT-based resolver. In a 
typical transpect project, the canonical catalog URI is at 
http://this.transpect.io/xmlcatalog/catalog.xml which resolves to 
{local_project_base_uri}/xmlcatalog/catalog.xml. Then we use this 
catalog to resolve URIs of non-XML resources.




Re: [basex-talk] supporting XML Catalog files in xslt:transform() (patch)

2019-03-13 Thread Imsieke, Gerrit, le-tex




On 13.03.2019 19:55, Liam R. E. Quin wrote:

Yes, they are a bit of a nightmare. Actually i’ve thought about having
the ability to write a URI Resolver in XQuery,
 db:resolve-identifier($system, $public, $purpose, $types) as
xs:anyURI?

but maybe it is too scary!


I’ve already written a catalog resolver in XSLT…
https://github.com/transpect/xslt-util/blob/master/xslt-based-catalog-resolver/xsl/resolve-uri-by-catalog.xsl


Re: [basex-talk] BaseX GUI just spins?

2019-08-21 Thread Imsieke, Gerrit, le-tex




On 21.08.2019 13:24, Buddy Kresge wrote:
Thanks for these ideas and will try these.  As far as #4, what is 
‘SSCCE’ – sorry in advance for the not recognizing (ha ha).


LMGTFY…

http://letmegooglethat.com/?q=SSCCE

SCNR

– Gerrit


Re: [basex-talk] catalog.xml - xsd - urn

2019-10-21 Thread Imsieke, Gerrit, le-tex

tl;dr

– Don’t bother that the namespace is a URN.
– Don’t confuse namespaces with schema locations.
– Apparently BaseX cannot use a catalog resolver for resolving schema 
locations.
– Use other more or less portable ways for accessing the schemas, for 
ex. store them in a database or put the base file system paths into an 
external variable.
– There is no default catalog location; you specify the catalog to use 
with the CATFILE option.



Hi,

You can store an XML catalog file anywhere you like. Then you set the 
option CATFILE to this file location, or you can do it in the XQuery 
file like this:


declare option db:catfile "path/relative/to/cwd/catalog.xml";

Of course you can also supply an absolute path.

On 21.10.2019 22:05, SW-Service wrote:

Good day,
Where is the catalog.xml file stored?
I want to validate xml files against XSD, but the xsd is referenced via 
an urn.

thank you very much

Guten Tag,
Wo wird die catalog.xml Datei abgelegt?
Ich möchte xml-Dateien gegen XSD validieren, aber die XSD wird über 
einen urn referenziert.

Herzlichen Dank

xmlns:xdomea="urn:xoev-de:xdomea:schema:2.3.0">




This line only contains a namespace declaration and no schema association.

Does your document contain an xsi:schemaLocation attribute? Then this 
can be taken into account for validation, see [1].


Suppose the schema location had the same URN as the namespace, I’d 
expect a document like this:


http://www.w3.org/2001/XMLSchema-instance;
  xsi:schemaLocation="urn:xoev-de:xdomea:schema:2.3.0 
urn:xoev-de:xdomea:schema:2.3.0">



The xsi:schemaLocation attribute contains first the namespace, then the 
schema location (accidentally the same URIs). Only the latter is subject 
to catalog resolution.


Suppose you have a catalog


  


and the XSD foobar.xsd (in the same directory as the catalog):

http://www.w3.org/2001/XMLSchema;
  targetNamespace="urn:xoev-de:xdomea:schema:2.3.0">
  


Then oXygen, for example, will honor the catalog’s mapping and validate 
against foobar.xsl.


But I just found out that BaseX’s XSD validator will not use the catalog 
in order to resolve the schema location’s URI.


I think BaseX uses catalogs only for two things: When importing files 
into a database and, recently, for the XSLT processor [2].


It doesn’t seem to use it for doc(), either, so reading 
doc('urn:xoev-de:xdomea:schema:2.3.0'), which should resolve to 
foobar.xsd, doesn’t work.


This means if you use BaseX, you need to access the schema in another 
way. You can put all the XOEV schemas in a database and validate like this:


validate:xsd-report('C:/…/mydoc.xml',
db:open('xoev-schema', 'foobar.xsd'))

assuming that xoev-schema is the name of the database and foobar.xsd the 
relative path of the schema, relative to where you imported it from.


You can also refer to an actual location in the file system, with the 
base path optionally declared by an external variable, in order not to 
make it too dependent on the given directory structure.


declare namespace xs = 'http://www.w3.org/2001/XMLSchema';
declare variable $basedir-uri external := 
file:path-to-uri(Q{org.basex.util.Prop}HOMEDIR()) || 'xoev/xsd/';


validate:xsd-report('C:/cygwin/home/gerrit/XML/basex/2019-10-21_xoev/Untitled14.xml',
doc($basedir-uri || 'foobar.xsd'))

This lets users supply a value for $basedir-uri (using the $x button in 
the GUI) if they don’t have the schemas in the default location, which 
is the user’s home directory plus 'xoev/xsd/' in this example.


Please let me know where I lost you on the way.

Gerrit


[1] http://docs.basex.org/wiki/Validation_Module#XML_Schema_Validation
[2] https://github.com/BaseXdb/basex/issues/1719

--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] catalog.xml - xsd - urn

2019-10-22 Thread Imsieke, Gerrit, le-tex




On 21.10.2019 22:05, SW-Service wrote:
xmlns:xdomea="urn:xoev-de:xdomea:schema:2.3.0">


Somewhat unrelated to your initial question, and it was probably someone 
else’s idea to put a minor version number in a namespace – but this is 
considered bad practice.


All processing applications, for example XSLT stylesheets or editing 
environments, need to be adapted when the namespace changes. This can 
probably be avoided by matching only local names in XSLT, but why use 
namespaces at all when you end up ignoring them?


If a schema is likely to undergo significant changes between major 
versions, it might be acceptable to include the major version in the 
namespace. In general, it is better to use a version attribute in the 
document instead so that transformation rules, editor configurations or 
schema co-occurrence constraints can be customized to reflect specific 
differences between the versions.


Gerrit


Re: [basex-talk] basex OOM on 30GB database upon running /dba/db-optimize/

2019-10-03 Thread Imsieke, Gerrit, le-tex

Hi,

just saying that 16 GB of DDR3 RAM cost about 40 € now.

Gerrit

On 03.10.2019 08:53, first name last name wrote:

I tried again, using SPLITSIZE = 12 in the .basex config file
The batch(console) script I used is attached mass-import.xq
This time I didn't do the optimize or index creation post-import, but 
instead, I did it as part of the import similar to what

is described in [4].
This time I got a different error, that is, 
"org.basex.core.BaseXException: Out of Main Memory.*"*
So right now.. I'm a bit out of ideas. Would AUTOOPTIMIZE make any 
difference here?


Thanks

[4] http://docs.basex.org/wiki/Indexes#Performance


On Wed, Oct 2, 2019 at 11:06 AM first name last name 
mailto:randomcod...@gmail.com>> wrote:


Hey Christian,

Thank you for your answer :)
I tried setting in .basex the SPLITSIZE = 24000 but I've seen the
same OOM behavior. It looks like the memory consumption is moderate
until when it reaches about 30GB (the size of the db before
optimize) and
then memory consumption spikes, and OOM occurs. Now I'm trying with
SPLITSIZE = 1000 and will report back if I get OOM again.
Regarding what you said, it might be that the merge step is where
the OOM occurs (I wonder if there's any way to control how much
memory is being used inside the merge step).

To quote the statistics page from the wiki:
Databases  in BaseX are
light-weight. If a database limit is reached, you can distribute
your documents across multiple database instances and access all of
them with a single XQuery expression.
This to me sounds like sharding. I would probably be able to split
the documents into chunks and upload them under a db with the same
prefix, but varying suffix.. seems a lot like shards. By doing this
I think I can avoid OOM, but if BaseX provides other, better, maybe
native mechanisms of avoiding OOM, I would try them.

Best regards,
Stefan


On Tue, Oct 1, 2019 at 4:22 PM Christian Grün
mailto:christian.gr...@gmail.com>> wrote:

Hi first name,

If you optimize your database, the indexes will be rebuilt. In this
step, the builder tries to guess how much free memory is still
available. If memory is exhausted, parts of the index will be split
(i. e., partially written to disk) and merged in a final step.
However, you can circumvent the heuristics by manually assigning a
static split value; see [1] for more information. If you use the
DBA,
you’ll need to assign this value to your .basex or the web.xml file
[2]. In order to find the best value for your setup, it may be
easier
to play around with the BaseX GUI.

As you have already seen in our statistics, an XML document has
various properties that may represent a limit for a single database.
Accordingly, these properties make it difficult to decide for the
system when the memory will be exhausted during an import or index
rebuild.

In general, you’ll get best performance (and your memory consumption
will be lower) if you create your database and specify the data
to be
imported in a single run. This is currently not possible via the
DBA;
use the GUI (Create Database) or console mode (CREATE DB command)
instead.

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/Options#SPLITSIZE
[2] http://docs.basex.org/wiki/Configuration



On Mon, Sep 30, 2019 at 7:09 AM first name last name
mailto:randomcod...@gmail.com>> wrote:
 >
 > Hi,
 >
 > Let's say there's a 30GB dataset [3] containing most
threads/posts from [1].
 > After importing all of it, when I try to run
/dba/db-optimize/ on it (which must have some corresponding
command) I get the OOM error in the stacktrace attached. I am
using -Xmx2g so BaseX is limited to 2GB of memory (the machine
I'm running this on doesn't have a lot of memory).
 > I was looking at [2] for some estimates of peak memory usage
for this "db-optimize" operation, but couldn't find any.
 > Actually it would be nice to know peak memory usage because..
of course, for any database (including BaseX) a common operation
is to do server sizing, to know what kind of server would be needed.
 > In this case, it seems like 2GB memory is enough to import
340k documents, weighing in at 30GB total, but it's not enough
to run "dba-optimize".
 > Is there any info about peak memory usage on [2] ? And are
there guidelines for large-scale collection imports like I'm
trying to do?
 >
 > Thanks,
 > Stefan
 >
 > [1] https://www.linuxquestions.org/
 > [2] 

Re: [basex-talk] How to escape/encode a search term using BaseX REST XQ

2020-01-24 Thread Imsieke, Gerrit, le-tex
While moving the URI parameter to the query string seems like an 
acceptable workaround, I, too, suggest that if *reserved* URI characters 
such as '/' appear percent-encoded, they should not be converted to 
their decoded character prior to analyzing the URI, in line with Sect. 
2.2 of RFC 3986 [1].


If I enter an escaped colon (%3A) in a path segment, it will be kept as 
%3A by BaseX, rather than converted to the reserved character ':'.


The RESTXQ specification [2] doesn’t seem to contain detailed 
instructions on how to decode the submitted URI before extracting path 
parameters, therefore I think RFC 3986 should prevail.


So I agree, BaseX should not interpret escaped slashes as if they were 
regular slashes, thereby disallowing them as part of RESTXQ path pa


Gerrit

[1] https://tools.ietf.org/html/rfc3986#section-2.2
[2] 
http://exquery.github.io/exquery/exquery-restxq-specification/restxq-1.0-specification.html


On 24.01.2020 13:54, Ivan Kanakarakis wrote:

Hi Christian,

thanks for the quick reply. It definitely helps, but it still keeps
this behaviour in the "weird" domain.
I do not see a reason to be decoding the URI before it gets to match a
route. What is the reason for this?

What you propose works, but if I have a route like
"/search/{$query=.+}/page/{$page}", then the query will match
everything including "/page/...". If the path was not decoded, I do
not think I would need the regex, neither any other special operation
on the route. It should work with "/search/{$query}/page/{$page}" and
it should return "tea%2Ftime". Why do I have to make workarounds to
try to guess how a part of the URL was encoded, when the URL I hit has
that part encoded?
I don't think it makes sense, and I don't see a use case for this.

When the framework receives the payload, it is responsible to match a route.
By matching the route, it will provide me with the binded parts of the
route that I requested.
Then, *I* am responsible to decode those parts as I see fit and handle
the request as I need.

If the framework decodes the URL before matching a route, that is a
problem to me - I do not have the control I need.
If the framework decodes the URL parts before binding the route
variables, this is fine - it saves me an operation.

While, I now refactored the endpoint handlers to work with query
params, and this is no longer a problem for me, it is a problem in
general.


Cheers,



On Mon, 20 Jan 2020 at 19:36, Christian Grün  wrote:


Hi Ivan,

A more common approach is to supply search terms as query parameters
(URL?query=...); in that case, your path won’t have new segments. If
you prefer paths, you can use a regular expression in your RESTXQ path
pattern [1]:

   "search/{$query=.+}"

In both cases, encodeURIComponent should be the appropriate function
to encode special characters.

Hope this helps,
Christian

[1] http://docs.basex.org/wiki/RESTXQ#Paths





On Mon, Jan 20, 2020 at 10:54 AM Ivan Kanakarakis
 wrote:


Hello everyone,

I am using BaseX 8.44 and the REST XQ interface (ie,
http://docs.basex.org/wiki/RESTXQ). I have an endpoint that, when
invoked with GET, it does a full text search (using "$db-nodes[text()
contains text { $term } all]"), gets the results, constructs a JSON
response and sends it back.

That's all fine and works great. However, I am not sure how I should
be doing the queries I describe bellow.

_Note: the query is initiated by a SPA javascript client, thus when I
say encode/uri-escape, what I mean is that I invoke the
encodeURIComponent function
(https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/encodeURIComponent).
_Note 2: for the sake of conversation let's consider the example
endpoint declared as:

 %rest:GET
 %rest:path("/search/{$term}")


1. I want to search for "tea". That is the basic query. A single term,
no problem.

 curl -s "https://example.com/search/tea;


2. I want to search for "tea time". Now, this query has a space in
between the two words. What I expect to get back, is any node that
contains both words (thus I have used "contains text" with "all"),
even if they may be a few words apart.
- Should I be sending an encoded/uri-escape version of this, ie, "tea%20time"?
- Or, should I be replacing the space with "+", ie "tea+time"?
- Or, some other advice?

 curl -s "https://example.com/search/tea%20time;
 curl -s "https://example.com/search/tea+time;


3. I want to search for "tea/time". This is even trickier. What I
expect to get back, is any node that contains "tea/time", ie a search
result for a single term. How do I do this?
- If I do not do anything, the slash is treated as part of the URL,
thus not matching a route.
- If I encoded/uri-escape this term, I get "tea%2Ftime". But, when I
invoke the endpoint I get the same as if there was a slash.
- I am not sure how I should deal with the slash. How should I
escape/encode this?

 curl -s "https://example.com/search/tea/time;
 curl -s 

Re: [basex-talk] How to escape/encode a search term using BaseX REST XQ

2020-01-24 Thread Imsieke, Gerrit, le-tex




On 24.01.2020 14:36, Imsieke, Gerrit, le-tex wrote:
So I agree, BaseX should not interpret escaped slashes as if they were 
regular slashes, thereby disallowing them as part of RESTXQ path pa


…rameters.


Re: [basex-talk] Performance loss between version 9.2.4 and 9.3.2 when executing specific xQuery

2020-05-08 Thread Imsieke, Gerrit, le-tex
Just saying that I find it sooo interesting to learn at which places and 
for which purposes BaseX is being employed. Have a nice weekend!


On 08.05.2020 13:31, BIRKNER Michael wrote:

Hi Christian,


thank you for your answers. As you can guess the queries I sent in my 
original email are just simplified  examples.



The real XML structure is like the following (its library data in format 
"MarcXML", here you see an example: 
https://www.loc.gov/standards/marcxml/Sandburg/sandburg.xml)




… … …



Mag. Michael Birkner
AK Wien - Bibliothek
1040, Prinz Eugen Straße 20-22
T: +43 1 501 65 12455
F: +43 1 501 65 142455
M: +43 664 88957669



Re: [basex-talk] BaseX XSLT fails after returning first result

2020-05-21 Thread Imsieke, Gerrit, le-tex
Alternatively, apply your stylesheet using xslt:transform-text() instead 
of xslt:transform().


On 21.05.2020 18:33, Imsieke, Gerrit, le-tex wrote:

Hi Tom,

The problem is most probably that your XSLT doesn’t create a *primary* 
output. It just writes something to another result-document. However, 
the interface for invoking an XSLT expects some result document. So if 
you just create a  element next to  and 
then discard it, it should just work.


Gerrit



Re: [basex-talk] BaseX XSLT fails after returning first result

2020-05-21 Thread Imsieke, Gerrit, le-tex

Hi Tom,

The problem is most probably that your XSLT doesn’t create a *primary* 
output. It just writes something to another result-document. However, 
the interface for invoking an XSLT expects some result document. So if 
you just create a  element next to  and 
then discard it, it should just work.


Gerrit

On 21.05.2020 17:22, Furst, Thomas wrote:
I have a large XML file stored in BaseX that I need to split up into 
smaller, modular documents. I have created an XSL file to do so:


http://www.w3.org/1999/XSL/Transform;

     xmlns:xs="http://www.w3.org/2001/XMLSchema;

     exclude-result-prefixes="xs"

     version="2.0">

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     method="xml">


     

     

     

     

     systemDiffCode="{$sysDiff}">


     select="$sys"/>


     select="$subsys"/>


     select="$subsubsys"/>


     select="$assy"/>


     select="$disassy"/>


     name="disassyCodeVariant" select="$disassyv"/>


     select="$info"/>


     select="$infov"/>


     select="$itemloc"/>


     

     countryIsoCode="US"/>


     

     

     

     

     

     

 



My issue is that when using the xslt:transform() function in BaseX it 
only returns (creates) the first document. This is the XQuery I have for 
this:


let $style := doc('file:///C:/base.xsl')

    for $d in doc('file:///C:/Users/tfurst/Documents/Book1-test.xml')//title

   for $newD in //*[title]

   where $newD/title/@id eq $d/@id

     let $schema := $d/schemaName

     let $model := $d/modelic

     let $sdc := $d/sdc

     let $sys := $d/systemCode

     let $subsys := $d/subsys

     let $subsubsys := $d/subsubsys

     let $assy := $d/assy

     let $disassy := $d/disassy

     let $disassyv := $d/disassyv

     let $info := $d/infoCode

     let $infov := $d/infov

     let $itemloc := $d/itemloc

     let $tname := $d/tname

     let $iname := $d/iname

     return xslt:transform($newD,$style, map 
{"outputDir":"file:///G:/LMA-Conv/Flight/test-conv-out", "model":$model, 
"sysDiff":$sdc, "sys":$sys, "subsys":$subsys, "subsubsys":$subsubsys, 
"assy":$assy, "disassy":$disassy, "disassyv":$disassyv, "info":$info, 
"infov":$infov, "itemloc":$itemloc, "tname":$tname, "iname":$iname})


The document named Book1-test.xml is essentially a map of existing IDs 
of elements to new output file names. After it creates the first XML 
output file BaseX returns ERROR [FODC0002] "" (Line 1): Premature end of 
file. When I looked up the error code in the BaseX documentation this 
error is defined as *"The specified document resource cannot be 
retrieved. "*. Is there some limitation to the use of the xslt:transform 
function in a loop? I am not understanding why it was able to retrieved 
the first time, but not after that. I have tried to move the XSL to 
different file locations, no luck. Am I missing something ridiculously 
obvious here? This XSL stylesheet is a very simplified version of what 
it will ultimately become, I wanted to get the process down before fully 
developing the XSL.


Thank you,

Tom Furst



Re: [basex-talk] xslt:transform function not working with XML Catalog

2020-07-09 Thread Imsieke, Gerrit, le-tex

Is the catalog schemas/catalog.xml residing in file:current-dir()?

On 09.07.2020 16:02, Lizzi, Vincent wrote:

Hi Gerrit,

Thank you for the hint! Removing quotes from the pragma did not work in 
this case.


   (# db:catfile schemas/catalog.xml #)

The catalog file is also configured at the beginning of the query:

declare option db:catfile 'schemas/catalog.xml';

This detail about not needing quotes in a pragma is worth remembering 
though!.


Vincent



Re: [basex-talk] xslt:transform function not working with XML Catalog

2020-07-10 Thread Imsieke, Gerrit, le-tex

Hi Vincent,

can you show us your catalog?

Since you mention that it chokes on finding the DTD, it might be that 
you need rewriteSystem instead of rewriteURI for the DTD locations.


Also if you don't resolve by public ID and refer to the DTD by relative 
path, this path will be made absolute before being catalog-resolved, so 
instead of  your can use  in order to only match 
the tail of the path.


On the other hand, since you say it’s running with standalone Saxon, the 
same DTD resolution issues should be expected there.


It may well be the case that our efforts to make DTD resolution 
available to xslt:transform() only focused on supporting xsl:import and 
xsl:include while not passing the resolver to the doc() function.


Maybe Liam can investigate this in more depth. I then suggest to ask for 
a budget at Taylor & Francis. We paid Liam to explore and enable the use 
of catalogs for xsl:import and xsl:include, and he dug through the mess 
of different interfaces etc. successfully for this limited but important 
use case. So he is *the* expert in this field and I’d like to warmly 
recommend paying him so that he can explore and fix the issue.


Gerrit

On 10.07.2020 07:55, Lizzi, Vincent wrote:

Hi Liam,

Thanks for the helpful suggestions. After trying everything you 
suggested and then also trying a few of Saxon’s configuration options, 
unfortunately I’m still having the same problem. Trying a shell script 
that contains the following:


MAIN="$( cd -P "$(dirname "$FILE")/../basex" && pwd )"

CP=$MAIN/BaseX.jar:$MAIN/lib/custom/*:$MAIN/lib/*:$CLASSPATH

echo 1 Saxon

java -cp "$CP" net.sf.saxon.Transform -s:input1.xml -xsl:transform.xsl 
-catalog:schemas/catalog.xml


echo 2 BaseX transform

java -cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) 
(# db:intparse false #) (# db:dtd true #) (# db:chop false #) { 
xslt:transform('input1.xml', 'transform.xsl') }"


echo 3 BaseX transform with Saxon features configured

java 
-Dhttp://saxon.sf.net/feature/entityResolverClass=org.apache.xml.resolver.tools.CatalogResolver 
-Dhttp://saxon.sf.net/feature/uriResolverClass=org.apache.xml.resolver.tools.CatalogResolver 
-cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) (# 
db:intparse false #) (# db:dtd true #) (# db:chop false #) { 
xslt:transform('input1.xml', 'transform.xsl') }"


echo 4 BaseX doc to show XML Catalog is configured correctly to parse XML

java -cp "$CP" org.basex.BaseX -q"(# db:catfile schemas/catalog.xml #) 
(# db:intparse false #) (# db:dtd true #) (# db:chop false #) { 
doc('input1.xml') }"


The classpath includes BaseX 9.3.3, Saxon HE 9.9, xml-resolver-1.2.jar, 
and CatalogManager.properties


 1. The transformation works in Saxon and uses the catalog file to
locate the DTD when parsing the XML input1.xml.
 2. The BaseX xslt:transform should work the same as #1, but fails
because the DTD cannot be read
 3. Adding Saxon configuration for Entity Resolver Class and URI Resolve
Class did not help
 4. Simply parsing the XML using doc() in BaseX with the same
configuration shows that the XML catalog is configured correctly
within BaseX

Using strace -f, the log shows that BaseX xslt:transform is reading the 
catalog.xml file from disk, and then is trying (and failing) to read the 
DTD from the non-working URIL.


This might be a bug in xslt:transform, so the workaround of using a 
regular expression replace on the DOCTYPE system URI is probably the 
practical solution.


Many thanks,

Vincent

_

*Vincent M. Lizzi*

Head of Information Standards | Taylor & Francis Group

vincent.li...@taylorandfrancis.com 



Information Classification: General

*From:* Liam R. E. Quin 
*Sent:* Thursday, July 9, 2020 12:55 PM
*To:* Lizzi, Vincent ; BaseX 

*Subject:* Re: [basex-talk] xslt:transform function not working with XML 
Catalog


On Thu, 2020-07-09 at 04:32 +, Lizzi, Vincent wrote:
 > Hi Liam,
 >
 > Thanks for the reply and suggestions. Based on your suggestion I
 > tried pragmas and strace, and had another go at
 > CatalogManager.properties, but they've not had any effect.

use, strace -f java >& hugelogfile.txt
and after, grep -i catalogmanager.properties hugelogfile.txt
and you should see where it's looking. If it doesn't look for that
file, check to see if it opened the jar file containing the resolver.

If you're running BaseX from Oxygen, Oxygen needs to have it in its
classpath too i think.

Also, of course, see if the catalog file is actually being opened!

I actually wrote some of the code in BaseX that makes XML catalogs work
with transform(), or provided a rough draft that Christian improved :),
and debugging it was... interesting.

I'd also try an absolute path for the catalog file - if you are using
the BaseX server, relative paths will be relative to the directory
(folder) where the server itself is running. (and of course the server
needs 

Re: [basex-talk] xslt:transform function not working with XML Catalog

2020-07-10 Thread Imsieke, Gerrit, le-tex

On 10.07.2020 08:23, Imsieke, Gerrit, le-tex wrote:
It may well be the case that our efforts to make DTD resolution 
available to xslt:transform() only focused on supporting xsl:import and 
xsl:include while not passing the resolver to the doc() function.


s/DTD resolution/catalog resolution/


Re: [basex-talk] xslt:transform function not working with XML Catalog

2020-07-08 Thread Imsieke, Gerrit, le-tex

Hi Vincent,

I feel your pain. Maybe this comment helps: 
https://github.com/BaseXdb/basex/issues/1793#issuecomment-579134499 
(omit the quotes in the pragma).


I documented it here, too: 
https://docs.basex.org/wiki/Catalog_Resolver#Additional_Notes
“The catalog location in the pragma can be given relative to the current 
working directory (the directory that is returned by file:current-dir()) 
or as an absolute operating system path. The catalog location in the 
pragma is not an XQuery expression; no concatenation or other operations 
may occur in the pragma, and the location string must not be surrounded 
by quotes.”


Gerrit

On 09.07.2020 06:32, Lizzi, Vincent wrote:

Hi Liam,

Thanks for the reply and suggestions. Based on your suggestion I tried 
pragmas and strace, and had another go at CatalogManager.properties, but 
they’ve not had any effect. (I’m using Windows 10 but was able to run 
strace in Ubuntu via WSL). This query:


try {

   (# db:catfile 'schemas/catalog.xml' #)

   (# db:intparse false #)

   (# db:dtd true #)

   (# db:chop false #)

   { xslt:transform('file.xml’, 'stylesheet.xsl')//inlinegraphic }

} catch * { $err:description }

Produces the same error again due to the DTD not being available at the 
system literal URI.


I did try setting verbosity 99 in a CatalogManager.properties file on 
the classpath, but this did not produce any additional messages. I also 
tried setting the same properties directly when launching BaseX this did 
not work either. Specifically, I set the following system properties 
when launching BaseX, and then used proc:property() in a query to 
confirm that these system properties were in fact set.


'xml.catalog.verbosity': '99'

'xml.catalog.ignoreMissing': 'no'

'xml.catalog.catalog-class-name': 'org.apache.xml.resolver.Resolver'

'xml.catalog.files': 'schemas/catalog.xml'

xml-resolver-1.2.jar and Saxon are definitely on the classpath.

Thanks,

VIncent

_

*Vincent M. Lizzi*

Head of Information Standards | Taylor & Francis Group

vincent.li...@taylorandfrancis.com 




Information Classification: General

*From:* Liam R. E. Quin 
*Sent:* Wednesday, July 8, 2020 10:28 PM
*To:* Lizzi, Vincent ; BaseX 

*Subject:* Re: [basex-talk] xslt:transform function not working with XML 
Catalog


On Wed, 2020-07-08 at 22:46 +, Lizzi, Vincent wrote:
 > I've encountered a problem using xslt:transform in to transform some
 > old XML that contains a DTD DOCTYPE system literal pointing to a non-
 > working URI and also uses ENTITYREF attributes to refer to image
 > files. I have the XML Catalog configured correctly using CATFILE.


If this is on Linux, using strace can help check which catalog file is
being used; you can also turn on debugging in a
CatalogManager.properties file containing the line
verbosity = 999
(thee file needs to be in your Java classpath).

There's also a BaseX pragma, (# db:catfile path/to/catalog.xml #) {
transform(...)
}

You need to turn off the BaseX internal parser.

Make sure that the resolver library and of course saxon are in your
class path.

You may need to add,
declare option db:catfile "path/relative/to/cwd/catalog.xml";
to your query.

Liam

--
Liam Quin, https://www.delightfulcomputing.com/ 


Available for XML/Document/Information Architecture/XSLT/
XSL/XQuery/Web/Text Processing/A11Y training, work & consulting.
Barefoot Web-slave, antique illustrations: http://www.fromoldbooks.org



--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


[basex-talk] Autocomplete with RESTXQ

2020-06-25 Thread Imsieke, Gerrit, le-tex

Hi List,

Can anyone recommend a lightweight vanilla Javascript autocomplete 
library that can easily be used together with BaseX RESTXQ? Maybe even a 
readily cloneable/modifiable example?
I don’t have a preference for a server response format. The RESTXQ 
service may be configured to return XML, JSON, or HTML
No CORS restrictions need to be considered, the query host is the same 
that delivers the HTML pages.


Gerrit


Re: [basex-talk] Autocomplete with RESTXQ

2020-06-27 Thread Imsieke, Gerrit, le-tex

Thanks Michael!

This looks quite simple but you probably saved me two to four hours of 
figuring out which lib to use, how to invoke the completion and how to 
shape the server response. Will try to use it in my app tomorrow.


And since there has been a question on this list recently which sites 
are built on BaseX, I will add to the list the site that I’m currently 
working on (Hogrefe Publishing’s Clinical Handbook of Psychotropic 
Drugs) probably in September, when it will have been relaunched. 
Accessing the site requires a subscription though, but maybe the 
publisher will decide that at least the search be free, enticing more 
mental health practitioners into subscribing when they want to access 
the search results…


Gerrit

On 26.06.2020 16:34, Michael Seiferle wrote:

Hi Gerrit,

I came up with the following example — 
https://git.basex.io/basex-public/mailinglist-autocomplete

Hope this helps — feel free to ask for more. I simply chose the first autocomplete 
library that showed up when asking google for  "autocomplete lightweight“ ;-)

Probably the most relevant changes compared to the default BaseX Config files 
are here:

Changed the Servlet mapping, so RESTXQ is output at `/api` (and not /)

https://git.basex.io/basex-public/mailinglist-autocomplete/-/blob/main/webapp/WEB-INF/web.xml#L83


And static-files are output at: `/` instead of the default `/static`

https://git.basex.io/basex-public/mailinglist-autocomplete/-/blob/main/webapp/WEB-INF/web.xml#L140



If you simply clone my repo, the relevant parts should be „correct“ — according 
to my personal taste ;-)

Feel free to ask for more help.

Michael


Am 25.06.2020 um 18:31 schrieb Imsieke, Gerrit, le-tex 
:

Hi List,

Can anyone recommend a lightweight vanilla Javascript autocomplete library that 
can easily be used together with BaseX RESTXQ? Maybe even a readily 
cloneable/modifiable example?
I don’t have a preference for a server response format. The RESTXQ service may 
be configured to return XML, JSON, or HTML
No CORS restrictions need to be considered, the query host is the same that 
delivers the HTML pages.

Gerrit




--
Gerrit Imsieke
Geschäftsführer / Managing Director
le-tex publishing services GmbH
Weissenfelser Str. 84, 04229 Leipzig, Germany
Phone +49 341 355356 110, Fax +49 341 355356 510
gerrit.imsi...@le-tex.de, http://www.le-tex.de

Registergericht / Commercial Register: Amtsgericht Leipzig
Registernummer / Registration Number: HRB 24930

Geschäftsführer / Managing Directors:
Gerrit Imsieke, Svea Jelonek, Thomas Schmidt


Re: [basex-talk] Autocomplete with RESTXQ

2020-08-31 Thread Imsieke, Gerrit, le-tex

Hi Michael,

Just to let you know that I finally and successfully used the full text 
index for autocompletion, based on your prototype code.


My endpoint for the lookup is like this:

declare
%rest:path("/chpd/{$work}/complete/{$what}")
%rest:single (: only run 1 query per client :)
%output:method("json")
%rest:GET function chpd:complete($work, $what) {
  element json {
attribute type {"array"},
let $tokens := ft:tokens('CHPD_' || $work || 
'_hobots_FT')[starts-with(., $what)]

return
  for $t in $tokens
  let $c := number($t/@count)
  order by $c descending
  count $rank
  where $rank le 10
  return element _ {
attribute type {"object"},
element label {string($t)},
element value {string($t)}
  }
  }
};

As I previously wrote, the site will (re-)launch later in September, and 
I will post a link then. Although it is behind a paywall, I will look 
into making the full text search and navigation lists available publicly 
in order to lure people into subscriptions.


The full text search is satisfyingly fast, at least when I’m the only 
user on the server.


Most other pages will be cached as HTML (with some placeholders for 
login/logout & user name) by the access control application, written in 
Ruby on Rails. This is because I render them dynamically from BITS with 
Saxon, and although the rendering is fast, it would be a waste of 
resources and a worse user experience if each page serving took 
additional ~ 200 ms of XSLT rendering time.


Among other reasons, I do render them dynamically (instead of serving 
statically rendered HTML) because there is a drug comparison 
functionality that takes 2 or more drug description pages (CHPD = 
Clinical Handbook of Psychotropic Drugs) and presents a side-by-side 
diff. The possible drug combinations for diffing are to manifold to 
pre-render them as HTML. Diffing the HTML instead of the BITS XML was 
not an option due to accidental changes in the output (the document 
structure of the BITS sources are more stable than the HTML output). 
Since all other pages use the same rendering XSLT as the comparison 
pages, I thought it would be too complicated to serve specific pages as 
pre-rendered HTML while dynamically rendering other pages. Therefore 
this HTML rendering happens dynamically, and it is cached also for the 
comparison pages.


Some more details: The BITS→HTML rendering is powered by our jats2html 
library 
(https://github.com/transpect/jats2html/blob/master/xsl/jats2html.xsl). 
In the XSLT you see imports like href="http://transpect.io/xslt-util/lengths/xsl/lengths.xsl"/>. These 
URIs don’t resolve immediately. They need a catalog resolver in order to 
resolve to local resources. A shoutout to Liam Quin (and to Christian) 
for making catalog resolution available to xsl:import and xsl:include, 
https://github.com/BaseXdb/basex/issues/1719, which has been quite a 
hairy issue.


If there is an XML Prague next year and if it features a BaseX user 
meeting (nudge nudge), I will be happy to present the application in 
greater detail.


Gerrit


On 28.06.2020 22:56, Michael Seiferle wrote:

You’re welcome.
Glad I could help save some time, I agree it looks simple, yet wrapping ones 
head around those small details can be a real showstopper sometimes :-)
Feel free to ask for more details anytime.

Looking forward to seeing said search portal!


Best from Konstanz
Michael
Von meinem iPhone gesendet


Am 27.06.2020 um 14:13 schrieb Imsieke, Gerrit, le-tex 
:

This looks quite simple but you probably saved me two to four hours of figuring 
out which lib to use, how to invoke the completion and how to shape the server 
response. Will try to use it in my app tomorrow.