Re: [basex-talk] Compare List Membership in XQuery

2019-03-25 Thread Imsieke, Gerrit, le-tex
If you are allowed to share some snippets of the actual documents, it 
will be easier to see how the query needs to be phrased.


Have you verified that $biblFull and $biblStruct actually contain 
strings? If not, do you need to declare a default namespace? The 
vocabulary looks like TEI, so


declare default element namespace "http://www.tei-c.org/ns/1.0";;

may be necessary. And if it is TEI, the ID attributes are probably 
called @xml:id rather than @id.


Gerrit

On 26.03.2019 00:18, Chris Yocum wrote:

Hi Gerrit,


Are you sure that the @target attributes are supposed to be identical to the
IDs?


Yes, they should be.  If they are not, I need to find them so I can
fix them to be identical.


Don’t you prepend a pound sign to @target attributes when they point to
IDs within the same document?


They are not in the same document.  The @target attributes live spread
out in the other documents while the IDs all live in the same
document.


So you probably need to say

where not(substring($title/@target,2) = $biblStruct) and
not(substring($title/@target,2) = $biblFull)



I will give this a shot tomorrow when I am not as tired.


And maybe you need to restrict the titles that you search to those with a
@target attribute, like so:

for $title in collection('edil_target/eDIL-A.xml')//entry//title[@target]



This is the other half of the problem which I did not state here. I am
to find all titles that do not have target attributes then give them a
target attribute based on some rules.  I have done so in a few files
(and I explicitly testing one of them in the query in my previous
email) and I will roll out the fix in all other files once I have
everything else tested and working.

I will give your suggestion a try tomorrow. Thanks!

All the best,
Chris





Re: [basex-talk] Compare List Membership in XQuery

2019-03-25 Thread Chris Yocum
Hi Gerrit,

> Are you sure that the @target attributes are supposed to be identical to the
> IDs?

Yes, they should be.  If they are not, I need to find them so I can
fix them to be identical.

> Don’t you prepend a pound sign to @target attributes when they point to
> IDs within the same document?

They are not in the same document.  The @target attributes live spread
out in the other documents while the IDs all live in the same
document.

> So you probably need to say
> 
> where not(substring($title/@target,2) = $biblStruct) and
> not(substring($title/@target,2) = $biblFull)
>

I will give this a shot tomorrow when I am not as tired.

> And maybe you need to restrict the titles that you search to those with a
> @target attribute, like so:
> 
> for $title in collection('edil_target/eDIL-A.xml')//entry//title[@target]
>

This is the other half of the problem which I did not state here. I am
to find all titles that do not have target attributes then give them a
target attribute based on some rules.  I have done so in a few files
(and I explicitly testing one of them in the query in my previous
email) and I will roll out the fix in all other files once I have
everything else tested and working.

I will give your suggestion a try tomorrow. Thanks!

All the best,
Chris


Re: [basex-talk] Compare List Membership in XQuery

2019-03-25 Thread Imsieke, Gerrit, le-tex
Are you sure that the @target attributes are supposed to be identical to 
the IDs? Don’t you prepend a pound sign to @target attributes when they 
point to IDs within the same document?

So you probably need to say

where not(substring($title/@target,2) = $biblStruct) and 
not(substring($title/@target,2) = $biblFull)


And maybe you need to restrict the titles that you search to those with 
a @target attribute, like so:


for $title in collection('edil_target/eDIL-A.xml')//entry//title[@target]

Otherwise also non-@target-bearing titles will match the where clause, 
which may be unintended.


Gerrit

On 25.03.2019 23:32, Chris Yocum wrote:

Hi Markus,


try

for $title in collection('edil_target/eDIL-A.xml')//entry//title
where not($title/@target = $biblStruct) and not($title/@target = $biblFull)
return $title


Thank you for the quick reply at such a late hour!  However, I am
getting the same results sadly.  These results I can open the files
and find the targets that are being return in either of the two lists.

All the best,
Chris


Re: [basex-talk] Compare List Membership in XQuery

2019-03-25 Thread Chris Yocum
Hi Markus,

> try
> 
> for $title in collection('edil_target/eDIL-A.xml')//entry//title
> where not($title/@target = $biblStruct) and not($title/@target = $biblFull)
> return $title

Thank you for the quick reply at such a late hour!  However, I am
getting the same results sadly.  These results I can open the files
and find the targets that are being return in either of the two lists.

All the best,
Chris


Re: [basex-talk] Compare List Membership in XQuery

2019-03-25 Thread Markus Wittenberg

 Hi Chris,

try

for $title in collection('edil_target/eDIL-A.xml')//entry//title
where not($title/@target = $biblStruct) and not($title/@target = $biblFull)
return $title
 


Best regards,
  Markus

Am 25.03.2019 um 23:07 schrieb Chris Yocum:

Hello,

I have a question that is more XQuery based than BaseX specifically
but I thought I would pose it here to see if any one knows.

The basic problem statement is: given a list of tags that have an id
attribute and a list of target attributes which should correspond to
those ids, are there any target attrbutes which reference ids which do
not exist?

Basically, I am trying to link up a tag called biblFull or biblStruct
which have an id attribute with a tag named title which has a target
attribute which should match up with one of the two tags (biblFull or
biblStruct) above.  I have formulated the query below which I believe
*should* give me all the title tags which do not exist either in the
biblFull or biblStruct id lists.  I am not getting the results that I
am expecting.

let $biblFull := distinct-values(collection('edil_target/Prologue Merged 
2013.xml')//biblFull/@id)
let $biblStruct := distinct-values(collection('edil_target/Prologue Merged 
2013.xml')//biblStruct/@id)
for $title in collection('edil_target/eDIL-A.xml')//entry//title
where $title/@target != $biblStruct or $title/@target != $biblFull
return $title

I am basically getting all the title tags back which is not what I am
expecting at all.  Can anyone shed any light on this?

Thank you very much in advance!

All the best,
Chris


--
Markus Wittenberg

Tel +49 (0)8382 911 07 24
Mail wittenb...@axxepta.de



axxepta solutions GmbH
Lehmgrubenweg 17, 88131 Lindau

Amtsgericht Berlin HRB 97544B
Geschäftsführer: Karsten Becke, Maximilian Gärber



[basex-talk] Compare List Membership in XQuery

2019-03-25 Thread Chris Yocum
Hello,

I have a question that is more XQuery based than BaseX specifically
but I thought I would pose it here to see if any one knows.

The basic problem statement is: given a list of tags that have an id
attribute and a list of target attributes which should correspond to
those ids, are there any target attrbutes which reference ids which do
not exist?

Basically, I am trying to link up a tag called biblFull or biblStruct
which have an id attribute with a tag named title which has a target
attribute which should match up with one of the two tags (biblFull or
biblStruct) above.  I have formulated the query below which I believe
*should* give me all the title tags which do not exist either in the
biblFull or biblStruct id lists.  I am not getting the results that I
am expecting.

let $biblFull := distinct-values(collection('edil_target/Prologue Merged 
2013.xml')//biblFull/@id)
let $biblStruct := distinct-values(collection('edil_target/Prologue Merged 
2013.xml')//biblStruct/@id)
for $title in collection('edil_target/eDIL-A.xml')//entry//title
where $title/@target != $biblStruct or $title/@target != $biblFull 
return $title

I am basically getting all the title tags back which is not what I am
expecting at all.  Can anyone shed any light on this?

Thank you very much in advance!

All the best,
Chris


Re: [basex-talk] RESTXQ, multipart/form-data, out of memory saving to file

2019-03-25 Thread Christian Grün
Hi James,

Thanks for your persistence. Your observation (4 GB assigned, failing
with 350 MB) made me think, and indeed the difference between handling
raw post and map data was caused by an internal info output generation
of map structures that does not contribute to the eventual result.
With the latest snapshot [1], you should be able to upload and save
your file as requested!

Cheers,
Christian

[1] http://files.basex.org/releases/latest/




On Sat, Mar 23, 2019 at 1:02 PM James Ball  wrote:
>
> Hi Christian,
>
> Thank you for your ideas - lots for me to consider.
>
> > increase the memory that’s assigned to the JVM.
>
> I’ve currently assigned it 4GB and the file that’s failing is 350MB so I 
> don’t think that’s going to be an easy fix.
>
> > You could try to interpret the incoming POST data with XQuery, but
> > then you might also struggle with memory constraints.
>
> What’s puzzling me is that the ‘upload’ part works - the data gets added to 
> the variables and the RESTXQ function is called.
>
> The following does NOT cause a memory issue:
>
> declare
>%rest:POST("{$data}")
>%rest:path("/test2.htm”)
>%rest:form-param("zip", "{$files}")
> function _:test($data,$files) {
>file:write-binary(“the path”,$data)
> };
>
> But the file I get isn’t valid.
>
> The following DOES cause the issue with larger files:
>
> declare
>%rest:POST("{$data}")
>%rest:path("/test2.htm”)
>%rest:form-param("zip", "{$files}")
> function _:test($data,$files) {
>file:write-binary(“the path”,$files(map:keys($files)[1]))
> };
>
> But the file I get is valid.
>
> Perhaps the issue isn’t anything to do with RESTXQ and is a limitation on the 
> size of xs:base64Binary from a map that can be written to a file?
>
> > You could try to interpret the incoming POST data with XQuery
>
>
> I was trying to read the source code to understand what processes happen to 
> convert the POST data to a map.
>
> I don’t understand what format or encoding the $data variable has that makes 
> the file different to the one I get from $files.
>
> Because if I can get $data -> run decoding -> save to file that would work 
> perfectly for me.
>
> I shall keep looking but any pointers in the right direction are much 
> appreciated.
>
> Thank you for your continued help.
>
> Regards, James
>


Re: [basex-talk] Options for ft:search

2019-03-25 Thread Christian Grün
Hi Sebastian,

sorry for letting you wait (lots to do).

> > I'd like to request a feature concerning the ft:search method: An
> > `lserror` option that works exactly like the global option `LSERROR`.
> > It would be nice if we could set the maximum Levenshtein distance
> > specifically for each fuzzy search without having to adjust the global
> > option.

Good idea. As fuzzy searching is a non-standard feature, there is
currently no syntactical construct to define LSERROR via "contains
text". I think it would make sense to first extend the standard
syntax, and provide an ft:search option for lserror just after that.

We could extend the syntax as follows:

  "A" contains text "B" using fuzzy 3 errors

I’ve added an issue for that [1].

> > Another question on ft:search: Is there a reason why it doesn't have a
> > `case` option just like ft:contains has one?

The reason is that all your data will be indexed with static options
for case, diacritics, etc. If case has been considered while building
the index store, and if you search for "A", it won’t return hits for
"a".

Things are different for ft:contains, as it is not based on the index,
and will always tokenize the given input on the fly.

If you decide to ignore case in the index, you can post-process your
index results as follows:

  let $query := 'search-term'
  for $result in ft:search('db', $query)
  where ft:contains($result, $query, map {
'case': 'sensitive'
  })
  return $result

As you can guess, this check might take some time, so just be careful
if your query might generate lots of hits.

Best,
Christian

[1] https://github.com/BaseXdb/basex/issues/1673


Re: [basex-talk] fn:serialize#2 serialization different in BaseX and Saxon

2019-03-25 Thread Christian Grün
Hi Andreas,

  let $asElement := 
  return serialize($asElement,map{"method":"text"})

> * why is the result to the very last function empty?

It is empty because the text output method will only output the string
value of your element, which is an empty string (i.e., the same you
get when wrapping your element with fn:string; see [1] for more
details).

> * how can I convert an XML node to an entity escaped string, _without_
> passing it in as string, but as element() or node()?

In Saxon and BaseX, the "xml" output method is used as default for
XQuery functions (fn:serialize, file:write, etc). In Saxon, it’s also
used as default for serializing the final result of a query. Early
versions of BaseX used the same default. When the "adaptive" method
was introduced, we wanted to switch over to this method (because it
seemed to be more appropriate for the use cases of most BaseX users).
As this method turned out to be mostly suitable for debugging data, we
eventually introduced our own serialization method, called "basex",
which serializes XML as XML, strings as strings, binary data as byte
stream, and so on [2].

You can enforce an identical serialization behavior by defining the
xml output method in the query prolog…

declare namespace output = 'http://www.w3.org/2010/xslt-xquery-serialization';
declare option output:method 'xml';
serialize()

…or (if you want it globally) by setting the BaseX SERIALIZER option
to "method=xml". In both cases,

1. The fn:serialize function call will convert your element to a string.
2. The output option will ensure that the result of your query will be
output according to the rules of the "xml" output method.

> what Saxon does (though I still wonder, how the Saxon people
> then get an unescaped string of XML, as BaseX does).

A serialize($element) function call combined with the "text" output
method will be the appropriate choice.

Hope this helps,
Christian

[1] https://www.w3.org/TR/xslt-xquery-serialization-31/#text-output
[2] http://docs.basex.org/wiki/XQuery_Extensions#Serialization