[basex-talk] Error from function-name(request:query#0): only from RESTXQ, not REST context.

2019-08-06 Thread Majewski, Steven Dennis (sdm7g)

I’ve been learning how to use the BaseX webapp, and below is one of the first 
test scripts for the REST interface I wrote.
No problems with this script. 


import module namespace request = "http://exquery.org/ns/request;;

declare namespace output = "http://www.w3.org/2010/xslt-xquery-serialization;;
declare option output:omit-xml-declaration "no";
declare option output:method "xhtml";
(: http://docs.basex.org/wiki/XQuery_3.0#Serialization  :)

(: 
declare variable $contains := 
try { request:parameter( 'contains' ) }
catch basex:http { '' }; 

 Simpler to use implicit assignment: 
http://docs.basex.org/wiki/REST#Assigning_Variables
"All query parameters that have not been processed before will be treated as 
variable assignments" 
:)

declare variable $contains as xs:string external := "";
declare function local:f0line( $f ) as node() {
  (: build a list item with function name and function values :)I
  {fn:function-name($f)}:{$f()}
};

declare variable $CTX := try { . }  catch err:XPDY0002 { () }; 
declare variable $DB := try { db:name( $CTX[1] ) } catch err:XPTY0004 { "" };
(: Could be, if ():  [XPTY0004] node() expected, empty sequence found. 
   but not sure if that's the only possible case.  :)

test

Database: { $DB }
Context documents (.)

{ for $doc in $CTX return {base-uri($doc)} }





HTTP Request

  { for $f in ( request:method#0, request:scheme#0, request:hostname#0, 
request:port#0,
   request:path#0, request:query#0, request:uri#0, 
request:context-path#0, request:address#0, 
   request:remote-hostname#0, request:remote-address#0, 
request:remote-port#0 ) 
  return local:f0line($f) } 



Request Parameters:

{ for $param in request:parameter-names() return 
{$param}:{request:parameter($param)} }



Headers

{ for $header in request:header-names() return 
  {$header}:{request:header($header)} }



Cookies

 { for $cookie in request:cookie-names() return 
  {$cookie}:{request:cookie($cookie)} }



XSLT

xslt:processor:  { xslt:processor()},  xslt:version: { 
xslt:version()}





I’m now trying to learn  RESTXQ programming, so I was trying to do the same 
sort of thing from a RESTXQ module. 

I was getting an error from the line where  (: request:query#0, :)   is now 
commented out, and added the line:

{ request:query( ), function-name(request:query#0)}

And I see that what happens is that it’s  the latter 
function-name(request:query#0) that is generating an error, but only when there 
is no query string in my request.  An empty query string, i.e. 
“http://localhost/basex/test? ”  , for example, 
works. But “http://localhost/basex/test   returns:

Stopped at /usr/local/tomcat/webapps/basex/test.xqm, 33/66:
[XPTY0004] Cannot convert empty-sequence() to xs:string: ().

I assume this has something to do with how RESTXQ processes form and request 
parameters, and I probably don’t actually need to do it this way, but I was 
wondering what exactly is the source of the problem with doing something like 
this or if there is an obvious work around. Maybe writing another local 
function to wrap around request:query and to use in that loop. 

( One reason I chose that test to convert to RESTXQ was that it wasn’t obvious 
to be from the docs if I could mix those different access methods, and so what 
was surprising was that it mostly worked — only that one function failed (so 
far)). 


module namespace test = 'http://localhost/test';

import module namespace request = "http://exquery.org/ns/request;;


declare function test:f0line( $f ) as node() {
  (: build a list item with function name and function values :)
  {fn:function-name($f)}:{$f()}
};

(:~
 : Generates a welcome page.
 : @return HTML page
 :)
declare
  %rest:path("test")
  %output:method("xhtml")
  %output:omit-xml-declaration("no")
  %output:doctype-public("-//W3C//DTD XHTML 1.0 Transitional//EN")
  
%output:doctype-system("http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd;)
function test:test(
) as element(Q{http://www.w3.org/1999/xhtml}html) {
  http://www.w3.org/1999/xhtml;>

  BaseX HTTP Services
  


  
  TEST

  
{ request:query( ), function-name(request:query#0)}
  


HTTP Request

  { for $f in ( request:method#0, request:scheme#0, 
request:hostname#0, request:port#0,
   request:path#0, (: request:query#0, :) request:uri#0, 
request:context-path#0, request:address#0,
   request:remote-hostname#0, request:remote-address#0, 
request:remote-port#0 )
  return test:f0line($f) }



  
};






smime.p7s
Description: S/MIME cryptographic signature


Re: [basex-talk] Coding help

2019-08-06 Thread Majewski, Steven Dennis (sdm7g)

Creating a bunch of temporary databases that you’re going to delete doesn’t 
sound like the most efficient way to process this data. 
But it hard to tell what alternative to recommend without more info about what 
your desired end result is. 

Is this something that you’re going to do once, or do repeatedly for different 
strings ? 

Taking your description literally, I would use ‘grep -l ‘ to generate a list of 
files with the specific string, and either feed that list into ‘cat’ or else 
use it to build a database of a subset of files for further investigation. 

There are also other ways to filter the data through a stream to select a 
subset if that is what you want to do. 

But if you’re going to do this repeatedly for different subsets, then it might 
make more sense to try to get everything parsed and indexed into the database 
once. If it really is too large for a single database after adjusting the java 
memory parameters in the basex scripts, you could try sharding the data into 
several databases, repeat the search on each collection, and concatenate the 
results. 




> On Aug 5, 2019, at 2:41 AM, Greg Kawchuk  wrote:
> 
> Hi everyone,
> I'm wondering if someone could provide what I think is a brief script for a 
> scientific project to do the following. 
> The goal is to identify XML documents from a very large collection that would 
> be too big to load into a database all at once.
> 
> Here is how I see the functions provided by the code. 
> 1. In the script, the user could enter the path of the target folder (with 
> millions of XML documents).
> 2. In the script, the user would enter the number of documents to load into a 
> database at a given time (i =. 1,000) depending on memory limitations.
> 3. The code would then create a temporary database from the first (i) xml 
> files in the target folder.
> 4. The code would then search the 1000 xml documents in the database for a 
> pre-defined text string.
> 5. If hits exist for the text string, the code would write those documents to 
> a unique XML file.
> 6. Clear the database.
> 7. Read in the next 1000 files (or remaining files in the folder).
> 8. Return to #4.
> 
> There would be no need to append XML files in step 5. The resulting XML files 
> could be concatenated afterwards. 
> Thank you in advance. If you have any questions, please feel free to email me 
> here. 
> Greg
> 
> ***
> Greg Kawchuk BSC, DC, MSc, PhD.
> Professor, Faculty of Rehabilitation Medicine
> University of Alberta
> greg.kawc...@ualberta.ca 
> 780-492-6891



smime.p7s
Description: S/MIME cryptographic signature


Re: [basex-talk] Database corrupted on updating with RestXQ

2019-08-06 Thread Christian Grün
Hi Marco,

I think that the bug fix (which is still on my todo list) will be made
available with BaseX 9.3; so, for now, it’s probably better to choose
the workaround.

Cheers,
Christian



On Sat, Aug 3, 2019 at 10:02 AM Marco Lettere  wrote:
>
> Hi Christian,
> I'm currently preparing a deployment based on Docker for one of our
> customers. Here in Italy it's holiday time in August so I have a bit of
> time and I can coope with your suggested workaround for the moment.
> Just a question to be prepared ... are intermediate bug-fix releases
> also available in the form of docker containers?
> Thanks for your support as usual.
> Regards,
> Marco.
>
> On 02/08/19 15:04, Christian Grün wrote:
> > Hi Marco,
> >
> > A little update on the status quo:
> >
> > • I noticed that the bug you observed was basically exposed by the
> > previous bug fix (it existed before, but it didn’t occur in your
> > particular setting).
> > • It happens only in certain cases (i.e., for specific document sizes)
> > and if the added/replaced document has namespaces.
> > • Another good thing: It happens only if your database is empty.
> >
> > It’s not exactly an elegant proposal, but as long as we haven’t fixed
> > the bug, you could add an initial  document to your database.
> >
> > We are doing our best, though.
> > Christian
> >
> >
> >
> > On Wed, Jul 31, 2019 at 10:20 AM Christian Grün
> >  wrote:
> >> Marco, thanks for the attachment. I have created a script that
> >> triggers the error [1]. Most probably, the bug is related to the
> >> previous bug issue. We’ll have a look at this soon. – Christian
> >>
> >> [1] https://github.com/BaseXdb/basex/issues/1711
> >>
> >>
> >>
> >> On Sat, Jul 27, 2019 at 4:02 PM Marco Lettere  wrote:
> >>> Thanks Christian,
> >>> I haven't been able to reproduce the bug with your SSCE.
> >>> Nevertheless I spent some time in tracing the operations with the actual
> >>> data I'm currently storing into the DB and a sequence of
> >>> db:add/db:replace close enough to actual app flow.
> >>> I attach the query script which should be rather self explaining.
> >>> You may run the script changing always $f with $f + 1 starting (from 0).
> >>> By uncommenting the different configurations of the variable $input you
> >>> can see how the misbehaviour depends somehow on the data because only
> >>> $d8 is causing the crash.
> >>> I stared at the diffs from $d8 to the other but there really isn't any
> >>> significant difference so now it's very hard for me to understand.
> >>>
> >>> Thank you in advance for all support that you can provide as usual.
> >>> Regards,
> >>> Marco.
> >>>
> >>> On 26/07/19 18:41, Christian Grün wrote:
>  Hi Marco,
> 
>  Back in February, a user reported a database inconsistency [1] – which
>  happened, too, if a database was nearly empty – and we managed to
>  build a little text case to get this reproduced [2]. After that, 9.2
>  was released. Maybe this fix introduced another irregularity for this
>  special case?
> 
>  Maybe we can get this reproduced by taking the script from issue 1662
>  and modify it a little? That would be great.
> 
>  Sorry to the inconvenience,
>  Christian
> 
>  PS: You can safely ignore the "Creating database..." output. It may
>  just indicate that an XML document is parsed, and that an internal
>  main-memory database instance is created, possibly for your users.xml
>  file in the data directory.
> 
>  [1] 
>  https://www.mail-archive.com/basex-talk@mailman.uni-konstanz.de/msg11459.html
>  [2] https://github.com/BaseXdb/basex/issues/1662
> 
> 
> 
>  On Fri, Jul 26, 2019 at 4:29 PM Marco Lettere  
>  wrote:
> > Hi all,
> >
> > starting with 9.2.1 we experience a strange error with a RESTXQ API  of
> > ours that we have been using for years.
> >
> > The typical pattern is lookup a document update it and store it back.
> >
> > We have done this milions of time and also all the tests work neatly.
> >
> > But when using it from inside the web application, at around the tenth
> > operation we get the DB corrupted with the stack trace [1]. This seems
> > to happen only when the database is nearly empty and, for sure, it
> > doesn't appear on 8.x series.
> >
> > It feels like some concurrency flaw is introduced somewhere but it's
> > hard to say because the API is eight years in age and it never changed
> > significantly.
> >
> > I know it's hard without an SSCE... but while working to insulate this
> > byzantine behaviour (I've been trying for days now) maybe someone in the
> > list has a clue or is able to interpret the stack trace?
> >
> > Thanks and have a nice weekend,
> >
> > Marco.
> >
> >
> > [1]
> >
> > Improper use? Potential bug? Your feedback is welcome:
> > Contact: basex-talk@mailman.uni-konstanz.de
> > Version: