Re: [basex-talk] DBA - display the number of results from a query
Hi James, I’m finding the web DBA functionality of BaseX more and more useful but one thing I really miss from the GUI is the display of the number of results in the top right corner. Many times when I’m running a quick query I want to know how many results there are and see at least an example of the results (as often the full set is too long to be displayed). I know I could add count() round the query and run it again but I’m lazy. :) Yes, this sounds like a useful extension! And I also agree it's not quite obvious how to implement this best with the current architecture. In fact, the cleanest solution would be to run all queries twice: After having evaluating the query, a second call could be sent to the server that computes the number of results. This could be done by introducing a new util:query() function, which wraps the xquery:eval() call with fn:count() and does not use fn:serialize(). Obviously, this function only makes sense with non-updating queries, but as the editor forces the user to choose between read-only and updating queries, we don't put much effort into that. In various cases, the count may take much longer than the actual query evaluation, which is stopped after a maximum amount of bytes has been serialized. A simple (artificial) example query for that is: count((1 to 1) ! a/) However, if the count is always triggered after the original query, it simply means that the number of results will be displayed some time after the query result, or (if it really takes too long) interrupted after the timeout. And if a new query is triggered in-between, the result of the counting query could be ignored. What’s the best way of getting the count though? The naive solution is to run each query twice, once to get the count and once to get the results but that seems less than ideal. I think there’s something with caching of queries that may help. Is that how it’s done in the GUI? The GUI code for computing and visualizing the results is pretty complex, as it includes all kinds of optimizations to speed up processing. The query results will first be stored in a value and only serialized in a second step inside the Result View (if currently visible) . We could indeed try something similar, but I assume it would take quite some time to get it robust and fast enough. Any pointers gratefully received and of course I’ll share any code back to the project. Always looking forward, thanks, Christian
Re: [basex-talk] Behaviour of doc() changed?
Hi Marco, when using doc() we used to pass into it a relative (to the static base uri) filesystem path to an xml file. This was parsed without any problems. Now (basex 8.2.2) the parsing works but we get on the standard output stating that the database (named as the first path element) does not exist. Could you write down the single steps for reproducing this? What I did was.. * creating a folder 'dir' on my desktop, * creating a well-formed xml file 'file.xml' in the 'dir' folder, and * running the following query from the desktop: basex doc('dir/file.xml') I got the contents of file.xml as result without any error message. I understand that the doc function first looks up the database server and then accesses the filesystem. Is it like that? Exactly. It's described somewhere in the last paragraph of [1] in the documentation (...not easy to find). Cheers, Christian [1] http://docs.basex.org/wiki/Databases#XML_Documents
Re: [basex-talk] Reporting function for a subset
Hi Menashè, Is it possible? Yes, it should be, because you can do nearly everything in XQuery.. However, I must confess I have no idea how to help you right now.. What about the reporting function, does it already exist? What is a subset: Is it a sequence of XML nodes resulting from a path expression? Could you possibly provide us with some code you have written so far? Christian On Mon, Jul 13, 2015 at 5:33 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hello, I want to call a reporting function with a subset of documents created inside a loop which will return a xml report. I couldn't find information how and if can it be done. The idea is that there is a main query which results in a subset. Then I want to make different processing only on this subset without making a query on the whole collection with the filters, as I've already done in the main query. The all-in-one query will actually return inside my own xml the results of the different functions. -- With kind regards, Menashè
Re: [basex-talk] Reporting function for a subset
Hi, The initial of the code should be modified, so here is only the essence of one of the pivoting reports: for $singleDataType in $dataType for $singleDevice in $device for $singleAvailability in $availability for $singleCountry in $country for $singleParameter in $parameter group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter={$singleParameter} NumberOfRecords={count($current-pre)}/ Other report will count all records with one less condition: group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter=Any NumberOfRecords={count($current-pre)}/ I hope it's clear. With kind regards, Menashè
Re: [basex-talk] Slow query
Should geo:within of http://docs.basex.org/wiki/Geo_Module help? The functions of the Geo Module don't use any index structures, so I am afraid they won't speed up the query. One more idea: you could convert all latitudes and longitudes to strings with a fixed number of digits _ (:~ Allowed range. :) declare variable $RANGE := 99; (:~ Maximum latitude. :) declare variable $LAT-MIN := -90; (:~ Maximum longitude. :) declare variable $LAT-MAX := 90; (:~ : Converts a double value to a normalized string value : with a fixed size of digits. : @param $num number to be converted : @param $min minimum allowed value : @param $max maximum allowed value : @return resulting value :) declare function local:normalize( $num as xs:double, $min as xs:integer, $max as xs:integer ) { let $norm := $RANGE * ($num - $min) div ($max - $min) return format-number($norm, '00') }; (: Run code for various latitude values :) for $latitude in (-90, -89., -13.345, 0, 89.9) return local:normalize($latitude, $LAT-MIN, $LAT-MAX) _ Next, you could to do string comparisons on these values: for $doc in db:open(CDI) let $lat := $doc//latitude let $lon := $doc//longitude where $lat = 883387 and $lat = 893463 and $lon = 173467 and $lon = 178745 return db:node-pre($doc) It should be fast enough if the maximum value is not much bigger than the minimum value.
Re: [basex-talk] Reporting function for a subset
I'm sorry, but it's not clear how $nodes can include the result of my main query: xquery version 3.0; declare option output:item-separator ,; let $db := db:open(CDI) for $x in $db let $beginPosition := $x//startTime let $lon := xs:float($x//longitudine) let $lat := xs:float($x//latitudine) where $beginPosition=1889-01-01 and $beginPosition=2015-07-10 and $lat=46.733 and $lat=-67.81 and $lon=72.7006667 and $lon =-79.967 return $x With kind regards, Menashè On 07/14/2015 12:51 PM, Christian Grün wrote: E.g. like that: let $count := function($nodes) { count($nodes) } let $nodes := (a/, b/) return $count($nodes) On Tue, Jul 14, 2015 at 12:41 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: Thank you, but would you please show me how to pass (only once) for each function the xml sequence which results from my main query, instead of simple numbers as in your example? With kind regards, Menashè On 07/14/2015 12:30 PM, Christian Grün wrote: I hope it's clear. Sorry, I'm still confuzzled. What is the problem? I guess you want to define different, exchangable reporting functions for more or less the same input (dataType, device, ...)? Here is one way to define functions and call them in a second step: let $add := function($a, $b) { $a + $b } let $multiply := function($a, $b) { $a * $b } for $function in ($add, $multiply) return $function(3, 5) Instead of $add and $multiply, you could have $report-pivoting and $report-count. On Tue, Jul 14, 2015 at 11:40 AM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hi, The initial of the code should be modified, so here is only the essence of one of the pivoting reports: for $singleDataType in $dataType for $singleDevice in $device for $singleAvailability in $availability for $singleCountry in $country for $singleParameter in $parameter group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter={$singleParameter} NumberOfRecords={count($current-pre)}/ Other report will count all records with one less condition: group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter=Any NumberOfRecords={count($current-pre)}/ I hope it's clear. With kind regards, Menashè
Re: [basex-talk] Reporting function for a subset
Thank you, but would you please show me how to pass (only once) for each function the xml sequence which results from my main query, instead of simple numbers as in your example? With kind regards, Menashè On 07/14/2015 12:30 PM, Christian Grün wrote: I hope it's clear. Sorry, I'm still confuzzled. What is the problem? I guess you want to define different, exchangable reporting functions for more or less the same input (dataType, device, ...)? Here is one way to define functions and call them in a second step: let $add := function($a, $b) { $a + $b } let $multiply := function($a, $b) { $a * $b } for $function in ($add, $multiply) return $function(3, 5) Instead of $add and $multiply, you could have $report-pivoting and $report-count. On Tue, Jul 14, 2015 at 11:40 AM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hi, The initial of the code should be modified, so here is only the essence of one of the pivoting reports: for $singleDataType in $dataType for $singleDevice in $device for $singleAvailability in $availability for $singleCountry in $country for $singleParameter in $parameter group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter={$singleParameter} NumberOfRecords={count($current-pre)}/ Other report will count all records with one less condition: group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter=Any NumberOfRecords={count($current-pre)}/ I hope it's clear. With kind regards, Menashè
Re: [basex-talk] Behaviour of doc() changed?
Hi Christian, you're right usual mistake of sending emails in the evening. :-D This morning I setup an example that reproduced the strange issue. Added this function to restxq.xqm: declare %rest:path(/doc/{$dir}) %rest:GET function page:getdoc($dir as xs:string) { doc($dir || /doc.xml) }; Then created the file doc.xml containing a/ in the folder webapp/tmp of basex. Invoked the function from my browser and got correctly a/ back. But the log reports the exception attached at the end. If I put the absolute path name the exception disappears. It's minor but caused a lot of headache because we were chasing a bug and thought that this exception was the issue (while it apparently was not at the end of the day). Hope it helps. Ciao, Marco. The exception: org.basex.core.BaseXException: Database 'tmp' was not found. at org.basex.core.cmd.Open.open(Open.java:92) at org.basex.query.QueryResources.open(QueryResources.java:357) at org.basex.query.QueryResources.doc(QueryResources.java:173) at org.basex.query.func.fn.Docs.doc(Docs.java:54) at org.basex.query.func.fn.FnDoc.item(FnDoc.java:16) at org.basex.query.func.StandardFunc.optimize(StandardFunc.java:81) at org.basex.query.expr.Arr.inline(Arr.java:64) at org.basex.query.expr.gflwor.GFLWOR.inline(GFLWOR.java:727) at org.basex.query.expr.gflwor.GFLWOR.inlineLets(GFLWOR.java:379) at org.basex.query.expr.gflwor.GFLWOR.optimize(GFLWOR.java:148) at org.basex.query.func.StaticFunc.inlineExpr(StaticFunc.java:283) at org.basex.query.func.StaticFuncCall.compile(StaticFuncCall.java:71) at org.basex.query.MainModule.compile(MainModule.java:74) at org.basex.query.QueryCompiler.compile(QueryCompiler.java:113) at org.basex.query.QueryCompiler.compile(QueryCompiler.java:104) at org.basex.query.QueryContext.analyze(QueryContext.java:330) at org.basex.query.QueryContext.compile(QueryContext.java:319) at org.basex.query.QueryContext.iter(QueryContext.java:345) at org.basex.http.restxq.RestXqResponse.create(RestXqResponse.java:55) at org.basex.http.restxq.RestXqModule.process(RestXqModule.java:101) at org.basex.http.restxq.RestXqFunction.process(RestXqFunction.java:109) at org.basex.http.restxq.RestXqServlet.run(RestXqServlet.java:44) at org.basex.http.BaseXServlet.service(BaseXServlet.java:64) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) From: Christian Grün christian.gr...@gmail.com Sent: Tuesday, July 14, 2015 9:25 AM To: Marco Lettere Cc: BaseX Subject: Re: [basex-talk] Behaviour of doc() changed? Hi Marco, when using doc() we used to pass into it a relative (to the static base uri) filesystem path to an xml file. This was parsed without any problems. Now (basex 8.2.2)
Re: [basex-talk] Slow query
oops, I'm sorry. It's attached. There are text and attribute indexes. It may be slightly faster if you remove the explicit string() conversion: for $x in db:open(CDI) let $beginPosition := $x//startTime where $beginPosition = 1889-01-01 and $beginPosition = 2015-07-10 return db:node-pre($x) But please note that BaseX provides no native range index, which would be a good fit for your longitude/latitude filter.
Re: [basex-talk] Reporting function for a subset
E.g. like that: let $count := function($nodes) { count($nodes) } let $nodes := (a/, b/) return $count($nodes) On Tue, Jul 14, 2015 at 12:41 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: Thank you, but would you please show me how to pass (only once) for each function the xml sequence which results from my main query, instead of simple numbers as in your example? With kind regards, Menashè On 07/14/2015 12:30 PM, Christian Grün wrote: I hope it's clear. Sorry, I'm still confuzzled. What is the problem? I guess you want to define different, exchangable reporting functions for more or less the same input (dataType, device, ...)? Here is one way to define functions and call them in a second step: let $add := function($a, $b) { $a + $b } let $multiply := function($a, $b) { $a * $b } for $function in ($add, $multiply) return $function(3, 5) Instead of $add and $multiply, you could have $report-pivoting and $report-count. On Tue, Jul 14, 2015 at 11:40 AM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hi, The initial of the code should be modified, so here is only the essence of one of the pivoting reports: for $singleDataType in $dataType for $singleDevice in $device for $singleAvailability in $availability for $singleCountry in $country for $singleParameter in $parameter group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter={$singleParameter} NumberOfRecords={count($current-pre)}/ Other report will count all records with one less condition: group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter=Any NumberOfRecords={count($current-pre)}/ I hope it's clear. With kind regards, Menashè
Re: [basex-talk] Reporting function for a subset
What about this? let $nodes := let $db := db:open(CDI) for $x in $db let $beginPosition := $x//startTime let $lon := xs:float($x//longitudine) let $lat := xs:float($x//latitudine) where $beginPosition=1889-01-01 and $beginPosition=2015-07-10 and $lat=46.733 and $lat=-67.81 and $lon=72.7006667 and $lon =-79.967 return $x return ... On Tue, Jul 14, 2015 at 12:55 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: I'm sorry, but it's not clear how $nodes can include the result of my main query: xquery version 3.0; declare option output:item-separator ,; let $db := db:open(CDI) for $x in $db let $beginPosition := $x//startTime let $lon := xs:float($x//longitudine) let $lat := xs:float($x//latitudine) where $beginPosition=1889-01-01 and $beginPosition=2015-07-10 and $lat=46.733 and $lat=-67.81 and $lon=72.7006667 and $lon =-79.967 return $x With kind regards, Menashè On 07/14/2015 12:51 PM, Christian Grün wrote: E.g. like that: let $count := function($nodes) { count($nodes) } let $nodes := (a/, b/) return $count($nodes) On Tue, Jul 14, 2015 at 12:41 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: Thank you, but would you please show me how to pass (only once) for each function the xml sequence which results from my main query, instead of simple numbers as in your example? With kind regards, Menashè On 07/14/2015 12:30 PM, Christian Grün wrote: I hope it's clear. Sorry, I'm still confuzzled. What is the problem? I guess you want to define different, exchangable reporting functions for more or less the same input (dataType, device, ...)? Here is one way to define functions and call them in a second step: let $add := function($a, $b) { $a + $b } let $multiply := function($a, $b) { $a * $b } for $function in ($add, $multiply) return $function(3, 5) Instead of $add and $multiply, you could have $report-pivoting and $report-count. On Tue, Jul 14, 2015 at 11:40 AM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hi, The initial of the code should be modified, so here is only the essence of one of the pivoting reports: for $singleDataType in $dataType for $singleDevice in $device for $singleAvailability in $availability for $singleCountry in $country for $singleParameter in $parameter group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter={$singleParameter} NumberOfRecords={count($current-pre)}/ Other report will count all records with one less condition: group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter=Any NumberOfRecords={count($current-pre)}/ I hope it's clear. With kind regards, Menashè
Re: [basex-talk] Behaviour of doc() changed?
Hi Marco, thanks for the details. Do you possibly have debug set to true? In that case, there is no need to worry. Best, Christian On Tue, Jul 14, 2015 at 10:22 AM, Marco Lettere marco.lett...@dedalus.eu wrote: Hi Christian, you're right usual mistake of sending emails in the evening. :-D This morning I setup an example that reproduced the strange issue. Added this function to restxq.xqm: declare %rest:path(/doc/{$dir}) %rest:GET function page:getdoc($dir as xs:string) { doc($dir || /doc.xml) }; Then created the file doc.xml containing a/ in the folder webapp/tmp of basex. Invoked the function from my browser and got correctly a/ back. But the log reports the exception attached at the end. If I put the absolute path name the exception disappears. It's minor but caused a lot of headache because we were chasing a bug and thought that this exception was the issue (while it apparently was not at the end of the day). Hope it helps. Ciao, Marco. The exception: org.basex.core.BaseXException: Database 'tmp' was not found. at org.basex.core.cmd.Open.open(Open.java:92) at org.basex.query.QueryResources.open(QueryResources.java:357) at org.basex.query.QueryResources.doc(QueryResources.java:173) at org.basex.query.func.fn.Docs.doc(Docs.java:54) at org.basex.query.func.fn.FnDoc.item(FnDoc.java:16) at org.basex.query.func.StandardFunc.optimize(StandardFunc.java:81) at org.basex.query.expr.Arr.inline(Arr.java:64) at org.basex.query.expr.gflwor.GFLWOR.inline(GFLWOR.java:727) at org.basex.query.expr.gflwor.GFLWOR.inlineLets(GFLWOR.java:379) at org.basex.query.expr.gflwor.GFLWOR.optimize(GFLWOR.java:148) at org.basex.query.func.StaticFunc.inlineExpr(StaticFunc.java:283) at org.basex.query.func.StaticFuncCall.compile(StaticFuncCall.java:71) at org.basex.query.MainModule.compile(MainModule.java:74) at org.basex.query.QueryCompiler.compile(QueryCompiler.java:113) at org.basex.query.QueryCompiler.compile(QueryCompiler.java:104) at org.basex.query.QueryContext.analyze(QueryContext.java:330) at org.basex.query.QueryContext.compile(QueryContext.java:319) at org.basex.query.QueryContext.iter(QueryContext.java:345) at org.basex.http.restxq.RestXqResponse.create(RestXqResponse.java:55) at org.basex.http.restxq.RestXqModule.process(RestXqModule.java:101) at org.basex.http.restxq.RestXqFunction.process(RestXqFunction.java:109) at org.basex.http.restxq.RestXqServlet.run(RestXqServlet.java:44) at org.basex.http.BaseXServlet.service(BaseXServlet.java:64) at javax.servlet.http.HttpServlet.service(HttpServlet.java:848) at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1086) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:429) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1020) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:370) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:494) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:971) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1033) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82) at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:696) at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:53) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:745) From: Christian Grün
Re: [basex-talk] Reporting function for a subset
Hi Christian, On 07/14/2015 09:30 AM, Christian Grün wrote: What about the reporting function, does it already exist? What is a subset: Is it a sequence of XML nodes resulting from a path expression? Could you possibly provide us with some code you have written so far? Christian I mean a sequence of XML nodes resulting from a 'for' loop. Then there are queries that for the same filters (or with a single extra condition) return different xml results. For example: group by and time distribution table per year and month. With kind regards, Menashè
[basex-talk] BaseX 8.2.3
Dear all, Just one week passed, and a new version of BaseX is available. It provides a fix for a minor problem with the DBA code and some namespace issues. All the best, Christian
Re: [basex-talk] Slow query
Hi, It sounds like a great idea and I can also implement it to the date comparisons, but unfortunately the new query is much slower. Please see the attached log. With kind regards, Menashè On 07/14/2015 12:50 PM, Christian Grün wrote: Should geo:within of http://docs.basex.org/wiki/Geo_Module help? The functions of the Geo Module don't use any index structures, so I am afraid they won't speed up the query. One more idea: you could convert all latitudes and longitudes to strings with a fixed number of digits _ (:~ Allowed range. :) declare variable $RANGE := 99; (:~ Maximum latitude. :) declare variable $LAT-MIN := -90; (:~ Maximum longitude. :) declare variable $LAT-MAX := 90; (:~ : Converts a double value to a normalized string value : with a fixed size of digits. : @param $num number to be converted : @param $min minimum allowed value : @param $max maximum allowed value : @return resulting value :) declare function local:normalize( $num as xs:double, $min as xs:integer, $max as xs:integer ) { let $norm := $RANGE * ($num - $min) div ($max - $min) return format-number($norm, '00') }; (: Run code for various latitude values :) for $latitude in (-90, -89., -13.345, 0, 89.9) return local:normalize($latitude, $LAT-MIN, $LAT-MAX) _ Next, you could to do string comparisons on these values: for $doc in db:open(CDI) let $lat := $doc//latitude let $lon := $doc//longitude where $lat = 883387 and $lat = 893463 and $lon = 173467 and $lon = 178745 return db:node-pre($doc) It should be fast enough if the maximum value is not much bigger than the minimum value. Compiling: - inlining $norm_3 - simplifying flwor expression - pre-evaluating -90 - pre-evaluating -180 - pre-evaluating db:open(CDI) - inlining local:normalize#3 - removing redundant $num_13 as xs:double cast. - removing redundant $min_14 as xs:integer cast. - removing redundant $max_15 as xs:integer cast. - inlining $num_13 - inlining $min_14 - pre-evaluating (46.733 - -90) - pre-evaluating (99 * 136.733) - inlining $max_15 - pre-evaluating (90 - -90) - pre-evaluating (1.36732863267E8 div 180) - pre-evaluating format-number(759627.018149, 00) - simplifying flwor expression - pre-evaluating -67.81 - inlining local:normalize#3 - removing redundant $num_16 as xs:double cast. - removing redundant $min_17 as xs:integer cast. - removing redundant $max_18 as xs:integer cast. - inlining $num_16 - inlining $min_17 - pre-evaluating (-67.81 - -90) - pre-evaluating (99 * 22.188) - inlining $max_18 - pre-evaluating (90 - -90) - pre-evaluating (2.218997781E7 div 180) - pre-evaluating format-number(123277.6544999, 00) - simplifying flwor expression - inlining local:normalize#3 - removing redundant $num_19 as xs:double cast. - removing redundant $min_20 as xs:integer cast. - removing redundant $max_21 as xs:integer cast. - inlining $num_19 - inlining $min_20 - pre-evaluating (72.7006667 - -180) - pre-evaluating (99 * 252.7006667) - inlining $max_21 - pre-evaluating (180 - -180) - pre-evaluating (2.52700413999E8 div 360) - pre-evaluating format-number(701945.5944425925, 00) - simplifying flwor expression - pre-evaluating -79.967 - inlining local:normalize#3 - removing redundant $num_22 as xs:double cast. - removing redundant $min_23 as xs:integer cast. - removing redundant $max_24 as xs:integer cast. - inlining $num_22 - inlining $min_23 - pre-evaluating (-79.967 - -180) - pre-evaluating (99 * 100.033) - inlining $max_24 - pre-evaluating (180 - -180) - pre-evaluating (1.00033233267E8 div 360) - pre-evaluating format-number(277870.0924074075, 00) - simplifying flwor expression - rewriting descendant-or-self step(s) - rewriting descendant-or-self step(s) - inlining local:normalize#3 - removing redundant $min_26 as xs:integer cast. - removing redundant $max_27 as xs:integer cast. - inlining $num_25 as xs:double - inlining $min_26 - inlining $max_27 - pre-evaluating (180 - -180) - simplifying flwor expression - rewriting descendant-or-self step(s) - inlining local:normalize#3 - removing redundant $min_29 as xs:integer cast. - removing redundant $max_30 as xs:integer cast. - inlining $num_28 as xs:double - inlining $min_29 - inlining $max_30 - pre-evaluating (90 - -90) - simplifying flwor expression - rewriting ($beginPosition_10 = 1889-01-01) - rewriting ($beginPosition_10 = 2015-07-10) - atomic evaluation of ($lat_12 = $north_5) - atomic evaluation of ($lat_12 = $south_6) - atomic evaluation of ($lon_11 = $east_7) - atomic evaluation of ($lon_11 = $west_8) - rewriting (1889-01-01 = $beginPosition_10 and $beginPosition_10 = 2015-07-10 and ($lat_12 = $north_5) and ($lat_12 = $south_6) and ($lon_11 = $east_7) and ($lon_11 = $west_8)) - inlining $db_4 - inlining $north_5 - rewriting ($lat_12 = 759627) - inlining $south_6 - rewriting ($lat_12 = 123278) -
Re: [basex-talk] Reporting function for a subset
Thank you for the helpful ideas! With kind regards, Menashè On 07/14/2015 12:56 PM, Christian Grün wrote: What about this? let $nodes := let $db := db:open(CDI) for $x in $db let $beginPosition := $x//startTime let $lon := xs:float($x//longitudine) let $lat := xs:float($x//latitudine) where $beginPosition=1889-01-01 and $beginPosition=2015-07-10 and $lat=46.733 and $lat=-67.81 and $lon=72.7006667 and $lon =-79.967 return $x return ... On Tue, Jul 14, 2015 at 12:55 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: I'm sorry, but it's not clear how $nodes can include the result of my main query: xquery version 3.0; declare option output:item-separator ,; let $db := db:open(CDI) for $x in $db let $beginPosition := $x//startTime let $lon := xs:float($x//longitudine) let $lat := xs:float($x//latitudine) where $beginPosition=1889-01-01 and $beginPosition=2015-07-10 and $lat=46.733 and $lat=-67.81 and $lon=72.7006667 and $lon =-79.967 return $x With kind regards, Menashè On 07/14/2015 12:51 PM, Christian Grün wrote: E.g. like that: let $count := function($nodes) { count($nodes) } let $nodes := (a/, b/) return $count($nodes) On Tue, Jul 14, 2015 at 12:41 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: Thank you, but would you please show me how to pass (only once) for each function the xml sequence which results from my main query, instead of simple numbers as in your example? With kind regards, Menashè On 07/14/2015 12:30 PM, Christian Grün wrote: I hope it's clear. Sorry, I'm still confuzzled. What is the problem? I guess you want to define different, exchangable reporting functions for more or less the same input (dataType, device, ...)? Here is one way to define functions and call them in a second step: let $add := function($a, $b) { $a + $b } let $multiply := function($a, $b) { $a * $b } for $function in ($add, $multiply) return $function(3, 5) Instead of $add and $multiply, you could have $report-pivoting and $report-count. On Tue, Jul 14, 2015 at 11:40 AM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hi, The initial of the code should be modified, so here is only the essence of one of the pivoting reports: for $singleDataType in $dataType for $singleDevice in $device for $singleAvailability in $availability for $singleCountry in $country for $singleParameter in $parameter group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry,$singleParameter return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter={$singleParameter} NumberOfRecords={count($current-pre)}/ Other report will count all records with one less condition: group by $singleDataType,$singleDevice,$singleAvailability,$singleCountry order by $singleDataType,$singleDevice,$singleAvailability,$singleCountry return Row DatasetType={$singleDataType} Instrument={$singleDevice} Availability={$singleAvailability} Country={$singleCountry} Parameter=Any NumberOfRecords={count($current-pre)}/ I hope it's clear. With kind regards, Menashè
Re: [basex-talk] Slow query
...it only makes sense if you store the data in its normalized representation. On Tue, Jul 14, 2015 at 2:42 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hi, It sounds like a great idea and I can also implement it to the date comparisons, but unfortunately the new query is much slower. Please see the attached log. With kind regards, Menashè On 07/14/2015 12:50 PM, Christian Grün wrote: Should geo:within of http://docs.basex.org/wiki/Geo_Module help? The functions of the Geo Module don't use any index structures, so I am afraid they won't speed up the query. One more idea: you could convert all latitudes and longitudes to strings with a fixed number of digits _ (:~ Allowed range. :) declare variable $RANGE := 99; (:~ Maximum latitude. :) declare variable $LAT-MIN := -90; (:~ Maximum longitude. :) declare variable $LAT-MAX := 90; (:~ : Converts a double value to a normalized string value : with a fixed size of digits. : @param $num number to be converted : @param $min minimum allowed value : @param $max maximum allowed value : @return resulting value :) declare function local:normalize( $num as xs:double, $min as xs:integer, $max as xs:integer ) { let $norm := $RANGE * ($num - $min) div ($max - $min) return format-number($norm, '00') }; (: Run code for various latitude values :) for $latitude in (-90, -89., -13.345, 0, 89.9) return local:normalize($latitude, $LAT-MIN, $LAT-MAX) _ Next, you could to do string comparisons on these values: for $doc in db:open(CDI) let $lat := $doc//latitude let $lon := $doc//longitude where $lat = 883387 and $lat = 893463 and $lon = 173467 and $lon = 178745 return db:node-pre($doc) It should be fast enough if the maximum value is not much bigger than the minimum value.
Re: [basex-talk] Slow query
:) I've thought to do it as a second step, but I should do it earlier. Thank you. With kind regards, Menashè On 07/14/2015 03:22 PM, Christian Grün wrote: ...it only makes sense if you store the data in its normalized representation. On Tue, Jul 14, 2015 at 2:42 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hi, It sounds like a great idea and I can also implement it to the date comparisons, but unfortunately the new query is much slower. Please see the attached log. With kind regards, Menashè On 07/14/2015 12:50 PM, Christian Grün wrote: Should geo:within of http://docs.basex.org/wiki/Geo_Module help? The functions of the Geo Module don't use any index structures, so I am afraid they won't speed up the query. One more idea: you could convert all latitudes and longitudes to strings with a fixed number of digits _ (:~ Allowed range. :) declare variable $RANGE := 99; (:~ Maximum latitude. :) declare variable $LAT-MIN := -90; (:~ Maximum longitude. :) declare variable $LAT-MAX := 90; (:~ : Converts a double value to a normalized string value : with a fixed size of digits. : @param $num number to be converted : @param $min minimum allowed value : @param $max maximum allowed value : @return resulting value :) declare function local:normalize( $num as xs:double, $min as xs:integer, $max as xs:integer ) { let $norm := $RANGE * ($num - $min) div ($max - $min) return format-number($norm, '00') }; (: Run code for various latitude values :) for $latitude in (-90, -89., -13.345, 0, 89.9) return local:normalize($latitude, $LAT-MIN, $LAT-MAX) _ Next, you could to do string comparisons on these values: for $doc in db:open(CDI) let $lat := $doc//latitude let $lon := $doc//longitude where $lat = 883387 and $lat = 893463 and $lon = 173467 and $lon = 178745 return db:node-pre($doc) It should be fast enough if the maximum value is not much bigger than the minimum value.
Re: [basex-talk] HTTP module and cookies
In my experience the case that causes the most problem is the authentication redirect. I have never tried this with BaseX but I have been very grateful in the past that XMLCalabash implements this: The exception arises in the case of redirection. If a redirect response includes cookies, those cookies are forwarded as appropriate to the redirected location when the redirection is followed. [1] /Andy [1] http://xprocbook.com/book/refentry-19.html#cookies On 10 July 2015 at 10:36, Florent Georges fgeor...@fgeorges.org wrote: Hi, Correct me if I am wrong, but I believe the HTTP Client in BaseX is the EXPath HTTP Client? It was indeed designed to provide access to low-level, raw HTTP. It does not contain a lot of higher level feature based on HTTP itself. Indeed, you have to handle cookies yourself for instance. The difficulty here, if I am right, is the side-effects required to pass information somehow (in a hidden way) between 2 different HTTP requests. Any suggestion to improve the API is welcome (at least on the EXPath mailing list, I don't want to speak for BaseX developers, but I am pretty sure here as well :-)...) Regards, -- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/ On 10 July 2015 at 11:13, Christian Grün wrote: Hi Vincent, So far, I'm not aware of a standard solution to handle and cache client-side cookies with BaseX. Could you show us your solution? It might help us to discuss alternative solutions. Best, Christian On Thu, Jul 9, 2015 at 8:30 PM, Lizzi, Vincent vincent.li...@taylorandfrancis.com wrote: I am using BaseX to scrape data from a web site. This web site, probably like many other websites, relies on cookies and if it does not receive the expected cookies it delivers a page instructing you to enable cookies in your browser. I was able to get this working by parsing the http:header response to get the cookies to use in subsequent requests. This is the second time I’ve done this, and even though this works it seems a bit hacky. Is there a standard way of handling cookies using the HTTP Module or the Fetch module? Or, are there any well written code examples available? In other environments typically you define a cookie jar in some way, and the cookie jar is used (and is updated) automatically in all subsequent HTTP requests. I’m hoping to find something similar in BaseX. Thanks, Vincent
Re: [basex-talk] Slow query
Hi Menashè, The attached log file is empty. Maybe it's sufficient if you provide us with the query and give us information on the query compilation (are any indexes used?). C. On Mon, Jul 13, 2015 at 3:32 PM, Menashè Eliezer melie...@ogs.trieste.it wrote: Hello, Creating a database of partial xml documents had almost no effect. Therefore I've created a database with very simple xml structure. I'm attaching an example (demo.xml). BaseX version: 8.2.2 Number of documents: 374739 However, the attached query takes 4 seconds (attached simple_query.log). I don't know if it's considered a normal performance, but my real query is different: I'm copying all the documents which correspond to my query to a newly created temporary collection, for having faster processing for this subset: reporting, ecc. Adding to db (Remote Java client): 12 sec. Optimising the db (Remote Java client): 23 sec. Both the Java client and BaseX server are installed on powerful servers. Are these numbers normal? The attached results are based on a local client (Using the BaseX GUI). In the future, I should have even much more documents to handle... Any ideas? I can also change the scheme of my new xml. As for the idea of creating a new temporary db, I'm checking an alternative: return in one query all what I need, including reports, all in one xml. With kind regards, Menashè
Re: [basex-talk] HTTP module and cookies
The EXPath HTTP Client does seem to provide low level HTTP access. I am hoping to find an XQuery library that implements some common things such as cookies and authentication on top of HTTP Client, but haven’t come across such a library yet. There are a few OATH implementations for authentication though. I’ll have a look at XML Calabash’s HTTP cookie handling. Fortunately, in the project that I currently have authentication is not needed. Here is the code that I currently have working. A query can fetch URL(s) by calling local:httpGet(), which does a request to get the cookies that the web site requires, and then does request(s) to return the web page for each URL provided. declare function local:httpResponseCookies($response as element(http:response)) as element(http:header) { let $setCookies := $response/http:header[@name = 'Set-Cookie']/@value/data() let $cookies := string-join(for $cookie in $setCookies return substring-before($cookie, '; '), '; ') return http:header name=Cookie value={$cookies}/ }; declare function local:httpGet($urls as xs:string+) as element(page)* { let $response := http:send-request(http:request method='get'/, $urls[1]) for $url in $urls let $response := http:send-request(http:request method='get' {local:httpResponseCookies($response[self::http:response])} /http:request, $url) return element page { attribute url { $url }, $response[2] } }; Thanks, Vincent From: basex-talk-boun...@mailman.uni-konstanz.de [mailto:basex-talk-boun...@mailman.uni-konstanz.de] On Behalf Of Andy Bunce Sent: Tuesday, July 14, 2015 12:11 PM To: Florent Georges Cc: BaseX Subject: Re: [basex-talk] HTTP module and cookies In my experience the case that causes the most problem is the authentication redirect. I have never tried this with BaseX but I have been very grateful in the past that XMLCalabash implements this: The exception arises in the case of redirection. If a redirect response includes cookies, those cookies are forwarded as appropriate to the redirected location when the redirection is followed. [1] /Andy [1] http://xprocbook.com/book/refentry-19.html#cookies On 10 July 2015 at 10:36, Florent Georges fgeor...@fgeorges.orgmailto:fgeor...@fgeorges.org wrote: Hi, Correct me if I am wrong, but I believe the HTTP Client in BaseX is the EXPath HTTP Client? It was indeed designed to provide access to low-level, raw HTTP. It does not contain a lot of higher level feature based on HTTP itself. Indeed, you have to handle cookies yourself for instance. The difficulty here, if I am right, is the side-effects required to pass information somehow (in a hidden way) between 2 different HTTP requests. Any suggestion to improve the API is welcome (at least on the EXPath mailing list, I don't want to speak for BaseX developers, but I am pretty sure here as well :-)...) Regards, -- Florent Georges http://fgeorges.org/ http://h2oconsulting.be/ On 10 July 2015 at 11:13, Christian Grün wrote: Hi Vincent, So far, I'm not aware of a standard solution to handle and cache client-side cookies with BaseX. Could you show us your solution? It might help us to discuss alternative solutions. Best, Christian On Thu, Jul 9, 2015 at 8:30 PM, Lizzi, Vincent vincent.li...@taylorandfrancis.commailto:vincent.li...@taylorandfrancis.com wrote: I am using BaseX to scrape data from a web site. This web site, probably like many other websites, relies on cookies and if it does not receive the expected cookies it delivers a page instructing you to enable cookies in your browser. I was able to get this working by parsing the http:header response to get the cookies to use in subsequent requests. This is the second time I’ve done this, and even though this works it seems a bit hacky. Is there a standard way of handling cookies using the HTTP Module or the Fetch module? Or, are there any well written code examples available? In other environments typically you define a cookie jar in some way, and the cookie jar is used (and is updated) automatically in all subsequent HTTP requests. I’m hoping to find something similar in BaseX. Thanks, Vincent
[basex-talk] Whitespace
Hi, When I use the file:write function, the whitespaces before an element are deleted (and also the initial whitespace of a string in an element). This is a problem for elements containing text and elements. Is there a way to avoid this? Thanks. J.
Re: [basex-talk] Whitespace
Hi You can use the serialisation parameter with no indent option. Marc On July 14, 2015 8:13:09 PM CEST, meumapple meumap...@gmail.com wrote: Hi, When I use the file:write function, the whitespaces before an element are deleted (and also the initial whitespace of a string in an element). This is a problem for elements containing text and elements. Is there a way to avoid this? Thanks. J. -- Envoyé de mon téléphone Android avec K-9 Mail. Excusez la brièveté.