Hi Lars, I think I can confirm the observed behavior: in certain circumstances, the index properties (stemming etc.) won't be applied to the optimized full-text query when using RESTXQ.
I'll check out how this can be fixed. Thanks, Christian On Mon, May 18, 2015 at 6:46 PM, Lars Johnsen <yoon...@gmail.com> wrote: > A last update, which may illuminate a little. After reindexing the database > using Norwegian (snowball), stemming, and keeping diacritis, RESTXQ > processes neither the special characters (treats them as closest ascii), nor > inflected forms. > > The words "mannen" (=the man, definite) and "spaserer" (=walks, present > tense), result in no output, while using the naked stems "mann" and "spaser" > the full result is displayed. In contrast to REST which behaves as expected. > > > Cheers > Lars > > 2015-05-18 15:28 GMT+02:00 Lars Johnsen <yoon...@gmail.com>: >> >> As an update, after rebuilding database with >> >> text index, >> full text index (no language, no stemming, keep diacritics) >> >> restarting server: >> BaseX 8.1.1 [Server] >> Server was started (port: 29084) >> [main] INFO org.eclipse.jetty.server.AbstractConnector - Started >> SelectChannelConnector@0.0.0.0:8984 >> HTTP Server was started (port: 8984) >> >> RESTXQ: Norwegian characters are converted using full text index, changing >> to text index takes forever. >> REST: Full-text works as expected, and text index works as expected (same >> as runing in GUI for both). >> >> It looks as if the index structure is treated differently. >> >> >> 2015-05-18 15:07 GMT+02:00 Lars Johnsen <yoon...@gmail.com>: >>> >>> The full text query is blisteringly fast for both, the text index query >>> is fast only for REST queries and seems not to be used with queries in >>> RESTXQ. I am rebuilding the whole database now to see how it goes, and will >>> restart everything for a new assessment. >>> >>> >>> >>> 2015-05-18 15:00 GMT+02:00 Christian Grün <christian.gr...@gmail.com>: >>>> >>>> > However, when using text index instead of full text the results are >>>> > the same >>>> > for both, except that RESTXQ takes almost forever >>>> >>>> What about the original query: Has it been slow as well, or do you >>>> think this is a new problem? >>>> >>>> >>>> > 2015-05-18 14:28 GMT+02:00 Christian Grün <christian.gr...@gmail.com>: >>>> >> >>>> >> It could be that your URL is decoded in a wrong way.. What happens if >>>> >> you run the following function with REST and RESTXQ and "føre" as >>>> >> word? >>>> >> >>>> >> declare >>>> >> %rest:path("/test/encoding/{$word}") >>>> >> function page:test-encoding($word) { >>>> >> string-to-codepoints($word) >>>> >> }; >>>> >> >>>> >> Thanks, >>>> >> Christian >>>> >> >>>> >> >>>> >> string-to-codepoints() >>>> >> > REST output (2 first lines): >>>> >> > føre >>>> >> > fø - re 219 >>>> >> > >>>> >> > RESTXQ >>>> >> > føre >>>> >> > fo - re 123 >>>> >> > >>>> >> > The first word quoted is "føre" in both cases and is what the >>>> >> > scripts >>>> >> > see, >>>> >> > so the full text is given the same in both cases. Could it be that >>>> >> > within >>>> >> > RESTXQ the full text index is treated differently? >>>> >> > >>>> >> > I will work closer on a self contained example, but thought this >>>> >> > might >>>> >> > point to something. >>>> >> > >>>> >> > Cheers >>>> >> > Lars >>>> >> > >>>> >> > >>>> >> > 2015-05-18 13:44 GMT+02:00 Lars Johnsen <yoon...@gmail.com>: >>>> >> >> >>>> >> >> Hi Christian - and thanks for fast response. Latest version 8.11 >>>> >> >> is in >>>> >> >> use >>>> >> >> (same behaviour as previous). Let me see if I can make a self >>>> >> >> contained >>>> >> >> example. >>>> >> >> >>>> >> >> best, >>>> >> >> Lars >>>> >> >> >>>> >> >> 2015-05-18 13:40 GMT+02:00 Christian Grün >>>> >> >> <christian.gr...@gmail.com>: >>>> >> >>> >>>> >> >>> Hi Lars, >>>> >> >>> >>>> >> >>> hm, that's difficult to tell. All I can say is that this sounds >>>> >> >>> unusual, so I'm coming up with my standard questions: Do you >>>> >> >>> think you >>>> >> >>> could build us a little example that allows us to reproduce the >>>> >> >>> problem? Have you tried the latest version of BaseX? >>>> >> >>> >>>> >> >>> Best, >>>> >> >>> Christian >>>> >> >>> >>>> >> >>> >>>> >> >>> On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen <yoon...@gmail.com> >>>> >> >>> wrote: >>>> >> >>> > >>>> >> >>> > I am running a web script in two identical versions (identical >>>> >> >>> > as in >>>> >> >>> > "cut >>>> >> >>> > and paste"), one via RESTXQ and one vi REST. The response is >>>> >> >>> > different, >>>> >> >>> > and >>>> >> >>> > I wondered what may be the trouble. >>>> >> >>> > >>>> >> >>> > For example the output (the URLs only works locally) for >>>> >> >>> > http://ljohnsen:8984/hyphens/mellom >>>> >> >>> > is the same as >>>> >> >>> > http://ljohnsen:8984/rest?run=hyphen-show.xq&word=mellom >>>> >> >>> > >>>> >> >>> > which is a set of hyphenation data: >>>> >> >>> > mellom >>>> >> >>> > mel - lom 17005 >>>> >> >>> > Mel - lom 144 >>>> >> >>> > mel - lom. 50 >>>> >> >>> > >>>> >> >>> > but if "mellom" is exchanged with "nasjonalbiblioteket" only >>>> >> >>> > the >>>> >> >>> > REST >>>> >> >>> > version shows any result, which then is the same as I get >>>> >> >>> > experimenting >>>> >> >>> > in >>>> >> >>> > the GUI. >>>> >> >>> > >>>> >> >>> > The actual script is added below, and which runs in both >>>> >> >>> > versions >>>> >> >>> > (identical apart form the rest and restxq interfaces), it uses >>>> >> >>> > full >>>> >> >>> > text >>>> >> >>> > search, but results differ when run under the REST-regime. >>>> >> >>> > >>>> >> >>> > All the best >>>> >> >>> > Lars G Johnsen >>>> >> >>> > National Library of Norway >>>> >> >>> > >>>> >> >>> > module namespace page = 'http://basex.org/modules/web-page'; >>>> >> >>> > >>>> >> >>> > declare >>>> >> >>> > %rest:path("/hyphens/{$word}") >>>> >> >>> > %output:method("html") >>>> >> >>> > >>>> >> >>> > function page:show-hyphens($word) { >>>> >> >>> > let $db := db:open('hyphen-data') >>>> >> >>> > let $hyphens := for $hyp in $db/hyphens/hyphens[full >>>> >> >>> > contains >>>> >> >>> > text >>>> >> >>> > {$word}] >>>> >> >>> > group by $first := $hyp/first, $second := $hyp/second >>>> >> >>> > let $count := count($hyp) >>>> >> >>> > order by xs:int($count) descending >>>> >> >>> > return element p { >>>> >> >>> > attribute freq {$count}, >>>> >> >>> > $first, " - ", $second, $count >>>> >> >>> > } >>>> >> >>> > >>>> >> >>> > let $total := sum($hyphens//@freq) >>>> >> >>> > let $div := element div { >>>> >> >>> > element p {$word}, >>>> >> >>> > for $hyp in $hyphens >>>> >> >>> > return element div { >>>> >> >>> > attribute class {"hyph"}, >>>> >> >>> > attribute style {"font-size:", 1 >>>> >> >>> > +round(xs:int($hyp//@freq/data()) >>>> >> >>> > div $total,1) || "em"}, >>>> >> >>> > $hyp >>>> >> >>> > >>>> >> >>> > } >>>> >> >>> > } >>>> >> >>> > return >>>> >> >>> > <html encoding="UTF-8"> >>>> >> >>> > <head> >>>> >> >>> > <meta http-equiv="Content-Type" content="text/html" >>>> >> >>> > charset="UTF-8" >>>> >> >>> > /> >>>> >> >>> > <title>Orddelinger</title> >>>> >> >>> > </head> >>>> >> >>> > <body>{$div} >>>> >> >>> > </body> >>>> >> >>> > </html> >>>> >> >>> > >>>> >> >>> > }; >>>> >> >> >>>> >> >> >>>> >> > >>>> > >>>> > >>> >>> >> >