Re: [basex-talk] rest vs. restxq - strange difference

2015-05-20 Thread Lars Johnsen
Thanks for that. New version works nicely with full text indexing - and
very fast too!

Noticed that the text index seems to work differently between RESTXQ (not
utilized?) and REST - judging from the response time.

Thanks again for the efforts

Lars

2015-05-19 13:29 GMT+02:00 Christian Grün christian.gr...@gmail.com:

  I'll check out how this can be fixed.

 So I checked out how to fix it, and I fixed it [1]. Feel free to try
 the latest snapshot [2]!
 Christian

 [1] https://github.com/BaseXdb/basex/issues/1144
 [2] http://files.basex.org/releases/latest


  On Mon, May 18, 2015 at 6:46 PM, Lars Johnsen yoon...@gmail.com wrote:
  A last update, which may illuminate a little. After reindexing the
 database
  using Norwegian (snowball), stemming, and keeping diacritis, RESTXQ
  processes neither the special characters (treats them as closest
 ascii), nor
  inflected forms.
 
  The words mannen (=the man, definite) and spaserer (=walks, present
  tense), result in no output, while using the naked stems mann and
 spaser
  the full result is displayed. In contrast to REST which behaves as
 expected.
 
 
  Cheers
  Lars
 
  2015-05-18 15:28 GMT+02:00 Lars Johnsen yoon...@gmail.com:
 
  As an update, after rebuilding database with
 
  text index,
  full text index (no language, no stemming, keep diacritics)
 
  restarting server:
  BaseX 8.1.1 [Server]
  Server was started (port: 29084)
  [main] INFO org.eclipse.jetty.server.AbstractConnector - Started
  SelectChannelConnector@0.0.0.0:8984
  HTTP Server was started (port: 8984)
 
  RESTXQ: Norwegian characters are converted using full text index,
 changing
  to text index takes forever.
  REST: Full-text works as expected, and text index works as expected
 (same
  as runing in GUI for both).
 
  It looks as if the index structure is treated differently.
 
 
  2015-05-18 15:07 GMT+02:00 Lars Johnsen yoon...@gmail.com:
 
  The full text query is blisteringly fast for both, the text index
 query
  is fast only for REST queries and seems not to be used with queries in
  RESTXQ. I am rebuilding the whole database now to see how it goes,
 and will
  restart everything for a new assessment.
 
 
 
  2015-05-18 15:00 GMT+02:00 Christian Grün christian.gr...@gmail.com
 :
 
   However, when using text index instead of full text the results are
   the same
   for both, except that RESTXQ takes almost forever
 
  What about the original query: Has it been slow as well, or do you
  think this is a new problem?
 
 
   2015-05-18 14:28 GMT+02:00 Christian Grün 
 christian.gr...@gmail.com:
  
   It could be that your URL is decoded in a wrong way.. What
 happens if
   you run the following function with REST and RESTXQ and føre as
   word?
  
 declare
   %rest:path(/test/encoding/{$word})
 function page:test-encoding($word) {
   string-to-codepoints($word)
 };
  
   Thanks,
   Christian
  
  
   string-to-codepoints()
REST output (2 first lines):
   føre
   fø - re 219
   
RESTXQ
   føre
   fo - re 123
   
The first word quoted is føre in both cases and is what the
scripts
see,
so the full text is given the same in both cases. Could it be
 that
within
RESTXQ the full text index is treated differently?
   
I will work closer on a  self contained example, but thought
 this
might
point to something.
   
Cheers
Lars
   
   
2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:
   
Hi Christian - and thanks for fast response. Latest version
 8.11
is in
use
(same behaviour as previous). Let me see if I can make a self
contained
example.
   
best,
Lars
   
2015-05-18 13:40 GMT+02:00 Christian Grün
christian.gr...@gmail.com:
   
Hi Lars,
   
hm, that's difficult to tell. All I can say is that this
 sounds
unusual, so I'm coming up with my standard questions: Do you
think you
could build us a little example that allows us to reproduce
 the
problem? Have you tried the latest version of BaseX?
   
Best,
Christian
   
   
On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen 
 yoon...@gmail.com
wrote:

 I am running a web script in two identical versions
 (identical
 as in
 cut
 and paste), one via RESTXQ and one vi REST. The response is
 different,
 and
 I wondered what may be the trouble.

 For example the output (the URLs only works locally) for
 http://ljohnsen:8984/hyphens/mellom
 is the same as

 http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom

 which is a set of hyphenation data:
 mellom
 mel - lom 17005
 Mel - lom 144
 mel - lom. 50

 but if mellom is exchanged with nasjonalbiblioteket only
 the
 REST
 version shows any result, which then is the same as I get
 experimenting
 in
 the GUI.

 The actual script is added below, and which runs in both
 versions
 

Re: [basex-talk] rest vs. restxq - strange difference

2015-05-19 Thread Christian Grün
Hi Lars,

I think I can confirm the observed behavior: in certain circumstances,
the index properties (stemming etc.) won't be applied to the optimized
full-text query when using RESTXQ.

I'll check out how this can be fixed.

Thanks,
Christian


On Mon, May 18, 2015 at 6:46 PM, Lars Johnsen yoon...@gmail.com wrote:
 A last update, which may illuminate a little. After reindexing the database
 using Norwegian (snowball), stemming, and keeping diacritis, RESTXQ
 processes neither the special characters (treats them as closest ascii), nor
 inflected forms.

 The words mannen (=the man, definite) and spaserer (=walks, present
 tense), result in no output, while using the naked stems mann and spaser
 the full result is displayed. In contrast to REST which behaves as expected.


 Cheers
 Lars

 2015-05-18 15:28 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 As an update, after rebuilding database with

 text index,
 full text index (no language, no stemming, keep diacritics)

 restarting server:
 BaseX 8.1.1 [Server]
 Server was started (port: 29084)
 [main] INFO org.eclipse.jetty.server.AbstractConnector - Started
 SelectChannelConnector@0.0.0.0:8984
 HTTP Server was started (port: 8984)

 RESTXQ: Norwegian characters are converted using full text index, changing
 to text index takes forever.
 REST: Full-text works as expected, and text index works as expected (same
 as runing in GUI for both).

 It looks as if the index structure is treated differently.


 2015-05-18 15:07 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 The full text query is blisteringly fast for both, the text index query
 is fast only for REST queries and seems not to be used with queries in
 RESTXQ. I am rebuilding the whole database now to see how it goes, and will
 restart everything for a new assessment.



 2015-05-18 15:00 GMT+02:00 Christian Grün christian.gr...@gmail.com:

  However, when using text index instead of full text the results are
  the same
  for both, except that RESTXQ takes almost forever

 What about the original query: Has it been slow as well, or do you
 think this is a new problem?


  2015-05-18 14:28 GMT+02:00 Christian Grün christian.gr...@gmail.com:
 
  It could be that your URL is decoded in a wrong way.. What happens if
  you run the following function with REST and RESTXQ and føre as
  word?
 
declare
  %rest:path(/test/encoding/{$word})
function page:test-encoding($word) {
  string-to-codepoints($word)
};
 
  Thanks,
  Christian
 
 
  string-to-codepoints()
   REST output (2 first lines):
  føre
  fø - re 219
  
   RESTXQ
  føre
  fo - re 123
  
   The first word quoted is føre in both cases and is what the
   scripts
   see,
   so the full text is given the same in both cases. Could it be that
   within
   RESTXQ the full text index is treated differently?
  
   I will work closer on a  self contained example, but thought this
   might
   point to something.
  
   Cheers
   Lars
  
  
   2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:
  
   Hi Christian - and thanks for fast response. Latest version 8.11
   is in
   use
   (same behaviour as previous). Let me see if I can make a self
   contained
   example.
  
   best,
   Lars
  
   2015-05-18 13:40 GMT+02:00 Christian Grün
   christian.gr...@gmail.com:
  
   Hi Lars,
  
   hm, that's difficult to tell. All I can say is that this sounds
   unusual, so I'm coming up with my standard questions: Do you
   think you
   could build us a little example that allows us to reproduce the
   problem? Have you tried the latest version of BaseX?
  
   Best,
   Christian
  
  
   On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com
   wrote:
   
I am running a web script in two identical versions (identical
as in
cut
and paste), one via RESTXQ and one vi REST. The response is
different,
and
I wondered what may be the trouble.
   
For example the output (the URLs only works locally) for
http://ljohnsen:8984/hyphens/mellom
is the same as
 http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
   
which is a set of hyphenation data:
mellom
mel - lom 17005
Mel - lom 144
mel - lom. 50
   
but if mellom is exchanged with nasjonalbiblioteket only
the
REST
version shows any result, which then is the same as I get
experimenting
in
the GUI.
   
The actual script is added below, and which runs in both
versions
(identical apart form the rest and restxq interfaces), it uses
full
text
search, but results differ when run under the REST-regime.
   
All the best
Lars G Johnsen
National Library of Norway
   
module namespace page = 'http://basex.org/modules/web-page';
   
declare
  %rest:path(/hyphens/{$word})
  %output:method(html)
   
function page:show-hyphens($word) {
   let $db := db:open('hyphen-data')
 let $hyphens :=  for $hyp in 

Re: [basex-talk] rest vs. restxq - strange difference

2015-05-19 Thread Christian Grün
 I'll check out how this can be fixed.

So I checked out how to fix it, and I fixed it [1]. Feel free to try
the latest snapshot [2]!
Christian

[1] https://github.com/BaseXdb/basex/issues/1144
[2] http://files.basex.org/releases/latest


 On Mon, May 18, 2015 at 6:46 PM, Lars Johnsen yoon...@gmail.com wrote:
 A last update, which may illuminate a little. After reindexing the database
 using Norwegian (snowball), stemming, and keeping diacritis, RESTXQ
 processes neither the special characters (treats them as closest ascii), nor
 inflected forms.

 The words mannen (=the man, definite) and spaserer (=walks, present
 tense), result in no output, while using the naked stems mann and spaser
 the full result is displayed. In contrast to REST which behaves as expected.


 Cheers
 Lars

 2015-05-18 15:28 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 As an update, after rebuilding database with

 text index,
 full text index (no language, no stemming, keep diacritics)

 restarting server:
 BaseX 8.1.1 [Server]
 Server was started (port: 29084)
 [main] INFO org.eclipse.jetty.server.AbstractConnector - Started
 SelectChannelConnector@0.0.0.0:8984
 HTTP Server was started (port: 8984)

 RESTXQ: Norwegian characters are converted using full text index, changing
 to text index takes forever.
 REST: Full-text works as expected, and text index works as expected (same
 as runing in GUI for both).

 It looks as if the index structure is treated differently.


 2015-05-18 15:07 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 The full text query is blisteringly fast for both, the text index query
 is fast only for REST queries and seems not to be used with queries in
 RESTXQ. I am rebuilding the whole database now to see how it goes, and will
 restart everything for a new assessment.



 2015-05-18 15:00 GMT+02:00 Christian Grün christian.gr...@gmail.com:

  However, when using text index instead of full text the results are
  the same
  for both, except that RESTXQ takes almost forever

 What about the original query: Has it been slow as well, or do you
 think this is a new problem?


  2015-05-18 14:28 GMT+02:00 Christian Grün christian.gr...@gmail.com:
 
  It could be that your URL is decoded in a wrong way.. What happens if
  you run the following function with REST and RESTXQ and føre as
  word?
 
declare
  %rest:path(/test/encoding/{$word})
function page:test-encoding($word) {
  string-to-codepoints($word)
};
 
  Thanks,
  Christian
 
 
  string-to-codepoints()
   REST output (2 first lines):
  føre
  fø - re 219
  
   RESTXQ
  føre
  fo - re 123
  
   The first word quoted is føre in both cases and is what the
   scripts
   see,
   so the full text is given the same in both cases. Could it be that
   within
   RESTXQ the full text index is treated differently?
  
   I will work closer on a  self contained example, but thought this
   might
   point to something.
  
   Cheers
   Lars
  
  
   2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:
  
   Hi Christian - and thanks for fast response. Latest version 8.11
   is in
   use
   (same behaviour as previous). Let me see if I can make a self
   contained
   example.
  
   best,
   Lars
  
   2015-05-18 13:40 GMT+02:00 Christian Grün
   christian.gr...@gmail.com:
  
   Hi Lars,
  
   hm, that's difficult to tell. All I can say is that this sounds
   unusual, so I'm coming up with my standard questions: Do you
   think you
   could build us a little example that allows us to reproduce the
   problem? Have you tried the latest version of BaseX?
  
   Best,
   Christian
  
  
   On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com
   wrote:
   
I am running a web script in two identical versions (identical
as in
cut
and paste), one via RESTXQ and one vi REST. The response is
different,
and
I wondered what may be the trouble.
   
For example the output (the URLs only works locally) for
http://ljohnsen:8984/hyphens/mellom
is the same as
 http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
   
which is a set of hyphenation data:
mellom
mel - lom 17005
Mel - lom 144
mel - lom. 50
   
but if mellom is exchanged with nasjonalbiblioteket only
the
REST
version shows any result, which then is the same as I get
experimenting
in
the GUI.
   
The actual script is added below, and which runs in both
versions
(identical apart form the rest and restxq interfaces), it uses
full
text
search, but results differ when run under the REST-regime.
   
All the best
Lars G Johnsen
National Library of Norway
   
module namespace page = 'http://basex.org/modules/web-page';
   
declare
  %rest:path(/hyphens/{$word})
  %output:method(html)
   
function page:show-hyphens($word) {
   let $db := db:open('hyphen-data')
 let $hyphens :=  for $hyp in 

Re: [basex-talk] rest vs. restxq - strange difference

2015-05-18 Thread Lars Johnsen
As an update, after rebuilding database with

text index,
full text index (no language, no stemming, keep diacritics)

restarting server:
BaseX 8.1.1 [Server]
Server was started (port: 29084)
[main] INFO org.eclipse.jetty.server.AbstractConnector - Started
SelectChannelConnector@0.0.0.0:8984
HTTP Server was started (port: 8984)

RESTXQ: Norwegian characters are converted using full text index, changing
to text index takes forever.
REST: Full-text works as expected, and text index works as expected (same
as runing in GUI for both).

It looks as if the index structure is treated differently.


2015-05-18 15:07 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 The full text query is blisteringly fast for both, the text index query is
 fast only for REST queries and seems not to be used with queries in RESTXQ.
 I am rebuilding the whole database now to see how it goes, and will restart
 everything for a new assessment.



 2015-05-18 15:00 GMT+02:00 Christian Grün christian.gr...@gmail.com:

  However, when using text index instead of full text the results are the
 same
  for both, except that RESTXQ takes almost forever

 What about the original query: Has it been slow as well, or do you
 think this is a new problem?


  2015-05-18 14:28 GMT+02:00 Christian Grün christian.gr...@gmail.com:
 
  It could be that your URL is decoded in a wrong way.. What happens if
  you run the following function with REST and RESTXQ and føre as
  word?
 
declare
  %rest:path(/test/encoding/{$word})
function page:test-encoding($word) {
  string-to-codepoints($word)
};
 
  Thanks,
  Christian
 
 
  string-to-codepoints()
   REST output (2 first lines):
  føre
  fø - re 219
  
   RESTXQ
  føre
  fo - re 123
  
   The first word quoted is føre in both cases and is what the scripts
   see,
   so the full text is given the same in both cases. Could it be that
   within
   RESTXQ the full text index is treated differently?
  
   I will work closer on a  self contained example, but thought this
 might
   point to something.
  
   Cheers
   Lars
  
  
   2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:
  
   Hi Christian - and thanks for fast response. Latest version 8.11 is
 in
   use
   (same behaviour as previous). Let me see if I can make a self
 contained
   example.
  
   best,
   Lars
  
   2015-05-18 13:40 GMT+02:00 Christian Grün 
 christian.gr...@gmail.com:
  
   Hi Lars,
  
   hm, that's difficult to tell. All I can say is that this sounds
   unusual, so I'm coming up with my standard questions: Do you think
 you
   could build us a little example that allows us to reproduce the
   problem? Have you tried the latest version of BaseX?
  
   Best,
   Christian
  
  
   On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com
   wrote:
   
I am running a web script in two identical versions (identical
 as in
cut
and paste), one via RESTXQ and one vi REST. The response is
different,
and
I wondered what may be the trouble.
   
For example the output (the URLs only works locally) for
http://ljohnsen:8984/hyphens/mellom
is the same as
 http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
   
which is a set of hyphenation data:
mellom
mel - lom 17005
Mel - lom 144
mel - lom. 50
   
but if mellom is exchanged with nasjonalbiblioteket only  the
REST
version shows any result, which then is the same as I get
experimenting
in
the GUI.
   
The actual script is added below, and which runs in both versions
(identical apart form the rest and restxq interfaces), it uses
 full
text
search, but results differ when run under the REST-regime.
   
All the best
Lars G Johnsen
National Library of Norway
   
module namespace page = 'http://basex.org/modules/web-page';
   
declare
  %rest:path(/hyphens/{$word})
  %output:method(html)
   
function page:show-hyphens($word) {
   let $db := db:open('hyphen-data')
 let $hyphens :=  for $hyp in $db/hyphens/hyphens[full
 contains
text
{$word}]
  group by $first := $hyp/first, $second := $hyp/second
  let $count := count($hyp)
  order by xs:int($count) descending
  return element p {
attribute freq {$count},
$first,  - , $second, $count
  }
   
 let $total := sum($hyphens//@freq)
 let $div := element div {
   element p {$word},
   for $hyp in $hyphens
   return element div {
  attribute class {hyph},
  attribute style {font-size:, 1
+round(xs:int($hyp//@freq/data())
div $total,1) || em},
  $hyp
   
 }
 }
 return
 html encoding=UTF-8
head
meta http-equiv=Content-Type content=text/html
charset=UTF-8
/
titleOrddelinger/title
/head
body{$div}
/body

Re: [basex-talk] rest vs. restxq - strange difference

2015-05-18 Thread Christian Grün
 However, when using text index instead of full text the results are the same
 for both, except that RESTXQ takes almost forever

What about the original query: Has it been slow as well, or do you
think this is a new problem?


 2015-05-18 14:28 GMT+02:00 Christian Grün christian.gr...@gmail.com:

 It could be that your URL is decoded in a wrong way.. What happens if
 you run the following function with REST and RESTXQ and føre as
 word?

   declare
 %rest:path(/test/encoding/{$word})
   function page:test-encoding($word) {
 string-to-codepoints($word)
   };

 Thanks,
 Christian


 string-to-codepoints()
  REST output (2 first lines):
 føre
 fø - re 219
 
  RESTXQ
 føre
 fo - re 123
 
  The first word quoted is føre in both cases and is what the scripts
  see,
  so the full text is given the same in both cases. Could it be that
  within
  RESTXQ the full text index is treated differently?
 
  I will work closer on a  self contained example, but thought this might
  point to something.
 
  Cheers
  Lars
 
 
  2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:
 
  Hi Christian - and thanks for fast response. Latest version 8.11 is in
  use
  (same behaviour as previous). Let me see if I can make a self contained
  example.
 
  best,
  Lars
 
  2015-05-18 13:40 GMT+02:00 Christian Grün christian.gr...@gmail.com:
 
  Hi Lars,
 
  hm, that's difficult to tell. All I can say is that this sounds
  unusual, so I'm coming up with my standard questions: Do you think you
  could build us a little example that allows us to reproduce the
  problem? Have you tried the latest version of BaseX?
 
  Best,
  Christian
 
 
  On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com
  wrote:
  
   I am running a web script in two identical versions (identical as in
   cut
   and paste), one via RESTXQ and one vi REST. The response is
   different,
   and
   I wondered what may be the trouble.
  
   For example the output (the URLs only works locally) for
   http://ljohnsen:8984/hyphens/mellom
   is the same as
http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
  
   which is a set of hyphenation data:
   mellom
   mel - lom 17005
   Mel - lom 144
   mel - lom. 50
  
   but if mellom is exchanged with nasjonalbiblioteket only  the
   REST
   version shows any result, which then is the same as I get
   experimenting
   in
   the GUI.
  
   The actual script is added below, and which runs in both versions
   (identical apart form the rest and restxq interfaces), it uses full
   text
   search, but results differ when run under the REST-regime.
  
   All the best
   Lars G Johnsen
   National Library of Norway
  
   module namespace page = 'http://basex.org/modules/web-page';
  
   declare
 %rest:path(/hyphens/{$word})
 %output:method(html)
  
   function page:show-hyphens($word) {
  let $db := db:open('hyphen-data')
let $hyphens :=  for $hyp in $db/hyphens/hyphens[full contains
   text
   {$word}]
 group by $first := $hyp/first, $second := $hyp/second
 let $count := count($hyp)
 order by xs:int($count) descending
 return element p {
   attribute freq {$count},
   $first,  - , $second, $count
 }
  
let $total := sum($hyphens//@freq)
let $div := element div {
  element p {$word},
  for $hyp in $hyphens
  return element div {
 attribute class {hyph},
 attribute style {font-size:, 1
   +round(xs:int($hyp//@freq/data())
   div $total,1) || em},
 $hyp
  
}
}
return
html encoding=UTF-8
   head
   meta http-equiv=Content-Type content=text/html
   charset=UTF-8
   /
   titleOrddelinger/title
   /head
   body{$div}
   /body
   /html
  
   };
 
 
 




Re: [basex-talk] rest vs. restxq - strange difference

2015-05-18 Thread Lars Johnsen
The full text query is blisteringly fast for both, the text index query is
fast only for REST queries and seems not to be used with queries in RESTXQ.
I am rebuilding the whole database now to see how it goes, and will restart
everything for a new assessment.



2015-05-18 15:00 GMT+02:00 Christian Grün christian.gr...@gmail.com:

  However, when using text index instead of full text the results are the
 same
  for both, except that RESTXQ takes almost forever

 What about the original query: Has it been slow as well, or do you
 think this is a new problem?


  2015-05-18 14:28 GMT+02:00 Christian Grün christian.gr...@gmail.com:
 
  It could be that your URL is decoded in a wrong way.. What happens if
  you run the following function with REST and RESTXQ and føre as
  word?
 
declare
  %rest:path(/test/encoding/{$word})
function page:test-encoding($word) {
  string-to-codepoints($word)
};
 
  Thanks,
  Christian
 
 
  string-to-codepoints()
   REST output (2 first lines):
  føre
  fø - re 219
  
   RESTXQ
  føre
  fo - re 123
  
   The first word quoted is føre in both cases and is what the scripts
   see,
   so the full text is given the same in both cases. Could it be that
   within
   RESTXQ the full text index is treated differently?
  
   I will work closer on a  self contained example, but thought this
 might
   point to something.
  
   Cheers
   Lars
  
  
   2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:
  
   Hi Christian - and thanks for fast response. Latest version 8.11 is
 in
   use
   (same behaviour as previous). Let me see if I can make a self
 contained
   example.
  
   best,
   Lars
  
   2015-05-18 13:40 GMT+02:00 Christian Grün christian.gr...@gmail.com
 :
  
   Hi Lars,
  
   hm, that's difficult to tell. All I can say is that this sounds
   unusual, so I'm coming up with my standard questions: Do you think
 you
   could build us a little example that allows us to reproduce the
   problem? Have you tried the latest version of BaseX?
  
   Best,
   Christian
  
  
   On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com
   wrote:
   
I am running a web script in two identical versions (identical as
 in
cut
and paste), one via RESTXQ and one vi REST. The response is
different,
and
I wondered what may be the trouble.
   
For example the output (the URLs only works locally) for
http://ljohnsen:8984/hyphens/mellom
is the same as
 http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
   
which is a set of hyphenation data:
mellom
mel - lom 17005
Mel - lom 144
mel - lom. 50
   
but if mellom is exchanged with nasjonalbiblioteket only  the
REST
version shows any result, which then is the same as I get
experimenting
in
the GUI.
   
The actual script is added below, and which runs in both versions
(identical apart form the rest and restxq interfaces), it uses
 full
text
search, but results differ when run under the REST-regime.
   
All the best
Lars G Johnsen
National Library of Norway
   
module namespace page = 'http://basex.org/modules/web-page';
   
declare
  %rest:path(/hyphens/{$word})
  %output:method(html)
   
function page:show-hyphens($word) {
   let $db := db:open('hyphen-data')
 let $hyphens :=  for $hyp in $db/hyphens/hyphens[full
 contains
text
{$word}]
  group by $first := $hyp/first, $second := $hyp/second
  let $count := count($hyp)
  order by xs:int($count) descending
  return element p {
attribute freq {$count},
$first,  - , $second, $count
  }
   
 let $total := sum($hyphens//@freq)
 let $div := element div {
   element p {$word},
   for $hyp in $hyphens
   return element div {
  attribute class {hyph},
  attribute style {font-size:, 1
+round(xs:int($hyp//@freq/data())
div $total,1) || em},
  $hyp
   
 }
 }
 return
 html encoding=UTF-8
head
meta http-equiv=Content-Type content=text/html
charset=UTF-8
/
titleOrddelinger/title
/head
body{$div}
/body
/html
   
};
  
  
  
 
 



Re: [basex-talk] rest vs. restxq - strange difference

2015-05-18 Thread Christian Grün
It could be that your URL is decoded in a wrong way.. What happens if
you run the following function with REST and RESTXQ and føre as
word?

  declare
%rest:path(/test/encoding/{$word})
  function page:test-encoding($word) {
string-to-codepoints($word)
  };

Thanks,
Christian


string-to-codepoints()
 REST output (2 first lines):
føre
fø - re 219

 RESTXQ
føre
fo - re 123

 The first word quoted is føre in both cases and is what the scripts see,
 so the full text is given the same in both cases. Could it be that within
 RESTXQ the full text index is treated differently?

 I will work closer on a  self contained example, but thought this might
 point to something.

 Cheers
 Lars


 2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 Hi Christian - and thanks for fast response. Latest version 8.11 is in use
 (same behaviour as previous). Let me see if I can make a self contained
 example.

 best,
 Lars

 2015-05-18 13:40 GMT+02:00 Christian Grün christian.gr...@gmail.com:

 Hi Lars,

 hm, that's difficult to tell. All I can say is that this sounds
 unusual, so I'm coming up with my standard questions: Do you think you
 could build us a little example that allows us to reproduce the
 problem? Have you tried the latest version of BaseX?

 Best,
 Christian


 On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com wrote:
 
  I am running a web script in two identical versions (identical as in
  cut
  and paste), one via RESTXQ and one vi REST. The response is different,
  and
  I wondered what may be the trouble.
 
  For example the output (the URLs only works locally) for
  http://ljohnsen:8984/hyphens/mellom
  is the same as
   http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
 
  which is a set of hyphenation data:
  mellom
  mel - lom 17005
  Mel - lom 144
  mel - lom. 50
 
  but if mellom is exchanged with nasjonalbiblioteket only  the REST
  version shows any result, which then is the same as I get experimenting
  in
  the GUI.
 
  The actual script is added below, and which runs in both versions
  (identical apart form the rest and restxq interfaces), it uses full
  text
  search, but results differ when run under the REST-regime.
 
  All the best
  Lars G Johnsen
  National Library of Norway
 
  module namespace page = 'http://basex.org/modules/web-page';
 
  declare
%rest:path(/hyphens/{$word})
%output:method(html)
 
  function page:show-hyphens($word) {
 let $db := db:open('hyphen-data')
   let $hyphens :=  for $hyp in $db/hyphens/hyphens[full contains
  text
  {$word}]
group by $first := $hyp/first, $second := $hyp/second
let $count := count($hyp)
order by xs:int($count) descending
return element p {
  attribute freq {$count},
  $first,  - , $second, $count
}
 
   let $total := sum($hyphens//@freq)
   let $div := element div {
 element p {$word},
 for $hyp in $hyphens
 return element div {
attribute class {hyph},
attribute style {font-size:, 1
  +round(xs:int($hyp//@freq/data())
  div $total,1) || em},
$hyp
 
   }
   }
   return
   html encoding=UTF-8
  head
  meta http-equiv=Content-Type content=text/html
  charset=UTF-8
  /
  titleOrddelinger/title
  /head
  body{$div}
  /body
  /html
 
  };





Re: [basex-talk] rest vs. restxq - strange difference

2015-05-18 Thread Lars Johnsen
Tried to make a small example but then things worked the same, so reindexed
the database (no language and no stemming) and found this. It seems that it
has to to with character encoding.

RESTXQ finds hits for føre as fore while REST treats it as føre so
the outputs are like this

REST output (2 first lines):
   føre
   fø - re 219

RESTXQ
   føre
   fo - re 123

The first word quoted is føre in both cases and is what the scripts see,
so the full text is given the same in both cases. Could it be that within
RESTXQ the full text index is treated differently?

I will work closer on a  self contained example, but thought this might
point to something.

Cheers
Lars


2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 Hi Christian - and thanks for fast response. Latest version 8.11 is in use
 (same behaviour as previous). Let me see if I can make a self contained
 example.

 best,
 Lars

 2015-05-18 13:40 GMT+02:00 Christian Grün christian.gr...@gmail.com:

 Hi Lars,

 hm, that's difficult to tell. All I can say is that this sounds
 unusual, so I'm coming up with my standard questions: Do you think you
 could build us a little example that allows us to reproduce the
 problem? Have you tried the latest version of BaseX?

 Best,
 Christian


 On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com wrote:
 
  I am running a web script in two identical versions (identical as in
 cut
  and paste), one via RESTXQ and one vi REST. The response is different,
 and
  I wondered what may be the trouble.
 
  For example the output (the URLs only works locally) for
  http://ljohnsen:8984/hyphens/mellom
  is the same as
   http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
 
  which is a set of hyphenation data:
  mellom
  mel - lom 17005
  Mel - lom 144
  mel - lom. 50
 
  but if mellom is exchanged with nasjonalbiblioteket only  the REST
  version shows any result, which then is the same as I get experimenting
 in
  the GUI.
 
  The actual script is added below, and which runs in both versions
  (identical apart form the rest and restxq interfaces), it uses full text
  search, but results differ when run under the REST-regime.
 
  All the best
  Lars G Johnsen
  National Library of Norway
 
  module namespace page = 'http://basex.org/modules/web-page';
 
  declare
%rest:path(/hyphens/{$word})
%output:method(html)
 
  function page:show-hyphens($word) {
 let $db := db:open('hyphen-data')
   let $hyphens :=  for $hyp in $db/hyphens/hyphens[full contains text
  {$word}]
group by $first := $hyp/first, $second := $hyp/second
let $count := count($hyp)
order by xs:int($count) descending
return element p {
  attribute freq {$count},
  $first,  - , $second, $count
}
 
   let $total := sum($hyphens//@freq)
   let $div := element div {
 element p {$word},
 for $hyp in $hyphens
 return element div {
attribute class {hyph},
attribute style {font-size:, 1
 +round(xs:int($hyp//@freq/data())
  div $total,1) || em},
$hyp
 
   }
   }
   return
   html encoding=UTF-8
  head
  meta http-equiv=Content-Type content=text/html
 charset=UTF-8
  /
  titleOrddelinger/title
  /head
  body{$div}
  /body
  /html
 
  };





Re: [basex-talk] rest vs. restxq - strange difference

2015-05-18 Thread Lars Johnsen
The codepoints are identical for both for føre:

102 248 114 101

and same as GUI.

However, when using text index instead of full text the results are the
same for both, except that RESTXQ takes almost forever (as if there was no
text index), while REST gives immediate result. So it looks as if the
RESTXQ accesses the index structure in a different way - could that be so,
or is there some strange things in my own configuration?



2015-05-18 14:28 GMT+02:00 Christian Grün christian.gr...@gmail.com:

 It could be that your URL is decoded in a wrong way.. What happens if
 you run the following function with REST and RESTXQ and føre as
 word?

   declare
 %rest:path(/test/encoding/{$word})
   function page:test-encoding($word) {
 string-to-codepoints($word)
   };

 Thanks,
 Christian


 string-to-codepoints()
  REST output (2 first lines):
 føre
 fø - re 219
 
  RESTXQ
 føre
 fo - re 123
 
  The first word quoted is føre in both cases and is what the scripts
 see,
  so the full text is given the same in both cases. Could it be that within
  RESTXQ the full text index is treated differently?
 
  I will work closer on a  self contained example, but thought this might
  point to something.
 
  Cheers
  Lars
 
 
  2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:
 
  Hi Christian - and thanks for fast response. Latest version 8.11 is in
 use
  (same behaviour as previous). Let me see if I can make a self contained
  example.
 
  best,
  Lars
 
  2015-05-18 13:40 GMT+02:00 Christian Grün christian.gr...@gmail.com:
 
  Hi Lars,
 
  hm, that's difficult to tell. All I can say is that this sounds
  unusual, so I'm coming up with my standard questions: Do you think you
  could build us a little example that allows us to reproduce the
  problem? Have you tried the latest version of BaseX?
 
  Best,
  Christian
 
 
  On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com
 wrote:
  
   I am running a web script in two identical versions (identical as in
   cut
   and paste), one via RESTXQ and one vi REST. The response is
 different,
   and
   I wondered what may be the trouble.
  
   For example the output (the URLs only works locally) for
   http://ljohnsen:8984/hyphens/mellom
   is the same as
http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
  
   which is a set of hyphenation data:
   mellom
   mel - lom 17005
   Mel - lom 144
   mel - lom. 50
  
   but if mellom is exchanged with nasjonalbiblioteket only  the
 REST
   version shows any result, which then is the same as I get
 experimenting
   in
   the GUI.
  
   The actual script is added below, and which runs in both versions
   (identical apart form the rest and restxq interfaces), it uses full
   text
   search, but results differ when run under the REST-regime.
  
   All the best
   Lars G Johnsen
   National Library of Norway
  
   module namespace page = 'http://basex.org/modules/web-page';
  
   declare
 %rest:path(/hyphens/{$word})
 %output:method(html)
  
   function page:show-hyphens($word) {
  let $db := db:open('hyphen-data')
let $hyphens :=  for $hyp in $db/hyphens/hyphens[full contains
   text
   {$word}]
 group by $first := $hyp/first, $second := $hyp/second
 let $count := count($hyp)
 order by xs:int($count) descending
 return element p {
   attribute freq {$count},
   $first,  - , $second, $count
 }
  
let $total := sum($hyphens//@freq)
let $div := element div {
  element p {$word},
  for $hyp in $hyphens
  return element div {
 attribute class {hyph},
 attribute style {font-size:, 1
   +round(xs:int($hyp//@freq/data())
   div $total,1) || em},
 $hyp
  
}
}
return
html encoding=UTF-8
   head
   meta http-equiv=Content-Type content=text/html
   charset=UTF-8
   /
   titleOrddelinger/title
   /head
   body{$div}
   /body
   /html
  
   };
 
 
 



Re: [basex-talk] rest vs. restxq - strange difference

2015-05-18 Thread Lars Johnsen
A last update, which may illuminate a little. After reindexing the database
using Norwegian (snowball), stemming, and keeping diacritis, RESTXQ
processes neither the special characters (treats them as closest ascii),
nor inflected forms.

The words mannen (=the man, definite) and spaserer (=walks, present
tense), result in no output, while using the naked stems mann and
spaser the full result is displayed. In contrast to REST which behaves as
expected.


Cheers
Lars

2015-05-18 15:28 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 As an update, after rebuilding database with

 text index,
 full text index (no language, no stemming, keep diacritics)

 restarting server:
 BaseX 8.1.1 [Server]
 Server was started (port: 29084)
 [main] INFO org.eclipse.jetty.server.AbstractConnector - Started
 SelectChannelConnector@0.0.0.0:8984
 HTTP Server was started (port: 8984)

 RESTXQ: Norwegian characters are converted using full text index, changing
 to text index takes forever.
 REST: Full-text works as expected, and text index works as expected (same
 as runing in GUI for both).

 It looks as if the index structure is treated differently.


 2015-05-18 15:07 GMT+02:00 Lars Johnsen yoon...@gmail.com:

 The full text query is blisteringly fast for both, the text index query
 is fast only for REST queries and seems not to be used with queries in
 RESTXQ. I am rebuilding the whole database now to see how it goes, and will
 restart everything for a new assessment.



 2015-05-18 15:00 GMT+02:00 Christian Grün christian.gr...@gmail.com:

  However, when using text index instead of full text the results are
 the same
  for both, except that RESTXQ takes almost forever

 What about the original query: Has it been slow as well, or do you
 think this is a new problem?


  2015-05-18 14:28 GMT+02:00 Christian Grün christian.gr...@gmail.com:
 
  It could be that your URL is decoded in a wrong way.. What happens if
  you run the following function with REST and RESTXQ and føre as
  word?
 
declare
  %rest:path(/test/encoding/{$word})
function page:test-encoding($word) {
  string-to-codepoints($word)
};
 
  Thanks,
  Christian
 
 
  string-to-codepoints()
   REST output (2 first lines):
  føre
  fø - re 219
  
   RESTXQ
  føre
  fo - re 123
  
   The first word quoted is føre in both cases and is what the
 scripts
   see,
   so the full text is given the same in both cases. Could it be that
   within
   RESTXQ the full text index is treated differently?
  
   I will work closer on a  self contained example, but thought this
 might
   point to something.
  
   Cheers
   Lars
  
  
   2015-05-18 13:44 GMT+02:00 Lars Johnsen yoon...@gmail.com:
  
   Hi Christian - and thanks for fast response. Latest version 8.11
 is in
   use
   (same behaviour as previous). Let me see if I can make a self
 contained
   example.
  
   best,
   Lars
  
   2015-05-18 13:40 GMT+02:00 Christian Grün 
 christian.gr...@gmail.com:
  
   Hi Lars,
  
   hm, that's difficult to tell. All I can say is that this sounds
   unusual, so I'm coming up with my standard questions: Do you
 think you
   could build us a little example that allows us to reproduce the
   problem? Have you tried the latest version of BaseX?
  
   Best,
   Christian
  
  
   On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com
   wrote:
   
I am running a web script in two identical versions (identical
 as in
cut
and paste), one via RESTXQ and one vi REST. The response is
different,
and
I wondered what may be the trouble.
   
For example the output (the URLs only works locally) for
http://ljohnsen:8984/hyphens/mellom
is the same as
 http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
   
which is a set of hyphenation data:
mellom
mel - lom 17005
Mel - lom 144
mel - lom. 50
   
but if mellom is exchanged with nasjonalbiblioteket only
 the
REST
version shows any result, which then is the same as I get
experimenting
in
the GUI.
   
The actual script is added below, and which runs in both
 versions
(identical apart form the rest and restxq interfaces), it uses
 full
text
search, but results differ when run under the REST-regime.
   
All the best
Lars G Johnsen
National Library of Norway
   
module namespace page = 'http://basex.org/modules/web-page';
   
declare
  %rest:path(/hyphens/{$word})
  %output:method(html)
   
function page:show-hyphens($word) {
   let $db := db:open('hyphen-data')
 let $hyphens :=  for $hyp in $db/hyphens/hyphens[full
 contains
text
{$word}]
  group by $first := $hyp/first, $second := $hyp/second
  let $count := count($hyp)
  order by xs:int($count) descending
  return element p {
attribute freq {$count},
$first,  - , $second, $count
  }
   
 let $total := 

Re: [basex-talk] rest vs. restxq - strange difference

2015-05-18 Thread Lars Johnsen
Hi Christian - and thanks for fast response. Latest version 8.11 is in use
(same behaviour as previous). Let me see if I can make a self contained
example.

best,
Lars

2015-05-18 13:40 GMT+02:00 Christian Grün christian.gr...@gmail.com:

 Hi Lars,

 hm, that's difficult to tell. All I can say is that this sounds
 unusual, so I'm coming up with my standard questions: Do you think you
 could build us a little example that allows us to reproduce the
 problem? Have you tried the latest version of BaseX?

 Best,
 Christian


 On Mon, May 18, 2015 at 1:35 PM, Lars Johnsen yoon...@gmail.com wrote:
 
  I am running a web script in two identical versions (identical as in cut
  and paste), one via RESTXQ and one vi REST. The response is different,
 and
  I wondered what may be the trouble.
 
  For example the output (the URLs only works locally) for
  http://ljohnsen:8984/hyphens/mellom
  is the same as
   http://ljohnsen:8984/rest?run=hyphen-show.xqword=mellom
 
  which is a set of hyphenation data:
  mellom
  mel - lom 17005
  Mel - lom 144
  mel - lom. 50
 
  but if mellom is exchanged with nasjonalbiblioteket only  the REST
  version shows any result, which then is the same as I get experimenting
 in
  the GUI.
 
  The actual script is added below, and which runs in both versions
  (identical apart form the rest and restxq interfaces), it uses full text
  search, but results differ when run under the REST-regime.
 
  All the best
  Lars G Johnsen
  National Library of Norway
 
  module namespace page = 'http://basex.org/modules/web-page';
 
  declare
%rest:path(/hyphens/{$word})
%output:method(html)
 
  function page:show-hyphens($word) {
 let $db := db:open('hyphen-data')
   let $hyphens :=  for $hyp in $db/hyphens/hyphens[full contains text
  {$word}]
group by $first := $hyp/first, $second := $hyp/second
let $count := count($hyp)
order by xs:int($count) descending
return element p {
  attribute freq {$count},
  $first,  - , $second, $count
}
 
   let $total := sum($hyphens//@freq)
   let $div := element div {
 element p {$word},
 for $hyp in $hyphens
 return element div {
attribute class {hyph},
attribute style {font-size:, 1
 +round(xs:int($hyp//@freq/data())
  div $total,1) || em},
$hyp
 
   }
   }
   return
   html encoding=UTF-8
  head
  meta http-equiv=Content-Type content=text/html
 charset=UTF-8
  /
  titleOrddelinger/title
  /head
  body{$div}
  /body
  /html
 
  };