[basex-talk] server time out

2013-05-31 Thread Cerstin Elisabeth Mahlow
Hi,

I run several web applications using various CGI scripts to read and update 
BaseX databases via various XQUERIES.  I use the basexserver -S command to 
start the server on a CentOS Server.  Maybe this is not the very best idea.

I already split the data into various databases to avoid concurrent reading and 
writing.  And I have a simple cron job running, that tests whether the BaseX 
server is still running and restarts it if it isn't running any more.

However, my users get (or rather create) a time out from time to time.  If the 
basex server then shut's down, the cron job would restart it and everything is 
fine again.  In the last weeks, however, I discovered that after such a time 
out the Basex server seems to still be running, but doesn't answer any xquery 
any more.  So I have to kill the process by hand and restart it again.

Could you recommend a better solution to ensure smooth running of the Basex 
server even it sometimes a time out occurs?

Thanks in advance and best regards

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] whitespace around comments

2013-04-12 Thread Cerstin Elisabeth Mahlow
Hi Christian,

Am 12.04.2013 um 10:49 schrieb Christian Grün christian.gr...@gmail.com
:

 our CHOP flag is subject to frequent discussions, which is why we will
 eventually change the default to FALSE.

I really second that!

 For now, we are still a little
 bit resistant, as such a change will change the behavior of existing
 BaseX applications out there, so we’ll probably combine the switch
 with the next major release.
 
 For now, you can preserve whitespaces by e.g..
 
 -- adding the line CHOP=false in your .basex configuration file
 -- using the basex command-line flag -w
 -- using set chop false as first command, or setting the options in
 any other way described in our Wiki [1].


The problem is, that you will be aware of this only AFTER you created a DB and 
worked with it.  Unfortunately, users are not informed when creating a DB that 
they should think about whitespace.  And there is no reason a user should 
assume that creating a DB would semantically change their data. 

In the Digital Humanities, it is all about mixed content (another major issue, 
I think) as in TEI-annotated data and of course this involves whitespace.  The 
worst thing at the moment is that you cannot get back your whitespace once you 
figure out that you should have preserved it actively.  I had to recreate the 
DB and recode node-IDs in dependent DBs and so on.

So, yes please, make preserving whitespace the default behavior!

Best regards

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] xquery help

2013-03-26 Thread Cerstin Elisabeth Mahlow
Hi,

I got stuck with an XQuery, could you please help?

I have data like this:

collection
entry
  q[text() contains text ('Johann' ftand 'Ballhorn') distance at most 5 words 
ordered][self::*:p or self::*:q]/q
  id12345/id
  pbey welchen die Verbesserungen durch Johann Ballhorn öfter vorkommen als 
man glauben sollte./p
/entry
entry…
/collection

All I want to get back is the p node with the content of the q node applied 
to p using ft:mark, i.e., the node with highlighting like this:

  pbey welchen die Verbesserungen durch markJohann/mark 
markBallhorn/mark öfter vorkommen als man glauben sollte./p

But I don't manage to call ft:mark with the correct parameters.

I tried variants of this:

for $i at $p in //entry
let $q := $i/q
return ft:mark($i $q)

But this gives

Expecting closing bracket for 'ft:mark(…'

I think I have to use concat() in some way, but how?

Thanks in advance

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] xquery help

2013-03-26 Thread Cerstin Elisabeth Mahlow
Hi Alex,

thanks for your answer!

Am 26.03.2013 um 18:43 schrieb Alexander Holupirek 
alexander.holupi...@uni-konstanz.de
:

 
 if I got you right you want to evaluate a query, such as
 
 ft:mark(pbey welchen die Verbesserungen durch Johann Ballhorn öfter 
 vorkommen als man glauben sollte./p[./text() contains text ('Johann' ftand 
 'Ballhorn') distance at most 5 words ordered][self::*:p or self::*:q])
 
 which you construct from your collection?
 
 pbey welchen .../p is a database node and
 q[text() contains text ('Johann' ... ]/q holds a query predicate as string
 
 So you dynamically construct a query string that you like to evaluate?

I already have the complete query stored in the very same collection

 ==
 
 Since ft:mark() operates on database nodes a simply typing
 
 ft:mark(pbey welchen die Verbesserungen durch Johann Ballhorn öfter 
 vorkommen als man glauben sollte./p[./text() contains text ('Johann' ftand 
 'Ballhorn') distance at most 5 words ordered][self::*:p or self::*:q])
 
 in BaseXGUi results in 
 
 [BXDB0001] ft:mark(element p { (bey welchen die Verbesserungen durch Johann 
 Ballhorn öfter vorkommen als man glauben sollte.) }[text() contains text 
 (Johann ftand Ballhorn) ordered distance(0-5 word)][(self::*:p or 
 self::*:q)]): database node expected.

Yes, I tried this :)

[…]

 let $in := doc('cm')
 let $p := $in//p
 let $q := $in//q
 let $qs := concat('ft:mark($p', $q, ')')
 return
  xquery:eval($qs)
 
 evaluates the query string ... aehh .. tries to evaluate the query string
 
 [XPST0008] Undefined variable $p.

And also this.

 
 http://docs.basex.org/wiki/XQuery_Module#xquery:eval shows how to pass a 
 binding of into the query to be evaluated
 
 let $in := doc('cm')
 let $p := $in//p
 let $q := $in//q
 let $qs := concat('ft:mark($binding', $q, ')')
 let $bm := map{ '$binding' := $p }
 return
  xquery:eval($qs, $bm)
 
 constructs a query string
 
 ft:mark($binding[text() contains text ('Johann' ftand 'Ballhorn') distance at 
 most 5 words ordered][self::*:p or self::*:q])
 and a binding of $binding and
 
 results in
 
 pbey welchen die Verbesserungen durch markJohann/mark 
 markBallhorn/mark öfter vorkommen als man glauben sollte./p
 
 Which, I hope, is the result you wanted to achieve.


OK, I now have this:

for $i at $p in //entry
let $q := $i/q
let $text := if ($i/p) then $i/p else $i/l
let $ft := concat(ft:mark($binding, $q,))
let $bm := map{'$binding' := $text}
return xquery:eval($ft, $bm)

The if clause allows to skip the extension [self::*:p or self::*:q] of the 
query.


Thanks for your help!

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] replace value vs. insert node

2013-03-22 Thread Cerstin Elisabeth Mahlow
Hi,

in terms of performance, is it cheaper to update the value of an existing node 
or to insert a node with that value?  

I'm not sure if I should create empty nodes or nodes with a default-value in 
the first place and when users specify values these nodes will be updated.  Or 
if I insert nodes with these values when users specify them in the web 
interface. Values are short texts like yes or no.

The value of the one node I have to update in any case is a longer text with 
some markup in it (for highlighting purposes of single tokens).

UPDINDEX for the DB is set and I will have to optimize the index after each 
user interaction, because I need the fulltext index in the next interaction. So 
the options are actually these:

- insert 6 nodes, replace value of 1 node, optimize
- replace values of 7 nodes, optimize

Thanks in advance and best regards

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] store results of a query in a new database

2013-03-21 Thread Cerstin Elisabeth Mahlow
Hi Christian,

Am 18.03.2013 um 23:53 schrieb Christian Grün christian.gr...@gmail.com:

 And my assumption isn't true: The document has to exist, I cannot specify a 
 non-existing document, i.e., a document I maybe would like to produce later 
 as an export of the DB.
 
 This sounds surprising to me, as I don’t get any errors when running
 e.g. the following command..
 
 basex db:create('db', root/, 'doesnotexist.xml')
 
 How does your command call look like?

It's in the cgi and it looks like this:

$session-execute(xquery declare option db:ftindex 'on'; declare option 
db:updindex 'on'; db:create('annotate-$phraseme', root/, 
'annotate-$phraseme.xml'));

$phraseme holds the ID to be used.


But it's funny, it works today without error, but it didn't work two days ago. 
And I did not install any updates.

So it fixed itself, thanks!

Cerstin

-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] store results of a query in a new database

2013-03-18 Thread Cerstin Elisabeth Mahlow
Hi Christian,

Am 18.03.2013 um 15:54 schrieb Christian Grün christian.gr...@gmail.com
:

 So I have the DB 'collect' open and then do:
 db:create('annotate-abcdef')
 
 as you guessed, you’ll have to specify a document path, in which you
 can then add new nodes. Every XML document has a root node, and vice
 versa, so it’s conceptually not possible to only create a root node
 without a document. Please note, however, that your second argument is
 simply the name and path of your document, and does not read any
 input.

Ah, so I would better use 

db:create('annotate-$name', root/, 'annotate-$name.xml')

Somehow it is not clear from the Wiki, I thought I had to use an existing 
document with some data in it.  So this document does not have to exist before?

 How do I set UPDINDEX ON for the new DB?
 
 The following lines should do what you need:
 
  declare option db:ftindex on;
  declare option db:updindex on;
  db:create('annotate', root/, 'root.xml')
 
 If you activate the FTINDEX option before creating a database, the
 OPTIMIZE call will always create/update your full-text index.

Ah, I will try this.  

 To be able to use ft:mark, I would have to optimize the DB with 
 db:optimize() after having added nodes, is this correct?
 
 ft:mark() also works without full-text index, but the index solution
 is usually faster.


OK, since some of the DBs will have more than 400 entries, it is save to always 
optimize the index after updating.

Thanks and best regards

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] re-sort database

2013-03-14 Thread Cerstin Elisabeth Mahlow
Hi,

Am 14.03.2013 um 00:02 schrieb Liam R E Quin l...@w3.org:

 On Wed, 2013-03-13 at 22:29 +0100, Christian Grün wrote:
 
 You could try to export your data and create a new
 database without updatable index structures; this could also speed up
 your updates. Maybe it even allows you to update all nodes in a single
 run.
 
 I already set VM=-Xmx1024m and I use BaseX 7.6.1 Beta from February 14 on a 
 MacBook Air with a 2 GHz processor and 8 GB RAM.
 
 I'd try using VM=-Xmx6000m if you have 8G of RAM.

OK, after combining both tips (using a database without updatable index and 
setting VM=-Xmx6000m) it worked in a single run. Thanks!

After 5'729'855 ms (95 minutes) it updated 35'344 nodes within the 165'000 
entries in the database.

I don't know if this is slow and could be improved, but I'm happy having fixed 
the database :)

Best regards and thanks again

Cerstin

-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] store results of a query in a new database

2013-03-14 Thread Cerstin Elisabeth Mahlow
Hi,

I would like to store the results of an xquery extracting some nodes from an 
existing database into a new database.

So I have the DB 'collect' open and then do:

db:create('annotate-abcdef')

and then

for $i at $p in //entry[phraseme[text() = abcdef] and selected[text() = 
yes]]
let $query := $i/query
let $nodeid := $i/node
let $node := db:open-id('TG-DTA-GerManC-stemming-ws', $nodeid)
let $nodename := name($node)
let $nodecontent := string($node)
return insert node div{$nodeid} {$query} {element {$nodename} 
{$nodecontent}} 
/div as last into db:open('annotate-abcdef')

However, the error message is 

[XUDY0027] Insert target must not be empty.

How would I add a root element to the new DB, as I don't wan't to link it to an 
existing document?  All nodes added to the DB are only results from queries 
over an existing DB. All I can see from the Wiki is using a document as initial 
data for the DB.

How do I set UPDINDEX ON for the new DB?  I will later update the information 
added in the first place by other queries.  To be able to use ft:mark, I would 
have to optimize the DB with db:optimize() after having added nodes, is this 
correct?

Thanks in advance and best regards

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] text() vs string()

2013-01-29 Thread Cerstin Elisabeth Mahlow
Hi Wendell,

Am 28.01.2013 um 22:27 schrieb Wendell Piez:

1. Unless I learn better, I'm going to prefer [B] or [C], because in
my world, mixed content is common; is there any reason (performance or
otherwise) to prefer [A] in cases where I know it will be robust? Is
there any reason to prefer [B] or prefer [C]?

My world is a world of mixed content, too.  So with questions like [A], you 
miss a lot of things you want to retrieve.  However, [A] is the only 
possibility of making use of the index.  So with [B] or [C] you might get all 
hits you are interested in, but you will never get them because of performance 
issues.

Flattening the structure in the first place, i.e., getting rid of all 
non-structural information not really relevant for your particular query, and 
then applying [A] would be a bad idea when your user scenario involves 
inspecting the hits in the original context, i.e., including all formatting, 
and annotating hits back into the original text.

As I see it, the handling of mixed content is the biggest obstacle when working 
with BaseX in the Humanities.

For some reason, eXist seems capable of handling mixed content AND using the 
index.  But when I experimented with it, it wasn't that stable, so I came back 
to BaseX and my users know that it is very likely some hits will be missed when 
querying the corpus.  However, for every query, they are interested in, they 
formulate various xqueries including different search terms -- this way they 
get hold of almost everything, eXist was capable to find.  I can show some 
examples at the BaseX user meeting in Prague.

Best regards

Cerstin
--
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.chmailto:cerstin.mah...@unibas.ch
Web: http://www.oldphras.nethttp://www.oldphras.net/

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] Upcoming: XMLPrague 2013

2012-12-14 Thread Cerstin Elisabeth Mahlow
Hi Christian,

Am 21.11.2012 um 02:46 schrieb Christian Grün:

The XMLPrague Conference is the most important European event for XML
specialists, developers and users. Next year, it will take place from
Feb 8-10:

 http://www.xmlprague.cz/

Once again, our team will be present there, and it will be a pleasure
for us to see you and talk to you live!

I will be there, just registered :)

What may be even more interesting: we are planning yet another BaseX
User Meeting on the first day (Friday). We have just started to
compile our agenda. As many of you are using BaseX for both exciting
projects and inn productive environments, it would be great to have
some of you talk about your experiences with BaseX.

Would you be interested to contribute?

Yes, I can show some of my awkward perl-skripts and the overall structure of 
our project.  It would be great to get some input on improving performance and 
how to deal with digital humanities data and users.

Best regards

Cerstin
--
Dr. phil. Cerstin Mahlow

Universität Basel
Deutsches Seminar
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.chmailto:cerstin.mah...@unibas.ch
Web: http://www.oldphras.nethttp://www.oldphras.net/

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] slow query

2012-11-16 Thread Cerstin Elisabeth Mahlow
Hi Christian,

Am 15.11.2012 um 20:00 schrieb Christian Grün:

for $i at $p in //entry[phraseme[text() = Ad0194] and selected[text() = 
yes]]

It’s often beneficial to avoid nested predicated. Does the following
version give you better results?

  //entry[phraseme/text() = Ad0194 and selected/text() = yes]

It gives one or two seconds. But I will use this also for other queries, thanks!

Beside that, feel free to send us the query info (the output of the
Info View), as it often indicates potential for additional
optimizations.

OK, here is the query info. Most time is used for evaluation, also printing 
takes some time, but parsing and compiling looks pretty fast, I think.



Query: for $i at $p in //entry[phraseme[text() = Ad0194] and selected[text() 
= yes]] let $query := $i/query let $node := $i/node let $prefix := 
fn:in-scope-prefixes($i) let $title := db:open-id('TG-DTA-GerManC-stemming-ws', 
$node) /ancestor::*:TEI[1]//*:fileDesc[1]//*:titleStmt[1]//*:title[1] let 
$author := db:open-id('TG-DTA-GerManC-stemming-ws', $node) 
/ancestor::*:TEI[1]//*:sourceDesc[1]//*:bibl[1]//*:author[1] let $note := 
db:open-id('TG-DTA-GerManC-stemming-ws', $node) 
/ancestor::*:TEI[1]//*:notesStmt//*:note let $expr := 
concat(ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', , $node, ) , 
$query, )) let $time := data($i/@time) return div hit count={ $p} 
pinput type=checkbox name=NODE value={$node}/b class=hitno{$p} ({ 
if($prefix = dta) then DTA else TG})/bKnoten: {$i/node}/p 
{xquery:eval($expr)} /hit bib p 
class=biblb{$time}/bbr/bBibliographie/b { data($author)}: { 
data($title)} br/bAnmerkung/b: { data ($note) }br/ bKorpus/b: { 
if($prefix = dta) then Deutsches Textarchiv else TextGrid Digitale 
Bibliothek}/p /bib p/p/div

Compiling:
- rewriting And expression to predicate(s)
- rewriting fn:boolean(phraseme[text() = Ad0194])
- rewriting fn:boolean(selected[text() = yes])
- simplifying descendant-or-self step(s)
- simplifying descendant-or-self step(s)

Result: for $i at $p as xs:integer in document-node { collect.xml 
}/descendant::entry[phraseme[text() = Ad0194]][selected[text() = yes]] let 
$query := $i/query let $node := $i/node let $prefix := fn:in-scope-prefixes($i) 
let $title := db:open-id(TG-DTA-GerManC-stemming-ws, 
$node)/ancestor::*:TEI[1]/descendant-or-self::node()/*:fileDesc[1]/descendant-or-self::node()/*:titleStmt[1]/descendant-or-self::node()/*:title[1]
 let $author := db:open-id(TG-DTA-GerManC-stemming-ws, 
$node)/ancestor::*:TEI[1]/descendant-or-self::node()/*:sourceDesc[1]/descendant-or-self::node()/*:bibl[1]/descendant-or-self::node()/*:author[1]
 let $note := db:open-id(TG-DTA-GerManC-stemming-ws, 
$node)/ancestor::*:TEI[1]/descendant::*:notesStmt/descendant::*:note let $expr 
:= fn:concat(ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', , $node, ) , 
$query, )) let $time := fn:data($i/@time) return element div { element hit { 
attribute count { $p }, element p { element input { attribute type { checkbox 
}, attribute name { NODE }, attribute value { $node } }, element b { 
attribute class { hitno }, $p,  (, if($prefix = dta) then DTA else 
TG, ) }, Knoten: , $i/node }, xquery:eval($expr) }, element bib { element 
p { attribute class { bibl }, element b { $time }, element br { () }, element 
b { Bibliographie }, fn:data($author), : , fn:data($title), element br { () 
}, element b { Anmerkung }, : , fn:data($note), element br { () }, element 
b { Korpus }, : , if($prefix = dta) then Deutsches Textarchiv else 
TextGrid Digitale Bibliothek } }, element p { () } }

Timing:
 - Parsing:  14.63 ms
 - Compiling:  33.34 ms
 - Evaluating:  12216.87 ms
 - Printing:  449.52 ms
 - Total Time:  12714.37 ms

Result:
- Hit(s): 676 Items
- Updated: 0 Items
- Printed: 2048 KB

Query plan:
QueryPlan
  FLWR
For var=$i pos=$p as xs:integer
  IterPath
DBNode name=collect-ws pre=0/
IterStep axis=descendant test=entry
  AxisPath
IterStep axis=child test=phraseme
  CmpG op==
AxisPath
  IterStep axis=child test=text()/
/AxisPath
Str value=Ad0194 type=xs:string/
  /CmpG
/IterStep
  /AxisPath
  AxisPath
IterStep axis=child test=selected
  CmpG op==
AxisPath
  IterStep axis=child test=text()/
/AxisPath
Str value=yes type=xs:string/
  /CmpG
/IterStep
  /AxisPath
/IterStep
  /IterPath
/For
Let var=$query
  AxisPath
VarRef
  Var name=$i id=0/
/VarRef
IterStep axis=child test=query/
  /AxisPath
/Let
Let var=$node
  AxisPath
VarRef
  Var name=$i id=0/
/VarRef
IterStep axis=child test=node/
  /AxisPath
/Let
Let var=$prefix
  FNQName name=in-scope-prefixes(elem)
VarRef
  Var name=$i id=0/
/VarRef
  /FNQName
  

[basex-talk] index:facets()

2012-10-05 Thread Cerstin Elisabeth Mahlow
Hi,

after Andreas recommended using index:facets(), my application speeds up.

However, I don't think that this is the best solution.  The database, I apply 
this function to is changing constantly.  As the function is using the index, I 
would have to re-create the index first, is this correct?  So that the 
functions would give wrong results most of the time.

And I found another strange thing: I had to delete some nodes and re-created 
all indexes.  However, index:facets() still gives the information from the 
status *before* the deletion, i.e., it counts nodes that aren't there anymore.  
Therefore I don't use it.

count(//entry/selected[text () = yes])

gives the correct result,

index:facets(collect, flat)//element[@name = selected]/entry[text() = 
yes]/@count/data()

still gives the wrong result, i.e., the result that was correct some days ago.

Is this a bug or a feature?

Best regards

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] index:facets()

2012-10-05 Thread Cerstin Elisabeth Mahlow
Hi Andreas,

I created all indexes as such and I clicked optimize.  I also closed the GUI, 
opened it new, droped all indexes and did optimization again.

However, your comment implicitly confirms that I can use index:facets() only 
when the database is somewhat stable and not in constant flux as in our 
scenario.

Best regards

Cerstin
--
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

Von: Andreas Weiler [andreas.wei...@uni-konstanz.de]
Gesendet: Freitag, 5. Oktober 2012 18:51
An: Cerstin Elisabeth Mahlow
Cc: basex-talk@mailman.uni-konstanz.de
Betreff: Re: [basex-talk] index:facets()

Hi Cerstin,

 re-created all indexes

which indexes did you re-created?
The information of index:facets is stored in the path index and therefore only 
optimize all is
updating this index structure.

-- Andreas

Am 05.10.2012 um 18:46 schrieb Cerstin Elisabeth Mahlow:

 Hi,

 after Andreas recommended using index:facets(), my application speeds up.

 However, I don't think that this is the best solution.  The database, I apply 
 this function to is changing constantly.  As the function is using the index, 
 I would have to re-create the index first, is this correct?  So that the 
 functions would give wrong results most of the time.

 And I found another strange thing: I had to delete some nodes and re-created 
 all indexes.  However, index:facets() still gives the information from the 
 status *before* the deletion, i.e., it counts nodes that aren't there 
 anymore.  Therefore I don't use it.

 count(//entry/selected[text () = yes])

 gives the correct result,

 index:facets(collect, flat)//element[@name = selected]/entry[text() = 
 yes]/@count/data()

 still gives the wrong result, i.e., the result that was correct some days ago.

 Is this a bug or a feature?

 Best regards

 Cerstin
 --
 Dr. phil. Cerstin Mahlow

 Universität Basel
 Departement Sprach- und Literaturwissenschaften
 Fachbereich Deutsche Sprach- und Literaturwissenschaft
 Nadelberg 4
 4051 Basel
 Schweiz

 Tel:  +41 61 267 07 65
 Fax: +41 61 267 34 40
 Mail: cerstin.mah...@unibas.ch
 Web: http://www.oldphras.net
 ___
 BaseX-Talk mailing list
 BaseX-Talk@mailman.uni-konstanz.de
 https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] bug: cancel = shut down

2012-10-04 Thread Cerstin Elisabeth Mahlow
Hi Michael,

yes, I use Mac OS X. 

Cerstin
--
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

Von: Michael Seiferle [m...@basex.org]
Gesendet: Donnerstag, 4. Oktober 2012 10:41
An: Cerstin Elisabeth Mahlow
Cc: basex-talk@mailman.uni-konstanz.de
Betreff: Re: [basex-talk] bug: cancel = shut down

Hi Cerstin,

true this is an annoying bug—you are running Mac OS X I assume— that will 
eventually be fixed.
I thought we had this on GitHub already, looks like its missing….
well:
 Am 24.04.2012 um 22:27 schrieb Christian Grün christian.gr...@gmail.com:

 you stumbled upon one of the issues that are specific to Mac OSX (the
 Cancel button works on all Windows and Linux distributions we are
 aware of). […]
 As I'm one of the few in our group who hasn't switched to Mac.. I'll
 have to pass this on to the others.. Anyone interested in having a
 look at this issue?
I'll add it to our issues on GH now, this won't fix it yet,  but I hope this 
will at least increase the chance of having it fixed sooner than later.

Best from Konstanz
Michael
Am 03.10.2012 um 19:22 schrieb Cerstin Elisabeth Mahlow 
cerstin.mah...@unibas.ch:

 Hi,

 when closing BaseX GUI, you will be asked to save edited files. When you 
 click cancel, it shuts down immediately.  Shouldn't it cancel the closing 
 process but not the application?

 Best regards

 Cerstin
 --
 Dr. phil. Cerstin Mahlow

 Universität Basel
 Departement Sprach- und Literaturwissenschaften
 Fachbereich Deutsche Sprach- und Literaturwissenschaft
 Nadelberg 4
 4051 Basel
 Schweiz

 Tel:  +41 61 267 07 65
 Fax: +41 61 267 34 40
 Mail: cerstin.mah...@unibas.ch
 Web: http://www.oldphras.net
 ___
 BaseX-Talk mailing list
 BaseX-Talk@mailman.uni-konstanz.de
 https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


[basex-talk] bug: cancel = shut down

2012-10-03 Thread Cerstin Elisabeth Mahlow
Hi,

when closing BaseX GUI, you will be asked to save edited files. When you click 
cancel, it shuts down immediately.  Shouldn't it cancel the closing process 
but not the application?

Best regards

Cerstin
-- 
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net
___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk


Re: [basex-talk] slow processing

2012-10-02 Thread Cerstin Elisabeth Mahlow
Hi Andreas,

having only one session brings some seconds.

When I run the queries in the GUI on the same server, I get this:

(total number of positive hits)
index:facets(collect, flat)//element[@name = selected]/entry[text() = 
yes]/@count/data()

takes 2 to 4 ms

(number of phrasemes searched)
count(distinct-values(//entry/phraseme/text()))

takes 370 to 500 ms

(create info table)
for $phraseme in distinct-values(//entry/phraseme)
let $nodes  := //phraseme[text() = $phraseme]
let $count  := count($nodes[../selected[text() = yes]])
let $person := distinct-values($nodes/../person)
order by $phraseme  
returntrtd({$phraseme})/td td{$person}/td td{$count}/tdtda 
href=basex-show-phraseme.pl?phraseme={$phraseme}anzeigen/aussortieren/a/td/tr

takes 1000 to 1400 ms

(last timestamp)
let $i := //entry/@time order by $i/@time ascending 
return pLetzte Bearbeitung: {data($i[last()])}/p

takes 250 to 380 ms


However, I just switched to using count() for the number of phrasemes accessed. 
 Before I took the distinct values, splitted them into an array, and then used 
the number of indices. And this probably took a lot of time.  Using count() and 
dropping the splitting results in the page showing up in 2 to 3 seconds. 
Perfect!

Thanks for helping!  I will probably soon will ask for help with another slow 
process :-)

Best regards

Cerstin

--
Dr. phil. Cerstin Mahlow

Universität Basel
Departement Sprach- und Literaturwissenschaften
Fachbereich Deutsche Sprach- und Literaturwissenschaft
Nadelberg 4
4051 Basel
Schweiz

Tel:  +41 61 267 07 65
Fax: +41 61 267 34 40
Mail: cerstin.mah...@unibas.ch
Web: http://www.oldphras.net

Von: Andreas Weiler [andreas.wei...@uni-konstanz.de]
Gesendet: Dienstag, 2. Oktober 2012 10:34
An: Cerstin Elisabeth Mahlow
Cc: basex-talk@mailman.uni-konstanz.de
Betreff: Re: [basex-talk] slow processing

Hi Cerstin,

can you check each single query contained in the script with the GUI and see 
how much time each one takes?

Why are you creating a new session for each query? You should be able to take 
the same session for all queries.

-- Andreas

___
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk