[basex-talk] server time out
Hi, I run several web applications using various CGI scripts to read and update BaseX databases via various XQUERIES. I use the basexserver -S command to start the server on a CentOS Server. Maybe this is not the very best idea. I already split the data into various databases to avoid concurrent reading and writing. And I have a simple cron job running, that tests whether the BaseX server is still running and restarts it if it isn't running any more. However, my users get (or rather create) a time out from time to time. If the basex server then shut's down, the cron job would restart it and everything is fine again. In the last weeks, however, I discovered that after such a time out the Basex server seems to still be running, but doesn't answer any xquery any more. So I have to kill the process by hand and restart it again. Could you recommend a better solution to ensure smooth running of the Basex server even it sometimes a time out occurs? Thanks in advance and best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] whitespace around comments
Hi Christian, Am 12.04.2013 um 10:49 schrieb Christian Grün christian.gr...@gmail.com : our CHOP flag is subject to frequent discussions, which is why we will eventually change the default to FALSE. I really second that! For now, we are still a little bit resistant, as such a change will change the behavior of existing BaseX applications out there, so we’ll probably combine the switch with the next major release. For now, you can preserve whitespaces by e.g.. -- adding the line CHOP=false in your .basex configuration file -- using the basex command-line flag -w -- using set chop false as first command, or setting the options in any other way described in our Wiki [1]. The problem is, that you will be aware of this only AFTER you created a DB and worked with it. Unfortunately, users are not informed when creating a DB that they should think about whitespace. And there is no reason a user should assume that creating a DB would semantically change their data. In the Digital Humanities, it is all about mixed content (another major issue, I think) as in TEI-annotated data and of course this involves whitespace. The worst thing at the moment is that you cannot get back your whitespace once you figure out that you should have preserved it actively. I had to recreate the DB and recode node-IDs in dependent DBs and so on. So, yes please, make preserving whitespace the default behavior! Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
[basex-talk] xquery help
Hi, I got stuck with an XQuery, could you please help? I have data like this: collection entry q[text() contains text ('Johann' ftand 'Ballhorn') distance at most 5 words ordered][self::*:p or self::*:q]/q id12345/id pbey welchen die Verbesserungen durch Johann Ballhorn öfter vorkommen als man glauben sollte./p /entry entry… /collection All I want to get back is the p node with the content of the q node applied to p using ft:mark, i.e., the node with highlighting like this: pbey welchen die Verbesserungen durch markJohann/mark markBallhorn/mark öfter vorkommen als man glauben sollte./p But I don't manage to call ft:mark with the correct parameters. I tried variants of this: for $i at $p in //entry let $q := $i/q return ft:mark($i $q) But this gives Expecting closing bracket for 'ft:mark(…' I think I have to use concat() in some way, but how? Thanks in advance Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] xquery help
Hi Alex, thanks for your answer! Am 26.03.2013 um 18:43 schrieb Alexander Holupirek alexander.holupi...@uni-konstanz.de : if I got you right you want to evaluate a query, such as ft:mark(pbey welchen die Verbesserungen durch Johann Ballhorn öfter vorkommen als man glauben sollte./p[./text() contains text ('Johann' ftand 'Ballhorn') distance at most 5 words ordered][self::*:p or self::*:q]) which you construct from your collection? pbey welchen .../p is a database node and q[text() contains text ('Johann' ... ]/q holds a query predicate as string So you dynamically construct a query string that you like to evaluate? I already have the complete query stored in the very same collection == Since ft:mark() operates on database nodes a simply typing ft:mark(pbey welchen die Verbesserungen durch Johann Ballhorn öfter vorkommen als man glauben sollte./p[./text() contains text ('Johann' ftand 'Ballhorn') distance at most 5 words ordered][self::*:p or self::*:q]) in BaseXGUi results in [BXDB0001] ft:mark(element p { (bey welchen die Verbesserungen durch Johann Ballhorn öfter vorkommen als man glauben sollte.) }[text() contains text (Johann ftand Ballhorn) ordered distance(0-5 word)][(self::*:p or self::*:q)]): database node expected. Yes, I tried this :) […] let $in := doc('cm') let $p := $in//p let $q := $in//q let $qs := concat('ft:mark($p', $q, ')') return xquery:eval($qs) evaluates the query string ... aehh .. tries to evaluate the query string [XPST0008] Undefined variable $p. And also this. http://docs.basex.org/wiki/XQuery_Module#xquery:eval shows how to pass a binding of into the query to be evaluated let $in := doc('cm') let $p := $in//p let $q := $in//q let $qs := concat('ft:mark($binding', $q, ')') let $bm := map{ '$binding' := $p } return xquery:eval($qs, $bm) constructs a query string ft:mark($binding[text() contains text ('Johann' ftand 'Ballhorn') distance at most 5 words ordered][self::*:p or self::*:q]) and a binding of $binding and results in pbey welchen die Verbesserungen durch markJohann/mark markBallhorn/mark öfter vorkommen als man glauben sollte./p Which, I hope, is the result you wanted to achieve. OK, I now have this: for $i at $p in //entry let $q := $i/q let $text := if ($i/p) then $i/p else $i/l let $ft := concat(ft:mark($binding, $q,)) let $bm := map{'$binding' := $text} return xquery:eval($ft, $bm) The if clause allows to skip the extension [self::*:p or self::*:q] of the query. Thanks for your help! Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
[basex-talk] replace value vs. insert node
Hi, in terms of performance, is it cheaper to update the value of an existing node or to insert a node with that value? I'm not sure if I should create empty nodes or nodes with a default-value in the first place and when users specify values these nodes will be updated. Or if I insert nodes with these values when users specify them in the web interface. Values are short texts like yes or no. The value of the one node I have to update in any case is a longer text with some markup in it (for highlighting purposes of single tokens). UPDINDEX for the DB is set and I will have to optimize the index after each user interaction, because I need the fulltext index in the next interaction. So the options are actually these: - insert 6 nodes, replace value of 1 node, optimize - replace values of 7 nodes, optimize Thanks in advance and best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] store results of a query in a new database
Hi Christian, Am 18.03.2013 um 23:53 schrieb Christian Grün christian.gr...@gmail.com: And my assumption isn't true: The document has to exist, I cannot specify a non-existing document, i.e., a document I maybe would like to produce later as an export of the DB. This sounds surprising to me, as I don’t get any errors when running e.g. the following command.. basex db:create('db', root/, 'doesnotexist.xml') How does your command call look like? It's in the cgi and it looks like this: $session-execute(xquery declare option db:ftindex 'on'; declare option db:updindex 'on'; db:create('annotate-$phraseme', root/, 'annotate-$phraseme.xml')); $phraseme holds the ID to be used. But it's funny, it works today without error, but it didn't work two days ago. And I did not install any updates. So it fixed itself, thanks! Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] store results of a query in a new database
Hi Christian, Am 18.03.2013 um 15:54 schrieb Christian Grün christian.gr...@gmail.com : So I have the DB 'collect' open and then do: db:create('annotate-abcdef') as you guessed, you’ll have to specify a document path, in which you can then add new nodes. Every XML document has a root node, and vice versa, so it’s conceptually not possible to only create a root node without a document. Please note, however, that your second argument is simply the name and path of your document, and does not read any input. Ah, so I would better use db:create('annotate-$name', root/, 'annotate-$name.xml') Somehow it is not clear from the Wiki, I thought I had to use an existing document with some data in it. So this document does not have to exist before? How do I set UPDINDEX ON for the new DB? The following lines should do what you need: declare option db:ftindex on; declare option db:updindex on; db:create('annotate', root/, 'root.xml') If you activate the FTINDEX option before creating a database, the OPTIMIZE call will always create/update your full-text index. Ah, I will try this. To be able to use ft:mark, I would have to optimize the DB with db:optimize() after having added nodes, is this correct? ft:mark() also works without full-text index, but the index solution is usually faster. OK, since some of the DBs will have more than 400 entries, it is save to always optimize the index after updating. Thanks and best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] re-sort database
Hi, Am 14.03.2013 um 00:02 schrieb Liam R E Quin l...@w3.org: On Wed, 2013-03-13 at 22:29 +0100, Christian Grün wrote: You could try to export your data and create a new database without updatable index structures; this could also speed up your updates. Maybe it even allows you to update all nodes in a single run. I already set VM=-Xmx1024m and I use BaseX 7.6.1 Beta from February 14 on a MacBook Air with a 2 GHz processor and 8 GB RAM. I'd try using VM=-Xmx6000m if you have 8G of RAM. OK, after combining both tips (using a database without updatable index and setting VM=-Xmx6000m) it worked in a single run. Thanks! After 5'729'855 ms (95 minutes) it updated 35'344 nodes within the 165'000 entries in the database. I don't know if this is slow and could be improved, but I'm happy having fixed the database :) Best regards and thanks again Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
[basex-talk] store results of a query in a new database
Hi, I would like to store the results of an xquery extracting some nodes from an existing database into a new database. So I have the DB 'collect' open and then do: db:create('annotate-abcdef') and then for $i at $p in //entry[phraseme[text() = abcdef] and selected[text() = yes]] let $query := $i/query let $nodeid := $i/node let $node := db:open-id('TG-DTA-GerManC-stemming-ws', $nodeid) let $nodename := name($node) let $nodecontent := string($node) return insert node div{$nodeid} {$query} {element {$nodename} {$nodecontent}} /div as last into db:open('annotate-abcdef') However, the error message is [XUDY0027] Insert target must not be empty. How would I add a root element to the new DB, as I don't wan't to link it to an existing document? All nodes added to the DB are only results from queries over an existing DB. All I can see from the Wiki is using a document as initial data for the DB. How do I set UPDINDEX ON for the new DB? I will later update the information added in the first place by other queries. To be able to use ft:mark, I would have to optimize the DB with db:optimize() after having added nodes, is this correct? Thanks in advance and best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] text() vs string()
Hi Wendell, Am 28.01.2013 um 22:27 schrieb Wendell Piez: 1. Unless I learn better, I'm going to prefer [B] or [C], because in my world, mixed content is common; is there any reason (performance or otherwise) to prefer [A] in cases where I know it will be robust? Is there any reason to prefer [B] or prefer [C]? My world is a world of mixed content, too. So with questions like [A], you miss a lot of things you want to retrieve. However, [A] is the only possibility of making use of the index. So with [B] or [C] you might get all hits you are interested in, but you will never get them because of performance issues. Flattening the structure in the first place, i.e., getting rid of all non-structural information not really relevant for your particular query, and then applying [A] would be a bad idea when your user scenario involves inspecting the hits in the original context, i.e., including all formatting, and annotating hits back into the original text. As I see it, the handling of mixed content is the biggest obstacle when working with BaseX in the Humanities. For some reason, eXist seems capable of handling mixed content AND using the index. But when I experimented with it, it wasn't that stable, so I came back to BaseX and my users know that it is very likely some hits will be missed when querying the corpus. However, for every query, they are interested in, they formulate various xqueries including different search terms -- this way they get hold of almost everything, eXist was capable to find. I can show some examples at the BaseX user meeting in Prague. Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.chmailto:cerstin.mah...@unibas.ch Web: http://www.oldphras.nethttp://www.oldphras.net/ ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] Upcoming: XMLPrague 2013
Hi Christian, Am 21.11.2012 um 02:46 schrieb Christian Grün: The XMLPrague Conference is the most important European event for XML specialists, developers and users. Next year, it will take place from Feb 8-10: http://www.xmlprague.cz/ Once again, our team will be present there, and it will be a pleasure for us to see you and talk to you live! I will be there, just registered :) What may be even more interesting: we are planning yet another BaseX User Meeting on the first day (Friday). We have just started to compile our agenda. As many of you are using BaseX for both exciting projects and inn productive environments, it would be great to have some of you talk about your experiences with BaseX. Would you be interested to contribute? Yes, I can show some of my awkward perl-skripts and the overall structure of our project. It would be great to get some input on improving performance and how to deal with digital humanities data and users. Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Deutsches Seminar Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.chmailto:cerstin.mah...@unibas.ch Web: http://www.oldphras.nethttp://www.oldphras.net/ ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] slow query
Hi Christian, Am 15.11.2012 um 20:00 schrieb Christian Grün: for $i at $p in //entry[phraseme[text() = Ad0194] and selected[text() = yes]] It’s often beneficial to avoid nested predicated. Does the following version give you better results? //entry[phraseme/text() = Ad0194 and selected/text() = yes] It gives one or two seconds. But I will use this also for other queries, thanks! Beside that, feel free to send us the query info (the output of the Info View), as it often indicates potential for additional optimizations. OK, here is the query info. Most time is used for evaluation, also printing takes some time, but parsing and compiling looks pretty fast, I think. Query: for $i at $p in //entry[phraseme[text() = Ad0194] and selected[text() = yes]] let $query := $i/query let $node := $i/node let $prefix := fn:in-scope-prefixes($i) let $title := db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:fileDesc[1]//*:titleStmt[1]//*:title[1] let $author := db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:sourceDesc[1]//*:bibl[1]//*:author[1] let $note := db:open-id('TG-DTA-GerManC-stemming-ws', $node) /ancestor::*:TEI[1]//*:notesStmt//*:note let $expr := concat(ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', , $node, ) , $query, )) let $time := data($i/@time) return div hit count={ $p} pinput type=checkbox name=NODE value={$node}/b class=hitno{$p} ({ if($prefix = dta) then DTA else TG})/bKnoten: {$i/node}/p {xquery:eval($expr)} /hit bib p class=biblb{$time}/bbr/bBibliographie/b { data($author)}: { data($title)} br/bAnmerkung/b: { data ($note) }br/ bKorpus/b: { if($prefix = dta) then Deutsches Textarchiv else TextGrid Digitale Bibliothek}/p /bib p/p/div Compiling: - rewriting And expression to predicate(s) - rewriting fn:boolean(phraseme[text() = Ad0194]) - rewriting fn:boolean(selected[text() = yes]) - simplifying descendant-or-self step(s) - simplifying descendant-or-self step(s) Result: for $i at $p as xs:integer in document-node { collect.xml }/descendant::entry[phraseme[text() = Ad0194]][selected[text() = yes]] let $query := $i/query let $node := $i/node let $prefix := fn:in-scope-prefixes($i) let $title := db:open-id(TG-DTA-GerManC-stemming-ws, $node)/ancestor::*:TEI[1]/descendant-or-self::node()/*:fileDesc[1]/descendant-or-self::node()/*:titleStmt[1]/descendant-or-self::node()/*:title[1] let $author := db:open-id(TG-DTA-GerManC-stemming-ws, $node)/ancestor::*:TEI[1]/descendant-or-self::node()/*:sourceDesc[1]/descendant-or-self::node()/*:bibl[1]/descendant-or-self::node()/*:author[1] let $note := db:open-id(TG-DTA-GerManC-stemming-ws, $node)/ancestor::*:TEI[1]/descendant::*:notesStmt/descendant::*:note let $expr := fn:concat(ft:mark(db:open-id('TG-DTA-GerManC-stemming-ws', , $node, ) , $query, )) let $time := fn:data($i/@time) return element div { element hit { attribute count { $p }, element p { element input { attribute type { checkbox }, attribute name { NODE }, attribute value { $node } }, element b { attribute class { hitno }, $p, (, if($prefix = dta) then DTA else TG, ) }, Knoten: , $i/node }, xquery:eval($expr) }, element bib { element p { attribute class { bibl }, element b { $time }, element br { () }, element b { Bibliographie }, fn:data($author), : , fn:data($title), element br { () }, element b { Anmerkung }, : , fn:data($note), element br { () }, element b { Korpus }, : , if($prefix = dta) then Deutsches Textarchiv else TextGrid Digitale Bibliothek } }, element p { () } } Timing: - Parsing: 14.63 ms - Compiling: 33.34 ms - Evaluating: 12216.87 ms - Printing: 449.52 ms - Total Time: 12714.37 ms Result: - Hit(s): 676 Items - Updated: 0 Items - Printed: 2048 KB Query plan: QueryPlan FLWR For var=$i pos=$p as xs:integer IterPath DBNode name=collect-ws pre=0/ IterStep axis=descendant test=entry AxisPath IterStep axis=child test=phraseme CmpG op== AxisPath IterStep axis=child test=text()/ /AxisPath Str value=Ad0194 type=xs:string/ /CmpG /IterStep /AxisPath AxisPath IterStep axis=child test=selected CmpG op== AxisPath IterStep axis=child test=text()/ /AxisPath Str value=yes type=xs:string/ /CmpG /IterStep /AxisPath /IterStep /IterPath /For Let var=$query AxisPath VarRef Var name=$i id=0/ /VarRef IterStep axis=child test=query/ /AxisPath /Let Let var=$node AxisPath VarRef Var name=$i id=0/ /VarRef IterStep axis=child test=node/ /AxisPath /Let Let var=$prefix FNQName name=in-scope-prefixes(elem) VarRef Var name=$i id=0/ /VarRef /FNQName
[basex-talk] index:facets()
Hi, after Andreas recommended using index:facets(), my application speeds up. However, I don't think that this is the best solution. The database, I apply this function to is changing constantly. As the function is using the index, I would have to re-create the index first, is this correct? So that the functions would give wrong results most of the time. And I found another strange thing: I had to delete some nodes and re-created all indexes. However, index:facets() still gives the information from the status *before* the deletion, i.e., it counts nodes that aren't there anymore. Therefore I don't use it. count(//entry/selected[text () = yes]) gives the correct result, index:facets(collect, flat)//element[@name = selected]/entry[text() = yes]/@count/data() still gives the wrong result, i.e., the result that was correct some days ago. Is this a bug or a feature? Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] index:facets()
Hi Andreas, I created all indexes as such and I clicked optimize. I also closed the GUI, opened it new, droped all indexes and did optimization again. However, your comment implicitly confirms that I can use index:facets() only when the database is somewhat stable and not in constant flux as in our scenario. Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net Von: Andreas Weiler [andreas.wei...@uni-konstanz.de] Gesendet: Freitag, 5. Oktober 2012 18:51 An: Cerstin Elisabeth Mahlow Cc: basex-talk@mailman.uni-konstanz.de Betreff: Re: [basex-talk] index:facets() Hi Cerstin, re-created all indexes which indexes did you re-created? The information of index:facets is stored in the path index and therefore only optimize all is updating this index structure. -- Andreas Am 05.10.2012 um 18:46 schrieb Cerstin Elisabeth Mahlow: Hi, after Andreas recommended using index:facets(), my application speeds up. However, I don't think that this is the best solution. The database, I apply this function to is changing constantly. As the function is using the index, I would have to re-create the index first, is this correct? So that the functions would give wrong results most of the time. And I found another strange thing: I had to delete some nodes and re-created all indexes. However, index:facets() still gives the information from the status *before* the deletion, i.e., it counts nodes that aren't there anymore. Therefore I don't use it. count(//entry/selected[text () = yes]) gives the correct result, index:facets(collect, flat)//element[@name = selected]/entry[text() = yes]/@count/data() still gives the wrong result, i.e., the result that was correct some days ago. Is this a bug or a feature? Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] bug: cancel = shut down
Hi Michael, yes, I use Mac OS X. Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net Von: Michael Seiferle [m...@basex.org] Gesendet: Donnerstag, 4. Oktober 2012 10:41 An: Cerstin Elisabeth Mahlow Cc: basex-talk@mailman.uni-konstanz.de Betreff: Re: [basex-talk] bug: cancel = shut down Hi Cerstin, true this is an annoying bug—you are running Mac OS X I assume— that will eventually be fixed. I thought we had this on GitHub already, looks like its missing…. well: Am 24.04.2012 um 22:27 schrieb Christian Grün christian.gr...@gmail.com: you stumbled upon one of the issues that are specific to Mac OSX (the Cancel button works on all Windows and Linux distributions we are aware of). […] As I'm one of the few in our group who hasn't switched to Mac.. I'll have to pass this on to the others.. Anyone interested in having a look at this issue? I'll add it to our issues on GH now, this won't fix it yet, but I hope this will at least increase the chance of having it fixed sooner than later. Best from Konstanz Michael Am 03.10.2012 um 19:22 schrieb Cerstin Elisabeth Mahlow cerstin.mah...@unibas.ch: Hi, when closing BaseX GUI, you will be asked to save edited files. When you click cancel, it shuts down immediately. Shouldn't it cancel the closing process but not the application? Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
[basex-talk] bug: cancel = shut down
Hi, when closing BaseX GUI, you will be asked to save edited files. When you click cancel, it shuts down immediately. Shouldn't it cancel the closing process but not the application? Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
Re: [basex-talk] slow processing
Hi Andreas, having only one session brings some seconds. When I run the queries in the GUI on the same server, I get this: (total number of positive hits) index:facets(collect, flat)//element[@name = selected]/entry[text() = yes]/@count/data() takes 2 to 4 ms (number of phrasemes searched) count(distinct-values(//entry/phraseme/text())) takes 370 to 500 ms (create info table) for $phraseme in distinct-values(//entry/phraseme) let $nodes := //phraseme[text() = $phraseme] let $count := count($nodes[../selected[text() = yes]]) let $person := distinct-values($nodes/../person) order by $phraseme returntrtd({$phraseme})/td td{$person}/td td{$count}/tdtda href=basex-show-phraseme.pl?phraseme={$phraseme}anzeigen/aussortieren/a/td/tr takes 1000 to 1400 ms (last timestamp) let $i := //entry/@time order by $i/@time ascending return pLetzte Bearbeitung: {data($i[last()])}/p takes 250 to 380 ms However, I just switched to using count() for the number of phrasemes accessed. Before I took the distinct values, splitted them into an array, and then used the number of indices. And this probably took a lot of time. Using count() and dropping the splitting results in the page showing up in 2 to 3 seconds. Perfect! Thanks for helping! I will probably soon will ask for help with another slow process :-) Best regards Cerstin -- Dr. phil. Cerstin Mahlow Universität Basel Departement Sprach- und Literaturwissenschaften Fachbereich Deutsche Sprach- und Literaturwissenschaft Nadelberg 4 4051 Basel Schweiz Tel: +41 61 267 07 65 Fax: +41 61 267 34 40 Mail: cerstin.mah...@unibas.ch Web: http://www.oldphras.net Von: Andreas Weiler [andreas.wei...@uni-konstanz.de] Gesendet: Dienstag, 2. Oktober 2012 10:34 An: Cerstin Elisabeth Mahlow Cc: basex-talk@mailman.uni-konstanz.de Betreff: Re: [basex-talk] slow processing Hi Cerstin, can you check each single query contained in the script with the GUI and see how much time each one takes? Why are you creating a new session for each query? You should be able to take the same session for all queries. -- Andreas ___ BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk