[basex-talk] Prolog and XQuery
Hi, My ultimate goal is to investigate the advantages of using Prolog and XQuery together when querying XML files. In doing so, I want to take advantage of BaseX's XQuery engine. Since there is no Prolog client, I initially started writing a Prolog client. But in SWI-prolog, I did not manage to connect using the webconnection tools. So I first started to write a C++ client and then start writing a bridge in Prolog that uses that client. The starting point for the C++ client was the RBaseX client I published earlier. I converted all functionalities from that client to C++. A preliminary version of the C++ client can be found at https://github.com/BenEngbers/BasexCpp. As soon as I manage to create a real shared object using CMake I will publish a final version. Regarding the server protocol for BaseX, however, I still have a question. According to that protocol, the ADD command accepts 4 arguments ({09 {name} {path} {input} ). However, neither in R nor in C++ have I succeeded in using the {name} argument. My question is whether this is a bug in the protocol? I don't know if there is a need for a C++ client that implements the full server protocol but in any case I enjoyed working on this project. Have fun, Ben Engbers
Re: [basex-talk] BaseX and Fedora 38
After entering "~/basex/bin/basexgui &" in a terminal, BasexGUI started as usual. After quiting BaseX I could restart Basex in the usual way (from the Gnome Shell). But after a cold reboot starting Basex from the Gnome shell resulted again in a logoff. I've reported this as a bug to Fedora. Greetings, Ben Op 27-04-2023 om 20:29 schreef Ben Engbers: My linux-session. Since basex 103 gave no problems I guess that maybe they changed the Jave-version that is installed? I don't know. I made a bug-report in Bugzilla and will let you know what the say. For all other Fedora users, be warned!
Re: [basex-talk] BaseX and Fedora 38
My linux-session. Since basex 103 gave no problems I guess that maybe they changed the Jave-version that is installed? I don't know. I made a bug-report in Bugzilla and will let you know what the say. For all other Fedora users, be warned! Ben Op 27-04-2023 om 18:56 schreef Christian Grün: Hi Ben, my session logs off. What kind of session is this? Groetjes, Christian
[basex-talk] BaseX and Fedora 38
Hi, Today I upgraded my PC to linux Fedora 38. The only problem I encountered is that when trying to start basexgui, my session logs off. I can start the server and the RbaseX-client works well so it probably is only the GUI that's giving problems. Is it possible to see what's causing the failure? Ben Engbers
[basex-talk] Binding multiple items to 1 variable (server protocol)
Hi, My ultimate goal is to investigate, using a SWI-Prolog client and Basex, whether a combined use of Prolog and XQuery offers advantages. To write a Prolog client, I first had to learn C++ and the spin-off from that is that I am now almost done writing a C++ client. As with writing the R client at the time, I have questions when applying the 'Bind' command. The server protocol contains the following sentence: "the two items xs:integer(123) and xs:string('ABC') are encoded as 123, \02, xs:integer, \01, ABC, \02, xs:string and \00" Does this mean that multiple items of different types can be bound to one (1) {name} variable? If so, in what situation could this be applied? Where can I find an example FLWOR query that uses this? Ben Engbers
Re: [basex-talk] Socket specifications?
Thanks, In the meantime I have narrowed my problem down to the code that reads from the socket. I'll see if this works better when usen poll() instead of select(). Ben Op 20-01-2023 om 19:20 schreef Liam R. E. Quin: On Fri, 2023-01-20 at 18:31 +0100, Ben Engbers wrote: Whether reading from a socket is non-blocking is a function of the API you use on the client, not the server end. I didn't know
[basex-talk] Socket specifications?
Hi, When I developed the Basex client for R, I ran into problems with the socket for a long time. In the end it turned out that in R I had to configure the socket as a non-blocking socket. This solved all performance issues! I am now trying to develop a client for SWI-prolog. Because that low-level compiler doesn't support using sockets enough, I need to develop a library in C++ first. And in doing so, I again run into problems with the socket. The basex documentation just says to use a socket. But there is no information on how to configure the socket itself. My question is how do I configure the client side of the socket for optimal use? Ben Engbers
[basex-talk] RBaseX version 1.1.2
Hi, Version 1.1.2 of RBaseX-client has been accepted by CRAN. Differences with version 1.1.1 are that 'Store' and 'Replace' have been replaced by 'put' and 'putBinary' and that now the tests have to been executed with Test/testBasex credentials. The daily download-average of RBaseX is 10. But since I haven't received any feedback yet, I don't know to what extent this package contributes to Basex's popularity. Ben Engbers
[basex-talk] Client protocol updated
Hi, Thanks to the mail from Erik Peterson on the basex client implementation, I learned that the client protocol has been updated and that 'replace/store' have been renamed to 'put/putBinary'. In my RBaseX-client, I used the 'retrieve' command to read binary data. I can't remember where I found that 'retrieve' was used for retrieving binary data. But while implementing 'put/putBinary' I noticed that 'retrieve' is no longer accepted as command. I have added a remark on this to the Clients page. Ben Engbers PS. RBaseX is loaded on average 10 times a day. I have never had any feedback on this package so I don't know in howfar it is really used. Any feedback is welcome!
Re: [basex-talk] Client auth debugging
Any information on the platform, programming language and the way in which the socket is opened would be helpfull Ben Erik Peterson schreef op 26 oktober 2022 00:42:41 CEST: >I'm implementing a client per basex api shown here: >https://docs.basex.org/wiki/Server_Protocol#Authentication > >I'm working on digest auth and I can get back the realm and timestamp. I'm >getting an access denied however when I send the username and token. My >questions are: > >1) What tips are there for debugging authentication implementation? I've >set logs.debug to true in .basex and and tail them. I can see access denied >but there's no info to help me debug. I'd like to at least see what the >server is receiving. Any way to do that? BTW, my token creation is correct >per the example in the docs. > >2) It's taking a long time, 60s, for the test to run. Any way to speed that >up?
[basex-talk] Closing a socketconnection
Hi, While reading the basexdbc.c code from Alexander Holupirek, I saw that he explicitly sends the 'exit' command to the server before closing the socket. I couldn't find anything on this command in the client server protocol. Is it necessary to send this command to the server? If so what is the effect of sending this command? Ben Engbers
[basex-talk] RBaseX-client
Hi, Version 1.1.1 is finally stable and available at CRAN (https://cran.r-project.org/package=RBaseX). Hopefully this blog "https://r-posts.com/rbasex-a-basex-client-written-in-r/; will contribute to more support for BaseX from the R community. Ben Engbers
Re: [basex-talk] How to return/use the value of a nested counter?
Hi, In the GUI, I couldn't see if all the //al/text() elements were really displayed as one (1) concatenated objected or just repeated. Only after importing the result to an R-dataframe, I saw that //al/text() was displayed as separate elements. Adding 'fn:string-join($Beurt//al/text(), "")' to the statement did the trick. Ben for $Debat in collection("Parl_Test") let $debate-id := fn:analyze-string( $Debat/officiele-publicatie/metadata/meta/@content, "(\\d{8}-\\d*-\\d*)")//fn:match/*:group[@nr="1"]/text() for $Beurt at $CountInner in $Debat//spreekbeurt let $tekst := fn:string-join($Beurt//al/text(), "") order by $debate-id return($debate-id, $CountInner, $tekst) Op 09-03-2022 om 22:46 schreef Ben Engbers: Hi for $Debat at $CountOuter in collection("Parl_Test") (: where $CountOuter <= 3:) let $debate-id := fn:analyze-string( $Debat/officiele-publicatie/metadata/meta/@content, "(\d{8}-\d*-\d*)")//fn:match/*:group[@nr="1"]/text() order by $debate-id for $Beurt at $CountInner in $Debat//spreekbeurt let $tekst := $Beurt//al/text() return($debate-id, $CountInner, $tekst) :-) Ben Op 09-03-2022 om 17:45 schreef Zimmel, Daniel: Are you simply counting the wrong items? It seems to me you wanted to count: for $Beurt at $CountInner in $Debat//spreekbeurt Daniel
Re: [basex-talk] How to return/use the value of a nested counter?
Hi for $Debat at $CountOuter in collection("Parl_Test") (: where $CountOuter <= 3:) let $debate-id := fn:analyze-string( $Debat/officiele-publicatie/metadata/meta/@content, "(\d{8}-\d*-\d*)")//fn:match/*:group[@nr="1"]/text() order by $debate-id for $Beurt at $CountInner in $Debat//spreekbeurt let $tekst := $Beurt//al/text() return($debate-id, $CountInner, $tekst) :-) Ben Op 09-03-2022 om 17:45 schreef Zimmel, Daniel: Are you simply counting the wrong items? It seems to me you wanted to count: for $Beurt at $CountInner in $Debat//spreekbeurt Daniel
[basex-talk] XQuery versus XSL
Hi, I have a collection of 740 documents with the following structure: xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; xsi:noNamespaceSchemaLocation="http://technische-documentatie.oep.overheid.nl/schema/op-xsd-2012-1;> content="https://zoek.officielebekendmakingen.nl/h-tk-20202021-102-2/metadata.xml; /> Allereerst hebben we het traditionele mondelinge vragenuur. Voorzitter. Het was altijd al een eer om hier te staan. De vragen die ik ga stellen, gaan over stikstof. We zijn allemaal 100 kilometer per uur gaan rijden, maar er is nog geen gram ammoniak uit de veehouderij minder uitgestoten. U heeft helaas maar één vraag, meneer Ephraim, als Groep Van Haga. Ik wil de minister bedanken voor haar beantwoording. I want to experiment with textmining and for these experiments, it would be usefull if for every , all /text() elements were concated.The first option is to use XQuery for concatenating. Another option is to use XSL to transform the original documents to the following structure: xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; xsi:noNamespaceSchemaLocation="http://technische-documentatie.oep.overheid.nl/schema/op-xsd-2012-1;> content="https://zoek.officielebekendmakingen.nl/h-tk-20202021-102-2/metadata.xml; /> Allereerst hebben we het traditionele mondelinge vragenuur. Voorzitter. Het was altijd al een eer om hier te staan. De vragen die ik ga stellen, gaan over stikstof. We zijn allemaal 100 kilometer per uur gaan rijden, maar er is nog geen gram ammoniak uit de veehouderij minder uitgestoten. U heeft helaas maar één vraag, meneer Ephraim, als Groep Van Haga. Ik wil de minister bedanken voor haar beantwoording. Question: What are the pros and cons of both methods? Is it difficult to do this in XSL (I have only used very simple transformations)? Ben
[basex-talk] How to return/use the value of a nested counter?
Hi, I have a collection of 740 documents with the following structure: xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance; xsi:noNamespaceSchemaLocation="http://technische-documentatie.oep.overheid.nl/schema/op-xsd-2012-1;> content="https://zoek.officielebekendmakingen.nl/h-tk-20202021-102-2/metadata.xml; /> Allereerst hebben we het traditionele mondelinge vragenuur. Voorzitter. Het was altijd al een eer om hier te staan. De vragen die ik ga stellen, gaan over stikstof. We zijn allemaal 100 kilometer per uur gaan rijden, maar er is nog geen gram ammoniak uit de veehouderij minder uitgestoten. U heeft helaas maar één vraag, meneer Ephraim, als Groep Van Haga. Ik wil de minister bedanken voor haar beantwoording. I am trying to concatenate all the // childs from elements. Together with an ID that I construct from //meta/@content and a counter for , I want this output: 20202021-102-2, 1, Allereerst ... 20202021-102-2, 2, Voorzitter... + De vragen ... + We zijn ... 20202021-102-2, 3, U heeft ... + Ik wil.. I expected that the following XQuery-statemnt would do it. import module namespace functx = "http://www.functx.com;; for $Debat at $CountOuter in collection("Parliament") (: where $CountOuter <= 3:) let $debate-id := fn:analyze-string( $Debat/officiele-publicatie/metadata/meta/@content, "(\d{8}-\d*-\d*)")//fn:match/*:group[@nr="1"]/text() order by $debate-id for $Beurt at $CountInner in $Debat let $tekst := $Beurt//spreekbeurt//al/text() return($debate-id, $CountInner, $tekst) Instead it returns: 20202021-102-2, 1, Allereerst ...+ Voorzitter... + De vragen ... + We zijn ... + U heeft ... + Ik wil.. How can I use the value of $CountInner? Ben
Re: [basex-talk] Develop a module, HOWTO
The link to expath is new to me! Thanx Ben Op 25-02-2022 om 14:35 schreef Bridger Dyson-Smith: Hi Ben, On Fri, Feb 25, 2022 at 8:25 AM Ben Engbers <mailto:ben.engb...@be-logical.nl>> wrote: Hi, I know that it is possible to create a module with functions (I have even done that once), but I can't find the documentation anymore on how to do that. Could please someone provide the URL to this information? Here is the specification: http://expath.org/spec/pkg <http://expath.org/spec/pkg> and the BaseX-specific documentation: https://docs.basex.org/wiki/Repository <https://docs.basex.org/wiki/Repository> If those aren't the pages you're thinking of, please say! :) Thanks, Ben Engbers HTH best, Bridger
[basex-talk] Develop a module, HOWTO
Hi, I know that it is possible to create a module with functions (I have even done that once), but I can't find the documentation anymore on how to do that. Could please someone provide the URL to this information? Thanks, Ben Engbers
Re: [basex-talk] string-join with a newline separator?
Ok, at least in the GUI using as separator works. Is this a HTML-specific separator? I use 'string-join' in R in the following statement: Query_Stmt <-paste( 'import module namespace functx = "http://www.functx.com;;', 'for $Debat at $CountOuter in collection("Parliament"),', '$Turn in collection("Parliament")', 'where $Turn/officiele-publicatie/metadata/meta/@content = $Debat/officiele-publicatie/metadata/meta/@content', 'and $CountOuter <=2', ' let $debate-id := fn:analyze-string(', '$Debat/officiele-publicatie/metadata/meta/@content, "(\\d{8}-\\d*-\\d*)")//fn:match/*:group[@nr="1"]/text()', ' for $Speach at $CountInner in $Turn/officiele-publicatie/handelingen/agendapunt/spreekbeurt', 'let $Spreker := $Speach/spreker/naam/achternaam/text()', 'let $Pol := $Speach/spreker/politiek/text()', 'order by $debate-id, $CountInner', 'for $par at $CountPar in $Turn/officiele-publicatie/handelingen/agendapunt/spreekbeurt/tekst', ' let $tekst := fn:string-join(fn:data($par//al/text()), "")', 'return($debate-id, $Spreker, ($Pol, "n.v.t")[1], $CountPar, $tekst)' When I use "." as item separator, this statement returns: $debate-id, $Spreker, ($Pol, "n.v.t")[1], $CountPar, $tekst1.$tekst2 But when I use "" it returns: $debate-id, $Spreker, ($Pol, "n.v.t")[1], $CountPar, $tekst1 NA NANA NA $tekst2 (NA means Not Available) So in R the is interpreted as a splitter. ;-( I'll take a look at this and will let you know if I can find a solution. Ben
[basex-talk] string-join with a newline separator?
Hi, My xml has the structure bla bla bla The element contains 1 to many elements. let $tekst := fn:string-join(fn:data($par//al/text()), ".") concatenates this to: bla.bla.bla But I want it to return: bla bla bla Is it possible to add a newline item-separator to fn:string-join? Ben Engbers
Re: [basex-talk] Content is not allowed in prolog
Hi Christian, I have added “The input can be a UTF-8 encoded XML document, a binary resource, or any other data (such as JSON or CSV) that can be successfully converted to a resource by the server.” to my documentation. Create() add(), replace() and store() now all use exec <- c(as.raw(), addVoid(name), addVoid(input_to_raw(input))) as basic pattern. The 'Execute' command has been renamed to 'Execute' (better alignment with the server protocol) Op 23-02-2022 om 11:18 schreef Christian Grün: Hi Ben, (Writing a test took half an hour ;-() Good tests are sometimes more valuable than the implementation itself ;) I have looked at QueryParser.java and probably it should not be that difficult to convert to an R-version (it is still a lot of work). Do you have a test-set with xquery-statements? Cheers, Ben
Re: [basex-talk] Content is not allowed in prolog
Op 22-02-2022 om 18:39 schreef Christian Grün: So you distinguish a XML-DOCUMENT from a XML-FILE and that was something I didn't know. I guess so. Do we use these two terms in our documentation? I don't know. If I find places where it is confusing (at least for me), I'll let you know Or did you want to point out that you used “document” and “files” for describing the same thing in our conversation? No, they are different. A 'file' lives on the file-system (and a file-pointer points to a file). A 'document' however lives in the memory. It can for example be a string which is constructed by Xquery by adding elements or attributes to the result of a query or by writing valid xml-code with a text-editor. I thought that the client could deal with both files and documents. Are there more places in the server protocol where this difference is relevant? Could you please make a note of this in the documentation for the server protocol? We’ll be glad to improve the documentation. I’m not sure which of the formulations were misleading to you, so feel free to share them with us. From the server protocol (https://docs.basex.org/wiki/Server_Protocol) Command Protocol The following byte sequences are sent and received from the client (please note that a specific client may not support all of the presented commands): Command Client Request Description COMMAND {command}Executes a database command. QUERY \00 {query} Creates a new query instance and returns its id. CREATE \08 {name} {input} Creates a new database with the specified input (may be empty). ADD \09 {name} {path} {input} {Adds a new resource to the opened database. REPLACE \0C {path} {input} Replaces a resource with the specified input. STORE \0D {path} {input} Stores a binary resource in the opened database. Everywhere where you use 'input', It is unclear what is valid input, a file or a document? I already have this function which checks if input is already a raw vector or if the input can be transformed into a vector. Is "raw vector" a byte array or something else? What does is.VALID do? A raw vector is a Bytearray. is.Valid is a set of regular expressions. It checks if a URL is valid (https://asf.dfg.dfhg/ is valid. htp:/ery/ery is not). In R, before being able to read from the URL (httr::GET(input)) I had to check wether the URL was valid. Feature request: Could you implement the same functionality in the server protocol? I’m hesitant to change the server protocol at this stage, as almost all other client bindings are based on the current definitions, and would possibly need to be updated. But maybe you need to get more specific in your wording (or it’s my task to spend more time and find out what you mean): The "protocol" is the set of rules that are implemented by the various bindings to communicate with the server. If you say we should implement the functionality in the protocol, would you like to see new rules added? Or would you expect the server-side implementation of the protocol rules to check if the input for a CREATE command can be interpreted as file reference? I understand. I don't believe you really have to update the protocol. It is only the client that needs to be updated. As said before, I consistenly use this pattern: exec <- c(as.raw(0x08), addVoid(name), addVoid(input)) It took me 2 minutes to change this into: raw_input <- input_to_raw(input) exec <- c(as.raw(0x08), addVoid(name), addVoid(raw_input)) Now I can use session$Create() with a document, an URL or a file-descriptor. (Writing a test took half an hour ;-() I think we shouldn’t resolve client file references on the server, as clients and servers usually reside on different machines. You can provide file paths with CREATE DB, but the only reason is that this command was initially designed to work with the standalone version of BaseX. We even had thoughts on rejecting local file references if they are passed on by a client. I think BaseX is an excellent standalone tool for xquery and xml-related applications... Hope this helps, Cheers, Ben
Re: [basex-talk] Content is not allowed in prolog
Op 22-02-2022 om 16:58 schreef Christian Grün: (On close reading I see that "Session$Execute(paste("Create db", DB_Name, Single_File))" should have been "Session$Execute(paste("Create db", DB_Name, Single_File))" The "paste() function just concatenates the strings) Does that solve your problem? No, this line executed without problems. The BaseX user command CREATE DB differs from the technical CREATE command that’s defined in the server protocol. With the latter one, the optional input must be a (single) XML document. The reason is that the client usually resides on a different system than the server, and specifying a file path wouldn’t work. That sounds better!!! This works: Session$Create(DB_Name, "Content 1") "Database 'Parl_Test' gemaakt in 8.64 ms." So you distinguish a XML-DOCUMENT from a XML-FILE and that was something I didn't know. Are there more places in the server protocol where this difference is relevant? Could you please make a note of this in the documentation for the server protocol? I already have this function which checks if input is already a raw vector or if the input can be transformed into a vector. Even with limited R-knowledge this shpuld be readable ;-) input_to_raw <- function(input) { type <- typeof(input) switch (type, "raw" = raw_input <- input, # Raw "character" = { if (input == "") {# Empty input raw_input <- raw(0) } else if (file.exists(input)) { # File on filesystem finfo <- file.info(input) toread <- file(input, "rb") raw_input <- readBin(toread, what = "raw", size = 1, n = finfo$size) close(toread) } else if (is.VALID(input)) { get_URL <- httr::GET(input) raw_input <- get_URL$content } else {# String raw_input <- charToRaw(input) } }, default = stop("Unknown input-type, please report the type of the input." ) ) return(raw_input) } I'll see if I can use this function in Session$Create(). Feature request: Could you implement the same functionality in the server protocol? Cheers, Ben
Re: [basex-talk] Content is not allowed in prolog
I don't believe that the problem is R-related. It is probably more a misunderstanding from my side. I looked at https://docs.basex.org/wiki/Commands#CREATE_DB. According to that page, it is possible to create a db with all the documents in the input-directory (i.e XML-Files) or with one initial document (On close reading I see that "Session$Execute(paste("Create db", DB_Name, Single_File))" should have been "Session$Execute(paste("Create db", DB_Name, "Single", Single_File))" The "paste() function just concatenates the strings) My guess was that the some conventions for specifying input would also be valid for the Sessipn$Create() command. That is still my question? Ben Op 22-02-2022 om 16:30 schreef Christian Grün: My R knowledge is very limited, so it’s difficult to give you advice (maybe someone else can). Does "XML_Files" mean that you are trying to pass on more than a single document?
Re: [basex-talk] Content is not allowed in prolog
Yes I did ;-) Both commands use the same set of xml-files. Session$Execute(paste("Create db", DB_Name, XML_Files)) accepts them. Session$Create(DB_Name, XML_Files) don't Ben Op 22-02-2022 om 16:15 schreef Christian Grün: Hi Ben, The server protocol does not specify the format that is to be used for input. In order to understand the syntax of "{input}", you can have a look at the Conventions paragraph: {...}: utf8 strings or raw data, suffixed with a \00 byte. To avoid confusion with this end-of-string byte, all transferred \00 and \FF bytes are prefixed by an additional \FF byte. Maybe you don’t take care of 00 and FF bytes in the input yet? Best, Christian
Re: [basex-talk] Content is not allowed in prolog
Hi Christian, There are two differences between the server protocol and my implementation. 1 I use "Execute" instead of "Command" as in the command protocol (When I started with this project I thought of it as "Executing" a Command. It is still possible to change Execute to Command if you prefer that). 2 I introduced a little bit of scripting. The last byte of the response indicates success or failure. When the 'intercept' that I introduced is set to TRUE, the success indicator can be used in a R-script to avoid abortion (a very basic form of exception handling and scripting) Apart from that I have followed the server protocol to the letter. ALL commands from the command - and the query protocol are implemented and follow this pattern: exec <- c(as.raw(0x09), addVoid(path), addVoid(input_to_raw(input))) response <- private$sock$handShake(exec) %>% split_Response() All input-parameters are converted to a raw vector and each parameter has a 00 appended. Together with the preceding byte, this is sent to the server. The server returns a raw vector. This vector is splitted on 00. The last byte of the response indicates success. R6, the R object orientation system I used does not know polymorphism but copying the Java source to R6 was not very difficult. I am now really using the package. And it is now that I sometimes see bugs but this is the first bug I don't understand. According to the protocol and the general BaseX documentation, there are two ways to create a database. 1) you can send a specific "Create" command (preceding byte is \08 or 2) you can execute a "Create db" command (no preceding byte). These variables are used in the examples: DB_Name <- "Parl_Test" XML_Files <- system.file("extdata", "xml_files", package="RBaseX") Single_File <- paste(XML_Files, "h-tk-20202021-102-12.xml", sep="/") Session$Execute(paste("Create db", DB_Name, Single_File)) # => success Session$Execute(paste("Create db", DB_Name, XML_Files)) # => success Session$Create(DB_Name) # => success Session$Create(DB_Name, Single_File) # => error Session$Create(DB_Name, XML_Files) # => error The server protocol does not specify the format that is to be used for input. It only says that input may be empty. Do I use the wrong format? Gruesse, Ben Op 22-02-2022 om 14:07 schreef Christian Grün: Hi Ben, I guess this could be caused by a little error in your implementation of the R client. Did you already have a look at the documentation of the server protocol [1] and an alternative implementation [2]? Cheers, Christian [1] https://docs.basex.org/wiki/Server_Protocol [2] https://github.com/BaseXdb/basex/blob/master/basex-examples/src/main/java/org/basex/examples/api/BaseXClient.java On Mon, Feb 21, 2022 at 1:03 PM Ben Engbers wrote: Hi, I have a directory with 12 testfiles. In the BaseX-GUI, the command: CREATE DB Parl_Test /home/bengbers/R/x86_64-redhat-linux-gnu-library/4.1/RBaseX/extdata/xml_files/ Creates database "Parl_Test" and loads the xml-files. In my R-client, Session$Create("Parl_Test") creates database "Parl_test"=> OK I want to create the same database with my client. I initialize the variable "XML_Files" with "/home/bengbers/R/x86_64-redhat-linux-gnu-library/4.1/RBaseX/extdata/xml_files". The client translates the command: Session$Create("Parl_Test", XML_Files) into a raw vector: '\bParl_Test\0/home/bengbers/R/x86_64-redhat-linux-gnu-library/4.1/RBaseX/extdata/xml_files' which is sent to the server. But the server responds with: "\"Parl_Test.xml\" (Regel 1): Content is not allowed in prolog." I didn't touch the xml-files. Where is the content inserted? Ben Engbers
[basex-talk] Content is not allowed in prolog
Hi, I have a directory with 12 testfiles. In the BaseX-GUI, the command: CREATE DB Parl_Test /home/bengbers/R/x86_64-redhat-linux-gnu-library/4.1/RBaseX/extdata/xml_files/ Creates database "Parl_Test" and loads the xml-files. In my R-client, Session$Create("Parl_Test") creates database "Parl_test"=> OK I want to create the same database with my client. I initialize the variable "XML_Files" with "/home/bengbers/R/x86_64-redhat-linux-gnu-library/4.1/RBaseX/extdata/xml_files". The client translates the command: Session$Create("Parl_Test", XML_Files) into a raw vector: '\bParl_Test\0/home/bengbers/R/x86_64-redhat-linux-gnu-library/4.1/RBaseX/extdata/xml_files' which is sent to the server. But the server responds with: "\"Parl_Test.xml\" (Regel 1): Content is not allowed in prolog." I didn't touch the xml-files. Where is the content inserted? Ben Engbers
Re: [basex-talk] Syntax-checker
Hi Christian I know that the R community is still looking for a XQuery tool. I won't say that RBaseX is the best but for the moment it is the best option I know of ;-). And bugs are becoming more and more rare (and difficult to resolve ;-(). Even after my retirement I spent a lot of time programming and working on the client gives me great fun. And introducing a syntax-checker would improve the usability. I'll take a look at het QueryParser class and see if I can manage to implement it in R. Gruesse, Ben PS. I've nearly completed a text that I mean to present to R-bloggers and in which I present my client. Would you care giving it a look? Op 17-02-2022 om 14:38 schreef Christian Grün: Hi Ben, An XQuery string is parsed by the QueryParser class [1]. It’s the largest Java class in the project, so it might take some time to get it reimplemented in R. Groetjes, Christian
[basex-talk] Syntax-checker
Hi, After I had formulated the query in Basex-GUI, I tried to execute the same multi-line query in R/RbaseX. Nada ;-( In R, one can use the paste function to concatenate strings. I use this function to build a string which is passed to the RbaseX-client. Example: Stmt_1 <- "for $i in 1 to 2 return $i" => OK Stmt_2 <- paste0("for $i in 1 to 2", => ERROR "return $i") It took 2 days of debugging before I found the error. Stmt_2 is concatenated to "for $i in 1 to 2return $i" and it is clear that this can't be executed. Instead of using the "paste0"-function I should have used "paste" which introdus a space between the strings to be concatenated. This works fine. My problem is that the server/my client does not give an error-message. And this leads me to following question: It would be helpfull if the syntax for the XQuery statement was checked before sending to the server. Where in the BaseX sources can I find the code for XQuery checking? Is it possible to translate tist code into R or is that way to difficult? Ben Engbers
Re: [basex-talk] 'Flatten' a collection
Op 14-02-2022 om 19:30 schreef Sebastian Albert: Hi Ben, I learned about the `count` feature just from your example. It does not seem to do what you want; I would try the "at" in a "for" loop. According to XQuery 2nd Edition, Priscilla Walmsley pg. 135, 'count' was introduced with XQuery version 3.0 2: How can I formulate the query for getting the correct output? Your example is not well-formed, you're probably missing a closing in the second around the second . No, the missing was due to a typo while composing the mail ;-) Anyway, I think what you want is to iterate over $Turn//spreker/text(), not just use the entire sequence. Here's how I transformed your first query (I stored your example in a variable called $file for experimentation): My intention was to iterate over 2 sequences; $Blog and $Turn. Why do you see this as 1 sequence? Hope this helps, Sebastian for $Blog in collection("Blog"), $Turn in collection("Blog") where $Turn//datum/@date = $Blog//datum/@date order by $Blog//datum/@date count $CountOuter let $Id := $Blog/handeling/@id let $Datum := $Blog//datum/@date for $Speaker at $CountInner in $Turn//spreker/text() return($CountOuter, $Id, $Datum, $CountInner, $Speaker) returns => 1, id="h_1", date="d_1", 1, spreker_1 1, id="h_1", date="d_1", 2, spreker_3 2, id="h_2", date="d_2", 1, spreker_2 2, id="h_2", date="d_2", 2, spreker_1 2, id="h_2", date="d_2", 3, spreker_4 3, id="h_3", date="d_3", 1, spreker_2 3, id="h_3", date="d_3", 2, spreker_3 3, id="h_3", date="d_3", 3, spreker_2 3, id="h_3", date="d_3", 4, spreker_1 OK! With for $Speaker in $Turn//spreker/text() count $CountInner return($CountOuter, $Id, $Datum, $CountInner, $Speaker) it returns => 1, id="h_1", date="d_1", 1, spreker_1 1, id="h_1", date="d_1", 2, spreker_3 2, id="h_2", date="d_2", 3, spreker_2 2, id="h_2", date="d_2", 4, spreker_1 2, id="h_2", date="d_2", 5, spreker_4 3, id="h_3", date="d_3", 6, spreker_2 3, id="h_3", date="d_3", 7, spreker_3 3, id="h_3", date="d_3", 8, spreker_2 3, id="h_3", date="d_3", 9, spreker_1 ERROR :-( While searching for a solution I also tried the following with a nested FLWOR: (It does not return what I want) for $Blog at $countOuter in collection("Blog") order by $Blog//datum/@date let $BlogId := $Blog/handeling/@id let $BlogDatum := $Blog//datum/@date count $countOuter return ( for $Turn at $countInner in collection("Blog") where $Turn//datum/@date = $Blog//datum/@date let $Speaker := $Turn//spreker/text() return ($countOuter, $BlogId, $BlogDatum, $countInner, $Speaker) ) I see your solution also as a nested 'for' loop but in your solution I am missing the 'LWO'. Do you know what is the fundamenta difference between the two nested FOR-loops? Ben (Thanks for the help)
[basex-talk] 'Flatten' a collection
Hi, I have a collection of 740 XML-documents which I want to flatten. The files all have the same structure: spreker_1 spreker_3 spreker_2 spreker_1 spreker_4 spreker_2 spreker_3 spreker_2 spreker_1 The following query gives this result: import module namespace functx = "http://www.functx.com;; let $Blogs := collection("Blog") let $Turns := collection("Blog") for $Blog in collection("Blog"), $Turn in collection("Blog") where $Turn//datum/@date = $Blog//datum/@date order by $Blog//datum/@date count $Count let $Id := $Blog/handeling/@id let $Datum := $Blog//datum/@date let $Speaker := $Turn//spreker/text() return($Id, $Datum, $Speaker, $Count) id="h_1" date="d_1" spreker_1 spreker_3 1 id="h_2" date="d_2" spreker_2 spreker_1 spreker_4 2 id="h_3" date="d_3" spreker_2 spreker_3 spreker_2 spreker_1 3 But what I eventually need is this (for clarity shown as a table): 1, id="h_1", date="d_1", 1, spreker_1 1, id="h_1", date="d_1", 2, spreker_3 2, id="h_2", date="d_2", 1, spreker_2 2, id="h_2", date="d_2", 2, spreker_1 2, id="h_2", date="d_2", 3, spreker_4 3, id="h_3", date="d_3", 1, spreker_2 3, id="h_3", date="d_3", 2, spreker_3 3, id="h_3", date="d_3", 3, spreker_2 3, id="h_3", date="d_3", 4, spreker_1 The first counter indicates the position in $Blog. and the second counter indicates the position in $Turn I expected that the following query would return what I was looking for: for $Blog in collection("Blog") order by $Blog//datum/@date let $Id := $Blog/handeling/@id let $Datum := $Blog//datum/@date count $countOuter return ( for $Turn in collection("Blog") where $Turn//datum/@date = $Blog//datum/@date let $Speaker := $Turn//spreker/text() return ($countOuter, $Id, $Datum, $Speaker)) Instead it returns 1, id="h_1", date="d_1", 1, spreker_1, spreker_3 2, id="h_2", date="d_2", 1, spreker_2, spreker_1, spreker_4 3, id="h_3", date="d_3", 1, spreker_2, spreker_3, spreker_2, spreker_1 I have 2 questions: 1: Is it possible to use separate counters for the inner and the outer loop? (How should I define the $countInner?) 2: How can I formulate the query for getting the correct output? Ben Engbers
Re: [basex-talk] How to extract value from fn:analyze-string
I'll experiment a little with the namespace but for the moment adding *: works! Thanks, Ben Op 10-02-2022 om 18:46 schreef Imsieke, Gerrit, le-tex: It’s a namespace thing. The analyze-string() result is in the http://www.w3.org/2005/xpath-functions namespace, which is bound to the fn prefix. So you should write fn:match etc. instead of match, or, as Bridger suggested, *:match. But such a wildcard always seems a bit desperate to me (no offense, Bridger ;). Whereas you don’t need to use the privileged fn prefix when you invoke analyze-string(), it’s only important when you select the namespaced results. Gerrit On 10.02.2022 18:30, Ben Engbers wrote: Hi, This query produces the following result: let $debates := collection("Parliament") for $debate-item in $debates let $item-file := $debate-item/officiele-publicatie//meta/@content let $debate-id := fn:analyze-string( $debate-item/officiele-publicatie//meta/@content, "(\d{8}-\d*)-(\d*)") return ($debate-id) => http://www.w3.org/2005/xpath-functions;> https://zoek.officielebekendmakingen.nl/h-tk- 20202021-102-1 /metadata.xml ... I am trying to extract the values from group 1 and 2 but this query returns 0 results: let $debates := collection("Parliament") for $debate-item in $debates let $item-file := $debate-item/officiele-publicatie//meta/@content let $debate-id := fn:analyze-string( $debate-item/officiele-publicatie//meta/@content, "(\d{8}-\d*)-(\d*)") let $debate-nr := $debate-id//match/group[@nr="1"]/text() let $item-nr := $debate-id//match/group[@nr="2"]/text() return ($debate-nr, $item-nr) My guess is that analyze-string inserts new elements in the query and that that is the reason why this does not work. How can I extract debate-nr and item-nr from $debate-id? Ben Engbers
[basex-talk] How to extract value from fn:analyze-string
Hi, This query produces the following result: let $debates := collection("Parliament") for $debate-item in $debates let $item-file := $debate-item/officiele-publicatie//meta/@content let $debate-id := fn:analyze-string( $debate-item/officiele-publicatie//meta/@content, "(\d{8}-\d*)-(\d*)") return ($debate-id) => http://www.w3.org/2005/xpath-functions;> https://zoek.officielebekendmakingen.nl/h-tk- 20202021-102-1 /metadata.xml ... I am trying to extract the values from group 1 and 2 but this query returns 0 results: let $debates := collection("Parliament") for $debate-item in $debates let $item-file := $debate-item/officiele-publicatie//meta/@content let $debate-id := fn:analyze-string( $debate-item/officiele-publicatie//meta/@content, "(\d{8}-\d*)-(\d*)") let $debate-nr := $debate-id//match/group[@nr="1"]/text() let $item-nr := $debate-id//match/group[@nr="2"]/text() return ($debate-nr, $item-nr) My guess is that analyze-string inserts new elements in the query and that that is the reason why this does not work. How can I extract debate-nr and item-nr from $debate-id? Ben Engbers
[basex-talk] BaseX and 'view'?
Hi, I am writing a blog for R-bloggers with the aim of raising awareness of my RbaseX package. The latest version is super fast! Loading and saving 740 xml documents from the R environment took only 34 seconds! Before I can use those documents I will probably have to convert them using xsl. I was wondering if BaseX, like Oracle, also has a 'view' over the original data? I can then define multiple views that are better adapted to the intended use. ben
Re: [basex-talk] Copy data from MariaDB into BaseX
Sorry, I should have been more precise in my question (and it would have been better not to talk about parity ;-() I have 4 schemas/databases in MariaDB which I want to copy to BaseX. The first one (schema = 'Innovate') uses 4 tables. The first table (='Dienst') has 2 attributes) and 2 rows. In total there are 6 tables in this schema. I try to copy this first schema to BaseX - "Relational" which is created as an empty database (create db Relational) The result from : let $doc := element { $db } { for $table in $tables return element { $table } { let $rows := sql:execute($con, 'select * from ' || $table) for $row in $rows return element row { for $col in $row/sql:column return element { $col/@name } { $col/data() } } } } return $doc is: 1 CIO Office 2 Dictu .. other tables .. But the result from: return db:add($MariaBase, $doc, $db) is: BaseX database "Relational" .. .. other tables .. At the end It was my intention to have created: BaseX database "Relational" .. other schemas/tables .. <3 other schemas> Op 22-12-2021 om 20:53 schreef Christian Grün: With your query, you seem to add a single document into your database that contains the contents of all tables. Correct. One document in database "Relational" should represent a complete schema in MariaDB. That’s fine in general; but is it what you are trying to achieve, or would it probably be better to represent a single table as document? What would be the advantage of representing single tables as a document? Aren't both approaches equivalent? Ben
Re: [basex-talk] Copy data from MariaDB into BaseX
Hi Christian, For the time being, I ended up with this: sql:init("org.mariadb.jdbc.Driver"), let $MariaBase := 'Relational' let $db:= 'Innovate' let $user := '' let $pass := 'let $con := sql:connect('jdbc:mariadb://localhost:3306/' || $db, $user, $pass) let $tables := sql:execute($con, 'show tables')/sql:column/text() let $doc := element { $db } { for $table in $tables return element { $table } { let $rows := sql:execute($con, 'select * from ' || $table) for $row in $rows return element row { for $col in $row/sql:column return element { $col/@name } { $col/data() } } } } (: return db:add($MariaBase, $doc, $db). :) return $doc gives return $doc gives: 1 CIO Office But return db:add($MariaBase, $doc, $db) results in my database in Relational -> Innovate/1 -> Innovate/1 -> Dienst/1 -> row/n (1 and n indicate parity) I expected that return db:add($MariaBase, $doc) would add $doc at the top-level, resulting in Relational -> Innovate/1 -> Dienst/1 -> row/n but this results in an error (path is missing) According to the documentation ommitting the third parameter in db:add should be allowed, or did I misinterpret something? Cheers, Ben Op 21-12-2021 om 13:20 schreef Christian Grün: Thanks. Does the query do what you are looking for? On Tue, Dec 21, 2021 at 12:47 PM wrote: Christian Grün schreef op 21-12-2021 10:18: Hi Ben, return db:add($db, $doc, $table || '.xml') Could you give us little examples for , and ? Best, Christian To the best of my knowledge in MySQL and/or MariaDB DB-name and DB-schema are identical? The schema-name I use is 'Innovate'. Table-names are ++ | Tables_in_Innovate | ++ | Dienst | | Mdw_Probleem | | Mdw_Wens | | Medewerker | | Medewerker_dienst | | Probleem | | Wens | ++ Ben PS.I hope you'll see this reply. Since a few days all mail from basex-talk is refused by Thunderbird. At least I don't see them anymore
Re: [basex-talk] Copy data from MariaDB into BaseX
At least this is a very good start! I'll see if I can manage to transfer all the tables in on nested command. But first I'll have to refresh my XPath or XQuery knowledge. I'll let you know about the results. Have a nice holiday, Ben Op 21-12-2021 om 13:20 schreef Christian Grün: Thanks. Does the query do what you are looking for?
[basex-talk] Copy data from MariaDB into BaseX
Hi, After completing my work on the R-client, I started working on a Prolog-client. Long ago I wrote an application in SWI-Prolog which operated on data from a MySQL-database. (In the meantime I changed from MySQL to MariaDb). My goal is to write a new version of that application but now based on data which is stored in Basex. In the basexgui, I created an empty database "MariaBases" The following code can be used to select data in MariaDb: sql:init("org.mariadb.jdbc.Driver"), let $con := sql:connect('jdbc:mariadb://localhost:3306/', '', '') return sql:execute($con, "select * from Mdw_Wens") returns: http://basex.org/modules/sql;> 1 5 1 Is it possible to change the query-statement in such a way that the results are added to MariaBases//? -- Ben Engbers
Re: [basex-talk] Authentication in server protocol
Hi Christian, As far as I now understand, a socketConnection is not a single connection but in fact a pool of connections. And I believe this is language-independent. In R, Tthe command socketSelect(list()) waits for the first of several socket connections and server sockets to become available. After inserting this command in my code, there is no need anymore to explicitly insert a sleep. Execution-time for all the results has been reduced to 1.4 seconds instead of 120 as before. Now I can really start using RbaseX! Ben Op 08-12-2021 om 12:55 schreef Christian Grün: Hi Ben, I assume this challenge needs to be tackled in the R realm: If the Java client is used, no sleep is required at all. Hope this helps, Christian
[basex-talk] Authentication in server protocol
Hi Christian, All my previous packages for RBaseX were based on using a blocking socket. Every attempt to use a non-blocking socket failed because I couldn't authenticate. In R each read-operation on a blocking socket uses a timeout of at least 1 second. Consequence was that executing 53 tests on my pacakge took at least 116 seconds on my machine. I finally managed to use a non-blocking socket. Execution of the same tests now take 3.8 seconds. It showed that the crucial needed step was to introduce a sleep/wait between sending the authentication nonce and checking the statusbyte: code <- md5(paste(md5(code), nonce, sep = "")) %>% charToRaw() # send username + code auth <- c(charToRaw(username), as.raw(0x00), code, as.raw(0x00)) writeBin(auth, private$conn) ==> Sys.sleep(.1) Accepted <- readBin(conn, what = "raw", n = 1) ==0 My knowledge of working with sockets is limited so maybe you can answer my question. Does the need of using a sleep means I need to fix a bug in the R code or should I use a setting in BaseX that takes into account the required delay? Ben Engbers
[basex-talk] R-client RBaseX version 0.9.2
Hi, I have completely rewritten my R-client for BaseX. This new version can be downloaded from https://cran.r-project.org/web/packages/RBaseX/index.html or https://github.com/BenEngbers/RBaseX. This version should comply more with the server specification. Compared to the previous version, there are (only) a few changes to the interface. Ben Engbers
Re: [basex-talk] Access to "https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/first.xml" blocked?
They are both rejected. I also tried with "https://www.cnn.com; and "https://nos.nl;. The first is accepted, the second is rejected. In firefox all URL's are accepted. Ben Op 11-11-2021 om 16:24 schreef Imsieke, Gerrit, le-tex: Hi Ben, What about other resources there, like: https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/input.xml https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/response.txt Do they pass on Windows? Gerrit On 11.11.2021 16:14, Ben Engbers wrote: Hi, Allthough I have never had any feedback on my R-client to BaseX, I have steadily been working on a new version (even while being retired, I still like to program :-). At present all tests are passed, except one test on adding content to a database. On my Linux-machine 'url exists("https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/first.xml;)' returns TRUE and 'first.xml' is added to a database. When executed on a Windows machine, the same test returns FALSE. I have tested other URL's - starting with https or http - and they are all accepted. Any clue why "https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/first.xml; is blocked? Ben
Re: [basex-talk] Access to "https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/first.xml" blocked?
On both machines I use the most recent versions of R and RStudio. I'll also drop a question in the R-community and (tomorrow) I'll look at the URL's you suggested. Ben Op 11-11-2021 om 16:45 schreef Imsieke, Gerrit, le-tex: Maybe related to the HTTP header field x-content-type-options: nosniff https://docs.microsoft.com/en-us/previous-versions/windows/internet-explorer/ie-developer/compatibility/gg622941(v=vs.85)?redirectedfrom=MSDN What is the tool/library you are using on Windows? Is it an R HTTP client that is interfacing some Windows DLL? Maybe they put this rejection in the DLL. Maybe you can find another proxying server for raw Github files? https://stackoverflow.com/questions/40728554/resource-blocked-due-to-mime-type-mismatch-x-content-type-options-nosniff Gerrit
[basex-talk] Access to "https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/first.xml" blocked?
Hi, Allthough I have never had any feedback on my R-client to BaseX, I have steadily been working on a new version (even while being retired, I still like to program :-). At present all tests are passed, except one test on adding content to a database. On my Linux-machine 'url exists("https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/first.xml;)' returns TRUE and 'first.xml' is added to a database. When executed on a Windows machine, the same test returns FALSE. I have tested other URL's - starting with https or http - and they are all accepted. Any clue why "https://raw.githubusercontent.com/BaseXdb/basex/master/basex-api/src/test/resources/first.xml; is blocked? Ben
Re: [basex-talk] Call for install/setup stories from users
Since I never had the need to keep older versions alive, apart from step 1 my install and upgrade procedure are the same: 1: mkdir ~/Programs/basex 2: Download BaseX.zip 3: Unzip to ~/Programs I never needed to (re)create the data directory or the symbolic link After adding ~/Programs/basex/bin: to my path, I can start basexgui or basexserver & from the coammandline. Ben Op 17-03-2021 om 20:26 schreef Graydon: On Wed, Mar 17, 2021 at 03:05:58PM -0400, Bridger Dyson-Smith scripsit: Per the recent thread about installing, I was hoping to convince some of you to share your experiences installing and running BaseX. Whether you use Mac OS, Windows, a Linux, or something else: how are you installing and running BaseX? This is on Fedora; it's pretty much strictly an update process by now, though the install process only skips step 3. 1. Download BaseX.zip from the website into ~/bin/basex 2. cd ~/bin/basex 3. mv basex BaseX$VERSION 4. unzip BaseX$VERSION.zip 5. cd basex 6. rmdir data 7. ln -s ../data . Because the executables are always on the same path -- ~/bin/basex/basex/bin -- I don't have to update the shortcut icons when I update versions. Every now and again I'll go through and prune old versions from bin/basex. It would be _better_ if there was a Fedora package and I didn't have to think about performing the update, but, well, BaseX is on a very short list of software that's useful enough to use even if it's not available as a Fedora RPM via dnf.
[basex-talk] Afmelden
Ik ben me er niet van bewust dat ik me ooit aangemeld heb voor 'redhetpensioenstelsel' en kan me ook niet voorstellen dat ik een account aangemaakt heb. Wel weet ik dat ik met regelmaat lastig gevallen worden door deze lijst. Kunt u het adres 'ben.engb...@be-logical.nl' verwijderen van de lijst? Ben Engbers
Re: [basex-talk] RbaseX Client software, reading from a socket
Hi Christian, R provides a package which makes it rather easy to use C++ code. That is why I focused on C++. I first tried to understand the BaseXCPPAPI as provided by Jean-Marc Mercier but for a complete novice on C++, that code was way too complicated for an old man like me (I'm retiring TODAY ;-)). The C-code from Alexander Holupirek is much easier to understand and for the moment I'm trying to convert his code to a C++-variant that can be both used by my RbaseX and a new C++-client. Usually, I first experiment in the GUI to learn which statements I have to use for a query. After that I use the same statements in my client. I noticed that often execution in the GUI only took miliseconds while execution in the client could take minutes (depending on the size of the input or the results). It is my guess that this boiles down to read/write operations on the connection. In R, I have isolated all actions upon the stream into one R-class. And my first goal is to create a C++ class that is functionally equivalent. Hopefully that will improve performance. If I manage in that, I am halfways into building a C++ client that offers the same functionality as my RbaseX-client. Who knows If I'll succeed in that ;-) . Cheers, Ben Op 30-06-2020 om 11:56 schreef Christian Grün: > Hi Ben, > > The BaseX server protocol was specified without focus on any > particular programming language. > > If there is no way to speed up stream processing with R, you could > have a look at the existing C++ client implementation [1]. Maybe > you’ve done so already? > > Cheers, > Christian > > [1] https://docs.basex.org/wiki/Clients
[basex-talk] RbaseX Client software, reading from a socket
Hi, I have no idea if it is used by others, but last march my most recent version of my RbaseX library was accepted by CRAN. To my knowledge there are no errors (all tests are passed). The only problem is that performance is bad ;-(. Uploading a file or downloading the result from a query can take several minutes. I can understand why it takes so long. According to the server protocol, the end of a stream is indicated by a terminating 0-byte. And to distinguish a 'regular' 0-byte in a binary stream from the stop-0, 0-bytes (and FF-bytes) are preceded by an extra FF-byte. The only way to deal in R with these FF-byte was to proces each character/byte separately and that takes much time. I am trying to speed up everything by using C++ for all direct read/write operations. But I never have worked with C++ before. And neither do I understand exactly how streams are to be used. According to some posts on internet, when reading from a stream the first 8 bytes are used to pass information on the length of the stream. My question is if this a standard way to pass information on that length? Or is it specific to C++ or Java? Ben
[basex-talk] Error in the text for the server protocol?
Hi Michael, I only noticed today that in the server procol (https://docs.basex.org/wiki/Server_Protocol) there are 2 different instructions for adding a new resource. On page 3: ADD \09 {name} {path} {input} On page 6: void add(String path, InputStream input) I has been pure luck that I mixed both instructions. I have implemented ADD as \09 {path} {input} and that works perfectly. Session$Add("Test.xml", "Content 1") adds resource Test.xml with the xml as content. Meanwhile it is very easy for me to change the instructions that are send to the server so I tried to add {name} to the ADD command. That results in errors. My guess is that you should use either {name} or {path} and that ADD only works on the database in use. Is that correct? Cheers, Ben
[basex-talk] Binding a variable to a sequence
Hi, I am still trying to improve my R-package and at the moment I am working (again) on the 'Binding' command. I have added this XML to a database: This query: for $t in collection("TestDB/Books")/book where $t/@author = "Walmsley" return $t/@title/string() returns: XQuery My command: Bind(Query_3, "$name", list("Walmsley", "Wickham")) sends the following byte-sequence to the server: 57 61 6c 6d 73 6c 65 79 01 57 69 63 6b 68 61 6d 00 00 And this sequence is accepted by the server. In Query_3, I try to bind $name to the sequence: declare variable $name external; for $t in collection('TestDB/Books')/book where $t/@author = $name return $t/@title/string() The query is executed but no results are given. (I had expected that it would return the sequence {"XQuery", "Advanced R"} How should I correct the query-statement? Cheers, Ben
Re: [basex-talk] How to apply array:for-each on a - sequence - of arrays? SOLVED
Hi, > To insert the third value into each array I think you want > > let $result := $idf ! array:append(., math:log($count div .(2) )) This works! Martin and Graydon, thanks for the help and the explanation. Ben import module namespace tidyTM = 'http://www.be-logical.nl'; declare function local:step_one($nodes as node()*) as array(*)* { let $text := for $node in $nodes return $node/text() => tokenize() => distinct-values() let $idf := $text => tidyTM:wordCount_arr() return $idf }; declare function local:wordFreq_idf($nodes as node()*) as array(*) { let $count := count($nodes) let $idf := local:step_one($nodes) let $result := $idf ! array:append(., math:log($count div .(2) )) return $result }; let $nodes := collection('IncidentRemarks/Incidenten-180101-190630.csv')/csv/record/INC_RM let $Stoppers := doc('TextMining/Stopwoorden.txt')/text/line/text() return local:wordFreq_idf( tidyTM:remove_Stopwords($nodes, "Stp", $Stoppers)) -- declare function tidyTM:wordCount_arr( $Words as xs:string*) as array(*)* { for $w in $Words let $f := $w group by $f order by count($w) descending return ([$f, count($w)]) } ; --- ["probleem", 703, 9.362885817944681e-1] ["opgelost.", 248, 1.9782167274401508e0] ...
Re: [basex-talk] How to apply array:for-each on an array of arrays?
Hi, For (my personal) clarity, I have split up the original function in two parts: declare function local:step_one($nodes as node()*) as array(*)* { let $text := for $node in $nodes return $node/text() => tokenize() => distinct-values() let $idf := $text => tidyTM:wordCount_arr() return $idf }; In local:step_one(), I first create a sequence with the distinct tokens for each $node. All the sequences are joined in $text. I then call wordCount_arr to count the occurences of each word in $text: declare function tidyTM:wordCount_arr( $Words as xs:string*) as array(*) { for $w in $Words let $f := $w group by $f order by count($w) descending return ([$f, count($w)]) } ; I would say that tidyTM:wordCount_arr returns a sequence of arrays but I am not certain if I have specified the correct return-type? Calling local:step_one(tidyTM:remove_Stopwords($nodes, "Stp", $Stoppers)) returns: ["probleem", 703] ["opgelost.", 248] I had hoped that calling the following local:wordFreq, would add the idf to each element but instead I get an error declare function local:wordFreq_idf($nodes as node()*) as array(*) { let $count := count($nodes) let $idf := local:step_one($nodes) let $result := for-each( $idf, function($z) {array:append ($z, math:log($count div $z(2) ) ) } ) return $result }; [XPTY0004] Cannot promote (array(xs:anyAtomicType))+ to array(*): $idf := ([ "probleem", 703 ], [ "opgelost.", 248 ], ...). Cheers, Ben Op 31-03-2020 om 16:29 schreef Martin Honnen: > So does the working function return a sequence of arrays? That doesn't > match the > as array(*) > return type declaration, it seems. > > What does tidyTM:wordCount_arr() return, a single array (of atomic items)?
Re: [basex-talk] How to apply array:for-each on an array of arrays?
Hi, > => means "take the thing on the left and substitute it for the first > parameter of the function on the right, so I thought it meant "The first parameter on the right will be subsituted with the thing on the left"? > ('weasels') => replace('weasels','mustelids') works > > ('weasels','badgers') => replace('weasels','mustelids') DOES NOT work > > This is because a one-item sequence can be treated as the single string > value the first parameter of replace() requires, but a > greater-then-one-item sequence can't be. (This one gives you "item > expected, sequence found" if you try it from the GUI.) The following is quite similar to the 'piping' mechanism in R. I'll start experimenting with it. Thanx, Ben > ! means "take each item of the sequence on the left and pass it to the > thing on the right in turn", so > > ('weasels','badgers') ! replace(.,'weasels','mustelids') works. > > (note that replace() got its first parameter back as the context item > dot.) > > so if you take > > => array:for-each(function($idf) {array:append($idf,math:log($count div > $idf[2]) )}) > > and replace it with > ! array:for-each(.,function($idf) {array:append($idf,math:log($count div > $idf[2]) )}) > > (note the context-item dot!) > > you should at least get a different error message. > > -- Graydon >
Re: [basex-talk] How to apply array:for-each on an array of arrays?
Op 31-03-2020 om 01:18 schreef Graydon: > On Mon, Mar 30, 2020 at 11:16:23PM +0200, Ben Engbers scripsit: > [snip] >> For "probleem", the idf should be calculated as ln($count/703). Since >> there are 1780 nodes this would result in 0.929011751. >> I tried to exten the 'let $idf' line with: >>=> array:for-each(function($idf) {array:append($idf, >> math:log($count div $idf[2]) )}) >> which should result in ["probleem", 703, 0.929011751] >> >> but no mather what I do, every time I get this error: >> [XPTY0004] Cannot promote (array(xs:anyAtomicType))+ to array(*): ([ >> "probleem", 703 ], [ "opgelost.", 248 ], ...). > > The errors says you're trying to feed a sequence of arrays to an array > function; maybe you want ! where you have => ? > > -- Graydon > Hi, Upon your remark about feeding a sequence of arrays, I first tried to apply 'for-each' instead of 'array:for-each'. Alas, that didn't help ;-(, the error was still the same. I then tried to understand what you mean with the '!'. In the book from Priscilla Walmsley, the ! is mentioned as a simple map operator. How is that related to this problem? Cheers, Ben
[basex-talk] How to apply array:for-each on an array of arrays?
Hi, In textmining, the 'idf' or inverse document frequency is defined as idf(term)=ln(ndocuments / ndocuments containing term). I am working on a function that should return this idf. This function: declare function local:wordFreq_idf($nodes as node()*) as array(*) { let $count := count($nodes) let $text := for $node in $nodes return $node/text() => tokenize() => distinct-values() let $idf := $text => tidyTM:wordCount_arr() return $idf }; returns: ["probleem", 703] ["opgelost.", 248] ["dictu", 235] ["opgelost", 217] ["medewerker", 193] ... For "probleem", the idf should be calculated as ln($count/703). Since there are 1780 nodes this would result in 0.929011751. I tried to exten the 'let $idf' line with: => array:for-each(function($idf) {array:append($idf, math:log($count div $idf[2]) )}) which should result in ["probleem", 703, 0.929011751] but no mather what I do, every time I get this error: [XPTY0004] Cannot promote (array(xs:anyAtomicType))+ to array(*): ([ "probleem", 703 ], [ "opgelost.", 248 ], ...). Is it possible to apply array:for-each on an array of arrays? Ben
[basex-talk] New version for RBaseX
Hi, I am glad that version 0.2.4 from my R-package 'RBaseX' has been accepted by CRAN (https://cran.r-project.org/package=RBaseX)! Large parts of the earlier version from the R-package 'RBaseX' have been rewritten and the resulting code is much cleaner. I have added tests and thanks to those test, I found (and fixed) several bugs. To my knowledge, the full server-protocol has now been implemented. One of the main differences concerns error-handling. All client-requests to the basexserver end with either a \00 byte or a \01. I have used this feature to add an extra layer of error-handling. The default is still the regular tryCatch method. But after setting 'intercept' to TRUE, you can now define your reaction upon errors. See the following example: Session <- BasexClient$new("localhost", 1984L, username = "admin", password = "admin") Session$set_intercept(TRUE) Session$Execute("drop DB TestDB") Session$Execute("Open TestDB") if (!Session$get_success()) { Session$Create("TestDB") Session$Add("Test.xml", "Content 1") } Session$Execute("Close") Session$restore_intercept() I am already working on a new version in which I will implement more specific R-related topics (populating dataframes with XQuery-results. Ben
Re: [basex-talk] Is it possible to use 'Stopwords' in a query?
Op 02-03-2020 om 13:27 schreef Christian Grün: > Hi Ben, > > Here is an alternative version that, as I believe, should match your > requirements better: > > let $words := distinct-values( > for $text in db:open('Incidents')/csv/record/INC_RM > return ft:tokenize($text) > ) > let $stopwords := db:open('Stopwords')/text/line > let $result := $words[not(. = $stopwords)] > return sort($result) > Hi Christian, I don't have a separate database 'Stopwords'. The file 'Stopwoorden.txt' was used as option while creating the 'Incidents'-database. Since I have several lists with stopwords and several lists that can be used with sentiment-analysts, I have stored all those files in a 'Textmining' database. Without caring about stopwords, this query works: let $words := for $text in collection('IncidentRemarks/Incidents')/csv/record/INC_RM return ft:tokenize($text) return $words ("sort($words)" returns a long list of numbers) In an article, ("Full-Text Search in XML Databases" by Skoglund, Robin, 2009), I saw this example on page 23: 1 (: will match "propagating few errors" :) 2 /books /book [@number="1"]//p ftcontains" propagation of errors" 3 with stemming with stop words ("a" , "the" , "of") The query may be changed to "stemming without stop words". What I would like to see in BaseX, is that similar as in xquery, 'Stopwords' could be used as if it were a separate resource in the 'Incidents'-database and that it could be used as follows in the query: let $words := for $text in collection('IncidentRemarks/Incidents')/csv/record/INC_RM with stemming without stop words return ft:tokenize($text) return $words As far as I understand, 'stemming' has alrady been made available in the ft:module. Would it also be possible to use STOPWORDS in a similar way? Cheers, Ben
Re: [basex-talk] Should it be possible to declare a function in the client?
Op 02-03-2020 om 13:27 schreef Christian Grün: > Hi Ben, > > Here is an alternative version that, as I believe, should match your > requirements better: > > let $words := distinct-values( > for $text in db:open('Incidents')/csv/record/INC_RM > return ft:tokenize($text) > ) > let $stopwords := db:open('Stopwords')/text/line > let $result := $words[not(. = $stopwords)] > return sort($result) > > There is no need to remove nbsp substrings as they’ll never occur in > your input, and the ft:tokenize function will ensure that your input > (case, special characters, diacritics) will be normalized (see [1,2] > for more details). Using functx is perfectly valid; I only removed the > reference to make the code a bit shorter. > > Hope this helps, > Christian > > [1] http://docs.basex.org/wiki/Full-Text_Module#ft:tokenize > [2] http://docs.basex.org/wiki/Full-Text Hi Christian, Since my primary goal for this is moment is to see how basex/XQuery can be used for full text analysis (and compare the results or needed efforts with similar tasks in R), I am very glad that you brought the fn:tokenize() function to my attention! Ben PS, Just for fun, I created a repository with this tiny function: declare function tidyTM:wordFreqs( $Words as xs:string*) { for $w in $Words let $f := $w group by $f order by count($w) descending return ($f, count($w)) } ; It took less than 10 minutes to create a repository and populate with this function. Creating a R-package takes much longer time!!!
Re: [basex-talk] Should it be possible to declare a function in the client?
Op 28-02-2020 om 14:39 schreef Christian Grün: > I was wondering about nbsp as well. Maybe you don’t need it at all, > but we’d need to have a look at your files. > > Could you additionally provide us with minimized instances of your > Incidents and Stopwoorden.txt XML documents? They should have the same > structure, but contain only a few lines of contents. It should be relatively easy to create a database with the (approximately 500) stopwords and another database with with the Incidents. Shall I send you a backup of those two databases? Ben
Re: [basex-talk] Should it be possible to declare a function in the client?
Op 27-02-2020 om 22:03 schreef Majewski, Steven Dennis (sdm7g): > Also: is ‘(nbsp;)’ what you want as part of you regex to also catch the > ampersand ? > I’m just guessing your intent here. > You could also try ‘(\W|nbsp;)+’ - i.e. non-word, but I’m kind of > assuming that it handles non-normalized unicode accented characters correctly > and reads them as word chars and not delimiters. That would be, of course, > the right thing, but I’ld probably test it first. > > — Steve. I just copied the regex-expression from this page "https://en.wikibooks.org/wiki/XQuery/Tag_Cloud; (using regex always gives me headaches ;-( ). But even after removing the "|[n][b][s][p][;]" from the regex, basexgui still returns 5843. Ben
Re: [basex-talk] Should it be possible to declare a function in the client?
Op 27-02-2020 om 19:19 schreef Christian Grün: > It’s difficult to understand what’s going on here. Could you please > provide us self-contained queries without the R wrapper code? Version 1: import module namespace functx = 'http://www.functx.com'; (: Extract the text :) let $txt := collection('IncidentRemarks/Incidents')/csv/record/INC_RM/text() (: Convert to lower-case and tokenize :) let $INC_RM := tokenize(lower-case(string-join($txt)), '(\\s|[,.!:;]|[n][b][s][p][;])+') (: Read Stopwords :) let $Stoppers := doc('TextMining/Stopwoorden.txt')/text/line/text() (: Remove Stopwords :) let $Stop := functx:value-except($INC_RM, $Stoppers) return $Stop" My R-code first executes this as XQUERY and then calculates the length of the returned list (=5842). Version 2: import module namespace functx = 'http://www.functx.com'; let $txt := collection('IncidentRemarks/Incidents')/csv/record/INC_RM/text() let $INC_RM := tokenize(lower-case(string-join($txt)), '(\\s|[,.!:;]|[n][b][s][p][;])+') let $Stoppers := doc('TextMining/Stopwoorden.txt')/text/line/text() let $Stop := functx:value-except($INC_RM, $Stoppers) return count($Stop) Returns the length of the sequence (counts 5843 words). The '\\' in the regular expression is intentional (R-specific). With a single '\' the query can be executed in BaseXGUI. Does this help? Ben
Re: [basex-talk] Should it be possible to declare a function in the client?
Op 27-02-2020 om 16:41 schreef Christian Grün: > Hi Ben, > > …create a query object, and attach the actual function call to your > query string. I already thougth about that but what would be the benefit of repeating the function-definition, every time I want to call the function ;-( ? > If you want to make XQuery code persistent for future invocations, you > can include your function in an XQuery library module and install this > module in the repository [1]. I will probably go for this. While experimenting (I try to speed up the querys), I compared the results from these 2 querys: Word_Inc_Rm_Stop_txt <- "import module namespace functx = 'http://www.functx.com'; let $txt := collection('IncidentRemarks/Incidents')/csv/record/INC_RM/text() let $INC_RM := tokenize(lower-case(string-join($txt)), '(\\s|[,.!:;]|[n][b][s][p][;])+') let $Stoppers := doc('TextMining/Stopwoorden.txt')/text/line/text() let $Stop := functx:value-except($INC_RM, $Stoppers) return $Stop" Word_Inc_Rm_Stop <- Session$Execute(as.character(glue("xquery {Word_Inc_Rm_Stop_txt}")))$result[[1]] Word_Inc_Rm_Stop_Count <- length(Word_Inc_Rm_Stop) Word_Inc_Rm_Stop_txt_2 <- "import module namespace functx = 'http://www.functx.com'; let $txt := collection('IncidentRemarks/Incidents')/csv/record/INC_RM/text() let $INC_RM := tokenize(lower-case(string-join($txt)), '(\\s|[,.!:;]|[n][b][s][p][;])+') let $Stoppers := doc('TextMining/Stopwoorden.txt')/text/line/text() let $Stop := functx:value-except($INC_RM, $Stoppers) return count($Stop)" Word_Inc_Rm_Stop_Count_2 <- Session$Execute(as.character(glue("xquery {Word_Inc_Rm_Stop_txt_2}")))$result[[1]] These are the processing-times: Version 1: > print(proc.time() - ptm) user system elapsed 2.903 0.022 3.160 Version 2: > print(proc.time() - ptm) user system elapsed 0.041 0.004 1.089 I guess it makes sense to put effort in speeding up my code. But what bothers me is the following. The first query computes the length from the vector that is returned, The result is 5842. The second query returns the length as computed by basex. This result is 5843. The GUI also returns 5843 as result. I copied the output from .. return $Stop to a new LibreOffice-document. That document counts 5842 words. Who is right? Cheers, Ben
[basex-talk] Should it be possible to declare a function in the client?
Hi, My RBaseX client is finally stable enough to use it for real development. All regular commands are executed without errors. But now I am facing another problem. In a client-session, I want to use the following function: fn_get_words_txt <- "declare function local:cloudWords( $Veld as xs:string) as xs:string* { let $base := collection('IncidentRemarks/Incidents')/csv/record let $txt := string-join( $base/*[name() = $Veld]/text(), ' ') let $words := tokenize($txt,'(\\s|[,.!:;]|[n][b][s][p][;])+') return ($words)};" (Doubling the '\' in the regular expression-string is R-specific.) Session$Execute(fn_get_words_txt) returns: Gestopt bij , 1/8: Onbekend commando: declare. Probeer 'help'. Error in Session$Execute(fn_get_words_txt) : Gestopt bij , 1/8: Onbekend commando: declare. Probeer 'help'. fn_get_words_Query <- Session$Query(fn_get_words_txt) fn_get_words_Query$queryObject$ExecuteQuery() returns: Error in private$default_query_pattern(match.call()[[1]]) : Gestopt bij ., 5/20: [XPST0003] Expecting expression. Since fn_get_words_txt neither represents a regular command nor a regular function-all, I understand these errors. Before I even start trying to implement this in my package, my question is if it should be able to create local functions for that session? If so, any idea how to tackle this problem? Could the problem be genaralized to the question how a prolog can be added or changed? Cheers, Ben
[basex-talk] Dynamic evaluation?
Hi, I want to declare a function that can operate on various elements of a record. It should be possible to pass the element-name as parameter to the function. I tried this: declare function local:cloudWords( $Veld as xs:string ) as xs:string* { let $base := collection('IncidentRemarks/Incidentsv')/csv/record let $txt := string-join( $base/$Veld/text(), " ") let $words := tokenize($txt,'(\s|[,.!:;]|[n][b][s][p][;])+') return ($words) }; let $retValue := local:cloudWords("INC_RM") return $retValue But I get this error: [XPTY0019] text(): node expected, xs:string found: "INC_RM". Should I use xquery:eval to transform "$base/$Veld/text()" into "$base/INC_RM/text()" Ben
[basex-talk] BaseX GUI language settings
Hi, My default language for basexgui is Dutch but I want to create screenshots from a GUI that uses English. How can I switch the language temporarily? Cheers, Ben
Re: [basex-talk] update:apply, Context is undeclared. (Newbie)
Op 19-02-2020 om 12:08 schreef Ben Engbers: > Hi, > > I have a database that contains several thousand records with elements > that I will never need so I want to remove them. > The following code snippet returns the expected element: > > I tried to use update:apply to update the database but when I execute > the following function, I get this message: > [XPDY0002] element(functx:remove-elements): Context is undeclared > --- > import module namespace functx = 'http://www.functx.com'; > declare %updating function local:clean_verbs( > $old as node(), > $rem as xs:string* > ) as empty-sequence() { > update:apply(functx:remove-elements, [$old, $rem]) > }; > > let $p := collection("TextMining/nl-verbs.csv")/csv/record[1] > let $remove := ("onbekend1", "onbekend2", "onbekend3", "onbekend4") > > return local:clean_verbs($p, $remove) -- > > I have two questions: > 1: If I want to use the update module, how should I provide the context > to the query? > 2: How can I update all records without making use of update:apply or > update:for-each (what is the befit of the update-module)? With this code, I managed to replace all records: -- import module namespace functx = 'http://www.functx.com'; let $old := collection("TextMining/nl-verbs.csv")/csv/record let $remove := ("onbekend1", "onbekend2", "onbekend3", "onbekend4") for $o in $old return replace node $o with functx:remove-elements($o, $remove) --- Remains my questions: 1: How can I achieve the same task, using functx:remove-elements and update:for-each? 2: What's the benefit of using the update module? Cheers, Ben
[basex-talk] update:apply, Context is undeclared. (Newbie)
Hi, I have a database that contains several thousand records with elements that I will never need so I want to remove them. The following code snippet returns the expected element: import module namespace functx = 'http://www.functx.com'; let $p := collection("Patterns/nl-verbs.csv")/csv/record[1] let $remove := ("onbekend1", "onbekend2", "onbekend3", "onbekend4") return functx:remove-elements($p, $remove) I tried to use update:apply to update the database but when I execute the following function, I get this message: [XPDY0002] element(functx:remove-elements): Context is undeclared import module namespace functx = 'http://www.functx.com'; declare %updating function local:clean_verbs( $old as node(), $rem as xs:string* ) as empty-sequence() { update:apply(functx:remove-elements, [$old, $rem]) }; let $p := collection("Patterns/nl-verbs.csv")/csv/record[1] let $remove := ("onbekend1", "onbekend2", "onbekend3", "onbekend4") return local:clean_verbs($p, $remove) I have two questions: 1: If I want to use the update module, how should I provide the context to the query? 2: How can I update all records without making use of update:apply or update:for-each (what is the befit of the update-module)? Cheers, Ben
[basex-talk] Restore original lay-out
Hi, I don't know how ;( but somehow I managed to change the layout for the GUI. Now I have the Result-panel in top of the Info-panel. How can I restore the original lay-out (Result to bottom-left and Info to bottom-right)? Cheers, Ben
Re: [basex-talk] XDM metadata (SOLVED)
Op 12-02-2020 om 10:23 schreef Ben Engbers: While reading the online-documentation, I saw that there was a internal link to http://www.docs.basex.org/wiki/Server_Protocol:_Types#XDM_Meta_Data in which was described that in most case, the XDM meta data is nothing else than the Type ID. So my code is giving the expected results. Ben
Re: [basex-talk] Full text and stopwords
Op 12-02-2020 om 10:21 schreef Ben Engbers: > Hi Christian, > Would it be a good approach to create a separate database for stop words and sentiments? > > Cheers, > Ben >
[basex-talk] XDM metadata
Hi Christian, According to the server protocol, when first sending \04 to the server, the resulting items from the query are returned as strings, prefixed by a single byte. With my RBaseX-package, the result from "for $i in 1 to 2 return Text { $i }" is "0b" "Text 1" "0b" "Text 2" When sending \1F the strings should be prefixed by the XDM metadata. In my case however, the output is the samen as with \04. Is it possible to get the same output for querys in the basexgui so that I can see which output should be expected? This is the last remaining problem in my package. If I can resolve it, I can upload a new version. Cheers, Ben
[basex-talk] Full text and stopwords
Hi Christian, According to the docs, a stopword list can be used to decrease the size of the full text index. I had no problems when using this list while creating a database. Is it also possible to use this list for other purposes? 1 According to XQueryX 3.1.pdf it is possible to use a sequence of stopwords in a query: /books/book[@number="1"]//p contains text "propagating of errors" using stop words ("a", "the", "of"). How can I use this list in BaseX while building querys? 2 Is it possible to add words to the list, after that is has been loaded? Suppose that it shows that my text contains a lot of names that I want to exclude. How can I add those names to the stopwords list? 3 If I want to create a Wordcloud, I want to use all the words that remain after tokenization and removing all the words from the stopwords list. (I found this item 'https://en.wikibooks.org/wiki/XQuery/Tag_Cloud'. It might be a good starting point for creating a wordcloud) Cheers, Ben
Re: [basex-talk] basexclient "Failed to construct terminal"
Op 11-02-2020 om 13:05 schreef Graydon: > My guess -- stress "guess" -- is that lucene-stemmers is presumably > Apache Lucene, which BaseX might well use -- writing your own stemmer > seems like unnecessary suffering, and BaseX does do word stemming as > part of the full-text capability -- and since current Apache Lucene is > version 8.4.1 -- https://lucene.apache.org/ -- it seems likely you could > be running into an error between a Lucene that BaseX expects to be using > and the (way-old) version 3.4.0 on the CLASSPATH getting loaded instead. > > But I don't know. > > Can you take lucene-stemmers-3.4.0.jar off your CLASSPATH and see what > happens? > > -- Graydon > If you want to use stemming in Dutch (as I do), http://docs.basex.org/wiki/Full-Text tells that in addition to the already present stemming-support, you have to add http://files.basex.org/maven/org/apache/lucene-stemmers/3.4.0/lucene-stemmers-3.4.0.jar to your CLASSPATH. I have removed that jar from the CLASSPATH but that didn't make any difference. At a later time, I'll gradually will remove all java-stuff from my PATH and see what happens. Ben
Re: [basex-talk] basexclient "Failed to construct terminal"
Op 10-02-2020 om 15:40 schreef Graydon: > In general, it looks like this is your environment rather than the > package, but it'd be nice to be able to prove it. > > Grab a fresh copy of current stable basex via the zip archive, unpack > that in some other directory entirely -- ideally belonging to a > different user, no reason you can't create a test user if you have root > on the machine -- and see what it does there? > > -- Graydon (who will admit to being rather baffled) I created a new user, copied and unpacked Basex931.zip. Out of the box, everything works fine. :-) I switched back to my personal account. There I get the same errors as before ;-(. What surprises me however is that despite the errors, basex is fully operational. So you are right, it is my environment that causes the error. The only real difference between regular and test-account, is that for the regular account, I have added in CLASSPATH an entry for lucene-stemmers-3.4.0.jar Could it be that this jar causes the error? Ben
Re: [basex-talk] basexclient "Failed to construct terminal"
Op 10-02-2020 om 14:59 schreef Graydon: > On Mon, Feb 10, 2020 at 02:28:08PM +0100, Ben Engbers scripsit: > What it looks like you've got going on is a situation where basex uses > modules and the java it's getting doesn't, but that doesn't explain why > the gui runs fine. It makes me suspect that you're having an > interaction with httpd somehow. > > -- Graydon > The only thing that might use hhtpd somehow, is my RBaseX-package (I have again rewritten large portions, cleaned up the classes, added tests and so on). I can't think of anything else that uses httpd. And even after rebooting, basex/basexclient still give errors. What else should I try? Ben
Re: [basex-talk] basexclient "Failed to construct terminal"
Op 10-02-2020 om 14:07 schreef Graydon: > On Mon, Feb 10, 2020 at 11:47:55AM +0100, Ben Engbers scripsit: >> Whenever I try to start basex or basexclient on my Fedora 31 linux >> distribution, I get this output: > [error message snipped] >> >> Am I missing something? > > I'm having no issues on Fedora 31, so I can at least say it's not > inherent to the distro. But then again I'm using the gui; let's check. The GUI works fine (always has). It was only last week that I first had to use the basexclient. > 08:03 bin % ./basex > /home/graydon/bin/basex/basex/.basex: writing new configuration file. > BaseX 9.3.1 [Standalone] > Try 'help' to get more information. >> > > That's from inside the basex bin directory. > > How are you installing basex? I always use the zip version and just > unpack it in $HOME/bin. Does the gui -- initialized from the basexgui > script -- run for you? > > -- Graydon > I added the directory, containing basex/bin, to my path. 'basexserver &', 'basexserver stop' and 'basexserverstop' execute without returning errors. This is the output from basexhttp (this is the first time I tried this): BaseX 9.3.1 [HTTP Server] SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/home/bengbers/Programs/basex/lib/slf4j-simple-1.7.26.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/bengbers/Programs/basex/lib/slf4j-simple-1.7.13.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/bengbers/Programs/basex/lib/slf4j-simple-1.7.28.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/home/bengbers/Programs/basex/lib/slf4j-simple-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.SimpleLoggerFactory] [main] INFO org.eclipse.jetty.util.log - Logging initialized @432ms to org.eclipse.jetty.util.log.Slf4jLog [main] INFO org.eclipse.jetty.util.TypeUtil - JVM Runtime does not support Modules java.lang.UnsupportedOperationException Did you add something to CLASSPATH? Ben
[basex-talk] basexclient "Failed to construct terminal"
Hi, It probably has been asked before but I found nothing on this topic. Whenever I try to start basex or basexclient on my Fedora 31 linux distribution, I get this output: BaseX 9.3.1 [Client] Probeer 'help' om informatie te krijgen. [ERROR] Failed to construct terminal; falling back to unsupported java.lang.NumberFormatException: For input string: "0x100" at java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) at java.lang.Integer.parseInt(Integer.java:580) at java.lang.Integer.valueOf(Integer.java:766) at jline.internal.InfoCmp.parseInfoCmp(InfoCmp.java:59) at jline.UnixTerminal.parseInfoCmp(UnixTerminal.java:233) at jline.UnixTerminal.(UnixTerminal.java:64) at jline.UnixTerminal.(UnixTerminal.java:49) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at java.lang.Class.newInstance(Class.java:442) at jline.TerminalFactory.getFlavor(TerminalFactory.java:209) at jline.TerminalFactory.create(TerminalFactory.java:100) at jline.TerminalFactory.get(TerminalFactory.java:184) at jline.TerminalFactory.get(TerminalFactory.java:190) at jline.console.ConsoleReader.(ConsoleReader.java:240) at jline.console.ConsoleReader.(ConsoleReader.java:232) at jline.console.ConsoleReader.(ConsoleReader.java:220) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.basex.util.ConsoleReader$JLineConsoleReader.(ConsoleReader.java:146) at org.basex.util.ConsoleReader.get(ConsoleReader.java:55) at org.basex.BaseX.console(BaseX.java:166) at org.basex.BaseX.(BaseX.java:152) at org.basex.BaseXClient.(BaseXClient.java:35) at org.basex.BaseXClient.main(BaseXClient.java:22) 'basex help' gives this output: Gestopt bij /home/bengbers/, 1/5: [XPDY0002] element(help): Context is undeclared. Am I missing something? Cheers, Ben
Re: [basex-talk] No difference for output from 'FULL' or 'RESULTS'
Op 04-02-2020 om 08:12 schreef Christian Grün: > Hi Ben, > > The client API code hasn’t changed since BaseX 8. Maybe you need to > revise your code. > > If you believe something wrong happens in the API, I’d still need some > more information on what you believe has changed exactly? > > Best, > Christian Hi Christian, It shouldn't be too difficult to read this code: More = function() { if (is.null(private$cache)) { # The cache has to be filled in_stream <- private$sock$get_socket() private$write_code_ID(0x04) cache <- c() while ((rd <- readBin(in_stream, what = "raw", n =1)) > 0) { cache <- c(cache, as.character(rd)) cache <- c(cache, private$sock$str_receive()) } success <- private$parent$get_socket()$bool_test_sock() private$parent$set_success(success) private$cache <- cache private$pos <- 0 } if ( length(private$cache) > private$pos) return(TRUE) else { private$cache <- NULL return(FALSE) }} Next = function() { if (self$More()) { private$pos <- private$pos + 1 result <- private$cache[private$pos] } return(result)} Full = function() { in_stream <- out_stream <- private$sock$get_socket() private$write_code_ID(0x1F) cache <- c() while ((rd <- readBin(in_stream, what = "raw", n =1)) > 0) { cache <- c(cache, as.character(rd)) cache <- c(cache, private$sock$str_receive()) } private$parent$get_socket()$bool_test_sock() result <- cache return(result) } Both More() and Full() start by filling the cache. Next() is used by More() to iterate over the results. The main difference is the code that is sent to the database (0x04 versus 0x1F). Query_1 <- Query(Session, "for $i in 1 to 2 return Text { $i }") fullResult <- Full(Query_1) results in: "0b""Text 1" "0b""Text 2" The result from: iterResult <- c() while (More(Query_1)) {iterResult <- c(iterResult, Next(Query_1))} is identical but as far as I can remember, it should have been: "Text 1" "Text 2" Can you tell if the results should be identical or different? If different, I'll have to install older versions from my code ;-( Cheers, Ben
Re: [basex-talk] Finalizing Query-Objects
Op 04-02-2020 om 08:17 schreef Christian Grün: > It makes no difference for the BaseX server if you close the session and > have open query objects (query objects exclusively reside in the client). > > It can make a difference in client implementations, though. If you have > a chance to always close queries after the execution, I think you should > do so. I assume your are caching the query results before iterating over > them, as it’s some in the other client implementations? Hi Christian, I used the java-client as example, so yes, I cam caching the query results. I will begin by explicitly closing all the querys, closing the socketconnection and removing the session-objects. Hopefully this will show what's causing the failure. Ben (The people from CRAN warned that this can be very difficult and can cause severe headache ;-( )
[basex-talk] No difference for output from 'FULL' or 'RESULTS'
Hi, As far as I can remember when using early versions from my client-software, the main difference in output after sending \04 or \1F to the database, was that in the latter case the output was preceded with XDM Meta data. # Full query_txt <- "for $i in 1 to 2 return Text { $i }" query_obj <- Query(Session, query_txt) result <- Full(query_obj) resulted in: "0b" "Text 1" "0b" "Text 2" # Iterate over query query2 <- "for $i in 3 to 4 return Iter { $i }" query_iterate <- Query(Session, query2) # <== Alternative call to query-object while (More(query_iterate)) { cat(Next(query_iterate), "\n") } resulted in: Iter 3 Iter 4 Now, iterating over the same query gives: 0b Iter 3 0b Iter 4 Did something change in the client/server protocol or did I introduce an error somewhere? Ben
[basex-talk] Finalizing Query-Objects
Hi, The people from CRAN strongly suggested to add tests (comparable to Unit-tests) to my package (RBaseX). Their request led me to take another critical look at my code. So far the tests do not give an error message. But after completing the last test, 'testthat' reports 1 failure without further explanation. After changing the order in which the tests are executed, the failure is always caused by the last test. Therefore I think that it are not the tests that cause an error, but the finalize-process. At this moment, my code is based upon 3 classes: 'RBaseXClient' creates a new client-session. This session use 'SocketClass' to communicate with basexserver. When used in query-mode, the session uses 'QueryClass' to create new query-objects. Due to this architecture, it is easy to explicitly close a regular query-object, but (at least in R) it is difficult to close query-objects when finalizing the session-object. How does the basexserver respond to closing the session without first explicitly closing all open querys? Does this result in an error? Ben
[basex-talk] Load LibreOffice- and Word-documents?
Hi, While we were discussing possible usecases for basex, a colleague asked me if it is also possible to load libreoffice and Word documents into Basex and then perform full-text analysis on them. In essence, these are both XML files, so it should be possible. Does anybody have experience with this? Ben
Re: [basex-talk] Client software and command scripts
Op 06-01-2020 om 11:39 schreef Christian Grün: > What kind of arguments would you like to set? > > The easiest option may be to prefix your script string with some > additional SET commands. > My suggestion/question on aassing arguments to a command script was not restricted to passing 'set' options with the client API, but was more general. Suppose I have a script with commands that add the content from a csv-file to a basex-db. It would be nice if I could pass the name/path of the file as argument to the script. That would make it easier to automate the update-process. Ben
Re: [basex-talk] Client software and command scripts
Hi Christian, Best wishes and thanks for your answer. Ben Op 03-01-2020 om 16:27 schreef Christian Grün: > Hi Ben, > >> I read in the documentation that a client should not only be able to >> execute commands but should also be able to execute command scripts. > > Could you give me a link to the part of the documentation you refer to? I could not find it anymore, I guess that somewhere I mixed up information the API client bindings and the commandline interface. >> My question is if a command script (a file with extension '.bxs') is >> passed as a 'path' to the execute-command or is the client supposed to >> read the file, line by line, and then executing each line separately? > If you refer to the execute function in the API client bindings [1], > the argument must be a BaseX command string. That is the way I have implemented it already. I used my client to load a csv-file into a db-file. In order to get the same result as with the GUI, I had to set some options. And at that moment, I thought that it might be handy to save all the commands in a script so that at a later moment, the script could be reused again. A condition for the usability of this option is that there is a possibility to pass arguments to the script. Is that already possible in BaseX?
[basex-talk] Client software and command scripts
Hi Christian, I read in the documentation that a client should not only be able to execute commands but should also be able to execute command scripts. My question is if a command script (a file with extension '.bxs') is passed as a 'path' to the execute-command or is the client supposed to read the file, line by line, and then executing each line separately? Cheers, Ben
Re: [basex-talk] Binding a variable
Hi Christian, Thanks for the explanation. I am glad to learn that - at least for this moment - I don't have to change my code :-) Ben > Hi Ben, > > If you... > >> and bind $p to root >> Bind(query_obj, "$p", "root") > > …you’ll need to add another external variable declaration in your query: > > "declare variable $p external;" > ... > > Please note, in addition, that your query won’t be executable as you > are trying to assign a dynamic path expression (e.g., 'root') to your > query. If you need to build dynamic query strings, you’ll have to > modify your original query string and send the result to the server. > > Hope this helps, > Christian
[basex-talk] Binding a variable
Hi, When experimenting with my RBaseX-package (I had hoped to submit it to CRAN today), I use the following pattern: 1 Define a query 2 Create a query-object 3 Bind variables (optional) 4 Execute the query When used this pattern on the following query, everything functions as expected: declare variable $name external; for $i in 1 to 3 return element { $name } { $i } The following query is also functioning: paste("declare variable $greet external;", "declare variable $friend external;", "declare variable $into external;", "let $greet := 'Greetings, my '", "let $friend := 'friend'", "for $i in 1 to 3", "let $friend_num := $greet || $friend || $i", "return insert nodes element { $friend } { $friend_num }", "into root", sep = " ") When I modify "into root" to "into $p" and bind $p to root Bind(query_obj, "$p", "root") I get the following error: [XPST0008] Undeclared variable $p The Bind-function returns with code \00, indicating that it is executed without errors. Does this mean that there is a bug in my code or am I violating XQuery-syntax? Ben Engbers
[basex-talk] binding-types?
Hi, While creating a R-package, based on my R client-implementation, I found that the binding function was malfunctioning. After rewriting that function, the following is accepted: query_txt <- "declare variable $name external; for $i in 1 to 5 return element { $name } { $i }" query_obj_1 <- Query(Sess, query_txt) success <- query_obj_1$queryObject$Bind("name", "number") print(query_obj_1$queryObject$ExecuteQuery()) results in: "1" "2" "3" "4" "5" When I change the line 'success <- query_obj_1$queryObject$Bind("name", "number")' in success <- query_obj_1$queryObject$Bind("name", "number", "xs:integer") the following error is produced: "[XPST0081] No namespace declared for '\002xs:integer'." What types are accepted? Ben
Re: [basex-talk] Test if basexserver is running? (Partially solved)
Hi Michael, When unit-testing a package, the first test should be to test if a connection to basexserver can be established. This is not difficult, in fact the first thing I do in my code, is opening a connection so I already know that my code works. But what does it mean when an attempt to open a connection fails? Does this mean that there is an error in my code or does the attempt fail because there is no Basexserver running? So if you want to test the code, you first have to be certain that a server is running. Using Google, I found this solution when using Linux. In linux, you can use the 'ps -fC java' command to see which processes are running in java. 'ps -fC java | grep basex | echo $?' returns 0, meaning that a basexserver-instance is running. I guess that it will be easy to incorporate this command in a R-function. Do you know if a similar command is available for Windows? Ben PS. What do you mean with BaseX:123456789? Op 30-08-19 om 14:44 schreef Michael Seiferle: > Hi Ben, > > I maybe don’t fully get your question right (and I admin I do not know > much about R), but I’d simply open the socket on the port I expect BaseX > to be listening on and see whether or not I receive a `BaseX:123456789` > response and close the connection immediately after. > > Best > Michael > >> Am 29.08.2019 um 15:03 schrieb Ben Engbers > <mailto:ben.engb...@be-logical.nl>>: >> >> Hi, >> >> Last year I have written a R-client for basex >> (https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/r/RbaseXClient.R). >> The present version uses no exception handling and you have to include >> the source-file in your R-code. A much cleaner solution would be catch >> all the errors and to pack the sources in a package. At this moment, I >> am working on such a R-package. >> >> The first test that should be executed in the package, is to test if a >> basexserver is available. >> >> How can I test on Linux, Apple and Windows if a baseserver is running? >> >> Ben >
[basex-talk] Test if basexserver is running?
Hi, Last year I have written a R-client for basex (https://github.com/BaseXdb/basex/tree/master/basex-api/src/main/r/RbaseXClient.R). The present version uses no exception handling and you have to include the source-file in your R-code. A much cleaner solution would be catch all the errors and to pack the sources in a package. At this moment, I am working on such a R-package. The first test that should be executed in the package, is to test if a basexserver is available. How can I test on Linux, Apple and Windows if a baseserver is running? Ben
[basex-talk] EXECUTE syntax
Hi, I want to use my R-client to insert csv in a database. These lines works: csv_add_run <- 'RUN "./DataScience/RBaseX/CSVexample.xq"' Session$command(csv_add_run) When I take the content from CSVexample.xq and incorporate that into a EXECUTE command, I get this: csv_add_exe <- 'EXECUTE "let $root := '/home/bengbers/DataScience/RBaseX/Examples/Parse/;)'; for $path in file:children($root)[ends-with(., '.csv')] return db:add('CSV_test', $path, 'CSV_API', map { 'parser': 'csv', 'csvparser': map { 'header': 'yes', 'separator': ';' }) "' Session$command(csv_add_exe) Stopped at , 1/9: Unknown command: 'EXECUTE. Did you mean 'EXECUTE'? My question is how to define the input for the EXECUTE-command? Cheers, Ben
Re: [basex-talk] Insert CSV into database
Hi Christian, The alternative worked so my first question is answered. But the second question still remains. Why does BaseX-GUI use an old path (/home/bengbers/DataScience/Eindopdracht/Data/file), a path I didn't even enter and does not use the path I entered in the query (/home/bengbers/DataScience/RBaseX/Examples/Parse/)? Cheers, Ben Op 01-06-18 om 12:38 schreef Christian Grün: > Hi Ben, > > As file:list only returns relative file paths, you will have to prepend > the root path later on: > > let $root := "/home/bengbers/DataScience/RBaseX/Examples/Parse/" > for $file in file:list($root, false(), "*.csv") > return db:add("CSV_test", $root || $file, "", map { > 'parser': 'csv', > 'csvparser': map { 'header': 'yes', 'separator': ';' } > }) > > Another alternative is to use the file:children function: > > let $root := "/home/bengbers/DataScience/RBaseX/Examples/Parse/" > for $path in file:children($root)[ends-with(., ".csv")] > return db:add("CSV_test", $path, "", map { ... }) > > Cheers, > Christian
[basex-talk] Insert CSV into database
Hi, My goals is to use my R clientdriver to insert csv-files into a new databases. But before that, I'm experimenting with the GUI. >From the documentation for the CSV-parser, I have taken this code: for $file in file:list("/home/bengbers/DataScience/RBaseX/Examples/Parse", false(), "*.csv") return db:add("CSV_test", $file, "", map { 'parser': 'csv', 'csvparser': map { 'header': 'yes', 'separator': ';' } }) BaseX returns: Error: Stopped at /home/bengbers/DataScience/Eindopdracht/Data/file, 2/14: [FODC0002] Resource 'Test_Parse.csv' does not exist. I didn't enter this path. It was used yesterday when browsing to the datafiles that were inserted into another test-database. let $file := file:list("/home/bengbers/DataScience/RBaseX/Examples/Parse", false(), "*.csv") return $file BaseX returns: Test_Parse.csv Test_Parse (exemplaar).csv If I create a new database, it neatly adds the two csv-files. My questions are: - which query I have to use to insert csv-files? - obviously, BasexGUI uses the wrong path. How should I adjust this path? Cheers, Ben
[basex-talk] How to use BaseX on MacBook? (Urgent!)
Hi, If we manage to install BaseX on a MacBook, chances are great that we will use BaseX dor our final project. I know how to install BaseX on linux but I have no experience with Apple. My fellow-students know how to use applications but don't know how to deal with java-applications. My question is if BaseX can be used on a MacBook. If so, where can I find instructions? Cheers, Ben Engbers
[basex-talk] Unlock database
Hi, Somehow, I managed to lock a (test)-database and now I can't get it unlocked. Is it possible to manually remove the lock? If so, how? Cheers, Ben Engbers
Re: [basex-talk] Missing 'DELETE' in server protocol?
Hi Christian, I changed my code to: add = function(path = path, input = input) { writeBin(as.raw(0x09), private$sock) writeBin(private$raw_terminated_string(path), private$sock) writeBin(private$raw_terminated_string(input), private$sock) private$info <- self$str_receive() return(list(info = private$info, success = self$bool_test_sock())) } and tested the new code with: Path1 <- "Test1.xml" Path2 <- "test/Test1.xml" Simple1 <- "Hello World!" Simple2 <- "/home/bengbers/DataScience/RBaseX/Test1.xml" Simple3 <- "Test1.xml" Added <- Session$add(path = Path1, input = Simple1) (Simple2 is Simple1 written to Test1.xml When used with either Path1 or Path 2, Added$info returns: "Improper use? Potential bug? Your feedback is welcome:\nContact: basex-talk@mailman.uni-konstanz.de\nVersion: BaseX 9.0\nJava: Oracle Corporation, 1.8.0_162\nOS: Linux, amd64\nStack Trace: \njava.lang.RuntimeException: Learn: lock file does not exist.\n\tat org.basex.util.Util.notExpected(Util.java:61)\n\tat org.basex.data.DiskData.finishUpdate(DiskData.java:246)\n\tat org.basex.core.cmd.ACreate.update(ACreate.java:97)\n\tat org.basex.core.cmd.Add.run(Add.java:56)\n\tat org.basex.core.Command.run(Command.java:257)\n\tat org.basex.core.Command.execute(Command.java:93)\n\tat org.basex.core.Command.execute(Command.java:116)\n\tat org.basex.server.ClientListener.execute(ClientListener.java:343)\n\tat org.basex.server.ClientListener.add(ClientListener.java:314)\n\tat org.basex.server.ClientListener.run(ClientListener.java:96)\n" With Path1/Simple2 or Path1/Simple3: "\"Test1.xml.xml\" (Line 1): Content is not allowed in prolog." With Path2/Simple2 or Path2/Simple3: "\"test/Test1.xml.xml\" (Line 1): Content is not allowed in prolog." In all cases Added$success returns FALSE In an old mail someone suggested that maybe this was caused by the used encoding. I converted the encoding for Test1.xml from US-ASCII to UTF-8 but this had no effect. Cheers, Ben Op 24-04-18 om 13:46 schreef Christian Grün: > Hi Ben, > > I assume that this part of the server protocol is indeed outdated. I > have just checked out our Java client, which only sends the target > path to the server (which includes the name of the document) [1]. > > Could you check out if this solves the problem? If yes, I’ll be happy > to update our documentation. > > Best, > Christian
Re: [basex-talk] Missing 'DELETE' in server protocol?
Hi Christian, Thanks for your answer. It helped. Now I have another question. According to the server protocol, I have coded the 'add'-command as follows: writeBin(as.raw(0x09), private$sock) writeBin(private$raw_terminated_string(name), private$sock) writeBin(private$raw_terminated_string(path), private$sock) writeBin(private$raw_terminated_string(input), private$sock) private$info <- self$str_receive() return(list(info = private$info, success = self$bool_test_sock())) When executing these lines: Name1 <- "Name1.xml" Path1 <- "path/test" Simple <- "Hello World!" test <- Session$add(name = "Name1.xml", path = "path/test", input = Simple) I would expect that a new reource was created with name, path and content as specified by the parameters. However I receive: > test $info [1] "\"Name1.xml.xml\" (Line 1): Content is not allowed in prolog." $success [1] FALSE Using Name1 <- "Name1" produces no error but still fails. Can you give any clue in which direction i should search (using the debugger didn't help) Ben Op 23-04-18 om 16:08 schreef Christian Grün: > Hi Ben, > > You are right, there is no DELETE entry in the client binding. The > reason is that you can simply send a DELETE command [1], as there is > no need to transfer additional binary data. > > Does this help? > Christian
[basex-talk] Missing 'DELETE' in server protocol?
Hi, It was only after starting to implement my R-client implementation in examples, that I noticed there is no 'DELETE'-command specified in the server protocol. Is this a deliberate ommission? I would guess that implementing such a command would come down to something like this: delete = function(name = name) { writeBin(as.raw(---BYTE---), private$sock) writeBin(private$raw_terminated_string(name), private$sock) return(list(info = private$info, success = self$bool_test_sock())) } If this is correct, where can I find (a list with) the required byte-codes? Ben Engbers
Re: [basex-talk] baseX vs ExistDB
Hi, Look at http://vschart.com/compare/basex/vs/exist-db If you want, you can add other comparisons Cheers, Ben Op 18-04-18 om 16:34 schreef Alexander Holupirek: >> On 18. Apr 2018, at 15:39, Feargal Hoganwrote: >> >> Hi >> >> Is anyone aware of any comparisons between baseX and Exist? >> I have some familiarity with Exist and I’d like o understand what are the >> benefits of each. >> >> Thanks >> >> Feargal > > Both are, of course, excellent systems. > Do you have something special in mind that you would like to compare? > Besides, I'm not aware of a general feature comparison site or something like > that. > > Cheers, > Alex > >
[basex-talk] RFC, Client for R
Hi, Last month I have been working on a R client. Results from my work are attached. The 'add, replace and store'-commands should be working but haven't been tested yet since I don't have any good example-commands at hand. I'm looking forward for comments! Ben Engbers [[1]] [1] "General Information:" " Version: 9.0" [3] " Used Memory: 38 MB""" [5] "Global options:"" AUTHMETHOD: Basic" [7] " CACHETIMEOUT: 3600"" DBPATH: /home/bengbers/Programs/basex/data" [9] " DEBUG: false" " FAIRLOCK: false" [11] " HOST: localhost" " HTTPLOCAL: false" [13] " IGNORECERT: false" " IGNOREHOSTNAME: false" [15] " KEEPALIVE: 600"" LANG: English" [17] " LANGKEYS: false" " LOG: true" [19] " LOGMSGMAXLEN: 1000"" LOGPATH: .logs" [21] " NONPROXYHOSTS: " " PARALLEL: 8" [23] " PARSERESTXQ: 3"" PASSWORD: " [25] " PORT: 1984"" PROXYHOST: " [27] " PROXYPORT: 0" " REPOPATH: /home/bengbers/Programs/basex/repo" [29] " RESTPATH: "" RESTXQPATH: " [31] " SERVERHOST: " " SERVERPORT: 1984" [33] " STOPPORT: 8985"" TIMEOUT: 30" [35] " USER: "" WEBPATH: /home/bengbers/Programs/basex/webapp" [37] "" "Local options" [39] " ADDARCHIVES: true" " ADDCACHE: false" [41] " ADDRAW: false" " ARCHIVENAME: false" [43] " ATTRINCLUDE: " " ATTRINDEX: true" [45] " AUTOFLUSH: true" " AUTOOPTIMIZE: false" [47] " BINDINGS: "" CASESENS: false" [49] " CATFILE: " " CHECKSTRINGS: true" [51] " CHOP: true"" COMPPLAN: true" [53] " COPYNODE: true"" CREATEFILTER: *.xml" [55] " CREATEONLY: false" " CSVPARSER: " [57] " DEFAULTDB: false" " DIACRITICS: false" [59] " DOTCOMPACT: false" " DOTPLAN: false" [61] " DTD: false"" ENFORCEINDEX: false" [63] " EXPORTER: "" FORCECREATE: false" [65] " FTINCLUDE: " " FTINDEX: false" [67] " HTMLPARSER: " " INLINELIMIT: 100" [69] " INTPARSE: false" " JSONPARSER: " [71] " LANGUAGE: en" " LSERROR: 0" [73] " MAINMEM: false"" MAXCATS: 100" [75] " MAXLEN: 96"" MAXSTAT: 30"