Re: [basex-talk] Call for assistance : BaseX as a preprocessor ?

2020-02-27 Thread thufir

you write:

"I would like to preprocess the xml before entering postgres, and stream 
it with the copy command."  but why?  I'm inferring that you want to 
dynamically generate XML as its queried by postgres?


Just curious,

Thufir


On 2020-02-23 7:31 a.m., maxzor wrote:

Hello,

Thank you for your software which GUI has been my savior every time I 
needed to deal with XML.


I would like to know if I can stream xml transforms, to pipe wikimedia 
XML dumps into a format acceptable by postgres copy ?

I know very well SQL, but nothing about XPath or XQuery

I managed to mock a XPath (or is it XQuery ? :/) snippet from postgres 
itself, but obviously this would need rewriting for basex CLI :
https://stackoverflow.com/questions/60361030/how-to-transform-and-stream-large-xml-files-to-postgres-mediawiki-dump-basex 



Best regards, Maxime Chambonnet



Re: [basex-talk] how to store a result in a database?

2020-02-20 Thread thufir
I see what you mean by passing the query result as the second argument 
to the db:add function, but am getting:


[XPST0003] Expecting function argument.


A syntax problem?

xquery version "3.0";

db:add("foo.clean.xml",


{
for $x in db:open("foo.txt")/text/line
return $x
}








On 2020-02-19 6:08 a.m., Martin Honnen wrote:

Am 19.02.2020 um 14:36 schrieb thufir:

While I can certainly output the result to a file, and then add to a
database, how would I actually send the result directly to a database?

xquery version "3.0";


{
for $line in db:open("people.json")//text()
return
  if (matches($line, "[0-9]"))
  then {$line}
  else {$line}
}




See http://docs.basex.org/wiki/Database_Module#db:add, you seem to want

   db:add("people.all",

    
{
for $line in db:open("people.json")//text()
return
   if (matches($line, "[0-9]"))
   then {$line}
   else {$line}
}


)


that is, you simply want to pass your query result as the second
argument to the db:add function.



Currently I'm running it like so:

thufir@dur:~/flwor/people.json$
thufir@dur:~/flwor/people.json$ basex all.xq > all.xml
thufir@dur:~/flwor/people.json$
thufir@dur:~/flwor/people.json$ basex
BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
>
> create database people.all
Database 'people.all' created in 234.74 ms.
>
> set parser xml
PARSER: xml
>
> add all.xml
Resource(s) added in 377.43 ms.
>
> exit
Have fun.
thufir@dur:~/flwor/people.json$


but that's a bit cumbersome.




thanks,

Thufir





[basex-talk] empty tags when grouping with tumbling window?

2020-02-20 Thread thufir
I found (another) wrinkle to parsing this data because blank lines seem 
to cause a problem with the grouping.  The grouping should "use a 
tumbling window which starts with any line not containing any ASCII 
digit (the name of the person) followed by any line containing at least 
one ASCII digit (i.e. the data lines)":


current output:


  
joe
phone1
phone2
phone3
sue

cell4

home5

ph3
  
  
alice
atrib6
x7
y9
z10
  



where "joe" and "sue" have been put into the same person tag.


desired output, more like:



   
  joe
  phone1
  phone2
  phone3
   
   
  sue
  cell4
  home5
   
   
  alice
  atrib6
  x7
  y9
  z10
   




xquery:

xquery version "3.0";


declare namespace output = 
"http://www.w3.org/2010/xslt-xquery-serialization;;


declare option output:method 'xml';
declare option output:indent 'yes';


declare variable $input :=


  people
  joe
  phone1
  phone2
  phone3
  sue
  
  cell4
  
  home5
  
  ph3
  alice
  atrib6
  x7
  y9
  z10
;



{
for tumbling window $person in $input//line
start $name next $data when matches($name, '^[^0-9]+$') and 
matches($data, '[0-9]')

return

{
{ data($name) },
tail($person) ! {data()}

}

}



Provided the grouping is correct that would be the main goal. 
Unfortunately, don't fully understand how the tumbling window works as 
of yet, so reviewing that section of a text book.


see also:  https://stackoverflow.com/q/60237739/262852





thanks,

Thufir


Re: [basex-talk] was: increment a variable only when a conditional is true?

2020-02-19 Thread thufir

Yes, that's pretty much it:

https://stackoverflow.com/q/60237739/262852


I'm a bit curious what happened to the "people" group, but I'll test and 
adapt.  Did "people" get dropped because it has no following siblings to 
group with?



thanks,

Thufir

On 2020-02-19 6:55 p.m., Majewski, Steven Dennis (sdm7g) wrote:
The reason I asked was: although XQuery looks more procedural than XSLT, 
it’s still basically a functional and declarative language, and thinking 
about incrementing or decrementing variables is probably the wrong way 
to think about solving your problem in XQuery.
Variables bind names to values over a particular scope, and they aren’t 
updatable variables that you can increment or decrement.


If what you want in English is roughly:
   For person element where numerical = true , group by the 
preceding-sibling where numerical = false


That maps fairly simply into XQuery syntax:

declare variable $XML := 
  people
  joe
  phone1
  phone2
  phone3
  sue
  cell4
  home5
  alice
  atrib6
  x7
  y9
  z10
 ;

for $P in $XML/person
where $P[@numerical="true"]
let $PREV := $P/preceding-sibling::person[@numerical="false"][1]
group by $PREV
return   { $P } 

Yields:



   phone1
   phone2
   phone3


   cell4
   home5


   atrib6
   x7
   y9
   z10


Once the grouping is working, you can work on tweaking the output format 
to exactly what you want. I tried to work towards something like what I 
THOUGHT you might want.
I had a bit of trouble understanding that $PREV in the return expression 
is an atomic string value and not a node, and that the other variables 
in return expression are multiple valued ( thus the ‘[1]’s ), but I 
ended up with this:



 {
for $P in $XML/person
where $P[@numerical="true"]
let $PREV := $P/preceding-sibling::person[@numerical="false"][1]
let $X := count($P[1]/preceding-sibling::person[@numerical="false"])
group by $PREV
return 
 { for $I in $P return {$I/@id, attribute x {$X[1]-1}, string($I)} } 
 

} 

Which yields:


   
     phone1
     phone2
     phone3
   
   
     cell4
     home5
   
   
     atrib6
     x7
     y9
     z10
   



Also more functional to think of use count or position expressions 
instead of incrementing or decrementing variables.


I’m hitting the Walmsley book myself now to try to understand those 
issues with group by / return that were puzzling to me.



— Steve M.


On Feb 19, 2020, at 8:28 PM, thufir <mailto:hawat.thu...@gmail.com>> wrote:


Hi Steve,

yes, it certainly does.  At this point, I'd settle for adding an 
attribute like "recordID", but that's exactly it.  I'll take a closer 
look at "group by" a bit later, thanks for the pointer.  See also:


https://stackoverflow.com/q/60237739/262852


thanks,

Thufir

On 2020-02-19 1:05 p.m., Majewski, Steven Dennis (sdm7g) wrote:
On Feb 19, 2020, at 10:01 AM, thufir <mailto:hawat.thu...@gmail.com>> wrote:


where I'm trying to use attributes because I'm not sure how to 
conditionally nest tags.  But, this is interesting.  Not quite sure 
on syntax to set and then conditionally increment $x, however.
Does “conditionally nest tags” mean that you want to make the 
person[@numerical=“true”] elements children of the immediately 
previous person[@numerical=“false”] elements ?

If that is the case you can use “group by”
— Steve M.




Re: [basex-talk] was: increment a variable only when a conditional is true?

2020-02-19 Thread thufir

Hi Steve,

yes, it certainly does.  At this point, I'd settle for adding an 
attribute like "recordID", but that's exactly it.  I'll take a closer 
look at "group by" a bit later, thanks for the pointer.  See also:


https://stackoverflow.com/q/60237739/262852


thanks,

Thufir

On 2020-02-19 1:05 p.m., Majewski, Steven Dennis (sdm7g) wrote:



On Feb 19, 2020, at 10:01 AM, thufir <mailto:hawat.thu...@gmail.com>> wrote:


where I'm trying to use attributes because I'm not sure how to 
conditionally nest tags.  But, this is interesting.  Not quite sure on 
syntax to set and then conditionally increment $x, however.


Does “conditionally nest tags” mean that you want to make the 
person[@numerical=“true”] elements children of the immediately previous 
person[@numerical=“false”] elements ?


If that is the case you can use “group by”


— Steve M.




Re: [basex-talk] increment a variable only when a conditional is true?

2020-02-19 Thread thufir

Hi Bridger,


Yes, this may very well.  I'll dive into it later today or tomorrow. 
The difficulty for me will be to decrement only if a conditional.  But, 
that may very well suffice.


If so, it will give the data some structure through attributes.

Then, I can use those attributes.


Thanks,

Thufi

On 2020-02-19 7:57 a.m., Bridger Dyson-Smith wrote:

Hi Thufir -

Maybe something like this will help?

```
xquery version "3.1";

let $y := 99

for $x in (1 to 9)
count $iterator
let $decrease := $y - $iterator
return(
   comment { "iterator = " ||  $iterator },
   comment { "decrease = " || $decrease },
   
)
```

I'm basically ripping Walmsley's book off for this example -- see pages 
~135-7 (examples P-6,7). The `count` clause makes this work.


Best,
Bridger

PS Remember, the first step in avoiding a *trap* is knowing of its 
existence. :)


On Wed, Feb 19, 2020 at 10:33 AM thufir <mailto:hawat.thu...@gmail.com>> wrote:


How do I decrement y?

Pardon, output for a simpler example:











the FLWOR:

xquery version "3.0";
let $y := 99
for $x in (1 to 9)
   let $y := $y - 1
return 


is it not possible to decrement $y without using some external
scripting
    function?  That seems odd.


thanks,

Thufir



Re: [basex-talk] increment a variable only when a conditional is true?

2020-02-19 Thread thufir

How do I decrement y?

Pardon, output for a simpler example:











the FLWOR:

xquery version "3.0";
let $y := 99
for $x in (1 to 9)
 let $y := $y - 1
return 


is it not possible to decrement $y without using some external scripting 
function?  That seems odd.



thanks,

Thufir


[basex-talk] increment a variable only when a conditional is true?

2020-02-19 Thread thufir
How can I increment the x variable only when numerical is false?  (I've 
been reading how xquery isn't iterative...)



current output:


  people
  joe
  phone1
  phone2
  phone3
  sue
  cell4
  home5
  alice
  atrib6
  x7
  y9
  z10


desired output:



  people
  joe
  phone1
  phone2
...


Maybe with a second xquery?  Here's the first:

xquery version "3.0";



{
variable $x:=0;

for $line in db:open("foo.txt")//text()


count $id

returnif (matches($line, "[0-9]"))
 then {$line}
 else numerical="false">{$line}

}




where I'm trying to use attributes because I'm not sure how to 
conditionally nest tags.  But, this is interesting.  Not quite sure on 
syntax to set and then conditionally increment $x, however.



thanks,

Thufir


Re: [basex-talk] how to nest tags using a conditional

2020-02-19 Thread thufir




On 2020-02-19 6:10 a.m., Martin Honnen wrote:

Am 19.02.2020 um 15:08 schrieb thufir:

How can I start a new "record" and then nest tags in that record?


but I'm getting output like:


  
  if (matches($line, "[0-9]"))
  then people
  else people
  
  
  if (matches($line, "[0-9]"))
  then joe
  else joe
  
..

wheras I just want output like:


  joe
  123



the query:

xquery version "3.0";


{
for $line in db:open("foo.txt")//text()
return
    



Nest any contained expression in further curly braces

   {


  if (matches($line, "[0-9]"))
  then {$line}
  else {$line}



}



}






that was quite helpful, thanks.  I'm getting:


joe
  
  
phone1
  
  
phone2
  


and want to only open the new record tab for something like:



joe
phone1
phone2



but get "incomplete if statement" when I try to add open and close 
record tags inside each if statement.


xquery version "3.0";


{
for $line in db:open("foo.txt")//text()
return

{
   if (matches($line, "[0-9]"))
   then {$line}
   else {$line}
}
 
}
 




thanks,

Thufir


[basex-talk] how to nest tags using a conditional

2020-02-19 Thread thufir

How can I start a new "record" and then nest tags in that record?


but I'm getting output like:


  
  if (matches($line, "[0-9]"))
  then people
  else people
  
  
  if (matches($line, "[0-9]"))
  then joe
  else joe
  
..

wheras I just want output like:


  joe
  123



the query:

xquery version "3.0";


{
for $line in db:open("foo.txt")//text()
return

  if (matches($line, "[0-9]"))
  then {$line}
  else {$line}

}




I think it's a matter of using the () and {} correctly.  Pardon, yes, 
I'm literally reading a book on this, still trying to understand the syntax.




thanks,

Thufir


[basex-talk] how to store a result in a database?

2020-02-19 Thread thufir
While I can certainly output the result to a file, and then add to a 
database, how would I actually send the result directly to a database?


xquery version "3.0";


{
for $line in db:open("people.json")//text()
return
  if (matches($line, "[0-9]"))
  then {$line}
  else {$line}
}


Currently I'm running it like so:

thufir@dur:~/flwor/people.json$
thufir@dur:~/flwor/people.json$ basex all.xq > all.xml
thufir@dur:~/flwor/people.json$
thufir@dur:~/flwor/people.json$ basex
BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
>
> create database people.all
Database 'people.all' created in 234.74 ms.
>
> set parser xml
PARSER: xml
>
> add all.xml
Resource(s) added in 377.43 ms.
>
> exit
Have fun.
thufir@dur:~/flwor/people.json$


but that's a bit cumbersome.




thanks,

Thufir


Re: [basex-talk] How to export or query JSON as an array?

2020-02-17 Thread thufir

This is the data I'm trying to pull from the database:

[
  {
"0":"z10",
"1":"y9",
"2":"x7",
"3":"atrib6",
"name":"alice"
  },
  {
"0":"home5",
"1":"cell4",
"name":"sue"
  },
  {
"0":"phone3",
"1":"phone2",
"2":"phone1",
"name":"joe"
  },
  {
"name":"people"
  }
]

which is, obviously, JSON rather than XML.  So use the export 
functionality?  But, how?


On 2020-02-17 10:09 a.m., thufir wrote:
Using Java, I can add JSON easily enough.  But how can I get that data 
back as a JSONArray, or even just a List of JSONObject?


https://stackoverflow.com/q/60268157/262852

which is the same question, of course.



thanks,

Thufir


[basex-talk] How to export or query JSON as an array?

2020-02-17 Thread thufir
Using Java, I can add JSON easily enough.  But how can I get that data 
back as a JSONArray, or even just a List of JSONObject?


https://stackoverflow.com/q/60268157/262852

which is the same question, of course.



thanks,

Thufir


[basex-talk] How to programatically add JSON?

2020-02-16 Thread thufir
I know I've gone around on this before, but looking to import data 
directly from Java.  I've set the parser to "json", am only adding 
JSONObject string data, and not JSONArray, but am getting "resource not 
found":


https://stackoverflow.com/questions/60250724/


What does it mean "resource not found"?  The data is certainly valid, 
because the following will import JSON the fine, in fact it even imports 
the array:



let $database := "blgdfmbljm"
for $name in file:list('.', false(), '*.json')
let $file := file:read-text($name)
let $json := json:parse($file)
return db:add($database, $json, $name)


dummy data:

[{"0":"","1":"z10","2":"y9","3":"x7","4":"atrib6","name":"alice"},{"0":"home5","1":"cell4","name":"sue"},{"0":"phone3","1":"phone2","2":"phone1","name":"joe"},{"name":"people"}]


So, I guess a two-fold question:  how to import a JSONArray with Java, 
and how to even know what resource isn't found.  Doesn't seem likely to 
be the JSONObject as I've inspected it just prior to invoking Add.



thanks,

Thufir


[basex-talk] POJO for import?

2020-02-16 Thread thufir
I'm looking at the Java examples, but how would I add a POJO, or 
collection of POJO's, to BaseX?  Jaxb looks to certainly be an option to 
serialize to XML, but what would be some alternate approaches?



thanks,

Thufir


Re: [basex-talk] how to export to CSV?

2020-02-15 Thread thufir
I'll read your response more carefully, but key takeaway is to maybe use 
something else?  I'm futzing with powershell, seems reasonable for the 
task (if a bit odd).


On 2020-02-15 11:47 a.m., Graydon wrote:

On Sat, Feb 15, 2020 at 11:12:36AM -0800, thufir scripsit:

What I'm trying to export to, CSV:

joe,phone1,phone2,phone3,
sue,cell4 ,home5,,
alice,atrib6,x7,y9,z10


This is the outcome you want?


What needs to be done so that it can be exported to CSV:



   joe
   phone1
   phone2
   phone3
   sue
   cell4
   home5
   alice
   atrib6
   x7
   y9
   z10



This isn't flat.  Which means you have a column mapping problem.

Does the thing expecting the CSV have a fixed list of columns?  Do you
know what that is? (If the first answer is "yes" and the second answer
is "no", or if the first answer is unknown, that's the first thing to
do.)


is what I see in the documentation, but not sure how to get there.


Generally speaking XQuery is not the best option for transforming XML
content.

There is no requirement to use the CSV functions; those are there to be
convenient, and if they're not convenient you don't have to use them.

That vaguely-list thing thing you referenced needs to be split; the
usual way in XQuery would be tumbling windows.
http://docs.basex.org/wiki/XQuery_3.0#window

One you've got a single record,

string-join($record/descendant::text(),',')

will give you one row of CSV. It won't necessarily look like you want it
to; that column mapping problem again.

You can use file:write-text-lines, which appends line ending characters
for you, so something like:

file:write-text-lines('path/to/file.csv',for $record in $data return 
string-join($record/descendant::text(),','))

will get you a CSV file.

The fun part is likely to be some combination of the column mapping and
the tumbling windows.

-- Graydon



[basex-talk] how to export to CSV?

2020-02-15 Thread thufir

What I'm trying to export to, CSV:

joe,phone1,phone2,phone3,
sue,cell4 ,home5,,
alice,atrib6,x7,y9,z10


What needs to be done so that it can be exported to CSV:



  joe
  phone1
  phone2
  phone3
  sue
  cell4
  home5
  alice
  atrib6
  x7
  y9
  z10


But not sure how to get from the above list, which I can rewrite as XML, 
to something which BaseX can then export as the desired CSV.  I know 
that it has to be wrapped in  and that  is used, as well. 
Something like:



  
Huber
Sepp
Hauptstraße 13
93547 Hintertupfing
  



is what I see in the documentation, but not sure how to get there.



thanks,

Thufir


Re: [basex-talk] Flat XML fods database

2020-02-14 Thread thufir

Hi Bridger,

thanks, and yes, I have her book -- still reading it.  But I'll skip 
ahead to that section :)



-Thufir


On 2020-02-14 5:05 a.m., Bridger Dyson-Smith wrote:

Hi Thufir,

You might find Priscilla Walmsley's Functx library[1] section on 
namespaces helpful (or something else interesting in the functions 
listed there).


Michael - that's a very nice FLOWR! Thank you for sharing.

Best,
Bridger

[1] http://www.xqueryfunctions.com/xq/c0021.html

On Fri, Feb 14, 2020, 7:29 AM Michael Seiferle <mailto:m...@basex.org>> wrote:


Hi Thufir,

I think you might look up the namespaces from the Libre Office File
like so:

(
  for $node in $doc//(@*,*)
  let $nn := node-name($node)
  let $ns-uri := namespace-uri-from-QName($nn)
  let $prefix := prefix-from-QName($nn)
  where $ns-uri
  return ``[declare namespace `{ $prefix }` = "`{ $ns-uri }`"]``
) => distinct-values()
=> string-join(out:nl())

https://git.basex.io/snippets/77


Other than that you might have to consult the Schema files I guess :)



Am 14.02.2020 um 09:27 schrieb thufir mailto:hawat.thu...@gmail.com>>:

I think I mainly need to add a namespace for fods:


    thufir@dur:~/fods/flwor$
thufir@dur:~/fods/flwor$ basex text.xq
Stopped at /home/thufir/fods/flwor/text.xq, 3/14:
[XPST0081] No namespace declared for 'text:p'.
    thufir@dur:~/fods/flwor$
thufir@dur:~/fods/flwor$ cat text.xq

for $foo  in db:open("foo")
return $foo//text:p

thufir@dur:~/fods/flwor$


https://stackoverflow.com/q/60222478/262852



how do I know all the namespacess?  from the libre office file?




Re: [basex-talk] Flat XML fods database

2020-02-14 Thread thufir

I think I mainly need to add a namespace for fods:


thufir@dur:~/fods/flwor$
thufir@dur:~/fods/flwor$ basex text.xq
Stopped at /home/thufir/fods/flwor/text.xq, 3/14:
[XPST0081] No namespace declared for 'text:p'.
thufir@dur:~/fods/flwor$
thufir@dur:~/fods/flwor$ cat text.xq

for $foo  in db:open("foo")
return $foo//text:p

thufir@dur:~/fods/flwor$


https://stackoverflow.com/q/60222478/262852



how do I know all the namespacess?  from the libre office file?


[basex-talk] Flat XML fods database

2020-02-13 Thread thufir

Hi all,

I've loaded a .fods file from Libre Office into BaseX and am looking to 
identify bold cells.  This is totally off-topic to BaseX itself, but 
does anyone have experience in working with .fods files?



thanks,

Thufir


Re: [basex-talk] how to count and remove "entities"

2020-02-04 Thread thufir
I'd have to experiment more, but I believe that if I kept the filename 
static each iteration of Replace would simply write over the previous 
tweet so that only one tweet was ever being stored to the db.  Only by 
changing the name was I able to add multiple tweets with Replace.  (I 
think.)


But, I thought there were 99 ways to skin a cat? ;)

I only selected BaseX to learn XQuery.

On 2020-02-04 4:17 a.m., Christian Grün wrote:

The filename is arbitrary, you can choose it as you like.

A general note: Your code may get easier again if you write more code
in XQuery. But there are always (I think it was) 42 ways to solve a
single problem.



Re: [basex-talk] how to count and remove "entities"

2020-02-04 Thread thufir

yes, this worked.  Kinda lengthy, but this is the code I came up with:


private void replace(JSONArray tweets) throws JSONException, 
BaseXException, IOException {

log.fine(tweets.toString());
JSONObject tweet = null;
long id = 0L;
new Open(databaseName).execute(context);
new Set("parser", "json").execute(context);
Command replace = null;

for (int i = 0; i < tweets.length(); i++) {
tweet = new JSONObject(tweets.get(i).toString());
id = Long.parseLong(tweet.get("id_str").toString());
replace = new Replace(id + ".xml");
replace.setInput(new ArrayInput(tweet.toString()));
replace.execute(context);
}
log.fine((new XQuery(".")).execute(context).toString());
}

what I don't really understand there is that when creating the Replace 
command the "primary key" would seem to be the id_str from the tweet -- 
which is fine.  But that relates to a filename xxx.xml?


thanks,

Thufir

On 2020-02-03 11:05 p.m., Christian Grün wrote:
You could use REPLACE instead of ADD (or db:replace instead of db:add) 
and name your tweet by the JSON id. For more details, have a look at our 
documentation [1].


Deleting duplicates after the insertion would be another approach, but 
it surely is too slow if your plan is to store thousands or millions of 
tweets.


[1] http://docs.basex.org/wiki/Database_Module#db:replace



thufir mailto:hawat.thu...@gmail.com>> schrieb 
am Di., 4. Feb. 2020, 07:41:


Not sure of the correct lingo, but I'm building a database of tweets.
As I run it, duplicate tweets are added to the database.  I can see the
duplicates with:

for $tweets  in db:open("twitter")
return {$tweets/json/id__str}

Firstly, how would I select the json node for a duplicate entity.  But,
before even selecting that node, recursively look to see if there's
more
than one result for that id__str value.

How would I even generate a count of each occurrence for the data of a
specific id__str?


thanks,

Thufir



Re: [basex-talk] how to count and remove "entities"

2020-02-03 Thread thufir

I think distinct-result is helpful here:

https://stackoverflow.com/q/60051384/262852

as is count.  How would I pipe the result from the set of 
distinct-result to a count?  If the count >1 then I could delete that tweet.


Just thinking out-loud.  Is that reasonable?  Or, might I not be 
re-inventing the wheel here?



On 2020-02-03 10:41 p.m., thufir wrote:
Not sure of the correct lingo, but I'm building a database of tweets. As 
I run it, duplicate tweets are added to the database.  I can see the 
duplicates with:


for $tweets  in db:open("twitter")
return {$tweets/json/id__str}

Firstly, how would I select the json node for a duplicate entity.  But, 
before even selecting that node, recursively look to see if there's more 
than one result for that id__str value.


How would I even generate a count of each occurrence for the data of a 
specific id__str?



thanks,

Thufir


[basex-talk] how to count and remove "entities"

2020-02-03 Thread thufir
Not sure of the correct lingo, but I'm building a database of tweets. 
As I run it, duplicate tweets are added to the database.  I can see the 
duplicates with:


for $tweets  in db:open("twitter")
return {$tweets/json/id__str}

Firstly, how would I select the json node for a duplicate entity.  But, 
before even selecting that node, recursively look to see if there's more 
than one result for that id__str value.


How would I even generate a count of each occurrence for the data of a 
specific id__str?



thanks,

Thufir


Re: [basex-talk] Add command: name of the input will be set as path?

2020-02-03 Thread thufir

I got it to work in a very kludgy way:


new Open(databaseName).execute(context);
for (int i = 0; i < tweets.length(); i++) {
jsonStringTweet = tweets.get(i).toString();
jsonObjectTweet = new org.json.JSONObject(jsonStringTweet);
stringXml = XML.toString(jsonObjectTweet);
stringXml = wrap(stringXml);
write(stringXml,fileName);
String stringFromFile = read(fileName);
log.fine(stringFromFile);
new Add(fileName, stringXml).execute(context);
}
}

buth there I'm passing the fileName -- certainly I can just pass 
stringXml by itself somehow?


see also:

https://stackoverflow.com/a/60047738/262852



thanks,

Thufir

On 2020-02-03 1:42 p.m., Christian Grün wrote:

In this case there's no path argument, but there is an input argument of

stringXml.  Is that how to pass a String to Add()?

There are various ways; one is as follows:

 String json = "{ \"A\": 123 }";
 Context ctx = new Context();
 new CreateDB("test").execute(ctx);
 new Set("parser", "json").execute(ctx);
 Command add = new Add("json.xml");
 add.setInput(new ArrayInput(json));
 add.execute(ctx);
 System.out.println(new XQuery(".").execute(ctx));




On Mon, Feb 3, 2020 at 10:16 PM thufir  wrote:




On 2020-02-03 6:46 a.m., Christian Grün wrote:

What does it mean that "if null, the name of input will be set as the path"?


If your path argument points to a directory or a single file, and if
you specify no argument for the input variable, the filenames
resulting from your first argument will be adopted as database paths.

If you run the command "ADD myfile.xml", the input argument will be
null. If you run "ADD TO /db/path myfile.xml", input will be
"/db/path".




Right, but I'm not looking to run the command "ADD myfile.xml" from the
console but rather:


  new Add(null, stringXml).execute(context);

In this case there's no path argument, but there is an input argument of
stringXml.  Is that how to pass a String to Add()?



thanks,

Thufir


Re: [basex-talk] Add command: name of the input will be set as path?

2020-02-03 Thread thufir




On 2020-02-03 6:46 a.m., Christian Grün wrote:

What does it mean that "if null, the name of input will be set as the path"?


If your path argument points to a directory or a single file, and if
you specify no argument for the input variable, the filenames
resulting from your first argument will be adopted as database paths.

If you run the command "ADD myfile.xml", the input argument will be
null. If you run "ADD TO /db/path myfile.xml", input will be
"/db/path".




Right, but I'm not looking to run the command "ADD myfile.xml" from the 
console but rather:



new Add(null, stringXml).execute(context);

In this case there's no path argument, but there is an input argument of 
stringXml.  Is that how to pass a String to Add()?




thanks,

Thufir


Re: [basex-talk] convert JSON to XML to add to database

2020-02-03 Thread thufir

is this what you're referring to?

Command:
SET PARSER json
Command:
CREATE DB tweet /home/thufir/json/tweet.json
Result:
Database 'tweet' created in 166.11 ms.


Which, yes, is exactly the sequence which I'm looking to capture or 
replicate -- but not from a file as above.  It's more the usage of "Add" 
to add a string.


I've converted the JSON to XML, so that rather than tweet.json I have 
tweet.xml for convenience.


Using either ADD or CREATE is my goal -- but not with files.  Trying to 
use Strings.


thanks,

Thufir

On 2020-02-03 6:40 a.m., Christian Grün wrote:

How is JSON converted to XML in order to ADD to a database?

 JSONObject jsonTweet = tweets.getJSONObject(Long.toString(id));
 xmlStringTweet = XML.toString(jsonTweet);


Do you know how to create a database and add documents as JSON via the
BaseX GUI? If yes, you can enable the InfoView panel, and you will see
the commands that are called in the background. In the next step, you
can call these commands with Java.

See [1] for the available BaseX options, and see [2] for an example
the assigns an option via the SET command.

[1] http://docs.basex.org/wiki/Options
[2] 
https://github.com/BaseXdb/basex/blob/master/basex-examples/src/main/java/org/basex/examples/local/CreateCollection.java



[basex-talk] Add command: name of the input will be set as path?

2020-02-03 Thread thufir



What does it mean that "if null, the name of input will be set as the path"?


Javadoc:

Add

public Add(java.lang.String path,
   java.lang.String input)

Constructor, specifying a target path and an input.

Parameters:
path - target path, optionally terminated by a new file name. 
If null, the name of the input will be set as path.

input - input file or XML string



I'm looking to add an xml file, so am using "null" for the path:

https://stackoverflow.com/q/60035605/262852


but what are the implications?  the "name of the input" will be "set as 
path"?  Where is the "name of the input"?  What is "path" in relation to 
a String which exists only in memory?



Just pass a string like:

new Add(null, stringXml).execute(context);

and that should add to the currently open database?






thanks,

Thufir


[basex-talk] convert JSON to XML to add to database

2020-02-02 Thread thufir



Hi all,


How is JSON converted to XML in order to ADD to a database?

   JSONObject jsonTweet = tweets.getJSONObject(Long.toString(id));
   xmlStringTweet = XML.toString(jsonTweet);


alternately, could the JSON simply get directly added to the database?

see also:  https://stackoverflow.com/q/60034291/262852



thanks,

Thufir


[basex-talk] JSON to XML conversion

2020-02-01 Thread thufir

Hi,

How do I do something like:


public void transform(String fileName) throws IOException {
String content = new 
String(Files.readAllBytes(Paths.get(fileName)), StandardCharsets.UTF_8);

org.json.JSONObject json = new org.json.JSONObject(content);
log.info(org.json.XML.toString(json));
}

but using BaseX as below?

private void baseXparseJsonFile(String fileName) throws IOException   {
org.basex.build.json.JsonParser jsonParser = new 
org.basex.build.json.JsonParser(new IOFile(fileName), new MainOptions());

//where is the xml?
}


Unclear on how to use the different parsers (really, just need this one 
parser).  See also:


https://stackoverflow.com/q/60022419/262852



thanks,

Thufir


Re: [basex-talk] simplest possible FLOWR specifying a database

2019-10-12 Thread thufir
that's interesting. what's item in this example, and how can it be 
referenced?



thanks,

Thufir

On 2019-10-12 2:29 a.m., Christian Grün wrote:

On top of the query (in the query prolog), you can bind your database
to the context. After that, there’ll be no need to bind it to a
variable:

declare context item := db:open("com.w3schools.books");
/bookstore



On Sat, Oct 12, 2019 at 10:11 AM thufir  wrote:


nevermind, it's:

let $db := db:open("com.w3schools.books")
return $db/bookstore

I thought it was unusual because of db:open.


-Thufir

On 2019-10-12 12:50 a.m., thufir wrote:

these FLWOR .xq files work as is:

let $db := db:open("com.w3schools.books")
for $x in $db/bookstore
return $x


or

{
let $db := db:open("com.w3schools.books")
for $x in $db/bookstore/book
return {$x/title,$x/author}
}


Probably a silly question, but, rather than, as in the first query,
using $x can I not somehow specify just "/"?



thanks,

Thufir


Re: [basex-talk] simplest possible FLOWR specifying a database

2019-10-12 Thread thufir

nevermind, it's:

let $db := db:open("com.w3schools.books")
return $db/bookstore

I thought it was unusual because of db:open.


-Thufir

On 2019-10-12 12:50 a.m., thufir wrote:

these FLWOR .xq files work as is:

let $db := db:open("com.w3schools.books")
for $x in $db/bookstore
return $x


or

{
let $db := db:open("com.w3schools.books")
for $x in $db/bookstore/book
return {$x/title,$x/author}
}


Probably a silly question, but, rather than, as in the first query, 
using $x can I not somehow specify just "/"?




thanks,

Thufir


[basex-talk] simplest possible FLOWR specifying a database

2019-10-12 Thread thufir

these FLWOR .xq files work as is:

let $db := db:open("com.w3schools.books")
for $x in $db/bookstore
return $x


or

{
let $db := db:open("com.w3schools.books")
for $x in $db/bookstore/book
return {$x/title,$x/author}
}


Probably a silly question, but, rather than, as in the first query, 
using $x can I not somehow specify just "/"?




thanks,

Thufir


[basex-talk] what do with treasury data?

2019-10-11 Thread thufir

I randomly came across the data at:

https://www.treasurydirect.gov/xml/

which is all well and good, but what might be a good starter project 
with this data?  Perhaps join it with other data?  But what other data?


Not even sure what this data, except that there's mention of 
announcements and auctions.



thanks,

Thufir




Re: [basex-talk] how to specify a database in a.xq FLOWR query?

2019-10-07 Thread thufir
From SO (and the fine manual), the solution is to use:  basex -i 
w3school_data titles.xq


Very interesting.  I'm sure this can be piped or simply written to a 
file, but is there a particularly "basex" method?  Not to just return a 
result but write that result back to basex.



thanks,

Thufir

On 2019-10-06 11:53 p.m., thufir wrote:

how do I specify a database for a .xq file to use?

a simple FLOWR:

thufir@dur:~/basex/w3$
thufir@dur:~/basex/w3$ basex titles.xq
[warning] /usr/bin/basex: Unable to locate /usr/share/java/jing.jar in 
/usr/share/java

Learning XML
XQuery Kick Startthufir@dur:~/basex/w3$
thufir@dur:~/basex/w3$
thufir@dur:~/basex/w3$ cat titles.xq
for $x in doc("books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title
thufir@dur:~/basex/w3$

(from w3schools)

thanks,

Thufir


Re: [basex-talk] // versus /*/

2019-10-07 Thread thufir
Thanks, Liam.  No, not homework, just me futzing about.  I'll experiment 
a bit more -- and thanks for the suggestion.


-Thufir

On 2019-10-07 12:25 a.m., Liam R. E. Quin wrote:

On Sun, 2019-10-06 at 21:28 -0700, thufir wrote:

Do these have the same meaning?  Might there be a subtle distinction,
or
might they be read differently but functionally identical?


Are we doing your homework? :-) :-)

  //* is the same as /descendant-or-self::*
  //book means, search the whole database to find "book" elements.


  /*/book meeans make a list of all children of the top-level node, and
find book elements that are children of items in that list.

So, given
   
 
//book will find one node, and /*/book won't find any.


They're equally efficient, at least as used above?

They are doing different things. To measure efficiency, use a much
larger database than the XQuery use cases example :)

You may find Priscilla Walmsley's XQuery book helpful in learning XPath
version 3.

Best,

Liam



[basex-talk] how to specify a database in a.xq FLOWR query?

2019-10-07 Thread thufir

how do I specify a database for a .xq file to use?

a simple FLOWR:

thufir@dur:~/basex/w3$
thufir@dur:~/basex/w3$ basex titles.xq
[warning] /usr/bin/basex: Unable to locate /usr/share/java/jing.jar in 
/usr/share/java

Learning XML
XQuery Kick Startthufir@dur:~/basex/w3$
thufir@dur:~/basex/w3$
thufir@dur:~/basex/w3$ cat titles.xq
for $x in doc("books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title
thufir@dur:~/basex/w3$

(from w3schools)

thanks,

Thufir


[basex-talk] // versus /*/

2019-10-06 Thread thufir
Do these have the same meaning?  Might there be a subtle distinction, or 
might they be read differently but functionally identical?



>
> xquery /*/book[@id="bk112"]

  Galos, Mike
  Visual Studio 7: A Comprehensive Guide
  Computer
  49.95
  2001-04-16
  Microsoft Visual Studio 7 is explored in depth,
  looking at how Visual Basic, Visual C++, C#, and ASP+ are
  integrated into a comprehensive development
  environment.

Query executed in 1.22 ms.
>
> xquery //book[@id="bk112"]/title
Visual Studio 7: A Comprehensive Guide
Query executed in 1.52 ms.
>
> xquery /*/book[@id="bk112"]/title
Visual Studio 7: A Comprehensive Guide
Query executed in 1.67 ms.
>


They're equally efficient, at least as used above?





thanks,

Thufir


Re: [basex-talk] get list of databases

2019-10-06 Thread thufir



thanks, worked perfectly.

On 2019-10-06 3:42 p.m., Christian Grün wrote:

Try

new List().execute(ctx);

See [1] for details.

Cheers,
Christian

[1] 
https://github.com/BaseXdb/basex/blob/master/basex-core/src/main/java/org/basex/core/cmd/List.java#L28





[basex-talk] get list of databases

2019-10-06 Thread thufir

Hi,

pardon the silly question, but how do I get a list of databases exactly? 
 I'm looking at:


https://github.com/BaseXdb/basex/blob/master/basex-examples/src/main/java/org/basex/examples/local/CreateCollection.java

and want output like:

> list
Name   Resources  Size  Input Path
--
books  0  4570

1 database(s).
>


but using:

http://docs.basex.org/javadoc/org/basex/core/cmd/List.html




thanks,

Thufir


Re: [basex-talk] console versus gui

2019-01-03 Thread thufir
I thought I was losing my mind -- thanks for at least confirming your 
understanding.  I'll be futzing with it, bit burnt out at the moment.


thanks,

Thufir

On 2019-01-03 12:41 p.m., Bridger Dyson-Smith wrote:

Hi Thufir,

AFAIK, the GUI and console (client or stand-alone mode) are *not* in 
sync. There is some wording on this in the wiki but I am unable to find 
it at the moment.


On Thu, Jan 3, 2019 at 3:30 PM thufir <mailto:hawat.thu...@gmail.com>> wrote:


I'll take a look at the database directory:

http://docs.basex.org/wiki/Configuration#Database_Directory



On 2019-01-03 11:03 a.m., thufir wrote:
 >
 > I'd like it consistent across access through the Java hooks, GUI and
 > console.


Best,
Bridger


Re: [basex-talk] console versus gui

2019-01-03 Thread thufir

I'll take a look at the database directory:

http://docs.basex.org/wiki/Configuration#Database_Directory



On 2019-01-03 11:03 a.m., thufir wrote:


I'd like it consistent across access through the Java hooks, GUI and 
console.


[basex-talk] console versus gui

2019-01-03 Thread thufir
Are the console and GUI in "sync"?  By which I mean, can a database in 
the GUI not list from the console, and vice versa?  I'm wondering if 
this perhaps relates to using BaseX as db server.


I notice that my Java application can create and delete databases fine, 
but, it seems that that only the GUI can view those databases.  Perhaps 
there's a way to migrate them from the GUI to the console?


I'd like it consistent across access through the Java hooks, GUI and 
console.  Or, perhaps I'm just writing bad code.



I asked on SO:

https://stackoverflow.com/q/54028203/262852





thanks,

Thufir


Re: [basex-talk] Java XQuery

2019-01-03 Thread thufir

The project is:

https://github.com/THUFIR/helloWorldBaseX


While I've been through many of the Java examples it's also entirely 
possible that I overlooked something.  Learning XQuery and Xpath syntax 
and capabilities.



-thufir


Re: [basex-talk] Java XQuery

2019-01-03 Thread thufir




On 2019-01-03 6:00 a.m., Christian Grün wrote:

If you use Java, there is quite a variety on running queries. Maybe
you could give us some insight into your use case first? For example,
what do you want to do with the result?


Yes, bit spaghetti-ish, pardon.  The notion is to first drop the 
database, then populate, then query.  For grabbing xml from w3schools, 
popping in a database, running an xquery, that works fine.


Moving to html, it then sortof works.  The db is dropped, a db is 
created and then populated.  Browsing in the GUI I can see, for example, 
a list of book categories -- so there's data to work from.  (Which 
tagsoup has fixed so that basex can parse it.)


That's really the end goal:  just running XQuery against html.

The only query I can get working against the html is for the query 
string to be "text()" or perhaps "/text()" which then returns all the 
html.  Rather, I'd want to traverse to pick out specific parts.


It's related, to a degree, with Selenium efforts.

---

The upshot being that the way tagsoup fixes malformed html either causes 
(me) problems with running xquery queries, or, more likely, I'm not 
understanding how to run xpath and xquery against the db properly.


The GUI is very interesting in this respect because it allows me to 
visualize the raw data, it's "clickable", and I can run type xpath 
queries right in the GUI.


However, the *only* xpath query I can get results on is "text()".  Not 
so with "raw" xml from w3schools.  With that xml I can drill down to 
varying degrees as expected.


---

Either tagsoup is mashing the html too extremely, or it's my lack of 
knowledge.




Hey, I appreciate the input.  Hope I made sense.


-Thufir


[basex-talk] XQuery from Java

2019-01-02 Thread thufir
I'm a tad more proficient with BaseX, but, how do I actually build an 
XQuery?  Yes, I can write an XQuery and save it in a text file, then use 
BaseX for execution.


But it would be far more flexible to write the actual XQuery from within 
Java itself.  But how?


Oracle gives an example:

OXQDataSource ds = new OXQDataSource();
XQConnection con = ds.getConnection();
String query = "{1 + 1}";
XQPreparedExpression expr = con.prepareExpression(query);
XQSequence result = expr.executeQuery();

using their datasource.  Cannot BaseX do something similar?


see also:

https://stackoverflow.com/q/53996575/262852





thanks,

Thufir


[basex-talk] upper limits on storage; database admin

2019-01-02 Thread thufir
What are the upper bounds to basex in size?  Assuming it's just text 
xml, gigabytes is quite a bit to my thinking.  At a certain, it's "big 
data" -- but how do you know when you're approaching that point?


Or, is the bottleneck more read/write and consistency problems?  What 
little I know of RDBMS is that master/slave can alleviate some bottlenecks.


To put this another way:  I'm so enthusiastic about basex that I'm 
having trouble finding a place it doesn't fit.  As you approach 
terabytes and beyond what dbadmin approaches are employed?




-Thufir


[basex-talk] Java XQuery

2019-01-02 Thread thufir

After creating and populating a database:

try {
new DropDB(databaseName).execute(context);
} catch (BaseXException ex) {

Logger.getLogger(Database.class.getName()).log(Level.SEVERE, null, ex);
LOG.fine("no databases to drop");
}

how would I query it?  The query:

db:open("note")//note



thanks,

Thufir


Re: [basex-talk] console usage

2019-01-02 Thread thufir




On 2019-01-02 8:07 a.m., Christian Grün wrote:

  > RUN fetch.note.text.xq
Resource "/home/thufir/fetch.note.text.xq" not found.


Is the script located at this path? If not, you will have to change to
this directory and start BaseX from there, or you will need to specify
the full path.




You were spot on:

thufir@dur:~/basex$
thufir@dur:~/basex$ basex
[warning] /usr/bin/basex: Unable to locate /usr/share/java/jing.jar in 
/usr/share/java

BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
>
> RUN fetch.note.text.xq

  Tove
  Jani
  Reminder
  Don't forget me this weekend!

Query "fetch.note.text.xq" executed in 1433.71 ms.
>
> exit
Enjoy life.
thufir@dur:~/basex$



Thanks.  And, enjoy life :)



Re: [basex-talk] console usage

2019-01-02 Thread thufir




On 2019-01-02 8:01 a.m., thufir wrote:

Resource "/home/thufir/fetch.note.text.xq" not found.



O

My pwd is /home/thufir/basex which is where the script is..


Re: [basex-talk] console usage

2019-01-02 Thread thufir




On 2019-01-02 8:07 a.m., Christian Grün wrote:

  > RUN fetch.note.text.xq
Resource "/home/thufir/fetch.note.text.xq" not found.


Is the script located at this path? If not, you will have to change to
this directory and start BaseX from there, or you will need to specify
the full path.



Yes, the script is there:

thufir@dur:~/basex$
thufir@dur:~/basex$ cat fetch.note.text.xq

fetch:xml("https://www.w3schools.com/xml/note.xml;, map { 'chop': true() })


thufir@dur:~/basex$
thufir@dur:~/basex$ ls -al fetch.note.text.xq
-rw-r--r-- 1 thufir thufir 79 Jan  2 07:54 fetch.note.text.xq
thufir@dur:~/basex$
thufir@dur:~/basex$ pwd
/home/thufir/basex
thufir@dur:~/basex$


Yet my attempts to execute from the console give an error.  Thanks for 
confirming that the error, as it appears, is that the file cannot be 
found.  Makes no sense to me.



The script itself works from the GUI:


https://stackoverflow.com/questions/54009208/



thanks,

Thufir


ps:  I missed your earlier reply but have fixed my gmail filters.


[basex-talk] console usage

2019-01-02 Thread thufir

How can I run this script?


thufir@dur:~/basex$
thufir@dur:~/basex$ basex
[warning] /usr/bin/basex: Unable to locate /usr/share/java/jing.jar in 
/usr/share/java

BaseX 9.0.1 [Standalone]
Try 'help' to get more information.
>
> RUN fetch.note.text.xq
Resource "/home/thufir/fetch.note.text.xq" not found.
>
> exit
Have fun.
thufir@dur:~/basex$
thufir@dur:~/basex$ cat fetch.note.text.xq

fetch:xml("https://www.w3schools.com/xml/note.xml;, map { 'chop': true() })


thufir@dur:~/basex$



The script seems to work fine from the GUI.




thanks,

Thufir


[basex-talk] basic xquery

2019-01-01 Thread thufir

How can I run this xquery:


fetch:xml("http://books.toscrape.com/;, map {
  'parser': 'html',
  'htmlparser': map { 'html': false(), 'nodefaults': true() }
})


from basex using Java?  I can run that xquery from the GUI.



thanks,

Thufir