Hi, Alexander!
 
I have written this task.
I have about 50 databases. If I search in each one database separately - it takes about 5-10ms per base.
But if I search in "db:list...db:open..." - it takes about 12-15 seconds.
 
Example takes ~12-15s:
let $db := for $i in db:list()[starts-with(.,'000999~')] return try {db:open($i)} catch * {}
for $doc in $db/.//*[text() contains text { 'TEN-9258' } any]
return $doc
 
Example takes ~180ms (returns 2 rows):
let $db := for $i in db:list()[starts-with(.,'000999~201807')] return db:open($i)
for $doc in $db/.//*[text() contains text { 'TEN-9258' } any]
return $doc
 
Example takes ~10ms (returns 2 rows):
for $doc in db:open('000999~201807')/.//*[text() contains text { 'TEN-9258' } any]
return $doc
 
Why do the last 2 examples take different times?
How can I improve this?
 
Example takes ~2s (returns 0 rows):
let $db := for $i in db:list()[starts-with(.,'000999~201806')] return db:open($i)
for $doc in $db/.//*[text() contains text { 'TEN-9258' } any]
return $doc
 
Example takes ~12ms (returns 0 rows):
for $doc in db:open('000999~201806')/.//*[text() contains text { 'TEN-9258' } any]
return $doc
 
 
25.06.2018, 13:07, "Alexander Shpack" <shadow...@gmail.com>:
Hi, Vladimir,
 
If you will do db names with the particular prefix, for example "db_", you may use the next code

let $docs := for $i in db:list()[starts-with(.,"db_")] return db:open($i)return $docs/*


 
On Mon, Jun 25, 2018 at 12:32 PM Ветошкин Владимир <en-tra...@yandex.ru> wrote:
Hi, Alexander,
 
Some questions:
After that, how can I perform a search in all of these databases?
Can I search for substring without fulltext using only text index?
 
25.06.2018, 11:56, "Alexander Shpack" <shadow...@gmail.com>:
Hey Vladimir,
 
You can use sharding approach for you data import and split all DBs even every month.

 
 
On Mon, Jun 25, 2018 at 11:50 AM Ветошкин Владимир <en-tra...@yandex.ru> wrote:
Hi, Alexander!
Thank you!
 
In my previous letter I have described the proccess in short.
I'll think about separated DB. But I'm afraid that this base will also be very big in future.
Although I can try to split data to several databases - one per year.. Hmm..
 
25.06.2018, 11:25, "Alexander Shpack" <shadow...@gmail.com>:
Hey, Vladimir!
 
Just put this specific files to the separated DB and than index it.
 You can process it automatically, BaseX allows to create and index DB right from XQuery.
 
I hope it helps you. Anyhow, you can provide more details about your task and we can figure out the best solution for you.
 
 
 
On Mon, Jun 25, 2018 at 10:42 AM Ветошкин Владимир <en-tra...@yandex.ru> wrote:
Hi, Fabrice!
Thank you.
 
All databases constantly change.That is why there is no way to single out "a big readonly collection" :(
Maybe it is possible to use some other incremental indexes?
I have to index specific xml-files, not all files in database.
 
21.06.2018, 17:16, "Fabrice ETANCHAUD" <fetanch...@pch.cerfrance.fr>:

Hi Vladimir,

 

I don’t think there is something like a incremental full text index for the moment [1].

As index is per collection, the recommanded way shall be to split your data in two collections :

-          A big readonly collection of all the past updates, indexed once

-          A small/medium sized collection whom full text index can be recreated in an acceptable time after each update.

At the end of a predefined time period, you have to add the live collection to the readonly one, reindex it, and truncate the live one.

 

Best regards from France,

Fabrice Etanchaud

 

[1] http://docs.basex.org/wiki/Indexes#Updates

 

 

 

 

De : BaseX-Talk [mailto:basex-talk-boun...@mailman.uni-konstanz.de] De la part de ???????? ????????
Envoyé : jeudi 21 juin 2018 16:02
À : BaseX
Objet : [basex-talk] Full-Text

 

Hi, everyone!

 

Is there any way to index only imported xml-files?

Now, when I import xml-files the full-text index is deleted.

After importing I recreate whole full-text index and it takes too much time :(

 

-- 

С уважением,

Ветошкин Владимир Владимирович

 

 
 
-- 
С уважением,
Ветошкин Владимир Владимирович
 


--
s0rr0w
 
 
-- 
С уважением,
Ветошкин Владимир Владимирович
 


--
s0rr0w
 
 
-- 
С уважением,
Ветошкин Владимир Владимирович
 


--
s0rr0w
 
 
-- 
С уважением,
Ветошкин Владимир Владимирович
 

Reply via email to