Re: [basex-talk] Finding document based on filename

Eliot Kimber Mon, 31 Aug 2015 10:57:12 -0700

How about (ignore the bad casing--that's Outlook's autocorrect and I'm too
lazy to go back and correct every line):


Let $docs := collection('/mydir')/*
For $doc in $docs
    Return if (matches(document-uri(root($doc)), '^.+somestring$'))
           Then $doc
           Else ()

Cheers,

Eliot
----
Eliot Kimber, Owner
Contrext, LLC
http://contrext.com




On 8/31/15, 11:35 AM, "Martín Ferrari"
<basex-talk-boun...@mailman.uni-konstanz.de on behalf of
ferrari_mar...@hotmail.com> wrote:

>Hi Mansi,     I have a similar situation. I don't think there's a fast
>way to get documents by only knowing a part of their names. It seems you
>need to know the exact name. In my case, we might be able to group
>documents by a common id, so we might create subfolders inside the DB and
>store/get the contents of the subfolder directly, which is pretty fast.
>     I've also tried indexing, but insertions got really slow (I assume
>maybe because indexing is not granular, it indexes all values) and we
>need performance.
>
>     Oh, I've also tried using starts-with() instead of contains(), but
>it seems it does not pick up indexes.
>
>Martín.
>
>________________________________________
>Date: Fri, 28 Aug 2015 16:52:37 -0400
>From: mansi.sh...@gmail.com
>To: basex-talk@mailman.uni-konstanz.de
>Subject: [basex-talk] Finding document based on filename
>
>Hello, 
>I would be having 100s of databases, with each database having 100 XML
>documents. I want to devise an algorithm, where given a part of XML file
>name, i want to know which database(s) contains it, or null if document
>is not currently present in any database. Based on that, add current
>document into the database. This is to always maintain latest version of
>a document in DB, and remove the older version, while adding newer
>version.
>
>So far, only way I could come up with is:
>
>for $db in all-databases:
>      open $db
>      $fileNames = list $db
>            for eachFileName in $fileNames:
>                   if $eachFileName.contains(sub-xml filename):
>                            add to ret-list-db
>
>return ret-list-db
>
>Above algorithm, seems highly inefficient, Is there any indexing, which
>can be done ? Do you suggest, for each document insert, I should maintain
>a separate XML document, which lists each file inserted etc.
>
>Once, i get hold of above list of db, I would be eventually deleting that
>file and inserting a latest version of that file(which would have same
>sub-xml file name). So, constant updating of this external document also
>seems painful (Map be ?).
>
>Also, would it be faster, using XQUERY script files, thru java code, or
>using Java API for such operations ?
>
>How do you all deal with such operations ?
>
>- Mansi
>
>
>
>
>
>
>

Re: [basex-talk] Finding document based on filename

Reply via email to