Thanks. I'm still trying to get this to work. Is it possible to put
updating expressions in a library module function (with the name of the
database hard coded) and then call from the function within jobs:eval() in
a main module? When I do this, the jobs don't seem to run in parallel. But
if I put the updating expressions in the main module, the jobs do seem to
run in parallel. Is this a limitation?

I have millions of updates (inserts) that I'm trying to run on 10 large
databases (5GB each). In my current process, it takes about 48 hours to
update a single DB. Are there other options you'd recommend in order to
speed things up?

All best,
Tim


--
Tim A. Thompson
Discovery Metadata Librarian
Yale University Library

On Wed, Feb 10, 2021 at 3:27 AM Christian Grün <[email protected]>
wrote:

> Hi Tim,
>
> Updates can be run in parallel if the name of the database is directly
> specified in the query [1]:
>
>   jobs:eval('delete node db:open("db1")//abc'),
>   jobs:eval('delete node db:open("db2")//def')
>
> In a future version of BaseX, we might split up our compilation phase
> into multiple ones. After this, we could statically detect that a
> passed on variable will be the name of a database.
>
> Until then, you could try to build a query string that included
> hard-coded database names.
>
> Hope this helps,
> Christian
>
> [1] https://docs.basex.org/wiki/Transaction_Management#XQuery
>
>
>
> On Wed, Feb 10, 2021 at 1:56 AM Tim Thompson <[email protected]> wrote:
> >
> > Thank you, Christian, for the detailed explanation!
> >
> > One more question, if I may. Is it possible to run updating jobs on
> different databases in parallel? Or can database update operations only be
> run sequentially, one db at a time? I have a query that calls a function to
> perform a series of operations:
> >
> > for $i in (0 to 9)
> > return (
> >   jobs:eval('
> >     declare variable $iter external;
> >     local:add-uris("marc.exp.20210115."||$iter)
> >   ', map {"iter": $i})
> > )
> >
> > The function:
> >
> > opens a database
> > iterates through its records
> > performs lookups against an index
> > inserts any matches into the database
> > calls file:append-text-lines() to write the results of the lookups
> >
> > Based on some simple tests, it doesn't seem possible to run the jobs in
> parallel, but I thought I would ask--to see whether there was something I
> was missing.
> >
> > Thanks again,
> > Tim
>

Reply via email to