Dear basex and xquery experts,
I have to solve the following problem:
On a daily basis I get data packages in csv format (approx.. 20.000 records)
I imported the 1st package to basex-DB and want to import further changed or 
new records from the daily deliveries.
I tried several solutions but cannot get a solution which is performant enough 
to be executed every day.
While importing complete packages takes a few seconds – comparing against 
records in the database requires hours.
So I decided to use the index variant which is described by Christian Grün.
My core code looks as follows:
let $db = db:open(‘mydb’) 
let $ndx = db:open(‘mydb-index’)
let $xml := csv:parse($csv, map { "header": true(), "separator": ';' })
let $xml := (
 for $x in subsequence($xml//record,2) 
 count $c
 return
 if( local:recordUpdated( $x, ‘mydb’, $ndx)) then (
   element {'record'} {
   attribute {'recno'} {$c},
   $x/child::*}
 ) else ()
)
(: Aufnahme der Differenzen in die DB :) 
db:replace( $dbname, $tpalow||"/"||$seqno, $xml)

declare function local:recordUpdated( $rec, $dbname, $ndx as item()*) as 
xs:boolean {
let $id := $ndx//vp[@ptn=$rec/_14_TPA-Key_VP/text() and 
@tarif=$rec/_28_TPA-Key_Tarif/text()]/id
return
 if( $id) then (
   let $oldrec := db:open-id($dbname, $id)/..
   return not(deep-equal($oldrec/child::*, $rec/child::*))
 ) else true()  
};
Both databases ‘mydb’ and ‘mydb-index’ are created with attribute and text 
indexes.
After each run the index is newly created to point to the newest records for 
the given key values.
Can anybody point out why this process takes more than a second per each single 
record to compare?
Thanks for your time

Reply via email to