To my knowledge putting hexbin inside binary { } is the way to create a
real binary. So your approach should already work. Did you check?A small optimization could be to make batches of let’s say about a 100 records, build a map:map of them, pass that to a spawn process that inserts all 100. You are creating a new task server thread for every record now. The task server queue has a limit, and doing it in batches of 100 files usually works faster. Here a bit of sample ‘transaction’ code I copied from collector-feed.xqy ( https://github.com/marklogic/infostudio-plugins/blob/master/collectors/collector-feed.xqy ): let $entries := … let $entry-count := count($entries) let $transaction-size := 100 let $total-transactions := ceiling($entry-count div $transaction-size) (: create transactions by breaking document set into maps each maps's documents are saved to the db in their own transaction :) let $transactions := for $i at $index in 1 to $total-transactions let $map := map:map() let $start := (($i -1) *$transaction-size) + 1 let $finish := min((($start - 1 + $transaction-size), $entry-count)) let $put := for $entry in ($entries)[$start to $finish] let $id := fn:concat(fn:string($entry/atom:id),".xml") return map:put($map,$id,$entry) return $map (: the callback function for ingest :) let $ingestion := for $transaction at $index in $transactions return infodev:transaction($transaction,$ticket-id, xdmp:function(xs:QName("feed:process-file")),$policy-deltas,$index,(),()) Replace $entries with $table/row, change let $id to use your uris, and replace that infodev:transaction call with your own spawn that should take an entire map, loop over its keys, and do an insert for each key/value within the map.. Cheers, Geert *Van:* [email protected] [mailto: [email protected]] *Namens *Todd Gochenour *Verzonden:* zaterdag 25 februari 2012 7:10 *Aan:* MarkLogic Developer Discussion *Onderwerp:* Re: [MarkLogic Dev General] Processing Large Documents? It's time for me to pick this project up now that the work week has passed. I'm attempting to implement Michael Blakeley's recommendation to move the SQL blob content into it's own document as part of this initial load/chunk phase. Here's how I see the strategy. As I iterate through each record in a table, when there is an element with the attribute xsi:type="xs:hexBinary", I want to extract this data, generate a new document. replace the original element with a reference to this new document, and then spawn two 'document-insert.xqy' operations, one for the original document and one for the binary document. These are my current issues. I haven't figured out how to convert the hexBinary into binary so that when I fetch the document I get the correct format. I probably need to be setting mime type. The @xsi:type attribute isn't part of the table_data, so I can't trigger blob processing based upon this attribute. I'm currently only processing elements called file_blob. My current working copy now looks like: (: query console :) xquery version "1.0-ml"; for $table in xdmp:document-get('C:\Users\servicelogix\slx\us_co_slx.xml')/*/*/table_data let $table-name := $table/@name/string() let $database-name := $table/../@name/string() for $row in $table/row let $record-uri := concat('/',$database-name,'/',$table-name,'/id-',$row/field[@name='id']) let $file-uri := concat('/',$database-name,'/',$table-name,'/file-',$row/field[@name='id']) let $blob := if($row/field[@name='file_blob'][1]) then binary {xs:hexBinary($row/field[@name='file_blob'][1])} else () let $record := element { $table-name } { $row/field[text() and not(@name='file_blob')]/element { if(number(substring(@name,1,1))=number(substring(@name,1,1))) then concat('_',@name) else @name } { text() }, if($blob) then element file_uri { $file-uri } else () } return ( if($record) then xdmp:spawn('document-insert.xqy', (xs:QName('URI'), $record-uri, xs:QName('NEW'), $record)) else (), if($blob) then xdmp:spawn('document-insert.xqy', (xs:QName('URI'), $file-uri, xs:QName('NEW'), $blob)) else () )
_______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
