Hi Matisse,

1. If files have similar structure - use collection:

./se_term test_db
create collection "files"&

2. Use log-less mode (and use one transaction) supported by se_term (
http://sedna.org/adminguide/AdminGuidesu4.html#x8-150002.4):

Write load.xq:

\nac
\ll
load '/tmp/docs/doc1.xml' 'doc1' 'files'&
load '/tmp/docs/doc2.xml' 'doc2' 'files'&
...
load ...
\commit
\quit

Run:

./se_term -query load.xq test_db

3. Don't create any indices before bulk load.
4. Increase buffers, e.g.:

se_sm -bufs-num 16000 test_db

sets 1GB buffers. Exact number depends on your documents structure.

Ivan Shcheklein,
Sedna Team


I have a directory containing 3700 XML files, about 550MB in total.
> What is the fastest way to import these into sedna?
>
> For comparison I am able to import these files into BaseX in about 5
> minutes on a 2.5 GHz MacBook Pro.
>
> -M
>
>
> ------------------------------------------------------------------------------
> Virtualization & Cloud Management Using Capacity Planning
> Cloud computing makes use of virtualization - but cloud computing
> also focuses on allowing computing to be delivered as a service.
> http://www.accelacomm.com/jaw/sfnl/114/51521223/
> _______________________________________________
> Sedna-discussion mailing list
> Sedna-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/sedna-discussion
>
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Sedna-discussion mailing list
Sedna-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sedna-discussion

Reply via email to