Hi Matisse,
1. If files have similar structure - use collection:
./se_term test_db
create collection "files"&
2. Use log-less mode (and use one transaction) supported by se_term (
http://sedna.org/adminguide/AdminGuidesu4.html#x8-150002.4):
Write load.xq:
\nac
\ll
load '/tmp/docs/doc1.xml' 'doc1' 'files'&
load '/tmp/docs/doc2.xml' 'doc2' 'files'&
...
load ...
\commit
\quit
Run:
./se_term -query load.xq test_db
3. Don't create any indices before bulk load.
4. Increase buffers, e.g.:
se_sm -bufs-num 16000 test_db
sets 1GB buffers. Exact number depends on your documents structure.
Ivan Shcheklein,
Sedna Team
I have a directory containing 3700 XML files, about 550MB in total.
> What is the fastest way to import these into sedna?
>
> For comparison I am able to import these files into BaseX in about 5
> minutes on a 2.5 GHz MacBook Pro.
>
> -M
>
>
> ------------------------------------------------------------------------------
> Virtualization & Cloud Management Using Capacity Planning
> Cloud computing makes use of virtualization - but cloud computing
> also focuses on allowing computing to be delivered as a service.
> http://www.accelacomm.com/jaw/sfnl/114/51521223/
> _______________________________________________
> Sedna-discussion mailing list
> Sedna-discussion@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/sedna-discussion
>
------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Sedna-discussion mailing list
Sedna-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sedna-discussion