Today I released xmlsh 1.2.0  at http://www.xmlsh.org

This release is somewhat major so the minor version and jar file names were 
incremented.   A major new feature is support for streaming XDM without 
temporary files or serialization.  This is supported by the classic pipes "|" 
using "set -xpipe" as well as a new feature named pipes (created with xmkpipe). 
  Special support for XDM streaming is being added to many of the core commands.

I am going to be updating the documentation over the next few weeks to cover 
these new features in more detail but for now the release is out and up for 
experiments.

Updates were made for the following extension modules
aws , json , marklogic, jmx , calabash , exist

The marklogic put command has been updated to support streaming by either named 
pipes or stdin.  A necessary feature to support this is auto naming of document 
URI's.  Now in streaming mode (and in the future other modes) you can use {seq} 
for sequential naming or {random} for 64 bit random names.


An example of where you might use this is streaming large or continuous feeds.
E.g say a large database dump, split  into row records and inserted into an XML 
database without intermediate files.

xmkpipe -xml feed
xsql ... "SELECT * from table" | xsplit -n -stream feed &
ml:put -stream feed -baseuri /test -uri  'doc{random}.xml'

This concept also works with continuous feeds.  For example I have experimented 
with the (upcoming) twitter extension module and was able to stream a million 
twitter feeds into marklogic without use of temporary files or serialization.  
It only stopped because I killed it off, not due to memory or file consumption. 
 (Eventually the ML DB would be full though ... 1 million  tweets took about 
4GB of indexed space).


-----------------------------------------------------------------------------
David Lee
Lead Engineer
MarkLogic Corporation
[email protected]
Phone: +1 650-287-2531
Cell:  +1 812-630-7622
www.marklogic.com<http://www.marklogic.com/>

This e-mail and any accompanying attachments are confidential. The information 
is intended solely for the use of the individual to whom it is addressed. Any 
review, disclosure, copying, distribution, or use of this e-mail communication 
by others is strictly prohibited. If you are not the intended recipient, please 
notify us immediately by returning this message to the sender and delete all 
copies. Thank you for your cooperation.





_______________________________________________
General mailing list
[email protected]
http://community.marklogic.com/mailman/listinfo/general

Reply via email to