Hi, Jakob:

We don't currently provide an equivalent to mlcp for Node.js

A content pump for Node.js might have characteristics similar to the following:

*  forming documents from input records parsed from one or more input streams
*  adding documents to batches based on forest assignment
*  sending batches to the appropriate dnodes with multiple concurrent requests 
each for multiple worker processes
*  adding workers or providing backpressure to input streams as needed to 
maintain optimal throughput

While that's possible and would be an interesting challenge, the streaming 
libraries available on npm and the Node.js client API certainly don't do all of 
that heavy lifting by themselves.

Could mlcp be used for ingestion in your environment?


Erik Hennum


________________________________
From: [email protected] 
[[email protected]] on behalf of Jakob Fix 
[[email protected]]
Sent: Monday, February 08, 2016 3:29 PM
To: General Mark Logic Developer Discussion
Subject: [MarkLogic Dev General] using node streams to write many documents 
into database

Hi,

I've found the documentation that explains how to use a WritableStream to get 
/one/ document into MarkLogic, but I couldn't find any example where it shows 
how one could stream /many thousands/ of documents.

The idea is to load a CSV file with > 1M lines as a ReadableStream and 
csv-parse and on each "readable" event to push the corresponding JSON object as 
a document into MarkLogic.

The signature for the db.documents.createWriteStream [1] seems to require a 
document URI to be present at the time of the stream creation, which I cannot 
supply at the stage of stream creation. The example given in the documentation 
on how to load many documents doesn't really scale to "big data proportions" 
... [2].

Thanks for any help.

cheers,
Jakob.

[1]  
https://github.com/marklogic/node-client-api/blob/master/lib/documents.js#L468
[2] http://docs.marklogic.com/guide/node-dev/documents#id_18341
_______________________________________________
General mailing list
[email protected]
Manage your subscription at: 
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to