On Mon, May 14, 2012 at 12:50 AM, Stephen Weiss <[email protected]> wrote:
> I'm new to node.js so forgive what's probably a very newbie question...
>
> I've been trying to use writestreams in a few different use cases - one as
> an HTTP request stream and one just writing to a plain file - and no matter
> what, I always observe the same behavior, where nothing actually ever gets
> written to the stream, until program execution is over.
>
> So, for example, if I write this code:
> var fs = require('fs');
>
> var ws = fs.createWriteStream("/tmp/out", {
> flags: 'w+'
> });
>
> var i = 0;
>
> while (i < 100000) {
> console.log("writing");
> ws.write("random text\n");
> i++;
> }
>
> ws.end();
>
>
> I will see all the "writing" lines outputted, and only after all of the
> "writing" lines are outputted to my terminal do any of the "random text"
> lines get sent to my file.  I could have set it to write 10 lines, or a
> billion lines, and I see the same behavior.
>
> My problem is, I'm trying to write a routine that generates JSON for several
> million objects and writes those JSON objects to elasticsearch, via the
> elasticsearchclient module, which sends data to elasticsearch via an HTTP
> request (which is also a writestream).  However, my routine always fails,
> because node.js runs out of memory before any data actually gets written to
> the stream.  My routine works great if I only try to index 10 documents -
> once program execution ends, it sends all 10 documents over at once, and
> they are indexed - but when I try to index the entire database, it fails,
> even though I send the writes along 1000 at a time, and it has ample time to
> start sending at least some of the documents .  It would all work great if
> it could just start sending the data as soon as it's buffered, but nothing
> I've found gets it to do that.  What I really need is a "flush" command or
> something but there isn't one listed in the documentation.  The
> documentation would seem to indicate that this should happen automatically,
> but it just doesn't.
>
> Nothing in the documentation seems to indicate that writestreams should work
> this way, so I find this very baffling and frustrating.  Is there any way to
> flush the writestream, to force it to start writing to the stream before it
> runs out of memory?  It seems like a pretty obvious thing, and in other
> programming languages I've never had a problem like this, but I'm just not
> finding the documentation where it explains this.  Usually, in most
> languages, you write to a stream, and it tries to output the data as quickly
> as it can - it doesn't usually buffer until your program is done executing.
>   I try listening for the "drain" event but the "drain" event never fires.
>  The stream is always writable = false, right from the start - the kernel
> buffer seems to be full right away.  Nothing really seems to work the way
> it's documented...
>
> I'm running node 0.6.17.  I'm pretty sure I'm missing something very obvious
> here, but I've scoured the documentation and the forums for hours and I
> can't find anything that helps me solve my problem.  If anyone can please
> help, I'd really appreciate it.  Thanks.

In your example, you're doing all the work in a single "tick" of the
event loop, effectively queuing up 100K write requests.

If you slice up the requests like below, you give node.js the
opportunity to process them concurrently:

  var i = 0;
  function work() {
    while (i < 100000) {
      console.log("writing");
      ws.write("random text\n");
      if (++i % 1000 == 0) return process.nextTick(work);
    }
  }
  work();

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Reply via email to