Re: [nodejs] nodejs zlib performance

Mark Hahn Thu, 25 Oct 2012 14:41:26 -0700

> He could pipe uncompressed output from node process A onto compressor
node process B.


Could he use buffers?  Would that be faster?  What is the overhead of the
piping compared to the compressing?  I assume it would be minor.

On Thu, Oct 25, 2012 at 2:38 PM, Jorge <[email protected]> wrote:

> Hi Isaac,
>
> He could pipe uncompressed output from node process A onto compressor node
> process B. That's the Node Way®, isn't it?
>
> Or, he could do it all in a single node process, but that would mean
> delegating CPU intensive jobs to background threads, the mere idea of which
> is something that unnerves more than one here.
>
> Or not?
>
> Cheers,
> --
> Jorge.
>
> On 25/10/2012, at 23:24, Isaac Schlueter wrote:
>
> > Jorge,
> >
> > Please do not make snarky remarks about Node on this mailing list.  If
> > you have a problem with something, bring it up in a new thread.  If
> > you have something to add to this thread, then please do so, but this
> > is not helpful.
> >
> >
> > On Thu, Oct 25, 2012 at 10:15 PM, Jorge <[email protected]> wrote:
> >> Threads are evil™, don't use threads.
> >>
> >> The Node Way® (just don't ask) is to pipeline processes as in the good
> ol' 70s. Flower Power, peace and love bro, and etc.
> >>
> >> Cheers,
> >> --
> >> Jorge.
> >>
> >> On 25/10/2012, at 19:24, Vadim Antonov wrote:
> >>
> >>> There're 8 processes (1 per core, created with cluster lib) and every
> process has got 6 threads.
> >>> Do you have an ETA for the thread pool updates? Is there a way how can
> we help you?
> >>>
> >>> Thank you.
> >>> --
> >>> Vadim
> >>>
> >>> On Wednesday, October 24, 2012 5:41:36 PM UTC-7, Ben Noordhuis wrote:
> >>> On Thu, Oct 25, 2012 at 12:26 AM, Vadim Antonov <[email protected]>
> wrote:
> >>>> Hi everybody,
> >>>>
> >>>> I've tried to google about the nodejs zlib performance and didn't
> find any
> >>>> useful information.
> >>>> I work on the high-loaded API which communicates with multiple backend
> >>>> servers. Throughput on 1 VM with 8 cores is around 200 QPS and for
> every
> >>>> query application makes up to 50 queries to cache/backends. Every
> response
> >>>> from cache/backends is compressed and requires decompression.
> >>>> It ends up that application need to make up too 10000 decompressions
> per
> >>>> second.
> >>>>
> >>>> Based on the zlib code for every decompression new thread from the
> thread
> >>>> pool is being used (one binding to the C++ code per decompression, we
> use
> >>>> naive method
> http://nodejs.org/api/zlib.html#zlib_zlib_gunzip_buf_callback -
> >>>> there is no way to use the stream methods):
> >>>> https://github.com/joyent/node/blob/master/lib/zlib.js#L272
> >>>> https://github.com/joyent/node/blob/master/src/node_zlib.cc#L430
> >>>>
> >>>> Service has started to see huge performance spikes a couple of times
> during
> >>>> the day which are coming from the decompression code: from time to
> time
> >>>> decompression takes up to 5 seconds and  all decompression calls are
> blocked
> >>>> during this time.
> >>>> I think that the issue is coming from the thread pool (uv_work_t)
> which zlib
> >>>> is using. Does anybody else see the same behavior? Is there any
> workarounds
> >>>> for it? Where can I find documentation about it? V8 code?
> >>>> At this point of time we've started to use snappy library
> >>>> (https://github.com/kesla/node-snappy) with sync
> compression/decompression
> >>>> calls. But service still need to decompress backend responses with
> gzip...
> >>>>
> >>>> To illustrate a little bit what I'm talking about, here is a small
> example
> >>>> (it generate 'count' buffers, decompresses them 'count2' times and
> writes
> >>>> all + min/max/avg timings).
> >>>>
> >>>> var _ = require('underscore');
> >>>> var rbytes = require('rbytes');
> >>>> var step = require('step');
> >>>> var zlib = require('zlib');
> >>>>
> >>>> var count = 10;
> >>>> var count2 = 1000;
> >>>> var count3 = 0;
> >>>> var len = 1024;
> >>>> var buffers = [];
> >>>> var timings = {};
> >>>> var totalTime = 0;
> >>>> var concurrent = 0;
> >>>> var maxConcurrent = 128;
> >>>>
> >>>> function addCompressed(done) {
> >>>>    zlib.gzip(rbytes.randomBytes(len), function (error, compressed) {
> >>>>        buffers.push(compressed);
> >>>>        done();
> >>>>    });
> >>>> }
> >>>>
> >>>> function decompress(done) {
> >>>>    var time = Date.now();
> >>>>    zlib.gunzip(buffers[Math.floor(Math.random() * count)], function
> (error,
> >>>> decompresed) {
> >>>>        if (error) {
> >>>>            console.log(error);
> >>>>        }
> >>>>        var total = Date.now() - time;
> >>>>        totalTime += total;
> >>>>        if (!timings[total]) {
> >>>>            timings[total] = 0;
> >>>>        }
> >>>>        timings[total]++;
> >>>>        ++count3;
> >>>>        if (done && count3 == count2) {
> >>>>            done();
> >>>>        }
> >>>>    });
> >>>> }
> >>>>
> >>>> step(
> >>>>    function genBuffers() {
> >>>>        for(var i = 0; i < count; ++i) {
> >>>>            var next = this.parallel();
> >>>>            addCompressed(next);
> >>>>        }
> >>>>    },
> >>>>    function runDecompression() {
> >>>>        var next = this;
> >>>>        for(var i = 0; i < count2; ++i) {
> >>>>            decompress(next);
> >>>>        }
> >>>>    },
> >>>>    function writeTotal() {
> >>>>        var min = null;
> >>>>        var max = -1;
> >>>>        _.each(timings, function(total, value) {
> >>>>            max = Math.max(value, max);
> >>>>            min = min ? Math.min(min, value) : value;
> >>>>            console.log(value + ' ' + total);
> >>>>        });
> >>>>        console.log('min ' + min);
> >>>>        console.log('max ' + max);
> >>>>        console.log('avg ' + totalTime / count2);
> >>>>    }
> >>>> );
> >>>>
> >>>> Here'are results for different amount decompressions (amount of
> >>>> compressions, min/max/avg timings):
> >>>> 10        0       1      0.1
> >>>> 100       1       6      3.8
> >>>> 1000     19      47     30.7
> >>>> 10000   149     382    255.0
> >>>> 100000 4120   18607  16094.3
> >>>>
> >>>> Decompression time grows based on the amount of concurrent
> decompressions.
> >>>> Is there a way to make it faster/limit amount of threads which zlib is
> >>>> using?
> >>>
> >>> How many active threads do you see in e.g. htop? There should
> >>> preferably be as many threads as there are cores in your machine (give
> >>> or take one).
> >>>
> >>> Aside, the current thread pool implementation is a known bottleneck in
> >>> node right now. We're working on addressing that in master but it's
> >>> not done yet.
> >>>
> >>> --
> >>> Job Board: http://jobs.nodejs.org/
> >>> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> >>> You received this message because you are subscribed to the Google
> >>> Groups "nodejs" group.
> >>> To post to this group, send email to [email protected]
> >>> To unsubscribe from this group, send email to
> >>> [email protected]
> >>> For more options, visit this group at
> >>> http://groups.google.com/group/nodejs?hl=en?hl=en
> >>
> >> --
> >> Job Board: http://jobs.nodejs.org/
> >> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> >> You received this message because you are subscribed to the Google
> >> Groups "nodejs" group.
> >> To post to this group, send email to [email protected]
> >> To unsubscribe from this group, send email to
> >> [email protected]
> >> For more options, visit this group at
> >> http://groups.google.com/group/nodejs?hl=en?hl=en
> >
> > --
> > Job Board: http://jobs.nodejs.org/
> > Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> > You received this message because you are subscribed to the Google
> > Groups "nodejs" group.
> > To post to this group, send email to [email protected]
> > To unsubscribe from this group, send email to
> > [email protected]
> > For more options, visit this group at
> > http://groups.google.com/group/nodejs?hl=en?hl=en
>
> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
>

-- 
Job Board: http://jobs.nodejs.org/
Posting guidelines: 
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en

Re: [nodejs] nodejs zlib performance

Reply via email to