[jira] [Commented] (COUCHDB-2724) Batch rows in streaming responses to improve throughput

ASF GitHub Bot (JIRA) Fri, 26 Jun 2015 16:01:24 -0700

    [ 
https://issues.apache.org/jira/browse/COUCHDB-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14603746#comment-14603746
 ]


ASF GitHub Bot commented on COUCHDB-2724:
-----------------------------------------

GitHub user kocolosk opened a pull request:

    https://github.com/apache/couchdb-chttpd/pull/38

    Use an internal buffer to increase _changes throughput

    This set of commits refactors the `changes_callback` function to keep an 
internal data buffer and flush it out the socket after a configurable number of 
bytes have been accumulated. It also adds support for a new callback message 
that will cause it to flush the buffer at the end of each traversal of the 
sequence tree for "live" feeds.
    
    COUCHDB-2724

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/couchdb-chttpd 2724-chunked-buffering

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/couchdb-chttpd/pull/38.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #38
    
----
commit 87fa8166701dd2a183d10e93a06f7f5ccb39478e
Author: Adam Kocoloski <[email protected]>
Date:   2015-06-22T23:15:23Z

    Remove temporary upgrade clause
    
    This clause was only needed for a very old hot code upgrade.

commit f45c8b2c4262c682000b5e2ac50d16c9cade3ec6
Author: Adam Kocoloski <[email protected]>
Date:   2015-06-22T23:41:37Z

    Use a record for changes_callback accumulator
    
    This change allows us to evolve the accumulator in a less-brittle way
    and sets the stage for new data to be held in the accumulator to address
    COUCHDB-2724.
    
    In the course of this change I also switched the feed labels from strings
    to atoms (they're only used for pattern matching in the accumulator, and
    multiple matches are executed for every row in the feed, so it seemed
    silly to be using Erlang lists for that comparison), and I explicitly
    indicated when we start a chunked response instead of guessing it
    heuristically based on other contents in the accumulator.
    
    COUCHDB-2724

commit a2a7a04141fe2911206f8dcd22b4490dfd6855e0
Author: Adam Kocoloski <[email protected]>
Date:   2015-06-23T01:45:26Z

    Buffer rows for normal/longpoll feeds
    
    This patch causes the coordinator to accumulate data in its own buffer
    and reduce the number of calls to write data on the socket. The size of
    the buffer is configurable:
    
    [httpd]
    chunked_response_buffer = 1490
    
    The default is chosen to approximately fill a standard Ethernet frame.
    
    COUCHDB-2724

commit 2e9a10ad2f9dc5ef94b5cb309fae96214fa8b383
Author: Adam Kocoloski <[email protected]>
Date:   2015-06-24T17:47:01Z

    Add basic buffering support for other feed types
    
    With this code it is possible that changes are buffered for a long
    period of time and not sent out. Will work on addressing that next.
    
    COUCHDB-2724

commit f73ddafd0de7a30756b0d21abfe110142eeade6b
Author: Adam Kocoloski <[email protected]>
Date:   2015-06-24T17:56:18Z

    Execute a callback for every complete DB traversal
    
    This ensures that we don't enter a receive statement waiting for new DB
    updates without first flushing the buffer.
    
    COUCHDB-2724

----


> Batch rows in streaming responses to improve throughput
> -------------------------------------------------------
>
>                 Key: COUCHDB-2724
>                 URL: https://issues.apache.org/jira/browse/COUCHDB-2724
>             Project: CouchDB
>          Issue Type: Improvement
>      Security Level: public(Regular issues) 
>          Components: Database Core, HTTP Interface
>            Reporter: Adam Kocoloski
>            Assignee: Adam Kocoloski
>
> [~tonysun83] showed me some profiling of the {{_changes}} feed which 
> indicated that the coordinator process was spending about 1/3 of its time 
> executing inside {{send_delayed_chunk}}. We can reduce the number of 
> invocations of this function by buffering individual rows until we reach a 
> (configurable) threshold for sending the data out the wire.
> We'll of course want to be careful about continuous feeds; if we're in the 
> "slow drip" portion of the feed we'll obviously want to emit right away 
> instead of adding latency unnecessarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (COUCHDB-2724) Batch rows in streaming responses to improve throughput

Reply via email to