[
https://issues.apache.org/jira/browse/THRIFT-401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tyler Kovacs updated THRIFT-401:
--------------------------------
Attachment: remove_slice_from_framed_transport.diff
Patch attached. I'm not sure if there are any tests that can be run to verify
the patch. We have run tests at our application layer and everything looks
good. Please let me know if you'd like any changes to the patch.
> ruby transport class read buffer very slow due to usage of slice!
> -----------------------------------------------------------------
>
> Key: THRIFT-401
> URL: https://issues.apache.org/jira/browse/THRIFT-401
> Project: Thrift
> Issue Type: Improvement
> Components: Library (Ruby)
> Environment: # uname -a
> Linux zvm.local 2.6.9-78.0.1.ELsmp #1 SMP Tue Aug 5 11:02:47 EDT 2008 i686
> i686 i386 GNU/Linux
> Reporter: Tyler Kovacs
> Priority: Minor
> Attachments: after.png, before.png,
> remove_slice_from_framed_transport.diff
>
>
> We use Thrift as a cross-language transport for Hypertable - an open-source
> distributed database. While profiling queries with large response using the
> ruby Thrift libraries, we discovered that the majority of time was spent in
> thrift/transport.rb. Specifically, the slice! method, which is used to
> manage the read buffer (@rbuf) was responsible for almost all latency.
> We tried an alternative implementation that showed 300x speedup in our tests.
> Instead of repeatedly calling slice! to alter @rbuf (which apparently is
> extremely expensive), we maintain an offset counter (@rpos) which starts at
> zero and is incremented by sz each time we read from @rbuf. Before and after
> screenshots from kcachegrind are attached.
> I'll copy the monkey patch that we use within the description below - and
> I'll try to assemble a patch later today.
> module Thrift
> class FramedTransport < Transport
> def initialize(transport, read=true, write=true)
> @transport = transport
> @rbuf = ''
> @wbuf = ''
> @read = read
> @write = write
> @rpos = 0
> end
> def read(sz)
> return @transport.read(sz) unless @read
> return '' if sz <= 0
> read_frame if @rpos >= @rbuf.length
> @rpos += sz
> @rb...@rpos - sz, sz] || ''
> end
> def borrow(requested_length = 0)
> read_frame if @rpos >= @rbuf.length
> # there isn't any more coming, so if it's not enough, it's an error.
> raise EOFError if requested_length > (@rbuf.length - @rpos)
> @rb...@rpos, requested_length]
> end
> def consume!(size)
> @rpos += size
> @rb...@rpos - size, size]
> end
> private
> def read_frame
> sz = @transport.read_all(4).unpack('N').first
> @rpos = 0
> @rbuf = @transport.read_all(sz)
> end
> end
> end
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.