On 01/11/2015 12:34 AM, Tiru Srikantha wrote:
Yeah, I'm looking at this as well. I don't like losing the bulk because
that means it gets slow when your load gets high due to HTTP overhead.
If I didn't care about bulk it'd just be a quick rewrite. The other
problem I ran into around bulk operations on a buffered queue is that
there are 4 points when you want to flush to the output:

1. Queued messages count equals count to send.
2. Queued message size with the new message added will exceed the max
bulk message size, even if the count is less than max, due to large
messages.
3. A timer expires to force a flush to the output target even if the
count or max size hasn't been hit yet, to get the data into the output
target in a timely manner.
4. Plugin is shutting down.

I'm actually re-writing a lot of the plugins/buffered_output.go file in
my local fork because of this and teasing out the "read the next record
from the file pile" operation from the "send the next record" operation,
so what I'm aiming for is something like a BulkBufferedOutput interface
with a SendRecords(records [][]byte) method that must be implemented in
lieu of BufferedOutput's SendRecord(record []byte) and some code that's
shared between them to buffer new messages and read buffered messages.
SendRecords would not advance the cursor until the bulk operation
succeeded, as you might expect.

I recommend reading my other message in this thread (http://is.gd/kzuhxs) for an alternate approach. I think you can achieve what you want with less effort, and better separation of concerns, by doing the batching *before* you pass data in to the BufferedOutput. Ultimately each buffer "record" is just a slice of bytes, the buffer doesn't need to know or care if that slice contains a single serialized message or an accumulated batch of them.

I'll submit a PR once I finish.

I always look forward to PRs, but I'll warn you that one that the approach described above will likely be rejected. I'm not very keen on introducing a separate buffering interface specifically for batches, when a simpler change can solve the same problems.

-r
_______________________________________________
Heka mailing list
[email protected]
https://mail.mozilla.org/listinfo/heka

Reply via email to