[
https://issues.apache.org/jira/browse/FLUME-936?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13198483#comment-13198483
]
[email protected] commented on FLUME-936:
-----------------------------------------------------
-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/3704/
-----------------------------------------------------------
(Updated 2012-02-02 02:27:14.322838)
Review request for Flume.
Changes
-------
I decided to address Prasad's issue with failing rollbacks by implementing my
invariant idea.
I also added in a concurrent sink + source test, with a tight capacity(only
100), and 50 each of randomly committing/rollbacking sinks and sources.
One can note that in it, a failed commit for sources is not exception handled,
so the thread would die and with that the test.
On the other hand, the source thread commits are inside a try/catch and can
fail. If they do, they are just rolled back. In practice this means that
barring agent failure, anything that has been succesfully committed to the
memory channel shouldn't ever be lost.
I think that the code is pretty hard to understand... I have tried to annotate
and document concurrency guards and invariants, if anyone can think of ways to
make it easier to understand let me know.
Other than that, I believe this patch should hopefully be my final one, solving
the outstanding issues. I will be sure to remove the 935 changes from it once I
put it up to the JIRA
Summary
-------
This is an initial go at fixing the threading issues with memory channel.
It uses the preliminary work on FLUME-935 and I have included the code from
that.
The tagging of the events became unnecessary so I dropped that. One thing that
concerns me slightly is how to deal with not having enough space in the queue
to rollback failed takes. One method would be to keep a minimum buffer of
transactionCapacity. Another would be to implement the queue of queues as
suggested in FLUME-889
Anyway, just putting up this early version to see what people think
This addresses bug FLUME-936.
https://issues.apache.org/jira/browse/FLUME-936
Diffs (updated)
-----
flume-ng-core/src/main/java/org/apache/flume/channel/BasicChannelSemantics.java
PRE-CREATION
flume-ng-core/src/main/java/org/apache/flume/channel/BasicTransactionSemantics.java
PRE-CREATION
flume-ng-core/src/main/java/org/apache/flume/channel/ChannelUtils.java
PRE-CREATION
flume-ng-core/src/main/java/org/apache/flume/channel/MemoryChannel.java
d379b64
flume-ng-core/src/test/java/org/apache/flume/channel/TestFanoutChannel.java
ada9a72
flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannel.java
b44030e
flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannelConcurrency.java
PRE-CREATION
flume-ng-core/src/test/java/org/apache/flume/channel/TestMemoryChannelTransaction.java
d18045b
flume-ng-core/src/test/java/org/apache/flume/source/TestExecSource.java
6acbbd5
Diff: https://reviews.apache.org/r/3704/diff
Testing
-------
The original tests pass, though I had to take out the state checks because of
the changes to semantics from the flume-935 code. I also had to add a
transaction.close statement where semantics were not properly being followed
I have to retrofit my new concurrency test since without the tagged events it
cannot fail without checking that the content is correct. I'll put that up
asap, just wanted to get some eyes on this before I head out.
Thanks,
Juhani
> MemoryChannel is not thread safe
> --------------------------------
>
> Key: FLUME-936
> URL: https://issues.apache.org/jira/browse/FLUME-936
> Project: Flume
> Issue Type: Bug
> Components: Channel
> Affects Versions: NG alpha 2
> Reporter: Juhani Connolly
> Assignee: Juhani Connolly
> Fix For: v1.1.0
>
> Attachments: FLUME-936-unittest.patch
>
>
> The memory channel isn't thread safe as a couple of parallel transactions can
> commit/rollback each others entries if called in the wrong order.
> I'm attaching a unit test I made that demonstrates it using a cyclicbarrier
> to force the event order that causes the precondition to fail.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira