Thanks Hari,
First, remember that the Flume agent is embedded in the Appender. So the Log4j
EventSource is passing the event to the FileChannel. The Avro Sink then reads
from the channel and sends it on. The unit test has two Avro Sources listening
with each associated with its own MemoryChannel. The test logs 10 events then
reads the 10 events from the primary MemoryChannel, each within its own
transaction. The test then stops the primary source. Then it logs 10 more
events and tries to read them from the alternate MemoryChannel.
The code to read from the channel looks like:
for (int i = 0; i < 10; ++i) {
StructuredDataMessage msg = new StructuredDataMessage("Test", "Test
Primary " + i, "Test");
EventLogger.logEvent(msg);
}
for (int i = 0; i < 10; ++i) {
Transaction transaction = primaryChannel.getTransaction();
transaction.begin();
Event event = primaryChannel.take();
Assert.assertNotNull(event);
String body = getBody(event);
String expected = "Test Primary " + i;
Assert.assertTrue("Channel contained event, but not expected
message. Received: " + body,
body.endsWith(expected));
transaction.commit();
transaction.close();
}
primarySource.stop();
for (int i = 0; i < 10; ++i) {
StructuredDataMessage msg = new StructuredDataMessage("Test", "Test
Alternate " + i, "Test");
EventLogger.logEvent(msg);
}
for (int i = 0; i < 10; ++i) {
Transaction transaction = alternateChannel.getTransaction();
transaction.begin();
Event event = alternateChannel.take();
Assert.assertNotNull(event);
String body = getBody(event);
String expected = "Test Alternate " + i;
/* When running in Gump Flume consistently returns the last event
from the primary channel after
the failover, which fails this test */
Assert.assertTrue("Channel contained event, but not expected
message. Expected: " + expected +
" Received: " + body, body.endsWith(expected));
transaction.commit();
transaction.close();
}
When I run this on my Mac it never fails. But Gump fails almost every time
returning "Channel contained event, but not expected message. Expected: Test
Alternate 0 Received: <128>1 2012-08-30T05:50:04.143Z vmgump MyApp - Test
[Test@18060][mdc@18060] Test Primary 9"
Do we have any tests that are similar to this? I didn't see anything that
tests failover in this way but I might have missed it.
Ralph
On Aug 30, 2012, at 9:22 AM, Hari Shreedharan wrote:
> Hi Ralph,
>
> Sorry missed this message earlier. How are you simulating failover in your
> test - I did not look at your code. If the message was written by the Avro
> Source on the client and the Avro Sink on the other side simply did not get a
> success would cause the failover sink processor to retry the same message
> since it would be rolled back by the sink, and hence the channel will end up
> making it available for another sink. Generally, if a message is not ack-ed
> as being successfully written to the channel by the Avro Source, the sink
> will rollback the transaction - and throw an EventDeliveryException - and in
> case of Failover SinkProcessor, it will cause the next sink to pick it up.
>
> Also, note that Flume guarantees at least once semantics and weak ordering.
> If a failure happens, it is possible that there will be duplicates.
>
> And no, this is not related to any of the FileChannel issues we have been
> fixing.
>
> Thanks,
> Hari
>
> --
> Hari Shreedharan
>
>
> On Thursday, August 30, 2012 at 7:50 AM, Ralph Goers wrote:
>
>> I'm going to try again. Does this problem sound familiar to anyone?
>>
>> Ralph
>>
>> On Aug 27, 2012, at 3:36 PM, Ralph Goers wrote:
>>
>>> Does anyone have any thoughts on this? Is it possibly related to any of the
>>> issues already being fixed on the FileChannel?
>>>
>>> Ralph
>>>
>>> On Aug 26, 2012, at 4:05 PM, Ralph Goers wrote:
>>>
>>>> I have successfully embedded Flume into the Log4j 2 Appender. However, I
>>>> have a unit test that has Flume fail over from one AvroSink to another.
>>>> When this happens under some circumstances I am getting the last message
>>>> successfully delivered to the first source as the first message to the
>>>> second source, which doesn't seem correct. The unit test is
>>>> athttps://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java
>>>>
>>>> (http://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java).
>>>> The odd thing is that I cannot get this to fail on my local machine - it
>>>> only fails when Gump runs it, but it fail fairly consistently.
>>>>
>>>> The unit test has the AppenderSource connect to a FileChannel. Two
>>>> AvroSinks are connected to the FileChannel via the Failover processor.
>>>>
>>>> Is this a known behavior?
>>>>
>>>> Ralph
>
>