Thanks Hari,

First, remember that the Flume agent is embedded in the Appender. So the Log4j 
EventSource is passing the event to the FileChannel. The Avro Sink then reads 
from the channel and sends it on.  The unit test has two Avro Sources listening 
with each associated with its own MemoryChannel.  The test logs 10 events  then 
reads the 10 events from the primary MemoryChannel, each within its own 
transaction.  The test then stops the primary source. Then it logs 10 more 
events and tries to read them from the alternate MemoryChannel.  


The code to read from the channel looks like:

        for (int i = 0; i < 10; ++i) {
            StructuredDataMessage msg = new StructuredDataMessage("Test", "Test 
Primary " + i, "Test");
            EventLogger.logEvent(msg);
        }        
        for (int i = 0; i < 10; ++i) {
            Transaction transaction = primaryChannel.getTransaction();
            transaction.begin();

            Event event = primaryChannel.take();
            Assert.assertNotNull(event);
            String body = getBody(event);
            String expected = "Test Primary " + i;
            Assert.assertTrue("Channel contained event, but not expected 
message. Received: " + body,
                body.endsWith(expected));
            transaction.commit();
            transaction.close();
        }

        primarySource.stop();


        for (int i = 0; i < 10; ++i) {
            StructuredDataMessage msg = new StructuredDataMessage("Test", "Test 
Alternate " + i, "Test");
            EventLogger.logEvent(msg);
        }
        for (int i = 0; i < 10; ++i) {
            Transaction transaction = alternateChannel.getTransaction();
            transaction.begin();

            Event event = alternateChannel.take();
            Assert.assertNotNull(event);
            String body = getBody(event);
            String expected = "Test Alternate " + i;
            /* When running in Gump Flume consistently returns the last event 
from the primary channel after
               the failover, which fails this test */
            Assert.assertTrue("Channel contained event, but not expected 
message. Expected: " + expected +
                " Received: " + body, body.endsWith(expected));
            transaction.commit();
            transaction.close();
        }
When I run this on my Mac it never fails. But Gump fails almost every time 
returning "Channel contained event, but not expected message. Expected: Test 
Alternate 0 Received: <128>1 2012-08-30T05:50:04.143Z vmgump MyApp - Test 
[Test@18060][mdc@18060] Test Primary 9"

Do we have any tests that are similar to this?  I didn't see anything that 
tests failover in this way but I might have missed it.

Ralph

On Aug 30, 2012, at 9:22 AM, Hari Shreedharan wrote:

> Hi Ralph, 
> 
> Sorry missed this message earlier. How are you simulating failover in your 
> test - I did not look at your code. If the message was written by the Avro 
> Source on the client and the Avro Sink on the other side simply did not get a 
> success would cause the failover sink processor to retry the same message 
> since it would be rolled back by the sink, and hence the channel will end up 
> making it available for another sink. Generally, if a message is not ack-ed 
> as being successfully written to the channel by the Avro Source, the sink 
> will rollback the transaction - and throw an EventDeliveryException - and in 
> case of Failover SinkProcessor, it will cause the next sink to pick it up. 
> 
> Also, note that Flume guarantees at least once semantics and weak ordering. 
> If a failure happens, it is possible that there will be duplicates. 
> 
> And no, this is not related to any of the FileChannel issues we have been 
> fixing. 
> 
> Thanks,
> Hari
> 
> -- 
> Hari Shreedharan
> 
> 
> On Thursday, August 30, 2012 at 7:50 AM, Ralph Goers wrote:
> 
>> I'm going to try again. Does this problem sound familiar to anyone? 
>> 
>> Ralph
>> 
>> On Aug 27, 2012, at 3:36 PM, Ralph Goers wrote:
>> 
>>> Does anyone have any thoughts on this? Is it possibly related to any of the 
>>> issues already being fixed on the FileChannel?
>>> 
>>> Ralph
>>> 
>>> On Aug 26, 2012, at 4:05 PM, Ralph Goers wrote:
>>> 
>>>> I have successfully embedded Flume into the Log4j 2 Appender. However, I 
>>>> have a unit test that has Flume fail over from one AvroSink to another. 
>>>> When this happens under some circumstances I am getting the last message 
>>>> successfully delivered to the first source as the first message to the 
>>>> second source, which doesn't seem correct. The unit test is 
>>>> athttps://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java
>>>>  
>>>> (http://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java).
>>>>  The odd thing is that I cannot get this to fail on my local machine - it 
>>>> only fails when Gump runs it, but it fail fairly consistently.
>>>> 
>>>> The unit test has the AppenderSource connect to a FileChannel. Two 
>>>> AvroSinks are connected to the FileChannel via the Failover processor.
>>>> 
>>>> Is this a known behavior?
>>>> 
>>>> Ralph 
> 
> 

Reply via email to