Did you ever get a chance to look at this? I am still getting these failures almost every time it runs.
Ralph On Aug 31, 2012, at 10:35 AM, Hari Shreedharan wrote: > Thanks Ralph. Let me take a look at the code. > > -- > Hari Shreedharan > > > On Friday, August 31, 2012 at 9:38 AM, Ralph Goers wrote: > >> Which file? The files from the FileChannel, the source or …? If you want the >> FileChannel stuff, unfortunately it only is failing on the machine where >> Gump runs and I don't have a clue how to get access to it or if it is even >> left around after the run. As I said, I've never had this fail on the >> Mac(s), my Linux system or in Jenkins. I have no idea what is peculiar about >> that system but I do know the tests take about twice as long as they do on >> my Mac. >> >> If you want to look at the actual unit test source, it is at >> https://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java >> >> Ralph >> >> >> >> On Aug 31, 2012, at 12:15 AM, Hari Shreedharan wrote: >> >>> It looks like the channel has already got the events before the source >>> stops. Can you send me a link to the actual file, so I can take a look? >>> >>> >>> Hari >>> >>> -- >>> Hari Shreedharan >>> >>> >>> On Thursday, August 30, 2012 at 9:44 AM, Ralph Goers wrote: >>> >>>> Thanks Hari, >>>> >>>> First, remember that the Flume agent is embedded in the Appender. So the >>>> Log4j EventSource is passing the event to the FileChannel. The Avro Sink >>>> then reads from the channel and sends it on. The unit test has two Avro >>>> Sources listening with each associated with its own MemoryChannel. The >>>> test logs 10 events then reads the 10 events from the primary >>>> MemoryChannel, each within its own transaction. The test then stops the >>>> primary source. Then it logs 10 more events and tries to read them from >>>> the alternate MemoryChannel. >>>> >>>> >>>> The code to read from the channel looks like: >>>> >>>> for (int i = 0; i < 10; ++i) { >>>> StructuredDataMessage msg = new StructuredDataMessage("Test", "Test >>>> Primary " + i, "Test"); >>>> EventLogger.logEvent(msg); >>>> } >>>> for (int i = 0; i < 10; ++i) { >>>> Transaction transaction = primaryChannel.getTransaction(); >>>> transaction.begin(); >>>> >>>> Event event = primaryChannel.take(); >>>> Assert.assertNotNull(event); >>>> String body = getBody(event); >>>> String expected = "Test Primary " + i; >>>> Assert.assertTrue("Channel contained event, but not expected message. >>>> Received: " + body, >>>> body.endsWith(expected)); >>>> transaction.commit(); >>>> transaction.close(); >>>> } >>>> >>>> primarySource.stop(); >>>> >>>> >>>> for (int i = 0; i < 10; ++i) { >>>> StructuredDataMessage msg = new StructuredDataMessage("Test", "Test >>>> Alternate " + i, "Test"); >>>> EventLogger.logEvent(msg); >>>> } >>>> for (int i = 0; i < 10; ++i) { >>>> Transaction transaction = alternateChannel.getTransaction(); >>>> transaction.begin(); >>>> >>>> Event event = alternateChannel.take(); >>>> Assert.assertNotNull(event); >>>> String body = getBody(event); >>>> String expected = "Test Alternate " + i; >>>> /* When running in Gump Flume consistently returns the last event from the >>>> primary channel after >>>> the failover, which fails this test */ >>>> Assert.assertTrue("Channel contained event, but not expected message. >>>> Expected: " + expected + >>>> " Received: " + body, body.endsWith(expected)); >>>> transaction.commit(); >>>> transaction.close(); >>>> } >>>> When I run this on my Mac it never fails. But Gump fails almost every time >>>> returning "Channel contained event, but not expected message. Expected: >>>> Test Alternate 0 Received: <128>1 2012-08-30T05:50:04.143Z vmgump MyApp - >>>> Test [Test@18060][mdc@18060] Test Primary 9" >>>> >>>> Do we have any tests that are similar to this? I didn't see anything that >>>> tests failover in this way but I might have missed it. >>>> >>>> Ralph >>>> >>>> On Aug 30, 2012, at 9:22 AM, Hari Shreedharan wrote: >>>> >>>>> Hi Ralph, >>>>> >>>>> Sorry missed this message earlier. How are you simulating failover in >>>>> your test - I did not look at your code. If the message was written by >>>>> the Avro Source on the client and the Avro Sink on the other side simply >>>>> did not get a success would cause the failover sink processor to retry >>>>> the same message since it would be rolled back by the sink, and hence the >>>>> channel will end up making it available for another sink. Generally, if a >>>>> message is not ack-ed as being successfully written to the channel by the >>>>> Avro Source, the sink will rollback the transaction - and throw an >>>>> EventDeliveryException - and in case of Failover SinkProcessor, it will >>>>> cause the next sink to pick it up. >>>>> >>>>> Also, note that Flume guarantees at least once semantics and weak >>>>> ordering. If a failure happens, it is possible that there will be >>>>> duplicates. >>>>> >>>>> And no, this is not related to any of the FileChannel issues we have been >>>>> fixing. >>>>> >>>>> Thanks, >>>>> Hari >>>>> >>>>> -- >>>>> Hari Shreedharan >>>>> >>>>> >>>>> On Thursday, August 30, 2012 at 7:50 AM, Ralph Goers wrote: >>>>> >>>>>> I'm going to try again. Does this problem sound familiar to anyone? >>>>>> >>>>>> Ralph >>>>>> >>>>>> On Aug 27, 2012, at 3:36 PM, Ralph Goers wrote: >>>>>> >>>>>>> Does anyone have any thoughts on this? Is it possibly related to any of >>>>>>> the issues already being fixed on the FileChannel? >>>>>>> >>>>>>> Ralph >>>>>>> >>>>>>> On Aug 26, 2012, at 4:05 PM, Ralph Goers wrote: >>>>>>> >>>>>>>> I have successfully embedded Flume into the Log4j 2 Appender. However, >>>>>>>> I have a unit test that has Flume fail over from one AvroSink to >>>>>>>> another. When this happens under some circumstances I am getting the >>>>>>>> last message successfully delivered to the first source as the first >>>>>>>> message to the second source, which doesn't seem correct. The unit >>>>>>>> test is >>>>>>>> athttps://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java >>>>>>>> >>>>>>>> (http://svn.apache.org/repos/asf/logging/log4j/log4j2/trunk/flume-ng/src/test/java/org/apache/logging/log4j/flume/appender/FlumeEmbeddedAppenderTest.java). >>>>>>>> The odd thing is that I cannot get this to fail on my local machine - >>>>>>>> it only fails when Gump runs it, but it fail fairly consistently. >>>>>>>> >>>>>>>> The unit test has the AppenderSource connect to a FileChannel. Two >>>>>>>> AvroSinks are connected to the FileChannel via the Failover processor. >>>>>>>> >>>>>>>> Is this a known behavior? >>>>>>>> >>>>>>>> Ralph > >
