Hi Ted, thanks for clearing up the language, I intended to express sequential consistency then.
Yes you could do a forced sync too, that would be another way good test. Taylor, the bug that I witnessed only occurs after you leave a trident topology running for at least a day. One day it'll just stop making progress and re-attempt the same batch forever. Unfortunately I can't send the particular trident code to you, but I don't think there's anything unique about it. I suspect any trident topology could reproduce the bug if ran for a week. Other correlated factors may include that the trident topology has to occasionally fail batches, the zookeeper cluster has to be under significant load from other applications beyond trident. I don't many much details unfortunately. -Jason On Wed, Apr 9, 2014 at 3:03 PM, Ted Dunning <[email protected]> wrote: > > In what sense do you mean when you say that reads in ZK are eventually > consistent? > > You may get a slightly old value, but you are guaranteed to see a > consistent history. That is, if a value has values (which include version > numbers) v_1 ... v_n, then if you see v_i, you will never see v_j where j<i. > > You can also guarantee that you don't even see delayed values by using > sync. > > Normally when people say "eventually consistent" they mean that two > participants can see inconsistent histories under partition. That isn't > possible in ZK. As I understand it, ZK would be better described as > providing sequential consistency since all observers will see all updates > in the same order. > > > > > On Wed, Apr 9, 2014 at 2:50 PM, Jason Jackson <[email protected]> wrote: > >> I have one theory that because reads in zookeeper are eventually >> consistent, this is a necessary condition for the bug to manifest. So one >> way to test this hypothesis is to run a zookeeper ensemble with 1 node, or >> a zookeeper ensemble configured for 5 nodes, but take 2 of them offline, so >> that every write operation only succeeds if every member of the ensemble >> sees the write. This should produce strong consistent reads. If you run >> this test, let me know what the results are. (Clearly this isn't a good >> production system though as you're making a tradeoff for lower availability >> in favor of greater consistency, but the results could help narrow down the >> bug) >> >> >> On Wed, Apr 9, 2014 at 2:43 PM, Jason Jackson <[email protected]>wrote: >> >>> Yah it's probably a bug in trident. It would be amazing if someone >>> figured out the fix for this. I spent about 6 hours looking into, but >>> couldn't figure out why it was occuring. >>> >>> Beyond fixing this, one thing you could do to buy yourself time is >>> disable batch retries in trident. There's no option for this in the API, >>> but it's like a 1 or 2 line change to the code. Obviously you loose exactly >>> once semantics, but at least you would have a system that never falls >>> behind real-time. >>> >>> >>> >>> On Wed, Apr 9, 2014 at 1:10 AM, Danijel Schiavuzzi < >>> [email protected]> wrote: >>> >>>> Thanks Jason. However, I don't think that was the case in my stuck >>>> topology, otherwise I'd have seen exceptions (thrown by my Trident >>>> functions) in the worker logs. >>>> >>>> >>>> On Wed, Apr 9, 2014 at 3:02 AM, Jason Jackson <[email protected]>wrote: >>>> >>>>> An example of "corrupted input" that causes a batch to fail would be >>>>> for example if you expected a schema to your data that you read off kafka, >>>>> or some queue, and for whatever reason the data didn't conform to your >>>>> schema and the function that you implement that you pass to stream.each >>>>> throws an exception when this unexpected situation occurs. This would >>>>> cause >>>>> the batch to be retried, but it's deterministically failing, so the batch >>>>> will be retried forever. >>>>> >>>>> >>>>> On Mon, Apr 7, 2014 at 10:37 AM, Danijel Schiavuzzi < >>>>> [email protected]> wrote: >>>>> >>>>>> Hi Jason, >>>>>> >>>>>> Could you be more specific -- what do you mean by "corrupted input"? >>>>>> Do you mean that there's a bug in Trident itself that causes the tuples >>>>>> in >>>>>> a batch to somehow become corrupted? >>>>>> >>>>>> Thanks a lot! >>>>>> >>>>>> Danijel >>>>>> >>>>>> >>>>>> On Monday, April 7, 2014, Jason Jackson <[email protected]> wrote: >>>>>> >>>>>>> This could happen if you have corrupted input that always causes a >>>>>>> batch to fail and be retried. >>>>>>> >>>>>>> I have seen this behaviour before and I didn't see corrupted input. >>>>>>> It might be a bug in trident, I'm not sure. If you figure it out please >>>>>>> update this thread and/or submit a patch. >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Mon, Mar 31, 2014 at 7:39 AM, Danijel Schiavuzzi < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>> To (partially) answer my own question -- I still have no idea on the >>>>>>> cause of the stuck topology, but re-submitting the topology helps -- >>>>>>> after >>>>>>> re-submitting my topology is now running normally. >>>>>>> >>>>>>> >>>>>>> On Wed, Mar 26, 2014 at 6:04 PM, Danijel Schiavuzzi < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>> Also, I did have multiple cases of my IBackingMap workers dying >>>>>>> (because of RuntimeExceptions) but successfully restarting afterwards (I >>>>>>> throw RuntimeExceptions in the BackingMap implementation as my strategy >>>>>>> in >>>>>>> rare SQL database deadlock situations to force a worker restart and to >>>>>>> fail+retry the batch). >>>>>>> >>>>>>> From the logs, one such IBackingMap worker death (and subsequent >>>>>>> restart) resulted in the Kafka spout re-emitting the pending tuple: >>>>>>> >>>>>>> 2014-03-22 16:26:43 s.k.t.TridentKafkaEmitter [INFO] re-emitting >>>>>>> batch, attempt 29698959:736 >>>>>>> >>>>>>> This is of course the normal behavior of a transactional topology, >>>>>>> but this is the first time I've encountered a case of a batch retrying >>>>>>> indefinitely. This is especially suspicious since the topology has been >>>>>>> running fine for 20 days straight, re-emitting batches and restarting >>>>>>> IBackingMap workers quite a number of times. >>>>>>> >>>>>>> I can see in my IBackingMap backing SQL database that the batch with >>>>>>> the exact txid value 29698959 has been committed -- but I suspect that >>>>>>> could come from another BackingMap, since there are two BackingMap >>>>>>> instances running (paralellismHint 2). >>>>>>> >>>>>>> However, I have no idea why the batch is being retried indefinitely >>>>>>> now nor why it hasn't been successfully acked by Trident. >>>>>>> >>>>>>> Any suggestions on the area (topology component) to focus my >>>>>>> research on? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> On Wed, Mar 26, 2014 at 5:32 PM, Danijel Schiavuzzi < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>> Hello, >>>>>>> >>>>>>> I'm having problems with my transactional Trident topology. It has >>>>>>> been running fine for about 20 days, and suddenly is stuck processing a >>>>>>> single batch, with no tuples being emitted nor tuples being persisted by >>>>>>> the TridentState (IBackingMap). >>>>>>> >>>>>>> It's a simple topology which consumes messages off a Kafka queue. >>>>>>> The spout is an instance of storm-kafka-0.8-plus >>>>>>> TransactionalTridentKafkaSpout and I use the trident-mssql transactional >>>>>>> TridentState implementation to persistentAggregate() data into a SQL >>>>>>> database. >>>>>>> >>>>>>> In Zookeeper I can see Storm is re-trying a batch, i.e. >>>>>>> >>>>>>> "/transactional/<myTopologyName>/coordinator/currattempts" is >>>>>>> "{"29698959":6487}" >>>>>>> >>>>>>> ... and the attempt count keeps increasing. It seems the batch with >>>>>>> txid 29698959 is stuck, as the attempt count in Zookeeper keeps >>>>>>> increasing >>>>>>> -- seems like the batch isn't being acked by Trident and I have no idea >>>>>>> why, especially since the topology has been running successfully the >>>>>>> last >>>>>>> 20 days. >>>>>>> >>>>>>> I did rebalance the topology on one occasion, after which it >>>>>>> continued running normally. Other than that, no other modifications were >>>>>>> done. Storm is at version 0.9.0.1. >>>>>>> >>>>>>> Any hints on how to debug the stuck topology? Any other useful info >>>>>>> I might provide? >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> -- >>>>>>> Danijel Schiavuzzi >>>>>>> >>>>>>> E: [email protected] >>>>>>> W: www.schiavuzzi.com >>>>>>> T: +385989035562 >>>>>>> Skype: danijel.schiavuzzi >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Danijel Schiavuzzi >>>>>>> >>>>>>> E: danije >>>>>>> >>>>>>> >>>>>> >>>>>> -- >>>>>> Danijel Schiavuzzi >>>>>> >>>>>> E: [email protected] >>>>>> W: www.schiavuzzi.com >>>>>> T: +385989035562 >>>>>> Skype: danijels7 >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Danijel Schiavuzzi >>>> >>>> E: [email protected] >>>> W: www.schiavuzzi.com >>>> T: +385989035562 >>>> Skype: danijels7 >>>> >>> >>> >> >
