Hi Stephan, I have upgraded to Flink 1.3.0 to test RocksDB with incremental checkpointing (PredefinedOptions used is FLASH_SSD_OPTIMIZED)
I am currently creating a YARN session and running the job on EMR having r3.4xlarge instances (122GB of memory), I have observed that it is utilizing almost all memory. This was not happening with previous version ; maximum 30GB was getting utilized. Because of this issue the job manager was killed and the job failed. Is there any other configurations I have to do ? P.S I am currently using FRocksDB Regards, Vinay Patil On Fri, May 5, 2017 at 1:01 PM, Vinay Patil <vinay18.pa...@gmail.com> wrote: > Hi Stephan, > > I tested the pipeline with the FRocksDB dependency (with SSD_OPTIMIZED > option), none of the checkpoints were failed. > > For checkpointing 10GB of state it took 45secs which is better than the > previous results. > > Let me know if there are any other configurations which will help to get > better results. > > Regards, > Vinay Patil > > On Thu, May 4, 2017 at 10:05 PM, Vinay Patil <vinay18.pa...@gmail.com> > wrote: > >> Hi Stephan, >> >> I see that the RocksDb issue is solved by having a separate FRocksDB >> dependency. >> >> I have added this dependency as discussed on the JIRA. Is it the only >> thing that we have to do or we have to change the code for setting RocksDB >> state backend as well ? >> >> >> >> Regards, >> Vinay Patil >> >> On Tue, Mar 28, 2017 at 1:20 PM, Stefan Richter [via Apache Flink User >> Mailing List archive.] <ml-node+s2336050n12429...@n4.nabble.com> wrote: >> >>> Hi, >>> >>> I was able to come up with a custom build of RocksDB yesterday that >>> seems to fix the problems. I still have to build the native code for >>> different platforms and then test it. I cannot make promises about the >>> 1.2.1 release, but I would be optimistic that this will make it in. >>> >>> Best, >>> Stefan >>> >>> Am 27.03.2017 um 19:12 schrieb vinay patil <[hidden email] >>> <http:///user/SendEmail.jtp?type=node&node=12429&i=0>>: >>> >>> Hi Stephan, >>> >>> Just an update, last week I did a run with state size close to 18GB, I >>> did not observe the pipeline getting stopped in between with G1GC enabled. >>> >>> I had observed checkpoint failures when the state size was close to 38GB >>> (but in this case G1GC was not enabled) >>> >>> Is it possible to get the RocksDB fix in 1.2.1 so that I can test it out. >>> >>> >>> Regards, >>> Vinay Patil >>> >>> On Sat, Mar 18, 2017 at 12:25 AM, Stephan Ewen [via Apache Flink User >>> Mailing List archive.] <<a href="x-msg://1/user/SendEmail >>> .jtp?type=node&node=12425&i=0" target="_top" rel="nofollow" >>> link="external" class="">[hidden email]> wrote: >>> >>>> @vinay Let's see how fast we get this fix in - I hope yes. It may >>>> depend also a bit on the RocksDB community. >>>> >>>> In any case, if it does not make it in, we can do a 1.2.2 release >>>> immediately after (I think the problem is big enough to warrant that), or >>>> at least release a custom version of the RocksDB state backend that >>>> includes the fix. >>>> >>>> Stephan >>>> >>>> >>>> On Fri, Mar 17, 2017 at 5:51 PM, vinay patil <[hidden email] >>>> <http://user/SendEmail.jtp?type=node&node=12276&i=0>> wrote: >>>> >>>>> Hi Stephan, >>>>> >>>>> Is the performance related change of RocksDB going to be part of >>>>> Flink 1.2.1 ? >>>>> >>>>> Regards, >>>>> Vinay Patil >>>>> >>>>> On Thu, Mar 16, 2017 at 6:13 PM, Stephan Ewen [via Apache Flink User >>>>> Mailing List archive.] <[hidden email] >>>>> <http://user/SendEmail.jtp?type=node&node=12274&i=0>> wrote: >>>>> >>>>>> The only immediate workaround is to use windows with "reduce" or >>>>>> "fold" or "aggregate" and not "apply". And to not use an evictor. >>>>>> >>>>>> The good news is that I think we have a good way of fixing this soon, >>>>>> making an adjustment in RocksDB. >>>>>> >>>>>> For the Yarn / g1gc question: Not 100% sure about that - you can >>>>>> check if it used g1gc. If not, you may be able to pass this through the >>>>>> "env.java.opts" parameter. (cc robert for confirmation) >>>>>> >>>>>> Stephan >>>>>> >>>>>> >>>>>> >>>>>> On Thu, Mar 16, 2017 at 8:31 AM, vinay patil <[hidden email] >>>>>> <http://user/SendEmail.jtp?type=node&node=12243&i=0>> wrote: >>>>>> >>>>>>> Hi Stephan, >>>>>>> >>>>>>> What can be the workaround for this ? >>>>>>> >>>>>>> Also need one confirmation : Is G1 GC used by default when running >>>>>>> the pipeline on YARN. (I see a thread of 2015 where G1 is used by >>>>>>> default >>>>>>> for JAVA8) >>>>>>> >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Vinay Patil >>>>>>> >>>>>>> On Wed, Mar 15, 2017 at 10:32 PM, Stephan Ewen [via Apache Flink >>>>>>> User Mailing List archive.] <[hidden email] >>>>>>> <http://user/SendEmail.jtp?type=node&node=12234&i=0>> wrote: >>>>>>> >>>>>>>> Hi Vinay! >>>>>>>> >>>>>>>> Savepoints also call the same problematic RocksDB function, >>>>>>>> unfortunately. >>>>>>>> >>>>>>>> We will have a fix next month. We either (1) get a patched RocksDB >>>>>>>> version or we (2) implement a different pattern for ListState in Flink. >>>>>>>> >>>>>>>> (1) would be the better solution, so we are waiting for a response >>>>>>>> from the RocksDB folks. (2) is always possible if we cannot get a fix >>>>>>>> from >>>>>>>> RocksDB. >>>>>>>> >>>>>>>> Stephan >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Mar 15, 2017 at 5:53 PM, vinay patil <[hidden email] >>>>>>>> <http://user/SendEmail.jtp?type=node&node=12225&i=0>> wrote: >>>>>>>> >>>>>>>>> Hi Stephan, >>>>>>>>> >>>>>>>>> Thank you for making me aware of this. >>>>>>>>> >>>>>>>>> Yes I am using a window without reduce function (Apply function). >>>>>>>>> The discussion happening on JIRA is exactly what I am observing, >>>>>>>>> consistent >>>>>>>>> failure of checkpoints after some time and the stream halts. >>>>>>>>> >>>>>>>>> We want to go live in next month, not sure how this will affect in >>>>>>>>> production as we are going to get above 200 million data. >>>>>>>>> >>>>>>>>> As a workaround can I take the savepoint while the pipeline is >>>>>>>>> running ? Let's say if I take savepoint after every 30minutes, will >>>>>>>>> it work >>>>>>>>> ? >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Regards, >>>>>>>>> Vinay Patil >>>>>>>>> >>>>>>>>> On Tue, Mar 14, 2017 at 10:02 PM, Stephan Ewen [via Apache Flink >>>>>>>>> User Mailing List archive.] <[hidden email] >>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12224&i=0>> wrote: >>>>>>>>> >>>>>>>>>> The issue in Flink is https://issues.apache.org/j >>>>>>>>>> ira/browse/FLINK-5756 >>>>>>>>>> >>>>>>>>>> On Tue, Mar 14, 2017 at 3:40 PM, Stefan Richter <[hidden email] >>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12209&i=0>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Vinay, >>>>>>>>>>> >>>>>>>>>>> I think the issue is tracked here: https://github.com/faceb >>>>>>>>>>> ook/rocksdb/issues/1988. >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Stefan >>>>>>>>>>> >>>>>>>>>>> Am 14.03.2017 um 15:31 schrieb Vishnu Viswanath <[hidden email] >>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12209&i=1>>: >>>>>>>>>>> >>>>>>>>>>> Hi Stephan, >>>>>>>>>>> >>>>>>>>>>> Is there a ticket number/link to track this, My job has all the >>>>>>>>>>> conditions you mentioned. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Vishnu >>>>>>>>>>> >>>>>>>>>>> On Tue, Mar 14, 2017 at 7:13 AM, Stephan Ewen <[hidden email] >>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12209&i=2>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Vinay! >>>>>>>>>>>> >>>>>>>>>>>> We just discovered a bug in RocksDB. The bug affects windows >>>>>>>>>>>> without reduce() or fold(), windows with evictors, and ListState. >>>>>>>>>>>> >>>>>>>>>>>> A certain access pattern in RocksDB starts being so slow after >>>>>>>>>>>> a certain size-per-key that it basically brings down the streaming >>>>>>>>>>>> program >>>>>>>>>>>> and the snapshots. >>>>>>>>>>>> >>>>>>>>>>>> We are reaching out to the RocksDB folks and looking for >>>>>>>>>>>> workarounds in Flink. >>>>>>>>>>>> >>>>>>>>>>>> Greetings, >>>>>>>>>>>> Stephan >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Mar 1, 2017 at 12:10 PM, Stephan Ewen <[hidden email] >>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12209&i=3>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> @vinay Can you try to not set the buffer timeout at all? I am >>>>>>>>>>>>> actually not sure what would be the effect of setting it to a >>>>>>>>>>>>> negative >>>>>>>>>>>>> value, that can be a cause of problems... >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> On Mon, Feb 27, 2017 at 7:44 PM, Seth Wiesman <[hidden email] >>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12209&i=4>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Vinay, >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The bucketing sink performs rename operations during the >>>>>>>>>>>>>> checkpoint and if it tries to rename a file that is not yet >>>>>>>>>>>>>> consistent that >>>>>>>>>>>>>> would cause a FileNotFound exception which would fail the >>>>>>>>>>>>>> checkpoint. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Stephan, >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Currently my aws fork contains some very specific assumptions >>>>>>>>>>>>>> about the pipeline that will in general only hold for my >>>>>>>>>>>>>> pipeline. This is >>>>>>>>>>>>>> because there were still some open questions that I had about >>>>>>>>>>>>>> how to solve >>>>>>>>>>>>>> consistency issues in the general case. I will comment on the >>>>>>>>>>>>>> Jira issue >>>>>>>>>>>>>> with more specific. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Seth Wiesman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> *From: *vinay patil <[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12209&i=5>> >>>>>>>>>>>>>> *Reply-To: *"[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12209&i=6>" <[hidden >>>>>>>>>>>>>> email] <http://user/SendEmail.jtp?type=node&node=12209&i=7>> >>>>>>>>>>>>>> *Date: *Monday, February 27, 2017 at 1:05 PM >>>>>>>>>>>>>> *To: *"[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12209&i=8>" <[hidden >>>>>>>>>>>>>> email] <http://user/SendEmail.jtp?type=node&node=12209&i=9>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> *Subject: *Re: Checkpointing with RocksDB as statebackend >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Seth, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thank you for your suggestion. >>>>>>>>>>>>>> >>>>>>>>>>>>>> But if the issue is only related to S3, then why does this >>>>>>>>>>>>>> happen when I replace the S3 sink to HDFS as well (for >>>>>>>>>>>>>> checkpointing I am >>>>>>>>>>>>>> using HDFS only ) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Stephan, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Another issue I see is when I set env.setBufferTimeout(-1) , >>>>>>>>>>>>>> and keep the checkpoint interval to 10minutes, I have observed >>>>>>>>>>>>>> that nothing >>>>>>>>>>>>>> gets written to sink (tried with S3 as well as HDFS), atleast I >>>>>>>>>>>>>> was >>>>>>>>>>>>>> expecting pending files here. >>>>>>>>>>>>>> >>>>>>>>>>>>>> This issue gets worst when checkpointing is disabled as >>>>>>>>>>>>>> nothing is written. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Vinay Patil >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Feb 27, 2017 at 10:55 PM, Stephan Ewen [via Apache >>>>>>>>>>>>>> Flink User Mailing List archive.] <[hidden email]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Seth! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Wow, that is an awesome approach. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> We have actually seen these issues as well and we are looking >>>>>>>>>>>>>> to eventually implement our own S3 file system (and circumvent >>>>>>>>>>>>>> Hadoop's S3 >>>>>>>>>>>>>> connector that Flink currently relies on): >>>>>>>>>>>>>> https://issues.apache.org/jira/browse/FLINK-5706 >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Do you think your patch would be a good starting point for >>>>>>>>>>>>>> that and would you be willing to share it? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> The Amazon AWS SDK for Java is Apache 2 licensed, so that is >>>>>>>>>>>>>> possible to fork officially, if necessary... >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Greetings, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Stephan >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Mon, Feb 27, 2017 at 5:15 PM, Seth Wiesman <[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11943&i=0>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Just wanted to throw in my 2cts. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I’ve been running pipelines with similar state size using >>>>>>>>>>>>>> rocksdb which externalize to S3 and bucket to S3. I was getting >>>>>>>>>>>>>> stalls like >>>>>>>>>>>>>> this and ended up tracing the problem to S3 and the bucketing >>>>>>>>>>>>>> sink. The >>>>>>>>>>>>>> solution was two fold: >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1) I forked hadoop-aws and have it treat flink as a >>>>>>>>>>>>>> source of truth. Emr uses a dynamodb table to determine if S3 is >>>>>>>>>>>>>> inconsistent. Instead I say that if flink believes that a file >>>>>>>>>>>>>> exists on S3 >>>>>>>>>>>>>> and we don’t see it then I am going to trust that flink is in a >>>>>>>>>>>>>> consistent >>>>>>>>>>>>>> state and S3 is not. In this case, various operations will >>>>>>>>>>>>>> perform a back >>>>>>>>>>>>>> off and retry up to a certain number of times. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2) The bucketing sink performs multiple renames over >>>>>>>>>>>>>> the lifetime of a file, occurring when a checkpoint starts and >>>>>>>>>>>>>> then again >>>>>>>>>>>>>> on notification after it completes. Due to S3’s consistency >>>>>>>>>>>>>> guarantees the >>>>>>>>>>>>>> second rename of file can never be assured to work and will >>>>>>>>>>>>>> eventually fail >>>>>>>>>>>>>> either during or after a checkpoint. Because there is no upper >>>>>>>>>>>>>> bound on the >>>>>>>>>>>>>> time it will take for a file on S3 to become consistent, retries >>>>>>>>>>>>>> cannot >>>>>>>>>>>>>> solve this specific problem as it could take upwards of many >>>>>>>>>>>>>> minutes to >>>>>>>>>>>>>> rename which would stall the entire pipeline. The only viable >>>>>>>>>>>>>> solution I >>>>>>>>>>>>>> could find was to write a custom sink which understands S3. Each >>>>>>>>>>>>>> writer >>>>>>>>>>>>>> will write file locally and then copy it to S3 on checkpoint. By >>>>>>>>>>>>>> only >>>>>>>>>>>>>> interacting with S3 once per file it can circumvent consistency >>>>>>>>>>>>>> issues all >>>>>>>>>>>>>> together. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hope this helps, >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Seth Wiesman >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> *From: *vinay patil <[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11943&i=1>> >>>>>>>>>>>>>> *Reply-To: *"[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11943&i=2>" <[hidden >>>>>>>>>>>>>> email] <http://user/SendEmail.jtp?type=node&node=11943&i=3>> >>>>>>>>>>>>>> *Date: *Saturday, February 25, 2017 at 10:50 AM >>>>>>>>>>>>>> *To: *"[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11943&i=4>" <[hidden >>>>>>>>>>>>>> email] <http://user/SendEmail.jtp?type=node&node=11943&i=5>> >>>>>>>>>>>>>> *Subject: *Re: Checkpointing with RocksDB as statebackend >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> HI Stephan, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Just to avoid the confusion here, I am using S3 sink for >>>>>>>>>>>>>> writing the data, and using HDFS for storing checkpoints. >>>>>>>>>>>>>> >>>>>>>>>>>>>> There are 2 core nodes (HDFS) and two task nodes on EMR >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I replaced s3 sink with HDFS for writing data in my last test. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Let's say the checkpoint interval is 5 minutes, now within >>>>>>>>>>>>>> 5minutes of run the state size grows to 30GB , after >>>>>>>>>>>>>> checkpointing the >>>>>>>>>>>>>> 30GB state that is maintained in rocksDB has to be copied to >>>>>>>>>>>>>> HDFS, right ? >>>>>>>>>>>>>> is this causing the pipeline to stall ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Vinay Patil >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Sat, Feb 25, 2017 at 12:22 AM, Vinay Patil <[hidden >>>>>>>>>>>>>> email]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Stephan, >>>>>>>>>>>>>> >>>>>>>>>>>>>> To verify if S3 is making teh pipeline stall, I have replaced >>>>>>>>>>>>>> the S3 sink with HDFS and kept minimum pause between checkpoints >>>>>>>>>>>>>> to >>>>>>>>>>>>>> 5minutes, still I see the same issue with checkpoints getting >>>>>>>>>>>>>> failed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> If I keep the pause time to 20 seconds, all checkpoints are >>>>>>>>>>>>>> completed , however there is a hit in overall throughput. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Vinay Patil >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 10:09 PM, Stephan Ewen [via Apache >>>>>>>>>>>>>> Flink User Mailing List archive.] <[hidden email]> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Flink's state backends currently do a good number of "make >>>>>>>>>>>>>> sure this exists" operations on the file systems. Through >>>>>>>>>>>>>> Hadoop's S3 >>>>>>>>>>>>>> filesystem, that translates to S3 bucket list operations, where >>>>>>>>>>>>>> there is a >>>>>>>>>>>>>> limit in how many operation may happen per time interval. After >>>>>>>>>>>>>> that, S3 >>>>>>>>>>>>>> blocks. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> It seems that operations that are totally cheap on HDFS are >>>>>>>>>>>>>> hellishly expensive (and limited) on S3. It may be that you are >>>>>>>>>>>>>> affected by >>>>>>>>>>>>>> that. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> We are gradually trying to improve the behavior there and be >>>>>>>>>>>>>> more S3 aware. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Both 1.3-SNAPSHOT and 1.2-SNAPSHOT already contain >>>>>>>>>>>>>> improvements there. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Stephan >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 4:42 PM, vinay patil <[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11891&i=0>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Stephan, >>>>>>>>>>>>>> >>>>>>>>>>>>>> So do you mean that S3 is causing the stall , as I have >>>>>>>>>>>>>> mentioned in my previous mail, I could not see any progress for >>>>>>>>>>>>>> 16minutes >>>>>>>>>>>>>> as checkpoints were getting failed continuously. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Feb 24, 2017 8:30 PM, "Stephan Ewen [via Apache Flink User >>>>>>>>>>>>>> Mailing List archive.]" <[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11887&i=0>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Vinay! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> True, the operator state (like Kafka) is currently not >>>>>>>>>>>>>> asynchronously checkpointed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> While it is rather small state, we have seen before that on >>>>>>>>>>>>>> S3 it can cause trouble, because S3 frequently stalls uploads of >>>>>>>>>>>>>> even data >>>>>>>>>>>>>> amounts as low as kilobytes due to its throttling policies. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> That would be a super important fix to add! >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Stephan >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Fri, Feb 24, 2017 at 2:58 PM, vinay patil <[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11885&i=0>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I have attached a snapshot for reference: >>>>>>>>>>>>>> As you can see all the 3 checkpointins failed , for >>>>>>>>>>>>>> checkpoint ID 2 and 3 it >>>>>>>>>>>>>> is stuck at the Kafka source after 50% >>>>>>>>>>>>>> (The data sent till now by Kafka source 1 is 65GB and sent by >>>>>>>>>>>>>> source 2 is >>>>>>>>>>>>>> 15GB ) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Within 10minutes 15M records were processed, and for the next >>>>>>>>>>>>>> 16minutes the >>>>>>>>>>>>>> pipeline is stuck , I don't see any progress beyond 15M >>>>>>>>>>>>>> because of >>>>>>>>>>>>>> checkpoints getting failed consistently. >>>>>>>>>>>>>> >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.na >>>>>>>>>>>>>> bble.com/file/n11882/Checkpointing_Failed.png> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> View this message in context: http://apache-flink-u >>>>>>>>>>>>>> ser-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpoint >>>>>>>>>>>>>> ing-with-RocksDB-as-statebackend-tp11752p11882.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sent from the Apache Flink User Mailing List archive. mailing >>>>>>>>>>>>>> list archive at Nabble.com <http://nabble.com/>. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>> >>>>>>>>>>>>>> *If you reply to this email, your message will be added to >>>>>>>>>>>>>> the discussion below:* >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab >>>>>>>>>>>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175 >>>>>>>>>>>>>> 2p11885.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> To start a new topic under Apache Flink User Mailing List >>>>>>>>>>>>>> archive., email[hidden email] >>>>>>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=11887&i=1> >>>>>>>>>>>>>> To unsubscribe from Apache Flink User Mailing List archive., >>>>>>>>>>>>>> click here. >>>>>>>>>>>>>> NAML >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>> >>>>>>>>>>>>>> View this message in context: Re: Checkpointing with RocksDB >>>>>>>>>>>>>> as statebackend >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p11887.html> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Sent from the Apache Flink User Mailing List archive. >>>>>>>>>>>>>> mailing list archive >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> >>>>>>>>>>>>>> at Nabble.com <http://nabble.com/>. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>> >>>>>>>>>>>>>> *If you reply to this email, your message will be added to >>>>>>>>>>>>>> the discussion below:* >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab >>>>>>>>>>>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175 >>>>>>>>>>>>>> 2p11891.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> To start a new topic under Apache Flink User Mailing List >>>>>>>>>>>>>> archive., email [hidden email] >>>>>>>>>>>>>> To unsubscribe from Apache Flink User Mailing List archive., >>>>>>>>>>>>>> click here. >>>>>>>>>>>>>> NAML >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>> >>>>>>>>>>>>>> View this message in context: Re: Checkpointing with RocksDB >>>>>>>>>>>>>> as statebackend >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p11913.html> >>>>>>>>>>>>>> Sent from the Apache Flink User Mailing List archive. >>>>>>>>>>>>>> mailing list archive >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> >>>>>>>>>>>>>> at Nabble.com <http://nabble.com/>. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>> >>>>>>>>>>>>>> *If you reply to this email, your message will be added to >>>>>>>>>>>>>> the discussion below:* >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab >>>>>>>>>>>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175 >>>>>>>>>>>>>> 2p11943.html >>>>>>>>>>>>>> >>>>>>>>>>>>>> To start a new topic under Apache Flink User Mailing List >>>>>>>>>>>>>> archive., email [hidden email] >>>>>>>>>>>>>> To unsubscribe from Apache Flink User Mailing List archive., >>>>>>>>>>>>>> click >>>>>>>>>>>>>> here. >>>>>>>>>>>>>> NAML >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ------------------------------ >>>>>>>>>>>>>> >>>>>>>>>>>>>> View this message in context: Re: Checkpointing with RocksDB >>>>>>>>>>>>>> as statebackend >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p11949.html> >>>>>>>>>>>>>> Sent from the Apache Flink User Mailing List archive. >>>>>>>>>>>>>> mailing list archive >>>>>>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> >>>>>>>>>>>>>> at Nabble.com <http://nabble.com/>. >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> ------------------------------ >>>>>>>>>> If you reply to this email, your message will be added to the >>>>>>>>>> discussion below: >>>>>>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab >>>>>>>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175 >>>>>>>>>> 2p12209.html >>>>>>>>>> To start a new topic under Apache Flink User Mailing List >>>>>>>>>> archive., email [hidden email] >>>>>>>>>> <http://user/SendEmail.jtp?type=node&node=12224&i=1> >>>>>>>>>> To unsubscribe from Apache Flink User Mailing List archive., click >>>>>>>>>> here. >>>>>>>>>> NAML >>>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------ >>>>>>>>> View this message in context: Re: Checkpointing with RocksDB as >>>>>>>>> statebackend >>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p12224.html> >>>>>>>>> Sent from the Apache Flink User Mailing List archive. mailing >>>>>>>>> list archive >>>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> >>>>>>>>> at Nabble.com. >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------ >>>>>>>> If you reply to this email, your message will be added to the >>>>>>>> discussion below: >>>>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab >>>>>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175 >>>>>>>> 2p12225.html >>>>>>>> To start a new topic under Apache Flink User Mailing List archive., >>>>>>>> email [hidden email] >>>>>>>> <http://user/SendEmail.jtp?type=node&node=12234&i=1> >>>>>>>> To unsubscribe from Apache Flink User Mailing List archive., click >>>>>>>> here. >>>>>>>> NAML >>>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> ------------------------------ >>>>>>> View this message in context: Re: Checkpointing with RocksDB as >>>>>>> statebackend >>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p12234.html> >>>>>>> Sent from the Apache Flink User Mailing List archive. mailing list >>>>>>> archive >>>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> >>>>>>> at Nabble.com. >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> ------------------------------ >>>>>> If you reply to this email, your message will be added to the >>>>>> discussion below: >>>>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab >>>>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175 >>>>>> 2p12243.html >>>>>> To start a new topic under Apache Flink User Mailing List archive., >>>>>> email [hidden email] >>>>>> <http://user/SendEmail.jtp?type=node&node=12274&i=1> >>>>>> To unsubscribe from Apache Flink User Mailing List archive., click >>>>>> here. >>>>>> NAML >>>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>>>>> >>>>> >>>>> >>>>> ------------------------------ >>>>> View this message in context: Re: Checkpointing with RocksDB as >>>>> statebackend >>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p12274.html> >>>>> Sent from the Apache Flink User Mailing List archive. mailing list >>>>> archive >>>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> >>>>> at Nabble.com. >>>>> >>>> >>>> >>>> >>>> ------------------------------ >>>> If you reply to this email, your message will be added to the >>>> discussion below: >>>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab >>>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp1175 >>>> 2p12276.html >>>> To start a new topic under Apache Flink User Mailing List archive., >>>> email <a >>>> href="x-msg://1/user/SendEmail.jtp?type=node&node=12425&i=1" >>>> target="_top" rel="nofollow" link="external" class="">[hidden email] >>>> To unsubscribe from Apache Flink User Mailing List archive., click here >>>> . >>>> NAML >>>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>>> >>> >>> >>> ------------------------------ >>> View this message in context: Re: Checkpointing with RocksDB as >>> statebackend >>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p12425.html> >>> Sent from the Apache Flink User Mailing List archive. mailing list >>> archive >>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> >>> at Nabble.com <http://nabble.com/>. >>> >>> >>> >>> >>> ------------------------------ >>> If you reply to this email, your message will be added to the discussion >>> below: >>> http://apache-flink-user-mailing-list-archive.2336050.n4.nab >>> ble.com/Re-Checkpointing-with-RocksDB-as-statebackend-tp11752p12429.html >>> To start a new topic under Apache Flink User Mailing List archive., >>> email ml-node+s2336050n1...@n4.nabble.com >>> To unsubscribe from Apache Flink User Mailing List archive., click here >>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=1&code=dmluYXkxOC5wYXRpbEBnbWFpbC5jb218MXwxODExMDE2NjAx> >>> . >>> NAML >>> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >>> >> >> >