Re: Job continuously failing after Checkpoint Restore

2019-03-06 Thread Yun Tang
e you ever changed anything before resuming your job? 4. If trying to restore checkpoint-60 again by submitting another job, will you also meet this NPE continuously again? Best Yun Tang From: Laura Uzcátegui Sent: Wednesday, March 6, 2019 21:35 To: user Subject

Re: Checkpoint recovery and state external to flink

2019-03-05 Thread Yun Tang
/features/2018/03/01/end-to-end-exactly-once-apache-flink.html [2] https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html Best Yun Tang From: Aggarwal, Ajay Sent: Tuesday, March 5, 2019 4:08 To: user@flink.apache.org Subject: Checkpoint recovery

Re: KeyBy distribution across taskslots

2019-02-28 Thread Yun Tang
node just like Hadoop's combine and reduce in some way. Jark (in CC) might provide more information. Best Yun Tang [1] https://github.com/apache/flink/blob/2be5f47fb62126fa3a35e44459e660c39e9e0a39/flink-libraries/flink-table/src/main/java/org/apache/flink/table/api/TableConfigOptions.java#L39

Re: How to use my custom log4j.properties when running minicluster in idea

2019-02-20 Thread Yun Tang
/resources/log4j.properties . You could compare your directory structure and the file contents with those I mentioned above and I believe you could finally make it work. Best Yun Tang From: peibin wang Sent: Tuesday, February 19, 2019 12:04 To: user@flink.apache.org

Re: StandAlone job on k8s fails with "Unknown method truncate" on restore

2019-02-14 Thread Yun Tang
email, the more serious problem should be using 'Buckets' with Hadoop-2.6. From what I know the `RecoverableWriter` within 'Buckets' can only support Hadoop-2.7+ , I'm not sure whether existed work around solution. Best Yun Tang From: Vishal Santoshi Sent: Friday

Re: 如何正确的利用StateTtlConfig为State设置过期时间

2019-02-09 Thread Yun Tang
Hi 银兵 1. StateTtlConfig目前只能做到state层级的,目前还不支持每个key设置一个超时时间的。如果过期时间比较短的话,其实可以考虑给key附上自定义的时间戳用window来满足你的需求。 2. 就我所知,queryable state的工业级应用的话,有一个拿来做记账系统的[1],他们也在Github上分享了相关源码demo[2]。还有一个是实验性质或者说更像爱好领域的应用,将queryable state用在在线机器学习中[3]。国内似乎没有大规模使用queryable state的,如果您那里有什么进展或者尝试也欢迎分享。 [1]

Re: late element and expired state

2019-02-06 Thread Yun Tang
state not so large, e.g. only keep a fixed size of expired states. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/event_timestamps_watermarks.html Best Yun Tang From: Aggarwal, Ajay Sent: Tuesday, February 5, 2019 22:54 To: user@flink.apach

Re: Videos and slides on Flink Forward Beijing

2019-02-01 Thread Yun Tang
Hi Paul You could find slides here https://github.com/flink-china/flink-forward-china-2018, many talks are given in Chinese but most of slides are presented both in Chinese and English. Best Yun Tang From: Congxian Qiu Sent: Friday, February 1, 2019 21:55

Re: Writing a custom Rocksdb statistics collector

2019-02-01 Thread Yun Tang
Tang From: Harshvardhan Agrawal Sent: Friday, February 1, 2019 1:35 To: Yun Tang Cc: user Subject: Re: Writing a custom Rocksdb statistics collector It looks like the DBOptions that are created by the OptionsFactory class are used for opening RocksDB. And yes I

Re: Writing a custom Rocksdb statistics collector

2019-01-30 Thread Yun Tang
/config.html#rocksdb-native-metrics<https://ci.apache.org/projects/flink/flink-docs-release-1.7/ops/config.html#rocksdb-native-metrics> Best Yun Tang From: Harshvardhan Agrawal Sent: Thursday, January 31, 2019 0:23 To: user Subject: Writing a custom Rocksdb stat

Re: SQL Client (Streaming + Checkpoint)

2019-01-28 Thread Yun Tang
, not production-ready yet. Best Yun Tang From: Vijay Srinivasaraghavan Sent: Tuesday, January 29, 2019 0:53 To: User Subject: SQL Client (Streaming + Checkpoint) It looks like the SQL client does not configure enable checkpoint while submitting the streaming job

Re: How to infer table schema from Avro file

2019-01-28 Thread Yun Tang
+ Flink Users From: Yun Tang Sent: Monday, January 28, 2019 19:46 To: Soheil Pourbafrani Subject: Re: How to infer table schema from Avro file Hi Soheil You should provide your generated Avro record class as the type of AvroInputFormat not Avro's GenericRecord

Re: getting an error when configuring state backend to hdfs

2018-12-23 Thread Yun Tang
and export your HADOOP_CLASSPATH [1] [1] https://flink.apache.org/downloads.html#latest-stable-release-v171 Best Yun Tang From: Avi Levi Sent: Thursday, December 20, 2018 2:11 To: Steven Nelson Cc: Chesnay Schepler; user@flink.apache.org Subject: Re: getting an

Re: number of files in checkpoint directory grows endlessly

2018-12-06 Thread Yun Tang
other possible causes. Best Yun Tang From: Andrey Zagrebin Sent: Thursday, December 6, 2018 19:07 To: bernd.winterst...@dev.helaba.de Cc: Yun Tang; Kostas Kloudas; user; Stefan Richter; Till Rohrmann; Stephan Ewen Subject: Re: number of files in checkpoint

Re: Questions about Savepoints

2018-11-05 Thread Yun Tang
cts/flink/flink-docs-release-1.6/ops/cli.html#cancel-with-a-savepoint Best Yun Tang From: Ning Shi Sent: Monday, November 5, 2018 8:28 To: user@flink.apache.org Subject: Questions about Savepoints I have the following questions regarding savepoint recovery. - In my j

Re: Job manager UI improvement

2018-11-01 Thread Yun Tang
isplay information about your checkpoints: Overview, History, Summary, and Configuration. The following sections will cover all of ... ci.apache.org Best Yun Tang From: Michael Latta Sent: Friday, November 2, 2018 10:08 To: user@flink.apache.org Subject: Job mana

Re: Starting a seperate Java process within a Flink cluster

2018-11-01 Thread Yun Tang
ven-assembly-plugin/> or shaded-maven<https://maven.apache.org/plugins/maven-shade-plugin/> plugin to include your classes. Best Yun Tang From: Ly, The Anh Sent: Friday, November 2, 2018 6:33 To: user@flink.apache.org Subject: Starting a seperate Java pr

Re: 1.6 UI issues

2018-11-01 Thread Yun Tang
taskmanager.heap.size "1024m" JVM heap size for the TaskManagers, which are the parallel workers of the system. ci.apache.org Best Yun Tang From: Juan Gentile Sent: Wednesday, October 31, 2018 22:05 To: user@flink.apache.org Subject: 1.6 UI issues Hello!

Re: Savepoint failed with error "Checkpoint expired before completing"

2018-11-01 Thread Yun Tang
l-list. Best Yun Tang From: Gagan Agrawal Sent: Thursday, November 1, 2018 13:38 To: myas...@live.com Cc: happydexu...@gmail.com; user@flink.apache.org Subject: Re: Savepoint failed with error "Checkpoint expired before completing" Thanks Yun for

Re: Flink weird checkpointing behaviour

2018-10-31 Thread Yun Tang
gram’s state will eventually reflect every record from the data stream exactly once. Note that there is a switch to ... ci.apache.org Best Yun Tang From: Pawel Bartoszek Sent: Wednesday, October 24, 2018 23:11 To: User Subject: Flink weird checkpointing behaviour Hi,

Re: Savepoint failed with error "Checkpoint expired before completing"

2018-10-31 Thread Yun Tang
s; Checkpoints. Overview; Retained Checkpoints. Directory Structure; Difference to Savepoints; Resuming from a retained checkpoint ci.apache.org Best Yun Tang From: Gagan Agrawal Sent: Wednesday, October 31, 2018 19:03 To: happydexu...@gmail.com Cc: user@flink.apac

Re: Rocksdb Metrics

2018-09-26 Thread Yun Tang
Hi Sayat Before this future is on, you could also find some metrics information, such as hit/miss count, file status from RocksDB itself. By default, RocksDB will dump its stats to its information LOG file every 10 minutes (you could call DBOptions.setStatsDumpPeriodSec to reduce the time

Re: RocksDB Read IOPs

2018-09-26 Thread Yun Tang
also acts as a role for reading, from our experience, we use 4 max write buffers and 32MB each, e.g. setMaxWriteBufferNumber(4) and setWriteBufferSize(32*1024*1024) Best Yun Tang [1] https://github.com/facebook/rocksdb/wiki/Block-Cache#caching-index-and-filter-blocks [https:/

Re: JobManager container is running beyond physical memory limits

2018-09-25 Thread Yun Tang
Hi If your JM's container is killed by YARN due to beyond physical memory limit and your job's code is not changed but just bumped the Flink verion , I think you could use jmap command to dump the memory of your JobManager to see the difference between 1.4.2 and 1.5.2, and you could also open

Re: What is the right way to add classpath?

2018-09-12 Thread Yun Tang
to https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/cli.html#usage Best Yun Tang From: bupt_ljy Sent: Wednesday, September 12, 2018 16:34 To: user Subject: What is the right way to add classpath? Hi,all My program needs some dependencies before it’s

Re: Flink failure recovery tooks very long time

2018-09-06 Thread Yun Tang
duration. Best Yun Tang [1] https://ci.apache.org/projects/flink/flink-docs-release-1.6/monitoring/checkpoint_monitoring.html#checkpoint-details From: trung kien Sent: Thursday, September 6, 2018 22:31 To: vino yang Cc: myas...@live.com; user Subject: Re: Flink

Re: Flink failure recovery tooks very long time

2018-09-06 Thread Yun Tang
barrier to trigger the checkpoint. You can watch your metrics of checkpoint alignment time to verify the root cause, and if you do not need the exactly once guarantees, you can change the checkpoint mode to at-least-once[2]. Best Yun Tang [1] https://ci.apache.org/projects/flink/flink-doc

Re: Increased Size of Incremental Checkpoint

2018-09-06 Thread Yun Tang
+ user mail list From: Yun Tang Sent: Thursday, September 6, 2018 14:36 To: burgesschen Subject: Re: Increased Size of Incremental Checkpoint Hi I think the "checkpoint size" metrics showed in your graph means the total checkpoint size of

Re: Cannot configure akka.ask.timeout

2018-07-18 Thread Yun Tang
Hi Lukas >From your first two steps' description ("started this in Intellij") and the >exception log, I think you run your program locally within Intellij with >LocalStreamEnvironment. You can view the configuration related code from

Re: Ever increasing key space

2018-07-16 Thread Yun Tang
Hi Chen >From your description, I think you called keyedState.clear() to clear up the >key which has not been seen for several minutes. * For HeapKeyedStateBackend, it will just remove the related content from memory immediately, no worry about the increasing checkpoint size. * For

<    1   2   3   4   5   6