Re: custom writer fail to recover

2017-12-18 Thread xiatao123
Hi Das, Have you got your .pending issue resolved? I am running into the same issue where the parquet files are all in pending status. Please help to share your solutions. Thanks, Tao -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Pending parquet file with Bucking Sink

2017-12-20 Thread xiatao123
Hi Vipul, Thanks for the information. Yes, I do have checkpointing enabled with 10 millisecs. I think the issue here is that the stream ended before the checkpoint reached. This is a testing code that the DataStream only have 5 events then it ended. Once the stream ended, the checkpoint is

Re: Checkpoint is not triggering as per configuration

2018-05-09 Thread xiatao123
I ran into a similar issue. Since it is a "Custom File Source", the first source just listing folder/file path for all existing files. Next operator "Split Reader" will read the content of the file. "Custom File Source" went to "finished" state after first couple secs. That's way we got this

Re: Anyone got Flink working in EMR with KinesisConnector

2018-01-10 Thread xiatao123
got the issue fixed after applying patch from https://github.com/apache/flink/pull/4150 -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

NoClassDefFoundError of a Avro class after cancel then resubmit the same job

2018-01-18 Thread xiatao123
Not sure why, when I submit the job at the first time after a cluster launch, it is working fine. After I cancelled the first job, then resubmit the same job again, it will hit the NoClassDefFoundError. Very weird, feels like some clean up of a cancelled job messed up future job of the same

How to access JobManager and TaskManager

2018-01-30 Thread xiatao123
In the web UI, I can see these information under JobManager. How can I access variables job_env in main method? Job Manager Configuration env.hadoop.conf.dir /etc/hadoop/conf env.yarn.conf.dir /etc/hadoop/conf high-availability.cluster-idapplication_1517362137681_0001 job_env stage

Re: How to access JobManager and TaskManager

2018-01-31 Thread xiatao123
Hi Tim, "job_env" is a variable I passed to launch YARN application. I just want to access it in my flink application main method. There is is no documentation on how to access customized job environment variables or settings. Thanks, Tao -- Sent from:

how to match external checkpoints with jobs during recovery

2018-02-05 Thread xiatao123
The external checkpoints are in the format of checkpoint_metadata-0057 which I have no idea which job this checkpoint metadata belongs to if I have multiple jobs running at the same time. If a job failed unexpected, I need to know which checkpoints belongs to the failed job. Is there API

Support COUNT(DISTINCT 'field') Query Yet?

2018-02-14 Thread xiatao123
SELECT TUMBLE_START(event_timestamp, INTERVAL '1' HOUR), COUNT(DISTINCT session), COUNT(DISTINCT user_id), SUM(duration), SUM(num_interactions) FROM unified_events GROUP BY TUMBLE(event_timestamp, INTERVAL '1' HOUR) I have the above statement my flink query running on Flink 1.3.2, but got the