Re: Auto/Dynamic scaling in Flink

2018-11-13 Thread Tzu-Li (Gordon) Tai
Hi, Flink does not support auto-scaling, yet. Rescaling operations currently are always manual, i.e take a savepoint of the Flink job, and when restoring from the savepoint, define a new parallelism for the job. As for the metrics to be used for auto-scaling, I can imagine that it would be possibl

Re: Kinesis Shards and Parallelism

2018-11-13 Thread Tzu-Li (Gordon) Tai
Hi, Another detail not that apparent in the description is that the assignment would only be evenly distributed assuming that the open Kinesis shards have consecutive shard ids, and are of the same Kinesis stream. Once you reshard a Kinesis stream, it could be that the shard ids are no longer cons

Re: Flink Streaming sink to InfluxDB

2018-11-13 Thread Tzu-Li (Gordon) Tai
Hi, This is most likely an exception that indicates either that 1) you are using mismatching versions of Flink in your application code and the installed Flink cluster, or 2) your application code isn't properly packaged. >From your exception, I'm guessing it is the latter case. If so, I would sug

Re: Queryable state when key is UUID - getting Kyro Exception

2018-10-25 Thread Tzu-Li (Gordon) Tai
Hi Jayant, What is the Kryo exception message that you are getting? Cheers, Gordon On 25 October 2018 at 5:17:13 PM, Jayant Ameta (wittyam...@gmail.com) wrote: Hi, I've not configured any serializer in the descriptor. (Neither in flink job, nor in state query client). Which serializer should

Re: Fail to recover Keyed State afeter ReinterpretAsKeyedStream

2018-10-24 Thread Tzu-Li (Gordon) Tai
Hi Jose, As far as I know, you should be able to use keyed state on a stream returned by DataStreamUtils.reinterpretAsKeyedStream function. That shouldn’t be the issue here. Have you looked into the logs for any meaningful exceptions of why the restore failed? That would be helpful here to und

Re: cannot find symbol of "fromargs"

2018-10-24 Thread Tzu-Li (Gordon) Tai
Hi! How are you packaging your Flink program? This looks like a simple dependency error. If you don’t know where to start when beginning to write your Flink program, the quickstart Maven templates are always a good place to begin with [1]. Cheers, Gordon [1]  https://ci.apache.org/projects/fli

Re: Question over Incremental Snapshot vs Full Snapshot in rocksDb state backend

2018-10-24 Thread Tzu-Li (Gordon) Tai
Hi, I’m forwarding this question to Stefan (cc’ed). He would most likely be able to answer your question, as he has done substantial work in the RocksDB state backends. Cheers, Gordon On 24 October 2018 at 8:47:24 PM, chandan prakash (chandanbaran...@gmail.com) wrote: Hi, I am new to Flink.

Re: Get request header from Kinesis

2018-10-24 Thread Tzu-Li (Gordon) Tai
Hi, Could you point to the AWS Kinesis Java API that exposes record headers? As far as I can tell from the Javadoc, I can’t seem to find how to retrieve headers from Kinesis records. If there is a way to do that, then it might make sense to expose that from the Kinesis connector’s serialization

Re: Using FlinkKinesisConsumer through a proxy

2018-10-04 Thread Tzu-Li (Gordon) Tai
Hi, Since Flink 1.5, you should be able to set all available configurations on the ClientConfiguration through the consumer Properties (see FLINK-9188 [1]). The way to do that would be to prefix the configuration you want to set with "aws.clientconfig" and add that to the properties, as such: ``

Re: Why FlinkKafkaConsumer doesn't subscribe to topics?

2018-09-04 Thread Tzu-Li (Gordon) Tai
Hi Julio, As Renjie had already mentioned, to achieve exactly-once semantics with the Kafka consumer, Flink needs to have control over the Kafka partition to source subtask assignment. To add a bit more detail here, this is due to the fact that each subtask writes to Flink managed state the cu

Re: Flink job is not reading from certain kafka topic partitions

2018-08-07 Thread Tzu-Li (Gordon) Tai
Hi, The case you described looks a lot like this issue with the Flink Kafka Consumer in 1.3.0 / 1.3.1: https://issues.apache.org/jira/browse/FLINK-7143 If this is the case, you would have to upgrade to 1.3.2 or above to overcome this. The issue ticket description contains some things to keep in m

Re: Elasticsearch 6.3.x connector

2018-07-18 Thread Tzu-Li (Gordon) Tai
Hi Miki, The latest stable version of the Elasticsearch connector, as of Flink 1.5.x, is Elasticsearch 5.x. As for Elasticsearch 6.x, there has been some PRs that has been open for a while and have already been discussed quite thoroughly [1] [2]. Till and I have talked about merging these for

Re: How to deploy Flink in a geo-distributed environment

2018-06-28 Thread Tzu-Li (Gordon) Tai
Hi, It should be possible to deploy a single Flink cluster across geo-distributed nodes, but Flink currently offers no optimization for such a specific use case. AFAIK, the general pattern for dealing with geographically distributed data sources right now, would be to replicate data across cluster

Re: Flink kafka consumers don't honor group.id

2018-06-28 Thread Tzu-Li (Gordon) Tai
Hi Giriraj, The fact that the Flink Kafka Consumer doesn't use the group.id property, is an expected behavior. Since the partition-to-subtask assignment of the Flink Kafka Consumer needs to be deterministic in Flink, the consumer uses static assignment instead of the more high-level consumer group

RE: Debug job execution from savepoint

2018-06-22 Thread Tzu-Li (Gordon) Tai
Hi, The tests in Flink uses a `AbstractStreamOperatorTestHarness` that allows wrapping an operator, input elements into the operator, getting the outputs, and also snapshotting / restoring operator state. I’m not sure of your specific case, but in general that test harness utility can be used t

[NOTICE] Flink Kinesis Producer for versions 1.4.2 and below needs to be built with higher AWS KPL version

2018-06-22 Thread Tzu-Li (Gordon) Tai
Hi, This is a notice for users who use the Flink Kinesis connector to produce data to AWS Kinesis Streams, with Flink versions 1.4.2 or below. For Flink versions 1.4.2 and below, the KPL client version used by default in the Kinesis connectors, KPL 0.12.5, is no longer supported by AWS Kinesis

Re: Backpressure from producer with flink connector kinesis 1.4.2

2018-06-21 Thread Tzu-Li (Gordon) Tai
)     at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)     at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)               From: "Liu, Gavin (CAI - Atlanta)" Date: Wednesday, June 20, 2018 at 12:11 PM To: "Tzu-

Re: A question about Kryo and Window State

2018-06-21 Thread Tzu-Li (Gordon) Tai
Hi Vishal, Kryo has a serializer called `CompatibleFieldSerializer` that allows for simple backward compatibility changes, such as adding non-optional fields / removing fields. If using the KryoSerializer is a must, then a good thing to do is to register Kryo's `CompatibleFieldSerializer` as

Re: Backpressure from producer with flink connector kinesis 1.4.2

2018-06-20 Thread Tzu-Li (Gordon) Tai
Hi Gavin, The problem is that the Kinesis producer currently does not propagate backpressure properly. Records are added to the internally used KPL client’s queue, without any queue size limit. This is considered a bug, and already has a pull request for it [1], which we should probably push t

Re: How to get past "bad" Kafka message, restart, keep state

2018-06-20 Thread Tzu-Li (Gordon) Tai
Hi, You can “skip” the corrupted message by returning `null` from the deserialize method on the user-provided DeserializationSchema. This lets the Kafka connector consider the record as processed, advances the offset, but doesn’t emit anything downstream for it. Hope this helps! Cheers, Gordon

Re: Problem producing to Kinesis

2018-06-15 Thread Tzu-Li (Gordon) Tai
> :-) > > Alexey > > On Thu, Jun 14, 2018 at 12:20 PM Tzu-Li (Gordon) Tai > wrote: > > > Hi, > > > > This could be related: > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-4-and-below-STOPS-writin

Re: Problem producing to Kinesis

2018-06-14 Thread Tzu-Li (Gordon) Tai
Hi, This could be related:  http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Flink-1-4-and-below-STOPS-writing-to-Kinesis-after-June-12th-td22687.html#a22701. Shortly put, the KPL library version used by default in the 1.4.x Kinesis connector, is no longer supported by AWS.

Re: flink_taskmanager_job_task_operator_records_lag_max == -Inf on Flink 1.4.2

2018-06-14 Thread Tzu-Li (Gordon) Tai
cords_lag_max" seems to be new (with the attempt thingy). On the "KafkaConsumer" front, but it only has the "commited_offset" for each partition. On Wed, Jun 13, 2018 at 5:41 AM, Tzu-Li (Gordon) Tai wrote: Hi, Which Kafka version are you using? AFAIK, the only recent

Re: flink_taskmanager_job_task_operator_records_lag_max == -Inf on Flink 1.4.2

2018-06-13 Thread Tzu-Li (Gordon) Tai
Hi, Which Kafka version are you using? AFAIK, the only recent changes to Kafka connector metrics in the 1.4.x series would be FLINK-8419 [1]. The ‘records_lag_max’ metric is a Kafka-shipped metric simply forwarded from the internally used Kafka client, so nothing should have been affected. Do

Re: Implementation of ElasticsearchSinkFunction, how to handle class level variables

2018-06-12 Thread Tzu-Li (Gordon) Tai
Hi Jayant, Yes, you don’t have to use an anonymous class for the sink function. An actual separate class works just as fine. The class fields don’t need to be marked as transient or checkpointed, since they should just be constants that come with instantiation of the sink function, or could eve

Re:Kryo Exception

2018-05-24 Thread Tzu-Li (Gordon) Tai
Hi, FYI, this is the JIRA ticket for the issue:  https://issues.apache.org/jira/browse/FLINK-8836 Yes, this seems to be only included in 1.5.0 (to be released), and 1.4.3 (there has been no discussion on releasing that yet). It could also be possible that the reported issue was caused by  https:

Re: Flink does not read from some Kafka Partitions

2018-05-16 Thread Tzu-Li (Gordon) Tai
Hi, Timo is correct - partition discovery is supported by the consumer only starting from Flink 1.4. The expected behaviour without partition discovery on, is that the list of partitions picked up on the first execution of the job will be the list of subscribed partition across all executions. Wh

Re: Dynamically deleting kafka topics does not remove partitions from kafkaConsumer

2018-05-06 Thread Tzu-Li (Gordon) Tai
Ah, correct, sorry for the incorrect link. Thanks Ted! On 7 May 2018 at 11:43:12 AM, Ted Yu (yuzhih...@gmail.com) wrote: It seems the correct JIRA should be FLINK-9303 On Sun, May 6, 2018 at 8:29 PM, Tzu-Li (Gordon) Tai wrote: Hi Edward, Thanks for brining this up, and I think your

Re: Question regarding refreshing KafkaConsumer in FlinkKafkaConnector

2018-05-06 Thread Tzu-Li (Gordon) Tai
Hi, AFAIK, there is no built-in feature for scheduling / orchestrating submission of Flink jobs. However, you should be able to easily use tools like cron jobs to do that. It should work by just taking a savepoint of your running job, and then resuming for that, and you do this periodically. Chee

Re: Assign JIRA issue permission

2018-05-06 Thread Tzu-Li (Gordon) Tai
Hi Sampath, Do you already have a target JIRA that you would like to work on? Once you have one, let us know the JIRA issue ID and your JIRA account ID, then we'll assign you contributor permissions. With that, you can pick up unassigned JIRA issues to work on by yourself in the future. Cheers,

Re: Dynamically deleting kafka topics does not remove partitions from kafkaConsumer

2018-05-06 Thread Tzu-Li (Gordon) Tai
Hi Edward, Thanks for brining this up, and I think your suggestion makes sense. The problem is that the Kafka consumer has no notion of "closed" partitions at the moment, so statically assigned partitions to the Kafka client is never removed and is always continuously requested for records. For e

Re: Testing Metrics

2018-04-25 Thread Tzu-Li (Gordon) Tai
Hi, Do you mean tests to verify that some metric is actually registered? AFAIK, this is not really easy to do as a unit test. One possible way is to have an integration test that uses a metrics reporter, from which you verify against. For example, the Kafka consumer integration tests that uses

Re: Flink Kafka connector not exist

2018-04-19 Thread Tzu-Li (Gordon) Tai
Hi Sebastien, You need to add the dependency under a “dependencies” section, like so: … Then it should be working. I would also recommend using the Flink quickstart Maven templates [1], as they already have a well defined Maven project skeleton for Flink jobs. Cheers, Gordon [1]  h

Re: Tracking deserialization errors

2018-04-19 Thread Tzu-Li (Gordon) Tai
@Alexander Sorry about that, that would be my mistake. I’ll close FLINK-9204 as a duplicate and leave my thoughts on FLINK-9155. Thanks for pointing out! On 19 April 2018 at 2:00:51 AM, Elias Levy (fearsome.lucid...@gmail.com) wrote: Either proposal would work.  In the later case, at a minimum

Re: Flink & Kafka multi-node config

2018-04-18 Thread Tzu-Li (Gordon) Tai
Hi, The partition-to-subtask assignment of is not locality aware. There were discussions to expose functionality for custom user-defined assignment methods, which it might be possible to leverage that for a locality aware assignment. Unfortunately, this feature is not implemented, yet. The rrela

Re: Flink job testing with

2018-04-18 Thread Tzu-Li (Gordon) Tai
Hi, The docs here [1] provide some example snippets of using the Kafka connector to consume from / write to Kafka topics. Once you consumed a `DataStream` from a Kafka topic using the Kafka consumer, you can use Flink transformations such as map, flatMap, etc. to perform processing on the records

Re: Tracking deserialization errors

2018-04-18 Thread Tzu-Li (Gordon) Tai
Hi, These are valid concerns. And yes, AFAIK users have been writing to logs within the deserialization schema to track this. The connectors as of now have no logging themselves in case of a skipped record. I think we can implement both logging and metrics to track this, most of which you have

Re: Lot of data generated in out file

2018-04-15 Thread Tzu-Li (Gordon) Tai
Hi Ashish, I don't really see why there are outputs in the out file for the program you provided. Perhaps others could chime in here .. As for your second question regarding window outputs: Yes, subsequent window operators should definitely be doable in Flink. This is just a matter of multiple tr

Re: Watermark and multiple streams

2018-04-15 Thread Tzu-Li (Gordon) Tai
Hi, How are your registering your event time timers on processElement? If you are continuously registering them, and watermarks are correctly generated upstream, then the onTimer method should be invoked properly. For your 1-to-many case, I would assume that whenever a new key arrives (that previ

Re: Simulating Time-based and Count-based Custom Windows with ProcessFunction

2018-04-15 Thread Tzu-Li (Gordon) Tai
Hi Max! Before we jump into the custom ProcessFunction approach: Have you also checked out using the RocksDB state backend, and whether or not it is suitable for your use case? For state that would not fit into memory, that is usually the to-go state backend to use. If you’re sure a custom Proc

Re: Trying to understand KafkaConsumer_records_lag_max

2018-04-15 Thread Tzu-Li (Gordon) Tai
Hi Julio, I'm not really sure, but do you think it is possible that there could be some hard data retention setting for your Kafka topics in the staging environment? As in, at some point in time and maybe periodically, all data in the Kafka topics are dropped and therefore the consumers effectivel

Re: Tiemrs and restore

2018-04-15 Thread Tzu-Li (Gordon) Tai
Hi Alberto, Looking at the code, I think the current behavior is that all timers (both processing time and event time) are re-registered on restore, and therefore should be triggered automatically. So, for processing time timers, on restore all timers that were supposed to be fired while the jo

Re: Kafka Consumers Partition Discovery doesn't work

2018-04-12 Thread Tzu-Li (Gordon) Tai
partition to a topic that is being consumed. >> Secor started consuming it as expected, but Flink didn't – or at least it >> isn't reporting anything about doing so. The new partition is not shown in >> Flink task metrics or consumer offsets committed by Flink. >>

Re: Lucene SPI class loading fails with shaded flink-connector-elasticsearch

2018-03-23 Thread Tzu-Li (Gordon) Tai
Hi Manuel, Thanks a lot for reporting this! Yes, this issue is most likely related to the recent changes to shading the Elasticsearch connector dependencies, though it is a bit curious why I didn’t bump into it before while testing it. The Flink job runs Lucene queries on a data stream which e

Re: Kafka Consumers Partition Discovery doesn't work

2018-03-22 Thread Tzu-Li (Gordon) Tai
Flink dev? If yes, I could rebuild my 1.5-SNAPSHOT and try again. On Thu, Mar 22, 2018 at 4:18 PM, Tzu-Li (Gordon) Tai wrote: Hi Juho, Can you confirm that the new partition is consumed, but only that Flink’s reported metrics do not include them? If yes, then I think your observations can

Re: Kafka Consumers Partition Discovery doesn't work

2018-03-22 Thread Tzu-Li (Gordon) Tai
Hi Juho, Can you confirm that the new partition is consumed, but only that Flink’s reported metrics do not include them? If yes, then I think your observations can be explained by this issue:  https://issues.apache.org/jira/browse/FLINK-8419 This issue should have been fixed in the recently rele

Re: Strange Kafka consumer behaviour

2018-03-19 Thread Tzu-Li (Gordon) Tai
Hi Gyula, Are you using Flink 1.4.x, and have partition discovery enabled? If yes, then both the state of previously existing topics, as well as partitions of the newly specified topics will be consumed. Cheers, Gordon On Tue, Mar 20, 2018 at 6:01 AM, Ankit Chaudhary wrote: > Did you changed t

[ANNOUNCE] Apache Flink 1.3.3 released

2018-03-15 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache Flink 1.3.3, which is the third bugfix release for the Apache Flink 1.3 series.  Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming

[ANNOUNCE] Apache Flink 1.4.2 released

2018-03-07 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache Flink 1.4.2, which is the second bugfix release for the Apache Flink 1.4 series.  Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streamin

Re: Fwd: Flink throwing ava.lang.ClassNotFoundException after upgrading to 1.4.1

2018-03-02 Thread Tzu-Li (Gordon) Tai
Hi Ankit, This is a known issue in 1.4.1. Please see  https://issues.apache.org/jira/browse/FLINK-8741. The release for 1.4.2 will include a fix for this issue, and we already have a release candidate being voted at the moment. Hopefully, it will be released soon, probable early next week. Chee

Re: Suggested way to backfill for datastream

2018-02-27 Thread Tzu-Li (Gordon) Tai
Hi Chengzhi, Yes, generally speaking, you would launch a separated job to do the backfilling, and then shut down the job after the backfilling is completed. For this to work, you’ll also have to keep in mind that writes to the external sink must be idempotent. Are you using Kafka as the data so

Re: Flink Kafka reads too many bytes .... Very rarely

2018-02-27 Thread Tzu-Li (Gordon) Tai
Hi Philip, Yes, I also have the question that Fabian mentioned. Did you start observing this only after upgrading to 1.4.0? Could you let me know what exactly your deserialization schema is doing? I don’t have any clues at the moment, but maybe there are hints there. Also, you mentioned that th

Re: Cannot used managed keyed state in sink

2018-02-25 Thread Tzu-Li (Gordon) Tai
Hi, Are you using a `RichSinkFunction`? There you should have access to the runtime context, with which you can use to access keyed state. Cheers, Gordon On 24 February 2018 at 3:04:55 PM, Kien Truong (duckientru...@gmail.com) wrote: Hi, It seems that I can't used managed keyed state inside s

Re: Caused by: java.lang.ClassNotFoundException: org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase

2018-02-25 Thread Tzu-Li (Gordon) Tai
Hi, Good to see that you have it working! Yes, each of the Kafka version-specific connectors also have a dependency on the base Kafka connector module. Note that it is usually not recommended to put optional dependencies (such as the connectors) under the lib folder. To add additional dependenc

[ANNOUNCE] Apache Flink 1.4.1 released

2018-02-15 Thread Tzu-Li (Gordon) Tai
The Apache Flink community is very happy to announce the release of Apache Flink 1.4.1, which is the first bugfix release for the Apache Flink 1.4 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming a

RE: Kafka as source for batch job

2018-02-08 Thread Tzu-Li (Gordon) Tai
need a different implementation of KafkaFetcher.runFetchLoop that has slightly different logic for changing running to be false.   What would you recommend in this case?   From: Tzu-Li (Gordon) Tai [mailto:tzuli...@apache.org] Sent: Thursday, February 08, 2018 12:24 PM To: user

Re: Kafka as source for batch job

2018-02-08 Thread Tzu-Li (Gordon) Tai
Hi Hayden, Have you tried looking into `KeyedDeserializationSchema#isEndOfStream` [1]? I think that could be what you are looking for. It signals the end of the stream when consuming from Kafka. Cheers, Gordon On 8 February 2018 at 10:44:59 AM, Marchant, Hayden (hayden.march...@citi.com) wrote

Re: Kafka and parallelism

2018-02-07 Thread Tzu-Li (Gordon) Tai
there was some ability to override the topic. Is there there a feature that allows me to do that? If not do you think this would be a worthwhile addition? Thanks again, -- Christophe On Mon, Feb 5, 2018 at 9:52 AM, Tzu-Li (Gordon) Tai wrote: Hi Christophe, You can set the parallelism of the

Re: Kafka and parallelism

2018-02-05 Thread Tzu-Li (Gordon) Tai
clearly no. You have to set your parallelism yourself and then it will round robin between them. Thanks again, -- Christophe On Mon, Feb 5, 2018 at 9:52 AM, Tzu-Li (Gordon) Tai wrote: Hi Christophe, You can set the parallelism of the FlinkKafkaConsumer independently of the total numbe

Re: Kafka and parallelism

2018-02-05 Thread Tzu-Li (Gordon) Tai
Hi Christophe, You can set the parallelism of the FlinkKafkaConsumer independently of the total number of Kafka partitions (across all subscribed streams, including newly created streams that match a subscribed pattern). The consumer deterministically assigns each partition to a single consumer

Re: How to make savepoints more robust in the face of refactorings ?

2018-01-30 Thread Tzu-Li (Gordon) Tai
long you don't rename this  Tuple2TypeInformation around everything will work.. but it feels very suboptimal.  On 29 January 2018 at 12:33, Tzu-Li (Gordon) Tai wrote: Hi, In the Scala API, type serializers may be anonymous classes generated by Scala macros, and would therefore contain a

Re: How to make savepoints more robust in the face of refactorings ?

2018-01-29 Thread Tzu-Li (Gordon) Tai
Hi, In the Scala API, type serializers may be anonymous classes generated by Scala macros, and would therefore contain a reference to the wrapping class (i.e., your `Operators` class). Since Flink currently serializes serializers into the savepoint to be used for deserialization on restore, and

Re: ElasticsearchSink in Flink 1.4.0 with Elasticsearch 5.2+

2018-01-29 Thread Tzu-Li (Gordon) Tai
Hi Christophe, Thanks a lot for the contribution! I’ll add reviewing the PR to my backlog. I would like / will try to take a look at the PR by the end of this week, after some 1.4.1 blockers which I’m still busy with. Cheers, Gordon On 29 January 2018 at 9:25:27 AM, Fabian Hueske (fhue...@gmail

Re: Flink streaming (1.3.2) KafkaConsumer08 - Unable to retrieve any partitions

2018-01-23 Thread Tzu-Li (Gordon) Tai
Hi Tarandeep, The error you are seeing only occurs when on startup of the consumer, it couldn’t retrieve any partition information from Kafka. Therefore, according to your description, there should be another error that caused the previous execution of the job to fail. Could you check that? Mayb

Re: Flink Kinesis Consumer re-reading merged shards upon restart

2018-01-23 Thread Tzu-Li (Gordon) Tai
On Mon, Jan 22, 2018 at 6:26 PM, Tzu-Li (Gordon) Tai wrote: Hi Philip, Thanks a lot for reporting this, and looking into this in detail. Your observation sounds accurate to me. The `endingSequenceNumber` would no longer be null once a shard is closed, so on restore that would mistaken the

Re: Flink Kinesis Consumer re-reading merged shards upon restart

2018-01-22 Thread Tzu-Li (Gordon) Tai
Hi Philip, Thanks a lot for reporting this, and looking into this in detail. Your observation sounds accurate to me. The `endingSequenceNumber` would no longer be null once a shard is closed, so on restore that would mistaken the consumer to think that it’s a new shard and start consuming it fr

Re: Classes missing from jar

2018-01-12 Thread Tzu-Li (Gordon) Tai
versions of the connector jars. Once the base jar is in the mvn repository, this won't be as problematic. On Friday, January 12, 2018, 9:46:22 AM EST, Tzu-Li (Gordon) Tai wrote: Hi Jason, The KeyedDeserializationSchema is located in the flink-connector-kafka-base module, so you'll need

Re: Restoring 1.3.1 savepoint in 1.4.0 fails in TimerService

2018-01-12 Thread Tzu-Li (Gordon) Tai
Weird huh? I can't see how I would've changed anything related to these when making those minor code changes required in upgrading to 1.4. Cheers, Juho On Fri, Jan 12, 2018 at 2:58 PM, Tzu-Li (Gordon) Tai wrote: Hi Juho, Could your key type have possibly changed / been modified ac

Re: class loader issues when closing streams

2018-01-12 Thread Tzu-Li (Gordon) Tai
Hi Jared, I currently don't have a solid idea of what may be happening, but from the stack dump you provided, it seems like the client connection you are using in the Elasticsearch API call bridge is stuck, even after the cleanup. Do you think there could be some issue with closing the client you

Re: Classes missing from jar

2018-01-12 Thread Tzu-Li (Gordon) Tai
Hi Jason, The KeyedDeserializationSchema is located in the flink-connector-kafka-base module, so you'll need to include the jar for that too [1]. Cheers, Gordon [1] https://repo1.maven.org/maven2/org/apache/flink/flink-connector-kafka-base_2.11/1.4.0/ -- Sent from: http://apache-flink-user-

Re: How can I count the element in datastream

2018-01-12 Thread Tzu-Li (Gordon) Tai
Hi! Do you mean that you want to count all elements across all partitions of a DataStream? To do that, you'll need to transform the DataStream with an operator of parallelism 1, e.g. DatatStream stream = ...; stream.map(new CountingMap<>()).setParallelism(1); Cheers, Gordon -- Sent from: http

Re: CaseClassTypeInfo fails to deserialize in flink 1.4 with Parent First Class Loading

2018-01-12 Thread Tzu-Li (Gordon) Tai
Hi Seth, Thanks a lot for the report! I think your observation is expected behaviour, if there really is a binary incompatible change between Scala minor releases. And yes, the type information macro in the Scala API is very sensitive to the exact Scala version used. I had in the past also observ

Re: can we recover job use latest checkpointed state instead of savepoint, and how?

2018-01-12 Thread Tzu-Li (Gordon) Tai
Hi, Externalized checkpoints [1] seems to be exactly what you are looking for. Checkpoints are by default not persisted, unless configured otherwise to be externalized so that they are not automatically cleaned up when the job fails. They can be used to resume the job. On the other hand, it woul

Re: Restoring 1.3.1 savepoint in 1.4.0 fails in TimerService

2018-01-12 Thread Tzu-Li (Gordon) Tai
Hi Juho, Could your key type have possibly changed / been modified across the upgrade? Also, from the error trace, it seems like the failing restore is of a window operator. What window type are you using? That exception is a result of either mismatching key serializers or namespace serializers (

Re: SideOutput doesn't receive anything if filter is applied after the process function

2018-01-12 Thread Tzu-Li (Gordon) Tai
Hi Juho, Now that I think of it this seems like a bug to me: why does the call to getSideOutput succeed if it doesn't provide _any_ input? With the way side outputs work, I don’t think this is possible (or would make sense). An operator does not know whether or not it would ever emit some elem

Re: Kinesis Connectors - With Temporary Credentials

2018-01-11 Thread Tzu-Li (Gordon) Tai
we can definitely look into that. If no, is there a workaround to implement or customize AWS Utils? Thank you On Jan 11, 2018, at 6:41 PM, Tzu-Li (Gordon) Tai wrote: Hi Sree, Are Temporary Credentials automatically shipped with AWS EC2 instances when delegated to the role? If yes, you should

Re: Kinesis Connectors - With Temporary Credentials

2018-01-11 Thread Tzu-Li (Gordon) Tai
Hi Sree, Are Temporary Credentials automatically shipped with AWS EC2 instances when delegated to the role? If yes, you should be able to just configure the properties so that the Kinesis consumer automatically fetches credentials from the AWS instance. To do that, simply do not provide the Acce

Re: Flink 1.3 -> 1.4 Kafka state migration issue

2018-01-08 Thread Tzu-Li (Gordon) Tai
Hi Gyula, Is your 1.3 savepoint from Flink 1.3.1 or 1.3.0? In those versions, we had a critical bug that caused duplicate partition assignments in corner cases, so the assignment logic was altered from 1.3.1 to 1.3.2 (and therefore also 1.4.0). If you indeed was using 1.3.1 or 1.3.0, and you are

Re: About Kafka08Fetcher and Kafka010Fetcher

2018-01-04 Thread Tzu-Li (Gordon) Tai
Hi Jaxon, The threading model is implemented differently between the Kafka08Fetcher and all other fetcher versions higher than 0.9+ because the Kafka Java clients used between these versions have different abstraction levels. The Kafka08Fetcher still uses the low-level `SimpleConsumer` API, whi

Re: Restroring from a SP

2017-12-20 Thread Tzu-Li (Gordon) Tai
Hi Vishal, AFAIK, intermittent restore failures from savepoints should not be expected. Do you still have the logs from the failed restore attempts? What exceptions were the restores failing on? We would need to take a look at the logs to figure what may be going on. Best, Gordon -- Sent from:

Re: Apache Flink - Difference between operator and function

2017-12-20 Thread Tzu-Li (Gordon) Tai
Hi Mans, What's the difference between an operator and a function ?  An operator in Flink needs to handle processing of watermarks, records, and checkpointing of the operator state. To implement one, you need to extend the AbstractStreamOperator base class. It is considered a very low-level API

Re: Setting jar file directory for Apache Flink

2017-12-19 Thread Tzu-Li (Gordon) Tai
nized by the Gmail! On Mon, Dec 18, 2017 at 10:14 PM, Tzu-Li (Gordon) Tai wrote: Hi Soheil, It seems like you are trying to link optional Flink libraries that are not shipped with the binary Flink distributions. Have you taken a look at this doc [1]? It should contain sufficient information for

Re: Setting jar file directory for Apache Flink

2017-12-18 Thread Tzu-Li (Gordon) Tai
Hi Soheil, It seems like you are trying to link optional Flink libraries that are not shipped with the binary Flink distributions. Have you taken a look at this doc [1]? It should contain sufficient information for your problem. Cheers, Gordon [1] http://apache-flink-user-mailing-list-archive.2

Re: Fwd: Replace classes at runtime

2017-12-18 Thread Tzu-Li (Gordon) Tai
Hi Jayant, Updating your job application / operator code at runtime is currently not available in Flink. It is however achievable via taking a savepoint of your job, and then restoring from the savepoint with your upgraded application. There's a few points to keep in mind, especially job state co

Re: flink eventTime, lateness, maxoutoforderness

2017-12-18 Thread Tzu-Li (Gordon) Tai
Hi, lateness is record time or the real word time?  maxoutoforderness is record time or the real word time?  Both allow lateness of window operators, or maxOutOfOrderness of the BoundedOutOfOrdernessTimestampExtractor, refer to event time. i.e., - given the end timestamp of a window is x (in ev

Re: Flink 1.4.0 keytab is unreadable

2017-12-15 Thread Tzu-Li (Gordon) Tai
/browse/FLINK-8270 On 15 December 2017 at 4:12:24 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org) wrote: Hi 杨光, Thanks a lot for reporting and looking into this with such detail! Your observations are correct: the changes from 1.3.2 to 1.4.0 in the YarnTaskManagerRunner caused the local Keytab

Re: Watermark in broadcast

2017-12-14 Thread Tzu-Li (Gordon) Tai
Hi Seth, Some clarifications to point out: Quick follow up question. Is there some way to notify a TimestampAssigner that is consuming from an idle source?  In Flink, idle sources would emit a special idleness marker event that notifies downstream time-based operators to not wait for its water

Re: Flink Kafka Producer Exception

2017-12-13 Thread Tzu-Li (Gordon) Tai
Hi Navneeth, The exception you are getting is a Kafka NetworkException. From the provided information I can’t really tell much and can only guess, but are you sure that the client / broker versions match? It seems like that you are using 0.10; the default client version in the Flink Kafka 0.10 c

Re: Kafka topic partition skewness causes watermark not being emitted

2017-12-13 Thread Tzu-Li (Gordon) Tai
Hi, I've just elevated FLINK-5479 to BLOCKER for 1.5. Unfortunately, AFAIK there is no easy workaround solution for this issue yet in the releases so far. The min watermark logic that controls per-partition watermark emission is hidden inside the consumer, making it hard to work around it. One p

Re: Flink-Kafka connector - partition offsets for a given timestamp?

2017-12-11 Thread Tzu-Li (Gordon) Tai
Hi Connie, We do have a pull request for the feature, that should almost be ready after rebasing: https://github.com/apache/flink/pull/3915, JIRA: https://issues.apache.org/jira/browse/FLINK-6352. This means, of course, that the feature isn't part of any release yet. We can try to make sure this h

Re: using regular expression to specify Kafka topics

2017-12-04 Thread Tzu-Li (Gordon) Tai
Hi, I’ve created a PR to publicly expose the feature:  https://github.com/apache/flink/pull/5117. Whether or not we should include this in the next release candidate for 1.4 is still up for discussion. Best, Gordon On 4 December 2017 at 3:02:29 PM, Tzu-Li (Gordon) Tai (tzuli...@apache.org

Re: using regular expression to specify Kafka topics

2017-12-03 Thread Tzu-Li (Gordon) Tai
Hi Soheil, That feature is actually already internally available. The only issue is that the functionality is not yet exposed via any public APIs on the Kafka consumer. Please see this JIRA here: https://issues.apache.org/jira/browse/FLINK-8190. I’m not sure of exposing the pattern-based subscri

Re: FlinkKafkaProducerXX

2017-11-30 Thread Tzu-Li (Gordon) Tai
Hi Mike, The rationale behind implementing the FlinkFixedPartitioner as the default is so that each Flink sink partition (i.e. one sink parallel subtask) maps to a single Kafka partition. One other thing to clarify: By setting the partitioner to null, the partitioning is based on a hash of the re

Re: How to deal with java.lang.IllegalStateException: Could not initialize keyed state backend after a code update

2017-11-28 Thread Tzu-Li (Gordon) Tai
Hi Federico, It seems like the state cannot be restored because the class of the state type (i.e., Event) had been modified since the savepoint, and therefore has a conflicting serialVersionUID with whatever it is in the savepoint. This can happen if Java serialization is used for some part of y

Re: Creating flink byte[] deserialiser

2017-11-27 Thread Tzu-Li (Gordon) Tai
Hi Soheil, AFAIK, there is no built-in byte array deserializer in Flink. However, it is very simple to implement one. You can do that by implementing the `DeserializationSchema` interface, and for the implementation of the `deserialize` method, simply return the fetched bytes from Kafka as the

Re: FlinkKafkaConsumer010 does not start from the next record on startup from offsets in Kafka

2017-11-22 Thread Tzu-Li (Gordon) Tai
Hi Robert, Uncaught exceptions that cause the job to fall into a fail-and-restart loop is likewise to the corrupt record case I mentioned. With exactly-once guarantees, the job will roll back to the last complete checkpoint, which "resets" the Flink consumer to some earlier Kafka partition offset

Re: Kafka consumer to sync topics by event time?

2017-11-22 Thread Tzu-Li (Gordon) Tai
Hi! The FlinkKafkaConsumer can handle watermark advancement with per-Kafka-partition awareness (across partitions of different topics). You can see an example of how to do that here [1]. Basically what this does is that it generates watermarks within the Kafka consumer individually for each Kafka

Re: FlinkKafkaConsumer010 does not start from the next record on startup from offsets in Kafka

2017-11-22 Thread Tzu-Li (Gordon) Tai
Hi Robert, As expected with exactly-once guarantees, a record that caused a Flink job to fail will be attempted to be reprocessed on the restart of the job. For some specific "corrupt" record that causes the job to fall into a fail-and-restart loop, there is a way to let the Kafka consumer skip t

Re: kafka consumer client seems not auto commit offset

2017-11-15 Thread Tzu-Li (Gordon) Tai
Hi Tony, Thanks for the report. At first glance of the description, what you described doesn’t seem to match the expected behavior. I’ll spend some time later today to check this out. Cheers, Gordon On 15 November 2017 at 5:08:34 PM, Tony Wei (tony19920...@gmail.com) wrote: Hi Gordon, When I

Re: What happened if my parallelism more than kafka partitions.

2017-11-08 Thread Tzu-Li (Gordon) Tai
The `KafkaTopicPartitionAssigner.assign(partition, numParallelSubtasks)` method returns the index of the target subtask for a given Kafka partition. The implementation in that method ensures that the same subtask index will always be returned for the same partition. Each consumer subtask will lo

<    1   2   3   4   5   6   >