Re: withFormat(Csv) is undefined for the type BatchTableEnvironment

2018-08-30 Thread Timo Walther
Hi François, you should read the documentation from top to bottom. The overview part [1] explains how everything plays together with examples. Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connect.html#overview Am 30.08.18 um 10:41 schrieb Till Rohrmann

Re: AvroSchemaConverter and Tuple classes

2018-08-24 Thread Timo Walther
Hi, tuples are just a sub category of rows. Because the tuple arity is limited to 25 fields. I think the easiest solution would be to write your own converter that maps rows to tuples if you know that you will not need more than 25 fields. Otherwise it might be easier to just use a TextInputF

Re: lack of function and low usability of provided function

2018-08-23 Thread Timo Walther
Hi Henry, thanks for giving feedback. The set of built-in functions is a continous effort that will never be considered as "done". If you think a function should be supported, you can open issues in FLINK-6810 and we can discuss its priority. Flink is an open source project so feel also free

Re: Override CaseClassSerializer with custom serializer

2018-08-17 Thread Timo Walther
Hi Gerard, you are correct, Kryo serializers are only used when no built-in Flink serializer is available. Actually, the tuple and case class serializers are one of the most performant serializers in Flink (due to their fixed length, no null support). If you really want to reduce the seriali

Re: InvalidTypesException: Type of TypeVariable 'K' in 'class X' could not be determined

2018-08-17 Thread Timo Walther
Hi Miguel, the issue that you are observing is due to Java's type erasure. "new MyClass()" is always erasured to "new MyClass()" by the Java compiler so it is impossible for Flink to extract something. For classes in declarations like class MyClass extends ... {    ... } the compiler adds t

Re: Scala 2.12 Support

2018-08-16 Thread Timo Walther
Hi Aaron, we just released Flink 1.6 and the discussion for the roadmap of 1.7 will begin soon. I guess the Jira issue will also updated then. I would recommend to watch it for now. Regards, Timo Am 16.08.18 um 17:08 schrieb Aaron Levin: Hi Piotr, Thanks for the update. Glad to hear it's

Re: Stream collector serialization performance

2018-08-15 Thread Timo Walther
Hi Mingliang, first of all the POJO serializer is not very performant. Tuple or Row are better. If you want to improve the performance of a collect() between operators, you could also enable object reuse. You can read more about this here [1] (section "Issue 2: Object Reuse"), but make sure y

Re: SQL parallelism setting

2018-08-10 Thread Timo Walther
Hi, currenlty, you can only set the parallelism for an entire Flink job using env.setParallelism(). There are rough ideas of how we could improve the situation in the future to control the parallelism of individual operators but this might need one or two releases. Regards, Timo Am 10.08.

Re: Getting compilation error in Array[TypeInformation]

2018-08-09 Thread Timo Walther
Hi Mich, I strongly recommend to read a good Scala programming tutorial before writing on a mailing list. As the error indicates you are missing generic parameters. If you don't know the parameter use `Array[TypeInformation[_]]` or `TableSink[_]`. For the types class you need to import the t

Re: Table API, custom window

2018-08-09 Thread Timo Walther
Hi Oleksandr, currenlty, we don't support custom windows for Table API. The Table & SQL API try to solve the most common cases but for more specific logic we recommend the DataStream API. Regards, Timo Am 09.08.18 um 14:15 schrieb Oleksandr Nitavskyi: Hello guys, I am curious, is there a

Re: unsubscribtion

2018-08-07 Thread Timo Walther
Hi, see https://flink.apache.org/community.html#mailing-lists for unsubscribing: Use: user-unsubscr...@flink.apache.org Regards, Timo Am 08.08.18 um 08:18 schrieb 네이버: On 7 Aug 2018, at 19:42, Yan Zhou [FDS Science] > wrote: Thank you Vino. It is very helpful

Re: FlinkCEP and scientific papers ?

2018-08-07 Thread Timo Walther
Hi Esa, the SQL/CEP integration might be part of Flink 1.7. The discussion has just been started again [1]. Regards, Timo [1] https://issues.apache.org/jira/browse/FLINK-6935 Am 07.08.18 um 15:36 schrieb Esa Heikkinen: There was one good example of pattern query in the paper made by SASE+

Re: Sink Multiple Stream Elastic search

2018-08-02 Thread Timo Walther
Hi, I'm not aware that multiple Flink operators can share transport connections. They usually perform independent communication with the target system. If the pressure is too high for Elasticsearch, have you thought about reducing the parallelism of the sink. Also the buffering options could

Re: Late events in streaming using SQL API

2018-08-02 Thread Timo Walther
Hi Juan, currently, there is no way of handling late events in SQL. This feature got requested multiple times so it is likely that some contributor will pick it up soon. I filed FLINK-10031 [1] for it. There is also [2] that aims for improving the situation with time windows. Regards, Timo

Re: Converting a DataStream into a Table throws error

2018-08-02 Thread Timo Walther
etary damages arising from such loss, damage or destruction. On Thu, 2 Aug 2018 at 08:27, Timo Walther <mailto:twal...@apache.org>> wrote: Hi Mich, could you share your project with us (maybe on github)? Then we can import it and debug what the problem is. Reg

Re: Converting a DataStream into a Table throws error

2018-08-01 Thread Timo Walther
Talebzadeh LinkedIn /https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw/ http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk.Any and all responsibility for any loss, damage or destruction of data or any other property which may arise fr

Re: Converting a DataStream into a Table throws error

2018-08-01 Thread Timo Walther
Hi Mich, I would check you imports again [1]. This is a pure compiler issue that is unrelated to your actual data stream. Also check your project dependencies. Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/common.html#implicit-conversion-for-scala Am 0

Re: Trying to implement UpsertStreamTableSink in Java

2018-07-23 Thread Timo Walther
Hi James, the method `Table.writeToSink()` calls `configure(String[] fieldNames, TypeInformation[] fieldTypes)` internally. Since you return null, you are trying to register null instead of a table sink. I hope this helps. Regards, Timo Am 23.07.18 um 14:33 schrieb Porritt, James: I put

Re: Keeping only latest row by key?

2018-07-18 Thread Timo Walther
Hi James, the easiest solution for this bahavior is to use a user-defined LAST_VALUE aggregate function as discussed here [1]. I hope this helps. Regards, Timo [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Using-SQL-with-dynamic-tables-where-rows-are-updated-td20519.htm

Re: rowTime from json nested timestamp field in SQL-Client

2018-07-17 Thread Timo Walther
<mailto:ashwin.si...@go-mmt.com>> wrote: Thanks Timo for the clarification, but our processing actually involves aggregations on huge past data also, which won't be served by processing time. Is this a WIP feature? On Mon, Jul 16, 2018 at 7:29 PM Timo Walth

Re: Why is flink master bump version to 1.7?

2018-07-17 Thread Timo Walther
Hi Tison, I guess this was a mistake that will be fixed soon. Till (in CC) forked off the release-1.6 branch yesterday? Regards, Timo Am 17.07.18 um 04:00 schrieb 陈梓立: Hi, I see no 1.6 branch or tag. What's the reason we skip 1.6 and now 1.7-SNAPSHOT? or there is a 1.6 I miss. Best, tiso

Re: rowTime from json nested timestamp field in SQL-Client

2018-07-16 Thread Timo Walther
Hi Ashwin, the SQL Client is in an early development stage right now and has some limitations. Your problem is one of them. I files an issue for this: https://issues.apache.org/jira/browse/FLINK-9864 There is no easy solution to fix this problem. Maybe you can use processing-time for your wi

Re: TumblingProcessingTimeWindow emits extra results for a same window

2018-07-12 Thread Timo Walther
Hi Yuan, this sounds indeed weird. The SQL API uses regular DataStream API windows underneath so this problem should have come up earlier if this is problem in the implementation. Does this behavior reproducible on your local machine? One thing that comes to my mind is that the "userId"s mig

Re: Support for detached mode for Flink1.5 SQL Client

2018-07-11 Thread Timo Walther
The INSERT INTO [1] statement will allow to submit queries detached. So your can close the client and let the Flink program do it's job sinking into external systems. Regards, Timo [1] https://issues.apache.org/jira/browse/FLINK-8858 Am 12.07.18 um 02:47 schrieb Rong Rong: Is the Gateway Mode

Re: Need to better way to create JSON If we have TableSchema and Row

2018-07-11 Thread Timo Walther
Hi Shivam, Flink 1.5 provides full Row-JSON-Row conversions. You can take a look at the `flink-json` module. A table schema can be converted into a TypeInformation (Types.ROW(schema.getColumns(), schema.getTypes())) which can be used to configure JsonRowSerialization/DeserializationSchemas. I

Re: Want to write Kafka Sink to SQL Client by Flink-1.5

2018-07-11 Thread Timo Walther
Hi Shivam, a Kafka sink for the SQL Client will be part of Flink 1.6. For this we need to do provide basic interfaces that sinks can extends as Rong mentioned (FLINK-8866). In order to support all formats that also sources support we also working on separating the connector from the formats [

Re: Issues with Flink1.5 SQL-Client

2018-07-03 Thread Timo Walther
tting more logs https://pastebin.com/fGTW9s2b On Tue, Jul 3, 2018 at 7:01 PM Timo Walther mailto:twal...@apache.org>> wrote: Hi Ashwin, which Flink version is your (local cluster)? Are you executing Flink in the default (new deployment) or legacy

Re: Issues with Flink1.5 SQL-Client

2018-07-03 Thread Timo Walther
Hi Ashwin, which Flink version is your (local cluster)? Are you executing Flink in the default (new deployment) or legacy mode? The SQL client supports only the new "FLIP-6" deployment model. I'm not sure about your error message but it might be related to that. Regards, Timo Am 03.07.18 u

Re: Regarding external metastore like HIVE

2018-07-03 Thread Timo Walther
Hi, you can follow the progress here: https://issues.apache.org/jira/browse/FLINK-9171 Regards, Timo Am 03.07.18 um 10:32 schrieb Fabian Hueske: Hi, The docs explain that the ExternalCatalog interface *can* be used to implement a catalog for HCatalog or Metastore. However, there is no suc

Re: DataStreamCalcRule$1802" grows beyond 64 KB when execute long sql.

2018-06-19 Thread Timo Walther
Hi, this is a known issue that is mentioned in https://issues.apache.org/jira/browse/FLINK-8921 and should be fixed soon. Currently, we only split by field but for your case we should also split expressions. As a workaround you could implement your own scalar UDF that contains the case/when l

Re: DataStreamCalcRule grows beyond 64 KB

2018-06-12 Thread Timo Walther
Hi, which version of Flink are you using? This issue is not entirely fixed but the most important cases have been solved in Flink 1.5. See https://issues.apache.org/jira/browse/FLINK-8274. Regards, Timo Am 12.06.18 um 03:52 schrieb Hequn Cheng: Hi rakeshchalasani, At the moment flink only

Re: why BlobServer use ServerSocket instead of Netty's ServerBootstrap?

2018-06-11 Thread Timo Walther
Hi, I think this question should rather be send to the dev@ mailing list. But I will loop in Nico that might know more about the implementation details. Regards, Timo Am 11.06.18 um 05:07 schrieb makeyang: after checking code, I found that BlobServer use ServerSocket instead of Netty's Serv

Re: Datastream[Row] covert to table exception

2018-06-08 Thread Timo Walther
ly ,I used ds.map()(Types.ROW()),then it works fine, but I did't know why.   The code is val inputStream: DataStream[Row] = env.addSource(myConsumer)(Types.ROW(fieldNameArray, flinkTypeArray)) 在 2018年6月8日,下午3:15,Timo Walther <mailto:twal...@apache.org>> 写道: Can you verify wit

Re: Datastream[Row] covert to table exception

2018-06-06 Thread Timo Walther
Sorry, I didn't see you last mail. The code looks good actually. What is the result of `inputStream.getType` if you print it to the console? Timo Am 07.06.18 um 08:24 schrieb Timo Walther: Hi, Row is a very special datatype where Flink cannot generate serializers based on the generic

Re: Datastream[Row] covert to table exception

2018-06-06 Thread Timo Walther
Hi, Row is a very special datatype where Flink cannot generate serializers based on the generics. By default DeserializationSchema uses reflection-based type analysis, you need to override the getResultType() method in WormholeDeserializationSchema. And specify the type information manually t

Re: Cannot determine simple type name - [FLINK-7490]

2018-06-05 Thread Timo Walther
This sounds similar to https://issues.apache.org/jira/browse/FLINK-9220. Can you explain the steps that I have to do to reproduce the error? Regards, Timo Am 05.06.18 um 08:06 schrieb Chesnay Schepler: Please re-open the issue. It would be great if you could also provide us with a reproducing

Re: Ask for SQL using kafka in Flink

2018-06-05 Thread Timo Walther
on, Jun 4, 2018 at 12:57 AM, Timo Walther <mailto:twal...@apache.org>> wrote: Hi, as you can see in code [1] Kafka09JsonTableSource takes a TableSchema. You can create table schema from type information see [2]. Regards, Timo [1] https://github.com

Re: Ask for SQL using kafka in Flink

2018-06-04 Thread Timo Walther
Hi, as you can see in code [1] Kafka09JsonTableSource takes a TableSchema. You can create table schema from type information see [2]. Regards, Timo [1] https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-kafka-0.9/src/main/java/org/apache/flink/streaming/connectors/k

Re: Java Code for Kafka Flink SQL

2018-06-04 Thread Timo Walther
Hi Rad, at a first glance your example does not look too bad. Which exceptions do you get? Did you create your pom.xml with the provided template [1] and then added flink-table, flink-connector-kafkaXXX, flink-streaming-scala? Regards, Timo [1] https://ci.apache.org/projects/flink/flink-doc

Re: latency critical job

2018-05-25 Thread Timo Walther
Hi, usually Flink should have constant latency if the job is implemented correctly. But if you want to implement something like an external monitoring process., you can use the REST API [1] and metrics [2] to model such an behavior by restarting your application. In theory, you could also imp

Re: Timers and Checkpoints

2018-05-25 Thread Timo Walther
Hi Alberto, do you get exactly the same exception? Maybe you can share some logs with us? Regards, Timo Am 25.05.18 um 13:41 schrieb Alberto Mancini: Hello, I think we are experiencing this issue: https://issues.apache.org/jira/browse/FLINK-6291 In fact we have a long running job that is un

Re: increasing parallelism increases the end2end latency in flink sql

2018-05-24 Thread Timo Walther
Hi Yan, SQL should not be the cause here. It is true that Flink removes the timestamp from a record when entering the SQL API but this timestamp is set again before time-based operations such as OVER windows. Watermarks are not touched. I think your issue is related to [2]. One explanation th

Re: AvroInputFormat Serialisation Issue

2018-05-15 Thread Timo Walther
tion > So your exception seems to be a bug if it works locally but not distributed. Hmm, well its nice to know I'm not just doing something stupid :-) Perhaps I'll try compile flink myself so I can try and debug this. On Tue, May 15, 2018 at 8:54 PM Timo Walther <mailto:twal

Re: Async Source Function in Flink

2018-05-15 Thread Timo Walther
Hi Frederico, Flink's AsyncFunction is meant for enriching a record with information that needs to be queried externally. So I guess you can't use it for your use case because an async call is initiated by the input. However, your custom SourceFunction could implement a similar asynchronous lo

Re: Leader Retrieval Timeout with HA Job Manager

2018-05-15 Thread Timo Walther
The details in the current logs are insufficient to know what is happening. Thanks, Jason On Tuesday, May 15, 2018, 7:51:40 a.m. EDT, Timo Walther wrote: Hi Jason, this sounds more like a network connection/firewall issue to me. Can you tell us a bit more about your environment? Are you run

Re: Akka heartbeat configurations

2018-05-15 Thread Timo Walther
Hi, increasing the time to detect a dead task manager usually increases the amount of elements that need to be reprocessed in case of a failure. Once a dead task manager is identified, the entire application is rolled back to the latest successful checkpointed/consistent state of the applicat

Re: AvroInputFormat Serialisation Issue

2018-05-15 Thread Timo Walther
Hi Padarn, usually people are using the AvroInputFormat with the Avro class generated by an Avro schema. But after looking into the implementation, one should also be able to use the GenericRecord class as a parameter. So your exception seems to be a bug if it works locally but not distribute

Re: Flink does not read from some Kafka Partitions

2018-05-15 Thread Timo Walther
Hi Ruby, which Flink version are you using? When looking into the code of the org.apache.flink.streaming.connectors.kafka.FlinkKafkaConsumerBase you can see that the behavior for using partition discovery or not depends on the Flink version. Regards, Timo Am 15.05.18 um 02:01 schrieb Ruby

Re: minPauseBetweenCheckpoints for failed checkpoints

2018-05-15 Thread Timo Walther
Hi Dmitry, I think the minPauseBetweenCheckpoints is intended for pausing between successful checkpoints. Usually a user wants to get a successful checkpoint as quickly as possible again. Stefan (in CC) might know more about. Regards, Timo Am 15.05.18 um 03:28 schrieb Dmitry Minaev: Hello!

Re: Leader Retrieval Timeout with HA Job Manager

2018-05-15 Thread Timo Walther
Hi Jason, this sounds more like a network connection/firewall issue to me. Can you tell us a bit more about your environment? Are you running your Flink cluster on a cloud provider? Regards, Timo Am 15.05.18 um 05:15 schrieb Jason Kania: Hi, I am using the 1.4.2 release on ubuntu and atte

Re: Wrong endpoints to cancel a job

2018-04-25 Thread Timo Walther
Hi Dongwon, please send such mails to the dev@ instead of the user@ as Flink 1.5.0 is not released yet. As far as I know the documentation around deployment and FLIP-6 has not been updated yet. But thank you for letting us know! Regards, Timo Am 25.04.18 um 11:03 schrieb Dongwon Kim: Hi,

Re: Kafka to Flink Avro Deserializer

2018-04-25 Thread Timo Walther
Hi Sebastien, for me this seems more an Avro issue than a Flink issue. You can ignore the shaded exception, we shade Google utilities for avoiding depencency conflicts. The root cause is this: java.lang.NullPointerException     at org.apache.avro.specific.SpecificData.getSchema (SpecificDat

Re: Why assignTimestampsAndWatermarks parallelism same as map,it will not fired?

2018-04-25 Thread Timo Walther
Hi, did you set your time characteristics to even-time? env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime); Regards, Timo Am 25.04.18 um 05:15 schrieb 潘 功森: Hi all, I use the same parallelism between map and assignTimestampsAndWatermarks , and it not fired, I saw the extractTime

Re: Managing state migrations with Flink and Avro

2018-04-20 Thread Timo Walther
? Regards, Timo Am 18.04.18 um 14:21 schrieb Timo Walther: Thank you. Maybe we already identified the issue (see https://issues.apache.org/jira/browse/FLINK-9202). I will use your code to verify it. Regards, Timo Am 18.04.18 um 14:07 schrieb Petter Arvidsson: Hi Timo, Please find the generated

Re: Applying an void function to DataStream

2018-04-19 Thread Timo Walther
>> wrote: Thanks, my map code is like this: stream.map(x -> parse(x)); I can't get what you mean! Something like the line below? DataStream t = stream.map(x -> parse(x)); ? On Thu, Apr 19, 2018 at 5:49 PM, Timo Walther mailto:twal...@apache

Re: Applying an void function to DataStream

2018-04-19 Thread Timo Walther
Hi Soheil, Flink supports the type "java.lang.Void" which you can use in this case. Regards, Timo Am 19.04.18 um 15:16 schrieb Soheil Pourbafrani: Hi, I have a void function that takes a String, parse it and write it into Cassandra (Using pure java, not Flink Cassandra connector). Using Apac

Re: Managing state migrations with Flink and Avro

2018-04-18 Thread Timo Walther
, Petter On Wed, Apr 18, 2018 at 11:32 AM, Timo Walther <mailto:twal...@apache.org>> wrote: Hi Petter, could you share the source code of the class that Avro generates out of this schema? Thank you. Regards, Timo Am 18.04.18 um 11:00 schrieb Petter Arvidsson:

Re: Managing state migrations with Flink and Avro

2018-04-18 Thread Timo Walther
Hi Petter, could you share the source code of the class that Avro generates out of this schema? Thank you. Regards, Timo Am 18.04.18 um 11:00 schrieb Petter Arvidsson: Hello everyone, I am trying to figure out how to set up Flink with Avro for state management (especially the content of s

Re: Side outputs never getting consumed

2018-04-06 Thread Timo Walther
Apr 3, 2018 at 10:17 AM, Timo Walther <mailto:twal...@apache.org>> wrote: Hi Julio, I tried to reproduce your problem locally but everything run correctly. Could you share a little example job with us? This worked for me: class TestingClass { var hello:Int

Re: Multiple (non-consecutive) keyBy operators in a dataflow

2018-04-03 Thread Timo Walther
d the second keyBy and thus prevent the network shuffle. Thanks, Arun Timo Walther wrote Hi Andre, every keyBy is a shuffle over the network and thus introduces some overhead. Esp. serialization of records between operators if object reuse is disabled by default. If you think that not all slots (and

Re: Reading data from Cassandra

2018-04-03 Thread Timo Walther
Client. Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/operators/asyncio.html Am 03.04.18 um 16:37 schrieb Timo Walther: Hi Soheil, yes Flink supports reading from Cassandra. You can find some examples here: https://github.com/apache/flink/tree/master

Re: Reading data from Cassandra

2018-04-03 Thread Timo Walther
Hi Soheil, yes Flink supports reading from Cassandra. You can find some examples here: https://github.com/apache/flink/tree/master/flink-connectors/flink-connector-cassandra/src/test/java/org/apache/flink/streaming/connectors/cassandra/example Regards, Timo Am 31.03.18 um 20:22 schrieb Soheil

Re: Task Manager fault tolerance does not work

2018-04-03 Thread Timo Walther
@Till: Do you have any advice for this issue? Am 03.04.18 um 11:54 schrieb dhirajpraj: What I have found is that the TM fault tolerance behaviour is not consistent. Sometimes it works and sometimes it doesnt. I am attaching my java code file (which is the main class). What I did was: 1) Run cl

Re: Kafka exceptions in Flink log file

2018-04-03 Thread Timo Walther
Hi Alex, which version of Flink are you running? There were some class loading issues with Kafka recently. I would try it with the newest Flink version. Otherwise ClassNotFoundException usually indicates that something is wrong with your dependencies. Maybe you can share your pom.xml with us.

Re: Side outputs never getting consumed

2018-04-03 Thread Timo Walther
Hi Julio, I tried to reproduce your problem locally but everything run correctly. Could you share a little example job with us? This worked for me: class TestingClass { var hello:Int =0 } class TestAextends TestingClass { var test:String = _ } def main(args: Array[String]) { // set u

Re: Savepointing with Avro Schema change

2018-04-03 Thread Timo Walther
Hi Aneesha, as far as I know Avro objects were serialized with Flink's POJO serializer in the past. This behavior changed in 1.4. @Gordon: do you have more information how good we support Avro schema evolution today? Regards, Timo Am 03.04.18 um 12:11 schrieb Aneesha Kaushal: Hello, I hav

Re: Watermark Question on Failed Process

2018-04-03 Thread Timo Walther
Hi Chengzhi, if you emit a watermark even though there is still data with a lower timestamp, you generate "late data" that either needs to be processed in a separate branch of your pipeline (see sideOutputLateData() [1]) or should force your existing operators to update their previously emitte

Re: Multiple (non-consecutive) keyBy operators in a dataflow

2018-04-03 Thread Timo Walther
Hi Andre, every keyBy is a shuffle over the network and thus introduces some overhead. Esp. serialization of records between operators if object reuse is disabled by default. If you think that not all slots (and thus all nodes) are not fully occupied evenly in the first keyBy operation (e.g.

Re: Temporary failure in name resolution

2018-04-03 Thread Timo Walther
Hi Miki, for me this sounds like your job has a resource leak such that your memory fills up and the JVM of the TaskManager is killed at some point. How does your job look like? I see a WindowedStream.apply which might not be appropriate if you have big/frequent windows where the evaluation h

Re: Task Manager fault tolerance does not work

2018-04-03 Thread Timo Walther
Could you provide a little reproducible example? Which file system are you using? This sounds like a bug to me that should be fixed if valid. Am 03.04.18 um 11:28 schrieb dhirajpraj: I have not specified any parallelism in the job code. So I guess, the parallelism should be set to parallelism.d

Re: subuquery about flink sql

2018-04-03 Thread Timo Walther
Hi, there are multiple issues in your query. First of all, "SELECT DISTINCT(user), product" is MySQL specific syntax and is interpreted as "SELECT DISTINCT user, product" which is not what you want I guess. Secondly, SQL windows can only be applied on time attributes. Meaning: "As long as a

Re: Task Manager fault tolerance does not work

2018-04-03 Thread Timo Walther
Hi, does your job code declare a higher parallelism than 2? Or is submitted with a higher parallelism? What is the Web UI displaying? Regards, Timo Am 03.04.18 um 10:48 schrieb dhirajpraj: Hi, I have done that env.enableCheckpointing(5000L); env.setRestartStrategy(RestartStrategies.fixedDela

Re: How can I set configuration of process function from job's main?

2018-03-29 Thread Timo Walther
Hi, the configuration parameter is just legacy API. You can simply pass any serializable object to the constructor of your process function. Regards, Timo Am 29.03.18 um 20:38 schrieb Main Frame: Hi guys! Iam newbie in flink and I have probably silly question about streaming api. So for t

Re: Table/SQL Kafka Sink Question

2018-03-27 Thread Timo Walther
Hi Alexandru, the KafkaTableSink does not expose all features of the underlying DataStream API. Either you convert your table program to the DataStream API for the sink operation or you just extend a class like Kafka010JsonTableSink and customize it. Regards, Timo Am 27.03.18 um 11:59 schr

Re: Out off memory when catching up

2018-03-26 Thread Timo Walther
Hi Lasse, in order to avoid OOM exception you should analyze your Flink job implementation. Are you creating a lot of objects within your Flink functions? Which state backend are you using? Maybe you can tell us a little bit more about your pipeline? Usually, there should be enough memory fo

Re: InterruptedException when async function is cancelled

2018-03-26 Thread Timo Walther
Hi Ken, as you can see here [1], Flink interrupts the timer service after a certain timeout. If you want to get rid of the exception, you should increase "task.cancellation.timers.timeout" in the configuration. Actually, the default is already set to 7 seconds. So your exception should not b

Re: How can I confirm a savepoint is used for a new job?

2018-03-26 Thread Timo Walther
Hi Hao, I quickly checked that manually. There should be a message similar to the one below in the JobManager log: INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator  - Starting job from savepoint ... Regards, Timo Am 22.03.18 um 06:45 schrieb Hao Sun: Do we have any logs in J

Re: Query regarding to CountinousFileMonitoring operator

2018-03-26 Thread Timo Walther
Hi Puneet, can you share a little code example with us? I could not reproduce your problem. You have to keep in mind that a setParallelism() only affects the last operation. If you want to change the default parallelism of the entire pipeline, you have to change it in StreamExecutionEnvironm

Re: Issue in Flink/Zookeeper authentication via Kerberos

2018-03-26 Thread Timo Walther
Hi Sarthak, I'm not a Kerberos expert but maybe Eron or Shuyi are more familiar with the details? Would be great if somebody could help. Thanks, Timo Am 22.03.18 um 10:16 schrieb Sahu, Sarthak 1. (Nokia - IN/Bangalore): Hi Folks, *_Environment Setup:_* 1. I have configured KDC 5 server.

Re: "dynamic" bucketing sink

2018-03-26 Thread Timo Walther
Hi Christophe, I think this will require more effort. As far as I know there is no such "dynamic" feature. Have you looked in to the bucketing sink code? Maybe you can adapt it to your needs? Otherwise it might also make sense to open an issue for it to discuss a design for it. Maybe other c

Re: How to handle large lookup tables that update rarely in Apache Flink - Stack Overflow

2018-03-26 Thread Timo Walther
Hi Pete, you can find some basic examples about stream enrichment here [1]. I hope this helps a bit. Regards, Timo [1] http://training.data-artisans.com/exercises/rideEnrichment-flatmap.html [2] http://training.data-artisans.com/exercises/rideEnrichment-processfunction.html Am 25.03.18 um

Re: Question regarding effect of 'restart-strategy: none' on Flink (1.4.1) JobManager HA

2018-03-14 Thread Timo Walther
Hi Shaswata, are you using a standalone Flink cluster or how does your deployement look like? E.g. YARN has its own restart attempts [1]. Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/jobmanager_high_availability.html#yarn-cluster-high-availability Am 1

Re: Flink SSL Setup on a standalone cluster

2018-03-14 Thread Timo Walther
Hi Vinay, do you have any exception or log entry that describes the failure? Regards, Timo Am 14.03.18 um 15:51 schrieb Vinay Patil: Hi, I have keystore for each of the 4 nodes in cluster and respective trustore. The cluster is configured correctly with SSL , verified this by accessing job

Re: activemq connector not working..

2018-03-14 Thread Timo Walther
Hi Puneet, are you running this job on the cluster or locally in your IDE? Regards, Timo Am 14.03.18 um 13:49 schrieb Puneet Kinra: Hi I used apache bahir connector  below is the code.the job is getting finished and not generated the output as well ,ideal it should keep on running below th

Re: Production-readyness of Flink SQL

2018-03-12 Thread Timo Walther
Hi Philip, you are absolutely right that Flink SQL is definitely production ready. It has been developed for 2 years now and is used at Uber, Alibaba, Huawei and many other companies. We usually only merge production ready code or add an explicit warning about it. I will finally remove this

Re: Share state across operators

2018-03-12 Thread Timo Walther
Hi Max, I would go with the Either approach if you want to ensure that the initital state and the first element arrive in the right order. Performance-wise there should not be a big different between both approaches. The side outputs are more meant for have a side channel beside the main stre

Re: POJO default constructor - how is it used by Flink?

2018-03-12 Thread Timo Walther
Hi Alex, I guess your "real" constuctor is invoked by your code within your Flink program. The default constructor is used during serialization between operators. If you are interested in the internals, you can have a look at the PojoSerializer [1]. The POJO is created with the default constr

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-09 Thread Timo Walther
ere any concern from performance or stability perspective? Best Yan *From:*Xingcan Cui mailto:xingc...@gmail.com>> *Sent:*Thursday, March 8, 2018 8:21:42 AM *To:*Timo Walther *Cc:*user; Yan Zhou [FDS Science] *Subject

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-08 Thread Timo Walther
Hi Xingcan, thanks for looking into this. This definitely seems to be a bug. Maybe in the org.apache.flink.table.calcite.RelTimeIndicatorConverter. In any case we should create an issue for it. Regards, Timo Am 3/8/18 um 7:27 AM schrieb Yan Zhou [FDS Science]: Hi Xingcan, Thanks for you

Re: Table Api and CSV builder

2018-03-08 Thread Timo Walther
Hi Karim, the CsvTableSource and its builder are currently not able to specify event-time or processing-time. I'm sure this will change in the near future. Until then I would recommend to either extend it yourself or use the DataStream API first to do the parsing and watermarking and then con

Re: Table API Compilation Error in Flink

2018-03-05 Thread Timo Walther
Hi Nagananda, could you show us your entire pom.xml? From what I see it seems that you are using the wrong StreamTableEnvironment. First you need to decide if you want to program in Scala or Java. Depending on that you can add the dependencies as descriped in [1]. There are two environments

Re: Using time window with SQL nested query

2018-03-05 Thread Timo Walther
Hi Bill, you can use HOP_ROWTIME()/HOP_PROCTIME() to propagate the time attribute to the outer query. See also [1] for an example. Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/table/sql.html#selecting-group-window-start-and-end-timestamps Am 3/5/18 um

Re: CsvTableSource Types.TIMESTAMP

2018-03-05 Thread Timo Walther
Hi, SQL_TIMESTAMP is the same. A couple of months ago it was decided to rename this property such that it can be used for timestamps with timezone support in the future. Regards, Tiom Am 3/5/18 um 2:10 PM schrieb Esa Heikkinen: I have tried to following example to work, but no succeed yet

Re: Using RowTypeInfo with Table API

2018-02-27 Thread Timo Walther
schrieb Jens Grassel: Hi, On Tue, 27 Feb 2018 14:43:06 +0100 Timo Walther wrote: TW> You can create it with org.apache.flink.table.api.Types.ROW(...). TW> You can check the type of your stream using ds.getType(). You can TW> pass information explicitly in Scala e.g. ds.map()(Types.ROW(...)

Re: SQL Table API: Naming operations done in query

2018-02-27 Thread Timo Walther
Hi Juan, usually the Flink operators contain the optimized expression that was defined in SQL. You can also name the the entire job using env.execute("Your Name") if that would help to identify the query. Regarding checkpoints, it depends how you define "small changes". You must ensure that

Re: Using RowTypeInfo with Table API

2018-02-27 Thread Timo Walther
Hi Jens, usually the Flink extracts the type information from the generic signature of a function. E.g. it knows the fields of a Tuple2String>. The row type cannot be analyzed and therefore always needs explicit information. You can create it with org.apache.flink.table.api.Types.ROW(...). Yo

Re: How to find correct "imports"

2018-02-19 Thread Timo Walther
Hi Esa, the easiest and recommended way is: - Create your Flink project with the provided quickstart scripts [1] - Visit the documentation about a feature you want to use. E.g. for the Table & SQL API [2] Usually it is described which modules you need. I hope this helps. Regards, Timo [1]

Re: How do I run SQL query on a dataStream that generates my custom type.

2018-02-15 Thread Timo Walther
Or even easier: You can do specify the type after the map call: eventStream.map({e: Event => toRow(e)})(Types.ROW_NAMED(...)) Regards, Timo Am 2/15/18 um 9:55 AM schrieb Timo Walther: Hi, In your case you don't have to convert to row if you don't want to. The Table API will

Re: How do I run SQL query on a dataStream that generates my custom type.

2018-02-15 Thread Timo Walther
Hi, In your case you don't have to convert to row if you don't want to. The Table API will do automatic conversion once the stream of Event is converted into a table. However, this only works if Event is a POJO. If you want to specify own type information your MapFunction can implement the R

Re: NullPointerException when asking for batched TableEnvironment

2018-02-14 Thread Timo Walther
Hi, by looking at the source code it seems that your "batchEnvironment" is null. Did you verify this? Regards, Timo Am 2/14/18 um 1:01 PM schrieb André Schütz: Hi, within the Flink Interpreter context, we try to get a Batch TableEnvironment with the following code. The code was executed wi

<    1   2   3   4   5   6   7   >