Re: Table API: Joining on Tables of Complex Types

2020-01-17 Thread Timo Walther
Hi Andreas, if dataset.getType() returns a RowTypeInfo you can ignore this log message. The type extractor runs before the ".returns()" but with this method you override the old type. Regards, Timo On 15.01.20 15:27, Hailu, Andreas wrote: Dawid, this approach looks promising. I’m able to

Re: Question about Scala Case Class and List in Flink

2020-01-15 Thread Timo Walther
Hi, Reg. 1: Scala case classes are supported in the Scala specific version of the DataStream API. If you are using case classes in the Java API you will get the INFO below because the Java API uses pure reflection extraction for analyzing POJOs. The Scala API tries to analyze Scala classes

Re: Need guidance on a use case

2019-12-19 Thread Timo Walther
Hi Eva, I'm not 100% sure if your use case can be solved with SQL. JOIN in SQL always joins an incoming record with all previous arrived records. Maybe Jark in CC has some idea? It might make sense to use the DataStream API instead with a connect() and CoProcessFunction where you can simply

Re: POJO ERROR

2019-12-19 Thread Timo Walther
and we used it just where we need. We used scalaVersion to specify for each library what scala is used, so used flink will be flink-streaming-scala_2.12 Alex On Thu, Dec 19, 2019 at 3:40 PM Timo Walther <mailto:twal...@apache.org>> wrote: I see a mismatch between scalaBui

Re: POJO ERROR

2019-12-19 Thread Timo Walther
a stream with our object model which is like this: case class A(a:Map[String, other_case_class_obj], b: List[other_case_class_obj], c: String) .flatMap(CustomFlatMap()) .print Thank you, Alex On Thu, Dec 19, 2019 at 3:14 PM Timo Walther <mailto:twal...@apache.org>> wrote: That's soun

Re: POJO ERROR

2019-12-19 Thread Timo Walther
On Thu, Dec 19, 2019 at 2:18 PM Timo Walther mailto:twal...@apache.org>> wrote: Hi Alex, the problem is that `case class` classes are analyzed by Scala specific code whereas `class` classes are analyzed with Java specific code. So I wou

Re: Kafka table descriptor missing startFromTimestamp()

2019-12-19 Thread Timo Walther
This issue is work in progress: https://issues.apache.org/jira/browse/FLINK-15220 On 19.12.19 09:07, Dawid Wysakowicz wrote: Hi, The only reason why it was not exposed at the beginning is that not all versions of the consumers support starting from a specific timestamp. I think we could

Re: POJO ERROR

2019-12-19 Thread Timo Walther
Hi Alex, the problem is that `case class` classes are analyzed by Scala specific code whereas `class` classes are analyzed with Java specific code. So I would recommend to use a case class to make sure you stay in the "Scala world" otherwise the fallback is the Java-based TypeExtractor. For

Re: sink type error in scala

2019-12-17 Thread Timo Walther
Hi Fanbin, I think you are mixing different APIs together. We have a Scala and Java version of both DataStream and Table API. The error message indicates that `toRetractStream` is called on a Java Table API class because it returns org.apache.flink.api.java.tuple.Tuple2 but your sink is

Re: Scala case class TypeInformation and Serializer

2019-12-12 Thread Timo Walther
Hi, the serializers are created from TypeInformation. So you can simply inspect the type information. E.g. by using this in the Scala API: val typeInfo = createTypeInformation[MyClassToAnalyze] And going through the object using a debugger. Actually, I don't understand why scala.Tuple2 is

Re: Open Method is not being called in case of AggregateFunction UDFs

2019-12-11 Thread Timo Walther
, /Arujit/ On Wed, Dec 11, 2019 at 3:56 PM Timo Walther <mailto:twal...@apache.org>> wrote: I remember that we fixed some bug around this topic recently. The legacy planner should not be affected. There is another user reporting this: https://issues.apache.org/jira/browse/FL

Re: Order events by filed that does not represent time

2019-12-11 Thread Timo Walther
Hi Krzysztof, first of all Flink does not sort events based on timestamp. The concept of watermarks just postpones the triggering of a time operation until the watermark says all events until a time t have arrived. For your problem, you can simply use a ProcessFunction and buffer the events

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread Timo Walther
Little mistake: The key must be any constant instead of `e`. On 11.12.19 11:42, Timo Walther wrote: Hi Mans, I would recommend to create a little prototype to answer most of your questions in action. You can simple do: stream = env.fromElements(1L, 2L, 3L, 4L

Re: Apache Flink - Clarifications about late side output

2019-12-11 Thread Timo Walther
Hi Mans, I would recommend to create a little prototype to answer most of your questions in action. You can simple do: stream = env.fromElements(1L, 2L, 3L, 4L) .assignTimestampsAndWatermarks( new AssignerWithPunctuatedWatermarks{ extractTimestamp(e) = e,

Re: Open Method is not being called in case of AggregateFunction UDFs

2019-12-11 Thread Timo Walther
I remember that we fixed some bug around this topic recently. The legacy planner should not be affected. There is another user reporting this: https://issues.apache.org/jira/browse/FLINK-15040 Regards, Timo On 11.12.19 10:34, Dawid Wysakowicz wrote: Hi Arujit, Could you also share the query

Re: Thread access and broadcast state initialization in BroadcastProcessFunction

2019-12-11 Thread Timo Walther
1. Yes, methods will only be called by one thread. The FLink API aims to abstract all concurrency topics away when using the provided methods and state. 2. The open() method should always be the first method being called. If this is not the case, this is definitely a bug. Which Flink version

Re: Processing Events by custom rules kept in Broadcast State

2019-12-11 Thread Timo Walther
Hi, I think when it comes to the question "What data type should I put in state?", this question should usually be answered with a well-defined data structure that allows for future state upgrades. Like defining a database schema. So I would not put "arbirary" classes such as Jackson's

Re: Event Timestamp corrupted by timezone

2019-12-10 Thread Timo Walther
Hi, I hope we can solve this issues with the new type system. The core problem is the old planner uses java.sql.Timestamp which depends on the timezone of the current machine. I would recommend to set everything to UTC if possible for now. Regards, Timo On 03.12.19 18:49, Lasse Nedergaard

Re: Flink SQL: How to tag a column as 'rowtime' from a Avro based DataStream?

2019-08-14 Thread Timo Walther
Hi Niels, if you are coming from DataStream API, all you need to do is to write a timestamp extractor. When you call: tableEnv.registerDataStream("TestStream", letterStream, "EventTime.rowtime, letter, counter"); The ".rowtime" means that the framework will extract the rowtime from the

Re: 1.9 Release Timeline

2019-07-23 Thread Timo Walther
Hi Oytun, the community is working hard to release 1.9. You can see the progress here [1] and on the dev@ mailing list. [1] https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=328=detail Regards, Timo Am 23.07.19 um 15:52 schrieb Oytun Tez: Ping, any estimates? --- Oytun Tez

Re: Flink Table API and Date fields

2019-07-08 Thread Timo Walther
Hi Flavio, yes I agree. This check is a bit confusing. The initial reason for that was that sql.Time, sql.Date, and sql.Timestamp extend from util.Date as well. But handling it as a generic type as Jingson mentioned might be the better option in order to write custom UDFs to handle them.

Re: Is the provided Serializer/TypeInformation checked "too late"?

2019-07-08 Thread Timo Walther
Hi Niels, the type handling evolved during the years and is a bit messed up through the different layers. You are almost right with your last assumption "Is the provided serialization via TypeInformation 'skipped' during startup and only used during runtime?". The type extraction returns a

Re: Generic return type on a user-defined scalar function

2019-05-20 Thread Timo Walther
Hi Morrisa, usually, this means that you class is not recognized as a POJO. Please check again the requirements of a POJO: Default constructor, getters and setters for every field etc. You can use org.apache.flink.api.common.typeinfo.Types.POJO(...) to verify if your class is a POJO or not.

Re: Table program cannot be compiled

2019-05-20 Thread Timo Walther
Hi Shahar, yes the number of parameters should be the issue for a cannot compile exception. If you moved most of the constants to a member in the function, it should actually work. Do you have a little reproducible example somewhere? Thanks, Timo Am 16.05.19 um 19:59 schrieb shkob1: Hi

Re: Table program cannot be compiled

2019-05-16 Thread Timo Walther
Hi, too many arguments for calling a UDF could currently lead to "grows beyond 64 KB" and maybe also causes the GC exception. This is a known issue covered in https://issues.apache.org/jira/browse/FLINK-8921. Could you also add the tags to the function itself? Maybe as a static map for

Re: POJO with private fields and toApeendStream of StreamTableEnvironment

2019-04-29 Thread Timo Walther
Hi Sung, private fields are only supported if you specify getters and setters accordingly. Otherwise you need to use `Row.class` and perform the mapping in a subsequent map() function manually via reflection. Regards, Timo Am 29.04.19 um 15:44 schrieb Sung Gon Yi: In

Re: Serialising null value in case-class

2019-04-26 Thread Timo Walther
Currently, tuples and case classes are the most efficient data types because they avoid the need for special null handling. Everything else is hard to estimate. You might need to perform micro benchmarks with the serializers you want to use if you have a very performance critical use case.

Re: Serialising null value in case-class

2019-04-26 Thread Timo Walther
Hi Averell, the reason for this lies in the internal serializer implementation. In general, the composite/wrapping type serializer is responsible for encoding nulls. The case class serialzer does not support nulls, because Scala discourages the use of nulls and promotes `Option`. Some

Re: Exceptions when launching counts on a Flink DataSet concurrently

2019-04-26 Thread Timo Walther
Hi Juan, as far as I know we do not provide any concurrency guarantees for count() or collect(). Those methods need to be used with caution anyways as the result size must not exceed a certain threshold. I will loop in Fabian who might know more about the internals of the execution?

Re: Emitting current state to a sink

2019-04-26 Thread Timo Walther
Hi Avi, did you have a look at the .connect() and .broadcast() API functionalities? They allow you to broadcast a control stream to all operators. Maybe this example [1] or other examples in this repository can help you. Regards, Timo [1]

Re: [DISCUSS] Introduction of a Table API Java Expression DSL

2019-03-21 Thread Timo Walther
doc. and also some features that I think will be beneficial to the final outcome. Please kindly take a look @Timo. Many thanks, Rong On Mon, Mar 18, 2019 at 7:15 AM Timo Walther mailto:twal...@apache.org>> wrote: > Hi everyone, > > some of you might h

Re: flink sql about nested json

2019-03-04 Thread Timo Walther
Hi, Flink SQL JSON format supports nested formats like the schema that you posted. Maybe the renaming with `from()` works not as expected. Did you try it without the `from()` where schema fields are equal to JSON fields? Alternatively, you could also define the schema only and use the

Re: FLIP-16, FLIP-15 Status Updates?

2019-02-19 Thread Timo Walther
Hi John, you are right that there was not much progress in the last years around these two FLIPs. Mostly due to shift of priorities. However, with the big Blink code contribution from Alibaba and joint development forces for a unified batch and streaming runtime [1], it is very likely that

Re: AssertionError: mismatched type $5 TIMESTAMP(3)

2019-02-06 Thread Timo Walther
Hi Chris, the error that you've observed is a bug that might be related to another bug that is not easily solvable. I created an issue for it nevertheless: https://issues.apache.org/jira/browse/FLINK-11543 In general, I think you need to adapt your program in any case. Because you are

Re: Table API zipWithIndex

2019-02-01 Thread Timo Walther
Hi Flavio, I guess you are looking for a unique identifier for rows, right? Currently, this is not possible in Table API. There, we only support UUID(). Once the Table API has been enhanced to be more interactive, we might support such features. Regards, Timo Am 01.02.19 um 11:16 schrieb

Re: UDAF Flink-SQL return null would lead to checkpoint fails

2019-01-30 Thread Timo Walther
Hi Henry, could you share a little reproducible example? From what I see you are using a custom aggregate function with a case class inside, right? Flink's case class serializer does not support null because the usage of `null` is also not very Scala like. Use a `Row` type for supporting

Re: AssertionError: mismatched type $5 TIMESTAMP(3)

2019-01-29 Thread Timo Walther
Hi Chris, the exception message is a bit misleading. The time attribute (time indicator) type is an internal type and should not be used by users. The following line should solve your issue. Instead of: DataStream> tradesByInstrStream = tableEnv.toRetractStream(tradesByInstr, typeInfo);

Re: SQL Client (Streaming + Checkpoint)

2019-01-29 Thread Timo Walther
Hi Vijay, in general Yun is right, the SQL Client is still in an early prototyping phase. Some configuration features are missing. You can track the progress of this feature here: https://issues.apache.org/jira/browse/FLINK-10265 It should be possible to use the global Flink configuration

Re: date format in Flink SQL

2019-01-29 Thread Timo Walther
Hi Soheil, the functions for date/time conversion are pretty limited so far. The full list of supported functions can be found here [1]. If you need more (which is usually the case), it is easy to implement a custom function [2]. We rely on Java's java.sql.Date as a data type. You can use

Re: Is there a way to get all flink build-in SQL functions

2019-01-25 Thread Timo Walther
The problem right now is that Flink SQL has two stacks for defining functions. One is the built-in function stack that is based on Calcite and the other are the registered UDFs. What you can do is to use FunctionCatalog.withBuiltIns.getSqlOperatorTable() for listing Calcite built-in

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-23 Thread Timo Walther
+1 for Stephan's suggestion. For example, SQL connectors have never been part of the main distribution and nobody complained about this so far. I think what is more important than a big dist bundle is a helpful "Downloads" page where users can easily find available filesystems, connectors,

Re: NoMatchingTableFactoryException when test flink sql with kafka in flink 1.7

2019-01-11 Thread Timo Walther
Hi Jashua, according to the property list, you passed "connector.version=0.10" so a Kafka 0.8 factory will not match. Are you sure you are compiling the right thing? There seems to be a mismatch between your screenshot and the exception. Regards, Timo Am 11.01.19 um 15:43 schrieb Joshua

Re: The way to write a UDF with generic type

2019-01-08 Thread Timo Walther
Currently, this functionality is hard-coded in the aggregation translation. Namely in `org.apache.flink.table.runtime.aggregate.AggregateUtil#transformToAggregateFunctions` [1]. Timo [1]

Re: Building Flink from source according to vendor-specific version but causes protobuf conflict

2019-01-07 Thread Timo Walther
Hi Wei, did you play around with classloading options mentioned here [1]. The -d option might impact how classes are loaded when the job is deployed on the cluster. I will loop in Gary that might now more about the YARN behavior. Regards, Timo [1]

Re: onTimer function is not getting executed and job is marked as finished.

2019-01-07 Thread Timo Walther
Hi Puneet, maybe you can show or explain us a bit more about your pipeline. From what I see your ProcessFunction looks correct. Are you sure the registering takes place? Regards, Timo Am 07.01.19 um 14:15 schrieb Puneet Kinra: Hi Hequn Its a streaming job . On Mon, Jan 7, 2019 at 5:51 PM

Re: Buffer stats when Back Pressure is high

2019-01-07 Thread Timo Walther
Hi Gagan, a typical solution to such a problem is to introduce an artifical key (enrichment id + some additional suffix), you can then keyBy on this artificial key and thus spread the workload more evenly. Of course you need to make sure that records of the second stream are duplicated to

Re: Reducing runtime of Flink planner

2019-01-07 Thread Timo Walther
Hi Niklas, it would be interesting to know which planner caused the long runtime. Could you use a debugger to figure out more details? Is it really the Flink Table API planner or the under DataSet planner one level deeper? There was an issue that was recently closed [1] about the DataSet

Re: How to get the temp result of each dynamic table when executing Flink-SQL?

2019-01-07 Thread Timo Walther
Hi Henry, such a feature is currently under discussion [1] feel free to participate here and give feedback. So far you need to have some intermediate store usually this could be Kafka or a filesystem. I would recommend to write little unit tests that test each SQL step like it is done here

Re: The way to write a UDF with generic type

2019-01-07 Thread Timo Walther
Currently, there is no more flexible approch for aggregate functions. Scalar functions can be overloaded but aggregate functions do not support this so far. Regards, Timo Am 07.01.19 um 02:27 schrieb yinhua.dai: Hi Timo, But getResultType should only return a concrete type information,

Re: The way to write a UDF with generic type

2019-01-04 Thread Timo Walther
Hi Yinhua, Flink needs to know how to serialize and deserialize a type `T`. If you are using a type variable here, Flink can not derive the type information. You need to override org.apache.flink.table.functions.AggregateFunction#getResultType and return type information that matches.

Re: Is there a better way to debug ClassNotFoundException from FlinkUserCodeClassLoaders?

2019-01-03 Thread Timo Walther
Hi Hao, which Flink version are you using? What do you mean with "suddenly", did it work before? Regards, Timo Am 03.01.19 um 07:13 schrieb Hao Sun: Yep, javap shows the class is there, but FlinkUserCodeClassLoaders somehow could not find it suddenly javap -cp

Re: SANSA 0.5 (Scalable Semantic Analytics Stack) Released

2018-12-14 Thread Timo Walther
Hi, looks like a very useful extension to Flink. Thanks for letting us know! You can also use the commun...@flink.apache.org mailing list to spread the news because the user@ list is more for user support questions and help. Regards, Timo Am 14.12.18 um 09:23 schrieb GezimSejdiu: Dear all,

Re: sql program throw exception when new kafka with csv format

2018-12-11 Thread Timo Walther
Hi Marvin, the CSV format is not supported for Kafka so far. Only formats that have the tag `DeserializationSchema` in the docs are supported. Right now you have to implement you own DeserializationSchemaFactory or use JSON or Avro. You can follow [1] to get informed once the CSV format is

Re: Run simple flink application via "java -jar"

2018-12-06 Thread Timo Walther
Hi Krishna, yes this should work given that you included all dependencies that are marked as "provided" in a Flink example project. In general, when you develop a Flink application, you can can simply press the run button in your IDE. This will start a mini cluster locally for debugging

Re: Discuss [FLINK-9740] Support group windows over intervals of months

2018-12-06 Thread Timo Walther
Hi, thanks for working on this. The user mailing list is not the right place to start development discussions. Please use the dev@ mailing list. Can you attach you design to the Jira issue? We can then further discuss there. Thanks, Timo Am 06.12.18 um 09:45 schrieb x1q1j1: Hi! Timo

Re: If you are an expert in flink sql, then I really need your help...

2018-12-04 Thread Timo Walther
Unfortunately, setting the parallelism per SQL operator is not supported right now. We are currently thinking about a way of having fine-grained control about properties of SQL operators but this is in an early design phase and might take a while Am 04.12.18 um 13:05 schrieb clay:

Re: If you are an expert in flink sql, then I really need your help...

2018-12-04 Thread Timo Walther
Hi, yes this was a unintended behavior that got fixed in Flink 1.7. See https://issues.apache.org/jira/browse/FLINK-10474 Regards, Timo Am 04.12.18 um 05:21 schrieb clay: I have found out that checkpoint is not triggered. Regarding the in operation in flink sql, this sql will trigger

Re: If you are an expert in flink sql, then I really need your help...

2018-12-03 Thread Timo Walther
Hi, it is very difficult to spot the problem with the little information you gave us. Maybe you can show us a simplified SQL query and the implementation of the `LAST_VALUE` function? An initial guess would be that you are running out of memory such that YARN kills your task manager. If

Re: Table exception

2018-11-29 Thread Timo Walther
Hi Michael, this dependency issue should have been fixed recently. Which Flink version are you using? Regards, Timo Am 29.11.18 um 16:01 schrieb TechnoMage: I have a simple test for looking at Flink SQL and hit an exception reported as a bug.  I wonder though if it is a missing dependency.

Re: Questions about UDTF in flink SQL

2018-11-29 Thread Timo Walther
Hi Wangsan, currently, UDFs have very strict result type assumptions. This is necessary to determine the serializers for the cluster. There were multiple requests for more flexible handling of types in UDFs. Please have a look at: - [FLINK-7358] Add implicitly converts support for

Re: SQL Query named operator exceeds 80 characters

2018-11-29 Thread Timo Walther
Unfortunetely, renaming of operators is not supported so far. We are currently thinking about a way of having fine-grained control about properties of SQL operators but this is in an early design phase and might take a while. Regards, Timo Am 29.11.18 um 10:32 schrieb Kostas Kloudas: Hi,

Re:

2018-11-27 Thread Timo Walther
Hi Hengyu, currently, metadata between Flink programs can only be shared via code. For this, we recently introduced a programmatic descriptor-based way of specifying sources and sinks [1]. However, a catalog such as Hive metastore would be much easier and the community is currently working

Re: Group by with null keys

2018-11-20 Thread Timo Walther
I assigned the issue to me. Because I wanted to that for a very long time. I already did some prerequisite work for the documentation in `org.apache.flink.api.common.typeinfo.Types`. Thanks, Timo Am 20.11.18 um 11:44 schrieb Flavio Pompermaier: Sure! The problem is that Dataset API does an

Re: ***UNCHECKED*** Table To String

2018-11-13 Thread Timo Walther
://github.com/apache/flink/tree/master/flink-end-to-end-tests Am 13.11.18 um 16:28 schrieb Steve Bistline: Hi Timo, Thank you... I am not very good with the MapFunction and trying to find an example. Will hack at it a bit. Appreciate your help. Steve On Tue, Nov 13, 2018 at 12:42 AM Timo

Re: Rich variant for Async IO in Scala

2018-11-13 Thread Timo Walther
, 12 Nov 2018 at 16:43 Timo Walther <mailto:twal...@apache.org>> wrote: Hi Bruno, `org.apache.flink.streaming.api.functions.async.RichAsyncFunction` should also work for the Scala API. `RichMapFunction` or `RichFilterFunction` are also shared between both APIs. Is there

Re: ***UNCHECKED*** Table To String

2018-11-13 Thread Timo Walther
Hi Steve, if you are ok with using the DataStream API you can simply use a map() function [1] and call row.toString(). However, usually people want custom logic to construct a string. This logic could either be in SQL using the concat operator `||` or in the DataStream API. Regards, Timo

Re: Rich variant for Async IO in Scala

2018-11-12 Thread Timo Walther
Hi Bruno, `org.apache.flink.streaming.api.functions.async.RichAsyncFunction` should also work for the Scala API. `RichMapFunction` or `RichFilterFunction` are also shared between both APIs. Is there anything that blocks you from using it? Regards, Timo Am 09.11.18 um 01:38 schrieb Bruno

Re: flink job restarts when flink cluster restarts?

2018-11-12 Thread Timo Walther
Hi, by default all the metadata is lost when shutting down the JobManager in a non high available setup. Flink uses Zookeeper together with a distributed filesystem to store the required metadata [1] in a persistent and distributed manner. A single node setup is rather uncommon, but you can

Re: Multiple operators to the same sink

2018-11-12 Thread Timo Walther
Hi, I'm not quite sure if I understand your problem correctly. But your use case sounds like a typical application of a union operation. What do you mean with "knowledge of their destination sink"? The operators don't need to be aware of the destination sink. The only thing that needs to be

Re: Implementation error: Unhandled exception - "Implementation error: Unhandled exception."

2018-11-12 Thread Timo Walther
Hi Richard, this sounds like a bug to me. I will loop in Till (in CC) who might know more about this. Regards, Timo Am 07.11.18 um 20:35 schrieb Richard Deurwaarder: Hello, We have a flink job / cluster running in kubernetes. Flink 1.6.2 (but the same happens in 1.6.0 and 1.6.1) To

Re: InterruptedException when async function is cancelled

2018-11-12 Thread Timo Walther
Hi Anil, if I researched correctly we are talking about these changes [1]. I don't know if you can back port it, but I hope this helps. Regards, Timo [1] https://issues.apache.org/jira/browse/FLINK-9304 Am 07.11.18 um 17:41 schrieb Anil: Hi Till, Thanks for the reply. Is there

Re: flink run from savepoint

2018-11-12 Thread Timo Walther
Hi Franck, as a first hint: paths are hard-coded in the savepoint's metadata so you should make sure that the path is still the same and accessible by all JobManagers and TaskManagers. Can you share logs with us to figure out what caused the internal server error? Thanks, Timo Am

Re: Run a Flink job: REST/ binary client

2018-11-12 Thread Timo Walther
I will loop in Chesnay. He might know more about the REST service internals. Timo Am 07.11.18 um 16:15 schrieb Flavio Pompermaier: After a painful migration to Flink 1.6.2 we were able to run one of the jobs. Unfortunately we faced the same behaviour: all the code after the first

Re: Report failed job submission

2018-11-12 Thread Timo Walther
rrors":["org.apache.flink.client.program.ProgramInvocationException: The main method caused an error."]} I was wondering if there is any better way to handle this kind of problems.. On Mon, Nov 12, 2018 at 3:53 PM Timo Walther <mailto:twal...@apache.org>> wrote: Hi Flavio, I'm no

Re: How to run Flink 1.6 job cluster in "standalone" mode?

2018-11-12 Thread Timo Walther
Hi, a session cluster does not imply that JM + TM are always executed in the same JVM. Debugging a job running on different JVMs might be a bit more difficult to debug but it should still be straightforward. Maybe you can tell us what wrong behavior you observe? Btw. Flink's metrics can

Re: Report failed job submission

2018-11-12 Thread Timo Walther
Hi Flavio, I'm not entirely sure if I get your question correct but what you are looking for is more information (like categorization) why the submission failed right? Regards, Timo Am 06.11.18 um 14:33 schrieb Flavio Pompermaier: Any idea about how to address this issue? On Tue, Oct 16,

Re: Understanding checkpoint behavior

2018-11-12 Thread Timo Walther
Hi, do you observe such long checkpoint times also without performing external calls? If not, I guess the communication to the external system is flaky. Maybe you have to rethink how you perform such calls in order to make the pipeline more robust against these latencies. Flink also offers

Re: Why dont't have a csv formatter for kafka table source

2018-11-02 Thread Timo Walther
I already answered his question but forgot to CC the mailing list: Hi Jocean, a standard compliant CSV format for a Kafka table source is in the making right now. There is a PR that implements it [1] but it needs another review pass. It is high on my priority list and I hope we can finalize

Re: Flink SQL questions

2018-11-02 Thread Timo Walther
Usually, the problem occurs when users import the wrong classes. The current class naming is a bit confusing as there are 3 StreamTableEnvironment classes. You need to choose the one that matches your programming language. E.g. org.apache.flink.table.api.java.StreamTableEnvironment. Regards,

Re: Non deterministic result with Table API SQL

2018-10-31 Thread Timo Walther
As far as I know STDDEV_POP is translated into basic aggregate functions (SUM/AVG/COUNT). But if this error is reproducible in a little test case, we should definitely track this in JIRA. Am 31.10.18 um 16:43 schrieb Flavio Pompermaier: Adding more rows to the dataset lead to a deterministic

Re: Non deterministic result with Table API SQL

2018-10-31 Thread Timo Walther
Hi Flavio, do you execute this query in a batch or stream execution environment? In any case this sounds very strange to me. But is it guarateed that it is not the fault of the connector? Regars, Timo Am 31.10.18 um 14:54 schrieb Flavio Pompermaier: Hi to all, I'm using Flink 1.6.1 and I

Re: Wired behavior of DATE_FORMAT UDF in Flink SQL

2018-10-31 Thread Timo Walther
Hi Henry, the DATE_FORMAT function is in a very bad state right now. I would recommend to implement your own custom function right now. This issue is tracked here: https://issues.apache.org/jira/browse/FLINK-10032 Regards, Timo Am 31.10.18 um 07:44 schrieb 徐涛: Hi Experts, I found that

Re: Needed more information about jdbc sink in Flink SQL Client

2018-10-29 Thread Timo Walther
Hi, all supported connectors and formats for the SQL Client with YAML can be found in the connect section [1]. However, the JDBC sink is not available for the SQL Client so far. It still needs to be ported, see [2]. However, if you want to use it. You could implement your own table factory

Re: Java Table API and external catalog bug?

2018-10-25 Thread Timo Walther
Hi Flavio, the external catalog support is not feature complete yet. I think you can only specify the catalog when reading from a table but `insertInto` does not consider the catalog name. Regards, TImo Am 25.10.18 um 10:04 schrieb Flavio Pompermaier: Any other help here? is this a bug or

Re: Reverse the order of fields in Flink SQL

2018-10-23 Thread Timo Walther
Hi Yinhua, your custom sink must implement `org.apache.flink.table.sinks.TableSink#configure`. This method is called when writing to a sink such that the sink can configure itself for the reverse order. The methods `getFieldTypes` and `getFieldNames` must then return the reconfigured schema;

Re: Flink Table API and table name

2018-10-16 Thread Timo Walther
Hi Flavio, yes you are right, I don't see a reason why we should not support such table names. Feel free to open an issue for it. Regards, Timo Am 16.10.18 um 10:56 schrieb miki haiat: Im not sure if it will solve this issue but can you try to register the your catalog [1]

Re: User jar is present in the flink job manager's class path

2018-10-11 Thread Timo Walther
Yes, you are right. I was not aware that the resolution order depends on the cluster deployment. I will loop in Gary (in CC) that might know about such a YARN setup. Regards, Timo Am 11.10.18 um 15:47 schrieb yinhua.dai: Hi Timo, I didn't tried to configure the classloader order, according

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-11 Thread Timo Walther
Hi Xuefu, thanks for your proposal, it is a nice summary. Here are my thoughts to your list: 1. I think this is also on our current mid-term roadmap. Flink lacks a poper catalog support for a very long time. Before we can connect catalogs we need to define how to map all the information

Re: Does Flink SQL "in" operation has length limit?

2018-10-01 Thread Timo Walther
Hi, tuple should not be used anywhere in flink-table. @Rong can you point us to the corresponding code? I haven't looked into the code but we should definitely support this query. @Henry feel free to open an issue for it. Regards, Timo Am 28.09.18 um 19:14 schrieb Rong Rong: Yes. Thanks

Re: About the retract of the calculation result of flink sql

2018-10-01 Thread Timo Walther
Hi, you also need to keep the parallelism in mind. If your downstream operator or sink has a parallelism of 1 and your SQL query pipeline has a higher parallelism, the retract results are rebalanced and arrive in a wrong order. For example, if you view the changelog in SQL Client, the

Re: OpenSSL use in Flink

2018-09-26 Thread Timo Walther
Hi Suchithra, did you take a look at the documentation [1] about the SSL setup? Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/security-ssl.html Am 26.09.18 um 14:08 schrieb V N, Suchithra (Nokia - IN/Bangalore): Hello, I have a query regarding OpenSSL

Re: Could not find a suitable table factory for 'org.apache.flink.table.factories.DeserializationSchemaFactory' in the classpath.

2018-09-26 Thread Timo Walther
Hi, actually it should not be neccessary to put the flink-json format into /lib. Is the class `org.apache.flink.formats.json.JsonRowFormatFactory` present in the jar file you are creating with Maven? There should also be an entry in

Re: Conversion to relational algebra failed to preserve datatypes

2018-09-14 Thread Timo Walther
Hi, could you maybe post the query that caused the exception? I guess the exception is related to a time attribute [1] for the optimizer time attributes and timestamps make no difference however they have a slightly different data type that might have caused the error. I think is a bug that

Re: Orc Sink Table

2018-09-13 Thread Timo Walther
Hi Jose, you have to add additional Maven modules depending on the connector/format you want to use. See this page [1] for more information. Feel free to ask further questions if the description is not enough for you. Regards, Timo [1]

Re: ElasticSearch 6 - error with UpdateRequest

2018-08-31 Thread Timo Walther
The problem is that BulkProcessorIndexer is located in flink-connector-elasticsearch-base which is compiled against a very old ES version. This old version is source code compatible but apparently not binary compatible with newer Elasticsearch classes. By copying this class you force to

Re: Table API: get Schema from TableSchema, TypeInformation[] or Avro

2018-08-31 Thread Timo Walther
or should I hard code a List>? All the best François 2018-08-31 8:12 GMT+02:00 Timo Walther <mailto:twal...@apache.org>>: Hi, thanks for your feedback. I agree that the the current interfaces are not flexible enough to fit to every use case. The unified connector API

Re: ElasticSearch 6 - error with UpdateRequest

2018-08-31 Thread Timo Walther
Hi Averell, sorry for my wrong other mail. I also observed this issue when implementing FLINK-3875. Currently, update requests are broken due to a binary incompatibility. I already have a fix for this in a different branch. I opened FLINK-10269 [1] to track the issue. As a work around you

Re: Table API: get Schema from TableSchema, TypeInformation[] or Avro

2018-08-31 Thread Timo Walther
Hi, thanks for your feedback. I agree that the the current interfaces are not flexible enough to fit to every use case. The unified connector API is a a very recent feature that still needs some polishing. I'm working on a design document to improve the situation there. For now, you can

Re: ElasticSearch 6 - error with UpdateRequest

2018-08-31 Thread Timo Walther
Hi, thanks for your feedback. I agree that the the current interfaces are not flexible enough to fit to every use case. The unified connector API is a a very recent feature that still needs some polishing. I'm working on a design document to improve the situation there. For now, you can

Re: withFormat(Csv) is undefined for the type BatchTableEnvironment

2018-08-30 Thread Timo Walther
Hi François, you should read the documentation from top to bottom. The overview part [1] explains how everything plays together with examples. Regards, Timo [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/connect.html#overview Am 30.08.18 um 10:41 schrieb Till

<    1   2   3   4   5   6   7   >