Re: [DISCUSS] Flink's supported APIs and Hive query syntax

2022-03-07 Thread Jing Zhang
to this migration work. Best, Jing Zhang Martijn Visser 于2022年3月7日周一 19:23写道: > Hi everyone, > > Flink currently has 4 APIs with multiple language support which can be used > to develop applications: > > * DataStream API, both Java and Scala > * Table API, both Java and Scala > *

Re: Window Top N for Flink 1.12

2021-12-23 Thread Jing Zhang
Hi Jing, Please try this way, Only create one sink for final output, write the window aggregate and topN in one query, write the result of topN into the final sink. Best, Jing Zhang Jing 于2021年12月24日周五 03:13写道: > Hi Jing Zhang, > > Thanks for the reply! My current implementatio

Re: Window Top N for Flink 1.12

2021-12-23 Thread Jing Zhang
he.org/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/ topn/ Jing Zhang 于2021年12月23日周四 17:04写道: > Hi Jing, > I'm afraid there is no possible to Window TopN in SQL on 1.12 version > because window TopN is introduced since 1.13. > > > I saw the one possibility is to c

Re: Window Top N for Flink 1.12

2021-12-23 Thread Jing Zhang
tps://nightlies.apache.org/flink/flink-docs-master/docs/dev/table/sql/queries/window-topn/ Best, Jing Zhang Jing 于2021年12月23日周四 16:12写道: > Hi, Flink community, > > Is there any existing code I can use to get the window top N with Flink > 1.12? I saw the one possibility is to create a table and insert t

Re: How to select an event that has max attribute

2021-12-11 Thread Jing Zhang
You are Welcome. Glad to hear the information is helpful. Guoqin Zheng 于2021年12月10日周五 03:28写道: > Hi Jing, > > Thanks for the advice. This is very helpful. > > -Guoqin > > On Wed, Dec 8, 2021 at 11:52 PM Jing Zhang wrote: > >> Hi Guoqin, >> I understand t

Re: How to select an event that has max attribute

2021-12-08 Thread Jing Zhang
stream instead of append stream, which means the result sent might be retracted later. Besides, you could take care of state clean. [1] https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/topn/ Best, Jing Zhang Guoqin Zheng 于2021年12月9日周四 14:16写道: > Hi J

Re: How to select an event that has max attribute

2021-12-08 Thread Jing Zhang
( TUMBLE(TABLE MyTable, DESCRIPTOR(readtime), INTERVAL '5' MINUTES)) ) WHERE rownum <= 1; [1] https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/sql/queries/window-topn/#window-top-n-follows-after-windowing-tvf Best, Jing Zhang Guoqin Zheng 于2021年12月9日周四 10:30写道: &

Re: Field names must be unique. Found duplicates

2021-11-28 Thread Jing Zhang
Hi, Thanks for reporting this BUG. It seems to be a duplicate with FLINK-23919 [1] which would be solved in 1.13.4. [1] https://issues.apache.org/jira/browse/FLINK-23919 Best, Jing Zhang Ivan Budanaev 于2021年11月28日周日 下午7:18写道: > I am getting the *Field names must be unique. Found duplica

Re: Is there a way to print key and state metadata/types for a job?

2021-11-26 Thread Jing Zhang
Hi Dan, AFAIK, there is no built-in way to solve your problem. You could check whether state processor API <https://nightlies.apache.org/flink/flink-docs-master/zh/docs/libs/state_processor_api/> could help you. Or you could add log in your program. Best, Jing Zhang Dan Hill 于2021年11月26日周

Re: How to express the datatype of sparksql collect_list(named_struct(...))inflinksql?

2021-11-11 Thread JING ZHANG
Hi, vtygoss +1 on Timo's solutions. I tried those two solutions in 1.12. Both them could work well. Thanks @Timo for good suggestions. Best, JING ZHANG

Re: How to express the datatype of sparksql collect_list(named_struct(...))in flinksql?

2021-11-09 Thread JING ZHANG
Hi vtygoss, I'm a little confused. The UDF could already work well without defining `DataTypeHint `annotation. Why do you define `DataTypeHint `annotation before input parameter of `eval `method? Best, JING ZHANG vtygoss 于2021年11月9日周二 下午8:17写道: > Hi, JING ZHANG! > > Thanks for your m

Re: How to express the datatype of sparksql collect_list(named_struct(...)) in flinksql?

2021-11-08 Thread JING ZHANG
ry("select TestFunc(COLLECT(ROW(id, name))) as info from table group by ...") @SerialVersionUID(1L) object TestFunc extends ScalarFunction { def eval(s: java.util.Map[Row, Integer]): String = s.keySet().mkString("\n") } Best regards, JING ZHANG vtygoss 于2021年11月8日周一 下午

Re: Need help with window TopN query

2021-11-04 Thread JING ZHANG
is not triggered yet? Best, JING ZHANG Francesco Guardiani 于2021年11月5日周五 上午12:57写道: > As a rule of thumb, I would first try to check that Flink ingests > correctly your csv. Perhaps try to run just a select on your input and see > if the input is parsed as expected and is ordered. &

Re: Alternate to PreserveWatermark() in recent Flink versions

2021-10-25 Thread JING ZHANG
erness`. > > I want to know if the preservation of watermarks happens there by default. > Since in the newer APIs source-level watermark definitions are deprecated. > > > > On Mon, Oct 25, 2021 at 2:17 PM JING ZHANG wrote: > >> Hi, >> I'm not

Re: Time different between checkpoint and savepoint restoration in GCS

2021-10-25 Thread JING ZHANG
ink-docs-master/docs/ops/state/savepoints/#what-is-a-savepoint-how-is-a-savepoint-different-from-a-checkpoint Best, JING ZHANG [2] Roman Khachatryan 于2021年10月25日周一 下午4:53写道: > Hi ChangZhuo, > > Yes, restoring from a savepoint is expected to be significantly slower > from a checkpoi

Re: Kafka Stream Window

2021-10-25 Thread JING ZHANG
Hi, I guess you would get suggestions more quickly if you send this email in user mail list of KStream. Or you could use Flink to complete your requirements directly. Mohammed Kamaal 于2021年10月25日周一 下午3:47写道: > Hi, > > Is there a way to define a sliding window with a count (number of >

Re: Alternate to PreserveWatermark() in recent Flink versions

2021-10-25 Thread JING ZHANG
Hi, I'm not sure I understand your requirement. However, are you looking for `PreserveWatermarks` in package `org.apache.flink.table.sources.wmstrategies`? Best, JING ZHANG Arujit Pradhan 于2021年10月25日周一 下午4:02写道: > Hi all, > > > We maintain an Open-sourced project for protobuf dat

Re: Any issues with reinterpretAsKeyedStream when scaling partitions?

2021-10-24 Thread JING ZHANG
savepoint, you change the parallelism, whether the partitioner l1-> J and l2-> J still be 'forward'? Would you please check this? If they are still be 'forward', we need to look for other reason for this problem. Dan Hill 于2021年10月25日周一 上午10:43写道: > On Sat, Oct 23, 2021 at 9:16 PM JING ZH

Re: Any issues with reinterpretAsKeyedStream when scaling partitions?

2021-10-23 Thread JING ZHANG
t; >> I found the following link about this. Still looks applicable. In my >> case, I don't need to do a broadcast join. >> >> https://www.alibabacloud.com/blog/flink-how-to-optimize-sql-performance-using-multiple-input-operators_597839 >> >> On Thu, Oct 21, 202

Re: Flink JDBC connect with secret

2021-10-23 Thread JING ZHANG
/connecting-with-ssl-encryption?view=sql-server-ver15 [5] https://www.ibm.com/docs/ar/informix-servers/14.10?topic=options-connecting-jdbc-applications-ssl Best, JING ZHANG Qihua Yang 于2021年10月23日周六 下午1:11写道: > Hi, > > We plan to use JDBC SQL connector to read/write database. I saw JDBC &g

Re: [ANNOUNCE] Apache Flink 1.13.3 released

2021-10-21 Thread JING ZHANG
Thank Chesnay, Martijn and every contributor for making this happen! Thomas Weise 于2021年10月22日周五 上午12:15写道: > Thanks for making the release happen! > > On Thu, Oct 21, 2021 at 5:54 AM Leonard Xu wrote: > > > > Thanks to Chesnay & Martijn and everyone who made this release happen. > > > > > >

Re: Latency Tracking in Apache Flink

2021-10-20 Thread JING ZHANG
track latency for each event processing because a group of raw records would be aggregated into an accumulator result. Best, JING ZHANG Puneet Duggal 于2021年10月21日周四 上午1:21写道: > Hi, > > Is there any way to track latency / time taken for each event processing. > Read about latency ma

Re: Wired rows in SQL Query Result (Changelog)

2021-10-20 Thread JING ZHANG
'Technology') Those two record arrives at downstream aggregate, each record would leads to two output record (-U, +U). so each update at source would lead to 4 output records. Hope this helps. Best, JING ZHANG thekingofcity 于2021年10月20日周三 下午5:52写道: > Hi, > > I'm working on

Re: Apache Flink - Using upsert JDBC sink for DataStream

2021-10-18 Thread JING ZHANG
defined in DML. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/table/jdbc/#idempotent-writes Best, JING ZHANG M Singh 于2021年10月19日周二 上午7:00写道: > Hi Jing: > > Thanks for your response and example. > > Is there a DataStream api for using the u

Re: Any issues with reinterpretAsKeyedStream when scaling partitions?

2021-10-18 Thread JING ZHANG
d you please check Dan's question? Please correct me if I'm wrong. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-92%3A+Add+N-Ary+Stream+Operator+in+Flink?spm=a2c65.11461447.0.0.786d2323FtzWaR Best, JING ZHANG Dan Hill 于2021年10月16日周六 上午6:28写道: > Thanks Thias and JING ZHANG!

Re: Apache Flink - Using upsert JDBC sink for DataStream

2021-10-17 Thread JING ZHANG
+ " GROUP BY len, cTag\n" > + ")\n" > + "GROUP BY cnt, cTag") > .await(); > > [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/connectors/table/jdbc/#key-han

Re: Catching SIGINT With flink Jobs

2021-10-17 Thread JING ZHANG
Hi, Would you please describe your demand in more detail? Do you want to release resource when the job is closing? If yes, you could overwrite the close() method of the custom source and custom sink. Best, JING ZHANG Caizhi Weng 于2021年10月18日周一 上午10:27写道: > Hi! > > This is gener

Re: Removing metrics

2021-10-15 Thread JING ZHANG
, JING ZHANG Mason Chen 于2021年10月15日周五 上午8:43写道: > Hi all, > > Suppose I have a short lived process within a UDF that defines metrics. > After the process has completed, the underlying resources should be cleaned > up. Is there an API to remove/unregister metrics? > > Best, > Mason >

Re: Any issues with reinterpretAsKeyedStream when scaling partitions?

2021-10-15 Thread JING ZHANG
those two operators chained into one vertex? Best, JING ZHANG Schwalbe Matthias 于2021年10月15日周五 下午2:57写道: > … didn’t mean to hit the send button so soon  > > > > I guess we are getting closer to a solution > > > > > > Thias > > > > > > > >

Re: Any issues with reinterpretAsKeyedStream when scaling partitions?

2021-10-14 Thread JING ZHANG
int? Would you please provide the code to reproduce the code? Best, JING ZHANG Dan Hill 于2021年10月14日周四 下午11:51写道: > I have a job that uses reinterpretAsKeyedStream across a simple map to > avoid a shuffle. When changing the number of partitions, I'm hitting an > issue with registerEventTi

Re: Table API joining 2 streams with periodic updates

2021-10-14 Thread JING ZHANG
need revise the query again or switch to `toChangelogStream` . You could find more explanation and demo in document [1] [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/data_stream_api/#converting-between-datastream-and-table Best, JING ZHANG Robert Cullen 于2021

Re: Reset of transient variables in state to default values.

2021-10-12 Thread JING ZHANG
currentFile) Best, JING ZHANG Yun Tang 于2021年10月12日周二 下午4:41写道: > Hi Alex, > > Since you use customized MultiStorePacketState class as the value state > type, it should use kryo serializer [1] to serialize your class via > accessing RocksDB state or checkpoint via FileSystemStateBackend

Re: Reset of transient variables in state to default values.

2021-10-11 Thread JING ZHANG
Hi Alex, It is a little weird. Would you please provide the program which could reproduce the problem, including DataStream job code and related classes code. I need some debug to find out the reason. Best, JING ZHANG Alex Drobinsky 于2021年10月11日周一 下午5:50写道: > Hi Jing Zhang, > >

Re: Unsubscribe

2021-10-11 Thread JING ZHANG
...@flink.apache.org For more information, please go to [1]. [1] https://flink.apache.org/community.html#mailing-lists Best, JING ZHANG Jesús Vásquez 于2021年10月11日周一 下午9:40写道: > Hello i want to unsubscribe >

Re: Reset of transient variables in state to default values.

2021-10-11 Thread JING ZHANG
thought it was expected behavior because RocksDBStateBackend stores all state as byte arrays in embedded RocksDB instances. Therefore, it de/serializes the state of a key for every read/write access. CurrentFile is null because the transient variable would not be serialized by default. Best, JING ZHANG

Re: flink job : TPS drops from 400 to 30 TPS

2021-09-27 Thread JING ZHANG
/projects/flink/flink-docs-release-1.13/docs/ops/debugging/flame_graphs/ Hope it helps. Best, JING ZHANG Ragini Manjaiah 于2021年9月27日周一 下午5:25写道: > please let me know how to check Does RPC response and CPU cost > > On Mon, Sep 27, 2021 at 1:19 PM JING ZHANG wrote: > >> Hi, >> S

Re: flink job : TPS drops from 400 to 30 TPS

2021-09-27 Thread JING ZHANG
fine? ... Would you please provide more information about the job, for example back pressure status, input data distribution, async mode or sync mode lookup. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/monitoring/back_pressure/ Best, JING ZHANG Ragini Manjaiah 于2021年9月27日

Re: 退订

2021-09-26 Thread JING ZHANG
...@flink.apache.org For more information, please go to [1]. [1] https://flink.apache.org/community.html#mailing-lists Best, JING ZHANG maozhaolin 于2021年9月26日周日 下午3:28写道: > 退订

Re: 退订

2021-09-26 Thread JING ZHANG
...@flink.apache.org For more information, please go to [1]. [1] https://flink.apache.org/community.html#mailing-lists Best, JING ZHANG maozhaolin 于2021年9月26日周日 下午3:28写道: > 退订

Re: stream processing savepoints and watermarks question

2021-09-24 Thread JING ZHANG
are continuous being registered. I would try to reproduce the problem in an ITCase, and once completed I would provide the code. Best, JING ZHANG Guowei Ma 于2021年9月24日周五 下午5:16写道: > Hi, JING > > Thanks for the case. > But I am not sure this would happen. As far as I know the event timer

Re: stream processing savepoints and watermarks question

2021-09-24 Thread JING ZHANG
is triggered, then registers another timer and forever. I'm not sure whether Macro meets a similar problem. Best, JING ZHANG Guowei Ma 于2021年9月24日周五 下午4:01写道: > Hi Macro > > Indeed, as mentioned by JING, if you want to drain when triggering > savepoint, you will encounter this M

Re: stream processing savepoints and watermarks question

2021-09-24 Thread JING ZHANG
-a-final-savepoint Best, JING ZHANG Marco Villalobos 于2021年9月24日周五 下午12:54写道: > Something strange happened today. > When we tried to shutdown a job with a savepoint, the watermarks became > equal to 2^63 - 1. > > This caused timers to fire indefinitely and crash downstream systems wi

Re: Built-in functions to manipulate MULTISET type

2021-09-18 Thread JING ZHANG
Hi Yuval, You could open a JIRA to track this if you think some functions should be added as built-in functions in Flink. Best, JING ZHANG Yuval Itzchakov 于2021年9月18日周六 下午3:33写道: > The problem with defining a UDF is that you have to create one overload > per key type in the MULTISET. It

Re: Built-in functions to manipulate MULTISET type

2021-09-17 Thread JING ZHANG
Hi Kai, AFAIK, there is no built-in function to extract the keys in MULTISET <https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/types/> to be an ARRAY. Define a UTF is a good solution. Best, JING ZHANG Kai Fu 于2021年9月18日周六 上午7:35写道: > Hi team, > >

Re: FlinkSQL Sinks

2021-09-15 Thread JING ZHANG
[1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sqlclient/#execute-a-set-of-sql-statements Best, JING ZHANG Martijn Visser 于2021年9月15日周三 下午11:07写道: > Hi, > > You can do this directly via the SQL client by defining (for example > Kafka) as a TABLE [1]

Re: Create a lookup table in a StreamExecutionEnvironment

2021-09-10 Thread JING ZHANG
-master/docs/dev/table/common/#connector-tables Best, JING ZHANG Robert Cullen 于2021年9月11日周六 上午1:02写道: > I have a developer that wants to create a lookup table in Kafka with data > that will be used later when sinking with S3. The lookup table will have > folder information that wil

Re: Usecase for flink

2021-09-09 Thread JING ZHANG
resources. Best, JING ZHANG Dipanjan Mazumder 于2021年9月10日周五 上午11:11写道: > Hi, > >I am working on a usecase and thinking of using flink for the same. > The use case is i will be having many large resource graphs , i need to > parse that graph for each node and edge and evaluate e

Re: What is the event time of an element produced in a timer?

2021-09-07 Thread JING ZHANG
ime-and-watermarks Best, JING ZHANG Marco Villalobos 于2021年9月7日周二 下午11:45写道: > I am guessing the answer is right infront of me. > > If the window has these attributes: > > window start: 00:30:00.000. > window end: 00:45:00.000 > max timestamp: 00:44:59.999 > > Then pe

Re: What is the event time of an element produced in a timer?

2021-09-07 Thread JING ZHANG
SQL [2]. The situation is very similar. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/datastream/operators/windows/ [2] https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sql/queries/window-agg/ Best regards, JING ZHANG Marco Villalobos 于2021年9月7日周二 下午2:24写

Re: [ANNOUNCE] Flink mailing lists archive service has migrated to Apache Archive service

2021-09-06 Thread JING ZHANG
Thanks Leonard for driving this. The information is helpful. Best, JING ZHANG Jark Wu 于2021年9月6日周一 下午4:59写道: > Thanks Leonard, > > I have seen many users complaining that the Flink mailing list doesn't > work (they were using Nabble). > I think this information would be very he

Re: [ANNOUNCE] Flink mailing lists archive service has migrated to Apache Archive service

2021-09-06 Thread JING ZHANG
Thanks Leonard for driving this. The information is helpful. Best, JING ZHANG Jark Wu 于2021年9月6日周一 下午4:59写道: > Thanks Leonard, > > I have seen many users complaining that the Flink mailing list doesn't > work (they were using Nabble). > I think this information would be very he

Re: Checkpointing failure, subtasks get stuck

2021-09-02 Thread JING ZHANG
could check whether there is any network problem. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/monitoring/checkpoint_monitoring/ [2] https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/state/unaligned_checkpoints/ Best, JING ZHANG Xiangyu Su 于2021年9月1日周三 下午

Re: Session windows - how to get the last value from a window using FlinkSQL?

2021-08-31 Thread JING ZHANG
your UserDefined AggregateFunction [1] to walk around. Please implement `merge` method in the UserDefined Aggregate Function if they would be used in Session Window. * [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/functions/udfs/#aggregate-functions Best regards, JING

Re: Flink performance with multiple operators reshuffling data

2021-08-31 Thread JING ZHANG
a data skew, then consider how to solve it. Best, JING ZHANG Jason Liu 于2021年8月31日周二 下午4:23写道: > Thanks for the help guys! > > Yea we can potentially append random strings to the keys and duplicate > data across them to avoid skewness, if necessary. But I think the keys we >

Re: Delete Keyed State outside of StateTTL

2021-08-31 Thread JING ZHANG
utomatically? or will be present but the state > will be empty, in that case what is the implication on memory occupation? > > On Tue, Aug 31, 2021 at 8:14 AM JING ZHANG wrote: > >> Hi, >> All types of state also have a method clear() that clears the state for >> the currently act

Re: Flink performance with multiple operators reshuffling data

2021-08-30 Thread JING ZHANG
function with the lowest selectivity at the top. The lower the ratio of output records number to input records number, the lower the selectivity is. Best, JING ZHANG Jason Liu 于2021年8月31日周二 上午8:12写道: > Hi there, > > We have this use case where we need to have multiple keybys

Re: Delete Keyed State outside of StateTTL

2021-08-30 Thread JING ZHANG
Hi, All types of state also have a method clear() that clears the state for the currently active key, i.e. the key of the input element. Could we call the `clear()` method directly to remove the state under the specified key? Best, JING ZHANG narasimha 于2021年8月31日周二 上午9:44写道: > Hi, > &g

Re: Error while deserializing the element

2021-08-26 Thread JING ZHANG
to memory usage and increase memory resources if necessary. JING ZHANG 于2021年8月20日周五 下午1:27写道: > Hi Vijay, Yun, > I've created a JIRA https://issues.apache.org/jira/browse/FLINK-23886 to > track this. > > Best, > JING ZHANG > > JING ZHANG 于2021年8月20日周五 下午1:19写道: > >>

Re: DataStream to Table API

2021-08-22 Thread JING ZHANG
es.STRING, Types.INT)); Best, JING ZHANG Matthias Broecheler 于2021年8月21日周六 上午12:40写道: > Thank you, Caizhi, for looking into this and identifying the source of the > bug. Is there a way to work around this at the API level until this bug is > resolved? Can I somehow "injec

Re: Pre shuffle aggregation in flink is not working

2021-08-22 Thread JING ZHANG
-aggregation for window aggregate. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/window-agg/ Welcome to discuss in a further step. Best, JING ZHANG suman shil 于2021年8月22日周日 上午12:13写道: > Hi JING, > Thanks for the pointers. > > 1) I am a

Re: Pre shuffle aggregation in flink is not working

2021-08-20 Thread JING ZHANG
nction, MapBundleOperator, and CountBundleTrigger are not marked as @public, they have no guarantee of compatibility. You'd better copy them for your own use. Best, JING ZHANG suman shil 于2021年8月20日周五 下午2:18写道: > Hi Jing, > I tried using `*MapBundleOperator*` also (I am yet to test with >

Re: Error while deserializing the element

2021-08-19 Thread JING ZHANG
Hi Vijay, Yun, I've created a JIRA https://issues.apache.org/jira/browse/FLINK-23886 to track this. Best, JING ZHANG JING ZHANG 于2021年8月20日周五 下午1:19写道: > Hi Vijay, > I have encountered the same problem several times in online production > Flink jobs, but I have not found the r

Re: Error while deserializing the element

2021-08-19 Thread JING ZHANG
would invite Yun Tang who is an expert on the topic to look into the problem, we could also create a JIRA to track the issue. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/state/state_backends/#timers-heap-vs-rocksdb Best, JING ZHANG vijayakumar palaniappan 于2021年8月19日

Re: Periodic output at end of stream

2021-08-19 Thread JING ZHANG
Hi Matthias, Thanks for providing the example, I would reply back soon after I do some debug. Best, JING ZHANG Matthias Broecheler 于2021年8月19日周四 上午1:53写道: > Hey JING, > > thanks for getting back to me. I tried to produce the smallest, > self-contained example that produces th

Re: Pre shuffle aggregation in flink is not working

2021-08-19 Thread JING ZHANG
Hi Suman, Please try copy `*MapBundleOperator*`, update the `HashMap` to `LinkedHashMap` to keep the output sequence consistent with input sequence. Best, JING ZHANG suman shil 于2021年8月20日周五 上午2:23写道: > Hi Jing, > Thanks for looking into this. Here is the code of `TaxiFareStre

Re: Pre shuffle aggregation in flink is not working

2021-08-19 Thread JING ZHANG
is a HashMap object which could not keep the data input sequence. I'm afraid there exists unorder in a bundle (in your case 10) problem. I'm not sure whether it is reasonable to assign a watermark based on an unordered timestamp. Best, JING ZHANG suman shil 于2021年8月19日周四 下午12:43写道: > I am try

Re: Handling HTTP Requests for Keyed Streams

2021-08-17 Thread JING ZHANG
external I/O; else an external call would be triggered, and the result value would be cached into LRU Cache. E.g Hive dimension table source would load all data into Cache in Memory, the cache would refresh regularly according to the specified interval. Hope the information is helpful. Best, JING

Re: Can i contribute for flink doc ?

2021-08-16 Thread JING ZHANG
re would be one or more committers to review the pull request and merge it finally. [1] https://flink.apache.org/contributing/contribute-documentation.html Best, JING ZHANG Camile Sing 于2021年8月17日周二 上午10:57写道: > Hi, all > I'm a Flink user. recently I find some problems when I use Flink,

Re: Windowed Aggregation With Event Time over a Temporary View

2021-08-16 Thread JING ZHANG
time [3] https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/concepts/time_attributes/#during-datastream-to-table-conversion [4] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/queries.html#selecting-group-window-start-and-end-timestamps Best, JING ZHANG Joseph Lore

Re: Periodic output at end of stream

2021-08-15 Thread JING ZHANG
Hi Matthias, How often do you register the event-time timer? It is registered per input record, or re-registered a new timer after an event-time timer is triggered? Would you please provide your test case code, it would be very helpful for troubleshooting. Best wishes, JING ZHANG Matthias

Re: Bug with PojoSerializer? java.lang.IllegalArgumentException: Can not set final double field Event.rating to java.lang.Integer

2021-08-15 Thread JING ZHANG
> > target = (T) > subclassSerializer.createInstance();// target is an > Integer instead of my Event class?? > > // also initialize fields for which the subclass serializer is > not responsible > > initializeFields(target); > > > > >

Re: UNION ALL on two LookupTableSources

2021-08-15 Thread JING ZHANG
nts `LookupTableSource` and ` ScanTableSource` at same time, it could work fine under the above case because the optimizer would automatically convert the physical plan to ` StreamPhysicalTableSourceScan` which would scan the data. Best, JING ZHANG Yuval Itzchakov 于2021年8月16日周一 上午4:15写道: > Hi, > >

Re: Bug with PojoSerializer? java.lang.IllegalArgumentException: Can not set final double field Event.rating to java.lang.Integer

2021-08-13 Thread JING ZHANG
Hi Yu, Thias provides a nice method to debug the issue. Big +1. Please try the way, feel free get back for further discussion. Best, JING ZHANG Schwalbe Matthias 于2021年8月13日周五 下午3:22写道: > Good Morning Nathan, > > > > The exception stack does not give enough informat

Re: Bug with PojoSerializer? java.lang.IllegalArgumentException: Can not set final double field Event.rating to java.lang.Integer

2021-08-12 Thread JING ZHANG
Hi Yu, The exception is thrown after processing some input data instead of at the beginning of the input, right? Is there any possible that input schema has updated? Nathan Yu 于2021年8月13日周五 上午8:38写道: > · Using local environment: > StreamExecutionEnvironment.createLocalEnvironment() > >

Re: Flink not processing the records

2021-08-12 Thread JING ZHANG
Hi Megha, Event window would be triggered after the watermark passed the end of window. Would you please check the watermark value on the Flink UI. Best, JING ZHANG Megha Gandhi 于2021年8月13日周五 上午2:33写道: > > > On Aug 12, 2021, at 11:31 AM, Megha Gandhi wrote: > > Hi > This is

Re: Questions on usage of SQL hints

2021-08-12 Thread JING ZHANG
Hi Paul, I'm very happy to hear that., Paul Lam 于2021年8月12日周四 下午3:17写道: > Hi JING, > > Thanks for your inputs! It helps a lot. > > Best, > Paul Lam > > 2021年8月12日 13:13,JING ZHANG 写道: > > Hi Paul, > There are Table hints and Query hints. > Query hints are

Re: Questions on usage of SQL hints

2021-08-11 Thread JING ZHANG
+for+Flink+SQL [2] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/hints/ [3] https://issues.apache.org/jira/browse/FLINK-20670 Best, JING ZHANG Paul Lam 于2021年8月12日周四 下午12:10写道: > Hi community, > > I’m trying out SQL hints on DML, but there’s not

Re: how to emit a deletion event for all data in iterating of production logic

2021-08-10 Thread JING ZHANG
of `RowData` to `Delete` upon the Datastream read from source [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/insert/#insert-from-select-queries Best JING ZHANG vtygoss 于2021年8月10日周二 下午7:35写道: > Hi, Flink community! > > > I have a problem wh

Re: Allowed lateness in Flink SQL

2021-08-10 Thread JING ZHANG
Hi Maciej, The pr is related to FLINK-21301 [1]. Sets the time by which elements are allowed to be late. Elements that arrive behind the watermark by more than the specified time " + "will be dropped. " + "Note: use the value if it is

Re: Approach to test custom Source/Sink

2021-08-09 Thread JING ZHANG
/flink/tree/master/flink-connectors Best, JING ZHANG Xinbin Huang 于2021年8月10日周二 上午4:22写道: > Hi team, > > I'm currently implementing a custom source and sink, and I'm trying to > find a way to test these implementations. The testing section > <https://ci.apache.org/projects/flink/

Re: Table API Throws Calcite Exception CannotPlanException When Tumbling Window is Used

2021-08-05 Thread JING ZHANG
Hi Joe, Window TVF is supported since Flink 1.13, while 1.12 does not support yet. Please upgrade to 1.13 version, or use the old Group Window Aggregate [1] syntax in 1.12. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/sql/queries.html#group-windows Best, JING ZHANG

Re: Over Window Aggregation Tuning

2021-08-03 Thread JING ZHANG
Hi Wanghui, Caizhi AFAIK, over window doesn't support mini batch optimization yet. Caizhi Weng 于2021年8月2日周一 上午10:20写道: > Hi! > > As the state grows the processing speed will slow down a bit. Which state > backend are you using? Is mini batch enabled[1]? > > [1] >

Re: Flink 1.13 Tumble Window Setting Time Attribute Column

2021-08-03 Thread JING ZHANG
+ "GROUP BY TUMBLE(ts, INTERVAL '1' SECOND)"; Table result = tableEnv.sqlQuery(query); tableEnv.toAppendStream(result, Row.class).print(); env.execute("Streaming Window SQL Job"); } } Best, JING ZHANG Pranav Patil 于2021年8月4日周三 上午6:29写道: > T

Re: Flink 1.13 Tumble Window Setting Time Attribute Column

2021-08-02 Thread JING ZHANG
://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/concepts/time_attributes/#event-time Best, JING ZHANG Pranav Patil 于2021年8月3日周二 上午8:51写道: > Hi, > > I'm upgrading a repository from Flink 1.11 to Flink 1.13. I have Flink SQL > command that used to do tumbling w

Re: [External] naming table stages

2021-07-28 Thread JING ZHANG
Hi Yuval, Thanks for pointing that out. Yes, metrics names would also be affected. The need sounds reasonable, we could create a JIRA and give a detailed description of the requirements. cc @godfrey @Kurt to provide more user perspective, they may be interested in the feature. Best, JING ZHANG

Re: Table Aggregate - Flink SQL

2021-07-28 Thread JING ZHANG
a single value based on a set of other values with same group key in ANSI SQL. Best, JING ZHANG Pranav Patil 于2021年7月29日周四 上午1:27写道: > I want to create a Python UDF for a table aggregate function. The > documentation explains this, and how to use its results by calling the > flatAggregate

Re: [External] naming table stages

2021-07-28 Thread JING ZHANG
operator is doing. But too long message would cause problem for log system and visual display in frontend. Maybe it's better to shorten the operator name if it is too long. And there is a way to query the complete name of a given operator which help us to troubleshoot problems. Best, JING ZHANG

Re: Issue with Flink SQL using RocksDB backend

2021-07-26 Thread JING ZHANG
Hi Yuval, I run a similar SQL (without `FIRST` aggregate function), there is nothing wrong. `FIRST` is a custom aggregate function? Would you please check if there is a drawback in `FIRST`? Whether the query could run without `FIRST`? Best, JING ZHANG Yuval Itzchakov 于2021年7月27日周二 上午12:29写道

Re: Migrating Kafka Sources (major version change)

2021-07-25 Thread JING ZHANG
earliest_offset because new topic name is different the previous one, so those KafkaTopicPartition could not be found in restored state. And restored state would be overwritten with new Kafka topic and offsets after a checkpoint. pease ensure that UID of the successor operators are not changed. Best, JING

Re: Finding matched state

2021-07-25 Thread JING ZHANG
search in a given job? Best, JING ZHANG Abu Bakar Siddiqur Rahman Rocky 于2021年7月25日周日 下午4:15写道: > Hello, > > is there any library or tools to get the matched state (if the current > state is similar to the previous state, in that case)? > > Thank you > > -- > Regards, > Abu Bakar Siddiqur Rahman > >

Re: Kafka data sources, multiple interval joins and backfilling

2021-07-21 Thread JING ZHANG
ts like an interval join (with fuzzy > matching). We wrote our own KeyedCoProcessFunction and modeled it closely > after the internal interval join code. Does Flink have special logic with > the built in interval join code that impacts how kafka data sources are > read? > > > > O

Re: Can we share state between different keys in the same window?

2021-07-19 Thread JING ZHANG
, let's see if there is other way to satisfy the requirement. Best, JING ZHANG Sweta Kalakuntla 于2021年7月20日周二 上午5:04写道: > Hi, > > I need to query the database(not a source but for additional information) > in ProcessFunction. I want to save the results in a state or some other way > so

Re: Flink UDF Scalar Function called only once for all rows of select SQL in case of no parameter passed

2021-07-13 Thread JING ZHANG
is a dynamic function. (not applied for UDF, it is used for flink built-in SqlOperator). Best, JING ZHANG shamit jain 于2021年7月14日周三 上午3:00写道: > Hi, > > I am facing an issue where scalar UDF called once in case if no parameter > passed as given below:- > > public class DateTime

Re: More detail information in sql validate exception

2021-07-08 Thread JING ZHANG
://flink.apache.org/contributing/contribute-code.html Best, JING ZHANG 纳兰清风 于2021年7月8日周四 下午12:00写道: > Hi, > Currently, When I was using a lot of the same udf in a sql, I can't locate > where the semantic occor if some udf being used in a wrong way. So I try to > change some code in

Re: Apache Flink - How to use/invoke LookupTableSource/Function

2021-07-07 Thread JING ZHANG
e same application twice - with customer record changing in between the two runs, since the orders table has proc time and customer record does not show any time attribute, will the results of join differ - since the customer record has changed during the two runs ? Yes. Best, JING ZHANG [1] htt

Re: State Processor API and existing state

2021-07-07 Thread JING ZHANG
; AsyncSnapshotStrategySynchronicityBehavior.java:41) > > at org.apache.flink.runtime.state.heap.HeapSnapshotStrategy > .newStateTable(HeapSnapshotStrategy.java:243) > > at org.apache.flink.runtime.state.heap.HeapRestoreOperation > .createOrCheckStateForMetaInfo(HeapRestoreOperat

Re: Apache Flink - How to use/invoke LookupTableSource/Function

2021-07-06 Thread JING ZHANG
join, window join). Otherwise an exception would be thrown out in the compile phase. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/table/sql/queries/joins/ Best regards, JING ZHANG M Singh 于2021年7月7日周三 上午8:23写道: > Hey Folks: > > I am trying to understand

Re: How to calculate how long an event stays in flink?

2021-07-03 Thread JING ZHANG
larity of these distributions can be > controlled in the Flink configuration - metrics.latency.granularity > > However, I am not sure if this option meets my need - is it possible to > obtain only the whole time spent between the source and the sink, without > the detailed time spent on each operat

Re: How to calculate how long an event stays in flink?

2021-07-01 Thread JING ZHANG
Hi Xiuming, +1 on your idea. BTW, Flink also provides a debug tool to track the latency of records travelling through the system[1]. But you should note the following issue if enable the latency tracking. (1) It's a tool for debugging purposes because enabling latency metrics can significantly

Re: Converting Table API query to Datastream API

2021-06-30 Thread JING ZHANG
, JING ZHANG Le Xu 于2021年7月1日周四 上午5:51写道: > Thanks -- Is there a way to quickly visualize the Stream operator DAG > generated by the TableAPI/SQL queries? > > Le > > On Tue, Jun 29, 2021 at 9:34 PM JING ZHANG wrote: > >> Hi Le, >> link >> <https://stack

  1   2   >