Questions about the V-C Iteration in Gelly

2017-02-07 Thread Xingcan Cui
Hi all, I got some question about the vertex-centric iteration in Gelly. a) It seems the postSuperstep method is called before the superstep barrier (I got different aggregate values of the same superstep in this method). Is this a bug? Or the design is just like that? b) There is not setHalt

Re: Cross operation on two huge datasets

2017-02-23 Thread Xingcan Cui
a 1-n join on the partition ids + an additional check if the point >> is actually in the shape. >> >> The challenge here is to have partitions of similar size. >> >> Cheers, Fabian >> >> 2017-02-23 5:59 GMT+01:00 Xingcan Cui <xingc...@gmail.com>: >> >&

Re: Cross operation on two huge datasets

2017-02-22 Thread Xingcan Cui
Hi all, @Gwen From the database's point of view, the only way to avoid Cartesian product in join is to use index, which exhibits as key grouping in Flink. However, it only supports many-to-one mapping now, i.e., a shape or a point can only be distributed to a single group. Only points and shapes

Re: Questions about the V-C Iteration in Gelly

2017-02-13 Thread Xingcan Cui
is empty) the "pact.runtime.workset-empty-aggregator" will judge convergence of the delta iteration and then the iteration just terminates. Is this a bug? Best, Xingcan On Mon, Feb 13, 2017 at 5:24 PM, Xingcan Cui <xingc...@gmail.com> wrote: > Hi Greg, > > Thanks for you

Re: Questions about the V-C Iteration in Gelly

2017-02-13 Thread Xingcan Cui
lt;c...@greghogan.com> wrote: > Hi Xingcan, > > FLINK-1885 looked into adding a bulk mode to Gelly's iterative models. > > As an alternative you could implement your algorithm with Flink operators > and a bulk iteration. Most of the Gelly library is written with native > o

Re: Questions about the V-C Iteration in Gelly

2017-02-10 Thread Xingcan Cui
during a vertex-centric iteration. Then what can we do if an algorithm really need that? Thanks for your patience. Best, Xingcan On Fri, Feb 10, 2017 at 4:50 PM, Vasiliki Kalavri <vasilikikala...@gmail.com > wrote: > Hi Xingcan, > > On 9 February 2017 at 18:16, Xingcan Cui <xingc..

Re: Questions about the V-C Iteration in Gelly

2017-02-09 Thread Xingcan Cui
ering the complexity, the example is not provided.) Really appreciate for all your help. Best, Xingcan On Thu, Feb 9, 2017 at 5:36 PM, Vasiliki Kalavri <vasilikikala...@gmail.com> wrote: > Hi Xingcan, > > On 7 February 2017 at 10:10, Xingcan Cui <xingc...@gmail.com> wrote: >

Re: Cross operation on two huge datasets

2017-03-02 Thread Xingcan Cui
Hi Gwen, in my view, indexing and searching are two isolated processes and they should be separated. Maybe you should take the RTree structure as a new dataset (fortunately it's static, right?) and store it to a distributed cache or DFS that can be accessed by operators from any nodes. That will

Question about the custom partitioner

2017-06-14 Thread Xingcan Cui
Hi all, I want to duplicate records to multiple downstream tasks (not all of them thus the Broadcasting should not work) in stream environment. However, it seems that the current custom partitioner can return only one partition index. Why this restriction exists or do I miss something? Thanks,

Re: Question about the custom partitioner

2017-06-16 Thread Xingcan Cui
m as > DataStream.partitionCustom() and DataStream.setConnectionType() (the first > calls the latter) do. > > Best, > Aljoscha > > On 14. Jun 2017, at 09:09, Xingcan Cui <xingc...@gmail.com> wrote: > > Hi all, > > I want to duplicate records to multiple dow

Re: Handle event time

2017-09-07 Thread Xingcan Cui
Hi AndreaKinn, The AscendingTimestampExtractor do not work as you think. It should be applied for streams where timestamps are monotonously ascending, naturally. Flink uses watermark to deal with unordered data. When a watermark *t* is received, it means there should be no more records whose

Re: Handle event time

2017-09-12 Thread Xingcan Cui
Hi AndreaKinn, Reordering in a stream environment is quite costly. AFAIK, Flink doesn't provide such functions internally. Watermark is just one of the approaches to deal with the out-of-order problem. IMO, it just like a coarse-grained reordering. The late records should be dropped *manually*.

Re: termination of stream#iterate on finite streams

2017-09-04 Thread Xingcan Cui
deterministically end the iteration without needing a > timeout? > > Having an arbitrary timeout that must be longer than any iteration step > takes seems really awkward. > > What you think? > > Best regards > Peter > > > Am 02.09.2017 um 17:16 schrieb Xingcan C

Re: termination of stream#iterate on finite streams

2017-09-01 Thread Xingcan Cui
Hi Peter, Let me try to explain this. As you shown in the examples, the iterate method takes a function, which "split" the initial stream into two separate streams, i.e., initialStream => (stream1, stream2). The stream2 works as the output stream, whose results will be emitted to the successor

Re: termination of stream#iterate on finite streams

2017-09-02 Thread Xingcan Cui
ing/runtime/tasks/StreamIterationHead.java#L80> ). Hope everything is considered this time : ) Best, Xingcan On Sat, Sep 2, 2017 at 10:08 PM, Peter Ertl <peter.e...@gmx.net> wrote: > > Am 02.09.2017 um 04:45 schrieb Xingcan Cui <xingc...@gmail.com>: > > In your

Re: how does time-windowed join and Over-Window Aggregation implemented in flink SQL?

2017-12-13 Thread Xingcan Cui
Hi Yan Zhou, as you may have noticed, the SQL level stream join was not built on top of some join APIs but was implemented with the low-level CoProcessFunction (see TimeBoundedStreamInnerJoin.scala

Re: Apache Flink - Question about Global Windows

2017-11-15 Thread Xingcan Cui
Hi Mans, the "global" here indicates the "horizontal" (count, time, etc.) dimension instead of the "vertical" (keyBy) dimension, i.e., all the received data will be placed into a single huge window. Actually, it's an orthogonal concept with the *KeyBy* operations since both *DataStream* and

Re: Generate watermarks per key in a KeyedStream

2017-11-08 Thread Xingcan Cui
Hi Shailesh, actually, the watermarks are generated per partition, but all of them will be forcibly aligned to the minimum one during processing. That is decided by the semantics of watermark and KeyedStream, i.e., the watermarks belong to a whole stream and a stream is made up of different

Re: How to use keyBy on ConnectedStream?

2018-05-10 Thread Xingcan Cui
Hi Ishwara, the `keyBy()` method automatically ensures that records with the same key will be processed by the same instance of a CoFlatMap. As for the exception, I suppose the types `MessageType1` and `MessageType1` are POJOs which should follow some rules [1]. Also, make sure that (1)

Re: Better way to clean up state when connect

2018-05-12 Thread Xingcan Cui
Hi Chengzhi, you said the Stream B which comes from a file will be updated unpredictably. I wonder if you could share more about how to judge an item (from Stream A I suppose) is not in the file and what watermark generation strategy did you choose? Best, Xingcan > On May 12, 2018, at 12:48

Re: Better way to clean up state when connect

2018-05-15 Thread Xingcan Cui
ks for you suggestions again. > > Regards, > Chengzhi > > On Sat, May 12, 2018 at 8:55 PM, Xingcan Cui <xingc...@gmail.com > <mailto:xingc...@gmail.com>> wrote: > Hi Chengzhi, > > you said the Stream B which comes from a file will be updated unpredictabl

Re: Better way to clean up state when connect

2018-05-15 Thread Xingcan Cui
t, > Chengzhi > > On Tue, May 15, 2018 at 10:22 AM, Xingcan Cui <xingc...@gmail.com > <mailto:xingc...@gmail.com>> wrote: > Hi Chengzhi, > > currently, it's impossible to process both a stream and a (dynamically > updated) dataset in a single job. I'll pr

Re: Replaying logs with microsecond delay

2018-05-15 Thread Xingcan Cui
Hi Dhruv, since there are timestamps associated with each record, I was wondering why you try to replay them with a fixed interval. Can you give a little explanation about that? Thanks, Xingcan > On May 16, 2018, at 2:11 AM, Ted Yu wrote: > > Please see the following: >

Re: Replaying logs with microsecond delay

2018-05-15 Thread Xingcan Cui
Science and Engineering > University of Minnesota > www.dhruvkumar.me <http://www.dhruvkumar.me/> > >> On May 15, 2018, at 20:29, Xingcan Cui <xingc...@gmail.com >> <mailto:xingc...@gmail.com>> wrote: >> >> Hi Dhruv, >> >> since

Re: Issues with Flink1.5 SQL-Client

2018-07-03 Thread Xingcan Cui
Hi Ashwin, I encountered this problem before. You should make sure that the version for your Flink cluster and the version you run the SQL-Client are exactly the same. Best, Xingcan > On Jul 3, 2018, at 10:00 PM, Chesnay Schepler wrote: > > Can you provide us with the JobManager logs? > >

Re: Joining data in Streaming

2018-01-30 Thread Xingcan Cui
Hi Hayden, To perform a full-history join on two streams has not been natively supported now. As a workaround, you may implement a CoProcessFunction and cache the records from both sides in states until the stream with fewer data has been fully cached. Then you could safely clear the cache for

Re: CoProcess() VS union.Process() & Timers in them

2018-02-13 Thread Xingcan Cui
Hi Max, Currently, the timers can only be used with keyed streams. As @Fabian suggested, you can “forge” a keyed stream with the special KeySelector, which maps all the records to the same key. IMO, Flink uses keyed streams/states as it’s a deterministic distribution mechanism. Here, “the

Re: CoProcess() VS union.Process()

2018-02-09 Thread Xingcan Cui
Hi Max, if I understood correctly, instead of joining three streams, you actually performed two separate joins, say S1 JOIN S3 and S2 JOIN S3, right? Your plan "(S1 UNION S2) JOIN S3” seems to be identical with “(S1 JOIN S3) UNION (S2 JOIN S3)” and if that’s what you need, your pipeline

Re: A "per operator instance" window all ?

2018-02-19 Thread Xingcan Cui
not misunderstand, I think you can key your stream on a >> `Random.nextInt() % parallesm`, this way you can "group" together alerts >> from different and benefit from multi parallems. >> >> >> 发自网易邮箱大师 >> >> On 02/19/2018 09:08,Xingcan Cui<xingc..

Re: Problems to use toAppendStream

2018-02-22 Thread Xingcan Cui
Hi Esa, just to remind that don’t miss the dot and underscore. Best, Xingcan > On 22 Feb 2018, at 3:59 PM, Esa Heikkinen > wrote: > > Hi > > Actually I have also line “import org.apache.flink.streaming.api.scala” on my > code, but this line seems to be

Re: Problems to use toAppendStream

2018-02-22 Thread Xingcan Cui
> 2.11.11 > > > BR Esa > > From: Fabian Hueske [mailto:fhue...@gmail.com] > Sent: Thursday, February 22, 2018 10:35 AM > To: Esa Heikkinen <esa.heikki...@student.tut.fi> > Cc: Xingcan Cui <xingc...@gmail

Re: Problems to use toAppendStream

2018-02-22 Thread Xingcan Cui
018, at 5:20 PM, Xingcan Cui <xingc...@gmail.com> wrote: > > Hi Fabian and Esa, > > I ran the code myself and also noticed the strange behavior. It seems that > only I explicitly import the function i.e., > org.apache.flink.streaming.api.scala.asScalaStream, can

Re: A "per operator instance" window all ?

2018-02-18 Thread Xingcan Cui
or instance 1 to group all the alerts > received on those 25 resource ids and do 1 query for those 25 resource ids. > Same thing for operator instance 2, 3 and 4. > > > Thank you, > Regards. > > > On 18/02/2018 14:43, Xingcan Cui wrote: >> Hi Julien, >>

Re: Only a single message processed

2018-02-18 Thread Xingcan Cui
Hi Niclas, About the second point you mentioned, was the processed message a random one or a fixed one? The default startup mode for FlinkKafkaConsumer is StartupMode.GROUP_OFFSETS, maybe you could try StartupMode.EARLIST while debugging. Also, before that, you may try fetching the messages

Re: Flink wrong Watermark in Periodic watermark

2018-07-30 Thread Xingcan Cui
HI Soheil, That may relate to your parallelism since each extractor instance compute its own watermarks. Try to print the max timestamps with the current thread’s name and you will notice this. Best, Xingcan > On Jul 30, 2018, at 3:05 PM, Soheil Pourbafrani wrote: > > Using Flink EventTime

Re: CoFlatMapFunction with more than two input streams

2018-08-15 Thread Xingcan Cui
Hi Averell, With the CoProcessFunction, you could get access to the time-related services which may be useful when maintaining the elements in Stream_C and you could get rid of type casting with the Either class. Best, Xingcan > On Aug 15, 2018, at 3:27 PM, Averell wrote: > > Thank you Vino

Re: CoFlatMapFunction with more than two input streams

2018-08-15 Thread Xingcan Cui
Hi Averell, I am also in favor of option 2. Besides, you could use CoProcessFunction instead of CoFlatMapFunction and try to wrap elements of stream_A and stream_B using the `Either` class. Best, Xingcan > On Aug 15, 2018, at 2:24 PM, vino yang wrote: > > Hi Averell, > > As far as these

Re: Will idle state retention trigger retract in dynamic table?

2018-08-21 Thread Xingcan Cui
Hi Henry, Idle state retention is just making a trade-off between the accuracy and the storage consumption. It can meet part of the calculation requirements in the stream environment, but not all. For instance, in your use case, if there exists a TTL for each article, their praise states can

Re: What's the advantage of using BroadcastState?

2018-08-27 Thread Xingcan Cui
Hi Radu, I cannot make a full understanding of your question but I guess the answer is NO. The broadcast state pattern just provides you with an automatic data broadcasting and a bunch of map states to cache the "low-throughput” patterns. Also, to keep consistency, it forbid the

Re: Window Stream - Need assistance

2018-07-18 Thread Xingcan Cui
Hi Rakkesh, Did you call `execute()`on your `StreamExecutionEnvironment`? Best, Xingcan > On Jul 18, 2018, at 5:12 PM, Titus Rakkesh wrote: > > Dear Friends, > I have 2 streams of the below data types. > > DataStream> splittedActivationTuple; > > DataStream> unionReloadsStream; >

Re: Can not get OutPutTag datastream from Windowing function

2018-07-17 Thread Xingcan Cui
Hi Soheil, The `getSideOutput()` method is defined on the operator instead of the datastream. You can invoke it after any action (e.g., map, window) performed on a datastream. Best, Xingcan > On Jul 17, 2018, at 3:36 PM, Soheil Pourbafrani wrote: > > Hi, according to the documents I tried

Re: streaming predictions

2018-07-22 Thread Xingcan Cui
Hi Cederic, If the model is a simple function, you can just load it and make predictions using the map/flatMap function in the StreamEnvironment. But I’m afraid the model trained by Flink-ML should be a “batch job", whose predict method takes a Dataset as the parameter and outputs another

Re: data enrichment via endpoint, serializable issue

2018-07-19 Thread Xingcan Cui
Hi Steffen, You could make the class `TextAPIClient` serializable, or use `RichMapFunction` [1] and instantiate all the required objects in its `open()` method. [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/api_concepts.html#rich-functions

Re: Window Stream - Need assistance

2018-07-18 Thread Xingcan Cui
t; printed. I have a doubt whether "implemented function" is correct with my > "requirement". Please assist. > > On Wed, Jul 18, 2018 at 2:57 PM, Xingcan Cui <mailto:xingc...@gmail.com>> wrote: > Hi Rakkesh, > > Did you call `execute()`on your `Stre

Re: Does Flink support stream-stream outer joins in the latest version?

2018-03-06 Thread Xingcan Cui
Hi Kant, I suppose you refer to the stream join in SQL/Table API since the outer join for windowed-streams can always be achieved with the `JoinFunction` in DataStream API. There are two kinds of stream joins, namely, the time-windowed join and the non-windowed join in Flink SQL/Table API [1,

Re: Sliding window based on event arrival

2018-03-12 Thread Xingcan Cui
Hi Miyuru, what you need should be something like a `SlidingCountWindow`. Flink Datastream API has already provided a `countWindow()` method for that and a related example can be found here

Re: Emulate Tumbling window in Event Time Space

2018-03-08 Thread Xingcan Cui
Hi Dhruv, there’s no need to implement the window logic with the low-level `ProcessFunction` yourself. Flink has provided built-in window operators and you just need to implement the `WindowFunction` for that [1]. Best, Xingcan [1]

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-08 Thread Xingcan Cui
Hi Yan & Timo, this is confirmed to be a bug and I’ve created an issue [1] for it. I’ll explain more about this query. In Flink SQL/Table API, the DISTINCT keyword will be implemented with an aggregation, which outputs a retract stream [2]. In that situation, all the time-related fields will

Re: flink sql timed-window join throw "mismatched type" AssertionError on rowtime column

2018-03-09 Thread Xingcan Cui
table to DataStream and follow the logic of > TimeBoundedStreamInnerJoin to do the timed-window join. Should I do this? Is > there any concern from performance or stability perspective? > > Best > Yan > > From: Xingcan Cui <xingc...@gmail.com> > Sent: Thursday, March 8,

Re: Does Flink support stream-stream outer joins in the latest version?

2018-03-07 Thread Xingcan Cui
n progress now(left/right/full), and > will probably be ready before the end of this month. You can check the > progress from[1]. > > Best, Hequn > > [1] https://issues.apache.org/jira/browse/FLINK-5878 > <https://issues.apache.org/jira/browse/FLINK-5878> > > O

Re: Slow watermark advances

2018-04-13 Thread Xingcan Cui
Hi Chengzhi, currently, the watermarks of the two streams of a connected stream are forcibly synchronized, i.e., the watermark is decided by the stream with a larger delay. Thus the window trigger is also affected by this mechanism. As a workaround, you could try to add (or subtract) a static

Re: Slow watermark advances

2018-04-13 Thread Xingcan Cui
iousElementTimestamp: > Long): Long = { > val timestamp = element.getCreationTime() + 360L //1 hour delay > currentMaxTimestamp = max(timestamp, currentMaxTimestamp) > timestamp > } > > Appreciate your help! > > Best, > Chengzhi > > > On Fri,

Re: Flink join operator after sorting seems to group fields (Scala)

2018-03-03 Thread Xingcan Cui
Hi Felipe, the `sortPartition()` method just LOCALLY sorts each partition of a dataset. To achieve a global sorting, use this method after a `partitionByRange()` (e.g., `result.partitionByRange(0).sortPartition(0, Order.ASCENDING)`). Hope that helps, Xingcan > On 3 Mar 2018, at 9:33 PM,

Re: Timers and state

2018-03-05 Thread Xingcan Cui
Hi Alberto, an ultimate solution for your problem would be a map state with ordered keys (like a TreeMap), but unfortunately, this is still a WIP feature. For now, maybe you could "eagerly remove” the outdated value (with `iterator.remove()`) when iterating the map state in the process

Re: Table API Compilation Error in Flink

2018-03-04 Thread Xingcan Cui
Hi Nagananda, adding `flink-streaming-scala_${scala version}` to your maven dependency would solve this. Best, Xingcan > On 5 Mar 2018, at 2:21 PM, Nagananda M wrote: > > Hi All, > Am trying to compile a sample program in apache flink using TableEnvironment > and

Re: Watermark not worked as expected in TimeCharacteristic.EvenTime setting.

2018-03-02 Thread Xingcan Cui
Hi, for periodically generated watermarks, you should use `ExecutionConfig.setAutoWatermarkInterval()` to set an interval. Hope that helps. Best, Xingcan > On 3 Mar 2018, at 12:08 PM, sundy <543950...@qq.com> wrote: > > > > Hi, I got a problem in Flink and need your help. > > I tried to

Re: Flink SQL string literal does not support double quotation?

2018-11-01 Thread Xingcan Cui
Hi Henry, In most SQL conventions, single quotes are for Strings, while double quotes are for identifiers. Best, Xingcan > On Oct 31, 2018, at 7:53 PM, 徐涛 wrote: > > Hi Experts, > When I am running the following SQL in FLink 1.6.2, I got >

Re: IntervalJoin is stucked in rocksdb'seek for too long time in flink-1.6.2

2018-11-21 Thread Xingcan Cui
Hi Jiangang, The IntervalJoin is actually the DataStream-level implementation of the SQL time-windowed join[1]. To ensure the completeness of the join results, we have to cache all the records (from both sides) in the most recent time interval. That may lead to state backend problems when

Re: Flink join stream where one stream is coming 5 minutes late

2018-11-26 Thread Xingcan Cui
Hi Abhijeet, If you want to perform window-join in the DataStream API, the window configurations on both sides must be exactly the same. For your case, maybe you can try adding a 5 mins delay on event times (and watermarks) of the faster stream. Hope that helps. Best, Xingcan > On Nov 26,

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread Xingcan Cui
Hi John, I’ve not dug into this yet, but IMO, it shouldn’t be the case. I just wonder how do you judge that the data in the first five seconds are not processed by the system? Best, Xingcan > On Sep 17, 2018, at 11:21 PM, John Stone wrote: > > Hello, > > I'm checking if this is intentional

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread Xingcan Cui
Hi John, I suppose that was caused by the groupBy field “timestamp”. You were actually grouping on two time fields simultaneously, the processing time and the time from your producer. As @Rong suggested, try removing the additional groupBy field “timestamp” and check the result again. Best,

Re: 2 Broadcast streams to a Single Keyed Stream....how to?

2018-09-18 Thread Xingcan Cui
ll look at second option... > > On Tue, Sep 18, 2018, 9:15 AM Xingcan Cui <mailto:xingc...@gmail.com>> wrote: > Hi Vishal, > > You could try 1) merging these two rule streams first with the `union` method > if they get the same type or 2) connecting them and encapsulate th

Re: 2 Broadcast streams to a Single Keyed Stream....how to?

2018-09-18 Thread Xingcan Cui
Hi Vishal, You could try 1) merging these two rule streams first with the `union` method if they get the same type or 2) connecting them and encapsulate the records from both sides to a unified type (e.g., scala Either). Best, Xingcan > On Sep 18, 2018, at 8:59 PM, Vishal Santoshi > wrote:

Re: left join failing with FlinkLogicalJoinConverter NPE

2019-02-25 Thread Xingcan Cui
Hi Karl, It seems that some field types of your inputs were not properly extracted. Could you share the result of `printSchema()` for your input tables? Best, Xingcan > On Feb 25, 2019, at 4:35 PM, Karl Jin wrote: > > Hello, > > First time posting, so please let me know if the formatting

Re: left join failing with FlinkLogicalJoinConverter NPE

2019-02-25 Thread Xingcan Cui
-- i_uc_pk: String >> |-- i_uc_update_ts: TimeIndicatorTypeInfo(rowtime) >> |-- image_count: Long >> |-- i_data: Multiset> >> >> On Mon, Feb 25, 2019 at 4:54 PM Xingcan Cui wrote: >> >>> Hi Karl, >>> >>> It seems that some fiel

Re: left join failing with FlinkLogicalJoinConverter NPE

2019-02-26 Thread Xingcan Cui
d from a Kafka source through a query that looks like > "select collect(data) as i_data from ... group by pk" > > Do you think this is a bug or is this something I can get around using some > configuration? > > On Tue, Feb 26, 2019 at 1:20 AM Xingcan Cui <ma

Re: flink tableapi inner join exception

2019-03-15 Thread Xingcan Cui
Hi, As the message said, some columns share the same names. You could first rename the columns of one table with the `as` operation [1]. Best, Xingcan [1] https://ci.apache.org/projects/flink/flink-docs-master/dev/table/tableApi.html#scan-projection-and-filter > On Mar 15, 2019, at 9:03

Re: Inconsistent documentation of Table Conditional functions

2019-05-08 Thread Xingcan Cui
Hi Flavio, In the description, resultX is just an identifier for the result of the first meeting condition. Best, Xingcan > On May 8, 2019, at 12:02 PM, Flavio Pompermaier wrote: > > Hi to all, > in the documentation of the Table Conditional functions [1] the example is > inconsistent with

[VOTE] How to Deal with Split/Select in DataStream API

2019-07-04 Thread Xingcan Cui
Hi folks, Two weeks ago, I started a thread [1] discussing whether we should discard the split/select methods (which have been marked as deprecation since v1.7) in DataStream API. The fact is, these methods will cause "unexpected" results when using consecutively (e.g.,

Re: [ANNOUNCE] Rong Rong becomes a Flink committer

2019-07-11 Thread Xingcan Cui
Congrats Rong! Best, Xingcan > On Jul 11, 2019, at 1:08 PM, Shuyi Chen wrote: > > Congratulations, Rong! > > On Thu, Jul 11, 2019 at 8:26 AM Yu Li > wrote: > Congratulations Rong! > > Best Regards, > Yu > > > On Thu, 11 Jul 2019 at 22:54, zhijiang

Re: [VOTE] How to Deal with Split/Select in DataStream API

2019-07-08 Thread Xingcan Cui
er option. > > Aljoscha > >> On 8. Jul 2019, at 06:39, Xingcan Cui > <mailto:xingc...@gmail.com>> wrote: >> >> Hi all, >> >> Thanks for your participation. >> >> In this thread, we got one +1 for option 1 and option 3, respe

Re: [VOTE] How to Deal with Split/Select in DataStream API

2019-07-07 Thread Xingcan Cui
needed only for > splitting, in the outputs of flatMap functions. Replacing it with outputTags > would simplify data structures. > > Xingcan Cui mailto:xingc...@gmail.com>> 于 2019年7月5日周五 > 上午2:20写道: > Hi folks, > > Two weeks ago, I started a thread [1] discussing whethe

Re: Flink SQL client support for running in Flink cluster

2019-09-07 Thread Xingcan Cui
release event for the > same ,because i need to prototype my platform and present the same. > > Or is there any other way i can achieve this dynamic configuration and query > submission to Flink running as a full fledged cluster. > > Regards > Dipanjan > > On Saturday, Septe

Re: Apache Flink - Relation between stream time characteristic and timer triggers

2019-07-09 Thread Xingcan Cui
Yes, Mans. You can use both processing-time and event-time timers if you set the time characteristic to event-time. They'll be triggered by their own time semantics, separately. (actually there’s no watermark for processing time) Cheers, Xingcan > On Jul 9, 2019, at 11:40 AM, M Singh wrote: >

Re: Datadog reporter timeout & OOM issue

2021-01-27 Thread Xingcan Cui
does the trick. > > On 1/27/2021 5:52 AM, Xingcan Cui wrote: > > Hi all, > > Recently, I tried to use the Datadog reporter to collect some user-defined > metrics. Sometimes when reaching traffic peaks (which are also peaks for > metrics), the HTTP client wi

Datadog reporter timeout & OOM issue

2021-01-26 Thread Xingcan Cui
Hi all, Recently, I tried to use the Datadog reporter to collect some user-defined metrics. Sometimes when reaching traffic peaks (which are also peaks for metrics), the HTTP client will throw the following exception: ``` [OkHttp https://app.datadoghq.com/...] WARN

Timestamp type mismatch between Flink, Iceberg, and Avro

2021-05-21 Thread Xingcan Cui
Hi all, Recently, I tried to use Flink to write some Avro data to Iceberg. However, the timestamp representations for these systems really confused me. Here are some facts: - Avro uses `java.time.Instant` for logical type `timestamp_ms`; - Flink takes `java.time.Instant` as table type

Re: Timestamp type mismatch between Flink, Iceberg, and Avro

2021-05-21 Thread Xingcan Cui
epresent a TIMESTAMP as a long > value (this is also done internally), we can also support TIMESTAMP in > connectors. > > So I would assume that the issues is on the connector side which is not > properly integrated into the SQL type system. It might be a bug. > > Regards, > Tim

Re: Any way to improve list state get performance

2022-11-22 Thread Xingcan Cui
Hi Tao, I think you just need an extra `isEmpty` VARIABLE and maintain it properly (e.g., when restoring the job, check if the list state is empty or not). Also, I remembered that the list state for rocksdb is not as performant as the map state when the state is large. Sometimes you could use a

Re: Types源码

2019-06-16 Thread Xingcan Cui
你好,看故障猜测是Scala类型推断机制问题,用.asInstanceOf[Array[TypeInformation[_]强转一下即可。 > On Jun 16, 2019, at 10:33 PM, liu_mingzhang wrote: > > 感谢您的回复, 我试过运行了, 编译报错, 无法build project > 另外您贴的issue我这里打不开... > > > 在2019年6月17日 10:27,Zili Chen 写道: > 你试过直接运行吗?IDEA 有时候对 Scala 的类型推断有问题,可以编译运行的代码会误报类型不匹配。如果可以运行应该是 IDEA