Hi Di

Thank you for initiating this FLIP, +1 for this.

Regarding the option `doris.filter.query` of doris source table

Can we directly implement the FilterPushDown capability of Flink Source
like Jdbc Source [1] instead of introducing an option?


Regarding two-phase commit,

> At the same time, Doris will also abort transactions that have not been
committed for a long time

Can we control the transaction timeout in the connector?
And control the behavior when timeout occurs, whether to discard by default
or trigger job failure?


[1]. https://issues.apache.org/jira/browse/FLINK-16024

Best,
Feng


On Tue, Mar 12, 2024 at 12:12 AM Ferenc Csaky <ferenc.cs...@pm.me.invalid>
wrote:

> Hi,
>
> Thanks for driving this, +1 for the FLIP.
>
> Best,
> Ferenc
>
>
>
>
> On Monday, March 11th, 2024 at 15:17, Ahmed Hamdy <hamdy10...@gmail.com>
> wrote:
>
> >
> >
> > Hello,
> > Thanks for the proposal, +1 for the FLIP.
> >
> > Best Regards
> > Ahmed Hamdy
> >
> >
> > On Mon, 11 Mar 2024 at 15:12, wudi 676366...@qq.com.invalid wrote:
> >
> > > Hi, Leonard
> > > Thank you for your suggestion.
> > > I referred to other Connectors[1], modified the naming and types of
> > > relevant parameters[2], and also updated FLIP.
> > >
> > > [1]
> > >
> https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/connectors/table/overview/
> > > [1]
> > >
> https://github.com/apache/doris-flink-connector/blob/master/flink-doris-connector/src/main/java/org/apache/doris/flink/table/DorisConfigOptions.java
> > >
> > > Brs,
> > > di.wu
> > >
> > > > 2024年3月7日 14:33,Leonard Xu xbjt...@gmail.com 写道:
> > > >
> > > > Thanks wudi for the updating, the FLIP generally looks good to me, I
> > > > only left two minor suggestions:
> > > >
> > > > (1) The suffix `.s` in configoption doris.request.query.timeout.s
> looks
> > > > strange to me, could we change all time interval related option
> value type
> > > > to Duration ?
> > > >
> > > > (2) Could you check and improve all config options like
> > > > `doris.exec.mem.limit` to make them to follow flink config option
> naming
> > > > and value type?
> > > >
> > > > Best,
> > > > Leonard
> > > >
> > > > > > 2024年3月6日 06:12,Jing Ge j...@ververica.com.INVALID 写道:
> > > > > >
> > > > > > Hi Di,
> > > > > >
> > > > > > Thanks for your proposal. +1 for the contribution. I'd like to
> know
> > > > > > your
> > > > > > thoughts about the following questions:
> > > > > >
> > > > > > 1. According to your clarification of the exactly-once, thanks
> for it
> > > > > > BTW,
> > > > > > no PreCommitTopology is required. Does it make sense to let
> > > > > > DorisSink[1]
> > > > > > implement SupportsCommitter, since the TwoPhaseCommittingSink is
> > > > > > deprecated[2] before turning the Doris connector into a Flink
> > > > > > connector?
> > > > > > 2. OLAP engines are commonly used as the tail/downstream of a
> data
> > > > > > pipeline
> > > > > > to support further e.g. ad-hoc query or cube with feasible
> > > > > > pre-aggregation.
> > > > > > Just out of curiosity, would you like to share some real use
> cases that
> > > > > > will use OLAP engines as the source of a streaming data
> pipeline? Or it
> > > > > > will only be used as the source for the batch?
> > > > > > 3. The E2E test only covered sink[3], if I am not mistaken.
> Would you
> > > > > > like
> > > > > > to test the source in E2E too?
> > > > > >
> > > > > > [1]
> > >
> > >
> https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/main/java/org/apache/doris/flink/sink/DorisSink.java#L55
> > >
> > > > > > [2]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-372%3A+Enhance+and+synchronize+Sink+API+to+match+the+Source+API
> > >
> > > > > > [3]
> > >
> > >
> https://github.com/apache/doris-flink-connector/blob/43e0e5cf9b832854ea228fb093077872e3a311b6/flink-doris-connector/src/test/java/org/apache/doris/flink/tools/cdc/MySQLDorisE2ECase.java#L96
> > >
> > > > > > Best regards,
> > > > > > Jing
> > > > > >
> > > > > > On Tue, Mar 5, 2024 at 11:18 AM wudi 676366...@qq.com.invalid
> wrote:
> > > > > >
> > > > > > > Hi, Jeyhun Karimov.
> > > > > > > Thanks for your question.
> > > > > > >
> > > > > > > - How to ensure Exactly-Once?
> > > > > > > 1. When the Checkpoint Barrier arrives, DorisSink will trigger
> the
> > > > > > > precommit api of StreamLoad to complete the persistence of
> data in
> > > > > > > Doris
> > > > > > > (the data will not be visible at this time), and will also
> pass this
> > > > > > > TxnID
> > > > > > > to the Committer.
> > > > > > > 2. When this Checkpoint of the entire Job is completed, the
> Committer
> > > > > > > will
> > > > > > > call the commit api of StreamLoad and commit TxnID to complete
> the
> > > > > > > visibility of the transaction.
> > > > > > > 3. When the task is restarted, the Txn with successful
> precommit and
> > > > > > > failed commit will be aborted based on the label-prefix, and
> Doris'
> > > > > > > abort
> > > > > > > API will be called. (At the same time, Doris will also abort
> > > > > > > transactions
> > > > > > > that have not been committed for a long time)
> > > > > > >
> > > > > > > ps: At the same time, this part of the content has been
> updated in
> > > > > > > FLIP
> > > > > > >
> > > > > > > - Because the default table model in Doris is Duplicate (
> > > > > > > https://doris.apache.org/docs/data-table/data-model/), which
> does not
> > > > > > > have a primary key, batch writing may cause data duplication,
> but
> > > > > > > UNIQ The
> > > > > > > model has a primary key, which ensures the idempotence of
> writing,
> > > > > > > thus
> > > > > > > achieving Exactly-Once
> > > > > > >
> > > > > > > Brs,
> > > > > > > di.wu
> > > > > > >
> > > > > > > > 2024年3月2日 17:50,Jeyhun Karimov je.kari...@gmail.com 写道:
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > Thanks for the proposal. +1 for the FLIP.
> > > > > > > > I have a few questions:
> > > > > > > >
> > > > > > > > - How exactly the two (Stream Load's two-phase commit and
> Flink's
> > > > > > > > two-phase
> > > > > > > > commit) combination will ensure the e2e exactly-once
> semantics?
> > > > > > > >
> > > > > > > > - The FLIP proposes to combine Doris's batch writing with the
> > > > > > > > primary key
> > > > > > > > table to achieve Exactly-Once semantics. Could you elaborate
> more on
> > > > > > > > that?
> > > > > > > > Why it is not the default behavior but a workaround?
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Jeyhun
> > > > > > > >
> > > > > > > > On Sat, Mar 2, 2024 at 10:14 AM Yanquan Lv
> decq12y...@gmail.com
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks for driving this.
> > > > > > > > > The content is very detailed, it is recommended to add a
> section on
> > > > > > > > > Test
> > > > > > > > > Plan for more completeness.
> > > > > > > > >
> > > > > > > > > Di Wu d...@apache.org 于2024年1月25日周四 15:40写道:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > Previously, we had some discussions about contributing
> Flink Doris
> > > > > > > > > > Connector to the Flink community [1]. I want to further
> promote
> > > > > > > > > > this
> > > > > > > > > > work.
> > > > > > > > > > I hope everyone will help participate in this FLIP
> discussion and
> > > > > > > > > > provide
> > > > > > > > > > more valuable opinions and suggestions.
> > > > > > > > > > Thanks.
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> https://lists.apache.org/thread/lvh8g9o6qj8bt3oh60q81z0o1cv3nn8p
> > > > > > > > > >
> > > > > > > > > > Brs,
> > > > > > > > > > di.wu
> > > > > > > > > >
> > > > > > > > > > On 2023/12/07 05:02:46 wudi wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > As discussed in the previous email [1], about
> contributing the
> > > > > > > > > > > Flink
> > > > > > > > > > > Doris Connector to the Flink community.
> > > > > > > > > > >
> > > > > > > > > > > Apache Doris[2] is a high-performance, real-time
> analytical
> > > > > > > > > > > database
> > > > > > > > > > > based on MPP architecture, for scenarios where Flink
> is used for
> > > > > > > > > > > data
> > > > > > > > > > > analysis, processing, or real-time writing on Doris,
> Flink Doris
> > > > > > > > > > > Connector
> > > > > > > > > > > is an effective tool.
> > > > > > > > > > >
> > > > > > > > > > > At the same time, Contributing Flink Doris Connector
> to the Flink
> > > > > > > > > > > community will further expand the Flink Connectors
> ecosystem.
> > > > > > > > > > >
> > > > > > > > > > > So I would like to start an official discussion
> FLIP-399: Flink
> > > > > > > > > > > Connector Doris[3].
> > > > > > > > > > >
> > > > > > > > > > > Looking forward to comments, feedbacks and suggestions
> from the
> > > > > > > > > > > community on the proposal.
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > > >
> https://lists.apache.org/thread/lvh8g9o6qj8bt3oh60q81z0o1cv3nn8p
> > > > > > > > > > > [2]
> > >
> > > https://doris.apache.org/docs/dev/get-starting/what-is-apache-doris/
> > >
> > > > > > > > > > > [3]
> > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-399%3A+Flink+Connector+Doris
> > >
> > > > > > > > > > > Brs,
> > > > > > > > > > >
> > > > > > > > > > > di.wu
>

Reply via email to