[
https://issues.apache.org/jira/browse/FLINK-20972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
huajiewang updated FLINK-20972:
-------------------------------
Description:
when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete, Maybe A
large number of EventData will be output (log.info),which will cause IO
bottleneck and disk waste.
my code in the attachment, A large number event data output in the log output
by flink , output e.g:
{code:java}
Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction
TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b,
ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ),
transactionStartTime=1610426158532} from checkpoint 4{code}
in TwoPhaseCommitSinkFunction about LOG.info code is as follows:
{code:java}
LOG.info(
"{} - checkpoint {} complete, committing transaction {} from checkpoint
{}",
name(),
checkpointId,
pendingTransaction,
pendingTransactionCheckpointId); {code}
will be invoke pendingTransaction'toString method (pendingTransaction is
TransactionHolder'instance), TransactionHolder'toString method code is:
{code:java}
@Override
public String toString() {
return "TransactionHolder{"
+ "handle="
+ handle
+ ", transactionStartTime="
+ transactionStartTime
+ '}';
}{code}
handle is the concrete realization of my Transaction! There is a parameter of
List type in my Transaction, which is used to receive data. as a result, these
data are printed out(log.info)
was:
when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete,
Maybe A large number of EventData will be output (log.info)
,which will cause IO bottleneck and disk waste
my code in the attachment, A large number event data output in the log output
by flink,
e.g:
{code:java}
Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction
TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b,
ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ),
transactionStartTime=1610426158532} from checkpoint 4{code}
in TwoPhaseCommitSinkFunction about LOG.info code is as follows:
{code:java}
LOG.info(
"{} - checkpoint {} complete, committing transaction {} from checkpoint
{}",
name(),
checkpointId,
pendingTransaction,
pendingTransactionCheckpointId); {code}
will be invoke pendingTransaction'toString method (pendingTransaction is
TransactionHolder'instance), TransactionHolder'toString method code is:
{code:java}
@Override
public String toString() {
return "TransactionHolder{"
+ "handle="
+ handle
+ ", transactionStartTime="
+ transactionStartTime
+ '}';
}{code}
handle is the concrete realization of my Transaction! There is a parameter of
List type in my Transaction, which is used to receive data. as a result, these
data are printed out(log.info)
> TwoPhaseCommitSinkFunction Output a large amount of EventData
> -------------------------------------------------------------
>
> Key: FLINK-20972
> URL: https://issues.apache.org/jira/browse/FLINK-20972
> Project: Flink
> Issue Type: Improvement
> Components: API / DataStream
> Affects Versions: 1.12.0
> Environment: flink 1.4.0 +
> Reporter: huajiewang
> Priority: Minor
> Labels: easyfix, pull-request-available
> Attachments: Jdbc2PCSinkFunction.scala
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> when TwoPhaseCommitSinkFunctionOutput tigger notifyCheckpointComplete, Maybe
> A large number of EventData will be output (log.info),which will cause IO
> bottleneck and disk waste.
>
> my code in the attachment, A large number event data output in the log
> output by flink , output e.g:
> {code:java}
> Jdbc2PCSinkFunction 1/1 - checkpoint 4 complete, committing transaction
> TransactionHolde {handle=Transaction(b420c880a951403984f231dd7e33597b,
> ListBuffer(insert into table(field1,field2) value ('11','22') ... ... ),
> transactionStartTime=1610426158532} from checkpoint 4{code}
> in TwoPhaseCommitSinkFunction about LOG.info code is as follows:
> {code:java}
> LOG.info(
> "{} - checkpoint {} complete, committing transaction {} from
> checkpoint {}",
> name(),
> checkpointId,
> pendingTransaction,
> pendingTransactionCheckpointId); {code}
> will be invoke pendingTransaction'toString method (pendingTransaction is
> TransactionHolder'instance), TransactionHolder'toString method code is:
>
> {code:java}
> @Override
> public String toString() {
> return "TransactionHolder{"
> + "handle="
> + handle
> + ", transactionStartTime="
> + transactionStartTime
> + '}';
> }{code}
>
> handle is the concrete realization of my Transaction! There is a parameter
> of List type in my Transaction, which is used to receive data. as a result,
> these data are printed out(log.info)
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)