[
https://issues.apache.org/jira/browse/FLINK-16496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jark Wu updated FLINK-16496:
----------------------------
Description:
Currently, HBase sink provides 3 flush options:
{code}
'connector.write.buffer-flush.max-size' = '2mb' -- default 2mb
'connector.write.buffer-flush.max-rows' = '1000' -- no default value
'connector.write.buffer-flush.interval' = '2s' -- no default value
{code}
That means if flush interval is not set, the buffered output rows may not be
flushed to database for a long time. That is a surprising behavior because no
results are outputed by default.
So we propose to have a default flush '1s' interval (same to JDBC sink) and
'2mb' size (default value of HBase client) for HBase sink flush. This only
applies to new JDBC sink options:
{code}
'sink.buffer-flush.max-actions' = 'none'
'sink.buffer-flush.max-size' = '2mb'
'sink.buffer-flush.interval' = '1s'
{code}
was:
Currently, HBase sink provides 3 flush options:
{code}
'connector.write.buffer-flush.max-size' = '2mb' -- default 2mb
'connector.write.buffer-flush.max-rows' = '1000' -- no default value
'connector.write.buffer-flush.interval' = '2s' -- no default value
{code}
That means if flush interval is not set, the buffered output rows may not be
flushed to database for a long time. That is a surprising behavior because no
results are outputed by default.
So I propose to have a default flush '1s' interval for HBase sink or default 1
row for flush size.
> Improve default flush strategy for HBase sink to make it work out-of-box
> -------------------------------------------------------------------------
>
> Key: FLINK-16496
> URL: https://issues.apache.org/jira/browse/FLINK-16496
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / HBase, Table SQL / Ecosystem
> Reporter: Jark Wu
> Assignee: Jark Wu
> Priority: Critical
> Fix For: 1.11.0
>
>
> Currently, HBase sink provides 3 flush options:
> {code}
> 'connector.write.buffer-flush.max-size' = '2mb' -- default 2mb
> 'connector.write.buffer-flush.max-rows' = '1000' -- no default value
> 'connector.write.buffer-flush.interval' = '2s' -- no default value
> {code}
> That means if flush interval is not set, the buffered output rows may not be
> flushed to database for a long time. That is a surprising behavior because no
> results are outputed by default.
> So we propose to have a default flush '1s' interval (same to JDBC sink) and
> '2mb' size (default value of HBase client) for HBase sink flush. This only
> applies to new JDBC sink options:
> {code}
> 'sink.buffer-flush.max-actions' = 'none'
> 'sink.buffer-flush.max-size' = '2mb'
> 'sink.buffer-flush.interval' = '1s'
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)