[
https://issues.apache.org/jira/browse/FLINK-16496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jark Wu updated FLINK-16496:
----------------------------
Description:
Currently, HBase sink provides 3 flush options:
{code}
'connector.write.buffer-flush.max-size' = '2mb' -- default 2mb
'connector.write.buffer-flush.max-rows' = '1000' -- no default value
'connector.write.buffer-flush.interval' = '2s' -- no default value
{code}
That means if flush interval is not set, the buffered output rows may not be
flushed to database for a long time. That is a surprising behavior because no
results are outputed by default.
So we propose to have a default flush '1s' interval and '1000' rows and '2mb'
size for HBase sink flush. This only applies to new HBase sink options:
{code}
'sink.buffer-flush.max-rows' = '1000'. -- the same to ES sink
'sink.buffer-flush.max-size' = '2mb' -- default value of HBase client
'sink.buffer-flush.interval' = '1s' -- the same to JDBC sink
{code}
was:
Currently, HBase sink provides 3 flush options:
{code}
'connector.write.buffer-flush.max-size' = '2mb' -- default 2mb
'connector.write.buffer-flush.max-rows' = '1000' -- no default value
'connector.write.buffer-flush.interval' = '2s' -- no default value
{code}
That means if flush interval is not set, the buffered output rows may not be
flushed to database for a long time. That is a surprising behavior because no
results are outputed by default.
So we propose to have a default flush '1s' interval (same to JDBC sink) and
'2mb' size (default value of HBase client) for HBase sink flush. This only
applies to new JDBC sink options:
{code}
'sink.buffer-flush.max-actions' = 'none'
'sink.buffer-flush.max-size' = '2mb'
'sink.buffer-flush.interval' = '1s'
{code}
> Improve default flush strategy for HBase sink to make it work out-of-box
> -------------------------------------------------------------------------
>
> Key: FLINK-16496
> URL: https://issues.apache.org/jira/browse/FLINK-16496
> Project: Flink
> Issue Type: Improvement
> Components: Connectors / HBase, Table SQL / Ecosystem
> Reporter: Jark Wu
> Assignee: Jark Wu
> Priority: Critical
> Fix For: 1.11.0
>
>
> Currently, HBase sink provides 3 flush options:
> {code}
> 'connector.write.buffer-flush.max-size' = '2mb' -- default 2mb
> 'connector.write.buffer-flush.max-rows' = '1000' -- no default value
> 'connector.write.buffer-flush.interval' = '2s' -- no default value
> {code}
> That means if flush interval is not set, the buffered output rows may not be
> flushed to database for a long time. That is a surprising behavior because no
> results are outputed by default.
> So we propose to have a default flush '1s' interval and '1000' rows and '2mb'
> size for HBase sink flush. This only applies to new HBase sink options:
> {code}
> 'sink.buffer-flush.max-rows' = '1000'. -- the same to ES sink
> 'sink.buffer-flush.max-size' = '2mb' -- default value of HBase client
> 'sink.buffer-flush.interval' = '1s' -- the same to JDBC sink
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)