[
https://issues.apache.org/jira/browse/HBASE-9521?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicolas Liochon updated HBASE-9521:
-----------------------------------
Description:
The behavior with clearBufferOnFail is very fishy.
{code}
/**
* When you turn {@link #autoFlush} off, you should also consider the
* {@link #clearBufferOnFail} option. By default, asynchronous {@link Put}
* requests will be retried on failure until successful. However, this can
* pollute the writeBuffer and slow down batching performance. Additionally,
* you may want to issue a number of Put requests and call
* {@link #flushCommits()} as a barrier. In both use cases, consider setting
* clearBufferOnFail to true to erase the buffer after {@link #flushCommits()}
* has been called, regardless of success.
*
* @param autoFlush
* Whether or not to enable 'auto-flush'.
* @param clearBufferOnFail
* Whether to keep Put failures in the writeBuffer
* @see #flushCommits
*/
public void setAutoFlush(boolean autoFlush, boolean clearBufferOnFail) {
this.autoFlush = autoFlush;
this.clearBufferOnFail = autoFlush || clearBufferOnFail; <============ yo
man
}
{code}
{code}
public void setAutoFlush(boolean autoFlush) {
setAutoFlush(autoFlush, autoFlush); <============ more yo
}
{code}
So by default, a HTable has
- autoflush == true
- clearBufferOnFail == true
BUT, if you call setAutoFlush(false), you have
- autoflush == false
- clearBufferOnFail == false
So:
- you're setting two parameters instead of only one, without being told so.
- a side effect is that failed operations will be tried twice:
- one in the standard process
- one in the table close, as we're flushing the buffer again
I would like to:
- deprecate clearBufferOnFail.
- deprecate setAutoFlush(boolean), to make things clear about what we're doing.
was:
The behavior with clearBufferOnFail is very fishy.
{code}
/**
* When you turn {@link #autoFlush} off, you should also consider the
* {@link #clearBufferOnFail} option. By default, asynchronous {@link Put}
* requests will be retried on failure until successful. However, this can
* pollute the writeBuffer and slow down batching performance. Additionally,
* you may want to issue a number of Put requests and call
* {@link #flushCommits()} as a barrier. In both use cases, consider setting
* clearBufferOnFail to true to erase the buffer after {@link #flushCommits()}
* has been called, regardless of success.
*
* @param autoFlush
* Whether or not to enable 'auto-flush'.
* @param clearBufferOnFail
* Whether to keep Put failures in the writeBuffer
* @see #flushCommits
*/
public void setAutoFlush(boolean autoFlush, boolean clearBufferOnFail) {
this.autoFlush = autoFlush;
this.clearBufferOnFail = autoFlush || clearBufferOnFail; <============ yo
man
}
{code}
{code}
public void setAutoFlush(boolean autoFlush) {
setAutoFlush(autoFlush, autoFlush); <============ more yo
}
{code}
So by default, a HTable has
- autoflush == true
- clearBufferOnFail == true
BUT, if you call setAutoFlush(false), you have
- autoflush == false
- clearBufferOnFail == false
So:
- you're setting two parameters instead of only one.
- the javadoc about writeBuffer beeing involved in the retry process was may
be true 5 years ago, but now it's wrong
I'm pretty sure that most people don't know this. You need to go to the
implementation to learn it, as the javadoc says the opposite.
a side effect is that failed operations will be tried twice:
- one in the standard process
- one in the table close, as we're flushing the buffer again
I would like to:
- deprecate clearBufferOnFail.
- change setAutoFlush(boolean) to make it change only the autoFlush, w/o
activating clearBufferOnFail
It won't change the interface; but would change the behavior. I would like to
put this into the next 0.96 release.
> clean clearBufferOnFail behavior and deprecate it
> -------------------------------------------------
>
> Key: HBASE-9521
> URL: https://issues.apache.org/jira/browse/HBASE-9521
> Project: HBase
> Issue Type: Bug
> Components: Client
> Affects Versions: 0.98.0, 0.96.0
> Reporter: Nicolas Liochon
> Assignee: Nicolas Liochon
> Priority: Critical
> Fix For: 0.98.0, 0.96.0
>
>
> The behavior with clearBufferOnFail is very fishy.
> {code}
> /**
> * When you turn {@link #autoFlush} off, you should also consider the
> * {@link #clearBufferOnFail} option. By default, asynchronous {@link Put}
> * requests will be retried on failure until successful. However, this can
> * pollute the writeBuffer and slow down batching performance. Additionally,
> * you may want to issue a number of Put requests and call
> * {@link #flushCommits()} as a barrier. In both use cases, consider setting
> * clearBufferOnFail to true to erase the buffer after {@link
> #flushCommits()}
> * has been called, regardless of success.
> *
> * @param autoFlush
> * Whether or not to enable 'auto-flush'.
> * @param clearBufferOnFail
> * Whether to keep Put failures in the writeBuffer
> * @see #flushCommits
> */
> public void setAutoFlush(boolean autoFlush, boolean clearBufferOnFail) {
> this.autoFlush = autoFlush;
> this.clearBufferOnFail = autoFlush || clearBufferOnFail; <============
> yo man
> }
> {code}
> {code}
> public void setAutoFlush(boolean autoFlush) {
> setAutoFlush(autoFlush, autoFlush); <============ more yo
> }
> {code}
> So by default, a HTable has
> - autoflush == true
> - clearBufferOnFail == true
> BUT, if you call setAutoFlush(false), you have
> - autoflush == false
> - clearBufferOnFail == false
> So:
> - you're setting two parameters instead of only one, without being told so.
> - a side effect is that failed operations will be tried twice:
> - one in the standard process
> - one in the table close, as we're flushing the buffer again
> I would like to:
> - deprecate clearBufferOnFail.
> - deprecate setAutoFlush(boolean), to make things clear about what we're
> doing.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira