[jira] [Comment Edited] (HBASE-20471) Recheck the design and implementation of FSYNC_WAL durability for WAL

2018-04-22 Thread Yu Li (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446692#comment-16446692
 ] 

Yu Li edited comment on HBASE-20471 at 4/23/18 3:32 AM:


Some of my thoughts:

1. Only allow setting {{SKIP_WAL/ASYNC_WAL/SYNC_WAL}} through 
{{Mutation#setDurability}} API
* We can separate FSYNC_WAL from SYNC_WAL writes only if we flush the 
mutations one by one, but actually we're grouping (and we should, for the sake 
of performance) the writes through disruptor mechanism, which makes it possible 
that a mutation with SYNC_WAL durability is actually fsync'ed, vice versa. So 
for sync write I suggest we only allow setting SYNC_WAL per mutation, and 
decide the detailed sync mode through cluster-level setting (for now).
* To keep backward compatibility, if user set FSYNC_WAL, we should change 
it to SYNC_WAL in {{Mutation#setDurability}}
* We should add some document about this change in our ref-guide.

2. Instead, we allow user to set the cluster-level way of sync through 
hbase-site.xml, hflush or hsync.

3. In the future we may allow user to use a dedicated WAL per table/CF and set 
its sync mode through table/CF descriptor.


was (Author: carp84):
Some of my thoughts:

1. Only allow setting {{SYNC_WAL}} through {{Mutation#setDurability}} API
* We can separate FSYNC_WAL from SYNC_WAL writes only if we flush the 
mutations one by one, but actually we're grouping (and we should, for the sake 
of performance) the writes through disruptor mechanism, which makes it possible 
that a mutation with SYNC_WAL durability is actually fsync'ed, vice versa.
* To keep backward compatibility, if user set FSYNC_WAL, we should change 
it to SYNC_WAL in {{Mutation#setDurability}}
* We should add some document about this change in our ref-guide.

2. Instead, we allow user to set the cluster-level way of sync through 
hbase-site.xml, hflush or hsync.

3. In the future we may allow user to use a dedicated WAL per table/CF and set 
its sync mode through table/CF descriptor.

> Recheck the design and implementation of FSYNC_WAL durability for WAL
> -
>
> Key: HBASE-20471
> URL: https://issues.apache.org/jira/browse/HBASE-20471
> Project: HBase
>  Issue Type: Task
>Reporter: Yu Li
>Priority: Major
>
> This is something derived from discussion in HBASE-19024 around [this 
> comment|https://issues.apache.org/jira/browse/HBASE-19024?focusedCommentId=16445592=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16445592]
> We have been supplying user the API to set durability per mutation for a long 
> time, by design the SYNC_WAL durability to call {{FSDataOutputStream#hflush}} 
> and FSYNC_WAL {{FSDataOutputStream#hsync}}, while in implementation we have 
> been calling hflush for FSYNC_WAL also until HBASE-19024. Although 
> HBASE-19024 tried to fix the syntax with good willing, the implementation 
> there cannot assure the FSYNC_WAL edits are truly hsync'ed due to the 
> disruptor mechanism used in WAL implementation. Here in this JIRA we aim to 
> have more discussion and give it a complete solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20471) Recheck the design and implementation of FSYNC_WAL durability for WAL

2018-04-22 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447499#comment-16447499
 ] 

Andrew Purtell edited comment on HBASE-20471 at 4/23/18 2:59 AM:
-

After HBASE-19024 the default durability setting becomes FSYNC_WAL if 
_hbase.wal.hsync_ is true, so all edits in the batch will have the same 
setting, and will be fsynced (unless the user changes the Durability setting, 
and then it's unclear, but we don't expect this for the normal case). The way 
the new setting _hbase.wal.hsync_ works was good enough for HBASE-19024 but 
agreed it is hardly ideal. If the user assumes SYNC_WAL or FSYNC_WAL is applied 
to every mutation, this is not correct and should be documented. 

We do expect SKIP_WAL to take effect on a per RPC basis, that is a Put should 
skip the WAL if SKIP_WAL, or every mutation in a batch or RowMutations should 
skip the WAL if the RPC carries a Durability attribute with SKIP_WAL selected. 
If that assumption is now somehow not correct, we need to fix that.


was (Author: apurtell):
After HBASE-19024 the default durability setting becomes FSYNC_WAL if 
_hbase.wal.hsync_ is true, so all edits in the batch will have the same 
setting, and will be fsynced (unless the user changes the Durability setting, 
and then it's unclear, but we don't expect this for the normal case). The way 
the new setting _hbase.wal.hsync_ works was good enough for HBASE-19024 but 
agreed it is hardly ideal. If the user assumes SYNC_WAL or FSYNC_WAL is applied 
to every mutation, this is not correct and should be documented. We do expect 
SKIP_WAL to take effect on a per RPC basis, that is a Put should skip the WAL 
if SKIP_WAL, or every mutation in a batch or RowMutations should skip the WAL 
if the RPC carries a Durability attribute with SKIP_WAL selected. If that 
assumption is now somehow not correct, we need to fix that.

> Recheck the design and implementation of FSYNC_WAL durability for WAL
> -
>
> Key: HBASE-20471
> URL: https://issues.apache.org/jira/browse/HBASE-20471
> Project: HBase
>  Issue Type: Task
>Reporter: Yu Li
>Priority: Major
>
> This is something derived from discussion in HBASE-19024 around [this 
> comment|https://issues.apache.org/jira/browse/HBASE-19024?focusedCommentId=16445592=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16445592]
> We have been supplying user the API to set durability per mutation for a long 
> time, by design the SYNC_WAL durability to call {{FSDataOutputStream#hflush}} 
> and FSYNC_WAL {{FSDataOutputStream#hsync}}, while in implementation we have 
> been calling hflush for FSYNC_WAL also until HBASE-19024. Although 
> HBASE-19024 tried to fix the syntax with good willing, the implementation 
> there cannot assure the FSYNC_WAL edits are truly hsync'ed due to the 
> disruptor mechanism used in WAL implementation. Here in this JIRA we aim to 
> have more discussion and give it a complete solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-20471) Recheck the design and implementation of FSYNC_WAL durability for WAL

2018-04-21 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446714#comment-16446714
 ] 

Anoop Sam John edited comment on HBASE-20471 at 4/21/18 9:31 AM:
-

Yes I mean along with ur point #3
As a 1st step, at least we should write all these known limitation in to book.  


was (Author: anoop.hbase):
As a 1st step, at least we should write all these known limitation in to book.  

> Recheck the design and implementation of FSYNC_WAL durability for WAL
> -
>
> Key: HBASE-20471
> URL: https://issues.apache.org/jira/browse/HBASE-20471
> Project: HBase
>  Issue Type: Task
>Reporter: Yu Li
>Priority: Major
>
> This is something derived from discussion in HBASE-19024 around [this 
> comment|https://issues.apache.org/jira/browse/HBASE-19024?focusedCommentId=16445592=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16445592]
> We have been supplying user the API to set durability per mutation for a long 
> time, by design the SYNC_WAL durability to call {{FSDataOutputStream#hflush}} 
> and FSYNC_WAL {{FSDataOutputStream#hsync}}, while in implementation we have 
> been calling hflush for FSYNC_WAL also until HBASE-19024. Although 
> HBASE-19024 tried to fix the syntax with good willing, the implementation 
> there cannot assure the FSYNC_WAL edits are truly hsync'ed due to the 
> disruptor mechanism used in WAL implementation. Here in this JIRA we aim to 
> have more discussion and give it a complete solution.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)