[jira] [Comment Edited] (HBASE-20471) Recheck the design and implementation of FSYNC_WAL durability for WAL
[ https://issues.apache.org/jira/browse/HBASE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446692#comment-16446692 ] Yu Li edited comment on HBASE-20471 at 4/23/18 3:32 AM: Some of my thoughts: 1. Only allow setting {{SKIP_WAL/ASYNC_WAL/SYNC_WAL}} through {{Mutation#setDurability}} API * We can separate FSYNC_WAL from SYNC_WAL writes only if we flush the mutations one by one, but actually we're grouping (and we should, for the sake of performance) the writes through disruptor mechanism, which makes it possible that a mutation with SYNC_WAL durability is actually fsync'ed, vice versa. So for sync write I suggest we only allow setting SYNC_WAL per mutation, and decide the detailed sync mode through cluster-level setting (for now). * To keep backward compatibility, if user set FSYNC_WAL, we should change it to SYNC_WAL in {{Mutation#setDurability}} * We should add some document about this change in our ref-guide. 2. Instead, we allow user to set the cluster-level way of sync through hbase-site.xml, hflush or hsync. 3. In the future we may allow user to use a dedicated WAL per table/CF and set its sync mode through table/CF descriptor. was (Author: carp84): Some of my thoughts: 1. Only allow setting {{SYNC_WAL}} through {{Mutation#setDurability}} API * We can separate FSYNC_WAL from SYNC_WAL writes only if we flush the mutations one by one, but actually we're grouping (and we should, for the sake of performance) the writes through disruptor mechanism, which makes it possible that a mutation with SYNC_WAL durability is actually fsync'ed, vice versa. * To keep backward compatibility, if user set FSYNC_WAL, we should change it to SYNC_WAL in {{Mutation#setDurability}} * We should add some document about this change in our ref-guide. 2. Instead, we allow user to set the cluster-level way of sync through hbase-site.xml, hflush or hsync. 3. In the future we may allow user to use a dedicated WAL per table/CF and set its sync mode through table/CF descriptor. > Recheck the design and implementation of FSYNC_WAL durability for WAL > - > > Key: HBASE-20471 > URL: https://issues.apache.org/jira/browse/HBASE-20471 > Project: HBase > Issue Type: Task >Reporter: Yu Li >Priority: Major > > This is something derived from discussion in HBASE-19024 around [this > comment|https://issues.apache.org/jira/browse/HBASE-19024?focusedCommentId=16445592=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16445592] > We have been supplying user the API to set durability per mutation for a long > time, by design the SYNC_WAL durability to call {{FSDataOutputStream#hflush}} > and FSYNC_WAL {{FSDataOutputStream#hsync}}, while in implementation we have > been calling hflush for FSYNC_WAL also until HBASE-19024. Although > HBASE-19024 tried to fix the syntax with good willing, the implementation > there cannot assure the FSYNC_WAL edits are truly hsync'ed due to the > disruptor mechanism used in WAL implementation. Here in this JIRA we aim to > have more discussion and give it a complete solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20471) Recheck the design and implementation of FSYNC_WAL durability for WAL
[ https://issues.apache.org/jira/browse/HBASE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16447499#comment-16447499 ] Andrew Purtell edited comment on HBASE-20471 at 4/23/18 2:59 AM: - After HBASE-19024 the default durability setting becomes FSYNC_WAL if _hbase.wal.hsync_ is true, so all edits in the batch will have the same setting, and will be fsynced (unless the user changes the Durability setting, and then it's unclear, but we don't expect this for the normal case). The way the new setting _hbase.wal.hsync_ works was good enough for HBASE-19024 but agreed it is hardly ideal. If the user assumes SYNC_WAL or FSYNC_WAL is applied to every mutation, this is not correct and should be documented. We do expect SKIP_WAL to take effect on a per RPC basis, that is a Put should skip the WAL if SKIP_WAL, or every mutation in a batch or RowMutations should skip the WAL if the RPC carries a Durability attribute with SKIP_WAL selected. If that assumption is now somehow not correct, we need to fix that. was (Author: apurtell): After HBASE-19024 the default durability setting becomes FSYNC_WAL if _hbase.wal.hsync_ is true, so all edits in the batch will have the same setting, and will be fsynced (unless the user changes the Durability setting, and then it's unclear, but we don't expect this for the normal case). The way the new setting _hbase.wal.hsync_ works was good enough for HBASE-19024 but agreed it is hardly ideal. If the user assumes SYNC_WAL or FSYNC_WAL is applied to every mutation, this is not correct and should be documented. We do expect SKIP_WAL to take effect on a per RPC basis, that is a Put should skip the WAL if SKIP_WAL, or every mutation in a batch or RowMutations should skip the WAL if the RPC carries a Durability attribute with SKIP_WAL selected. If that assumption is now somehow not correct, we need to fix that. > Recheck the design and implementation of FSYNC_WAL durability for WAL > - > > Key: HBASE-20471 > URL: https://issues.apache.org/jira/browse/HBASE-20471 > Project: HBase > Issue Type: Task >Reporter: Yu Li >Priority: Major > > This is something derived from discussion in HBASE-19024 around [this > comment|https://issues.apache.org/jira/browse/HBASE-19024?focusedCommentId=16445592=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16445592] > We have been supplying user the API to set durability per mutation for a long > time, by design the SYNC_WAL durability to call {{FSDataOutputStream#hflush}} > and FSYNC_WAL {{FSDataOutputStream#hsync}}, while in implementation we have > been calling hflush for FSYNC_WAL also until HBASE-19024. Although > HBASE-19024 tried to fix the syntax with good willing, the implementation > there cannot assure the FSYNC_WAL edits are truly hsync'ed due to the > disruptor mechanism used in WAL implementation. Here in this JIRA we aim to > have more discussion and give it a complete solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (HBASE-20471) Recheck the design and implementation of FSYNC_WAL durability for WAL
[ https://issues.apache.org/jira/browse/HBASE-20471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16446714#comment-16446714 ] Anoop Sam John edited comment on HBASE-20471 at 4/21/18 9:31 AM: - Yes I mean along with ur point #3 As a 1st step, at least we should write all these known limitation in to book. was (Author: anoop.hbase): As a 1st step, at least we should write all these known limitation in to book. > Recheck the design and implementation of FSYNC_WAL durability for WAL > - > > Key: HBASE-20471 > URL: https://issues.apache.org/jira/browse/HBASE-20471 > Project: HBase > Issue Type: Task >Reporter: Yu Li >Priority: Major > > This is something derived from discussion in HBASE-19024 around [this > comment|https://issues.apache.org/jira/browse/HBASE-19024?focusedCommentId=16445592=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16445592] > We have been supplying user the API to set durability per mutation for a long > time, by design the SYNC_WAL durability to call {{FSDataOutputStream#hflush}} > and FSYNC_WAL {{FSDataOutputStream#hsync}}, while in implementation we have > been calling hflush for FSYNC_WAL also until HBASE-19024. Although > HBASE-19024 tried to fix the syntax with good willing, the implementation > there cannot assure the FSYNC_WAL edits are truly hsync'ed due to the > disruptor mechanism used in WAL implementation. Here in this JIRA we aim to > have more discussion and give it a complete solution. -- This message was sent by Atlassian JIRA (v7.6.3#76005)