Re: LOCAL vs TRANSACTIONAL indexes
Thanks James, In response to: "If your table is mutable and not transactional"... I'd like to assume mutable tables are configured by default to "Disable mutable indexes on write failure until consistency restored" but documentation points to the following params below. I assume these properties are configured per deployment (not per table). I will have to follow up with my team :) to see what we have configured. phoenix.index.failure.block.write phoenix.index.failure.handling.rebuild phoenix.index.failure.handling.rebuild.interval phoenix.index.failure.handling.rebuild.overlap.time --Matt On Thu, Sep 22, 2016 at 4:55 PM, James Taylor wrote: > In all cases, the client throws if a write fails (i.e. data table or index > table failure). The state your table and index are left in depend on 1) how > you've configured your table, and 2) when the failure happened. This is > described here[1] in detail. > > Here's a summary: > > - If your table is transactional, then you're always left in a consistent > state. Upon usage of the data or index table in queries, neither will have > the update applied. > - If your table is immutable and not transactional, then your tables are > left in a potentially inconsistent state and it's up to you to retry. It's > "potentially" inconsistent because it depends on when the failure happened. > If the write to the data table failed, then your index will still be > consistent. If the write to the data table succeeded and the write to the > index table failed, then you're in an inconsistent state. > - If your table is mutable and not transactional, then your tables are > left in a potentially inconsistent state. Your data table may be one commit > ahead of your index table and there are some options to a) automatically > "roll-forward" the index to get it back in sync in the background, b) > disable writes to the data table and hide the prior failed commit until the > index table is available again and caught up to the data table, c) unactive > the index so it's not used by queries again until it's automatically caught > up, d) disable the index until it's manually rebuilt. > - If you're using local indexes and you have a release that includes > HBASE-15600 (not available in OS world until HBase 1.3 is out, but likely > available in HDP 2.5), then your data table and index will remain in a > consistent state. > > It's complicated because it's a distributed system. We've made these > various options for users who don't want the overhead of transactions, but > if you're ok with the overhead, that's the simplest solution from the users > POV. > > Thanks, > James > > [1] https://phoenix.apache.org/secondary_indexing.html# > Consistency_Guarantees > > On Thu, Sep 22, 2016 at 10:53 AM, Matthew Van Wely < > mvanw...@salesforce.com> wrote: > >> James, what are the "write failure scenarios" in this case? I can only >> assume one, index update fails and client while trying to rewrite >> to the index. >> >> How does the statement below fair when an index cannot be updated. Does >> client hang, throw error (leaving inconsistent results) or does the table >> write get rolled back? >> >> > "From the same client, there is no race condition. The upsert statement >> is synchronous, so when control returns back to you, all of your data has >> been written (both to the data and index table(s))." >> >> On Tue, Sep 20, 2016 at 10:47 PM, James Taylor >> wrote: >> >>> Glad to help, Matt. Just to be clear, there are no race conditions from >>> the same client. The "unlikely" scenarios come into play when there are >>> multiple clients. Other things to think about are to what degree you want >>> to guard against various write failure scenarios. >>> >>> Thanks, >>> James >>> >>> On Tue, Sep 20, 2016 at 10:30 PM, Matthew Van Wely < >>> mvanw...@salesforce.com> wrote: >>> >>>> Thanks James, knowing that there are no race conditions (or very >>>> unlikely) from the same client on a mutable table is really helpful. >>>> >>>> Thx, >>>> --Matt >>>> >>>> On Sat, Sep 17, 2016 at 4:26 PM, James Taylor >>>> wrote: >>>> >>>>> On Fri, Sep 16, 2016 at 7:22 PM, Matthew Van Wely < >>>>> mvanw...@salesforce.com> wrote: >>>>> >>>>>> All, >>>>>> >>>>>> I would like some guidance on LOCAL vs TRANSACTIONAL indexes and
Re: LOCAL vs TRANSACTIONAL indexes
In all cases, the client throws if a write fails (i.e. data table or index table failure). The state your table and index are left in depend on 1) how you've configured your table, and 2) when the failure happened. This is described here[1] in detail. Here's a summary: - If your table is transactional, then you're always left in a consistent state. Upon usage of the data or index table in queries, neither will have the update applied. - If your table is immutable and not transactional, then your tables are left in a potentially inconsistent state and it's up to you to retry. It's "potentially" inconsistent because it depends on when the failure happened. If the write to the data table failed, then your index will still be consistent. If the write to the data table succeeded and the write to the index table failed, then you're in an inconsistent state. - If your table is mutable and not transactional, then your tables are left in a potentially inconsistent state. Your data table may be one commit ahead of your index table and there are some options to a) automatically "roll-forward" the index to get it back in sync in the background, b) disable writes to the data table and hide the prior failed commit until the index table is available again and caught up to the data table, c) unactive the index so it's not used by queries again until it's automatically caught up, d) disable the index until it's manually rebuilt. - If you're using local indexes and you have a release that includes HBASE-15600 (not available in OS world until HBase 1.3 is out, but likely available in HDP 2.5), then your data table and index will remain in a consistent state. It's complicated because it's a distributed system. We've made these various options for users who don't want the overhead of transactions, but if you're ok with the overhead, that's the simplest solution from the users POV. Thanks, James [1] https://phoenix.apache.org/secondary_indexing.html#Consistency_Guarantees On Thu, Sep 22, 2016 at 10:53 AM, Matthew Van Wely wrote: > James, what are the "write failure scenarios" in this case? I can only > assume one, index update fails and client while trying to rewrite > to the index. > > How does the statement below fair when an index cannot be updated. Does > client hang, throw error (leaving inconsistent results) or does the table > write get rolled back? > > > "From the same client, there is no race condition. The upsert statement > is synchronous, so when control returns back to you, all of your data has > been written (both to the data and index table(s))." > > On Tue, Sep 20, 2016 at 10:47 PM, James Taylor > wrote: > >> Glad to help, Matt. Just to be clear, there are no race conditions from >> the same client. The "unlikely" scenarios come into play when there are >> multiple clients. Other things to think about are to what degree you want >> to guard against various write failure scenarios. >> >> Thanks, >> James >> >> On Tue, Sep 20, 2016 at 10:30 PM, Matthew Van Wely < >> mvanw...@salesforce.com> wrote: >> >>> Thanks James, knowing that there are no race conditions (or very >>> unlikely) from the same client on a mutable table is really helpful. >>> >>> Thx, >>> --Matt >>> >>> On Sat, Sep 17, 2016 at 4:26 PM, James Taylor >>> wrote: >>> >>>> On Fri, Sep 16, 2016 at 7:22 PM, Matthew Van Wely < >>>> mvanw...@salesforce.com> wrote: >>>> >>>>> All, >>>>> >>>>> I would like some guidance on LOCAL vs TRANSACTIONAL indexes and I >>>>> cannot quite get the details I need from the Phoenix site: >>>>> https://phoenix.apache.org/secondary_indexing.htm >>>>> >>>>> Transactional Tables >>>>> >>>>> >>>>> transactional tables with secondary indexes potentially lowers your >>>>> availability of being able to write to your data table, as both the >>>>> data >>>>> table and its secondary index tables must be availalbe as otherwise the >>>>> write will fail >>>>> >>>>> >>>>> 1) What is the likelihood that an index is not available? >>>>> >>>> This is rare and unlikely. If a region server goes down, HBase >>>> relocates the regions it was hosting to another region server. If you write >>>> data exactly when this happens, it's possible that you'll get an exception >>>> back if this relocation takes longer than your # of retries and tim
Re: LOCAL vs TRANSACTIONAL indexes
James, what are the "write failure scenarios" in this case? I can only assume one, index update fails and client while trying to rewrite to the index. How does the statement below fair when an index cannot be updated. Does client hang, throw error (leaving inconsistent results) or does the table write get rolled back? > "From the same client, there is no race condition. The upsert statement is synchronous, so when control returns back to you, all of your data has been written (both to the data and index table(s))." On Tue, Sep 20, 2016 at 10:47 PM, James Taylor wrote: > Glad to help, Matt. Just to be clear, there are no race conditions from > the same client. The "unlikely" scenarios come into play when there are > multiple clients. Other things to think about are to what degree you want > to guard against various write failure scenarios. > > Thanks, > James > > On Tue, Sep 20, 2016 at 10:30 PM, Matthew Van Wely < > mvanw...@salesforce.com> wrote: > >> Thanks James, knowing that there are no race conditions (or very >> unlikely) from the same client on a mutable table is really helpful. >> >> Thx, >> --Matt >> >> On Sat, Sep 17, 2016 at 4:26 PM, James Taylor >> wrote: >> >>> On Fri, Sep 16, 2016 at 7:22 PM, Matthew Van Wely < >>> mvanw...@salesforce.com> wrote: >>> >>>> All, >>>> >>>> I would like some guidance on LOCAL vs TRANSACTIONAL indexes and I >>>> cannot quite get the details I need from the Phoenix site: >>>> https://phoenix.apache.org/secondary_indexing.htm >>>> >>>> Transactional Tables >>>> >>>> >>>> transactional tables with secondary indexes potentially lowers your >>>> availability of being able to write to your data table, as both the data >>>> table and its secondary index tables must be availalbe as otherwise the >>>> write will fail >>>> >>>> >>>> 1) What is the likelihood that an index is not available? >>>> >>> This is rare and unlikely. If a region server goes down, HBase relocates >>> the regions it was hosting to another region server. If you write data >>> exactly when this happens, it's possible that you'll get an exception back >>> if this relocation takes longer than your # of retries and timeout settings. >>> >>> >>>> >>>> 2) If rebuilding, is this on the order of minutes, hours? >>>> >>> Not sure what rebuilding you're asking about. For mutable, non >>> transactional secondary indexes, Phoenix has the ability to partially >>> rebuild them if a write failure occurs. This will be relatively faster >>> because it only rebuilds index rows that were added after the writes began >>> failing. See the options listed under https://phoenix.apache.o >>> rg/secondary_indexing.html#Mutable_Tables >>> >>> If on the other hand you're asking how long does it take to completely >>> rebuild the index, then that depends on how much data the table has (so >>> then you're really asking how fast does HBase write). >>> >>> >>>> >>>> 3) Does Phoenix give an indication the write failed due to unavailable >>>> table/index (bc if so client could handle this with other write >>>> options)? >>>> >>> >>> Yes, Phoenix throws an exception if the write fails. It never fails >>> silently. If your data is immutable, then it's up to you to handle the >>> write failure (usually by just continually retrying the failed write). If >>> mutable, then Phoenix has some options that can automate catching the index >>> up with the data table (see https://phoenix.apache.or >>> g/secondary_indexing.html#Consistency_Guarantees). If your table is >>> transactional, then it cannot get out of sync with the index. >>> >>> >>>> >>>> Local Indexes >>>> >>>> >>>> all local index data in the separate shadow column families in the >>>> same data table. At read time when the local index is used, every region >>>> must be examined for the data as the exact region location of index data >>>> cannot be predetermined. Thus some overhead occurs at read-time. >>>> >>>> >>>> 4) Are there any requirements on table PK and index key regarding key >>>> ordering? >>>> >>> No >>> >>> >>>> >>>> 5) How is
Re: LOCAL vs TRANSACTIONAL indexes
Glad to help, Matt. Just to be clear, there are no race conditions from the same client. The "unlikely" scenarios come into play when there are multiple clients. Other things to think about are to what degree you want to guard against various write failure scenarios. Thanks, James On Tue, Sep 20, 2016 at 10:30 PM, Matthew Van Wely wrote: > Thanks James, knowing that there are no race conditions (or very > unlikely) from the same client on a mutable table is really helpful. > > Thx, > --Matt > > On Sat, Sep 17, 2016 at 4:26 PM, James Taylor > wrote: > >> On Fri, Sep 16, 2016 at 7:22 PM, Matthew Van Wely < >> mvanw...@salesforce.com> wrote: >> >>> All, >>> >>> I would like some guidance on LOCAL vs TRANSACTIONAL indexes and I >>> cannot quite get the details I need from the Phoenix site: >>> https://phoenix.apache.org/secondary_indexing.htm >>> >>> Transactional Tables >>> >>> >>> transactional tables with secondary indexes potentially lowers your >>> availability of being able to write to your data table, as both the data >>> table and its secondary index tables must be availalbe as otherwise the >>> write will fail >>> >>> >>> 1) What is the likelihood that an index is not available? >>> >> This is rare and unlikely. If a region server goes down, HBase relocates >> the regions it was hosting to another region server. If you write data >> exactly when this happens, it's possible that you'll get an exception back >> if this relocation takes longer than your # of retries and timeout settings. >> >> >>> >>> 2) If rebuilding, is this on the order of minutes, hours? >>> >> Not sure what rebuilding you're asking about. For mutable, non >> transactional secondary indexes, Phoenix has the ability to partially >> rebuild them if a write failure occurs. This will be relatively faster >> because it only rebuilds index rows that were added after the writes began >> failing. See the options listed under https://phoenix.apache.o >> rg/secondary_indexing.html#Mutable_Tables >> >> If on the other hand you're asking how long does it take to completely >> rebuild the index, then that depends on how much data the table has (so >> then you're really asking how fast does HBase write). >> >> >>> >>> 3) Does Phoenix give an indication the write failed due to unavailable >>> table/index (bc if so client could handle this with other write options)? >>> >> >> Yes, Phoenix throws an exception if the write fails. It never fails >> silently. If your data is immutable, then it's up to you to handle the >> write failure (usually by just continually retrying the failed write). If >> mutable, then Phoenix has some options that can automate catching the index >> up with the data table (see https://phoenix.apache.or >> g/secondary_indexing.html#Consistency_Guarantees). If your table is >> transactional, then it cannot get out of sync with the index. >> >> >>> >>> Local Indexes >>> >>> >>> all local index data in the separate shadow column families in the >>> same data table. At read time when the local index is used, every region >>> must be examined for the data as the exact region location of index data >>> cannot be predetermined. Thus some overhead occurs at read-time. >>> >>> >>> 4) Are there any requirements on table PK and index key regarding key >>> ordering? >>> >> No >> >> >>> >>> 5) How is something locally indexed if the keys are completely >>> mismatched? >>> I get the sense that it doesn't matter given that "every region must be >>> examined". >>> >> >> The rows of a local index are sorted in each region. The client just has >> to do a merge sort between all the data it gets back for the scans over >> each region. This is very fast, so not too much overhead here. >> >> >>> >>> Mutable Tables >>> >>> >>> indexes on non transactional mutable tables are only ever a single >>> batch of edits behind the primary table >>> >>> >>> 6) If my use case updates a table and then reads from an index, it seems >>> a >>> likely race condition that I can read-my-write. >>> >> >> From the same client, there is no race condition. The upsert statement is >> synchronous, so when control returns back to you,
Re: LOCAL vs TRANSACTIONAL indexes
Thanks James, knowing that there are no race conditions (or very unlikely) from the same client on a mutable table is really helpful. Thx, --Matt On Sat, Sep 17, 2016 at 4:26 PM, James Taylor wrote: > On Fri, Sep 16, 2016 at 7:22 PM, Matthew Van Wely > wrote: > >> All, >> >> I would like some guidance on LOCAL vs TRANSACTIONAL indexes and I >> cannot quite get the details I need from the Phoenix site: >> https://phoenix.apache.org/secondary_indexing.htm >> >> Transactional Tables >> >> >> transactional tables with secondary indexes potentially lowers your >> availability of being able to write to your data table, as both the data >> table and its secondary index tables must be availalbe as otherwise the >> write will fail >> >> >> 1) What is the likelihood that an index is not available? >> > This is rare and unlikely. If a region server goes down, HBase relocates > the regions it was hosting to another region server. If you write data > exactly when this happens, it's possible that you'll get an exception back > if this relocation takes longer than your # of retries and timeout settings. > > >> >> 2) If rebuilding, is this on the order of minutes, hours? >> > Not sure what rebuilding you're asking about. For mutable, non > transactional secondary indexes, Phoenix has the ability to partially > rebuild them if a write failure occurs. This will be relatively faster > because it only rebuilds index rows that were added after the writes began > failing. See the options listed under https://phoenix.apache. > org/secondary_indexing.html#Mutable_Tables > > If on the other hand you're asking how long does it take to completely > rebuild the index, then that depends on how much data the table has (so > then you're really asking how fast does HBase write). > > >> >> 3) Does Phoenix give an indication the write failed due to unavailable >> table/index (bc if so client could handle this with other write options)? >> > > Yes, Phoenix throws an exception if the write fails. It never fails > silently. If your data is immutable, then it's up to you to handle the > write failure (usually by just continually retrying the failed write). If > mutable, then Phoenix has some options that can automate catching the index > up with the data table (see https://phoenix.apache. > org/secondary_indexing.html#Consistency_Guarantees). If your table is > transactional, then it cannot get out of sync with the index. > > >> >> Local Indexes >> >> >> all local index data in the separate shadow column families in the >> same data table. At read time when the local index is used, every region >> must be examined for the data as the exact region location of index data >> cannot be predetermined. Thus some overhead occurs at read-time. >> >> >> 4) Are there any requirements on table PK and index key regarding key >> ordering? >> > No > > >> >> 5) How is something locally indexed if the keys are completely mismatched? >> I get the sense that it doesn't matter given that "every region must be >> examined". >> > > The rows of a local index are sorted in each region. The client just has > to do a merge sort between all the data it gets back for the scans over > each region. This is very fast, so not too much overhead here. > > >> >> Mutable Tables >> >> >> indexes on non transactional mutable tables are only ever a single >> batch of edits behind the primary table >> >> >> 6) If my use case updates a table and then reads from an index, it seems a >> likely race condition that I can read-my-write. >> > > From the same client, there is no race condition. The upsert statement is > synchronous, so when control returns back to you, all of your data has been > written (both to the data and index table(s)). > > If the read happens from a different client than the write, with global, > mutable, non transactional indexes, it's possible that a read could occur > after the write to the data table but before the write to the index > table(s) (since the with global indexes, the regions for the index table > are potentially on different region servers than the regions of the data > table). > > With local indexes the above is even more unlikely because the writes are > all occurring to the same region server, but in theory it's still possible. > With the fix that was made as part of HBASE-15600, this wouldn't be > possible at all, though. > > With transactional tables, this scenario isn't possible. > > >> >> 7) Would you be willing to bet that most reads are consistent with the >> table and only in rare scenarios is the table/index out of sync? >> > Yes > >> >> I appreciate your help and feedback on these questions. Thanks, >> --Matthew >> > > Thanks, > James > >
Re: LOCAL vs TRANSACTIONAL indexes
On Fri, Sep 16, 2016 at 7:22 PM, Matthew Van Wely wrote: > All, > > I would like some guidance on LOCAL vs TRANSACTIONAL indexes and I > cannot quite get the details I need from the Phoenix site: > https://phoenix.apache.org/secondary_indexing.htm > > Transactional Tables > > > transactional tables with secondary indexes potentially lowers your > availability of being able to write to your data table, as both the data > table and its secondary index tables must be availalbe as otherwise the > write will fail > > > 1) What is the likelihood that an index is not available? > This is rare and unlikely. If a region server goes down, HBase relocates the regions it was hosting to another region server. If you write data exactly when this happens, it's possible that you'll get an exception back if this relocation takes longer than your # of retries and timeout settings. > > 2) If rebuilding, is this on the order of minutes, hours? > Not sure what rebuilding you're asking about. For mutable, non transactional secondary indexes, Phoenix has the ability to partially rebuild them if a write failure occurs. This will be relatively faster because it only rebuilds index rows that were added after the writes began failing. See the options listed under https://phoenix.apache.org/secondary_indexing.html#Mutable_Tables If on the other hand you're asking how long does it take to completely rebuild the index, then that depends on how much data the table has (so then you're really asking how fast does HBase write). > > 3) Does Phoenix give an indication the write failed due to unavailable > table/index (bc if so client could handle this with other write options)? > Yes, Phoenix throws an exception if the write fails. It never fails silently. If your data is immutable, then it's up to you to handle the write failure (usually by just continually retrying the failed write). If mutable, then Phoenix has some options that can automate catching the index up with the data table (see https://phoenix.apache.org/secondary_indexing.html#Consistency_Guarantees). If your table is transactional, then it cannot get out of sync with the index. > > Local Indexes > > > all local index data in the separate shadow column families in the > same data table. At read time when the local index is used, every region > must be examined for the data as the exact region location of index data > cannot be predetermined. Thus some overhead occurs at read-time. > > > 4) Are there any requirements on table PK and index key regarding key > ordering? > No > > 5) How is something locally indexed if the keys are completely mismatched? > I get the sense that it doesn't matter given that "every region must be > examined". > The rows of a local index are sorted in each region. The client just has to do a merge sort between all the data it gets back for the scans over each region. This is very fast, so not too much overhead here. > > Mutable Tables > > > indexes on non transactional mutable tables are only ever a single > batch of edits behind the primary table > > > 6) If my use case updates a table and then reads from an index, it seems a > likely race condition that I can read-my-write. > >From the same client, there is no race condition. The upsert statement is synchronous, so when control returns back to you, all of your data has been written (both to the data and index table(s)). If the read happens from a different client than the write, with global, mutable, non transactional indexes, it's possible that a read could occur after the write to the data table but before the write to the index table(s) (since the with global indexes, the regions for the index table are potentially on different region servers than the regions of the data table). With local indexes the above is even more unlikely because the writes are all occurring to the same region server, but in theory it's still possible. With the fix that was made as part of HBASE-15600, this wouldn't be possible at all, though. With transactional tables, this scenario isn't possible. > > 7) Would you be willing to bet that most reads are consistent with the > table and only in rare scenarios is the table/index out of sync? > Yes > > I appreciate your help and feedback on these questions. Thanks, > --Matthew > Thanks, James
LOCAL vs TRANSACTIONAL indexes
All, I would like some guidance on LOCAL vs TRANSACTIONAL indexes and I cannot quite get the details I need from the Phoenix site: https://phoenix.apache.org/secondary_indexing.htm Transactional Tables transactional tables with secondary indexes potentially lowers your availability of being able to write to your data table, as both the data table and its secondary index tables must be availalbe as otherwise the write will fail 1) What is the likelihood that an index is not available? 2) If rebuilding, is this on the order of minutes, hours? 3) Does Phoenix give an indication the write failed due to unavailable table/index (bc if so client could handle this with other write options)? Local Indexes all local index data in the separate shadow column families in the same data table. At read time when the local index is used, every region must be examined for the data as the exact region location of index data cannot be predetermined. Thus some overhead occurs at read-time. 4) Are there any requirements on table PK and index key regarding key ordering? 5) How is something locally indexed if the keys are completely mismatched? I get the sense that it doesn't matter given that "every region must be examined". Mutable Tables indexes on non transactional mutable tables are only ever a single batch of edits behind the primary table 6) If my use case updates a table and then reads from an index, it seems a likely race condition that I can read-my-write. 7) Would you be willing to bet that most reads are consistent with the table and only in rare scenarios is the table/index out of sync? I appreciate your help and feedback on these questions. Thanks, --Matthew