[jira] [Commented] (OMID-56) Integrate with Apache Phoenix

James Taylor (JIRA) Wed, 04 Jan 2017 18:15:24 -0800

    [ 
https://issues.apache.org/jira/browse/OMID-56?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15800023#comment-15800023
 ]


James Taylor commented on OMID-56:
----------------------------------

We use this VisibilityFence API to coordinate creating an index while the data 
table is taking writes to ensure that we don't miss any updates. There's an 
inherent race condition between a {{CREATE INDEX}} call and {{UPERT}} or 
{{DELETE}} call to the data table in particular around inflight transactions. 
To solve this, we call commitDDLFence when an index is created:
{code}
    /**
     * Commit a write fence when creating an index so that we can detect
     * when a data table transaction is started before the create index
     * but completes after it. In this case, we need to rerun the data
     * table transaction after the index creation so that the index rows
     * are generated. See {@link #addDMLFence(PTable)} and TEPHRA-157
     * for more information.
     * @param dataTable the data table upon which an index is being added
     * @throws SQLException
     */
    public void commitDDLFence(PTable dataTable) throws SQLException {
{code}
and call addDMLFence when a DML operation (i.e. an UPSERT or DELETE) is 
executed:
{code}
    /**
     * Add an entry to the change set representing the DML operation that is 
starting.
     * These entries will not conflict with each other, but they will conflict 
with a
     * DDL operation of creating an index. See {@link #addDMLFence(PTable)} and 
TEPHRA-157
     * for more information.
     * @param table the table which is doing DML
     * @throws SQLException
     */
    private void addDMLFence(PTable table) throws SQLException {
{code}
Then this code in MutationState.commit() will retry if we get a transaction 
conflict and an index was added:
{code}
                } finally {
                    try {
                        resetState();
                    } finally {
                        if (retryCommit) {
                            startTransaction();
                            // Add back read fences
                            Set<TableRef> txTableRefs = txMutations.keySet();
                            for (TableRef tableRef : txTableRefs) {
                                PTable dataTable = tableRef.getTable();
                                addDMLFence(dataTable);
                            }
                            try {
                                // Only retry if an index was added
                                retryCommit = 
shouldResubmitTransaction(txTableRefs);
{code}
You can review the comments in TEPHRA-157 to get more detail, but this 
VisibilityFence is purely an API on top of the transaction API. The await call 
starts a transaction and the client code adds the <table name> + <txIDs of open 
txns> combinations to the set of row keys for conflict detection. Then the 
VisibilityFence.create() adds <table name> + <txID of current tx> to the set of 
row keys. We'll get a conflict if the current tx was in the list of inflight 
transactions.

bq. Another thing we do not fully understand is the use of both context and 
awares in MutationState. There is some comment noting that the context is not 
thread safe. Could you please explain this requirement as well?

This is simpler to explain. The AbstractTransactionAwareTable maintains a Map 
(changeSets) of all the row keys which have been modified. The Map is not 
thread safe, though. So when Phoenix has multiple threads that are writing to 
HBase (for example, in UpsertCompiler), MutationState won't use the 
{{TransactionContext txContext}} member variable but instead will use 
{{List<TransactionAware> txAwares}}. We'll then take the txAwares and add then 
from the parent thread where we know there's only a single thread operating on 
the MutationState. This is all essentially to work around Tephra's 
TransactionContext not being thread safe.

> Integrate with Apache Phoenix
> -----------------------------
>
>                 Key: OMID-56
>                 URL: https://issues.apache.org/jira/browse/OMID-56
>             Project: Apache Omid
>          Issue Type: Improvement
>            Reporter: James Taylor
>              Labels: phoenix
>
> The current transaction implementation in Phoenix uses Tephra which is good 
> when the number of rows in the transaction is small and the changes of a 
> conflict are relatively rare. It's also not clear when the number of 
> simultaneous transactions would max out given the single, global transaction 
> manager component.
> Omid is very complimentary in this regard. Though the overhead for small 
> transactions may be larger than Tephra, it will likely scale well as the 
> number of rows in a transaction grows and has no global transaction manager.
> It'd be great to figure out the best way to integrate Omid with Phoenix. The 
> trickiest issue may be with optimizing secondary indexes, in that conflict 
> detection is not necessary for them. We could leave this optimization for the 
> future and just treat them as any other HBase table. Perhaps a good first 
> step would be to just turn on Omid transactions at the HBase level and then 
> have Phoenix issue the appropriate Omid call for start transaction, commit 
> transaction, and rollback transaction. It might just work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (OMID-56) Integrate with Apache Phoenix

Reply via email to