[ 
https://issues.apache.org/jira/browse/PHOENIX-5084?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16731382#comment-16731382
 ] 

James Taylor edited comment on PHOENIX-5084 at 12/31/18 6:54 PM:
-----------------------------------------------------------------

For consistency with the other metadata methods, change the rollback to a 
commit. Also, there's one more method that could use the same commit logic:
{code:java}
    @Override
    public ResultSet getTableTypes() throws SQLException {
{code}
Though not strictly needed, it'd make the behavior consistent across all 
metadata calls. 
{quote}Won't that leave an aborted transaction around each time (which is 
costly at least in Tephra)?
{quote}
For Tephra, a commit or rollback would be the same. They only become expensive 
if the transaction manager cannot be reached (in which case they'd timeout and 
end up on the invalid list).

Separate from this issue (in a new JIRA), I'd recommend thinking through the 
behavior when
 # Auto commit is off. Do we want sqlline to be inadvertently starting a 
transaction and leaving it open? 
 # DML operations are mixed with metadata calls (again with auto commit off). 
What if we had something like this?
{code:java}
UPSERT VALUES
connection.getMetaData().getTables(...)
connection.rollback(){code}
In this case, the upserted values would get committed before the rollback 
occurred which would be a surprise for the user. There's a similar issue with 
issuing DDL on the same connection when there is uncommitted data. In that 
case, we issue a rollback before we start executing the DDL.

IMO, we should either:
 # Recommend to users to not mix metadata data operations with DML and follow 
the same approach as we do with DDL - execute a rollback prior to starting the 
operation and execute another rollback at the end (so we don't inadvertently 
leave a transaction open when auto commit if off).
 # Or alternately, fix the PhoenixRuntime.getTableNoCache to not start a 
transaction. Since we're explicitly saying we want to bypass the cache, then it 
seems like we should not start the transaction but just use LATEST_TIMESTAMP. 
Something to think about, [~tdsilva]. However, I'm not sure why we're not using 
PhoenixRuntime.getTable instead and relying on the cache - we could save a lot 
of RPCs that way. If we could use that *and* not start a transaction, that'd be 
ideal, but I think we need to start a transaction to get back a timestamp from 
the transaction manager that we can use in the cache (for consistency across 
queries/DML).


was (Author: jamestaylor):
For consistency with the other metadata methods, change the rollback to a 
commit. Also, there's one more method that could use the same logic:
{code:java}
    @Override
    public ResultSet getTableTypes() throws SQLException {
{code}
Though not strictly needed, it'd make the behavior consistent across all 
metadata calls. 
{quote}Won't that leave an aborted transaction around each time (which is 
costly at least in Tephra)?
{quote}
For Tephra, a commit or rollback would be the same. They only become expensive 
if the transaction manager cannot be reached (in which case they'd timeout and 
end up on the invalid list).

I think this will fix the issue, but it's a bit unfortunate that the 
PhoenixRuntime.getTableNoCache call calls MetaDataClient.updateCache (which 
starts a transaction). Since we're explicitly saying we want to bypass the 
cache, then it seems like we could not start the transaction but just use 
LATEST_TIMESTAMP. Something to think about, [~tdsilva]. Also, I thought we were 
actually calling PhoenixRuntime.getTable and relying the cache to prevent RPCs.

 Separate from this issue (in a new JIRA), I'd recommend thinking through the 
behavior when
 # auto commit is off. Do we want sqlline to be inadvertently starting a 
transaction and leaving it open? 
 # DML operations are mixed with metadata calls. What if we had something like 
this?
{code:java}
UPSERT VALUES
connection.getMetaData().getTables(...)
connection.rollback(){code}
In this case, the upserted values would get committed before the rollback 
occurred. There's a similar issue with issuing DDL on the same connection with 
uncommitted data. In that case, we issue a rollback before we start executing 
the DDL.

IMO, we should either:
 # Recommend to users to not mix metadata data operations with DML and we 
should follow the same approach as we do with DDL - execute a rollback prior to 
starting the operation and execute another rollback at the end (so we don't 
inadvertently start a transaction and leave it open when auto commit if off).
 # Or alternately, fix the PhoenixRuntime.getTableNoCache to not start a 
transaction. Since we're explicitly saying we want to bypass the cache, then it 
seems like we should not start the transaction but just use LATEST_TIMESTAMP. 
Something to think about, [~tdsilva]. However, I'm not sure why we're not using 
PhoenixRuntime.getTable instead and relying on the cache, though - we could 
save a lot of RPCs that way. If we could use that *and* not start a 
transaction, that'd be ideal, but I think we need to start a transaction to get 
back a timestamp from the transaction manager that we can use in the cache (for 
consistency across queries/DML).

> Changes from Transactional Tables are not visible to query in different client
> ------------------------------------------------------------------------------
>
>                 Key: PHOENIX-5084
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-5084
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.15.0, 4.14.1
>            Reporter: Lars Hofhansl
>            Assignee: Lars Hofhansl
>            Priority: Blocker
>         Attachments: PHOENIX-5084-v2.txt, PHOENIX-5084.txt
>
>
> Scenario:
> # Upsert and commit some data into a transactional table. (Autocommit or 
> following by explicit commit)
> # Query same table from another client
> The first query on the other client will not see the newly upserted/committed 
> data (regardless of how long one waits).
> A second identical query will see the new data.
> This happens with both Omid and Tephra.
> I guess we can't write a test for this, since it requires multiple JVMs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to