virajjasani commented on PR #2199:
URL: https://github.com/apache/phoenix/pull/2199#issuecomment-2997111321

   > I have a comment on the user interface. Currently, one could use 
`executeAtomicUpdateReturnRow()` if they are interested in the new state (or 
old when unchanged) and the newly introduced 
`executeAtomicUpdateReturnOldRow()` if they are interested in the old state. 
However, it is not possible to receive both so why not let the user choose 
which states they want and make multiple result sets available via JDBC API 
[getMoreResults()](https://docs.oracle.com/javase/8/docs/api/java/sql/Statement.html#getMoreResults--)()?
   
   Interesting point indeed. I think there are two challenges with it:
   - Implementation: Phoenix constructs ResultSet based on Result retrieved 
from HBase. The server and client communicates only through Cells. The Cell is 
expected to have same rowkey for the given row and same CF:CQ as per the 
encodings used by Phoenix table columns. If the server were to send the list of 
Cells in the Result, for the same row, it cannot differentiate old vs new cells 
based on just rowkey/cf/cq combination, which is usually used by Phoenix client 
to generate the ResultSet. We would need additional implementation overhead of 
identifying based on timestamp.
   - Data transfer: For small rows and the tables with less complex data types, 
this approach can still be considered ideal. However, complex data types would 
make data transfer twice as expensive over the network. For instance, usually 
one can start with a small document initially and keep using 
BSON_UPDATE_EXPRESSION() to incrementally add/update more top-level or nested 
document fields. The expression language is simpler as it does not need for 
client to send the whole document to the server, it just allows client to 
specify what to change in the document. If the server sends both new and old 
row image with every atomic update, large document data transfer would be twice 
expensive and can impact the atomic update latencies.
   
   I have not seen use case where the client needs both old and new row image 
with the atomic update, it could be either one. However, the best use case I 
can think of is CDC, which scans new and old row image based on the raw scan.
   
   @haridsv this is nice point to consider, is there any database you are aware 
of that provides both updated as well as old row image with the atomic updates?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@phoenix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to