[GitHub] [spark] turboFei commented on issue #24610: [SPARK-27716][SQL] Complete the transactions support for part of jdbc datasource operations.

GitBox Wed, 22 May 2019 09:17:10 -0700

turboFei commented on issue #24610: [SPARK-27716][SQL] Complete the 
transactions support for part of jdbc datasource operations.
URL: https://github.com/apache/spark/pull/24610#issuecomment-494872278
 
 
   > > I don't know this part well, but I'm pretty sure that the transaction is 
supposed to be per partition. These writes must happen separately
   > 
   > Yes, you are right, the transaction is supposed to be per partition.
   > For these cases, where the table to be saved can be dropped first or not 
existed, we can save the rdd data to a tempTable and record the successful 
partitions num with the help of accumulator.
   > At last, we compare the accumulator's value with partitions num of 
dataFrame to get whether all partitions are successful.
   > If all partitions are successful, we rename the tempTable to destination 
table.
   > Therefor, we can make the save operation for all partitions in a single 
transaction.
   > 
   
   @gatorsmile  Could you help me review this?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] turboFei commented on issue #24610: [SPARK-27716][SQL] Complete the transactions support for part of jdbc datasource operations.

Reply via email to