Re: Iceberg Transactions via spark

vivek B Mon, 03 May 2021 09:54:44 -0700

Hey Ryan,

I was using spark merge into command  but its performance became slower as
the size of data in the iceberg table was growing.
A part of the reason may be due to I need to overwrite some of the older
partitions often as a side effect of delete and update.

Another reason may be that *MERGE INTO* command require reading all iceberg
data into spark to do join*.*

I wanted to know if there is a way by which I can acquire lock on iceberg
table.
After acquiring lock, I will  do delete, update and insert operations
explicitly and release the lock.

And If there were any issues then I will roll back to the snapshot id  that
was latest before applying my sql operations.

I require a lock on the iceberg table because no other process should write
to the iceberg table while my sql operations are in progress.

Thanks,

vivek

On 2021/05/02 18:10:19, Ryan Blue <b...@apache.org> wrote:

> Vivek,>

>

> Currently, Spark doesn't support any of the BEGIN/COMMIT statements for>

> transactions, so I don't think that it is possible right now. What are
you>

> trying to do? It may be that some of the newer commands, like MERGE
INTO,>

> would work for you instead.>

>

> On Thu, Apr 29, 2021 at 5:49 PM vivek B <vivekbalachan...@gmail.com>
wrote:>

>

> > Hey All,>

> > Is there a way to run multiple sql operations via spark as one single>

> > transaction ?>

> >>

> > Thanks,>

> > vivek>

> >>

>

>

> -- >

> Ryan Blue>

>

Re: Iceberg Transactions via spark

Reply via email to