Re: Spark support for update/delete operations on Hive ORC transactional tables

2016-06-22 Thread Ajay Chander
Thanks for the confirmation Mich!

On Wednesday, June 22, 2016, Mich Talebzadeh 
wrote:

> Hi Ajay,
>
> I am afraid for now transaction heart beat do not work through Spark, so I
> have no other solution.
>
> This is interesting point as with Hive running on Spark engine there is no
> issue with this as Hive handles the transactions,
>
> I gather in simplest form Hive has to deal with its metadata for
> transaction logic but Spark somehow cannot do that.
>
> In short that is it. You need to do that through Hive.
>
> Cheers,
>
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 22 June 2016 at 16:08, Ajay Chander  > wrote:
>
>> Hi Mich,
>>
>> Right now I have a similar usecase where I have to delete some rows
>> from a hive table. My hive table is of type ORC, Bucketed and included
>> transactional property. I can delete from hive shell but not from my
>> spark-shell or spark app. Were you able to find any work around? Thank
>> you.
>>
>> Regards,
>> Ajay
>>
>>
>> On Thursday, June 2, 2016, Mich Talebzadeh > > wrote:
>>
>>> thanks for that.
>>>
>>> I will have a look
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>> On 2 June 2016 at 10:46, Elliot West  wrote:
>>>
 Related to this, there exists an API in Hive to simplify the
 integrations of other frameworks with Hive's ACID feature:

 See:
 https://cwiki.apache.org/confluence/display/Hive/HCatalog+Streaming+Mutation+API

 It contains code for maintaining heartbeats, handling locks and
 transactions, and submitting mutations in a distributed environment.

 We have used it to write to transactional tables from Cascading based
 processes.

 Elliot.


 On 2 June 2016 at 09:54, Mich Talebzadeh 
 wrote:

>
> Hi,
>
> Spark does not support transactions because as I understand there is
> a piece in the execution side that needs to send heartbeats to Hive
> metastore saying a transaction is still alive". That has not been
> implemented in Spark yet to my knowledge."
>
> Any idea on the timelines when we are going to have support for
> transactions in Spark for Hive ORC tables. This will really be useful.
>
>
> Thanks,
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>


>>>
>


Re: Spark support for update/delete operations on Hive ORC transactional tables

2016-06-22 Thread Mich Talebzadeh
Hi Ajay,

I am afraid for now transaction heart beat do not work through Spark, so I
have no other solution.

This is interesting point as with Hive running on Spark engine there is no
issue with this as Hive handles the transactions,

I gather in simplest form Hive has to deal with its metadata for
transaction logic but Spark somehow cannot do that.

In short that is it. You need to do that through Hive.

Cheers,



Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 22 June 2016 at 16:08, Ajay Chander  wrote:

> Hi Mich,
>
> Right now I have a similar usecase where I have to delete some rows from a
> hive table. My hive table is of type ORC, Bucketed and included
> transactional property. I can delete from hive shell but not from my
> spark-shell or spark app. Were you able to find any work around? Thank
> you.
>
> Regards,
> Ajay
>
>
> On Thursday, June 2, 2016, Mich Talebzadeh 
> wrote:
>
>> thanks for that.
>>
>> I will have a look
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 2 June 2016 at 10:46, Elliot West  wrote:
>>
>>> Related to this, there exists an API in Hive to simplify the
>>> integrations of other frameworks with Hive's ACID feature:
>>>
>>> See:
>>> https://cwiki.apache.org/confluence/display/Hive/HCatalog+Streaming+Mutation+API
>>>
>>> It contains code for maintaining heartbeats, handling locks and
>>> transactions, and submitting mutations in a distributed environment.
>>>
>>> We have used it to write to transactional tables from Cascading based
>>> processes.
>>>
>>> Elliot.
>>>
>>>
>>> On 2 June 2016 at 09:54, Mich Talebzadeh 
>>> wrote:
>>>

 Hi,

 Spark does not support transactions because as I understand there is a
 piece in the execution side that needs to send heartbeats to Hive metastore
 saying a transaction is still alive". That has not been implemented in
 Spark yet to my knowledge."

 Any idea on the timelines when we are going to have support for
 transactions in Spark for Hive ORC tables. This will really be useful.


 Thanks,


 Dr Mich Talebzadeh



 LinkedIn * 
 https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
 *



 http://talebzadehmich.wordpress.com



>>>
>>>
>>


Re: Spark support for update/delete operations on Hive ORC transactional tables

2016-06-22 Thread Ajay Chander
Hi Mich,

Right now I have a similar usecase where I have to delete some rows from a
hive table. My hive table is of type ORC, Bucketed and included
transactional property. I can delete from hive shell but not from my
spark-shell or spark app. Were you able to find any work around? Thank you.

Regards,
Ajay

On Thursday, June 2, 2016, Mich Talebzadeh 
wrote:

> thanks for that.
>
> I will have a look
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 2 June 2016 at 10:46, Elliot West  > wrote:
>
>> Related to this, there exists an API in Hive to simplify the integrations
>> of other frameworks with Hive's ACID feature:
>>
>> See:
>> https://cwiki.apache.org/confluence/display/Hive/HCatalog+Streaming+Mutation+API
>>
>> It contains code for maintaining heartbeats, handling locks and
>> transactions, and submitting mutations in a distributed environment.
>>
>> We have used it to write to transactional tables from Cascading based
>> processes.
>>
>> Elliot.
>>
>>
>> On 2 June 2016 at 09:54, Mich Talebzadeh > > wrote:
>>
>>>
>>> Hi,
>>>
>>> Spark does not support transactions because as I understand there is a
>>> piece in the execution side that needs to send heartbeats to Hive metastore
>>> saying a transaction is still alive". That has not been implemented in
>>> Spark yet to my knowledge."
>>>
>>> Any idea on the timelines when we are going to have support for
>>> transactions in Spark for Hive ORC tables. This will really be useful.
>>>
>>>
>>> Thanks,
>>>
>>>
>>> Dr Mich Talebzadeh
>>>
>>>
>>>
>>> LinkedIn * 
>>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>> *
>>>
>>>
>>>
>>> http://talebzadehmich.wordpress.com
>>>
>>>
>>>
>>
>>
>


Re: Spark support for update/delete operations on Hive ORC transactional tables

2016-06-02 Thread Mich Talebzadeh
thanks for that.

I will have a look

Dr Mich Talebzadeh



LinkedIn * 
https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
*



http://talebzadehmich.wordpress.com



On 2 June 2016 at 10:46, Elliot West  wrote:

> Related to this, there exists an API in Hive to simplify the integrations
> of other frameworks with Hive's ACID feature:
>
> See:
> https://cwiki.apache.org/confluence/display/Hive/HCatalog+Streaming+Mutation+API
>
> It contains code for maintaining heartbeats, handling locks and
> transactions, and submitting mutations in a distributed environment.
>
> We have used it to write to transactional tables from Cascading based
> processes.
>
> Elliot.
>
>
> On 2 June 2016 at 09:54, Mich Talebzadeh 
> wrote:
>
>>
>> Hi,
>>
>> Spark does not support transactions because as I understand there is a
>> piece in the execution side that needs to send heartbeats to Hive metastore
>> saying a transaction is still alive". That has not been implemented in
>> Spark yet to my knowledge."
>>
>> Any idea on the timelines when we are going to have support for
>> transactions in Spark for Hive ORC tables. This will really be useful.
>>
>>
>> Thanks,
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> *
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>
>


Re: Spark support for update/delete operations on Hive ORC transactional tables

2016-06-02 Thread Elliot West
Related to this, there exists an API in Hive to simplify the integrations
of other frameworks with Hive's ACID feature:

See:
https://cwiki.apache.org/confluence/display/Hive/HCatalog+Streaming+Mutation+API

It contains code for maintaining heartbeats, handling locks and
transactions, and submitting mutations in a distributed environment.

We have used it to write to transactional tables from Cascading based
processes.

Elliot.


On 2 June 2016 at 09:54, Mich Talebzadeh  wrote:

>
> Hi,
>
> Spark does not support transactions because as I understand there is a
> piece in the execution side that needs to send heartbeats to Hive metastore
> saying a transaction is still alive". That has not been implemented in
> Spark yet to my knowledge."
>
> Any idea on the timelines when we are going to have support for
> transactions in Spark for Hive ORC tables. This will really be useful.
>
>
> Thanks,
>
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> *
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>