SparkSQL read Hive transactional table

2018-10-12 Thread daily
Hi, I use HCatalog Streaming Mutation API to write data to hive transactional table, and then, I use SparkSQL to read data from the hive transactional table. I get the right result. However, SparkSQL uses more time to read hive orc bucket transactional table, beacause SparkSQL read all

Re: Timestamp Difference/operations

2018-10-12 Thread John Zhuge
Yeah, operator "-" does not seem to be supported, however, you can use "datediff" function: In [9]: select datediff(CAST('2000-02-01 12:34:34' AS TIMESTAMP), CAST('2000-01-01 00:00:00' AS TIMESTAMP)) Out[9]:

Code review and Coding livestreams today

2018-10-12 Thread Holden Karau
I’ll be doing my regular weekly code review at 10am Pacific today - https://youtu.be/IlH-EGiWXK8 with a look at the current RC, and in the afternoon at 3pm Pacific I’ll be doing some live coding around WIP graceful decommissioning PR - https://youtu.be/4FKuYk2sbQ8 -- Twitter:

Timestamp Difference/operations

2018-10-12 Thread Paras Agarwal
Hello Spark Community, Currently in hive we can do operations on Timestamp Like : CAST('2000-01-01 12:34:34' AS TIMESTAMP) - CAST('2000-01-01 00:00:00' AS TIMESTAMP) Seems its not supporting in spark. Is there any way available. Kindly provide some insight on this. Paras 9130006036

Spark Structured Streaming resource contention / memory issue

2018-10-12 Thread Patrick McGloin
Hi all, We have a Spark Structured streaming stream which is using mapGroupWithState. After some time of processing in a stable manner suddenly each mini batch starts taking 40 seconds. Suspiciously it looks like exactly 40 seconds each time. Before this the batches were taking less than a