subject:"Why the same INSERT OVERWRITE sql , final table file produced by spark sql is larger than hive sql？"

Re: Why the same INSERT OVERWRITE sql , final table file produced by spark sql is larger than hive sql？

2022-10-12 Thread Chartist

| From | Sadha Chilukoori | | Date | 10/12/2022 08:27 | | To | Chartist<13289341...@163.com> | | Cc | | | Subject | Re: Why the same INSERT OVERWRITE sql , final table file produced by spark sql is larger than hive sql？ | I have faced the same problem, where hive and spark orc were using the

Re: Why the same INSERT OVERWRITE sql , final table file produced by spark sql is larger than hive sql？

2022-10-11 Thread Sadha Chilukoori

I have faced the same problem, where hive and spark orc were using the snappy compression. Hive 2.1 Spark 2.4.8 I'm curious to learn what could be the root cause of this. -S On Tue, Oct 11, 2022, 2:18 AM Chartist <13289341...@163.com> wrote: > > Hi，All > > I encountered a problem as the

Why the same INSERT OVERWRITE sql , final table file produced by spark sql is larger than hive sql？

2022-10-11 Thread Chartist

Hi，All I encountered a problem as the e-mail subject described. And the followings are the details: SQL: insert overwrite table mytable partition(pt='20220518') select guid, user_new_id, sum_credit_score, sum_credit_score_change, platform_credit_score_change, bike_credit_score_change,