[DISCUSS] clarify the definition of behavior changes

2024-04-30 Thread Wenchen Fan
Hi all, It's exciting to see innovations keep happening in the Spark community and Spark keeps evolving itself. To make these innovations available to more users, it's important to help users upgrade to newer Spark versions easily. We've done a good job on it: the PR template requires the author

Re: Potential Impact of Hive Upgrades on Spark Tables

2024-04-30 Thread Wenchen Fan
Yes, Spark has a shim layer to support all Hive versions. It shouldn't be an issue as many users create native Spark data source tables already today, by explicitly putting the `USING` clause in the CREATE TABLE statement. On Wed, May 1, 2024 at 12:56 AM Mich Talebzadeh wrote: > @Wenchen Fan

Potential Impact of Hive Upgrades on Spark Tables

2024-04-30 Thread Mich Talebzadeh
@Wenchen Fan Got your explanation, thanks! My understanding is that even if we create Spark tables using Spark's native data sources, by default, the metadata about these tables will be stored in the Hive metastore. As a consequence, a Hive upgrade can potentially affect Spark tables. For

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Kent Yao
+1 Kent Yao On 2024/04/30 09:07:21 Yuming Wang wrote: > +1 > > On Tue, Apr 30, 2024 at 3:31 PM Ye Xianjin wrote: > > > +1 > > Sent from my iPhone > > > > On Apr 30, 2024, at 3:23 PM, DB Tsai wrote: > > > >  > > +1 > > > > On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote: > > > >  > > To add

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Yuming Wang
+1 On Tue, Apr 30, 2024 at 3:31 PM Ye Xianjin wrote: > +1 > Sent from my iPhone > > On Apr 30, 2024, at 3:23 PM, DB Tsai wrote: > >  > +1 > > On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote: > >  > To add more color: > > Spark data source table and Hive Serde table are both stored in the

[VOTE][RESULT] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Dongjoon Hyun
The vote passes with 11 +1s (6 binding +1s) and one -1. Thanks to all who helped with the vote! (* = binding) +1: - Dongjoon Hyun * - Gengliang Wang * - Liang-Chi Hsieh * - Holden Karau * - Zhou Jiang - Cheng Pan - Hyukjin Kwon * - DB Tsai * - Ye Xianjin - XiDuo You - Nimrod Ofek +0: None -1: -

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Nimrod Ofek
+1 (non-binding) p.s How do I become binding? Thanks, Nimrod On Tue, Apr 30, 2024 at 10:53 AM Ye Xianjin wrote: > +1 > Sent from my iPhone > > On Apr 30, 2024, at 3:23 PM, DB Tsai wrote: > >  > +1 > > On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote: > >  > To add more color: > > Spark data

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread XiDuo You
+1 Dongjoon Hyun 于2024年4月27日周六 03:50写道: > > Please vote on SPARK-46122 to set spark.sql.legacy.createHiveTableByDefault > to `false` by default. The technical scope is defined in the following PR. > > - DISCUSSION: https://lists.apache.org/thread/ylk96fg4lvn6klxhj6t6yh42lyqb8wmd > - JIRA:

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread Ye Xianjin
+1 Sent from my iPhoneOn Apr 30, 2024, at 3:23 PM, DB Tsai wrote:+1 On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote:To add more color:Spark data source table and Hive Serde table are both stored in the Hive metastore and keep the data files in the table directory. The only difference is they

Re: [VOTE] SPARK-46122: Set spark.sql.legacy.createHiveTableByDefault to false

2024-04-30 Thread DB Tsai
+1 On Apr 29, 2024, at 8:01 PM, Wenchen Fan wrote:To add more color:Spark data source table and Hive Serde table are both stored in the Hive metastore and keep the data files in the table directory. The only difference is they have different "table provider", which means Spark will use different