Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2019-11-01 Thread Jiaxin Shan
+1 for Hadoop 3.2. Seems lots of cloud integration efforts Steve made is only available in 3.2. We see lots of users asking for better S3A support in Spark. On Fri, Nov 1, 2019 at 9:46 AM Xiao Li wrote: > Hi, Steve, > > Thanks for your comments! My major quality concern is not against Hadoop >

DSv2 sync notes - 30 October 2019

2019-11-01 Thread Ryan Blue
*Attendees*: Ryan Blue Terry Kim Wenchen Fan Jose Torres Jacky Lee Gengliang Wang *Topics*: - DROP NAMESPACE cascade behavior - 3.0 tasks - TableProvider API changes - V1 and V2 table resolution rules - Separate logical and physical write (for streaming) - Bucketing support

Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2019-11-01 Thread Dongjoon Hyun
Hi, Xiao. How JDK11-support can make `Hadoop-3.2 profile` risky? We build and publish with JDK8. > In this release, Hive execution module upgrade [from 1.2 to 2.3], Hive thrift-server upgrade, and JDK11 supports are added to Hadoop 3.2 profile only. Since we build and publish with JDK8 and the

Re: [VOTE] SPARK 3.0.0-preview (RC2)

2019-11-01 Thread Takeshi Yamamuro
+1, too. On Sat, Nov 2, 2019 at 3:36 AM Hyukjin Kwon wrote: > +1 > > On Fri, 1 Nov 2019, 15:36 Wenchen Fan, wrote: > >> The PR builder uses Hadoop 2.7 profile, which makes me think that 2.7 is >> more stable and we should make releases using 2.7 by default. >> >> +1 >> >> On Fri, Nov 1, 2019

Re: [VOTE] SPARK 3.0.0-preview (RC2)

2019-11-01 Thread Wenchen Fan
The PR builder uses Hadoop 2.7 profile, which makes me think that 2.7 is more stable and we should make releases using 2.7 by default. +1 On Fri, Nov 1, 2019 at 7:16 AM Xiao Li wrote: > Spark 3.0 will still use the Hadoop 2.7 profile by default, I think. > Hadoop 2.7 profile is much more

Re: Spark 3.0 and S3A

2019-11-01 Thread Steve Loughran
On Mon, Oct 28, 2019 at 3:40 PM Sean Owen wrote: > There will be a "Hadoop 3.x" version of 3.0, as it's essential to get > a JDK 11-compatible build. you can see the hadoop-3.2 profile. > hadoop-aws is pulled in in the hadoop-cloud module I believe, so bears > checking whether the profile

Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2019-11-01 Thread Steve Loughran
What is the current default value? as the 2.x releases are becoming EOL; 2.7 is dead, there might be a 2.8.x; for now 2.9 is the branch-2 release getting attention. 2.10.0 shipped yesterday, but the ".0" means there will inevitably be surprises. One issue about using a older versions is that any

Re: [VOTE] SPARK 3.0.0-preview (RC2)

2019-11-01 Thread Dongjoon Hyun
+1 for Apache Spark 3.0.0-preview (RC2). Bests, Dongjoon. On Thu, Oct 31, 2019 at 11:36 PM Wenchen Fan wrote: > The PR builder uses Hadoop 2.7 profile, which makes me think that 2.7 is > more stable and we should make releases using 2.7 by default. > > +1 > > On Fri, Nov 1, 2019 at 7:16 AM

Re: Use Hadoop-3.2 as a default Hadoop profile in 3.0.0?

2019-11-01 Thread Xiao Li
Hi, Steve, Thanks for your comments! My major quality concern is not against Hadoop 3.2. In this release, Hive execution module upgrade [from 1.2 to 2.3], Hive thrift-server upgrade, and JDK11 supports are added to Hadoop 3.2 profile only. Compared with Hadoop 2.x profile, the Hadoop 3.2 profile

Re: [VOTE] SPARK 3.0.0-preview (RC2)

2019-11-01 Thread Hyukjin Kwon
+1 On Fri, 1 Nov 2019, 15:36 Wenchen Fan, wrote: > The PR builder uses Hadoop 2.7 profile, which makes me think that 2.7 is > more stable and we should make releases using 2.7 by default. > > +1 > > On Fri, Nov 1, 2019 at 7:16 AM Xiao Li wrote: > >> Spark 3.0 will still use the Hadoop 2.7