I am +1 on having spark-2 and spark-3 modules as well.

> On 7 Mar 2020, at 15:03, RD <rdsr...@gmail.com> wrote:
> 
> I'm +1 to separate modules for spark-2 and spark-3, after the 0.8 release.
> I think it would be a big change in organizations to adopt Spark-3 since that 
> brings in Scala-2.12 which is binary incompatible to previous Scala versions. 
> Hence this adoption could take a lot of time. I know in our company we have 
> no near term plans to move to Spark 3.
> 
> -Best,
> R.
> 
> On Thu, Mar 5, 2020 at 6:33 PM Saisai Shao <sai.sai.s...@gmail.com 
> <mailto:sai.sai.s...@gmail.com>> wrote:
> I was thinking that if it is possible to limit version lock plugin to only 
> iceberg core related subprojects., seems like current consistent-versions 
> plugin doesn't allow to do so. So not sure if there're some other plugins 
> which could provide similar functionality with more flexibility?
> 
>  Any suggestions on this?
> 
> Best regards,
> Saisai
> 
> Saisai Shao <sai.sai.s...@gmail.com <mailto:sai.sai.s...@gmail.com>> 
> 于2020年3月5日周四 下午3:12写道:
> I think the requirement of supporting different version should be quite 
> common. As Iceberg is a table format which should be adapted to different 
> engines like Hive, Flink, Spark. To support different versions is a real 
> problem, Spark is just one case, Hive, Flink could also be the case if the 
> interface is changed across major versions. Also version lock may have 
> problems when several engines coexisted in the same build, as they will 
> transiently introduce lots of dependencies which may be conflicted, it may be 
> hard to figure out one version which could satisfy all, and usually they only 
> confined to a single module.
> 
>  So I think we should figure out a way to support such scenario, not just 
> maintaining branches one by one.
> 
> Ryan Blue <rb...@netflix.com <mailto:rb...@netflix.com>> 于2020年3月5日周四 
> 上午2:53写道:
> I think the key is that this wouldn't be using the same published artifacts. 
> This work would create a spark-2.4 artifact and a spark-3.0 artifact. (And 
> possibly a spark-common artifact.)
> 
> It seems reasonable to me to have those in the same build instead of in 
> separate branches, as long as the Spark dependencies are not leaked outside 
> of the modules. That said, I'd rather have the additional checks that 
> baseline provides in general since this is a short-term problem. It would 
> just be nice if we could have versions that are confined to a single module. 
> The Nebula plugin that baseline uses claims to support that, but I couldn't 
> get it to work.
> 
> On Wed, Mar 4, 2020 at 6:38 AM Saisai Shao <sai.sai.s...@gmail.com 
> <mailto:sai.sai.s...@gmail.com>> wrote:
> Just think a bit on this. I agree that generally introducing different 
> versions of same dependencies could be error prone. But I think the case here 
> should not lead to  issue:
> 
> 1.  These two sub-modules spark-2 and spark-3 are isolated, they're not 
> dependent on either.
> 2. They can be differentiated by names when generating jars, also they will 
> not be relied by other modules in Iceberg.
> 
> So this dependency issue should not be the case here. And in Maven it could 
> be achieved easily. Please correct me if wrong.
> 
> Best regards,
> Saisai 
> 
> Saisai Shao <sai.sai.s...@gmail.com <mailto:sai.sai.s...@gmail.com>> 
> 于2020年3月4日周三 上午10:01写道:
> Thanks Matt,
> 
> If branching is the only choice, then we would potentially have two *master* 
> branches until spark-3 is vastly adopted. That will somehow increase the 
> maintenance burden and lead to inconsistency. IMO I'm OK with the branching 
> way, just think that we should have a clear way to keep tracking of two 
> branches.
> 
> Best,
> Saisai
> 
> Matt Cheah <mch...@palantir.com.invalid> 于2020年3月4日周三 上午9:50写道:
> I think it’s generally dangerous and error-prone to try to support two 
> versions of the same library in the same build, in the same published 
> artifacts. This is the stance that Baseline 
> <https://github.com/palantir/gradle-baseline> + Gradle Consistent Versions 
> <https://github.com/palantir/gradle-consistent-versions> takes. Gradle 
> Consistent Versions is specifically opinionated towards building against one 
> version of a library across all modules in the build.
> 
>  
> 
> I would think that branching would be the best way to build and publish 
> against multiple versions of a dependency.
> 
>  
> 
> -Matt Cheah
> 
>  
> 
> From: Saisai Shao <sai.sai.s...@gmail.com <mailto:sai.sai.s...@gmail.com>>
> Reply-To: "dev@iceberg.apache.org <mailto:dev@iceberg.apache.org>" 
> <dev@iceberg.apache.org <mailto:dev@iceberg.apache.org>>
> Date: Tuesday, March 3, 2020 at 5:45 PM
> To: Iceberg Dev List <dev@iceberg.apache.org <mailto:dev@iceberg.apache.org>>
> Cc: Ryan Blue <rb...@netflix.com <mailto:rb...@netflix.com>>
> Subject: Re: [Discuss] Merge spark-3 branch into master
> 
>  
> 
> I didn't realized that Gradle cannot support two different versions in one 
> build. I think I did such things for Livy to build scala 2.10 and 2.11 jars 
> simultaneously with Maven. I'm not so familiar with Gradle thing, I can take 
> a shot to see if there's some hacky ways to make it work.
> 
>  
> 
> Besides, are we saying that we will move to spark-3 support after 0.8 release 
> in the master branch to replace Spark-2, or we maintain two branches for both 
> spark-2 and spark-3 and make two releases? From my understanding, the 
> adoption of spark-3 may not be so fast, and there still has lots users who 
> stick on spark-2. Ideally, it might be better to support two versions in a 
> near future.
> 
>  
> 
> Thanks
> 
> Saisai
> 
>  
> 
>  
> 
>  
> 
> Mass Dosage <massdos...@gmail.com <mailto:massdos...@gmail.com>> 于2020年3月4日周三 
> 上午1:33写道:
> 
> +1 for a 0.8.0 release with Spark 2.4 and then move on for Spark 3.0 when 
> it's ready.
> 
>  
> 
> On Tue, 3 Mar 2020 at 16:32, Ryan Blue <rb...@netflix.com.invalid> wrote:
> 
> Thanks for bringing this up, Saisai. I tried to do this a couple of months 
> ago, but ran into a problem with dependency locks. I couldn't get two 
> different versions of Spark packages in the build with baseline, but maybe I 
> was missing something. If you can get it working, I think it's a great idea 
> to get this into master.
> 
>  
> 
> Otherwise, I was thinking about proposing an 0.8.0 release in the next month 
> or so based on Spark 2.4. Then we could merge the branch into master and do 
> another release for Spark 3.0 when it's ready.
> 
>  
> 
> rb
> 
>  
> 
> On Tue, Mar 3, 2020 at 6:07 AM Saisai Shao <sai.sai.s...@gmail.com 
> <mailto:sai.sai.s...@gmail.com>> wrote:
> 
> Hi team,
> 
>  
> 
> I was thinking of merging spark-3 branch into master, also per the discussion 
> before we could make spark-2 and spark-3 coexisted into 2 different 
> sub-modules. With this, one build could generate both spark-2 and spark-3 
> runtime jars, user could pick either at preference. 
> 
>  
> 
> One concern is that they share lots of common code in read/write path, this 
> will increase the maintenance overhead to keep consistency of two copies.
> 
>  
> 
> So I'd like to hear your thoughts, any suggestions on it?
> 
>  
> 
> Thanks
> 
> Saisai
> 
> 
> 
>  
> 
> --
> 
> Ryan Blue
> 
> Software Engineer
> 
> Netflix
> 
> 
> 
> -- 
> Ryan Blue
> Software Engineer
> Netflix

Reply via email to