Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
> On Jul 25, 2016, at 1:16 PM, Sangjin Leewrote: > > Also: right now, the non-Linux and/or non-x86 platforms have to supply their > own leveldbjni jar (or at least the C level library?) in order to make YARN > even functional. How is that going to work with the class path manipulation? > > First, the native libraries are orthogonal to this. They're not governed by > the java classpath. > > For those platforms where users/admins need to provide their own LevelDB > libraries, the only requirement would be to add them to the > share/hadoop/.../lib directory. I don't think we would ask end users of the > clusters to bring in their own LevelDB library as it would not be an end-user > concern. I assume the administrators of clusters (still users but not end > users) would add it to the clusters. The classpath isolation doesn't really > have an impact on that. > $ jar tf leveldbjni-all-1.8.jar | grep native META-INF/native/ META-INF/native/linux32/ META-INF/native/linux32/libleveldbjni.so META-INF/native/linux64/ META-INF/native/linux64/libleveldbjni.so META-INF/native/osx/ META-INF/native/osx/libleveldbjni.jnilib META-INF/native/windows32/ META-INF/native/windows32/leveldbjni.dll META-INF/native/windows64/ META-INF/native/windows64/leveldbjni.dll - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
On Fri, Jul 22, 2016 at 5:15 PM, Allen Wittenauerwrote: > > But if I don't use ApplicationClassLoader, my java app is basically > screwed then, right? > If we start upgrading the libraries aggressively, then it would also mean that the ApplicationClassLoader should be more of the default than the other way around (i.e. opt-out rather than opt-in). If we're not willing to go there, then we cannot be too aggressive in upgrading libraries. I'm not sure what you mean by "my java app is basically screwed", but if you meant whether your java app would be OK if hadoop upgraded libraries aggressively and you don't use the ApplicationClassLoader, then yes. > > Also: right now, the non-Linux and/or non-x86 platforms have to supply > their own leveldbjni jar (or at least the C level library?) in order to > make YARN even functional. How is that going to work with the class path > manipulation? > First, the native libraries are orthogonal to this. They're not governed by the java classpath. For those platforms where users/admins need to provide their own LevelDB libraries, the only requirement would be to add them to the share/hadoop/.../lib directory. I don't think we would ask end users of the clusters to bring in their own LevelDB library as it would not be an end-user concern. I assume the administrators of clusters (still users but not end users) would add it to the clusters. The classpath isolation doesn't really have an impact on that. > > > On Jul 22, 2016, at 9:57 AM, Sangjin Lee wrote: > > > > The work on HADOOP-13070 and the ApplicationClassLoader are generic and > go beyond YARN. It can be used in any JVM that uses hadoop. The current use > cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN > node manager auxiliary services. I'm not sure if that's what you were > asking, but I hope it helps. > > > > Regards, > > Sangjin > > > > On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey > wrote: > > My work on HADOOP-11804 *only* helps processes that sit outside of YARN. > :) > > > > On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer > > wrote: > > > > > > Does any of this work actually help processes that sit outside of YARN? > > > > > >> On Jul 21, 2016, at 12:29 PM, Sean Busbey > wrote: > > >> > > >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0. > > >> > > >> I have an updated patch for HADOOP-11804 ready to post this week. I've > > >> been updating HBase's master branch to try to make use of it, but > > >> could use some other reviews. > > >> > > >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa > wrote: > > >>> Hi developers, > > >>> > > >>> I'd like to discuss how to make an advance towards dependency > > >>> management in Apache Hadoop trunk code since there has been lots work > > >>> about updating dependencies in parallel. Summarizing recent works and > > >>> activities as follows: > > >>> > > >>> 0) Currently, we have merged minimum update dependencies for making > > >>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8). > > >>> 1) After that, some people suggest that we should update the other > > >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.). > > >>> 2) In parallel, Sangjin and Sean are working on classpath isolation: > > >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656. > > >>> > > >>> Main problems we try to solve in the activities above is as follows: > > >>> > > >>> * 1) tries to solve dependency hell between user-level jar and > > >>> system(Hadoop)-level jar. > > >>> * 2) tries to solve updating old libraries. > > >>> > > >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries > > >>> to separate class loader between client-side dependencies and > > >>> server-side dependencies in Hadoop, so we can the change policy of > > >>> updating libraries after doing 2). We can also decide which libraries > > >>> can be shaded after 2). > > >>> > > >>> Hence, IMHO, a straight way we should go to is doing 2 at first. > > >>> After that, we can update both client-side and server-side > > >>> dependencies based on new policy(maybe we should discuss what kind of > > >>> incompatibility is acceptable, and the others are not). > > >>> > > >>> Thoughts? > > >>> > > >>> Thanks, > > >>> - Tsuyoshi > > >>> > > >>> - > > >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > >>> > > >> > > >> > > >> > > >> -- > > >> busbey > > >> > > >> - > > >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > > >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > >> > > > > > > > > >
Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
> On Jul 22, 2016, at 5:47 PM, Zheng, Kaiwrote: > > For the leveldb thing, wouldn't we have an alternative option in Java for the > platforms where leveldb isn't supported yet due to whatever reasons. IMO, > native library would be best to be used for optimization and production for > performance. For development and pure Java platform, by default pure Java > approach should still be provided and used. That is to say, if no Hadoop > native is used, all the functionalities should still work and not break. Yes and no. I can certainly understand some high-end features being tied to native libraries, simply because system programming with Java is like being a touch typist with your nose. That said, absolutely key functionality should definitely work. Take a look at the last Linux/ppc64le report that was emailed to these very lists a few days ago [1]: https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/30/artifact/out/console-report.html Almost all of those YARN failures are due to MiniYARN trying to initiate leveldb as part of the service startup but can't because the embedded shared library is the wrong hardware architecture. Rather than catch the exception and do something else, the code just blows up in a very dramatic fashion. That should translate into YARN is completely busted and unusable without doing some very weird workarounds. To get us back on topic: the class path isolation work absolutely cannot make this situation worse. We either need to make sure end users can replace/modify Hadoop's dependencies if they require native lirbaries or work harder on making multiplatform stuff better supported. The nightly PowerPC builds should help tremendously towards this goal. [2] 1 - While I greatly appreciate the OpenPOWER Foundation getting the ASF access to these boxes -- Mesos and Hadoop are both actively using them -- It'd be great if they were more reliable so we could get a report every day of the week. :( 2 - At some point, I'll set up a manually triggered precommit job to test patches. But until both boxes are online and available on a consistent basis, it just isn't worth the effort. - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
RE: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
For the leveldb thing, wouldn't we have an alternative option in Java for the platforms where leveldb isn't supported yet due to whatever reasons. IMO, native library would be best to be used for optimization and production for performance. For development and pure Java platform, by default pure Java approach should still be provided and used. That is to say, if no Hadoop native is used, all the functionalities should still work and not break. HDFS erasure coding goes in the way. For that, we spent much effort in developing an ISA-L compatible erasure coder in pure Java that's used by default, though for performance the ISA-L native one is recommended in production deployment. Regards, Kai -Original Message- From: Allen Wittenauer [mailto:a...@effectivemachines.com] Sent: Saturday, July 23, 2016 8:16 AM To: Sangjin LeeCc: Sean Busbey ; common-...@hadoop.apache.org; yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; mapreduce-dev@hadoop.apache.org Subject: Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk But if I don't use ApplicationClassLoader, my java app is basically screwed then, right? Also: right now, the non-Linux and/or non-x86 platforms have to supply their own leveldbjni jar (or at least the C level library?) in order to make YARN even functional. How is that going to work with the class path manipulation? > On Jul 22, 2016, at 9:57 AM, Sangjin Lee wrote: > > The work on HADOOP-13070 and the ApplicationClassLoader are generic and go > beyond YARN. It can be used in any JVM that uses hadoop. The current use > cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN > node manager auxiliary services. I'm not sure if that's what you were asking, > but I hope it helps. > > Regards, > Sangjin > > On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey wrote: > My work on HADOOP-11804 *only* helps processes that sit outside of > YARN. :) > > On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer > wrote: > > > > Does any of this work actually help processes that sit outside of YARN? > > > >> On Jul 21, 2016, at 12:29 PM, Sean Busbey wrote: > >> > >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0. > >> > >> I have an updated patch for HADOOP-11804 ready to post this week. > >> I've been updating HBase's master branch to try to make use of it, > >> but could use some other reviews. > >> > >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa wrote: > >>> Hi developers, > >>> > >>> I'd like to discuss how to make an advance towards dependency > >>> management in Apache Hadoop trunk code since there has been lots > >>> work about updating dependencies in parallel. Summarizing recent > >>> works and activities as follows: > >>> > >>> 0) Currently, we have merged minimum update dependencies for > >>> making Hadoop JDK-8 compatible(compilable and runnable on JDK-8). > >>> 1) After that, some people suggest that we should update the other > >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.). > >>> 2) In parallel, Sangjin and Sean are working on classpath isolation: > >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656. > >>> > >>> Main problems we try to solve in the activities above is as follows: > >>> > >>> * 1) tries to solve dependency hell between user-level jar and > >>> system(Hadoop)-level jar. > >>> * 2) tries to solve updating old libraries. > >>> > >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) > >>> tries to separate class loader between client-side dependencies > >>> and server-side dependencies in Hadoop, so we can the change > >>> policy of updating libraries after doing 2). We can also decide > >>> which libraries can be shaded after 2). > >>> > >>> Hence, IMHO, a straight way we should go to is doing 2 at first. > >>> After that, we can update both client-side and server-side > >>> dependencies based on new policy(maybe we should discuss what kind > >>> of incompatibility is acceptable, and the others are not). > >>> > >>> Thoughts? > >>> > >>> Thanks, > >>> - Tsuyoshi > >>> > >>> -- > >>> --- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >>> > >> > >> > >> > >> -- > >> busbey > >> > >> --- > >> -- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >> > > > > > > > > - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > > > > --
Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
But if I don't use ApplicationClassLoader, my java app is basically screwed then, right? Also: right now, the non-Linux and/or non-x86 platforms have to supply their own leveldbjni jar (or at least the C level library?) in order to make YARN even functional. How is that going to work with the class path manipulation? > On Jul 22, 2016, at 9:57 AM, Sangjin Leewrote: > > The work on HADOOP-13070 and the ApplicationClassLoader are generic and go > beyond YARN. It can be used in any JVM that uses hadoop. The current use > cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN > node manager auxiliary services. I'm not sure if that's what you were asking, > but I hope it helps. > > Regards, > Sangjin > > On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey wrote: > My work on HADOOP-11804 *only* helps processes that sit outside of YARN. :) > > On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer > wrote: > > > > Does any of this work actually help processes that sit outside of YARN? > > > >> On Jul 21, 2016, at 12:29 PM, Sean Busbey wrote: > >> > >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0. > >> > >> I have an updated patch for HADOOP-11804 ready to post this week. I've > >> been updating HBase's master branch to try to make use of it, but > >> could use some other reviews. > >> > >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa wrote: > >>> Hi developers, > >>> > >>> I'd like to discuss how to make an advance towards dependency > >>> management in Apache Hadoop trunk code since there has been lots work > >>> about updating dependencies in parallel. Summarizing recent works and > >>> activities as follows: > >>> > >>> 0) Currently, we have merged minimum update dependencies for making > >>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8). > >>> 1) After that, some people suggest that we should update the other > >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.). > >>> 2) In parallel, Sangjin and Sean are working on classpath isolation: > >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656. > >>> > >>> Main problems we try to solve in the activities above is as follows: > >>> > >>> * 1) tries to solve dependency hell between user-level jar and > >>> system(Hadoop)-level jar. > >>> * 2) tries to solve updating old libraries. > >>> > >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries > >>> to separate class loader between client-side dependencies and > >>> server-side dependencies in Hadoop, so we can the change policy of > >>> updating libraries after doing 2). We can also decide which libraries > >>> can be shaded after 2). > >>> > >>> Hence, IMHO, a straight way we should go to is doing 2 at first. > >>> After that, we can update both client-side and server-side > >>> dependencies based on new policy(maybe we should discuss what kind of > >>> incompatibility is acceptable, and the others are not). > >>> > >>> Thoughts? > >>> > >>> Thanks, > >>> - Tsuyoshi > >>> > >>> - > >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >>> > >> > >> > >> > >> -- > >> busbey > >> > >> - > >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >> > > > > > > - > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > > > > -- > busbey > > - > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
The work on HADOOP-13070 and the ApplicationClassLoader are generic and go beyond YARN. It can be used in any JVM that uses hadoop. The current use cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN node manager auxiliary services. I'm not sure if that's what you were asking, but I hope it helps. Regards, Sangjin On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbeywrote: > My work on HADOOP-11804 *only* helps processes that sit outside of YARN. :) > > On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer > wrote: > > > > Does any of this work actually help processes that sit outside of YARN? > > > >> On Jul 21, 2016, at 12:29 PM, Sean Busbey wrote: > >> > >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0. > >> > >> I have an updated patch for HADOOP-11804 ready to post this week. I've > >> been updating HBase's master branch to try to make use of it, but > >> could use some other reviews. > >> > >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa > wrote: > >>> Hi developers, > >>> > >>> I'd like to discuss how to make an advance towards dependency > >>> management in Apache Hadoop trunk code since there has been lots work > >>> about updating dependencies in parallel. Summarizing recent works and > >>> activities as follows: > >>> > >>> 0) Currently, we have merged minimum update dependencies for making > >>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8). > >>> 1) After that, some people suggest that we should update the other > >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.). > >>> 2) In parallel, Sangjin and Sean are working on classpath isolation: > >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656. > >>> > >>> Main problems we try to solve in the activities above is as follows: > >>> > >>> * 1) tries to solve dependency hell between user-level jar and > >>> system(Hadoop)-level jar. > >>> * 2) tries to solve updating old libraries. > >>> > >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries > >>> to separate class loader between client-side dependencies and > >>> server-side dependencies in Hadoop, so we can the change policy of > >>> updating libraries after doing 2). We can also decide which libraries > >>> can be shaded after 2). > >>> > >>> Hence, IMHO, a straight way we should go to is doing 2 at first. > >>> After that, we can update both client-side and server-side > >>> dependencies based on new policy(maybe we should discuss what kind of > >>> incompatibility is acceptable, and the others are not). > >>> > >>> Thoughts? > >>> > >>> Thanks, > >>> - Tsuyoshi > >>> > >>> - > >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > >>> > >> > >> > >> > >> -- > >> busbey > >> > >> - > >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >> > > > > > > - > > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > > > > > > -- > busbey > > - > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > >
Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
Does any of this work actually help processes that sit outside of YARN? > On Jul 21, 2016, at 12:29 PM, Sean Busbeywrote: > > thanks for bringing this up! big +1 on upgrading dependencies for 3.0. > > I have an updated patch for HADOOP-11804 ready to post this week. I've > been updating HBase's master branch to try to make use of it, but > could use some other reviews. > > On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa wrote: >> Hi developers, >> >> I'd like to discuss how to make an advance towards dependency >> management in Apache Hadoop trunk code since there has been lots work >> about updating dependencies in parallel. Summarizing recent works and >> activities as follows: >> >> 0) Currently, we have merged minimum update dependencies for making >> Hadoop JDK-8 compatible(compilable and runnable on JDK-8). >> 1) After that, some people suggest that we should update the other >> dependencies on trunk(e.g. protobuf, netty, jackthon etc.). >> 2) In parallel, Sangjin and Sean are working on classpath isolation: >> HADOOP-13070, HADOOP-11804 and HADOOP-11656. >> >> Main problems we try to solve in the activities above is as follows: >> >> * 1) tries to solve dependency hell between user-level jar and >> system(Hadoop)-level jar. >> * 2) tries to solve updating old libraries. >> >> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries >> to separate class loader between client-side dependencies and >> server-side dependencies in Hadoop, so we can the change policy of >> updating libraries after doing 2). We can also decide which libraries >> can be shaded after 2). >> >> Hence, IMHO, a straight way we should go to is doing 2 at first. >> After that, we can update both client-side and server-side >> dependencies based on new policy(maybe we should discuss what kind of >> incompatibility is acceptable, and the others are not). >> >> Thoughts? >> >> Thanks, >> - Tsuyoshi >> >> - >> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org >> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org >> > > > > -- > busbey > > - > To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: common-dev-h...@hadoop.apache.org > - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
thanks for bringing this up! big +1 on upgrading dependencies for 3.0. I have an updated patch for HADOOP-11804 ready to post this week. I've been updating HBase's master branch to try to make use of it, but could use some other reviews. On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawawrote: > Hi developers, > > I'd like to discuss how to make an advance towards dependency > management in Apache Hadoop trunk code since there has been lots work > about updating dependencies in parallel. Summarizing recent works and > activities as follows: > > 0) Currently, we have merged minimum update dependencies for making > Hadoop JDK-8 compatible(compilable and runnable on JDK-8). > 1) After that, some people suggest that we should update the other > dependencies on trunk(e.g. protobuf, netty, jackthon etc.). > 2) In parallel, Sangjin and Sean are working on classpath isolation: > HADOOP-13070, HADOOP-11804 and HADOOP-11656. > > Main problems we try to solve in the activities above is as follows: > > * 1) tries to solve dependency hell between user-level jar and > system(Hadoop)-level jar. > * 2) tries to solve updating old libraries. > > IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries > to separate class loader between client-side dependencies and > server-side dependencies in Hadoop, so we can the change policy of > updating libraries after doing 2). We can also decide which libraries > can be shaded after 2). > > Hence, IMHO, a straight way we should go to is doing 2 at first. > After that, we can update both client-side and server-side > dependencies based on new policy(maybe we should discuss what kind of > incompatibility is acceptable, and the others are not). > > Thoughts? > > Thanks, > - Tsuyoshi > > - > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org > -- busbey - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
Thanks Tsuyoshi for opening the discussion. One benefit of the dependency/classpath isolation work is that it can open up a possibility of having diverging dependencies in a safe manner so that upgrading libraries may have less impact. I'll spend some more time on HADOOP-13070 to make some progress. Help is welcome there! :) That said, upgrading libraries can still go on in parallel. If the past experience is any guide, the hadoop dependencies badly trail current user dependencies. If anything, we would be reducing the occurrences of problems or workarounds people put in by upgrading our dependencies. On Thu, Jul 21, 2016 at 2:30 AM, Tsuyoshi Ozawawrote: > Hi developers, > > I'd like to discuss how to make an advance towards dependency > management in Apache Hadoop trunk code since there has been lots work > about updating dependencies in parallel. Summarizing recent works and > activities as follows: > > 0) Currently, we have merged minimum update dependencies for making > Hadoop JDK-8 compatible(compilable and runnable on JDK-8). > 1) After that, some people suggest that we should update the other > dependencies on trunk(e.g. protobuf, netty, jackthon etc.). > 2) In parallel, Sangjin and Sean are working on classpath isolation: > HADOOP-13070, HADOOP-11804 and HADOOP-11656. > > Main problems we try to solve in the activities above is as follows: > > * 1) tries to solve dependency hell between user-level jar and > system(Hadoop)-level jar. > * 2) tries to solve updating old libraries. > > IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries > to separate class loader between client-side dependencies and > server-side dependencies in Hadoop, so we can the change policy of > updating libraries after doing 2). We can also decide which libraries > can be shaded after 2). > > Hence, IMHO, a straight way we should go to is doing 2 at first. > After that, we can update both client-side and server-side > dependencies based on new policy(maybe we should discuss what kind of > incompatibility is acceptable, and the others are not). > > Thoughts? > > Thanks, > - Tsuyoshi > > - > To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org > For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org > >
[DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk
Hi developers, I'd like to discuss how to make an advance towards dependency management in Apache Hadoop trunk code since there has been lots work about updating dependencies in parallel. Summarizing recent works and activities as follows: 0) Currently, we have merged minimum update dependencies for making Hadoop JDK-8 compatible(compilable and runnable on JDK-8). 1) After that, some people suggest that we should update the other dependencies on trunk(e.g. protobuf, netty, jackthon etc.). 2) In parallel, Sangjin and Sean are working on classpath isolation: HADOOP-13070, HADOOP-11804 and HADOOP-11656. Main problems we try to solve in the activities above is as follows: * 1) tries to solve dependency hell between user-level jar and system(Hadoop)-level jar. * 2) tries to solve updating old libraries. IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries to separate class loader between client-side dependencies and server-side dependencies in Hadoop, so we can the change policy of updating libraries after doing 2). We can also decide which libraries can be shaded after 2). Hence, IMHO, a straight way we should go to is doing 2 at first. After that, we can update both client-side and server-side dependencies based on new policy(maybe we should discuss what kind of incompatibility is acceptable, and the others are not). Thoughts? Thanks, - Tsuyoshi - To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org