Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-25 Thread Allen Wittenauer

> On Jul 25, 2016, at 1:16 PM, Sangjin Lee  wrote:
> 
> Also:  right now, the non-Linux and/or non-x86 platforms have to supply their 
> own leveldbjni jar (or at least the C level library?) in order to make YARN 
> even functional.  How is that going to work with the class path manipulation?
> 
> First, the native libraries are orthogonal to this. They're not governed by 
> the java classpath.
> 
> For those platforms where users/admins need to provide their own LevelDB 
> libraries, the only requirement would be to add them to the 
> share/hadoop/.../lib directory. I don't think we would ask end users of the 
> clusters to bring in their own LevelDB library as it would not be an end-user 
> concern. I assume the administrators of clusters (still users but not end 
> users) would add it to the clusters. The classpath isolation doesn't really 
> have an impact on that.
> 

$ jar tf leveldbjni-all-1.8.jar | grep native
META-INF/native/
META-INF/native/linux32/
META-INF/native/linux32/libleveldbjni.so
META-INF/native/linux64/
META-INF/native/linux64/libleveldbjni.so
META-INF/native/osx/
META-INF/native/osx/libleveldbjni.jnilib
META-INF/native/windows32/
META-INF/native/windows32/leveldbjni.dll
META-INF/native/windows64/
META-INF/native/windows64/leveldbjni.dll



-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-25 Thread Sangjin Lee
On Fri, Jul 22, 2016 at 5:15 PM, Allen Wittenauer 
wrote:

>
> But if I don't use ApplicationClassLoader, my java app is basically
> screwed then, right?
>

If we start upgrading the libraries aggressively, then it would also mean
that the ApplicationClassLoader should be more of the default than the
other way around (i.e. opt-out rather than opt-in). If we're not willing to
go there, then we cannot be too aggressive in upgrading libraries.

I'm not sure what you mean by "my java app is basically screwed", but if
you meant whether your java app would be OK if hadoop upgraded libraries
aggressively and you don't use the ApplicationClassLoader, then yes.


>
> Also:  right now, the non-Linux and/or non-x86 platforms have to supply
> their own leveldbjni jar (or at least the C level library?) in order to
> make YARN even functional.  How is that going to work with the class path
> manipulation?
>

First, the native libraries are orthogonal to this. They're not governed by
the java classpath.

For those platforms where users/admins need to provide their own LevelDB
libraries, the only requirement would be to add them to the
share/hadoop/.../lib directory. I don't think we would ask end users of the
clusters to bring in their own LevelDB library as it would not be an
end-user concern. I assume the administrators of clusters (still users but
not end users) would add it to the clusters. The classpath isolation
doesn't really have an impact on that.


>
> > On Jul 22, 2016, at 9:57 AM, Sangjin Lee  wrote:
> >
> > The work on HADOOP-13070 and the ApplicationClassLoader are generic and
> go beyond YARN. It can be used in any JVM that uses hadoop. The current use
> cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN
> node manager auxiliary services. I'm not sure if that's what you were
> asking, but I hope it helps.
> >
> > Regards,
> > Sangjin
> >
> > On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey 
> wrote:
> > My work on HADOOP-11804 *only* helps processes that sit outside of YARN.
> :)
> >
> > On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer
> >  wrote:
> > >
> > > Does any of this work actually help processes that sit outside of YARN?
> > >
> > >> On Jul 21, 2016, at 12:29 PM, Sean Busbey 
> wrote:
> > >>
> > >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> > >>
> > >> I have an updated patch for HADOOP-11804 ready to post this week. I've
> > >> been updating HBase's master branch to try to make use of it, but
> > >> could use some other reviews.
> > >>
> > >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa 
> wrote:
> > >>> Hi developers,
> > >>>
> > >>> I'd like to discuss how to make an advance towards dependency
> > >>> management in Apache Hadoop trunk code since there has been lots work
> > >>> about updating dependencies in parallel. Summarizing recent works and
> > >>> activities as follows:
> > >>>
> > >>> 0) Currently, we have merged minimum update dependencies for making
> > >>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> > >>> 1) After that, some people suggest that we should update the other
> > >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> > >>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> > >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
> > >>>
> > >>> Main problems we try to solve in the activities above is as follows:
> > >>>
> > >>> * 1) tries to solve dependency hell between user-level jar and
> > >>> system(Hadoop)-level jar.
> > >>> * 2) tries to solve updating old libraries.
> > >>>
> > >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
> > >>> to separate class loader between client-side dependencies and
> > >>> server-side dependencies in Hadoop, so we can the change policy of
> > >>> updating libraries after doing 2). We can also decide which libraries
> > >>> can be shaded after 2).
> > >>>
> > >>> Hence, IMHO, a straight way we should go to is doing 2 at first.
> > >>> After that, we can update both client-side and server-side
> > >>> dependencies based on new policy(maybe we should discuss what kind of
> > >>> incompatibility is acceptable, and the others are not).
> > >>>
> > >>> Thoughts?
> > >>>
> > >>> Thanks,
> > >>> - Tsuyoshi
> > >>>
> > >>> -
> > >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> > >>>
> > >>
> > >>
> > >>
> > >> --
> > >> busbey
> > >>
> > >> -
> > >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> > >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> > >>
> > >
> > >
> > > 

Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Allen Wittenauer

> On Jul 22, 2016, at 5:47 PM, Zheng, Kai  wrote:
> 
> For the leveldb thing, wouldn't we have an alternative option in Java for the 
> platforms where leveldb isn't supported yet due to whatever reasons. IMO, 
> native library would be best to be used for optimization and production for 
> performance. For development and pure Java platform, by default pure Java 
> approach should still be provided and used. That is to say, if no Hadoop 
> native is used, all the functionalities should still work and not break. 

Yes and no.  I can certainly understand some high-end features being 
tied to native libraries, simply because system programming with Java is like 
being a touch typist with your nose.  

That said, absolutely key functionality should definitely work. Take a 
look at the last Linux/ppc64le report that was emailed to these very lists a 
few days ago [1]:


https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-ppc/30/artifact/out/console-report.html

Almost all of those YARN failures are due to MiniYARN trying to 
initiate leveldb as part of the service startup but can't because the embedded 
shared library is the wrong hardware architecture. Rather than catch the 
exception and do something else, the code just blows up in a very dramatic 
fashion. That should translate into YARN is completely busted and unusable 
without doing some very weird workarounds.

To get us back on topic:  the class path isolation work absolutely 
cannot make this situation worse.  We either need to make sure end users can 
replace/modify Hadoop's dependencies if they require native lirbaries or work 
harder on making multiplatform stuff better supported.  The nightly PowerPC 
builds should help tremendously towards this goal. [2]

1 - While I greatly appreciate the OpenPOWER Foundation getting the ASF access 
to these boxes -- Mesos and Hadoop are both actively using them -- It'd be 
great if they were more reliable so we could get a report every day of the 
week. :(

2 - At some point, I'll set up a manually triggered precommit job to test 
patches.  But until both boxes are online and available on a consistent basis, 
it just isn't worth the effort.
-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



RE: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Zheng, Kai
For the leveldb thing, wouldn't we have an alternative option in Java for the 
platforms where leveldb isn't supported yet due to whatever reasons. IMO, 
native library would be best to be used for optimization and production for 
performance. For development and pure Java platform, by default pure Java 
approach should still be provided and used. That is to say, if no Hadoop native 
is used, all the functionalities should still work and not break. 

HDFS erasure coding goes in the way. For that, we spent much effort in 
developing an ISA-L compatible erasure coder in pure Java that's used by 
default, though for performance the ISA-L native one is recommended in 
production deployment.

Regards,
Kai

-Original Message-
From: Allen Wittenauer [mailto:a...@effectivemachines.com] 
Sent: Saturday, July 23, 2016 8:16 AM
To: Sangjin Lee 
Cc: Sean Busbey ; common-...@hadoop.apache.org; 
yarn-...@hadoop.apache.org; hdfs-...@hadoop.apache.org; 
mapreduce-dev@hadoop.apache.org
Subject: Re: [DISCUSS] The order of classpath isolation work and 
updating/shading dependencies on trunk


But if I don't use ApplicationClassLoader, my java app is basically screwed 
then, right?

Also:  right now, the non-Linux and/or non-x86 platforms have to supply their 
own leveldbjni jar (or at least the C level library?) in order to make YARN 
even functional.  How is that going to work with the class path manipulation?


> On Jul 22, 2016, at 9:57 AM, Sangjin Lee  wrote:
> 
> The work on HADOOP-13070 and the ApplicationClassLoader are generic and go 
> beyond YARN. It can be used in any JVM that uses hadoop. The current use 
> cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN 
> node manager auxiliary services. I'm not sure if that's what you were asking, 
> but I hope it helps.
> 
> Regards,
> Sangjin
> 
> On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey  wrote:
> My work on HADOOP-11804 *only* helps processes that sit outside of 
> YARN. :)
> 
> On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer 
>  wrote:
> >
> > Does any of this work actually help processes that sit outside of YARN?
> >
> >> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
> >>
> >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> >>
> >> I have an updated patch for HADOOP-11804 ready to post this week. 
> >> I've been updating HBase's master branch to try to make use of it, 
> >> but could use some other reviews.
> >>
> >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
> >>> Hi developers,
> >>>
> >>> I'd like to discuss how to make an advance towards dependency 
> >>> management in Apache Hadoop trunk code since there has been lots 
> >>> work about updating dependencies in parallel. Summarizing recent 
> >>> works and activities as follows:
> >>>
> >>> 0) Currently, we have merged minimum update dependencies for 
> >>> making Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> >>> 1) After that, some people suggest that we should update the other 
> >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> >>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
> >>>
> >>> Main problems we try to solve in the activities above is as follows:
> >>>
> >>> * 1) tries to solve dependency hell between user-level jar and 
> >>> system(Hadoop)-level jar.
> >>> * 2) tries to solve updating old libraries.
> >>>
> >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) 
> >>> tries to separate class loader between client-side dependencies 
> >>> and server-side dependencies in Hadoop, so we can the change 
> >>> policy of updating libraries after doing 2). We can also decide 
> >>> which libraries can be shaded after 2).
> >>>
> >>> Hence, IMHO, a straight way we should go to is doing 2 at first.
> >>> After that, we can update both client-side and server-side 
> >>> dependencies based on new policy(maybe we should discuss what kind 
> >>> of incompatibility is acceptable, and the others are not).
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks,
> >>> - Tsuyoshi
> >>>
> >>> --
> >>> --- To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>>
> >>
> >>
> >>
> >> --
> >> busbey
> >>
> >> ---
> >> -- To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >
> >
> > 
> > - To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> 
> 
> 
> --

Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Allen Wittenauer

But if I don't use ApplicationClassLoader, my java app is basically screwed 
then, right?

Also:  right now, the non-Linux and/or non-x86 platforms have to supply their 
own leveldbjni jar (or at least the C level library?) in order to make YARN 
even functional.  How is that going to work with the class path manipulation?


> On Jul 22, 2016, at 9:57 AM, Sangjin Lee  wrote:
> 
> The work on HADOOP-13070 and the ApplicationClassLoader are generic and go 
> beyond YARN. It can be used in any JVM that uses hadoop. The current use 
> cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN 
> node manager auxiliary services. I'm not sure if that's what you were asking, 
> but I hope it helps.
> 
> Regards,
> Sangjin
> 
> On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey  wrote:
> My work on HADOOP-11804 *only* helps processes that sit outside of YARN. :)
> 
> On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer
>  wrote:
> >
> > Does any of this work actually help processes that sit outside of YARN?
> >
> >> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
> >>
> >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> >>
> >> I have an updated patch for HADOOP-11804 ready to post this week. I've
> >> been updating HBase's master branch to try to make use of it, but
> >> could use some other reviews.
> >>
> >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
> >>> Hi developers,
> >>>
> >>> I'd like to discuss how to make an advance towards dependency
> >>> management in Apache Hadoop trunk code since there has been lots work
> >>> about updating dependencies in parallel. Summarizing recent works and
> >>> activities as follows:
> >>>
> >>> 0) Currently, we have merged minimum update dependencies for making
> >>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> >>> 1) After that, some people suggest that we should update the other
> >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> >>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
> >>>
> >>> Main problems we try to solve in the activities above is as follows:
> >>>
> >>> * 1) tries to solve dependency hell between user-level jar and
> >>> system(Hadoop)-level jar.
> >>> * 2) tries to solve updating old libraries.
> >>>
> >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
> >>> to separate class loader between client-side dependencies and
> >>> server-side dependencies in Hadoop, so we can the change policy of
> >>> updating libraries after doing 2). We can also decide which libraries
> >>> can be shaded after 2).
> >>>
> >>> Hence, IMHO, a straight way we should go to is doing 2 at first.
> >>> After that, we can update both client-side and server-side
> >>> dependencies based on new policy(maybe we should discuss what kind of
> >>> incompatibility is acceptable, and the others are not).
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks,
> >>> - Tsuyoshi
> >>>
> >>> -
> >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>>
> >>
> >>
> >>
> >> --
> >> busbey
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
> 
> 
> 
> --
> busbey
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Sangjin Lee
The work on HADOOP-13070 and the ApplicationClassLoader are generic and go
beyond YARN. It can be used in any JVM that uses hadoop. The current use
cases are MR containers, hadoop's RunJar (as in "hadoop jar"), and the YARN
node manager auxiliary services. I'm not sure if that's what you were
asking, but I hope it helps.

Regards,
Sangjin

On Fri, Jul 22, 2016 at 9:16 AM, Sean Busbey  wrote:

> My work on HADOOP-11804 *only* helps processes that sit outside of YARN. :)
>
> On Fri, Jul 22, 2016 at 10:48 AM, Allen Wittenauer
>  wrote:
> >
> > Does any of this work actually help processes that sit outside of YARN?
> >
> >> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
> >>
> >> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> >>
> >> I have an updated patch for HADOOP-11804 ready to post this week. I've
> >> been updating HBase's master branch to try to make use of it, but
> >> could use some other reviews.
> >>
> >> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa 
> wrote:
> >>> Hi developers,
> >>>
> >>> I'd like to discuss how to make an advance towards dependency
> >>> management in Apache Hadoop trunk code since there has been lots work
> >>> about updating dependencies in parallel. Summarizing recent works and
> >>> activities as follows:
> >>>
> >>> 0) Currently, we have merged minimum update dependencies for making
> >>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> >>> 1) After that, some people suggest that we should update the other
> >>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> >>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> >>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
> >>>
> >>> Main problems we try to solve in the activities above is as follows:
> >>>
> >>> * 1) tries to solve dependency hell between user-level jar and
> >>> system(Hadoop)-level jar.
> >>> * 2) tries to solve updating old libraries.
> >>>
> >>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
> >>> to separate class loader between client-side dependencies and
> >>> server-side dependencies in Hadoop, so we can the change policy of
> >>> updating libraries after doing 2). We can also decide which libraries
> >>> can be shaded after 2).
> >>>
> >>> Hence, IMHO, a straight way we should go to is doing 2 at first.
> >>> After that, we can update both client-side and server-side
> >>> dependencies based on new policy(maybe we should discuss what kind of
> >>> incompatibility is acceptable, and the others are not).
> >>>
> >>> Thoughts?
> >>>
> >>> Thanks,
> >>> - Tsuyoshi
> >>>
> >>> -
> >>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> >>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >>>
> >>
> >>
> >>
> >> --
> >> busbey
> >>
> >> -
> >> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> >> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> >>
> >
> >
> > -
> > To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
> >
>
>
>
> --
> busbey
>
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
>
>


Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-22 Thread Allen Wittenauer

Does any of this work actually help processes that sit outside of YARN?

> On Jul 21, 2016, at 12:29 PM, Sean Busbey  wrote:
> 
> thanks for bringing this up! big +1 on upgrading dependencies for 3.0.
> 
> I have an updated patch for HADOOP-11804 ready to post this week. I've
> been updating HBase's master branch to try to make use of it, but
> could use some other reviews.
> 
> On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
>> Hi developers,
>> 
>> I'd like to discuss how to make an advance towards dependency
>> management in Apache Hadoop trunk code since there has been lots work
>> about updating dependencies in parallel. Summarizing recent works and
>> activities as follows:
>> 
>> 0) Currently, we have merged minimum update dependencies for making
>> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
>> 1) After that, some people suggest that we should update the other
>> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
>> 2) In parallel, Sangjin and Sean are working on classpath isolation:
>> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
>> 
>> Main problems we try to solve in the activities above is as follows:
>> 
>> * 1) tries to solve dependency hell between user-level jar and
>> system(Hadoop)-level jar.
>> * 2) tries to solve updating old libraries.
>> 
>> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
>> to separate class loader between client-side dependencies and
>> server-side dependencies in Hadoop, so we can the change policy of
>> updating libraries after doing 2). We can also decide which libraries
>> can be shaded after 2).
>> 
>> Hence, IMHO, a straight way we should go to is doing 2 at first.
>> After that, we can update both client-side and server-side
>> dependencies based on new policy(maybe we should discuss what kind of
>> incompatibility is acceptable, and the others are not).
>> 
>> Thoughts?
>> 
>> Thanks,
>> - Tsuyoshi
>> 
>> -
>> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
>> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>> 
> 
> 
> 
> -- 
> busbey
> 
> -
> To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: common-dev-h...@hadoop.apache.org
> 


-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-21 Thread Sean Busbey
thanks for bringing this up! big +1 on upgrading dependencies for 3.0.

I have an updated patch for HADOOP-11804 ready to post this week. I've
been updating HBase's master branch to try to make use of it, but
could use some other reviews.

On Thu, Jul 21, 2016 at 4:30 AM, Tsuyoshi Ozawa  wrote:
> Hi developers,
>
> I'd like to discuss how to make an advance towards dependency
> management in Apache Hadoop trunk code since there has been lots work
> about updating dependencies in parallel. Summarizing recent works and
> activities as follows:
>
> 0) Currently, we have merged minimum update dependencies for making
> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> 1) After that, some people suggest that we should update the other
> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
>
> Main problems we try to solve in the activities above is as follows:
>
> * 1) tries to solve dependency hell between user-level jar and
> system(Hadoop)-level jar.
> * 2) tries to solve updating old libraries.
>
> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
> to separate class loader between client-side dependencies and
> server-side dependencies in Hadoop, so we can the change policy of
> updating libraries after doing 2). We can also decide which libraries
> can be shaded after 2).
>
> Hence, IMHO, a straight way we should go to is doing 2 at first.
> After that, we can update both client-side and server-side
> dependencies based on new policy(maybe we should discuss what kind of
> incompatibility is acceptable, and the others are not).
>
> Thoughts?
>
> Thanks,
> - Tsuyoshi
>
> -
> To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org
>



-- 
busbey

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org



Re: [DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-21 Thread Sangjin Lee
Thanks Tsuyoshi for opening the discussion. One benefit of the
dependency/classpath isolation work is that it can open up a possibility of
having diverging dependencies in a safe manner so that upgrading libraries
may have less impact. I'll spend some more time on HADOOP-13070 to make
some progress. Help is welcome there! :)

That said, upgrading libraries can still go on in parallel. If the past
experience is any guide, the hadoop dependencies badly trail current user
dependencies. If anything, we would be reducing the occurrences of problems
or workarounds people put in by upgrading our dependencies.

On Thu, Jul 21, 2016 at 2:30 AM, Tsuyoshi Ozawa  wrote:

> Hi developers,
>
> I'd like to discuss how to make an advance towards dependency
> management in Apache Hadoop trunk code since there has been lots work
> about updating dependencies in parallel. Summarizing recent works and
> activities as follows:
>
> 0) Currently, we have merged minimum update dependencies for making
> Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
> 1) After that, some people suggest that we should update the other
> dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
> 2) In parallel, Sangjin and Sean are working on classpath isolation:
> HADOOP-13070, HADOOP-11804 and HADOOP-11656.
>
> Main problems we try to solve in the activities above is as follows:
>
> * 1) tries to solve dependency hell between user-level jar and
> system(Hadoop)-level jar.
> * 2) tries to solve updating old libraries.
>
> IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
> to separate class loader between client-side dependencies and
> server-side dependencies in Hadoop, so we can the change policy of
> updating libraries after doing 2). We can also decide which libraries
> can be shaded after 2).
>
> Hence, IMHO, a straight way we should go to is doing 2 at first.
> After that, we can update both client-side and server-side
> dependencies based on new policy(maybe we should discuss what kind of
> incompatibility is acceptable, and the others are not).
>
> Thoughts?
>
> Thanks,
> - Tsuyoshi
>
> -
> To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
> For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org
>
>


[DISCUSS] The order of classpath isolation work and updating/shading dependencies on trunk

2016-07-21 Thread Tsuyoshi Ozawa
Hi developers,

I'd like to discuss how to make an advance towards dependency
management in Apache Hadoop trunk code since there has been lots work
about updating dependencies in parallel. Summarizing recent works and
activities as follows:

0) Currently, we have merged minimum update dependencies for making
Hadoop JDK-8 compatible(compilable and runnable on JDK-8).
1) After that, some people suggest that we should update the other
dependencies on trunk(e.g. protobuf, netty, jackthon etc.).
2) In parallel, Sangjin and Sean are working on classpath isolation:
HADOOP-13070, HADOOP-11804 and HADOOP-11656.

Main problems we try to solve in the activities above is as follows:

* 1) tries to solve dependency hell between user-level jar and
system(Hadoop)-level jar.
* 2) tries to solve updating old libraries.

IIUC, 1) and 2) looks not related, but it's related in fact. 2) tries
to separate class loader between client-side dependencies and
server-side dependencies in Hadoop, so we can the change policy of
updating libraries after doing 2). We can also decide which libraries
can be shaded after 2).

Hence, IMHO, a straight way we should go to is doing 2 at first.
After that, we can update both client-side and server-side
dependencies based on new policy(maybe we should discuss what kind of
incompatibility is acceptable, and the others are not).

Thoughts?

Thanks,
- Tsuyoshi

-
To unsubscribe, e-mail: mapreduce-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: mapreduce-dev-h...@hadoop.apache.org