I am fine with a separate project for relocated dependencies (or even just 
separate packages, you do a maven install of them and not include them in the 
IDE at all).  Shading still has some drawbacks, but I think in a few cases it 
makes since.  I would prefer it if we picked a very small number of 
dependencies that cause people issues and just shade those.  Guava is the big 
one that I worry about. Netty is a possibility and I think asm would be 
another, but it is a transitive dependency so it would require us with our own 
version of kryo exposing the kryo API but pulling in a shaded asm.  
The servlet-api concerns me, but it looks like it is tied to the 
IHttpCredentialsPlugin which should move to the server package anyways.

The rest I am not concerned about, are things that are exposed to end users, or 
are for test and not actually shipped.
$ mvn dependecy:tree...
[INFO] --- maven-dependency-plugin:2.8:tree (default-cli) @ storm-client ---
[INFO] org.apache.storm:storm-client:jar:2.0.0-SNAPSHOT
[INFO] +- uk.org.lidalia:sysout-over-slf4j:jar:1.0.2:compile
[INFO] +- org.slf4j:slf4j-api:jar:1.7.21:compile 
[INFO] +- org.apache.logging.log4j:log4j-api:jar:2.8.2:compile
[INFO] +- org.apache.logging.log4j:log4j-core:jar:2.8.2:compile
[INFO] +- org.apache.logging.log4j:log4j-slf4j-impl:jar:2.8.2:compile
[INFO] +- org.slf4j:log4j-over-slf4j:jar:1.6.6:compile
[INFO] +- com.google.guava:guava:jar:16.0.1:compile
[INFO] +- org.apache.thrift:libthrift:jar:0.9.3:compile
[INFO] |  \- org.apache.httpcomponents:httpcore:jar:4.4.1:compile
[INFO] +- commons-io:commons-io:jar:2.5:compile
[INFO] +- commons-lang:commons-lang:jar:2.5:compile
[INFO] +- commons-collections:commons-collections:jar:3.2.2:compile
[INFO] +- com.lmax:disruptor:jar:3.3.2:compile
[INFO] +- com.googlecode.json-simple:json-simple:jar:1.1:compile
[INFO] +- org.yaml:snakeyaml:jar:1.11:compile
[INFO] +- io.netty:netty:jar:3.9.0.Final:compile
[INFO] +- com.esotericsoftware:kryo:jar:3.0.3:compile
[INFO] |  +- com.esotericsoftware:reflectasm:jar:1.10.1:compile
[INFO] |  |  \- org.ow2.asm:asm:jar:5.0.3:compile
[INFO] |  +- com.esotericsoftware:minlog:jar:1.3.0:compile
[INFO] |  \- org.objenesis:objenesis:jar:2.1:compile
[INFO] +- org.apache.zookeeper:zookeeper:jar:3.4.6:compile
[INFO] |  \- jline:jline:jar:0.9.94:compile
[INFO] +- org.apache.curator:curator-framework:jar:2.12.0:compile
[INFO] +- org.jgrapht:jgrapht-core:jar:0.9.0:compile
[INFO] +- javax.servlet:servlet-api:jar:2.5:compile
[INFO] +- org.apache.httpcomponents:httpclient:jar:4.3.3:compile
[INFO] |  +- commons-logging:commons-logging:jar:1.1.3:compile
[INFO] |  \- commons-codec:commons-codec:jar:1.6:compile
[INFO] +- org.apache.curator:curator-client:jar:2.12.0:compile
[INFO] +- junit:junit:jar:4.11:test
[INFO] |  \- org.hamcrest:hamcrest-core:jar:1.3:test
[INFO] +- org.mockito:mockito-core:jar:1.9.5:test
[INFO] \- org.hamcrest:hamcrest-library:jar:1.3:test
- Bobby


On Wednesday, July 12, 2017, 9:45:43 AM CDT, Jungtaek Lim <[email protected]> 
wrote:

I'd like to bump on this again, since we have a few huge issues for Storm
2.0.0, and this issue is a kind of regression and effectively blocker.
(Please note that current master branch removes shading for some libraries
to make IDE happy.)

At that time I didn't consider option 2 as possible solution, but now Flink
is going with this option, and I can't find reason to not doing this.

* Repository: https://github.com/apache/flink-shaded
* Discussion thread:
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Changing-Flink-s-shading-model-td17419.html

Thought?

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 3월 31일 (금) 오후 3:12, Jungtaek Lim <[email protected]>님이 작성:

> Bobby,
>
> I've worked on separating worker and daemon classpath.
>
> - Issue: STORM-2441: Break down 'storm-core' to extract client (worker)
> artifacts <http://issues.apache.org/jira/browse/STORM-2441>
> - PR: https://github.com/apache/storm/pull/2034
>
> I don't address your suggestion about "classpath selection" and "hiding
> local mode". Please file issues if you would like to address.
>
> Btw, I exclude artifacts from shade & relocation list so still need to
> address dependency issue.
>
> Folks,
>
> any other ideas or opinions around dependency issue?
>
> IMHO Option 2 is clearer but not sure where we can create a new git repo
> (ASF git or even outside), and also it's not against LICENSEs to repackage
> shade & relocated artifacts to Maven.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2017년 3월 29일 (수) 오후 10:42, Bobby Evans <[email protected]>님이 작성:
>
> I am fine with those changes so long as we finish the separation of worker
> and daemon classpaths.  Otherwise we have made some very big changes for
> our end users that are going to have a hard time upgrading.
> If all we support is the option to run an old worker version with a new
> supervisor/nimbus I think that would be good enough, although I would like
> to see a full separation of the classpaths.
>
>
> - Bobby
>
> On Tuesday, March 28, 2017, 6:03:26 PM CDT, Jungtaek Lim <
> [email protected]> wrote:Just FYI:
> I've worked with minimal patch for 3, though I still don't like such
> workaround:
>
> https://github.com/HeartSaVioR/storm/commit/d3122faa7ae182915242b979beaac156f91fe3b2
>
> It excludes 'libthrift', 'jetty', 'codahale metrics' from relocation
> targets. I can see IDEA is OK to build the project, and Maven build
> passing.
>
> - Jungtaek Lim (HeartSaVioR)
>
> 2017년 3월 29일 (수) 오전 11:02, Jungtaek Lim <[email protected]>님이 작성:
>
> > Back to origin issue (before breaking down 'storm-core'), turned out
> > IntelliJ doesn't recognize relocated classes within project. That's why
> > build (via Maven) for master branch succeeds but IDEA compile doesn't.
> >
> > There're some issues filed but no action has been made.
> > https://youtrack.jetbrains.com/issue/IDEA-93855
> > https://youtrack.jetbrains.com/issue/IDEA-126596
> >
> > So suppose we have two modules A and B within project, and A relocates L
> > to Lr.
> > B relies on A's method which returns a class of Lr or has parameters for
> a
> > class of LR, B needs to use Lr rather than L, and Lr is not recognized
> from
> > B.
> >
> > Moving 'storm-drpc-server' to 'storm-core' may help but it's not a nice
> > solution though. (think about why we add new module 'storm-drpc-server')
> > To minimize dependency for worker (which actually affects end users) we
> > should break down 'storm-core' and it will remain to be headache.
> >
> > There seemed to be little workarounds.
> >
> > 1. Guide IDEA users to take hacky workaround.
> >
> > Quoting
> https://youtrack.jetbrains.com/issue/IDEA-93855#comment=27-1838157
> > :
> > "A hacky workaround is to make the module in intellij with the dependency
> > depend on the jar explicitly in target/. This at least allows things to
> > compile and tests to run."
> >
> > That is really bad and annoying, but we might have no choice when we
> don't
> > want to take other workarounds.
> >
> > 2. Maintaining separate project for relocated dependencies.
> >
> > This avoids contributors to take hacky workaround so good to go, but
> > maintaining relocated artifacts might be another headache, and I'm not
> sure
> > ASF (or LICENSE of relocated targets) allows to do that.
> >
> > 3. Minimize (or remove) relocate targets and/or don't relocate
> troublesome
> > targets.
> >
> > For 'storm-drpc-server', there seems to be three troublesome targets:
> >
> > - 'thrift'
> > - 'codahale metrics'
> > - 'jetty server' (We may be able to move this to 'storm-drpc-server' when
> > another webapp port is done.)
> >
> > If we are OK to give up relocating those things we might be OK for now.
> We
> > may want to extend the list when we break down more modules from
> > 'storm-core'.
> >
> > Btw, IMHO relocating is not a good option. Elastic gives up shading
> > anything for 2.0. (https://www.elastic.co/blog/to-shade-or-not-to-shade)
> > Someone might feel that it's a regression, but we need to decide to do it
> > when it can provide better shape.
> >
> > Please add ideas if you have any, and give your opinions about above
> > options.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > 2017년 3월 28일 (화) 오후 10:23, Bobby Evans <[email protected]>님이
> 작성:
> >
> > Sure I am happy to help out how I can.  I really would like to spend more
> > time on storm, but sadly work has shifted and my team got 2 new projects
> > recently, but we have not increased the head count to cover it yet, so I
> am
> > swamped.  But if you do need help with some of these let me know and I'll
> > see what I can do in my spare time.
> >
> >
> > - Bobby
> >
> > On Tuesday, March 28, 2017, 2:10:46 AM CDT, Jungtaek Lim <
> > [email protected]> wrote:Bobby,
> >
> > I just tried to follow your suggestion and found it's less error-prone
> > compared to my approach, and has lots of benefits. (I am seeing the great
> > chance to minimize dependencies for 'storm-client', say, Worker.)
> >
> > Thanks for the suggestion. I'm working on this now. I'll mention you
> when I
> > finish working this, or need your help.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > 2017년 3월 28일 (화) 오전 8:15, Jungtaek Lim <[email protected]>님이 작성:
> >
> > I think we could also fix this issue for separating 'storm-core' and
> > 'storm-webapp' (rename from 'storm-drpc-server'), since local cluster
> > doesn't need to have 'storm-webapp', DRPC server (local DRPC will still
> be
> > in 'storm-core'), UI, Logviewer. That's what I'm working on, which seems
> to
> > require heavy efforts.
> >
> > Your plan looks really promising, but in other perspective this plan is
> > even much harder to address.
> > Do you have time frame for working on this? If you can finish the work in
> > time frame so that it can be included in 2.0.0, I'll just discard my work
> > and move forward to port other things (logviewer, ui) first.
> >
> > Regarding local mode, exposing local mode provides easy debug
> functionality
> > with IDE, and hiding it takes away such functionality. We have
> > ConfigurableTopology for 2.0.0 which helps to remove ceremony code, so
> > exposing is not that bad.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > 2017년 3월 27일 (월) 오후 11:10, Bobby Evans <[email protected]>님이
> 작성:
> >
> > +1 for fixing dependency/IDE issues, but I am not sure it is as simple as
> > what you describe.
> >
> > The issue is that there is no clean way to get local mode without pulling
> > in almost all of the daemons too.  If we are going to go through the pain
> > of separating them out, I would prefer to do it once and do it right.  I
> am
> > happy to help out with this, as it is something I have been thinking
> about
> > for a while, but just haven't found the time to tackle on my own.
> > First we need a good way to give a control to our users about the base
> > classpath of the worker, ideally the JVM version too.  We have been
> doing a
> > really good job with rolling upgrades and I think it would be great if we
> > could have multiple versions of storm/JVM installed on the worker nodes
> and
> > the end user can pick what JVM and what version of storm they want their
> > worker to run with.  We can argue over details of how that would work
> > later. The point is that it lets us make changes to the classpath in very
> > drastic ways without breaking end users.
> >
> > Second we need a better way to hide local mode.  Every example we have
> > supports local mode which means we will ship a copy of the storm daemons
> in
> > each topology jar if we pull them out of the default classpath.  We need
> to
> > be able to run existing topologies that do not have "local mode support"
> in
> > local mode.  We should be able to make storm-submitter work, there are
> > already stubs for this kind of thing, but we may need to play around with
> > DRPC and a few other APIs to make it transparent.
> >
> > We then create new jars from the existing storm-core and
> storm-drpc-server.
> >
> > storm-client - Just what the client and worker needs.  The only external
> > dependencies are logging and possibly metrics.
> > storm-local - This would pull in local mode dependencies (almost
> everything
> > in storm core).  We might even make it a test jar.
> >
> > storm-daemon - all of our daemon processes (most if not all shading
> > removed).  We can subdivide this more if we want to.
> >
> > storm-core would go away or just pull in storm-client.
> > The storm jar command would by default only pull in storm-client and its
> > dependencies.  If you wanted local mode you could add in a flag that
> would
> > adjust the classpath, boot up a local mode cluster, change the client to
> > transparently interact with that instead of a regular cluster, and jump
> to
> > the end users main.  There could also be an option to just include
> > everything on the classpath without the local mode cluster.  Ideally if
> we
> > include everything on the classpath with storm jar, that would also add a
> > flag that would make the supervisor include everything on the classpath
> > when launching the worker.
> >
> >
> > - Bobby
> >
> > On Monday, March 27, 2017, 12:11:44 AM CDT, Jungtaek Lim <
> > [email protected]>
> > wrote:Hi devs,
> >
> > I took a first step of finalizing port work via resolving dependency
> issue
> > with DRPC.
> >
> > Here's what I'm giving a try:
> > - rename 'storm-drpc-server' to 'storm-webapp'
> > - remove 'storm-core' from 'storm-drpc-server'
> > -- 'storm-drpc-server' will have its own library directory or shaded jar
> > - create 'storm-common' and extract all the things used for both
> > 'storm-core' and 'storm-webapp'
> >
> > It requires numerous files to be moved to, and huge code block should be
> > moved / modified. A bit painful to work on.
> >
> > Other approach would be separating 'storm-worker' (or 'storm-client') and
> > 'storm-daemon', and link to different libraries directory.
> > (Maybe we could make uber jar for 'storm-daemon'.)
> > This also requires similar work and maybe introduce more big effect to
> > users.
> >
> > Other than above ideas I don't have any other ideas. We're shading
> > libraries which are both needed from 'storm-core' and 'storm-drpc-server'
> > which in turn makes known issue - able to build with maven but IDE can't
> > compile 'storm-drpc-server' project.
> >
> > Please share other ideas if you have one.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> >
>
>

Reply via email to