I'd like to bump on this again, since we have a few huge issues for Storm 2.0.0, and this issue is a kind of regression and effectively blocker. (Please note that current master branch removes shading for some libraries to make IDE happy.)
At that time I didn't consider option 2 as possible solution, but now Flink is going with this option, and I can't find reason to not doing this. * Repository: https://github.com/apache/flink-shaded * Discussion thread: http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Changing-Flink-s-shading-model-td17419.html Thought? Thanks, Jungtaek Lim (HeartSaVioR) 2017년 3월 31일 (금) 오후 3:12, Jungtaek Lim <[email protected]>님이 작성: > Bobby, > > I've worked on separating worker and daemon classpath. > > - Issue: STORM-2441: Break down 'storm-core' to extract client (worker) > artifacts <http://issues.apache.org/jira/browse/STORM-2441> > - PR: https://github.com/apache/storm/pull/2034 > > I don't address your suggestion about "classpath selection" and "hiding > local mode". Please file issues if you would like to address. > > Btw, I exclude artifacts from shade & relocation list so still need to > address dependency issue. > > Folks, > > any other ideas or opinions around dependency issue? > > IMHO Option 2 is clearer but not sure where we can create a new git repo > (ASF git or even outside), and also it's not against LICENSEs to repackage > shade & relocated artifacts to Maven. > > Thanks, > Jungtaek Lim (HeartSaVioR) > > 2017년 3월 29일 (수) 오후 10:42, Bobby Evans <[email protected]>님이 작성: > > I am fine with those changes so long as we finish the separation of worker > and daemon classpaths. Otherwise we have made some very big changes for > our end users that are going to have a hard time upgrading. > If all we support is the option to run an old worker version with a new > supervisor/nimbus I think that would be good enough, although I would like > to see a full separation of the classpaths. > > > - Bobby > > On Tuesday, March 28, 2017, 6:03:26 PM CDT, Jungtaek Lim < > [email protected]> wrote:Just FYI: > I've worked with minimal patch for 3, though I still don't like such > workaround: > > https://github.com/HeartSaVioR/storm/commit/d3122faa7ae182915242b979beaac156f91fe3b2 > > It excludes 'libthrift', 'jetty', 'codahale metrics' from relocation > targets. I can see IDEA is OK to build the project, and Maven build > passing. > > - Jungtaek Lim (HeartSaVioR) > > 2017년 3월 29일 (수) 오전 11:02, Jungtaek Lim <[email protected]>님이 작성: > > > Back to origin issue (before breaking down 'storm-core'), turned out > > IntelliJ doesn't recognize relocated classes within project. That's why > > build (via Maven) for master branch succeeds but IDEA compile doesn't. > > > > There're some issues filed but no action has been made. > > https://youtrack.jetbrains.com/issue/IDEA-93855 > > https://youtrack.jetbrains.com/issue/IDEA-126596 > > > > So suppose we have two modules A and B within project, and A relocates L > > to Lr. > > B relies on A's method which returns a class of Lr or has parameters for > a > > class of LR, B needs to use Lr rather than L, and Lr is not recognized > from > > B. > > > > Moving 'storm-drpc-server' to 'storm-core' may help but it's not a nice > > solution though. (think about why we add new module 'storm-drpc-server') > > To minimize dependency for worker (which actually affects end users) we > > should break down 'storm-core' and it will remain to be headache. > > > > There seemed to be little workarounds. > > > > 1. Guide IDEA users to take hacky workaround. > > > > Quoting > https://youtrack.jetbrains.com/issue/IDEA-93855#comment=27-1838157 > > : > > "A hacky workaround is to make the module in intellij with the dependency > > depend on the jar explicitly in target/. This at least allows things to > > compile and tests to run." > > > > That is really bad and annoying, but we might have no choice when we > don't > > want to take other workarounds. > > > > 2. Maintaining separate project for relocated dependencies. > > > > This avoids contributors to take hacky workaround so good to go, but > > maintaining relocated artifacts might be another headache, and I'm not > sure > > ASF (or LICENSE of relocated targets) allows to do that. > > > > 3. Minimize (or remove) relocate targets and/or don't relocate > troublesome > > targets. > > > > For 'storm-drpc-server', there seems to be three troublesome targets: > > > > - 'thrift' > > - 'codahale metrics' > > - 'jetty server' (We may be able to move this to 'storm-drpc-server' when > > another webapp port is done.) > > > > If we are OK to give up relocating those things we might be OK for now. > We > > may want to extend the list when we break down more modules from > > 'storm-core'. > > > > Btw, IMHO relocating is not a good option. Elastic gives up shading > > anything for 2.0. (https://www.elastic.co/blog/to-shade-or-not-to-shade) > > Someone might feel that it's a regression, but we need to decide to do it > > when it can provide better shape. > > > > Please add ideas if you have any, and give your opinions about above > > options. > > > > Thanks, > > Jungtaek Lim (HeartSaVioR) > > > > 2017년 3월 28일 (화) 오후 10:23, Bobby Evans <[email protected]>님이 > 작성: > > > > Sure I am happy to help out how I can. I really would like to spend more > > time on storm, but sadly work has shifted and my team got 2 new projects > > recently, but we have not increased the head count to cover it yet, so I > am > > swamped. But if you do need help with some of these let me know and I'll > > see what I can do in my spare time. > > > > > > - Bobby > > > > On Tuesday, March 28, 2017, 2:10:46 AM CDT, Jungtaek Lim < > > [email protected]> wrote:Bobby, > > > > I just tried to follow your suggestion and found it's less error-prone > > compared to my approach, and has lots of benefits. (I am seeing the great > > chance to minimize dependencies for 'storm-client', say, Worker.) > > > > Thanks for the suggestion. I'm working on this now. I'll mention you > when I > > finish working this, or need your help. > > > > Thanks, > > Jungtaek Lim (HeartSaVioR) > > > > 2017년 3월 28일 (화) 오전 8:15, Jungtaek Lim <[email protected]>님이 작성: > > > > I think we could also fix this issue for separating 'storm-core' and > > 'storm-webapp' (rename from 'storm-drpc-server'), since local cluster > > doesn't need to have 'storm-webapp', DRPC server (local DRPC will still > be > > in 'storm-core'), UI, Logviewer. That's what I'm working on, which seems > to > > require heavy efforts. > > > > Your plan looks really promising, but in other perspective this plan is > > even much harder to address. > > Do you have time frame for working on this? If you can finish the work in > > time frame so that it can be included in 2.0.0, I'll just discard my work > > and move forward to port other things (logviewer, ui) first. > > > > Regarding local mode, exposing local mode provides easy debug > functionality > > with IDE, and hiding it takes away such functionality. We have > > ConfigurableTopology for 2.0.0 which helps to remove ceremony code, so > > exposing is not that bad. > > > > Thanks, > > Jungtaek Lim (HeartSaVioR) > > > > 2017년 3월 27일 (월) 오후 11:10, Bobby Evans <[email protected]>님이 > 작성: > > > > +1 for fixing dependency/IDE issues, but I am not sure it is as simple as > > what you describe. > > > > The issue is that there is no clean way to get local mode without pulling > > in almost all of the daemons too. If we are going to go through the pain > > of separating them out, I would prefer to do it once and do it right. I > am > > happy to help out with this, as it is something I have been thinking > about > > for a while, but just haven't found the time to tackle on my own. > > First we need a good way to give a control to our users about the base > > classpath of the worker, ideally the JVM version too. We have been > doing a > > really good job with rolling upgrades and I think it would be great if we > > could have multiple versions of storm/JVM installed on the worker nodes > and > > the end user can pick what JVM and what version of storm they want their > > worker to run with. We can argue over details of how that would work > > later. The point is that it lets us make changes to the classpath in very > > drastic ways without breaking end users. > > > > Second we need a better way to hide local mode. Every example we have > > supports local mode which means we will ship a copy of the storm daemons > in > > each topology jar if we pull them out of the default classpath. We need > to > > be able to run existing topologies that do not have "local mode support" > in > > local mode. We should be able to make storm-submitter work, there are > > already stubs for this kind of thing, but we may need to play around with > > DRPC and a few other APIs to make it transparent. > > > > We then create new jars from the existing storm-core and > storm-drpc-server. > > > > storm-client - Just what the client and worker needs. The only external > > dependencies are logging and possibly metrics. > > storm-local - This would pull in local mode dependencies (almost > everything > > in storm core). We might even make it a test jar. > > > > storm-daemon - all of our daemon processes (most if not all shading > > removed). We can subdivide this more if we want to. > > > > storm-core would go away or just pull in storm-client. > > The storm jar command would by default only pull in storm-client and its > > dependencies. If you wanted local mode you could add in a flag that > would > > adjust the classpath, boot up a local mode cluster, change the client to > > transparently interact with that instead of a regular cluster, and jump > to > > the end users main. There could also be an option to just include > > everything on the classpath without the local mode cluster. Ideally if > we > > include everything on the classpath with storm jar, that would also add a > > flag that would make the supervisor include everything on the classpath > > when launching the worker. > > > > > > - Bobby > > > > On Monday, March 27, 2017, 12:11:44 AM CDT, Jungtaek Lim < > > [email protected]> > > wrote:Hi devs, > > > > I took a first step of finalizing port work via resolving dependency > issue > > with DRPC. > > > > Here's what I'm giving a try: > > - rename 'storm-drpc-server' to 'storm-webapp' > > - remove 'storm-core' from 'storm-drpc-server' > > -- 'storm-drpc-server' will have its own library directory or shaded jar > > - create 'storm-common' and extract all the things used for both > > 'storm-core' and 'storm-webapp' > > > > It requires numerous files to be moved to, and huge code block should be > > moved / modified. A bit painful to work on. > > > > Other approach would be separating 'storm-worker' (or 'storm-client') and > > 'storm-daemon', and link to different libraries directory. > > (Maybe we could make uber jar for 'storm-daemon'.) > > This also requires similar work and maybe introduce more big effect to > > users. > > > > Other than above ideas I don't have any other ideas. We're shading > > libraries which are both needed from 'storm-core' and 'storm-drpc-server' > > which in turn makes known issue - able to build with maven but IDE can't > > compile 'storm-drpc-server' project. > > > > Please share other ideas if you have one. > > > > Thanks, > > Jungtaek Lim (HeartSaVioR) > > > > > >
