Hi devs, Pull request for this proposal is now available. https://github.com/apache/storm/pull/1608
It's only against 1.x-branch, and I need to update the doc. I'll take care of them accordingly. Please take a look at pull request and comment. Thanks, Jungtaek Lim (HeartSaVioR) 2016년 8월 4일 (목) 오전 12:20, Jungtaek Lim <[email protected]>님이 작성: > One thing I've found while working is that we may want to add package with > excludes. > > Launching child class with --packages to kafka_2.10 just fails since it > has conflicted libraries as transitive dependencies. Not sure how to > represent that, but technically Aether seems to support this. > > SLF4J: Detected both log4j-over-slf4j.jar AND slf4j-log4j12.jar on the > class path, preempting StackOverflowError. > SLF4J: See also http://www.slf4j.org/codes.html#log4jDelegationLoop for > more details. > > Two jars are transitive dependencies of org.apache.kafka:kafka_2.10:0.9.0 > (also 0.8.2.1). So if we would like to add kafka lib from submission step, > exclusion should be supported. > > 2016년 8월 4일 (목) 오전 12:03, Jungtaek Lim <[email protected]>님이 작성: > >> FYI: This proposal is filed to STORM-2016 >> <https://issues.apache.org/jira/browse/STORM-2016> and I've been working >> on this. >> >> I'd like to explain the details on topology submitter as I wasn't clear >> on that. >> >> I've been experimenting several ways of topology submission, but they're >> all having pros and cons. >> >> 1. Introduce Submitter class which resolves dependencies and upload them >> to blobstore, and load topology code and dependencies to custom mutable >> classloader and finally run child class' main method by reflection. This is >> what SparkSubmit is doing though that is more complicated due to support >> various options. >> >> pros. >> - No need to handle communication between processes. That class >> bootstraps and handle all of things. >> cons. >> - We should pass custom classloader to all usages of Class.forName in >> order to prevent any CNFs. >> - Spark uses checkstyle to check usage of Class.forName, but we don't >> apply that so we could miss it. >> >> 2. Introduce Helper class which resolves transitive dependencies (with >> fetching) and upload them to blobstore, and return pair of (blob key, file) >> map. storm.py reads the response of Helper class and add them to classpath >> and run child class' main. >> >> pros. >> - We don't need to use Classloader hack (?). >> - If we make Helper class to separate module, we can even place that >> module to outside of lib and avoid adding aether libraries to lib directory. >> cons. >> - It's annoying and error prone to get and parse Helper's output from >> stdout. >> - Also storm.py needs to run two classes but it's not a big deal since we >> already do that. (confvalue, and ClientJarTransformerRunner) >> - It's not easy to remove dependencies from blobstore if topology >> submission from child class is failed. >> >> 3 Let Helper class just resolves transitive dependencies and return file >> list. storm.py reads the response of Helper class and add them to >> classpath and run child class' main. StormSubmitter will upload them to >> blobstore. >> >> pros. >> - Same as 2. >> - Easy to remove dependencies from blobstore if submission is failed. >> - Helper class is no longer depending on storm-core. Easier to place the >> module to outside of lib. >> cons. >> - StormSubmitter should handle dependencies when submitting topology. >> >> I've succeed with 2, and will try 3 to see it helps. >> >> Any other suggestions or opinions for existing options are much >> appreciated! >> >> Thanks, >> Jungtaek Lim (HeartSaVioR) >> >> 2016년 8월 3일 (수) 오전 8:01, Jungtaek Lim <[email protected]>님이 작성: >> >>> Hi Priyank, >>> >>> first of all, this feature is similar (close) to what Spark provides. >>> >>> https://spark.apache.org/docs/2.0.0/submitting-applications.html#advanced-dependency-management >>> >>> if you have additional jars which are not packed to uber topology jar, >>> you can use --jars option to include them without repackaging topology jar. >>> >>> And I think I was not clear on submitter. I'm still trying to design >>> that point in detail since resolving dependencies need eclipse aether >>> libraries so thinking about avoiding to add dependency to storm-core. But >>> it seems not that easy and clear. I'll update once I'm clear on this. >>> >>> Thanks, >>> Jungtaek Lim (HeartSaVioR) >>> >>> 2016년 8월 3일 (수) 오전 7:43, Priyank Shah <[email protected]>님이 작성: >>> >>>> Hi Jungtaek, >>>> >>>> For adding jars and maven at submission, you have used the word >>>> Submitter. Is Submitter the person running storm jar command or is >>>> Submitter the java code that actually submits it to Nimbus? >>>> Also, I did not quite understand the --jars option. If you could please >>>> elaborate a little on that, that will be great >>>> >>>> Thanks >>>> Priyank >>>> >>>> >>>> >>>> >>>> >>>> >>>> On 8/2/16, 7:05 AM, "Jungtaek Lim" <[email protected]> wrote: >>>> >>>> >Ah, Satish you got the point. I meant copied version of files in >>>> >supervisor, but itself can be isolated. >>>> >I didn't think about removing blobs, and it seems not easy to do. >>>> > >>>> >Jungtaek Lim (HeartSaVIoR) >>>> > >>>> > >>>> >2016년 8월 2일 (화) 오후 7:35, Satish Duggana <[email protected]>님이 >>>> 작성: >>>> > >>>> >> Hi Jungtaek, >>>> >> With the current proposal, are we removing blob store files referred >>>> by a >>>> >> topology when it is killed? >>>> >> >>>> >> Thanks, >>>> >> Satish. >>>> >> >>>> >> On Tue, Aug 2, 2016 at 3:50 PM, Jungtaek Lim <[email protected]> >>>> wrote: >>>> >> >>>> >> > Hi Satish, >>>> >> > >>>> >> > Thanks for reviewing and share your idea. >>>> >> > >>>> >> > Yes this is shared dependencies vs isolated dependencies. >>>> >> > If we name file of dependency to contain group name, artifact >>>> name, and >>>> >> > version, that can be shared. >>>> >> > One downside of this approach is storage space since we don't know >>>> when >>>> >> > it's safe to delete without additional care, but I'm curious that >>>> disk >>>> >> > fills up due to dependency blob jar files in normal situation. >>>> >> > So I think we're OK to do this but I would like to see others >>>> opinions. >>>> >> > >>>> >> > Btw, I'm designing details based on proposal. Will update to this >>>> thread >>>> >> if >>>> >> > there're not covered things with initial design. >>>> >> > >>>> >> > Thanks, >>>> >> > Jungtaek Lim (HeartSaVioR) >>>> >> > >>>> >> > 2016년 8월 2일 (화) 오후 6:58, Satish Duggana <[email protected]>님이 >>>> 작성: >>>> >> > >>>> >> > > Hi Jungtaek, >>>> >> > > Proposal looks good to me. Good that we are not going with other >>>> >> > > alternative using mutable classloader etc. >>>> >> > > >>>> >> > > Good to have the mentioned config in proposal to add those jars >>>> before >>>> >> or >>>> >> > > after storm core/libs. There is a property Config. >>>> >> > > TOPOLOGY_CLASSPATH_BEGINNING which is to have that value as >>>> initial >>>> >> > > classpath and that should continue to be working as expected >>>> even with >>>> >> > the >>>> >> > > new configuration. >>>> >> > > >>>> >> > > One enhancement which we may want to add to the existing >>>> proposal. >>>> >> > > When --packages are used, storm submitter can upload those >>>> dependencies >>>> >> > in >>>> >> > > blob store with a defined naming convention so that same set of >>>> >> packages >>>> >> > > are not uploaded again and they can be used again for other >>>> topologies >>>> >> if >>>> >> > > they use same package. >>>> >> > > >>>> >> > > Thanks, >>>> >> > > Satish. >>>> >> > > >>>> >> > > >>>> >> > > On Tue, Aug 2, 2016 at 7:25 AM, Jungtaek Lim <[email protected]> >>>> >> wrote: >>>> >> > > >>>> >> > > > Hi dev, >>>> >> > > > >>>> >> > > > This is proposal review thread for submitting topology with >>>> adding >>>> >> jars >>>> >> > > and >>>> >> > > > maven artifacts. This is also following up discussion thread >>>> for >>>> >> > > > [DISCUSSION] >>>> >> > > > Policy of resolving dependencies for non storm-core modules.[1] >>>> >> > > > >>>> >> > > > I've written design doc which also describes motivation on >>>> this. >>>> >> > > > >>>> >> > > > >>>> >> > > >>>> >> > >>>> >> >>>> https://cwiki.apache.org/confluence/display/STORM/A.+Design+doc%3A+adding+jars+and+maven+artifacts+at+submission >>>> >> > > > >>>> >> > > > Please review this and comment to "this thread" instead of >>>> wiki page >>>> >> so >>>> >> > > > that all devs can be notified for the update. >>>> >> > > > >>>> >> > > > Thanks, >>>> >> > > > Jungtaek Lim (HeartSaVioR) >>>> >> > > > >>>> >> > > > [1] >>>> >> > > > >>>> >> > > > >>>> >> > > >>>> >> > >>>> >> >>>> http://mail-archives.apache.org/mod_mbox/storm-dev/201607.mbox/%3CCAF5108jByyJLTKrV_P4fS=dj8rsr_o5oubzqbviscggsc1c...@mail.gmail.com%3E >>>> >> > > > >>>> >> > > >>>> >> > >>>> >> >>>> >>>
