Hi devs,

Pull request for this proposal is now available.
https://github.com/apache/storm/pull/1608

It's only against 1.x-branch, and I need to update the doc. I'll take care
of them accordingly.

Please take a look at pull request and comment.

Thanks,
Jungtaek Lim (HeartSaVioR)

2016년 8월 4일 (목) 오전 12:20, Jungtaek Lim <[email protected]>님이 작성:

> One thing I've found while working is that we may want to add package with
> excludes.
>
> Launching child class with --packages to kafka_2.10 just fails since it
> has conflicted libraries as transitive dependencies. Not sure how to
> represent that, but technically Aether seems to support this.
>
> SLF4J: Detected both log4j-over-slf4j.jar AND slf4j-log4j12.jar on the
> class path, preempting StackOverflowError.
> SLF4J: See also http://www.slf4j.org/codes.html#log4jDelegationLoop for
> more details.
>
> Two jars are transitive dependencies of org.apache.kafka:kafka_2.10:0.9.0
> (also 0.8.2.1). So if we would like to add kafka lib from submission step,
> exclusion should be supported.
>
> 2016년 8월 4일 (목) 오전 12:03, Jungtaek Lim <[email protected]>님이 작성:
>
>> FYI: This proposal is filed to STORM-2016
>> <https://issues.apache.org/jira/browse/STORM-2016> and I've been working
>> on this.
>>
>> I'd like to explain the details on topology submitter as I wasn't clear
>> on that.
>>
>> I've been experimenting several ways of topology submission, but they're
>> all having pros and cons.
>>
>> 1. Introduce Submitter class which resolves dependencies and upload them
>> to blobstore, and load topology code and dependencies to custom mutable
>> classloader and finally run child class' main method by reflection. This is
>> what SparkSubmit is doing though that is more complicated due to support
>> various options.
>>
>> pros.
>> - No need to handle communication between processes. That class
>> bootstraps and handle all of things.
>> cons.
>> - We should pass custom classloader to all usages of Class.forName in
>> order to prevent any CNFs.
>> - Spark uses checkstyle to check usage of Class.forName, but we don't
>> apply that so we could miss it.
>>
>> 2. Introduce Helper class which resolves transitive dependencies (with
>> fetching) and upload them to blobstore, and return pair of (blob key, file)
>> map. storm.py reads the response of Helper class and add them to classpath
>> and run child class' main.
>>
>> pros.
>> - We don't need to use Classloader hack (?).
>> - If we make Helper class to separate module, we can even place that
>> module to outside of lib and avoid adding aether libraries to lib directory.
>> cons.
>> - It's annoying and error prone to get and parse Helper's output from
>> stdout.
>> - Also storm.py needs to run two classes but it's not a big deal since we
>> already do that. (confvalue, and ClientJarTransformerRunner)
>> - It's not easy to remove dependencies from blobstore if topology
>> submission from child class is failed.
>>
>> 3 Let Helper class just resolves transitive dependencies and return file
>> list. storm.py reads the response of Helper class and add them to
>> classpath and run child class' main. StormSubmitter will upload them to
>> blobstore.
>>
>> pros.
>> - Same as 2.
>> - Easy to remove dependencies from blobstore if submission is failed.
>> - Helper class is no longer depending on storm-core. Easier to place the
>> module to outside of lib.
>> cons.
>> - StormSubmitter should handle dependencies when submitting topology.
>>
>> I've succeed with 2, and will try 3 to see it helps.
>>
>> Any other suggestions or opinions for existing options are much
>> appreciated!
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2016년 8월 3일 (수) 오전 8:01, Jungtaek Lim <[email protected]>님이 작성:
>>
>>> Hi Priyank,
>>>
>>> first of all, this feature is similar (close) to what Spark provides.
>>>
>>> https://spark.apache.org/docs/2.0.0/submitting-applications.html#advanced-dependency-management
>>>
>>> if you have additional jars which are not packed to uber topology jar,
>>> you can use --jars option to include them without repackaging topology jar.
>>>
>>> And I think I was not clear on submitter. I'm still trying to design
>>> that point in detail since resolving dependencies need eclipse aether
>>> libraries so thinking about avoiding to add dependency to storm-core. But
>>> it seems not that easy and clear. I'll update once I'm clear on this.
>>>
>>> Thanks,
>>> Jungtaek Lim (HeartSaVioR)
>>>
>>> 2016년 8월 3일 (수) 오전 7:43, Priyank Shah <[email protected]>님이 작성:
>>>
>>>> Hi Jungtaek,
>>>>
>>>> For adding jars and maven at submission, you have used the word
>>>> Submitter. Is Submitter the person running storm jar command or is
>>>> Submitter the java code that actually submits it to Nimbus?
>>>> Also, I did not quite understand the --jars option. If you could please
>>>> elaborate a little on that, that will be great
>>>>
>>>> Thanks
>>>> Priyank
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On 8/2/16, 7:05 AM, "Jungtaek Lim" <[email protected]> wrote:
>>>>
>>>> >Ah, Satish you got the point. I meant copied version of files in
>>>> >supervisor, but itself can be isolated.
>>>> >I didn't think about removing blobs, and it seems not easy to do.
>>>> >
>>>> >Jungtaek Lim (HeartSaVIoR)
>>>> >
>>>> >
>>>> >2016년 8월 2일 (화) 오후 7:35, Satish Duggana <[email protected]>님이
>>>> 작성:
>>>> >
>>>> >> Hi Jungtaek,
>>>> >> With the current proposal, are we removing blob store files referred
>>>> by a
>>>> >> topology when it is killed?
>>>> >>
>>>> >> Thanks,
>>>> >> Satish.
>>>> >>
>>>> >> On Tue, Aug 2, 2016 at 3:50 PM, Jungtaek Lim <[email protected]>
>>>> wrote:
>>>> >>
>>>> >> > Hi Satish,
>>>> >> >
>>>> >> > Thanks for reviewing and share your idea.
>>>> >> >
>>>> >> > Yes this is shared dependencies vs isolated dependencies.
>>>> >> > If we name file of dependency to contain group name, artifact
>>>> name, and
>>>> >> > version, that can be shared.
>>>> >> > One downside of this approach is storage space since we don't know
>>>> when
>>>> >> > it's safe to delete without additional care, but I'm curious that
>>>> disk
>>>> >> > fills up due to dependency blob jar files in normal situation.
>>>> >> > So I think we're OK to do this but I would like to see others
>>>> opinions.
>>>> >> >
>>>> >> > Btw, I'm designing details based on proposal. Will update to this
>>>> thread
>>>> >> if
>>>> >> > there're not covered things with initial design.
>>>> >> >
>>>> >> > Thanks,
>>>> >> > Jungtaek Lim (HeartSaVioR)
>>>> >> >
>>>> >> > 2016년 8월 2일 (화) 오후 6:58, Satish Duggana <[email protected]>님이
>>>> 작성:
>>>> >> >
>>>> >> > > Hi Jungtaek,
>>>> >> > > Proposal looks good to me. Good that we are not going with other
>>>> >> > > alternative using mutable classloader etc.
>>>> >> > >
>>>> >> > > Good to have the mentioned config in proposal to add those jars
>>>> before
>>>> >> or
>>>> >> > > after storm core/libs. There is a property Config.
>>>> >> > > TOPOLOGY_CLASSPATH_BEGINNING which is to have that value as
>>>> initial
>>>> >> > > classpath and that should continue to be working as expected
>>>> even with
>>>> >> > the
>>>> >> > > new configuration.
>>>> >> > >
>>>> >> > > One enhancement which we may want to add to the existing
>>>> proposal.
>>>> >> > > When --packages are used, storm submitter can upload those
>>>> dependencies
>>>> >> > in
>>>> >> > > blob store with a defined naming convention so that same set of
>>>> >> packages
>>>> >> > > are not uploaded again and they can be used again for other
>>>> topologies
>>>> >> if
>>>> >> > > they use same package.
>>>> >> > >
>>>> >> > > Thanks,
>>>> >> > > Satish.
>>>> >> > >
>>>> >> > >
>>>> >> > > On Tue, Aug 2, 2016 at 7:25 AM, Jungtaek Lim <[email protected]>
>>>> >> wrote:
>>>> >> > >
>>>> >> > > > Hi dev,
>>>> >> > > >
>>>> >> > > > This is proposal review thread for submitting topology with
>>>> adding
>>>> >> jars
>>>> >> > > and
>>>> >> > > > maven artifacts. This is also following up discussion thread
>>>> for
>>>> >> > > > [DISCUSSION]
>>>> >> > > > Policy of resolving dependencies for non storm-core modules.[1]
>>>> >> > > >
>>>> >> > > > I've written design doc which also describes motivation on
>>>> this.
>>>> >> > > >
>>>> >> > > >
>>>> >> > >
>>>> >> >
>>>> >>
>>>> https://cwiki.apache.org/confluence/display/STORM/A.+Design+doc%3A+adding+jars+and+maven+artifacts+at+submission
>>>> >> > > >
>>>> >> > > > Please review this and comment to "this thread" instead of
>>>> wiki page
>>>> >> so
>>>> >> > > > that all devs can be notified for the update.
>>>> >> > > >
>>>> >> > > > Thanks,
>>>> >> > > > Jungtaek Lim (HeartSaVioR)
>>>> >> > > >
>>>> >> > > > [1]
>>>> >> > > >
>>>> >> > > >
>>>> >> > >
>>>> >> >
>>>> >>
>>>> http://mail-archives.apache.org/mod_mbox/storm-dev/201607.mbox/%3CCAF5108jByyJLTKrV_P4fS=dj8rsr_o5oubzqbviscggsc1c...@mail.gmail.com%3E
>>>> >> > > >
>>>> >> > >
>>>> >> >
>>>> >>
>>>>
>>>

Reply via email to