Sounds good.

If the libraries follow semantic versioning properly, we could try and only
have the major version in the module name?



On Mon, Jun 26, 2017 at 2:45 PM, Chesnay Schepler <ches...@apache.org>
wrote:

> You're raising good points, now i see why having the version in the name
> is useful.
>
> I'll adjust my PR accordingly. And yes, ideally we only release the
> modified modules, not everything again.
>
>
> On 26.06.2017 14:29, Stephan Ewen wrote:
>
>> How should it work when one of the shaded dependencies is updated? We
>> probably do not want to release all of them, just because the overall
>> version number is in their version number.
>>
>> How about that:
>>
>>    - Under normal circumstances, we only increase the version of the root
>> project, when we add / bump a shaded dependency version
>>
>>    - Shaded dependencies include the version in the same. That way we can
>> possibly have two different versions of a dependency, such as
>> "flink-shaded-kryo-2" and "flink-shaded-kryo-3".
>>    - Shaded dependencies should have the version in the relocation pattern
>> as well, for the same reason as above (unless the two versions have
>> separate namespaces already).
>>
>>    - The released version of the module and artifact is always "1.0"
>> unless
>> we find that we did shading/relocation errors, in which case we need to
>> re-release that artifact (1.1, 1.2, ...)
>>
>>
>>
>> On Mon, Jun 26, 2017 at 2:23 PM, Chesnay Schepler <ches...@apache.org>
>> wrote:
>>
>> the guava version is already included in the version of the flink-shaded
>>> module.
>>>
>>> For example, for the first release of flink-shaded-guava the version
>>> would
>>> be 1-18.0.
>>>
>>> 1 is the version of flink-shaded itself, 18.0 is the guava version.
>>>
>>>
>>> On 26.06.2017 14:01, Stephan Ewen wrote:
>>>
>>> Looks good, thanks Chesnay!.
>>>>
>>>> How about including the dependency version names in the module names,
>>>> like
>>>> "flink-shaded-guava-18.0"?
>>>>
>>>> On Mon, Jun 26, 2017 at 11:49 AM, Chesnay Schepler <ches...@apache.org>
>>>> wrote:
>>>>
>>>> The new repo was created and is accessible here:
>>>>
>>>>> https://github.com/apache/flink-shaded
>>>>>
>>>>> I've already opened a PR to add a shaded guava module.
>>>>>
>>>>> Once the shaded-guava module is merged I would like to do a first
>>>>> release
>>>>> of flink-shaded,
>>>>> only containing guava. I already have a branch with all the changes
>>>>> necessary to
>>>>> integrate this dependency into Flink.
>>>>>
>>>>> The alternative would be to first create shaded modules for all
>>>>> dependencies and make a
>>>>> single release, but I feel that would delay things quite a bit.
>>>>>
>>>>>
>>>>>
>>>>> On 21.06.2017 17:00, Robert Metzger wrote:
>>>>>
>>>>> Okay, I'll request a repo for the shading.
>>>>>
>>>>>> On Wed, Jun 21, 2017 at 1:38 PM, Chesnay Schepler <ches...@apache.org
>>>>>> >
>>>>>> wrote:
>>>>>>
>>>>>> I like your suggestion Robert. A lot actually.
>>>>>>
>>>>>> Having the actual dependency version (i.e 18 for guava) in the version
>>>>>>> should improve clarity a lot.
>>>>>>>
>>>>>>> Originally i intended to release 1 artifact per Flink version, with
>>>>>>> the
>>>>>>> normal versioning scheme
>>>>>>> that we use. But given that the shaded dependencies aren't changed
>>>>>>> often
>>>>>>> (even rarely might be a stretch),
>>>>>>> and aren't actually coupled to the Flink release cycle this doesn't
>>>>>>> make
>>>>>>> a
>>>>>>> lot of sense.
>>>>>>>
>>>>>>> Having separate repos looks like a reasonable separation of concerns.
>>>>>>> The
>>>>>>> release for Flink itself
>>>>>>> would work just like it does now; we don't have to modify any scripts
>>>>>>> or
>>>>>>> do extra steps.
>>>>>>>
>>>>>>> Since the build, release and development process are separate (since
>>>>>>> flink-shaded isn't part of Flink build
>>>>>>> process, has a separate release process and changes to it will /never
>>>>>>> /require immediate changes to Flink)
>>>>>>> it seems like a very good candidate to move it into a separate repo.
>>>>>>>
>>>>>>>
>>>>>>> On 21.06.2017 11:26, Robert Metzger wrote:
>>>>>>>
>>>>>>> Its not completely clear to me how we want to version the shaded
>>>>>>>
>>>>>>> dependencies, and where we are putting them.
>>>>>>>>
>>>>>>>> One concern are the official apache release rules. If we want to
>>>>>>>> release
>>>>>>>> something to maven central, we need to do a proper vote over a
>>>>>>>> source
>>>>>>>> archive.
>>>>>>>> I would propose to create a new repository "flink-shaded.git" that
>>>>>>>> contains
>>>>>>>> the following maven module structure:
>>>>>>>> - flink-shaded: 1
>>>>>>>>        - flink-shaded-asm: 1-5.2
>>>>>>>>        - flink-shaded-guava: 1-18.0
>>>>>>>>        - ...
>>>>>>>>
>>>>>>>> The number indicates the version (for ASM, I've just guessed).
>>>>>>>> The version for the parent "flink-shaded" needs to be updated on
>>>>>>>> each
>>>>>>>> parent pom change (new module added, new / changed plugins, ...)
>>>>>>>>
>>>>>>>> We could create a separate release script in this repository that
>>>>>>>> creates
>>>>>>>> the flink-shaded-src.zip from the code and deploys the artifacts to
>>>>>>>> the
>>>>>>>> maven staging area.
>>>>>>>>
>>>>>>>> The advantage of a separate repo would be that we don't need to
>>>>>>>> maintain
>>>>>>>> separate maven projects in the same git repo.
>>>>>>>> Also, the src archives for the release vote can be created from the
>>>>>>>> repo
>>>>>>>> content (without much filtering).
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Jun 20, 2017 at 9:44 PM, Stephan Ewen <se...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> I like this approach.
>>>>>>>>
>>>>>>>> Two additional things can be mention here:
>>>>>>>>
>>>>>>>>>       - We need to deploy these artifacts independently and not as
>>>>>>>>> part
>>>>>>>>> of
>>>>>>>>> the
>>>>>>>>> build. That is a manual step once per "bump" in the dependency of
>>>>>>>>> that
>>>>>>>>> library.
>>>>>>>>>
>>>>>>>>>       - We reduce the shading complexity of the original build and
>>>>>>>>> should
>>>>>>>>> thus
>>>>>>>>> also speed up build times :-)
>>>>>>>>>
>>>>>>>>> Stephan
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Jun 20, 2017 at 1:15 PM, Chesnay Schepler <
>>>>>>>>> ches...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> I would like to start working on this.
>>>>>>>>>
>>>>>>>>> I've looked into adding a flink-shaded-guava module. Working
>>>>>>>>> against
>>>>>>>>>
>>>>>>>>>> the
>>>>>>>>>> shaded namespaces seems
>>>>>>>>>> to work without problems from the IDE, and we could forbid
>>>>>>>>>> un-shaded
>>>>>>>>>> usages with checkstyle.
>>>>>>>>>>
>>>>>>>>>> So for the list of dependencies that we want to shade we currently
>>>>>>>>>> got:
>>>>>>>>>>
>>>>>>>>>>      * asm
>>>>>>>>>>      * guava
>>>>>>>>>>      * netty
>>>>>>>>>>      * hadoop
>>>>>>>>>>      * curator
>>>>>>>>>>
>>>>>>>>>> I've had a chat with Stephan Ewan and he brought up kryo + chill
>>>>>>>>>> as
>>>>>>>>>> well.
>>>>>>>>>>
>>>>>>>>>> The nice thing is that we can do this incrementally, one
>>>>>>>>>> dependency
>>>>>>>>>> at a
>>>>>>>>>> time. As such i would propose
>>>>>>>>>> to go through the whole process for guava and see what problems
>>>>>>>>>> arise.
>>>>>>>>>>
>>>>>>>>>> This would include adding a flink-shaded module and a child
>>>>>>>>>> flink-shaded-guava module to the flink repository
>>>>>>>>>> that are not part of the build process, replacing all usages of
>>>>>>>>>> guava
>>>>>>>>>> in
>>>>>>>>>> Flink, adding the
>>>>>>>>>> checkstyle rule (optional) and deploying the artifact to maven
>>>>>>>>>> central.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 11.05.2017 10:54, Stephan Ewen wrote:
>>>>>>>>>>
>>>>>>>>>> @Ufuk  - I have never set up artifact deployment in Maven, could
>>>>>>>>>> need
>>>>>>>>>> some
>>>>>>>>>> help there.
>>>>>>>>>>
>>>>>>>>>> Regarding shading Netty, I agree, would be good to do that as
>>>>>>>>>>
>>>>>>>>>>> well...
>>>>>>>>>>>
>>>>>>>>>>> On Thu, May 11, 2017 at 10:52 AM, Ufuk Celebi <u...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> The advantages you've listed sound really compelling to me.
>>>>>>>>>>>
>>>>>>>>>>> - Do you have time to implement these changes or do we need a
>>>>>>>>>>>
>>>>>>>>>>> volunteer?
>>>>>>>>>>>>
>>>>>>>>>>>> ;)
>>>>>>>>>>>
>>>>>>>>>> - I assume that republishing the artifacts as you propose doesn't
>>>>>>>>>>
>>>>>>>>>>> have
>>>>>>>>>>>> any new legal implications since we already publish them with
>>>>>>>>>>>> our
>>>>>>>>>>>> JARs, right?
>>>>>>>>>>>>
>>>>>>>>>>>> - We might think about adding Netty to the list of shaded
>>>>>>>>>>>> artifacts
>>>>>>>>>>>> since some dependency conflicts were reported recently. Would
>>>>>>>>>>>> have
>>>>>>>>>>>> to
>>>>>>>>>>>> double check the reported issues before doing that though. ;-)
>>>>>>>>>>>>
>>>>>>>>>>>> – Ufuk
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, May 10, 2017 at 8:45 PM, Stephan Ewen <se...@apache.org
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> @chesnay: I used ASM as an example in the proposal. Maybe I did
>>>>>>>>>>> not
>>>>>>>>>>>
>>>>>>>>>> say
>>>>>>>>>>
>>>>>>>>>>> that clearly.
>>>>>>>>>>> If we like that approach, we should deal with the other libraries
>>>>>>>>>>> (at
>>>>>>>>>>>
>>>>>>>>>>> least
>>>>>>>>>>>>
>>>>>>>>>>>>> the frequently used ones) in the same way.
>>>>>>>>>>>>>
>>>>>>>>>>>> I would imagine to have a project layout like that:
>>>>>>>>>>>>
>>>>>>>>>>>>> flink-shaded-deps
>>>>>>>>>>>>>        - flink-shaded-asm
>>>>>>>>>>>>>        - flink-shaded-guava
>>>>>>>>>>>>>        - flink-shaded-curator
>>>>>>>>>>>>>        - flink-shaded-hadoop
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> "flink-shaded-deps" would not be built every time (and not be
>>>>>>>>>>>>> released
>>>>>>>>>>>>> every time), but only when needed.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, May 10, 2017 at 7:28 PM, Chesnay Schepler <
>>>>>>>>>>>>> ches...@apache.org
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> I like the idea, thank you for bringing it up.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Given that the raised problems aren't really ASM specific would
>>>>>>>>>>>>> it
>>>>>>>>>>>>>
>>>>>>>>>>>>> make
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> sense to create one flink-shaded module that contains all
>>>>>>>>>>>>>
>>>>>>>>>>>> frequently
>>>>>>>>>>>>
>>>>>>>>>>>> shaded
>>>>>>>>>>>
>>>>>>>>>>> libraries? (or maybe even all shaded dependencies by core
>>>>>>>>>>>> modules)
>>>>>>>>>>>>
>>>>>>>>>>>>> The
>>>>>>>>>>>>>
>>>>>>>>>>>>> proposal limits the scope of this to ASM and i was wondering
>>>>>>>>>>>>> why.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I also remember that there was a discussion recently about why
>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> shade
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> things at all, and the idea of working against the shaded
>>>>>>>>>>>>>
>>>>>>>>>>>> namespaces
>>>>>>>>>>>>
>>>>>>>>>>>> was
>>>>>>>>>>>
>>>>>>>>>>> brought up. Back then i was expressing doubts as to whether IDE's
>>>>>>>>>>>>
>>>>>>>>>>>>> would
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> properly support this; what's the state on that?
>>>>>>>>>>>>>
>>>>>>>>>>>> On 10.05.2017 18:18, Stephan Ewen wrote:
>>>>>>>>>>>
>>>>>>>>>>> Hi!
>>>>>>>>>>>>
>>>>>>>>>>>>> This is a discussion about altering the way we handle
>>>>>>>>>>>>>> dependencies
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> shading in Flink.
>>>>>>>>>>>>>>
>>>>>>>>>>>>> I ran into quite a view problems trying to adjust / fix some
>>>>>>>>>>>> shading
>>>>>>>>>>>>
>>>>>>>>>>>>> issues
>>>>>>>>>>>>>>> during release validation.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The issue is tracked under: https://issues.apache.org/jira
>>>>>>>>>>>>>>> /browse/FLINK-6529
>>>>>>>>>>>>>>> Bring this discussion thread up because it is a bigger issue
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> *Problem*
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Currently, Flink shades dependencies like ASM and Guava into
>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> jars
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>
>>>>>>>>>>>>> projects that reference it and relocate the classes.
>>>>>>>>>>>> There are some drawbacks to that approach, let's discuss them at
>>>>>>>>>>>>
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> example of ASM:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         - The ASM classes are for example in flink-core,
>>>>>>>>>>>>>>> flink-java,
>>>>>>>>>>>>>>> flink-scala,
>>>>>>>>>>>>>>> flink-runtime, etc.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         - Users that reference these dependencies have the
>>>>>>>>>>>>>>> classes
>>>>>>>>>>>>>>> multiple
>>>>>>>>>>>>>>> times
>>>>>>>>>>>>>>> in the classpath. That is unclean (works, through, because
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> classes
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>
>>>>>>>>>>>>> identical). The same happens when building the final dist. jar.
>>>>>>>>>>>>         - Some of these dependencies require to include license
>>>>>>>>>>>>
>>>>>>>>>>>>> files
>>>>>>>>>>>>>
>>>>>>>>>>>>> in
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>> shaded jar. It is hard to impossible to build a good
>>>>>>>>>>>>>>> automatic
>>>>>>>>>>>>>>> solution
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>> that, partly due to Maven's very poor cross-project path
>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         - Most importantly: Scala does not support shading
>>>>>>>>>>>>>>> really
>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Scala
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> classes have references to classes in more places than just
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> class
>>>>>>>>>>>>>
>>>>>>>>>>>>> names
>>>>>>>>>>>>>
>>>>>>>>>>>>> (apparently for Scala reflect support). Referencing a Scala
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> project
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> shaded ASM still requires to add a reference to unshaded ASM
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (at
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> least
>>>>>>>>>>>>>
>>>>>>>>>>>>> as
>>>>>>>>>>>>>
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> compile dependency).
>>>>>>>>>>>>>
>>>>>>>>>>>>> *Proposal*
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I propose that we build and deploy a asm-flink-shaded version
>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>> ASM
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> directly program against the relocated namespaces. Since we
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> never
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> use
>>>>>>>>>>>>>
>>>>>>>>>>>>> classes that we relocate in public interfaces, Flink users will
>>>>>>>>>>>>>
>>>>>>>>>>>>> never
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>
>>>>>>>>>>>>> the relocated class names. Internally, it does not hurt to use
>>>>>>>>>>>>
>>>>>>>>>>> them.
>>>>>>>>>>>
>>>>>>>>>>>         - Proper maven dependency management, no hidden (shaded)
>>>>>>>>>>>>
>>>>>>>>>>>>> dependencies
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>         - One copy of each class for shaded dependencies
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>         - Proper Scala interoperability
>>>>>>>>>>>>>
>>>>>>>>>>>>>         - Natural License management (license is part of
>>>>>>>>>>>>>> deployed
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> asm-flink-shaded jar)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Happy to hear thoughts!
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Stephan
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>

Reply via email to