Hey Marco,

Just to clarify: Nick and I were not arguing against the value provided by shaded versions of Phoenix jars. We were more confused with your explanation of the problem(s) you outlined.

Personally, you will never see me argue that we will create a shaded client jar that *does not* relocate the vast majority of all classes contained within. The likelihood that such an artifact will continue to work over time is slim to none. As Hadoop, Guava, HBase, Protobuf and the like continue to grow and make new releases, it's inevitable that some incompatibilities will reduce the quality of the Apache Phoenix artifacts for our users (outright breakages or just lack of confidence via no testing).

The long term goal that I see here is reducing the barrier for _you_ (as a user) to easily build the jar that contains the dependencies you want for your stack. A Maven archetype might help, or maybe just a blog post. Actually, I really like this blog post on this exact subject[1].

(and, just to make sure it's clear, this is just my opinion. It is not necessarily representative of the rest of the community)

Some open questions/requests:

1. Regarding a non-shaded artifact, it is still not clear to me how the phoenix-core-$VERSION.jar we publish already (created by our official source release and published to Nexus) doesn't meet your needs. This would be good to understand. Are you inflating "non-shaded" to mean "shaded but not relocated" (e.g. a fat-jar)?

2. I do agree with you that attaching the current shaded client jars to the Maven build so that they are published for use would be very beneficial. A JIRA issue (and patch!) is the way to make this happen. While describing a problem can be helpful, including the actual changes to source which you'd like to see made go a very long way.

- Josh

[1] https://www.elastic.co/blog/to-shade-or-not-to-shade

Marco Villalobos wrote:
My recommendations are this:

Provide an artifact called phoenix-client-shaded that is a shaded a jar.

Refactor the current phoenix-client-shaded into a plain ordinary maven project 
in which any standard based JSR api (such as servlet, xml) are provided scope.  
The rest of the dependencies are compile scope. Punish it in a maven repository.



On Sep 15, 2016, at 11:09 PM, Marco Villalobos<mvillalo...@kineteque.com>  
wrote:

TL;DR; Publish a non-shaded version of phoenix client to a maven repository 
because it would much easier for complex systems to utilize since it is meant 
to be used as a library.

I'll elaborate, because both approaches have their benefits when done correctly.

I want to constructively say that it was not done correctly in this project.

CURRENTLY

Currently, the phoenix-client jar is shaded and it only repackages (changes the 
package name structure of dependencies) for some of its dependencies.
Currently, the phoenix-client jar is NOT available in the most popular maven 
repositories.

However, most organizations use a dependency management tool such as Maven, 
Ivy, or Gradle. Typically, an organize simply declares an artifact as a 
dependency.
Their build system will then pull ALL the required dependencies and package 
them in the proper location for an application.

WHAT IF CLIENT WAS PUBLISHED IN MAVEN REPOSITORY?

Now, let's say phoenix-client jar was published in a maven repository.

FIRST BENEFIT

This would help the adoption rate since it would be easier for organizations to 
integrate into their projects via dependency management.

BUT THERE ARE CLASS LOADING PROBLEMS BECAUSE IT SHADED

However, providing only a shaded jar has less benefit and utility, and in fact, 
probably introduces a greater chance that it will be in conflict during the 
runtime with class loading issues.

If it was not shaded then all of those runtime class loading issues disappear, 
and build tools such as maven could easily manage the required dependencies.

If you listed the all of the non-phoenix packages in the phoenix client:

jar tvf phoenix-4.8.0-HBase-1.2-client.jar | awk '{if 
(!match($8,/^org\/apache\/phoenix/)) {print $8}}'

It would list these packages (a sample):

org/apache/hadoop/
org/apache/hadoop/hbase/
org/apache/tephra/
com/google/gson/
com/google/inject
sqlline
com/google/common
com/google/protobuf
org/apache/commons/logging
org/apache/log4j/
org/apache/htrace/
org/apache/commons/csv
javax/annotation/
org/apache/hadoop/hbase/client/
org/apache/hadoop/hbase/zookeeper/
tables/
org/apache/curator/
com/sun/jersey/server
javax/ws/
com/sun/appserv/
com/sun/el/parser
javax/servlet/
org/owasp/esapi
bsh/
org/apache/batik
org/w3c/dom/svg
org/cyberneko/html/
org/apache/hadoop/hdfs
org/apache/xerces/
javax/xml/
org/w3c/dom
org/xml/sax
com/sun/jersey/json
com/sun/xml/bind/
org/slf4j/impl/
javax/activation
com/sun/jersey/api/client/
com/google/inject
org/apache/flume
org/apache/pig/

That listing alone would prevent a person from using the client that has a 
different version of Servlet, Guava, Protocol Buffers, Jersey (jax-rs), SLF-4j, 
Jersey Client.

Those are some of the most useful popular libraries available to the public.

Now let's say that it was shaded properly.  There could still be some problems 
in certain environments, such as one that uses JAX-RS because classes that were 
annotated with @Provider would be loaded during a web application!

So really, this brings us back to square one.  An unshaded version of the 
client should be made available so that users can just declare it as a maven 
dependency.

THE BENEFITS OF A SHADED JAR

Now, there are a few benefits to providing a shaded jar:

It is easier to MANUALLY install on a system that has NONE of the dependencies 
that are shaded into it.

Its okay to to provide shaded jars for executable applications.

REBUTTAL

Most systems use Maven for dependency management.  Phoenix client is not an 
application, it is a library.

CONCLUSION

I killed this, didn't I?  I hope my point is taken that a non-shaded version of 
the client should be provided by now. And a more complete effort to repackage 
its dependencies completely should be taken.

Libraries should never shaded jars.


On Sep 15, 2016, at 1:10 PM, Nick Dimiduk<ndimi...@gmail.com>  wrote:

So how is no shading better than this partial shading? You'd end up with
the same conflicts, no?

As far as I know, a JDBC implementation exposes nothing outside java.sql.*.
Anything else exposed by the shaded jar is probably a bug. Can you provide
more details on what you're seeing, or -- better still -- file a JIRA?

On Thu, Sep 15, 2016 at 11:45 AM, Marco Villalobos<
mvillalo...@kineteque.com>  wrote:

Not all of its dependencies are repackaged though, which leads to
class loading conflicts.  When is that ever a good thing?

On Thu, Sep 15, 2016 at 9:28 AM, Nick Dimiduk<ndimi...@gmail.com>  wrote:
Maybe I'm missing something, but...

The whole point of providing a shaded client jar is to prevent exposing
Phoenix implementation details to the applications that consume it --
effectively allowing people to manage their own dependencies. Using a
shaded client jar means you don't have to worry about dependency conflict
because by definition there's only one dependency: the shaded client.
What
are you able to achieve now with, say, the 4.7.0 unshaded client that you
cannot with the new 4.8.0 shaded client?

Thanks for the explanation.
-n

On Thu, Sep 15, 2016 at 7:56 AM, Marco Villalobos<
mvillalo...@kineteque.com
wrote:
Good morning.

I want to provide a module that provides the unshaded version of the
jdbc
client.

This will allow people to manage their own dependencies without worry of
conflict.

-Marco.




Reply via email to