[
https://issues.apache.org/jira/browse/AVRO-647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12905124#action_12905124
]
Scott Carey commented on AVRO-647:
----------------------------------
bq. Finally, to be clear, is there a motive for this beyond better expressing
dependencies? Functionally sticking everything in a single jar with lots of
optional dependencies works fine, but folks then have to guess which
dependencies they actually need, and that's the primary problem this seeks to
solve. Is that right, or are there other problems too?
That is the main case here. Dependendies become more explicit. Users should
be able to consume the parts they need without too much accidental baggage.
Instead, we could simply document this all clearly so that users are armed with
the information necessary to configure their builds to exclude transitive
dependencies they don't use.
However, Avro is by nature something that many things will depend on, and many
of those things portions of Avro might itself depend on. In particular, making
it easy to avoid circular dependencies is a plus. As we have seen
(https://issues.apache.org/jira/browse/AVRO-545) , even if it is possible to
use ivy/maven features to prevent circular dependency, it makes users uneasy.
The guidelines I use for my projects is two-fold:
* If the cascaded set of dependencies is large and likely to conflict with
other things, it should be easy to separate (for Avro, this is the hadoop
dependency).
* If the dependency is physically large (large jar file), consider making it
easy to separate.
* If the dependency is for a minor rarely used feature, be careful. For
example Jackson 1.0.1 being used by hadoop 0.20+ for dumping configuration
files to JSON causes problems.
So for the case of Reflect, if paranamer doesn't have a lot of cascaded
dependencies itself, nor is a large jar on its own, then including it in
avro-data is not going to be a big deal.
bq. If we separate jars, it might be good to split the build-time classpath in
the same manner, by splitting the src tree.
We have three choices, I think:
1. Leave the source tree as-is, and have the build use ant file
excludes/includes to define what is packaged in each one. Managing the
excludes/includes will be troublesome and would be easier if the split was
cleanly done by package. Not much else would have to change -- the compile and
test phases would stay the same. There would also be the downside that tests
would not implicitly test the packaging boundaries.
2. Break it into different source trees and continue using ant/ivy. This is
more work and means we would be breaking up tests and compile phases too.
3. Break it into different source trees and use maven. Maven is a natural fit
for this sort of thing and I'm experienced with it, but it is not trivial and
others here aren't as familiar with it. To wire up IDL and the Specific
compiler, Maven plugins would be required. Interop testing would probably
still require ant.
> Break avro.jar into avro.jar, avro-dev.jar and avro-hadoop.jar
> --------------------------------------------------------------
>
> Key: AVRO-647
> URL: https://issues.apache.org/jira/browse/AVRO-647
> Project: Avro
> Issue Type: Improvement
> Components: java
> Reporter: Scott Carey
> Assignee: Scott Carey
>
> Our dependencies are starting to get a little complicated on the Java side.
> I propose we build two (possibly more) jars related to our major dependencies
> and functions.
> 1. avro.jar (or perhaps avro-core.jar)
> This contains all of the core avro functionality for _using_ avro as a
> library. This excludes the specific compiler, avro idl, and other build-time
> or development tools, as well as avro packages for third party integration
> such as hadoop. This jar should then have a minimal set of dependencies
> (jackson, jetty, SLF4J ?).
> 2. avro-dev.jar
> This would contain compilers, idl, development tools, etc. Most applications
> will not need this, but build systems and developers will.
> 3. avro-hadoop.jar
> This would contain the hadoop API and possibly pig/hive/whatever related to
> that. This makes it easier for pig/hive/hadoop to consume avro-core without
> circular dependencies.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.