Thanks for putting this together, Puja. I don't see the geotools related issues anymore.

I took a glance at another JAR that a `mvn package` creates this time:

extras/indexing/target/rya.indexing-3.2.10-SNAPSHOT-accumulo-server.jar

In here, I found: hep/aida/bin/AbstractBin.class

Per https://dst.lbl.gov/ACSSoftware/colt/license.html, code in the hep/aida package are licensed as LGPL.

It's really important to understand *all* of the dependencies that you're using when you're creating these massive shaded jars...

./extras/rya.merger/target/rya.merger-3.2.10-SNAPSHOT-shaded.jar also has the same issue. It appears like it is coming in via tinkerpop-blueprints

[INFO] +- org.apache.rya:rya.sail:jar:3.2.10-SNAPSHOT:compile
[INFO] |  +- com.tinkerpop.blueprints:blueprints-core:jar:2.5.0:compile
[INFO] |  |  \- colt:colt:jar:1.2.0:compile
[INFO] |  |     \- concurrent:concurrent:jar:1.3.4:compile

com.google.code.findbugs:jsr305 coming in via hadoop-common (yes, Hadoop screwed up) is also bad. This is an easy fix to exclude this dependency and add in com.github.stephenc.findbugs:findbugs-annotations instead.

Puja Valiyil wrote:
Hi everyone,
I put up a pull request Friday for. making geoindexing an optional profile
( https://github.com/apache/incubator-rya/pull/101).  It pulls out
geoindexing into a separate project, and there are some obvious next steps
that we could punt to the next release that I list out in the pr.
If no one sees any issues it would be good to move forward with merging
this and going forward with another release candidate.


On Monday, October 10, 2016, Josh Elser<els...@apache.org
<javascript:_e(%7B%7D,'cvml','els...@apache.org');>>  wrote:

Yup, you got it right, Caleb (given my current interpretation, anyways
:P). The incubator proposal[1] should have called out these dependencies to
begin with. I think this is why this is such a "shock".

Working under the assumption that GeoIndexing is the "tainted fruit" and
we call it optional, I think there is merit in making a release,
acknowledging that it needs work and that it is not entirely at the spirit
of "optional". Getting familiarity with how to make a release is extremely
important. End of the day, this is something that needs to be addressed
prior to graduation. I'm think I'm OK with suggesting that geoindexing is
optional to kick the problem down the road for a first release, but I want
to make sure it doesn't get repeatedly kicked. This should be an extremely
high priority to resolve.

In fewer words: if reworking the indexing to modularize the Geo-related
pieces is something someone can volunteer to do right now, great. That is
the ideal path forward. If it's going to take month(s) to do, I think
punting for one release is OK.

[1] https://wiki.apache.org/incubator/RyaProposal#External_Dependencies

Meier, Caleb wrote:

So just make sure I'm clear with what you said, I'll attempt to
summarize.  For the purposes of a release, it's okay to include source code
for components that have improperly licensed, Runtime dependencies, so long
as they are "optional" and turned off by default.  But when we actually
deploy our artifacts, we need to exclude the jars for all components that
have improperly licensed dependencies.  So in effect, any components that
have improperly licensed dependencies need to be truly optional from a
build perspective -- have an optional build profile -- and should not be
built and deployed by default.

What we are currently working on is making geoindexing optional from a
build perspective.  We're separating it out from the indexing project so
that it can have its own, optional build profile.  If what I said above is
correct, it seems like there is no way around this, other than making the
entire indexing project optional.  But that would be like throwing the baby
out with the bath water.
________________________________________
From: Josh Elser [els...@apache.org]
Sent: Monday, October 10, 2016 10:26 AM
To: dev@rya.incubator.apache.org
Subject: Re: [DISCUSS] Path forward for release

Ok, I put some more thought into this one because it wasn't sitting
right with me. I think there are two main issues:

1) Is geoindexing actually "optional"

2) Would JARs be also published alongside the source release, and do
those JARs bundle these GPL-licensed dependencies.

Assuming #1 is "yes" (because I don't know it well enough technically),
If the geo-indexing modules are disabled by default, you can make the
release. I think this is what Venkatesh was getting at.

When you publish JARs, even though they are not an official release in
Apache's eyes (only source code is an Apache release -- everything else
is "supplemental" and not actually part of the release), you should
still make sure that they are being properly licensed. This also extends
to not being allowed to bundle Category-X dependencies (e.g. GPL). I
think this is how I noticed this in the first place.

I will leave the #1 discussion up to you all because I don't have enough
context -- should really get an answer in the spirit of the question:
"Is Rya useful if GeoIndexing is optional?". Meaning, will the people
using this release all be building the optional GeoIndexing support? In
this case, it's a core feature, and not an optional one.

Let me know if #2 is still not clear. I apologize for (likely) making
things more complicated.

Josh Elser wrote:

No, you're correct. I am disagreeing with Venkatesh :). That's why I
included documentation which outlines why I am disagreeing with him.

Meier, Caleb wrote:

Unless I am misunderstanding something, which I probably am, it seems
like Venkatesh and Josh are saying conflicting things. Venkatesh seems
to be implying that the licenses for runtime dependencies do not need
to be taken into account, while Josh seems to be be saying that the
licenses of all artifacts created need to be compliant, and that the
licensing of those artifacts depends on the licensing of run time
dependencies. Am I missing something here?

Regarding geoindexing and indexing, those projects are somewhat
coupled right now. Puja took steps to remove geoindexing from indexing
in an effort to carry out 2. Going forward it might be best to make
the indexes pluggable.



Sent from my Verizon 4G LTE smartphone


-------- Original message --------
From: Josh Elser<els...@apache.org>
Date: 10/8/16 3:54 PM (GMT-05:00)
To: dev@rya.incubator.apache.org
Subject: Re: [DISCUSS] Path forward for release

Venkatesh is right in that the only "official" release in the ASF's eyes
is the source release. Any JARs you publish are supplementary and
technically not subject to the rules of Apache releases.

The area I'm still trying to fully grok is that the source-release you
publish must also create artifacts which are properly licensed[1]. Right
now, that means including numerous incompatible dependencies, and, thus,
does not meet the requirements of the ASL and the ASF.

Regarding David's last question: I would assume that the license applies
to both the source code and binary forms of the geo-related artifacts
that you are currently bundling in Rya. GPL is forcing that the source
code for those artifacts be available, but is not implying that the
license only applies to the code in source form.

"A" and 1/2 would be how I expected this to go forward (although, I'm
not sure how "removing GeoIndexing" evolved into "removing Indexing" --
are they so intertwined?). The area that currently makes me feel awkward
is how to interpret "optional dependencies". If every user of Rya would
just be building this support anyways, that's skirting a very gray area
in my current understanding of what is allowed.

- Josh

[1]
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.apac
he.org_dev_licensing-2Dhowto.html-23binary&d=CwICAw&c=Nwf-pp
4xtYRe0sCRVM8_LWH54joYF7EKmrYIdfxIq10&r=vuVdzYC2kksVZR5STiFw
DpzJ7CrMHCgeo_4WXTD0qo8&m=PlHBkcTuE9DcvVb1m3V1nCNNRsvZnLKrtM
K1AKmYSY0&s=43QIBqVfsjovifro22HiGqmGW3Q9qY4xvKVtqPzv_x8&e=


Puja Valiyil wrote:

I don't think I follow. The source references an lgpl Api, and we are
publishing binary that references it in nexus. Are you sure it's not
an issue?

Sent from my iPhone

On Oct 6, 2016, at 10:36 PM, Seetharam
Venkatesh<vseetha...@gmail.com>   wrote:

If it's a runtime dependency, you are fine. Apache only supports
source releases. We vote on source tar ball and not binary artifacts.

Makes sense?

Sent from my iPhone,
Venkatesh

On Oct 6, 2016, at 12:40 PM, David Lotts<dlo...@gmail.com>   wrote:
Yes, geotools is a runtime dependency. No geotools source code is
distributed.

By that I mean: Geotools source code is not in our source code
repository.
Only references: imports in our *.java files and dependencies
entries in
our pom.xml. Because of this maven will package geotools JARs
(binaries)
in our shaded/uber JAR and WAR files that we distribute.

With option 1 or 2 as discussed, maven will exclude the geotools
jars in
our JARs and WARs. Users of Rya can follow some instructions that we
provide to add "-P indexing" (or similar) to their Maven build
command
create their own jar/war containing the optional Rya features and
geotools
binaries.

Your "you should be okay." mean which of these????
A. option 1 and option 2 will work around the issue and we should
proceed
before we release,
- OR -
B. We are already in compliance and this is not a blocker for
release as
long as we are not redistributing geotools source code.

Hopeful for interpretation B, but expecting and happy with A.

david.

On Thu, Oct 6, 2016 at 1:22 PM, Seetharam
Venkatesh<vseetha...@gmail.com>
wrote:

Quick question - geotools is a runtime dependency? Are you
shipping the
source code? If not, you should be okay.

Sent from my iPhone,
Venkatesh

On Oct 6, 2016, at 7:52 AM, Puja Valiyil<puja...@gmail.com>   wrote:
Hi everyone,
Talking with Aaron, it seems like there were two paths forward for
refactoring in order to create a release. To refresh everyone's
memory,
the issue was that the geo-indexing extensions to Rya pull in
geotools,
which prohibits us from releasing Rya under an Apache 2 license.
There

may

be some more particulars that I'm glossing over -- someone please
chime

in

if they feel it is key to the discussion.
The two paths forward we had were:
1. Make all of the indexing project and its downstream dependencies
optional and exclude them from a release
-- The indexing project includes several "optional" extensions to
Rya
(advanced indexing strategies). Prior to Rya becoming an apache
project,
these indexing extensions were optional and there was a separate
profile
for including them. This option involves reverting back to that
mindset.
The main argument against this is that these indexing

strategies/extensions

are not in fact optional but are "core" to Rya and can't be
excluded.

2. Refactor Rya to pull geoindexing into a separate project and
exclude
that project from the release.
- We could refactor Rya to have geoindexing be its own project
and add a
profile to include that in the build. This would invovle moving the

class

mvm.rya.indexing.GeoIndexer and packages
mem.rya.indexing.accumulo.geo

and

mvm.rya.indexing.mongodb.geo to a separate project and then

removing/moving

references to geoindexing anywhere else. Another option is to
refactor

the

GeoIndexer interface to remove the geotools dependency.

I think #1 is a good immediate path for a release and that #2 is
a good
longer term path forward. Since it's probably in our best
interests as a
community to get an apache release sooner rather than later, I'd
rather

us

go with #1 since it would quicker. I also think that most users
of Rya
would be ok with excluding the indexing project since it is not
core
functionality for Rya. While #2 is a better long term plan, it
involves
some pretty extensive refactoring that would be difficult to do
well in a
timely manner.

Any thoughts?


Reply via email to