I posted on general@incubator that:

> One goal is to have a binary that contains all resources, 
> which can be used to install cTAKES on a system that does
> not have an internet connection.
> For now we can focus on a first Apache release that 
> doesn't meet that goal, while pursuing the question with legal.
> If legal says we can't do have that kind of binary here, 
> then in the future we can consider
> if we will host such a binary on a different site.

http://s.apache.org/bgp

Another motivation for this email is a post by Benson (below) to 
general@incubator, where he writes "It's not the mission of the ASF to create 
complete, end-user-friendly, software products".

I suggest we, or whoever among us are interested in such a thing, host an 
easy-to-install *binary* that includes cTAKES plus the models and jars, 
somewhere other than apache.org, that would be a single download with a simple 
unzip (and would be built off Apache cTAKES 3.0.0-incubating, once it is 
released).

This binary would probably be released shortly after each Apache cTAKES 
release, so it could be built from the officially released Apache cTAKES source.

From my understanding, we cannot have models in SVN here if they were built 
from data that is not available to the community since the models are not 
"source". That's based on this specific comment within LEGAL-157: 
https://issues.apache.org/jira/browse/LEGAL-157?focusedCommentId=13561092&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13561092

We also cannot have other compiled jars in our SVN here at apache.org, and 
therefore cannot be in our source release, which we are working on addressing

For people checking out code from SVN and using maven, those are not such big 
issues since maven will fetch the dependencies once we finish updating the POMs 
etc.

If we want to allow people to download a single binary and get the cTAKES code 
and the models, it sounds like we either need to 
1) write something that would download the models for the users 
2) or host the binaries elsewhere 
(or require users to download things separately and put them together).

I strongly dislike option 1, so I will focus on option 2 in this email, as that 
will be more than enough for one email any way ;)

For people to host such an all-inclusive binary elsewhere, those people would 
need to choose a name.
We could create a logo for their use, something like "Apache cTAKES inside" or  
"Powered by Apache cTAKES" (see 
http://www.apache.org/foundation/marks/pmcs.html#poweredby) and make it clear 
the binary is not being released directly by Apache http://s.apache.org/BAj

I suggest that we wouldn't need to create a convenience binary here at Apache - 
one less thing to test and document.

This would bring up several questions though, which I'm guessing we don't want 
to get into here in great detail since it is really about something that is not 
to be released directly from Apache.
 - what to call the binary (we would not simply be able to call it "Apache 
cTAKES")
 - where to host the binary (I'd suggest the ohnlp sourceforge project, where 
previous versions of cTAKES live)
 - we would need a place to hold the documentation for this binary. I am 
assuming we could not host it as apache.org, but we would need that either 
confirmed here or create a legal Jira to get that confirmation.
 - where would we tell people to go to post questions about the binary? 
 - where would the build of the binary take place 

I suggest taking those questions offline unless someone tells me those things 
are indeed OK to discuss here.

My main point to discuss here is whether there is enough value in providing a 
convenience binary of Apache cTAKES here at apache.org (which would not contain 
the models) for us to create and support it here, or if we skip creating binary 
here at apache.org and only create source packages here.

I am not trying to splinter the group here. I would hope anyone involved in 
producing the binary would be involved here with Apache cTAKES too. But there 
might be people involved in Apache cTAKES that aren't interested in the details 
of how a binary is produced or what it looks like, or even if it is produced.

-- James

> -----Original Message-----
> From: [email protected]
> [mailto:[email protected]]
> On Behalf Of Benson Margulies
> Sent: Thursday, January 24, 2013 9:23 PM
> To: [email protected]
> Subject: Re: [VOTE] Apache cTAKES 3.0.0-incubating RC5 release
> 
> It's unfortunate to have this conversation in parallel here and on
> https://issues.apache.org/jira/browse/LEGAL-157.
> 
> Also, this thread is a combo of the discussion of ordinary jars-of-classes
> (where I'd forgotten the policy) and the much more tangled question of
> models, which is what the JIRA is wrestling with.
> 
> To answer Ted, I think that Roy might write something like:
> 
> "It's not the mission of the ASF to create complete, end-user-friendly,
> software products. It's our mission to create open source code. If someone
> else wants to build up an end-user-friendly aggregation of ASF code and
> models from bombs of whatever, that's great, and we encourage them."
> 
> On Thu, Jan 24, 2013 at 8:19 PM, Branko Čibej <[email protected]> wrote:
> > On 25.01.2013 01:50, Ted Dunning wrote:
> >> On Fri, Jan 25, 2013 at 7:37 AM, Branko Čibej <[email protected]> wrote:
> >>
> >>> On 21.01.2013 21:08, Benson Margulies wrote:
> >>> ...>>
> >>>>> I am referring to this discussion  http://s.apache.org/MUZ
> >>>> Well, that clear enough, even if it is a typical example of how our
> >>>> founders yell at us but we have no mechanism to channel those yells
> >>>> into concise, unambiguous, documentation.
> >>> Per haps off-topic ... but I fail to see how "source release" is
> >>> ambiguous or not concise.
> >>>
> >>> Unless the Java world has a different definition of "source code"
> >>> than us stuck-in-the-mud plodders, and it's only considered binary
> >>> once it's been JIT-compiled. :)
> >>>
> >>
> >> It isn't necessarily ambiguous when applied to code, but there is a
> >> different case when applied to models  or parameter settings.
> >>
> >> For instance, commons match has polynomial coefficients embedded in
> >> code that approximate certain functions.  These are the results of
> >> computations done using other systems and the source code and the
> >> data used in those other computations are not included in the
> >> released code, only the parameter values are.
> >>
> >> This same sort of thing applies here except that the model in
> >> question has a much larger set of values and is being packaged in a
> >> binary, inspectable format.  Would your opinion change if the model
> >> were expressed in a textual model?  Would it matter that the textual
> >> model is too large and obtuse to usefully inspect?
> >
> > In cases like this one, it would seem reasonable for the source code
> > to refer to those models and computations, which presumably anyone can
> > then reproduce to their own satisfaction. This is unlike compiled code
> > in that compilation results are notoriously hard to reproduce exactly,
> > because they depend on many factors that are usually hard to document,
> > let alone reproduce. I'd expect a mathematical model, no matter how
> > large, does not suffer from such ambiguities (and shut up, Gödel).
> >
> > However, that's beside the point, because ...
> >
> >> What about a hypothetical case where the model is derived from the
> >> explosion of a nuclear bomb?  Would the release of the numbers
> >> require the inclusion of a suitable bomb design so that everybody
> >> could replicate the derivation?
> >
> > ... the issue is not about the exposing all the knowledge that goes
> > into writing the code, but to expose the code itself so that it can be
> > reviewed for, e.g., back-doors and other security issues. Neither of
> > your examples is relevant.
> >
> > -- Brane
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]

Reply via email to