Hi folks,

Sorry to join the discussion late.  I had a look at the design doc
earlier in this thread, and it was not mentioned what types of
projects are the targets of this new "spark extras" ASF umbrella....

Is the desire to have a maintained set of spark-related projects that
keep pace with the main Spark development schedule?  Is it just for
streaming connectors?  what about data sources, and other important
projects in the Spark ecosystem?

I'm worried that this would relegate spark-packages to third tier
status, and the promotion of a select set of committers, and the
project itself, to top level ASF status (a la Arrow) would create a
further split in the community.

-Evan

On Sat, Apr 16, 2016 at 4:46 AM, Steve Loughran <ste...@hortonworks.com> wrote:
>
>
>
>
>
> On 15/04/2016, 17:41, "Mattmann, Chris A (3980)" 
> <chris.a.mattm...@jpl.nasa.gov> wrote:
>
>>Yeah in support of this statement I think that my primary interest in
>>this Spark Extras and the good work by Luciano here is that anytime we
>>take bits out of a code base and “move it to GitHub” I see a bad precedent
>>being set.
>>
>>Creating this project at the ASF creates a synergy between *Apache Spark*
>>which is *at the ASF*.
>>
>>We welcome comments and as Luciano said, this is meant to invite and be
>>open to those in the Apache Spark PMC to join and help.
>>
>>Cheers,
>>Chris
>
> As one of the people named, here's my rationale:
>
> Throwing stuff into github creates that world of branches, and its no longer 
> something that could be managed through the ASF, where managed is: 
> governance, participation and a release process that includes auditing 
> dependencies, code-signoff, etc,
>
>
> As an example, there's a mutant hive JAR which spark uses, that's something 
> which currently evolved between my repo and Patrick Wendell's; now that Josh 
> Rosen has taken on the bold task of "trying to move spark and twill to Kryo 
> 3", he's going to own that code, and now the reference branch will move 
> somewhere else.
>
> In contrast, if there was an ASF location for this, then it'd be something 
> anyone with commit rights could maintain and publish
>
> (actually, I've just realised life is hard here as the hive is a fork of ASF 
> hive —really the spark branch should be a separate branch in Hive's own repo 
> ... But the concept is the same: those bits of the codebase which are core 
> parts of the spark project should really live in or near it)
>
>
> If everyone on the spark commit list gets write access to this extras repo, 
> moving things is straightforward. Release wise, things could/should be in 
> sync.
>
> If there's a risk, its the eternal problem of the contrib/ dir .... Stuff 
> ends up there that never gets maintained. I don't see that being any worse 
> than if things were thrown to the wind of a thousand github repos: at least 
> now there'd be a central issue tracking location.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Reply via email to