I wouldn’t favor merging for Tom’s point and others:
So far from the template I maintain, there have been 2 PIO releases and soon to 
be 7 template releases. The point being that active templates will have their 
own revision schedule. You have only to look at the history of the templates to 
see that they are released independent of PIO releases. ASF tools make it hard, 
not the project needs. 
These were all separate repos in PIO days because they made sense as separate 
and because Github makes it easy. Now with ASF hosted git there is more pain 
but still the same project needs. Let’s not confuse pain with need. Let’s 
remove the pain points. We already have self-service repo creation from pushing 
on the pain points, a big step forward from the days when it took an 
infra-ticket to get a repo.
If `git pull template-url` is the basis of getting a template, merging repos 
will break this and make contributed templates different than external ones to 
the confusion of users.
As Tom noted It will also bloat the project when we’d like to see it more 
modular. For instance an Admin server microservice may also end up in a 
separate repo so it can be released at different intervals. 
The standard IMO is not Apache, which is a venerable institution (trying to 
remove friction points), it is outside-apache OSS which most assuredly is more 
modular. Pip, npm, gems, apt-get, ...

Growth leads to bloat or efforts to decouple and refactor. I’d actually like to 
see PIO split up along mircorservices refactoring lines but all in time. A move 
to bundle together seems the wrong direction.

Another problem is the difficulty of binary releases in ASF as we all witnessed 
(especially hard for incubating projects). Think about the fact that currently 
templates do not need to be released in any sense. Wow, that is very cool, 
speaking from the ASF red-tape avoidance part of me.  


On Nov 3, 2016, at 2:41 PM, Tom Chan <[email protected]> wrote:

This is mostly a good idea but then one of the templates is 3 times the
size of incubator-predictionio:

$ du -d 1  -h
53M ./incubator-predictionio
1.3M ./incubator-predictionio-sdk-java
288K ./incubator-predictionio-sdk-php
536K ./incubator-predictionio-sdk-python
264K ./incubator-predictionio-sdk-ruby
236K ./incubator-predictionio-template-attribute-based-classifier
220K ./incubator-predictionio-template-ecom-recommender
264K ./incubator-predictionio-template-java-ecom-recommender
184K ./incubator-predictionio-template-recommender
196K ./incubator-predictionio-template-similar-product
440K ./incubator-predictionio-template-skeleton
160M ./incubator-predictionio-template-text-classifier

This 160M will be downloaded by all users regardless of whether they use it
or not, if we choose to consolidate them all into one repo.

Tom

On Thu, Nov 3, 2016 at 2:16 PM, Simon Chan <[email protected]> wrote:

> Hi guys,
> 
> I'm actually thinking we should consolidate all core templates / SDKs repos
> that are donated to Apache (i.e.
> https://github.com/search?q=org%3Aapache+PredictionIO) into one main repo
> (
> https://github.com/apache/incubator-predictionio)
> 
> The benefit may be that:
> 1. We can track Apache PredictionIO project activity in a unified place;
> 2. Making these templates part of the main repo encourages contributors to
> make sure they are all compatible with the latest version of PredictionIO
> core;
> 3. I don't see other projects (e.g. Mahout and its libraries) hosting core
> and components separately.
> 
> Thought?
> 
> Simon
> 

Reply via email to