Re: [nupic-dev] NuPIC versioning and releases

stewart mackenzie Thu, 14 Nov 2013 09:55:54 -0800

Hi All,

my 2 cents worth:


When following the forking versioning model you stabilize on a
popularly used minor. forget stabilizing on bugfix version Git SHAs
can do that. That version should be "famous" just as python 2.7 rings
a bell.
Over time the codebase feels like its time to stabilize it has its own
character and feel most importantly its already stable. That's when
you stabilize. Its before the moment there are a whole bunch of new
ideas and changes coming in from the community that could
significantly change the code base. terms like code freezing is legacy
SVN architectural failure, time boxing is corporate stuff, it doesn't
apply to open source projects that grow organically. You'll find when
it's time to stabilize maintainers might start making sounds of
holding back pull requests so that they can stabilize. When that
happens the time is becoming ripe to stabilize. They hold off the pull
requests because we are lazy, then a moment comes when you cannot hold
anymore as the community are bitching away then stabilize.
Stabilization is for the larger community whom have millions of
dollars of development time sitting on that version but don't want to
update their code to follow the latest developments but want to use a
stable version.
Its for an Engineer whom is tight on time and money needs to make a
project work and doesn't want to futz about with using his own fork,
but just use a stable fork on /numenta so that he can *also* take
advantage of the community bugfixes as well as his own. He knows that
there won't be any API changes. He knows that it is starting to catch
dust but he doesn't care because it is stable and explicitly made so
by having its own repo.
In response to the chap sitting to the right of Matt. You won't carry
around a stabilization repo. It's becoming legacy, your attention is
focused on development. Besides as a maintainer you don't carry the
codebase around with you on your laptop. You'll just click the green
merge button if it follows coding standards and has an associated
issue on stabilization. Then you move on to the bleeding edge where
its exciting.

Stabilization isn't about the convenience as a developer. Its about
ease of community adoption. Critical for building a larger NuPIC
community.

C4 is meant to keep software projects alive past the lifespan of the
main backing companies. It also makes sure that vendors can't lock the
code down. This include Grok, as strange as that sounds. In a way, now
don't misunderstand me. Grok is NuPIC's greatest possible enemy, yet
most loved one. NuPIC needs to take on its own identity, one that can
nurture the world, and the world it. If it becomes the world nurturing
Grok then trouble will start to boil. Lets not have an Open Office, or
D Language complication. What a pity that would be. It hurts everyone.
NuPIC is reaching adulthood. It'll soon take its own identity
independent of Grok. Grok one day may die, but NuPIC will live on.

Now I want to encourage everyone to think about something.
* Imagine all those hundreds of package managers out there for
windows, linux, macos, they can easily handle an explicit URL to
automatically pull and build the code. Who wants to debug strange
scripting languages to make it 'git checkout numenta-v1.3-stable' -
Its good to make it easy so that those package maintainers can spread
nupic. Package maintainers most likely will just want to download a
zip file. This way they can point to a stabilization fork download URL
without needing git dependencies that might on might not be supported
in their scripting language. Github doesn't allow downloading a zipped
branches. (this point cannot be heavy enough)
* Actually it isn't alot of work maintaining stabilizations.
Paradoxically you'll have more work having to shift gears if you have
versioning using the branching model. What happen is if you click the
wrong dropdown branch etc.
* The branching model encourages too many stabilizations at the wrong
time or even worse stabilization by date releases.
* Stabilization repos will settle down over time to the point that you
will forget about what is going on with it - especially when you're
ten years on. If you follow the branching versioning model you'll get
a strange pull request to an old branch that you as a new developer
know nothing about. What will you make of that Pull Request?
Especially when development ten years on has completely whacked that
section of code. You have to adjust your head.
* Stabilizations encourage money and time pressed individuals to use
your stable versions. Just cause you the developer know how to change
branches that doesn't mean other people know how to get to them.
Imagine someone from CVS needing to get a stable version. They'll
start swearing, using git forks allows you to get it with a simple
'git clone'. Most likely this person will want to download a zip file.
* Stabilization is for legacy software that keeps those old code bases workable.
* Stabilizations build their own subset communities. They have
knowledgeable about what is wrong with that code base. Development
fork developers want the bleeding edge action. Stabilization fork
developers want stability. They are earning their money from this
stable code base.

Response to Marek:
* It really doesn't matter that forking "doesn't follow the git model"
- the git model some say is actually broken - (I agree). What is
important is that you make it painfully easy for the community to use
your software so that your software can spread easily.
* Space is cheap, github won't complain. Fork away!
* You should have like 10 branches on your local development fork. But
the chances are you wont have a single stabilization unless you choose
to build something around something stable that is. Then it'll only be
one.
* You're conflating development and stabilization. Development is done
on the development fork against master. Stabilization too has one
branch -> master and only bugfixes go in there. Those bugfixes can
easily be cherrypicked to development if the bug is present in
development.
* Don't worry about release dates, Stabilize when it "feels" like its time.
* When moving on past stabilization issues are essentially for the
development repo. You don't want issues being created for a bit of
code that was removed 10 years ago, then require everyone to shift
gears, checkout the code base and apply the patch.
* Bugfix issues should be created on the stabilization fork. It gives
a clear history of bugfixes associated with it. (then you can cherry
pick if it's pertinent to development and visa versa)

Regarding development talk in the video:
Matt understands that it C4 isn't just random people throwing crazy
code at master. Issues *have* to be created *before* the pull request
is created. This gets the community's eyeballs on the topic. Days and
weeks can go by with the wisdom of the crowds mulling over the problem
in the shower, till a eureka moment is had, this gets communicated on
the issue page and the group moves on. That's when the *expected* pull
request comes in.
This process frees up maintainers whom aren't supposed to do deep code
review nor bugfixing at all (travis makes sure it builds). Maintainers
just make sure the Pull Request and coding standards are met. *If*
there is a bug in a Pull Request it doesn't matter, the maintainer
commits it, another issue is created and another pull-request is
submitted to fix it (or it can just be on the same issue). This
creates a kind of wikipedia (in the early days) type of effect.
Whereby people fix others work. A hive of ants working together all
looking up to the conversations on issues board for the path ahead.
This gets everyone on the same page and doesn't wear out maintainers
who think wtf is that new thing touching such a core bit of code?

Regarding algorithm changes in the video towards the end:
Don't conflate the conversation on the issues with the conversation on
the pull request.
master will take the 'best' algorithm. There was talk about easily
flipping a switch to be able to test the differences. This stage is
still at the testing phase, there shouldn't even be a pull request in
the pipeline. The developers should have remotes of those
non-canonical forked branches. This allows you to easily flip that
switch (change branches) to determine which is the best algorithm.
Once there is community consensus that 1 of the 3 potential solutions
is the right way to go then this will be elaborated on in the issue.
The correct pull request + modification? are used and the other
branches get whacked by their owners. Issues are where the above
conversation should take place *not* the pull request page.

I personally advocate a rolling release whereby the APIs don't change
as the C4 states. It simplifies things greatly and keeps mindshare in
one place, yet keeps the stable seekers happy as they believe in the
maintainers who are hard as nails and disciplined about contribution
guidelines. (you need a German on the maintainer list btw - I'm not
joking) This way legacy software can still use the tight optimized
always improving code base. It makes life simple for _everyone_ given
maintainers and contirbutors know their shit. I do not advocate a
branching model, and prefer a forked model if I can only choose of the
two. NuPIC being a very algorithmic type project that does 2 (soon 3)
things can keep a very simple API. It has a small API but mega stuff
hidden away on the inside. This is perfect for a rolling release.
Though! I would only declare NuPIC official and ready to build on once
the C++ core is stripped out, regression tests are in, language
bindings have their own repo (one for each language binding). I
wouldn't bother with considering using stabilizations or not till the
core is out and some _serious_ talk about API is had. That would be
the beginning for me.

Kind regards
Stewart

_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Re: [nupic-dev] NuPIC versioning and releases

Reply via email to