Some great conversations about this topic. Thanks to the guys in Grok for
the great kickoff discussion.

As we can see from the preceding contributions, this is apt to become a
religious debate about exactly what sort of Open Source project we want to
be. That's grand but as Stewart so correctly points out at the very end of
his email (and I mean the last two sentences!), we should perhaps create a
design for what NuPIC is going to be when it grows up, and that will
illuminate the discussion about how it's released and stabilised. As Matt
said in the meeting, once you embark on one course, it's very hard to
change.

As Stewart says (and Matt alluded to it), Grok is now only a very important
member of the community, and the primary user of the NuPIC code. Grok will
inevitably shrink in terms of the application space using NuPIC as a
component, and so it is up to the whole community (with Grok exercising a
friendly veto power) to fill out the design roadmap for NuPIC starting now.

Here is my suggested starting point for that design:

1. (Obviously) NuPIC as a substrate for Grok's products and services, and
as a development platform for the community.
2. Binary NuPIC apps for Windows, MacOS and Linux. Point them at a CSV
file, run a wizard, and you get Cerebro-style results. You could also have
a scripting interface for these OS's which would allow actions to be taken
based on NuPIC.
3. NuPIC as a web app (again Cerebro comes to mind).
4. Visual NuPIC (Ian's Visualiser is a good starting point) as an
educational tool.
5. NuPIC with a REST API.
6. NuPIC provided as .rpm, .deb etc binary packages (and equivalent for
MacOS).
7. Packages for developing iOS and Android apps incorporating NuPIC.
8. Tools for creating and changing description and config files.
9. Language bindings for major languages (Ruby and PHP would be on my
wishlist).

The current NuPIC software is just the first of these products. At the
moment it serves its purpose well as the foundation for Grok and as the
development master for us. NuPIC is highly modular, and, as mentioned, we
can add new functionality to the core system and access it using parameters
and flags. The core, default functionality can be maintained and protected
so it can continue to be used by Grok and other experimenters.

I agree with Stewart that we may not need v1.2.3 style releases for the
core NuPIC software. If we have to make a change to the core API of NuPIC,
then we should make a release which provides the old API for those who
depend on it. I can't really envisage how this would happen at all often,
because the core API is very solid. Anyone advocating a change which might
break people's dependence on the API would have to have an outstanding
reason to do so, and get agreement from both Grok and the community.

The other products listed above, however, are good candidates for v1.2.3
style versioning and release schedules. Each product should have a
maintainer (or team) who guides the feature roadmap for regular release (at
the minor level) and combines bug-fixes for a patch-level release (perhaps
based on sprints). These products have a consumer-facing element (even if
that consumer is a developer using NuPIC as a module in her app) which
NuPIC as it is now does not. The products are also clients of the core
NuPIC software, but not part of it.

Regards,

Fergal


On Thu, Nov 14, 2013 at 5:51 PM, stewart mackenzie <[email protected]>wrote:

> Hi All,
>
> my 2 cents worth:
>
> When following the forking versioning model you stabilize on a
> popularly used minor. forget stabilizing on bugfix version Git SHAs
> can do that. That version should be "famous" just as python 2.7 rings
> a bell.
> Over time the codebase feels like its time to stabilize it has its own
> character and feel most importantly its already stable. That's when
> you stabilize. Its before the moment there are a whole bunch of new
> ideas and changes coming in from the community that could
> significantly change the code base. terms like code freezing is legacy
> SVN architectural failure, time boxing is corporate stuff, it doesn't
> apply to open source projects that grow organically. You'll find when
> it's time to stabilize maintainers might start making sounds of
> holding back pull requests so that they can stabilize. When that
> happens the time is becoming ripe to stabilize. They hold off the pull
> requests because we are lazy, then a moment comes when you cannot hold
> anymore as the community are bitching away then stabilize.
> Stabilization is for the larger community whom have millions of
> dollars of development time sitting on that version but don't want to
> update their code to follow the latest developments but want to use a
> stable version.
> Its for an Engineer whom is tight on time and money needs to make a
> project work and doesn't want to futz about with using his own fork,
> but just use a stable fork on /numenta so that he can *also* take
> advantage of the community bugfixes as well as his own. He knows that
> there won't be any API changes. He knows that it is starting to catch
> dust but he doesn't care because it is stable and explicitly made so
> by having its own repo.
> In response to the chap sitting to the right of Matt. You won't carry
> around a stabilization repo. It's becoming legacy, your attention is
> focused on development. Besides as a maintainer you don't carry the
> codebase around with you on your laptop. You'll just click the green
> merge button if it follows coding standards and has an associated
> issue on stabilization. Then you move on to the bleeding edge where
> its exciting.
>
> Stabilization isn't about the convenience as a developer. Its about
> ease of community adoption. Critical for building a larger NuPIC
> community.
>
> C4 is meant to keep software projects alive past the lifespan of the
> main backing companies. It also makes sure that vendors can't lock the
> code down. This include Grok, as strange as that sounds. In a way, now
> don't misunderstand me. Grok is NuPIC's greatest possible enemy, yet
> most loved one. NuPIC needs to take on its own identity, one that can
> nurture the world, and the world it. If it becomes the world nurturing
> Grok then trouble will start to boil. Lets not have an Open Office, or
> D Language complication. What a pity that would be. It hurts everyone.
> NuPIC is reaching adulthood. It'll soon take its own identity
> independent of Grok. Grok one day may die, but NuPIC will live on.
>
> Now I want to encourage everyone to think about something.
> * Imagine all those hundreds of package managers out there for
> windows, linux, macos, they can easily handle an explicit URL to
> automatically pull and build the code. Who wants to debug strange
> scripting languages to make it 'git checkout numenta-v1.3-stable' -
> Its good to make it easy so that those package maintainers can spread
> nupic. Package maintainers most likely will just want to download a
> zip file. This way they can point to a stabilization fork download URL
> without needing git dependencies that might on might not be supported
> in their scripting language. Github doesn't allow downloading a zipped
> branches. (this point cannot be heavy enough)
> * Actually it isn't alot of work maintaining stabilizations.
> Paradoxically you'll have more work having to shift gears if you have
> versioning using the branching model. What happen is if you click the
> wrong dropdown branch etc.
> * The branching model encourages too many stabilizations at the wrong
> time or even worse stabilization by date releases.
> * Stabilization repos will settle down over time to the point that you
> will forget about what is going on with it - especially when you're
> ten years on. If you follow the branching versioning model you'll get
> a strange pull request to an old branch that you as a new developer
> know nothing about. What will you make of that Pull Request?
> Especially when development ten years on has completely whacked that
> section of code. You have to adjust your head.
> * Stabilizations encourage money and time pressed individuals to use
> your stable versions. Just cause you the developer know how to change
> branches that doesn't mean other people know how to get to them.
> Imagine someone from CVS needing to get a stable version. They'll
> start swearing, using git forks allows you to get it with a simple
> 'git clone'. Most likely this person will want to download a zip file.
> * Stabilization is for legacy software that keeps those old code bases
> workable.
> * Stabilizations build their own subset communities. They have
> knowledgeable about what is wrong with that code base. Development
> fork developers want the bleeding edge action. Stabilization fork
> developers want stability. They are earning their money from this
> stable code base.
>
> Response to Marek:
> * It really doesn't matter that forking "doesn't follow the git model"
> - the git model some say is actually broken - (I agree). What is
> important is that you make it painfully easy for the community to use
> your software so that your software can spread easily.
> * Space is cheap, github won't complain. Fork away!
> * You should have like 10 branches on your local development fork. But
> the chances are you wont have a single stabilization unless you choose
> to build something around something stable that is. Then it'll only be
> one.
> * You're conflating development and stabilization. Development is done
> on the development fork against master. Stabilization too has one
> branch -> master and only bugfixes go in there. Those bugfixes can
> easily be cherrypicked to development if the bug is present in
> development.
> * Don't worry about release dates, Stabilize when it "feels" like its time.
> * When moving on past stabilization issues are essentially for the
> development repo. You don't want issues being created for a bit of
> code that was removed 10 years ago, then require everyone to shift
> gears, checkout the code base and apply the patch.
> * Bugfix issues should be created on the stabilization fork. It gives
> a clear history of bugfixes associated with it. (then you can cherry
> pick if it's pertinent to development and visa versa)
>
> Regarding development talk in the video:
> Matt understands that it C4 isn't just random people throwing crazy
> code at master. Issues *have* to be created *before* the pull request
> is created. This gets the community's eyeballs on the topic. Days and
> weeks can go by with the wisdom of the crowds mulling over the problem
> in the shower, till a eureka moment is had, this gets communicated on
> the issue page and the group moves on. That's when the *expected* pull
> request comes in.
> This process frees up maintainers whom aren't supposed to do deep code
> review nor bugfixing at all (travis makes sure it builds). Maintainers
> just make sure the Pull Request and coding standards are met. *If*
> there is a bug in a Pull Request it doesn't matter, the maintainer
> commits it, another issue is created and another pull-request is
> submitted to fix it (or it can just be on the same issue). This
> creates a kind of wikipedia (in the early days) type of effect.
> Whereby people fix others work. A hive of ants working together all
> looking up to the conversations on issues board for the path ahead.
> This gets everyone on the same page and doesn't wear out maintainers
> who think wtf is that new thing touching such a core bit of code?
>
> Regarding algorithm changes in the video towards the end:
> Don't conflate the conversation on the issues with the conversation on
> the pull request.
> master will take the 'best' algorithm. There was talk about easily
> flipping a switch to be able to test the differences. This stage is
> still at the testing phase, there shouldn't even be a pull request in
> the pipeline. The developers should have remotes of those
> non-canonical forked branches. This allows you to easily flip that
> switch (change branches) to determine which is the best algorithm.
> Once there is community consensus that 1 of the 3 potential solutions
> is the right way to go then this will be elaborated on in the issue.
> The correct pull request + modification? are used and the other
> branches get whacked by their owners. Issues are where the above
> conversation should take place *not* the pull request page.
>
> I personally advocate a rolling release whereby the APIs don't change
> as the C4 states. It simplifies things greatly and keeps mindshare in
> one place, yet keeps the stable seekers happy as they believe in the
> maintainers who are hard as nails and disciplined about contribution
> guidelines. (you need a German on the maintainer list btw - I'm not
> joking) This way legacy software can still use the tight optimized
> always improving code base. It makes life simple for _everyone_ given
> maintainers and contirbutors know their shit. I do not advocate a
> branching model, and prefer a forked model if I can only choose of the
> two. NuPIC being a very algorithmic type project that does 2 (soon 3)
> things can keep a very simple API. It has a small API but mega stuff
> hidden away on the inside. This is perfect for a rolling release.
> Though! I would only declare NuPIC official and ready to build on once
> the C++ core is stripped out, regression tests are in, language
> bindings have their own repo (one for each language binding). I
> wouldn't bother with considering using stabilizations or not till the
> core is out and some _serious_ talk about API is had. That would be
> the beginning for me.
>
> Kind regards
> Stewart
>
> _______________________________________________
> nupic mailing list
> [email protected]
> http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
>



-- 

Fergal Byrne, Brenter IT

<http://www.examsupport.ie>http://inbits.com - Better Living through
Thoughtful Technology

e:[email protected] t:+353 83 4214179
Formerly of Adnet [email protected] http://www.adnet.ie
_______________________________________________
nupic mailing list
[email protected]
http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org

Reply via email to