[fonc] To fork or not to fork? (was: Hacking Maru)

Loup Vaillant-David Mon, 21 Oct 2013 08:37:45 -0700

On Sun, Oct 20, 2013 at 10:40:32PM -0700, Ian Piumarta wrote:
> > * Is the idea that everyone should be doing/forking his own,
> > CipherSaber style, or is there an intent to share and build common
> > platform?
> 
> I'd love to build a common platform.  Maru is in particular trying
> to be malleable at the very lowest levels, so any special interest
> that cannot be accommodated easily within the common platform would
> be a strong indicator of a deficiency within the platform that
> should be addressed rather than disinherited.
> 
> Where there is a choice between forking to add a fix or feature and
> clamouring to get that fix of feature adopted into some kind of
> central repository, I believe clamouring always benefits vastly more
> people in the long run.  I intensely dislike the github/gitorius
> 'clone fest' mindset because it dilutes and destroys progress,
> encouraging trivial imitation rather than radical innovation --
> which, if there is any at all, finds itself fighting an intractably
> high noise floor.  Forking will always split a community (even if
> one side is only left with a community of one) so if you want what
> you are doing to be relevant one day and benefit the most people
> then trying to limit forks and clones is a good thing, IMO.  If
> anyone wants to do something hugely incompatible with Maru, with no
> guarantee of eventual success, I'm happy to make a branch in the
> repository for it.  While public forking might be a viable model for
> development that closely linked with and intends to contribute back
> to a mature project, where the gravitational field of the parent
> repository is irresistibly high, I don't think it is very helpful or
> efficient for getting a small and unestablished project off the
> ground.


If I may, there may be one non-trivial argument in favour of forking:
learning.  I found that building my own stuff helps me learn.  For
instance, I had to write my own meta compilers[1] to really understand
the magic behind parsing expression grammars.  Reading OMeta's source
code[2] simply wasn't enough.

I'm now doing the same with Earley Parsing[3].  I have a working toy
recogniser, and now seek to reconstruct a tree (or several).  Since
the Wikipedia article didn't help much on that front, I have sought
the simplest implementation I could find[4].  But again, no luck
understanding how it reconstructs the parse tree.  (Can't I read
source code?)

So I tried to reconstruct a parse tree manually, from the states
generated by my recogniser.  Surprisingly, it worked.  The Wikipedia
says that "the recogniser can be easily modified to create a parse
tree as it recognises, and in that way can be turned into a parser".
Turns out I don't even need to.  And now, I think I understand the
reconstruction algorithm well enough to implement it.  I predict some
minor trouble with ambiguous parses, though.

[1]: http://loup-vaillant.fr/projects/metacompilers/
[2]: http://www.tinlizzie.org/ometa/
[3]: https://en.wikipedia.org/wiki/Earley_parser
[4]: https://github.com/tomerfiliba/tau/blob/master/earley3.py

---

Now, back to Maru.

A central repository for serious work is probably best.  But we also
need a reliable way to learn.  I simply cannot trust myself with a
COLA system until I know I can build one, for two reasons:

 - It's a new, untested, immature technology (or so many people will
   think).  If it breaks, I must be able to fix it myself, because
   like Linus, Ian probably doesn't scale.
 - Its abstractions are "leaky" by design.  Unlike with C or Java, the
   actual machine behind the language is for me to take over.  There
   is no layer behind which I cannot get past. (Or so I guess.)

So if we want to have a chance to spread something like Maru, we
probably need to favour deep learning as well.  Surface understanding
is likely to get its users into trouble, which will then blame the
tool.  Of the top of my head, I see a few ways one could learn Maru:

 - Dive into the source code of the real thing.  I may try, but I will
   likely fail miserably, just like I did OMeta.
 - Read scientific papers.  I gathered a surface understanding of some
   principles, but nothing solid yet.
 - Build a toy from scratch.  I'll probably do that, since it worked
   so far.
 - Learn from an existing toy.  That toy would be the "useful fork".
   Bonus points if the toy can lift itself into the real thing.  Extra
   bonus points if the real thing is _actually_ lifted up from the
   toy.
 - Learn from tutorials, like "Write yourself a Maru in 48 hours"[5].
   Bonus points if there's a second tutorial to lift your toy into
   something serious.

[5]: https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours

---

Here is how I would imagine my dream world.  It would be a central
repository with:

 - A toy Maru, optimised for clarity.
 - A tutorial for writing your own toy.
 - A serious Maru, lifted up from the toy.
 - A tutorial for lifting your own toy up.
 - The hand-written bootstrap compilers (for understanding, and the
   Trusting Trust problem).

Does this dream world sounds possible? Is it even a good idea?

Loup.
_______________________________________________
fonc mailing list
fonc@vpri.org
http://vpri.org/mailman/listinfo/fonc

[fonc] To fork or not to fork? (was: Hacking Maru)

Reply via email to