Re: [Simile-Widgets] alpha release of exhibit 2.3

David Karger Sat, 04 Feb 2012 21:57:55 -0800





On 2/4/2012 8:15 PM, Ryan Lee wrote:

On 2012-02-03 11:42 , David Karger wrote:

OK, I'm going to try to unpack a few different arguments.

I'll respond inline below, but I should note that there are several
other things going on in this thread that have wider repercussions
within this community than whether this proposed release goes forward.

First and foremost is closing all back doors into the privilege of
becoming a code committer on the project.  While I go on more about this
under your point 4 and others, I'd like to hear your response.  I've
brought it up in every message I've sent and am bringing it up again.  I
do very much believe this general policy will have little to no impact
on your research goals and methods, but if you think differently, that
will have to be addressed.

I am comfortable with any community-determined process for admittingcore committers. To date there hasn't been one, so I've been left tomake decisions on my own.


The second, as I think we've teased out in these discussions, is how
best to develop and present new extension level features to users.  It's
becoming clear to me that most extensions should be released on their
own schedule from the core of Exhibit.  To date, the ones colocated with
Exhibit have been in lockstep.  Perhaps there is a subset of these that
should continue to be held in sync with core, but there may be some that
we've got that could be cut loose to follow their own development flow.
  This suggests a bit of work is needed for extensions to identify which
version of core they're meant to work with.

This discussion is highlighting a worrying tension between the developercommunity and the user community. I hope we can figure out a way torelease this tension. You misread my analogy to medical research; itwas not trying to contrast maintenance releases from experimentalreleases, but rather was trying to contrast work aimed at future benefit(careful gatekeeping to build a robust development community) from workaimed at addressing immediate needs of current users.

When I saw David's earliest version of Exhibit, an aspect that reallyinspired me was the idea that here is something that non-developer, HTMLnovice can use. And that is where I have continued to focus my energy.I worry about any changes that will impact such naive users of exhibit.It's fine to tell a developer "well, take this core from this site, thenyou can combine with this version of this extension from that site (butnot another version), and if you want this functionality I'm sure thatthere's an extension for that somewhere else if you can find it". Butthat's a complete non-starter for naive users. I think there's anoticeable gap be telling such people "put type='application/xml' inyour link tag" and "to import xml you need to include such and suchscript in your document". I foresee us heading in the direction ofjquery-ui, which has its own web application that lets you build anddownload to host your own custom script bundle. Fine for developers,but a disaster for naive users.

1.  It seems we disagree on whether or not the release I am proposing is
a fork.  I think this is based on a difference of opinion about whether
a fork is defined by an _action_ or by an _intention_.  I do not believe
this is a fork, because there is no intention (and, I believe, no
effect) to divide community effort.  I am fully supportive of the
completion of and ultimate migration to E3.

It's good to hear your intentions restated in this form; nevertheless,
forking *is* an action.  Well-intentioned forks still have all the
consequences of forking code.  It appears we are still at odds on what
it means.

While I've mostly avoided discussing the specifics of this proposed
release to avoid ratholes on debating the relative merits of each, I
will pick on two to illustrate my point.  You created ex:getter and
ex:parser for importers, a design problem I also noted and resolved in
3.0 - completely differently (via an API).  There might be a way to
automatically resolve it when the two are joined, or it might be a
nightmare that forces every importer developer to implement in such a
way as to answer to both.  You never intended it, but there is a
potential mire to bog down in if both are out there as official solutions.

As a minor point, I am suspicious of anything that is solved "via anAPI" since (given my interests) that suggests it will not be accessibleto naive users.


I would also pick on the current implementation of logging.  It took the
easy way out and is as a result way out of line.  It should not be in
core.  It should instead be made into an extension users can load off of
a csail.mit.edu server since it already is opt-in and requires editing
the HTML.  The Exhibit parameters introduced with logging should not
exist in core.  I am rather certain it is difficult and ugly to do what
I just suggested in Exhibit 2.  It should be trivial in Exhibit 3.  Put
a hold on introducing your research team's logging to a release; it
definitely doesn't belong there.  (I haven't been tracking this closely,
but you may also want to warn Exhibit authors using logging to warn
their own users that their activity is being recorded).

Here I have to disagree on technical grounds. I can see absolutely noway to accomplish logging without tinkering core. We're trying torecord actions taken in the exhibit core. To do that from outside corewould require a script that, after loading core, redefines every singlelogged function inside core to first log and then call the in-corefunction. This seems incredibly fragile and will break any timesomebody changes core.

To the separate concern of adding exhibit parameters, I am comfortablechanging the way we _activate_ the logging functionality, replacing thecurrent parameter setting with a script, e.g. <scriptsrc="http://csail/logger.js";> which would turn on the logging.

To the question of warning users, note that we chose an opt-in mechanismbecause we considered it very much the role of the web author to decidewhether to log. Subsequently, the web author has their own decision tomake about whether to give the page's visitors a choice aboutlogging---however, this is exactly the same decision as they make whenthey decide to put the google analytics urchin into their pages, and Idon't think _anyone_ bothers to ask permission of that from theirusers. The logger collects exactly the same kind of data as thoseanalytics tools; in fact if exhibit was server based and generated a newpage for each interface action, we could just use google analytics togather the same data.

2. You object that "making experimental commitments" may not coincide
with "making a great release".  This may be true, but remember that
there is no intention to ever make another "great release" of E2.
Instead, we expect the next great release to be E3.   With E2 abandoned,
there seems no downside to releasing some experimental functionality on it.

What I was saying too backhandedly was that experimental commits
probably make for a terrible point release, not that I think Exhibit 2
now needs a great release to have a release.

It's rather important to point out that to date, Exhibit releases have
been of the type that suggest if a feature is in, it's in it to stay.
That's quite contrary to experimenting.  Experiments are allowed to fail
in order to provide interesting information.  Releases of the type
Exhibit has undergone have not seemed to me to be intended for
experimenting, where failed features get removed.  Releasing web-based
services might carry some extra caveats about going too experimental, too.

I should note on the side that I do believe in releasing early and often
and unknowingly adding bugs and making mistakes to further a project -
as long as there is appropriate gatekeeping in place.  And maybe some
test harnesses.

I agree that past releases have been more reflective of futurecommitment. However, as I said, we have all made quite clear that E3 isthe future of exhibit.

3.  You're right that I believe this code should be released because it
is there.  Of course this release could happen in different ways.  I
suppose I could put a copy of my code at
http://people.csail.mit.edu/karger/exhibit-api.js , but really why is
this any better?  The code still _exists_.  Isn't it just as much a fork
that way?  On the other hand, the option of deleting my changes seems
damaging, since I think they make the tool better and thus help our
current users---see my point 7 below.

There's a rather substantive difference (of thousands of existing users)
between api.simile-widgets.org and people.csail.mit.edu/karger.
Everybody is already free to "fork" hosting in the sense you describe,
I'm not sure it pertains to this discussion.  What you're proposing is
an *official* fork release.

If we can resolve this whole debate by serving the release fromsimile-widgets.org/exhibit-mit , or by reactivating simile.mit.edu andhosting the new release there, I'm all for it. Of course, I'llrecommend everyone switch to it, since I think it's better. And we'llprobably need a separate copy of the documentation wiki, to describe thefeatures that aren't in the official release. And a separate web-siteto which I can direct people who are interest in exhibit and may want touse the added features. This seems much more forky to me than anythingwe've yet discussed.

4.  "How has the liverpool group participated in the community?"  By
creating a pretty cool extension.  We might wish that they would
participate more in the discussion group, but I'd rather appreciate
their contribution than feel bad about what they haven't done.

That's a false dichotomy.  One of the points I'll keep drumming at you
is that people who aren't yet part of this community can't be let in to
make commits without first entering into it through the front door,
otherwise it doesn't work as a community.  That fact doesn't diminish
their eagerness to contribute or the usefulness of their work; it means
that work never should have come in the back door through you (or any of
us).  Officially releasing a non/back door participant's work is not a
good signal to the rest of the community, the way letting unknown
committers commit to trunk is not a good signal - it's just a lot bigger
of a signal.

In the future when newcomers approach any of us (or, I suppose, are
approached by us) regarding improving Exhibit, I hope our line is
something like, "Excellent, join the list, fork on GitHub, and send in
your pull requests" - just like everybody else.

So this helps me understand another issue to tease apart. Liverpoolproduced an *extension* and never touched the exhibit core. In fact,all I was thinking about when I provided commit access was that thissimplified the *hosting* of the extension on simile-widgets.org. Iunderstand that this had *development authority* implications I wasn'tconsidering. I'd be perfectly happy to resolve them by moving theextension to a different subversion repository. However, in theinterest of keeping things simple for those naive users, I would stillwant to *serve* the extension from simile-widgets.org.

And why simile-widgets.org instead of csail.mit.edu/karger ?Resources. I'm happy to leverage the fact the MIT libraries aremaintaining a well-run, high capacity server that can deliver ourscripts. Kenzie may disapprove of my taking advantage of MIT librariesthis way, but as long as they are willing to do the management, I don'thave to find the resources to do it.

5.  To the issue of what should be in core.  Note that neither the
data-editor from liverpool nor the map view are "in core".  They are

I am aware of that.  That's why I don't think it would so burdensome to
shift it out of a release.

extensions.  I suppose we could have asked liverpool to host the data
editing extension somewhere else, but that's putting a big and as far as
I can see unnecessary barrier on the contribution.  Similarly the map
extension is outside core.  As proof, exhibit and the map extension can
be mixed and matched between 2.2 and trunk: the 2.2 map extension will
run fine with the current trunk (one user is already doing so), and the
trunk map extension should work fine with E2.2 core (though I haven't
tried that mix).    I can see some weight to an argument that I should
have created a new extension---mapv3-extension.js .  A similar argument
could be made that each data-format importer should be its own script
that must be explicitly included.  But such extensions would only be
available to people who explicitly ask for them, which bumps into the
next consideration:

Part of forming a community does involve barriers.  Yes, it's open to
the world, but membership requires some proof of interest and capacity.
  It's as low as it can get, but it's there.  Were this group not
suddenly granted access to trunk carte blanche, they would have
solicited community input on the extension - by putting it up and
hosting it somewhere.  If that's about the measure of how things will
progress in the future, then I'm asking you to not make special
exceptions for it now.

6.  I think the really big question is whether, at some point, we should
switch over simile-widgets.org/exhibit/api to use the 2.3 version
instead of 2.2.  Obviously, 2.2 will still exist at
api.simile-widget.org/exhibit/2.2.0 , so it isn't a matter of
eliminating something people depends on.  But a switchover would mean a
"forced upgrade" for anyone who isn't paying attention to their
exhibits.  Is this a good thing or a bad thing?  Thinking about the
changes, we have (i) maintenance that should rescue people from bugs
(and painter failures) and (ii) new features that they won't see at all
if they don't choose to use them.  On the downside, we have the risk
that I have introduced new bugs, or incompatibilities with certain
customizations people have already made.  That would obviously be bad,
which is why I want to take the time to let people test trunk
voluntarily before we do a switchover.    I believe that at a certain
point, we will have enough evidence that 2.3 is "safe" to make the
(known) benefits outweigh the (unknown) risks.

You understate the impact of new features, not all of which are outside
of core.  The code is there whether one particular user puts it into
play or not.  The fact that the code is there binds you to its
existence, and you have to wrestle with the difficult questions of how
to deal with its existence in the future.

For an outside example, HTML5 gets to wrestle with XHTML/XML namespaces.
  It's not at all an easy matter to resolve, because lots of people ended
up adopting namespaces, and HTML5 isn't XML.  Some people are simply
never going to leave XHTML because of it.  The current solution is that
a very few "xmlns:xx" attributes exist, for which the prefix "xmlns:" is
meaningless and just part of the name.  I'd like to avoid that kind of
thing.

And some people may never leave exhibit 2 because they are using somefeature only it has. But in the meantime they have a feature they canuse. Again, this comes back to the medicine metaphor. If I can providethis needed feature to users now, when it's impossible to provide in E3,then I'm helping them *now*. That is a different but valuablecontribution distinct from things that might help them *later*.

7.  This also ties to your question, "Do I intend for thousands of users
to move from E2.2 to E2.3"?  I believe that the painter issue has caused
widespread problems in the past, and that other bugs have been a general
nuisance, and therefore think there is benefit for the entire community
to make the shift.  That's why I think ultimately it will be better for
the default exhibit api to be 2.3.   This does have to balanced against
the worry/risk that changes may cause some 2.2 exhibits to stop
working---which is why I want to test for a while before making the switch.

Your answer revolves around maintenance, to which I never objected.  I
haven't made it clear (avoiding specifics, etc.), but I've considered
your work on the map view to be welcome maintenance.

8.  Another way to interpret your question is as worrying about the
consequences if many exhibit users adopt the new features like data
import or data editing.  I think this would be great!  It would provide
a very strong signal that these things are important to incorporate into
E3.  Perhaps  this is where you worry about leaving users "stranded" in
the move to E3---that they will have become dependent on those features
and won't be able to move.  But again, I don't buy the argument that we
should withhold something from our current users just because we might
not be able to provide it in the future.

Not really.  You clearly don't need to make a release to make these
things available to people as they're already testing them.  Your
argument cuts both ways; if it isn't in the release, it can still be
provided for users to try out.  My argument was never to completely
shutter that work - it was to avoid conflating it with a release by
carving it out of one.

OK, it seems that if I just leave everything as it is, with peopleneeding the new features linking to trunk.simile-widgets.org , theneverything is fine?

9.  More generally, I think our differences may reflect an argument that
has played out often in the past: in medicine, should we be directing
our money towards treating poor current sufferers from a disease, or
should we be investing in research that may ultimately yield a complete
cure?  For us, this translates to "should we be doing things that
benefit the current *users* of the exhibit codebase, or should we be
investing in a development process that will yield a better development
community in the long term."  I can accept the argument that effort
invested in helping current exhibit users (through maintenance or new
features) might be effort taken away from long-term development
community building.  But by the same token, investing effort in
long-term development can detract from solving the problems that current
users face.  And I think the answer is the same as it for medicine: both
the arguments are true, but neither is definitive.  We value both
investments---in current benefit and in future benefit---and really
don't have the foresight necessary to understand the optimal allocation
between them.   So we do both.

Again, I have repeatedly (and already above) stated I have no problems
with a maintenance release.  This analogy doesn't really seem to get at
our differences to me.  You'll pardon if I don't offer a counter
analogy, none come to mind at present.

10.  To the question of moving the goalposts, I am happy to commit, once
E3 is otherwise feature complete, to porting the E2 changes I made into
E3.  Although, as I remarked before, you probably don't want any code
contributions from a bad engineer like me into a production quality system!

It may not necessarily be a matter of simple porting.  I'm glad you're
making a commitment to it.  I am not certain the direction you've moved
in along the 2 line allows you to simply modify your code to achieve
compatibility.  If you don't create the expectations a formal and
official release entail, you will find yourself free to simply drop or
reformulate at will anything that doesn't come over naturally.

On 2/3/2012 5:56 AM, Ryan Lee wrote:

Thanks for taking the time to respond and offering the opportunity to
discuss this.

So I don't think it's on me to convince you why you shouldn't make a
fork release.  It's your responsibility to explain why you should.  If
you don't see this proposed 2.3 release meeting the definition of a
fork, I can and will explain myself in more depth below, but I'll just
be repeating conventional concepts couched in Exhibit specifics.  And
while you go to some lengths to describe the scope, content, and intent
of this material, I feel I should make clear that my objection is to the
act of forking and what its signals.

The reasons you give below for testing out your ideas using Exhibit 2
are spot on.  Were I in your position and on a timeline, I would have
done the same and chosen the existing code base over the one that was
just an idea at that time.  What I would not have done is make
experimental commits to trunk to satisfy external interests.  While it
is a certainty that this experimenting can lead to great improvements to
the code, it is not a certainty that satisfying said external interests
also coincides with making a great release.  And it's a release based on
such commit behavior that I'm objecting to.

One of the things I brought up in my last was how your research brings
in new and unknown committers to the main development line without any
prior community participation.  With that in mind - how have the
Liverpool group been participating in this open source community?

I mentioned that I thought a great number of the features I see coming
from the research direction don't belong in core from their outset.
Perhaps it would behoove all of us to find some consensus on what core
means.  What I'm generally advocating for is that these things begin
their life as extensions external to the project, making their way in to
the Exhibit repository as extensions as the community plays with and
develops them, finally making their way into core if the feature grows
to such a rate of adoption.  My ideal resolution to our differences
would be to see a 2.2.1 maintenance-only release and the new features
teased out as extensions for the community to poke at and play with.

Perhaps one way to sum up our differences is that you'd like to make a
release because it's there; I'd prefer you didn't as-is because a lot of
what's there shouldn't be there in the first place.

On forking.  You mention your moving to Exhibit 3.0 can happen when its
feature set meets or exceeds Exhibit 2's.  Yet you're essentially moving
the goalposts with this proposed 2.3 release.  This is one essence of
forking every developer should rightfully despise.

Do you intend for thousands of users to move from 2.2 to 2.3?  If not -
and I hope you don't - that suggests this proposed 2.3 release is only
meant for a small circle of adopters.  It doesn't sound like it merits a
full, community-wide release.  Stranding users on 2.3 to take on a
handful of experimental features while 3.0 marches away from them is
also a distasteful result of forking.

As for not taking any energy away, I think that's contestable, but it's
also only looking in one direction to claim that just because it hasn't
yet doesn't mean it won't in the future.  Groups fork from one another
precisely because it does do that.

On 2012-02-02 14:45 , David Karger wrote:

You may be surprised that I agree with almost everything you say.
However, there is one sticky fact that drove me onto the path of an
exhibit 2.3 release: Exhibit 2 is a full-featured system in active use
at a couple thousand sites, while Exhibit 3, due to the limits of what
we received funding to accomplish, is an incomplete upgrade that does
not yet meet the needs of the current E2 users.

As you observe, my proposed 2.3 is a mix of a maintenance and a
"research" release.  In the maintenance category we have bugfixes and
the elimination of the painter service dependence in the map view.  In
the research category we have logging, embedded data, new input formats,
and data editing.

The rationale for doing maintenance on E2 is that, as observed above, E3
is not yet at a point where current E2 users can transition to it.
Because painter has been a longstanding problem point for E2 users, I
judged it worth improving the quality of their current tool.

In a perfect world, we would have first completed development of E3 to
match E2 capabilities, then added these changes to E3.  However, none of
us have the manpower for that, so these needed changes would not have
happened without E2.  I judged that the need to have these changes
available *now* trumped the value of shifting all effort to E3.

As for the research component of the release, most of these changes were
again driven by current users.  The data editing extension was actually
created by the Ensemble Project at Liverpool University because they
need it for their application of Exhibit in e-learning (and E3 doesn't
yet have what they need).  The XML importer was also a request of the
Ensemble project.  Embedded data was a specific response for users who
had problems getting their content indexed when the content was on a
linked page, and also ties to the editable data work of the ensemble
project.  Logging is indeed something we inserted for our own research
purposes, but it's literally 10 lines of code, not worth attention.

In a sense, the existence of E3 reduced my concern about pushing
experimental changes to E2.  We know that eventually E3 will overtake E2
in functionality, and at that point E2 will be decommissioned.  E2
therefore becomes a perfect prototyping environment within which to
test-drive ideas that might someday be incorporated in E3 when it
reaches full functionality.  Again, those ideas can't be test-driven in
E3 yet, because E3 isn't complete.

To your forking  objection, that we "split focus and energy" from E3, I
can only observe how tiny our manpower is at MIT.  All of my (as opposed
to ensemble's) contributions to E2.3 represent tinkering at the edges
that I was able to carve out of a small amount of "hobby time".  My
contribution to E3 would have been negligible in quantity (and probably
negative in quality---as you say, production code is different).
Essentially, 2.3 is the exhibit "research lab" you recommend at the end
of your note.  It isn't a fork because it hasn't taken any meaningful
energy from E3.

I'm happy to continue this discussion, but so far none of the arguments
you've given convince me that there is any negative value in making the
small improvements we've produced available as a new 2.3 release.



On 02/02/2012 05:13 PM, Ryan Lee wrote:

This is going to be a bit long, so please bear with me.  It's
important.

I am supportive of a maintenance release to Exhibit 2.2.0 (what is
currently deployed) where long standing bugs get fixed, libraries
updated, etc., for those who feel they can't make a switch to 3.0 just
yet.  But this proposed alpha changes semantics and adds features.  It
is essentially a fork release.  And forking releases sucks: parallel
and
divergent lines of development get very hard to reconcile, and they
split focus and energy.

Even so, I'd be happy to take a look at a diff for between June and now
to see what fixes could be incorporated into Exhibit 3.0.  But I'm not
going to take in changes to the configuration language or other
material
that almost certainly does not belong in the core of Exhibit at all.

Your involvement with Exhibit at the research level is incredibly
valuable, don't get me wrong there; I think it could be amazing to have
a constant flow into the Exhibit community of fresh ideas emanating
from
your research group.  At the same time, how that's been done to date is
at direct odds with one of the cornerstones of making an open source
project successful: gatekeeping for who can get commit access to the
core trunk.

When any of your students can get in to satisfy your group's
requirements but others from the wider community need to actively
demonstrate participation and core competency to receive the same, the
overall quality of the project is rather more harmed than improved, and
the community gets unhelpful signals about how exactly they're
involved.
    Code that's been generated for research is almost never the same as
code that's been tested and engineered for production, for many good
reasons - but the difference is there nonetheless.

Still, I do believe these competing interests both deserve their place
in the project, and I think they can be reconciled.  One of the reasons
we moved to GitHub was to provide a better social model for working on
Exhibit.  With GitHub, everybody is working on their own personal fork
for development, even the gatekeepers.  It becomes the gatekeepers job
to merge in any changes as submitted by contributors.  This way,
anybody
can participate - subject to review.  The best contributors then
become gatekeepers themselves.  Within this model, your students get
the
opportunity to both simply work on code and use it as a proving ground
for promotion to gatekeeper, if that's at all their interest.

Ideally, Exhibit 3.0 also makes it easier to write code for Exhibit
without touching its core.  I'm sure it could use some refinement with
experience, but given that that's the direction we're moving in, your
students could then write extensions to pursue their ideas, and your
group serve them up as a sort of Exhibit research lab to the community,
the best features and implementations being adopted into Exhibit over
time.

This release you propose conflates what is useful in a maintenance
release with what your group's most recent research focus has been.  I
do not believe the two should be joined together in one release.

The interim between the prior release and the next shows how little
of a
release process we currently have in place as a community, so I suppose
it feels like fair game to just take individual initiative.  There's a
release proposal to the community coming up soon to address just that
point.

Nobody is going to force you to stop.  But please don't issue a fork
release.

On 2012-01-24 23:21 , David Karger wrote:

This is to announce an alpha release of an update to the Exhibit 2
codebase, one that I hope will eventually become Exhibit version
2.3. As
Exhibit 3 matures we aim to shift our developments efforts there, but
for the time being the greater maturity of E2 makes it a better
testbed
for these updates. This release fixes a number of bugs and also offers
additional functionality; we'd like to see how that functionality gets
used in order to understand what is important to incorporate into E3.

These changes are all live on
http://trunk.simile-widgets.org/exhibit/api, so all you need to do to
try them is link to that API instead of api.simile-widgets.org .
Please
do so, and provide feedback on what is working and what isn't.

Major changes include:

* support for new import data formats including xml and html tables
* exhibit data can be embedded directly in html documents
* map view upgraded to use google maps v3 (gmaps key no longer
required)
* map view renders icons locally (using canvas) instead of using
painter
service
* a new extension supporting wysiwyg inline editing of data
displayed in
any exhibit

There are also several bug fixes.  Details of these and other changes
can be found at http://people.csail.mit.edu/karger/Exhibit/alpha.html


--
You received this message because you are subscribed to the Google Groups "SIMILE 
Widgets" group.
To post to this group, send email to simile-widgets@googlegroups.com.
To unsubscribe from this group, send email to 
simile-widgets+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/simile-widgets?hl=en.

Re: [Simile-Widgets] alpha release of exhibit 2.3

Reply via email to