TL;DR: it would be feasible to port the entire Whimsy code base over to Node.js, get it up an running on whimsy-vm6 hosted on Ubuntu 20.04 (Focal Fossa). There would be a number of advantages to doing so.

The difference wouldn't merely be one of language syntax. The result would likely be closer to an actor model than an object oriented model.

https://en.wikipedia.org/wiki/Actor_model
https://en.wikipedia.org/wiki/Object-oriented_modeling

A number of quasi-independent observations lead up to this conclusion. Warning, the third part is rather lengthy.

- - - -

I wrote a test for posting an item to the agenda:

https://github.com/rubys/whimsy-board-agenda-nodejs/blob/d2b46afa81ccd980c416109f6c9c4fea198b508f/src/server/__tests__/post.js#L10

It looks remarkably similar to the analogous test for the Ruby implementation:

https://github.com/apache/whimsy/blob/1316a898d5e8c91e8a33d89565d12efc4842dd56/www/board/agenda/spec/actions_spec.rb#L20

With this test in place, I now feel that I have one of everything needed to make a complete application working. The question is whether this version of the codebase will attract a sustainable community.

My current thoughts are that if I were to start from scratch, I would definitely do so in Node.js, but at the moment, that's not what I am facing. The Ruby base is more mature, and the Node.js base is just starting.

- - - - -

It appears that svn added a new command line parameter --password-from-stdin. I added support for this parameter to the node.js board agenda tool yesterday:

https://github.com/rubys/whimsy-board-agenda-nodejs/blob/d2b46afa81ccd980c416109f6c9c4fea198b508f/src/server/svn.js#L54

If you are running on a Mac, you may or may not have a version of svn that supports it. brew upgrade svn will get it for you.

The version of svn in the repositories for Ubuntu 16.04 and 18.04 don't support this parameter. The version of svn in the Ubuntu 20.04 repository does.

If we are writing new code, it is relatively straightforward to handle this correctly. If we want to update all of the existing code, at this point that represents a technical debt.

- - - - -

While Ruby and JavaScript have very different surface syntaxes, they superficially have a lot of similarities in their runtime models. There are some subtle differences, which I will over simplify as follows:

Ruby tends to encourage a more object oriented approach to solving problems. JavaScript tends to encourage a more event driven approach.

The current Ruby model can be seen here:

https://whimsy.apache.org/docs/api/

We have some obvious domain model object classes: Person, Committee, CCLAFiles. There are some cross-cutting concerns shared by each, and those tend to be broken out into classes: LDAP, Git, SVN. Along the way, the reads and writes for any given data type tend to be clustered together.

I continued with this approach on the client, where I had an Agenda class which contained a list of items, each of which were objects that responded to method calls that would indicated whether that item was read for review, flagged, or whatever.

There are a number of drawbacks to this approach. As an example, if you don't take care, performance suffers. A number of operations take a while because a large number of LDAP requests would be required. Caching can help, but then there are cache invalidation issues to worry about. I built a clever solution using weak references, but clever is generally a sign that there is a flaw in this approach.

Going out to svn every time there is a request would be a problem, but that can be mitigated by keeping a local working copy up to date with cron jobs. This, too, is a form of caching.

We can improve on that with pubsub, and the infrastructure team is working on that.

But if you ignore the caches, the flow for displaying an agenda item on the client is very linear: you start with getting a file from svn, you issue a bunch of LDAP requests, you get another file from svn, issue more LDAP requests, package up a JSON response that is sent to the client which pulls it apart, and renders the result in the DOM.

None of that would be feasible without caches.

The JavaScript approach is different. Instead of starting with the objects and propping up the architecture with caches, you start with the data (what you previously would call a cache) and build a number of quasi independent units of work that operate on the data.

When I undertook the port to Node.js of the board agenda tool, it took whatever code I needed, and figured that I would find a way to factor it out later into libraries. The code to parse committee-info.txt is a prime example of something that would be useful to many tools.

On the client, I decided to replace my custom models, routing, and event models with the ones that are favored by the React community.

Stepping back, I see that the code tends to be considerably less linear.

From the client side, pushing a button would tend to do something, and often would need to have explicit code in place to cause data in another component to rerender.

Now pushing a button will generally do one of three things: surface a modal dialog, change state within the dialog, or dispatch an action to the Redux store (possibly based on data retrieved from the server). That's it. In the first two cases, everything is local. In the latter case, it it somebody elses problem to do something with the data.

A similar thing happened on the server. I have cache files that represent the parsed version of the agenda, committee-info.txt, member data from LDAP, and the like. If cached files change, the client is notified, and it has the option to load that information in the store.

The infrastructure team has already enabled pubsub for LDAP data, and is working on pubsub for private svn repositories. The role of these functions will be be to update the source of truth on the the servers (i.e., the caches).

Instead of having shared libraries for parsing committee-info.txt, there can be a canonical JSON file for this data, and multiple tools can have file watchers that trigger when this file changes.

In fact, we already have these types of files; you can see them in https://whimsy.apache.org/public/. We can create some more and put them in a private directory (and perhaps even make them available to authenticated requests). But we mostly make this data available for other tools, we don't use this data much ourselves.

I'm finding that I'm liking the result more and more. Instead of looking at an object and a description of what a method should return, and trying to figure out what's broken when things don't work as expected, you can directly inspect the data to see if the problem is in the production of the data or in the consumption.

In other words, when something goes wrong, the first thing you do is examine the Redux store on the client (I enabled this by pressing "=" in the board agenda tool) or go to https://whimsy.apache.org/public/ or equivalent.

And when we are done, we can not only be a pubsub consumer, but perhaps we can look to be a pubsub source (likely by hosting an instance of the the pypubsub package). Different tools running on different machines, perhaps written in different languages, can all collaborate in this manner.

- Sam Ruby




Reply via email to