Re: Beancount v3

Stefano Zacchiroli Mon, 06 Jul 2020 02:00:35 -0700

On Sat, Jul 04, 2020 at 02:34:35AM -0400, Martin Blais wrote:
> Today I'm starting development on Beancount v3.
> 
> This is going to be a pretty big change and will take a while.
> I've laid down the details in this document:
> https://docs.google.com/document/d/1qPdNXaz5zuDQ8M9uoZFyyFis7hA0G55BEfhWhrVBsfc/


This is very exciting. And, as usual, your design documents are very
interesting and insightful to read. I took some time to read through all
of them and I'm sharing some thoughts of mine about them below.

==================================


Directives
----------

Having as output of beancount core two streams of clearly separated
incomplete/syntactic v. complete/semantic directives sounds like a great
approach. In terms of terminology, you might use the "raw v. cooked"
terminology (which I've picked up from proof assistants years ago, but
which I find fitting here; YMMV). It's not yet clear to me if both
streams will be accessible to plugins (I think they should). And, if
they are, how will they be interleaved: a single stream with both raw
and cooked transactions? Two separate streams?


Parser
------

You mention you're gonna keep using flex/bison, which is for sure well
known technology. However, the expressivity of bison grammars make it
kinda hard to hack on existing parsers, raising the barrier for
contributors. Have you considered switching to PEG parsing?

Unrelated (but still on parsing), I don't understand your point about
getting rid of the cache. Sure, we all hope it will no longer needed for
interactive use, but it would still be useful for people building small
services on top of relatively static Beancount ledgers; including Fava.
Also, as the output of Beancount core is gonna be streams of protobufs,
those will be trivial to serialize, and also cross language, why not
imagine a cache of protobufs serialized on disks?

The rework of includes sounds great. We have discussed it on the list in
the past, so I guess it's your goal, but as it's not explicitly stated
in the design doc let me repeat it here. I think the goal should be
"include invariance", i.e., one should always be able to take an
existing Beancount ledger in a single file and break it down in an
arbitrary amount of smaller ledger files that include each other,
without any semantic change. (The stated goal in your doc of being able
to declare plugins elsewhere than in the main file will derive from
this, but this principle is more general.)

The main feature I lack to have feature parity with Ledger-CLI is the
ability to add tags to individual transaction legs. I'm assuming this
will go hand-in-hand with relaxing the distinction between metadata/
tags/ links (by making them syntactic sugar for metadata, I'm guessing),
which is great, thanks!


Ulque
-----

This sounds like an exciting project.

In addition to support for balance columns and totals, there are a bunch
of other features that would be very welcome, like the ability to filter
out 0 columns, or to add derived columns (e.g., differences between
columns, to compute P&L in investments). I don't know how much you plan
to build on top of Pandas (which will trivially offer many of these),
but it is absolutely brilliant to see the analogy between the two
worlds.

Something I'm surprising to haven't see mentioned on this is your vision
(which we discussed a while ago on list) that the hierarchical nature of
the account hierarchy is kinda arbitrary and gets in the way (e.g., one
often wants to pivot around from "Expenses:Home:Repair +
Expenses:Car:Repair" to "Expenses:Repair:Home + Expenses:Repair:Car" as
there is no right or wrong hierarchy there). Is this idea of being able
to pivot around the account hierarchy, considering each component a
facet of sort, part of your plans for Ulque, or is it out of scope?


Code quality
------------

Typing: outside of Google I've the feeling that the state-of-the-art
static type checker is Mypy. I've myself migrated a substantial codebase
to it and it's a vibrant environment (with a lot of involvement from
Guido himself) and active development that goes hand in hand with the
refinement of the type system (via periodic PEPs). I'd be weary of going
pytype instead of Mypy, even though I realized that the type annotations
are (supposed to be) compatible.

How about automated code formatting via Black?
(https://github.com/psf/black) I've recently switched to it a
substantial code base and I find it pretty life changing. It would also
help contributors I think, which is one of your worthwhile meta-goals
for v3.


Strict payee
------------

YAY, everything that makes possible to have even more automated sanity
checks is a welcome addition.  I wonder if a relaxed policy where any
new payee is OK on first use even if undeclared, unless it's "near" (as
string distance) to a previous one would work well as a default policy.
But that's probably a matter for a plugin anyway...


Unsigned debit and credit
-------------------------

This is a very concrete need, which I routinely struggle with when
showing accounting reports extracted from Beancount (or Fava) to other
family members. But I'm surprised you mention it as a potential feature
for Beancount itself. Wouldn't it belong to front-ends, like Fava (or
maybe Ulque in the future), instead? In the view of "Beancount as an
accounting calculator", which I've always adhered too, that seems to
belong elsewhere.


bean-sed
--------

This is something which is not in your design documents, but seems
important enough to me to be mentioned in light of a new Beancount
generation. In plain text accounting we maintain two things at once: the
semantic information captured in our books, and the syntax of those
books, which matters more than the syntax of paper-based books (which is
why we use Git to version and often allow ourselves to amend/curate very
old transactions, which is something you never do with paper-based
books, and for sure not reaching further in the past before the most
recent book closure).

But our textual books grow larger and we often need to perform batch
changes. E.g., split an account category, merge some, rename accounts,
etc., spanning all our books. Some of these operations are purely
syntactic, some have impact on the semantics of our accounting data. I
think we need a tool to automate this, more powerful than search and
replace in vim/emacs, and with some knowledge of the data it's
manipulating.

The current style of plugins is not useful for this need. It is OK to
patch transactions/directives post parsing, but cannot reflect those
changes back to the textual books.

Would something like this fit your vision for Beancount 3? In
particular, I'd like to know if the raw/syntactic directives you imagine
coming out of the new Beancount core would be close enough to the book
concrete syntax to allow manipulation such as meddling with spacing
Provided that, and a good pretty printer for concrete syntax, a
"bean-sed" project with a dedicated manipulation language can probably
be created and maintained separately of core.   


==================================


> The short version is that v3's core is going to be ported to C++ using a
> Bazel build, and the codebase will be sectioned between core and the rest.
> I just merged the new build definition in master.

Bazel is indeed a great build system, but you should know that, at least
for now, it is not in Debian/Ubuntu yet. So for the time being it will
be impossible to ship Beancount v3 on those distros (and any other
Debian-based distro) until Bazel itself is part of Debian. Work is
ongoing (see: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=782654
), but I'm unable to guess when it will actually happen.


Cheers
-- 
Stefano Zacchiroli . [email protected] . upsilon.cc/zack . . o . . . o . o
Computer Science Professor . CTO Software Heritage . . . . . o . . . o o
Former Debian Project Leader & OSI Board Director  . . . o o o . . . o .
« the first rule of tautology club is the first rule of tautology club »

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/20200706090020.xr73ygh3ivlme433%40upsilon.cc.

Re: Beancount v3

Reply via email to