On Sat, Jul 04, 2020 at 02:34:35AM -0400, Martin Blais wrote: > Today I'm starting development on Beancount v3. > > This is going to be a pretty big change and will take a while. > I've laid down the details in this document: > https://docs.google.com/document/d/1qPdNXaz5zuDQ8M9uoZFyyFis7hA0G55BEfhWhrVBsfc/
This is very exciting. And, as usual, your design documents are very interesting and insightful to read. I took some time to read through all of them and I'm sharing some thoughts of mine about them below. ================================== Directives ---------- Having as output of beancount core two streams of clearly separated incomplete/syntactic v. complete/semantic directives sounds like a great approach. In terms of terminology, you might use the "raw v. cooked" terminology (which I've picked up from proof assistants years ago, but which I find fitting here; YMMV). It's not yet clear to me if both streams will be accessible to plugins (I think they should). And, if they are, how will they be interleaved: a single stream with both raw and cooked transactions? Two separate streams? Parser ------ You mention you're gonna keep using flex/bison, which is for sure well known technology. However, the expressivity of bison grammars make it kinda hard to hack on existing parsers, raising the barrier for contributors. Have you considered switching to PEG parsing? Unrelated (but still on parsing), I don't understand your point about getting rid of the cache. Sure, we all hope it will no longer needed for interactive use, but it would still be useful for people building small services on top of relatively static Beancount ledgers; including Fava. Also, as the output of Beancount core is gonna be streams of protobufs, those will be trivial to serialize, and also cross language, why not imagine a cache of protobufs serialized on disks? The rework of includes sounds great. We have discussed it on the list in the past, so I guess it's your goal, but as it's not explicitly stated in the design doc let me repeat it here. I think the goal should be "include invariance", i.e., one should always be able to take an existing Beancount ledger in a single file and break it down in an arbitrary amount of smaller ledger files that include each other, without any semantic change. (The stated goal in your doc of being able to declare plugins elsewhere than in the main file will derive from this, but this principle is more general.) The main feature I lack to have feature parity with Ledger-CLI is the ability to add tags to individual transaction legs. I'm assuming this will go hand-in-hand with relaxing the distinction between metadata/ tags/ links (by making them syntactic sugar for metadata, I'm guessing), which is great, thanks! Ulque ----- This sounds like an exciting project. In addition to support for balance columns and totals, there are a bunch of other features that would be very welcome, like the ability to filter out 0 columns, or to add derived columns (e.g., differences between columns, to compute P&L in investments). I don't know how much you plan to build on top of Pandas (which will trivially offer many of these), but it is absolutely brilliant to see the analogy between the two worlds. Something I'm surprising to haven't see mentioned on this is your vision (which we discussed a while ago on list) that the hierarchical nature of the account hierarchy is kinda arbitrary and gets in the way (e.g., one often wants to pivot around from "Expenses:Home:Repair + Expenses:Car:Repair" to "Expenses:Repair:Home + Expenses:Repair:Car" as there is no right or wrong hierarchy there). Is this idea of being able to pivot around the account hierarchy, considering each component a facet of sort, part of your plans for Ulque, or is it out of scope? Code quality ------------ Typing: outside of Google I've the feeling that the state-of-the-art static type checker is Mypy. I've myself migrated a substantial codebase to it and it's a vibrant environment (with a lot of involvement from Guido himself) and active development that goes hand in hand with the refinement of the type system (via periodic PEPs). I'd be weary of going pytype instead of Mypy, even though I realized that the type annotations are (supposed to be) compatible. How about automated code formatting via Black? (https://github.com/psf/black) I've recently switched to it a substantial code base and I find it pretty life changing. It would also help contributors I think, which is one of your worthwhile meta-goals for v3. Strict payee ------------ YAY, everything that makes possible to have even more automated sanity checks is a welcome addition. I wonder if a relaxed policy where any new payee is OK on first use even if undeclared, unless it's "near" (as string distance) to a previous one would work well as a default policy. But that's probably a matter for a plugin anyway... Unsigned debit and credit ------------------------- This is a very concrete need, which I routinely struggle with when showing accounting reports extracted from Beancount (or Fava) to other family members. But I'm surprised you mention it as a potential feature for Beancount itself. Wouldn't it belong to front-ends, like Fava (or maybe Ulque in the future), instead? In the view of "Beancount as an accounting calculator", which I've always adhered too, that seems to belong elsewhere. bean-sed -------- This is something which is not in your design documents, but seems important enough to me to be mentioned in light of a new Beancount generation. In plain text accounting we maintain two things at once: the semantic information captured in our books, and the syntax of those books, which matters more than the syntax of paper-based books (which is why we use Git to version and often allow ourselves to amend/curate very old transactions, which is something you never do with paper-based books, and for sure not reaching further in the past before the most recent book closure). But our textual books grow larger and we often need to perform batch changes. E.g., split an account category, merge some, rename accounts, etc., spanning all our books. Some of these operations are purely syntactic, some have impact on the semantics of our accounting data. I think we need a tool to automate this, more powerful than search and replace in vim/emacs, and with some knowledge of the data it's manipulating. The current style of plugins is not useful for this need. It is OK to patch transactions/directives post parsing, but cannot reflect those changes back to the textual books. Would something like this fit your vision for Beancount 3? In particular, I'd like to know if the raw/syntactic directives you imagine coming out of the new Beancount core would be close enough to the book concrete syntax to allow manipulation such as meddling with spacing Provided that, and a good pretty printer for concrete syntax, a "bean-sed" project with a dedicated manipulation language can probably be created and maintained separately of core. ================================== > The short version is that v3's core is going to be ported to C++ using a > Bazel build, and the codebase will be sectioned between core and the rest. > I just merged the new build definition in master. Bazel is indeed a great build system, but you should know that, at least for now, it is not in Debian/Ubuntu yet. So for the time being it will be impossible to ship Beancount v3 on those distros (and any other Debian-based distro) until Bazel itself is part of Debian. Work is ongoing (see: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=782654 ), but I'm unable to guess when it will actually happen. Cheers -- Stefano Zacchiroli . [email protected] . upsilon.cc/zack . . o . . . o . o Computer Science Professor . CTO Software Heritage . . . . . o . . . o o Former Debian Project Leader & OSI Board Director . . . o o o . . . o . « the first rule of tautology club is the first rule of tautology club » -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/20200706090020.xr73ygh3ivlme433%40upsilon.cc.
