Re: Beancount with large journals

Jason Chu Mon, 18 Feb 2019 12:51:57 -0800

If my vote counted (which I don't expect it does), I'd vote for go over c++
because I'm more familiar with it as my job has a lot more of that coding
day to day.


On Mon, Feb 18, 2019, 12:35 PM Shreedhar Hardikar <
[email protected]> wrote:

> Will the rewrite in C++ really help speed that much? I mean, C++ does
> comes with a number of additional costs, and so do you believe ultimately
> that the benefit of C++ (execution speed) for an accounting tool like
> beancount, really outweighs those costs?
>
> Here's some of my thoughts:
>
>    1. C++ cross-platform dependency management & build - I personally use
>    beancount on a FreeBSD system, and I do have to manually build it (even
>    when install from pip) because there are some C/C++ library dependences for
>    the parser etc. I can say that part is not very fun. If then entire thing
>    is written in C++, care would have to be taken to not use "fancy" C++
>    features because that means not being able to use on certain systems
>    (because they have older compilers or don't have the specific). Perhaps
>    bazel solves that?
>    2. Ease of development & hacking on the code - One prime reason I
>    chose beancount over ledger was the fact that the data structures and
>    algorithms used were written in Python and so easier to grok. I am fairly
>    adept in C++, but running through .h & .cpp & Make & inheritance
>    hierarchies is much more work in C++ than other languages. It was difficult
>    for me to follow along the datatypes available in ledger and how the python
>    integration really worked. I mean, perhaps some more documentation would
>    have helped. Also C++ bugs may give segfaults a lot more often than python
>    code does - a different beast than the stack trace bugs in python. I'm not
>    saying it's not possible to write seg-fault-free code. It gets harder very
>    fast as the complexity goes up.
>    3. Also, I'm not sure of what design you have in mind, but if you are
>    going to expose Python bindings for plugins (which, according to the docs
>    is a fundamental part of beancount extensions model), won't you need to be
>    constantly converting between Python objects & C++ objects anyway? That
>    might nullify down all the benefits from C++. Caveat here: I'm not very
>    familiar with Python/C++ bindings, there may be a way to do this
>    efficiently. And maybe googe/clif solves that problem superbly.
>
> Finally, I reckon that you can get a lot from your execution speeds by
> using other compiled language. Have you considered Go? It should give much
> faster execution speeds of integers/decimals with easier development,
> maintenance (and package management) etc. Caveat here: I have not used Go
> very much, that is, I know only basics, and what I've heard from others. It
> may work really well to solve the problem beancount is facing in an elegant
> manner.
>
> Anyway, I do hope you take these points in good spirit - as they were well
> intentioned. Beancount is a great product and I can't wait till it gets
> even better with all the features you listed out here!
>
> Thanks,
> Shreedhar
>
> On Mon, Feb 18, 2019 at 12:22 PM Martin Blais <[email protected]> wrote:
>
>> On Thu, Feb 14, 2019 at 2:44 AM Stefano Zacchiroli <[email protected]>
>> wrote:
>>
>>> On Sun, Feb 10, 2019 at 11:07:03PM -0500, Martin Blais wrote:
>>> > You can view the breakdown in time with the -v option to bean-check:
>>>
>>> You've probably already thought about that, so out of curiosity: how
>>> much of this is potentially parallelizable, as an avenue for "easily"
>>> getting a performance boost? I guess not much, due to either I/O
>>> constraints or the GIL lock, right? I'm curious about whether
>>> validation, booking, and plugins might be made parallelizable in the
>>> future.
>>>
>>
>> None.
>> It's a sequential process.
>> Something that /might/ have an impact is to sequence all the operations
>> as a chain of streams consuming each other (think: generators/iterators),
>> for memory locality, but at this (small) scale I doubt it would make any
>> difference TBH. Some of the plugins do multiple passes over the stream,
>> which makes this not work and would require pirouettes to harvest
>> opportunities for reusing already computed quantities (e.g. results of
>> stuff from getters.py)
>>
>> No, I think what should be done for the next major release is a rewrite.
>> At the very coarse level, it looks like this in my mind:
>> - Beancount reports/web gets deleted in favor of Fava.
>> - Beancount query/SQL gets forked to a separate project operating on
>> arbitrary schemas (via protobufs as common representation for various
>> sources of data) and has support for Beancount integration (e.g. a Decimal
>> type, and simple aggregators with the semantics of
>> beancount.core.Inventory/Position/Amount). That's all that's needed, and it
>> would enable the query language to work on CSV files and other data
>> sources. Moreover, this version would be tested property, and have data
>> types in its compiler (no exceptions at runtime).
>> - Beancount core, parser, booking and plugins get rewritten in simple C++
>> (no boost/templates, but rather on top of a bazel + absl + protobuf + clif
>> base with functional-style and a straightforward subset of C++, no
>> classes), providing its parsed and booked contents as a stream of protobuf
>> objects.
>> - All tests would remain in Python (I'm not rewriting those).
>> Comprehensive clean Python bindings for beancount.core would be provided,
>> to do as much scripting as is done today, except with types implemented
>> fully in C++.
>> - Moreover, all the big ticket items would have to be addressed, e.g.
>> explicitly setting the precision instead of inference, currency trading
>> accounts, reports of trades built-in, etc.
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Beancount" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To post to this group, send email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/beancount/CAK21%2BhMXqd9sOAey%2B3aFDi6gh22B5bG8Y08E7CKa5WssWcryZg%40mail.gmail.com
>> <https://groups.google.com/d/msgid/beancount/CAK21%2BhMXqd9sOAey%2B3aFDi6gh22B5bG8Y08E7CKa5WssWcryZg%40mail.gmail.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beancount/CAAY9sD8%2BXEKOEstkmF5mHNMTWsGOjKJcFarBV15v%2BUCA7pAmYw%40mail.gmail.com
> <https://groups.google.com/d/msgid/beancount/CAAY9sD8%2BXEKOEstkmF5mHNMTWsGOjKJcFarBV15v%2BUCA7pAmYw%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/CAFFHUgtFci-nyKvAr08_t2y2o9M1x7n1XTs6M%3D12MBmkXY3jJw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Beancount with large journals

Reply via email to