This is fantastic, thanks for sharing, Martin. I typed up a review of my
experiences with it. See the doc here
<https://docs.google.com/document/d/1FsbdqbPWaupPg67kZp1CK0BoX2Ri4WhGnLa5vNTBBAs/edit#>.
I tried to paste it below, but the formatting is lost (this used to work
earlier).
*Overview*
Extracting cashflows needed to compute IRR is challenging because of the
flexibility that bookkeeping systems like Beancount need to offer in order
to be useful.
This is a review of beancount/experiments/returns that Martin posted
recently, including what you need to get up and running, and pitfalls to
watch out for.
First, a review:
- Fantastic documentation (as usual) and clarity in code (also as
usual), along with clear output that includes the transactions, posting
categorizations, templates that were identified, and extracted cashflows in
the output files.
- The simplicity of the generalized cashflow method (investments.py:
produce_cash_flows_general()) is neat and helpful in building one’s mental
model when working with this tool.
- Runtime performance: it takes ~3.5 minutes to run on my transactions
of about 10k investment related postings on older hardware (x86, intel core
duo 1.2GHz)
*Per-commodity accounting: not a hard requirement*
The documentation may seem to suggest that per-commodity accounts are a
requirement or are assumed by the code, but this is not actually true. The
code works at the account level, and if you don’t have or use per-commodity
accounting, you can still compute returns at an account level. Below are
some pitfalls, solutions, and some drawbacks to not using per-commodity
accounting.
- The configure.py script may not work for you in this case, but it’s
trivial to write your own configuration.
- One problem is, it’s impossible to know which commodity to ascribe a
dividend to. Even worse, reinvested dividends in the “Dividends”
computation, if the cash flows into the asset account. Eg:
2020-01-02 * "Buy"
Assets:Brokerage 1000 HOOL {1 USD}
Assets:Bank
2020-08-01 * "Dividends"
Assets:Brokerage 500 USD
Income:Dividends:Brokerage ; ← not a cash account; not counted
2020-08-01 * "Reinvest dividends"
Assets:Brokerage 500 HOOL {1 USD}
Assets:Brokerage -500 USD ; ← not an external account; not counted
However, this is only a problem if you want your total performance broken
down into dividends vs. appreciation. If not, the total performance numbers
you get will still be correct. Simply ignore the ex-dividends and dividends
numbers.
- Several of my accounts correspond to a “real” brokerage account,
meaning they mix multiple commodities, cash and/or the money market
settlement fund
- returns.py:truncate_cash_flows() fails because my account contains
both cash and a settlement fund, variably used to trade commodities. This
means the cashflow can’t be reduced to a single currency based on cost, but
ends up with two (cash, settlement fund). Note that I use price conversions
for the settlement fund (vs. holding them at cost), to avoid generating a
ton of lots with no real purpose given that the NAV of these don’t stray
much from 1 USD. Fix:
@@ -141,7 +144,10 @@ def truncate_cash_flows(
balance = compute_balance_at(account_data.transactions, date_start)
if not balance.is_empty():
cost_balance = balance.reduce(pricer.get_value, date_start)
+ cost_balance = cost_balance.reduce(convert.convert_position,
"USD", pricer.price_map)
cost_position = cost_balance.get_only_position()
@@ -154,7 +160,13 @@ def truncate_cash_flows(
balance = compute_balance_at(account_data.transactions, date_end)
if not balance.is_empty():
cost_balance = balance.reduce(pricer.get_value, date_end)
- cost_position = cost_balance.get_only_position()
+ cost_balance = cost_balance.reduce(convert.convert_position,
"USD", pricer.price_map)
+ try:
+ cost_position = cost_balance.get_only_position()
*Verification of results can be challenging:*
After building a basic config file and perhaps making a few simple changes
to your ledger, compute_returns.py spits out numbers. How can the
correctness of these numbers be verified?
One way to identify problems is to construct groupings that reflect IRRs
provided by brokerages. However, the details and terminology of how IRR
calculations are made can vary widely by brokerage. For example, there
seems to be no agreement on what the start/end dates should be when
computing “3-year performance”. It can thus be a lot of effort to figure
these out even if one has just a handful of brokerages.
Verifying the output of compute_returns.py may involve some effort. The
challenge fundamentally seems to stem from the (fantastic) flexibility
Beancount provides in constructing one's ledger, which means that the
configuration for compute_returns.py needs to be correct. Short of bugs,
all incorrect results are generally because of errors in configuration.
So how can one verify that their configuration is correct? Thankfully, this
is a place where compute_returns.py shines, with its clear, relevant
output. Below, I share a few tips and personal stumbling blocks to help
others figure their own:
- Examine the helpful “category map” in the output files. It is the set
of all accounts that are involved in transactions with an asset account in
question:
cat returns.output/investments/*.org | grep "^.'" | grep None
Also consider whether each of them has been categorized correctly:
cat returns.output/investments/*.org | grep "^.'"
- The root finding solver is initialized to 0.2, so if you see a 20.00%
as your CAGR, it’s almost certainly because the solver crashed
- Examining the cashflow graph:
- Do the inflows and outflows make sense? With many accounts, you
have a sense of how frequently you transfer money and how much (eg: every
paycheck). Is that being violated?
- Do the dividends make sense?
- Transfers should show up as an equal inflow/outflow. If not,
investigate
- Spot check at least some of the transactions in the investments/*.org
files to ensure their categories make sense. See if there are transactions
that may be missing
- Finally, building one’s intuition for what the numbers should be, how
the cashflow should look like, and how compute_returns.py works, is
generally helpful in spotting errors
*Modifications I had to make to my ledger:*
There were some gotchas that I had to work through that I’ve shared below
to give a sense of what one might need to look out for:
- Tighten initial pad postings, so they appear on the day of the first
transaction. Else, the period used for computing returns is lengthened,
resulting in incorrectly worse performance
- I use the rename plugin
<https://github.com/redstreet/beancount_reds_plugins/tree/master/beancount_reds_plugins/rename_accounts>
I wrote to rename Expenses:Taxes:* to Income:Taxes:*, to avoid it from
dominating my expenses for analysis. This meant that the income accounts
were categorized as income (investments.py matches against an “Income:”
regex) and therefore (incorrectly) not counted as casflow, which happens in
transactions that involve tax (eg: sell to cover). I had to add
“Income:Taxes:*” as a cash_account. The problems manifested as accounts
with obviously incorrect IRR values (eg: -144%), though this may not always
happen.
- In transaction entries involving a paycheck (eg: contributions to an
employer sponsored retirement account), many postings were uncategorized,
and thus eventually had incorrect workflows. My solution was to insulate
brokerage transactions from paycheck transactions by having a transfer
account in between. I started using transfer accounts in the last few years
anyway via the zerosum
<https://github.com/redstreet/beancount_reds_plugins/tree/master/beancount_reds_plugins/zerosum>
plugin, but older transactions didn’t
- Eventually, I plan to evaluate switching to per-commodity accounts
that Martin has long recommended
*Configuration maintenance*
As one’s ledger evolves, the reports config needs to be kept up to date.
One downside of per-commodity accounts is they are likely to be opened and
closed more frequently than physical accounts.
Omissions of all types in the returns configuration (asset, dividend, and
cash accounts) can be hard to detect. Solutions are likely to be highly
specific to the conventions one uses in their ledger. For my ledger for
example, I plan to ensure all accounts under
“(Assets|Income):Investments:*” appear at least once in the reports config
file.
*In-kind transfers and returns*
One case that is not covered seems to be in-kind transfers. For example:
option "operating_currency" "USD"
plugin "beancount.plugins.implicit_prices"
2005-01-01 commodity USD
2005-01-01 commodity HOOL
2000-01-01 open Assets:Bank
2000-01-01 open Assets:Brokerage:HOOL "STRICT"
2000-01-01 open Assets:Zero-Sum-Accounts:Transfers "STRICT"
2010-01-01 * "Buy"
Assets:Zero-Sum-Accounts:Transfers 1000 HOOL {0.5 USD}
Assets:Bank
2020-01-01 price HOOL 1 USD
2020-01-02 * "Transfer"
Assets:Brokerage:HOOL 1000 HOOL {0.5 USD}
Assets:Zero-Sum-Accounts:Transfers -1000 HOOL {0.5 USD}
2020-12-23 price HOOL 1.1 USD
Using the market value on the transfer date for the cashflow value should
solve this case.
This appears in several cases in me ledger: moving from one brokerage to
another, and options exercise. It could be argued that the former could be
handled by bundling together the old and new accounts, but that assumes all
accounts in question deal with only a single commodity each, which means
(for me), redoing a whole bunch of historical transactions.
The latter is an interesting case. For context, assume an option to buy
stock ABC at 1 USD was exercised when the fair market value of ABC was 5
USD. It was then held for a year, and sold for 10 USD. It is helpful to
compute the returns of holding on to the exercised stock (which is an
investing decision) separately from the “compensation” portion, which is
the 5-1 = 4 USD. To do this, I track them in separate accounts like so:
2015-01-01 * "Exercised"
Assets:Vested -100 ABC_OPTIONS
Expenses:Stock-Options 100 ABC_OPTIONS
Assets:XTrade:Exercised 100 ABC {1 USD} ; FMV was 4 USD
Assets:Bank 100 USD ; 100 * 1 USD
2015-01-01 price ABC 4 USD ; FMV
2015-01-01 * “Transferred to brokerage”
Assets:XTrade:Exercised -100 ABC {1 USD}
Assets:XTrade:Invested 100 ABC {1 USD}
If stock transfers in-kind were computed in the way above, one could simply
compute returns separately on Assets:XTrade:Exercised and
Assets:XTrade:Invested, and distinguish between the “compensation” and the
“investment” portions.
*Minor feedback*
- Legend in graph (in .svg): both amortized value from flows, and market
value show up as a black line. In the graph, one is blue + dots, other is
black
- How rae trailing, rolling, and total returns period define? Since
everyone defines these a bit differently, perhaps balloon help on the UI
would clarify it. This script helps, meanwhile:
# Run on 2020-12-23
>>> from reports import *
>>> from tabulate import tabulate
>>> print(tabulate(get_calendar_intervals(TODAY)))
---- ---------- ----------
2005 2005-01-01 2006-01-01
2006 2006-01-01 2007-01-01
2007 2007-01-01 2008-01-01
2008 2008-01-01 2009-01-01
2009 2009-01-01 2010-01-01
2010 2010-01-01 2011-01-01
2011 2011-01-01 2012-01-01
2012 2012-01-01 2013-01-01
2013 2013-01-01 2014-01-01
2014 2014-01-01 2015-01-01
2015 2015-01-01 2016-01-01
2016 2016-01-01 2017-01-01
2017 2017-01-01 2018-01-01
2018 2018-01-01 2019-01-01
2019 2019-01-01 2020-01-01
2020 2020-01-01 2020-12-23
---- ---------- ----------
>>> print(tabulate(get_cumulative_intervals(TODAY)))
-------------------- ---------- ----------
15_years_ago 2005-01-01 2020-12-23
10_years_ago 2010-01-01 2020-12-23
5_years_ago 2015-01-01 2020-12-23
4_years_ago 2016-01-01 2020-12-23
3_years_ago 2017-01-01 2020-12-23
2_years_ago 2018-01-01 2020-12-23
1_year_ago 2019-01-01 2020-12-23
ytd 2020-01-01 2020-12-23
rolling_6_months_ago 2020-06-25 2020-12-23
rolling_3_months_ago 2020-09-25 2020-12-23
-------------------- ---------- ----------
--
You received this message because you are subscribed to the Google Groups
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/ebba4de0-c401-4290-a119-2cab59d5b432n%40googlegroups.com.