Re: Inventory Reductions

Christopher Singley Sat, 15 Jun 2019 06:00:51 -0700


On 6/14/19 8:22 PM, Martin Blais wrote:

On Fri, Jun 14, 2019 at 3:36 PM Christopher Singley<[email protected] <mailto:[email protected]>> wrote:
    <snip>
    You mean "booking" in the sense of "realizing capital gains",
    yes?  Why does booking depend on interpolation? Is this because of
    your emphasis of specific identification as a cost accounting
    method?  There are ways out of that.
No; it would be very easy to fix if the issue was just implementingdifferent booking methods :-)
The problem occurs because the syntax I created specifically aims toallow users to elide some information and automatically fill in somemissing numbers. For instance, you don't have to provide all thedetails of a reducing lot, as long as the list of lots it matches(when filtered down) yields an unambiguous set (either a single lot,or many lots for which the total number of units matches the size ofthe reducing lot precisely). This is the process I call "booking",that is, matching a partial specification for reducing lots againstthe available lots just before the transaction gets applied. It usesthe accumulated inventory in order to fill in missing information.
"Interpolation," on the other hand, is a similar process that fills inmissing numbers, but not by matching against the contents of theinventory just before the transaction gets applied, but rather onlyagainst the other postings, by assuming that the set of postings foreach currency group must balance. This does not make use of the stateof the inventory before the transaction gets applied, just theinformation provided on that one transaction. It basically attempts tofigured out the cost currency of each posting, then groups them bycost currency, and then attempts to fill in missing bits and piece(either numbers or currencies) in each of these currency groups.
These two are similar in goal: fill in missing informationautomatically to ease the burden of data entry, but in some cases -cases which are which particular bits are left missing in the inputand for Beancount to figure out - running booking before interpolationworks, and in other cases running interpolation preceding bookingworks. I have seen cases that are impossible to resolve. It took me awhile to figure out which order was the most useful in practice, andthis is what's in there now.

I'm just trying to understand why you're having so much trouble with thecost accounting. I adore the Ledger style syntax, despite its obviouslimitations for this kind of work. My main question is how much of theproblem is inherent in the syntax & data entry format vs. the algorithmapplied to it.



Your "Self-reductions" document contains this example:
"""
Assets:Invest     10 HOOL {50 USD, 2016-01-01} ;; A
Assets:Invest     10 HOOL {51 USD, 2016-01-02} ;; B

2016-12-04 *
  Assets:Invest    -5 HOOL {}
  AssetCash       255 USD
"""

This is not any sort of corner case; this is what normal JEs look like. As written, the interpolation is trivial. The trouble arises becausethis JE could theoretically contain thousands of other postings, so thealgorithm needs to solve for the missing cash to figure out the proceedsof the HOOL sale.

If I've got that right, it seems like the algorithm is suffering at thehands of the syntax. Why bother trying to handle such pathologicalbookkeeping? It is no hardship to the user to enforce a constraint thatan asset purchase/sale must only contain a single currency posting.

In general, to process securities transactions, I believe you're goingto need to define new directives other than "txn" so the parser canroute securities transactions to different handlers. For example, yourdocs contain an example of HOOL spinning off A-shares and B-shares...you need a way to signal the parser to update inventory but skiprealizing gains. As it stands, I don't believe beancount's syntaxoffers the possibility of distinguishing "reducing" postings thatrealize gain from those that don't.

I've been able to get it down to 6 different types of securitiestransactions - trades, return of capital distributions, spinoffs,splits, transfers, and options exercise. I think you can reduce thenumber of needed directives. Splits are essentially a subtype oftransfers. It may also be possible to treat trades and return ofcapital as subtypes of transfer. Spinoffs probably need their owndirective. You might be able to decompose options exercise into asequence of more fundamental types, but I'm skeptical because of theholding period rules.

I suspect minimal syntax extensions would greatly improve the algorithmsat essentially no cost to the user. If that's something you're willingto consider, you might also consider at the same time what kind ofledger syntax is needed to specify cost accounting, which(unfortunately) can change from one transaction to another on the sameday. You need to be able to handle input data that does this:


https://investor.vanguard.com/taxes/cost-basis/methods

Anyway, something to keep in mind next time you're working on theinventory system.


Cheers, Chris

P.S. Technical documentation nitpicking - the average cost basis methodis only available for mutual funds (I think it was special pleading toallow them to keep this business logic in the database layer - SQLstored procedures). It's got nothing to do with the tax qualificationof the holding account - you see average cost used both inside andoutside retirement accounts.



    Specific identification is a very uncommon cost accounting
    method.  It's almost always FIFO or (for mutual fund companies)
    the degenerate average cost method.  It's good to support specific
    identification (generality is good!) but given its rarity, it's
    not unreasonable to enforce a requirement that opening/closing
    transactions (or augmenting/reducing transactions in your usage)
    must have matching labels if you want to use specific
    identification.  Don't attempt to interpolate the opening
    transaction from date/price, and the problem is solved, no?

    Is guessing the opening transaction from partial user input (i.e.
    date or price) a high priority?  The algorithm cannot reliably
    find a solution because of underspecified inputs, as you note in
    your docs, and it requires the user to manually duplicate a
    significant effort by keeping their own inventory outside of
    beancount (probably in a spreadsheet).  I don't know about you,
    but never maintaining another lot-matching spreadsheet ever again
    is very high on my list of priorities.

    You already have a good chunk of the inventory system built into
    beancount.  If you let go of the requirements that are introducing
    recursion into your algorithm, I guess you'd find the benefits a
    lot more valuable than the bits of interpolation that you'd need
    to drop in order to achieve it.

    I've written an inventory system that does this for me, so I know
    it can be done.  I doubt you'd find it terribly useful, but if
    you're interested I can show you how I handle cost accounting. 
    I've got a Python package that handles trades, splits, spinoffs,
    mergers, return of capital distributions, all that fun stuff. 
    It's somewhat battle-tested, too, with a relatively high volume of
    messy real-world transactions run through it, and the results
    audited (as in CPAs engaged to discover discrepancies, not just
    unit tests).  You might find it interesting to look at an
    alternative implementation. The code won't win any prizes for
    engineering elegance, and still needs some work, but the output is
    demonstrably correct for the most part.

I'd be curious to have a look, but unfortunately I'm too busy rightnow, I have very little time, just keeping my head above water, mostly.




        The implementation of the Inventory has already evolved since
        this was written (for performance reasons) and IIRC is treated
        mostly like a list, matching portions that have been specified
        to filter a list candidate positions. I'm not beyond reviewing
        core classes - especially if it might help - but I believe
        changing the mapping would make no difference at all here.  I
        wrote an example some time ago - in a text file IIRC, which I
        shared on the list and had some comments about -  but I can't
        seem to find it right now.


    I'd be interested in seeing the doc if you happen to stumble
    across it, but it's not a big deal.  You're right that the dict
    keys aren't a deal breaker; they can be worked around without much
    trouble.  I'm still puzzling through how you do this.

I'll bring it up if I can find it. It was a text file in anotherbranch IIRC, in the midst of code.




    Thanks for releasing beancount, it's nice software


Thank you!


        On Wed, Jun 12, 2019 at 12:27 AM Christopher Singley
        <[email protected]> wrote:

            I've been reading through this:

            http://furius.ca/beancount/doc/self-reductions

            and puzzling through parser.booking_full.

            It looks to me like the root cause of your struggles is
            that the keys to
            your
            Inventory mapping are overspecified - the necessity to
            perform the
            calculations to
            populate a Cost instance in order to look up a lot. I
            reckon you need
            to move
            the cost data from keys to values, so that inventory is a
            mapping from
            (account, security) -> [(units, cost, date)]
            instead of the current mapping from
            (account, security, cost, date) -> [(units, )].
            The former is a more natural data structure for cost
            accounting.

            Any well-formed transaction natively has (account,
            security) fields.
            Use those to look up a sequence of lots containing
            (lot_units, cost,
            open_date).
            Filter that sequence using (transaction_date,
            transaction_units) to find
            lots that
            might be closed ("booked") by the incoming transaction -
            it will
            definitely have
            transaction_date, and if it doesn't have transaction_units
            for some
            reason, then that
            is trivially interpolated.

            Next step depends on your cost accounting method. Normally
            you'd sort
            transactions
            by date/time to do FIFO or average cost.  To instead do
            specific
            identification, you'd
            further filter the lots for a particular date/cost/label,
            and require a
            unique result.

            NOW you do the heavy lifting.

            Work through the surviving lots in order, popping lots and
            splitting
            them as necessary
            until you run out of transaction_units or lot_units.  For
            each popped
            lot, couple its
            data to (transaction_date, transaction_units,
            transaction_price),
            and you'll have all the data needed to fully populate a
            journal entry.

            There's nothing recursive about this calculation. You can
            implement it
            as a straight
            pipeline of iterators, evaluated lazily.

            An additional advantage is that this procedure is easy to
            extend to
            handling other
            securities transaction types that don't involve realizing
            gain.
            E.g. for a split, use (account, security) to look up your
            position.
            Filter that sequence for lots with an open_date before the
            transaction_date,
            and replace them with copies with the units/cost adjusted
            for the split.
            Keep a running total of the change in units, and require
            that total to
            match the input
            transaction_units (which is a hard requirement for a stock
            split
            transaction).

            Any conceptual problems with this setup?  I mean, other
            than being a
            huge PITA to
            rip up existing classes and everything that touches them.

--You received this message because you are subscribed to

            the Google Groups "Beancount" group.
            To unsubscribe from this group and stop receiving emails
            from it, send an email to [email protected].
            To post to this group, send email to [email protected].
            To view this discussion on the web visit
            
https://groups.google.com/d/msgid/beancount/e84f305c-9282-1d20-d74d-00d99620f2da%40singleys.com.
            For more options, visit https://groups.google.com/d/optout.

--You received this message because you are subscribed to the Google

    Groups "Beancount" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:[email protected]>.
    To post to this group, send email to [email protected]
    <mailto:[email protected]>.
    To view this discussion on the web visit
    
https://groups.google.com/d/msgid/beancount/66479691-deff-4982-994f-845d71aae0e1%40googlegroups.com
    
<https://groups.google.com/d/msgid/beancount/66479691-deff-4982-994f-845d71aae0e1%40googlegroups.com?utm_medium=email&utm_source=footer>.
    For more options, visit https://groups.google.com/d/optout.

--

You received this message because you are subscribed to a topic in theGoogle Groups "Beancount" group.To unsubscribe from this topic, visithttps://groups.google.com/d/topic/beancount/QxEtBO-kyKQ/unsubscribe.To unsubscribe from this group and all its topics, send an email to[email protected]<mailto:[email protected]>.To post to this group, send email to [email protected]<mailto:[email protected]>.To view this discussion on the web visithttps://groups.google.com/d/msgid/beancount/CAK21%2BhMXumgJXtrBXV2B7Edu%2BQTK5yygwmtKGcOt4bt%3Dcmrkgw%40mail.gmail.com<https://groups.google.com/d/msgid/beancount/CAK21%2BhMXumgJXtrBXV2B7Edu%2BQTK5yygwmtKGcOt4bt%3Dcmrkgw%40mail.gmail.com?utm_medium=email&utm_source=footer>.

For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/701185e5-39bc-6637-2339-bb8f9e06230a%40singleys.com.
For more options, visit https://groups.google.com/d/optout.

Re: Inventory Reductions

Reply via email to