Re: A word on Ledger structure

Alexandre Rademaker Wed, 14 Mar 2012 18:44:01 -0700

Hello John,

Many thanks for this email. Does cl-ledger (lisp version) has similar
architecture? What are the differences? I don't know how many of the
ledger's users are programmers but making Ledger's architecture more
transparent will, for sure, help people understand and contribute to
ledger.


Best,

Alexandre Rademaker
http://arademaker.github.com/



On Wed, Mar 14, 2012 at 2:45 AM, John Wiegley <[email protected]> wrote:
> Ledger is developed as a tiered set of functionality, where lower tiers no
> nothing about the higher tiers.  In fact, I build multiple libraries during
> the process, and link unit tests to these libraries, so that it is a link
> error for a lower tier to violate this modularity.
>
> Those tiers are:
>
>  - Utility code
>
>   There's lots of general utility in Ledger for doing time parsing, using
>   Boost.Regex, error handling, etc.  It's all done in a way that can be
>   reused in other projects as needed.
>
>  - Commoditized Amounts (amount_t, commodity_t and friends)
>
>   An numerical abstraction combining multi-precision rational numbers (via
>   GMP) with commodities.  These structures can be manipulated like regular
>   numbers in either C++ or Python (as Amount objects).
>
>  - Commodity Pool
>
>   Commodities are all owned by a commodity pool, so that future parsing of
>   amounts can link to the same commodity and established a consistent price
>   history and record of formatting details.
>
>  - Balances
>
>   Adds the concept of multiple amounts with varying commodities.  Supports
>   simple arithmetic, and multiplication and division with non-commoditized
>   values.
>
>  - Price history
>
>   Amounts have prices, and these are kept in a data graph which the amount
>   code itself is only dimly aware of (there's three points of access so an
>   amount can query its revalued price on a given date).
>
>  - Values
>
>   Often the higher layers in Ledger don't care if something is an amount or a
>   balance, they just want to add stuff to it or print it.  For this, I
>   created a type-erasure class, value_t/Value, into which many things can be
>   stuffed and then operated on.  They can contain amounts, balances, dates,
>   strings, etc.  If you try to apply an operation between two values that
>   makes no sense (like dividing an amount by a balance), an error occurs at
>   runtime, rather than at compile-time (as would happen if you actually tried
>   to divide an amount_t by a balance_t).
>
>   This is the core data type for the value expression language.
>
>  - Value expressions
>
>   The next layer up adds functions and operators around the Value concept.
>   This lets you apply transformations and tests to Values at runtime without
>   having to bake it into C++.  The set of functions available is defined by
>   each object type in Ledger (posts, accounts, transactions, etc.), though
>   the core engine knows nothing about these.  At its base, it only knows how
>   to apply operators to values, and how to pass them to and receive them from
>   functions.
>
>  - Query expressions
>
>   Expressions can be onerous to type at the command-line, so there's a
>   shorthand for reporting called "query expressions".  These add no
>   functionality of there own, but are purely translated from the input string
>   (cash) down to the corresponding value expression (account =~ /cash/).
>   This is a convenience layer.
>
>  - Format strings
>
>   Format strings let you interpolate value expressions into string, with the
>   requirement that any interpolated value have a string representation.
>   Really all this does is calculate the value expression in the current
>   report context, call the resulting value's "to_string()" method, and stuffs
>   the result into the output string.  It also provides printf-like behavior,
>   such as min/max width, right/left justification, etc.
>
>  - Journal items
>
>   Next is a base type shared by anything that can appear in a journal: an
>   item_t.  It contains details common to all such parsed entities, like what
>   file and line it was found on, etc.
>
>  - Journal posts
>
>   The most numerous object found in a Journal, postings are a type of item
>   that contain an account, an amount, a cost, and metadata.  There are some
>   other complications, like the account can be marked virtual, the amount
>   could be an expression, etc.
>
>  - Journal transactions
>
>   Postings are owned by transactions, always.  This subclass of item_t knows
>   about the date, the payee, etc.  If a date or metadata tag is requested
>   from a posting and it doesn't have that information, the transaction is
>   queried to see if it can provide it.
>
>  - Journal accounts
>
>   Postings are also shared by accounts, though the actual memory is managed
>   by the transaction.  Each account knows all the postings within it, but
>   contains relatively little information of its own.
>
>  - The Journal object
>
>   Finally, all transactions with their postings, and all accounts, are owned
>   by a journal_t object.  This is the go-to object for querying ad reporting
>   on your data.
>
>  - Textual journal parser
>
>   There is a textual parser, wholly contained in textual.cc, which knows how
>   to parse text into journal objects, which then get "finalized" and added to
>   the journal.  Finalization is the step that enforces the double-entry
>   guarantee.
>
>  - Iterators
>
>   Every journal object is "iterable", and these iterators are defined in
>   iterators.h and iterators.cc.  This iteration logic is kept out of the
>   basic journal objects themselves for the sake of modularity.
>
>  - Comparators
>
>   Another abstraction isolated to its own layer, this class encapsulating the
>   comparison of journal objects, based on whatever value expression the user
>   passed to --sort.
>
>  - Temporaries
>
>   Many reports bring pseudo-journal objects into existence, like postings
>   which report totals in a "<Total>" account.  These objects are created and
>   managed by a temporaries_t object, which gets used in many places by the
>   reporting filters.
>
>  - Option handling
>
>   There is an option handling subsystem used by many of the layers further
>   down.  It makes it relatively easy for me to add new options, and to have
>   those option settings immediately accessible to value expressions.
>
>  - Session objects
>
>   Every journal object is owned by a session, with the session providing
>   support for that object.  In GUI terms, this is the Controller object for
>   the journal Data object, where every document window would be a separate
>   session.  They are all owned by the global scope.
>
>  - Report objects
>
>   Every time you create report output, a report object is created to
>   determine what you want to see.  In the Ledger REPL, a new report object is
>   created every time a command is executed.  In CLI mode, only one report
>   object ever comes into being, as Ledger immediately exits after displaying
>   the results.
>
>  - Reporting filters
>
>   The way Ledger generates data is this: it asks the session for the current
>   journal, and then creates an iterator applied to that journal.  The kind of
>   iterator depends on the type of report.
>
>   This iterator is then walked, and every object yielded from the iterator is
>   passed to an "item handler", whose type is directly related to the type of
>   the iterator.
>
>   There are many, many item handlers, which can be chained together.  Each
>   one receives an item (post, account, xact, etc.), performs some action on
>   it, and then passes it down to the next handler in the chain.  There are
>   filters which compute the running totals; that queue and sort all the input
>   items before playing them back out in a new order; that filter out items
>   which fail to match a predicate, etc.  Almost every reporting feature in
>   Ledger is related to one or more filters.  Looking at filters.h, I see over
>   25 of them defined currently.
>
>  - The filter chain
>
>   How filters get wired up, and in what order, is a complex process based on
>   all the various options specified by the user.  This is the job of the
>   chain logic, found entirely in chain.cc.  It took a really long time to get
>   this logic exactly write, which is why I haven't exposed this layer to the
>   Python bridge yet.
>
>  - Output modules
>
>   Although filters are great and all, in the end you want to see stuff.  This
>   is the job of special "leaf" filters call output modules.  They are
>   implemented just like a regular filter, but they don't have a "next" filter
>   to pass the time on down to.  Instead, they are the end of the line and
>   must do something with the item that results in the user seeing something
>   on their screen or in a file.
>
>  - Select queries
>
>   Select queries know a lot about everything, even though they implement
>   their logic by implementing the user's query in terms of all the other
>   features thus presented.  Select queries have no functionality of their
>   own, they are simple a shorthand to provide access to much of Ledger's
>   functionality via a cleaner, more consistent syntax.
>
>  - The Global Scope
>
>   There is a master object which owns every other objects, and this is
>   Ledger's global scope.  It creates the other objects, provides REPL
>   behavior for the command-line utility, etc.  In GUI terms, this is the
>   Application object.
>
>  - The Main Driver
>
>   This creates the global scope object, performs error reporting, and handles
>   command-line options which must precede even the creation of the global
>   scope, such as --debug.
>
> And that's Ledger in a nutshell.  All the rest are details, such as which
> value expressions each journal item exposes, how many filters currently exist,
> which options the report and session scopes define, etc.
>
> John

Re: A word on Ledger structure

Reply via email to