Hello John, Many thanks for this email. Does cl-ledger (lisp version) has similar architecture? What are the differences? I don't know how many of the ledger's users are programmers but making Ledger's architecture more transparent will, for sure, help people understand and contribute to ledger.
Best, Alexandre Rademaker http://arademaker.github.com/ On Wed, Mar 14, 2012 at 2:45 AM, John Wiegley <[email protected]> wrote: > Ledger is developed as a tiered set of functionality, where lower tiers no > nothing about the higher tiers. In fact, I build multiple libraries during > the process, and link unit tests to these libraries, so that it is a link > error for a lower tier to violate this modularity. > > Those tiers are: > > - Utility code > > There's lots of general utility in Ledger for doing time parsing, using > Boost.Regex, error handling, etc. It's all done in a way that can be > reused in other projects as needed. > > - Commoditized Amounts (amount_t, commodity_t and friends) > > An numerical abstraction combining multi-precision rational numbers (via > GMP) with commodities. These structures can be manipulated like regular > numbers in either C++ or Python (as Amount objects). > > - Commodity Pool > > Commodities are all owned by a commodity pool, so that future parsing of > amounts can link to the same commodity and established a consistent price > history and record of formatting details. > > - Balances > > Adds the concept of multiple amounts with varying commodities. Supports > simple arithmetic, and multiplication and division with non-commoditized > values. > > - Price history > > Amounts have prices, and these are kept in a data graph which the amount > code itself is only dimly aware of (there's three points of access so an > amount can query its revalued price on a given date). > > - Values > > Often the higher layers in Ledger don't care if something is an amount or a > balance, they just want to add stuff to it or print it. For this, I > created a type-erasure class, value_t/Value, into which many things can be > stuffed and then operated on. They can contain amounts, balances, dates, > strings, etc. If you try to apply an operation between two values that > makes no sense (like dividing an amount by a balance), an error occurs at > runtime, rather than at compile-time (as would happen if you actually tried > to divide an amount_t by a balance_t). > > This is the core data type for the value expression language. > > - Value expressions > > The next layer up adds functions and operators around the Value concept. > This lets you apply transformations and tests to Values at runtime without > having to bake it into C++. The set of functions available is defined by > each object type in Ledger (posts, accounts, transactions, etc.), though > the core engine knows nothing about these. At its base, it only knows how > to apply operators to values, and how to pass them to and receive them from > functions. > > - Query expressions > > Expressions can be onerous to type at the command-line, so there's a > shorthand for reporting called "query expressions". These add no > functionality of there own, but are purely translated from the input string > (cash) down to the corresponding value expression (account =~ /cash/). > This is a convenience layer. > > - Format strings > > Format strings let you interpolate value expressions into string, with the > requirement that any interpolated value have a string representation. > Really all this does is calculate the value expression in the current > report context, call the resulting value's "to_string()" method, and stuffs > the result into the output string. It also provides printf-like behavior, > such as min/max width, right/left justification, etc. > > - Journal items > > Next is a base type shared by anything that can appear in a journal: an > item_t. It contains details common to all such parsed entities, like what > file and line it was found on, etc. > > - Journal posts > > The most numerous object found in a Journal, postings are a type of item > that contain an account, an amount, a cost, and metadata. There are some > other complications, like the account can be marked virtual, the amount > could be an expression, etc. > > - Journal transactions > > Postings are owned by transactions, always. This subclass of item_t knows > about the date, the payee, etc. If a date or metadata tag is requested > from a posting and it doesn't have that information, the transaction is > queried to see if it can provide it. > > - Journal accounts > > Postings are also shared by accounts, though the actual memory is managed > by the transaction. Each account knows all the postings within it, but > contains relatively little information of its own. > > - The Journal object > > Finally, all transactions with their postings, and all accounts, are owned > by a journal_t object. This is the go-to object for querying ad reporting > on your data. > > - Textual journal parser > > There is a textual parser, wholly contained in textual.cc, which knows how > to parse text into journal objects, which then get "finalized" and added to > the journal. Finalization is the step that enforces the double-entry > guarantee. > > - Iterators > > Every journal object is "iterable", and these iterators are defined in > iterators.h and iterators.cc. This iteration logic is kept out of the > basic journal objects themselves for the sake of modularity. > > - Comparators > > Another abstraction isolated to its own layer, this class encapsulating the > comparison of journal objects, based on whatever value expression the user > passed to --sort. > > - Temporaries > > Many reports bring pseudo-journal objects into existence, like postings > which report totals in a "<Total>" account. These objects are created and > managed by a temporaries_t object, which gets used in many places by the > reporting filters. > > - Option handling > > There is an option handling subsystem used by many of the layers further > down. It makes it relatively easy for me to add new options, and to have > those option settings immediately accessible to value expressions. > > - Session objects > > Every journal object is owned by a session, with the session providing > support for that object. In GUI terms, this is the Controller object for > the journal Data object, where every document window would be a separate > session. They are all owned by the global scope. > > - Report objects > > Every time you create report output, a report object is created to > determine what you want to see. In the Ledger REPL, a new report object is > created every time a command is executed. In CLI mode, only one report > object ever comes into being, as Ledger immediately exits after displaying > the results. > > - Reporting filters > > The way Ledger generates data is this: it asks the session for the current > journal, and then creates an iterator applied to that journal. The kind of > iterator depends on the type of report. > > This iterator is then walked, and every object yielded from the iterator is > passed to an "item handler", whose type is directly related to the type of > the iterator. > > There are many, many item handlers, which can be chained together. Each > one receives an item (post, account, xact, etc.), performs some action on > it, and then passes it down to the next handler in the chain. There are > filters which compute the running totals; that queue and sort all the input > items before playing them back out in a new order; that filter out items > which fail to match a predicate, etc. Almost every reporting feature in > Ledger is related to one or more filters. Looking at filters.h, I see over > 25 of them defined currently. > > - The filter chain > > How filters get wired up, and in what order, is a complex process based on > all the various options specified by the user. This is the job of the > chain logic, found entirely in chain.cc. It took a really long time to get > this logic exactly write, which is why I haven't exposed this layer to the > Python bridge yet. > > - Output modules > > Although filters are great and all, in the end you want to see stuff. This > is the job of special "leaf" filters call output modules. They are > implemented just like a regular filter, but they don't have a "next" filter > to pass the time on down to. Instead, they are the end of the line and > must do something with the item that results in the user seeing something > on their screen or in a file. > > - Select queries > > Select queries know a lot about everything, even though they implement > their logic by implementing the user's query in terms of all the other > features thus presented. Select queries have no functionality of their > own, they are simple a shorthand to provide access to much of Ledger's > functionality via a cleaner, more consistent syntax. > > - The Global Scope > > There is a master object which owns every other objects, and this is > Ledger's global scope. It creates the other objects, provides REPL > behavior for the command-line utility, etc. In GUI terms, this is the > Application object. > > - The Main Driver > > This creates the global scope object, performs error reporting, and handles > command-line options which must precede even the creation of the global > scope, such as --debug. > > And that's Ledger in a nutshell. All the rest are details, such as which > value expressions each journal item exposes, how many filters currently exist, > which options the report and session scopes define, etc. > > John
