RE: The Ultimate Calculation Engine...

Raffi Norian Mon, 22 Nov 1999 08:13:32 -0800
btw, CORBA development experience causes me to issue the following warning:
I would be leery of the IOR as "interoperable" is not something they always
are... sigh.

-----Original Message-----
From: Christopher Browne [mailto:[EMAIL PROTECTED]]
Sent: Monday, November 22, 1999 10:00 AM
To: [EMAIL PROTECTED]
Subject: Re: The Ultimate Calculation Engine... 


On Mon, 22 Nov 1999 13:34:00 +0100, the world broke into rejoicing as
Jan Schrage <[EMAIL PROTECTED]>  said:
> On Sat, Nov 20, 1999 at 12:21:59PM -0600, Rob Browning wrote:
> > Christopher Browne <[EMAIL PROTECTED]> writes:
> > 
> > > Rob Browning suggested that it might be good to have a way of
> > > somehow handling calculations done in Guile; I think I have some
> > > suggestions for architecture.  Either better still, or scarier
> > > still, my "strawman" proposal has not, to the best of my 
> > > knowledge, been implemented in any system.
> > 
> > [...]
> > 
> > > a) Transactions need to become "first class objects," with the ability
> > >    to attach to them a schedule on which they are to be repeated.
> > 
> > I'll probably have more to say about this later, but I'm not sure I
> > think that this needs to involve modifying the structure of
> > transactions in the engine, thought it might need to involve modifying
> > the data file format.
> 
> A much simpler, quicker and more reliable solution would be, IMHO, to
> leave the account database as it is and simply add a second database
> (i.e. structure in memory + file) containing nothing but the scheduled
> transactions for this account. Transactions belonging to some scheme
> could simply be flagged - not with an ID but simply as "automatic" or
> something (I dimly remember somebody saying there is an unused bitfield
> around).  

This approach has the merit that it is the way that CBB has handled
recurring transactions for quite some years now...
<a href=" http://www.menet.umn.edu/~curt/cbb/cbb-devel/contrib/recur.pl">
recur.pl</a>
<a href=
"http://www.menet.umn.edu/~curt/cbb/cbb-devel/contrib/loan_recur.pl">
loan_recur.pl</a>

"recur" takes an input file where repetition criteria are specified 
exactly as in a crontab, and then applies those transactions to a
specified data file.

It traverses all the transactions, and rewrites them thusly:
--> Recurrance transactions are marked with "?" in the "cleared" field.
--> If the date of a particular transaction has passed (e.g. - it's now
    in the past), change "?" to "!" so that "recur" won't touch it anymore.
--> Future-dated transactions with "?" get thrown away altogether, to be
    replaced by fresh new ones based on the recurrance set.

> This approach has, I think, several advantages over using just
> one file:
> 
> 1) Data for a single transaction and data for some transaction scheme
>    (loans etc.) differ considerably. It seems quite appropriate to store
>    them in distinct files with distinct data formats, each suited to
>    its task. Besides, a single transaction and a loan scheme are really
quite
>    different concepts that should only be mixed at our peril.

It seems to me that it makes sense to separate out somehow
  a) Transaction data
  b) Scheduling information for recurrant transactions
  c) Calculation information for recurrant transactions
Those can logically have quite different properties, and should be separate.

> 2) Accounts and scheduled transactions can easily be processed
>    separately, e.g. by conversion tools, external report generators
>    and whatever you like to think of. In return any tool designed
>    to automatically update accounts only has to look at the schedules
>    in order to determine whether anything needs to be done.
> 3) Reliability. Building parsers that act correctly in all circumstances
>    is not an easy task; especially not if you are dealing with objects
>    that can dynamically store information. In addition, dealing with
>    transaction objects of variable size is prone to make the transaction
>    engine less reliable. Just imagine a scheduling tool generating a
>    not-quite-correct transaction entry. A scheduling tool generating a 
>    not-quite-correct schedule is far less dramatic, for it doesn't
>    affect the accounts directly. (Of course, coming from the OO-world
>    I am all for encapsulation of data wherever possible. :-)

What this means is that it is necessary to be particularly paranoid about
*deleting* transactions.  That's the particularly unsafe action.

> 4) It is more like real life. This need not be an advantage in itself,
>    but I usually find that it makes a programmer's life easier.  Your
>    balance sheets show you transactions and balances. Full stop. Loan
>    schemes, depreciation schems and the like are usually stored on
>    different pieces of paper. Thinking of how I am doing this by hand, I
>    don't need to know about transactions when setting up a scheme like
>    this, nor do I need to know about my exact depreciation schemes to
>    understand that a transaction on my balance sheet is part of one. I
>    don't really see why an accounting program needs to know about both in
>    the same data structure, so to say. 

The fact that the computer can context-switch millions of times per second
means that such a separation need only push these things a millisecond
apart...

> 5) Actually, it is what you would do in database design. Just create a
>    second table instead of adding fields to the first that will in most
>    cases remain empty, thereby adding needless complexity to your data
>    structures and the parts of your program handling them.

This tends to be called "normalization," and it would indeed be quite
fair to add some additional tables to track things that need be tracked.

One "for instance" is that it would probably be worthwhile to have a
simple table that looks at transactions in cronological order.  There
is at present no single table in GnuCash that stores transactions
themselves; merely their components.  Adding a way of walking through
transactions in an orderly manner seems to me to be a nice improvement.

> > I envision the system as having a "scheduling infrastructure", and
> > your scenario would just be handled via code that would instantiate
> > transactions when the right criteria are satisfied, rather than as a
> > modification to the engine-level idea of transactions themselves.  For
> > example (definitely psuedo-code):
> > 
> >   (gnc:add-event '(days-of-month 1 5 21)
> >     (gnc:add-transaction ...
> >                          other-transaction-details
> >                          (compute-some-value
> >                            (gnc:get-account-balance
> >                             (gnc:lookup-account-by-name
"SomeAccount")))))
> > 
> > Of course something would have to be done to guarantee uniqueness,
> > either some event tagging scheme or some locking semantics, but that's
> > probably doable.  And this sort of sounds like a job for "cron" or
> > "at", but we probably can't use those.  I don't think they have
> > acceptable guarantees on execution or enough interface flexibility.
> > We'll have to do something ourselves.

Note that CBB uses the same syntax for specifying events as does
cron/at; even if we use none of the same code, it might still be a very
good idea to use the same syntax, as it is already well-understood.

> Indeed they haven't. Just think of somebody switching off his PC. cron
> will not execute any task that should have occured meanwhile. 
> I also believe this is not necessary. After all, you do not need files
> that are changed at a particular date or time. First time you really
> need the change is when you next look at the files, meaning the accounts
> could be updated upon startup. Uniqueness is - I think - guaranteed, too. 

There are two problems:
a) Identifying that a transaction is already on the system, or not.
   This corresponds more or less to my comment that it would be nice to
   have something like a CORBA IOR for each transaction.

   A present weakness of GnuCash that has been causing me to be a bit
   uneasy for rather a while is that it has no "transaction table," as
   well as the consideration that transactions do not have any 
   system-created identifier that is permanent across sessions.  That is
   a normal thing to see happen with transactions in "traditional"
   accounting systems.

b) Being able to update the "database" from offline at the same time
   that someone is running GnuCash.

> I think this, as well as the rest of your mail suggesting a server
> process, really implies that a computer be up and running all the time.
> For a number of gnucash users that will not be the case. Thinking of a
> client-server system means gnucash will be playing in a different league
> of accounting systems. In that case I would also suggest moving to RDBMS
> for the storage of accounting information. Ah, well, perhaps we should
> stick to the way its going now at least for a while.

There lies the crux of one of the big issues; I *disagree* that moving
to RDBMS would be appropriate; an appropriate move might be to an embedded
DBMS, with some tools to extract the data out to an RDBMS.  

But a certainly thorny issue is that of finding a way of having external
processes have a way of getting data into the system whilst
  - Gnucash might be running, or
  - Gnucash might *not* be running at all.

That being said, what comes to mind as a resolution to *both* would be
some sort of transaction queueing system, where external processes would
marshall data, and push it to some little GnuCash process that would drop
that data into some form of "queue" where it could wait until the main
GnuCash process would get a chance to get around to dealing with it.
Should that be:
  a) Ten seconds from now, when you click on a "recalc" button at which
     point GnuCash checks to see if any changes are queued, or
  b) 8 days from now, when you start up GnuCash, and it pops up a dialog
     indicating: "42 transactions queued up offline.  Would you like to
     review them before they get inserted?"

Interesting thing...  If there is a transaction recurrance scheme, and a
text-based way of storing recurrances, then this makes it quite natural
to queue external updates by building a file of one-time recurrances,
and queueing them up as needed.

In other words, solve one problem and you may solve them all...
--
"Without insects, our ecosystem would collapse and we would all die. In
that respect, insects are far more important than mere end-users."
-- Eugene O'Neil <[EMAIL PROTECTED]>
[EMAIL PROTECTED] - <http://www.ntlug.org/~cbbrowne/lsf.html>

--
Gnucash Developer's List 
To unsubscribe send empty email to: [EMAIL PROTECTED]


--
Gnucash Developer's List 
To unsubscribe send empty email to: [EMAIL PROTECTED]
RE: The Ultimate Calculation Engine...

Reply via email to