On Fri, 24 Jan 2014 09:57:41 -0500, Martin Blais <[email protected]> wrote:
> These would be better done in two separate steps IMHO:
> 
> 1. extract the data from whichever external source format (e.g. OFX) into
> an internal transaction data structure
> 2. "complete" incomplete imported transaction objects by adding missing
> legs using the past Ledger history

I do agree that it could make sense to split these up into two projects
in the future. At the moment Reckon's scope is small enough though that that is
not needed yet. 

> 
> About (1): CSV files are pretty rare. The only ones I've come across (in my
> own little bubble of a world) are PayPal, OANDA, and Ameritrade. Much more
> common for banks, investment and credit card companies is OFX and Quicken
> files. I also find it convenient to recognize at least *some* data from PDF
> files, such as the date of a statement, for automatic classification and
> filing into a folder (you could apply machine learning to this problem,
> i.e. give a whole bunch of disorganized words from what is largely
> imperfect PDF to text conversion, classify which statement it is, but
> crafting a few regexps by hand has proved to work quite well so far).  I'll
> add anomyfied example input files to Beancount for automated testing at
> some point, they'll be going here:
> https://hg.furius.ca/public/beancount/file/tip/src/python/beancount/sources

In my experience banks seem to always support at least csv and only QIF
OFX when you are lucky. I only have experience with personal banking in
the UK and the Netherlands though.

> 
> I'm thinking.... maybe it would make sense for importers (mine and/or
> yours) to spit out some sort of XML/JSON format that could be converted
> into either Ledger of Beancount syntax or whatever else? This way all those
> importers could be farmed out to another project and reused by users of
> various accounting software. Does this make sense?

I would love such a project, but have little use for reading other
things than csv files myself. 

> 
> About (2): If Ledger supports input'ing incomplete transactions, you could
> do this without relying on CSV conversion, that would be much more
> reusable. In Beancount, my importers are allowed to create invalid
> transaction objects, and I plan to put in a simple little perceptron
> function that should do a good enough job of adding missing legs
> automatically (one might call this "automatic categorization"),
> independently of input data format.

In some ways this is the main part of what Reckon already does, since it fills 
in the missing
information based on entries in an existing ledger values. The csv
parsing was only added to make it easier to read in the data.

Edwin

-- 

--- 
You received this message because you are subscribed to the Google Groups 
"Ledger" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to