On Fri, Feb 24, 2012 at 7:59 PM, Russell Adams <[email protected]> wrote: > Out of the many issues I've had "scaling up" automation has been > fairly easy for my specific case. It's worth bringing up because it is > unlikely that a large Ledger would be entirely written by > hand. Whether you are dealing with stock values, or bank and credit > card statements automation ought to be the first priority. > > Ledger's goal is to provide reporting on the data files, but creating > those files is left as an exercise to the user. Perhaps this is > another place where a UI could be useful, as an editor that > compliments the command line reporting.
I tend to agree with you here. I've got a similar collection of scripts of various kinds, ranging from simple automation to a semi-complete scanned pdf -> tesseract OCR -> fuzzy matching classification engine -> MacRuby file sorting gui, which I haven't hooked up to matching ledger transactions yet. There's definitely a place out there for a "munge ledger files in interesting ways" tool - for example, if it could split or sort a ledger file by criteria? Or invert all transactions in a ledger file? While this violates the idea of the ledger files standing on their own as input only to ledger calculation programs, there's nothing to say that a text editor is the only suitable tool for modifying that input. > I utilize a single credit card as often as practical while traveling, > so that I can import that data reliably from my bank. Using this as my > primary data feed ensures I catch any unusual transactions (ie: fraud, > cancellation fees, etc). > > I wrote CSV2Ledger to automate the import of CSV data into the Ledger > format, and to automate as much account, category, file and metadata > matching as I possibly could. This is such a common task that Ledger > and Hledger have some new automation options, and there are many > competing projects for importing data into Ledger. Have you tried hledgers CSV conversion? I tried using both, and while CSV2ledger has more features, and found I preferred hledger's single configuration file, and the fact that it didn't modify that file when used. > I'm heavily dependent on deduplication because my CSV files I download > often have overlapping date ranges. I added the ability to tag each > txn with an md5sum of the original CSV to CSV2Ledger for this purpose, > and use rough text matching (ie: grep) with optional caching to > prevent duplication. A side note there's a 2 line pull request from me on github that switches the CSV2ledger code to use hashlib instead of md5 which is deprecated and throws warnings in recent versions of Python. - Zack
