BTW, here's an auto-generated example file that looks similar to how I organize mine using org-mode: https://bitbucket.org/blais/beancount/src/tip/examples/example.beancount
On Tue, Feb 2, 2016 at 11:07 PM, Martin Blais <[email protected]> wrote: > On Tue, Feb 2, 2016 at 10:48 PM, John Hendy <[email protected]> wrote: >> >> On Monday, February 1, 2016 at 10:41:26 PM UTC-6, Martin Blais wrote: >> >>> On Mon, Feb 1, 2016 at 1:13 PM, John Hendy <[email protected]> wrote: >>> >>>> Greetings, >>>> >>>> >>>> It's a fresh year and I've been seeing ledger come up on the Org-mode >>>> mailing list for some time and decided to give it a try. I'm coming >>>> from Moneydance and just wanted to get away from the tedious GUI >>>> method of adding information, as well as have flexibility to generate >>>> my own reports/visualizations with python or R, etc. [1] >>>> >>>> Consider that I'm about a week into reading through docs here and >>>> there during evenings. My first step was going to be importing a >>>> downloaded .csv from my bank to get started. I'm still trying to >>>> verify I get the terminology, so I'll use this from the manual: >>>> >>>> From 5.1 Basic format: >>>> ``` >>>> This transaction has a date, a payee or description, a target account >>>> (the first posting), and a source account (the second posting). Each >>>> posting specifies what action is taken related to that account. >>>> ``` >>>> >>>> From 7.2.1.2 The convert command: >>>> ``` >>>> The fields ledger can recognize contain these case-insensitive strings >>>> date, posted, code, payee or desc or description, amount, cost,total, >>>> and note. >>>> ``` >>>> >>>> For my purposes, I import my finances primarily to "categorize" (what >>>> I believe here is called adding an account) and assign a payee so that >>>> I can track my spending against a budget. So, I'm surprised there's no >>>> special column keyword I can add for "account". It appears that all I >>>> can do is pass, say, `--account "assets:checking"` to have ledger know >>>> it's against assets:checking? Is that correct? >>>> >>>> From trying to google "import csv account ledger" or similar >>>> variations, I've been surprised that the only tools to do something >>>> like this appear to be interactive one-trans-at-a-time programs like >>>> icsv2ledger and reckon (granted, they can learn or follow rules). I >>>> could quickly go through my bank's .csv and add exp:food:dining, >>>> exp:auto:fuel to my ~100 transactions a month and have those imported >>>> just like the other column data. >>>> >>> >>> >> Thanks for the awesome reply! >> >> >>> Keep in mind that part of the process of importing (they like call it >>> "reconciling") involves >>> - Manually reviewing the transactions for correctness or fraud >>> >> >> I'll get there. For better or worse, I take the downloaded bank .csv as >> "truth" and am mostly interested in getting a better handle on what my >> money is used on, budgeting, planning, etc. >> >> >>> - Merging new transactions with previous transactions imported from the >>> other side (e.g. a payment from a bank account to pay off on'es credit card >>> will typically be imported from both the bank AND credit card accounts; you >>> must merge the corresponding transactions together) >>> >> >> Definitely. Moneydance allowed me to input an account, which would "link" >> the transaction. Then I'd have to delete or merge the other account's >> record of the same transaction. >> > > BTW, there are some ideas around about automatically merging two > incomplete transactions. This problem is the dual of solving the issue of > settlement dates, i.e., the problem being that the dates of each of the two > sides may settlement on different days. > See http://furius.ca/beancount/doc/proposal-settlement for some > ruminations and scour the mailing-list, there is more discussion about this. > > > > - Assigning the right category (you can automate this with a script I >>> suppose; frankly it's not much work, I do all of mine manually with the >>> help of auto-completion from Emacs, which is the most important feature IMO) >>> >> >> Huh. Yes, I'll definitely have to look into the emacs mode. I assumed >> once it was in ledger format it would be *a lot* harder to navigate around >> vs. just doing it while it's already in a spreadsheet format. >> > > Definitely not, text is there for your pleasure. You typically organize > your Ledger input file in the order that makes the most sense for you > (minus some constraints: Ledger will report the transactions in the order > they appear in the file and the balance assertions are computed as such. > Beancount sorts everything by date so order doesn't matter). > > > > - Moving the resulting transactions to the right place in your file. >>> >> >> I'll have to look into this more. I get that this is the ledger list... >> but is beancount different in this respect? From reading your docs, it >> sounded like beancount didn't care about order. Or are there other reasons >> (besides date) that one would have to move transactions around? >> > > In Ledger, the reporting is done in file order. Balance assertions as well. > In Beancount, order is by date, so you don't have to care about how you > organize them. > I think - but I'm not 100% sure - that most Ledger users must store their > input file by section, and in each section in date order, to minimize the > number of out-of-order transactions if they print out a register. > I use org-mode to create sections and each section is stored in date order > for some subset of accounts. > > > >> - Verifying balances visually, or inserting a balance directive which >>> asserts what the final account balance should be (for correctness) after >>> the new transactions. >>> >>> If you do it often enough and you have editing chops, you get used to >>> the dance and it's a breeze. >>> I think the fourth step can be hypothetically solved using heuristics. >>> >>> >>> >>> I feel like I must be missing something with respect to getting the >>>> from/to accounts added to the bank data. >>>> >>>> Perhaps to take a step back... >>>> - are the majority of folks writing their transactions by hand in >>>> ledger format? >>>> >>> >>> Can't say about others, but for me I want to say that about half the >>> importing is semi-automatic. >>> - Credit cards and banks import from downloads but I need to categorize >>> manually (as described above), fairly good quality downloads. >>> - Investment accounts fully automated buys but I need to manually edit >>> sales in some accounts. Great quality of downloads. >>> - Payroll stubs and vesting and a few other things are provided only as >>> PDFs and I don't bother trying to extract (though I've made some headway >>> towards this, it's incomplete; it turns out fully automating table >>> extraction from PDF isn't trivial. The best OSS solution is TabulaPDF by >>> far but you still need to manually identify where the table is). >>> - Cash transactions: I have to enter those by hand. I only book non-food >>> expenses as individual transactions directly, and for food maybe once every >>> six months I'll count my wallet balance and insert one transaction per >>> month to debit away the cash account toward food. If you do this, you end >>> up with surprisingly little transactions to book manually, maybe a >>> few/week. I suppose it could depend on lifestyle choices. >>> >>> It takes me less than 1 hour/week to run through the active accounts, >>> usually first thing Saturday morning when I get up. Most of the pain is >>> logging with user/passwords into the various institutions and clicking the >>> right buttons to generate the downloaded files. Extraction and filing is >>> automated using importers I wrote against LedgerHub. Less active accounts >>> are updated every quarter or when I feel like it. >>> >>> >> This is a helpful time estimate/reference. My main account (checking) has >> ~100 transactions per month. I don't mind categorizing them myself, but I >> hoped for a quick-ish way to do that. Typing "expenses:blah:blah" is pretty >> fast in a spreadsheet. While I *use* emacs, I'm no navigation whiz, and >> going to the right place in a block of text to type the same thing seems >> super tedious vs. a spreadsheet. Hence I was puzzled that I couldn't use >> ledger's convert command to just bring in accounts from the .csv along with >> the rest. After all, all the dates and amounts are there, one can add >> payees... why not accounts? >> > > You can probably script that away with a few rules. > I admit that 100 txns/month is more than I have, and I might look into > auto-categorizing most of it myself if I were in that situation. > Problem is, everyone's little scripts appear to have little in common. > > > >> - is there some better way to import bulk data (e.g. via ledger's >>>> convert function) and post-edit once it's in ledger format? It seemed >>>> a .csv in LO calc was pretty convenient vs. scrolling through a long >>>> text file >>> >>> - any other pointers along the above lines would be most welcome. >>>> >>> >>> Check out LedgerHub for ideas. >>> >>> Original design doc: >>> http://furius.ca/ledgerhub/doc/design >>> >>> Post-mortem: >>> http://furius.ca/ledgerhub/doc/postmortem >>> >>> The project is being killed right now, rewritten much better and simpler >>> and migrated into the Beancount project; if you do end up looking at the >>> code make sure you're checking out the "stable" branch, it's a bit of a >>> riot on the default branch right now, it will be broken. >>> >>> Essentially, I'm defining a config (in Python) as a list of "importer" >>> objects and boil the process down to three steps: >>> 1. Identify: Given a messy list of downloaded files (e.g. in >>> ~/Downloads), automatically identify which importer is supposed to handle >>> them >>> 2. Extract: Extracting transactions and statement date from each file, >>> if possble >>> 3. File: Filing away the downloads to a directory hierarchy which >>> mirrors the chart of accounts, for preservation, e.g. in a personal git >>> repo. >>> >>> You could think of adding >>> 0. Fetch: Automatically download the files >>> but that's too hard. Personally I just don't have the stamina to >>> implement this for myself. Given the nature of today's websites and the >>> castles of JavaScript used to implement them, this would be a nightmare to >>> implement for too little payoff. I love the idea of full automation, but I >>> just don't have the time. Note that if you don't mind the nature of their >>> business (they sell your data), you could potentially try to use Yodlee to >>> pull much of it from a single place. >>> >>> >> Yeah, not interested in that. It's not a big deal to download the few >> files I need. >> >> >>> In any case, you can't really get away without writing at least some >>> code--it's just not realistic, the inputs from different people vary too >>> much. There's very little shared code out there (just basic codes for CSV >>> files, like the ones you mention) but too few users that share the same >>> accounts to generate the critical mass needed for reuse. A while back I >>> created the LedgerHub project to host shared importer code and provide a >>> framework for doing the above, but never received much contributions and >>> honestly I didn't put the care and quality attention to it I should have. >>> More importantly, regression testing for those importers is most easily >>> carried out using actual downloaded files compared to a corresponding >>> expected output, but these files don't share well (they contain lots of >>> personal data) so one ends up with two repositories anyhow. And besides >>> there are several design decisions in some importers that may not please >>> every user, in particular about how you choose your accounts for >>> investments (there are degrees of freedom), so even sharing is not entirely >>> an obvious win. >>> >>> >> That's okay, and I'm cool with trying some code. I primarily use R for >> data analysis/plotting, but have started getting introduced to python via >> Coursera recently and hope to dig in more. >> > > R won't be fun for doing this. R makes it a huge pain to even do the kind > of data cleaning necessary for prepping data for analysis. Definitely use > Python over it, you'll save a lot of time. If you really need some > specialzed R module, you can create numpy arrays in Python and there is a > module that allows you to invoke the R runtime with these. Best of both > worlds, but I doubt you'll need it. > > > That's another thing that attracts me to beancount :) That said, these are >> more just general questions at this point. I'm amazed at how much >> documentation there is... but for a total noob, I can say it's a bit >> intimidating and kind of hard to know where one should start! Not to >> mention having questions and not being sure you're even searching for the >> right terminology to answer your question. >> >> >> >>> By the way, I've found that regression testing is the _key_ to >>> maintaining your importer code, because those importers are often written >>> against file formats with no official spec and unexpected surprises show up >>> routinely (e.g. I have XML files with some unescaped "&" characters, which >>> require a custom fix "just for that bank", for instance, lots of nasty >>> surprises), so you really need to be able to reproduce your tests. I think >>> I have to make at least _some_ fix to an importer about once/month, and >>> that sinks maybe a half-hour (involves adding the new file which makes it >>> break, fix the importer code, and potentially update the older expected >>> files for changes). >>> >>> I hope this helps give some color to the process, >>> >>> >>> >> Definitely, and sincere thanks for taking the time to give me some >> pointers! >> >> >> John >> >> >>> >>> >>> I tried to search the list for more of this sort of question, so >>>> forgive me if I've missed something. Replying with links pointing me >>>> in the right direction would be plenty sufficient if this has already >>>> been discussed! >>>> >>>> >>>> Thanks! >>>> John >>>> >>>> >>>> [1] http://moneydance.com/ >>>> >>>> -- >>>> >>>> --- >>>> You received this message because you are subscribed to the Google >>>> Groups "Ledger" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >> >> --- >> You received this message because you are subscribed to the Google Groups >> "Ledger" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> For more options, visit https://groups.google.com/d/optout. >> > > -- --- You received this message because you are subscribed to the Google Groups "Ledger" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
