Hey, I'm in a very similar boat, were you able to post your importer files publicly? I think seeing the conversation of you working through this, along with your finished files would make your files a lot more easier to understand than the current examples I've seen.
Cheers, On Friday, 20 July 2018 02:22:48 UTC+10, trs...@tutanota.com wrote: > > I figured it out. The dumb_categorizer does .lower(): and I was passing > it a search term with a capital letter in it. Now I'm off to the races.. :) > > I think maybe I might publish my working setup once I get it all cleaned > up, as yet another example for others to follow. > > TRS-80 > > -- > Securely sent with Tutanota. Claim your encrypted mailbox today! > https://tutanota.com > > 19. Jul 2018 10:44 by trs...@tutanota.com <javascript:>: > > OK, I am successfully calling dumb_categorizer from CSV Importer by > defining it at beginning of .config file, and then passing categorizer = > dumb_categorizer to CSV Importer. I know this because I replaced it with a > simple print("something") and I got a bunch of "something" on stdout. So > the categorizer is getting called, it's just either not matching or not > attaching the other leg... ? > > Any help would be greatly appreciated. > > TRS-80 > -- > Securely sent with Tutanota. Claim your encrypted mailbox today! > https://tutanota.com > > 19. Jul 2018 08:52 by trs...@tutanota.com <javascript:>: > > I suppose I should have included a link to the CSV importer source: > https://bitbucket.org/blais/beancount/src/80d30d6896cf5fdcff8c1156cab77107ee8e0f96/beancount/ingest/importers/csv.py?at=default&fileviewer=file-view-default > > Down toward the bottom (line 283) is where the categorizer gets called. > > Last night at my local LUG, I volunteered to do a talk next month on plain > text accounting, and got the green light. So it would be nice to get this > working by then. :) > > TRS-80 > -- > Securely sent with Tutanota. Claim your encrypted mailbox today! > https://tutanota.com > > 19. Jul 2018 08:32 by trs...@tutanota.com <javascript:>: > > It is still unclear to me where to put this categorizer code? I have tried > putting it here, there, and everywhere. I am using the provided generic CSV > importer, which calls it, but I cannot figure out where to put it or how to > instantiate it or whatever it is you need to do in Python. > > Since I don't really know Python, I am happy to pay someone few bucks to > help me get this working. > > (from > https://bitbucket.org/blais/beancount/pull-requests/24/improve-ingestimporterscsv/diff > ): > > def dumb_categorizer(txn): > # At this time the txn has only one posting > try: > posting1 = txn.postings[0] > except IndexError: > return txn > > # Guess the account(s) of the other posting(s) > if 'nutella' in txn.narration.lower(): > account = 'Expenses:Food' > else: > return txn > > # Make the other posting(s) > posting2 = posting1._replace( > account=account, > units=-posting1.units > ) > > # Insert / Append the posting into the transaction > if posting1.units < posting2.units: > txn.postings.append(posting2) > else: > txn.postings.insert(0, posting2) > > return txn > > > > -- > Securely sent with Tutanota. Claim your encrypted mailbox today! > https://tutanota.com > > 25. Jun 2018 16:33 by trs...@tutanota.com <javascript:>: > > OK, stayed up late last night and actually got all my character stripping > accomplished in Python within the provided tools. Yay me (first Python code > I ever wrote)! :) > > OK so basic CSV importers are working, now trying to figure out where to > stick the categorizer code I found here: > https://bitbucket.org/blais/beancount/pull-requests/24/improve-ingestimporterscsv/diff > > I been trying here and there without success as of yet. Any hints/pointers > would be greatly appreciated! > > TRS-80 > -- > Securely sent with Tutanota. Claim your encrypted mailbox today! > https://tutanota.com > > 24. Jun 2018 15:21 by bl...@furius.ca <javascript:>: > > On Sun, Jun 24, 2018 at 11:58 AM <trs...@tutanota.com <javascript:>> > wrote: > >> [...]But by all means, please correct me if I am wrong, or have missed >> something. >> >> So now that I have attained some success, and see the light at the end of >> the tunnel, it looks like I will have to do ~ the following: >> 1.Manually download CSV file from bank. >> > Yes > > >> 2.Do some pre-processing, either manually or with macros in Emacs, or >> (more likely) programatically, using scripts and sed, etc. to remove parens >> and $s. >> > You can write code in your importer to do that. > > >> 3.Run the actual bean-import. >> > You mean bean-extract. > > 4.Run some post processing (I would like to change date: metadata name to >> transaction_date: because I think it's more descriptive). >> > Do that in your importer code as well. > > > 5.And then finally hand copy these transactions into my main .beancount >> file, double checking and tweaking (aka "clearing") them in the process, >> categorizing remaining ones into Expense accounts and perhaps updating my >> scripts in the process. >> > Yes. > > I suppose 2, 4, and 5 could be done all in Emacs, but I'll just have to >> figure out some workflow now that works for me. >> > Yes. > > >> >> Also not mentioned is somehow programatically inserting the other leg of >> the transaction (which Expense account). I agree with Martin's basic >> philosophy on this, and still plan on manually reviewing everything, >> however I am already seeing that the bulk of transactions are the same >> places in my case and could easily be categorized with some simple matching >> (either in a post matching script or within bean-extract using >> categorizer). I need to look into this more, and also experiment or read up >> on how the de-duplication works, as I think it's probably related. >> > > You can write some function for your importer to do that with your > particular rules if it saves you time. > > > Anyway, I will continue to report on what I find as I go along, and even >> though I'm not getting any replies >> > Short emails with direct questions -> more replies more quickly > > > >> hopefully this will either encourage others to try and set this up or >> perhaps help other noobs who come along later looking for more in depth >> info (or perhaps stumble across similar error messages searching the >> internet) and it eventually helps someone. >> >> Helpful tips, encouraging words, or even just letting me know if anyone >> is actually reading my idiotic ramblings are always welcomed. :D >> > > Sounds like you're making great progress! > Unfortunately automating the importing still requires writing Python code > and I see no way around that, I wish it was easier. > > > >> >> TRS-80 >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> 22. Jun 2018 19:21 by trs...@tutanota.com <javascript:>: >> >> Yeah I was completely on the wrong track before (I think). But I am on >> the right one now (I think)? >> >> So what I have done is just copy the csv.py file and save it as >> __init__.py in my importers/suncoast_g directory. Then I put the following >> into ledger.config: >> https://paste.pound-python.org/show/popHoa0wvVE2OiPCqIAL >> >> But now when doing bean-extract I get "ValueError: CSV config without >> header has non-index fields: {'[DATE]': 'Posted Date', '[TXN_DATE]': >> 'Transaction Date', '[NARRATION1]': 'Description', '[CREDIT]': 'Deposit', >> '[DEBIT]': 'Withdrawal', '[BALANCE]': 'Balance'}" >> >> Yes my CSV have headers. I been searching the internet for that error, >> but still scratching my head. Also tried to change '[DATE]' to 'DATE' etc. >> but that didn't seem to make a difference either. >> >> Of course, I could be completely off track (this is my fourth different >> approach). I been flailing around at this all day and a good part of >> yesterday too. Early in the morning until late at night. At this point I >> would be willing to send someone a few dollars to help me get this set up. >> I am sure I could get other accounts working and maintain it once I can >> just get the first one working. >> >> When I first saw my credit union's CSV file I thought "this should be >> easy" because it's very straightforward. I don't need all this complicated >> parsing like I have seen in some of the other Importers I have been >> studying. Just a straight CSV import. Or so I thought... :/ >> >> Anyway, any help at all would be greatly appreciated at this point. Any >> clue might help! >> >> TRS-80 >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> 22. Jun 2018 14:19 by trs...@tutanota.com <javascript:>: >> >> OK I sought and received some help in @python. I think I am on a much >> better track now. I don't know where I got my original __init__.py from, >> some similar thread here I think. >> >> But now I have downloaded from source the utrade one from: >> https://bitbucket.org/blais/beancount/src/65212d1176bb427a7883d2593edbd0e0545a145a/examples/ingest/office/importers/utrade/__init__.py?at=default&fileviewer=file-view-default >> >> and am modifying that to my needs. I now see that I missed a whole bunch of >> the methods listed in "Writing an Importer" section of "Importing External >> Data" Docs. It will take me a while to work through it but I will post >> something back later, including results. I just didn't want anyone to spend >> time posting a long reply in the meantime. >> >> Fun fun! :) >> >> TRS-80 >> >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> 22. Jun 2018 12:08 by trs...@tutanota.com <javascript:>: >> >> OK, so this is quite challenging for someone who doesn't really know >> Python. However I think it's a good exercise not only for myself but also >> to help other newbies who would like to try and get this awesome feature >> working. >> >> I have read everything I can in source and mailing list about CSV Import >> / Ingest and I've made some progress, but now I'm stuck. >> >> Apologies in advance for ugly formatting, Google Groups apparently do not >> support inline text formatting, and I am communicating with the group via >> email. >> >> I've tried to (mostly) follow the naming conventions in the examples but >> it seems they have changed over time. Anyway, file structure looks like so: >> ~/fin >> |---documents >> |---Downloads >> |---importers >> | |---suncoast_g >> | |---__init__.py (this file shared below) >> | |---__init__.py (this file is empty) >> |---ledger.beancount >> |---ledger.config (I have seen this also referenced as >> .import in docs) >> >> Here is my ledger.config file: >> --------------------(begin ledger.config file)-------------------- >> #!/usr/bin/env python3 >> """Example import configuration.""" >> >> # Insert our custom importers path here. >> # (In practice you might just change your PYTHONPATH environment.) >> import sys >> from os import path >> sys.path.insert(0, path.join(path.dirname(__file__))) >> >> from importers import suncoast_g >> #from importers import acme_pdf >> >> from beancount.ingest import extract >> #from beancount.ingest.importers import ofx >> >> >> # Setting this variable provides a list of importer instances. >> # >> # Removed the following from below to replace with my own, saved for >> reference >> # >> # utrade.Importer("USD", >> # "Assets:US:UTrade", >> # "Assets:US:UTrade:Cash", >> # "Income:US:UTrade:{}:Dividend", >> # "Income:US:UTrade:{}:Gains", >> # "Expenses:Financial:Fees", >> # "Assets:US:BofA:Checking"), >> # >> # ofx.Importer("379700001111222", >> # "Liabilities:US:CreditCard", >> # "bofa"), >> # >> # acme_pdf.Importer("Assets:US:AcmeBank"), >> # >> CONFIG = [ >> suncoast_g.Importer("Assets:Suncoast:Checking-G"), >> ] >> >> >> # Override the header on extracted text (if desired). >> extract.HEADER = ';; -*- mode: org; mode: beancount; coding: utf-8; -*-\n' >> --------------------(end ledger.config file)-------------------- >> >> OK now the __init__.py that is in suncoast_g contains following: >> --------------------(begin __init__.py file)-------------------- >> #!/usr/bin/env python3 >> >> # >> # Configuration file for extracting Suncoast-G data >> # >> >> from beancount.ingest import regression >> from beancount.ingest.importers import csv >> >> from beancount.plugins import auto_accounts >> >> >> class Importer(csv.Importer): >> >> config = {csv.Col.DATE: 'Posted Date', >> csv.Col.TXN_DATE: 'Transaction Date', >> csv.Col.NARRATION: 'Description', >> csv.Col.AMOUNT_CREDIT: 'Deposit', >> csv.Col.AMOUNT_DEBIT: 'Withdrawal', >> csv.Col.BALANCE: 'Balance'} >> >> def __init__(self, account): >> csv.Importer.__init__( >> self, self.config, >> account, 'Currency', >> ('Posted Date,Transaction Date,Description,' >> 'Deposit,Withdrawal,Balance'), >> 1) >> >> def get_description(self, row): >> payee, narration = super().get_description() >> narration = '{} ({})'.format(narration, row.category) >> return payee, narration >> --------------------(end __init__.py file)-------------------- >> >> I have just copied this stuff and tried to figure it out. I'm sure I've >> got something wrong in here but I don't really know what I'm doing. FYI >> here is what the data looks like which is in G.csv in Downloads: >> >> Posted Date,Transaction Date,Description,Deposit,Withdrawal,Balance >> 6/4/2018,6/4/2018,Withdrawal Debit Card SOME BAR & GRILL CITY ST Card >> XXXX,,($59.83),$229.15 >> >> OK I think that's all the relevant info. So now when I do: >> >> ~/fin$ bean-identify ledger.config Downloads >> >> I get: >> >> **** /home/myname/fin/Downloads/A Sunnet History 6186156 >> 23032018_21062018.csv >> **** /home/myname/fin/Downloads/G.csv >> >> Which I think means it is identifying those 2 files (the only ones in >> there) as CSV, correct? I will point out that G.csv is an Asset account and >> is my first target here. The other one is a Liability account (credit card) >> and therefore has different fields (only one amount, and no balance). But I >> figure once I get this one working, that other one (and subsequent others) >> should be pretty easy. >> >> OK so now when I do: >> >> ~/fin$ bean-extract ledger.config Downloads >> >> I get: >> >> **** /home/myname/fin/Downloads/A Sunnet History 6186156 >> 23032018_21062018.csv >> >> >> **** >> /home/myname/fin/Downloads/G.csv >> >> >> ERROR:root:Importer importers.suncoast_g.Importer: >> "Assets:Suncoast:Checking-G".extract() raised an unexpected error: CSV >> config without header has non-index fields: {<Col.DATE: '[DATE]'>: 'Posted >> Date', <Col.TXN_DATE: '[TXN_DATE]'>: 'Transaction Date', <Col.NARRATION: >> '[NARRATION1]'>: 'Description', <Col.AMOUNT_CREDIT: '[CREDIT]'>: 'Deposit', >> <Col.AMOUNT_DEBIT: '[DEBIT]'>: 'Withdrawal', <Col.BALANCE: '[BALANCE]'>: >> 'Balance'} >> >> >> ERROR:root:Traceback: Traceback (most recent call >> last): >> >> >> File >> "/usr/local/lib/python3.6/dist-packages/beancount/ingest/extract.py", line >> 187, in extract >> >> allow_none_for_tags_and_links=allow_none_for_tags_and_links) >> >> >> File >> "/usr/local/lib/python3.6/dist-packages/beancount/ingest/extract.py", line >> 69, in extract_from_file >> new_entries = importer.extract(file, **kwargs) >> File >> "/usr/local/lib/python3.6/dist-packages/beancount/ingest/importers/csv.py", >> line 189, in extract >> iconfig, has_header = normalize_config(self.config, file.head()) >> File >> "/usr/local/lib/python3.6/dist-packages/beancount/ingest/importers/csv.py", >> line 340, in normalize_config >> "{}".format(config)) >> ValueError: CSV config without header has non-index fields: {<Col.DATE: >> '[DATE]'>: 'Posted Date', <Col.TXN_DATE: '[TXN_DATE]'>: 'Transaction Date', >> <Col.NARRATION: '[NARRATION1]'>: 'Description', <Col.AMOUNT_CREDIT: >> '[CREDIT]'>: 'Deposit', <Col.AMOUNT_DEBIT: '[DEBIT]'>: 'Withdrawal', >> <Col.BALANCE: '[BALANCE]'>: 'Balance'} >> >> ;; -*- mode: org; mode: beancount; coding: utf-8; -*- >> >> And this is where I'm currently stuck. I feel like it's something dumb, >> something not pointing at something else correctly but I don't know enough >> Python (yet) to figure it out myself. Any halp would be greatly >> appreciated. :) >> >> TRS-80 >> -- >> Securely sent with Tutanota. Claim your encrypted mailbox today! >> https://tutanota.com >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to beancount+...@googlegroups.com <javascript:>. >> To post to this group, send email to bean...@googlegroups.com >> <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/LFcF9ZJ--3-0%40tutanota.com >> <https://groups.google.com/d/msgid/beancount/LFcF9ZJ--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to beancount+...@googlegroups.com <javascript:>. >> To post to this group, send email to bean...@googlegroups.com >> <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/LFciKzu--3-0%40tutanota.com >> <https://groups.google.com/d/msgid/beancount/LFciKzu--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to beancount+...@googlegroups.com <javascript:>. >> To post to this group, send email to bean...@googlegroups.com >> <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/LFdnLh3--3-0%40tutanota.com >> <https://groups.google.com/d/msgid/beancount/LFdnLh3--3-0%40tutanota.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Beancount" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to beancount+...@googlegroups.com <javascript:>. >> To post to this group, send email to bean...@googlegroups.com >> <javascript:>. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/beancount/LFmJI7Y--B-0%40tutanota.com >> <https://groups.google.com/d/msgid/beancount/LFmJI7Y--B-0%40tutanota.com?utm_medium=email&utm_source=footer> >> . >> For more options, visit https://groups.google.com/d/optout. >> > -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to beancount+...@googlegroups.com <javascript:>. > To post to this group, send email to bean...@googlegroups.com > <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/CAK21%2BhNT9Wvhd9EtFvp_F6sNKBV4NAFBmw_yJyu_umkHPwY%2Bsw%40mail.gmail.com > > <https://groups.google.com/d/msgid/beancount/CAK21%2BhNT9Wvhd9EtFvp_F6sNKBV4NAFBmw_yJyu_umkHPwY%2Bsw%40mail.gmail.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to beancount+...@googlegroups.com <javascript:>. > To post to this group, send email to bean...@googlegroups.com > <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/LFsdlPg--3-0%40tutanota.com > <https://groups.google.com/d/msgid/beancount/LFsdlPg--3-0%40tutanota.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to beancount+...@googlegroups.com <javascript:>. > To post to this group, send email to bean...@googlegroups.com > <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/LHmWkuU--3-0%40tutanota.com > <https://groups.google.com/d/msgid/beancount/LHmWkuU--3-0%40tutanota.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to beancount+...@googlegroups.com <javascript:>. > To post to this group, send email to bean...@googlegroups.com > <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/LHmaD4f--F-0%40tutanota.com > <https://groups.google.com/d/msgid/beancount/LHmaD4f--F-0%40tutanota.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to beancount+...@googlegroups.com <javascript:>. > To post to this group, send email to bean...@googlegroups.com > <javascript:>. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/LHmzwng--3-0%40tutanota.com > <https://groups.google.com/d/msgid/beancount/LHmzwng--3-0%40tutanota.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to beancount+unsubscr...@googlegroups.com. To post to this group, send email to beancount@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/660e92ff-2ba4-4c47-9fbd-eb76b8ec6571%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.