I did get this working but I don't think I will continue to use it.  I don't look at my finances often enough.  In fact I usually pick away at updating all my accounts every 6 months (for an hour or two on about 10 evenings).

The only real downside was restaurant transactions that included tips.  I would only get an email for the initial charge so my imported value would not have the final total.  Then I would have to comb through everything to find the issues and get my balance checks to pass.

I am going to pursue CSV import like a normal person, but I am glad to have tried.  I'll dump some code below in case it inspires and/or horrifies anyone someday.

Enjoy!


Importing the emails was quite fiddly...bizarre MIME quoted-printable stuff.  I had planned to use the Unread state to keep track of what had been processed.  Maybe that explains the ~5% mysteriously missing transaction rate.

fromdatetimeimportdatetime
importquopri
importemail
importimaplib
importbase64
importos
importre
fromemailimportpolicy
fromemail.messageimportEmailMessage
fromemail.parserimportParser
frompprintimportpprint
mail=imaplib.IMAP4_SSL(email_server, 993)
mail.login(email_user, email_pass)
test2=mail.select("INBOX.CC-notifications")
test=mail.list()
type, data=mail.search(None, 'UNSEEN')
fornumindata[0].split():
typ, data=mail.fetch(num, '(RFC822)')
forresponse_partindata:
ifisinstance(response_part, tuple):
raw_email=data[0][1]# converts byte literal to string removing b''
raw_email_string=raw_email.decode('utf-8')
msg=email.message_from_string(raw_email_string)
ifmsg['Message-Id']:
subject=msg['Subject']
# Like: 'Thu, 11 Jan 2024 22:39:49 -0500 (EST)'
date_with_parens=msg['Date']
date_string=re.sub(r' \([^)]*\)', '', date_with_parens)
date=datetime.strptime(date_string, '%a, %d%b%Y %H:%M:%S %z')
filename='downloads/'+date.strftime('%Y-%m-%d_%H-%M-%S') +'.eml'
withopen(filename, 'w') asfile:
file.write(raw_email_string)
else:
print('No Message-Id for %s')

Nothing too crazy with parsing the data out of the email body. It is susceptible to changing formatting

importdecimal
importos
importre
frombeancount.coreimportamount
frombeancount.coreimportdata
frombeancount.core.amountimportAmount
frombeancount.core.numberimportD
frombeancount.ingestimportimporter
frombeancount.ingestimportregression
fromchardetimportdetect
fromdatetimeimportdatetime
fromemailimportpolicy
fromemail.parserimportParser
frompprintimportpprint
classImporter(importer.ImporterProtocol):
"""An importer for <credit card> email notifications"""
def__init__(self, lastFour, account, accounts, currency):
self.lastFour=lastFour
self.account=account
self.accounts=accounts
self.currency=currency
defidentify(self, file):
#print(self.extract_amount(self.extract_body(file)))
return(re.search(self.lastFour, self.extract_body(file)) isnotNone)
deffile_name(self, file):
returnos.path.basename(file.name).split('_')[1]
deffile_account(self, _):
returnself.account
deffile_date(self, file):
returnself.extract_date(self.extract_body(file))
defextract(self, file):
entries=[]
body=self.extract_body(file)
meta=data.new_metadata(file.name, lineno=0)
date=self.extract_date(body)
flag="txn"
payee=self.extract_payee(body)
narration=""
tags=data.EMPTY_SET
links=data.EMPTY_SET
txn=data.Transaction(meta, date, flag, payee, narration, tags, links, postings=[])
account=self.file_account(file)
liabilities_units=Amount(-self.extract_amount(body), self.currency)
expenses_units=Amount(self.extract_amount(body), self.currency)
txn.postings.append(data.Posting(account, liabilities_units, None, None, None, None))
ifpayeeinself.accounts:
txn.postings.append(data.Posting(self.accounts.get(payee), expenses_units, None, None, None, None))
else:
txn.postings.append(data.Posting("Expenses:Uncategorized", expenses_units, None, None, None, None))
entries.append(txn)
returnentries
defextract_body(self, file):
withopen(file.name, 'r', encoding='utf-8', errors='ignore') asin_file:
msg=Parser(policy=policy.default).parse(in_file)
returnmsg.get_body().get_content()
defextract_date(self, body):
# Email contains "Jan 1, 2024 at 12:00 PM ET" as an example.
date_text=re.search("(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec) [0-9]{1,2}, [0-9]{4} at ", body).group()
returndatetime.strptime(date_text, "%b%d, %Y at ").date()
defextract_payee(self, body):
# Email contains this as an example:
# <td align="left" style="vertical-align:top; padding:0px 0px 0px 0px; font-family:Arial, Helvetica, sans-serif; font-size:16px; color:#414042;" class="font14">Merchant</td> # <td align="right" style="vertical-align:top; padding:0px 0px 0px 5px; font-family:Arial, Helvetica, sans-serif; font-size:16px; font-weight:bold; color:#414042;" class="font14">GOOGLE *SyncTech</td>
returnre.search("Merchant<\\/td>\\n.*font14\">(.*?)<\\/td>", body).group(1)
defextract_amount(self, body):
# Email contains this as an example:
# <td align="left" style="vertical-align:top; padding:0px 0px 0px 0px; font-family:Arial, Helvetica, sans-serif; font-size:16px; color:#414042;" class="font14">Amount</td> # <td align="right" style="vertical-align:top; padding:0px 0px 0px 5px; font-family:Arial, Helvetica, sans-serif; font-size:16px; font-weight:bold; color:#414042;" class="font14">$5.36</td> returnD(re.search("Amount<\\/td>\\n.*font14\">\\$(.*?)<\\/td>", body).group(1))

The accountsvariable passed in to __init__ is just an array like this, for categorization of expenses:

expense_accounts={ }
expense_accounts["ADAFRUIT INDUSTRIES"] ="Expenses:Shopping:Electronics-Software"
expense_accounts["ARCO"] ="Expenses:Auto-Transport:Gas-Fuel"
expense_accounts["Barnes and Noble"] ="Expenses:Shopping:Books"

On 1/7/24 00:06, Timothy Jesionowski wrote:
So the thing about beancount is that it's just python. It sounds like you're using a library to parse QIF files, but anything you can parse into python can be fed into an importer. So just...figure out how to extract the relevant information from those emails and use whatever importer you think is the most similar as a starting point.

For example, when I was writing an importer for my credit card statements this code <https://github.com/beancount/beancount/blob/v2/examples/ingest/office/importers/utrade/utrade_csv.py> as a starting point. And the actual parsing is just python's standard csv library.

As for your specific case with the emails, that looks like some bespoke string parsing so I'd just throw regexes at the emails until you have the substrings you want.

That's what I did in my case, and it's what I'd probably do in your case. But if you want very fast ledger updates <https://reds-rants.netlify.app/personal-finance/the-five-minute-ledger-update/>, then you should look into using ofx <https://reds-rants.netlify.app/personal-finance/a-word-about-input-formats-use-ofx-when-you-can/> for your imports. It's on my wishlist for my own workflow, I just haven't gotten around to it yet.

Sincerely,
Timothy Jesionowski


On Sat, Jan 6, 2024 at 2:02 PM 'Patrick Ruckstuhl' via Beancount <[email protected]> wrote:

    Hi,

    not exactly the same case, but related.


    I created an importer which wrapps around other importers for
    attachments,
    https://tariochbctools.readthedocs.io/en/latest/importers.html#mail-adapter

    Maybe you can use some of it as a base for your importer.


    Regards,

    Patrick


    On 06.01.2024 19:57, 'Aaron Axvig' via Beancount wrote:
    > Hello, new Beancount user of a few weeks here.  I'm cutting over
    from
    > Quicken as of the first of the year.  The import was a lot of
    work but
    > I have fiddled my way to a beautiful 61,000 line file of converted
    > data.  Some fixing of QIF parsers and converters, and lots of
    fixing
    > my old data.  Quicken allows some real garbage!
    >
    > Anyways, as I get ready to do my first import of transactions
    directly
    > into Beancount, it occurs to me that I get an email of every
    > transaction that occurs on my credit cards.  With some email
    rules and
    > cron jobs this could potentially keep my credit card accounts
    > perpetually up to date on at least a daily basis if not better. 
    And
    > it would be verified on a monthly basis with ending balances, etc.
    >
    > Has anyone put some thought into this technique?  Any
    roadblocks? Are
    > any parts of such a processing flow already built?
    >
    > Some sample data from emails is below.  I think account, amount,
    > merchant, and date is all one needs?
    >
    > Example 1:
    >
    > You made a $5.36 transaction
    > Account     Visa (...1234)
    > Date     Jan 1, 2024 at 12:00 PM ET
    > Merchant     GOOGLE *SomeApp
    > Amount     $5.36
    >
    > Example 2:
    >
    > Amount: $16.07
    > Card Ending In 4321
    > Merchant ETSY, INC. 718-8557955 US
    > Date 01/02/2024
    > Time 10:19 AM E
    >

-- You received this message because you are subscribed to the Google
    Groups "Beancount" group.
    To unsubscribe from this group and stop receiving emails from it,
    send an email to [email protected]
    <mailto:beancount%[email protected]>.
    To view this discussion on the web visit
    
https://groups.google.com/d/msgid/beancount/f56658f7-92e0-4d2f-8646-a28e6117424a%40tario.org.

--
You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAOVsoWTd1Y0M-%3D-7R7OPJN%3DRVtCRu5uYRtHzcAUX0Jtwkcc6-g%40mail.gmail.com <https://groups.google.com/d/msgid/beancount/CAOVsoWTd1Y0M-%3D-7R7OPJN%3DRVtCRu5uYRtHzcAUX0Jtwkcc6-g%40mail.gmail.com?utm_medium=email&utm_source=footer>.

--
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/8efc1197-867a-4e02-827a-327e8911b1b4%40axvig.com.

Reply via email to