I did get this working but I don't think I will continue to use it. I
don't look at my finances often enough. In fact I usually pick away at
updating all my accounts every 6 months (for an hour or two on about 10
evenings).
The only real downside was restaurant transactions that included tips.
I would only get an email for the initial charge so my imported value
would not have the final total. Then I would have to comb through
everything to find the issues and get my balance checks to pass.
I am going to pursue CSV import like a normal person, but I am glad to
have tried. I'll dump some code below in case it inspires and/or
horrifies anyone someday.
Enjoy!
Importing the emails was quite fiddly...bizarre MIME quoted-printable
stuff. I had planned to use the Unread state to keep track of what had
been processed. Maybe that explains the ~5% mysteriously missing
transaction rate.
fromdatetimeimportdatetime
importquopri
importemail
importimaplib
importbase64
importos
importre
fromemailimportpolicy
fromemail.messageimportEmailMessage
fromemail.parserimportParser
frompprintimportpprint
mail=imaplib.IMAP4_SSL(email_server, 993)
mail.login(email_user, email_pass)
test2=mail.select("INBOX.CC-notifications")
test=mail.list()
type, data=mail.search(None, 'UNSEEN')
fornumindata[0].split():
typ, data=mail.fetch(num, '(RFC822)')
forresponse_partindata:
ifisinstance(response_part, tuple):
raw_email=data[0][1]# converts byte literal to string removing b''
raw_email_string=raw_email.decode('utf-8')
msg=email.message_from_string(raw_email_string)
ifmsg['Message-Id']:
subject=msg['Subject']
# Like: 'Thu, 11 Jan 2024 22:39:49 -0500 (EST)'
date_with_parens=msg['Date']
date_string=re.sub(r' \([^)]*\)', '', date_with_parens)
date=datetime.strptime(date_string, '%a, %d%b%Y %H:%M:%S %z')
filename='downloads/'+date.strftime('%Y-%m-%d_%H-%M-%S') +'.eml'
withopen(filename, 'w') asfile:
file.write(raw_email_string)
else:
print('No Message-Id for %s')
Nothing too crazy with parsing the data out of the email body. It is
susceptible to changing formatting
importdecimal
importos
importre
frombeancount.coreimportamount
frombeancount.coreimportdata
frombeancount.core.amountimportAmount
frombeancount.core.numberimportD
frombeancount.ingestimportimporter
frombeancount.ingestimportregression
fromchardetimportdetect
fromdatetimeimportdatetime
fromemailimportpolicy
fromemail.parserimportParser
frompprintimportpprint
classImporter(importer.ImporterProtocol):
"""An importer for <credit card> email notifications"""
def__init__(self, lastFour, account, accounts, currency):
self.lastFour=lastFour
self.account=account
self.accounts=accounts
self.currency=currency
defidentify(self, file):
#print(self.extract_amount(self.extract_body(file)))
return(re.search(self.lastFour, self.extract_body(file)) isnotNone)
deffile_name(self, file):
returnos.path.basename(file.name).split('_')[1]
deffile_account(self, _):
returnself.account
deffile_date(self, file):
returnself.extract_date(self.extract_body(file))
defextract(self, file):
entries=[]
body=self.extract_body(file)
meta=data.new_metadata(file.name, lineno=0)
date=self.extract_date(body)
flag="txn"
payee=self.extract_payee(body)
narration=""
tags=data.EMPTY_SET
links=data.EMPTY_SET
txn=data.Transaction(meta, date, flag, payee, narration, tags, links,
postings=[])
account=self.file_account(file)
liabilities_units=Amount(-self.extract_amount(body), self.currency)
expenses_units=Amount(self.extract_amount(body), self.currency)
txn.postings.append(data.Posting(account, liabilities_units, None, None,
None, None))
ifpayeeinself.accounts:
txn.postings.append(data.Posting(self.accounts.get(payee),
expenses_units, None, None, None, None))
else:
txn.postings.append(data.Posting("Expenses:Uncategorized",
expenses_units, None, None, None, None))
entries.append(txn)
returnentries
defextract_body(self, file):
withopen(file.name, 'r', encoding='utf-8', errors='ignore') asin_file:
msg=Parser(policy=policy.default).parse(in_file)
returnmsg.get_body().get_content()
defextract_date(self, body):
# Email contains "Jan 1, 2024 at 12:00 PM ET" as an example.
date_text=re.search("(Jan|Feb|Mar|Apr|May|Jun|Jul|Aug|Sep|Oct|Nov|Dec)
[0-9]{1,2}, [0-9]{4} at ", body).group()
returndatetime.strptime(date_text, "%b%d, %Y at ").date()
defextract_payee(self, body):
# Email contains this as an example:
# <td align="left" style="vertical-align:top; padding:0px 0px 0px 0px;
font-family:Arial, Helvetica, sans-serif; font-size:16px;
color:#414042;" class="font14">Merchant</td>
# <td align="right" style="vertical-align:top; padding:0px 0px 0px 5px;
font-family:Arial, Helvetica, sans-serif; font-size:16px;
font-weight:bold; color:#414042;" class="font14">GOOGLE *SyncTech</td>
returnre.search("Merchant<\\/td>\\n.*font14\">(.*?)<\\/td>", body).group(1)
defextract_amount(self, body):
# Email contains this as an example:
# <td align="left" style="vertical-align:top; padding:0px 0px 0px 0px;
font-family:Arial, Helvetica, sans-serif; font-size:16px;
color:#414042;" class="font14">Amount</td>
# <td align="right" style="vertical-align:top; padding:0px 0px 0px 5px;
font-family:Arial, Helvetica, sans-serif; font-size:16px;
font-weight:bold; color:#414042;" class="font14">$5.36</td>
returnD(re.search("Amount<\\/td>\\n.*font14\">\\$(.*?)<\\/td>",
body).group(1))
The accountsvariable passed in to __init__ is just an array like this,
for categorization of expenses:
expense_accounts={ }
expense_accounts["ADAFRUIT INDUSTRIES"]
="Expenses:Shopping:Electronics-Software"
expense_accounts["ARCO"] ="Expenses:Auto-Transport:Gas-Fuel"
expense_accounts["Barnes and Noble"] ="Expenses:Shopping:Books"
On 1/7/24 00:06, Timothy Jesionowski wrote:
So the thing about beancount is that it's just python. It sounds like
you're using a library to parse QIF files, but anything you can parse
into python can be fed into an importer. So just...figure out how to
extract the relevant information from those emails and use whatever
importer you think is the most similar as a starting point.
For example, when I was writing an importer for my credit card
statements this code
<https://github.com/beancount/beancount/blob/v2/examples/ingest/office/importers/utrade/utrade_csv.py>
as a starting point. And the actual parsing is just python's standard
csv library.
As for your specific case with the emails, that looks like some
bespoke string parsing so I'd just throw regexes at the emails until
you have the substrings you want.
That's what I did in my case, and it's what I'd probably do in your
case. But if you want very fast ledger updates
<https://reds-rants.netlify.app/personal-finance/the-five-minute-ledger-update/>,
then you should look into using ofx
<https://reds-rants.netlify.app/personal-finance/a-word-about-input-formats-use-ofx-when-you-can/>
for your imports. It's on my wishlist for my own workflow, I just
haven't gotten around to it yet.
Sincerely,
Timothy Jesionowski
On Sat, Jan 6, 2024 at 2:02 PM 'Patrick Ruckstuhl' via Beancount
<[email protected]> wrote:
Hi,
not exactly the same case, but related.
I created an importer which wrapps around other importers for
attachments,
https://tariochbctools.readthedocs.io/en/latest/importers.html#mail-adapter
Maybe you can use some of it as a base for your importer.
Regards,
Patrick
On 06.01.2024 19:57, 'Aaron Axvig' via Beancount wrote:
> Hello, new Beancount user of a few weeks here. I'm cutting over
from
> Quicken as of the first of the year. The import was a lot of
work but
> I have fiddled my way to a beautiful 61,000 line file of converted
> data. Some fixing of QIF parsers and converters, and lots of
fixing
> my old data. Quicken allows some real garbage!
>
> Anyways, as I get ready to do my first import of transactions
directly
> into Beancount, it occurs to me that I get an email of every
> transaction that occurs on my credit cards. With some email
rules and
> cron jobs this could potentially keep my credit card accounts
> perpetually up to date on at least a daily basis if not better.
And
> it would be verified on a monthly basis with ending balances, etc.
>
> Has anyone put some thought into this technique? Any
roadblocks? Are
> any parts of such a processing flow already built?
>
> Some sample data from emails is below. I think account, amount,
> merchant, and date is all one needs?
>
> Example 1:
>
> You made a $5.36 transaction
> Account Visa (...1234)
> Date Jan 1, 2024 at 12:00 PM ET
> Merchant GOOGLE *SomeApp
> Amount $5.36
>
> Example 2:
>
> Amount: $16.07
> Card Ending In 4321
> Merchant ETSY, INC. 718-8557955 US
> Date 01/02/2024
> Time 10:19 AM E
>
--
You received this message because you are subscribed to the Google
Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [email protected]
<mailto:beancount%[email protected]>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/f56658f7-92e0-4d2f-8646-a28e6117424a%40tario.org.
--
You received this message because you are subscribed to the Google
Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/CAOVsoWTd1Y0M-%3D-7R7OPJN%3DRVtCRu5uYRtHzcAUX0Jtwkcc6-g%40mail.gmail.com
<https://groups.google.com/d/msgid/beancount/CAOVsoWTd1Y0M-%3D-7R7OPJN%3DRVtCRu5uYRtHzcAUX0Jtwkcc6-g%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/8efc1197-867a-4e02-827a-327e8911b1b4%40axvig.com.