So if I see this correctly, after the filtering of the training data,
there is never any data left.
The logic looks like this
def training_data_filter(self, txn):
"""Filter function for the training data."""
found_import_account = False
for pos in txn.postings:
if pos.account not in self.open_accounts:
return False
if self.account == pos.account:
found_import_account = True
return found_import_account or not self.account
And from the printout you have something in self.account. So if I see
this correctly, either none of your training data is matching the
account or the account is actually no longer open.
Maybe worth printing out the self.open_accounts and maybe even
debugging/logging some stuff in that training_data_filter code
Regards,
Patrick
On 20.05.2021 02:02, Jonathan Goldman wrote:
Hi Patrick,
Thanks for the suggestions. I started doing this. Here is what I'm seeing:
------CHECKPOINT1-------
1353
1133
0
------CHECKPOINT2-------
[]
---__call__----
Assets:US:Banks:Checking:myBank
------CHECKPOINT1-------
1353
1133
0
------CHECKPOINT2-------
[]
---__call__----
Assets:US:Banks:Checking:myBank
Here is the code I added to predictory.py:
#beg
print('---__call__----')
print(self.account)
#print(existing_entries)
#end
withself.lock:
self.define_pipeline()
self.train_pipeline()
returnself.process_entries(imported_entries)
defload_open_accounts(self, existing_entries):
"""Return map of accounts which have been opened but not closed."""
account_map= {}
ifnotexisting_entries:
return
forentry inbeancount_sorted(existing_entries):
# pylint: disable=isinstance-second-argument-not-valid-type
ifisinstance(entry, Open):
account_map[entry.account] = entry
elifisinstance(entry, Close):
account_map.pop(entry.account)
self.open_accounts = account_map
defload_training_data(self, existing_entries):
"""Load training data, i.e., a list of Beancount entries."""
training_data= existing_entries or[]
self.load_open_accounts(existing_entries)
#beg1
print('------CHECKPOINT1-------')
print(len(training_data))
#end1
training_data= list(filter_txns(training_data))
print(len(training_data))
length_all= len(training_data)
training_data= [
txn fortxn intraining_data ifself.training_data_filter(txn)
]
print(len(training_data))
#beg2
print('------CHECKPOINT2-------')
print(training_data)
#beg2
--------
I'm trying to check now that every account in the config file is
present in my beancount file. I noticed one missing and that changed
what was in the training_data but still getting the warning about
training data being empty. I'll keep digging as best I can but
definitely can use any additional help.
On Wed, May 19, 2021 at 3:16 AM 'Patrick Ruckstuhl' via Beancount
<[email protected] <mailto:[email protected]>> wrote:
Hi Jonathan,
Let's try to figure this out. In smart importer can you printout
the following stuff
in smart_importer/predictor.py
in __call__ around line 64
print(self.account)
print(existing_entries)
in load_training_data around line 91
print(training_data)
and around line 95
print(training_data)
That should give an idea where the information is "lost".
Depending on where the information is lost, you can then dig a bit
deeper into what is happening.
Regards,
Patrick
On 18.05.2021 13:14, Jonathan Goldman wrote:
Thanks Red.
bean-query works fine on my input file which now has >1000
transactions .
Ready with 1344 directives (2266 postings in 1133 transactions).
beancount>
I still get the error. I'm not sure what is causing and not sure
how to debug it. The only other issue I recall seeing was some
error with fund_info or something in getting prices but I thought
it was an unrelated issue.
Do you or does anyone have some suggestions on where/how to
debug. E.g. I should print some variables to STDOUT at such and
such point inside smart_importer code or inside bean-extract.
thanks,
Jonathan
On Mon, May 17, 2021 at 9:34 PM [email protected]
<mailto:[email protected]> <[email protected]
<mailto:[email protected]>> wrote:
A minimum of two transactions should suffice for
smart_importer. More will increase prediction quality, but
two should suffice. I can't tell what's happening at your
end, but you're likely ending up with zero transactions for
some reason. Run bean-query on the file you pass to "-f" of
bean-extract.
beancount-reds-importers supports smart_importer out of the
box for banking, that shouldn't be an issue AFAICT.
On Wednesday, May 12, 2021 at 10:23:14 PM UTC-7
[email protected] <mailto:[email protected]> wrote:
Thanks for suggestions @Patrick and Alan. My beancount
file has about 64 Asset accounts. It has about 41 expense
accounts. I have only 2 months of labelled banking
transactions (about 42 transactions) all associated with
one bank account and various expense accounts.
I had thought that some transactions were relatively
deterministic (same $ amount and same description like
rent/mortgage) and I was under the impression that only a
few months of data are needed to get going.
Perhaps I'll just go back to manually labelling data for
now and trying again later or after I see more
posts/explanation of smart_importer. I'm not well-versed
enough with smart_importer to debug what is happening.
On Thu, May 13, 2021 at 3:04 AM Alan H
<[email protected]> wrote:
I get this error when there are insufficient entries
in the journal to teach the smart_importer how to
file new transactions. Specifically there are no
matches for payees or narrations.
Is that the case? Try adding a dummy transaction that
matches the narration in the import file.
Alan
On Wednesday, May 12, 2021 at 12:24:55 PM UTC+1
[email protected] wrote:
Hm, actually that looks ok, it has the
existing_entries on the interface. But to be
honest I'm not super familiar with how the apply
hook is hooking this in, so there might be an issue.
Maybe someone more familiar with this can respond
on that.
Otherwise if you could install smart_importer
from git and then maybe add a bit more debug
output in
hooks.py and predictor.py to make sure that the
existing entries arrive, this would give a better
idea how to progress.
On 12.05.2021 13:17, [email protected] wrote:
Thank you. I think that is it.
I'm using reds-importers and I see
site-packages/beancount_reds_importers/libimport/banking.py
and it has this entry:
def extract(self, file, existing_entries=None):
I think this importer tool needs to be updated
to support the smart_importer.
On Wednesday, May 12, 2021 at 11:11:37 PM UTC+12
[email protected] wrote:
I just remembered something. The issue could
be that the importer you're trying to use
does not have the new interface and instead
still uses the old (legacy) interface.
the new one looks like this
def extract(self, file, existing_entries):
the old one looks like this
def extract(self, file):
Smart importer uses the existing_entries for
training its model.
Regards,
Patrick
On 12.05.2021 12:20, [email protected] wrote:
Just checked and I got the same result. I
can add some debugging code in the config
file perhaps. I'm not very experienced with
beancount or smart_importer so not sure
what to look for.
bean-extract -e journal/accounts.beancount
jonathan_smart.import ~/staging/mydata.qfx
> ~/staging/dud.txt
gives 2 printouts of
Cannot train the machine learning model
because the training data is empty.
Cannot train the machine learning model
because the training data is empty.
On Wednesday, May 12, 2021 at 7:15:19 PM
UTC+12 [email protected] wrote:
Can you try -e instead of -f that's
what I use
On May 12, 2021 8:31:36 AM GMT+02:00,
"[email protected]" <[email protected]>
wrote:
Thanks for the suggestion @Patrick.
I just tried changing that but
still doesn't work. I get the exact
same behavior if I call it with an
empty file....seems the -f option
doesn't make bean-extract behave as
expected for me. Here is my call:
bean-extract -f
journal/myledger.beancount
jonathan_smart.import
~/staging/62090_818496_1013051ofxdl.qfx
> ~/staging/dud.txt
I get these messages:
Cannot train the machine learning
model because the training data is
empty.
Cannot train the machine learning
model because the training data is
empty.
On Wednesday, May 12, 2021 at
5:31:25 PM UTC+12
[email protected] wrote:
Hi,
I think your setup looks good,
the smart importer hook is in
there as otherwise you would
not get the errors about not
able to train.
I think the issue is on your call
bean-extract
jonathan_smart.import
~/staging/new_bank_data.qfx -f
journal/myledger.beancount >
~/staging/dud.txt
My guess is that the -f
argument needs to come before
you specify the importconfig
and the location, so
bean-extract -f
journal/myledger.beancount
jonathan_smart.import
~/staging/new_bank_data.qfx >
~/staging/dud.txt
Regards,
Patrick
On 12.05.2021 01:58,
[email protected] wrote:
Thanks for looking at this
module even though you aren't
using it!
I followed the code that was
further down on the readme
page
<https://github.com/beancount/smart_importer>
that describes how to convert
an existing importer.
>>
from your_custom_importer
import MyBankImporter
from smart_importer import
apply_hooks, PredictPayees,
PredictPostings
my_bank_importer =
MyBankImporter('whatever',
'config', 'is', 'needed')
apply_hooks(my_bank_importer,
[PredictPostings(),
PredictPayees()])
CONFIG = [ my_bank_importer, ]
>>
(my code looks just like this
example)
I had thought apply_hooks
would operate on the importer
so when I call it in config I
can just then call the
hookified bank_importer. Is
this note the case?
On Wednesday, May 12, 2021 at
1:26:27 AM UTC+12
[email protected] wrote:
* Disclaimer * I have
never actually run smart
importer.
Looking at the README on
GitHub for smart importer
it looks like you need to
use the return object of
apply_hooks in your CONFIG
list.
CONFIG = [
apply_hooks(MyBankImporter(account='Assets:MyBank:MyAccount'),
[PredictPostings()]) ]
In your config you apply
the hooks but are not
using the returned object.
Hope that helps.
On Tuesday, 11 May 2021 at
04:06:33 UTC+1
[email protected] wrote:
Hi,
I'm trying to get
smart_importer to work
and not sure what I'm
doing wrong.
*_1_*. I successfully
have done all the
required beancount
setup and created by
own bank importer and
ran it on two months
of data.
_*2.*_ I then manually
labelled about 2
months of data from
one of my banks.
*_3._* I installed
smart_importer using
"pip install
smart_importer"
(base)
MacBook-Air:beandata
jonathan$ pip show
smart_importer
Name: smart-importer
Version: 0.3
Summary: Augment
Beancount importers
with machine learning
functionality.
Home-page:
https://github.com/beancount/smart_importer
<https://github.com/beancount/smart_importer>
Author: Johannes Harms
Author-email: UNKNOWN
License: MIT
Location:
/Users/jonathan/opt/miniconda3/lib/python3.8/site-packages
Requires:
scikit-learn,
beancount, numpy, scipy
*_4._* I created a new
config file I called
Jonathan_smart.import
base)
MacBook-Air:beandata
jonathan$ more
jonathan_smart.import
#!/usr/bin/env python3
"""Import
configuration."""
import sys
from os import path
sys.path.insert(0,
path.join(path.dirname(__file__)))
from
beancount_reds_importers
import vanguard
from
myimporters.bfsfcu
import bfsfcu_bank
from myimporters.anz
import anz_bank
from fund_info import *
from smart_importer
import apply_hooks,
PredictPayees,
PredictPostings
myBank_smart_importer
=my_bank.Importer({
'main_account' :
'Assets:US:Banks:Checking:myBank',
'account_number' :
''xxx'',
'transfer' :
'Assets:US:Zero-Sum-Accounts:Transfers:Bank-Account',
'income' :
'Income:US:Interest:myBank',
'fees' :
'Expenses:US:Bank-Fees:myBank',
'rounding_error' :
'Equity:US:Rounding-Errors:Imports',
})
apply_hooks(myBank_smart_importer,
[PredictPayees(),
PredictPostings()])
CONFIG =
[myBank_smart_importer,
...(other importers)]
*_5_*. I was following
the README
documentation that
said write
bean-extract -f to
invoke it on existing
data. So I tried the
following.*Is this right?*
bean-extract
jonathan_smart.import
~/staging/new_bank_data.qfx
-f
journal/myledger.beancount
> ~/staging/dud.txt
Cannot train the
machine learning model
because the training
data is empty.
Cannot train the
machine learning model
because the training
data is empty.
The output is just
like the normal output
without all the
smart_importer stuff.
Seems I'm doing
something wrong as the
staging/dud.txt
doesn't have any
predictions.
Appreciate any
assistance on this!
thanks,
Jonathan
--
You received this message
because you are subscribed to
the Google Groups "Beancount"
group.
To unsubscribe from this group
and stop receiving emails from
it, send an email to
[email protected].
To view this discussion on the
web visit
https://groups.google.com/d/msgid/beancount/820ef641-8178-47d1-9e97-afbc709e6a83n%40googlegroups.com
<https://groups.google.com/d/msgid/beancount/820ef641-8178-47d1-9e97-afbc709e6a83n%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are
subscribed to the Google Groups "Beancount"
group.
To unsubscribe from this group and stop
receiving emails from it, send an email to
[email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/fe28577c-8220-49cd-b976-40ef9f0b6a91n%40googlegroups.com
<https://groups.google.com/d/msgid/beancount/fe28577c-8220-49cd-b976-40ef9f0b6a91n%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are
subscribed to the Google Groups "Beancount" group.
To unsubscribe from this group and stop
receiving emails from it, send an email to
[email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/6248ca60-16fa-4ad0-88b5-1c4bb91f9feen%40googlegroups.com
<https://groups.google.com/d/msgid/beancount/6248ca60-16fa-4ad0-88b5-1c4bb91f9feen%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed
to the Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving
emails from it, send an email to
[email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/2b800e6d-fb0c-4b78-bde3-477eee6f9e7en%40googlegroups.com
<https://groups.google.com/d/msgid/beancount/2b800e6d-fb0c-4b78-bde3-477eee6f9e7en%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the
Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from
it, send an email to [email protected]
<mailto:[email protected]>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/f1e3ce25-e842-45b4-bb28-4f3737a3cb9en%40googlegroups.com
<https://groups.google.com/d/msgid/beancount/f1e3ce25-e842-45b4-bb28-4f3737a3cb9en%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the
Google Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [email protected]
<mailto:[email protected]>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/CANUAcYdz12pG%2BPyxiBdn5-L14TtSztkJ8A%2BQ8Fwfd753vN0-tg%40mail.gmail.com
<https://groups.google.com/d/msgid/beancount/CANUAcYdz12pG%2BPyxiBdn5-L14TtSztkJ8A%2BQ8Fwfd753vN0-tg%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google
Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [email protected]
<mailto:[email protected]>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/858c5ceb-7507-5f9c-793a-4dd5a4bd44e2%40ch.tario.org
<https://groups.google.com/d/msgid/beancount/858c5ceb-7507-5f9c-793a-4dd5a4bd44e2%40ch.tario.org?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google
Groups "Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected]
<mailto:[email protected]>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/CANUAcYdNeEw9UjFsZzq3RmcusEVkjZS_XzS1h1PPA2JUPp9Sjw%40mail.gmail.com
<https://groups.google.com/d/msgid/beancount/CANUAcYdNeEw9UjFsZzq3RmcusEVkjZS_XzS1h1PPA2JUPp9Sjw%40mail.gmail.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/3ff79e07-83d4-3895-452f-42b287bc2ca4%40ch.tario.org.