Thanks, Dan, for answering.

WRT the question in the subject line:
```
filename = Path(__file__).parent.parent / "master.beancount"
entries, errors, options = loader.load_file(filename)
```
gives me all the entries,  whereas `parser.parse_file` does not.
Dunno why.  It was a bit of a rabbit-hole looking down parse_file.
Maybe its the includes. Wtever. Going with load_file for now.

FWIW plugins:

Attempting it did indeed lead to a recursive loop to nowhere.  
Didn't think it a better approach; just not clear what is for what.

So, it seems I am looking for a script.  In the past I've built scripts 
which generate beancount compliant
text, but am ready to start using the beancount library, if only I could 
find my way in.
The html docs certainly reduce "friction" vs. the google docs-oddly-it 
seems so minor-but-there it is.
The ecosystem is all so...um... googly :-/ no offense intended.

WRT smart_importer:

There was some trauma due to a bug in scipy:
https://github.com/beancount/smart_importer/issues/116
Got past that, got it all to run and... nothing.  :shrugs:  Life is too 
short.  
That alone sucked up my entire afternoon to "get the accounts done".

WRT ML vs "fuzzy"

For a csv/json import from a bank, the text inputs of "payee" and 
"narration" are all it has to go on.  
Ok, maybe the ML can glean a little extra weight from the value but, when 
it won't install...
'overkill' comes to mind.
Previously I've done a regex thing on the payee and that worked well enough 
for my purposes, but I got a new bank (new importer) and ideas!
Fuzzy matching the strings with old entries seemed like a 
natural/incremental progression over regex.


On Wednesday, September 14, 2022 at 8:43:25 AM UTC+10 [email protected] 
wrote:

> On 12/09/2022 11:54, John Koala wrote:
> > Hi,
> > 
> > Yes, sorry, in the context of V2 still I'm afraid...
> > 
> > Perhaps you already know of a "fuzzy string matcher" for transaction 
> > narrations/payees?
>
> I am not sure I grok the question: how could a fuzzy string matcher be 
> specialized for transaction narrations or payees?
>
> > I didn't have much luck with "smart_importer" and decided the 
> > scipy/numpy/etc dependency was a PITA so am (or was) thinking to knock 
> > up a plugin to complete my imported transactions.
>
> Fuzzy matching strings is not all there is to write a machine learning 
> classifiers. I think that 'pip install scikit-learn' is immensely easier 
> than rolling your own algorithms.
>
> Maybe if you provide more details on how smart_importer does not work 
> for you, someone can help you in making it work.
>
> > Is a plugin the correct idea?
>
> I don' think so.
>
> A plugin operates on the transactions read from a ledger after beancount 
> after booking (the process for which all the postings in all the 
> transactions are balanced, padding amounts are calculated, lots are 
> computed, etc...). The transactions processed in this phase already need 
> to have all postings completed.
>
> Also, a plugin does not have a way to serialize the completed 
> transactions into a ledger. Unless you hack something together, your 
> plugin would run every time you load your ledger and will have to do its 
> job again. This would make fixing any mistake the automatic 
> categorization algorithms does rather cumbersome.
>
> Why do you thing a plugin is a better approach?
>
> > I noted that the importer is provided with an `existing_entries` list of 
> > transactions, which seems a very useful suite of items to match 
> > against.  But can I reach that from the plugin?
>
> That what? A plugin as access to all the transactions in the ledger on 
> which Beancount is operating. In this context there isn't the notion of 
> another ledger to which a batch of transactions will be added to.
>
> > Where/how? and is that even a good idea?  (its not going to re-read the 
> > entire history for every imported transaction is it? Hmm, I'd tolerate 
> > that nonetheless :-))  I'm assuming a `beancount.loader.load_file` 
> > inside the plugin would create some recursive sillyness?
>
> The beancount parser and loader are capable of loading more than one 
> file in the same process. However, there is no protection from a plugin 
> that recursively tries to load the Beancount ledger from which it has 
> been invoked. If you want to try the ledger filename is available as the 
> "filename" entry to the "options_map" passed to the plugin entry point.
>
> The way I would approach this, if you want a solution independent from 
> the import framework, is to use beancount.parser.parse_file() to parse 
> the transactions from a ledger, use the technique you like the most to 
> complete or rewrite the transactions, and write them back with 
> beancount.parser.printer.print_entries().
>
> Cheers,
> Dan
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/f6d0837e-769e-4bb3-9087-d3f4201c6f54n%40googlegroups.com.

Reply via email to