Thanks, Dan, for answering. WRT the question in the subject line: ``` filename = Path(__file__).parent.parent / "master.beancount" entries, errors, options = loader.load_file(filename) ``` gives me all the entries, whereas `parser.parse_file` does not. Dunno why. It was a bit of a rabbit-hole looking down parse_file. Maybe its the includes. Wtever. Going with load_file for now.
FWIW plugins: Attempting it did indeed lead to a recursive loop to nowhere. Didn't think it a better approach; just not clear what is for what. So, it seems I am looking for a script. In the past I've built scripts which generate beancount compliant text, but am ready to start using the beancount library, if only I could find my way in. The html docs certainly reduce "friction" vs. the google docs-oddly-it seems so minor-but-there it is. The ecosystem is all so...um... googly :-/ no offense intended. WRT smart_importer: There was some trauma due to a bug in scipy: https://github.com/beancount/smart_importer/issues/116 Got past that, got it all to run and... nothing. :shrugs: Life is too short. That alone sucked up my entire afternoon to "get the accounts done". WRT ML vs "fuzzy" For a csv/json import from a bank, the text inputs of "payee" and "narration" are all it has to go on. Ok, maybe the ML can glean a little extra weight from the value but, when it won't install... 'overkill' comes to mind. Previously I've done a regex thing on the payee and that worked well enough for my purposes, but I got a new bank (new importer) and ideas! Fuzzy matching the strings with old entries seemed like a natural/incremental progression over regex. On Wednesday, September 14, 2022 at 8:43:25 AM UTC+10 [email protected] wrote: > On 12/09/2022 11:54, John Koala wrote: > > Hi, > > > > Yes, sorry, in the context of V2 still I'm afraid... > > > > Perhaps you already know of a "fuzzy string matcher" for transaction > > narrations/payees? > > I am not sure I grok the question: how could a fuzzy string matcher be > specialized for transaction narrations or payees? > > > I didn't have much luck with "smart_importer" and decided the > > scipy/numpy/etc dependency was a PITA so am (or was) thinking to knock > > up a plugin to complete my imported transactions. > > Fuzzy matching strings is not all there is to write a machine learning > classifiers. I think that 'pip install scikit-learn' is immensely easier > than rolling your own algorithms. > > Maybe if you provide more details on how smart_importer does not work > for you, someone can help you in making it work. > > > Is a plugin the correct idea? > > I don' think so. > > A plugin operates on the transactions read from a ledger after beancount > after booking (the process for which all the postings in all the > transactions are balanced, padding amounts are calculated, lots are > computed, etc...). The transactions processed in this phase already need > to have all postings completed. > > Also, a plugin does not have a way to serialize the completed > transactions into a ledger. Unless you hack something together, your > plugin would run every time you load your ledger and will have to do its > job again. This would make fixing any mistake the automatic > categorization algorithms does rather cumbersome. > > Why do you thing a plugin is a better approach? > > > I noted that the importer is provided with an `existing_entries` list of > > transactions, which seems a very useful suite of items to match > > against. But can I reach that from the plugin? > > That what? A plugin as access to all the transactions in the ledger on > which Beancount is operating. In this context there isn't the notion of > another ledger to which a batch of transactions will be added to. > > > Where/how? and is that even a good idea? (its not going to re-read the > > entire history for every imported transaction is it? Hmm, I'd tolerate > > that nonetheless :-)) I'm assuming a `beancount.loader.load_file` > > inside the plugin would create some recursive sillyness? > > The beancount parser and loader are capable of loading more than one > file in the same process. However, there is no protection from a plugin > that recursively tries to load the Beancount ledger from which it has > been invoked. If you want to try the ledger filename is available as the > "filename" entry to the "options_map" passed to the plugin entry point. > > The way I would approach this, if you want a solution independent from > the import framework, is to use beancount.parser.parse_file() to parse > the transactions from a ledger, use the technique you like the most to > complete or rewrite the transactions, and write them back with > beancount.parser.printer.print_entries(). > > Cheers, > Dan > > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/f6d0837e-769e-4bb3-9087-d3f4201c6f54n%40googlegroups.com.
