On Sat, Jul 22, 2023 at 3:34 AM Daniele Nicolodi <[email protected]> wrote:
> On 21/07/23 23:06, Eric Altendorf wrote: > > I'm trying to figure out whether I can use the Beangulp import driver > > with hooks, or if I need to write my own driver to call my importers and > > do postprocessing. As you may recall, my workflow is atypical, as I > > have no curated Beancount ledger file; my source of truth are my input > > data files and the Beancount ledger is a built artifact for running > > analysis. > > > > There are two things I'd like to do that I don't think are currently > > possible; I'd appreciate feedback on whether these seem like things > > Beangulp should support (I could contribute a patch), or if I'm better > > off finding a different solution: > > > > - I'd like to deduplicate entries among different importers in a single > > run, not just dedup against a pre-existing ledger > > I was going to reply that this is already supported, then I realized > that I never merged the patch implementing it > https://github.com/beancount/beangulp/pull/64 I'm going to rebase and > merge it ASAP. > That's great! I have pulled the latest code, and it doesn't seem to be deduplicating the expected items. Let me check my assumptions: I'm not sure how one is supposed to run multiple importers at once, the doc <https://docs.google.com/document/d/1O42HgYQBQEna6YpobTqszSgTGnbRX7RdjmzR2xumfjs/edit#heading=h.9lk1l7gqxxfs> kind of only describes running one. So I'm currently running with a Python script that builds a list of importers, then runs Ingest, as follows; is this correct, or am I missing some other setup code? if __name__ == '__main__': importers = get_importers() hooks = [] cli = beangulp.Ingest(importers, hooks).cli cli() The deduplication is supposed to run by default, correct? There seems to be a fairly good default implementation of similarity comparison, yes? Deduplication will happen among entries from *different* importers running in the same run, right? > > > - I'd like to be able to emit the output file globally sorted by date > > (first the official entry date, then secondarily by a timestamp attached > > to the metadata) rather than grouped by import file. (Broadly this will > > make it easier for me to debug issues sequentially, and ordering > > within-day may alleviate some of the issues I've seen with same-day > > purchase & transfer transactions.) > > It is trivial to post-process the output of beangulp to apply any > ordering you like. Indeed I do something very similar for ledgers. > Writing from memory: > > import beanquery.parser.parser > import beanquery.parser.printer > > def key(entry): > return (entry.date, entry.meta['timestamp']) > > entries, errors, options = parser.parse_file(filename) > entries.sort(key=key) > printer.print_entries(entries) > Hmm, OK, that may work fine, thanks. > > > And just to double check that this should already be possible: > > > > - I'd like to be able to add entries (i.e., account declarations, > > initial balance pads, etc.) via a hook > > You can do this as part of the sorting post-processing step, or with a > beancount plugin. See for example the beancount.plugins.auto_accounts > (and other) plugins. > Cool, sounds good. I hadn't dug into plugins yet. Thank you! eric > > Cheers, > Dan > > -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/3fdc241b-1fae-062b-22c6-42b718bd00cf%40grinta.net > . > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAFXPr0tqHop_YMnYTxeGgEkKRcv%3D3oPmMH8LDF-b-qbKwPdBBg%40mail.gmail.com.
