On Sun, Feb 4, 2018 at 7:23 AM, 'Patrick Ruckstuhl' via Beancount < [email protected]> wrote:
> >- Using prices in imports >>> >> >> >>> >> For some imports I would like to enhance the transactions with prices >>> >> based on current/daily price, I'm currently fetching and storing >>> >prices >>> >> in beancount, so prices are available in the beancount file but I'm >>> >not >>> >> sure what the best way to hook this into the importer framework is >>> >> >>> > >>> >Fetching prices automatically /is/ OTOH intended to be automated. >>> >(Note that we're in a funny situation right now with both Yahoo and >>> >Google >>> >Finance APIs disabled.) >>> > >>> >These are two separate processes at the moment; run one, then the >>> >other. >>> >Concatenate to a file if you want to. >>> >>> That's what I'm doing. What I'm looking is how to access the prices from >>> the beancount file in the importer, do I have to parse/load the beancount >>> file o my own? >>> >> >> Yes. >> You'd call beancount.loader.load_file() on your existing file, and then >> build a price_map dict. >> Grep for "price_map" in the source code, you'll find several examples of >> doing that. >> >> > I'm wondering if it would make sense to slightly enhance the > ImporterProtocol. Right now bean-extract has already the ability to parse > an existing beancount file and use this for duplicate detection. > Now if those entries could be forwarded to the importer it would open up > some use cases such as: > > * custom duplicate logic (e.g. let's say my import file has a unique > identifier which I map to metadata on the transaction, if I now get the > existing import entries, I can make sure to only import new transactions > and either completely ignore duplicates or tag them with the __duplicate__ > meta > > * my use case where I need data (e.g. prices) from the existing beancount > file to enhance the new entries > > I think all that would be needed is to add existingEntries to the extract > method: > > importer.extract(file, existing_entries) > > > That way there is no additional parsing of beancount file needed and there > is a clear way to define which file to parse (e.g. same way to do it when > called from fava as well as from bean-extract) > That's an interesting idea. It's an easy change to make. + Note that if we add the entries to the extractor, it opens up the possibility for the particulars of the extractor to depend on particulars of previously imported transactions. To paraphrase your example, if an extractor knows that its input file contains a unique transaction id column and it consistently attaches that as a "link" on the transaction, it can then use that fact to very reliably flag transactions as duplicates in the future by inspecting the link field of those transactions (assuming the user hasn't removed them in the text). That may be a good thing, because that kind of check may NOT be generalizable across different importers, unless we'd establish some sort of guarantee that some links represent globally unique identifiers. In a sense, the current method for flagging duplicates assumes that a general method for detecting duplicates - after fiddling and manual adjustments by the user - exists. - On the downside, preventing access to the previous entries essentially decouples the duplicate detection method and the importer logic. This would force the duplicate logic to remain generic. The importer having access to the prior directives creates a logical dependency between it and the duplicate detection. I'm not sure we have to worry about that. Given that the duplicate logic has been iffy ever since it existed, I think it's a reasonable thing to try. Let's do it and see what happens, if people start relying on it. To be fair, I think more work could simply be done on the duplicate logic to make it more resilient, but in the interest of flexibility, let's add this. So here's the change: - The Importer.extract() method now accepts a new parameters with the prior entries (or None, if not specified). It's free to use that as it pleases. - The entries returned by Importer.extract() will be checked for __duplicate__ metadata and automatically inserted to that set if it is present. This allows the importer to return some duplicate entries for context - which will be rendered as such in the output, e.g. commented out - without having to necessarily throw them away. - Current importer parameters are still supported as legacy (I really didn't want to break everyone's importers with this API change, so I inspect the signature). Here: https://bitbucket.org/blais/beancount/commits/f9728f0c9594fae38e3ff7fa7e1f8dd2190ab6da -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhNBa7hM_DPtYeL5bUxTip06WoUnBcUsjRL6%2BNpYB_vz1w%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
