On Mon, May 6, 2019 at 8:50 AM Florian Lindner <[email protected]> wrote:
> Hello, > > as you might have guessed from my previous questions, I am currently into > the issues of importing and de-duplication. For that reason, I wrote an > importer for the Frankfurter Sparkasse 1822direkt CSV input data. > > https://gist.github.com/floli/6df567d6f08993ddebe07662842c1d47 > > * It does de-duplication by computing a hash of the CSV input line and > saves it to meta data as "hash". Entries from the same input line are not > imported again. > > * It converts "Rechnungabschlüsse" in the CSV files to balance assertions. > > * It adds some additional information as "empfaenger" and "buchungsart" as > meta data. > > * The importer does some transformations of payees based on regular > expressions and setting of accounts based on python expressions allowing > for more flexible rules. The latter might be interesting to you. > > * It can also be used to transform an existing beancount file and apply > the aforementioned transformations. > > > Questions / Remarks: > > * Is "hash" the best meta variable name to store the hash too? Is there > some notion of hidden/internal use only meta names, such as "__hash__" > (which is invalid, as bean-check told me). > SGTM In theory I've tried pretty hard to avoid using metadata from Beancount itself and to leave it alone for users to peruse, but a few instances of special keys have crept in: bergamot [hg|default]:~/p/beancount$ grep -Esrn "^[A-Z][A-Z_]+ = ['\"]__[a-z]+" beancount beancount/core/interpolate.py:199:AUTOMATIC_META = '__automatic__' beancount/core/interpolate.py:202:AUTOMATIC_RESIDUAL = '__residual__' beancount/core/interpolate.py:205:AUTOMATIC_TOLERANCES = '__tolerances__' beancount/ingest/extract.py:31:DUPLICATE_META = '__duplicate__' I'd like to remove these eventually and put this information in the schema at a more appropriate place. Just avoid the __...__ names and you should be alright. > > * UTF-8 in metadata key names would be cool, for me specifically, the > German Umlauts (öüä). > Not impossible and perhaps not even difficult; somebody else has already done the legwork to add utf-8 to account names. You'd have to change the KEY token in lexer.l to use some of the UTF-* definitions carefully. See rule key_value in grammar.y It's pretty isolated, I don't think it would break much else. * I am an open to any other suggestions, remarks as this is my first piece > of code using the beancount API. > > Best Thanks, > Florian > > -- > You received this message because you are subscribed to the Google Groups > "Beancount" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/beancount/f37bff25-8af4-4bda-b27f-9c1c08c37437%40googlegroups.com > <https://groups.google.com/d/msgid/beancount/f37bff25-8af4-4bda-b27f-9c1c08c37437%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Beancount" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/beancount/CAK21%2BhMYPpf_tZ9yZcYi%2BspLpUSSZGMTB%2B6y7WYtMrsoKrv5kQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.
