Code Review: Importer with duplicate detection and transformation

Florian Lindner Mon, 06 May 2019 05:50:01 -0700

Hello,

as you might have guessed from my previous questions, I am currently into 
the issues of importing and de-duplication. For that reason, I wrote an 
importer for the Frankfurter Sparkasse 1822direkt CSV input data.

https://gist.github.com/floli/6df567d6f08993ddebe07662842c1d47

* It does de-duplication by computing a hash of the CSV input line and
saves it to meta data as "hash". Entries from the same input line are not
imported again.

* It converts "Rechnungabschlüsse" in the CSV files to balance assertions.

* It adds some additional information as "empfaenger" and "buchungsart" as
meta data.

* The importer does some transformations of payees based on regular
expressions and setting of accounts based on python expressions allowing
for more flexible rules. The latter might be interesting to you.

* It can also be used to transform an existing beancount file and apply the
aforementioned transformations.

Questions / Remarks:

* Is "hash" the best meta variable name to store the hash too? Is there
some notion of hidden/internal use only meta names, such as "__hash__"
(which is invalid, as bean-check told me).

* UTF-8 in metadata key names would be cool, for me specifically, the
German Umlauts (öüä).

* I am an open to any other suggestions, remarks as this is my first piece
of code using the beancount API.

Best Thanks,
Florian

--
You received this message because you are subscribed to the Google Groups
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/beancount/f37bff25-8af4-4bda-b27f-9c1c08c37437%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Code Review: Importer with duplicate detection and transformation

Reply via email to