On Mon, May 6, 2019 at 8:50 AM Florian Lindner <[email protected]> wrote:

> Hello,
>
> as you might have guessed from my previous questions, I am currently into
> the issues of importing and de-duplication. For that reason, I wrote an
> importer for the Frankfurter Sparkasse 1822direkt CSV input data.
>
> https://gist.github.com/floli/6df567d6f08993ddebe07662842c1d47
>
> * It does de-duplication by computing a hash of the CSV input line and
> saves it to meta data as "hash". Entries from the same input line are not
> imported again.
>
> * It converts "Rechnungabschlüsse" in the CSV files to balance assertions.
>
> * It adds some additional information as "empfaenger" and "buchungsart" as
> meta data.
>
> * The importer does some transformations of payees based on regular
> expressions and setting of accounts based on python expressions allowing
> for more flexible rules. The latter might be interesting to you.
>
> * It can also be used to transform an existing beancount file and apply
> the aforementioned transformations.
>
>
> Questions / Remarks:
>
> * Is "hash" the best meta variable name to store the hash too? Is there
> some notion of hidden/internal use only meta names, such as "__hash__"
> (which is invalid, as bean-check told me).
>

SGTM
In theory I've tried pretty hard to avoid using metadata from Beancount
itself and to leave it alone for users to peruse, but a few instances of
special keys have crept in:

bergamot [hg|default]:~/p/beancount$ grep -Esrn "^[A-Z][A-Z_]+ =
['\"]__[a-z]+"  beancount
beancount/core/interpolate.py:199:AUTOMATIC_META = '__automatic__'
beancount/core/interpolate.py:202:AUTOMATIC_RESIDUAL = '__residual__'
beancount/core/interpolate.py:205:AUTOMATIC_TOLERANCES = '__tolerances__'
beancount/ingest/extract.py:31:DUPLICATE_META = '__duplicate__'

I'd like to remove these eventually and put this information in the schema
at a more appropriate place.
Just avoid the __...__ names and you should be alright.



>
> * UTF-8 in metadata key names would be cool, for me specifically, the
> German Umlauts (öüä).
>

Not impossible and perhaps not even difficult; somebody else has already
done the legwork to add utf-8 to account names.
You'd have to change the KEY token in lexer.l to use some of the UTF-*
definitions carefully.
See rule key_value in grammar.y
It's pretty isolated, I don't think it would break much else.


* I am an open to any other suggestions, remarks as this is my first piece
> of code using the beancount API.
>
> Best Thanks,
> Florian
>
> --
> You received this message because you are subscribed to the Google Groups
> "Beancount" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To post to this group, send email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/beancount/f37bff25-8af4-4bda-b27f-9c1c08c37437%40googlegroups.com
> <https://groups.google.com/d/msgid/beancount/f37bff25-8af4-4bda-b27f-9c1c08c37437%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/CAK21%2BhMYPpf_tZ9yZcYi%2BspLpUSSZGMTB%2B6y7WYtMrsoKrv5kQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to