Hi!

It's been a while since I've done much, but a few weekends ago I rewrote
all my CSV importers.
I had new changes to update my code for, and I was also behind on updating
from changes from updates in beangulp.
Some nice experience came out of it.

I had been unhappy with the object-oriented mixins and CSV importer that's
in beangulp for a long time.
Looking around for which file provided which implementation was always a
bit annoying.
It's a lot simpler to have a single protocol (beangulp.Importer) with all
abstract methods and just implementations of that (no inheritance of
functionality).
In fact, even if I have to duplicate some code in the implementation, I'm
still happier with the result that way.
The simplicity is worth the repetition and having all the code locally
visible in a single file is advantageous, especially since this is the type
of thing that you end up doing reluctantly (in general when I'm doing
accounting imports the last thing I want to do is having to hack to adapt
code due to changed file formats; the easier I can make it the better).

As it turns out, a heavily configurable CSV importer is not best served by
a class + config abstraction. It's a lot simpler to read and massage the
input table with "petl" to convert the types (dates and numbers, mostly),
normalize the column names and then call a generic little helper function
to construct Transaction instances. For many of my simple CSVs, I've been
using this extremely simple helper:
https://github.com/beancount/beangulp/blob/master/beangulp/petl_utils.py#L16
and these parser functions:
https://github.com/beancount/beangulp/blob/master/beangulp/utils.py
The petl code really is as simple - and much more powerful - than a custom
configuration that attempts to support all variations and think ahead about
all the possibilities.
This is the key: that code *is* the transformation configuration, and the
petl API is quite elegant and minimal in that way.
(If you're interested in more involved usage of petl you can look here:
https://github.com/beancount/johnny/tree/master/johnny/sources)

Here's an example of such a CSV importer using petl (but not the helper
above, this one creates transactions for groups of rows with the same id):
https://github.com/beancount/beanbuff/blob/master/beanbuff/coinbase/coinbase_csv.py

What I ended up with is so much easier to work with when debugging is
needed that I'm tempted to declare the CSV importer implementation that's
in beangulp deprecated.
I'm referring to all the files under
https://github.com/beancount/beangulp/tree/master/beangulp/importers/
I have no intention of adding to that functionality going forward.
I think we should even probably delete the mixins and it on the next
release. I have a feeling nobody's been using them anyway (nobody ever
asked questions about them, I was probably alone using them) and it's less
code to maintain. If you rely on them say something.
We could add a tag for the last version with them available.

Any thoughts?

-- 
You received this message because you are subscribed to the Google Groups 
"Beancount" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to beancount+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beancount/CAK21%2BhPNuL1yFhzn91pAgHRKBaG0r8%2BYMhzOKNcj4-kb65%3D_mw%40mail.gmail.com.

Reply via email to