We have a CSV view (not a serializer) that is linked from every
change_list page. This allows sufficiently privileged users to dump the
database table into Excel to do things not covered by our existing
views. We do not allow for a CSV import, but it's been something that
we've wanted.

We'd be very interested in this project.

On Tue, 2010-10-26 at 23:05 +0800, Russell Keith-Magee wrote:
> CSV has a basic
> structure (i.e., comma separated values), but doesn't have a natural
> way of representing multiple datatypes

I think this is the biggest challenge. Could we come up with some
criteria that would have to be met for a given field type's
representation?

Perhaps:
1) The field has to load correctly in Excel.
2) The field has to load correctly in OpenOffice.org.
3) The field has to be human readable, except where doing so would
   violate #1 or #2.
4) The field should match its most common SQL representation, except
   where doing so would violate #1, #2, or #3.

Handling foreign keys is problematic. If you just export the key, you
often end up with an integer that's meaningless. If you export the
related object, do you use it's __unicode__ or something else?

On import, do you match the provided values to existing values in the
JOIN table or can new ones be added?

To be honest, I haven't looked at the JSON serializer, so I'm not sure
how this is handled there. Of course, JSON would support nested objects
where CSV wouldn't.

> multiple values for a single field

When would this matter? Is there a field type in Django that uses SQL
arrays? If not, SQL has the same issue.

> or differentiating NULL from empty string

Neither does CharField, so why does this matter?

> Even in-file
> metadata (sometimes represented as the first, commented out row of a
> CSV file) is the subject of inconsistency.

On export, you either have it or you don't. It seems that having a
header row is better than not, so include it. This meets your "useful in
Excel" criteria.

As far as import, it's easy to strip the first row or not, but the big
question is if you want to make it *optional*.

If the goal of the serializer is to import data that you've previously
export, then there's no need to make it optional. If you want something
more generally useful, you'll have to look at the first row and try to
match the columns to field names. If they all match, then it's a header
row, if they don't, it's not.

Richard

Attachment: signature.asc
Description: This is a digitally signed message part

Reply via email to