Re: [HACKERS] generic copy options

Emmanuel Cecchet Sun, 20 Sep 2009 19:31:37 -0700

The easiest for both implementation and documentation might just be tohave a matrix of options.Each option has a row and a column in the matrix. The intersection of arow and a column is set to 0 if options are not compatible and set to 1if it is. This way we are sure to capture all possible combinations.This way, each time we find a new option, we just have to check in thematrix if it is compatible with the already existing options. Note thatwe can also replace the 0 with an index in an error message array.


I can provide an implementation of that if this looks interesting to anyone.
Emmanuel


Robert Haas wrote:

On Sun, Sep 20, 2009 at 2:25 PM, Emmanuel Cecchet <[email protected]> wrote:

Tom Lane wrote:

Emmanuel Cecchet <[email protected]> writes:

Here you will force every format to use the same set of options

How does this "force" any such thing?

As far as I understand it, every format will have to handle every format
options that may exist so that they can either implement it or throw an
error.


I don't think this is really true.  To be honest with you, I think
it's exactly backwards.  The way the option-parsing logic works, we
parse each option individually FIRST.  Then at the end we do
cross-checks to see whether there is an incompatibility in the
combination specified.  So if two different formats support the same
option, we just change the cross-check to say that foo is OK with
either format bar or format baz.  On the other hand, if we split the
option into bar_foo and baz_foo, then the first loop that does the
initial parsing has to support both cases, and then you still need a
separate cross-check for each one.

That would argue in favor of a format option that defines the format. Right
now I find it bogus to have to say (csv on, csv_header on). If csv_header is
on that should imply csv on.
The only problem I have is that it is not obvious what options are generic
COPY options and what are options of an option (like format options).
So maybe a tradeoff is to differentiate format specific options like in:
(delimiter '.', format csv, format_header, format_escape...)
This should also make clear if someone develops a new format what options
need to be addressed.


I think this is a false dichotomy.  It isn't necessarily the case that
every format will support a delimiter option either.  For example, if
we were to add an XML or JSON format (which I'm not at all convinced
is a good idea, but I'm sure someone is going to propose it!) it
certainly won't support specifying an arbitrary delimiter.

IOW, *every* format will have different needs and we can't necessarily
know which options will be applicable to those needs.  But as long as
we agree that we won't use the same option for two different
format-specific options with wildly different semantics, I don't think
that undecorated names are going to cause us much trouble.  It's also
less typing.

PS: I don't know why but as I write this message I already feel that Tom
hates this new proposal :-D


I get those feeling sometimes myself.  :-)  Anyway, FWIW, I think Tom
has analyzed this one correctly...

...Robert



--
Emmanuel Cecchet

FTO @ Frog ThinkerOpen Source Development & Consulting

--
Web: http://www.frogthinker.org
email: [email protected]
Skype: emmanuel_cecchet


--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] generic copy options

Reply via email to