Re: [PyGreSQL] sql array notation parsing bug in pygresql-5+

Christoph Zwerschke Tue, 13 Sep 2016 02:48:40 -0700

Am 13.09.2016 um 08:35 schrieb raf:

I suspect that if people aren't complaining about it, that might
be because array columns probably aren't a very popular datatype
in sql databases. The only reason I'm using them is because it
was the easiest and quickest way to migrate data from a database
designed probably in the 1980s. But perhaps they're more popular
than I imagine. I really don't know.

The question for me is not how frequently arrays are used, but howfrequently arrays with non-default start indices are used. Note that theARRAY constructor in Postgres doesn't even have a way to set a differentstart index (as far as I know).

Optional is completely fine. If the C version can be made to
optionally insert the None values, that would be awesome but I
was surprised that the speed difference between that and my
parser in Python didn't make a huge difference to the overall
speed of the tests. But I wasn't testing array parsing so maybe
that's not too surprising.

Actually, I just wrote a test to select an array of 26 ints
(with a default start index) 10000 times and the pygresql-5
version with the C array parser took 9s and the pygreesql-4
version with my Python array parser took 4s. That can't be
right. There must be some other differences that explain it.
The parsing of the array is probably only a small part of the
overall database query.

If there's really a performance degradation I'd like to fix that. Maybeyou can create a small reproducible test code.


Note that you can test the array parser in isolation:
http://www.pygresql.org/contents/pg/module.html#cast-array-record-fast-parsers-for-arrays-and-records

It also parses multidimensional arrays and it should be pretty fast.

Maybe the method should create a subclass of list that corresponds to aone-dimensonal Postgres array. It should behave like a normal list, buthave settable (and changeable) start and end indices (if you change one,you change also the other) and return None for out of range indices.That way you could even properly convert an array like '[-2:-2]={1}'.Multidimensional arrays could be created by nesting these.


I'm thinking of something along these lines:

class PgList(list):

    def __init__(self, value, lower=1):
        list.__init__(self, value)
        self._lower = lower

    @property
    def lower(self):
        return self._lower if self else None

    @property
    def upper(self):
        return self._lower + len(self) - 1 if self else None

    def __getitem__(self, key):
        if self and self._lower <= key < self._lower + len(self):
            return list.__getitem__(self, key - self._lower)

We could also support mutability and slicing, but that will quicklybecome complicated.


Would it help you if we return these kind of lists?

-- Chris
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo.cgi/pygresql

Re: [PyGreSQL] sql array notation parsing bug in pygresql-5+

Reply via email to