Re: [PyGreSQL] sql array notation parsing bug in pygresql-5+

Christoph Zwerschke Thu, 15 Sep 2016 02:23:48 -0700

Am 15.09.2016 um 04:34 schrieb raf:
>> The question for me is not how frequently arrays are used, but how
>> frequently arrays with non-default start indices are used. Note that the
>> ARRAY constructor in Postgres doesn't even have a way to set a different
>> start index (as far as I know).
>
> Yes, they're usually created by something like:
>
>   update some_table set array_column[index_other_than_1] = value...

Right, if you use an index < 1 or update an empty array, the start indexwould be changed. I still don't think this is something that isfrequently done, but yes, it's possible.


> I've never really thought of them as having a non-default start
> index. I've always just thought of the '[#:#]={}' notation as a
> Postgres-specific "compression" format which needed to be
> "decompressed" when fetched but that seems not to be the case
> (and it doesn't explain negative start indexes).

Not really compression, since all empty values are returned as NULL.Actually the end index is redundant in this notation.


> I suspect that the speed difference is not because of the array
> parser. There must some other reason so I've attached the test
> program I used rather than one that just tests the parser in
> isolation. The array parser in the attachment is slow and only
> handles 1-dimensional arrays. I think it's safe to say that the
> C parser would be much faster.

I think so. But it may point to a performance degradation elsewhere, soI'd like to check this. Did you forget to attach the test?


> I'm not sure. I'd be happy for it to return None for indexes
> between 0 and the "real" start but I wouldn't want it to return
> None for indexes past the end even though that would mimic
> Postgres behaviour. I'd rather it behaved like a Python list
> with enough None values inserted at the beginning to make the
> indexes match (although being off-by-one of course). In other
> words, I'd want len(a) in Python to return the same value as
> array_upper(a, 1) in Postgres. But it sounds like that's too
> tacky which is fair enough. Just because it's what I want
> doesn't mean that's what anyone else would want.

Exactly. As you see, there are many different ways to implement this andif we do it one way, there is always somebody who will complain.Therefore it is best to implement only the straightforward, obvious way,but allow for customization.

Again, the problem here is that Postgres arrays are different beaststhan lists in Python, much more similar to Arrays in JavaScript.


> If inserting None values into an ordinary Python list is not an
> option, my next thought was maybe the client can request
> non-optional behaviour somehow that means that, when fetching
> arrays, if the start index is 1, a Python list is returned (as
> is the case now) but if the start index is not 1, then a 2-tuple
> is returned instead containing the Postgres start index as one
> item and the list that would normally be returned as the other
> item.

I was also thinking along these lines, but the return value shouldalways be of the same type, otherwise code will always have to handleboth cases, making it more complicated, or risk raising errors. Alsokeep in mind you can also have multidimensional arrays with more thanone start index.

I'd rather return a list subtype with the start index as an additionalattribute. However, the question is then how that list should behavewhen retrieving items. Since this is not obvious, we should make itcustomizable.

So the idea is that we provide a function for changing the base Pythonclass used for PG arrays, which should be a subclass of list. The arrayparser would then only set an additional "lower" attribute in instancesof that class, and it's up to the class implementation how this ishandled when items of the array are returned.


Would that be reasonable solution?

-- Christoph
_______________________________________________
PyGreSQL mailing list
[email protected]
https://mail.vex.net/mailman/listinfo.cgi/pygresql

Re: [PyGreSQL] sql array notation parsing bug in pygresql-5+

Reply via email to