Inlined below

On Fri, Jan 16, 2015 at 11:50 AM, Jan Van Besien <[email protected]> wrote:
> Hi,
>
> In the context of "store nulls", Phoenix seems to store empty arrays
> and null arrays both as an empty byte array. We have a use case where
> null means something different than empty.
>
> I had a quick look at how arrays are serialized. The serialization
> format starts by writing out the length of the array, hence I think it
> is relatively easy to change the serialization of empty arrays into "a
> byte array that represents an empty array by storing length=0" in
> stead of "an empty byte array".

Just taking a look at this as well -- I guess another way of putting
it is that empty arrays don't (yet) exist in Phoenix (right?)

>
> I can go ahead and provide patches, but I am wondering whether maybe
> there is a reason why phoenix would not want to make the distinction
> between null and empty?

My guess would be that it's more the fact that empty arrays don't
exist (if my assumption about that is correct), and then I guess it's
just less serialization overhead to store nothing than to store an
"empty" marker.

I guess if the concept of empty arrays were to be introduced (by
storing them explicitly), the potential for backwards-compatibility
issues would be pretty minimal. Code that was doing something like
this to set an array column to null:

   stmt.setArray(1, conn.createArrayOf("INTEGER", new Object[]{}));

instead of doing this:

   stmt.setNull(1)

would stop working as it does right now, but that seems like a pretty
far-off edge case.

If my assumptions are all correct here, the question becomes more of:
do we want to introduce empty arrays or not? I don't see a reason
necessarily not to do it, although maybe someone else does?

>
> This might also apply to strings?

I think strings (varchar) is a different case -- the non-existence of
empty strings is in line with what Oracle does, and this would require
changing the actual serialization of varchar columns as well (i.e.
adding a length value to the serialization).

- Gabriel

Reply via email to