I think the answer to that is, if an inet type column is a partition
key, can I write to it in IPv4 and then query it with IPv6 and find the
record? I believe the behaviour between SAI and partition key should be
the same.
On 07/03/2024 17:43, Caleb Rackliffe wrote:
Yeah, what we have with inet is much like if we had a type like
"numeric" that allowed you to write both ints and doubles. If we had
actual "inet4" and "inet6" types, SAI would have been able to index
them as fixed length values without doing the 4 -> 16 byte conversion.
Given SAI could easily change this to go one way or another at
post-filtering time, perhaps there's another option:
4.) Have an option on the column index that allows the user to specify
whether ipv4 and ipv6 addresses are comparable. If they are, nothing
changes. If they aren't, we can just take the matches from the index
and filter "strictly".
I'm not sure what's best here, because what it seems to hinge on is
what users actually want to do when they throw both v4 and v6
addresses into a single column. Without any real loss in storage
efficiency, you could index them in two separate columns on the same
table, and none of this matters. If they are mixed, it feels like we
should at least have the option to make them comparable, kind of like
we have the option to make text case-insensitive or unicode normalized
right now.
On Wed, Mar 6, 2024 at 4:35 PM Bowen Song via dev
<dev@cassandra.apache.org> wrote:
Technically, 127.0.0.1 (IPv4) is not 0:0:0:0:0:ffff:7f00:0001 (IPv6),
but their values are equal. Just like 1.0 (double) is not 1 (int),
but
their values are equal. So, what is the meaning of "=" in CQL?
On 06/03/2024 21:36, David Capwell wrote:
> So, was reviewing SAI and found we convert ipv4 to ipv6 (which
is valid for the type) and made me wonder what the behavior would
be if client mixed ipv4 with ipv4 encoded as ipv6… this caused me
to find a different behavior in SAI to the rest of C*… where I
feel C* is doing the wrong thing…
>
> Lets walk over a simple example
>
> ipv4: 127.0.0.1
> ipv6: 0:0:0:0:0:ffff:7f00:0001
>
> Both of these address are equal according to networking and
java… but for C* they are different! These are 2 different values
as ipv4 is 4 bytes and ipv6 is 16 bytes, so 4 != 16!
>
> With SAI we convert all ipv4 to ipv6 so that the search logic is
correct… this causes SAI to return partitions that ALLOW FILTERING
and other indexes wouldn’t…
>
> This gets to the question in the subject… what SHOULD we do for
this type?
>
> I see 3 options:
>
> 1) SAI use the custom C* semantics where 4 != 16… this keeps us
consistent…
> 2) ALLOW FILTERING and other indexes are “fixed” so that we
actually match correctly… we are not really able to fix if the
type is in a partition or clustering column though…
> 3) deprecate inet in favor of a inet_better type… where inet
semantics is the custom C* semantics and inet_better handles this case
>
> Thoughts?