rdblue commented on issue #2837:
URL: https://github.com/apache/iceberg/issues/2837#issuecomment-882846235
@RussellSpitzer, yes. But I think the question is whether we expect anyone
to have this problem. I'm not familiar enough with unicode to know whether we
would expect regular use in other languages to hit this bug. If this only
affects code points like 💰 then I'm not sure that we need to add compatibility.
But if this affects normal use in character-based languages then we should
build and document a fix like the one for negative date values.
If we end up doing that, it should be a matter of updating the projections
from string predicates to bucket id predicates. For example, `eq("col", "💰")`
would normally project as `eq("col_bucket", 4)` but we need to create
`and(eq("col_bucket", 4), eq("col_bucket", 12))` instead. This isn't too bad
because we only need to update equality and in predicates because bucket
function projection doesn't work for inequalities.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]