[
https://issues.apache.org/jira/browse/CASSANDRA-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254294#comment-14254294
]
Tyler Hobbs commented on CASSANDRA-8374:
----------------------------------------
bq. the overwhelming majority of functions have only 2 reasonable ways to
handle null inputs: either throw, or return null. There is certainly functions
for which it makes sense to return a non-null value on null inputs, but it's
pretty rare.
Agreed.
bq. if we agree on the previous point, then making CALLED ON NULL INPUT the
default amounts to ask users to make one of those 2 choice in the vast majority
of cases.
Yes, but I think that perhaps forcing users to make that choice explicitly is a
good thing. Perhaps we can consider having no default and require one of the
two options? I don't think that users will be writing new UDFs so frequently
that four or five words for the sake of explicit correctness is a bad tradeoff.
I think it's fair to say that Cassandra doesn't control (or have tools for
controlling) the presence of nulls as strictly as relational DBs, so perhaps
the handling of them should be more careful.
bq. I think throwing an exception is the wrong choice in general because it
makes the method unusable (or dangerous to use) in the "called on column name"
case, more so than returning null in the "called on value" (which as said
above, while definitively not perfect, feels to me somewhat less evil)
IMO, returning null (potentially without knowing it) is more dangerous than
getting an error that indicates your functions are broken.
bq. therefore, provided we don't want to make a difference based on the context
of the call (which I don't want to)
Agreed, we shouldn't change behavior based on context.
bq. This also means that if we do think RETURNS NULL ON NULL INPUT is a bad
default, then surely it's bad that all our functions do it, which begs the
question "what should they do instead?".
There can be good reasons for functions to return null on null input, I'm not
arguing that. Most of our existing functions are essentially casts, where
returning null on null input makes sense.
bq. again, I think most functions will end up returning null on null inputs
anyway, so having CALLED ON NULL INPUT the default feels to me like just
forcing clutter by default (and people will forget handle null by default).
As mentioned above, I think the "clutter" is worth it. And since you agree
that people will forget to handle it, perhaps having no default is the best
choice? That could help to cut down on that type of mistake. (FWIW, we could
always change from no default to having a default in the future if other
changes in C* make it a good idea. The reverse is not a option.)
bq. "it's particularly painful to work with java by default" is a problem I'd
much rather avoid
That's fair, we do want to encourage the best options.
> Better support of null for UDF
> ------------------------------
>
> Key: CASSANDRA-8374
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8374
> Project: Cassandra
> Issue Type: Bug
> Reporter: Sylvain Lebresne
> Assignee: Robert Stupp
> Fix For: 3.0
>
> Attachments: 8473-1.txt, 8473-2.txt
>
>
> Currently, every function needs to deal with it's argument potentially being
> {{null}}. There is very many case where that's just annoying, users should be
> able to define a function like:
> {noformat}
> CREATE FUNCTION addTwo(val int) RETURNS int LANGUAGE JAVA AS 'return val + 2;'
> {noformat}
> without having this crashing as soon as a column it's applied to doesn't a
> value for some rows (I'll note that this definition apparently cannot be
> compiled currently, which should be looked into).
> In fact, I think that by default methods shouldn't have to care about
> {{null}} values: if the value is {{null}}, we should not call the method at
> all and return {{null}}. There is still methods that may explicitely want to
> handle {{null}} (to return a default value for instance), so maybe we can add
> an {{ALLOW NULLS}} to the creation syntax.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)