I like the Sentinel value approach (but maybe because I don't know nothing
about the Cons)
*Pros*
- Type-stable
- Imposes no indirection
- Directly use machine ops
- No additional memory requirements
- NaN already implements NA-like semantics
- Existing functions like + behave correctly
*Cons*
- Requires a sentinel value for every new type
In Julia is very easy to define by defining:
isna(x::MyType, MySentinelValues::Vector{MyType} ) = x in MySentinelValues
isna(x::MyType, MySentinelValue::MyType ) = x == MySentinelValue
isna(x::MyType) = isna( x , MySentinelValue )
- Potential binary incompatibility with other systems
I don't know about it. Are there problems in R because of this? What is the
PANDAS approach to NA's? Is there some kind of general agreement?
- Vulnerable to compiler optimizations
I dont't know about it... Can we workin on that?
- NaN + NA != NA + NaN in some R builds
Is there any way to prevent it?
- Discards a potentially useful value
Is this a big problem? Is NaN doing the same, isn't it ?
Best,