Dear Hive User Group,

I hope this email finds you well.

I’m writing to inquire about the behavior of string concatenation with NULL 
values in Hive. I’ve observed that when concatenating a string with a NULL 
value using the || operator or concat function, Hive returns NULL.

For example:
```sql
beeline> select “a” || null;
±------+
|  _c0  |
±------+
| NULL  |
±------+
```
My understanding is that in some other database systems, such as Oracle or 
PostgreSQL, a similar operation (e.g., `"a" || NULL` or `'a' || NULL`) would 
typically return the non-`NULL` string itself (i.e., "a"), effectively treating 
the `NULL` as an empty string or ignoring it for the concatenation.

I'm curious to understand the rationale behind Hive's approach to propagate 
`NULL` in this scenario. While I understand that `NULL` often means "unknown" 
and operations involving it can result in `NULL`, I find that in the context of 
string building, the behavior of ignoring `NULL`s (as seen in other RDBMSs) can 
sometimes be more intuitive and convenient, potentially reducing the need for 
explicit `COALESCE` calls.

>From my perspective, it seems that if Hive were to ignore `NULL` values in 
>`||` concatenation (similar to Oracle/Postgres), it might lead to more 
>straightforward query writing in certain scenarios, particularly when dealing 
>with potentially `NULL`able columns that need to be combined.

Could you please shed some light on why this design choice was made for Hive? 
Are there specific advantages or standards that Hive is adhering to with this 
behavior?

Thank you for your time and insights. I appreciate the work done by the Hive 
community and would be happy to contribute to any discussions or documentation 
if needed.

Best regards,
Sadegh Abedi

Reply via email to