alamb commented on code in PR #9689:
URL: https://github.com/apache/arrow-datafusion/pull/9689#discussion_r1534272307
##########
datafusion/functions/src/datetime/to_char.rs:
##########
@@ -172,6 +172,19 @@ fn _to_char_scalar(
let data_type = &expression.data_type();
let is_scalar_expression = matches!(&expression, ColumnarValue::Scalar(_));
let array = expression.into_array(1)?;
+
+ if format.is_none() {
+ if is_scalar_expression {
+ return Ok(ColumnarValue::Scalar(ScalarValue::Utf8(
+ Some(String::new()),
Review Comment:
I thought `None` is the correct value (as it will semantically be a`NULL`,
which is the correct results)
The sqllogictests have special formatting for `NULL` values (to distinguish
them from empty strings)
https://github.com/apache/arrow-datafusion/blob/1d8a41bc8e08b56e90d6f8e6ef20e39a126987e4/datafusion/sqllogictest/src/engines/datafusion_engine/normalize.rs#L198-L200
However, when I double checked, the behavior of `to_date` in spark seems to
be different yet (passing in a null format seems to simply ignore the format
string and instead parses with the default -- it doesn't return `null`)
```python
>>> df = spark.createDataFrame([('1997-02-28 10:30:00',)], ['t'])
>>> df.select(functions.to_date(df.t, 'yyyy-MM-dd
HH:mm:ss').alias('date')).collect()
[Row(date=datetime.date(1997, 2, 28))]
>>> df.select(functions.to_date(df.t, None).alias('date')).collect()
[Row(date=datetime.date(1997, 2, 28))]
```
```python
>>> df = spark.createDataFrame([('1997-02-2ddddd',)], ['t'])
>>> df.select(functions.to_date(df.t, None).alias('date')).collect()
[Row(date=None)]
>>> df.select(functions.to_date(df.t, 'yyyy-MM-dd
HH:mm:ss').alias('date')).collect()
[Row(date=None)]
```
Any hints @Omega359 ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]