Jefffrey commented on code in PR #22702:
URL: https://github.com/apache/datafusion/pull/22702#discussion_r3346120672
##########
datafusion/core/src/dataframe/mod.rs:
##########
@@ -2527,6 +2528,78 @@ impl DataFrame {
.collect()
}
+ /// Fill NaN values in specified columns with a given value
+ /// If no columns are specified (empty vector), applies to all columns
+ /// Only fills if the value can be cast to the column's type
+ ///
+ /// # Arguments
+ /// * `value` - Value to fill NaNs with
+ /// * `columns` - List of column names to fill. If empty, fills all
columns.
+ ///
+ /// # Example
+ /// ```
+ /// # use datafusion::prelude::*;
+ /// # use datafusion::error::Result;
+ /// # use datafusion_common::ScalarValue;
+ /// # #[tokio::main]
+ /// # async fn main() -> Result<()> {
+ /// let ctx = SessionContext::new();
+ /// let df = ctx
+ /// .read_csv("tests/data/example.csv", CsvReadOptions::new())
+ /// .await?;
+ /// // Fill NaN in only columns "a" and "c":
+ /// let df = df.fill_nan(ScalarValue::from(0.0), vec!["a".to_owned(),
"c".to_owned()])?;
+ /// // Fill NaN across all columns:
+ /// let df = df.fill_nan(ScalarValue::from(0.0), vec![])?;
+ /// # Ok(())
+ /// # }
+ /// ```
+ #[expect(clippy::needless_pass_by_value)]
+ pub fn fill_nan(
+ &self,
+ value: ScalarValue,
+ columns: Vec<String>,
+ ) -> Result<DataFrame> {
Review Comment:
I would prefer trying to get the API right from the start, since as you
point out changing them later can be a breaking change (technically we can
avoid if we do the followup within the same release window)
Requiring a `Vec<String>` is a bit unwieldy, which I think is what the
clippy lint is trying to tell us?
I think it would be good to try get `&[impl Into<Column>]` to work if
possible since that allows `&[&str]` to work as well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]