Kimahriman commented on code in PR #731: URL: https://github.com/apache/datafusion-comet/pull/731#discussion_r1700429573
########## native/spark-expr/src/structs.rs: ########## @@ -125,3 +125,103 @@ impl PartialEq<dyn Any> for CreateNamedStruct { .unwrap_or(false) } } + +#[derive(Debug, Hash)] +pub struct GetStructField { + child: Arc<dyn PhysicalExpr>, + ordinal: usize, +} + +impl GetStructField { + pub fn new(child: Arc<dyn PhysicalExpr>, ordinal: usize) -> Self { + Self { child, ordinal } + } + + fn child_field(&self, input_schema: &Schema) -> DataFusionResult<Arc<Field>> { + match self.child.data_type(input_schema)? { + DataType::Struct(fields) => Ok(fields[self.ordinal].clone()), + data_type => Err(DataFusionError::Plan(format!( + "Expect struct field, got {:?}", + data_type + ))), + } + } +} + +impl PhysicalExpr for GetStructField { Review Comment: In this particular case `GetStructField` might not even make sense as a UDF because it's created by the Spark analyzer from an `ExtractValue` expression, which resolves the name to an ordinal in the struct. It'd be odd to use `GetStructField` directly, since you would normally want a nested column by name, and not by index. I think the existing `get_field` UDF in DataFusion covers that use case, but using this PhysicalExpr on the Spark side seems like it makes more sense instead of having to convert the ordinal back into a field name to then use `get_field` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org