alamb commented on code in PR #19335:
URL: https://github.com/apache/datafusion/pull/19335#discussion_r2623746392
##########
datafusion/sqllogictest/test_files/metadata.slt:
##########
@@ -235,7 +235,56 @@ order by 1 asc nulls last;
3 1
NULL 1
+# Regression test: first_value should preserve metadata
Review Comment:
I think this would help debug various metadata issues more easily
I can file a ticket if you think this is reasonable
##########
datafusion/sqllogictest/test_files/metadata.slt:
##########
@@ -235,7 +235,56 @@ order by 1 asc nulls last;
3 1
NULL 1
+# Regression test: first_value should preserve metadata
Review Comment:
the tests I think test that there is metadata on the input tables (rather
than the output tables)
I do really like the idea of adding a UDF, simlarly to `arrow_typeof` that
can show the metadata
``sql
> select arrow_typeof('foo');
+---------------------------+
| arrow_typeof(Utf8("foo")) |
+---------------------------+
| Utf8 |
+---------------------------+
1 row(s) fetched.
Elapsed 0.024 seconds.
```
Possibilities: add a new argument
```sql
> select arrow_typeof('foo', true);
```
Possibility: add a new function
```sql
> select arrow_metadata('foo');
```
##########
datafusion/sqllogictest/src/test_context.rs:
##########
@@ -398,6 +398,58 @@ pub async fn register_metadata_tables(ctx:
&SessionContext) {
.unwrap();
ctx.register_batch("table_with_metadata", batch).unwrap();
+
+ // Register the get_metadata UDF for testing metadata preservation
+ ctx.register_udf(ScalarUDF::from(GetMetadataUdf::new()));
+}
+
+/// UDF to extract metadata from a field for testing purposes
+/// Usage: get_metadata(expr, 'key') -> returns the metadata value or NULL
+#[derive(Debug, PartialEq, Eq, Hash)]
+struct GetMetadataUdf {
+ signature: Signature,
+}
+
+impl GetMetadataUdf {
+ fn new() -> Self {
+ Self {
+ signature: Signature::any(2, Volatility::Immutable),
+ }
+ }
+}
+
+impl ScalarUDFImpl for GetMetadataUdf {
+ fn as_any(&self) -> &dyn Any {
+ self
+ }
+
+ fn name(&self) -> &str {
+ "get_metadata"
+ }
+
+ fn signature(&self) -> &Signature {
+ &self.signature
+ }
+
+ fn return_type(&self, _arg_types: &[DataType]) -> Result<DataType> {
+ Ok(DataType::Utf8)
+ }
+
+ fn invoke_with_args(&self, args: ScalarFunctionArgs) ->
Result<ColumnarValue> {
+ // Get the metadata key from the second argument (must be a string
literal)
Review Comment:
I think it would also be nice if we supported a single column version that
returned the metadata as a struct array too
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]