dadepo opened a new issue, #6485:
URL: https://github.com/apache/arrow-datafusion/issues/6485

   ### Is your feature request related to a problem or challenge?
   
   I am not sure if this is a bug, feature request or things are working as it 
should.
   
   How can I fix an error of this kind
   
   Background, I have a UDF that mimics json_build_object from postgres. This 
UDF returns a MapArray as the returning value.
   
   With a table like
   
   ```
   let df = ctx.sql(r#"
   select * from test"#).await?;
   df.show().await?;
   
   +-----------+-----------+-----------+
   | clientid  | name      | parentid  |
   +-----------+-----------+-----------+
   | c-string1 | n-string1 | p-string1 |
   | c-string2 | n-string2 | p-string2 |
   +-----------+-----------+-----------+
   ```
   
   I can build a json like this fine
   
   ```
           let df = ctx.sql(r#"
           select json_build_object('name', name, 'type', 'test')
           from test"#).await?;
           df.show().await?;
   
   +---------------------------------------------------------------------+
   | json_build_object(Utf8("name"),test.name,Utf8("type"),Utf8("test")) |
   +---------------------------------------------------------------------+
   | {name: n-string1, type: test}                                       |
   | {name: n-string2, type: test}                                       |
   +---------------------------------------------------------------------+
   ```
   
   But if I remove name or any other column that exist in the table, I get the 
error
   
   ```
           let df = ctx.sql(r#"
           select json_build_object('type', 'test')
           from test"#).await?;
           df.show().await?;
   
   Error: This feature is not implemented: Can't create a scalar from array of 
type "Map(Field { name: "entries", data_type: Struct([Field { name: "keys", 
data_type: Utf8, nullable: false, dict_id: 0, dict_is_ordered: false, metadata: 
{} }, Field { name: "values", data_type: Utf8, nullable: true, dict_id: 0, 
dict_is_ordered: false, metadata: {} }]), nullable: false, dict_id: 0, 
dict_is_ordered: false, metadata: {} }, false)"
   ```
   
   A simple work around will be to always include a column from the table in 
the query, but unfortunately i cannot guarantee this. Especially when I have 
use case where json_build_object works on computed values or values created via 
other udfs.
   
   How can I make my UDF not dependent on the present of a column? For example 
I can use the make_array function in this way. That is, I can use it without 
including a column from the table
   
   ```
           let df = ctx.sql(r#"
           select make_array('type', 'test')
           from test"#).await?;
           df.show().await?;
   
   +--------------------------------------+
   | makearray(Utf8("type"),Utf8("test")) |
   +--------------------------------------+
   | [type, test]                         |
   | [type, test]                         |
   +--------------------------------------+
   ```
   
   ### Describe the solution you'd like
   
   Ability to have a UDF not dependent on the present of a column, generate as 
much results in a column as needed in the resulting record batch 
   
   ### Describe alternatives you've considered
   
   if I change the return type to UTF8 instead of Map, that is
      
   ```
   let json_build_object_return_type: ReturnTypeFunction = Arc::new(move |_| 
Ok(Arc::new(DataType::Utf8)));
   ```
   
   And i return `StringArray` from the UDF, I get the behaviour I am looking 
for. But this means the json structure returned from the UDF get formatted into 
strings, which makes processing more difficult down the line as I end up with 
values like this
   
   ```
   let value = r#"\"{\\\"key_one\\\":\\\"val_one\\\"}\""#;
   ```
   
   ### Additional context
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to