peterxcli commented on PR #20358:
URL: https://github.com/apache/datafusion/pull/20358#issuecomment-3910433621

   > > Warning that spark has `spark.sql.mapKeyDedupPolicy`
   > > ```
   > > spark.sql.mapKeyDedupPolicy | EXCEPTION | The policy to deduplicate map 
keys in builtin function: CreateMap, MapFromArrays, MapFromEntries, 
StringToMap, MapConcat and TransformKeys. When EXCEPTION, the query fails if 
duplicated map keys are detected. When LAST_WIN, the map key that is inserted 
at last takes precedence. | 3.0.0
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > 
[spark.apache.org/docs/latest/configuration.html](https://spark.apache.org/docs/latest/configuration.html)
   > 
   > this is a good point
   
   yeah, thanks for pointing out. That's a big reason why I try to reuse the 
`map_from_keys_values_offsets_nulls` from 
`datafusion/spark/src/function/map/utils.rs`, which handle the spark policy 
things already,
   
   
https://github.com/apache/datafusion/blob/ffcc7e3af8cfccb0c0705de2112d8277b28114fd/datafusion/spark/src/function/map/utils.rs#L114-L120
   
   > For now, configurable functions are not supported by Datafusion. So more 
permissive `LAST_WIN` option is used in this implementation (instead of 
`EXCEPTION`) `EXCEPTION` behaviour can still be achieved externally in cost of 
performance:
   `when(array_length(array_distinct(keys)) == array_length(keys), 
constructed_map)` `.otherwise(raise_error("duplicate keys occurred during map 
construction"))` 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to