[ https://issues.apache.org/jira/browse/SPARK-25829?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16663118#comment-16663118 ]
Wenchen Fan edited comment on SPARK-25829 at 11/21/18 10:06 AM: ---------------------------------------------------------------- More investigation on "later entry wins". If we still allow duplicated keys in map physically, following functions need to be updated: Explode, PosExplode, GetMapValue, MapKeys, MapValues, MapEntries, TransformKeys, TransformValues, MapZipWith If we want to forbid duplicated keys in map, following functions need to be updated: CreateMap, MapFromArrays, MapFromEntries, StringToMap, MapConcat, TransformKeys, MapFilter, and also reading map from data sources. So "later entry wins" semantic is more ideal but needs more works. was (Author: cloud_fan): More investigation on "later entry wins". If we still allow duplicated keys in map physically, following functions need to be updated: Explode, PosExplode, GetMapValue, MapKeys, MapValues, MapEntries, TransformKeys, TransformValues, MapZipWith If we want to forbid duplicated keys in map, following functions need to be updated: CreateMap, MapFromArrays, MapFromEntries, MapFromString, MapConcat, TransformKeys, MapFilter, and also reading map from data sources. So "later entry wins" semantic is more ideal but needs more works. > Duplicated map keys are not handled consistently > ------------------------------------------------ > > Key: SPARK-25829 > URL: https://issues.apache.org/jira/browse/SPARK-25829 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.4.0 > Reporter: Wenchen Fan > Priority: Major > > In Spark SQL, we apply "earlier entry wins" semantic to duplicated map keys. > e.g. > {code} > scala> sql("SELECT map(1,2,1,3)[1]").show > +------------------+ > |map(1, 2, 1, 3)[1]| > +------------------+ > | 2| > +------------------+ > {code} > However, this handling is not applied consistently. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org