Shaobo Guan created SPARK-55560:
-----------------------------------
Summary: STR_TO_MAP returns wrong result with "|" in delimiter
Key: SPARK-55560
URL: https://issues.apache.org/jira/browse/SPARK-55560
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 3.5.6
Reporter: Shaobo Guan
STR_TO_MAP returns wrong result with "|" in delimiter:
('key1|value1,key2|value2', ',', '|'):
Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most
recent failure: Lost task 0.0 in stage 0.0 (TID 0)
(ip-10-0-3-243.us-east-2.compute.internal executor driver):
org.apache.spark.SparkRuntimeException: [DUPLICATED_MAP_KEY] Duplicate map key
k was found, please check the input data. If you want to remove the duplicated
keys, you can set "spark.sql.mapKeyDedupPolicy" to "LAST_WIN" so that the key
inserted at last takes precedence.
at
org.apache.spark.sql.errors.QueryExecutionErrors$.duplicateMapKeyFoundError(QueryExecutionErrors.scala:1278)
at
org.apache.spark.sql.catalyst.util.ArrayBasedMapBuilder.put(ArrayBasedMapBuilder.scala:69)
Row 11 (inputs=(k10|푇ͳõ, &, |)):
Column 'out': actual=Map(k -> 10|푇ͳõ), expected=Map(k10 -> 푇ͳõ)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]