[ https://issues.apache.org/jira/browse/SPARK-23603?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dzcxzl updated SPARK-23603: --------------------------- Labels: ca (was: ) > When the length of the json is in a range,get_json_object will result in > missing tail data > ------------------------------------------------------------------------------------------ > > Key: SPARK-23603 > URL: https://issues.apache.org/jira/browse/SPARK-23603 > Project: Spark > Issue Type: Bug > Components: SQL > Affects Versions: 2.0.0, 2.2.0, 2.3.0 > Reporter: dzcxzl > Priority: Major > Labels: ca > > Jackson(>=2.7.7) fixes the possibility of missing tail data when the length > of the value is in a range > [https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.7.7] > [https://github.com/FasterXML/jackson-core/issues/307] > > spark-shell: > > {code:java} > val value = "x" * 3000 > val json = s"""{"big": "$value"}""" > spark.sql("select length(get_json_object(\'"+json+"\','$.big'))" ).collect > res0: Array[org.apache.spark.sql.Row] = Array([2991]) > {code} > correct result : 3000 > > > There are two solutions > One is > bump jackson version to 2.7.7 > The other one is > Replace writeRaw(char[] text, int offset, int len) with writeRaw(String text) > -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org