MaxGekk opened a new pull request #26613: [MINOR][SQL] Check `json_tuple` does 
not truncate results
URL: https://github.com/apache/spark/pull/26613
 
 
   ### What changes were proposed in this pull request?
   I propose to add a test from the commit 
https://github.com/apache/spark/commit/a9365221133caadffbbbbce1aae1ace799a588a3 
for 2.4. I extended the test by a few more lengths of requested field to cover 
more code branches in Jackson Core. In particular, [the 
optimization](https://github.com/apache/spark/blob/5eb8973f871fef557fb4ca3f494406ed676a431a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L473-L476)
 calls Jackson's method 
https://github.com/FasterXML/jackson-core/blob/42b8b566845e8f8d77537f51187a439029ed9bff/src/main/java/com/fasterxml/jackson/core/json/UTF8JsonGenerator.java#L742-L746
 where the internal buffer size is **8000**. In this way:
   - 2000 to check 2000+2000+2000 < 8000
   - 2800 from the 2.4 commit to check user's case
   - 8000-1, 8000, 8000+1 are sizes around the size of the internal buffer
   - 65535 to test an outstanding large field.
   
   ### Why are the changes needed?
   To be sure that current implementation and future version of Spark don't 
have the bug fixed in 2.4
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   By running `JsonFunctionsSuite`.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to