[ 
https://issues.apache.org/jira/browse/SPARK-40170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17582777#comment-17582777
 ] 

Jungtaek Lim commented on SPARK-40170:
--------------------------------------

Premature optimization is the root of all evil. The thing is, your application 
tries to (or makes Spark to) convert UnsafeRow into external Row. Both have 
different memory placement and we have to copy the data - this can't be 
changed. I'd rather say recheck such conversion is really needed, and if it is, 
what are alternatives to avoid it. Probably better to share your 
application/query code.

> StringCoding UTF8 decode slowly
> -------------------------------
>
>                 Key: SPARK-40170
>                 URL: https://issues.apache.org/jira/browse/SPARK-40170
>             Project: Spark
>          Issue Type: Improvement
>          Components: SQL
>    Affects Versions: 3.1.2, 3.2.0, 3.1.3, 3.2.1, 3.3.0, 3.2.2, 3.3.1
>            Reporter: caican
>            Priority: Major
>         Attachments: image-2022-08-22-10-56-54-768.png, 
> image-2022-08-22-10-57-11-744.png
>
>
> When `UnsafeRow` is converted to `Row` at 
> `org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificSafeProjection.createExternalRow
>  `,  UTF8String decoding and copyMemory  process are very slow.
> Does anyone have any ideas for optimization?
> !image-2022-08-22-10-56-54-768.png!
>  
> !image-2022-08-22-10-57-11-744.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to