[ 
https://issues.apache.org/jira/browse/HIVE-29137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

shilongfei updated HIVE-29137:
------------------------------
    Priority: Major  (was: Critical)

> Write orc map type very slow
> ----------------------------
>
>                 Key: HIVE-29137
>                 URL: https://issues.apache.org/jira/browse/HIVE-29137
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 3.1.0, 4.1.0, 4.0.1
>            Reporter: zhaolong
>            Priority: Major
>         Attachments: image-2025-08-11-19-58-36-722.png
>
>
> I find that when the map field data is complex, writing the map field data of 
> the orc type is very slow, but parquet's is very fast.
>  
> jstack:
> "main" #1 prio=5 os_prio=0 tid=0x00007f8e3c065000 nid=0x1197 runnable 
> [0x00007f8e44285000]
>    java.lang.Thread.State: RUNNABLE
>         at 
> org.apache.hadoop.hive.ql.exec.vector.BytesColumnVector.ensureSize(BytesColumnVector.java:544)
>         at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:279)
>         at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.setColumn(WriterImpl.java:285)
>         at 
> org.apache.hadoop.hive.ql.io.orc.WriterImpl.addRow(WriterImpl.java:311)
>         at 
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat$OrcRecordWriter.write(OrcOutputFormat.java:96)
>         at 
> org.apache.hadoop.hive.ql.exec.FileSinkOperator.process(FileSinkOperator.java:1184)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:888)
>         at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:94)
>         at 
> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:214)
>         at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:445)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:393)
>         at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:422)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899)
>         at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
>  
> look like time spent in arrarycopy
> !image-2025-08-11-19-58-36-722.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to