[ https://issues.apache.org/jira/browse/HIVE-7741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth Jayachandran resolved HIVE-7741. ----------------------------------------- Resolution: Duplicate Fixed by HIVE-10191. > Don't synchronize WriterImpl.addRow() when dynamic.partition is enabled > ----------------------------------------------------------------------- > > Key: HIVE-7741 > URL: https://issues.apache.org/jira/browse/HIVE-7741 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: 0.13.1 > Environment: Loading into orc > Reporter: Mostafa Mokhtar > Assignee: Prasanth Jayachandran > Labels: performance > > When loading into an un-paritioned ORC table > WriterImpl$StructTreeWriter.write method is synchronized. > When hive.optimize.sort.dynamic.partition is enabled the current thread will > be the only writer and the synchronization is not needed. > Also checking for memory per row is an over kill , this can be done per 1K > rows or such > {code} > public void addRow(Object row) throws IOException { > synchronized (this) { > treeWriter.write(row); > rowsInStripe += 1; > if (buildIndex) { > rowsInIndex += 1; > if (rowsInIndex >= rowIndexStride) { > createRowIndexEntry(); > } > } > } > memoryManager.addedRow(); > } > {code} > This can improve ORC load performance by 7% > {code} > Stack Trace Sample Count Percentage(%) > WriterImpl.addRow(Object) 5,852 65.782 > WriterImpl$StructTreeWriter.write(Object) 5,163 58.037 > MemoryManager.addedRow() 666 7.487 > MemoryManager.notifyWriters() 648 7.284 > WriterImpl.checkMemory(double) 645 7.25 > WriterImpl.flushStripe() 643 7.228 > > WriterImpl$StructTreeWriter.writeStripe(OrcProto$StripeFooter$Builder, int) > 584 6.565 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)