[ 
https://issues.apache.org/jira/browse/ORC-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Quanlong Huang resolved ORC-1150.
---------------------------------
    Fix Version/s: 1.8.0
       Resolution: Fixed

Resolved by https://github.com/apache/orc/pull/1087

> [C++] Improve RowReaderImpl::computeBatchSize()
> -----------------------------------------------
>
>                 Key: ORC-1150
>                 URL: https://issues.apache.org/jira/browse/ORC-1150
>             Project: ORC
>          Issue Type: Improvement
>          Components: C++
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>             Fix For: 1.8.0
>
>         Attachments: RowReaderImpl_next_annotation.png, 
> image-2022-04-12-17-11-28-091.png
>
>
> RowReaderImpl::computeBatchSize() can be the hot path when sargs exists. The 
> following perf report shows that orc::RowReaderImpl::next() itself takes 1/4 
> of the scan time. It's measured using orc-scan with sargs 
> "inv_quantity_on_hand between -1 and 5000" scanning 4 orc files of 
> TPCDS-inventory table (768.23MB in total size).
> !image-2022-04-12-17-11-28-091.png|width=713,height=251!
> Looking into the disassembly of it, the time is taken by a loop:
> !RowReaderImpl_next_annotation.png|width=556,height=465!
> The annotation indicates it's the inlined RowReaderImpl::computeBatchSize() 
> method. Disassembly codes:
> {code:java}
>        │ d0:┌─→mov    %r14,%r15
>   0.36 │    │  mov    %esi,%ecx
>   0.13 │    │  shr    $0x6,%rdx
>  22.81 │    │  shl    %cl,%r15
>  24.24 │    │  test   %r15,(%r9,%rdx,8)
>        │    │↓ je     fb  
>        │ e2:│  lea    0x1(%rsi),%edx
>   0.22 │    │  mov    %r10,%rax
>   0.18 │    │  imul   %rdx,%rax
>  25.31 │    │  mov    %rdx,%rsi
>        │    │  cmp    %rdi,%rax
>   0.54 │    │  cmova  %rdi,%rax
>   0.04 │    ├──cmp    %r11,%rdx
>  23.79 │    └──jb     d0  
>   0.31 │ fb:   sub    %r8,%rax{code}
>  The corresponding loop:
> {code:cpp}
> endRowInStripe = currentRowInStripe;
> uint32_t rg = static_cast<uint32_t>(currentRowInStripe / rowIndexStride);
> for (; rg < includedRowGroups.size(); ++rg) {
>   if (!includedRowGroups[rg]) {
>     break;
>   } else {
>     endRowInStripe = std::min(rowsInCurrentStripe, (rg + 1) * rowIndexStride);
>   }
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to