[ 
https://issues.apache.org/jira/browse/GOBBLIN-1918?focusedWorklogId=881440&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-881440
 ]

ASF GitHub Bot logged work on GOBBLIN-1918:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Sep/23 17:31
            Start Date: 22/Sep/23 17:31
    Worklog Time Spent: 10m 
      Work Description: codecov-commenter commented on PR #3787:
URL: https://github.com/apache/gobblin/pull/3787#issuecomment-1731790732

   ## 
[Codecov](https://app.codecov.io/gh/apache/gobblin/pull/3787?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 Report
   > Merging 
[#3787](https://app.codecov.io/gh/apache/gobblin/pull/3787?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 (7e2a590) into 
[master](https://app.codecov.io/gh/apache/gobblin/commit/1ca3192a29a3fd45045ca5bcddbdf58763ae1f79?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
 (1ca3192) will **increase** coverage by `6.59%`.
   > Report is 2 commits behind head on master.
   > The diff coverage is `n/a`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3787      +/-   ##
   ============================================
   + Coverage     47.30%   53.90%   +6.59%     
   + Complexity    10957     1397    -9560     
   ============================================
     Files          2152      281    -1871     
     Lines         85111    10269   -74842     
     Branches       9452     1101    -8351     
   ============================================
   - Hits          40266     5535   -34731     
   + Misses        41196     4215   -36981     
   + Partials       3649      519    -3130     
   ```
   
   
   [see 1871 files with indirect coverage 
changes](https://app.codecov.io/gh/apache/gobblin/pull/3787/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   
   :mega: We’re building smart automated test selection to slash your CI/CD 
build times. [Learn 
more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 881440)
    Time Spent: 20m  (was: 10m)

> Optimize smart resizing for ORC Writer converter buffer
> -------------------------------------------------------
>
>                 Key: GOBBLIN-1918
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1918
>             Project: Apache Gobblin
>          Issue Type: Improvement
>          Components: gobblin-core
>            Reporter: William Lo
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The GobblinOrcWriter contains a converter and a buffer rowbatch. The buffer 
> holds the converted Avro -> Orc records before adding them to the native orc 
> writer.
> Since it can contain multiple records, it constantly needs to resize the 
> columns of the rowbatch in order to hold multiple records. This problem 
> affects both performance and memory when resizing is done either too often 
> (enlarge factor is too low) or not often enough (enlarge factor is too high 
> and thus the buffer dominates the container memory).
> Because there is a bounded number of records that can persist in the buffer 
> before getting flushed, we want to reduce the aggressiveness of the resizing 
> algorithm the more records that have been processed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to