[ 
https://issues.apache.org/jira/browse/GOBBLIN-1715?focusedWorklogId=813811&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-813811
 ]

ASF GitHub Bot logged work on GOBBLIN-1715:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 30/Sep/22 23:01
            Start Date: 30/Sep/22 23:01
    Worklog Time Spent: 10m 
      Work Description: codecov-commenter commented on PR #3574:
URL: https://github.com/apache/gobblin/pull/3574#issuecomment-1264116133

   # 
[Codecov](https://codecov.io/gh/apache/gobblin/pull/3574?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3574](https://codecov.io/gh/apache/gobblin/pull/3574?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (5ecf978) into 
[master](https://codecov.io/gh/apache/gobblin/commit/71da34b5d42e7b2c159ef241368b644a08de875d?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (71da34b) will **increase** coverage by `1.73%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3574      +/-   ##
   ============================================
   + Coverage     46.90%   48.64%   +1.73%     
   + Complexity    10610     7851    -2759     
   ============================================
     Files          2111     1467     -644     
     Lines         82533    57802   -24731     
     Branches       9178     6647    -2531     
   ============================================
   - Hits          38714    28117   -10597     
   + Misses        40261    27068   -13193     
   + Partials       3558     2617     -941     
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/gobblin/pull/3574?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...rg/apache/gobblin/writer/GobblinBaseOrcWriter.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tb3JjL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3dyaXRlci9Hb2JibGluQmFzZU9yY1dyaXRlci5qYXZh)
 | | |
   | 
[...e/gobblin/qualitychecker/task/TaskLevelPolicy.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vcXVhbGl0eWNoZWNrZXIvdGFzay9UYXNrTGV2ZWxQb2xpY3kuamF2YQ==)
 | | |
   | 
[...he/gobblin/kafka/client/Kafka08ConsumerClient.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtMDgvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4va2Fma2EvY2xpZW50L0thZmthMDhDb25zdW1lckNsaWVudC5qYXZh)
 | | |
   | 
[...org/apache/gobblin/source/extractor/Extractor.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc291cmNlL2V4dHJhY3Rvci9FeHRyYWN0b3IuamF2YQ==)
 | | |
   | 
[.../gobblin/kafka/writer/KafkaWriterCommonConfig.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2thZmthL3dyaXRlci9LYWZrYVdyaXRlckNvbW1vbkNvbmZpZy5qYXZh)
 | | |
   | 
[.../converter/BytesToRecordWithMetadataConverter.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tbWV0YWRhdGEvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vY29udmVydGVyL0J5dGVzVG9SZWNvcmRXaXRoTWV0YWRhdGFDb252ZXJ0ZXIuamF2YQ==)
 | | |
   | 
[.../apache/gobblin/kafka/writer/Kafka1DataWriter.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtMS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9rYWZrYS93cml0ZXIvS2Fma2ExRGF0YVdyaXRlci5qYXZh)
 | | |
   | 
[...gobblin/service/modules/spec/JobExecutionPlan.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9zcGVjL0pvYkV4ZWN1dGlvblBsYW4uamF2YQ==)
 | | |
   | 
[.../apache/gobblin/service/FlowExecutionResource.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi1mbG93LWNvbmZpZy1zZXJ2aWNlL2dvYmJsaW4tZmxvdy1jb25maWctc2VydmljZS1zZXJ2ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc2VydmljZS9GbG93RXhlY3V0aW9uUmVzb3VyY2UuamF2YQ==)
 | | |
   | 
[.../gobblin/service/modules/core/NoopD2Announcer.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9jb3JlL05vb3BEMkFubm91bmNlci5qYXZh)
 | | |
   | ... and [635 
more](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD 
build times. [Learn 
more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 813811)
    Time Spent: 20m  (was: 10m)

> Support vectorized row batch pooling
> ------------------------------------
>
>                 Key: GOBBLIN-1715
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1715
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-core
>            Reporter: Ratandeep Ratti
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> The pre-allocation method allocates vastly more memory for ORC ColumnVectors 
> of arrays and maps than needed and is unpredictable as it depends upon the 
> size of the current column vector’s length, which can change as we allocate 
> more memory to it. From the heap dump done on a kafka topic we saw that on 
> the second resize call for an array ColumnVector, where request size was ~ 1k 
> elements, it had requested to allocate around 444M elements. This resulted in 
> over allocating way past the heap size. This was the primary reason why  we 
> see OOM failures during ingestion for deeply nested records
> Update: Below is an example of how a very large memory can be allocated using 
> smart resizing procedure. The formula for allocating memory is 
> {noformat}
> child_vector resize = 
>    child_vector_request_size  + 
>   (child_vector_request_size / rowsAdded + 1) * current_vector_size
> {noformat}
> If we now have deeply nested arrays of arrays each of 525 elements in a row 
> like The memory will be allocated as such.
> {noformat}
> 1st resize = (525 + 525/1 + 1) * 256 = 135181 ; current vector size by 
> default is batch size = 256
> 2nd resize = (525 + 525/1 + 1) * 135181 = *71105731*                         
> ; current vector size = 135181 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to