[ https://issues.apache.org/jira/browse/GOBBLIN-1715?focusedWorklogId=813811&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-813811 ]
ASF GitHub Bot logged work on GOBBLIN-1715: ------------------------------------------- Author: ASF GitHub Bot Created on: 30/Sep/22 23:01 Start Date: 30/Sep/22 23:01 Worklog Time Spent: 10m Work Description: codecov-commenter commented on PR #3574: URL: https://github.com/apache/gobblin/pull/3574#issuecomment-1264116133 # [Codecov](https://codecov.io/gh/apache/gobblin/pull/3574?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report > Merging [#3574](https://codecov.io/gh/apache/gobblin/pull/3574?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (5ecf978) into [master](https://codecov.io/gh/apache/gobblin/commit/71da34b5d42e7b2c159ef241368b644a08de875d?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (71da34b) will **increase** coverage by `1.73%`. > The diff coverage is `n/a`. ```diff @@ Coverage Diff @@ ## master #3574 +/- ## ============================================ + Coverage 46.90% 48.64% +1.73% + Complexity 10610 7851 -2759 ============================================ Files 2111 1467 -644 Lines 82533 57802 -24731 Branches 9178 6647 -2531 ============================================ - Hits 38714 28117 -10597 + Misses 40261 27068 -13193 + Partials 3558 2617 -941 ``` | [Impacted Files](https://codecov.io/gh/apache/gobblin/pull/3574?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | | |---|---|---| | [...rg/apache/gobblin/writer/GobblinBaseOrcWriter.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tb3JjL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3dyaXRlci9Hb2JibGluQmFzZU9yY1dyaXRlci5qYXZh) | | | | [...e/gobblin/qualitychecker/task/TaskLevelPolicy.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vcXVhbGl0eWNoZWNrZXIvdGFzay9UYXNrTGV2ZWxQb2xpY3kuamF2YQ==) | | | | [...he/gobblin/kafka/client/Kafka08ConsumerClient.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtMDgvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4va2Fma2EvY2xpZW50L0thZmthMDhDb25zdW1lckNsaWVudC5qYXZh) | | | | [...org/apache/gobblin/source/extractor/Extractor.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc291cmNlL2V4dHJhY3Rvci9FeHRyYWN0b3IuamF2YQ==) | | | | [.../gobblin/kafka/writer/KafkaWriterCommonConfig.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2thZmthL3dyaXRlci9LYWZrYVdyaXRlckNvbW1vbkNvbmZpZy5qYXZh) | | | | [.../converter/BytesToRecordWithMetadataConverter.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tbWV0YWRhdGEvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vY29udmVydGVyL0J5dGVzVG9SZWNvcmRXaXRoTWV0YWRhdGFDb252ZXJ0ZXIuamF2YQ==) | | | | [.../apache/gobblin/kafka/writer/Kafka1DataWriter.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtMS9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9rYWZrYS93cml0ZXIvS2Fma2ExRGF0YVdyaXRlci5qYXZh) | | | | [...gobblin/service/modules/spec/JobExecutionPlan.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9zcGVjL0pvYkV4ZWN1dGlvblBsYW4uamF2YQ==) | | | | [.../apache/gobblin/service/FlowExecutionResource.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi1mbG93LWNvbmZpZy1zZXJ2aWNlL2dvYmJsaW4tZmxvdy1jb25maWctc2VydmljZS1zZXJ2ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc2VydmljZS9GbG93RXhlY3V0aW9uUmVzb3VyY2UuamF2YQ==) | | | | [.../gobblin/service/modules/core/NoopD2Announcer.java](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9jb3JlL05vb3BEMkFubm91bmNlci5qYXZh) | | | | ... and [635 more](https://codecov.io/gh/apache/gobblin/pull/3574/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | | :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Issue Time Tracking ------------------- Worklog Id: (was: 813811) Time Spent: 20m (was: 10m) > Support vectorized row batch pooling > ------------------------------------ > > Key: GOBBLIN-1715 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1715 > Project: Apache Gobblin > Issue Type: Bug > Components: gobblin-core > Reporter: Ratandeep Ratti > Assignee: Abhishek Tiwari > Priority: Major > Time Spent: 20m > Remaining Estimate: 0h > > The pre-allocation method allocates vastly more memory for ORC ColumnVectors > of arrays and maps than needed and is unpredictable as it depends upon the > size of the current column vector’s length, which can change as we allocate > more memory to it. From the heap dump done on a kafka topic we saw that on > the second resize call for an array ColumnVector, where request size was ~ 1k > elements, it had requested to allocate around 444M elements. This resulted in > over allocating way past the heap size. This was the primary reason why we > see OOM failures during ingestion for deeply nested records > Update: Below is an example of how a very large memory can be allocated using > smart resizing procedure. The formula for allocating memory is > {noformat} > child_vector resize = > child_vector_request_size + > (child_vector_request_size / rowsAdded + 1) * current_vector_size > {noformat} > If we now have deeply nested arrays of arrays each of 525 elements in a row > like The memory will be allocated as such. > {noformat} > 1st resize = (525 + 525/1 + 1) * 256 = 135181 ; current vector size by > default is batch size = 256 > 2nd resize = (525 + 525/1 + 1) * 135181 = *71105731* > ; current vector size = 135181 > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)