[ 
https://issues.apache.org/jira/browse/GOBBLIN-1783?focusedWorklogId=844482&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-844482
 ]

ASF GitHub Bot logged work on GOBBLIN-1783:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 09/Feb/23 01:04
            Start Date: 09/Feb/23 01:04
    Worklog Time Spent: 10m 
      Work Description: codecov-commenter commented on PR #3640:
URL: https://github.com/apache/gobblin/pull/3640#issuecomment-1423459456

   # 
[Codecov](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 Report
   > Merging 
[#3640](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (80d0bce) into 
[master](https://codecov.io/gh/apache/gobblin/commit/13faea46bd2f23999fb1bf9ea579296fb86d1e3d?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 (13faea4) will **decrease** coverage by `2.77%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@             Coverage Diff              @@
   ##             master    #3640      +/-   ##
   ============================================
   - Coverage     46.58%   43.81%   -2.77%     
   + Complexity    10681     2066    -8615     
   ============================================
     Files          2133      409    -1724     
     Lines         83573    17639   -65934     
     Branches       9295     2152    -7143     
   ============================================
   - Hits          38931     7729   -31202     
   + Misses        41076     9052   -32024     
   + Partials       3566      858    -2708     
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 | Coverage Δ | |
   |---|---|---|
   | 
[...pache/gobblin/configuration/ConfigurationKeys.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1hcGkvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vY29uZmlndXJhdGlvbi9Db25maWd1cmF0aW9uS2V5cy5qYXZh)
 | | |
   | 
[...che/gobblin/runtime/api/InstrumentedSpecStore.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0luc3RydW1lbnRlZFNwZWNTdG9yZS5qYXZh)
 | | |
   | 
[...java/org/apache/gobblin/runtime/api/SpecStore.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL1NwZWNTdG9yZS5qYXZh)
 | | |
   | 
[...apache/gobblin/runtime/metrics/RuntimeMetrics.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvbWV0cmljcy9SdW50aW1lTWV0cmljcy5qYXZh)
 | | |
   | 
[...ache/gobblin/runtime/spec\_catalog/FlowCatalog.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19jYXRhbG9nL0Zsb3dDYXRhbG9nLmphdmE=)
 | | |
   | 
[...apache/gobblin/runtime/spec\_store/FSSpecStore.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19zdG9yZS9GU1NwZWNTdG9yZS5qYXZh)
 | | |
   | 
[...gobblin/runtime/spec\_store/MysqlBaseSpecStore.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvc3BlY19zdG9yZS9NeXNxbEJhc2VTcGVjU3RvcmUuamF2YQ==)
 | | |
   | 
[.../modules/scheduler/GobblinServiceJobScheduler.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1zZXJ2aWNlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3NlcnZpY2UvbW9kdWxlcy9zY2hlZHVsZXIvR29iYmxpblNlcnZpY2VKb2JTY2hlZHVsZXIuamF2YQ==)
 | | |
   | 
[...etention/policy/predicates/WhitelistPredicate.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L3JldGVudGlvbi9wb2xpY3kvcHJlZGljYXRlcy9XaGl0ZWxpc3RQcmVkaWNhdGUuamF2YQ==)
 | | |
   | 
[...gobblin/runtime/commit/DatasetStateCommitStep.java](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvY29tbWl0L0RhdGFzZXRTdGF0ZUNvbW1pdFN0ZXAuamF2YQ==)
 | | |
   | ... and [1719 
more](https://codecov.io/gh/apache/gobblin/pull/3640?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
 | |
   
   :mega: We’re building smart automated test selection to slash your CI/CD 
build times. [Learn 
more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   




Issue Time Tracking
-------------------

    Worklog Id:     (was: 844482)
    Time Spent: 20m  (was: 10m)

> Initialize scheduler with batch gets instead of individual get per flow
> -----------------------------------------------------------------------
>
>                 Key: GOBBLIN-1783
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1783
>             Project: Apache Gobblin
>          Issue Type: Bug
>          Components: gobblin-service
>            Reporter: Urmi Mustafi
>            Assignee: Abhishek Tiwari
>            Priority: Major
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> We seek to improve initialization time of the JobScheduler upon restart or 
> new leadership change by batching the mysql queries to get flow specs. 
> Instead of making 1 mysql get call for each flow execution id, which scales 
> extremely poorly with number of flows, we should group them to reduce number 
> of calls and downtime.
> This implementation adds two new functions to the SpecStore interface, 
> getSortedSpecURIs and getBatchedSpecs, that we use to achieve the batching. 
> Because these two functionalities are generic enough to be used in derived 
> classes of the SpecStore we add them to the base class. Although this 
> requires any child classes to implement these functions, it allows any 
> consumer of the parent class SpecStore to use this functionality without 
> caring about the specific implementation of the SpecStore used (as 
> JobScheduler does). Additionally, the getBatchedSpecs requires an offset or 
> starting point to obtain the batches from so the consumer has to do some book 
> keeping of where in the paginated gets we are but this again separates the 
> functionality from the use case of the consumer. the entirety of the flow 
> catalog is too large to load into memory for the Scheduler, so we use this 
> batch functionality. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to