Re: Suitable Name Search is still Open
Thanks for the reminder. Yes, we plan to take it up soon. Abhishek On Thu, May 7, 2020 at 3:58 PM Dave Fisher wrote: > Hi Gobblin Devs- > > Your Suitable Name Search is still open. > https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-131 > > Please look into providing the necessary trademark searches > > Regards, > Dave >
Suitable Name Search is still Open
Hi Gobblin Devs- Your Suitable Name Search is still open. https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-131 Please look into providing the necessary trademark searches Regards, Dave
[jira] [Work logged] (GOBBLIN-1144) move spec store delete to gobblinservice job scheduler
[ https://issues.apache.org/jira/browse/GOBBLIN-1144?focusedWorklogId=431907=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431907 ] ASF GitHub Bot logged work on GOBBLIN-1144: --- Author: ASF GitHub Bot Created on: 07/May/20 21:48 Start Date: 07/May/20 21:48 Worklog Time Spent: 10m Work Description: sv2000 commented on a change in pull request #2981: URL: https://github.com/apache/incubator-gobblin/pull/2981#discussion_r421813658 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_catalog/FlowCatalog.java ## @@ -325,12 +325,14 @@ public Spec getSpecWrapper(URI uri) { return responseMap; } - private boolean isCompileSuccessful(Map responseMap) { + public static boolean isCompileSuccessful(Map responseMap) { AddSpecResponse addSpecResponse = responseMap.getOrDefault(ServiceConfigKeys.GOBBLIN_SERVICE_JOB_SCHEDULER_LISTENER_CLASS, new AddSpecResponse<>("")); -return addSpecResponse != null -&& addSpecResponse.getValue() != null -&& !addSpecResponse.getValue().contains("ConfigException"); +return isCompileSuccessful(addSpecResponse.getValue()); + } + + public static boolean isCompileSuccessful(String dag) { Review comment: Can we move this method into BaseFlowCompiler class and expose the method via compiler interface? It seems like a property of the compiler. Definitely, not that of FlowCatalog. ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/scheduler/GobblinServiceJobScheduler.java ## @@ -293,19 +294,27 @@ public AddSpecResponse onAddSpec(Spec addedSpec) { response = Arrays.toString(flowSpec.getCompilationErrors().toArray()); } -if (!isExplain) { +boolean compileSuccess = FlowCatalog.isCompileSuccessful(response); + +if (!isExplain && compileSuccess) { this.scheduledFlowSpecs.put(addedSpec.getUri().toString(), addedSpec); if (jobConfig.containsKey(ConfigurationKeys.JOB_SCHEDULE_KEY)) { _log.info("{} Scheduling flow spec: {} ", this.serviceName, addedSpec); scheduleJob(jobConfig, null); if (PropertiesUtils.getPropAsBoolean(jobConfig, ConfigurationKeys.FLOW_RUN_IMMEDIATELY, "false")) { _log.info("RunImmediately requested, hence executing FlowSpec: " + addedSpec); - this.jobExecutor.execute(new NonScheduledJobRunner(flowSpec.getUri(), false, jobConfig, null)); + this.jobExecutor.execute(new NonScheduledJobRunner(flowSpecUri, false, jobConfig, null)); } } else { _log.info("No FlowSpec schedule found, so running FlowSpec: " + addedSpec); -this.jobExecutor.execute(new NonScheduledJobRunner(flowSpec.getUri(), true, jobConfig, null)); +this.jobExecutor.execute(new NonScheduledJobRunner(flowSpecUri, true, jobConfig, null)); + } +} else { + _log.info("Removing the flow spec: {}, since it is an EXPLAIN request or the flow compilation failed.", addedSpec); Review comment: Any chance we can distinguish the log message based on whether it is an explain query of compile failed? Will help debugging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431907) Time Spent: 50m (was: 40m) > move spec store delete to gobblinservice job scheduler > -- > > Key: GOBBLIN-1144 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1144 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2981: [GOBBLIN-1144] remove specs from gobblin service job scheduler
sv2000 commented on a change in pull request #2981: URL: https://github.com/apache/incubator-gobblin/pull/2981#discussion_r421813658 ## File path: gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_catalog/FlowCatalog.java ## @@ -325,12 +325,14 @@ public Spec getSpecWrapper(URI uri) { return responseMap; } - private boolean isCompileSuccessful(Map responseMap) { + public static boolean isCompileSuccessful(Map responseMap) { AddSpecResponse addSpecResponse = responseMap.getOrDefault(ServiceConfigKeys.GOBBLIN_SERVICE_JOB_SCHEDULER_LISTENER_CLASS, new AddSpecResponse<>("")); -return addSpecResponse != null -&& addSpecResponse.getValue() != null -&& !addSpecResponse.getValue().contains("ConfigException"); +return isCompileSuccessful(addSpecResponse.getValue()); + } + + public static boolean isCompileSuccessful(String dag) { Review comment: Can we move this method into BaseFlowCompiler class and expose the method via compiler interface? It seems like a property of the compiler. Definitely, not that of FlowCatalog. ## File path: gobblin-service/src/main/java/org/apache/gobblin/service/modules/scheduler/GobblinServiceJobScheduler.java ## @@ -293,19 +294,27 @@ public AddSpecResponse onAddSpec(Spec addedSpec) { response = Arrays.toString(flowSpec.getCompilationErrors().toArray()); } -if (!isExplain) { +boolean compileSuccess = FlowCatalog.isCompileSuccessful(response); + +if (!isExplain && compileSuccess) { this.scheduledFlowSpecs.put(addedSpec.getUri().toString(), addedSpec); if (jobConfig.containsKey(ConfigurationKeys.JOB_SCHEDULE_KEY)) { _log.info("{} Scheduling flow spec: {} ", this.serviceName, addedSpec); scheduleJob(jobConfig, null); if (PropertiesUtils.getPropAsBoolean(jobConfig, ConfigurationKeys.FLOW_RUN_IMMEDIATELY, "false")) { _log.info("RunImmediately requested, hence executing FlowSpec: " + addedSpec); - this.jobExecutor.execute(new NonScheduledJobRunner(flowSpec.getUri(), false, jobConfig, null)); + this.jobExecutor.execute(new NonScheduledJobRunner(flowSpecUri, false, jobConfig, null)); } } else { _log.info("No FlowSpec schedule found, so running FlowSpec: " + addedSpec); -this.jobExecutor.execute(new NonScheduledJobRunner(flowSpec.getUri(), true, jobConfig, null)); +this.jobExecutor.execute(new NonScheduledJobRunner(flowSpecUri, true, jobConfig, null)); + } +} else { + _log.info("Removing the flow spec: {}, since it is an EXPLAIN request or the flow compilation failed.", addedSpec); Review comment: Any chance we can distinguish the log message based on whether it is an explain query of compile failed? Will help debugging. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table
[ https://issues.apache.org/jira/browse/GOBBLIN-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hung Tran resolved GOBBLIN-1142. Fix Version/s: 0.15.0 Resolution: Fixed Issue resolved by pull request #2979 [https://github.com/apache/incubator-gobblin/pull/2979] > Hive Distcp support filter on partitioned or snapshot table > --- > > Key: GOBBLIN-1142 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1142 > Project: Apache Gobblin > Issue Type: Task >Reporter: Zhixiong Chen >Assignee: Zhixiong Chen >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > The change adds support filtering a specific type of tables, e.g snapshot, > partitioned, in `HiveDatasetFinder` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GOBBLIN-1144) move spec store delete to gobblinservice job scheduler
[ https://issues.apache.org/jira/browse/GOBBLIN-1144?focusedWorklogId=431902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431902 ] ASF GitHub Bot logged work on GOBBLIN-1144: --- Author: ASF GitHub Bot Created on: 07/May/20 21:28 Start Date: 07/May/20 21:28 Worklog Time Spent: 10m Work Description: enjoyear commented on pull request #2981: URL: https://github.com/apache/incubator-gobblin/pull/2981#issuecomment-625506736 @arjun4084346 , LGTM. Please commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431902) Time Spent: 40m (was: 0.5h) > move spec store delete to gobblinservice job scheduler > -- > > Key: GOBBLIN-1144 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1144 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-gobblin] enjoyear commented on pull request #2981: [GOBBLIN-1144] remove specs from gobblin service job scheduler
enjoyear commented on pull request #2981: URL: https://github.com/apache/incubator-gobblin/pull/2981#issuecomment-625506736 @arjun4084346 , LGTM. Please commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Work logged] (GOBBLIN-1127) Provide an option to make metric reporting instantiation failures fatal
[ https://issues.apache.org/jira/browse/GOBBLIN-1127?focusedWorklogId=431884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431884 ] ASF GitHub Bot logged work on GOBBLIN-1127: --- Author: ASF GitHub Bot Created on: 07/May/20 20:29 Start Date: 07/May/20 20:29 Worklog Time Spent: 10m Work Description: sv2000 opened a new pull request #2967: URL: https://github.com/apache/incubator-gobblin/pull/2967 …n failures fatal Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-1127 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): This option allows GobblinTaskRunner to "fail-fast" on metric reporting instantiation failures. This is particularly sseful in scenarios where pipeline monitoring depends on metrics and tracking events being emitted. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Added unit tests. ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431884) Time Spent: 1h 20m (was: 1h 10m) > Provide an option to make metric reporting instantiation failures fatal > --- > > Key: GOBBLIN-1127 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1127 > Project: Apache Gobblin > Issue Type: Improvement > Components: gobblin-cluster >Affects Versions: 0.15.0 >Reporter: Sudarshan Vasudevan >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > This option allows GobblinTaskRunner to "fail-fast" on metric reporting > instantiation failures. This is particularly sseful in scenarios where > pipeline monitoring depends on metrics and tracking events being emitted. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-gobblin] codecov-io edited a comment on pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables
codecov-io edited a comment on pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#issuecomment-624346150 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=h1) Report > Merging [#2979](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/a63461257c3fcea8f4ff67087f8cb29be25d6baf=desc) will **increase** coverage by `0.07%`. > The diff coverage is `66.66%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/graphs/tree.svg?width=650=150=pr=4MgURJ0bGc)](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2979 +/- ## + Coverage 44.60% 44.68% +0.07% - Complexity 8980 9039 +59 Files 1936 1944 +8 Lines 7323473748 +514 Branches 8083 8143 +60 + Hits 3266932951 +282 - Misses3751537703 +188 - Partials 3050 3094 +44 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...ta/management/copy/predicates/TableTypeFilter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvcHJlZGljYXRlcy9UYWJsZVR5cGVGaWx0ZXIuamF2YQ==) | `53.84% <53.84%> (ø)` | `5.00 <5.00> (?)` | | | [...n/data/management/copy/hive/HiveDatasetFinder.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvaGl2ZS9IaXZlRGF0YXNldEZpbmRlci5qYXZh) | `80.76% <100.00%> (+0.97%)` | `22.00 <0.00> (+1.00)` | | | [...gobblin/azkaban/AzkabanGobblinYarnAppLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tYXprYWJhbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9hemthYmFuL0F6a2FiYW5Hb2JibGluWWFybkFwcExhdW5jaGVyLmphdmE=) | `0.00% <0.00%> (-30.56%)` | `0.00% <0.00%> (-2.00%)` | | | [...compaction/verify/CompactionThresholdVerifier.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1jb21wYWN0aW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbXBhY3Rpb24vdmVyaWZ5L0NvbXBhY3Rpb25UaHJlc2hvbGRWZXJpZmllci5qYXZh) | `66.66% <0.00%> (-8.34%)` | `5.00% <0.00%> (ø%)` | | | [...n/java/org/apache/gobblin/util/logs/LogCopier.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvbG9ncy9Mb2dDb3BpZXIuamF2YQ==) | `61.53% <0.00%> (-8.03%)` | `18.00% <0.00%> (ø%)` | | | [...a/org/apache/gobblin/service/RequesterService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi1mbG93LWNvbmZpZy1zZXJ2aWNlL2dvYmJsaW4tZmxvdy1jb25maWctc2VydmljZS1zZXJ2ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc2VydmljZS9SZXF1ZXN0ZXJTZXJ2aWNlLmphdmE=) | `92.30% <0.00%> (-7.70%)` | `4.00% <0.00%> (ø%)` | | | [...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=) | `85.71% <0.00%> (-7.15%)` | `3.00% <0.00%> (ø%)` | | | [.../gobblin/metrics/kafka/KafkaAvroEventReporter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL21ldHJpY3Mva2Fma2EvS2Fma2FBdnJvRXZlbnRSZXBvcnRlci5qYXZh) | `53.84% <0.00%> (-5.25%)` | `3.00% <0.00%> (ø%)` | | | [...che/gobblin/yarn/YarnContainerSecurityManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFybkNvbnRhaW5lclNlY3VyaXR5TWFuYWdlci5qYXZh) | `58.62% <0.00%> (-5.02%)` | `6.00% <0.00%> (+1.00%)` | :arrow_down: | | [.../org/apache/gobblin/yarn/GobblinYarnLogSource.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vR29iYmxpbllhcm5Mb2dTb3VyY2UuamF2YQ==) | `19.35% <0.00%> (-3.73%)` | `3.00% <0.00%> (ø%)` | | | ... and [77 more](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree-more) | |
[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table
[ https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431859 ] ASF GitHub Bot logged work on GOBBLIN-1142: --- Author: ASF GitHub Bot Created on: 07/May/20 19:00 Start Date: 07/May/20 19:00 Worklog Time Spent: 10m Work Description: codecov-io edited a comment on pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#issuecomment-624346150 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=h1) Report > Merging [#2979](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/a63461257c3fcea8f4ff67087f8cb29be25d6baf=desc) will **increase** coverage by `0.07%`. > The diff coverage is `66.66%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/graphs/tree.svg?width=650=150=pr=4MgURJ0bGc)](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2979 +/- ## + Coverage 44.60% 44.68% +0.07% - Complexity 8980 9039 +59 Files 1936 1944 +8 Lines 7323473748 +514 Branches 8083 8143 +60 + Hits 3266932951 +282 - Misses3751537703 +188 - Partials 3050 3094 +44 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...ta/management/copy/predicates/TableTypeFilter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvcHJlZGljYXRlcy9UYWJsZVR5cGVGaWx0ZXIuamF2YQ==) | `53.84% <53.84%> (ø)` | `5.00 <5.00> (?)` | | | [...n/data/management/copy/hive/HiveDatasetFinder.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvaGl2ZS9IaXZlRGF0YXNldEZpbmRlci5qYXZh) | `80.76% <100.00%> (+0.97%)` | `22.00 <0.00> (+1.00)` | | | [...gobblin/azkaban/AzkabanGobblinYarnAppLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tYXprYWJhbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9hemthYmFuL0F6a2FiYW5Hb2JibGluWWFybkFwcExhdW5jaGVyLmphdmE=) | `0.00% <0.00%> (-30.56%)` | `0.00% <0.00%> (-2.00%)` | | | [...compaction/verify/CompactionThresholdVerifier.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1jb21wYWN0aW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbXBhY3Rpb24vdmVyaWZ5L0NvbXBhY3Rpb25UaHJlc2hvbGRWZXJpZmllci5qYXZh) | `66.66% <0.00%> (-8.34%)` | `5.00% <0.00%> (ø%)` | | | [...n/java/org/apache/gobblin/util/logs/LogCopier.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvbG9ncy9Mb2dDb3BpZXIuamF2YQ==) | `61.53% <0.00%> (-8.03%)` | `18.00% <0.00%> (ø%)` | | | [...a/org/apache/gobblin/service/RequesterService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi1mbG93LWNvbmZpZy1zZXJ2aWNlL2dvYmJsaW4tZmxvdy1jb25maWctc2VydmljZS1zZXJ2ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc2VydmljZS9SZXF1ZXN0ZXJTZXJ2aWNlLmphdmE=) | `92.30% <0.00%> (-7.70%)` | `4.00% <0.00%> (ø%)` | | | [...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=) | `85.71% <0.00%> (-7.15%)` | `3.00% <0.00%> (ø%)` | | | [.../gobblin/metrics/kafka/KafkaAvroEventReporter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL21ldHJpY3Mva2Fma2EvS2Fma2FBdnJvRXZlbnRSZXBvcnRlci5qYXZh) | `53.84% <0.00%> (-5.25%)` | `3.00% <0.00%> (ø%)` | | | [...che/gobblin/yarn/YarnContainerSecurityManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFybkNvbnRhaW5lclNlY3VyaXR5TWFuYWdlci5qYXZh) | `58.62% <0.00%> (-5.02%)` | `6.00% <0.00%> (+1.00%)` |
[jira] [Work logged] (GOBBLIN-1127) Provide an option to make metric reporting instantiation failures fatal
[ https://issues.apache.org/jira/browse/GOBBLIN-1127?focusedWorklogId=431846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431846 ] ASF GitHub Bot logged work on GOBBLIN-1127: --- Author: ASF GitHub Bot Created on: 07/May/20 18:26 Start Date: 07/May/20 18:26 Worklog Time Spent: 10m Work Description: sv2000 opened a new pull request #2967: URL: https://github.com/apache/incubator-gobblin/pull/2967 …n failures fatal Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-1127 ### Description - [x] Here are some details about my PR, including screenshots (if applicable): This option allows GobblinTaskRunner to "fail-fast" on metric reporting instantiation failures. This is particularly sseful in scenarios where pipeline monitoring depends on metrics and tracking events being emitted. ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: Added unit tests. ### Commits - [x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431846) Time Spent: 1h 10m (was: 1h) > Provide an option to make metric reporting instantiation failures fatal > --- > > Key: GOBBLIN-1127 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1127 > Project: Apache Gobblin > Issue Type: Improvement > Components: gobblin-cluster >Affects Versions: 0.15.0 >Reporter: Sudarshan Vasudevan >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > This option allows GobblinTaskRunner to "fail-fast" on metric reporting > instantiation failures. This is particularly sseful in scenarios where > pipeline monitoring depends on metrics and tracking events being emitted. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table
[ https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431840=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431840 ] ASF GitHub Bot logged work on GOBBLIN-1142: --- Author: ASF GitHub Bot Created on: 07/May/20 18:12 Start Date: 07/May/20 18:12 Worklog Time Spent: 10m Work Description: arjun4084346 commented on pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#issuecomment-625415427 +1 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431840) Time Spent: 1h 10m (was: 1h) > Hive Distcp support filter on partitioned or snapshot table > --- > > Key: GOBBLIN-1142 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1142 > Project: Apache Gobblin > Issue Type: Task >Reporter: Zhixiong Chen >Assignee: Zhixiong Chen >Priority: Major > Time Spent: 1h 10m > Remaining Estimate: 0h > > The change adds support filtering a specific type of tables, e.g snapshot, > partitioned, in `HiveDatasetFinder` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-gobblin] arjun4084346 commented on pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables
arjun4084346 commented on pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#issuecomment-625415427 +1 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table
[ https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431839 ] ASF GitHub Bot logged work on GOBBLIN-1142: --- Author: ASF GitHub Bot Created on: 07/May/20 18:11 Start Date: 07/May/20 18:11 Worklog Time Spent: 10m Work Description: zxcware commented on a change in pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421697928 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.gobblin.data.management.copy.predicates; + +import java.util.Properties; + +import org.apache.hadoop.hive.metastore.api.Table; + +import com.google.common.base.Predicate; + +import javax.annotation.Nullable; + + +/** + * A predicate to check if a hive {@link Table} is of a certain type in {@link TABLE_TYPE} + * + * Example usage: {@link org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER} + */ +public class TableTypeFilter implements Predicate { + + public static final String FILTER_TYPE = "tableTypeFilter.type"; + + private enum TABLE_TYPE { +SNAPSHOT, +PARTITIONED + } + + private final TABLE_TYPE tableType; + + public TableTypeFilter(Properties props) { +tableType = TABLE_TYPE.valueOf( +props.getProperty(FILTER_TYPE, TABLE_TYPE.SNAPSHOT.name()).toUpperCase()); + } + + @Override + public boolean apply(@Nullable Table input) { +if (input == null) { + return false; +} + +switch (tableType) { + case SNAPSHOT: +return input.getPartitionKeys() == null || input.getPartitionKeys().size() == 0; + case PARTITIONED: +return input.getPartitionKeys() != null && input.getPartitionKeys().size() > 0; +} Review comment: Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431839) Time Spent: 1h (was: 50m) > Hive Distcp support filter on partitioned or snapshot table > --- > > Key: GOBBLIN-1142 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1142 > Project: Apache Gobblin > Issue Type: Task >Reporter: Zhixiong Chen >Assignee: Zhixiong Chen >Priority: Major > Time Spent: 1h > Remaining Estimate: 0h > > The change adds support filtering a specific type of tables, e.g snapshot, > partitioned, in `HiveDatasetFinder` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables
zxcware commented on a change in pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421697928 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.gobblin.data.management.copy.predicates; + +import java.util.Properties; + +import org.apache.hadoop.hive.metastore.api.Table; + +import com.google.common.base.Predicate; + +import javax.annotation.Nullable; + + +/** + * A predicate to check if a hive {@link Table} is of a certain type in {@link TABLE_TYPE} + * + * Example usage: {@link org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER} + */ +public class TableTypeFilter implements Predicate { + + public static final String FILTER_TYPE = "tableTypeFilter.type"; + + private enum TABLE_TYPE { +SNAPSHOT, +PARTITIONED + } + + private final TABLE_TYPE tableType; + + public TableTypeFilter(Properties props) { +tableType = TABLE_TYPE.valueOf( +props.getProperty(FILTER_TYPE, TABLE_TYPE.SNAPSHOT.name()).toUpperCase()); + } + + @Override + public boolean apply(@Nullable Table input) { +if (input == null) { + return false; +} + +switch (tableType) { + case SNAPSHOT: +return input.getPartitionKeys() == null || input.getPartitionKeys().size() == 0; + case PARTITIONED: +return input.getPartitionKeys() != null && input.getPartitionKeys().size() > 0; +} Review comment: Fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Work logged] (GOBBLIN-1127) Provide an option to make metric reporting instantiation failures fatal
[ https://issues.apache.org/jira/browse/GOBBLIN-1127?focusedWorklogId=431833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431833 ] ASF GitHub Bot logged work on GOBBLIN-1127: --- Author: ASF GitHub Bot Created on: 07/May/20 18:01 Start Date: 07/May/20 18:01 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2967: URL: https://github.com/apache/incubator-gobblin/pull/2967#issuecomment-625409335 There's a travis failure seems relevant, can you reproduce locally by `./gradlew :gobblin-modules:gobblin-kafka-09:test` ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431833) Time Spent: 1h (was: 50m) > Provide an option to make metric reporting instantiation failures fatal > --- > > Key: GOBBLIN-1127 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1127 > Project: Apache Gobblin > Issue Type: Improvement > Components: gobblin-cluster >Affects Versions: 0.15.0 >Reporter: Sudarshan Vasudevan >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > Time Spent: 1h > Remaining Estimate: 0h > > This option allows GobblinTaskRunner to "fail-fast" on metric reporting > instantiation failures. This is particularly sseful in scenarios where > pipeline monitoring depends on metrics and tracking events being emitted. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-gobblin] autumnust commented on pull request #2967: GOBBLIN-1127: Provide an option to make metric reporting instantiatio…
autumnust commented on pull request #2967: URL: https://github.com/apache/incubator-gobblin/pull/2967#issuecomment-625409335 There's a travis failure seems relevant, can you reproduce locally by `./gradlew :gobblin-modules:gobblin-kafka-09:test` ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table
[ https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431812 ] ASF GitHub Bot logged work on GOBBLIN-1142: --- Author: ASF GitHub Bot Created on: 07/May/20 16:29 Start Date: 07/May/20 16:29 Worklog Time Spent: 10m Work Description: arjun4084346 commented on a change in pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.gobblin.data.management.copy.predicates; + +import java.util.Properties; + +import org.apache.hadoop.hive.metastore.api.Table; + +import com.google.common.base.Predicate; + +import javax.annotation.Nullable; + + +/** + * A predicate to check if a hive {@link Table} is of a certain type in {@link TABLE_TYPE} + * + * Example usage: {@link org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER} + */ +public class TableTypeFilter implements Predicate { + + public static final String FILTER_TYPE = "tableTypeFilter.type"; + + private enum TABLE_TYPE { +SNAPSHOT, +PARTITIONED + } + + private final TABLE_TYPE tableType; + + public TableTypeFilter(Properties props) { +tableType = TABLE_TYPE.valueOf( +props.getProperty(FILTER_TYPE, TABLE_TYPE.SNAPSHOT.name()).toUpperCase()); + } + + @Override + public boolean apply(@Nullable Table input) { +if (input == null) { + return false; +} + +switch (tableType) { + case SNAPSHOT: +return input.getPartitionKeys() == null || input.getPartitionKeys().size() == 0; + case PARTITIONED: +return input.getPartitionKeys() != null && input.getPartitionKeys().size() > 0; +} Review comment: I was not comfortable with this and then why I googled, I found `switch-case` without a `default` is widely accepted not a good style. https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431812) Time Spent: 50m (was: 40m) > Hive Distcp support filter on partitioned or snapshot table > --- > > Key: GOBBLIN-1142 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1142 > Project: Apache Gobblin > Issue Type: Task >Reporter: Zhixiong Chen >Assignee: Zhixiong Chen >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > The change adds support filtering a specific type of tables, e.g snapshot, > partitioned, in `HiveDatasetFinder` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables
arjun4084346 commented on a change in pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.gobblin.data.management.copy.predicates; + +import java.util.Properties; + +import org.apache.hadoop.hive.metastore.api.Table; + +import com.google.common.base.Predicate; + +import javax.annotation.Nullable; + + +/** + * A predicate to check if a hive {@link Table} is of a certain type in {@link TABLE_TYPE} + * + * Example usage: {@link org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER} + */ +public class TableTypeFilter implements Predicate { + + public static final String FILTER_TYPE = "tableTypeFilter.type"; + + private enum TABLE_TYPE { +SNAPSHOT, +PARTITIONED + } + + private final TABLE_TYPE tableType; + + public TableTypeFilter(Properties props) { +tableType = TABLE_TYPE.valueOf( +props.getProperty(FILTER_TYPE, TABLE_TYPE.SNAPSHOT.name()).toUpperCase()); + } + + @Override + public boolean apply(@Nullable Table input) { +if (input == null) { + return false; +} + +switch (tableType) { + case SNAPSHOT: +return input.getPartitionKeys() == null || input.getPartitionKeys().size() == 0; + case PARTITIONED: +return input.getPartitionKeys() != null && input.getPartitionKeys().size() > 0; +} Review comment: I was not comfortable with this and then why I googled, I found `switch-case` without a `default` is widely accepted not a good style. https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table
[ https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431580 ] ASF GitHub Bot logged work on GOBBLIN-1142: --- Author: ASF GitHub Bot Created on: 07/May/20 06:33 Start Date: 07/May/20 06:33 Worklog Time Spent: 10m Work Description: arjun4084346 commented on a change in pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.gobblin.data.management.copy.predicates; + +import java.util.Properties; + +import org.apache.hadoop.hive.metastore.api.Table; + +import com.google.common.base.Predicate; + +import javax.annotation.Nullable; + + +/** + * A predicate to check if a hive {@link Table} is of a certain type in {@link TABLE_TYPE} + * + * Example usage: {@link org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER} + */ +public class TableTypeFilter implements Predicate { + + public static final String FILTER_TYPE = "tableTypeFilter.type"; + + private enum TABLE_TYPE { +SNAPSHOT, +PARTITIONED + } + + private final TABLE_TYPE tableType; + + public TableTypeFilter(Properties props) { +tableType = TABLE_TYPE.valueOf( +props.getProperty(FILTER_TYPE, TABLE_TYPE.SNAPSHOT.name()).toUpperCase()); + } + + @Override + public boolean apply(@Nullable Table input) { +if (input == null) { + return false; +} + +switch (tableType) { + case SNAPSHOT: +return input.getPartitionKeys() == null || input.getPartitionKeys().size() == 0; + case PARTITIONED: +return input.getPartitionKeys() != null && input.getPartitionKeys().size() > 0; +} Review comment: I was not comfortable with this and then why I googled, I found `switch-case` without a `default` is widely not a good style. https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431580) Time Spent: 40m (was: 0.5h) > Hive Distcp support filter on partitioned or snapshot table > --- > > Key: GOBBLIN-1142 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1142 > Project: Apache Gobblin > Issue Type: Task >Reporter: Zhixiong Chen >Assignee: Zhixiong Chen >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > The change adds support filtering a specific type of tables, e.g snapshot, > partitioned, in `HiveDatasetFinder` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table
[ https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431576 ] ASF GitHub Bot logged work on GOBBLIN-1142: --- Author: ASF GitHub Bot Created on: 07/May/20 06:32 Start Date: 07/May/20 06:32 Worklog Time Spent: 10m Work Description: arjun4084346 commented on a change in pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.gobblin.data.management.copy.predicates; + +import java.util.Properties; + +import org.apache.hadoop.hive.metastore.api.Table; + +import com.google.common.base.Predicate; + +import javax.annotation.Nullable; + + +/** + * A predicate to check if a hive {@link Table} is of a certain type in {@link TABLE_TYPE} + * + * Example usage: {@link org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER} + */ +public class TableTypeFilter implements Predicate { + + public static final String FILTER_TYPE = "tableTypeFilter.type"; + + private enum TABLE_TYPE { +SNAPSHOT, +PARTITIONED + } + + private final TABLE_TYPE tableType; + + public TableTypeFilter(Properties props) { +tableType = TABLE_TYPE.valueOf( +props.getProperty(FILTER_TYPE, TABLE_TYPE.SNAPSHOT.name()).toUpperCase()); + } + + @Override + public boolean apply(@Nullable Table input) { +if (input == null) { + return false; +} + +switch (tableType) { + case SNAPSHOT: +return input.getPartitionKeys() == null || input.getPartitionKeys().size() == 0; + case PARTITIONED: +return input.getPartitionKeys() != null && input.getPartitionKeys().size() > 0; +} Review comment: I was not comfortable with this and then why I googled, I found `switch-case` without a `default` is widely not recommended. https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 431576) Time Spent: 0.5h (was: 20m) > Hive Distcp support filter on partitioned or snapshot table > --- > > Key: GOBBLIN-1142 > URL: https://issues.apache.org/jira/browse/GOBBLIN-1142 > Project: Apache Gobblin > Issue Type: Task >Reporter: Zhixiong Chen >Assignee: Zhixiong Chen >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > The change adds support filtering a specific type of tables, e.g snapshot, > partitioned, in `HiveDatasetFinder` -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables
arjun4084346 commented on a change in pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.gobblin.data.management.copy.predicates; + +import java.util.Properties; + +import org.apache.hadoop.hive.metastore.api.Table; + +import com.google.common.base.Predicate; + +import javax.annotation.Nullable; + + +/** + * A predicate to check if a hive {@link Table} is of a certain type in {@link TABLE_TYPE} + * + * Example usage: {@link org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER} + */ +public class TableTypeFilter implements Predicate { + + public static final String FILTER_TYPE = "tableTypeFilter.type"; + + private enum TABLE_TYPE { +SNAPSHOT, +PARTITIONED + } + + private final TABLE_TYPE tableType; + + public TableTypeFilter(Properties props) { +tableType = TABLE_TYPE.valueOf( +props.getProperty(FILTER_TYPE, TABLE_TYPE.SNAPSHOT.name()).toUpperCase()); + } + + @Override + public boolean apply(@Nullable Table input) { +if (input == null) { + return false; +} + +switch (tableType) { + case SNAPSHOT: +return input.getPartitionKeys() == null || input.getPartitionKeys().size() == 0; + case PARTITIONED: +return input.getPartitionKeys() != null && input.getPartitionKeys().size() > 0; +} Review comment: I was not comfortable with this and then why I googled, I found `switch-case` without a `default` is widely not a good style. https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables
arjun4084346 commented on a change in pull request #2979: URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048 ## File path: gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java ## @@ -0,0 +1,65 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.gobblin.data.management.copy.predicates; + +import java.util.Properties; + +import org.apache.hadoop.hive.metastore.api.Table; + +import com.google.common.base.Predicate; + +import javax.annotation.Nullable; + + +/** + * A predicate to check if a hive {@link Table} is of a certain type in {@link TABLE_TYPE} + * + * Example usage: {@link org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER} + */ +public class TableTypeFilter implements Predicate { + + public static final String FILTER_TYPE = "tableTypeFilter.type"; + + private enum TABLE_TYPE { +SNAPSHOT, +PARTITIONED + } + + private final TABLE_TYPE tableType; + + public TableTypeFilter(Properties props) { +tableType = TABLE_TYPE.valueOf( +props.getProperty(FILTER_TYPE, TABLE_TYPE.SNAPSHOT.name()).toUpperCase()); + } + + @Override + public boolean apply(@Nullable Table input) { +if (input == null) { + return false; +} + +switch (tableType) { + case SNAPSHOT: +return input.getPartitionKeys() == null || input.getPartitionKeys().size() == 0; + case PARTITIONED: +return input.getPartitionKeys() != null && input.getPartitionKeys().size() > 0; +} Review comment: I was not comfortable with this and then why I googled, I found `switch-case` without a `default` is widely not recommended. https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org