Re: Suitable Name Search is still Open

2020-05-07 Thread Abhishek Tiwari
Thanks for the reminder. Yes, we plan to take it up soon.

Abhishek

On Thu, May 7, 2020 at 3:58 PM Dave Fisher  wrote:

> Hi Gobblin Devs-
>
> Your Suitable Name Search is still open.
> https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-131
>
> Please look into providing the necessary trademark searches
>
> Regards,
> Dave
>


Suitable Name Search is still Open

2020-05-07 Thread Dave Fisher
Hi Gobblin Devs-

Your Suitable Name Search is still open. 
https://issues.apache.org/jira/browse/PODLINGNAMESEARCH-131

Please look into providing the necessary trademark searches

Regards,
Dave


[jira] [Work logged] (GOBBLIN-1144) move spec store delete to gobblinservice job scheduler

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1144?focusedWorklogId=431907=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431907
 ]

ASF GitHub Bot logged work on GOBBLIN-1144:
---

Author: ASF GitHub Bot
Created on: 07/May/20 21:48
Start Date: 07/May/20 21:48
Worklog Time Spent: 10m 
  Work Description: sv2000 commented on a change in pull request #2981:
URL: https://github.com/apache/incubator-gobblin/pull/2981#discussion_r421813658



##
File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_catalog/FlowCatalog.java
##
@@ -325,12 +325,14 @@ public Spec getSpecWrapper(URI uri) {
 return responseMap;
   }
 
-  private boolean isCompileSuccessful(Map 
responseMap) {
+  public static boolean isCompileSuccessful(Map 
responseMap) {
 AddSpecResponse addSpecResponse = 
responseMap.getOrDefault(ServiceConfigKeys.GOBBLIN_SERVICE_JOB_SCHEDULER_LISTENER_CLASS,
 new AddSpecResponse<>(""));
 
-return addSpecResponse != null
-&& addSpecResponse.getValue() != null
-&& !addSpecResponse.getValue().contains("ConfigException");
+return isCompileSuccessful(addSpecResponse.getValue());
+  }
+
+  public static boolean isCompileSuccessful(String dag) {

Review comment:
   Can we move this method into BaseFlowCompiler class and expose the 
method via compiler interface? It seems like a property of the compiler. 
Definitely, not that of FlowCatalog.

##
File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/scheduler/GobblinServiceJobScheduler.java
##
@@ -293,19 +294,27 @@ public AddSpecResponse onAddSpec(Spec addedSpec) {
   response = 
Arrays.toString(flowSpec.getCompilationErrors().toArray());
 }
 
-if (!isExplain) {
+boolean compileSuccess = FlowCatalog.isCompileSuccessful(response);
+
+if (!isExplain && compileSuccess) {
   this.scheduledFlowSpecs.put(addedSpec.getUri().toString(), 
addedSpec);
 
   if (jobConfig.containsKey(ConfigurationKeys.JOB_SCHEDULE_KEY)) {
 _log.info("{} Scheduling flow spec: {} ", this.serviceName, 
addedSpec);
 scheduleJob(jobConfig, null);
 if (PropertiesUtils.getPropAsBoolean(jobConfig, 
ConfigurationKeys.FLOW_RUN_IMMEDIATELY, "false")) {
   _log.info("RunImmediately requested, hence executing FlowSpec: " 
+ addedSpec);
-  this.jobExecutor.execute(new 
NonScheduledJobRunner(flowSpec.getUri(), false, jobConfig, null));
+  this.jobExecutor.execute(new NonScheduledJobRunner(flowSpecUri, 
false, jobConfig, null));
 }
   } else {
 _log.info("No FlowSpec schedule found, so running FlowSpec: " + 
addedSpec);
-this.jobExecutor.execute(new 
NonScheduledJobRunner(flowSpec.getUri(), true, jobConfig, null));
+this.jobExecutor.execute(new NonScheduledJobRunner(flowSpecUri, 
true, jobConfig, null));
+  }
+} else {
+  _log.info("Removing the flow spec: {}, since it is an EXPLAIN 
request or the flow compilation failed.", addedSpec);

Review comment:
   Any chance we can distinguish the log message based on whether it is an 
explain query of compile failed? Will help debugging.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431907)
Time Spent: 50m  (was: 40m)

> move spec store delete to gobblinservice job scheduler
> --
>
> Key: GOBBLIN-1144
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1144
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] sv2000 commented on a change in pull request #2981: [GOBBLIN-1144] remove specs from gobblin service job scheduler

2020-05-07 Thread GitBox


sv2000 commented on a change in pull request #2981:
URL: https://github.com/apache/incubator-gobblin/pull/2981#discussion_r421813658



##
File path: 
gobblin-runtime/src/main/java/org/apache/gobblin/runtime/spec_catalog/FlowCatalog.java
##
@@ -325,12 +325,14 @@ public Spec getSpecWrapper(URI uri) {
 return responseMap;
   }
 
-  private boolean isCompileSuccessful(Map 
responseMap) {
+  public static boolean isCompileSuccessful(Map 
responseMap) {
 AddSpecResponse addSpecResponse = 
responseMap.getOrDefault(ServiceConfigKeys.GOBBLIN_SERVICE_JOB_SCHEDULER_LISTENER_CLASS,
 new AddSpecResponse<>(""));
 
-return addSpecResponse != null
-&& addSpecResponse.getValue() != null
-&& !addSpecResponse.getValue().contains("ConfigException");
+return isCompileSuccessful(addSpecResponse.getValue());
+  }
+
+  public static boolean isCompileSuccessful(String dag) {

Review comment:
   Can we move this method into BaseFlowCompiler class and expose the 
method via compiler interface? It seems like a property of the compiler. 
Definitely, not that of FlowCatalog.

##
File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/scheduler/GobblinServiceJobScheduler.java
##
@@ -293,19 +294,27 @@ public AddSpecResponse onAddSpec(Spec addedSpec) {
   response = 
Arrays.toString(flowSpec.getCompilationErrors().toArray());
 }
 
-if (!isExplain) {
+boolean compileSuccess = FlowCatalog.isCompileSuccessful(response);
+
+if (!isExplain && compileSuccess) {
   this.scheduledFlowSpecs.put(addedSpec.getUri().toString(), 
addedSpec);
 
   if (jobConfig.containsKey(ConfigurationKeys.JOB_SCHEDULE_KEY)) {
 _log.info("{} Scheduling flow spec: {} ", this.serviceName, 
addedSpec);
 scheduleJob(jobConfig, null);
 if (PropertiesUtils.getPropAsBoolean(jobConfig, 
ConfigurationKeys.FLOW_RUN_IMMEDIATELY, "false")) {
   _log.info("RunImmediately requested, hence executing FlowSpec: " 
+ addedSpec);
-  this.jobExecutor.execute(new 
NonScheduledJobRunner(flowSpec.getUri(), false, jobConfig, null));
+  this.jobExecutor.execute(new NonScheduledJobRunner(flowSpecUri, 
false, jobConfig, null));
 }
   } else {
 _log.info("No FlowSpec schedule found, so running FlowSpec: " + 
addedSpec);
-this.jobExecutor.execute(new 
NonScheduledJobRunner(flowSpec.getUri(), true, jobConfig, null));
+this.jobExecutor.execute(new NonScheduledJobRunner(flowSpecUri, 
true, jobConfig, null));
+  }
+} else {
+  _log.info("Removing the flow spec: {}, since it is an EXPLAIN 
request or the flow compilation failed.", addedSpec);

Review comment:
   Any chance we can distinguish the log message based on whether it is an 
explain query of compile failed? Will help debugging.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table

2020-05-07 Thread Hung Tran (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran resolved GOBBLIN-1142.

Fix Version/s: 0.15.0
   Resolution: Fixed

Issue resolved by pull request #2979
[https://github.com/apache/incubator-gobblin/pull/2979]

> Hive Distcp support filter on partitioned or snapshot table
> ---
>
> Key: GOBBLIN-1142
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1142
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> The change adds support filtering a specific type of tables, e.g snapshot, 
> partitioned, in `HiveDatasetFinder`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (GOBBLIN-1144) move spec store delete to gobblinservice job scheduler

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1144?focusedWorklogId=431902=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431902
 ]

ASF GitHub Bot logged work on GOBBLIN-1144:
---

Author: ASF GitHub Bot
Created on: 07/May/20 21:28
Start Date: 07/May/20 21:28
Worklog Time Spent: 10m 
  Work Description: enjoyear commented on pull request #2981:
URL: 
https://github.com/apache/incubator-gobblin/pull/2981#issuecomment-625506736


   @arjun4084346 , LGTM. Please commit.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431902)
Time Spent: 40m  (was: 0.5h)

> move spec store delete to gobblinservice job scheduler
> --
>
> Key: GOBBLIN-1144
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1144
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Arjun Singh Bora
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] enjoyear commented on pull request #2981: [GOBBLIN-1144] remove specs from gobblin service job scheduler

2020-05-07 Thread GitBox


enjoyear commented on pull request #2981:
URL: 
https://github.com/apache/incubator-gobblin/pull/2981#issuecomment-625506736


   @arjun4084346 , LGTM. Please commit.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Work logged] (GOBBLIN-1127) Provide an option to make metric reporting instantiation failures fatal

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1127?focusedWorklogId=431884=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431884
 ]

ASF GitHub Bot logged work on GOBBLIN-1127:
---

Author: ASF GitHub Bot
Created on: 07/May/20 20:29
Start Date: 07/May/20 20:29
Worklog Time Spent: 10m 
  Work Description: sv2000 opened a new pull request #2967:
URL: https://github.com/apache/incubator-gobblin/pull/2967


   …n failures fatal
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-1127
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   This option allows GobblinTaskRunner to "fail-fast" on metric reporting 
instantiation failures. This is particularly sseful in scenarios where pipeline 
monitoring depends on metrics and tracking events being emitted.
   
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Added unit tests.
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431884)
Time Spent: 1h 20m  (was: 1h 10m)

> Provide an option to make metric reporting instantiation failures fatal
> ---
>
> Key: GOBBLIN-1127
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1127
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> This option allows GobblinTaskRunner to "fail-fast" on metric reporting 
> instantiation failures. This is particularly sseful in scenarios where 
> pipeline monitoring depends on metrics and tracking events being emitted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] codecov-io edited a comment on pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables

2020-05-07 Thread GitBox


codecov-io edited a comment on pull request #2979:
URL: 
https://github.com/apache/incubator-gobblin/pull/2979#issuecomment-624346150


   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=h1)
 Report
   > Merging 
[#2979](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/a63461257c3fcea8f4ff67087f8cb29be25d6baf=desc)
 will **increase** coverage by `0.07%`.
   > The diff coverage is `66.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/graphs/tree.svg?width=650=150=pr=4MgURJ0bGc)](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2979  +/-   ##
   
   + Coverage 44.60%   44.68%   +0.07% 
   - Complexity 8980 9039  +59 
   
 Files  1936 1944   +8 
 Lines 7323473748 +514 
 Branches   8083 8143  +60 
   
   + Hits  3266932951 +282 
   - Misses3751537703 +188 
   - Partials   3050 3094  +44 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...ta/management/copy/predicates/TableTypeFilter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvcHJlZGljYXRlcy9UYWJsZVR5cGVGaWx0ZXIuamF2YQ==)
 | `53.84% <53.84%> (ø)` | `5.00 <5.00> (?)` | |
   | 
[...n/data/management/copy/hive/HiveDatasetFinder.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvaGl2ZS9IaXZlRGF0YXNldEZpbmRlci5qYXZh)
 | `80.76% <100.00%> (+0.97%)` | `22.00 <0.00> (+1.00)` | |
   | 
[...gobblin/azkaban/AzkabanGobblinYarnAppLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tYXprYWJhbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9hemthYmFuL0F6a2FiYW5Hb2JibGluWWFybkFwcExhdW5jaGVyLmphdmE=)
 | `0.00% <0.00%> (-30.56%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...compaction/verify/CompactionThresholdVerifier.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1jb21wYWN0aW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbXBhY3Rpb24vdmVyaWZ5L0NvbXBhY3Rpb25UaHJlc2hvbGRWZXJpZmllci5qYXZh)
 | `66.66% <0.00%> (-8.34%)` | `5.00% <0.00%> (ø%)` | |
   | 
[...n/java/org/apache/gobblin/util/logs/LogCopier.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvbG9ncy9Mb2dDb3BpZXIuamF2YQ==)
 | `61.53% <0.00%> (-8.03%)` | `18.00% <0.00%> (ø%)` | |
   | 
[...a/org/apache/gobblin/service/RequesterService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi1mbG93LWNvbmZpZy1zZXJ2aWNlL2dvYmJsaW4tZmxvdy1jb25maWctc2VydmljZS1zZXJ2ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc2VydmljZS9SZXF1ZXN0ZXJTZXJ2aWNlLmphdmE=)
 | `92.30% <0.00%> (-7.70%)` | `4.00% <0.00%> (ø%)` | |
   | 
[...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=)
 | `85.71% <0.00%> (-7.15%)` | `3.00% <0.00%> (ø%)` | |
   | 
[.../gobblin/metrics/kafka/KafkaAvroEventReporter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL21ldHJpY3Mva2Fma2EvS2Fma2FBdnJvRXZlbnRSZXBvcnRlci5qYXZh)
 | `53.84% <0.00%> (-5.25%)` | `3.00% <0.00%> (ø%)` | |
   | 
[...che/gobblin/yarn/YarnContainerSecurityManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFybkNvbnRhaW5lclNlY3VyaXR5TWFuYWdlci5qYXZh)
 | `58.62% <0.00%> (-5.02%)` | `6.00% <0.00%> (+1.00%)` | :arrow_down: |
   | 
[.../org/apache/gobblin/yarn/GobblinYarnLogSource.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vR29iYmxpbllhcm5Mb2dTb3VyY2UuamF2YQ==)
 | `19.35% <0.00%> (-3.73%)` | `3.00% <0.00%> (ø%)` | |
   | ... and [77 
more](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree-more)
 | |
   
   

[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431859=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431859
 ]

ASF GitHub Bot logged work on GOBBLIN-1142:
---

Author: ASF GitHub Bot
Created on: 07/May/20 19:00
Start Date: 07/May/20 19:00
Worklog Time Spent: 10m 
  Work Description: codecov-io edited a comment on pull request #2979:
URL: 
https://github.com/apache/incubator-gobblin/pull/2979#issuecomment-624346150


   # 
[Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=h1)
 Report
   > Merging 
[#2979](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=desc)
 into 
[master](https://codecov.io/gh/apache/incubator-gobblin/commit/a63461257c3fcea8f4ff67087f8cb29be25d6baf=desc)
 will **increase** coverage by `0.07%`.
   > The diff coverage is `66.66%`.
   
   [![Impacted file tree 
graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/graphs/tree.svg?width=650=150=pr=4MgURJ0bGc)](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=tree)
   
   ```diff
   @@ Coverage Diff  @@
   ## master#2979  +/-   ##
   
   + Coverage 44.60%   44.68%   +0.07% 
   - Complexity 8980 9039  +59 
   
 Files  1936 1944   +8 
 Lines 7323473748 +514 
 Branches   8083 8143  +60 
   
   + Hits  3266932951 +282 
   - Misses3751537703 +188 
   - Partials   3050 3094  +44 
   ```
   
   
   | [Impacted 
Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2979?src=pr=tree) 
| Coverage Δ | Complexity Δ | |
   |---|---|---|---|
   | 
[...ta/management/copy/predicates/TableTypeFilter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvcHJlZGljYXRlcy9UYWJsZVR5cGVGaWx0ZXIuamF2YQ==)
 | `53.84% <53.84%> (ø)` | `5.00 <5.00> (?)` | |
   | 
[...n/data/management/copy/hive/HiveDatasetFinder.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1kYXRhLW1hbmFnZW1lbnQvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vZGF0YS9tYW5hZ2VtZW50L2NvcHkvaGl2ZS9IaXZlRGF0YXNldEZpbmRlci5qYXZh)
 | `80.76% <100.00%> (+0.97%)` | `22.00 <0.00> (+1.00)` | |
   | 
[...gobblin/azkaban/AzkabanGobblinYarnAppLauncher.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4tYXprYWJhbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9hemthYmFuL0F6a2FiYW5Hb2JibGluWWFybkFwcExhdW5jaGVyLmphdmE=)
 | `0.00% <0.00%> (-30.56%)` | `0.00% <0.00%> (-2.00%)` | |
   | 
[...compaction/verify/CompactionThresholdVerifier.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1jb21wYWN0aW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbXBhY3Rpb24vdmVyaWZ5L0NvbXBhY3Rpb25UaHJlc2hvbGRWZXJpZmllci5qYXZh)
 | `66.66% <0.00%> (-8.34%)` | `5.00% <0.00%> (ø%)` | |
   | 
[...n/java/org/apache/gobblin/util/logs/LogCopier.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvbG9ncy9Mb2dDb3BpZXIuamF2YQ==)
 | `61.53% <0.00%> (-8.03%)` | `18.00% <0.00%> (ø%)` | |
   | 
[...a/org/apache/gobblin/service/RequesterService.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi1mbG93LWNvbmZpZy1zZXJ2aWNlL2dvYmJsaW4tZmxvdy1jb25maWctc2VydmljZS1zZXJ2ZXIvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vc2VydmljZS9SZXF1ZXN0ZXJTZXJ2aWNlLmphdmE=)
 | `92.30% <0.00%> (-7.70%)` | `4.00% <0.00%> (ø%)` | |
   | 
[...lin/util/filesystem/FileSystemInstrumentation.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3V0aWwvZmlsZXN5c3RlbS9GaWxlU3lzdGVtSW5zdHJ1bWVudGF0aW9uLmphdmE=)
 | `85.71% <0.00%> (-7.15%)` | `3.00% <0.00%> (ø%)` | |
   | 
[.../gobblin/metrics/kafka/KafkaAvroEventReporter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi1tb2R1bGVzL2dvYmJsaW4ta2Fma2EtY29tbW9uL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL21ldHJpY3Mva2Fma2EvS2Fma2FBdnJvRXZlbnRSZXBvcnRlci5qYXZh)
 | `53.84% <0.00%> (-5.25%)` | `3.00% <0.00%> (ø%)` | |
   | 
[...che/gobblin/yarn/YarnContainerSecurityManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2979/diff?src=pr=tree#diff-Z29iYmxpbi15YXJuL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3lhcm4vWWFybkNvbnRhaW5lclNlY3VyaXR5TWFuYWdlci5qYXZh)
 | `58.62% <0.00%> (-5.02%)` | `6.00% <0.00%> (+1.00%)` | 

[jira] [Work logged] (GOBBLIN-1127) Provide an option to make metric reporting instantiation failures fatal

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1127?focusedWorklogId=431846=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431846
 ]

ASF GitHub Bot logged work on GOBBLIN-1127:
---

Author: ASF GitHub Bot
Created on: 07/May/20 18:26
Start Date: 07/May/20 18:26
Worklog Time Spent: 10m 
  Work Description: sv2000 opened a new pull request #2967:
URL: https://github.com/apache/incubator-gobblin/pull/2967


   …n failures fatal
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [x] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-1127
   
   
   ### Description
   - [x] Here are some details about my PR, including screenshots (if 
applicable):
   This option allows GobblinTaskRunner to "fail-fast" on metric reporting 
instantiation failures. This is particularly sseful in scenarios where pipeline 
monitoring depends on metrics and tracking events being emitted.
   
   
   ### Tests
   - [x] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Added unit tests.
   
   ### Commits
   - [x] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431846)
Time Spent: 1h 10m  (was: 1h)

> Provide an option to make metric reporting instantiation failures fatal
> ---
>
> Key: GOBBLIN-1127
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1127
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> This option allows GobblinTaskRunner to "fail-fast" on metric reporting 
> instantiation failures. This is particularly sseful in scenarios where 
> pipeline monitoring depends on metrics and tracking events being emitted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431840=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431840
 ]

ASF GitHub Bot logged work on GOBBLIN-1142:
---

Author: ASF GitHub Bot
Created on: 07/May/20 18:12
Start Date: 07/May/20 18:12
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on pull request #2979:
URL: 
https://github.com/apache/incubator-gobblin/pull/2979#issuecomment-625415427


   +1 LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431840)
Time Spent: 1h 10m  (was: 1h)

> Hive Distcp support filter on partitioned or snapshot table
> ---
>
> Key: GOBBLIN-1142
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1142
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Major
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> The change adds support filtering a specific type of tables, e.g snapshot, 
> partitioned, in `HiveDatasetFinder`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables

2020-05-07 Thread GitBox


arjun4084346 commented on pull request #2979:
URL: 
https://github.com/apache/incubator-gobblin/pull/2979#issuecomment-625415427


   +1 LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431839=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431839
 ]

ASF GitHub Bot logged work on GOBBLIN-1142:
---

Author: ASF GitHub Bot
Created on: 07/May/20 18:11
Start Date: 07/May/20 18:11
Worklog Time Spent: 10m 
  Work Description: zxcware commented on a change in pull request #2979:
URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421697928



##
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.data.management.copy.predicates;
+
+import java.util.Properties;
+
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import com.google.common.base.Predicate;
+
+import javax.annotation.Nullable;
+
+
+/**
+ * A predicate to check if a hive {@link Table} is of a certain type in {@link 
TABLE_TYPE}
+ *
+ *  Example usage: {@link 
org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER}
+ */
+public class TableTypeFilter implements Predicate {
+
+  public static final String FILTER_TYPE = "tableTypeFilter.type";
+
+  private enum TABLE_TYPE {
+SNAPSHOT,
+PARTITIONED
+  }
+
+  private final TABLE_TYPE tableType;
+
+  public TableTypeFilter(Properties props) {
+tableType = TABLE_TYPE.valueOf(
+props.getProperty(FILTER_TYPE, 
TABLE_TYPE.SNAPSHOT.name()).toUpperCase());
+  }
+
+  @Override
+  public boolean apply(@Nullable Table input) {
+if (input == null) {
+  return false;
+}
+
+switch (tableType) {
+  case SNAPSHOT:
+return input.getPartitionKeys() == null || 
input.getPartitionKeys().size() == 0;
+  case PARTITIONED:
+return input.getPartitionKeys() != null && 
input.getPartitionKeys().size() > 0;
+}

Review comment:
   Fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431839)
Time Spent: 1h  (was: 50m)

> Hive Distcp support filter on partitioned or snapshot table
> ---
>
> Key: GOBBLIN-1142
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1142
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> The change adds support filtering a specific type of tables, e.g snapshot, 
> partitioned, in `HiveDatasetFinder`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables

2020-05-07 Thread GitBox


zxcware commented on a change in pull request #2979:
URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421697928



##
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.data.management.copy.predicates;
+
+import java.util.Properties;
+
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import com.google.common.base.Predicate;
+
+import javax.annotation.Nullable;
+
+
+/**
+ * A predicate to check if a hive {@link Table} is of a certain type in {@link 
TABLE_TYPE}
+ *
+ *  Example usage: {@link 
org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER}
+ */
+public class TableTypeFilter implements Predicate {
+
+  public static final String FILTER_TYPE = "tableTypeFilter.type";
+
+  private enum TABLE_TYPE {
+SNAPSHOT,
+PARTITIONED
+  }
+
+  private final TABLE_TYPE tableType;
+
+  public TableTypeFilter(Properties props) {
+tableType = TABLE_TYPE.valueOf(
+props.getProperty(FILTER_TYPE, 
TABLE_TYPE.SNAPSHOT.name()).toUpperCase());
+  }
+
+  @Override
+  public boolean apply(@Nullable Table input) {
+if (input == null) {
+  return false;
+}
+
+switch (tableType) {
+  case SNAPSHOT:
+return input.getPartitionKeys() == null || 
input.getPartitionKeys().size() == 0;
+  case PARTITIONED:
+return input.getPartitionKeys() != null && 
input.getPartitionKeys().size() > 0;
+}

Review comment:
   Fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Work logged] (GOBBLIN-1127) Provide an option to make metric reporting instantiation failures fatal

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1127?focusedWorklogId=431833=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431833
 ]

ASF GitHub Bot logged work on GOBBLIN-1127:
---

Author: ASF GitHub Bot
Created on: 07/May/20 18:01
Start Date: 07/May/20 18:01
Worklog Time Spent: 10m 
  Work Description: autumnust commented on pull request #2967:
URL: 
https://github.com/apache/incubator-gobblin/pull/2967#issuecomment-625409335


   There's a travis failure seems relevant, can you reproduce locally by 
`./gradlew :gobblin-modules:gobblin-kafka-09:test` ? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431833)
Time Spent: 1h  (was: 50m)

> Provide an option to make metric reporting instantiation failures fatal
> ---
>
> Key: GOBBLIN-1127
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1127
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-cluster
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> This option allows GobblinTaskRunner to "fail-fast" on metric reporting 
> instantiation failures. This is particularly sseful in scenarios where 
> pipeline monitoring depends on metrics and tracking events being emitted.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] autumnust commented on pull request #2967: GOBBLIN-1127: Provide an option to make metric reporting instantiatio…

2020-05-07 Thread GitBox


autumnust commented on pull request #2967:
URL: 
https://github.com/apache/incubator-gobblin/pull/2967#issuecomment-625409335


   There's a travis failure seems relevant, can you reproduce locally by 
`./gradlew :gobblin-modules:gobblin-kafka-09:test` ? 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431812=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431812
 ]

ASF GitHub Bot logged work on GOBBLIN-1142:
---

Author: ASF GitHub Bot
Created on: 07/May/20 16:29
Start Date: 07/May/20 16:29
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on a change in pull request 
#2979:
URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048



##
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.data.management.copy.predicates;
+
+import java.util.Properties;
+
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import com.google.common.base.Predicate;
+
+import javax.annotation.Nullable;
+
+
+/**
+ * A predicate to check if a hive {@link Table} is of a certain type in {@link 
TABLE_TYPE}
+ *
+ *  Example usage: {@link 
org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER}
+ */
+public class TableTypeFilter implements Predicate {
+
+  public static final String FILTER_TYPE = "tableTypeFilter.type";
+
+  private enum TABLE_TYPE {
+SNAPSHOT,
+PARTITIONED
+  }
+
+  private final TABLE_TYPE tableType;
+
+  public TableTypeFilter(Properties props) {
+tableType = TABLE_TYPE.valueOf(
+props.getProperty(FILTER_TYPE, 
TABLE_TYPE.SNAPSHOT.name()).toUpperCase());
+  }
+
+  @Override
+  public boolean apply(@Nullable Table input) {
+if (input == null) {
+  return false;
+}
+
+switch (tableType) {
+  case SNAPSHOT:
+return input.getPartitionKeys() == null || 
input.getPartitionKeys().size() == 0;
+  case PARTITIONED:
+return input.getPartitionKeys() != null && 
input.getPartitionKeys().size() > 0;
+}

Review comment:
   I was not comfortable with this and then why I googled, I found 
`switch-case` without a `default` is widely accepted not a good style.
   https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431812)
Time Spent: 50m  (was: 40m)

> Hive Distcp support filter on partitioned or snapshot table
> ---
>
> Key: GOBBLIN-1142
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1142
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Major
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> The change adds support filtering a specific type of tables, e.g snapshot, 
> partitioned, in `HiveDatasetFinder`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables

2020-05-07 Thread GitBox


arjun4084346 commented on a change in pull request #2979:
URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048



##
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.data.management.copy.predicates;
+
+import java.util.Properties;
+
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import com.google.common.base.Predicate;
+
+import javax.annotation.Nullable;
+
+
+/**
+ * A predicate to check if a hive {@link Table} is of a certain type in {@link 
TABLE_TYPE}
+ *
+ *  Example usage: {@link 
org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER}
+ */
+public class TableTypeFilter implements Predicate {
+
+  public static final String FILTER_TYPE = "tableTypeFilter.type";
+
+  private enum TABLE_TYPE {
+SNAPSHOT,
+PARTITIONED
+  }
+
+  private final TABLE_TYPE tableType;
+
+  public TableTypeFilter(Properties props) {
+tableType = TABLE_TYPE.valueOf(
+props.getProperty(FILTER_TYPE, 
TABLE_TYPE.SNAPSHOT.name()).toUpperCase());
+  }
+
+  @Override
+  public boolean apply(@Nullable Table input) {
+if (input == null) {
+  return false;
+}
+
+switch (tableType) {
+  case SNAPSHOT:
+return input.getPartitionKeys() == null || 
input.getPartitionKeys().size() == 0;
+  case PARTITIONED:
+return input.getPartitionKeys() != null && 
input.getPartitionKeys().size() > 0;
+}

Review comment:
   I was not comfortable with this and then why I googled, I found 
`switch-case` without a `default` is widely accepted not a good style.
   https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431580=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431580
 ]

ASF GitHub Bot logged work on GOBBLIN-1142:
---

Author: ASF GitHub Bot
Created on: 07/May/20 06:33
Start Date: 07/May/20 06:33
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on a change in pull request 
#2979:
URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048



##
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.data.management.copy.predicates;
+
+import java.util.Properties;
+
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import com.google.common.base.Predicate;
+
+import javax.annotation.Nullable;
+
+
+/**
+ * A predicate to check if a hive {@link Table} is of a certain type in {@link 
TABLE_TYPE}
+ *
+ *  Example usage: {@link 
org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER}
+ */
+public class TableTypeFilter implements Predicate {
+
+  public static final String FILTER_TYPE = "tableTypeFilter.type";
+
+  private enum TABLE_TYPE {
+SNAPSHOT,
+PARTITIONED
+  }
+
+  private final TABLE_TYPE tableType;
+
+  public TableTypeFilter(Properties props) {
+tableType = TABLE_TYPE.valueOf(
+props.getProperty(FILTER_TYPE, 
TABLE_TYPE.SNAPSHOT.name()).toUpperCase());
+  }
+
+  @Override
+  public boolean apply(@Nullable Table input) {
+if (input == null) {
+  return false;
+}
+
+switch (tableType) {
+  case SNAPSHOT:
+return input.getPartitionKeys() == null || 
input.getPartitionKeys().size() == 0;
+  case PARTITIONED:
+return input.getPartitionKeys() != null && 
input.getPartitionKeys().size() > 0;
+}

Review comment:
   I was not comfortable with this and then why I googled, I found 
`switch-case` without a `default` is widely not a good style.
   https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431580)
Time Spent: 40m  (was: 0.5h)

> Hive Distcp support filter on partitioned or snapshot table
> ---
>
> Key: GOBBLIN-1142
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1142
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Major
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> The change adds support filtering a specific type of tables, e.g snapshot, 
> partitioned, in `HiveDatasetFinder`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (GOBBLIN-1142) Hive Distcp support filter on partitioned or snapshot table

2020-05-07 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-1142?focusedWorklogId=431576=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-431576
 ]

ASF GitHub Bot logged work on GOBBLIN-1142:
---

Author: ASF GitHub Bot
Created on: 07/May/20 06:32
Start Date: 07/May/20 06:32
Worklog Time Spent: 10m 
  Work Description: arjun4084346 commented on a change in pull request 
#2979:
URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048



##
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.data.management.copy.predicates;
+
+import java.util.Properties;
+
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import com.google.common.base.Predicate;
+
+import javax.annotation.Nullable;
+
+
+/**
+ * A predicate to check if a hive {@link Table} is of a certain type in {@link 
TABLE_TYPE}
+ *
+ *  Example usage: {@link 
org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER}
+ */
+public class TableTypeFilter implements Predicate {
+
+  public static final String FILTER_TYPE = "tableTypeFilter.type";
+
+  private enum TABLE_TYPE {
+SNAPSHOT,
+PARTITIONED
+  }
+
+  private final TABLE_TYPE tableType;
+
+  public TableTypeFilter(Properties props) {
+tableType = TABLE_TYPE.valueOf(
+props.getProperty(FILTER_TYPE, 
TABLE_TYPE.SNAPSHOT.name()).toUpperCase());
+  }
+
+  @Override
+  public boolean apply(@Nullable Table input) {
+if (input == null) {
+  return false;
+}
+
+switch (tableType) {
+  case SNAPSHOT:
+return input.getPartitionKeys() == null || 
input.getPartitionKeys().size() == 0;
+  case PARTITIONED:
+return input.getPartitionKeys() != null && 
input.getPartitionKeys().size() > 0;
+}

Review comment:
   I was not comfortable with this and then why I googled, I found 
`switch-case` without a `default` is widely not recommended.
   https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 431576)
Time Spent: 0.5h  (was: 20m)

> Hive Distcp support filter on partitioned or snapshot table
> ---
>
> Key: GOBBLIN-1142
> URL: https://issues.apache.org/jira/browse/GOBBLIN-1142
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Zhixiong Chen
>Assignee: Zhixiong Chen
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> The change adds support filtering a specific type of tables, e.g snapshot, 
> partitioned, in `HiveDatasetFinder`



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables

2020-05-07 Thread GitBox


arjun4084346 commented on a change in pull request #2979:
URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048



##
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.data.management.copy.predicates;
+
+import java.util.Properties;
+
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import com.google.common.base.Predicate;
+
+import javax.annotation.Nullable;
+
+
+/**
+ * A predicate to check if a hive {@link Table} is of a certain type in {@link 
TABLE_TYPE}
+ *
+ *  Example usage: {@link 
org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER}
+ */
+public class TableTypeFilter implements Predicate {
+
+  public static final String FILTER_TYPE = "tableTypeFilter.type";
+
+  private enum TABLE_TYPE {
+SNAPSHOT,
+PARTITIONED
+  }
+
+  private final TABLE_TYPE tableType;
+
+  public TableTypeFilter(Properties props) {
+tableType = TABLE_TYPE.valueOf(
+props.getProperty(FILTER_TYPE, 
TABLE_TYPE.SNAPSHOT.name()).toUpperCase());
+  }
+
+  @Override
+  public boolean apply(@Nullable Table input) {
+if (input == null) {
+  return false;
+}
+
+switch (tableType) {
+  case SNAPSHOT:
+return input.getPartitionKeys() == null || 
input.getPartitionKeys().size() == 0;
+  case PARTITIONED:
+return input.getPartitionKeys() != null && 
input.getPartitionKeys().size() > 0;
+}

Review comment:
   I was not comfortable with this and then why I googled, I found 
`switch-case` without a `default` is widely not a good style.
   https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [incubator-gobblin] arjun4084346 commented on a change in pull request #2979: [GOBBLIN-1142] Hive Distcp support filter on partitioned or snapshot tables

2020-05-07 Thread GitBox


arjun4084346 commented on a change in pull request #2979:
URL: https://github.com/apache/incubator-gobblin/pull/2979#discussion_r421269048



##
File path: 
gobblin-data-management/src/main/java/org/apache/gobblin/data/management/copy/predicates/TableTypeFilter.java
##
@@ -0,0 +1,65 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.gobblin.data.management.copy.predicates;
+
+import java.util.Properties;
+
+import org.apache.hadoop.hive.metastore.api.Table;
+
+import com.google.common.base.Predicate;
+
+import javax.annotation.Nullable;
+
+
+/**
+ * A predicate to check if a hive {@link Table} is of a certain type in {@link 
TABLE_TYPE}
+ *
+ *  Example usage: {@link 
org.apache.gobblin.data.management.copy.hive.HiveDatasetFinder#TABLE_FILTER}
+ */
+public class TableTypeFilter implements Predicate {
+
+  public static final String FILTER_TYPE = "tableTypeFilter.type";
+
+  private enum TABLE_TYPE {
+SNAPSHOT,
+PARTITIONED
+  }
+
+  private final TABLE_TYPE tableType;
+
+  public TableTypeFilter(Properties props) {
+tableType = TABLE_TYPE.valueOf(
+props.getProperty(FILTER_TYPE, 
TABLE_TYPE.SNAPSHOT.name()).toUpperCase());
+  }
+
+  @Override
+  public boolean apply(@Nullable Table input) {
+if (input == null) {
+  return false;
+}
+
+switch (tableType) {
+  case SNAPSHOT:
+return input.getPartitionKeys() == null || 
input.getPartitionKeys().size() == 0;
+  case PARTITIONED:
+return input.getPartitionKeys() != null && 
input.getPartitionKeys().size() > 0;
+}

Review comment:
   I was not comfortable with this and then why I googled, I found 
`switch-case` without a `default` is widely not recommended.
   https://help.semmle.com/wiki/display/JAVA/Missing+default+case+in+switch





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org