[GitHub] [incubator-gobblin] jhsenjaliya commented on issue #2586: [GOBBLIN-719] fix invalid git links for classes in docs

2019-04-04 Thread GitBox
jhsenjaliya commented on issue #2586: [GOBBLIN-719] fix invalid git links for 
classes in docs
URL: 
https://github.com/apache/incubator-gobblin/pull/2586#issuecomment-480150157
 
 
   @yukuai518 , can you merge the changes, I need to update docs as part of 
#2578 and it will hekop to avoid conflicts. Thanks


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-719) gobblin-docs has invalid git links

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-719?focusedWorklogId=223441=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223441
 ]

ASF GitHub Bot logged work on GOBBLIN-719:
--

Author: ASF GitHub Bot
Created on: 05/Apr/19 05:09
Start Date: 05/Apr/19 05:09
Worklog Time Spent: 10m 
  Work Description: jhsenjaliya commented on issue #2586: [GOBBLIN-719] fix 
invalid git links for classes in docs
URL: 
https://github.com/apache/incubator-gobblin/pull/2586#issuecomment-480150157
 
 
   @yukuai518 , can you merge the changes, I need to update docs as part of 
#2578 and it will hekop to avoid conflicts. Thanks
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223441)
Time Spent: 0.5h  (was: 20m)

> gobblin-docs has invalid git links
> --
>
> Key: GOBBLIN-719
> URL: https://issues.apache.org/jira/browse/GOBBLIN-719
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Jay Sen
>Priority: Trivial
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> gobblin docs had some invalid links pointing not only LinkedIn repo but also 
> old location of the classes that has changes since then.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=223440=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223440
 ]

ASF GitHub Bot logged work on GOBBLIN-707:
--

Author: ASF GitHub Bot
Created on: 05/Apr/19 05:09
Start Date: 05/Apr/19 05:09
Worklog Time Spent: 10m 
  Work Description: jhsenjaliya commented on pull request #2578: 
[GOBBLIN-707] rewrite gobblin script to combine all modes and command
URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272445741
 
 

 ##
 File path: gobblin-docs/user-guide/Gobblin-CLI.md
 ##
 @@ -28,29 +28,29 @@ Gobblin ingestion applications
 
 Gobblin ingestion applications can be accessed through the command `run`:
 ```bash
-bin/gobblin run [listQuickApps] [] -jobName  [OPTIONS]
+bin/gobblin cli run [listQuickApps] [] -jobName  [OPTIONS]
 
 Review comment:
   ok sure, it requires log of doc changes, and some reorganization, which i 
can take care of but can we get #2586 merged? otherwise i ll have lot of 
conflicts.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223440)
Time Spent: 2h 20m  (was: 2h 10m)

> combine & standardize all gobblin scripts into one master script & 
> restructure configs accordingly
> --
>
> Key: GOBBLIN-707
> URL: https://issues.apache.org/jira/browse/GOBBLIN-707
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> gobblin supports multiple modes of executions ( CLI, Standalone, 
> cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual 
> script for each of them.
> 1. there can be one gobblin.sh script
> {{gobblin.sh   }}
> {{gobblin.sh   }}
> {{commands values: admin, cli, statestore-check, statestore-clean, 
> historystore-manager}}
> {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
> service}}
> 2. Also configs needs to be structured and deduped accordingly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [incubator-gobblin] jhsenjaliya commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command

2019-04-04 Thread GitBox
jhsenjaliya commented on a change in pull request #2578: [GOBBLIN-707] rewrite 
gobblin script to combine all modes and command
URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272445741
 
 

 ##
 File path: gobblin-docs/user-guide/Gobblin-CLI.md
 ##
 @@ -28,29 +28,29 @@ Gobblin ingestion applications
 
 Gobblin ingestion applications can be accessed through the command `run`:
 ```bash
-bin/gobblin run [listQuickApps] [] -jobName  [OPTIONS]
+bin/gobblin cli run [listQuickApps] [] -jobName  [OPTIONS]
 
 Review comment:
   ok sure, it requires log of doc changes, and some reorganization, which i 
can take care of but can we get #2586 merged? otherwise i ll have lot of 
conflicts.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-724) Throttling server delays responses for throttling causing too many connections

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-724?focusedWorklogId=223370=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223370
 ]

ASF GitHub Bot logged work on GOBBLIN-724:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 23:47
Start Date: 04/Apr/19 23:47
Worklog Time Spent: 10m 
  Work Description: ibuenros commented on issue #2591: [GOBBLIN-724] 
Upgrade throttling server so waiting until tokens can be used is done…
URL: 
https://github.com/apache/incubator-gobblin/pull/2591#issuecomment-480102141
 
 
   @htran1 can you review? I can go over the changes with you if necessary.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223370)
Time Spent: 20m  (was: 10m)

> Throttling server delays responses for throttling causing too many connections
> --
>
> Key: GOBBLIN-724
> URL: https://issues.apache.org/jira/browse/GOBBLIN-724
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Issac Buenrostro
>Priority: Major
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, the throttling server implements throttling in part by delaying 
> the response with the permit allocation. However, when waiting to respond, 
> the request remains in flight utilizing system resources and severely 
> limiting how many clients can use the throttling server.
> As a fix, the server should respond immediately and ask the client to wait 
> before distributing the permits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-724) Throttling server delays responses for throttling causing too many connections

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-724?focusedWorklogId=223366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223366
 ]

ASF GitHub Bot logged work on GOBBLIN-724:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 23:47
Start Date: 04/Apr/19 23:47
Worklog Time Spent: 10m 
  Work Description: ibuenros commented on pull request #2591: [GOBBLIN-724] 
Upgrade throttling server so waiting until tokens can be used is done…
URL: https://github.com/apache/incubator-gobblin/pull/2591
 
 
   … by the client instead of the server. See GOBBLIN-724.
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-XXX
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if 
applicable):
   
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223366)
Time Spent: 10m
Remaining Estimate: 0h

> Throttling server delays responses for throttling causing too many connections
> --
>
> Key: GOBBLIN-724
> URL: https://issues.apache.org/jira/browse/GOBBLIN-724
> Project: Apache Gobblin
>  Issue Type: Bug
>Reporter: Issac Buenrostro
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently, the throttling server implements throttling in part by delaying 
> the response with the permit allocation. However, when waiting to respond, 
> the request remains in flight utilizing system resources and severely 
> limiting how many clients can use the throttling server.
> As a fix, the server should respond immediately and ask the client to wait 
> before distributing the permits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GOBBLIN-724) Throttling server delays responses for throttling causing too many connections

2019-04-04 Thread Issac Buenrostro (JIRA)
Issac Buenrostro created GOBBLIN-724:


 Summary: Throttling server delays responses for throttling causing 
too many connections
 Key: GOBBLIN-724
 URL: https://issues.apache.org/jira/browse/GOBBLIN-724
 Project: Apache Gobblin
  Issue Type: Bug
Reporter: Issac Buenrostro


Currently, the throttling server implements throttling in part by delaying the 
response with the permit allocation. However, when waiting to respond, the 
request remains in flight utilizing system resources and severely limiting how 
many clients can use the throttling server.

As a fix, the server should respond immediately and ask the client to wait 
before distributing the permits.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [incubator-gobblin] ibuenros commented on issue #2591: [GOBBLIN-724] Upgrade throttling server so waiting until tokens can be used is done…

2019-04-04 Thread GitBox
ibuenros commented on issue #2591: [GOBBLIN-724] Upgrade throttling server so 
waiting until tokens can be used is done…
URL: 
https://github.com/apache/incubator-gobblin/pull/2591#issuecomment-480102141
 
 
   @htran1 can you review? I can go over the changes with you if necessary.


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] ibuenros opened a new pull request #2591: [GOBBLIN-724] Upgrade throttling server so waiting until tokens can be used is done…

2019-04-04 Thread GitBox
ibuenros opened a new pull request #2591: [GOBBLIN-724] Upgrade throttling 
server so waiting until tokens can be used is done…
URL: https://github.com/apache/incubator-gobblin/pull/2591
 
 
   … by the client instead of the server. See GOBBLIN-724.
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [ ] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-XXX
   
   
   ### Description
   - [ ] Here are some details about my PR, including screenshots (if 
applicable):
   
   
   ### Tests
   - [ ] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   
   
   ### Commits
   - [ ] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-720) delete the state store whenever a flow is deleted

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-720?focusedWorklogId=223232=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223232
 ]

ASF GitHub Bot logged work on GOBBLIN-720:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 20:24
Start Date: 04/Apr/19 20:24
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #2587: [GOBBLIN-720 
Always delete state store
URL: https://github.com/apache/incubator-gobblin/pull/2587
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223232)
Time Spent: 50m  (was: 40m)

> delete the state store whenever a flow is deleted
> -
>
> Key: GOBBLIN-720
> URL: https://issues.apache.org/jira/browse/GOBBLIN-720
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-722) add option to unschedule a gaas flow

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-722?focusedWorklogId=223234=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223234
 ]

ASF GitHub Bot logged work on GOBBLIN-722:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 20:25
Start Date: 04/Apr/19 20:25
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #2589: [GOBBLIN-722] 
Unschedule gaas flow
URL: https://github.com/apache/incubator-gobblin/pull/2589
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223234)
Time Spent: 40m  (was: 0.5h)

> add option to unschedule a gaas flow
> 
>
> Key: GOBBLIN-722
> URL: https://issues.apache.org/jira/browse/GOBBLIN-722
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GOBBLIN-722) add option to unschedule a gaas flow

2019-04-04 Thread Hung Tran (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran resolved GOBBLIN-722.
---
   Resolution: Fixed
Fix Version/s: 0.15.0

Issue resolved by pull request #2589
[https://github.com/apache/incubator-gobblin/pull/2589]

> add option to unschedule a gaas flow
> 
>
> Key: GOBBLIN-722
> URL: https://issues.apache.org/jira/browse/GOBBLIN-722
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Arjun Singh Bora
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] [incubator-gobblin] asfgit closed pull request #2589: [GOBBLIN-722] Unschedule gaas flow

2019-04-04 Thread GitBox
asfgit closed pull request #2589: [GOBBLIN-722] Unschedule gaas flow
URL: https://github.com/apache/incubator-gobblin/pull/2589
 
 
   


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[jira] [Work logged] (GOBBLIN-723) Add support to the LogCopier for copying from multiple source paths

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-723?focusedWorklogId=223229=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223229
 ]

ASF GitHub Bot logged work on GOBBLIN-723:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 20:22
Start Date: 04/Apr/19 20:22
Worklog Time Spent: 10m 
  Work Description: asfgit commented on pull request #2590: [GOBBLIN-723] 
Add support to the LogCopier for copying from multiple …
URL: https://github.com/apache/incubator-gobblin/pull/2590
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223229)
Time Spent: 20m  (was: 10m)

> Add support to the LogCopier for copying from multiple source paths
> ---
>
> Key: GOBBLIN-723
> URL: https://issues.apache.org/jira/browse/GOBBLIN-723
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Hung Tran
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The LogCopier should support multiple source paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (GOBBLIN-723) Add support to the LogCopier for copying from multiple source paths

2019-04-04 Thread Hung Tran (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hung Tran resolved GOBBLIN-723.
---
   Resolution: Fixed
Fix Version/s: 0.15.0

Issue resolved by pull request #2590
[https://github.com/apache/incubator-gobblin/pull/2590]

> Add support to the LogCopier for copying from multiple source paths
> ---
>
> Key: GOBBLIN-723
> URL: https://issues.apache.org/jira/browse/GOBBLIN-723
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Hung Tran
>Assignee: Hung Tran
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> The LogCopier should support multiple source paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (GOBBLIN-723) Add support to the LogCopier for copying from multiple source paths

2019-04-04 Thread Hung Tran (JIRA)
Hung Tran created GOBBLIN-723:
-

 Summary: Add support to the LogCopier for copying from multiple 
source paths
 Key: GOBBLIN-723
 URL: https://issues.apache.org/jira/browse/GOBBLIN-723
 Project: Apache Gobblin
  Issue Type: Task
Reporter: Hung Tran
Assignee: Hung Tran


The LogCopier should support multiple source paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-723) Add support to the LogCopier for copying from multiple source paths

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-723?focusedWorklogId=223166=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223166
 ]

ASF GitHub Bot logged work on GOBBLIN-723:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 17:52
Start Date: 04/Apr/19 17:52
Worklog Time Spent: 10m 
  Work Description: htran1 commented on pull request #2590: [GOBBLIN-723] 
Add support to the LogCopier for copying from multiple …
URL: https://github.com/apache/incubator-gobblin/pull/2590
 
 
   …source paths
   
   Dear Gobblin maintainers,
   
   Please accept this PR. I understand that it will not be reviewed until I 
have checked off all the steps below!
   
   
   ### JIRA
   - [X] My PR addresses the following [Gobblin 
JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references 
them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR"
   - https://issues.apache.org/jira/browse/GOBBLIN-723
   
   
   ### Description
   - [X] Here are some details about my PR, including screenshots (if 
applicable):
   Add support for multiple source paths. The GobblinYarnLogSource will split 
the string value of LOG_DIRS and configure the LogCopier to look at multiple 
paths.
   
   ### Tests
   - [X] My PR adds the following unit tests __OR__ does not need testing for 
this extremely good reason:
   Tested with a job on an environment with multiple log directories.
   
   ### Commits
   - [X] My commits all reference JIRA issues in their subject lines, and I 
have squashed multiple commits if they address the same issue. In addition, my 
commits follow the guidelines from "[How to write a good git commit 
message](http://chris.beams.io/posts/git-commit/)":
   1. Subject is separated from body by a blank line
   2. Subject is limited to 50 characters
   3. Subject does not end with a period
   4. Subject uses the imperative mood ("add", not "adding")
   5. Body wraps at 72 characters
   6. Body explains "what" and "why", not "how"
   
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223166)
Time Spent: 10m
Remaining Estimate: 0h

> Add support to the LogCopier for copying from multiple source paths
> ---
>
> Key: GOBBLIN-723
> URL: https://issues.apache.org/jira/browse/GOBBLIN-723
> Project: Apache Gobblin
>  Issue Type: Task
>Reporter: Hung Tran
>Assignee: Hung Tran
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The LogCopier should support multiple source paths.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-708) Create SqlDatasetDescriptor for JDBC-sourced datasets

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-708?focusedWorklogId=223082=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223082
 ]

ASF GitHub Bot logged work on GOBBLIN-708:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 16:23
Start Date: 04/Apr/19 16:23
Worklog Time Spent: 10m 
  Work Description: zxcware commented on pull request #2577: GOBBLIN-708: 
Create SqlDatasetDescriptor for JDBC-sourced datasets.
URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272259273
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java
 ##
 @@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gobblin.service.modules.dataset;
+
+import java.io.IOException;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import lombok.EqualsAndHashCode;
+import lombok.Getter;
+import lombok.ToString;
+
+import 
org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys;
+import org.apache.gobblin.util.ConfigUtils;
+
+@EqualsAndHashCode (exclude = {"description", "rawConfig"})
+@ToString (exclude = {"description", "rawConfig"})
+public abstract class BaseDatasetDescriptor implements DatasetDescriptor {
+  @Getter
+  private final String platform;
+  @Getter
+  private final FormatConfig formatConfig;
+  @Getter
+  private final boolean isRetentionApplied;
+  @Getter
+  private final String description;
+  @Getter
+  private final Config rawConfig;
+
+  private static final Config DEFAULT_FALLBACK =
+  ConfigFactory.parseMap(ImmutableMap.builder()
+  .put(DatasetDescriptorConfigKeys.PATH_KEY, 
DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY)
+  .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false)
+  .build());
+
+  public BaseDatasetDescriptor(Config config) throws IOException {
+
Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY),
 "Dataset descriptor config must specify platform");
+this.platform = 
config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase();
+this.formatConfig = new FormatConfig(config);
+this.isRetentionApplied = ConfigUtils.getBoolean(config, 
DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false);
+this.description = ConfigUtils.getString(config, 
DatasetDescriptorConfigKeys.DESCRIPTION_KEY, "");
+this.rawConfig = 
config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK);
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  protected abstract boolean isPathContaining(String otherPath);
+
+  /**
+   * @return true if this {@link DatasetDescriptor} contains the other {@link 
DatasetDescriptor} i.e. the
+   * datasets described by this {@link DatasetDescriptor} is a subset of the 
datasets described by the other
+   * {@link DatasetDescriptor}. This operation is non-commutative.
+   * @param other
+   */
+  @Override
+  public boolean contains(DatasetDescriptor other) {
+if (this == other) {
+  return true;
+}
+if (!getClass().equals(other.getClass())) {
 
 Review comment:
   Maybe `other == null || !getClass().equals(other.getClass())`?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223082)
Time Spent: 50m  (was: 40m)

> Create SqlDatasetDescriptor for JDBC-sourced datasets 
> --
>
> Key: GOBBLIN-708
> URL: https://issues.apache.org/jira/browse/GOBBLIN-708
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-service
>Affects 

Re: Gobblin at ApacheCon ?

2019-04-04 Thread Abhishek Tiwari
+1

Last year we received good traction at ApacheCon NA. Totally worth it.

@Tamas, if Vegas doesn't work, you should definitely go for ApacheCon EU:
https://aceu19.apachecon.com/

Abhishek


On Thu, Apr 4, 2019 at 9:16 AM Jay Sen  wrote:

> I also think would be really helpful to the project, specially when the
> core dev team is working hard and trying to make this a top level apache
> project.
>
> -Jay
>
> On Wed, Apr 3, 2019 at 11:34 PM Tamas Nemeth  .invalid>
> wrote:
>
> > I love the idea as well!
> > I would be happy to see at Apachecon but Las Vegas is a bit far from
> here.
> > :(
> >
> > Tamas
> >
> > On 2019. Apr 4., Thu at 8:24, Jean-Baptiste Onofré 
> > wrote:
> >
> > > It sounds good to me.
> > >
> > > Regards
> > > JB
> > >
> > > On 03/04/2019 18:59, Jay Sen wrote:
> > > > Hi Guys,
> > > >
> > > > Lets present Apache Gobblin at the ApacheCon.
> > > >
> > > > I would be interested in presenting/co-presenting PayPal's use-case.
> > > >
> > > > @PMCs, Please share your thoughts.
> > > >
> > > > Thanks
> > > > Jay
> > > >
> > >
> > > --
> > > Jean-Baptiste Onofré
> > > jbono...@apache.org
> > > http://blog.nanthrax.net
> > > Talend - http://www.talend.com
> > >
> >
>


[jira] [Work logged] (GOBBLIN-708) Create SqlDatasetDescriptor for JDBC-sourced datasets

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-708?focusedWorklogId=223079=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223079
 ]

ASF GitHub Bot logged work on GOBBLIN-708:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 16:23
Start Date: 04/Apr/19 16:23
Worklog Time Spent: 10m 
  Work Description: zxcware commented on pull request #2577: GOBBLIN-708: 
Create SqlDatasetDescriptor for JDBC-sourced datasets.
URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272257572
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java
 ##
 @@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gobblin.service.modules.dataset;
+
+import java.io.IOException;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import lombok.EqualsAndHashCode;
+import lombok.Getter;
+import lombok.ToString;
+
+import 
org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys;
+import org.apache.gobblin.util.ConfigUtils;
+
+@EqualsAndHashCode (exclude = {"description", "rawConfig"})
+@ToString (exclude = {"description", "rawConfig"})
+public abstract class BaseDatasetDescriptor implements DatasetDescriptor {
+  @Getter
+  private final String platform;
+  @Getter
+  private final FormatConfig formatConfig;
+  @Getter
+  private final boolean isRetentionApplied;
+  @Getter
+  private final String description;
+  @Getter
+  private final Config rawConfig;
+
+  private static final Config DEFAULT_FALLBACK =
+  ConfigFactory.parseMap(ImmutableMap.builder()
+  .put(DatasetDescriptorConfigKeys.PATH_KEY, 
DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY)
+  .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false)
+  .build());
+
+  public BaseDatasetDescriptor(Config config) throws IOException {
+
Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY),
 "Dataset descriptor config must specify platform");
+this.platform = 
config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase();
+this.formatConfig = new FormatConfig(config);
+this.isRetentionApplied = ConfigUtils.getBoolean(config, 
DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false);
+this.description = ConfigUtils.getString(config, 
DatasetDescriptorConfigKeys.DESCRIPTION_KEY, "");
+this.rawConfig = 
config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK);
 
 Review comment:
   Should we favor `formatConfig` over the input `config`?
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223079)
Time Spent: 20m  (was: 10m)

> Create SqlDatasetDescriptor for JDBC-sourced datasets 
> --
>
> Key: GOBBLIN-708
> URL: https://issues.apache.org/jira/browse/GOBBLIN-708
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-service
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Abhishek Tiwari
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Create a new DatasetDescriptor for JDBC sourced datasets. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-708) Create SqlDatasetDescriptor for JDBC-sourced datasets

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-708?focusedWorklogId=223080=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223080
 ]

ASF GitHub Bot logged work on GOBBLIN-708:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 16:23
Start Date: 04/Apr/19 16:23
Worklog Time Spent: 10m 
  Work Description: zxcware commented on pull request #2577: GOBBLIN-708: 
Create SqlDatasetDescriptor for JDBC-sourced datasets.
URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272260089
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/EncryptionConfig.java
 ##
 @@ -20,18 +20,21 @@
 import java.io.IOException;
 
 import com.google.common.base.Enums;
-import com.google.common.base.Joiner;
 import com.google.common.collect.ImmutableMap;
 import com.typesafe.config.Config;
 import com.typesafe.config.ConfigFactory;
 
+import lombok.EqualsAndHashCode;
 import lombok.Getter;
+import lombok.ToString;
 import lombok.extern.slf4j.Slf4j;
 
 import 
org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys;
 import org.apache.gobblin.util.ConfigUtils;
 
 @Slf4j
+@ToString(exclude = {"rawConfig"})
 
 Review comment:
   nice
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223080)
Time Spent: 0.5h  (was: 20m)

> Create SqlDatasetDescriptor for JDBC-sourced datasets 
> --
>
> Key: GOBBLIN-708
> URL: https://issues.apache.org/jira/browse/GOBBLIN-708
> Project: Apache Gobblin
>  Issue Type: Improvement
>  Components: gobblin-service
>Affects Versions: 0.15.0
>Reporter: Sudarshan Vasudevan
>Assignee: Abhishek Tiwari
>Priority: Major
> Fix For: 0.15.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Create a new DatasetDescriptor for JDBC sourced datasets. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Work logged] (GOBBLIN-708) Create SqlDatasetDescriptor for JDBC-sourced datasets

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-708?focusedWorklogId=223081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223081
 ]

ASF GitHub Bot logged work on GOBBLIN-708:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 16:23
Start Date: 04/Apr/19 16:23
Worklog Time Spent: 10m 
  Work Description: zxcware commented on pull request #2577: GOBBLIN-708: 
Create SqlDatasetDescriptor for JDBC-sourced datasets.
URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272259071
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java
 ##
 @@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gobblin.service.modules.dataset;
+
+import java.io.IOException;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import lombok.EqualsAndHashCode;
+import lombok.Getter;
+import lombok.ToString;
+
+import 
org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys;
+import org.apache.gobblin.util.ConfigUtils;
+
+@EqualsAndHashCode (exclude = {"description", "rawConfig"})
+@ToString (exclude = {"description", "rawConfig"})
+public abstract class BaseDatasetDescriptor implements DatasetDescriptor {
+  @Getter
+  private final String platform;
+  @Getter
+  private final FormatConfig formatConfig;
+  @Getter
+  private final boolean isRetentionApplied;
+  @Getter
+  private final String description;
+  @Getter
+  private final Config rawConfig;
+
+  private static final Config DEFAULT_FALLBACK =
+  ConfigFactory.parseMap(ImmutableMap.builder()
+  .put(DatasetDescriptorConfigKeys.PATH_KEY, 
DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY)
+  .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false)
+  .build());
+
+  public BaseDatasetDescriptor(Config config) throws IOException {
+
Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY),
 "Dataset descriptor config must specify platform");
+this.platform = 
config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase();
+this.formatConfig = new FormatConfig(config);
+this.isRetentionApplied = ConfigUtils.getBoolean(config, 
DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false);
+this.description = ConfigUtils.getString(config, 
DatasetDescriptorConfigKeys.DESCRIPTION_KEY, "");
+this.rawConfig = 
config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK);
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  protected abstract boolean isPathContaining(String otherPath);
+
+  /**
+   * @return true if this {@link DatasetDescriptor} contains the other {@link 
DatasetDescriptor} i.e. the
+   * datasets described by this {@link DatasetDescriptor} is a subset of the 
datasets described by the other
+   * {@link DatasetDescriptor}. This operation is non-commutative.
+   * @param other
+   */
+  @Override
+  public boolean contains(DatasetDescriptor other) {
+if (this == other) {
+  return true;
+}
+if (!getClass().equals(other.getClass())) {
+  return false;
+}
+
+if (this.getPlatform() == null || other.getPlatform() == null || 
!this.getPlatform().equalsIgnoreCase(other.getPlatform())) {
 
 Review comment:
   we can simplify as `if (this.getPlatform() == null || 
!this.getPlatform().equalsIgnoreCase(other.getPlatform())) {`
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 223081)
Time Spent: 40m  (was: 0.5h)

> Create SqlDatasetDescriptor for JDBC-sourced datasets 
> --
>
> Key: 

[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets.

2019-04-04 Thread GitBox
zxcware commented on a change in pull request #2577: GOBBLIN-708: Create 
SqlDatasetDescriptor for JDBC-sourced datasets.
URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272257572
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java
 ##
 @@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gobblin.service.modules.dataset;
+
+import java.io.IOException;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import lombok.EqualsAndHashCode;
+import lombok.Getter;
+import lombok.ToString;
+
+import 
org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys;
+import org.apache.gobblin.util.ConfigUtils;
+
+@EqualsAndHashCode (exclude = {"description", "rawConfig"})
+@ToString (exclude = {"description", "rawConfig"})
+public abstract class BaseDatasetDescriptor implements DatasetDescriptor {
+  @Getter
+  private final String platform;
+  @Getter
+  private final FormatConfig formatConfig;
+  @Getter
+  private final boolean isRetentionApplied;
+  @Getter
+  private final String description;
+  @Getter
+  private final Config rawConfig;
+
+  private static final Config DEFAULT_FALLBACK =
+  ConfigFactory.parseMap(ImmutableMap.builder()
+  .put(DatasetDescriptorConfigKeys.PATH_KEY, 
DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY)
+  .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false)
+  .build());
+
+  public BaseDatasetDescriptor(Config config) throws IOException {
+
Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY),
 "Dataset descriptor config must specify platform");
+this.platform = 
config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase();
+this.formatConfig = new FormatConfig(config);
+this.isRetentionApplied = ConfigUtils.getBoolean(config, 
DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false);
+this.description = ConfigUtils.getString(config, 
DatasetDescriptorConfigKeys.DESCRIPTION_KEY, "");
+this.rawConfig = 
config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK);
 
 Review comment:
   Should we favor `formatConfig` over the input `config`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets.

2019-04-04 Thread GitBox
zxcware commented on a change in pull request #2577: GOBBLIN-708: Create 
SqlDatasetDescriptor for JDBC-sourced datasets.
URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272259071
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java
 ##
 @@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gobblin.service.modules.dataset;
+
+import java.io.IOException;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import lombok.EqualsAndHashCode;
+import lombok.Getter;
+import lombok.ToString;
+
+import 
org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys;
+import org.apache.gobblin.util.ConfigUtils;
+
+@EqualsAndHashCode (exclude = {"description", "rawConfig"})
+@ToString (exclude = {"description", "rawConfig"})
+public abstract class BaseDatasetDescriptor implements DatasetDescriptor {
+  @Getter
+  private final String platform;
+  @Getter
+  private final FormatConfig formatConfig;
+  @Getter
+  private final boolean isRetentionApplied;
+  @Getter
+  private final String description;
+  @Getter
+  private final Config rawConfig;
+
+  private static final Config DEFAULT_FALLBACK =
+  ConfigFactory.parseMap(ImmutableMap.builder()
+  .put(DatasetDescriptorConfigKeys.PATH_KEY, 
DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY)
+  .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false)
+  .build());
+
+  public BaseDatasetDescriptor(Config config) throws IOException {
+
Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY),
 "Dataset descriptor config must specify platform");
+this.platform = 
config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase();
+this.formatConfig = new FormatConfig(config);
+this.isRetentionApplied = ConfigUtils.getBoolean(config, 
DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false);
+this.description = ConfigUtils.getString(config, 
DatasetDescriptorConfigKeys.DESCRIPTION_KEY, "");
+this.rawConfig = 
config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK);
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  protected abstract boolean isPathContaining(String otherPath);
+
+  /**
+   * @return true if this {@link DatasetDescriptor} contains the other {@link 
DatasetDescriptor} i.e. the
+   * datasets described by this {@link DatasetDescriptor} is a subset of the 
datasets described by the other
+   * {@link DatasetDescriptor}. This operation is non-commutative.
+   * @param other
+   */
+  @Override
+  public boolean contains(DatasetDescriptor other) {
+if (this == other) {
+  return true;
+}
+if (!getClass().equals(other.getClass())) {
+  return false;
+}
+
+if (this.getPlatform() == null || other.getPlatform() == null || 
!this.getPlatform().equalsIgnoreCase(other.getPlatform())) {
 
 Review comment:
   we can simplify as `if (this.getPlatform() == null || 
!this.getPlatform().equalsIgnoreCase(other.getPlatform())) {`


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets.

2019-04-04 Thread GitBox
zxcware commented on a change in pull request #2577: GOBBLIN-708: Create 
SqlDatasetDescriptor for JDBC-sourced datasets.
URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272260089
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/EncryptionConfig.java
 ##
 @@ -20,18 +20,21 @@
 import java.io.IOException;
 
 import com.google.common.base.Enums;
-import com.google.common.base.Joiner;
 import com.google.common.collect.ImmutableMap;
 import com.typesafe.config.Config;
 import com.typesafe.config.ConfigFactory;
 
+import lombok.EqualsAndHashCode;
 import lombok.Getter;
+import lombok.ToString;
 import lombok.extern.slf4j.Slf4j;
 
 import 
org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys;
 import org.apache.gobblin.util.ConfigUtils;
 
 @Slf4j
+@ToString(exclude = {"rawConfig"})
 
 Review comment:
   nice


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


[GitHub] [incubator-gobblin] zxcware commented on a change in pull request #2577: GOBBLIN-708: Create SqlDatasetDescriptor for JDBC-sourced datasets.

2019-04-04 Thread GitBox
zxcware commented on a change in pull request #2577: GOBBLIN-708: Create 
SqlDatasetDescriptor for JDBC-sourced datasets.
URL: https://github.com/apache/incubator-gobblin/pull/2577#discussion_r272259273
 
 

 ##
 File path: 
gobblin-service/src/main/java/org/apache/gobblin/service/modules/dataset/BaseDatasetDescriptor.java
 ##
 @@ -0,0 +1,92 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.gobblin.service.modules.dataset;
+
+import java.io.IOException;
+
+import com.google.common.base.Preconditions;
+import com.google.common.collect.ImmutableMap;
+import com.typesafe.config.Config;
+import com.typesafe.config.ConfigFactory;
+
+import lombok.EqualsAndHashCode;
+import lombok.Getter;
+import lombok.ToString;
+
+import 
org.apache.gobblin.service.modules.flowgraph.DatasetDescriptorConfigKeys;
+import org.apache.gobblin.util.ConfigUtils;
+
+@EqualsAndHashCode (exclude = {"description", "rawConfig"})
+@ToString (exclude = {"description", "rawConfig"})
+public abstract class BaseDatasetDescriptor implements DatasetDescriptor {
+  @Getter
+  private final String platform;
+  @Getter
+  private final FormatConfig formatConfig;
+  @Getter
+  private final boolean isRetentionApplied;
+  @Getter
+  private final String description;
+  @Getter
+  private final Config rawConfig;
+
+  private static final Config DEFAULT_FALLBACK =
+  ConfigFactory.parseMap(ImmutableMap.builder()
+  .put(DatasetDescriptorConfigKeys.PATH_KEY, 
DatasetDescriptorConfigKeys.DATASET_DESCRIPTOR_CONFIG_ANY)
+  .put(DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false)
+  .build());
+
+  public BaseDatasetDescriptor(Config config) throws IOException {
+
Preconditions.checkArgument(config.hasPath(DatasetDescriptorConfigKeys.PLATFORM_KEY),
 "Dataset descriptor config must specify platform");
+this.platform = 
config.getString(DatasetDescriptorConfigKeys.PLATFORM_KEY).toLowerCase();
+this.formatConfig = new FormatConfig(config);
+this.isRetentionApplied = ConfigUtils.getBoolean(config, 
DatasetDescriptorConfigKeys.IS_RETENTION_APPLIED_KEY, false);
+this.description = ConfigUtils.getString(config, 
DatasetDescriptorConfigKeys.DESCRIPTION_KEY, "");
+this.rawConfig = 
config.withFallback(this.formatConfig.getRawConfig()).withFallback(DEFAULT_FALLBACK);
+  }
+
+  /**
+   * {@inheritDoc}
+   */
+  protected abstract boolean isPathContaining(String otherPath);
+
+  /**
+   * @return true if this {@link DatasetDescriptor} contains the other {@link 
DatasetDescriptor} i.e. the
+   * datasets described by this {@link DatasetDescriptor} is a subset of the 
datasets described by the other
+   * {@link DatasetDescriptor}. This operation is non-commutative.
+   * @param other
+   */
+  @Override
+  public boolean contains(DatasetDescriptor other) {
+if (this == other) {
+  return true;
+}
+if (!getClass().equals(other.getClass())) {
 
 Review comment:
   Maybe `other == null || !getClass().equals(other.getClass())`?


This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services


Re: Gobblin at ApacheCon ?

2019-04-04 Thread Tamas Nemeth
I love the idea as well!
I would be happy to see at Apachecon but Las Vegas is a bit far from here.
:(

Tamas

On 2019. Apr 4., Thu at 8:24, Jean-Baptiste Onofré  wrote:

> It sounds good to me.
>
> Regards
> JB
>
> On 03/04/2019 18:59, Jay Sen wrote:
> > Hi Guys,
> >
> > Lets present Apache Gobblin at the ApacheCon.
> >
> > I would be interested in presenting/co-presenting PayPal's use-case.
> >
> > @PMCs, Please share your thoughts.
> >
> > Thanks
> > Jay
> >
>
> --
> Jean-Baptiste Onofré
> jbono...@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>


Re: Gobblin at ApacheCon ?

2019-04-04 Thread Jean-Baptiste Onofré
It sounds good to me.

Regards
JB

On 03/04/2019 18:59, Jay Sen wrote:
> Hi Guys,
> 
> Lets present Apache Gobblin at the ApacheCon.
> 
> I would be interested in presenting/co-presenting PayPal's use-case.
> 
> @PMCs, Please share your thoughts.
> 
> Thanks
> Jay
> 

-- 
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly

2019-04-04 Thread ASF GitHub Bot (JIRA)


 [ 
https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222822
 ]

ASF GitHub Bot logged work on GOBBLIN-707:
--

Author: ASF GitHub Bot
Created on: 04/Apr/19 05:58
Start Date: 04/Apr/19 05:58
Worklog Time Spent: 10m 
  Work Description: jhsenjaliya commented on pull request #2578: 
[GOBBLIN-707] rewrite gobblin script to combine all modes and command
URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272026110
 
 

 ##
 File path: conf/yarn/application.conf
 ##
 @@ -22,15 +22,18 @@ gobblin.yarn.app.name=GobblinYarn
 gobblin.yarn.app.master.memory.mbs=256
 gobblin.yarn.initial.containers=2
 gobblin.yarn.container.memory.mbs=512
-gobblin.yarn.conf.dir=
-gobblin.yarn.lib.jars.dir=
-gobblin.yarn.app.master.files.local=${gobblin.yarn.conf.dir}"/log4j-yarn.properties,"${gobblin.yarn.conf.dir}"/application.conf,"${gobblin.yarn.conf.dir}"/reference.conf"
+gobblin.yarn.conf.dir=/tools/gobblin-dist/conf/yarn/
 
 Review comment:
   this is missed, let me change this to 
`gobblin.yarn.conf.dir=${GOBBLIN_HOME}/conf/yarn/` will be better than having 
 btw, thanks for catching this, this was my local config.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 222822)
Time Spent: 2h 10m  (was: 2h)

> combine & standardize all gobblin scripts into one master script & 
> restructure configs accordingly
> --
>
> Key: GOBBLIN-707
> URL: https://issues.apache.org/jira/browse/GOBBLIN-707
> Project: Apache Gobblin
>  Issue Type: Improvement
>Reporter: Jay Sen
>Priority: Major
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> gobblin supports multiple modes of executions ( CLI, Standalone, 
> cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual 
> script for each of them.
> 1. there can be one gobblin.sh script
> {{gobblin.sh   }}
> {{gobblin.sh   }}
> {{commands values: admin, cli, statestore-check, statestore-clean, 
> historystore-manager}}
> {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, 
> service}}
> 2. Also configs needs to be structured and deduped accordingly.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)