[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=319799=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319799 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 27/Sep/19 20:37 Start Date: 27/Sep/19 20:37 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 319799) Time Spent: 10h 40m (was: 10.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 10h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. The problem is each cli and execution > mode has individual script to manage the service, which brings following > problems. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts, not to > mention different features supported by different scripts. > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # code duplication: all the gobblin scripts share lot of common code to > handle params, start, stop services, status checks, pid handling, etc... > combining all the scripts into 1 not only makes maintenance easier but also > brings clarity and consistency. > # Basically, current 13 different scripts adds confusion to new user on how > to use Gobblin or how to use it. > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin cli job-state-to-json > # class: StateStoreCleaner > statestore-clean.sh -> the class is depricated so no need to migrate > this over. > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin cli job-store-schema-manager > > # class: Cli > gobblin-admin.sh-> gobblin cli admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin service cluster-master > start|stop|status > gobblin-cluster-worker.sh -> gobblin service cluster-worker > start|stop|status > gobblin-compaction.sh -> gobblin-compaction.sh ( kept as it is for > now, can be migrated to new script framework) > gobblin-mapreduce.sh-> gobblin service mapreduce start|stop|status > gobblin-service.sh -> gobblin service service-manager > start|stop|status > gobblin-standalone.sh-> gobblin service standalone start|stop|status > gobblin-yarn.sh -> gobblin service yarn start|stop|status > {code} > > 2. Also all configurations for each mode needs to be structured and de-duped > accordingly to make it clear on which config will be picked up for which > execution mode. This would be well defined in command help instructions. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=319662=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319662 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 27/Sep/19 16:40 Start Date: 27/Sep/19 16:40 Worklog Time Spent: 10m Work Description: codecov-io commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-524705436 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2578?src=pr=h1) Report > Merging [#2578](https://codecov.io/gh/apache/incubator-gobblin/pull/2578?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/9389a4b2dfcfc08ad05b34182e341d32b198d97c?src=pr=desc) will **increase** coverage by `0.04%`. > The diff coverage is `7.14%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/graphs/tree.svg?width=650=4MgURJ0bGc=150=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2578?src=pr=tree) ```diff @@ Coverage Diff @@ ## master #2578 +/- ## === + Coverage 45.16% 45.2% +0.04% - Complexity 88008804 +4 === Files 18901889 -1 Lines 70622 70572 -50 Branches 77477745 -2 === + Hits 31896 31902 +6 + Misses35773 35719 -54 + Partials 29532951 -2 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2578?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...c/main/java/org/apache/gobblin/cli/JobCommand.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1hZG1pbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9jbGkvSm9iQ29tbWFuZC5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...ava/org/apache/gobblin/runtime/cli/GobblinCli.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvY2xpL0dvYmJsaW5DbGkuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...apache/gobblin/runtime/cli/CliEmbeddedGobblin.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvY2xpL0NsaUVtYmVkZGVkR29iYmxpbi5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...ore/util/DatabaseJobHistoryStoreSchemaManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1tZXRhc3RvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0YXN0b3JlL3V0aWwvRGF0YWJhc2VKb2JIaXN0b3J5U3RvcmVTY2hlbWFNYW5hZ2VyLmphdmE=) | `31.66% <100%> (+1.15%)` | `4 <1> (+1)` | :arrow_up: | | [.../gobblin/runtime/util/JobStateToJsonConverter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdXRpbC9Kb2JTdGF0ZVRvSnNvbkNvbnZlcnRlci5qYXZh) | `20.21% <33.33%> (+0.85%)` | `6 <1> (ø)` | :arrow_down: | | [...lin/restli/throttling/ZookeeperLeaderElection.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1yZXN0bGkvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2UvZ29iYmxpbi10aHJvdHRsaW5nLXNlcnZpY2Utc2VydmVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3Jlc3RsaS90aHJvdHRsaW5nL1pvb2tlZXBlckxlYWRlckVsZWN0aW9uLmphdmE=) | `70% <0%> (-2.23%)` | `13% <0%> (ø)` | | | [.../apache/gobblin/runtime/api/JobExecutionState.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvYXBpL0pvYkV4ZWN1dGlvblN0YXRlLmphdmE=) | `79.43% <0%> (-0.94%)` | `24% <0%> (ø)` | | | [.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==) | `65.27% <0%> (+0.46%)` | `28% <0%> (ø)` | :arrow_down: | | [...e/gobblin/runtime/locks/ZookeeperBasedJobLock.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvbG9ja3MvWm9va2VlcGVyQmFzZWRKb2JMb2NrLmphdmE=) | `64.44% <0%> (+1.11%)` | `16% <0%> (+1%)` | :arrow_up: | |
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=319637=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319637 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 27/Sep/19 16:06 Start Date: 27/Sep/19 16:06 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-536001576 End up fixing the conflict with merge. Please take a look. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 319637) Time Spent: 10h 20m (was: 10h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 10h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. The problem is each cli and execution > mode has individual script to manage the service, which brings following > problems. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts, not to > mention different features supported by different scripts. > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # code duplication: all the gobblin scripts share lot of common code to > handle params, start, stop services, status checks, pid handling, etc... > combining all the scripts into 1 not only makes maintenance easier but also > brings clarity and consistency. > # Basically, current 13 different scripts adds confusion to new user on how > to use Gobblin or how to use it. > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin cli job-state-to-json > # class: StateStoreCleaner > statestore-clean.sh -> the class is depricated so no need to migrate > this over. > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin cli job-store-schema-manager > > # class: Cli > gobblin-admin.sh-> gobblin cli admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin service cluster-master > start|stop|status > gobblin-cluster-worker.sh -> gobblin service cluster-worker > start|stop|status > gobblin-compaction.sh -> gobblin-compaction.sh ( kept as it is for > now, can be migrated to new script framework) > gobblin-mapreduce.sh-> gobblin service mapreduce start|stop|status > gobblin-service.sh -> gobblin service service-manager > start|stop|status > gobblin-standalone.sh-> gobblin service standalone start|stop|status > gobblin-yarn.sh -> gobblin service yarn start|stop|status > {code} > > 2. Also all configurations for each mode needs to be structured and de-duped > accordingly to make it clear on which config will be picked up for which > execution mode. This would be well defined in command help instructions. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=319329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319329 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 27/Sep/19 03:26 Start Date: 27/Sep/19 03:26 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-535771167 @sv2000, cant figure out what is the conflict here, i can not see the diff, since i dont have write access. can u pls take a look? btw `conf/service` is renamed to `conf/gobblin-as-service`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 319329) Time Spent: 10h 10m (was: 10h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 10h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. The problem is each cli and execution > mode has individual script to manage the service, which brings following > problems. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts, not to > mention different features supported by different scripts. > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # code duplication: all the gobblin scripts share lot of common code to > handle params, start, stop services, status checks, pid handling, etc... > combining all the scripts into 1 not only makes maintenance easier but also > brings clarity and consistency. > # Basically, current 13 different scripts adds confusion to new user on how > to use Gobblin or how to use it. > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin cli job-state-to-json > # class: StateStoreCleaner > statestore-clean.sh -> the class is depricated so no need to migrate > this over. > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin cli job-store-schema-manager > > # class: Cli > gobblin-admin.sh-> gobblin cli admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin service cluster-master > start|stop|status > gobblin-cluster-worker.sh -> gobblin service cluster-worker > start|stop|status > gobblin-compaction.sh -> gobblin-compaction.sh ( kept as it is for > now, can be migrated to new script framework) > gobblin-mapreduce.sh-> gobblin service mapreduce start|stop|status > gobblin-service.sh -> gobblin service service-manager > start|stop|status > gobblin-standalone.sh-> gobblin service standalone start|stop|status > gobblin-yarn.sh -> gobblin service yarn start|stop|status > {code} > > 2. Also all configurations for each mode needs to be structured and de-duped > accordingly to make it clear on which config will be picked up for which > execution mode. This would be well defined in command help instructions. > {color:#ff} > NOTE: this refactoring adds all cli and service
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=319084=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-319084 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 26/Sep/19 16:57 Start Date: 26/Sep/19 16:57 Worklog Time Spent: 10m Work Description: sv2000 commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-535593899 @jhsenjaliya looks like there are conflicts. Can you take a look? Once you resolve the conflicts, I can merge the PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 319084) Time Spent: 10h (was: 9h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 10h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. The problem is each cli and execution > mode has individual script to manage the service, which brings following > problems. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts, not to > mention different features supported by different scripts. > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # code duplication: all the gobblin scripts share lot of common code to > handle params, start, stop services, status checks, pid handling, etc... > combining all the scripts into 1 not only makes maintenance easier but also > brings clarity and consistency. > # Basically, current 13 different scripts adds confusion to new user on how > to use Gobblin or how to use it. > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin cli job-state-to-json > # class: StateStoreCleaner > statestore-clean.sh -> the class is depricated so no need to migrate > this over. > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin cli job-store-schema-manager > > # class: Cli > gobblin-admin.sh-> gobblin cli admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin service cluster-master > start|stop|status > gobblin-cluster-worker.sh -> gobblin service cluster-worker > start|stop|status > gobblin-compaction.sh -> gobblin-compaction.sh ( kept as it is for > now, can be migrated to new script framework) > gobblin-mapreduce.sh-> gobblin service mapreduce start|stop|status > gobblin-service.sh -> gobblin service service-manager > start|stop|status > gobblin-standalone.sh-> gobblin service standalone start|stop|status > gobblin-yarn.sh -> gobblin service yarn start|stop|status > {code} > > 2. Also all configurations for each mode needs to be structured and de-duped > accordingly to make it clear on which config will be picked up for which > execution mode. This would be well defined in command help instructions. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=301007=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301007 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 26/Aug/19 03:52 Start Date: 26/Aug/19 03:52 Worklog Time Spent: 10m Work Description: codecov-io commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-524705436 # [Codecov](https://codecov.io/gh/apache/incubator-gobblin/pull/2578?src=pr=h1) Report > Merging [#2578](https://codecov.io/gh/apache/incubator-gobblin/pull/2578?src=pr=desc) into [master](https://codecov.io/gh/apache/incubator-gobblin/commit/651c4a1265190d79036681d14a45f598f5060201?src=pr=desc) will **increase** coverage by `0.32%`. > The diff coverage is `7.14%`. [![Impacted file tree graph](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/graphs/tree.svg?width=650=4MgURJ0bGc=150=pr)](https://codecov.io/gh/apache/incubator-gobblin/pull/2578?src=pr=tree) ```diff @@ Coverage Diff @@ ## master#2578 +/- ## + Coverage 44.79% 45.12% +0.32% - Complexity 8689 8747 +58 Files 1878 1879 +1 Lines 7007070120 +50 Branches 7703 7698 -5 + Hits 3139131641 +250 + Misses3577435554 -220 - Partials 2905 2925 +20 ``` | [Impacted Files](https://codecov.io/gh/apache/incubator-gobblin/pull/2578?src=pr=tree) | Coverage Δ | Complexity Δ | | |---|---|---|---| | [...c/main/java/org/apache/gobblin/cli/JobCommand.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1hZG1pbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9jbGkvSm9iQ29tbWFuZC5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...apache/gobblin/runtime/cli/CliEmbeddedGobblin.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvY2xpL0NsaUVtYmVkZGVkR29iYmxpbi5qYXZh) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...ava/org/apache/gobblin/runtime/cli/GobblinCli.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi11dGlsaXR5L3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvY2xpL0dvYmJsaW5DbGkuamF2YQ==) | `0% <0%> (ø)` | `0 <0> (ø)` | :arrow_down: | | [...ore/util/DatabaseJobHistoryStoreSchemaManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1tZXRhc3RvcmUvc3JjL21haW4vamF2YS9vcmcvYXBhY2hlL2dvYmJsaW4vbWV0YXN0b3JlL3V0aWwvRGF0YWJhc2VKb2JIaXN0b3J5U3RvcmVTY2hlbWFNYW5hZ2VyLmphdmE=) | `31.66% <100%> (+1.15%)` | `4 <1> (+1)` | :arrow_up: | | [.../gobblin/runtime/util/JobStateToJsonConverter.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1ydW50aW1lL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL3J1bnRpbWUvdXRpbC9Kb2JTdGF0ZVRvSnNvbkNvbnZlcnRlci5qYXZh) | `20.21% <33.33%> (+0.85%)` | `6 <1> (ø)` | :arrow_down: | | [...in/java/org/apache/gobblin/cluster/HelixUtils.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvSGVsaXhVdGlscy5qYXZh) | `35.51% <0%> (-6.33%)` | `12% <0%> (-1%)` | | | [.../org/apache/gobblin/cluster/GobblinTaskRunner.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1jbHVzdGVyL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NsdXN0ZXIvR29iYmxpblRhc2tSdW5uZXIuamF2YQ==) | `64.78% <0%> (-1.41%)` | `29% <0%> (ø)` | | | [...e/gobblin/config/store/zip/ZipFileConfigStore.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1jb25maWctbWFuYWdlbWVudC9nb2JibGluLWNvbmZpZy1jb3JlL3NyYy9tYWluL2phdmEvb3JnL2FwYWNoZS9nb2JibGluL2NvbmZpZy9zdG9yZS96aXAvWmlwRmlsZUNvbmZpZ1N0b3JlLmphdmE=) | `73.21% <0%> (-1.34%)` | `12% <0%> (ø)` | | | [...apache/gobblin/hive/avro/HiveAvroSerDeManager.java](https://codecov.io/gh/apache/incubator-gobblin/pull/2578/diff?src=pr=tree#diff-Z29iYmxpbi1oaXZlLXJlZ2lzdHJhdGlvbi9zcmMvbWFpbi9qYXZhL29yZy9hcGFjaGUvZ29iYmxpbi9oaXZlL2F2cm8vSGl2ZUF2cm9TZXJEZU1hbmFnZXIuamF2YQ==) | `52.17% <0%> (-0.77%)` | `8% <0%> (ø)` | | |
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=301000=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-301000 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 26/Aug/19 03:22 Start Date: 26/Aug/19 03:22 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-524695862 @sv2000 @htran1 @ibuenros, I have been using this for sometime now, lately for mapreduce mode which is hard to use on old gobblin.sh script, also have added some updates in last commits based on my usage. Please take a look when you get chance, I am waiting for this to get merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 301000) Time Spent: 9h 40m (was: 9.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 9h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=300999=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300999 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 26/Aug/19 03:22 Start Date: 26/Aug/19 03:22 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-524695862 @sv2000 @htran1 @ibuenros, I have been using this for sometime now, lately for mapreduce mode which is hard to use on old gobblin.sh script and added some more updates for more convenience. Please take a look when you get chance, I am waiting for this to get merged. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 300999) Time Spent: 9.5h (was: 9h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 9.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=300986=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-300986 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 26/Aug/19 02:48 Start Date: 26/Aug/19 02:48 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-524695862 @sv2000 @htran1 @ibuenros, can you guys pls take a look at this ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 300986) Time Spent: 9h 20m (was: 9h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 9h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=288368=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-288368 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 03/Aug/19 03:02 Start Date: 03/Aug/19 03:02 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-517889096 @sv2000 @htran1 @ibuenros , added some more changes as per my usage. pls take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 288368) Time Spent: 9h 10m (was: 9h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 9h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=263498=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263498 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 20/Jun/19 02:52 Start Date: 20/Jun/19 02:52 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r295596935 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true Review comment: @ibuenros, made default to false as before, since it requires store config as well. pls take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 263498) Time Spent: 9h (was: 8h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 9h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=263497=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-263497 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 20/Jun/19 02:52 Start Date: 20/Jun/19 02:52 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r295596935 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true Review comment: @ibuenros, made default to false as before, since it requires store config as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 263497) Time Spent: 8h 50m (was: 8h 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 8h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=256423=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-256423 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 08/Jun/19 16:21 Start Date: 08/Jun/19 16:21 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-500137180 @htran1, @sv2000 , added wrapper scripts, pls take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 256423) Time Spent: 8h 40m (was: 8.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 8h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=254712=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-254712 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 05/Jun/19 22:17 Start Date: 05/Jun/19 22:17 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287553190 ## File path: bin/gobblin-admin.sh ## @@ -1,142 +0,0 @@ -#!/bin/bash Review comment: I was targeting to cleanup and simplify those too many scripts Gobblin has, which could be not only confusing for new users but also not a standard way of using it since all scripts have different params. user should really move to new scripts to standardize, so not sure how much the backward compatibility is required for scripts, as oppose to APIs, otherwise we will never be able to clean up and provide better version. but I see ur point so may be I will create wrapper as you suggested with note to remove those wrapper in future but i afraid it wont be 100% backward compatible since loy of things (options and features) are getting standardized here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 254712) Time Spent: 8.5h (was: 8h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 8.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248632 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 27/May/19 05:09 Start Date: 27/May/19 05:09 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287553190 ## File path: bin/gobblin-admin.sh ## @@ -1,142 +0,0 @@ -#!/bin/bash Review comment: I was targeting to cleanup and simplify those too many scripts Gobblin has, which could be not only confusing for new users but also not a standard way of using it since all scripts have different params. user should really move to new scripts to standardize, so not sure how much the backward compatibility is required for scripts, as oppose to APIs, otherwise we will never be able to clean up and provide better version. but I see ur point so may be I will create wrapper as you suggested with note to remove those wrapper in future but i afraid it wont be 100% backward compatible since log of things are getting standardized here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248632) Time Spent: 8h 20m (was: 8h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 8h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248461 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 26/May/19 03:02 Start Date: 26/May/19 03:02 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287552990 ## File path: bin/gobblin.sh ## @@ -17,50 +17,488 @@ # limitations under the License. # -calling_dir() { - echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" +# JAVA_HOME is required. +if [[ -z "$JAVA_HOME" ]]; then +echo -e "\nError: Environment variable JAVA_HOME not set!\n" +exit 1 +fi + +# global vars + +GOBBLIN_VERSION=@project.version@ +GOBBLIN_HOME="$(cd `dirname $0`/..; pwd)" +GOBBLIN_LIB=${GOBBLIN_HOME}/lib +GOBBLIN_BIN=${GOBBLIN_HOME}/bin +GOBBLIN_LOGS=${GOBBLIN_HOME}/logs +GOBBLIN_CONF='' + +#sourcing basic gobblin env vars like GOBBLIN_HOME and GOBBLIN_LIB +. ${GOBBLIN_BIN}/gobblin-env.sh + +CLUSTER_NAME="gobblin_cluster" +JVM_OPTS="-Xmx1g -Xms512m" +LOG4J_FILE_PATH='' +LOG4J_OPTS='' +GOBBLIN_MODE='' +ACTION='' +JVM_FLAGS='' +EXTRA_JARS='' +VERBOSE=0 +ENABLE_GC_LOGS=0 +CMD_PARAMS='' + + +# Gobblin Commands, Modes & respective Classes +GOBBLIN_MODE_TYPE='' +CLI='cli' +SERVICE='service' + +# Commands +JOB_STATE_TO_JSON_CMD='job-state-to-json' +JOB_STORE_SCHEMA_MANAGER_CMD='job-store-schema-manager' +CLASSPATH_CMD='classpath' + +# Execution Modes +STANDALONE_MODE='standalone' +CLUSTER_MASTER_MODE='cluster-master' +CLUSTER_WORKER_MODE='cluster-worker' +AWS_MODE='aws' +YARN_MODE='yarn' +MAPREDUCE_MODE='mapreduce' +SERVICE_MANAGER_MODE='service-manager' + +GOBBLIN_EXEC_MODE_LIST="$STANDALONE_MODE $CLUSTER_MASTER_MODE $CLUSTER_WORKER_MODE $AWS_MODE $YARN_MODE $MAPREDUCE_MODE $SERVICE_MANAGER_MODE" + +# CLI Command class +CLI_CLASS='org.apache.gobblin.runtime.cli.GobblinCli' + +# Service Class +STANDALONE_CLASS='org.apache.gobblin.scheduler.SchedulerDaemon' +CLUSTER_MASTER_CLASS='org.apache.gobblin.cluster.GobblinClusterManager' +CLUSTER_WORKER_CLASS='org.apache.gobblin.cluster.GobblinTaskRunner' +AWS_CLASS='org.apache.gobblin.aws.GobblinAWSClusterLauncher' +YARN_CLASS='org.apache.gobblin.yarn.GobblinYarnAppLauncher' +MAPREDUCE_CLASS='org.apache.gobblin.runtime.mapreduce.CliMRJobLauncher' +SERVICE_MANAGER_CLASS='org.apache.gobblin.service.modules.core.GobblinServiceManager' + + +function print_gobblin_usage() { +echo "Usage:" +echo "gobblin.sh cli " +echo "gobblin.sh service " +echo "" +echo "Use \"gobblin --help\" for more information. (Gobblin Version: $GOBBLIN_VERSION)" +} + +function print_gobblin_cli_usage() { Review comment: Yes, GobblinCLI has `printusage` which will list all the Alias classes, but it wont display other help options. Also its better to have all help options at script level rather than at class level. Also the actual command help is going to be managed by the implementor class and btw both are available depending on the usage. ``` bin/gobblin cli somecommand Could not find an application with alias somecommand Available commands: job-state-to-json To convert Job state to JSON jobsCommand line job info and operations passwordManager Encrypt or decrypt strings for the password manager. run Run a Gobblin application. decrypt Decryption utilities job-store-schema-managerDatabase job history store schema manager stateMigration Command line tools for migrating state store keystoreExamine JCE Keystore files config Query the config library watermarks Inspect streaming watermarks cleaner Data retention utility``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248461) Time Spent: 8h 10m (was: 8h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248460=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248460 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 26/May/19 02:59 Start Date: 26/May/19 02:59 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287553190 ## File path: bin/gobblin-admin.sh ## @@ -1,142 +0,0 @@ -#!/bin/bash Review comment: I was targeting to cleanup and simplify those too many scripts Gobblin has, which could be confusing for new users. user should really move to new scripts, so not sure how much the backward compatibility is required for scripts, as oppose to APIs, otherwise we will never be able to clean up. but I see ur point so may be I will create wrapper as you suggested with note to remove those wrapper in future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248460) Time Spent: 8h (was: 7h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 8h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248416=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248416 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/May/19 07:51 Start Date: 25/May/19 07:51 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287552780 ## File path: bin/gobblin.sh ## @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash Review comment: yes, i have updated the doc with updated help and sample commands, will be updating more as i make the changes, Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248416) Time Spent: 7h 50m (was: 7h 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 7h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248403=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248403 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/May/19 06:33 Start Date: 25/May/19 06:33 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287553190 ## File path: bin/gobblin-admin.sh ## @@ -1,142 +0,0 @@ -#!/bin/bash Review comment: I was targeting to cleanup and simplify those too many scripts Gobblin has, which could be confusing for new users. user should really move to new scripts, so not sure how much the backward compatibility is required for scripts, as oppose to APIs, otherwise we will never be able to clean up. but I see ur point so may be I will create wrapper as you suggested with note to remove that in future. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248403) Time Spent: 7h 40m (was: 7.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 7h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248402=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248402 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/May/19 06:25 Start Date: 25/May/19 06:25 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287553047 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/HelixRetriggeringJobCallable.java ## @@ -152,7 +152,7 @@ public Void call() throws JobException { GobblinClusterConfigurationKeys.JOB_ALWAYS_DELETE, "false"); -try { +try { //TODO: what is really the difference ? Review comment: oh this got into this by mistake, it was my own code comment. will remove it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248402) Time Spent: 7.5h (was: 7h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 7.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248401=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248401 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/May/19 06:24 Start Date: 25/May/19 06:24 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287553020 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true Review comment: so that first time user dont have to figure this out, if there is any specific reason not to turn it on by default, then i can remove this config. I have added those configs that I had to add when i setup the gobblin first time to get it working. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248401) Time Spent: 7h 20m (was: 7h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 7h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and >
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248399=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248399 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/May/19 06:13 Start Date: 25/May/19 06:13 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287552780 ## File path: bin/gobblin.sh ## @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash Review comment: yes, i have updated the doc with updated help and sample commands. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248399) Time Spent: 7h (was: 6h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 7h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248400=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248400 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/May/19 06:23 Start Date: 25/May/19 06:23 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287552990 ## File path: bin/gobblin.sh ## @@ -17,50 +17,488 @@ # limitations under the License. # -calling_dir() { - echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" +# JAVA_HOME is required. +if [[ -z "$JAVA_HOME" ]]; then +echo -e "\nError: Environment variable JAVA_HOME not set!\n" +exit 1 +fi + +# global vars + +GOBBLIN_VERSION=@project.version@ +GOBBLIN_HOME="$(cd `dirname $0`/..; pwd)" +GOBBLIN_LIB=${GOBBLIN_HOME}/lib +GOBBLIN_BIN=${GOBBLIN_HOME}/bin +GOBBLIN_LOGS=${GOBBLIN_HOME}/logs +GOBBLIN_CONF='' + +#sourcing basic gobblin env vars like GOBBLIN_HOME and GOBBLIN_LIB +. ${GOBBLIN_BIN}/gobblin-env.sh + +CLUSTER_NAME="gobblin_cluster" +JVM_OPTS="-Xmx1g -Xms512m" +LOG4J_FILE_PATH='' +LOG4J_OPTS='' +GOBBLIN_MODE='' +ACTION='' +JVM_FLAGS='' +EXTRA_JARS='' +VERBOSE=0 +ENABLE_GC_LOGS=0 +CMD_PARAMS='' + + +# Gobblin Commands, Modes & respective Classes +GOBBLIN_MODE_TYPE='' +CLI='cli' +SERVICE='service' + +# Commands +JOB_STATE_TO_JSON_CMD='job-state-to-json' +JOB_STORE_SCHEMA_MANAGER_CMD='job-store-schema-manager' +CLASSPATH_CMD='classpath' + +# Execution Modes +STANDALONE_MODE='standalone' +CLUSTER_MASTER_MODE='cluster-master' +CLUSTER_WORKER_MODE='cluster-worker' +AWS_MODE='aws' +YARN_MODE='yarn' +MAPREDUCE_MODE='mapreduce' +SERVICE_MANAGER_MODE='service-manager' + +GOBBLIN_EXEC_MODE_LIST="$STANDALONE_MODE $CLUSTER_MASTER_MODE $CLUSTER_WORKER_MODE $AWS_MODE $YARN_MODE $MAPREDUCE_MODE $SERVICE_MANAGER_MODE" + +# CLI Command class +CLI_CLASS='org.apache.gobblin.runtime.cli.GobblinCli' + +# Service Class +STANDALONE_CLASS='org.apache.gobblin.scheduler.SchedulerDaemon' +CLUSTER_MASTER_CLASS='org.apache.gobblin.cluster.GobblinClusterManager' +CLUSTER_WORKER_CLASS='org.apache.gobblin.cluster.GobblinTaskRunner' +AWS_CLASS='org.apache.gobblin.aws.GobblinAWSClusterLauncher' +YARN_CLASS='org.apache.gobblin.yarn.GobblinYarnAppLauncher' +MAPREDUCE_CLASS='org.apache.gobblin.runtime.mapreduce.CliMRJobLauncher' +SERVICE_MANAGER_CLASS='org.apache.gobblin.service.modules.core.GobblinServiceManager' + + +function print_gobblin_usage() { +echo "Usage:" +echo "gobblin.sh cli " +echo "gobblin.sh service " +echo "" +echo "Use \"gobblin --help\" for more information. (Gobblin Version: $GOBBLIN_VERSION)" +} + +function print_gobblin_cli_usage() { Review comment: Yes, GobblinCLI has `printusage` which will list all the Alias classes, but it wont display other help options. Also its better to have all help options at script level rather than at class level. btw both are available depending on the usage. ``` bin/gobblin cli somecommand Could not find an application with alias somecommand Available commands: job-state-to-json To convert Job state to JSON jobsCommand line job info and operations passwordManager Encrypt or decrypt strings for the password manager. run Run a Gobblin application. decrypt Decryption utilities job-store-schema-managerDatabase job history store schema manager stateMigration Command line tools for migrating state store keystoreExamine JCE Keystore files config Query the config library watermarks Inspect streaming watermarks cleaner Data retention utility``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248400) Time Spent: 7h 10m (was: 7h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 7h 10m > Remaining Estimate: 0h > >
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=248136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-248136 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 24/May/19 16:40 Start Date: 24/May/19 16:40 Worklog Time Spent: 10m Work Description: htran1 commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287436214 ## File path: bin/gobblin-admin.sh ## @@ -1,142 +0,0 @@ -#!/bin/bash Review comment: How about maintaining compatibility with callers of the old script? One option is to leave the old scripts as wrapper scripts that call the new one. Another option is to have the old script names be symlinks to the new script and detect the usage based on the value of $0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 248136) Time Spent: 6h 50m (was: 6h 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247923=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247923 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 24/May/19 07:23 Start Date: 24/May/19 07:23 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287240383 ## File path: gobblin-cluster/src/main/java/org/apache/gobblin/cluster/HelixRetriggeringJobCallable.java ## @@ -152,7 +152,7 @@ public Void call() throws JobException { GobblinClusterConfigurationKeys.JOB_ALWAYS_DELETE, "false"); -try { +try { //TODO: what is really the difference ? Review comment: Why is this relevant to this PR? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247923) Time Spent: 6h 40m (was: 6.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247922 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 24/May/19 07:23 Start Date: 24/May/19 07:23 Worklog Time Spent: 10m Work Description: sv2000 commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r287242524 ## File path: bin/gobblin.sh ## @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash Review comment: @jhsenjaliya since this is a big change, and the fact that we use the existing shell scripts in some of our internal use cases, it would be great if you can provide documentation on how to run common commands using the new script. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247922) Time Spent: 6.5h (was: 6h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247138 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 22/May/19 23:08 Start Date: 22/May/19 23:08 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705253 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true +admin.server.port=9000 + +# is this required/redundent ? +rest.server.host=localhost +rest.server.port=9090 + +# job history store +job.execinfo.server.enabled=true Review comment: Ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247138) Time Spent: 6h 20m (was: 6h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247135=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247135 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 22/May/19 23:08 Start Date: 22/May/19 23:08 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705377 ## File path: conf/cluster-worker/application.conf ## @@ -69,3 +69,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +failure.log.dir=${gobblin.cluster.work.dir}/failure-logs + +# UI +admin.server.enabled=false +# admin.server.port=9000 + +rest.server.host=localhost +rest.server.port=9090 + +# job history store ( WARN [GobblinYarnAppLauncher] NOT starting the admin UI because the job execution info server is NOT enabled ) +job.execinfo.server.enabled=true Review comment: Why enabled by default? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247135) Time Spent: 6h 10m (was: 6h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} >
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247136 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 22/May/19 23:08 Start Date: 22/May/19 23:08 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286703827 ## File path: bin/gobblin.sh ## @@ -17,50 +17,488 @@ # limitations under the License. # -calling_dir() { - echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" +# JAVA_HOME is required. +if [[ -z "$JAVA_HOME" ]]; then +echo -e "\nError: Environment variable JAVA_HOME not set!\n" +exit 1 +fi + +# global vars + +GOBBLIN_VERSION=@project.version@ +GOBBLIN_HOME="$(cd `dirname $0`/..; pwd)" +GOBBLIN_LIB=${GOBBLIN_HOME}/lib +GOBBLIN_BIN=${GOBBLIN_HOME}/bin +GOBBLIN_LOGS=${GOBBLIN_HOME}/logs +GOBBLIN_CONF='' + +#sourcing basic gobblin env vars like GOBBLIN_HOME and GOBBLIN_LIB +. ${GOBBLIN_BIN}/gobblin-env.sh + +CLUSTER_NAME="gobblin_cluster" +JVM_OPTS="-Xmx1g -Xms512m" +LOG4J_FILE_PATH='' +LOG4J_OPTS='' +GOBBLIN_MODE='' +ACTION='' +JVM_FLAGS='' +EXTRA_JARS='' +VERBOSE=0 +ENABLE_GC_LOGS=0 +CMD_PARAMS='' + + +# Gobblin Commands, Modes & respective Classes +GOBBLIN_MODE_TYPE='' +CLI='cli' +SERVICE='service' + +# Commands +JOB_STATE_TO_JSON_CMD='job-state-to-json' +JOB_STORE_SCHEMA_MANAGER_CMD='job-store-schema-manager' +CLASSPATH_CMD='classpath' + +# Execution Modes +STANDALONE_MODE='standalone' +CLUSTER_MASTER_MODE='cluster-master' +CLUSTER_WORKER_MODE='cluster-worker' +AWS_MODE='aws' +YARN_MODE='yarn' +MAPREDUCE_MODE='mapreduce' +SERVICE_MANAGER_MODE='service-manager' + +GOBBLIN_EXEC_MODE_LIST="$STANDALONE_MODE $CLUSTER_MASTER_MODE $CLUSTER_WORKER_MODE $AWS_MODE $YARN_MODE $MAPREDUCE_MODE $SERVICE_MANAGER_MODE" + +# CLI Command class +CLI_CLASS='org.apache.gobblin.runtime.cli.GobblinCli' + +# Service Class +STANDALONE_CLASS='org.apache.gobblin.scheduler.SchedulerDaemon' +CLUSTER_MASTER_CLASS='org.apache.gobblin.cluster.GobblinClusterManager' +CLUSTER_WORKER_CLASS='org.apache.gobblin.cluster.GobblinTaskRunner' +AWS_CLASS='org.apache.gobblin.aws.GobblinAWSClusterLauncher' +YARN_CLASS='org.apache.gobblin.yarn.GobblinYarnAppLauncher' +MAPREDUCE_CLASS='org.apache.gobblin.runtime.mapreduce.CliMRJobLauncher' +SERVICE_MANAGER_CLASS='org.apache.gobblin.service.modules.core.GobblinServiceManager' + + +function print_gobblin_usage() { +echo "Usage:" +echo "gobblin.sh cli " +echo "gobblin.sh service " +echo "" +echo "Use \"gobblin --help\" for more information. (Gobblin Version: $GOBBLIN_VERSION)" +} + +function print_gobblin_cli_usage() { Review comment: Why is this needed? `GobblinCli` should be able to automatically generate this usage info. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247136) Time Spent: 6h 10m (was: 6h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247137 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 22/May/19 23:08 Start Date: 22/May/19 23:08 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705209 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true Review comment: Why do we want admin server enabled by default? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247137) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=244606=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-244606 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 19/May/19 01:11 Start Date: 19/May/19 01:11 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-493718575 @ibuenros, @autumnust since our last discussion, I have also brought all CLI and command classes under `GobblinCli` with Alias, so its much organized now that also helps clear CLI options. I would suggest try to check out this PR locally and play around with it to see how this works. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 244606) Time Spent: 6h (was: 5h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=237116=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-237116 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 03/May/19 23:12 Start Date: 03/May/19 23:12 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r280953068 ## File path: bin/gobblin.sh ## @@ -1,4 +1,4 @@ -#!/bin/bash +#!/usr/bin/env bash Review comment: Can we leave `gobblin.sh` relatively simple and instead have `gobblin-cli.sh` and `gobblin-service.sh`? `gobblin.sh` would just redirect to the correct place depending on the first argument. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 237116) Time Spent: 5h 50m (was: 5h 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=236098=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-236098 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 02/May/19 02:42 Start Date: 02/May/19 02:42 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-488537624 1 out of 4 test env has failed, will look into it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 236098) Time Spent: 5h 40m (was: 5.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > > {color:#FF} > NOTE: this refactoring to gobblin.sh, changes the way all gobblin commands > where ran before{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=236061=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-236061 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 01/May/19 23:41 Start Date: 01/May/19 23:41 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-488498198 @ibuenros , @autumnust, Would you pls take a look ? Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 236061) Time Spent: 5.5h (was: 5h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > > {color:#FF} > NOTE: this refactoring to gobblin.sh, changes the way all gobblin commands > where ran before{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=231807=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-231807 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 24/Apr/19 00:45 Start Date: 24/Apr/19 00:45 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-486024787 @autumnust , updated docs and also added new info in doc regarding the usage of gobblin.sh, please take a look when you get chance. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 231807) Time Spent: 5h 20m (was: 5h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > > {color:#FF} > NOTE: this refactoring to gobblin.sh, changes the way all gobblin commands > where ran before{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=228081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-228081 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 16/Apr/19 02:19 Start Date: 16/Apr/19 02:19 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275608262 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: Hi @ibuenros, I have added the mapping of old gobblin script to new gobblin command in jira ticket and also started email thread: "GOBBLIN-707 Review", would you please take a look and reply there. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 228081) Time Spent: 5h 10m (was: 5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > > {color:#FF} > NOTE:
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=227027=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227027 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 23:56 Start Date: 12/Apr/19 23:56 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275093655 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: @ibuenros , lets take this offline over email. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 227027) Time Spent: 5h (was: 4h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=227025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227025 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 23:54 Start Date: 12/Apr/19 23:54 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275089641 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: Each removed shell script has been accommodated into the updated `gobblin.sh`. if you can pull this PR and try some of the commands, it will be clear on how things are working with these updates. git diff is bit confusing for such large PR. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 227025) Time Spent: 4h 50m (was: 4h 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 4h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=227016=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-227016 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 23:21 Start Date: 12/Apr/19 23:21 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275089641 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: to handle it on bash script level and call the `GobblinCli` with appropriate args for each command accordingly. btw, each removed shell script has support to execute via this updated `gobblin.sh`. if you can pull this PR and try some of the commands, it will be clear on how things are working with these updates. git diff is bit confusing for such large PR. Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 227016) Time Spent: 4h 40m (was: 4.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 4h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226973=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226973 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 22:37 Start Date: 12/Apr/19 22:37 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275083422 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: What is the proposal on how to do that though? Actually I'm a bit confused with this PR, there are a lot of removed shell scripts, but there seems to be no replacement. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226973) Time Spent: 4.5h (was: 4h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 4.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226949=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226949 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 21:39 Start Date: 12/Apr/19 21:39 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275072150 ## File path: bin/gobblin.sh ## @@ -17,50 +17,435 @@ # limitations under the License. # -calling_dir() { - echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" -} -classpath() { - DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" - - for i in `ls $DIR/../lib` - do - if [[ $i != hadoop* ]] - then -CLASSPATH=${CLASSPATH:+${CLASSPATH}:}$DIR/../lib/$i - else - HADOOP_CLASSPATH=${HADOOP_CLASSPATH:+${HADOOP_CLASSPATH}:}$DIR/../lib/$i - fi - done - - if [ ! -z "$HADOOP_HOME" ] && [ -f $HADOOP_HOME/bin/hadoop ] - then -HADOOP_CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath) - fi - - CLASSPATH=$CLASSPATH:$HADOOP_CLASSPATH - - if [ ! -z "$GOBBLIN_ADDITIONAL_JARS" ] - then - CLASSPATH=$GOBBLIN_ADDITIONAL_JARS:$CLASSPATH - fi - - echo $CLASSPATH +# JAVA_HOME is required. +if [[ -z "$JAVA_HOME" ]]; then +echo -e "\nError: Environment variable JAVA_HOME not set!\n" +exit 1 +fi + +# global vars + +GOBBLIN_VERSION="0.15.0" Review comment: will replace this with @project.version@ to get it placed by gradle. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226949) Time Spent: 4h 20m (was: 4h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 4h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226934=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226934 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 21:16 Start Date: 12/Apr/19 21:16 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275060881 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: ok , sure. is `gobblin admin ` fine ? is your comment only about `gobblin cli run `? then i can bring the `run`, `watermarks`, `passwrodManager`, etc commands to gobblin.sh level so that the command signature wont change and it will remain `gobblin run `. make sense? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226934) Time Spent: 4h 10m (was: 4h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 4h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226929=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226929 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 21:00 Start Date: 12/Apr/19 21:00 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275060881 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: ok , sure. is `gobblin admin ` fine ? is it only about `gobblin cli run `? then i can bring the `run`, `watermarks`, `passwrodManager`, etc commands to gobblin.sh level so that the command signature wont change and it will remain `gobblin run `. make sense ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226929) Time Spent: 4h (was: 3h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 4h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226925=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226925 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 20:53 Start Date: 12/Apr/19 20:53 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275060881 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: ok , sure. is `gobblin admin ` fine ? is it only about `gobblin cli run `? then i can bring the `run`, `watermarks`, `passwrodManager`, etc commands to gobblin.sh level so that the command signature wont change from existing one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226925) Time Spent: 3h 50m (was: 3h 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 3h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226918=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226918 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 20:44 Start Date: 12/Apr/19 20:44 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275058240 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: My point is that there is no reason to move all of those applications down one level. Instead, everything else that you would want to add to the `bin/gobblin` command should be implemented as a `CliApplication`. I do see the benefit of aggregating all functionality, I just think it should be done without changing current commands. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226918) Time Spent: 3h 40m (was: 3.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 3h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226913=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226913 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 20:37 Start Date: 12/Apr/19 20:37 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275056186 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: Hi @ibuenros , the `cli` has lot of options other than `run`, like `passwordManager`, `watermarks`, etc... so we can keep it under `cli` category for now, and in future, those options can be added directly to the `gobblin.sh` if we really want to have `run` command directly under gobblin without `cli`. here the problem I m trying to solve is mostly inconsistency and lack of clarity among functionality of all gobblin scripts. I have also added more detail GOBBLIN-707 jira ticket to explain why we should combine all these scripts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226913) Time Spent: 3h 20m (was: 3h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 3h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226917=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226917 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 20:43 Start Date: 12/Apr/19 20:43 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r275056186 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: Hi @ibuenros , the `cli` has lot of options other than `run`, like `passwordManager`, `watermarks`, etc... so we can keep it under `cli` category for now, and in future, those options can be added directly to the `gobblin.sh` if we really want to have `run` command directly under gobblin without `cli`. btw, bringing all possible command line options at the `gobblin.sh` level could be too much aggregation. here the problem I m trying to solve is mostly inconsistency and lack of clarity among functionality of all gobblin scripts. I have also added more detail GOBBLIN-707 jira ticket to explain why we should combine all these scripts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226917) Time Spent: 3.5h (was: 3h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 3.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > > All the gobblin scripts share lot of common code to handle params, start, > stop services, status checks, pid handling, etc... combining all the scripts > into 1 not only makes maintenance easier but also brings clarity and > consistency. > > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226865 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 19:20 Start Date: 12/Apr/19 19:20 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-482692999 @autumnust , Thanks for checking on that, I will push some changes to re-trigger it after some review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226865) Time Spent: 3h (was: 2h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 3h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226478=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226478 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 04:28 Start Date: 12/Apr/19 04:28 Worklog Time Spent: 10m Work Description: autumnust commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-482432842 @jhsenjaliya There's seems to be an travis issue but I am not sure if it is related to your changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226478) Time Spent: 2h 50m (was: 2h 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 2h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=226461=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-226461 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 12/Apr/19 02:41 Start Date: 12/Apr/19 02:41 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-482414899 @autumnust , updated and refactored the doc, please take a look. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 226461) Time Spent: 2h 40m (was: 2.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 2h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=224485=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-224485 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 08/Apr/19 17:22 Start Date: 08/Apr/19 17:22 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on issue #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#issuecomment-480924487 Hi @autumnust, working on adding and refactoring documentation for commands and deployments once #2586 is merged, I will add the commit here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 224485) Time Spent: 2.5h (was: 2h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 2.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=223440=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-223440 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 05/Apr/19 05:09 Start Date: 05/Apr/19 05:09 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272445741 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: ok sure, it requires log of doc changes, and some reorganization, which i can take care of but can we get #2586 merged? otherwise i ll have lot of conflicts. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 223440) Time Spent: 2h 20m (was: 2h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 2h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222822 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 04/Apr/19 05:58 Start Date: 04/Apr/19 05:58 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272026110 ## File path: conf/yarn/application.conf ## @@ -22,15 +22,18 @@ gobblin.yarn.app.name=GobblinYarn gobblin.yarn.app.master.memory.mbs=256 gobblin.yarn.initial.containers=2 gobblin.yarn.container.memory.mbs=512 -gobblin.yarn.conf.dir= -gobblin.yarn.lib.jars.dir= -gobblin.yarn.app.master.files.local=${gobblin.yarn.conf.dir}"/log4j-yarn.properties,"${gobblin.yarn.conf.dir}"/application.conf,"${gobblin.yarn.conf.dir}"/reference.conf" +gobblin.yarn.conf.dir=/tools/gobblin-dist/conf/yarn/ Review comment: this is missed, let me change this to `gobblin.yarn.conf.dir=${GOBBLIN_HOME}/conf/yarn/` will be better than having btw, thanks for catching this, this was my local config. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 222822) Time Spent: 2h 10m (was: 2h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 2h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222821 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 04/Apr/19 05:56 Start Date: 04/Apr/19 05:56 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272026110 ## File path: conf/yarn/application.conf ## @@ -22,15 +22,18 @@ gobblin.yarn.app.name=GobblinYarn gobblin.yarn.app.master.memory.mbs=256 gobblin.yarn.initial.containers=2 gobblin.yarn.container.memory.mbs=512 -gobblin.yarn.conf.dir= -gobblin.yarn.lib.jars.dir= -gobblin.yarn.app.master.files.local=${gobblin.yarn.conf.dir}"/log4j-yarn.properties,"${gobblin.yarn.conf.dir}"/application.conf,"${gobblin.yarn.conf.dir}"/reference.conf" +gobblin.yarn.conf.dir=/tools/gobblin-dist/conf/yarn/ Review comment: this is missed, let me change this to `gobblin.yarn.conf.dir=${GOBBLIN_HOME}/conf/yarn/` thanks for catching this, this was my local config. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 222821) Time Spent: 2h (was: 1h 50m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 2h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222820 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 04/Apr/19 05:52 Start Date: 04/Apr/19 05:52 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r272025514 ## File path: conf/yarn/reference.conf ## @@ -38,6 +38,6 @@ gobblin.yarn.work.dir=/gobblin gobblin.cluster.helix.cluster.name=GobblinYarn gobblin.cluster.zk.connection.string="localhost:2181" -fs.uri="hdfs://localhost:9000" Review comment: yes, both are acceptable, just changing it to 8020 as default which i believe most people use, can change it to 9000 if its otherwise, no prob. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 222820) Time Spent: 1h 50m (was: 1h 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 1h 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222609 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 03/Apr/19 20:46 Start Date: 03/Apr/19 20:46 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r271922811 ## File path: conf/yarn/reference.conf ## @@ -38,6 +38,6 @@ gobblin.yarn.work.dir=/gobblin gobblin.cluster.helix.cluster.name=GobblinYarn gobblin.cluster.zk.connection.string="localhost:2181" -fs.uri="hdfs://localhost:9000" Review comment: Just curious, why port number needs to be changed here? Seems both 8020/9000 can be default port number for IPC of Namenode This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 222609) Time Spent: 1.5h (was: 1h 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 1.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222611=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222611 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 03/Apr/19 20:46 Start Date: 03/Apr/19 20:46 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r271923038 ## File path: conf/yarn/application.conf ## @@ -22,15 +22,18 @@ gobblin.yarn.app.name=GobblinYarn gobblin.yarn.app.master.memory.mbs=256 gobblin.yarn.initial.containers=2 gobblin.yarn.container.memory.mbs=512 -gobblin.yarn.conf.dir= -gobblin.yarn.lib.jars.dir= -gobblin.yarn.app.master.files.local=${gobblin.yarn.conf.dir}"/log4j-yarn.properties,"${gobblin.yarn.conf.dir}"/application.conf,"${gobblin.yarn.conf.dir}"/reference.conf" +gobblin.yarn.conf.dir=/tools/gobblin-dist/conf/yarn/ Review comment: Why this is being hard coded ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 222611) Time Spent: 1h 40m (was: 1.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222610=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222610 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 03/Apr/19 20:46 Start Date: 03/Apr/19 20:46 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r271925108 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: Sorry for getting back to this late. Yes that makes sense to me. I see them in `gobblin.sh` printUsage() method. Can you add/edit documentation to mention the existence of `gobblin.sh` so that newer users will be aware of it ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 222610) Time Spent: 1h 40m (was: 1.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 1h 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=222117=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-222117 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 03/Apr/19 01:16 Start Date: 03/Apr/19 01:16 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r271550178 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: @autumnust , does that make sense ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 222117) Time Spent: 1h 20m (was: 1h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 1h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=218258=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-218258 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/Mar/19 20:53 Start Date: 25/Mar/19 20:53 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r268845649 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: @autumnust, with this change, I am targetting to combine all services and commands together because the config and other mgmt is the same for all services and commands. since the gobblin is just link to gobblin.sh, with this change it will handle the cli as a command. so following becomes the new way of using gobblin script for various activities. `gobblin.sh ` `gobblin.sh ` commands values: `admin, cli, statestore-check, statestore-clean, historystore-manager` service values: `standalone, cluster-master, cluster-worker, aws, yarn, mr, service` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 218258) Time Spent: 50m (was: 40m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 50m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=218257=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-218257 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/Mar/19 20:52 Start Date: 25/Mar/19 20:52 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r268845649 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: @autumnust, with this change, I am targetting to combine all services and commands together because the config and other mgmt is the same for all services and commands. since the gobblin is just link to gobblin.sh, with this change it will handle the cli as a command. so following becomes the new way of using gobblin script for various activities. `gobblin.sh gobblin.sh commands values: admin, cli, statestore-check, statestore-clean, historystore-manager service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, service` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 218257) Time Spent: 40m (was: 0.5h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 40m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=218234=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-218234 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 25/Mar/19 20:08 Start Date: 25/Mar/19 20:08 Worklog Time Spent: 10m Work Description: autumnust commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r268778696 ## File path: gobblin-docs/user-guide/Gobblin-CLI.md ## @@ -28,29 +28,29 @@ Gobblin ingestion applications Gobblin ingestion applications can be accessed through the command `run`: ```bash -bin/gobblin run [listQuickApps] [] -jobName [OPTIONS] +bin/gobblin cli run [listQuickApps] [] -jobName [OPTIONS] Review comment: Changes here seems incorrect since `gobblin cli run` is not a valid command. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 218234) Time Spent: 0.5h (was: 20m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > 1. there can be one gobblin.sh script > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > 2. Also configs needs to be structured and deduped accordingly. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=217645=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-217645 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 23/Mar/19 20:29 Start Date: 23/Mar/19 20:29 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707]-rewrite gobblin script combining all gobblin modes and … URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r268408431 ## File path: bin/gobblin.sh ## @@ -17,50 +17,410 @@ # limitations under the License. # -calling_dir() { - echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" -} -classpath() { - DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" - - for i in `ls $DIR/../lib` - do - if [[ $i != hadoop* ]] - then -CLASSPATH=${CLASSPATH:+${CLASSPATH}:}$DIR/../lib/$i - else - HADOOP_CLASSPATH=${HADOOP_CLASSPATH:+${HADOOP_CLASSPATH}:}$DIR/../lib/$i - fi - done - - if [ ! -z "$HADOOP_HOME" ] && [ -f $HADOOP_HOME/bin/hadoop ] - then -HADOOP_CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath) - fi - - CLASSPATH=$CLASSPATH:$HADOOP_CLASSPATH - - if [ ! -z "$GOBBLIN_ADDITIONAL_JARS" ] - then - CLASSPATH=$GOBBLIN_ADDITIONAL_JARS:$CLASSPATH - fi - - echo $CLASSPATH +# JAVA_HOME is required. +if [[ -z "$JAVA_HOME" ]]; then +echo -e "\nError: Environment variable JAVA_HOME not set!\n" +exit 1 +fi + +# gobblin global vars, All these can be overridden by specified values in gobblin-env.sh +GOBBLIN_VERSION="0.15.0" + +GOBBLIN_HOME="$(cd `dirname $0`/..; pwd)" +GOBBLIN_LIB=${GOBBLIN_HOME}/lib +GOBBLIN_BIN=${GOBBLIN_HOME}/bin +GOBBLIN_LOGS=${GOBBLIN_HOME}/logs +GOBBLIN_CONF='' + +LOG4J_FILE_PATH='' +CLUSTER_NAME="gobblin_cluster" +JVM_OPTS="-Xmx1g -Xms512m" +GOBBLIN_MODE='' +ACTION='' +JVM_FLAGS='' +EXTRA_JARS='' +VERBOSE=0 +ENABLE_GC_LOGS=0 +CMD_PARAMS='' +GOBBLIN_MODE_TYPE_COMMAND="GOBBLIN_COMMAND" +GOBBLIN_MODE_TYPE_SERVICE="GOBBLIN_SERVICE" + +# Gobblin Commands, Modes & respective Classes +# Commands +ADMIN_MODE='admin' +CLI_MODE='cli' +STATESTORE_CHECK_MODE='statestore-check' +STATESTORE_CLEAN_MODE='statestore-clean' +HISTORYSTORE_MANAGER_MODE='historystore-manager' + +# Services +STANDALONE_MODE='standalone' +CLUSTER_MASTER_MODE='cluster-master' +CLUSTER_WORKER_MODE='cluster-worker' +AWS_MODE='aws' +YARN_MODE='yarn' +MR_MODE='mr' +SERVICE_MODE='service' + +# Command class +ADMIN_CLASS="org.apache.gobblin.cli.Cli" +CLI_CLASS='org.apache.gobblin.runtime.cli.GobblinCli' +STATESTORE_CHECK_CLASS='org.apache.gobblin.runtime.util.JobStateToJsonConverter' +STATESTORE_CLEAN_CLASS='org.apache.gobblin.metastore.util.StateStoreCleaner' +HISTORYSTORE_MANAGER_CLASS='org.apache.gobblin.metastore.util.DatabaseJobHistoryStoreSchemaManager' + +# Service Class +STANDALONE_CLASS='org.apache.gobblin.scheduler.SchedulerDaemon' +CLUSTER_MASTER_CLASS='org.apache.gobblin.cluster.GobblinClusterManager' +CLUSTER_WORKER_CLASS='org.apache.gobblin.cluster.GobblinTaskRunner' +AWS_CLASS='org.apache.gobblin.aws.GobblinAWSClusterLauncher' +YARN_CLASS='org.apache.gobblin.yarn.GobblinYarnAppLauncher' +MR_CLASS='org.apache.gobblin.runtime.mapreduce.CliMRJobLauncher' +SERVICE_CLASS='org.apache.gobblin.service.modules.core.GobblinServiceManager' + + +function print_usage() { +echo "gobblin.sh " +echo "gobblin.sh " + +echo "Argument Options:" +echo "values: $ADMIN_MODE, $CLI_MODE, $STATESTORE_CHECK_MODE, $STATESTORE_CLEAN_MODE, $HISTORYSTORE_MANAGER_MODE" +echo " values: $STANDALONE_MODE, $CLUSTER_MASTER_MODE, $CLUSTER_WORKER_MODE, $AWS_MODE, $YARN_MODE, $MR_MODE, $SERVICE_MODE." +echo "--cluster-nameassign cluster name ( default: $CLUSTER_NAME)." +echo "--conf-dir default is '$GOBBLIN_HOME/conf/'." +echo "--log4j-conf default is '$GOBBLIN_HOME/conf//log4j.properties'." +echo "--jtOnly for MR mode: Job submission URL, if not set, taken from \${HADOOP_HOME}/conf." +echo "--fs Only for MR mode: Target file system, if not set, taken from \${HADOOP_HOME}/conf." +echo "--jvmopts String containing JVM flags to include, in addition to \"$JVM_OPTS\"." +echo "--jars Column-separated list of extra jars to put on the CLASSPATH." +echo "--enable-gc-logs enables gc logs & dumps." +echo "--helpDisplay this help." +echo "--verbose Display full command used to start the process." } +# TODO: use getopts +shopt -s nocasematch for i in "$@" do - case "$1"
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=217644=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-217644 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 23/Mar/19 20:26 Start Date: 23/Mar/19 20:26 Worklog Time Spent: 10m Work Description: jhsenjaliya commented on pull request #2578: [GOBBLIN-707]-rewrite gobblin script combining all gobblin modes and … URL: https://github.com/apache/incubator-gobblin/pull/2578 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [*] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-707 ### Description - [*] Here are some details about my PR, including screenshots (if applicable): This PR adds new gobblin.sh script that does following: 1. there were individual scripts for individual gobblin modes and gobblin commands, all does config, log and other param management differently, this PR combines all commands and modes into one script that not only standardize it but also makes easy for devs to use it. 2. config were scattered or duplicated per gobblin mode, this PR standardizes 3. removes scripts and redundant configs. ### Tests - [*] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: These are scripts and config changes, dont require test case. ### Commits - [*] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 217644) Time Spent: 10m Remaining Estimate: 0h > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and there is a individual > script for each of them. > there can be one gobblin.sh script{color:#6a8759} > {color} > {{gobblin.sh }} > where possible values can be: {{CLI, Standalone, cluster-master, > cluster-worker, AWS, YARN, MR.}} > Also currently the configs for each mode is scattered and it needs some > structure. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)