[jira] [Work logged] (GOBBLIN-775) Add job level retry for gobblin service
[ https://issues.apache.org/jira/browse/GOBBLIN-775?focusedWorklogId=247268=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247268 ] ASF GitHub Bot logged work on GOBBLIN-775: -- Author: ASF GitHub Bot Created on: 23/May/19 03:50 Start Date: 23/May/19 03:50 Worklog Time Spent: 10m Work Description: jack-moseley commented on issue #2640: [GOBBLIN-775] Add job level retries for gobblin service URL: https://github.com/apache/incubator-gobblin/pull/2640#issuecomment-495057221 - Changed `JobExecutionPlan` equals and hashCode instead of changing the key of the `dagNode` maps - Changed orchestration events to include attempt counter, so that when there's a failure event we can avoid updating the `JobStatus` to failed if there will be a retry This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247268) Time Spent: 0.5h (was: 20m) > Add job level retry for gobblin service > --- > > Key: GOBBLIN-775 > URL: https://issues.apache.org/jira/browse/GOBBLIN-775 > Project: Apache Gobblin > Issue Type: New Feature > Components: gobblin-service >Reporter: Jack Moseley >Assignee: Abhishek Tiwari >Priority: Major > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] jack-moseley commented on issue #2640: [GOBBLIN-775] Add job level retries for gobblin service
jack-moseley commented on issue #2640: [GOBBLIN-775] Add job level retries for gobblin service URL: https://github.com/apache/incubator-gobblin/pull/2640#issuecomment-495057221 - Changed `JobExecutionPlan` equals and hashCode instead of changing the key of the `dagNode` maps - Changed orchestration events to include attempt counter, so that when there's a failure event we can avoid updating the `JobStatus` to failed if there will be a retry This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-779) make job status retriever configurable
[ https://issues.apache.org/jira/browse/GOBBLIN-779?focusedWorklogId=247267=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247267 ] ASF GitHub Bot logged work on GOBBLIN-779: -- Author: ASF GitHub Bot Created on: 23/May/19 03:38 Start Date: 23/May/19 03:38 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2643: [GOBBLIN-779] make job status retriever configurable URL: https://github.com/apache/incubator-gobblin/pull/2643 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247267) Time Spent: 10m Remaining Estimate: 0h > make job status retriever configurable > -- > > Key: GOBBLIN-779 > URL: https://issues.apache.org/jira/browse/GOBBLIN-779 > Project: Apache Gobblin > Issue Type: Bug >Reporter: Arjun Singh Bora >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] asfgit closed pull request #2643: [GOBBLIN-779] make job status retriever configurable
asfgit closed pull request #2643: [GOBBLIN-779] make job status retriever configurable URL: https://github.com/apache/incubator-gobblin/pull/2643 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Created] (GOBBLIN-779) make job status retriever configurable
Arjun Singh Bora created GOBBLIN-779: Summary: make job status retriever configurable Key: GOBBLIN-779 URL: https://issues.apache.org/jira/browse/GOBBLIN-779 Project: Apache Gobblin Issue Type: Bug Reporter: Arjun Singh Bora -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] arjun4084346 opened a new pull request #2643: make job status retriever configurable
arjun4084346 opened a new pull request #2643: make job status retriever configurable URL: https://github.com/apache/incubator-gobblin/pull/2643 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! @sv2000 please review ### JIRA - [x] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. For example, "[GOBBLIN-XXX] My Gobblin PR" - https://issues.apache.org/jira/browse/GOBBLIN-XXX ### Description - [x] Here are some details about my PR, including screenshots (if applicable): ### Tests - [x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: trivial changes ### Commits - [ ] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247138=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247138 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 22/May/19 23:08 Start Date: 22/May/19 23:08 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705253 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true +admin.server.port=9000 + +# is this required/redundent ? +rest.server.host=localhost +rest.server.port=9090 + +# job history store +job.execinfo.server.enabled=true Review comment: Ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247138) Time Spent: 6h 20m (was: 6h 10m) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 20m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247135=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247135 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 22/May/19 23:08 Start Date: 22/May/19 23:08 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705377 ## File path: conf/cluster-worker/application.conf ## @@ -69,3 +69,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +failure.log.dir=${gobblin.cluster.work.dir}/failure-logs + +# UI +admin.server.enabled=false +# admin.server.port=9000 + +rest.server.host=localhost +rest.server.port=9090 + +# job history store ( WARN [GobblinYarnAppLauncher] NOT starting the admin UI because the job execution info server is NOT enabled ) +job.execinfo.server.enabled=true Review comment: Why enabled by default? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247135) Time Spent: 6h 10m (was: 6h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} >
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247136 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 22/May/19 23:08 Start Date: 22/May/19 23:08 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286703827 ## File path: bin/gobblin.sh ## @@ -17,50 +17,488 @@ # limitations under the License. # -calling_dir() { - echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" +# JAVA_HOME is required. +if [[ -z "$JAVA_HOME" ]]; then +echo -e "\nError: Environment variable JAVA_HOME not set!\n" +exit 1 +fi + +# global vars + +GOBBLIN_VERSION=@project.version@ +GOBBLIN_HOME="$(cd `dirname $0`/..; pwd)" +GOBBLIN_LIB=${GOBBLIN_HOME}/lib +GOBBLIN_BIN=${GOBBLIN_HOME}/bin +GOBBLIN_LOGS=${GOBBLIN_HOME}/logs +GOBBLIN_CONF='' + +#sourcing basic gobblin env vars like GOBBLIN_HOME and GOBBLIN_LIB +. ${GOBBLIN_BIN}/gobblin-env.sh + +CLUSTER_NAME="gobblin_cluster" +JVM_OPTS="-Xmx1g -Xms512m" +LOG4J_FILE_PATH='' +LOG4J_OPTS='' +GOBBLIN_MODE='' +ACTION='' +JVM_FLAGS='' +EXTRA_JARS='' +VERBOSE=0 +ENABLE_GC_LOGS=0 +CMD_PARAMS='' + + +# Gobblin Commands, Modes & respective Classes +GOBBLIN_MODE_TYPE='' +CLI='cli' +SERVICE='service' + +# Commands +JOB_STATE_TO_JSON_CMD='job-state-to-json' +JOB_STORE_SCHEMA_MANAGER_CMD='job-store-schema-manager' +CLASSPATH_CMD='classpath' + +# Execution Modes +STANDALONE_MODE='standalone' +CLUSTER_MASTER_MODE='cluster-master' +CLUSTER_WORKER_MODE='cluster-worker' +AWS_MODE='aws' +YARN_MODE='yarn' +MAPREDUCE_MODE='mapreduce' +SERVICE_MANAGER_MODE='service-manager' + +GOBBLIN_EXEC_MODE_LIST="$STANDALONE_MODE $CLUSTER_MASTER_MODE $CLUSTER_WORKER_MODE $AWS_MODE $YARN_MODE $MAPREDUCE_MODE $SERVICE_MANAGER_MODE" + +# CLI Command class +CLI_CLASS='org.apache.gobblin.runtime.cli.GobblinCli' + +# Service Class +STANDALONE_CLASS='org.apache.gobblin.scheduler.SchedulerDaemon' +CLUSTER_MASTER_CLASS='org.apache.gobblin.cluster.GobblinClusterManager' +CLUSTER_WORKER_CLASS='org.apache.gobblin.cluster.GobblinTaskRunner' +AWS_CLASS='org.apache.gobblin.aws.GobblinAWSClusterLauncher' +YARN_CLASS='org.apache.gobblin.yarn.GobblinYarnAppLauncher' +MAPREDUCE_CLASS='org.apache.gobblin.runtime.mapreduce.CliMRJobLauncher' +SERVICE_MANAGER_CLASS='org.apache.gobblin.service.modules.core.GobblinServiceManager' + + +function print_gobblin_usage() { +echo "Usage:" +echo "gobblin.sh cli " +echo "gobblin.sh service " +echo "" +echo "Use \"gobblin --help\" for more information. (Gobblin Version: $GOBBLIN_VERSION)" +} + +function print_gobblin_cli_usage() { Review comment: Why is this needed? `GobblinCli` should be able to automatically generate this usage info. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247136) Time Spent: 6h 10m (was: 6h) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the
[jira] [Work logged] (GOBBLIN-707) combine & standardize all gobblin scripts into one master script & restructure configs accordingly
[ https://issues.apache.org/jira/browse/GOBBLIN-707?focusedWorklogId=247137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-247137 ] ASF GitHub Bot logged work on GOBBLIN-707: -- Author: ASF GitHub Bot Created on: 22/May/19 23:08 Start Date: 22/May/19 23:08 Worklog Time Spent: 10m Work Description: ibuenros commented on pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705209 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true Review comment: Why do we want admin server enabled by default? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 247137) > combine & standardize all gobblin scripts into one master script & > restructure configs accordingly > -- > > Key: GOBBLIN-707 > URL: https://issues.apache.org/jira/browse/GOBBLIN-707 > Project: Apache Gobblin > Issue Type: Improvement >Reporter: Jay Sen >Priority: Major > Time Spent: 6h 10m > Remaining Estimate: 0h > > gobblin supports multiple modes of executions ( CLI, Standalone, > cluster-master, cluster-worker, AWS, YARN, MR ) and various command lines > utility to run cli and admin commands. There is a individual script for each > of them. > Having individual script introduces lot of issues > # all scripts handles gobblin variables, user parameters differently, and > its highly inconsistent among various different gobblin scripts > # functionality around start, stop, status checking and handling PID's among > lot of other things, varies vastly as per the implementation of the script. > # features like GC & JVM params, log4j file selection, classpath > calculation, etc... exists in some gobblin scripts but not all, adding to > inconsistent user experience. > # maintaining total 13 script would be too much effort. > Also all the gobblin scripts share lot of common code to handle params, > start, stop services, status checks, pid handling, etc... combining all the > scripts into 1 not only makes maintenance easier but also brings clarity and > consistency. > > Solution: > 1. there can be one gobblin.sh script to handle all gobblin commands and > deployment options as per following signature. NOTE: This > {{gobblin.sh }} > {{gobblin.sh }} > {{commands values: admin, cli, statestore-check, statestore-clean, > historystore-manager, classpath}} > {{service values: standalone, cluster-master, cluster-worker, aws, yarn, mr, > service}} > with above change, following becomes valid command. > {code:java} > # all under GobblinCli class > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run listQuickApps –> gobblin cli run listQuickApps > gobblin run -> gobblin cli run > # class: JobStateToJsonConverter > statestore-checker.sh -> gobblin statestore-checker > # class: StateStoreCleaner > statestore-clean.sh -> gobblin statestore-clean > # class: DatabaseJobHistoryStoreSchemaManager > historystore-manager.sh -> gobblin historystore-manager > # class: Cli > gobblin-admin.sh-> gobblin admin > # all gobblin deployment modes > gobblin-cluster-master.sh -> gobblin cluster-mater start|stop|status > gobblin-cluster-worker.sh -> gobblin cluster-mater start|stop|status > gobblin-compaction.sh -> gobblin cluster-mater start|stop|status > gobblin-env.sh -> gobblin cluster-mater start|stop|status > gobblin-mapreduce.sh-> gobblin cluster-mater start|stop|status > gobblin-service.sh -> gobblin cluster-mater start|stop|status > gobblin-standalone.sh -> gobblin cluster-mater start|stop|status > gobblin-yarn.sh -> gobblin cluster-mater start|stop|status > {code} > > 2. Also configs needs to be structured and deduped accordingly to make it > clear on which config will be picked up for which execution mode. > {color:#ff} > NOTE: this refactoring adds all cli and service commands to gobblin.sh and > hence changes the syntax for all commands and services.{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] ibuenros commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command
ibuenros commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705377 ## File path: conf/cluster-worker/application.conf ## @@ -69,3 +69,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +failure.log.dir=${gobblin.cluster.work.dir}/failure-logs + +# UI +admin.server.enabled=false +# admin.server.port=9000 + +rest.server.host=localhost +rest.server.port=9090 + +# job history store ( WARN [GobblinYarnAppLauncher] NOT starting the admin UI because the job execution info server is NOT enabled ) +job.execinfo.server.enabled=true Review comment: Why enabled by default? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ibuenros commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command
ibuenros commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705253 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true +admin.server.port=9000 + +# is this required/redundent ? +rest.server.host=localhost +rest.server.port=9090 + +# job history store +job.execinfo.server.enabled=true Review comment: Ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ibuenros commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command
ibuenros commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286703827 ## File path: bin/gobblin.sh ## @@ -17,50 +17,488 @@ # limitations under the License. # -calling_dir() { - echo "$( cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd )" +# JAVA_HOME is required. +if [[ -z "$JAVA_HOME" ]]; then +echo -e "\nError: Environment variable JAVA_HOME not set!\n" +exit 1 +fi + +# global vars + +GOBBLIN_VERSION=@project.version@ +GOBBLIN_HOME="$(cd `dirname $0`/..; pwd)" +GOBBLIN_LIB=${GOBBLIN_HOME}/lib +GOBBLIN_BIN=${GOBBLIN_HOME}/bin +GOBBLIN_LOGS=${GOBBLIN_HOME}/logs +GOBBLIN_CONF='' + +#sourcing basic gobblin env vars like GOBBLIN_HOME and GOBBLIN_LIB +. ${GOBBLIN_BIN}/gobblin-env.sh + +CLUSTER_NAME="gobblin_cluster" +JVM_OPTS="-Xmx1g -Xms512m" +LOG4J_FILE_PATH='' +LOG4J_OPTS='' +GOBBLIN_MODE='' +ACTION='' +JVM_FLAGS='' +EXTRA_JARS='' +VERBOSE=0 +ENABLE_GC_LOGS=0 +CMD_PARAMS='' + + +# Gobblin Commands, Modes & respective Classes +GOBBLIN_MODE_TYPE='' +CLI='cli' +SERVICE='service' + +# Commands +JOB_STATE_TO_JSON_CMD='job-state-to-json' +JOB_STORE_SCHEMA_MANAGER_CMD='job-store-schema-manager' +CLASSPATH_CMD='classpath' + +# Execution Modes +STANDALONE_MODE='standalone' +CLUSTER_MASTER_MODE='cluster-master' +CLUSTER_WORKER_MODE='cluster-worker' +AWS_MODE='aws' +YARN_MODE='yarn' +MAPREDUCE_MODE='mapreduce' +SERVICE_MANAGER_MODE='service-manager' + +GOBBLIN_EXEC_MODE_LIST="$STANDALONE_MODE $CLUSTER_MASTER_MODE $CLUSTER_WORKER_MODE $AWS_MODE $YARN_MODE $MAPREDUCE_MODE $SERVICE_MANAGER_MODE" + +# CLI Command class +CLI_CLASS='org.apache.gobblin.runtime.cli.GobblinCli' + +# Service Class +STANDALONE_CLASS='org.apache.gobblin.scheduler.SchedulerDaemon' +CLUSTER_MASTER_CLASS='org.apache.gobblin.cluster.GobblinClusterManager' +CLUSTER_WORKER_CLASS='org.apache.gobblin.cluster.GobblinTaskRunner' +AWS_CLASS='org.apache.gobblin.aws.GobblinAWSClusterLauncher' +YARN_CLASS='org.apache.gobblin.yarn.GobblinYarnAppLauncher' +MAPREDUCE_CLASS='org.apache.gobblin.runtime.mapreduce.CliMRJobLauncher' +SERVICE_MANAGER_CLASS='org.apache.gobblin.service.modules.core.GobblinServiceManager' + + +function print_gobblin_usage() { +echo "Usage:" +echo "gobblin.sh cli " +echo "gobblin.sh service " +echo "" +echo "Use \"gobblin --help\" for more information. (Gobblin Version: $GOBBLIN_VERSION)" +} + +function print_gobblin_cli_usage() { Review comment: Why is this needed? `GobblinCli` should be able to automatically generate this usage info. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[GitHub] [incubator-gobblin] ibuenros commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command
ibuenros commented on a change in pull request #2578: [GOBBLIN-707] rewrite gobblin script to combine all modes and command URL: https://github.com/apache/incubator-gobblin/pull/2578#discussion_r286705209 ## File path: conf/cluster-master/application.conf ## @@ -69,3 +70,20 @@ task.status.reportintervalinms=1000 # Enable metrics / events metrics.enabled=true +# UI +admin.server.enabled=true Review comment: Why do we want admin server enabled by default? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Work logged] (GOBBLIN-778) Enhance SalesforceExtractor bulkConnection config for setting transport factory
[ https://issues.apache.org/jira/browse/GOBBLIN-778?focusedWorklogId=246936=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-246936 ] ASF GitHub Bot logged work on GOBBLIN-778: -- Author: ASF GitHub Bot Created on: 22/May/19 19:23 Start Date: 22/May/19 19:23 Worklog Time Spent: 10m Work Description: asfgit commented on pull request #2642: GOBBLIN-778 - Moving config creation to a separate method URL: https://github.com/apache/incubator-gobblin/pull/2642 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 246936) Time Spent: 20m (was: 10m) > Enhance SalesforceExtractor bulkConnection config for setting transport > factory > --- > > Key: GOBBLIN-778 > URL: https://issues.apache.org/jira/browse/GOBBLIN-778 > Project: Apache Gobblin > Issue Type: Task > Components: gobblin-salesforce >Reporter: Monish Vachhani >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > Time Spent: 20m > Remaining Estimate: 0h > > SalesforceExtractor uses bulk connection to connect to Salesforce using bulk > API. Since bulkConnection is private variable it cannot be modified to pass > custom transportFactory via config. > This task is to separate the config creation from bulkApiLogin method so as > it can be overridden for passing custom params like setTransport. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] asfgit closed pull request #2642: GOBBLIN-778 - Moving config creation to a separate method
asfgit closed pull request #2642: GOBBLIN-778 - Moving config creation to a separate method URL: https://github.com/apache/incubator-gobblin/pull/2642 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
[jira] [Resolved] (GOBBLIN-778) Enhance SalesforceExtractor bulkConnection config for setting transport factory
[ https://issues.apache.org/jira/browse/GOBBLIN-778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Issac Buenrostro resolved GOBBLIN-778. -- Resolution: Fixed Fix Version/s: 0.15.0 Issue resolved by pull request #2642 [https://github.com/apache/incubator-gobblin/pull/2642] > Enhance SalesforceExtractor bulkConnection config for setting transport > factory > --- > > Key: GOBBLIN-778 > URL: https://issues.apache.org/jira/browse/GOBBLIN-778 > Project: Apache Gobblin > Issue Type: Task > Components: gobblin-salesforce >Reporter: Monish Vachhani >Assignee: Hung Tran >Priority: Major > Fix For: 0.15.0 > > Time Spent: 10m > Remaining Estimate: 0h > > SalesforceExtractor uses bulk connection to connect to Salesforce using bulk > API. Since bulkConnection is private variable it cannot be modified to pass > custom transportFactory via config. > This task is to separate the config creation from bulkApiLogin method so as > it can be overridden for passing custom params like setTransport. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Work logged] (GOBBLIN-778) Enhance SalesforceExtractor bulkConnection config for setting transport factory
[ https://issues.apache.org/jira/browse/GOBBLIN-778?focusedWorklogId=246613=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-246613 ] ASF GitHub Bot logged work on GOBBLIN-778: -- Author: ASF GitHub Bot Created on: 22/May/19 08:23 Start Date: 22/May/19 08:23 Worklog Time Spent: 10m Work Description: mvachhani commented on pull request #2642: GOBBLIN-778 - Moving config creation to a separate method URL: https://github.com/apache/incubator-gobblin/pull/2642 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. - https://issues.apache.org/jira/browse/GOBBLIN-778 ### Description - [ x] Here are some details about my PR, including screenshots (if applicable): SalesforceExtractor uses bulk connection to connect to Salesforce using bulk API. Since bulkConnection is private variable it cannot be modified to pass custom transportFactory via config. This task is to separate the config creation from bulkApiLogin method so as it can be overridden for passing custom params like setTransport. ### Tests - [ x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: No new code added, this change only refactors the code into a separate method. ### Commits - [ x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 246613) Time Spent: 10m Remaining Estimate: 0h > Enhance SalesforceExtractor bulkConnection config for setting transport > factory > --- > > Key: GOBBLIN-778 > URL: https://issues.apache.org/jira/browse/GOBBLIN-778 > Project: Apache Gobblin > Issue Type: Task > Components: gobblin-salesforce >Reporter: Monish Vachhani >Assignee: Hung Tran >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > SalesforceExtractor uses bulk connection to connect to Salesforce using bulk > API. Since bulkConnection is private variable it cannot be modified to pass > custom transportFactory via config. > This task is to separate the config creation from bulkApiLogin method so as > it can be overridden for passing custom params like setTransport. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (GOBBLIN-778) Enhance SalesforceExtractor bulkConnection config for setting transport factory
Monish Vachhani created GOBBLIN-778: --- Summary: Enhance SalesforceExtractor bulkConnection config for setting transport factory Key: GOBBLIN-778 URL: https://issues.apache.org/jira/browse/GOBBLIN-778 Project: Apache Gobblin Issue Type: Task Components: gobblin-salesforce Reporter: Monish Vachhani Assignee: Hung Tran SalesforceExtractor uses bulk connection to connect to Salesforce using bulk API. Since bulkConnection is private variable it cannot be modified to pass custom transportFactory via config. This task is to separate the config creation from bulkApiLogin method so as it can be overridden for passing custom params like setTransport. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[GitHub] [incubator-gobblin] mvachhani opened a new pull request #2642: GOBBLIN-778 - Moving config creation to a separate method
mvachhani opened a new pull request #2642: GOBBLIN-778 - Moving config creation to a separate method URL: https://github.com/apache/incubator-gobblin/pull/2642 Dear Gobblin maintainers, Please accept this PR. I understand that it will not be reviewed until I have checked off all the steps below! ### JIRA - [ ] My PR addresses the following [Gobblin JIRA](https://issues.apache.org/jira/browse/GOBBLIN/) issues and references them in the PR title. - https://issues.apache.org/jira/browse/GOBBLIN-778 ### Description - [ x] Here are some details about my PR, including screenshots (if applicable): SalesforceExtractor uses bulk connection to connect to Salesforce using bulk API. Since bulkConnection is private variable it cannot be modified to pass custom transportFactory via config. This task is to separate the config creation from bulkApiLogin method so as it can be overridden for passing custom params like setTransport. ### Tests - [ x] My PR adds the following unit tests __OR__ does not need testing for this extremely good reason: No new code added, this change only refactors the code into a separate method. ### Commits - [ x] My commits all reference JIRA issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "[How to write a good git commit message](http://chris.beams.io/posts/git-commit/)": 1. Subject is separated from body by a blank line 2. Subject is limited to 50 characters 3. Subject does not end with a period 4. Subject uses the imperative mood ("add", not "adding") 5. Body wraps at 72 characters 6. Body explains "what" and "why", not "how" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services