This is an automated email from the ASF dual-hosted git repository.

suvasude pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-gobblin.git


The following commit(s) were added to refs/heads/master by this push:
     new 70afd6d  [GOBBLIN-895] Fixes Gobblin Standalone configs and scripts so 
that the user guide is accurate
70afd6d is described below

commit 70afd6dbb7abb93907a3e2f61ee674c7af97fd2f
Author: William Lo <[email protected]>
AuthorDate: Thu Oct 10 20:54:49 2019 -0700

    [GOBBLIN-895] Fixes Gobblin Standalone configs and scripts so that the user 
guide is accurate
    
    Closes #2751 from Will-Lo/fix-gobblin-standalone-
    script
---
 bin/gobblin-admin.sh              |  2 +-
 bin/gobblin-aws.sh                |  2 +-
 bin/gobblin-cluster-master.sh     |  2 +-
 bin/gobblin-cluster-worker.sh     |  2 +-
 bin/gobblin-mapreduce.sh          |  2 +-
 bin/gobblin-service.sh            |  2 +-
 bin/gobblin-standalone.sh         |  2 +-
 bin/gobblin-yarn.sh               |  2 +-
 bin/gobblin.sh                    | 20 +++++++++++-
 bin/gobblin_password_encryptor.sh |  2 +-
 bin/historystore-manager.sh       |  2 +-
 bin/statestore-checker.sh         |  2 +-
 bin/statestore-cleaner.sh         |  2 +-
 conf/standalone/application.conf  | 69 ++++++++++++++-------------------------
 conf/standalone/log4j.xml         | 32 ++++++++++++++++++
 gobblin-docs/Getting-Started.md   |  4 +--
 16 files changed, 89 insertions(+), 60 deletions(-)

diff --git a/bin/gobblin-admin.sh b/bin/gobblin-admin.sh
index 1a510db..1e95d0e 100755
--- a/bin/gobblin-admin.sh
+++ b/bin/gobblin-admin.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin cli $@
\ No newline at end of file
diff --git a/bin/gobblin-aws.sh b/bin/gobblin-aws.sh
index 48fd15f..dab9a95 100755
--- a/bin/gobblin-aws.sh
+++ b/bin/gobblin-aws.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin service aws $@
\ No newline at end of file
diff --git a/bin/gobblin-cluster-master.sh b/bin/gobblin-cluster-master.sh
index cc39264..0632d49 100755
--- a/bin/gobblin-cluster-master.sh
+++ b/bin/gobblin-cluster-master.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin service cluster-master $@
\ No newline at end of file
diff --git a/bin/gobblin-cluster-worker.sh b/bin/gobblin-cluster-worker.sh
index ec99c83..8e607d5 100755
--- a/bin/gobblin-cluster-worker.sh
+++ b/bin/gobblin-cluster-worker.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin service cluster-worker $@
diff --git a/bin/gobblin-mapreduce.sh b/bin/gobblin-mapreduce.sh
index 5e050fa..9ee0cc8 100755
--- a/bin/gobblin-mapreduce.sh
+++ b/bin/gobblin-mapreduce.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 ##############################################################
 ############### Run Gobblin Jobs on Hadoop MR ################
diff --git a/bin/gobblin-service.sh b/bin/gobblin-service.sh
index cafb15c..6c080a2 100755
--- a/bin/gobblin-service.sh
+++ b/bin/gobblin-service.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin service service-manager $@
\ No newline at end of file
diff --git a/bin/gobblin-standalone.sh b/bin/gobblin-standalone.sh
index 356d3c4..9210004 100755
--- a/bin/gobblin-standalone.sh
+++ b/bin/gobblin-standalone.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin service standalone $@
\ No newline at end of file
diff --git a/bin/gobblin-yarn.sh b/bin/gobblin-yarn.sh
index e531651..6e5c138 100755
--- a/bin/gobblin-yarn.sh
+++ b/bin/gobblin-yarn.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin service yarn $@
\ No newline at end of file
diff --git a/bin/gobblin.sh b/bin/gobblin.sh
index d8d97fd..7e32a90 100755
--- a/bin/gobblin.sh
+++ b/bin/gobblin.sh
@@ -238,6 +238,11 @@ if [[ "$GOBBLIN_MODE_TYPE" == "$CLI" ]]; then
     fi
 fi
 
+CHECK_ENV_VARS=false
+if [ $ACTION == "start" ] || [ $ACTION == "restart" ]; then
+  CHECK_ENV_VARS=true
+fi
+
 # derived based on input from user, $GOBBLIN_MODE
 PID_FILE_NAME=".gobblin-$GOBBLIN_MODE.pid"
 PID_FILE="$GOBBLIN_HOME/$PID_FILE_NAME"
@@ -263,6 +268,10 @@ if [[ -n "$USER_LOG4J_FILE" ]]; then
 elif [[ -f ${GOBBLIN_CONF}/log4j2.xml ]]; then
     LOG4J_FILE_PATH=file://${GOBBLIN_CONF}/log4j2.xml
     LOG4J_OPTS="-Dlog4j.configuration=$LOG4J_FILE_PATH"
+#prefer log4j.xml
+elif [[ -f ${GOBBLIN_CONF}/log4j.xml ]]; then
+    LOG4J_FILE_PATH=file://${GOBBLIN_CONF}/log4j.xml
+    LOG4J_OPTS="-Dlog4j.configuration=$LOG4J_FILE_PATH"
 #defaults to log4j.properties
 elif [[ -f ${GOBBLIN_CONF}/log4j.properties ]]; then
     LOG4J_FILE_PATH=file://${GOBBLIN_CONF}/log4j.properties
@@ -372,6 +381,7 @@ function start() {
 
     LOG_OUT_FILE="${GOBBLIN_LOGS}/${GOBBLIN_MODE}.out"
     LOG_ERR_FILE="${GOBBLIN_LOGS}/${GOBBLIN_MODE}.err"
+    ADDITIONAL_ARGS=""
 
     # for all gobblin commands
     if [[ "$GOBBLIN_MODE_TYPE" == "$CLI" ]]; then
@@ -417,7 +427,15 @@ function start() {
             CLASS_N_ARGS=''
             if [[ "$GOBBLIN_MODE" = "$STANDALONE_MODE" ]]; then
                 CLASS_N_ARGS="$STANDALONE_CLASS $GOBBLIN_CONF/application.conf"
+                ADDITIONAL_ARGS="-Dgobblin.logs.dir=${GOBBLIN_LOGS}"
+
+                if [ -z "$GOBBLIN_WORK_DIR" ] && [ "$CHECK_ENV_VARS" == true 
]; then
+                  die "GOBBLIN_WORK_DIR is not set!"
+                fi
 
+                if [ -z "$GOBBLIN_JOB_CONFIG_DIR" ] && [ "$CHECK_ENV_VARS" == 
true ]; then
+                  die "Environment variable GOBBLIN_JOB_CONFIG_DIR not set!"
+                fi
             elif [[ "$GOBBLIN_MODE" = "$AWS_MODE" ]]; then
                 CLASS_N_ARGS="$AWS_CLASS"
 
@@ -442,7 +460,7 @@ function start() {
                 echo "Invalid gobblin command or execution mode... [EXITING]"
                 exit 1
             fi
-            GOBBLIN_COMMAND="$JAVA_HOME/bin/java -cp $GOBBLIN_CLASSPATH 
$GC_OPTS $JVM_OPTS $LOG4J_OPTS $CLASS_N_ARGS"
+            GOBBLIN_COMMAND="$JAVA_HOME/bin/java -cp $GOBBLIN_CLASSPATH 
$GC_OPTS $JVM_OPTS $LOG4J_OPTS $ADDITIONAL_ARGS $CLASS_N_ARGS"
         fi
 
         # execute the command
diff --git a/bin/gobblin_password_encryptor.sh 
b/bin/gobblin_password_encryptor.sh
index 0fdc211..9dec72a 100755
--- a/bin/gobblin_password_encryptor.sh
+++ b/bin/gobblin_password_encryptor.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 script_dir=$(dirname $0)
 lib_dir=${script_dir}/../lib
diff --git a/bin/historystore-manager.sh b/bin/historystore-manager.sh
index c2fbc25..0e06711 100755
--- a/bin/historystore-manager.sh
+++ b/bin/historystore-manager.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin cli job-store-schema-manager $@
\ No newline at end of file
diff --git a/bin/statestore-checker.sh b/bin/statestore-checker.sh
index ff9661e..60c0279 100755
--- a/bin/statestore-checker.sh
+++ b/bin/statestore-checker.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 CURRENT_DIR="$(cd `dirname $0`/..; pwd)"
 $CURRENT_DIR/bin/gobblin cli job-state-to-json $@
diff --git a/bin/statestore-cleaner.sh b/bin/statestore-cleaner.sh
index 2999220..f4eacbe 100755
--- a/bin/statestore-cleaner.sh
+++ b/bin/statestore-cleaner.sh
@@ -17,7 +17,7 @@
 # limitations under the License.
 #
 
-# @depricated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
+# @deprecated: This script is kept for backward compatibility only and will be 
removed in future. Use gobblin.sh
 
 FWDIR="$(cd `dirname $0`/..; pwd)"
 
diff --git a/conf/standalone/application.conf b/conf/standalone/application.conf
index 3a856e3..77e1182 100644
--- a/conf/standalone/application.conf
+++ b/conf/standalone/application.conf
@@ -15,70 +15,44 @@
 # limitations under the License.
 #
 
-# Cluster configuration properties
-gobblin.cluster.app.name=GobblinStandaloneCluster
-gobblin.cluster.email.notification.on.shutdown=false
-gobblin.cluster.helix.instance.max.retries=2
-gobblin.cluster.work.dir=/tmp/gobblin-cluster
-
-# Helix/Zookeeper configuration properties
-gobblin.cluster.helix.cluster.name=GobblinStandaloneCluster
-gobblin.cluster.zk.connection.string="localhost:2181"
-
-# job config monitor interval
-jobconf.monitor.interval=30000
-
-# Sample configuration properties for the Gobblin Standalone cluster
-gobblin.cluster.workDir=${gobblin.cluster.work.dir}/GobblinStandaloneCluster
-
-# default is the JobConfigurationManager
-# use this manager to accept jobs from Kafka. It requires some additional 
Kafka related parameters.
-#gobblin.cluster.job.configuration.manager=org.apache.gobblin.cluster.StreamingJobConfigurationManager
-#spec.kafka.topics=ruyang_test_kafka_gobblin
-#kafka.brokers="hostname:12913/kafka-queuing"
-#jobSpecMonitor.kafka.zookeeper.connect="hostname:12913/kafka-queuing"
-
-# Cluster configuration properties
-gobblin.cluster.helix.cluster.name=GobblinStandaloneClusterCli
-
-# used by the JobConfigurationManager
-gobblin.cluster.job.conf.path=${gobblin.cluster.work.dir}/jobs
-gobblin.cluster.jobconf.fullyQualifiedPath=${gobblin.cluster.work.dir}/jobs
-gobblin.cluster.job.catalog=org.apache.gobblin.runtime.job_catalog.FSJobCatalog
+# Thread pool settings for the task executor
+taskexecutor.threadpool.size=2
+taskretry.threadpool.coresize=1
+taskretry.threadpool.maxsize=2
 
 # File system URIs
-fs.uri="file:///"
+fs.uri=file:///
 writer.fs.uri=${fs.uri}
 state.store.fs.uri=${fs.uri}
 
 # Writer related configuration properties
-writer.destination.type=HDFS
 writer.output.format=AVRO
-writer.staging.dir=${gobblin.cluster.work.dir}/task-staging
-writer.output.dir=${gobblin.cluster.work.dir}/task-output
+writer.staging.dir=${env:GOBBLIN_WORK_DIR}/task-staging
+writer.output.dir=${env:GOBBLIN_WORK_DIR}/task-output
 
 # Data publisher related configuration properties
 data.publisher.type=org.apache.gobblin.publisher.BaseDataPublisher
-data.publisher.final.dir=${gobblin.cluster.work.dir}/job-output
+data.publisher.final.dir=${env:GOBBLIN_WORK_DIR}/job-output
 data.publisher.replace.final.dir=false
 
+# Directory where job configuration files are stored
+jobconf.dir=${env:GOBBLIN_JOB_CONFIG_DIR}
+jobconf.fullyQualifiedPath=file://${env:GOBBLIN_JOB_CONFIG_DIR}
+
 # Directory where job/task state files are stored
-state.store.dir=${gobblin.cluster.work.dir}/state-store
+state.store.dir=${env:GOBBLIN_WORK_DIR}/state-store
 
-# Directory where error files from the quality checkers are stored
-qualitychecker.row.err.file=${gobblin.cluster.work.dir}/err
+# Directory where commit sequences are stored
+gobblin.runtime.commit.sequence.store.dir=${env:GOBBLIN_WORK_DIR}/commit-sequence-store
 
-# Disable job locking for now
-job.lock.enabled=false
+# Directory where error files from the quality checkers are stored
+qualitychecker.row.err.file=${env:GOBBLIN_WORK_DIR}/err
 
 # Directory where job locks are stored
-job.lock.dir=${gobblin.cluster.work.dir}/locks
+job.lock.dir=${env:GOBBLIN_WORK_DIR}/locks
 
 # Directory where metrics log files are stored
-metrics.log.dir=${gobblin.cluster.work.dir}/metrics
-
-# Interval of task state reporting in milliseconds
-task.status.reportintervalinms=1000
+metrics.log.dir=${env:GOBBLIN_WORK_DIR}/metrics
 
 # Enable metrics / events
 metrics.enabled=true
@@ -94,3 +68,8 @@ rest.server.port=9090
 # job history store ( WARN [GobblinYarnAppLauncher] NOT starting the admin UI 
because the job execution info server is NOT enabled )
 job.execinfo.server.enabled=false
 job.history.store.enabled=false
+task.status.reportintervalinms=5000
+
+# The time gap for Job Detector to detect modification/deletion/creation of 
jobconfig.
+# Unit in milliseconds, configurable.
+jobconf.monitor.interval=30000
diff --git a/conf/standalone/log4j.xml b/conf/standalone/log4j.xml
new file mode 100644
index 0000000..436a2b0
--- /dev/null
+++ b/conf/standalone/log4j.xml
@@ -0,0 +1,32 @@
+<!DOCTYPE log4j:configuration SYSTEM "log4j.dtd">
+
+<log4j:configuration>
+
+  <appender name="FileRoll" 
class="org.apache.log4j.rolling.RollingFileAppender">
+    <param name="file" value="${gobblin.logs.dir}/standalone.out" />
+    <param name="append" value="true" />
+    <param name="encoding" value="UTF-8" />
+
+    <rollingPolicy class="org.apache.log4j.rolling.TimeBasedRollingPolicy">
+      <param name="FileNamePattern" 
value="${gobblin.logs.dir}/archive/gobblin.%d{yyyy-MM-dd}.log"/>
+    </rollingPolicy>
+
+    <layout class="org.apache.log4j.PatternLayout">
+      <param name="ConversionPattern" value="%d{yyyy-MM-dd HH:mm:ss z} %-5p 
[%t] %C %X{tableName} %L - %m%n"/>
+    </layout>
+  </appender>
+
+  <logger name="org.apache.commons.httpclient">
+    <level value="DEBUG"/>
+  </logger>
+
+  <logger name="httpclient.wire">
+    <level value="ERROR"/>
+  </logger>
+
+  <root>
+    <priority value ="INFO" />
+    <appender-ref ref="FileRoll" />
+  </root>
+
+</log4j:configuration>
\ No newline at end of file
diff --git a/gobblin-docs/Getting-Started.md b/gobblin-docs/Getting-Started.md
index 5f5d843..29ac986 100644
--- a/gobblin-docs/Getting-Started.md
+++ b/gobblin-docs/Getting-Started.md
@@ -89,7 +89,7 @@ Each Gobblin job minimally involves several constructs, e.g. 
[Source](https://gi
 
 Some of the classes relevant to this example include 
[WikipediaSource](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/java/org/apache/gobblin/example/wikipedia/WikipediaSource.java),
 
[WikipediaExtractor](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/java/org/apache/gobblin/example/wikipedia/WikipediaExtractor.java),
 
[WikipediaConverter](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/jav
 [...]
 
-To run Gobblin in standalone daemon mode we need a Gobblin configuration file 
(such as uses 
[gobblin-standalone.properties](https://github.com/apache/incubator-gobblin/blob/master/conf/gobblin-standalone-v2.properties)).
 And for each job we wish to run, we also need a job configuration file (such 
as 
[wikipedia.pull](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/resources/wikipedia.pull)).
 The Gobblin configuration file, which is passed to Gobblin as a c [...]
+To run Gobblin in standalone daemon mode we need a Gobblin configuration file 
(such as uses 
[application.conf](https://github.com/apache/incubator-gobblin/blob/master/conf/standalone/application.conf)).
 And for each job we wish to run, we also need a job configuration file (such 
as 
[wikipedia.pull](https://github.com/apache/incubator-gobblin/blob/master/gobblin-example/src/main/resources/wikipedia.pull)).
 The Gobblin configuration file, which is passed to Gobblin as a command line 
argume [...]
 
 A list of commonly used configuration properties can be found here: 
[Configuration Properties 
Glossary](user-guide/Configuration-Properties-Glossary).
 
@@ -107,7 +107,7 @@ A list of commonly used configuration properties can be 
found here: [Configurati
 gobblin service standalone start
 ```
 
-The job log, which contains the progress and status of the job, will be 
written into `logs/<execution-mode>.out` & `logs/<execution-mode>.err` (to 
change where the log is written, modify the Log4j configuration file 
`conf/log4j.properties`).
+Stdout and the job log, which contains the progress and status of the job, 
will be written into `logs/<execution-mode>.out` & `logs/<execution-mode>.err` 
(to change where the log is written, modify the Log4j configuration file 
`conf/log4j.xml`).
 
 Among the job logs there should be the following information:
 

Reply via email to