hit-lacus edited a comment on issue #1166: Prepare DebugTomcat and Integration 
Test in Hadoop3 Env
URL: https://github.com/apache/kylin/pull/1166#issuecomment-603251050
 
 
   ## Prepare Spark 
   > If you not interested in Spark Cubing, you may consider skip this part.
   
   Please download spark binary and decompress it into somewhere of your 
laptop, I think spark2.3 should works, what I choose is 
**spark-2.3.2-bin-hadoop2.7** .
   
   After that, please set some sytem env in `~/.zshrc` or `~/.bashrc`. , 
depending on which shell you are using.
   
   ```sh
   # This is where spark located
   export SPARK_HOME=/Users/XXX/Lab/laucs-libs/spark-2.3.2-bin-hadoop2.7
   
   # This is where spark search there configuration.
   export 
HADOOP_CONF_DIR=/Users/XXX/IntelliJ_IDEA_Project/MyKylinHadoop3/examples/test_case_data/sandbox
   
   # This is the username which the spark job submitted by
   export HADOOP_USER_NAME=root
   ```
   
   It is strange that it will be some class conflict issue when submit spark 
job, I manual downloaded these jars from mvn centerl repo and put them into 
$SPARK_HOME/jars.
   
   - jersey-client-1.16.jar and 
   - jersey-core-1.16.jar
   
   ### Submit Spark Job to Count
   To check if you can submit a spark job into HDP3 container successfully, let 
us try use `spark-shell` to verify it.
   
   Input the following command will start a spark interactive shell, to let you 
to using scala language to manipulate `SparkContext` and `SparkSession` to .
   
   ```sh
   [YOUR_NAME@YOUR_LAP_TOP ~] $SPARK_HOME/bin/spark-shell --master yarn 
--verbose
   ```
   
   If you are really lucky, after you use following simple sinnpet, a 
meaningful number should be printed.
   ```sh
   
sc.textFile("/warehouse/tablespace/external/hive/kylin_account/DEFAULT.KYLIN_ACCOUNT.csv").count()
   ```
   
   When you see the number, you will knew you have met the requirement for 
debuging spark building on laptop.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to