Re: YARN deployment of Spark and Thrift JDBC server

Cheng Lian Tue, 14 Oct 2014 07:46:57 -0700

On 10/14/14 7:31 PM, Neeraj Garg02 wrote:

Hi All,
I’ve downloaded and installed Apache Spark 1.1.0 pre-built for Hadoop2.4.
Now, I want to test two features of Spark:
|1.|*YARN deployment* : As per my understanding, I need to modify“spark-defaults.conf” file with the settings mentioned at URLhttp://spark.apache.org/docs/1.1.0/running-on-yarn.html#configuration. For example, settings like |spark.yarn.applicationMaster.waitTriesetc.|||
|**|
|*In order to launch *|a Spark application in yarn-cluster mode,following command can be used once the configurations are done.
|*./bin/spark-submit --class path.to.your.Class --master yarn-cluster[options] <app jar> [app options]*||**|
|**|
|*Is this understanding correct*||*or please suggest with the steps toDeploy Spark on YARN.*||**|

Yes.

|**|
2.*Testing Thrift JDBC server connection: *I’ve Hadoop 2.4 clustersetup. Apache spark is running on this cluster. Now, in order to testJDC thrift server, I’ve successfully followed the steps mentioned inthe “*Other SQL Interfaces” *section of Spark SQL programming guidei.e. I can see beeline prompt and it’s connected to thrift serverusing the given command. Please help me to get answers of followingqueries:
a.Which kind of queries I can execute using this beeline prompt. Wouldthese be Spark SQL queries or Hive queries?

You can only use HiveQL under beeline.

*b.**Configuration of Hive is done by placing your*|*hive-site.xml*|*file in *|*conf/*|*.***Right now, I don’t have Hiveinstalled as part of the Hadoop 2.4 cluster. Do I need to install Hiveto test the Thrift JDBC server OR to execute Spark SQL queries fromthe beeline prompt.**
i.In case Hive installation is a pre-requisite, then, is there a needto re-build the Spark package. What are the steps for these. Isinternet required for the re-build?

The Thrift server is used to interact with existing Hive data, and thusneeds Hive Metastore to access Hive catalog. In your case, you need tobuild Spark with |sbt/sbt -Phive,hadoop-2.4 clean package|. But sinceyou’ve already started Thrift server successfully, this step shouldalready have been done properly.

*c.*What else would I need in case I need to connect BI tools withSpark SQL using Thrift JDBC/ ODBC server. Please share the steps orpointers to do the same.

You can follow this awesome article authored by Denny Lee:https://www.concur.com/blog/en-us/connect-tableau-to-sparksql

**

As I could not find sufficient information on the same, please help.

Please let me know if more information/ explanation is required.

Thanks and Regards,

Neeraj Garg

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are 
not
to copy, disclose, or distribute this e-mail or its contents to any other 
person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has 
taken
every reasonable precaution to minimize this risk, but is not liable for any 
damage
you may sustain as a result of any virus in this e-mail. You should carry out 
your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this 
e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

Re: YARN deployment of Spark and Thrift JDBC server

Reply via email to