On 10/14/14 7:31 PM, Neeraj Garg02 wrote:

Hi All,

I’ve downloaded and installed Apache Spark 1.1.0 pre-built for Hadoop 2.4.

Now, I want to test two features of Spark:

|1.|*YARN deployment* : As per my understanding, I need to modify “spark-defaults.conf” file with the settings mentioned at URL http://spark.apache.org/docs/1.1.0/running-on-yarn.html#configuration . For example, settings like |spark.yarn.applicationMaster.waitTries etc.|||

|**|

|*In order to launch *|a Spark application in yarn-cluster mode, following command can be used once the configurations are done.

|*./bin/spark-submit --class path.to.your.Class --master yarn-cluster [options] <app jar> [app options]*||**|

|**|

|*Is this understanding correct*||*or please suggest with the steps to Deploy Spark on YARN.*||**|

Yes.

|**|

2.*Testing Thrift JDBC server connection: *I’ve Hadoop 2.4 cluster setup. Apache spark is running on this cluster. Now, in order to test JDC thrift server, I’ve successfully followed the steps mentioned in the “*Other SQL Interfaces” *section of Spark SQL programming guide i.e. I can see beeline prompt and it’s connected to thrift server using the given command. Please help me to get answers of following queries:

a.Which kind of queries I can execute using this beeline prompt. Would these be Spark SQL queries or Hive queries?

You can only use HiveQL under beeline.

*b.**Configuration of Hive is done by placing your *|*hive-site.xml*|*file in *|*conf/*|*.***Right now, I don’t have Hive installed as part of the Hadoop 2.4 cluster. Do I need to install Hive to test the Thrift JDBC server OR to execute Spark SQL queries from the beeline prompt.**

i.In case Hive installation is a pre-requisite, then, is there a need to re-build the Spark package. What are the steps for these. Is internet required for the re-build?

The Thrift server is used to interact with existing Hive data, and thus needs Hive Metastore to access Hive catalog. In your case, you need to build Spark with |sbt/sbt -Phive,hadoop-2.4 clean package|. But since you’ve already started Thrift server successfully, this step should already have been done properly.

*c.*What else would I need in case I need to connect BI tools with Spark SQL using Thrift JDBC/ ODBC server. Please share the steps or pointers to do the same.

You can follow this awesome article authored by Denny Lee: https://www.concur.com/blog/en-us/connect-tableau-to-sparksql

**

As I could not find sufficient information on the same, please help.

Please let me know if more information/ explanation is required.

Thanks and Regards,

Neeraj Garg

**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are 
not
to copy, disclose, or distribute this e-mail or its contents to any other 
person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has 
taken
every reasonable precaution to minimize this risk, but is not liable for any 
damage
you may sustain as a result of any virus in this e-mail. You should carry out 
your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this 
e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

Reply via email to