Hi All,

I've downloaded and installed Apache Spark 1.1.0 pre-built for Hadoop 2.4.

Now, I want to test two features of Spark:

1.       YARN deployment : As per my understanding, I need to modify 
"spark-defaults.conf" file with the settings mentioned at URL 
http://spark.apache.org/docs/1.1.0/running-on-yarn.html#configuration . For 
example, settings like spark.yarn.applicationMaster.waitTries etc.

In order to launch a Spark application in yarn-cluster mode, following command 
can be used once the configurations are done.
./bin/spark-submit --class path.to.your.Class --master yarn-cluster [options] 
<app jar> [app options]

Is this understanding correct or please suggest with the steps to Deploy Spark 
on YARN.


2.       Testing Thrift JDBC server connection: I've Hadoop 2.4 cluster setup. 
Apache spark is running on this cluster. Now, in order to test JDC thrift 
server, I've successfully followed the steps mentioned in the "Other SQL 
Interfaces" section of Spark SQL programming guide i.e. I can see beeline 
prompt and it's connected to thrift server using the given command. Please help 
me to get answers of following queries:

a.       Which kind of queries I can execute using this beeline prompt. Would 
these be Spark SQL queries or Hive queries?

b.      Configuration of Hive is done by placing your hive-site.xml file in 
conf/. Right now, I don't have Hive installed as part of the Hadoop 2.4 
cluster. Do I need to install Hive to test the Thrift JDBC server OR to execute 
Spark SQL queries from the beeline prompt.

                                                               i.      In case 
Hive installation is a pre-requisite, then,  is there a need to re-build the 
Spark package. What are the steps for these. Is internet required for the 
re-build?

c.       What else would I need in case I need to connect BI tools with Spark 
SQL using Thrift JDBC/ ODBC server. Please share the steps or pointers to do 
the same.

As I could not find sufficient information on the same, please help.

Please let me know if more information/ explanation is required.

Thanks and Regards,
Neeraj Garg


**************** CAUTION - Disclaimer *****************
This e-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended solely
for the use of the addressee(s). If you are not the intended recipient, please
notify the sender by e-mail and delete the original message. Further, you are 
not
to copy, disclose, or distribute this e-mail or its contents to any other 
person and
any such actions are unlawful. This e-mail may contain viruses. Infosys has 
taken
every reasonable precaution to minimize this risk, but is not liable for any 
damage
you may sustain as a result of any virus in this e-mail. You should carry out 
your
own virus checks before opening the e-mail or attachment. Infosys reserves the
right to monitor and review the content of all messages sent to or from this 
e-mail
address. Messages sent to or from this e-mail address may be stored on the
Infosys e-mail system.
***INFOSYS******** End of Disclaimer ********INFOSYS***

Reply via email to