Assuming I understood your query, in Spark SQL (that is you log in to spark sql like spark-sql --master spark://<HOST_NAME>:7077 you do not need double quotes around column names for sql to work
spark-sql> select "hello from Mich" from oraclehadoop.sales limit 1; hello from Mich Anything between a pair of "" will be interpreted as text NOT column name. In Spark SQL you do not need double quotes. So simply spark-sql> select prod_id, cust_id from sales limit 2; 17 28017 18 10419 HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com On 10 June 2016 at 21:54, Ajay Chander <itsche...@gmail.com> wrote: > Hi again, anyone in this group tried to access SAS dataset through Spark > SQL ? Thank you > > Regards, > Ajay > > > On Friday, June 10, 2016, Ajay Chander <itsche...@gmail.com> wrote: > >> Hi Spark Users, >> >> I hope everyone here are doing great. >> >> I am trying to read data from SAS through Spark SQL and write into HDFS. >> Initially, I started with pure java program please find the program and >> logs in the attached file sas_pure_java.txt . My program ran >> successfully and it returned the data from Sas to Spark_SQL. Please note >> the highlighted part in the log. >> >> My SAS dataset has 4 rows, >> >> Program ran successfully. So my output is, >> >> [2016-06-10 10:35:21,584] INFO stmt(1.1)#executeQuery SELECT >> a.sr_no,a.start_dt,a.end_dt FROM sasLib.run_control a; created result >> set 1.1.1; time= 0.122 secs (com.sas.rio.MVAStatement:590) >> >> [2016-06-10 10:35:21,630] INFO rs(1.1.1)#next (first call to next); time= >> 0.045 secs (com.sas.rio.MVAResultSet:773) >> >> 1,'2016-01-01','2016-01-31' >> >> 2,'2016-02-01','2016-02-29' >> >> 3,'2016-03-01','2016-03-31' >> >> 4,'2016-04-01','2016-04-30' >> >> >> Please find the full logs attached to this email in file >> sas_pure_java.txt. >> >> _______________________ >> >> >> Now I am trying to do the same via Spark SQL. Please find my program and >> logs attached to this email in file sas_spark_sql.txt . >> >> Connection to SAS dataset is established successfully. But please note >> the highlighted log below. >> >> [2016-06-10 10:29:05,834] INFO conn(2)#prepareStatement sql=SELECT >> "SR_NO","start_dt","end_dt" FROM sasLib.run_control ; prepared statement >> 2.1; time= 0.038 secs (com.sas.rio.MVAConnection:538) >> >> [2016-06-10 10:29:05,935] INFO ps(2.1)#executeQuery SELECT >> "SR_NO","start_dt","end_dt" FROM sasLib.run_control ; created result set >> 2.1.1; time= 0.102 secs (com.sas.rio.MVAStatement:590) >> Please find the full logs attached to this email in file >> sas_spark_sql.txt >> >> I am using same driver in both pure java and spark sql programs. But the >> query generated in spark sql has quotes around the column names(Highlighted >> above). >> So my resulting output for that query is like this, >> >> +-----+--------+------+ >> | _c0| _c1| _c2| >> +-----+--------+------+ >> |SR_NO|start_dt|end_dt| >> |SR_NO|start_dt|end_dt| >> |SR_NO|start_dt|end_dt| >> |SR_NO|start_dt|end_dt| >> +-----+--------+------+ >> >> Since both programs are using the same driver com.sas.rio.MVADriver . >> Expected output should be same as my pure java programs output. But >> something else is happening behind the scenes. >> >> Any insights on this issue. Thanks for your time. >> >> >> Regards, >> >> Ajay >> >