Re: SAS_TO_SPARK_SQL_(Could be a Bug?)

Ajay Chander Fri, 10 Jun 2016 15:39:17 -0700

Hi Mich,

Thanks for the response. If you look at my programs, I am not writings my
queries to include column names in a pair of "". My driver in spark
program is generating such query with column names in "" which I do not
want. On the other hand, I am using the same driver in my pure Java program
which is attached, in that program the same driver is generating a proper
sql query with out "".


Pure Java log:

2016-06-10 10:35:21,584] INFO stmt(1.1)#executeQuery SELECT
a.sr_no,a.start_dt,a.end_dt FROM sasLib.run_control a; created result set
1.1.1; time= 0.122 secs (com.sas.rio.MVAStatement:590)
Spark SQL log:

[2016-06-10 10:29:05,834] INFO conn(2)#prepareStatement sql=SELECT
"SR_NO","start_dt","end_dt" FROM sasLib.run_control ; prepared statement
2.1; time= 0.038 secs (com.sas.rio.MVAConnection:538)

[2016-06-10 10:29:05,935] INFO ps(2.1)#executeQuery SELECT
"SR_NO","start_dt","end_dt" FROM sasLib.run_control ; created result set
2.1.1; time= 0.102 secs (com.sas.rio.MVAStatement:590)
Please find complete program and full logs attached in the below thread.
Thank you.

Regards,
Ajay

On Friday, June 10, 2016, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:

> Assuming I understood your query, in Spark SQL (that is you log in to
> spark sql like  spark-sql --master spark://<HOST_NAME>:7077 you do not
> need double quotes around column names for sql to work
>
> spark-sql> select "hello from Mich" from oraclehadoop.sales limit 1;
> hello from Mich
>
> Anything between a pair of "" will be interpreted as text NOT column name.
>
> In Spark SQL you do not need double quotes. So simply
>
> spark-sql> select prod_id, cust_id from sales limit 2;
> 17      28017
> 18      10419
>
> HTH
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>
> On 10 June 2016 at 21:54, Ajay Chander <itsche...@gmail.com
> <javascript:_e(%7B%7D,'cvml','itsche...@gmail.com');>> wrote:
>
>> Hi again, anyone in this group tried to access SAS dataset through Spark
>> SQL ? Thank you
>>
>> Regards,
>> Ajay
>>
>>
>> On Friday, June 10, 2016, Ajay Chander <itsche...@gmail.com
>> <javascript:_e(%7B%7D,'cvml','itsche...@gmail.com');>> wrote:
>>
>>> Hi Spark Users,
>>>
>>> I hope everyone here are doing great.
>>>
>>> I am trying to read data from SAS through Spark SQL and write into HDFS.
>>> Initially, I started with pure java program please find the program and
>>> logs in the attached file sas_pure_java.txt . My program ran
>>> successfully and it returned the data from Sas to Spark_SQL. Please
>>> note the highlighted part in the log.
>>>
>>> My SAS dataset has 4 rows,
>>>
>>> Program ran successfully. So my output is,
>>>
>>> [2016-06-10 10:35:21,584] INFO stmt(1.1)#executeQuery SELECT
>>> a.sr_no,a.start_dt,a.end_dt FROM sasLib.run_control a; created result
>>> set 1.1.1; time= 0.122 secs (com.sas.rio.MVAStatement:590)
>>>
>>> [2016-06-10 10:35:21,630] INFO rs(1.1.1)#next (first call to next);
>>> time= 0.045 secs (com.sas.rio.MVAResultSet:773)
>>>
>>> 1,'2016-01-01','2016-01-31'
>>>
>>> 2,'2016-02-01','2016-02-29'
>>>
>>> 3,'2016-03-01','2016-03-31'
>>>
>>> 4,'2016-04-01','2016-04-30'
>>>
>>>
>>> Please find the full logs attached to this email in file
>>> sas_pure_java.txt.
>>>
>>> _______________________
>>>
>>>
>>> Now I am trying to do the same via Spark SQL. Please find my program
>>> and logs attached to this email in file sas_spark_sql.txt .
>>>
>>> Connection to SAS dataset is established successfully. But please note
>>> the highlighted log below.
>>>
>>> [2016-06-10 10:29:05,834] INFO conn(2)#prepareStatement sql=SELECT
>>> "SR_NO","start_dt","end_dt" FROM sasLib.run_control ; prepared
>>> statement 2.1; time= 0.038 secs (com.sas.rio.MVAConnection:538)
>>>
>>> [2016-06-10 10:29:05,935] INFO ps(2.1)#executeQuery SELECT
>>> "SR_NO","start_dt","end_dt" FROM sasLib.run_control ; created result
>>> set 2.1.1; time= 0.102 secs (com.sas.rio.MVAStatement:590)
>>> Please find the full logs attached to this email in file
>>>  sas_spark_sql.txt
>>>
>>> I am using same driver in both pure java and spark sql programs. But the
>>> query generated in spark sql has quotes around the column names(Highlighted
>>> above).
>>> So my resulting output for that query is like this,
>>>
>>> +-----+--------+------+
>>> |  _c0|     _c1|   _c2|
>>> +-----+--------+------+
>>> |SR_NO|start_dt|end_dt|
>>> |SR_NO|start_dt|end_dt|
>>> |SR_NO|start_dt|end_dt|
>>> |SR_NO|start_dt|end_dt|
>>> +-----+--------+------+
>>>
>>> Since both programs are using the same driver com.sas.rio.MVADriver .
>>> Expected output should be same as my pure java programs output. But
>>> something else is happening behind the scenes.
>>>
>>> Any insights on this issue. Thanks for your time.
>>>
>>>
>>> Regards,
>>>
>>> Ajay
>>>
>>
>

Re: SAS_TO_SPARK_SQL_(Could be a Bug?)

Reply via email to