Happy new year, Raymond!

Not sure whether I undestand your problem correctly but it seems to me that you are just not processing your result.
sqlContext.sql(...) returns a DataFrame which you have to call an action on.

Therefore, to get the result you are expecting, you just have to call:
sqlContext.sql(...).show()

You can also assign it to a variable or register it as a new table (view) to work with it further:
df2 = sqlContext.sql(...)
or:
sqlContext.sql(...).createOrReplaceTempView("flight201601_carriers")

Regards,

Michal Šenkýř


On 2.1.2017 05:22, Raymond Xie wrote:
Happy new year!

Below is my script:

pyspark --packages com.databricks:spark-csv_2.10:1.4.0
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
df = sqlContext.read.format('com.databricks.spark.csv').options(header='true', inferschema='true').load('file:///root/Downloads/data/flight201601short2.csv')
df.show(5)
df.registerTempTable("flight201601")
sqlContext.sql("select distinct CARRIER from flight201601")

df.show(5) is below:

+----+-------+-----+------------+-----------+----------+--------------+----------+-------+--------+------+
|YEAR|QUARTER|MONTH|DAY_OF_MONTH|DAY_OF_WEEK| FL_DATE|UNIQUE_CARRIER|AIRLINE_ID|CARRIER|TAIL_NUM|FL_NUM|
+----+-------+-----+------------+-----------+----------+--------------+----------+-------+--------+------+
|2016| 1| 1| 6| 3|2016-01-06| AA| 19805| AA| N4YBAA| 43| |2016| 1| 1| 7| 4|2016-01-07| AA| 19805| AA| N434AA| 43| |2016| 1| 1| 8| 5|2016-01-08| AA| 19805| AA| N541AA| 43| |2016| 1| 1| 9| 6|2016-01-09| AA| 19805| AA| N489AA| 43| |2016| 1| 1| 10| 7|2016-01-10| AA| 19805| AA| N439AA| 43|
+----+-------+-----+------------+-----------+----------+--------------+----------+-------+--------+------+

The final result is NOT what I am expecting, it currently shows the following:

>>> sqlContext.sql("select distinct CARRIER from flight201601")
DataFrame[CARRIER: string]

I am expecting the distinct CARRIER will be created:

AA
BB
CC
...

flight201601short2.csv is attached here for your reference.


Thank you very much.



/------------------------------------------------//
/
/Sincerely yours,/


/Raymond/



---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to