Happy new year, Raymond!
Not sure whether I undestand your problem correctly but it seems to me
that you are just not processing your result.
sqlContext.sql(...) returns a DataFrame which you have to call an action on.
Therefore, to get the result you are expecting, you just have to call:
sqlContext.sql(...).show()
You can also assign it to a variable or register it as a new table
(view) to work with it further:
df2 = sqlContext.sql(...)
or:
sqlContext.sql(...).createOrReplaceTempView("flight201601_carriers")
Regards,
Michal Šenkýř
On 2.1.2017 05:22, Raymond Xie wrote:
Happy new year!
Below is my script:
pyspark --packages com.databricks:spark-csv_2.10:1.4.0
from pyspark.sql import SQLContext
sqlContext = SQLContext(sc)
df =
sqlContext.read.format('com.databricks.spark.csv').options(header='true',
inferschema='true').load('file:///root/Downloads/data/flight201601short2.csv')
df.show(5)
df.registerTempTable("flight201601")
sqlContext.sql("select distinct CARRIER from flight201601")
df.show(5) is below:
+----+-------+-----+------------+-----------+----------+--------------+----------+-------+--------+------+
|YEAR|QUARTER|MONTH|DAY_OF_MONTH|DAY_OF_WEEK|
FL_DATE|UNIQUE_CARRIER|AIRLINE_ID|CARRIER|TAIL_NUM|FL_NUM|
+----+-------+-----+------------+-----------+----------+--------------+----------+-------+--------+------+
|2016| 1| 1| 6| 3|2016-01-06| AA|
19805| AA| N4YBAA| 43|
|2016| 1| 1| 7| 4|2016-01-07| AA|
19805| AA| N434AA| 43|
|2016| 1| 1| 8| 5|2016-01-08| AA|
19805| AA| N541AA| 43|
|2016| 1| 1| 9| 6|2016-01-09| AA|
19805| AA| N489AA| 43|
|2016| 1| 1| 10| 7|2016-01-10| AA|
19805| AA| N439AA| 43|
+----+-------+-----+------------+-----------+----------+--------------+----------+-------+--------+------+
The final result is NOT what I am expecting, it currently shows the
following:
>>> sqlContext.sql("select distinct CARRIER from flight201601")
DataFrame[CARRIER: string]
I am expecting the distinct CARRIER will be created:
AA
BB
CC
...
flight201601short2.csv is attached here for your reference.
Thank you very much.
/------------------------------------------------//
/
/Sincerely yours,/
/Raymond/
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org