how does Spark SQL/DataFrame know that train_users_2.csv has a field named,
"id" or anything else domain specific? is there a header? if so, does
sc.textFile() know about this header?
I'd suggest using the Databricks spark-csv package for reading csv data. there
is an option in there to
Hi all,
I got this error when I tried to use the 'join' function to left outer join
two data frames in pyspark 1.4.1.
Please kindly point out the places where I made mistakes. Thank you.
Traceback (most recent call last):
File "/Users/wz/PycharmProjects/PysparkTraining/Airbnb/src/driver.py",