Hi, cc'ing Shivaram here, because he worked on this yesterday.
If I'm not mistaken, you can use the following workflow: ```./bin/sparkR --packages com.databricks:spark-csv_2.10:1.0.3``` and then ```df <- read.df(sqlContext, "/data", "csv", header = "true")``` Best, Burak On Tue, Jun 2, 2015 at 11:52 AM, Eskilson,Aleksander < alek.eskil...@cerner.com> wrote: > Are there any intentions to provide first class support for CSV files as > one of the loadable file types in SparkR? Data brick’s spark-csv API [1] > has support for SQL, Python, and Java/Scala, and implements most of the > arguments of R’s read.table API [2], but currently there is no way to load > CSV data in SparkR (1.4.0) besides separating our headers from the data, > loading into an RDD, splitting by our delimiter, and then converting to a > SparkR Data Frame with a vector of the columns gathered from the header. > > Regards, > Alek Eskilson > > [1] -- https://github.com/databricks/spark-csv > [2] -- http://www.inside-r.org/r-doc/utils/read.table > CONFIDENTIALITY NOTICE This message and any included attachments are from > Cerner Corporation and are intended only for the addressee. The information > contained in this message is confidential and may constitute inside or > non-public information under international, federal, or state securities > laws. Unauthorized forwarding, printing, copying, distribution, or use of > such information is strictly prohibited and may be unlawful. If you are not > the addressee, please promptly delete this message and notify the sender of > the delivery error by e-mail or you may call Cerner's corporate offices in > Kansas City, Missouri, U.S.A at (+1) (816)221-1024. >