RE: Can we load csv partitioned data into one DF?

2016-02-22 Thread Mohammed Guller
Are all the csv files in the same directory? Mohammed Author: Big Data Analytics with Spark From: saif.a.ell...@wellsfargo.com [mailto:saif.a.ell...@wellsfargo.com] Sent: Monday, February 22, 2016 7:25 AM To:

Re: Can we load csv partitioned data into one DF?

2016-02-22 Thread Mich Talebzadeh
Indeed this will work. Additionally the files could be zipped as well (gz or bzip2) val df = sqlContext.read.format("com.databricks.spark.csv").option("inferSchema", "true").option("header", "true").load("/data/stg") On 22/02/2016 15:32, Alex Dzhagriev wrote: > Hi Saif, > > You can put

Re: Can we load csv partitioned data into one DF?

2016-02-22 Thread Alex Dzhagriev
Hi Saif, You can put your files into one directory and read it as text. Another option is to read them separately and then union the datasets. Thanks, Alex. On Mon, Feb 22, 2016 at 4:25 PM, wrote: > Hello all, I am facing a silly data question. > > If I have +100