Hi, I am trying to dynamically create Dataframe by reading subdirectories under parent directory
My code looks like > import org.apache.spark._ > import org.apache.spark.sql._ > val hadoopConf = new org.apache.hadoop.conf.Configuration() > val hdfsConn = org.apache.hadoop.fs.FileSystem.get(new > java.net.URI("hdfs://xxx.xx.xx.xxx:8020"), hadoopConf) > hdfsConn.listStatus(new > org.apache.hadoop.fs.Path("/TestDivya/Spark/ParentDir/")).foreach{ > fileStatus => > val filePathName = fileStatus.getPath().toString() > val fileName = fileStatus.getPath().getName().toLowerCase() > var df = "df"+fileName > df = > sqlContext.read.format("com.databricks.spark.csv").option("header", > "true").option("inferSchema", "true").load(filePathName) > } getting below error > <console>:35: error: type mismatch; > found : org.apache.spark.sql.DataFrame > required: String > df = > sqlContext.read.format("com.databricks.spark.csv").option("header", > "true").option("inferSchema", "true").load(filePathName) Am I missing something ? Would really appreciate the help . Thanks, Divya