Hi Spark Users, I've a spark job where I am reading the parquet path, and that parquet path data is generated by other systems, some of the parquet paths doesn't contains any data which is possible. is there a any way to read the parquet if no data found I can create a dummy dataframe and go ahead.
One way is to check path exists like val conf = spark.sparkContext.hadoopConfiguration val fs = org.apache.hadoop.fs.FileSystem.get(conf) val currentAreaExists = fs.exists(new org.apache.hadoop.fs.Path(consumableCurrentArea)) But I don't want to check this for 300 parquets, just if data doesn't exist in the parquet path go with the dummy parquet / custom DataFrame AnalysisException: u'Unable to infer schema for Parquet. It must be specified manually.;' Thanks