Hi Spark Users,

I've a spark job where I am reading the parquet path, and that parquet path
data is generated by other systems, some of the parquet paths doesn't
contains any data which is possible. is there a any way to read the parquet
if no data found I can create a dummy dataframe and go ahead.

One way is to check path exists like

 val conf = spark.sparkContext.hadoopConfiguration
    val fs = org.apache.hadoop.fs.FileSystem.get(conf)
    val currentAreaExists = fs.exists(new
org.apache.hadoop.fs.Path(consumableCurrentArea))

But I don't want to check this for 300 parquets, just if data doesn't exist
in the parquet path go with the dummy parquet / custom DataFrame

AnalysisException: u'Unable to infer schema for Parquet. It must be
specified manually.;'

Thanks

Reply via email to