Hi Junfeng, can I ask why it is important to remove the empty column?
Regards, Gourav Sengupta On Tue, Apr 3, 2018 at 4:28 AM, Junfeng Chen <darou...@gmail.com> wrote: > I am trying to read data from kafka and writing them in parquet format via > Spark Streaming. > The problem is, the data from kafka are in variable data structure. For > example, app one has columns A,B,C, app two has columns B,C,D. So the data > frame I read from kafka has all columns ABCD. When I decide to write the > dataframe to parquet file partitioned with app name, > the parquet file of app one also contains columns D, where the columns D > is empty and it contains no data actually. So how to filter the empty > columns when I writing dataframe to parquet? > > Thanks! > > > Regard, > Junfeng Chen >