Re: How to delete empty columns in df when writing to parquet?

2018-04-07 Thread Gourav Sengupta
Hi Junfeng, you are welcome. If users are extremely adamant on seeing only a few columns try to see if you can create a view on only the selected columns and give it to them, in case you are using hive metastore. Regards, Gourav On Sun, Apr 8, 2018 at 3:28 AM, Junfeng Chen

Re: How to delete empty columns in df when writing to parquet?

2018-04-07 Thread Junfeng Chen
Hi, Thanks for explaining! Regard, Junfeng Chen On Wed, Apr 4, 2018 at 7:43 PM, Gourav Sengupta wrote: > Hi, > > I do not think that in a columnar database it makes much of a difference. > The amount of data that you will be parsing will not be much anyways. > >

Re: How to delete empty columns in df when writing to parquet?

2018-04-04 Thread Junfeng Chen
Our users ask for it Regard, Junfeng Chen On Wed, Apr 4, 2018 at 5:45 PM, Gourav Sengupta wrote: > Hi Junfeng, > > can I ask why it is important to remove the empty column? > > Regards, > Gourav Sengupta > > On Tue, Apr 3, 2018 at 4:28 AM, Junfeng Chen

Re: How to delete empty columns in df when writing to parquet?

2018-04-04 Thread Gourav Sengupta
Hi Junfeng, can I ask why it is important to remove the empty column? Regards, Gourav Sengupta On Tue, Apr 3, 2018 at 4:28 AM, Junfeng Chen wrote: > I am trying to read data from kafka and writing them in parquet format via > Spark Streaming. > The problem is, the data

Re: How to delete empty columns in df when writing to parquet?

2018-04-03 Thread Junfeng Chen
You mean I should start two spark streaming application and read topics respectively? Regard, Junfeng Chen On Tue, Apr 3, 2018 at 10:31 PM, naresh Goud wrote: > I don’t see any option other than staring two individual queries. It’s > just a thought. > > Thank you,

How to delete empty columns in df when writing to parquet?

2018-04-02 Thread Junfeng Chen
I am trying to read data from kafka and writing them in parquet format via Spark Streaming. The problem is, the data from kafka are in variable data structure. For example, app one has columns A,B,C, app two has columns B,C,D. So the data frame I read from kafka has all columns ABCD. When I decide