Re: Converting None/Null into json in pyspark

2022-10-03 Thread Yeachan Park
Hi, There's a config option for this. Try setting this to false in your spark conf. spark.sql.jsonGenerator.ignoreNullFields On Tuesday, October 4, 2022, Karthick Nk wrote: > Hi all, > > I need to convert pyspark dataframe into json . > > While converting , if all rows values are null/None

Reading too many files

2022-10-03 Thread Sachit Murarka
Hello, I am reading too many files in Spark 3.2(Parquet) . It is not giving any error in the logs. But after spark.read.parquet , it is not able to proceed further. Can anyone please suggest if there is any property to improve the parallel reads? I am reading more than 25000 files . Kind

Re: Reading too many files

2022-10-03 Thread Sid
Are you trying to run on cloud ? On Mon, 3 Oct 2022, 21:55 Sachit Murarka, wrote: > Hello, > > I am reading too many files in Spark 3.2(Parquet) . It is not giving any > error in the logs. But after spark.read.parquet , it is not able to proceed > further. > Can anyone please suggest if there

Converting None/Null into json in pyspark

2022-10-03 Thread Karthick Nk
Hi all, I need to convert pyspark dataframe into json . While converting , if all rows values are null/None for that particular column that column is getting removed from data. Could you suggest a way to do this. I need to convert dataframe into json with columns. Thanks

Re: Reading too many files

2022-10-03 Thread Henrik Pang
you may need a large cluster memory and fast disk IO. Sachit Murarka wrote: Can anyone please suggest if there is any property to improve the parallel reads? I am reading more than 25000 files . -- Simple Mail https://simplemail.co.in/