subject:"Spark Sql group by less performant"

Re: Spark Sql group by less performant

2018-12-10 Thread Georg Heiler

See https://databricks.com/blog/2016/05/19/approximate-algorithms-in-apache-spark-hyperloglog-and-quantiles.html you most probably do not require exact counts. Am Di., 11. Dez. 2018 um 02:09 Uhr schrieb 15313776907 <15313776...@163.com >: > i think you can add executer memory > > 15313776907 >

Re: Spark Sql group by less performant

2018-12-10 Thread 15313776907

i think you can add executer memory | | 15313776907 | | 邮箱：15313776...@163.com | 签名由网易邮箱大师定制 On 12/11/2018 08:28, lsn24 wrote: Hello, I have a requirement where I need to get total count of rows and total count of failedRows based on a grouping. The code looks like below:

Spark Sql group by less performant

2018-12-10 Thread lsn24

Hello, I have a requirement where I need to get total count of rows and total count of failedRows based on a grouping. The code looks like below: myDataset.createOrReplaceTempView("temp_view"); Dataset countDataset = sparkSession.sql("Select