Exact, one row, and two columns Le sam. 9 avr. 2022 à 17:44, Sean Owen <sro...@gmail.com> a écrit :
> But it only has one row, right? > > On Sat, Apr 9, 2022, 10:06 AM sam smith <qustacksm2123...@gmail.com> > wrote: > >> Yes. Returns the number of rows in the Dataset as *long*. but in my case >> the aggregation returns a table of two columns. >> >> Le ven. 8 avr. 2022 à 14:12, Sean Owen <sro...@gmail.com> a écrit : >> >>> Dataset.count() returns one value directly? >>> >>> On Thu, Apr 7, 2022 at 11:25 PM sam smith <qustacksm2123...@gmail.com> >>> wrote: >>> >>>> My bad, yes of course that! still i don't like the .. >>>> select("count(myCol)") .. part in my line is there any replacement to that >>>> ? >>>> >>>> Le ven. 8 avr. 2022 à 06:13, Sean Owen <sro...@gmail.com> a écrit : >>>> >>>>> Just do an average then? Most of my point is that filtering to one >>>>> group and then grouping is pointless. >>>>> >>>>> On Thu, Apr 7, 2022, 11:10 PM sam smith <qustacksm2123...@gmail.com> >>>>> wrote: >>>>> >>>>>> What if i do avg instead of count? >>>>>> >>>>>> Le ven. 8 avr. 2022 à 05:32, Sean Owen <sro...@gmail.com> a écrit : >>>>>> >>>>>>> Wait, why groupBy at all? After the filter only rows with myCol >>>>>>> equal to your target are left. There is only one group. Don't group just >>>>>>> count after the filter? >>>>>>> >>>>>>> On Thu, Apr 7, 2022, 10:27 PM sam smith <qustacksm2123...@gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> I want to aggregate a column by counting the number of rows having >>>>>>>> the value "myTargetValue" and return the result >>>>>>>> I am doing it like the following:in JAVA >>>>>>>> >>>>>>>>> long result = >>>>>>>>> dataset.filter(dataset.col("myCol").equalTo("myTargetVal")).groupBy(col("myCol")).agg(count(dataset.col("myCol"))).select("count(myCol)").first().getLong(0); >>>>>>>> >>>>>>>> >>>>>>>> Is that the right way? if no, what if a more optimized way to do >>>>>>>> that (always in JAVA)? >>>>>>>> Thanks for the help. >>>>>>>> >>>>>>>