Yes. Returns the number of rows in the Dataset as *long*. but in my case the aggregation returns a table of two columns.
Le ven. 8 avr. 2022 à 14:12, Sean Owen <sro...@gmail.com> a écrit : > Dataset.count() returns one value directly? > > On Thu, Apr 7, 2022 at 11:25 PM sam smith <qustacksm2123...@gmail.com> > wrote: > >> My bad, yes of course that! still i don't like the .. >> select("count(myCol)") .. part in my line is there any replacement to that ? >> >> Le ven. 8 avr. 2022 à 06:13, Sean Owen <sro...@gmail.com> a écrit : >> >>> Just do an average then? Most of my point is that filtering to one group >>> and then grouping is pointless. >>> >>> On Thu, Apr 7, 2022, 11:10 PM sam smith <qustacksm2123...@gmail.com> >>> wrote: >>> >>>> What if i do avg instead of count? >>>> >>>> Le ven. 8 avr. 2022 à 05:32, Sean Owen <sro...@gmail.com> a écrit : >>>> >>>>> Wait, why groupBy at all? After the filter only rows with myCol equal >>>>> to your target are left. There is only one group. Don't group just count >>>>> after the filter? >>>>> >>>>> On Thu, Apr 7, 2022, 10:27 PM sam smith <qustacksm2123...@gmail.com> >>>>> wrote: >>>>> >>>>>> I want to aggregate a column by counting the number of rows having >>>>>> the value "myTargetValue" and return the result >>>>>> I am doing it like the following:in JAVA >>>>>> >>>>>>> long result = >>>>>>> dataset.filter(dataset.col("myCol").equalTo("myTargetVal")).groupBy(col("myCol")).agg(count(dataset.col("myCol"))).select("count(myCol)").first().getLong(0); >>>>>> >>>>>> >>>>>> Is that the right way? if no, what if a more optimized way to do that >>>>>> (always in JAVA)? >>>>>> Thanks for the help. >>>>>> >>>>>