Exact, one row, and two columns
Le sam. 9 avr. 2022 à 17:44, Sean Owen a écrit :
> But it only has one row, right?
>
> On Sat, Apr 9, 2022, 10:06 AM sam smith
> wrote:
>
>> Yes. Returns the number of rows in the Dataset as *long*. but in my case
>> the aggregation returns a table of two columns
Yes. Returns the number of rows in the Dataset as *long*. but in my case
the aggregation returns a table of two columns.
Le ven. 8 avr. 2022 à 14:12, Sean Owen a écrit :
> Dataset.count() returns one value directly?
>
> On Thu, Apr 7, 2022 at 11:25 PM sam smith
> wrote:
>
>> My bad, yes of cour
Dataset.count() returns one value directly?
On Thu, Apr 7, 2022 at 11:25 PM sam smith
wrote:
> My bad, yes of course that! still i don't like the ..
> select("count(myCol)") .. part in my line is there any replacement to that ?
>
> Le ven. 8 avr. 2022 à 06:13, Sean Owen a écrit :
>
>> Just do a
My bad, yes of course that! still i don't like the ..
select("count(myCol)") .. part in my line is there any replacement to that ?
Le ven. 8 avr. 2022 à 06:13, Sean Owen a écrit :
> Just do an average then? Most of my point is that filtering to one group
> and then grouping is pointless.
>
> On
What if i do avg instead of count?
Le ven. 8 avr. 2022 à 05:32, Sean Owen a écrit :
> Wait, why groupBy at all? After the filter only rows with myCol equal to
> your target are left. There is only one group. Don't group just count after
> the filter?
>
> On Thu, Apr 7, 2022, 10:27 PM sam smith
Wait, why groupBy at all? After the filter only rows with myCol equal to
your target are left. There is only one group. Don't group just count after
the filter?
On Thu, Apr 7, 2022, 10:27 PM sam smith wrote:
> I want to aggregate a column by counting the number of rows having the
> value "myTarg
I want to aggregate a column by counting the number of rows having the
value "myTargetValue" and return the result
I am doing it like the following:in JAVA
> long result =
> dataset.filter(dataset.col("myCol").equalTo("myTargetVal")).groupBy(col("myCol")).agg(count(dataset.col("myCol"))).select("c