Re: Calculation of histogram bins and frequency in Apache spark 1.6

2016-02-25 Thread Yanbo Liang
Actually Spark SQL `groupBy` with `count` can get frequency in each bin.
You can also try with DataFrameStatFunctions.freqItems() to get the
frequent items for columns.

Thanks
Yanbo

2016-02-24 1:21 GMT+08:00 Burak Yavuz :

> You could use the Bucketizer transformer in Spark ML.
>
> Best,
> Burak
>
> On Tue, Feb 23, 2016 at 9:13 AM, Arunkumar Pillai  > wrote:
>
>> Hi
>> Is there any predefined method to calculate histogram bins and frequency
>> in spark. Currently I take range and find bins then count frequency using
>> SQL query.
>>
>> Is there any better way
>>
>
>


Re: Calculation of histogram bins and frequency in Apache spark 1.6

2016-02-23 Thread Burak Yavuz
You could use the Bucketizer transformer in Spark ML.

Best,
Burak

On Tue, Feb 23, 2016 at 9:13 AM, Arunkumar Pillai 
wrote:

> Hi
> Is there any predefined method to calculate histogram bins and frequency
> in spark. Currently I take range and find bins then count frequency using
> SQL query.
>
> Is there any better way
>