en project, if you develop a third party library
>
> On Fri, Oct 19, 2018, 2:32 PM Matt Saunders wrote:
>
>> Thanks, Eric. I went ahead and created SPARK-25782 for this improvement
>> since it is a feature I and others have looked for in MLlib, but doesn't
>> seem t
or it, and then a pull
> request. Another possibility is to publish it as your own 3rd party
> library, which I have done for aggregators before.
>
>
> On Wed, Oct 17, 2018 at 4:54 PM Matt Saunders wrote:
>
>> I built an Aggregator that computes PCA on grouped datasets. I wa
I built an Aggregator that computes PCA on grouped datasets. I wanted to
use the PCA functions provided by MLlib, but they only work on a full
dataset, and I needed to do it on a grouped dataset (like a
RelationalGroupedDataset).
So I built a little Aggregator that can do that, here’s an example o