On Wed, May 25, 2016 at 09:10:02AM +0000, Kouhei Kaigai wrote:
> > -----Original Message-----
> > From: Simon Riggs [mailto:si...@2ndquadrant.com]
> > Sent: Wednesday, May 25, 2016 4:39 PM
> > To: Kaigai Kouhei(海外 浩平)
> > Cc: email@example.com
> > Subject: Re: [HACKERS] Does people favor to have matrix data type?
> > On 25 May 2016 at 03:52, Kouhei Kaigai <kai...@ak.jp.nec.com> wrote:
> > In a few days, I'm working for a data type that represents matrix in
> > mathematical area. Does people favor to have this data type in the core,
> > not only my extension?
> > If we understood the use case, it might help understand whether to include
> > it or not.
> > Multi-dimensionality of arrays isn't always useful, so this could be good.
> As you may expect, the reason why I've worked for matrix data type is one of
> the groundwork for GPU acceleration, but not limited to.
> What I tried to do is in-database calculation of some analytic algorithm; not
> exporting entire dataset to client side.
> My first target is k-means clustering; often used to data mining.
> When we categorize N-items which have M-attributes into k-clusters, the master
> data can be shown in NxM matrix; that is equivalent to N vectors in
> The cluster centroid is also located inside of the M-dimension space, so it
> can be shown in kxM matrix; that is equivalent to k vectors in M-dimension.
> The k-means algorithm requires to calculate the distance to any cluster
> for each items, thus, it produces Nxk matrix; that is usually called as
> matrix. Next, it updates the cluster centroid using the distance matrix, then
> repeat the entire process until convergence.
> The heart of workload is calculation of distance matrix. When I tried to write
> k-means algorithm using SQL + R, its performance was not sufficient (poor).
> If we would have native functions we can use instead of the complicated SQL
> expression, it will make sense for people who tries in-database analytics.
> Also, fortunately, PostgreSQL's 2-D array format is binary compatible to BLAS
> library's requirement. It will allow GPU to process large matrix in HPC grade
> NEC Business Creation Division / PG-Strom Project
> KaiGai Kohei <kai...@ak.jp.nec.com>
Have you looked at Perl Data Language under pl/perl? It has pretty nice support
for matrix calculations:
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: