[
https://issues.apache.org/jira/browse/MADLIB-1086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15987818#comment-15987818
]
Frank McQuillan edited comment on MADLIB-1086 at 4/27/17 10:24 PM:
-------------------------------------------------------------------
{code:sql}
DROP TABLE IF EXISTS arraytest1;
CREATE TABLE arraytest1(
id INTEGER,
arrays1 INTEGER[][]
);
INSERT INTO arraytest1 VALUES
(1, '{{1,2},{3,4}}'),
(2, '{{5,6},{7,8}}'),
(3, '{{9,10},{11,12}}');
DROP TABLE IF EXISTS array_unnest_output;
CREATE TABLE array_unnest_output AS
SELECT *, (madlib.array_unnest_2d_to_1d(arrays1)).*
FROM arraytest1;
SELECT * FROM array_unnest_output ORDER BY id, unnest_row_id;
{code}
produces
{code:sql}
id | arrays1 | unnest_row_id | unnest_result
----+------------------+---------------+---------------
1 | {{1,2},{3,4}} | 1 | {1,2}
1 | {{1,2},{3,4}} | 2 | {3,4}
2 | {{5,6},{7,8}} | 1 | {5,6}
2 | {{5,6},{7,8}} | 2 | {7,8}
3 | {{9,10},{11,12}} | 1 | {9,10}
3 | {{9,10},{11,12}} | 2 | {11,12}
(6 rows)
{code}
which seems fine. FLOAT8 and TEXT work fine too.
I also updated the K-means workbook to use this new unnest function and posted
to
https://github.com/apache/incubator-madlib-site/blob/asf-site/community-artifacts/Kmeans-v2.ipynb
was (Author: fmcquillan):
{code/sql}
DROP TABLE IF EXISTS arraytest1;
CREATE TABLE arraytest1(
id INTEGER,
arrays1 INTEGER[][]
);
INSERT INTO arraytest1 VALUES
(1, '{{1,2},{3,4}}'),
(2, '{{5,6},{7,8}}'),
(3, '{{9,10},{11,12}}');
DROP TABLE IF EXISTS array_unnest_output;
CREATE TABLE array_unnest_output AS
SELECT *, (madlib.array_unnest_2d_to_1d(arrays1)).*
FROM arraytest1;
SELECT * FROM array_unnest_output ORDER BY id, unnest_row_id;
{code}
produces
{code/sql}
id | arrays1 | unnest_row_id | unnest_result
----+------------------+---------------+---------------
1 | {{1,2},{3,4}} | 1 | {1,2}
1 | {{1,2},{3,4}} | 2 | {3,4}
2 | {{5,6},{7,8}} | 1 | {5,6}
2 | {{5,6},{7,8}} | 2 | {7,8}
3 | {{9,10},{11,12}} | 1 | {9,10}
3 | {{9,10},{11,12}} | 2 | {11,12}
(6 rows)
{code}
which seems fine. FLOAT8 and TEXT work fine too.
I also updated the K-means workbook to use this new unnest function and posted
to
https://github.com/apache/incubator-madlib-site/blob/asf-site/community-artifacts/Kmeans-v2.ipynb
> Unnest 2-D array by one level (i.e. into rows of 1-D arrays)
> ------------------------------------------------------------
>
> Key: MADLIB-1086
> URL: https://issues.apache.org/jira/browse/MADLIB-1086
> Project: Apache MADlib
> Issue Type: New Feature
> Components: Module: Utilities
> Reporter: Frank McQuillan
> Assignee: Rashmi Raghu
> Priority: Minor
> Fix For: v1.11
>
>
> Context
> Currently k-means returns the following
> {code}
> centroids |
> {{13.7533333333333,1.905,2.425,16.0666666666667,90.3333333333333,2.805,2.98,0.29,2.005,5.40663333333333,1.04166666666667,
> 3.31833333333333,1020.83333333333},
>
> {14.255,1.9325,2.5025,16.05,110.5,3.055,2.9775,0.2975,1.845,6.2125,0.9975,3.365,1378.75}}
> cluster_variance | {122999.110416013,30561.74805}
> objective_fn | 153560.858466013
> frac_reassigned | 0
> num_iterations | 3
> {code}
> Story
> As a data scientist, I want to unnest 2-D array by one level (i.e. into rows
> of 1-D arrays) in K-means, so that I can get one centroid per row for follow
> on operations.
> Acceptance
> 1) Add function to array operations
> http://madlib.incubator.apache.org/docs/latest/group__grp__array.html
> 2) Add an example in k-means
> http://madlib.incubator.apache.org/docs/latest/group__grp__kmeans.html
> to demonstrate usage
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)