[ 
https://issues.apache.org/jira/browse/MADLIB-1089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16126506#comment-16126506
 ] 

Frank McQuillan commented on MADLIB-1089:
-----------------------------------------

This issue existed before when installing MADlib to a non-default schema in 
HAWQ, so it’s not new with HAWQ 2.x.

Currently allowing arbitrary distance function for kmeans on HAWQ 2.2.0.0 build 
4141 gives this error:

{code}
psql:/tmp/madlib.Fbq28_/kmeans/test/kmeans.sql_in.tmp:116: ERROR:  
plpy.SPIError: Function "_closest_column(double precision[],double 
precision[],regproc,text)": Error while looking up a function in the system 
catalog. (UDF_impl.hpp:210)  (seg9 test3:20100 pid=5332) (plpython.c:4663)
{code}

This indicates that HAWQ is unable to lookup the distance function from the 
segment. This was originally the reason for hard-coding the distance function. 
It looks like the issue is not fixed (or feature is not available) yet.

We do not want to do more research at this point on enabling arbitrary dist 
functions to work around the HAWQ limitation, so I would like to call this 
story as no-op.  Pls let me know if anyone thinks otherwise.

> Install check errors on HAWQ 2.2 when install MADlib on non-default schema
> --------------------------------------------------------------------------
>
>                 Key: MADLIB-1089
>                 URL: https://issues.apache.org/jira/browse/MADLIB-1089
>             Project: Apache MADlib
>          Issue Type: Bug
>          Components: All Modules
>            Reporter: Frank McQuillan
>            Priority: Minor
>             Fix For: v1.12
>
>         Attachments: k-means-IC-fail-on-hawq-2dot2, 
> linalg-IC-fail-on-hawq-2dot2
>
>
> Running install-check on a non-default schema in HAWQ 2.2 results in errors 
> for lining and means.
> {code}
> MADlib version: 1.10.0, git revision: rel/v1.9.1-58-ga3863b6, cmake 
> configuration time: Wed Mar  8 19:49:45 UTC 2017, build type: Release, bui
> ld system: Linux-2.6.18-238.27.1.el5.hotfix.bz516490, C compiler: gcc 4.4.0, 
> C++ compiler: g++ 4.4.0
>  PostgreSQL 8.2.15 (Greenplum Database 4.2.0 build 1) (HAWQ 2.2.0.0 build 
> 4141) on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.8.5 20
> 150623 (Red Hat 4.8.5-11) compiled on Mar 30 2017 21:45:26
> {code}
> See attached log files and summaries below:
> linalg.sql_in
> {code}
> psql:/tmp/madlib.sGu72l/linalg/test/linalg.sql_in.tmp:165: ERROR:  Function 
> "closest_column(double precision[],double precision[],text)": Inval
> id distance metric provided: madlib1.squared_dist_norm2. Currently only 
> madlib provided distance functions are supported.
> {code}
> kmeans.sql_in
> {code}
> psql:/tmp/madlib.sGu72l/kmeans/test/kmeans.sql_in.tmp:117: ERROR:  
> plpy.SPIError: Function "closest_column(double precision[],double precision[
> ],text)": Invalid distance metric provided: madlib1.squared_dist_norm2. 
> Currently only madlib provided distance functions are supported.  (seg1
>  ip-10-32-127-188.ore6.vpc.pivotal.io:40000 pid=483012) (plpython.c:4663)
> CONTEXT:  Traceback (most recent call last):
>   PL/Python function "internal_compute_kmeanspp_seeding", line 22, in <module>
>     return kmeans.compute_kmeanspp_seeding(**globals())
>   PL/Python function "internal_compute_kmeanspp_seeding", line 154, in 
> compute_kmeanspp_seeding
>   PL/Python function "internal_compute_kmeanspp_seeding", line 415, in update
> PL/Python function "internal_compute_kmeanspp_seeding"
> SQL statement "SELECT  ( SELECT madlib1.internal_compute_kmeanspp_seeding( 
> '_madlib_kmeanspp_args', '_madlib_kmeanspp_state', textin(regclassou
> t( $1 )),  $2 ) )"
> PL/pgSQL function "kmeanspp_seeding" line 83 at assignment
> SQL statement "SELECT  madlib1.kmeans(  $1 ,  $2 , madlib1.kmeanspp_seeding( 
> $1 ,  $2 ,  $3 ,  $4 , NULL,  $5 ),  $4 ,  $6 ,  $7 ,  $8 )"
> PL/pgSQL function "kmeanspp" line 4 at assignment
> SQL statement "SELECT  madlib1.kmeanspp( $1 ,  $2 ,  $3 , 
> 'madlib1.squared_dist_norm2'::VARCHAR, 'madlib1.avg'::VARCHAR, 20::INTEGER, 
> 0.001::DO
> UBLE PRECISION, 1.0::DOUBLE PRECISION)"
> PL/pgSQL function "kmeanspp" line 4 at assignment
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to