Frank McQuillan created MADLIB-978:
--------------------------------------
Summary: CLONE - Implement skipping of arrays-with-NULL for
elastic net predict
Key: MADLIB-978
URL: https://issues.apache.org/jira/browse/MADLIB-978
Project: Apache MADlib
Issue Type: Improvement
Components: Module: Regularized Regression
Reporter: Frank McQuillan
Assignee: Rahul Iyer
Priority: Minor
Fix For: v1.9
Implement skipping of arrays-with-NULL for elastic net predict. Some context
for this JIRA is below…
(Q)
Question came in this week from a MADlib user:
Function "madlib.elastic_net_gaussian_predict(double precision[],double
precision,double precision[])": Error converting an array w/ NULL value s to
dense format. (UDF_impl.hpp:210)
Is there a typical pattern for handling nulls in such a scenario, perhaps
converting to 0.0 or something like this?
(A)
Answer:
The skipping of arrays-with-NULL has not been implemented for elastic net
predict yet.
You can workaround it by creating the below function:
http://stackoverflow.com/questions/7819021/replace-null-values-in-an-array-in-postgresql
CREATE OR REPLACE FUNCTION f_array_replace_null (double precision[], double
precision)
RETURNS double precision[] AS
$$
SELECT ARRAY (
SELECT COALESCE(x, $2)
FROM unnest($1) x);
$$ LANGUAGE SQL IMMUTABLE;
They'll have to add the function before the feature array in the elastic_net
statement:
f_array_replace_null(array["pf_calc_fdy_position", ...], 0)
This would replace each NULL with a 0. The downside is it could get slower
since the unnest and nest would happen with each call. If performance is a
concern, and if they're running over this data multiple times, I would create a
new table with the NULLs replaced and execute elastic_net_xxx in the regular
way.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)