[
https://issues.apache.org/jira/browse/HIVE-15978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15904117#comment-15904117
]
Zoltan Haindrich commented on HIVE-15978:
-----------------------------------------
[~pxiong] I see multiple ways this could be achieved...and I'm not sure which
one to take :)
Most of these functions (more/or less) could be translated into existing UDAF
function usage - it needs some tweaking; but it can be done; I don't really
want to reimplement all those things again - I think it would be better to
reuse them.
# if I create some 'cover' UDAF evaluators for each of these functions and do
the evaluation of those inside the new evaluator - that could work; but it will
be quite a few very similar classes
# tho other alternative is to add some slightly extended versions of some
existing UDAFs (like:count and variance) - and rewrite somehow the
{{regr_sxx(y,x)}} invocations to {{extended_COUNT(x, y) * extended_VAR_POP( y
)}}
I guess from here that the 1. alternative may give slightly better runtimes -
but not significantly; but in the 2. case the "original" evalutators would do
the real work
about why do I need to change a bit the existing UDAFs: all these regr_*
functions are required to only do any work when neither of {{x}} and {{y}} is
null ({{regr_sxx(x,y)}})
> Support regr_* functions
> ------------------------
>
> Key: HIVE-15978
> URL: https://issues.apache.org/jira/browse/HIVE-15978
> Project: Hive
> Issue Type: Sub-task
> Components: SQL
> Reporter: Carter Shanklin
> Assignee: Zoltan Haindrich
>
> Support the standard regr_* functions, regr_slope, regr_intercept, regr_r2,
> regr_sxx, regr_syy, regr_sxy, regr_avgx, regr_avgy, regr_count. SQL reference
> section 10.9
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)