[
https://issues.apache.org/jira/browse/HIVE-24645?focusedWorklogId=543770&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-543770
]
ASF GitHub Bot logged work on HIVE-24645:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 28/Jan/21 17:41
Start Date: 28/Jan/21 17:41
Worklog Time Spent: 10m
Work Description: kgyrtkirk commented on a change in pull request #1876:
URL: https://github.com/apache/hive/pull/1876#discussion_r566284005
##########
File path:
ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java
##########
@@ -140,8 +141,23 @@ public ObjectInspector initialize(ObjectInspector
rowInspector) throws HiveExcep
childrenOIs[i] = children[i].initialize(rowInspector);
}
MapredContext context = MapredContext.get();
+ // It is possible that there is no context at this point. For example a
context is not created
+ // when "hive.fetch.task.conversion" occurs.
if (context != null) {
context.setup(genericUDF);
+ } else {
+ // It is a bit unfortunate that currently the UDF configuration
signature expects a
+ // MapredContext (even if execution is tez or another engine) - this
causes an
+ // impedence mismatch. For example: MapredContext has Reporter objects
that may or
+ // may not make sense for the current engine.
+ //
+ // We attempt to create a dummyContext that has at least access to the
currently set
+ // configuration. The other unfortunate issue is that some paths set the
configuration
+ // when creating ExprNodeGenericFuncEvaluator while others do not, so we
fallback to a
+ // "default" new HiveConf object which might be missing configuration
that changed during
+ // runtime.
+ MapredContext dummyContext = MapredContext.createDummy(getConf() != null
? getConf() : new HiveConf());
Review comment:
this is strange - during fetch task conversion we should have access to
the sessionstate; and when we are running it on the cluster
mapredcontext/tezcontext should be available.
There is one thing which might have came up in the failures: when we run
HS2/clidriver we *always* have a session state when we run queries; because of
this nature I think it makes sense to rely on this mechanism instead of working
around stuff here and there - however some test(s) may have "forgot" to start
the session (`SessionState.start`) - which could cause an NPE ; I think this
might be happening in `TestVectorizationContext`
but...there is also some stuff inside the metastore which can't have a
session - so you might need to handle the null case as well...
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 543770)
Time Spent: 1h 20m (was: 1h 10m)
> UDF configure not called when fetch task conversion occurs
> ----------------------------------------------------------
>
> Key: HIVE-24645
> URL: https://issues.apache.org/jira/browse/HIVE-24645
> Project: Hive
> Issue Type: Bug
> Components: HiveServer2
> Reporter: John Sherman
> Assignee: John Sherman
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> When hive.fetch.task.conversion kicks in - UDF configure is not called.
> This is likely due to MapredContext not being available when this conversion
> occurs.
> The approach I suggest is to create a dummy MapredContext and provide it with
> the current configuration from ExprNodeGenericFuncEvaluator.
> It is slightly unfortunate that the UDF API relies on MapredContext since
> some aspects of the context do not apply to the variety of engines and
> invocation paths for UDFs which makes it difficult to make a fully formed
> dummy object such as the Reporter objects and the boolean around if it is a
> Map context.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)