[
https://issues.apache.org/jira/browse/HIVE-790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748262#action_12748262
]
Ning Zhang commented on HIVE-790:
---------------------------------
@zheng, I'll fix the comment and the test query.
As for the new state, maybe "FINISH" is not a good name for it but I think we
need two states since they have two different situations when an operator has
two or more parents:
1) the close() is called on this operator, but it doesn't guarantee all its
child operators are also called close() (the FINISH state)
2) the close() is called and all its children are called close() (the CLOSE
state).
The current code set the state CLOSE at the end of the function, which means
all its children (eventually desendants) are closed. So it is the second
semantics. What you proposed is the first semantics, to implement which we need
to move the statement to set the state to CLOSE to the beginning of the close()
function (just after the check of the CLOSE state and return if true).
We need both both states since if we just have 1 state (CLOSE) and assign it in
the beginning, if there are two parents to the operator, when the first parent
call close(), this operator will set it state to CLOSE and just return without
calling close() to all its children (since the other parent has not been
closed). When the second parent call close(), it just return since its state is
already closed. So this end up all children are not closed. We should not
remove the CLOSE state checkup in the beginning since that may cause an
operator being closed multiple times.
We cannot use just the CLOSE state as it is in the current implementation as
well since the CLOSE state is set at the end of the close() function. When a
parent calls this operator's close(), the parent's state is still not in CLOSE.
So we end up just return and don't close the child operators. If we have the
FINISH state and this state is set at the beginning of close(), whenever a
parent calls close(), the parent is in the FINISH state and this operator can
check and treat FINISH the same as CLOSE except that this operator hasn't
return yet.
> race condition related to ScriptOperator + UnionOperator
> --------------------------------------------------------
>
> Key: HIVE-790
> URL: https://issues.apache.org/jira/browse/HIVE-790
> Project: Hadoop Hive
> Issue Type: Bug
> Reporter: Zheng Shao
> Assignee: Ning Zhang
> Attachments: Hive-790.patch
>
>
> ScriptOperator uses a second thread to output the rows to the children
> operators. In a corner case which contains a union, 2 threads might be
> outputting data into the same operator hierarchy and caused race conditions.
> {code}
> CREATE TABLE tablea (cola STRING);
> SELECT *
> FROM (
> SELECT TRANSFORM(cola)
> USING 'cat'
> AS cola
> FROM tablea
> UNION ALL
> SELECT cola as cola
> FROM tablea
> ) a;
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.