[
https://issues.apache.org/jira/browse/HIVE-10209?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chao updated HIVE-10209:
------------------------
Description:
ExecMapper.done is a static variable, and may cause issues in the following
example:
{code}
set hive.fetch.task.conversion=minimal;
select * from src where key < 10 limit 1;
set hive.fetch.task.conversion=more;
select *, BLOCK__OFFSET_INSIDE__FILE from src where key < 10;
{code}
The second select won't return any result, if running in local mode.
The issue is, the first select query will be converted to a MapRedTask with
only a mapper. And, when the task is done, because of the limit operator,
ExecMapper.done will be set to true.
Then, when the second select query begin to execute, it will call
{{FetchOperator::getRecordReader()}}, and since here we have virtual column, an
instance of {{HiveRecordReader}} will be returned. The problem is,
{{HiveRecordReader::doNext()}} will check ExecMapper.done. In this case, since
the value is true, it will quit immediately.
In short, I think making ExecMapper.done static is a bad idea. The first query
should in no way affect the second one.
was:
ExecMapper.done is a static variable, and may cause issues in the following
example:
{code}
set hive.fetch.task.conversion=minimal;
select * from src where key < 10 limit 1;
set hive.fetch.task.conversion=more;
select *, BLOCK__OFFSET_INSIDE__FILE from src where key < 10;
{code}
The second select won't return any result.
The issue is, the first select query will be converted to a MapRedTask with
only a mapper. And, when the task is done, because of the limit operator,
ExecMapper.done will be set to true.
Then, when the second select query begin to execute, it will call
{{FetchOperator::getRecordReader()}}, and since here we have virtual column, an
instance of {{HiveRecordReader}} will be returned. The problem is,
{{HiveRecordReader::doNext()}} will check ExecMapper.done. In this case, since
the value is true, it will quit immediately.
In short, I think making ExecMapper.done static is a bad idea. The first query
should in no way affect the second one.
> FetchTask with VC may fail because ExecMapper.done is true
> ----------------------------------------------------------
>
> Key: HIVE-10209
> URL: https://issues.apache.org/jira/browse/HIVE-10209
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 1.1.0
> Reporter: Chao
> Assignee: Chao
> Attachments: HIVE-10209.1-spark.patch
>
>
> ExecMapper.done is a static variable, and may cause issues in the following
> example:
> {code}
> set hive.fetch.task.conversion=minimal;
> select * from src where key < 10 limit 1;
> set hive.fetch.task.conversion=more;
> select *, BLOCK__OFFSET_INSIDE__FILE from src where key < 10;
> {code}
> The second select won't return any result, if running in local mode.
> The issue is, the first select query will be converted to a MapRedTask with
> only a mapper. And, when the task is done, because of the limit operator,
> ExecMapper.done will be set to true.
> Then, when the second select query begin to execute, it will call
> {{FetchOperator::getRecordReader()}}, and since here we have virtual column,
> an instance of {{HiveRecordReader}} will be returned. The problem is,
> {{HiveRecordReader::doNext()}} will check ExecMapper.done. In this case,
> since the value is true, it will quit immediately.
> In short, I think making ExecMapper.done static is a bad idea. The first
> query should in no way affect the second one.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)