[
https://issues.apache.org/jira/browse/HIVE-3699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557993#comment-13557993
]
Hudson commented on HIVE-3699:
------------------------------
Integrated in Hive-trunk-h0.21 #1923 (See
[https://builds.apache.org/job/Hive-trunk-h0.21/1923/])
HIVE-3699 Multiple insert overwrite into multiple tables query stores same
results
in all tables (Navis via namit) (Revision 1435484)
Result = SUCCESS
namit : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1435484
Files :
* /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/ppd/OpProcFactory.java
* /hive/trunk/ql/src/test/queries/clientpositive/multi_insert_gby.q
*
/hive/trunk/ql/src/test/results/clientpositive/groupby_multi_single_reducer2.q.out
* /hive/trunk/ql/src/test/results/clientpositive/multi_insert.q.out
* /hive/trunk/ql/src/test/results/clientpositive/multi_insert_gby.q.out
*
/hive/trunk/ql/src/test/results/clientpositive/multi_insert_move_tasks_share_dependencies.q.out
> Multiple insert overwrite into multiple tables query stores same results in
> all tables
> --------------------------------------------------------------------------------------
>
> Key: HIVE-3699
> URL: https://issues.apache.org/jira/browse/HIVE-3699
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Environment: Cloudera 4.1 on Amazon Linux (rebranded Centos 6):
> hive-0.9.0+150-1.cdh4.1.1.p0.4.el6.noarch
> Reporter: Alexandre Fouché
> Assignee: Navis
> Fix For: 0.11.0
>
> Attachments: HIVE-3699.D7743.1.patch, HIVE-3699.D7743.2.patch,
> HIVE-3699.D7743.3.patch, HIVE-3699_hive-0.9.1.patch.txt
>
>
> (Note: This might be related to HIVE-2750)
> I am doing a query with multiple INSERT OVERWRITE to multiple tables in order
> to scan the dataset only 1 time, and i end up having all these tables with
> the same content ! It seems the GROUP BY query that returns results is
> overwriting all the temp tables.
> Weird enough, if i had further GROUP BY queries into additional temp tables,
> grouped by a different field, then all temp tables, even the ones that would
> have been wrong content are all correctly populated.
> This is the misbehaving query:
> FROM nikon
> INSERT OVERWRITE TABLE e1
> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Impressions
> WHERE qs_cs_s_cat='PRINT' GROUP BY qs_cs_s_aid
> INSERT OVERWRITE TABLE e2
> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Vues
> WHERE qs_cs_s_cat='VIEW' GROUP BY qs_cs_s_aid
> ;
> It launches only one MR job and here are the results. Why does table 'e1'
> contains results from table 'e2' ?! Table 'e1' should have been empty (see
> individual SELECTs further below)
> hive> SELECT * from e1;
> OK
> NULL 2
> 1627575 25
> 1627576 70
> 1690950 22
> 1690952 42
> 1696705 199
> 1696706 66
> 1696730 229
> 1696759 85
> 1696893 218
> Time taken: 0.229 seconds
> hive> SELECT * from e2;
> OK
> NULL 2
> 1627575 25
> 1627576 70
> 1690950 22
> 1690952 42
> 1696705 199
> 1696706 66
> 1696730 229
> 1696759 85
> 1696893 218
> Time taken: 0.11 seconds
> Here is are the result to the indiviual queries (only the second query
> returns a result set):
> hive> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Impressions FROM
> nikon
> WHERE qs_cs_s_cat='PRINT' GROUP BY qs_cs_s_aid;
> (...)
> OK
> <- There are no results, this is normal
> Time taken: 41.471 seconds
> hive> SELECT qs_cs_s_aid AS Emplacements, COUNT(*) AS Vues FROM nikon
> WHERE qs_cs_s_cat='VIEW' GROUP BY qs_cs_s_aid;
> (...)
> OK
> NULL 2
> 1627575 25
> 1627576 70
> 1690950 22
> 1690952 42
> 1696705 199
> 1696706 66
> 1696730 229
> 1696759 85
> 1696893 218
> Time taken: 39.607 seconds
>
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira