[jira] [Updated] (HIVE-16936) wrong result with CTAS(create table as select)

2017-06-21 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-16936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang updated HIVE-16936:
--
Description: 
1. 
{code}
hive> desc abc_test_old;
OK
did string
activetime  int
{code}

2. 
{code}
hive> select 'test' as did from abc_test_old
> where did = '5FCAFD34-C124-4E13-AF65-27B675C945CC' limit 1;
OK
test  
{code}
result is 'test'

3. 
{code}
hive> create table abc_test_12345 as
> select 'test' as did from abc_test_old
> where did = '5FCAFD34-C124-4E13-AF65-27B675C945CC' limit 1;

hive> select did from abc_test_12345 limit 1;
OK
5FCAFD34-C124-4E13-AF65-27B675C945CC 
{code}
result is '5FCAFD34-C124-4E13-AF65-27B675C945CC'
why result is not 'test'?

4. 
{code}
hive> explain
> create table abc_test_12345 as
> select 'test' as did from abc_test_old
> where did = '5FCAFD34-C124-4E13-AF65-27B675C945CC' limit 1;
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, Stage-4
  Stage-3
  Stage-0 depends on stages: Stage-3, Stage-2, Stage-5
  Stage-7 depends on stages: Stage-0
  Stage-2
  Stage-4
  Stage-5 depends on stages: Stage-4

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abc_test_old
Statistics: Num rows: 32 Data size: 1152 Basic stats: COMPLETE 
Column stats: NONE
Filter Operator
  predicate: (did = '5FCAFD34-C124-4E13-AF65-27B675C945CC') (type: 
boolean)
  Statistics: Num rows: 16 Data size: 576 Basic stats: COMPLETE 
Column stats: NONE
  Select Operator
Statistics: Num rows: 16 Data size: 576 Basic stats: COMPLETE 
Column stats: NONE
Limit
  Number of rows: 1
  Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Select Operator
  expressions: '5FCAFD34-C124-4E13-AF65-27B675C945CC' (type: string)
  outputColumnNames: _col0
  Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
  Limit
Number of rows: 1
Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE Column 
stats: NONE
File Output Operator
  compressed: true
  Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  table:
  input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
  output format: 
org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
  serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
  name: default.abc_test_12345
..
{code}
why expressions is '5FCAFD34-C124-4E13-AF65-27B675C945CC'

  was:
1. 
{code}
hive> select 'test' as did from abc_test_old
> where did = '5FCAFD34-C124-4E13-AF65-27B675C945CC' limit 1;
OK
test  
{code}
result is 'test'

2. 
{code}
hive> create table abc_test_12345 as
> select 'test' as did from abc_test_old
> where did = '5FCAFD34-C124-4E13-AF65-27B675C945CC' limit 1;

hive> select did from abc_test_12345 limit 1;
OK
5FCAFD34-C124-4E13-AF65-27B675C945CC 
{code}
result is '5FCAFD34-C124-4E13-AF65-27B675C945CC'
why result is not 'test'?

3. 
{code}
hive> explain
> create table abc_test_12345 as
> select 'test' as did from abc_test_old
> where did = '5FCAFD34-C124-4E13-AF65-27B675C945CC' limit 1;
OK
STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-6 depends on stages: Stage-1 , consists of Stage-3, Stage-2, Stage-4
  Stage-3
  Stage-0 depends on stages: Stage-3, Stage-2, Stage-5
  Stage-7 depends on stages: Stage-0
  Stage-2
  Stage-4
  Stage-5 depends on stages: Stage-4

STAGE PLANS:
  Stage: Stage-1
Map Reduce
  Map Operator Tree:
  TableScan
alias: abc_test_old
Statistics: Num rows: 32 Data size: 1152 Basic stats: COMPLETE 
Column stats: NONE
Filter Operator
  predicate: (did = '5FCAFD34-C124-4E13-AF65-27B675C945CC') (type: 
boolean)
  Statistics: Num rows: 16 Data size: 576 Basic stats: COMPLETE 
Column stats: NONE
  Select Operator
Statistics: Num rows: 16 Data size: 576 Basic stats: COMPLETE 
Column stats: NONE
Limit
  Number of rows: 1
  Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Output Operator
sort order:
Statistics: Num rows: 1 Data size: 36 Basic stats: COMPLETE 
Column stats: NONE
  Reduce Operator Tree:
Select Operator
  expressions: 

[jira] [Commented] (HIVE-11765) SMB Join fails in Hive 1.2

2017-02-20 Thread Xiaomeng Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15875476#comment-15875476
 ] 

Xiaomeng Huang commented on HIVE-11765:
---

I also missed this issue when add the patch of HIVE-10591 based on 1.2.1

> SMB Join fails in Hive 1.2
> --
>
> Key: HIVE-11765
> URL: https://issues.apache.org/jira/browse/HIVE-11765
> Project: Hive
>  Issue Type: Bug
>  Components: Hive
>Affects Versions: 1.2.0, 1.2.1
>Reporter: Na Yang
>Assignee: Prasanth Jayachandran
> Attachments: employee (1).csv
>
>
> SMB join on Hive 1.2 fails with the following stack trace :
> {code}
> java.io.IOException: java.lang.reflect.InvocationTargetException
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
> at
> org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:266)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.(HadoopShimsSecure.java:213)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileInputFormatShim.getRecordReader(HadoopShimsSecure.java:333)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:719)
> at
> org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:173)
> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:437)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1595)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: java.lang.reflect.InvocationTargetException
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:408)
> at
> org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.initNextRecordReader(HadoopShimsSecure.java:252)
> ... 11 more
> Caused by: java.lang.IndexOutOfBoundsException: toIndex = 5
> at java.util.ArrayList.subListRangeCheck(ArrayList.java:1004)
> at java.util.ArrayList.subList(ArrayList.java:996)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.getSchemaOnRead(RecordReaderFactory.java:161)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderFactory.createTreeReader(RecordReaderFactory.java:66)
> at
> org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.(RecordReaderImpl.java:202)
> at
> org.apache.hadoop.hive.ql.io.orc.ReaderImpl.rowsOptions(ReaderImpl.java:539)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.createReaderFromFile(OrcInputFormat.java:230)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.(OrcInputFormat.java:163)
> at
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat.getRecordReader(OrcInputFormat.java:1104)
> at
> org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.(CombineHiveRecordReader.java:67)
> {code}
> This error happens after adding the patch of HIVE-10591. Reverting HIVE-10591 
> fixes this exception. 
> Steps to reproduce:
> {code}
> SET hive.enforce.sorting=true;
> SET hive.enforce.bucketing=true;
> SET hive.exec.dynamic.partition=true;
> SET mapreduce.reduce.import.limit=-1;
> SET hive.optimize.bucketmapjoin=true;
> SET hive.optimize.bucketmapjoin.sortedmerge=true;
> SET hive.auto.convert.join=true;
> SET hive.auto.convert.sortmerge.join=true;
> create Table table1 (empID int, name varchar(64), email varchar(64), company 
> varchar(64), age int) clustered by (age) sorted by (age ASC) INTO 384 buckets 
> stored as ORC;
> create Table table2 (empID int, name varchar(64), email varchar(64), company 
> varchar(64), age int) clustered by (age) sorted by (age ASC) into 384 buckets 
> stored as ORC;
> create Table table_tmp (empID int, name varchar(64), email varchar(64), 
> company varchar(64), age int);
> load data local inpath '/tmp/employee.csv’ into table table_tmp;
> INSERT OVERWRITE table  table1 select * from table_tmp;
> INSERT OVERWRITE table  table2 select * from table_tmp;
> SELECT table1.age, table2.age from table1 inner join table2 on 
> table1.age=table2.age;
> {code}



--
This message was 

[jira] [Resolved] (HIVE-15836) CATS failed when the table is stored as orc and select clause has null

2017-02-08 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang resolved HIVE-15836.
---
Resolution: Not A Problem

> CATS failed when the table is stored as orc and select clause has null
> --
>
> Key: HIVE-15836
> URL: https://issues.apache.org/jira/browse/HIVE-15836
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Xiaomeng Huang
>
> Based on the stable version 1.2.1, I patched 
> https://issues.apache.org/jira/browse/HIVE-11217, but I still got error.
> CASE:
> {quote}
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> {quote}
> ERROR:
> {quote}
> FAILED: SemanticException [Error 10305]: CREATE-TABLE-AS-SELECT creates a 
> VOID type, please use CAST to specify the type, near field:  _c1
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (HIVE-15836) CATS failed when the table is stored as orc and select clause has null

2017-02-07 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang updated HIVE-15836:
--
Summary: CATS failed when the table is stored as orc and select clause has 
null  (was: CATS failed when the table is stored as orc and select cause has 
null)

> CATS failed when the table is stored as orc and select clause has null
> --
>
> Key: HIVE-15836
> URL: https://issues.apache.org/jira/browse/HIVE-15836
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Xiaomeng Huang
>
> Based on the stable version 1.2.1, I patched 
> https://issues.apache.org/jira/browse/HIVE-11217, but I still got error.
> CASE:
> {quote}
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> {quote}
> ERROR:
> {quote}
> FAILED: SemanticException [Error 10305]: CREATE-TABLE-AS-SELECT creates a 
> VOID type, please use CAST to specify the type, near field:  _c1
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-15836) CATS failed when the table is stored as orc and select cause has null

2017-02-07 Thread Xiaomeng Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-15836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855608#comment-15855608
 ] 

Xiaomeng Huang commented on HIVE-15836:
---

Do anyone have a look?

> CATS failed when the table is stored as orc and select cause has null
> -
>
> Key: HIVE-15836
> URL: https://issues.apache.org/jira/browse/HIVE-15836
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 1.2.1
>Reporter: Xiaomeng Huang
>
> Based on the stable version 1.2.1, I patched 
> https://issues.apache.org/jira/browse/HIVE-11217, but I still got error.
> CASE:
> {quote}
> CREATE TABLE empty (x int);
> CREATE TABLE orc_table_with_null 
> STORED AS ORC 
> AS 
> SELECT 
> x,
> null
> FROM empty;
> {quote}
> ERROR:
> {quote}
> FAILED: SemanticException [Error 10305]: CREATE-TABLE-AS-SELECT creates a 
> VOID type, please use CAST to specify the type, near field:  _c1
> {quote}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (HIVE-14369) submit a task with hive on spark to other yarn cluster failed

2016-07-29 Thread Xiaomeng Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-14369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15399006#comment-15399006
 ] 

Xiaomeng Huang commented on HIVE-14369:
---

Hi [~ruili] 
It can't work, too. I think I should set both hivecluster and sparkcluster in 
hdfs-site.xml of sparkcluster.

> submit a task with hive on spark to other yarn cluster failed
> -
>
> Key: HIVE-14369
> URL: https://issues.apache.org/jira/browse/HIVE-14369
> Project: Hive
>  Issue Type: Bug
>Reporter: Xiaomeng Huang
>
> In our environment, we have two hadoop clusters with HA,  named hivecluster 
> and sparkcluster.
> hivecluster is a HA hadoop cluster for hive, which has large hard disk.
> sparkcluster is a HA hadoop cluster for spark, which has large memory.
> e.g. below is a hdfs-site.xml of hivecluster:
> {code}
> 
> dfs.ha.namenodes.hivecluster
> nn1,nn2
> 
> 
> dfs.namenode.rpc-address.hivecluster.nn1
> 10.17.21.32:9000
> 
> 
> dfs.namenode.rpc-address.hivecluster.nn2
> 10.17.21.77:9000
> 
> 
> dfs.namenode.http-address.hivecluster.nn1
> 10.17.21.32:50070
> 
> 
> dfs.namenode.http-address.hivecluster.nn2
> 10.17.21.77:50070
> 
> {code}
> Firstly, I created a hive table located as 
> hdfs://hivecluster/hive/warehouse/xxx
> If I use hive on mr, it will run successfully.
> But if I use hive on spark to submit a task to yarn cluster of sparkcluster, 
> it says:
> {code}
> FAILED: Execution Error, return code 2 from 
> org.apache.hadoop.hive.ql.exec.spark.SparkTask
> yarn日志显示:
> Diagnostics: java.lang.IllegalArgumentException: 
> java.net.UnknownHostException: hivecluster
> Failing this attempt. Failing the application.
> {code}
> I didn't set host of hivecluster into hdfs-site.xml of sparkcluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-10699) Query resultset metadata show wrong column name

2015-12-22 Thread Xiaomeng Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-10699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15067992#comment-15067992
 ] 

Xiaomeng Huang commented on HIVE-10699:
---

Hi [~tracyy]
try to use {code} set hive.resultset.use.unique.column.names=false; {code}

> Query resultset metadata show wrong column name
> ---
>
> Key: HIVE-10699
> URL: https://issues.apache.org/jira/browse/HIVE-10699
> Project: Hive
>  Issue Type: Bug
>  Components: JDBC
>Affects Versions: 0.13.0, 0.14.0
>Reporter: Tracy Y
>Priority: Minor
>
> select c1 from table1
> resultset metadata show getColumnName = "c1"
> however,
> select * from table1
> resultset metadata show getColumnName = "table1.c1".
> getColumnName should be "c1",
> table name can be show in getTableName column.
> The problem not seen in hive 0.12.
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6131) New columns after table alter result in null values despite data

2015-10-12 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang updated HIVE-6131:
-
Affects Version/s: 1.2.1

> New columns after table alter result in null values despite data
> 
>
> Key: HIVE-6131
> URL: https://issues.apache.org/jira/browse/HIVE-6131
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 1.2.1
>Reporter: James Vaughan
>Priority: Critical
> Attachments: HIVE-6131.1.patch
>
>
> Hi folks,
> I found and verified a bug on our CDH 4.0.3 install of Hive when adding 
> columns to tables with Partitions using 'REPLACE COLUMNS'.  I dug through the 
> Jira a little bit and didn't see anything for it so hopefully this isn't just 
> noise on the radar.
> Basically, when you alter a table with partitions and then reupload data to 
> that partition, it doesn't seem to recognize the extra data that actually 
> exists in HDFS- as in, returns NULL values on the new column despite having 
> the data and recognizing the new column in the metadata.
> Here's some steps to reproduce using a basic table:
> 1.  Run this hive command:  CREATE TABLE jvaughan_test (col1 string) 
> partitioned by (day string);
> 2.  Create a simple file on the system with a couple of entries, something 
> like "hi" and "hi2" separated by newlines.
> 3.  Run this hive command, pointing it at the file:  LOAD DATA LOCAL INPATH 
> '' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02');
> 4.  Confirm the data with:  SELECT * FROM jvaughan_test WHERE day = 
> '2014-01-02';
> 5.  Alter the column definitions:  ALTER TABLE jvaughan_test REPLACE COLUMNS 
> (col1 string, col2 string);
> 6.  Edit your file and add a second column using the default separator 
> (ctrl+v, then ctrl+a in Vim) and add two more entries, such as "hi3" on the 
> first row and "hi4" on the second
> 7.  Run step 3 again
> 8.  Check the data again like in step 4
> For me, this is the results that get returned:
> hive> select * from jvaughan_test where day = '2014-01-01';
> OK
> hiNULL2014-01-02
> hi2   NULL2014-01-02
> This is despite the fact that there is data in the file stored by the 
> partition in HDFS.
> Let me know if you need any other information.  The only workaround for me 
> currently is to drop partitions for any I'm replacing data in and THEN 
> reupload the new data file.
> Thanks,
> -James



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-6131) New columns after table alter result in null values despite data

2015-10-12 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-6131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang updated HIVE-6131:
-
Priority: Critical  (was: Minor)

> New columns after table alter result in null values despite data
> 
>
> Key: HIVE-6131
> URL: https://issues.apache.org/jira/browse/HIVE-6131
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 0.11.0, 0.12.0, 0.13.0, 1.2.1
>Reporter: James Vaughan
>Priority: Critical
> Attachments: HIVE-6131.1.patch
>
>
> Hi folks,
> I found and verified a bug on our CDH 4.0.3 install of Hive when adding 
> columns to tables with Partitions using 'REPLACE COLUMNS'.  I dug through the 
> Jira a little bit and didn't see anything for it so hopefully this isn't just 
> noise on the radar.
> Basically, when you alter a table with partitions and then reupload data to 
> that partition, it doesn't seem to recognize the extra data that actually 
> exists in HDFS- as in, returns NULL values on the new column despite having 
> the data and recognizing the new column in the metadata.
> Here's some steps to reproduce using a basic table:
> 1.  Run this hive command:  CREATE TABLE jvaughan_test (col1 string) 
> partitioned by (day string);
> 2.  Create a simple file on the system with a couple of entries, something 
> like "hi" and "hi2" separated by newlines.
> 3.  Run this hive command, pointing it at the file:  LOAD DATA LOCAL INPATH 
> '' OVERWRITE INTO TABLE jvaughan_test PARTITION (day = '2014-01-02');
> 4.  Confirm the data with:  SELECT * FROM jvaughan_test WHERE day = 
> '2014-01-02';
> 5.  Alter the column definitions:  ALTER TABLE jvaughan_test REPLACE COLUMNS 
> (col1 string, col2 string);
> 6.  Edit your file and add a second column using the default separator 
> (ctrl+v, then ctrl+a in Vim) and add two more entries, such as "hi3" on the 
> first row and "hi4" on the second
> 7.  Run step 3 again
> 8.  Check the data again like in step 4
> For me, this is the results that get returned:
> hive> select * from jvaughan_test where day = '2014-01-01';
> OK
> hiNULL2014-01-02
> hi2   NULL2014-01-02
> This is despite the fact that there is data in the file stored by the 
> partition in HDFS.
> Let me know if you need any other information.  The only workaround for me 
> currently is to drop partitions for any I'm replacing data in and THEN 
> reupload the new data file.
> Thanks,
> -James



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HIVE-11931) Join sql cannot get result

2015-10-09 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang reassigned HIVE-11931:
-

Assignee: Xiaomeng Huang

> Join sql cannot get result
> --
>
> Key: HIVE-11931
> URL: https://issues.apache.org/jira/browse/HIVE-11931
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Query Planning, Query Processor, SQL
>Affects Versions: 1.1.1, 1.2.1
>Reporter: NING DING
>Assignee: Xiaomeng Huang
> Attachments: 00_0
>
>
> I found a join issue in hive-1.2.1 and hive-1.1.1.
> The create table sql is as below.
> {code}
> CREATE TABLE IF NOT EXISTS join_case(
> orderid  bigint,
> tradeitemid bigint,
> id bigint
> ) ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ',' 
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE;
> {code}
> Please put attached sample data file 00_0 in /tmp/join_case folder.
> Then load data.
> {code}
> LOAD DATA LOCAL INPATH '/tmp/join_case/00_0' OVERWRITE INTO TABLE 
> join_case;
> {code}
> Run the following sql, but cannot get searching result.
> {code}
> select a.id from 
> (
> select orderid as orderid, max(id) as id from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> This issue also occurs in hive-1.1.0-cdh5.4.5.
> But in apache hive-1.0.1 the above sql can return 10 rows.
> After exchanging the sequence of "orderid as orderid" and "max(id) as id", 
> the following sql can get result in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select max(id) as id, orderid as orderid from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> Also, the following sql can get results in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select orderid as orderid, id as id from join_case group by orderid, id
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10; 
> {code}
> Anyone can take a look at this issue? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HIVE-11931) Join sql cannot get result

2015-10-09 Thread Xiaomeng Huang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-11931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaomeng Huang resolved HIVE-11931.
---
Resolution: Duplicate

Mark this as duplicated.

> Join sql cannot get result
> --
>
> Key: HIVE-11931
> URL: https://issues.apache.org/jira/browse/HIVE-11931
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Query Planning, Query Processor, SQL
>Affects Versions: 1.1.1, 1.2.1
>Reporter: NING DING
>Assignee: Xiaomeng Huang
> Attachments: 00_0
>
>
> I found a join issue in hive-1.2.1 and hive-1.1.1.
> The create table sql is as below.
> {code}
> CREATE TABLE IF NOT EXISTS join_case(
> orderid  bigint,
> tradeitemid bigint,
> id bigint
> ) ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ',' 
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE;
> {code}
> Please put attached sample data file 00_0 in /tmp/join_case folder.
> Then load data.
> {code}
> LOAD DATA LOCAL INPATH '/tmp/join_case/00_0' OVERWRITE INTO TABLE 
> join_case;
> {code}
> Run the following sql, but cannot get searching result.
> {code}
> select a.id from 
> (
> select orderid as orderid, max(id) as id from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> This issue also occurs in hive-1.1.0-cdh5.4.5.
> But in apache hive-1.0.1 the above sql can return 10 rows.
> After exchanging the sequence of "orderid as orderid" and "max(id) as id", 
> the following sql can get result in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select max(id) as id, orderid as orderid from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> Also, the following sql can get results in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select orderid as orderid, id as id from join_case group by orderid, id
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10; 
> {code}
> Anyone can take a look at this issue? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HIVE-11931) Join sql cannot get result

2015-10-09 Thread Xiaomeng Huang (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-11931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14950052#comment-14950052
 ] 

Xiaomeng Huang commented on HIVE-11931:
---

Hi [~iceberg565]
I think this bug has fixed in https://issues.apache.org/jira/browse/HIVE-10996

> Join sql cannot get result
> --
>
> Key: HIVE-11931
> URL: https://issues.apache.org/jira/browse/HIVE-11931
> Project: Hive
>  Issue Type: Bug
>  Components: Hive, Query Planning, Query Processor, SQL
>Affects Versions: 1.1.1, 1.2.1
>Reporter: NING DING
> Attachments: 00_0
>
>
> I found a join issue in hive-1.2.1 and hive-1.1.1.
> The create table sql is as below.
> {code}
> CREATE TABLE IF NOT EXISTS join_case(
> orderid  bigint,
> tradeitemid bigint,
> id bigint
> ) ROW FORMAT DELIMITED
> FIELDS TERMINATED BY ',' 
> LINES TERMINATED BY '\n'
> STORED AS TEXTFILE;
> {code}
> Please put attached sample data file 00_0 in /tmp/join_case folder.
> Then load data.
> {code}
> LOAD DATA LOCAL INPATH '/tmp/join_case/00_0' OVERWRITE INTO TABLE 
> join_case;
> {code}
> Run the following sql, but cannot get searching result.
> {code}
> select a.id from 
> (
> select orderid as orderid, max(id) as id from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> This issue also occurs in hive-1.1.0-cdh5.4.5.
> But in apache hive-1.0.1 the above sql can return 10 rows.
> After exchanging the sequence of "orderid as orderid" and "max(id) as id", 
> the following sql can get result in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select max(id) as id, orderid as orderid from join_case group by orderid
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10;
> {code}
> Also, the following sql can get results in hive-1.2.1 and hive-1.1.1.
> {code}
> select a.id from 
> (
> select orderid as orderid, id as id from join_case group by orderid, id
> ) a 
> join 
> (
> select id as id , orderid as orderid from join_case
> ) b
> on a.id = b.id limit 10; 
> {code}
> Anyone can take a look at this issue? 
> Thanks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)