[jira] [Updated] (HIVE-25838) Hive SQL using TEZ as execution engine not giving result on empty partition

2021-12-31 Thread dinesh (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dinesh updated HIVE-25838:
--
Description: 
Hive SQL's on empty partitions giving no result instead of 0 rows or actual 
value. For example - 

--Create external Table

1) Create external table test_tbl ( name string) partitioned by ( company 
string, processdate string) stored as orc location '/my/some/random/location';

– Add partion

2) Alter table test_tbl add partition ( company='aquaifer', 
processdate='20220101');

 

– Execute following SQL's which returns no records.

3) select max( company ) , processdate  from test_tbl  group by processdate  ;

4) select max(processdate ) from test_tbl  ;

 

Same SQL (#3 & #4 above) , when execute with SPARK, returns  '0' count and  
'20220101' respectively. 

 

  was:
Hive SQL's on empty partitions giving no result instead of 0 rows or actual 
value. For example - 

--Create external Table

1) Create external table test_tbl ( name string) partitioned by ( company 
string, processdate string) stored as orc location '/my/some/random/location';

– Add partion

2) Alter table test_tbl add partition ( company='aquaifer', 
processdate='20220101');

 

– Execute following SQL's which returns no records.

3) select count( * ) , processdate  from test_tbl  group by processdate  ;

4) select max(processdate ) from test_tbl  ;

 

Same SQL (#3 & #4 above) , when execute with SPARK, returns  '0' count and  
'20220101' respectively. 

 


> Hive SQL using TEZ as execution engine not giving result on empty partition
> ---
>
> Key: HIVE-25838
> URL: https://issues.apache.org/jira/browse/HIVE-25838
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: dinesh
>Priority: Major
>
> Hive SQL's on empty partitions giving no result instead of 0 rows or actual 
> value. For example - 
> --Create external Table
> 1) Create external table test_tbl ( name string) partitioned by ( company 
> string, processdate string) stored as orc location '/my/some/random/location';
> – Add partion
> 2) Alter table test_tbl add partition ( company='aquaifer', 
> processdate='20220101');
>  
> – Execute following SQL's which returns no records.
> 3) select max( company ) , processdate  from test_tbl  group by processdate  ;
> 4) select max(processdate ) from test_tbl  ;
>  
> Same SQL (#3 & #4 above) , when execute with SPARK, returns  '0' count and  
> '20220101' respectively. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Updated] (HIVE-25838) Hive SQL using TEZ as execution engine not giving result on empty partition

2021-12-31 Thread dinesh (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-25838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dinesh updated HIVE-25838:
--
Description: 
Hive SQL's on empty partitions giving no result instead of 0 rows or actual 
value. For example - 

--Create external Table

1) Create external table test_tbl ( name string) partitioned by ( company 
string, processdate string) stored as orc location '/my/some/random/location';

– Add partion

2) Alter table test_tbl add partition ( company='aquaifer', 
processdate='20220101');

 

– Execute following SQL's which returns no records.

3) select count( * ) , processdate  from test_tbl  group by processdate  ;

4) select max(processdate ) from test_tbl  ;

 

Same SQL (#3 & #4 above) , when execute with SPARK, returns  '0' count and  
'20220101' respectively. 

 

  was:
Hive SQL's on empty partitions giving no result instead of 0 rows or actual 
value. For example - 

--Create external Table

1) Create external table test_tbl ( name string) partitioned by ( company 
string, processdate string) stored as orc location '/my/some/random/location';

-- Add partion

2) Alter table test_tbl add partition ( company='aquaifer', 
processdate='20220101');

 

-- Execute following SQL's which returns no records.

3) select count(*), processdate  from test_tbl  group by processdate  ;

4) select max(processdate ) from test_tbl  ;

 

Same SQL (#3 & #4 above) , when execute with SPARK, returns  '0' count and  
'20220101' respectively. 

 


> Hive SQL using TEZ as execution engine not giving result on empty partition
> ---
>
> Key: HIVE-25838
> URL: https://issues.apache.org/jira/browse/HIVE-25838
> Project: Hive
>  Issue Type: Bug
>Affects Versions: 3.1.0
>Reporter: dinesh
>Priority: Major
>
> Hive SQL's on empty partitions giving no result instead of 0 rows or actual 
> value. For example - 
> --Create external Table
> 1) Create external table test_tbl ( name string) partitioned by ( company 
> string, processdate string) stored as orc location '/my/some/random/location';
> – Add partion
> 2) Alter table test_tbl add partition ( company='aquaifer', 
> processdate='20220101');
>  
> – Execute following SQL's which returns no records.
> 3) select count( * ) , processdate  from test_tbl  group by processdate  ;
> 4) select max(processdate ) from test_tbl  ;
>  
> Same SQL (#3 & #4 above) , when execute with SPARK, returns  '0' count and  
> '20220101' respectively. 
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HIVE-3488) Issue trying to use the thick client (embedded) from windows.

2015-10-31 Thread Dinesh (JIRA)

[ 
https://issues.apache.org/jira/browse/HIVE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983997#comment-14983997
 ] 

Dinesh commented on HIVE-3488:
--

Hi, 

I was getting similar exception trace. And quick solution to that is to not use 
default warehouse location for table i.e; "/user/hive/warehouse". 

Alter table to set location to some other directory like 
'hdfs://localhost:/user/youruser/hive/test'. 

This might help you. Please let me know if it works.

Thanks

> Issue trying to use the thick client (embedded) from windows.
> -
>
> Key: HIVE-3488
> URL: https://issues.apache.org/jira/browse/HIVE-3488
> Project: Hive
>  Issue Type: Bug
>  Components: Windows
>Affects Versions: 0.8.1
>Reporter: Rémy DUBOIS
>Priority: Critical
>
> I'm trying to execute a very simple SELECT query against my remote hive 
> server.
> If I'm doing a SELECT * from table, everything works well. If I'm trying to 
> execute a SELECT name from table, this error appears:
> {code:java}
> Job Submission failed with exception 'java.io.IOException(cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])'
> 12/09/19 17:18:44 ERROR exec.Task: Job Submission failed with exception 
> 'java.io.IOException(cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])'
> java.io.IOException: cannot find dir = 
> /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: 
> [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris]
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:290)
>   at 
> org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:257)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.(CombineHiveInputFormat.java:104)
>   at 
> org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:407)
>   at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:989)
>   at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:981)
>   at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
>   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:891)
>   at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Unknown Source)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232)
>   at 
> org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844)
>   at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:818)
>   at 
> org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452)
>   at 
> org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136)
>   at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133)
>   at 
> org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
>   at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332)
>   at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123)
>   at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931)
>   at 
> org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191)
>   at 
> org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187)
> {code}
> Indeed, this "dir" (/user/hive/warehouse/test/city=paris/out.csv) can't be 
> found since it deals with my data file, and not a directory.
> Could you please help me?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)