[jira] [Updated] (HIVE-25838) Hive SQL using TEZ as execution engine not giving result on empty partition
[ https://issues.apache.org/jira/browse/HIVE-25838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dinesh updated HIVE-25838: -- Description: Hive SQL's on empty partitions giving no result instead of 0 rows or actual value. For example - --Create external Table 1) Create external table test_tbl ( name string) partitioned by ( company string, processdate string) stored as orc location '/my/some/random/location'; – Add partion 2) Alter table test_tbl add partition ( company='aquaifer', processdate='20220101'); – Execute following SQL's which returns no records. 3) select max( company ) , processdate from test_tbl group by processdate ; 4) select max(processdate ) from test_tbl ; Same SQL (#3 & #4 above) , when execute with SPARK, returns '0' count and '20220101' respectively. was: Hive SQL's on empty partitions giving no result instead of 0 rows or actual value. For example - --Create external Table 1) Create external table test_tbl ( name string) partitioned by ( company string, processdate string) stored as orc location '/my/some/random/location'; – Add partion 2) Alter table test_tbl add partition ( company='aquaifer', processdate='20220101'); – Execute following SQL's which returns no records. 3) select count( * ) , processdate from test_tbl group by processdate ; 4) select max(processdate ) from test_tbl ; Same SQL (#3 & #4 above) , when execute with SPARK, returns '0' count and '20220101' respectively. > Hive SQL using TEZ as execution engine not giving result on empty partition > --- > > Key: HIVE-25838 > URL: https://issues.apache.org/jira/browse/HIVE-25838 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: dinesh >Priority: Major > > Hive SQL's on empty partitions giving no result instead of 0 rows or actual > value. For example - > --Create external Table > 1) Create external table test_tbl ( name string) partitioned by ( company > string, processdate string) stored as orc location '/my/some/random/location'; > – Add partion > 2) Alter table test_tbl add partition ( company='aquaifer', > processdate='20220101'); > > – Execute following SQL's which returns no records. > 3) select max( company ) , processdate from test_tbl group by processdate ; > 4) select max(processdate ) from test_tbl ; > > Same SQL (#3 & #4 above) , when execute with SPARK, returns '0' count and > '20220101' respectively. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (HIVE-25838) Hive SQL using TEZ as execution engine not giving result on empty partition
[ https://issues.apache.org/jira/browse/HIVE-25838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dinesh updated HIVE-25838: -- Description: Hive SQL's on empty partitions giving no result instead of 0 rows or actual value. For example - --Create external Table 1) Create external table test_tbl ( name string) partitioned by ( company string, processdate string) stored as orc location '/my/some/random/location'; – Add partion 2) Alter table test_tbl add partition ( company='aquaifer', processdate='20220101'); – Execute following SQL's which returns no records. 3) select count( * ) , processdate from test_tbl group by processdate ; 4) select max(processdate ) from test_tbl ; Same SQL (#3 & #4 above) , when execute with SPARK, returns '0' count and '20220101' respectively. was: Hive SQL's on empty partitions giving no result instead of 0 rows or actual value. For example - --Create external Table 1) Create external table test_tbl ( name string) partitioned by ( company string, processdate string) stored as orc location '/my/some/random/location'; -- Add partion 2) Alter table test_tbl add partition ( company='aquaifer', processdate='20220101'); -- Execute following SQL's which returns no records. 3) select count(*), processdate from test_tbl group by processdate ; 4) select max(processdate ) from test_tbl ; Same SQL (#3 & #4 above) , when execute with SPARK, returns '0' count and '20220101' respectively. > Hive SQL using TEZ as execution engine not giving result on empty partition > --- > > Key: HIVE-25838 > URL: https://issues.apache.org/jira/browse/HIVE-25838 > Project: Hive > Issue Type: Bug >Affects Versions: 3.1.0 >Reporter: dinesh >Priority: Major > > Hive SQL's on empty partitions giving no result instead of 0 rows or actual > value. For example - > --Create external Table > 1) Create external table test_tbl ( name string) partitioned by ( company > string, processdate string) stored as orc location '/my/some/random/location'; > – Add partion > 2) Alter table test_tbl add partition ( company='aquaifer', > processdate='20220101'); > > – Execute following SQL's which returns no records. > 3) select count( * ) , processdate from test_tbl group by processdate ; > 4) select max(processdate ) from test_tbl ; > > Same SQL (#3 & #4 above) , when execute with SPARK, returns '0' count and > '20220101' respectively. > -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (HIVE-3488) Issue trying to use the thick client (embedded) from windows.
[ https://issues.apache.org/jira/browse/HIVE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14983997#comment-14983997 ] Dinesh commented on HIVE-3488: -- Hi, I was getting similar exception trace. And quick solution to that is to not use default warehouse location for table i.e; "/user/hive/warehouse". Alter table to set location to some other directory like 'hdfs://localhost:/user/youruser/hive/test'. This might help you. Please let me know if it works. Thanks > Issue trying to use the thick client (embedded) from windows. > - > > Key: HIVE-3488 > URL: https://issues.apache.org/jira/browse/HIVE-3488 > Project: Hive > Issue Type: Bug > Components: Windows >Affects Versions: 0.8.1 >Reporter: Rémy DUBOIS >Priority: Critical > > I'm trying to execute a very simple SELECT query against my remote hive > server. > If I'm doing a SELECT * from table, everything works well. If I'm trying to > execute a SELECT name from table, this error appears: > {code:java} > Job Submission failed with exception 'java.io.IOException(cannot find dir = > /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: > [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])' > 12/09/19 17:18:44 ERROR exec.Task: Job Submission failed with exception > 'java.io.IOException(cannot find dir = > /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: > [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris])' > java.io.IOException: cannot find dir = > /user/hive/warehouse/test/city=paris/out.csv in pathToPartitionInfo: > [hdfs://cdh-four:8020/user/hive/warehouse/test/city=paris] > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:290) > at > org.apache.hadoop.hive.ql.io.HiveFileFormatUtils.getPartitionDescFromPathRecursively(HiveFileFormatUtils.java:257) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat$CombineHiveInputSplit.(CombineHiveInputFormat.java:104) > at > org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getSplits(CombineHiveInputFormat.java:407) > at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:989) > at org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:981) > at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:891) > at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:844) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Unknown Source) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1232) > at > org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:844) > at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:818) > at > org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:452) > at > org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:136) > at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:133) > at > org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57) > at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1332) > at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1123) > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:931) > at > org.apache.hadoop.hive.service.HiveServer$HiveServerHandler.execute(HiveServer.java:191) > at > org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:187) > {code} > Indeed, this "dir" (/user/hive/warehouse/test/city=paris/out.csv) can't be > found since it deals with my data file, and not a directory. > Could you please help me? -- This message was sent by Atlassian JIRA (v6.3.4#6332)