Prasanna Ravichandran created CARBONDATA-4024:
-------------------------------------------------

             Summary: Select queries with filter and aggregate queries are not 
working in Hive write - carbon table. 
                 Key: CARBONDATA-4024
                 URL: https://issues.apache.org/jira/browse/CARBONDATA-4024
             Project: CarbonData
          Issue Type: Bug
          Components: hive-integration
    Affects Versions: 2.0.0
            Reporter: Prasanna Ravichandran


Select queries with filter and aggregate queries are not working in Hive write 
- carbon table.

Hive - console:

0: /> use t2;
INFO : State: Compiling.
INFO : Compiling 
command(queryId=omm_20201008191831_ac10f1ae-8d39-4185-b25a-d690134a94be): use 
t2; 
Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : hive.compile.auto.avoid.cbo=true
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling 
command(queryId=omm_20201008191831_ac10f1ae-8d39-4185-b25a-d690134a94be); Time 
taken: 0.122 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : State: Executing.
INFO : Executing 
command(queryId=omm_20201008191831_ac10f1ae-8d39-4185-b25a-d690134a94be): use 
t2; Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=omm_20201008191831_ac10f1ae-8d39-4185-b25a-d690134a94be); Time 
taken: 0.019 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
No rows affected (0.207 seconds)
0: /> show tables;
INFO : State: Compiling.
INFO : Compiling 
command(queryId=omm_20201008191835_5e1e9469-0054-446f-af82-ec3294ec77b1): show 
tables; 
Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : hive.compile.auto.avoid.cbo=true
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, 
type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling 
command(queryId=omm_20201008191835_5e1e9469-0054-446f-af82-ec3294ec77b1); Time 
taken: 0.015 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : State: Executing.
INFO : Executing 
command(queryId=omm_20201008191835_5e1e9469-0054-446f-af82-ec3294ec77b1): show 
tables; Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=omm_20201008191835_5e1e9469-0054-446f-af82-ec3294ec77b1); Time 
taken: 0.016 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+----------------+
| tab_name |
+----------------+
| hive_carbon |
| hive_table |
| parquet_table |
+----------------+
3 rows selected (0.114 seconds)
0: /> select * from hive_carbon;
INFO : State: Compiling.
INFO : Compiling 
command(queryId=omm_20201008191842_9378bab9-181c-455e-aa6d-9b4f787ce6da): 
select * from hive_carbon; 
Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : hive.compile.auto.avoid.cbo=true
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Current sql is not contains insert syntax, not need record dest table 
flag
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:hive_carbon.id, type:int, comment:null), 
FieldSchema(name:hive_carbon.name, type:string, comment:null), 
FieldSchema(name:hive_carbon.scale, type:decimal(10,0), comment:null), 
FieldSchema(name:hive_carbon.country, type:string, comment:null), 
FieldSchema(name:hive_carbon.salary, type:double, comment:null)], 
properties:null)
INFO : Completed compiling 
command(queryId=omm_20201008191842_9378bab9-181c-455e-aa6d-9b4f787ce6da); Time 
taken: 0.511 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : State: Executing.
INFO : Executing 
command(queryId=omm_20201008191842_9378bab9-181c-455e-aa6d-9b4f787ce6da): 
select * from hive_carbon; Current 
sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Completed executing 
command(queryId=omm_20201008191842_9378bab9-181c-455e-aa6d-9b4f787ce6da); Time 
taken: 0.001 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+-----------------+-------------------+--------------------+----------------------+---------------------+
| hive_carbon.id | hive_carbon.name | hive_carbon.scale | hive_carbon.country | 
hive_carbon.salary |
+-----------------+-------------------+--------------------+----------------------+---------------------+
| 1 | Ram | 2 | India | 3500.0 |
+-----------------+-------------------+--------------------+----------------------+---------------------+
1 row selected (0.614 seconds)
0: /> select * from hive_carbon where hive_carbon.id=1;
INFO : State: Compiling.
INFO : Compiling 
command(queryId=omm_20201008191854_20cca5e2-a9ae-470a-acbc-1d0ceb46f4e2): 
select * from hive_carbon where hive_carbon.id=1; Current 
sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : hive.compile.auto.avoid.cbo=true
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Current sql is not contains insert syntax, not need record dest table 
flag
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:hive_carbon.id, type:int, comment:null), 
FieldSchema(name:hive_carbon.name, type:string, comment:null), 
FieldSchema(name:hive_carbon.scale, type:decimal(10,0), comment:null), 
FieldSchema(name:hive_carbon.country, type:string, comment:null), 
FieldSchema(name:hive_carbon.salary, type:double, comment:null)], 
properties:null)
INFO : Completed compiling 
command(queryId=omm_20201008191854_20cca5e2-a9ae-470a-acbc-1d0ceb46f4e2); Time 
taken: 0.215 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : State: Executing.
INFO : Executing 
command(queryId=omm_20201008191854_20cca5e2-a9ae-470a-acbc-1d0ceb46f4e2): 
select * from hive_carbon where hive_carbon.id=1; Current 
sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
WARN : Hive-on-MR is deprecated in Hive 2 and may not be available in the 
future versions. Consider using a different execution engine (i.e. spark, tez) 
or using Hive 1.X releases.
INFO : Query ID = omm_20201008191854_20cca5e2-a9ae-470a-acbc-1d0ceb46f4e2, 
Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Total jobs = 1
INFO : Launching Job 1 out of 1
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Number of reduce tasks is set to 0 since there's no reduce operator
INFO : number of splits:1
INFO : Submitting tokens for job: job_1601898485220_0037
INFO : Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: 
ha-hdfs:hacluster, Ident: (token for carbon: HDFS_DELEGATION_TOKEN 
owner=carbon, renewer=mapred, realUser=hive/hadoop.hadoop....@hadoop.com, 
issueDate=1602164934768, maxDate=1602769734768, sequenceNumber=5596, 
masterKeyId=10), Kind: HIVE_DELEGATION_TOKEN, Service: 
HiveServer2ImpersonationToken, Ident: 00 06 63 61 72 62 6f 6e 06 63 61 72 62 6f 
6e 21 68 69 76 65 2f 68 61 64 6f 6f 70 2e 68 61 64 6f 6f 70 2e 63 6f 6d 40 48 
41 44 4f 4f 50 2e 43 4f 4d 8a 01 75 08 78 47 0e 8a 01 75 2c 84 cb 0e 8e 15 f1 
5a]
INFO : The url to track the job: 
<Job-history-server-URL>/application_1601898485220_0037/
INFO : Starting Job = job_1601898485220_0037, Tracking URL = 
<Job-history-server-URL>/application_1601898485220_0037/, Current 
sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Kill Command = 
/opt/huawei/Bigdata/FusionInsight_HD_8.0.2/install/FusionInsight-Hive-3.1.0/hive-3.1.0/bin/..//../hadoop/bin/mapred
 job -kill job_1601898485220_0037
INFO : Hadoop job information for Stage-1: number of mappers: 1; number of 
reducers: 0
INFO : 2020-10-08 19:19:11,339 Stage-1 map = 0%, reduce = 0%
INFO : 2020-10-08 19:19:44,231 Stage-1 map = 100%, reduce = 0%
ERROR : Ended Job = job_1601898485220_0037 with errors
INFO : MapReduce Jobs Launched:
INFO : Stage-Stage-1: Map: 1 HDFS Read: 0 HDFS Write: 0 FAIL
INFO : Total MapReduce CPU Time Spent: 0 msec
INFO : Completed executing 
command(queryId=omm_20201008191854_20cca5e2-a9ae-470a-acbc-1d0ceb46f4e2); Time 
taken: 50.866 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
Error: Error while processing statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)
0: /> select count(*) from hive_carbon;
INFO : State: Compiling.
INFO : Compiling 
command(queryId=omm_20201008191958_51b13d46-d5f5-4b96-88bf-a953ad339c19): 
select count(*) from hive_carbon; Current 
sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : hive.compile.auto.avoid.cbo=true
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Current sql is not contains insert syntax, not need record dest table 
flag
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:_c0, 
type:bigint, comment:null)], properties:null)
INFO : Completed compiling 
command(queryId=omm_20201008191958_51b13d46-d5f5-4b96-88bf-a953ad339c19); Time 
taken: 0.238 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : State: Executing.
INFO : Executing 
command(queryId=omm_20201008191958_51b13d46-d5f5-4b96-88bf-a953ad339c19): 
select count(*) from hive_carbon; Current 
sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
WARN : Hive-on-MR is deprecated in Hive 2 and may not be available in the 
future versions. Consider using a different execution engine (i.e. spark, tez) 
or using Hive 1.X releases.
INFO : Query ID = omm_20201008191958_51b13d46-d5f5-4b96-88bf-a953ad339c19, 
Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Total jobs = 1
INFO : Launching Job 1 out of 1
INFO : Starting task [Stage-1:MAPRED] in serial mode
INFO : Number of reduce tasks determined at compile time: 1
INFO : In order to change the average load for a reducer (in bytes):
INFO : set hive.exec.reducers.bytes.per.reducer=<number>
INFO : In order to limit the maximum number of reducers:
INFO : set hive.exec.reducers.max=<number>
INFO : In order to set a constant number of reducers:
INFO : set mapreduce.job.reduces=<number>
INFO : number of splits:1
INFO : Submitting tokens for job: job_1601898485220_0038
INFO : Executing with tokens: [Kind: HDFS_DELEGATION_TOKEN, Service: 
ha-hdfs:hacluster, Ident: (token for carbon: HDFS_DELEGATION_TOKEN 
owner=carbon, renewer=mapred, realUser=hive/hadoop.hadoop....@hadoop.com, 
issueDate=1602164998767, maxDate=1602769798767, sequenceNumber=5597, 
masterKeyId=10), Kind: HIVE_DELEGATION_TOKEN, Service: 
HiveServer2ImpersonationToken, Ident: 00 06 63 61 72 62 6f 6e 06 63 61 72 62 6f 
6e 21 68 69 76 65 2f 68 61 64 6f 6f 70 2e 68 61 64 6f 6f 70 2e 63 6f 6d 40 48 
41 44 4f 4f 50 2e 43 4f 4d 8a 01 75 08 78 47 0e 8a 01 75 2c 84 cb 0e 8e 15 f1 
5a]
INFO : The url to track the job: 
<Job-history-server-URL>/application_1601898485220_0038/
INFO : Starting Job = job_1601898485220_0038, Tracking URL = 
<Job-history-server-URL>/application_1601898485220_0038/, Current 
sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Kill Command = 
/opt/huawei/Bigdata/FusionInsight_HD_8.0.2/install/FusionInsight-Hive-3.1.0/hive-3.1.0/bin/..//../hadoop/bin/mapred
 job -kill job_1601898485220_0038
INFO : Hadoop job information for Stage-1: number of mappers: 1; number of 
reducers: 1
INFO : 2020-10-08 19:20:17,684 Stage-1 map = 0%, reduce = 0%
INFO : 2020-10-08 19:20:50,546 Stage-1 map = 100%, reduce = 100%
ERROR : Ended Job = job_1601898485220_0038 with errors
INFO : MapReduce Jobs Launched:
Error: Error while processing statement: FAILED: Execution Error, return code 2 
from org.apache.hadoop.hive.ql.exec.mr.MapRedTask (state=08S01,code=2)

*Common error found in the Job history server URL for filter and aggregate 
queries - In map job for both the above issue:*

Error: java.io.IOException: java.io.IOException: Database name is not set. at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
 at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:414)
 at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:843)
 at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:175) 
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at 
org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:175) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: 
java.io.IOException: Database name is not set. at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDatabaseName(CarbonInputFormat.java:841)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getCarbonTable(MapredCarbonInputFormat.java:80)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getQueryModel(MapredCarbonInputFormat.java:215)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getRecordReader(MapredCarbonInputFormat.java:205)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:411)
 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to