[GitHub] [carbondata] Indhumathi27 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-07 Thread GitBox


Indhumathi27 commented on pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#issuecomment-705339900


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] maheshrajus commented on pull request #3912: [CARBONDATA-3977] Global sort partitions should be determined dynamically

2020-10-07 Thread GitBox


maheshrajus commented on pull request #3912:
URL: https://github.com/apache/carbondata/pull/3912#issuecomment-705339174


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] maheshrajus closed pull request #3968: [WIP] Partition optimization

2020-10-07 Thread GitBox


maheshrajus closed pull request #3968:
URL: https://github.com/apache/carbondata/pull/3968


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] maheshrajus commented on pull request #3968: [WIP] Partition optimization

2020-10-07 Thread GitBox


maheshrajus commented on pull request #3968:
URL: https://github.com/apache/carbondata/pull/3968#issuecomment-705337955


   kunal already raised PR for this. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on pull request #3950: [CARBONDATA-3889] Enable scalastyle check for all scala test code

2020-10-07 Thread GitBox


Indhumathi27 commented on pull request #3950:
URL: https://github.com/apache/carbondata/pull/3950#issuecomment-705337439


   @QiangCai Please rebase



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example

2020-10-07 Thread GitBox


ShreelekhyaG commented on a change in pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#discussion_r501455143



##
File path: 
integration/hive/src/test/java/org/apache/carbondata/hive/HiveCarbonTest.java
##
@@ -211,6 +248,85 @@ public void testStructType() throws Exception {
 checkAnswer(carbonResult, hiveResult);
   }
 
+  private ArrayList getDimRawChunk(Integer blockindex)
+  throws IOException {
+File rootPath = new File(HiveTestUtils.class.getResource("/").getPath() + 
"../../../..");
+String storePath = rootPath.getAbsolutePath() + 
"/integration/hive/target/warehouse/warehouse/hive_carbon_table/";
+CarbonFile[] dataFiles = FileFactory.getCarbonFile(storePath)
+.listFiles(new CarbonFileFilter() {
+  @Override
+  public boolean accept(CarbonFile file) {
+if (file.getName().endsWith(CarbonCommonConstants.FACT_FILE_EXT)) {
+  return true;
+} else {
+  return false;
+}
+  }
+});
+ArrayList dimensionRawColumnChunks = 
read(dataFiles[0].getAbsolutePath(),
+blockindex);
+return dimensionRawColumnChunks;
+  }
+
+  private ArrayList read(String filePath, Integer 
blockIndex) throws IOException {

Review comment:
   Done. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#issuecomment-705121072


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2582/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#issuecomment-705119554


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4332/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3971: [WIP] Do not clean stale data

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3971:
URL: https://github.com/apache/carbondata/pull/3971#issuecomment-705104850


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2581/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3971: [WIP] Do not clean stale data

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3971:
URL: https://github.com/apache/carbondata/pull/3971#issuecomment-705103101


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4331/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] nihal0107 commented on a change in pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-07 Thread GitBox


nihal0107 commented on a change in pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#discussion_r501158701



##
File path: docs/ddl-of-carbondata.md
##
@@ -812,7 +813,19 @@ Users can specify which columns to include and exclude for 
local dictionary gene
```
ALTER TABLE tablename UNSET TBLPROPERTIES('SORT_SCOPE')
```
+ - # Long String Columns
+   Example to SET Long String Columns:
+   ```
+   ALTER TABLE tablename SET TBLPROPERTIES('LONG_STRING_COLUMNS'='column1')
+   ```
+   **NOTE:** Only string columns can be set to long string columns. Cannot 
set sort columns to long string columns.
 
+   Example to UNSET Long String Columns:
+   ```
+   ALTER TABLE tablename UNSET TBLPROPERTIES('LONG_STRING_COLUMNS')
+   ```
+   **NOTE:** On unset long string columns are set to their original 
datatypes.

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on pull request #3971: [WIP] Do not clean stale data

2020-10-07 Thread GitBox


Indhumathi27 commented on pull request #3971:
URL: https://github.com/apache/carbondata/pull/3971#issuecomment-705036042


   retest this please 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3971: [WIP] Do not clean stale data

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3971:
URL: https://github.com/apache/carbondata/pull/3971#issuecomment-705016588


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2580/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3971: [WIP] Do not clean stale data

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3971:
URL: https://github.com/apache/carbondata/pull/3971#issuecomment-705014817


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4330/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-07 Thread GitBox


Indhumathi27 commented on a change in pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#discussion_r501090468



##
File path: docs/ddl-of-carbondata.md
##
@@ -812,7 +813,19 @@ Users can specify which columns to include and exclude for 
local dictionary gene
```
ALTER TABLE tablename UNSET TBLPROPERTIES('SORT_SCOPE')
```
+ - # Long String Columns
+   Example to SET Long String Columns:
+   ```
+   ALTER TABLE tablename SET TBLPROPERTIES('LONG_STRING_COLUMNS'='column1')
+   ```
+   **NOTE:** Only string columns can be set to long string columns. Cannot 
set sort columns to long string columns.
 
+   Example to UNSET Long String Columns:
+   ```
+   ALTER TABLE tablename UNSET TBLPROPERTIES('LONG_STRING_COLUMNS')
+   ```
+   **NOTE:** On unset long string columns are set to their original 
datatypes.

Review comment:
   ```suggestion
  **NOTE:** On unset, long string columns are set to their original 
datatypes.
   ```





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] Issue with select after update command

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-704967528


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4329/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] Issue with select after update command

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-704967821


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2579/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-4024) Select queries with filter and aggregate queries are not working in Hive write - carbon table.

2020-10-07 Thread Prasanna Ravichandran (Jira)
Prasanna Ravichandran created CARBONDATA-4024:
-

 Summary: Select queries with filter and aggregate queries are not 
working in Hive write - carbon table. 
 Key: CARBONDATA-4024
 URL: https://issues.apache.org/jira/browse/CARBONDATA-4024
 Project: CarbonData
  Issue Type: Bug
  Components: hive-integration
Affects Versions: 2.0.0
Reporter: Prasanna Ravichandran


Select queries with filter and aggregate queries are not working in Hive write 
- carbon table.

Hive - console:

0: /> use t2;
INFO : State: Compiling.
INFO : Compiling 
command(queryId=omm_20201008191831_ac10f1ae-8d39-4185-b25a-d690134a94be): use 
t2; 
Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : hive.compile.auto.avoid.cbo=true
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:null, properties:null)
INFO : Completed compiling 
command(queryId=omm_20201008191831_ac10f1ae-8d39-4185-b25a-d690134a94be); Time 
taken: 0.122 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : State: Executing.
INFO : Executing 
command(queryId=omm_20201008191831_ac10f1ae-8d39-4185-b25a-d690134a94be): use 
t2; Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=omm_20201008191831_ac10f1ae-8d39-4185-b25a-d690134a94be); Time 
taken: 0.019 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
No rows affected (0.207 seconds)
0: /> show tables;
INFO : State: Compiling.
INFO : Compiling 
command(queryId=omm_20201008191835_5e1e9469-0054-446f-af82-ec3294ec77b1): show 
tables; 
Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : hive.compile.auto.avoid.cbo=true
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:tab_name, 
type:string, comment:from deserializer)], properties:null)
INFO : Completed compiling 
command(queryId=omm_20201008191835_5e1e9469-0054-446f-af82-ec3294ec77b1); Time 
taken: 0.015 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : State: Executing.
INFO : Executing 
command(queryId=omm_20201008191835_5e1e9469-0054-446f-af82-ec3294ec77b1): show 
tables; Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Starting task [Stage-0:DDL] in serial mode
INFO : Completed executing 
command(queryId=omm_20201008191835_5e1e9469-0054-446f-af82-ec3294ec77b1); Time 
taken: 0.016 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
++
| tab_name |
++
| hive_carbon |
| hive_table |
| parquet_table |
++
3 rows selected (0.114 seconds)
0: /> select * from hive_carbon;
INFO : State: Compiling.
INFO : Compiling 
command(queryId=omm_20201008191842_9378bab9-181c-455e-aa6d-9b4f787ce6da): 
select * from hive_carbon; 
Current sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : hive.compile.auto.avoid.cbo=true
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : Current sql is not contains insert syntax, not need record dest table 
flag
INFO : Semantic Analysis Completed (retrial = false)
INFO : Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:hive_carbon.id, type:int, comment:null), 
FieldSchema(name:hive_carbon.name, type:string, comment:null), 
FieldSchema(name:hive_carbon.scale, type:decimal(10,0), comment:null), 
FieldSchema(name:hive_carbon.country, type:string, comment:null), 
FieldSchema(name:hive_carbon.salary, type:double, comment:null)], 
properties:null)
INFO : Completed compiling 
command(queryId=omm_20201008191842_9378bab9-181c-455e-aa6d-9b4f787ce6da); Time 
taken: 0.511 seconds
INFO : Concurrency mode is disabled, not creating a lock manager
INFO : State: Executing.
INFO : Executing 
command(queryId=omm_20201008191842_9378bab9-181c-455e-aa6d-9b4f787ce6da): 
select * from hive_carbon; Current 
sessionId=35d8eaaa-6d9f-4e8e-a837-e059b4eb85b4
INFO : Completed executing 
command(queryId=omm_20201008191842_9378bab9-181c-455e-aa6d-9b4f787ce6da); Time 
taken: 0.001 seconds
INFO : OK
INFO : Concurrency mode is disabled, not creating a lock manager
+-+---++--+-+
| hive_carbon.id | hive_carbon.name | hive_carbon.scale | hive_carbon.country | 
hive_carbon.salary |
+-+---++--+-+
| 1 | Ram | 2 | India | 3500.0 |
+-+---++--+-+
1 row selected (0.614 seconds)
0: /> select * from hive_carbon where 

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-704944183


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2577/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-704942707


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4327/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3937) Insert into select from another carbon /parquet table is not working on Hive Beeline on a newly create Hive write format - carbon table. We are getting “Database is

2020-10-07 Thread Prasanna Ravichandran (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-3937:
--
Description: 
Insert into select from another carbon or parquet table to a carbon table is 
not working on Hive Beeline on a newly create Hive write format carbon table. 
We are getting “Database is not set” error.

 

Test queries:

 drop table if exists hive_carbon;

create table hive_carbon(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon select 1,"Ram","2.3","India",3500;

insert into hive_carbon select 2,"Raju","2.4","Russia",3600;

insert into hive_carbon select 3,"Raghu","2.5","China",3700;

insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;

 

drop table if exists hive_carbon2;

create table hive_carbon2(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon2 select * from hive_carbon;

select * from hive_carbon;

select * from hive_carbon2;

 

 --execute below queries in spark-beeline;

create table hive_table(id int, name string, scale decimal, country string, 
salary double);
 create table parquet_table(id int, name string, scale decimal, country string, 
salary double) stored as parquet;
 insert into hive_table select 1,"Ram","2.3","India",3500;
 select * from hive_table;
 insert into parquet_table select 1,"Ram","2.3","India",3500;
 select * from parquet_table;

--execute the below query in hive beeline;

insert into hive_carbon select * from parquet_table;

Attached the logs for your reference. But the insert into select from the 
parquet and hive table into carbon table is working fine.

 

Only insert into select from hive table to carbon table is only working.

Error details in MR job which run through hive query:

Error: java.io.IOException: java.io.IOException: Database name is not set. at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
 at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:414)
 at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:843)
 at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:175) 
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at 
org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:175) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: 
java.io.IOException: Database name is not set. at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDatabaseName(CarbonInputFormat.java:841)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getCarbonTable(MapredCarbonInputFormat.java:80)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getQueryModel(MapredCarbonInputFormat.java:215)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getRecordReader(MapredCarbonInputFormat.java:205)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:411)
 ... 9 more

  was:
Insert into select from another carbon or parquet table to a carbon table is 
not working on Hive Beeline on a newly create Hive write format carbon table. 
We are getting “Database is not set” error.

 

Test queries:

 drop table if exists hive_carbon;

create table hive_carbon(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon select 1,"Ram","2.3","India",3500;

insert into hive_carbon select 2,"Raju","2.4","Russia",3600;

insert into hive_carbon select 3,"Raghu","2.5","China",3700;

insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;

 

drop table if exists hive_carbon2;

create table hive_carbon2(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon2 select * from hive_carbon;

select * from hive_carbon;

select * from hive_carbon2;

 

 --execute below queries in spark-beeline;

create table hive_table(id int, name string, scale decimal, country string, 
salary double);
 create table parquet_table(id int, name string, scale decimal, country string, 
salary double) stored as parquet;
 insert into hive_table select 1,"Ram","2.3","India",3500;
 select * from hive_table;
 insert into 

[jira] [Updated] (CARBONDATA-3937) Insert into select from another carbon /parquet table is not working on Hive Beeline on a newly create Hive write format - carbon table. We are getting “Database is

2020-10-07 Thread Prasanna Ravichandran (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-3937:
--
Description: 
Insert into select from another carbon or parquet table to a carbon table is 
not working on Hive Beeline on a newly create Hive write format carbon table. 
We are getting “Database is not set” error.

 

Test queries:

 drop table if exists hive_carbon;

create table hive_carbon(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon select 1,"Ram","2.3","India",3500;

insert into hive_carbon select 2,"Raju","2.4","Russia",3600;

insert into hive_carbon select 3,"Raghu","2.5","China",3700;

insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;

 

drop table if exists hive_carbon2;

create table hive_carbon2(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon2 select * from hive_carbon;

select * from hive_carbon;

select * from hive_carbon2;

 

 --execute below queries in spark-beeline;

create table hive_table(id int, name string, scale decimal, country string, 
salary double);
 create table parquet_table(id int, name string, scale decimal, country string, 
salary double) stored as parquet;
 insert into hive_table select 1,"Ram","2.3","India",3500;
 select * from hive_table;
 insert into parquet_table select 1,"Ram","2.3","India",3500;
 select * from parquet_table;

--execute the below query in hive beeline;

insert into hive_carbon select * from parquet_table;

Attached the logs for your reference. But the insert into select from the 
parquet and hive table into carbon table is working fine.

 

Error details in MR job which run through hive query:

Error: java.io.IOException: java.io.IOException: Database name is not set. at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
 at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:414)
 at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:843)
 at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:175) 
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at 
org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:175) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: 
java.io.IOException: Database name is not set. at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDatabaseName(CarbonInputFormat.java:841)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getCarbonTable(MapredCarbonInputFormat.java:80)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getQueryModel(MapredCarbonInputFormat.java:215)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getRecordReader(MapredCarbonInputFormat.java:205)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:411)
 ... 9 more

  was:
Insert into select from another carbon table is not working on Hive Beeline on 
a newly create Hive write format carbon table. We are getting “Carbondata files 
not found error”.

 

Test queries:

 drop table if exists hive_carbon;

create table hive_carbon(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon select 1,"Ram","2.3","India",3500;

insert into hive_carbon select 2,"Raju","2.4","Russia",3600;

insert into hive_carbon select 3,"Raghu","2.5","China",3700;

insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;

 

drop table if exists hive_carbon2;

create table hive_carbon2(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon2 select * from hive_carbon;

select * from hive_carbon;

select * from hive_carbon2;

 

 --execute below queries in spark-beeline;

create table hive_table(id int, name string, scale decimal, country string, 
salary double);
create table parquet_table(id int, name string, scale decimal, country string, 
salary double) stored as parquet;
insert into hive_table select 1,"Ram","2.3","India",3500;
select * from hive_table;
insert into parquet_table select 1,"Ram","2.3","India",3500;
select * from parquet_table;

--execute the below query in hive 

[jira] [Updated] (CARBONDATA-3937) Insert into select from another carbon /parquet table is not working on Hive Beeline on a newly create Hive write format - carbon table. We are getting “Database is

2020-10-07 Thread Prasanna Ravichandran (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-3937:
--
Description: 
Insert into select from another carbon table is not working on Hive Beeline on 
a newly create Hive write format carbon table. We are getting “Carbondata files 
not found error”.

 

Test queries:

 drop table if exists hive_carbon;

create table hive_carbon(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon select 1,"Ram","2.3","India",3500;

insert into hive_carbon select 2,"Raju","2.4","Russia",3600;

insert into hive_carbon select 3,"Raghu","2.5","China",3700;

insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;

 

drop table if exists hive_carbon2;

create table hive_carbon2(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon2 select * from hive_carbon;

select * from hive_carbon;

select * from hive_carbon2;

 

 --execute below queries in spark-beeline;

create table hive_table(id int, name string, scale decimal, country string, 
salary double);
create table parquet_table(id int, name string, scale decimal, country string, 
salary double) stored as parquet;
insert into hive_table select 1,"Ram","2.3","India",3500;
select * from hive_table;
insert into parquet_table select 1,"Ram","2.3","India",3500;
select * from parquet_table;

--execute the below query in hive beeline;

insert into hive_carbon select * from parquet_table;

Attached the logs for your reference. But the insert into select from the 
parquet and hive table into carbon table is working fine.

 

Error details in MR job which run through hive query:

Error: java.io.IOException: java.io.IOException: Database name is not set. at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
 at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:414)
 at 
org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:843)
 at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.(MapTask.java:175) 
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444) at 
org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at 
org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:175) at 
java.security.AccessController.doPrivileged(Native Method) at 
javax.security.auth.Subject.doAs(Subject.java:422) at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737)
 at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: 
java.io.IOException: Database name is not set. at 
org.apache.carbondata.hadoop.api.CarbonInputFormat.getDatabaseName(CarbonInputFormat.java:841)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getCarbonTable(MapredCarbonInputFormat.java:80)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getQueryModel(MapredCarbonInputFormat.java:215)
 at 
org.apache.carbondata.hive.MapredCarbonInputFormat.getRecordReader(MapredCarbonInputFormat.java:205)
 at 
org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:411)
 ... 9 more

  was:
Insert into select from another carbon table is not working on Hive Beeline on 
a newly create Hive write format carbon table. We are getting “Carbondata files 
not found error”.

 

Test queries:

 drop table if exists hive_carbon;

create table hive_carbon(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon select 1,"Ram","2.3","India",3500;

insert into hive_carbon select 2,"Raju","2.4","Russia",3600;

insert into hive_carbon select 3,"Raghu","2.5","China",3700;

insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;

 

drop table if exists hive_carbon2;

create table hive_carbon2(id int, name string, scale decimal, country string, 
salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon2 select * from hive_carbon;

select * from hive_carbon;

select * from hive_carbon2;

 

 

Attached the logs for your reference. But the insert into select from the 
parquet and hive table into carbon table is working fine.

 

Error details in MR job which run through hive query:

Error: java.io.IOException: java.io.IOException: CarbonData file is not present 
in the table location at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97)
 at 

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-704904168


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4326/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3695: [WIP] partition optimization

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3695:
URL: https://github.com/apache/carbondata/pull/3695#issuecomment-704901815


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2576/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akkio-97 commented on pull request #3967: [CARBONDATA-4004] Issue with select after update command

2020-10-07 Thread GitBox


akkio-97 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-704899485


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3938) In Hive read table, we are unable to read a projection column or read a full scan - select * query. Even the aggregate queries are not working.

2020-10-07 Thread Prasanna Ravichandran (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prasanna Ravichandran updated CARBONDATA-3938:
--
Description: 
In Hive read table, we are unable to read a projection column or full scan 
query. But the aggregate queries are working fine.

 

Test query:

 

--spark beeline;

drop table if exists uniqdata;

drop table if exists uniqdata1;

CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) stored as carbondata ;

LOAD DATA INPATH 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

CREATE TABLE IF NOT EXISTS uniqdata1 (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) ROW FORMAT SERDE 'org.apache.carbondata.hive.CarbonHiveSerDe' WITH 
SERDEPROPERTIES 
('mapreduce.input.carboninputformat.databaseName'='default','mapreduce.input.carboninputformat.tableName'='uniqdata')
 STORED AS INPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonInputFormat' 
OUTPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonOutputFormat' LOCATION 
'hdfs://hacluster/user/hive/warehouse/uniqdata';

select  count(*)  from uniqdata1;

 

 

--Hive Beeline;

select count(*) from uniqdata1; --not working, returning 0 rows, eventhough 
2000 rows are there;--Issue 1 on Hive read format table;

select * from uniqdata1; --Return no rows;--Issue 2 - a) full scan on Hive read 
format table;

select cust_id from uniqdata1 limit 5;--Return no rows;–Issue 2-b select query 
with projection, not working, returning now rows;

 Attached the logs for your reference. With the Hive write table this issue is 
not seen. Issue is only seen in Hive read format table.

This issue also exists when a normal carbon table is created in Spark and read 
through Hive beeline.

  was:
In Hive read table, we are unable to read a projection column or full scan 
query. But the aggregate queries are working fine.

 

Test query:

 

--spark beeline;

drop table if exists uniqdata;

drop table if exists uniqdata1;

CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, 
DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) stored as carbondata ;

LOAD DATA INPATH 'hdfs://hacluster/user/prasanna/2000_UniqData.csv' into table 
uniqdata OPTIONS('DELIMITER'=',', 
'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');

CREATE TABLE IF NOT EXISTS uniqdata1 (CUST_ID int,CUST_NAME 
String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 
bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
int) ROW FORMAT SERDE 'org.apache.carbondata.hive.CarbonHiveSerDe' WITH 
SERDEPROPERTIES 
('mapreduce.input.carboninputformat.databaseName'='default','mapreduce.input.carboninputformat.tableName'='uniqdata')
 STORED AS INPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonInputFormat' 
OUTPUTFORMAT 'org.apache.carbondata.hive.MapredCarbonOutputFormat' LOCATION 
'hdfs://hacluster/user/hive/warehouse/uniqdata';

select  count(*)  from uniqdata1;

 

 

--Hive Beeline;

select count(*) from uniqdata1; --Returns 2000;

select count(*) from uniqdata; --Returns 2000 - working fine;

select * from uniqdata1; --Return no rows;–Issue 1 on Hive read format table;

select * from uniqdata;–Returns no rows;–Issue 2 while reading a normal carbon 
table created in spark;

select cust_id from uniqdata1 limit 5;--Return no rows;

 Attached the logs for your reference. With the Hive write table this issue is 
not seen. Issue is only seen in Hive read format table.

This issue also exists when a normal carbon table is created in Spark and read 
through Hive beeline.

Summary: In Hive read table, we are unable to read a projection column 
or read a full scan - select * query. Even the aggregate queries are not 
working.  (was: In Hive read table, we are unable to read a projection column 
or read a full scan - select * query. 

[GitHub] [carbondata] Indhumathi27 opened a new pull request #3971: [WIP] Do not clean stale data

2020-10-07 Thread GitBox


Indhumathi27 opened a new pull request #3971:
URL: https://github.com/apache/carbondata/pull/3971


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] Issue with select after update command

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-704885270


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4325/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] Issue with select after update command

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-704882187


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2575/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] change discovery.uri in presto guide and dml document update

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3969:
URL: https://github.com/apache/carbondata/pull/3969#issuecomment-704873036


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4323/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] change discovery.uri in presto guide and dml document update

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3969:
URL: https://github.com/apache/carbondata/pull/3969#issuecomment-704870859


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2573/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3964: [CARBONDATA-4015] Remove hardcode of Lock configuration in Update and Delete

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3964:
URL: https://github.com/apache/carbondata/pull/3964#issuecomment-704843056


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4322/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3964: [CARBONDATA-4015] Remove hardcode of Lock configuration in Update and Delete

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3964:
URL: https://github.com/apache/carbondata/pull/3964#issuecomment-704842487


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2572/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3967: [CARBONDATA-4004] Issue with select after update command

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3967:
URL: https://github.com/apache/carbondata/pull/3967#issuecomment-704828397







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (CARBONDATA-3921) SI load fails with unable to get filestatus error in concurrent scenario

2020-10-07 Thread Akshay (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akshay closed CARBONDATA-3921.
--
Resolution: Fixed

> SI load fails with unable to get filestatus error in concurrent scenario
> 
>
> Key: CARBONDATA-3921
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3921
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akshay
>Priority: Major
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> The clean up mechanism in SI finally block causes to delete the ongoing load 
> files which caused the issue.
> Solution - Remove calling of the API 
> TableProcessingOperations.deletePartialLoadDataIfExist, as it's not required 
> for the scenario and clean up happens with clean files and 
> deleteLoadsAndUpdateMetadata API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (CARBONDATA-3794) show metacache command takes significantly more time 1st time when compared to 2nd time.

2020-10-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3794.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

The issue is resolved in Carbon 2.1.0 version.

0: jdbc:hive2://10.20.251.163:23040/default> show metacache;
+-+---+-+-+
| Identifier | Table Index size | CgAndFg Index size | Cache Location |
+-+---+-+-+
+-+---+-+-+
No rows selected (7.059 seconds)
0: jdbc:hive2://10.20.251.163:23040/default> show metacache;
+-+---+-+-+
| Identifier | Table Index size | CgAndFg Index size | Cache Location |
+-+---+-+-+
+-+---+-+-+
No rows selected (6.52 seconds)

> show metacache command takes significantly more time 1st time when compared 
> to 2nd time.
> 
>
> Key: CARBONDATA-3794
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3794
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2 compatible carbon
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> show metacache command takes significantly more time 1st time when compared 
> to 2nd time.
> *1st time*
> 0: jdbc:hive2://10.20.255.171:23040/show metacache;
>  
> +--+++++--
> |Identifier|Index size|Datamap size|Cache Location|
> +--+++++--
> |TOTAL|745 B|0 B|DRIVER|
> |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|
> +--+++++--
>  *2 rows selected (8.233 seconds)*
>  
> *2nd time*
>  0: jdbc:hive2://10.20.255.171:23040/default> show metacache;
>  
> +--+++++--
> |Identifier|Index size|Datamap size|Cache Location|
> +--+++++--
> |TOTAL|745 B|0 B|DRIVER|
> |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|
> +--+++++--
>  *2 rows selected (1.46 seconds)*
>  
> *Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 
> seconds for 2nd time show metacache.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3794) show metacache command takes significantly more time 1st time when compared to 2nd time.

2020-10-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat updated CARBONDATA-3794:

Description: 
show metacache command takes significantly more time 1st time when compared to 
2nd time.

*1st time*

0: jdbc:hive2://10.20.255.171:23040/show metacache;
 
+--+++++--
|Identifier|Index size|Datamap size|Cache Location|

+--+++++--
|TOTAL|745 B|0 B|DRIVER|
|1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|

+--+++++--
 *2 rows selected (8.233 seconds)*

 

*2nd time*
 0: jdbc:hive2://10.20.255.171:23040/default> show metacache;
 
+--+++++--
|Identifier|Index size|Datamap size|Cache Location|

+--+++++--
|TOTAL|745 B|0 B|DRIVER|
|1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|

+--+++++--
 *2 rows selected (1.46 seconds)*

 

*Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 
seconds for 2nd time show metacache.*

  was:
show metacache command takes significantly more time 1st time when compared to 
2nd time.

*1st time*

0: jdbc:hive2://10.20.255.171:23040/show metcshow metacache;
+-+-+---+-+--+
| Identifier | Index size | Datamap size | Cache Location |
+-+-+---+-+--+
| TOTAL | 745 B | 0 B | DRIVER |
| 1_6_1.uniqdata_comp_nosort | 745 B | 0 B | DRIVER |
+-+-+---+-+--+
*2 rows selected (8.233 seconds)*

 

*2nd time*
0: jdbc:hive2://10.20.255.171:23040/default> show metacache;
+-+-+---+-+--+
| Identifier | Index size | Datamap size | Cache Location |
+-+-+---+-+--+
| TOTAL | 745 B | 0 B | DRIVER |
| 1_6_1.uniqdata_comp_nosort | 745 B | 0 B | DRIVER |
+-+-+---+-+--+
*2 rows selected (1.46 seconds)*

 

*Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 
seconds for 2nd time show metacache.*


> show metacache command takes significantly more time 1st time when compared 
> to 2nd time.
> 
>
> Key: CARBONDATA-3794
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3794
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.0
> Environment: Spark 2.3.2 compatible carbon
>Reporter: Chetan Bhat
>Priority: Minor
>
> show metacache command takes significantly more time 1st time when compared 
> to 2nd time.
> *1st time*
> 0: jdbc:hive2://10.20.255.171:23040/show metacache;
>  
> +--+++++--
> |Identifier|Index size|Datamap size|Cache Location|
> +--+++++--
> |TOTAL|745 B|0 B|DRIVER|
> |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|
> +--+++++--
>  *2 rows selected (8.233 seconds)*
>  
> *2nd time*
>  0: jdbc:hive2://10.20.255.171:23040/default> show metacache;
>  
> +--+++++--
> |Identifier|Index size|Datamap size|Cache Location|
> +--+++++--
> |TOTAL|745 B|0 B|DRIVER|
> |1_6_1.uniqdata_comp_nosort|745 B|0 B|DRIVER|
> +--+++++--
>  *2 rows selected (1.46 seconds)*
>  
> *Sometimes the 1st time show metacache takes upto 25 seconds compared to 3-4 
> seconds for 2nd time show metacache.*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#issuecomment-704799730


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2570/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#issuecomment-704798161


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4320/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-704794823


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2569/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970#issuecomment-704794039


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4319/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Closed] (CARBONDATA-3950) Alter table drop column for non partition column throws error

2020-10-07 Thread Chetan Bhat (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chetan Bhat closed CARBONDATA-3950.
---
Fix Version/s: 2.1.0
   Resolution: Fixed

Issue is fixed in latest Carbon 2.1.0 build

> Alter table drop column for non partition column throws error
> -
>
> Key: CARBONDATA-3950
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3950
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 2.0.1
> Environment: Spark 2.4.5
>Reporter: Chetan Bhat
>Priority: Minor
> Fix For: 2.1.0
>
>
> From spark-sql the queries are executed as mentioned below-
> drop table if exists uniqdata_int;
> CREATE TABLE uniqdata_int (CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB 
> timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double, INTEGER_COLUMN1 
> int) Partitioned by (cust_id int) stored as carbondata TBLPROPERTIES 
> ("TABLE_BLOCKSIZE"= "256 MB");
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata_int partition(cust_id='1') OPTIONS ('FILEHEADER'='CUST_ID,CUST_NAME 
> ,ACTIVE_EMUI_VERSION,DOB,DOJ, 
> BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1, 
> Double_COLUMN2,INTEGER_COLUMN1','BAD_RECORDS_ACTION'='FORCE');
> show partitions uniqdata_int;
> select * from uniqdata_int order by cust_id;
> alter table uniqdata_int add columns(id int);
>  desc uniqdata_int;
>  *alter table uniqdata_int drop columns(CUST_NAME);*
>  desc uniqdata_int;
> Issue : Alter table drop column for non partition column throws error even 
> though the operation is success.
> org.apache.carbondata.spark.exception.ProcessMetaDataException: operation 
> failed for priyesh.uniqdata_int: Alterion failed: 
> org.apache.hadoop.hive.ql.metadata.HiveException: Unable to alter table. The 
> following columns have he existing columns in their respective positions :
> col;
>  at 
> org.apache.spark.sql.execution.command.MetadataProcessOperation$class.throwMetadataException(package.
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.throwMetadataException(package.scala:120)
>  at 
> org.apache.spark.sql.execution.command.schema.CarbonAlterTableDropColumnCommand.processMetadata(Carboand.scala:201)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand$$anonfun$run$1.apply(package.scala:123)
>  at 
> org.apache.spark.sql.execution.command.Auditable$class.runWithAudit(package.scala:104)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.runWithAudit(package.scala:120)
>  at 
> org.apache.spark.sql.execution.command.MetadataCommand.run(package.scala:123)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:69)
>  at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:80)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3379)
>  at 
> org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:95
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:144)
>  at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:86)
>  at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3378)
>  at org.apache.spark.sql.Dataset.(Dataset.scala:196)
>  at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:79)
>  at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:651)
>  at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:67)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:387)
>  at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:406)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:279)
>  at 
> org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at 

[GitHub] [carbondata] QiangCai commented on pull request #3969: [CARBONDATA-3932] [CARBONDATA-3903] document update

2020-10-07 Thread GitBox


QiangCai commented on pull request #3969:
URL: https://github.com/apache/carbondata/pull/3969#issuecomment-704782229


   better to change the PR title to the details that this PR does.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai edited a comment on pull request #3964: [CARBONDATA-4015] Remove hardcode of Lock configuration in Update and Delete

2020-10-07 Thread GitBox


QiangCai edited a comment on pull request #3964:
URL: https://github.com/apache/carbondata/pull/3964#issuecomment-704778260


   LGTM, but CI didn't run for 2.3.4



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3964: [CARBONDATA-4015] Remove hardcode of Lock configuration in Update and Delete

2020-10-07 Thread GitBox


QiangCai commented on pull request #3964:
URL: https://github.com/apache/carbondata/pull/3964#issuecomment-704779979


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai commented on pull request #3964: [CARBONDATA-4015] Remove hardcode of Lock configuration in Update and Delete

2020-10-07 Thread GitBox


QiangCai commented on pull request #3964:
URL: https://github.com/apache/carbondata/pull/3964#issuecomment-704778260


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-4017) insert fails when column name has back slash and Si creation fails

2020-10-07 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4017.
-
Resolution: Fixed

> insert fails when column name has back slash and Si creation fails
> --
>
> Key: CARBONDATA-4017
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4017
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> 1. when the column name contains the backslash character and the table is 
> created with carbon session , insert fails second time.
> 2. when column name has special characters, SI creation fails in parsing.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3962: [CARBONDATA-4017]Fix the insert issue when the column name contains '\' and fix SI creation issue

2020-10-07 Thread GitBox


asfgit closed pull request #3962:
URL: https://github.com/apache/carbondata/pull/3962


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-704770724


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4321/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3914: [CARBONDATA-3979] Added Hive local dictionary support example

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3914:
URL: https://github.com/apache/carbondata/pull/3914#issuecomment-704770158


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2571/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-4018) CSV header validation is not considering the dimension columns

2020-10-07 Thread Akash R Nilugal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-4018?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-4018.
-
Resolution: Fixed

> CSV header validation is not  considering the dimension columns
> ---
>
> Key: CARBONDATA-4018
> URL: https://issues.apache.org/jira/browse/CARBONDATA-4018
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> CSV header validation not considering the dimension columns in schema



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3963: [CARBONDATA-4018]Fix CSV header validation not contains dimension columns

2020-10-07 Thread GitBox


asfgit closed pull request #3963:
URL: https://github.com/apache/carbondata/pull/3963


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Commented] (CARBONDATA-3970) Carbondata 2.0.1 MV ERROR CarbonInternalMetastore$: Adding/Modifying tableProperties operation failed

2020-10-07 Thread SHREELEKHYA GAMPA (Jira)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-3970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17209365#comment-17209365
 ] 

SHREELEKHYA GAMPA commented on CARBONDATA-3970:
---

Hi [~sushantsam], Could you please provide the spark configurations set 
particularly related to metastore, and complete stacktrace of error.

> Carbondata 2.0.1 MV  ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed
> --
>
> Key: CARBONDATA-3970
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3970
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, hive-integration
>Affects Versions: 2.0.1
> Environment: CarbonData 2.0.1 with Spark 2.4.5
>Reporter: Sushant Sammanwar
>Priority: Major
>
> Hi ,
>  
> I am facing issues with materialized views  -  the query is not hitting the 
> view in the explain plan .I would really appreciate if you could help me on 
> this.
> Below are the details : 
> I am using Spark shell to connect to Carbon 2.0.1 using spark 2.4.5
> Underlying table has data loaded.
> I think problem is while create materialized view as i am getting a error 
> related to metastore.
>  
>  
> scala> carbon.sql("create MATERIALIZED VIEW agg_sales_mv as select country, 
> sex,sum(quantity),avg(price) from sales group by country,sex").show()
> 20/08/26 01:04:41 AUDIT audit: \{"time":"August 26, 2020 1:04:41 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"START"}
> 20/08/26 01:04:45 AUDIT audit: \{"time":"August 26, 2020 1:04:45 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"START"}
> 20/08/26 01:04:46 AUDIT audit: \{"time":"August 26, 2020 1:04:46 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"START"}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377696791275","opStatus":"SUCCESS","opTime":"2326 
> ms","table":"NA","extraInfo":{}}
> 20/08/26 01:04:48 AUDIT audit: \{"time":"August 26, 2020 1:04:48 AM 
> IST","username":"root","opName":"CREATE 
> TABLE","opId":"16462377160819798","opStatus":"SUCCESS","opTime":"2955 
> ms","table":"default.agg_sales_mv","extraInfo":{"local_dictionary_threshold":"1","bad_record_path":"","table_blocksize":"1024","local_dictionary_enable":"true","flat_folder":"false","external":"false","sort_columns":"","comment":"","carbon.column.compressor":"snappy","mv_related_tables":"sales"}}
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:50 ERROR CarbonInternalMetastore$: Adding/Modifying 
> tableProperties operation failed: 
> org.apache.spark.sql.hive.HiveExternalCatalog cannot be cast to 
> org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener
> 20/08/26 01:04:51 AUDIT audit: \{"time":"August 26, 2020 1:04:51 AM 
> IST","username":"root","opName":"CREATE MATERIALIZED 
> VIEW","opId":"16462372696035311","opStatus":"SUCCESS","opTime":"10551 
> ms","table":"NA","extraInfo":{"mvName":"agg_sales_mv"}}
> ++
> ||
> ++
> ++
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3961: [CARBONDATA-4019]Fix CDC merge failure join expression made of AND/OR expressions.

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3961:
URL: https://github.com/apache/carbondata/pull/3961#issuecomment-704764695


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2568/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3961: [CARBONDATA-4019]Fix CDC merge failure join expression made of AND/OR expressions.

2020-10-07 Thread GitBox


CarbonDataQA1 commented on pull request #3961:
URL: https://github.com/apache/carbondata/pull/3961#issuecomment-704762415


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/4318/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on pull request #3959: [CARBONDATA-4010] Doc changes for long strings.

2020-10-07 Thread GitBox


Indhumathi27 commented on pull request #3959:
URL: https://github.com/apache/carbondata/pull/3959#issuecomment-704743825


   retest  this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3953: [CARBONDATA-4008]Fixed IN filter on date column is returning 0 results when 'carbon.push.rowfilters.for.vector' is true

2020-10-07 Thread GitBox


akashrn5 commented on a change in pull request #3953:
URL: https://github.com/apache/carbondata/pull/3953#discussion_r500783398



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/filterexpr/TestInFilter.scala
##
@@ -165,8 +168,27 @@ class TestInFilter extends QueryTest with 
BeforeAndAfterAll{
   Seq(Row(4, 1.00, 2.00, 3.00)))
   }
 
-  override def afterAll(): Unit = {
+  test("test infilter with date, timestamp columns") {
+sql("create table test_table(i int, dt date, ts timestamp) stored as 
carbondata")
+sql("insert into test_table select 1, '2020-03-30', '2020-03-30 10:00:00'")
+sql("insert into test_table select 2, '2020-07-04', '2020-07-04 14:12:15'")
+sql("insert into test_table select 3, '2020-09-23', '2020-09-23 12:30:45'")
+
+checkAnswer(sql("select * from test_table where dt IN ('2020-03-30', 
'2020-09-23')"),

Review comment:
   can you add a query here with the different values in In() list as you 
mentioned in description?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Karan980 opened a new pull request #3970: [CARBONDATA-4007] Fix multiple issues in SDK_IUD

2020-10-07 Thread GitBox


Karan980 opened a new pull request #3970:
URL: https://github.com/apache/carbondata/pull/3970


### Why is this PR needed?
   Fix multiple issues occurred in SDK_IUD.
   a) TupleId always have linux file separator independent of the system.
   b) Filtered rows array size gives ArrayOutOfBound exception if number of 
deleted rows is greater than 4096.
   c) While writing the data of date column type during update operation gives 
bad record exception as dates are converted to Integer during read.

### What changes were proposed in this PR?
   a) Changed the tupleId file separator to linux file separator.
   b) Change the filtered rows size to default column page rows size.
   c) Converted the integer back to date type before writing the data during 
update operation.
   
 ### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] akashrn5 commented on a change in pull request #3961: [CARBONDATA-4019]Fix CDC merge failure join expression made of AND/OR expressions.

2020-10-07 Thread GitBox


akashrn5 commented on a change in pull request #3961:
URL: https://github.com/apache/carbondata/pull/3961#discussion_r500756989



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/execution/command/mutation/merge/CarbonMergeDataSetCommand.scala
##
@@ -106,18 +106,22 @@ case class CarbonMergeDataSetCommand(
 // decide join type based on match conditions
 val joinType = decideJoinType
 
-val joinColumn = mergeMatches.joinExpr.expr.asInstanceOf[EqualTo].left
-  .asInstanceOf[UnresolvedAttribute].nameParts.tail.head
-// repartition the srsDs, if the target has bucketing and the bucketing 
column and join column
-// are same
+val joinColumns = mergeMatches.joinExpr.expr.collect {
+  case unresolvedAttribute: UnresolvedAttribute if 
unresolvedAttribute.nameParts.nonEmpty =>
+unresolvedAttribute.nameParts.tail.head

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org