[jira] [Created] (CARBONDATA-3659) alluxio without host and port cannot read or write data in carbon.

2020-01-09 Thread Ravindra Pesala (Jira)
Ravindra Pesala created CARBONDATA-3659:
---

 Summary: alluxio without host and port cannot read or write data 
in carbon.
 Key: CARBONDATA-3659
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3659
 Project: CarbonData
  Issue Type: New Feature
Reporter: Ravindra Pesala


When alluxio path is provided without host and port like 
alluxio:///user/warehouse then carbon cannot read or write data because of path 
comparison fails and extracting parent path fails.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3597) Support DataSet merge functionality in Carbon

2019-11-27 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3597:

Description: 
Merge dataset is essential to improve the performance of SCD and CDC scenarios 
and also it eases the development effort.

The expected behavior as follows.

SQL Syntax

{code}

MERGE INTO [db_name.]target_table [AS target_alias]

USING [db_name.]source_table [AS source_alias]

ON 

[ WHEN MATCHED [ AND  ] THEN  ]

[ WHEN MATCHED [ AND  ] THEN  ]

[ WHEN NOT MATCHED [ AND  ]  THEN  ]

{code}

 

DataFrame API

{code}

targetDS.merge(sourceDS, __). 

  whenMatched(__).

  updateExpr(updateMap).

  insertExpr(insertMap_u).

  whenNotMatched(__).

  insertExpr(insertMap).

  whenNotMatchedAndExistsOnlyOnTarget(__).

  delete().

  insertHistoryTableExpr(insertMap_d, ).

  execute()

{code}

 

> Support DataSet merge functionality in Carbon
> -
>
> Key: CARBONDATA-3597
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3597
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Priority: Major
> Attachments: Support DataSet Merge in CarboonData_V1.docx
>
>
> Merge dataset is essential to improve the performance of SCD and CDC 
> scenarios and also it eases the development effort.
> The expected behavior as follows.
> SQL Syntax
> {code}
> MERGE INTO [db_name.]target_table [AS target_alias]
> USING [db_name.]source_table [AS source_alias]
> ON 
> [ WHEN MATCHED [ AND  ] THEN  ]
> [ WHEN MATCHED [ AND  ] THEN  ]
> [ WHEN NOT MATCHED [ AND  ]  THEN  ]
> {code}
>  
> DataFrame API
> {code}
> targetDS.merge(sourceDS, __). 
>   whenMatched(__).
>   updateExpr(updateMap).
>   insertExpr(insertMap_u).
>   whenNotMatched(__).
>   insertExpr(insertMap).
>   whenNotMatchedAndExistsOnlyOnTarget(__).
>   delete().
>   insertHistoryTableExpr(insertMap_d, ).
>   execute()
> {code}
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3597) Support DataSet merge functionality in Carbon

2019-11-27 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3597:

Attachment: Support DataSet Merge in CarboonData_V1.docx

> Support DataSet merge functionality in Carbon
> -
>
> Key: CARBONDATA-3597
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3597
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Priority: Major
> Attachments: Support DataSet Merge in CarboonData_V1.docx
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3597) Support DataSet merge functionality in Carbon

2019-11-27 Thread Ravindra Pesala (Jira)
Ravindra Pesala created CARBONDATA-3597:
---

 Summary: Support DataSet merge functionality in Carbon
 Key: CARBONDATA-3597
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3597
 Project: CarbonData
  Issue Type: New Feature
Reporter: Ravindra Pesala
 Attachments: Support DataSet Merge in CarboonData_V1.docx





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3543) Support move segment in carbon

2019-10-07 Thread Ravindra Pesala (Jira)
Ravindra Pesala created CARBONDATA-3543:
---

 Summary: Support move segment in carbon
 Key: CARBONDATA-3543
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3543
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


Since we are supporting adding external segmentsto the carbon transactional 
table, it is needed to support move segments feature to move external segments 
to the table path.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (CARBONDATA-3517) Support add segment feature in carbon

2019-09-10 Thread Ravindra Pesala (Jira)
Ravindra Pesala created CARBONDATA-3517:
---

 Summary: Support add segment feature in carbon
 Key: CARBONDATA-3517
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3517
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Ravindra Pesala


We have a scenario like carbondata files are generated externally through SDK 
and if we want to add that data to the existing table then there is no proper 
way.  And also already existing data in other formats cannot be added to 
carbondata. 

To support above both the requirements we need to support Add segment feature 
with the following syntax.

Alter table test add segment options (‘path’= 
'hdfs://usr/oldtable,'format'=parquet)

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3516) Support heterogeneous format segments in carbondata

2019-09-10 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3516:

Attachment: Support heterogeneous format segments in carbondata_V1.docx

>  Support heterogeneous format segments in carbondata
> 
>
> Key: CARBONDATA-3516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Priority: Major
> Attachments: Support heterogeneous format segments in 
> carbondata_V1.docx
>
>
> Already existing customers use other formats like parquet, orc etc., but if 
> they want to migrate to carbon there is no proper solution at hand. So this 
> feature allows all the old data to add as a segment to carbondata .  And 
> during query, it reads old data in its respective format and all new segments 
> will be read in carbon. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3516) Support heterogeneous format segments in carbondata

2019-09-10 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3516:

Attachment: (was: Support heterogeneous format segments in 
carbondata_V1.docx)

>  Support heterogeneous format segments in carbondata
> 
>
> Key: CARBONDATA-3516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Priority: Major
> Attachments: Support heterogeneous format segments in 
> carbondata_V1.docx
>
>
> Already existing customers use other formats like parquet, orc etc., but if 
> they want to migrate to carbon there is no proper solution at hand. So this 
> feature allows all the old data to add as a segment to carbondata .  And 
> during query, it reads old data in its respective format and all new segments 
> will be read in carbon. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3516) Support heterogeneous format segments in carbondata

2019-09-10 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3516:

Attachment: Support heterogeneous format segments in carbondata_V1.docx

>  Support heterogeneous format segments in carbondata
> 
>
> Key: CARBONDATA-3516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Priority: Major
> Attachments: Support heterogeneous format segments in 
> carbondata_V1.docx
>
>
> Already existing customers use other formats like parquet, orc etc., but if 
> they want to migrate to carbon there is no proper solution at hand. So this 
> feature allows all the old data to add as a segment to carbondata .  And 
> during query, it reads old data in its respective format and all new segments 
> will be read in carbon. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (CARBONDATA-3516) Support heterogeneous format segments in carbondata

2019-09-10 Thread Ravindra Pesala (Jira)
Ravindra Pesala created CARBONDATA-3516:
---

 Summary:  Support heterogeneous format segments in carbondata
 Key: CARBONDATA-3516
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3516
 Project: CarbonData
  Issue Type: New Feature
Reporter: Ravindra Pesala
 Attachments: Support heterogeneous format segments in 
carbondata_V1.docx

Already existing customers use other formats like parquet, orc etc., but if 
they want to migrate to carbon there is no proper solution at hand. So this 
feature allows all the old data to add as a segment to carbondata .  And during 
query, it reads old data in its respective format and all new segments will be 
read in carbon. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3500) Time travel or multi versioning on Carbondata.

2019-08-23 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3500:

Attachment: Time Travel on CarbonData_v1.pdf

> Time travel or multi versioning on Carbondata.
> --
>
> Key: CARBONDATA-3500
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3500
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
>Priority: Major
> Attachments: Time Travel on CarbonData_v1.pdf
>
>
> CarbonData allows to store the data incrementally and do the Update/Delete 
> operations on the stored data. But the user always can access the latest 
> state of data at that point of time.
> In the current system, it is not possible to access the old version of data. 
> And it is not possible to rollback to the old version in case of some issues 
> in current version data. 
> This proposal adds the automatic versioning of data that we store and we can 
> access any historical version of that data.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Created] (CARBONDATA-3500) Time travel or multi versioning on Carbondata.

2019-08-23 Thread Ravindra Pesala (Jira)
Ravindra Pesala created CARBONDATA-3500:
---

 Summary: Time travel or multi versioning on Carbondata.
 Key: CARBONDATA-3500
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3500
 Project: CarbonData
  Issue Type: New Feature
Reporter: Ravindra Pesala


CarbonData allows to store the data incrementally and do the Update/Delete 
operations on the stored data. But the user always can access the latest state 
of data at that point of time.
In the current system, it is not possible to access the old version of data. 
And it is not possible to rollback to the old version in case of some issues in 
current version data. 

This proposal adds the automatic versioning of data that we store and we can 
access any historical version of that data.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3401) Fix the java sdk create wrong carbondata filename

2019-08-20 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3401:

Fix Version/s: (was: 1.6.0)

> Fix the java sdk create wrong carbondata filename 
> --
>
> Key: CARBONDATA-3401
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3401
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load, other
>Affects Versions: 1.5.3
>Reporter: lamber-ken
>Priority: Major
> Attachments: fix_the_java_sdk_create_wrong_carbondata_filename.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> When use sdk to write carbondata files, use +System.nanoTime()+ assign to the 
> start timestamp and use +System.nanoTime()+ assign to taskNo currently.
> There are two bug here, first +System.nanoTime()+ is different from 
> +System.currentTimeMillis()+, it can only be used to measure elapsed time and 
> is not related to any other notion of system or wall-clock time. Second, the 
> carbondata file name written by sdk was different from written by spark.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3388) Optimize the way to acquire project or root path

2019-08-20 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3388:

Fix Version/s: (was: 1.6.0)

> Optimize the way to acquire project or root path 
> -
>
> Key: CARBONDATA-3388
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3388
> Project: CarbonData
>  Issue Type: Improvement
>  Components: core, examples, spark-integration
>Affects Versions: 1.5.3
>Reporter: lamber-ken
>Priority: Major
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> For now, to acquire the root path of project path was called by
> {code:java}
> val rootPath = new File(this.getClass.getResource("/").getPath  + 
> "../../../..").getCanonicalPath
> {code}
> we can use a simpler approach to optimize
> {code:java}
> System.getProperty("user.dir")
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3383) fix thrift version conflict bug in examples module

2019-08-20 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3383:

Fix Version/s: (was: 1.6.0)

> fix thrift version conflict bug in examples module
> --
>
> Key: CARBONDATA-3383
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3383
> Project: CarbonData
>  Issue Type: Bug
>  Components: examples
>Affects Versions: 1.5.3
>Reporter: lamber-ken
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Thrift version conflict in examples module, in example module needs 
> +carbondata-hive+ and
> +carbondata-store-sdk,+ one version is 0.9.3 another is 0.9.2. This cause NPE 
> when use +carbondata-sdk+ write demo data.
>  
> Detail stackstrace
> {code:java}
> java.lang.NullPointerException
>     at 
> org.apache.thrift.protocol.TCompactProtocol.writeString(TCompactProtocol.java:356)
>     at 
> org.apache.carbondata.format.FileFooter3$FileFooter3StandardScheme.write(FileFooter3.java:1053)
>     at 
> org.apache.carbondata.format.FileFooter3$FileFooter3StandardScheme.write(FileFooter3.java:877)
>     at org.apache.carbondata.format.FileFooter3.write(FileFooter3.java:768)
>     at 
> org.apache.carbondata.core.util.CarbonUtil.getByteArray(CarbonUtil.java:1444)
>     at 
> org.apache.carbondata.processing.store.writer.v3.CarbonFactDataWriterImplV3.writeFooterToFile(CarbonFactDataWriterImplV3.java:122)
>     at 
> org.apache.carbondata.processing.store.writer.v3.CarbonFactDataWriterImplV3.writeFooter(CarbonFactDataWriterImplV3.java:415)
>     at 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.closeHandler(CarbonFactDataHandlerColumnar.java:502)
>     at 
> org.apache.carbondata.processing.loading.steps.CarbonRowDataWriterProcessorStepImpl.processingComplete(CarbonRowDataWriterProcessorStepImpl.java:234)
>     at 
> org.apache.carbondata.processing.loading.steps.CarbonRowDataWriterProcessorStepImpl.finish(CarbonRowDataWriterProcessorStepImpl.java:212)
>     at 
> org.apache.carbondata.processing.loading.steps.CarbonRowDataWriterProcessorStepImpl.doExecute(CarbonRowDataWriterProcessorStepImpl.java:179)
>     at 
> org.apache.carbondata.processing.loading.steps.CarbonRowDataWriterProcessorStepImpl.execute(CarbonRowDataWriterProcessorStepImpl.java:133)
>     at 
> org.apache.carbondata.processing.loading.DataLoadExecutor.execute(DataLoadExecutor.java:52)
>     at 
> org.apache.carbondata.hadoop.api.CarbonTableOutputFormat$1.run(CarbonTableOutputFormat.java:274)
>     at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>     at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>     at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>     at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>     at java.lang.Thread.run(Thread.java:748)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (CARBONDATA-3380) Fix missing appName and AnalysisException bug in DirectSQLExample

2019-08-20 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3380?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3380.
-
Resolution: Fixed

> Fix missing appName and AnalysisException bug in DirectSQLExample 
> --
>
> Key: CARBONDATA-3380
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3380
> Project: CarbonData
>  Issue Type: Bug
>  Components: build, spark-integration
>Affects Versions: 1.5.3
>Reporter: lamber-ken
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 10h 10m
>  Remaining Estimate: 0h
>
> Fix two bug in +DirectSQLExample+
>  # fix missing +appName+ which is writing the carbondata files.
>  # fix +AnalysisException+ bug because of +carbonfile+



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3484) Unable to read data from carbontable , after sucessfull writing in same or new carbonsession

2019-08-20 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3484?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3484:

Fix Version/s: (was: 1.6.0)

> Unable to read data from carbontable , after sucessfull writing in same or 
> new carbonsession
> 
>
> Key: CARBONDATA-3484
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3484
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.6.0
> Environment: ubuntu
>Reporter: anshul
>Priority: Critical
>  Labels: newbie
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> I have read a csv from my local system and write it as carbontable on s3 
> location using s3n
>   public static void main(String args[]) {
>     String colNames = "";
>     SparkSession spark = null;
>     SparkSession carbon = null;
>     String storeLocation = "s3n://accesskey:secretkey@bucketnames3";
>     SparkConf config = new SparkConf();
>     config.setMaster("local[2]");
>     config.set("javax.jdo.option.ConnectionDriverName", 
> "org.postgresql.Driver");
>     config.set("javax.jdo.option.ConnectionPassword", "postgres");
>     config.set("javax.jdo.option.ConnectionUserName", "postgres");
>     config.set("hive.exec.dynamic.partition.mode", "nonstrict");
>     config.set("hive.exec.dynamic.partition", "true");
>     config.set("hive.exec.max.dynamic.partitions", "2556");
>     config.set("hive.exec.max.dynamic.partitions.pernode", "2556");
>     config.set("carbon.number.of.cores.while.loading", "1");
>     config.set("carbon.sort.temp.compressor", "SNAPPY");
>     config.set("carbon.sort.size", "5000");
>     config.set("carbon.sort.file.write.buffer.size", "500");
>     config.set("carbon.merge.sort.prefetch", "false");
>     config.set("carbon.sort.intermediate.files.limit", "10");
>     config.set("enable.unsafe.sort", "true");
>     config.set("spark.kryo.unsafe", "true");
>     config.set("hive.metastore.uris", "thrift://localhost:9083");
>     spark = 
> SparkSession.builder().appName("CarbonDataReader").config(config).enableHiveSupport().getOrCreate();
>     carbon = 
> CarbonSession.CarbonBuilder(spark.builder()).getOrCreateCarbonSession(storeLocation,
>     "jdbc:postgresql://localhost:5432/carbonmetastore");
>     carbon.sparkContext().hadoopConfiguration().set("fs.s3n.impl", 
> "org.apache.hadoop.fs.s3native.NativeS3FileSystem");
>     
> carbon.sparkContext().hadoopConfiguration().set("fs.s3n.awsAccessKeyId", 
> "");
>     
> carbon.sparkContext().hadoopConfiguration().set("fs.s3n.awsSecretAccessKey", 
> "xxx");
>     
>     Dataset ds = carbon.read().format("carbondata").option("header", 
> "true").option("inferSchema", "true")
>     .csv("/home/anshul.jain/Downloads/datasets/EMP_Age.csv");
>     ds.registerTempTable("temp_emp_age_test");
>     carbon.sql("describe formatted emp_age_test").show(100, false);
>     DataFrameWriter dfw = 
> ds.write().format(MaterializedViewConstants.CARBONFORMAT).option(MetadataConstants.TABLE_NAME,
>  "emp_age_test")
>     .option("bad_records_logger_enable", false);
>     dfw.mode(SaveMode.Overwrite).save();
>  
> carbon.sql("select * from emp_age_test").show();
>  



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Updated] (CARBONDATA-3370) fix missing version of maven-duplicate-finder-plugin

2019-08-20 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3370:

Fix Version/s: (was: 1.6.0)
   1.6.1

> fix missing version of maven-duplicate-finder-plugin
> 
>
> Key: CARBONDATA-3370
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3370
> Project: CarbonData
>  Issue Type: Improvement
>  Components: build
>Affects Versions: 1.5.3
>Reporter: lamber-ken
>Priority: Critical
> Fix For: 1.6.1
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> fix missing version of maven-duplicate-finder-plugin in pom file



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (CARBONDATA-3494) Nullpointer exception in case of drop table

2019-08-20 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3494.
-
Fix Version/s: 1.6.1
   Resolution: Fixed

> Nullpointer exception in case of drop table
> ---
>
> Key: CARBONDATA-3494
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3494
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 1.6.1
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Drop table is failing with nullpointer in below scenario with datamap
> This issue will happen once the PR #3339 is merged, because with that PR 
> refresh table happens correctly and then this fails.
> CREATE TABLE datamap_test_1 (id int,name string,salary float,dob date)STORED 
> BY 'carbondata' TBLPROPERTIES('SORT_COLUMNS'='id');
> CREATE DATAMAP dm_datamap_test_1_2 ON TABLE datamap_test_1 USING 
> 'bloomfilter' DMPROPERTIES ('INDEX_COLUMNS' = 'salary,name', 
> 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1', 'BLOOM_COMPRESS'='true');
> CREATE DATAMAP dm_datamap_test3 ON TABLE datamap_test_1 USING 'bloomfilter' 
> DMPROPERTIES ('INDEX_COLUMNS' = 'dob', 'BLOOM_SIZE'='64', 
> 'BLOOM_FPP'='0.1', 'BLOOM_COMPRESS'='true');
> drop table if exists datamap_test_1;



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (CARBONDATA-3480) Remove Modified MDT and make relation refresh only when schema file is modified.

2019-08-20 Thread Ravindra Pesala (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3480.
-
Fix Version/s: 1.6.1
   Resolution: Fixed

> Remove Modified MDT and make relation refresh only when schema file is 
> modified.
> 
>
> Key: CARBONDATA-3480
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3480
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.1
>
>  Time Spent: 16h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.2#803003)


[jira] [Resolved] (CARBONDATA-3490) Concurrent data load failure with carbondata FileNotFound exception

2019-08-12 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3490.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Concurrent data load failure with carbondata FileNotFound exception
> ---
>
> Key: CARBONDATA-3490
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3490
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> Caused by: 
> org.apache.carbondata.core.datastore.exception.CarbonDataWriterException: 
> Problem while copying file from local store to carbon store
>   at 
> org.apache.carbondata.core.util.CarbonUtil.copyCarbonDataFileToCarbonStorePath(CarbonUtil.java:2750)
>   at 
> org.apache.carbondata.processing.store.writer.AbstractFactDataWriter.commitCurrentFile(AbstractFactDataWriter.java:283)
>   at 
> org.apache.carbondata.processing.store.writer.v3.CarbonFactDataWriterImplV3.closeWriter(CarbonFactDataWriterImplV3.java:393)
>   ... 11 more
> Caused by: java.io.FileNotFoundException: 
> /tmp/carbon865982118689228_1/Fact/Part0/Segment_6/1/part-0-1_batchno0-0-6-1565329654844.carbondata
>  (No such file or directory)
>   at java.io.FileInputStream.open0(Native Method)
>   at java.io.FileInputStream.open(FileInputStream.java:195)
>   at java.io.FileInputStream.(FileInputStream.java:138)
>   at java.io.FileInputStream.(FileInputStream.java:93)
>   at 
> org.apache.carbondata.core.datastore.filesystem.LocalCarbonFile.getDataInputStream(LocalCarbonFile.java:309)
>   at 
> org.apache.carbondata.core.datastore.filesystem.LocalCarbonFile.getDataInputStream(LocalCarbonFile.java:299)
>   at 
> org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:179)
>   at 
> org.apache.carbondata.core.datastore.impl.FileFactory.getDataInputStream(FileFactory.java:175)
>   at 
> org.apache.carbondata.core.util.CarbonUtil.copyLocalFileToCarbonStore(CarbonUtil.java:2781)
>   at 
> org.apache.carbondata.core.util.CarbonUtil.copyCarbonDataFileToCarbonStorePath(CarbonUtil.java:2746)
>   ... 13 more
> problem: When two load is happening concurrently, one load is cleaning the 
> temp directory of the concurrent load
> cause: temp directory to store the carbon files is created using system.get 
> nano time, due to this two load have same store location. when one load is 
> completed, it cleaned the temp directory. causing dataload failure for other 
> load.
> solution:
> use UUID instead of nano time while creating the temp directory to have each 
> load a unique directory.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3482) Null pointer exception when concurrent select queries are executed from different beeline terminals.

2019-08-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3482.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Null pointer exception when concurrent select queries are executed from 
> different beeline terminals.
> 
>
> Key: CARBONDATA-3482
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3482
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> # Beeline1: => create tables (1K )
> 2. Beeline2 => insert into table t2 (only 1 records ) till 7K
> 3. Concurrent queries
> q1 : select count(*) from t1
> q2 : select * from t1 limit 1
> q3 : select count(*) from t2
> q2 : select * from t2 limit 1
>  
> Exception:
> java.lang.NullPointerException
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockDataMap.getFileFooterEntrySchema(BlockDataMap.java:1061)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockDataMap.prune(BlockDataMap.java:727)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockDataMap.prune(BlockDataMap.java:821)
>  at 
> org.apache.carbondata.core.indexstore.blockletindex.BlockletDataMapFactory.getAllBlocklets(BlockletDataMapFactory.java:446)
>  at 
> org.apache.carbondata.core.datamap.TableDataMap.pruneWithoutFilter(TableDataMap.java:156)
>  at 
> org.apache.carbondata.core.datamap.TableDataMap.prune(TableDataMap.java:143)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getPrunedBlocklets(CarbonInputFormat.java:563)
>  at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getDataBlocksOfSegment(CarbonInputFormat.java:471)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:471)
>  at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:199)
>  at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:141)
>  at 
> org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:66)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:256)
>  at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:254)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3460) EOF exception is thrown when quering using index server

2019-07-15 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3460?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3460.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> EOF exception is thrown when quering using index server
> ---
>
> Key: CARBONDATA-3460
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3460
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> java.lang.UnsupportedOperationException: Unsupported columnar format version: 
> 29556
>  at 
> org.apache.carbondata.core.metadata.ColumnarFormatVersion.valueOf(ColumnarFormatVersion.java:52)
>  at 
> org.apache.carbondata.hadoop.CarbonInputSplit.readFields(CarbonInputSplit.java:357)
>  at 
> org.apache.carbondata.hadoop.CarbonMultiBlockSplit.readFields(CarbonMultiBlockSplit.java:154)
>  at org.apache.hadoop.io.ObjectWritable.readObject(ObjectWritable.java:285)
>  at org.apache.hadoop.io.ObjectWritable.readFields(ObjectWritable.java:77)
>  at 
> org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply$mcV$sp(SerializableWritable.scala:45)
>  at 
> org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply(SerializableWritable.scala:41)
>  at 
> org.apache.spark.SerializableWritable$$anonfun$readObject$1.apply(SerializableWritable.scala:41)
>  at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1390)
>  at 
> org.apache.spark.SerializableWritable.readObject(SerializableWritable.scala:41)
>  at sun.reflect.GeneratedMethodAccessor65.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:497)
>  at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1896)
>  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>  at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
>  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>  at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1993)
>  at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1918)
>  at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1801)
>  at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1351)
>  at java.io.ObjectInputStream.readObject(ObjectInputStream.java:371)
>  at 
> org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:75)
>  at 
> org.apache.spark.serializer.JavaSerializerInstance.deserialize(JavaSerializer.scala:114)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:313)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)
> 2019-06-27 12:44:07 ERROR Executor:91 - Exception in task 1472.0 in stage 
> 10.0 (TID 3317)
> java.lang.IllegalStateException: unread block data



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3459) Fixed id based distribution for show cache command

2019-07-15 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3459.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Fixed id based distribution for show cache command
> --
>
> Key: CARBONDATA-3459
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3459
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> Currently tasks are not being fired based on the executor ID because 
> getPrefferedLocation was not overridden.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3467) Fix count(*) with filter on string value

2019-07-13 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3467.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Fix count(*) with filter on string value
> 
>
> Key: CARBONDATA-3467
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3467
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3457) [MV]Fix Column not found with Cast Expression

2019-07-12 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3457.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> [MV]Fix Column not found with Cast Expression
> -
>
> Key: CARBONDATA-3457
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3457
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3456) Fix DataLaoding on MV when Yarn-Application is killed

2019-07-12 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3456.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Fix DataLaoding on MV when Yarn-Application is killed
> -
>
> Key: CARBONDATA-3456
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3456
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[jira] [Resolved] (CARBONDATA-3440) Expose a DDL to add index size and data size to tableStatus for legacy segments

2019-07-02 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3440.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Expose a DDL to add index size and data size to tableStatus for legacy 
> segments
> ---
>
> Key: CARBONDATA-3440
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3440
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> For legacy segments index size in not written to tablestatus due to which the 
> distribution will go for a toss.
> To counter this a DDL will be exposed which will write the index size to the 
> tablestatus for all the segments for a specified table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3398) Implement Show Cache for IndexServer and MV

2019-06-22 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3398.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Implement Show Cache for IndexServer and MV
> ---
>
> Key: CARBONDATA-3398
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3398
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 23h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3409) Fix Concurrent dataloading Issue with mv

2019-06-05 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3409.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Fix Concurrent dataloading Issue with mv
> 
>
> Key: CARBONDATA-3409
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3409
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3407) distinct, count, Sum query fails when MV is created on single projection column

2019-06-05 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3407.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> distinct, count, Sum query fails when MV is created on single projection 
> column
> ---
>
> Key: CARBONDATA-3407
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3407
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> distinct, count, Sum query fails when MV is created on single projection 
> column
>  sql("drop table if exists maintable")
> sql("create table maintable(name string, age int, add string) stored by 
> 'carbondata'")
> sql("create datamap single_mv using 'mv' as select age from maintable")
> sql("insert into maintable select 'pheobe',31,'NY'")
> sql("insert into maintable select 'rachel',32,'NY'")
>  sql("select distinct(age) from maintable")
>  sql("select sum(age) from maintable")
> sql("select count(age) from maintable")
> Fails with below Exception:
> {quote}requirement failed: Fragment is not supported.  Current frag:
> org.apache.carbondata.mv.plans.util.SQLBuildDSL$$anon$1@1f7f2e76
> java.lang.IllegalArgumentException: requirement failed: Fragment is not 
> supported.  Current frag:
> org.apache.carbondata.mv.plans.util.SQLBuildDSL$$anon$1@1f7f2e76
>   at scala.Predef$.require(Predef.scala:224)
>   at 
> org.apache.carbondata.mv.plans.util.Printers$SQLFragmentCompactPrinter.printFragment(Printers.scala:248)
>   at 
> org.apache.carbondata.mv.plans.util.Printers$FragmentPrinter$$anonfun$print$1.apply(Printers.scala:82)
>   at 
> org.apache.carbondata.mv.plans.util.Printers$FragmentPrinter$$anonfun$print$1.apply(Printers.scala:80)
>   at 
> scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)
>   at scala.collection.mutable.WrappedArray.foreach(WrappedArray.scala:35)
>   at 
> org.apache.carbondata.mv.plans.util.Printers$FragmentPrinter.print(Printers.scala:80)
>   at 
> org.apache.carbondata.mv.plans.util.Printers$class.render(Printers.scala:318)
>   at 
> org.apache.carbondata.mv.plans.modular.ModularPlan.render(ModularPlan.scala:35)
>   at 
> org.apache.carbondata.mv.plans.util.Printers$class.asCompactString(Printers.scala:323)
>   at 
> org.apache.carbondata.mv.plans.modular.ModularPlan.asCompactString(ModularPlan.scala:35)
>   at 
> org.apache.carbondata.mv.plans.modular.ModularPlan.asCompactSQL(ModularPlan.scala:156)
>   at 
> org.apache.carbondata.mv.datamap.MVAnalyzerRule.apply(MVAnalyzerRule.scala:83)
>   at 
> org.apache.carbondata.mv.datamap.MVAnalyzerRule.apply(MVAnalyzerRule.scala:43)
>   at 
> org.apache.spark.sql.hive.CarbonAnalyzer.execute(CarbonAnalyzer.scala:46){quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3404) Support CarbonFile API for coniguring custom file systems

2019-06-05 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3404.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Support CarbonFile API for coniguring custom file systems
> -
>
> Key: CARBONDATA-3404
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3404
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Kanaka Kumar Avvaru
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> Currently CarbonData supports few set of FileSystems like HDFS,S3,VIEWFS 
> schemes.
> If user configures table path from different file systems apart from 
> supported, FileFactory takes CarbonLocalFile as default and casues errors.
> This Jira proposes to support a API for user to extend CarbonFile & give the 
> correct instance of CarbonFile from existing implementations or a new own 
> implementation of CarbonFile interface.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3350) enhance custom compaction to support resort single segment

2019-06-05 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3350?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3350.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> enhance custom compaction to support resort single segment
> --
>
> Key: CARBONDATA-3350
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3350
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: QiangCai
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3403) MV is not working for like and filter AND and OR queries

2019-05-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3403.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> MV is not working for like and filter AND and OR queries
> 
>
> Key: CARBONDATA-3403
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3403
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> MV is not working for like and filter AND and OR queries
>  
> Steps:
> create table brinjal (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' ;
>  
> create datamap brinjal_mv_tab_nlz_aa016 on table brinjal using 'mv' as select 
> imei,AMSize,channelsId from brinjal where ActiveCountry NOT LIKE 'US' group 
> by imei,AMSize,channelsId;
> create datamap brinjal_mv_tab_nlz_aa018 on table brinjal using 'mv' as select 
> imei,AMSize,channelsId,ActiveCountry from brinjal where ActiveCountry 
> ='Chinese' or channelsId =4 group by imei,AMSize,channelsId,ActiveCountry;
>  
> then 
> select imei,AMSize,channelsId from brinjal where ActiveCountry NOT LIKE 'US' 
> group by imei,AMSize,channelsId; and 
>   select imei,AMSize,channelsId,ActiveCountry from brinjal where 
> ActiveCountry ='Chinese' or channelsId =4 group by 
> imei,AMSize,channelsId,ActiveCountry;
> are not hitting the datamap cretaed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3399) Implement Executor ID based task distribution for Index Server

2019-05-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3399.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Implement Executor ID based task distribution for Index Server
> --
>
> Key: CARBONDATA-3399
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3399
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3402) Block complex data types and validate dmproperties in mv

2019-05-29 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3402.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Block complex data types and validate dmproperties in mv
> 
>
> Key: CARBONDATA-3402
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3402
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Indhumathi Muthumurugesh
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3393) Merge Index Job Failure should not trigger the merge index job again. Exception propagation should be decided by the User.

2019-05-29 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3393.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Merge Index Job Failure should not trigger the merge index job again. 
> Exception propagation should be decided by the User.
> --
>
> Key: CARBONDATA-3393
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3393
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> If the merge index job is failed, LOAD is also failing. Load should not 
> consider the merge index job status to decide the LOAD status.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3396) Range Compaction Data mismatch

2019-05-29 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3396.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Range Compaction Data mismatch
> --
>
> Key: CARBONDATA-3396
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3396
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3397) Remove SparkUnknown Expression to Index Server

2019-05-29 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3397.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Remove SparkUnknown Expression to Index Server
> --
>
> Key: CARBONDATA-3397
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3397
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> if Query has UDF , it is registered into the Main driver and UDF function 
> will not be available in Index server So it is better to remove the  
> SparkUnknown  Expression because anyway for pruning we select all blocks
> org.apache.carbondata.core.scan.filter.executer.RowLevelFilterExecuterImpl#isScanRequired.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3400) Support IndexSever for Spark-Shell for in secure KERBROSE mode

2019-05-29 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3400?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3400.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Support IndexSever for Spark-Shell for in secure KERBROSE mode
> --
>
> Key: CARBONDATA-3400
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3400
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Support IndexSever for Spark-Shell for in secure KERBROSE mode



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3364) Support Read from Hive. Queries are giving empty results from hive.

2019-05-28 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3364?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3364.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Support Read from Hive. Queries are giving empty results from hive.
> ---
>
> Key: CARBONDATA-3364
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3364
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 12h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3395) When same split object is passed to concurrent readers, build() fails randomly with Exception.

2019-05-28 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3395?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3395.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> When same split object is passed to concurrent readers, build() fails 
> randomly with Exception.
> --
>
> Key: CARBONDATA-3395
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3395
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> When same split object is passed to concurrent readers, build() fails 
> randomly with Exception
>  
> 2019-05-24 13:51:55 ERROR CarbonVectorizedRecordReader:116 - 
> java.lang.ArrayIndexOutOfBoundsException: 4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3387) Support Partition with MV datamap & Show DataMap Status

2019-05-28 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3387.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

>  Support Partition with MV datamap & Show DataMap Status
> 
>
> Key: CARBONDATA-3387
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3387
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3392) Make use of LRU mandatory when using IndexServer

2019-05-28 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3392.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Make use of LRU mandatory when using IndexServer
> 
>
> Key: CARBONDATA-3392
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3392
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> *Background*
> Currently LRU is optional for the user to configure, but this will raise some 
> concerns in case of index server because the invalid segments have to be 
> constantly removed from the cache in case of update/delete/compaction 
> scenarios.
> Therefore if clear segment job is failed then the job would not fail bu there 
> has to be a mechanism to prevent that segment from being in cache forever.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3357) Support TableProperties from single parent table and restrict alter/delete/partition on mv

2019-05-27 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3357?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3357.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Support TableProperties from single parent table and restrict 
> alter/delete/partition on mv
> --
>
> Key: CARBONDATA-3357
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3357
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 21h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3384) Delete/Update is throwing NullPointerException when index server is enabled.

2019-05-27 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3384.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Delete/Update is throwing NullPointerException when index server is enabled.
> 
>
> Key: CARBONDATA-3384
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3384
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3303) MV datamap return wrong results when using coalesce and less groupby columns

2019-05-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3303.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> MV datamap return wrong results when using coalesce and less groupby columns
> 
>
> Key: CARBONDATA-3303
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3303
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Chenjian Qiu
>Priority: Blocker
> Fix For: 1.6.0
>
>  Time Spent: 5.5h
>  Remaining Estimate: 0h
>
> *SQL:*
> create table coalesce_test_main(id int,name string,height int,weight int 
> using carbondata
> insert into coalesce_test_main select 1,'tom',170,130
> insert into coalesce_test_main select 2,'tom',170,120
> insert into coalesce_test_main select 3,'lily',160,100
> create datamap coalesce_test_main_mv using 'mv' as select coalesce(sum(id),0) 
> as sum_id,name as myname,weight from coalesce_test_main group by name,weight
> select coalesce(sum(id),0) as sumid,name from coalesce_test_main group by name
> *Result:*
> 1 tom
> 2 tom
> 3 lily



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3338) Incremental dat load support to datamap on single table

2019-05-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3338.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Incremental dat load support to datamap on single table
> ---
>
> Key: CARBONDATA-3338
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3338
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Akash R Nilugal
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 30h 10m
>  Remaining Estimate: 0h
>
> Incremental dat load support to datamap on single table
> projection cases, aggregates, groupby cases



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3309) MV datamap adapt to spark 2.1 version

2019-05-20 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3309.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> MV datamap adapt to spark 2.1 version
> -
>
> Key: CARBONDATA-3309
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3309
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Chenjian Qiu
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 10.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3337) Implement a Hadoop RPC framwork for communication

2019-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3337.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Implement a Hadoop RPC framwork for communication
> -
>
> Key: CARBONDATA-3337
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3337
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.6.0
>
>  Time Spent: 53h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3306) Implement a DistributableIndexPruneRDD and IndexPruneFileFormat

2019-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3306?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3306.
-
Resolution: Fixed

> Implement a DistributableIndexPruneRDD and IndexPruneFileFormat
> ---
>
> Key: CARBONDATA-3306
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3306
> Project: CarbonData
>  Issue Type: Sub-task
>Affects Versions: 1.5.2
>Reporter: TaoLi
>Priority: Major
> Fix For: 1.6.0
>
>   Original Estimate: 336h
>  Time Spent: 4h 10m
>  Remaining Estimate: 331h 50m
>
> ○ The RDD will accept list of segments and the filter expression for pruning.
> ○ The pruning job is handled by the executors in a distributed way for the 
> allocated segments.
> ○ The splits are equally divided based on the size on the index files.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3295) MV datamap throw exception because its rewrite algorithm when multiply subquery

2019-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3295.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> MV datamap throw exception because its rewrite algorithm when multiply 
> subquery
> ---
>
> Key: CARBONDATA-3295
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3295
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Chenjian Qiu
>Priority: Blocker
> Fix For: 1.6.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> error:
> java.lang.UnsupportedOperationException was thrown.
> java.lang.UnsupportedOperationException
>  at 
> org.apache.carbondata.mv.plans.util.SQLBuildDSL$Fragment.productArity(SQLBuildDSL.scala:36)
>  at scala.runtime.ScalaRunTime$$anon$1.(ScalaRunTime.scala:174)
>  at scala.runtime.ScalaRunTime$.typedProductIterator(ScalaRunTime.scala:172)
>  
> mv sql:
> sql(s"""create datamap data_table_mv using 'mv' as
>  | SELECT STARTTIME,LAYER4ID,
>  | COALESCE (SUM(seq),0) AS seq_c,
>  | COALESCE (SUM(succ),0) AS succ_c
>  | FROM data_table
>  | GROUP BY STARTTIME,LAYER4ID""".stripMargin)
>  
> Query sql:
> sql(s"""SELECT MT.`3600` AS `3600`,
>  | MT.`2250410101` AS `2250410101`,
>  | (CASE WHEN (SUM(COALESCE(seq_c, 0))) = 0 THEN NULL
>  | ELSE
>  | (CASE WHEN (CAST((SUM(COALESCE(seq_c, 0))) AS int)) = 0 THEN 0
>  | ELSE ((CAST((SUM(COALESCE(succ_c, 0))) AS double))
>  | / (CAST((SUM(COALESCE(seq_c, 0))) AS double)))
>  | END) * 100
>  | END) AS rate
>  | FROM (
>  | SELECT sum_result.*, H_REGION.`2250410101` FROM
>  | (SELECT cast(floor((starttime + 28800) / 3600) * 3600 - 28800 as int) AS 
> `3600`,
>  | LAYER4ID,
>  | COALESCE(SUM(seq), 0) AS seq_c,
>  | COALESCE(SUM(succ), 0) AS succ_c
>  | FROM data_table
>  | WHERE STARTTIME >= 1549866600 AND STARTTIME < 1549899900
>  | GROUP BY cast(floor((STARTTIME + 28800) / 3600) * 3600 - 28800 as 
> int),LAYER4ID
>  | )sum_result
>  | LEFT JOIN
>  | (SELECT l4id AS `225040101`,
>  | l4name AS `2250410101`,
>  | l4name AS NAME_2250410101
>  | FROM region
>  | GROUP BY l4id, l4name) H_REGION
>  | ON sum_result.LAYER4ID = H_REGION.`225040101`
>  | WHERE H_REGION.NAME_2250410101 IS NOT NULL
>  | ) MT
>  | GROUP BY MT.`3600`, MT.`2250410101`
>  | ORDER BY `3600` ASC LIMIT 5000""".stripMargin)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3294) MV datamap throw error when using count(1) and case when expression

2019-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3294.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> MV datamap throw error when using count(1) and case when expression
> ---
>
> Key: CARBONDATA-3294
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3294
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Chenjian Qiu
>Priority: Blocker
> Fix For: 1.6.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Query SQL
>  ```
>  sql(s"""SELECT MT.`3600` AS `3600`,
>  MT.`2250410101` AS `2250410101`,
>  count(1) over() as countNum,
>  (CASE WHEN (SUM(COALESCE(seq_c, 0))) = 0 THEN NULL
>  ELSE
>  (CASE WHEN (CAST((SUM(COALESCE(seq_c, 0))) AS int)) = 0 THEN 0
>  ELSE ((CAST((SUM(COALESCE(succ_c, 0))) AS double))
>  / (CAST((SUM(COALESCE(seq_c, 0))) AS double)))
>  END) * 100
>  END) AS rate
>  FROM (
>  SELECT sum_result.*, H_REGION.`2250410101` FROM
>  (SELECT cast(floor((starttime + 28800) / 3600) * 3600 - 28800 as int) AS 
> `3600`,
>  LAYER4ID,
>  COALESCE(SUM(seq), 0) AS seq_c,
>  COALESCE(SUM(succ), 0) AS succ_c
>  FROM data_table
>  WHERE STARTTIME >= 1549866600 AND STARTTIME < 1549899900
>  GROUP BY cast(floor((STARTTIME + 28800) / 3600) * 3600 - 28800 as 
> int),LAYER4ID
>  )sum_result
>  LEFT JOIN
>  (SELECT l4id AS `225040101`,
>  l4name AS `2250410101`,
>  l4name AS NAME_2250410101
>  FROM region
>  GROUP BY l4id, l4name) H_REGION
>  ON sum_result.LAYER4ID = H_REGION.`225040101`
>  WHERE H_REGION.NAME_2250410101 IS NOT NULL
>  ) MT
>  GROUP BY MT.`3600`, MT.`2250410101`
>  ORDER BY `3600` ASC LIMIT 5000""".stripMargin)
>  ```
>  
> ERROR:
> mismatched input 'FROM' expecting \{, 'WHERE', 'GROUP', 'ORDER', 
> 'HAVING', 'LIMIT', 'LATERAL', 'WINDOW', 'UNION', 'EXCEPT', 'MINUS', 
> 'INTERSECT', 'SORT', 'CLUSTER', 'DISTRIBUTE'}(line 2, pos 0)
> == SQL ==
> SELECT MT.`3600`, MT.`2250410101`, `countNum`, `rate` 
> FROM
> ^^^



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3291) MV datamap doesn't take affect when the same table join

2019-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3291.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> MV datamap doesn't take affect when the same table join
> ---
>
> Key: CARBONDATA-3291
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3291
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Chenjian Qiu
>Priority: Blocker
> Fix For: 1.6.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3367) OOM when huge number of carbondata files are read from SDK reader

2019-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3367?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3367.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> OOM when huge number of carbondata files are read from SDK reader
> -
>
> Key: CARBONDATA-3367
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3367
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Currently, for each carbondata file, one CarbonRecordReader will be created. 
> And list of CarbonRecordReader will be maintained in carbonReader. so even 
> when CarbonRecordReader is closed, the GC will not happen for that reader as 
> list is still referring that object. 
> so, each CarbonRecordReader needs separate memory , instead of reusing the 
> previous memory. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3368) InferSchema from datafile instead of index file

2019-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3368.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> InferSchema from datafile instead of index file
> ---
>
> Key: CARBONDATA-3368
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3368
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Major
> Fix For: 1.6.0
>
>
> problem : In SDK, when multiple readers were created with same folder 
> location with different file list, for inferschema all the readers refers 
> same index file, which was causing bottle neck and JVM crash in case of JNI 
> call.
> solution: Inferschema from the data file mentioned while building the reader. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3365) Support Apache arrow vector filling from carbondata SDK

2019-05-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3365.
-
   Resolution: Fixed
Fix Version/s: 1.6.0

> Support Apache arrow vector filling from carbondata SDK
> ---
>
> Key: CARBONDATA-3365
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3365
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ajantha Bhat
>Priority: Major
> Fix For: 1.6.0
>
>  Time Spent: 11h 50m
>  Remaining Estimate: 0h
>
> *Background:* 
> As we know Apache arrow is a cross-language development platform for 
> in-memory data, It specifies a standardised language-independent columnar 
> memory format for flat and hierarchical data, organised for efficient 
> analytic operations on modern hardware. 
> So, By integrating carbon to support filling arrow vector, contents read by 
> carbondata files can be used for analytics in any programming language. say 
> arrow vector filled from carbon java SDK can be read by python, c, c++ and 
> many other languages supported by arrow. 
> This will also increase the scope for carbondata use-cases and carbondata 
> can be used for various applications as arrow is integrated already with 
> many query engines. 
> *Implementation:* 
> *Stage1:* 
> After SDK reading the carbondata file, convert carbon rows and fill the 
> arrow vector. 
> *Stage2:* 
> Deep integration with carbon vector; for this, currently carbon SDK vector 
> doesn't support filling complex columns. 
> After supporting this, arrow vector can be wrapped around carbon SDK vector 
> for deep integration. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3386) Concurrent Merge index and query is failing

2019-05-17 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3386.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> Concurrent Merge index and query is failing
> ---
>
> Key: CARBONDATA-3386
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3386
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Priority: Major
> Fix For: 1.5.4
>
>  Time Spent: 6h 50m
>  Remaining Estimate: 0h
>
> Concurrent merge index and query is failing. Load is triggered on a table, at 
> the end of the load Merge index will be triggered. But this is triggered 
> after the table status is updated as SUCCESS/PARTIAL SUCCESS for that 
> segments. So for the concurrent query, this segment is available for query. 
> Once the merge index is done, it deletes the index files, which are still 
> referred by the query, this leads to the query failure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3377) String Type Column with huge strings and null values fails Range Compaction

2019-05-16 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3377.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> String Type Column with huge strings and null values fails Range Compaction
> ---
>
> Key: CARBONDATA-3377
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3377
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Priority: Minor
> Fix For: 1.5.4
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> h1. String Type Column with huge strings and null values fails giving 
> NullPointerException when it is a range column and compaction is done.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3375) GC Overhead limit exceeded error for huge data in Range Compaction

2019-05-15 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3375.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> GC Overhead limit exceeded error for huge data in Range Compaction
> --
>
> Key: CARBONDATA-3375
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3375
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Priority: Minor
> Fix For: 1.5.4
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> When only single data item is present then it will be launched as one single 
> task wich results in one executor getting overloaded.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3371) Compaction show ArrayIndexOutOfBoundsException after sort_columns modification

2019-05-09 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3371.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> Compaction show ArrayIndexOutOfBoundsException after sort_columns modification
> --
>
> Key: CARBONDATA-3371
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3371
> Project: CarbonData
>  Issue Type: Bug
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Major
> Fix For: 1.5.4
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> 2019-05-05 15:26:39 ERROR DataTypeUtil:619 - Cannot convert� Z�w} to SHORT 
> type valueWrong length: 8, expected 2
> 2019-05-05 15:26:39 ERROR DataTypeUtil:621 - Problem while converting data 
> type� Z�w} 
> 2019-05-05 15:26:39 ERROR CompactionResultSortProcessor:185 - 3
> java.lang.ArrayIndexOutOfBoundsException: 3
>  at 
> org.apache.carbondata.core.scan.wrappers.ByteArrayWrapper.getNoDictionaryKeyByIndex(ByteArrayWrapper.java:81)
>  at 
> org.apache.carbondata.processing.merger.CompactionResultSortProcessor.prepareRowObjectForSorting(CompactionResultSortProcessor.java:332)
>  at 
> org.apache.carbondata.processing.merger.CompactionResultSortProcessor.processResult(CompactionResultSortProcessor.java:250)
>  at 
> org.apache.carbondata.processing.merger.CompactionResultSortProcessor.execute(CompactionResultSortProcessor.java:175)
>  at 
> org.apache.carbondata.spark.rdd.CarbonMergerRDD$$anon$1.(CarbonMergerRDD.scala:226)
>  at 
> org.apache.carbondata.spark.rdd.CarbonMergerRDD.internalCompute(CarbonMergerRDD.scala:84)
>  at org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:82)
>  at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
>  at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
>  at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
>  at org.apache.spark.scheduler.Task.run(Task.scala:108)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
> 2019-05-05 15:26:39 ERROR CarbonMergerRDD:233 - Compaction Failed



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3369) Fix issues during concurrent execution of Create table If not exists

2019-05-08 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3369.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> Fix issues during concurrent execution of Create table If not exists 
> -
>
> Key: CARBONDATA-3369
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3369
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kanaka Kumar Avvaru
>Priority: Major
> Fix For: 1.5.4
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Create table if not exists has following problems if run concurrently from 
> different drivers 
> 1) Sometimes It fails with error "Table  already exists." 
> 2) Create table failed driver still holds the table with wrong path or 
> schema. Eventual operations refer the wrong path
> 3) Stale path created during create table is not deleted for ever [After 
> 1.5.0 version table will be created in a new folder using UUID if folder with 
> table name already exists]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3360) NullPointerException in clean files operation

2019-05-08 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3360.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> NullPointerException in clean files operation
> -
>
> Key: CARBONDATA-3360
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3360
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Minor
> Fix For: 1.5.4
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> when delete is failed due to hdfs quota exceeded or disk space is full, then 
> tableUpdateStatus.write will be present in store.
> So after that if clean files operation is done, we were trying to assign null 
> to primitive type long, which will throw runtime exception, and .write file 
> will not be deleted, since we consider it as invalid file.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3344) Fix Drop column not present in table

2019-04-19 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3344.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> Fix Drop column not present in table
> 
>
> Key: CARBONDATA-3344
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3344
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Minor
> Fix For: 1.5.4
>
> Attachments: drop.png
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Steps to Reproduce:
>  # create table test1(col1 int) stored by 'carbondata'
>  # Try to Drop column not present in table => alter table test1 drop 
> columns(name)
>  # Find the null pointer exception
> !drop.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3353) Fix MinMax Pruning for Measure column in case of Legacy store

2019-04-17 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3353.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> Fix MinMax Pruning for Measure column in case of Legacy store
> -
>
> Key: CARBONDATA-3353
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3353
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.5.4
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> For tables created and loaded in legacy store with measure column, when we 
> query measure column with current version, query returns wrong results



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3334) multiple segment files created for partition table for one segment

2019-04-17 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3334.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> multiple segment files created for partition table for one segment
> --
>
> Key: CARBONDATA-3334
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3334
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.5.4
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> unnecessary file 0.tmp.  also  created along with duplicate segment files 
> (and 0_1553762464817.segment,0_1553762469148.segment)   
> create table f (name string) partitioned by (b int) stored by 'carbondata' ;
> insert into f select 'a',10;
> check hdfs store location 
> hadoop fs -ls 
> /user/hive/warehouse/carbon.store/carbon1_1_test/f/Metadata/segments/
> Found 3 items
> -rw-rw-r--+ 3 carbon hive 163 2019-03-28 14:11 
> /user/hive/warehouse/carbon.store/carbon1_1_test/f/Metadata/segments/0_1553762464817.segment
> -rw-rw-r--+ 3 carbon hive 158 2019-03-28 14:11 
> /user/hive/warehouse/carbon.store/carbon1_1_test/f/Metadata/segments/0_1553762469148.segment
> drwxrwx---+ - carbon hive 0 2019-03-28 14:11 
> /user/hive/warehouse/carbon.store/carbon1_1_test/f/Metadata/segments/_0.tmp
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3331) Database index size is more than overall index size in SHOW METADATA command

2019-04-17 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3331.
-
   Resolution: Fixed
Fix Version/s: 1.5.4

> Database index size is more than overall index size in SHOW METADATA command
> 
>
> Key: CARBONDATA-3331
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3331
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naman Rastogi
>Priority: Major
> Fix For: 1.5.4
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3001) Propose configurable page size in MB (via carbon property)

2019-04-15 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3001.
-
   Resolution: Fixed
 Assignee: (was: Ajantha Bhat)
Fix Version/s: 1.5.4

> Propose configurable page size in MB (via carbon property)
> --
>
> Key: CARBONDATA-3001
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3001
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ajantha Bhat
>Priority: Minor
> Fix For: 1.5.4
>
> Attachments: Propose configurable page size in MB (via carbon 
> property).pdf
>
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> For better in-memory processing of carbondata pages, I am proposing 
> configurable page size in MB (via carbon property).
> please find the attachment for more details.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3332) Concurrent update and compaction failure

2019-03-28 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3332.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Concurrent update and compaction failure
> 
>
> Key: CARBONDATA-3332
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3332
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Job aborted due to stage failure: Task 0 in stage 114.0 failed 4 times, most 
> recent failure: Lost task 0.3 in stage 114.0 (TID 257, linux-53, executor 
> 19): java.util.NoSuchElementException: key not found: 0
> at scala.collection.MapLike$class.default(MapLike.scala:228)
> at scala.collection.AbstractMap.default(Map.scala:59)
> at scala.collection.MapLike$class.apply(MapLike.scala:141)
> at scala.collection.AbstractMap.apply(Map.scala:59)
> at 
> org.apache.carbondata.spark.rdd.CarbonDataRDDFactory$SegmentPartitioner$1.getPartition(CarbonDataRDDFactory.scala:709)
> at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:153)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
> at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:325)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3333) Fixed No Sort Store Size issue and Compatibility issue after alter addd column done in 1.1 and load in 1.5

2019-03-27 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Fixed No Sort Store Size issue and Compatibility issue after alter addd 
> column done in 1.1 and load in 1.5
> --
>
> Key: CARBONDATA-
> URL: https://issues.apache.org/jira/browse/CARBONDATA-
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Issue 1: Load is failing in latest version with alter in older version
> Issue 2: After PR#3140 store size got increased  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3330) Fix Invalid exception when SDK reader is trying to clear the datamap

2019-03-27 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3330.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Fix Invalid exception when SDK reader is trying to clear the datamap
> 
>
> Key: CARBONDATA-3330
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3330
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Priority: Minor
> Fix For: 1.5.3
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> java.io.IOException: File does not exist: 
> /opt/csdk/out/cmplx_Schema/Metadata/schema at 
> org.apache.carbondata.core.metadata.schema.SchemaReader.readCarbonTableFromStore(SchemaReader.java:60)
>  at 
> org.apache.carbondata.core.metadata.schema.table.CarbonTable.buildFromTablePath(CarbonTable.java:302)
>  at 
> org.apache.carbondata.core.datamap.DataMapStoreManager.getCarbonTable(DataMapStoreManager.java:512)
>  at 
> org.apache.carbondata.core.datamap.DataMapStoreManager.clearDataMaps(DataMapStoreManager.java:476)
>  at 
> org.apache.carbondata.hadoop.CarbonRecordReader.close(CarbonRecordReader.java:164)
>  at org.apache.carbondata.sdk.file.CarbonReader.close(CarbonReader.java:219)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3328) Performance issue with merge small files distribution

2019-03-25 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3328.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Performance issue with merge small files distribution
> -
>
> Key: CARBONDATA-3328
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3328
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> After PR#3154 in case of merge small files split length was coming 0 because 
> of this it was merging all the files because of this query with merge small 
> files was slow



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3329) DeadLock is observed when a query fails.

2019-03-25 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3329.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> DeadLock is observed when a query fails.
> 
>
> Key: CARBONDATA-3329
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3329
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> "HiveServer2-Handler-Pool: Thread-303" #303 prio=5 os_prio=0 
> tid=0x7fcfe129f800 nid=0x59eb9 waiting for monitor entry 
> [0x7fcfd3c42000]
>  java.lang.Thread.State: BLOCKED (on object monitor)
>  at org.apache.log4j.Category.callAppenders(Category.java:204)
>  - waiting to lock <0x7fd046f9ed60> (a org.apache.log4j.spi.RootLogger)
>  at org.apache.log4j.Category.forcedLog(Category.java:391)
>  at org.apache.log4j.Category.log(Category.java:856)
>  at org.slf4j.impl.Log4jLoggerAdapter.log(Log4jLoggerAdapter.java:581)
>  at 
> org.apache.commons.logging.impl.SLF4JLocationAwareLog.info(SLF4JLocationAwareLog.java:155)
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient.close(HiveMetaStoreClient.java:622)
>  at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:156)
>  at com.sun.proxy.$Proxy28.close(Unknown Source)
>  at sun.reflect.GeneratedMethodAccessor33.invoke(Unknown Source)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2107)
>  - locked <0x7fd05611ef38> (a 
> org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler)
>  at com.sun.proxy.$Proxy28.close(Unknown Source)
>  at org.apache.hadoop.hive.ql.metadata.Hive.close(Hive.java:294)
>  at org.apache.hadoop.hive.ql.metadata.Hive.access$000(Hive.java:141)
>  at org.apache.hadoop.hive.ql.metadata.Hive$1.remove(Hive.java:161)
>  - locked <0x7fd051ba0bb0> (a org.apache.hadoop.hive.ql.metadata.Hive$1)
>  at org.apache.hadoop.hive.ql.metadata.Hive.closeCurrent(Hive.java:264)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl$$anonfun$withHiveState$1.apply(HiveClientImpl.scala:294)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:246)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:245)
>  - locked <0x7fd04cbc4c78> (a 
> org.apache.spark.sql.hive.client.hiveClientObject)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:292)
>  at 
> org.apache.spark.sql.hive.client.HiveClientImpl.databaseExists(HiveClientImpl.scala:388)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply$mcZ$sp(HiveExternalCatalog.scala:178)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:178)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog$$anonfun$databaseExists$1.apply(HiveExternalCatalog.scala:178)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:97)
>  - locked <0x7fd04ce5ff48> (a 
> org.apache.spark.sql.hive.HiveExternalCatalog)
>  at 
> org.apache.spark.sql.hive.HiveExternalCatalog.databaseExists(HiveExternalCatalog.scala:177)
>  at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.databaseExists(SessionCatalog.scala:198)
>  at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.org$apache$spark$sql$catalyst$catalog$SessionCatalog$$requireDbExists(SessionCatalog.scala:138)
>  at 
> org.apache.spark.sql.catalyst.catalog.SessionCatalog.getDatabaseMetadata(SessionCatalog.scala:192)
>  at 
> org.apache.spark.sql.getDB$.getDBLocation(CarbonCatalystOperators.scala:107)
>  at 
> org.apache.spark.sql.hive.CarbonMetastore$$anonfun$loadMetadata$1.apply(CarbonMetastore.scala:253)
>  at 
> org.apache.spark.sql.hive.CarbonMetastore$$anonfun$loadMetadata$1.apply(CarbonMetastore.scala:251)
>  at scala.collection.Iterator$class.foreach(Iterator.scala:893)
>  at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
>  at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>  at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>  at 
> org.apache.spark.sql.hive.CarbonMetastore.loadMetadata(CarbonMetastore.scala:251)
>  at 
> 

[jira] [Resolved] (CARBONDATA-3323) Output is null when cache is empty

2019-03-24 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3323?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3323.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Output is null when cache is empty
> --
>
> Key: CARBONDATA-3323
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3323
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naman Rastogi
>Priority: Minor
> Fix For: 1.5.3
>
>
> *Problem*:
> When "SHOW METACACHE ON TABLE" is executed and carbonLRUCAche is null, output 
> is empty sequence, which is not standard.
>  
> *Fix*:
> Return standard output even when carbonLRUCache is not initalised (null) with 
> size for index and dictionary as 0.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3322) After renaming table, "SHOW METACACHE ON TABLE" still works for old table

2019-03-24 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3322.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> After renaming table, "SHOW METACACHE ON TABLE" still works for old table
> -
>
> Key: CARBONDATA-3322
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3322
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naman Rastogi
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> *Problem*:
> After we alter table name from t1 to t2, "SHOW METACACHE ON TABLE" works for 
> both old table name "t1" and new table name "t2"
>  
> *Fix*:
> Added check for table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3321) Improve Single/Concurrent query performance

2019-03-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3321.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Improve Single/Concurrent query performance 
> 
>
> Key: CARBONDATA-3321
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3321
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> When number of Segments are high Single/Concurrent query is slow. because of 
> below reason 
>  # Memory footprint is more because of this gc is more and reducing query 
> performance
>  # Converting to Unsafe data map row to safe data map during pruning 
>  # Multi threaded pruning in case of non filter query is not supported 
>  # Retrieval from unsafe data map row is slower 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3293) Prune datamaps improvement for count(*)

2019-03-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3293.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Prune datamaps improvement for count(*)
> ---
>
> Key: CARBONDATA-3293
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3293
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 14h 50m
>  Remaining Estimate: 0h
>
> +*Problem:*+
> (1) Currently for count ( *) , the prune is same as select * query.  Blocklet 
> and ExtendedBlocklet are formed from the DataMapRow and that is of no need 
> and it is a time consuming process.
> (2) Pruning in select * query consumes time in convertToSafeRow() - 
> converting the DataMapRow to safe as in an unsafe row to get the position of 
> data, we need to traverse through the whole row to reach a position.
> (3) In case of filter queries, even if the blocklet is valid or invalid, we 
> are converting the DataMapRow to safeRow. This conversion is time consuming 
> increasing the number of blocklets.
>  
> +*Solution:*+
> (1) We have the blocklet row count in the DataMapRow itself, so it is just 
> enough to read the count. With this count ( *) query performance can be 
> improved.
> (2) Maintain the data length also to the DataMapRow, so that traversing the 
> whole row can be avoided. With the length we can directly hit the data 
> position.
> (3) Read only the MinMax from the DataMapRow, decide whether scan is required 
> on that blocklet, if required only then it can be converted to safeRow, if 
> needed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3315) Range Filter query with two between clauses with an OR gives wrong results

2019-03-15 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3315.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Range Filter query with two between clauses with an OR gives wrong results
> --
>
> Key: CARBONDATA-3315
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3315
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 7h 10m
>  Remaining Estimate: 0h
>
> # Create table t1(c1 string, c2 int) stored by 'carbondata' 
> tblproperties('sort_columns'='c2')
>  # insert some values into table t1{color:#008000}
> {color}
>  # select * from t1 where c2 between 2 and 3 or c2 between 3 and 4



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3300) ClassNotFoundException when using UDF on spark-shell

2019-03-12 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3300.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> ClassNotFoundException when using UDF on spark-shell
> 
>
> Key: CARBONDATA-3300
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3300
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> create table x1 (imei string, deviceInformationId int, mac string, 
> productdate timestamp, updatetime timestamp, gamePointId double, 
> contractNumber double) STORED BY 'org.apache.carbondata.format';
> Load the data to x1:
> LOAD DATA inpath 'hdfs://localhost/x1_without_header.csv' into table x1 
> options('DELIMITER'=',', 'QUOTECHAR'='"','FILEHEADER'='imei, 
> deviceinformationid,mac, productdate,updatetime, gamepointid,contractnumber');
> Create another table res_1 using following sql:
> create table res_1 as select * from x1 limit 2;
> 2. Login spark-shell, register udf and run the join query
> import java.sql.Date
> import java.sql.Timestamp;
> spark.udf.register("castTimestampToDate", (x: Timestamp) =>
>   try {
>     Some(new Date(x.getTime - x.toLocalDateTime.getHour * 3600 * 1000L - 
> x.toLocalDateTime.getMinute * 60 * 1000L - x.toLocalDateTime.getSecond * 
> 1000L))
>   } catch {
>     case _: Exception => None
>   }
> )
> spark.sql("select res_1.* from x1, res_1 where 
> castTimestampToDate(x1.productdate) = castTimestampToDate(res_1.productdate) 
> and x1.deviceInformationId = res_1.deviceInformationId").show(false)
>  
> java.lang.RuntimeException: Error while reading filter expression
>   at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getFilterPredicates(CarbonInputFormat.java:392)
>   at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getSplits(CarbonTableInputFormat.java:204)
>   at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.internalGetPartitions(CarbonScanRDD.scala:139)
>   at 
> org.apache.carbondata.spark.rdd.CarbonRDD.getPartitions(CarbonRDD.scala:66)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
>   at 
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:46)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:253)
>   at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:251)
>   at scala.Option.getOrElse(Option.scala:121)
>   at org.apache.spark.rdd.RDD.partitions(RDD.scala:251)
>   at org.apache.spark.sql.execution.SparkPlan.executeTake(SparkPlan.scala:340)
>   at 
> org.apache.spark.sql.execution.CollectLimitExec.executeCollect(limit.scala:38)
>   at 
> org.apache.spark.sql.Dataset.org$apache$spark$sql$Dataset$$collectFromPlan(Dataset.scala:3278)
>   at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2489)
>   at org.apache.spark.sql.Dataset$$anonfun$head$1.apply(Dataset.scala:2489)
>   at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
>   at org.apache.spark.sql.Dataset.head(Dataset.scala:2489)
>   at org.apache.spark.sql.Dataset.take(Dataset.scala:2703)
>   at org.apache.spark.sql.Dataset.showString(Dataset.scala:254)
>   at org.apache.spark.sql.Dataset.show(Dataset.scala:725)
>   at org.apache.spark.sql.Dataset.show(Dataset.scala:702)
>   ... 49 elided
> Caused by: java.io.IOException: Could not read object
>   at 
> org.apache.carbondata.core.util.ObjectSerializationUtil.convertStringToObject(ObjectSerializationUtil.java:100)
>   at 
> org.apache.carbondata.hadoop.api.CarbonInputFormat.getFilterPredicates(CarbonInputFormat.java:389)
>   ... 84 more
> Caused by: 

[jira] [Resolved] (CARBONDATA-3301) Array column is giving null data in case of spark carbon file format

2019-03-07 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3301.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Array column is giving null data in case of spark carbon file format 
> ---
>
> Key: CARBONDATA-3301
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3301
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> Steps to reproduce
>  
>  # drop table if exists issue
>  # create table issue(name string, dob array) using carbon
>  # insert into issue select 'abc',array('2017-11-11')
>  # select * from issue"
> output is 
> +++
> |name|dob |
> +++
> |abc |[]|
> +++
>  
> But parquet gives correct output
> +++
> |name|dob |
> +++
> |abc |[2017-11-11]|
> +++



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3284) Workaround for Create-PreAgg Datamap Fail

2019-02-13 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3284.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Workaround for Create-PreAgg Datamap Fail
> -
>
> Key: CARBONDATA-3284
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3284
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naman Rastogi
>Assignee: Naman Rastogi
>Priority: Minor
> Fix For: 1.5.3
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> If for some reason^*[1]*^, creating PreAgg datamap failed and its dropping 
> also failed.
> Then dropping datamap also cannot be done, as the datamap was not registered 
> to the parent table schema file, but got registered in spark-hive, so it 
> shows it as a table, but won't let us drop it as carbon throws error if we 
> try to drop it as a table.
> Workaround:
> After this change, we can at lease drop that as a hive folder by command
> {{drop table table_datamap; }}
> *[1]* - Reason could be something like setting HDFS Quota on database folder, 
> so that parent table schema file cound not be modified.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3287) Remove the validation of same chema data files in location for external table and file format

2019-02-13 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3287.
-
   Resolution: Fixed
Fix Version/s: 1.5.3

> Remove the validation of  same chema data files in location for external 
> table and file format
> --
>
> Key: CARBONDATA-3287
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3287
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 1.5.3
>
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Currently we have a validation that if there are two carbondata files in a 
> location with different schema, then we fail the query. I think there is no 
> need to fail. If you see the parquet behavior also we cna understand. 
>  
> Here i think failing is not good, we can read the latets schema from latest 
> carbondata file in the given location and based on that read all the files 
> and give query output. For the columns which are not present in some data 
> files, it wil have null values for the new column.
>  
> But here basically we do not merge schema. we can maintain the same now also, 
> only thing is can take latest schma.
>  
> for example:
> 1. one data file with columns a,b and c. 2nd file is with columns a,b,c,d,e. 
> then can read and create table with 5 columns or 3 columns which ever is 
> latest and create table(This will be when user does not specify schema). If 
> he species table will be created with specified schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3224) SDK should validate the improper value

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3224.
-
Resolution: Fixed

> SDK should validate the improper value
> --
>
> Key: CARBONDATA-3224
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3224
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.5.1
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
> Fix For: 1.5.2
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>
> a)"BAD_RECORDS_ACTION","jfie"  
> b)"BAD_RECORDS_LOGGER_ENABLE","FLSE"
> For both the above cases test case is passing, which is improper, 
> Validation is not done for the values to the corresponding key which is 
> expected.
> Basically, it is accepting any string values.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2420) Support string longer than 32000 characters

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-2420.
-
Resolution: Fixed

> Support string longer than 32000 characters
> ---
>
> Key: CARBONDATA-2420
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2420
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> Add a property in creating table 'long_string_columns' to support string 
> columns that will contains more than 32000 characters.
> Inside carbondata, it use an integer instead of short to store the length of 
> bytes content.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3198) ALTER ADD COLUMNS does not support datasource table with type carbon

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3198:

Fix Version/s: (was: 1.5.2)
   NONE

> ALTER ADD COLUMNS does not support datasource table with type carbon
> 
>
> Key: CARBONDATA-3198
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3198
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.5.1
>Reporter: xubo245
>Priority: Major
> Fix For: NONE
>
>
> code:
> {code:java}
>   test("test add columns for table of using carbon") {
> import spark.implicits._
> val df = spark.sparkContext.parallelize(1 to 10)
>   .map(x => ("a" + x % 10, "b", x))
>   .toDF("c1", "c2", "number")
> spark.sql("drop table if exists testparquet")
> spark.sql("drop table if exists testformat")
> // Saves dataframe to carbon file
> df.write
>   .format("parquet").saveAsTable("testparquet")
> spark.sql("create table carbon_table(c1 string, c2 string, number int) 
> using carbon")
> spark.sql("insert into carbon_table select * from testparquet")
> TestUtil.checkAnswer(spark.sql("select * from carbon_table where 
> c1='a1'"), spark.sql("select * from testparquet where c1='a1'"))
> if (!spark.sparkContext.version.startsWith("2.1")) {
>   val mapSize = DataMapStoreManager.getInstance().getAllDataMaps.size()
>   DataMapStoreManager.getInstance()
> .clearDataMaps(AbsoluteTableIdentifier.from(warehouse1 + 
> "/carbon_table"))
>   assert(mapSize > 
> DataMapStoreManager.getInstance().getAllDataMaps.size())
> }
> spark.sql("select * from carbon_table").show()
> spark.sql("ALTER TABLE carbon_table ADD COLUMNS (a1 INT, b1 STRING) ")
> spark.sql("select * from carbon_table").show()
> spark.sql("insert into carbon_table values('Bob','xu',12,1,'parquet')")
> spark.sql("select * from carbon_table").show()
> spark.sql("drop table if exists testparquet")
> spark.sql("drop table if exists testformat")
>   }
> {code}
> exception:
> {code:java}
> 2018-12-25 22:22:12 INFO  ContextHandler:781 - Started 
> o.s.j.s.ServletContextHandler@51e0301d{/metrics/json,null,AVAILABLE,@Spark}
> +---+---+--+
> | c1| c2|number|
> +---+---+--+
> | a1|  b| 1|
> | a2|  b| 2|
> | a3|  b| 3|
> | a4|  b| 4|
> | a5|  b| 5|
> | a6|  b| 6|
> | a7|  b| 7|
> | a8|  b| 8|
> | a9|  b| 9|
> | a0|  b|10|
> +---+---+--+
> ALTER ADD COLUMNS does not support datasource table with type carbon.
> You must drop and re-create the table for adding the new columns. Tables: 
> `carbon_table`
>  ;
> org.apache.spark.sql.AnalysisException: 
> ALTER ADD COLUMNS does not support datasource table with type carbon.
> You must drop and re-create the table for adding the new columns. Tables: 
> `carbon_table`
>  ;
>   at 
> org.apache.spark.sql.execution.command.AlterTableAddColumnsCommand.verifyAlterTableAddColumn(tables.scala:242)
>   at 
> org.apache.spark.sql.execution.command.AlterTableAddColumnsCommand.run(tables.scala:194)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
>   at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190)
>   at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3259)
>   at 
> org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77)
>   at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3258)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:190)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642)
>   at 
> org.apache.spark.sql.carbondata.datasource.SparkCarbonDataSourceTest$$anonfun$3.apply$mcV$sp(SparkCarbonDataSourceTest.scala:100)
>   at 
> org.apache.spark.sql.carbondata.datasource.SparkCarbonDataSourceTest$$anonfun$3.apply(SparkCarbonDataSourceTest.scala:80)
>   at 
> org.apache.spark.sql.carbondata.datasource.SparkCarbonDataSourceTest$$anonfun$3.apply(SparkCarbonDataSourceTest.scala:80)
>   at 
> org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
>   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
>   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
>   at org.scalatest.Transformer.apply(Transformer.scala:22)
>   at 

[jira] [Resolved] (CARBONDATA-3216) There are some bugs in CSDK

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3216.
-
Resolution: Fixed

> There are some bugs in CSDK
> ---
>
> Key: CARBONDATA-3216
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3216
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.5.1
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
> Fix For: 1.5.2
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>
> There are some bugs in CSDK:
>  1.enableLocalDictionary can' t set false
> code:
> {code:java}
>writer.enableLocalDictionary(false);
> {code}
> excepton:
> {code:java}
> libc++abi.dylib: terminating with uncaught exception of type 
> std::runtime_error: enableLocalDictionary parameter can't be NULL.
> {code}
> 2.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3215) Optimize the documentation

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3215.
-
Resolution: Fixed

> Optimize the documentation
> --
>
> Key: CARBONDATA-3215
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3215
> Project: CarbonData
>  Issue Type: Improvement
>Affects Versions: 1.5.1
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
> Fix For: 1.5.2
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> Optimize the documentation:
> # describe  Global dictionary local dictionary,non-dictionary together in doc
> # list mvdataMap list



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3144) CarbonData support spark-2.4.0

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3144:

Fix Version/s: (was: 1.5.2)

> CarbonData support spark-2.4.0
> --
>
> Key: CARBONDATA-3144
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3144
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
>   Spark has released spark-2.4 more than one month. CarbonData should start 
> to support  spark-2.4.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3151) SDK supports LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3151:

Fix Version/s: (was: 1.5.2)
   NONE

> SDK supports LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE
> --
>
> Key: CARBONDATA-3151
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3151
> Project: CarbonData
>  Issue Type: New Feature
>Affects Versions: 1.5.1
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
> Fix For: NONE
>
>
> When user use SDK and want to use LOCAL DICTIONARY, they can't use  
> LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE because SDK only 
> support local_dictionary_threshold and local_dictionary_enable.
> So we should support  LOCAL_DICTIONARY_INCLUDE and LOCAL_DICTIONARY_EXCLUDE 
> in SDK, then use can include part of columns or exclude part of columns.
> http://apache-carbondata-dev-mailing-list-archive.1130556.n5.nabble.com/SDK-supports-LOCAL-DICTIONARY-INCLUDE-and-LOCAL-DICTIONARY-EXCLUDE-td69870.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3043) Add test framework for CSDK

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-3043:

Fix Version/s: (was: 1.5.2)
   NONE

> Add test framework for CSDK
> ---
>
> Key: CARBONDATA-3043
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3043
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xubo245
>Assignee: Babulal
>Priority: Major
> Fix For: NONE
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> Add test framework for CSDK,for unit test
> googletest is a popular test framework, we can try to use  it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2420) Support string longer than 32000 characters

2019-01-31 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala updated CARBONDATA-2420:

Fix Version/s: (was: 1.5.2)
   1.5.1

> Support string longer than 32000 characters
> ---
>
> Key: CARBONDATA-2420
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2420
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 19h 40m
>  Remaining Estimate: 0h
>
> Add a property in creating table 'long_string_columns' to support string 
> columns that will contains more than 32000 characters.
> Inside carbondata, it use an integer instead of short to store the length of 
> bytes content.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3282) presto carbon doesn't work with Hadoop conf in cluster.

2019-01-30 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3282.
-
   Resolution: Fixed
Fix Version/s: 1.5.2

> presto carbon doesn't work with Hadoop conf in cluster.
> ---
>
> Key: CARBONDATA-3282
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3282
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> problem:
> when datamap path is given , presto carbon throws 'hacluster' unkown host 
> exception even when hdfs configuration is present.
>  
> solution: from HDFS environment, set the hadoop configuration to thread 
> local, so that FileFactory can use this configuration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3273) For table without SORT_COLUMNS, Loading data is showing SORT_SCOPE=LOCAL_SORT instead of NO_SORT

2019-01-28 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3273.
-
   Resolution: Fixed
Fix Version/s: 1.5.2

> For table without SORT_COLUMNS, Loading data is showing SORT_SCOPE=LOCAL_SORT 
> instead of NO_SORT
> 
>
> Key: CARBONDATA-3273
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3273
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Naman Rastogi
>Assignee: Naman Rastogi
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3269) Range_column throwing ArrayIndexOutOfBoundsException when using KryoSerializer

2019-01-28 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3269.
-
   Resolution: Fixed
Fix Version/s: 1.5.2

> Range_column throwing ArrayIndexOutOfBoundsException when using KryoSerializer
> --
>
> Key: CARBONDATA-3269
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3269
> Project: CarbonData
>  Issue Type: Bug
>Reporter: QiangCai
>Assignee: QiangCai
>Priority: Critical
> Fix For: 1.5.2
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Reproduce:
> For range_column feature,When we set "spark.serializer" to 
> "org.apache.spark.serializer.KryoSerializer", data loading will throw 
> ArrayIndexOutOfBoundsException.
> Excpetion:
> 2019-01-25 13:00:19 ERROR DataLoadProcessorStepOnSpark$:367 - Data Loading 
> failed for table carbon_range_column4
>  java.lang.ArrayIndexOutOfBoundsException: 5
>  at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>  at org.apache.spark.scheduler.Task.run(Task.scala:108)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  2019-01-25 13:00:19 ERROR TaskContextImpl:91 - Error in TaskFailureListener
>  
> org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException:
>  Data Loading failed for table carbon_range_column4
>  at 
> org.apache.carbondata.spark.load.DataLoadProcessorStepOnSpark$.org$apache$carbondata$spark$load$DataLoadProcessorStepOnSpark$$wrapException(DataLoadProcessorStepOnSpark.scala:368)
>  at 
> org.apache.carbondata.spark.load.DataLoadProcessorStepOnSpark$$anonfun$convertFunc$3.apply(DataLoadProcessorStepOnSpark.scala:215)
>  at 
> org.apache.carbondata.spark.load.DataLoadProcessorStepOnSpark$$anonfun$convertFunc$3.apply(DataLoadProcessorStepOnSpark.scala:210)
>  at org.apache.spark.TaskContext$$anon$2.onTaskFailure(TaskContext.scala:144)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$markTaskFailed$1.apply(TaskContextImpl.scala:107)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$markTaskFailed$1.apply(TaskContextImpl.scala:107)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:130)
>  at 
> org.apache.spark.TaskContextImpl$$anonfun$invokeListeners$1.apply(TaskContextImpl.scala:128)
>  at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
>  at 
> org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:128)
>  at org.apache.spark.TaskContextImpl.markTaskFailed(TaskContextImpl.scala:106)
>  at org.apache.spark.scheduler.Task.run(Task.scala:113)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)
>  Caused by: java.lang.ArrayIndexOutOfBoundsException: 5
>  at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>  at org.apache.spark.scheduler.Task.run(Task.scala:108)
>  ... 4 more



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3268) Query on Varchar showing as Null in Presto

2019-01-28 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3268.
-
   Resolution: Fixed
Fix Version/s: 1.5.2

> Query on Varchar showing as Null in Presto
> --
>
> Key: CARBONDATA-3268
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3268
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3267) Data loading is failing with OOM using range sort

2019-01-24 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3267.
-
   Resolution: Fixed
Fix Version/s: 1.5.2

> Data loading is failing with OOM using range sort
> -
>
> Key: CARBONDATA-3267
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3267
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
> Fix For: 1.5.2
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> h3. Problem:
> Range sort is failing with OOM.
> h3. Root cause:
> This is because UnsafeSortStorageMemory is not able to control the off heap 
> memory because of this when huge data is loaded it OOM exception is coming 
> fron UnsafeMemoryAllocator.allocate.
> h3. Solution:
> Control Sort Storage memory. After sorting the rows if memory is available 
> then only add sorted records to sort storage memory otherwise write to disk



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3262) Failure to write merge index file results in merged segment being deleted when cleanup happens

2019-01-23 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3262.
-
   Resolution: Fixed
Fix Version/s: 1.5.2

> Failure to write merge index file results in merged segment being deleted 
> when cleanup happens
> --
>
> Key: CARBONDATA-3262
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3262
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.5.2
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3261) support float and byte reading from presto

2019-01-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3261.
-
   Resolution: Fixed
Fix Version/s: 1.5.2

> support float and byte reading from presto
> --
>
> Key: CARBONDATA-3261
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3261
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> problem: support float and byte reading from presto
> cause: currently float and byte cannot be read in presto due to code issue. 
> It was going as double data type. Hence array out of bound issue used to come 
> as float/byte read from double stream reader.
> solution: Implement a new stream reader for float and byte.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3242) Range_Column should be table level property

2019-01-21 Thread Ravindra Pesala (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravindra Pesala resolved CARBONDATA-3242.
-
   Resolution: Fixed
Fix Version/s: 1.5.2

> Range_Column should be table level property
> ---
>
> Key: CARBONDATA-3242
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3242
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: QiangCai
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 6h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3260) Broadcast join is not properly in carbon with spark-2.3.2

2019-01-20 Thread Ravindra Pesala (JIRA)
Ravindra Pesala created CARBONDATA-3260:
---

 Summary: Broadcast join is not properly in carbon with spark-2.3.2
 Key: CARBONDATA-3260
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3260
 Project: CarbonData
  Issue Type: Bug
Reporter: Ravindra Pesala


It seems stats which come from catalog table of hive gives wrong data sizes for 
carbon table. Because of that even large tables are also going to broadcast 
join.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   9   10   >