[jira] [Resolved] (CARBONDATA-3233) JVM is getting crashed during dataload while compressing in snappy

2019-01-17 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3233.
--
   Resolution: Fixed
Fix Version/s: 1.5.2

> JVM is getting crashed during dataload while compressing in snappy
> --
>
> Key: CARBONDATA-3233
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3233
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 1.5.2
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> when huge dataload is done, some times dataload is failed and jvm is crashed 
> during snappy compression
>  
> Below is the logs:
> Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
>  j org.xerial.snappy.SnappyNative.rawCompress(JJJ)J+0
>  j 
> org.apache.carbondata.core.datastore.compression.SnappyCompressor.rawCompress(JIJ)J+9
>  j 
> org.apache.carbondata.core.datastore.page.UnsafeFixLengthColumnPage.compress(Lorg/apache/carbondata/core/datastore/compression/Compressor[B+50
>  j 
> org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveCodec.encodeAndCompressPage(Lorg/apache/carbondata/core/datastore/page/ColumnPage;Lorg/apache/carbondata/core/datastore/page/ColumnPageValueConverter;Lorg/apache/carbondata/core/datastore/compression/Compressor[B+85
>  j 
> org.apache.carbondata.core.datastore.page.encoding.adaptive.AdaptiveDeltaIntegralCodec$1.encodeData(Lorg/apache/carbondata/core/datastore/page/ColumnPage[B+45
>  j 
> org.apache.carbondata.core.datastore.page.encoding.ColumnPageEncoder.encode(Lorg/apache/carbondata/core/datastore/page/ColumnPage;)Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+2
>  j 
> org.apache.carbondata.processing.store.TablePage.encodeAndCompressMeasures()[Lorg/apache/carbondata/core/datastore/page/encoding/EncodedColumnPage;+54
>  j org.apache.carbondata.processing.store.TablePage.encode()V+6
>  j 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.processDataRows(Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+86
>  j 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar.access$500(Lorg/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar;Ljava/util/List;)Lorg/apache/carbondata/processing/store/TablePage;+2
>  j 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Void;+8
>  j 
> org.apache.carbondata.processing.store.CarbonFactDataHandlerColumnar$Producer.call()Ljava/lang/Object;+1
>  j java.util.concurrent.FutureTask.run()V+42
>  j 
> java.util.concurrent.ThreadPoolExecutor.runWorker(Ljava/util/concurrent/ThreadPoolExecutor$Worker;)V+95
>  j java.util.concurrent.ThreadPoolExecutor$Worker.run()V+5
>  j java.lang.Thread.run()V+11
>  v ~StubRoutines::call_stub



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3241) Refactor the requested scan columns and the projection columns

2019-01-15 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3241.
--
   Resolution: Fixed
Fix Version/s: 1.5.2

> Refactor the requested scan columns and the projection columns
> --
>
> Key: CARBONDATA-3241
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3241
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Trivial
> Fix For: 1.5.2
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3223) Datasize and Indexsize showing 0B for 1.1 store when show segments is done

2019-01-06 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3223.
--
   Resolution: Fixed
Fix Version/s: 1.5.3

> Datasize and Indexsize showing 0B for 1.1 store when show segments is done
> --
>
> Key: CARBONDATA-3223
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3223
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Minor
> Fix For: 1.5.3
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>
> # Create table and load in 1.1 store.
>  # Refresh and Load in 1.5.1 version.
>  # Show Segments on the table will give 0B for the older segment.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3217) Optimize implicit filter expression performance by removing extra serialization

2018-12-31 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-3217:


 Summary: Optimize implicit filter expression performance by 
removing extra serialization
 Key: CARBONDATA-3217
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3217
 Project: CarbonData
  Issue Type: Bug
Reporter: Manish Gupta


# Currently all the filter values are getting serialized for all the tasks 
which is increasing the schedular delay thereby impacting the query performance.
 # For each task 2 times deserialization is taking place in the executor side 
which is not required. 1 time is suficient



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3202) updated schema is not updated in session catalog after add, drop or rename column.

2018-12-30 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3202.
--
   Resolution: Fixed
 Assignee: Akash R Nilugal
Fix Version/s: 1.5.3

> updated schema is not updated in session catalog after add, drop or rename 
> column. 
> ---
>
> Key: CARBONDATA-3202
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3202
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 1.5.3
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> updated schema is not updated in session catalog after add, drop or rename 
> column. 
>  
> Spark does not support drop column , rename column, and supports add column 
> from spark2.2 onwards, so after rename, or add or drop column, the new 
> updated schema is not updated in catalog



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3203) Compaction failing for table which is retstructured

2018-12-28 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3203.
--
   Resolution: Fixed
Fix Version/s: 1.5.2

> Compaction failing for table which is retstructured
> ---
>
> Key: CARBONDATA-3203
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3203
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Minor
> Fix For: 1.5.2
>
>
> Steps to reproduce:
>  # Create table with complex and primitive types.
>  # Load data 2-3 times.
>  # Drop one column.
>  # Trigger Compaction.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3196) Compaction Failing for Complex datatypes with Dictionary Include

2018-12-28 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3196.
--
   Resolution: Fixed
Fix Version/s: 1.5.2

> Compaction Failing for Complex datatypes with Dictionary Include
> 
>
> Key: CARBONDATA-3196
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3196
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Minor
> Fix For: 1.5.2
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
>  # Create Table with Complex type and Dictionary Include Complex type.
>  # Load data into the table 2-3 times.
>  # Alter table compact 'major'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-45) Support MAP type

2018-12-14 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-45.

   Resolution: Fixed
 Assignee: Manish Gupta  (was: Venkata Ramana G)
Fix Version/s: 1.5.2

> Support MAP type
> 
>
> Key: CARBONDATA-45
> URL: https://issues.apache.org/jira/browse/CARBONDATA-45
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, sql
>Reporter: cen yuhai
>Assignee: Manish Gupta
>Priority: Major
> Fix For: 1.5.2
>
> Attachments: MAP DATA-TYPE SUPPORT.pdf
>
>
> {code:sql}
> >>CREATE TABLE table1 (
>  deviceInformationId int,
>  channelsId string,
>  props map)
>   STORED BY 'org.apache.carbondata.format'
> >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root')
> {code}
> format of data to be read from csv, with '$' as level 1 delimiter and map 
> keys terminated by '#'
> {code:sql}
> >>load data local inpath '/tmp/data.csv' into table1 options 
> >>('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 
> >>'COMPLEX_DELIMITER_FOR_KEY'='#')
> 20,channel2,2#user2$100#usercommon
> 30,channel3,3#user3$100#usercommon
> 40,channel4,4#user3$100#usercommon
> >>select channelId, props[100] from table1 where deviceInformationId > 10;
> 20, usercommon
> 30, usercommon
> 40, usercommon
> >>select channelId, props from table1 where props[2] = 'user2';
> 20, {2,'user2', 100, 'usercommon'}
> {code}
> Following cases needs to  be handled:
> ||Sub feature||Pending activity||Remarks||
> |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, 
> select * from maptable|
> |Maptype lookup in projection and filter|Develop|Projection and filters needs 
> execution at spark|
> |NULL values, UDFs, Describe support|Develop||
> |Compaction support | Test + fix | As compaction works at byte level, no 
> changes required. Needs to add test-cases|
> |Insert into table| Develop | Source table data containing Map data needs to 
> convert from spark datatype to string , as carbon takes string as input row |
> |Support DDL for Map fields Dictionary include and Dictionary Exclude | 
> Develop | Also needs to handle CarbonDictionaryDecoder  to handle the same. |
> |Support multilevel Map | Develop | currently DDL is validated to allow only 
> 2 levels, remove this restriction|
> |Support Map value to be a measure | Develop | Currently array and struct 
> supports only dimensions which needs change|
> |Support Alter table to add and remove Map column | Develop | implement DDL 
> and requires default value handling |
> |Projections of Map loopup push down to carbon | Develop | this is an 
> optimization, when more number of values are present in Map |
> |Filter map loolup push down to carbon | Develop | this is an optimization, 
> when more number of values are present in Map |
> |Update Map values | Develop | update map value|
> h4. Design suggestion:
> Map can be represented internally stored as Array>, So that 
> conversion of data is required to Map data type while giving to spark. Schema 
> will have new column of map type similar to Array.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3017) Create DDL Support for Map Type

2018-12-14 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3017.
--
   Resolution: Fixed
Fix Version/s: 1.5.2

> Create DDL Support for Map Type
> ---
>
> Key: CARBONDATA-3017
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3017
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Major
> Fix For: 1.5.2
>
>  Time Spent: 13h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3134) Wrong result when a column is dropped and added using alter with blocklet cache.

2018-11-28 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3134.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> Wrong result when a column is dropped and added using alter with blocklet 
> cache.
> 
>
> Key: CARBONDATA-3134
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3134
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> *Steps to reproduce:*
> spark.sql("drop table if exists tile")
> spark.sql("create table tile(b int, s int,bi bigint, t timestamp) partitioned 
> by (i int) stored by 'carbondata' TBLPROPERTIES 
> ('DICTIONARY_EXCLUDE'='b,s,i,bi,t','SORT_COLUMS'='b,s,i,bi,t', 
> 'cache_level'='blocklet')")
> spark.sql("load data inpath 'C:/Users/k00475610/Documents/en_all.csv' into 
> table tile options('fileheader'='b,s,i,bi,t','DELIMITER'=',')")
> spark.sql("select * from tile")
>  spark.sql("alter table tile drop columns(t)")
>  spark.sql("alter table tile add columns(t timestamp)")
>  spark.sql("load data inpath 'C:/Users/k00475610/Documents/en_all.csv' into 
> table tile options('fileheader'='b,s,i,bi,t','DELIMITER'=',')")
> spark.sql("select * from tile").show()
>  
> *Result:*
> *+---+-+---+++*
> *| b| s| bi| t| i|*
> +---+-+---+++
> |100|2|93405673097|null|1644|
> |100|2|93405673097|null|1644|
> +---+-+---+++



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3113) Fixed Local Dictionary Query Performance and Added reusable buffer for direct flow

2018-11-21 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3113.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> Fixed Local Dictionary Query Performance  and Added reusable buffer for 
> direct flow
> ---
>
> Key: CARBONDATA-3113
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3113
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> 1. Added reusable buffer for direct flow
> In query for each page each column it is creating a byte array, when number 
> of columns are high it is causing lots of minor gc and degrading query 
> performance, as each page is getting uncompressed one by one we can use same 
> buffer for all the columns and based on requested size it will resize.
> 2. Fixed Local Dictionary performance issue.
> Reverted back #2895 and fixed NPE issue by setting null for local dictionary 
> to vector In safe and Unsafe VariableLengthDataChunkStore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3112) Optimise decompressing while filling the vector during conversion of primitive types

2018-11-20 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3112.
--
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.5.1

> Optimise decompressing while filling the vector during conversion of 
> primitive types
> 
>
> Key: CARBONDATA-3112
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3112
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> We can possibly avoid one copy by filling the vector during the conversion of 
> primitive types in codecs.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3088) enhance compaction performance by using prefetch

2018-11-20 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3088.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> enhance compaction performance by using prefetch
> 
>
> Key: CARBONDATA-3088
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3088
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3106) Written_BY_APPNAME is not serialized in executor with GlobalSort

2018-11-19 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3106.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> Written_BY_APPNAME is not serialized in executor with GlobalSort
> 
>
> Key: CARBONDATA-3106
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3106
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Problem:
> Written_By_APPNAME when added in carbonproperty is not serialized in executor 
> with global sort
> Steps to Reproduce:
>  # Create table and set sort_scope='global_sort'
>  # Load data into table and find the exception
> *Exception: There is an unexpected error: null*
> NOTE: This issue is reproducible only if driver and executor are running in a 
> different JVM Process
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3098) Negative value exponents giving wrong results

2018-11-14 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3098.
--
   Resolution: Fixed
 Assignee: MANISH NALLA
Fix Version/s: 1.5.1

> Negative value exponents giving wrong results
> -
>
> Key: CARBONDATA-3098
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3098
> Project: CarbonData
>  Issue Type: Bug
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Problem: When the value of exponent is a negative number then the data is 
> incorrect due to loss of precision of Floating point values and wrong 
> calculation of the count of decimal points.
>  
> Steps to reproduce: 
> -> "create table float_c(f float) using carbon"
> -> "insert into float_c select '1.4E-38' "



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3081) NPE when boolean column has null values with Vectorized SDK reader

2018-11-13 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3081.
--
   Resolution: Fixed
 Assignee: Kunal Kapoor
Fix Version/s: 1.5.1

> NPE when boolean column has null values with Vectorized SDK reader
> --
>
> Key: CARBONDATA-3081
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3081
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 4h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3077) Fixed query failure in fileformat due stale cache issue

2018-11-05 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-3077:
-
Attachment: 20181102101536.jpg

> Fixed query failure in fileformat due stale cache issue
> ---
>
> Key: CARBONDATA-3077
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3077
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Major
> Attachments: 20181102101536.jpg
>
>
> *Problem*
> While using FileFormat API, if a table created, dropped and then recreated 
> with the same name the query fails because of schema mismatch issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3077) Fixed query failure in fileformat due stale cache issue

2018-11-05 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-3077:


 Summary: Fixed query failure in fileformat due stale cache issue
 Key: CARBONDATA-3077
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3077
 Project: CarbonData
  Issue Type: Bug
Reporter: Manish Gupta
Assignee: Manish Gupta


*Problem*
While using FileFormat API, if a table created, dropped and then recreated with 
the same name the query fails because of schema mismatch issue



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3057) Implement Vectorized CarbonReader for SDK

2018-11-04 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3057.
--
   Resolution: Fixed
 Assignee: Naman Rastogi
Fix Version/s: 1.5.1

> Implement Vectorized CarbonReader for SDK
> -
>
> Key: CARBONDATA-3057
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3057
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Naman Rastogi
>Assignee: Naman Rastogi
>Priority: Minor
> Fix For: 1.5.1
>
>  Time Spent: 16h 10m
>  Remaining Estimate: 0h
>
> Implement Vectorized Reader and expose a API for the user to switch
> between CarbonReader/Vectorized reader. Additionally an API would be
> provided for the user to extract the columnar batch instead of rows. This
> would allow the user to have a deeper integration with carbon.
> Additionally the reduction in method calls for vector reader would improve
> the read time.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3066) ADD documentation for new APIs in SDK

2018-11-02 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3066.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> ADD documentation for new APIs in SDK
> -
>
> Key: CARBONDATA-3066
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3066
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 1.5.1
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> ADD documentation for new APIs in SDK



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3062) Fix Compatibility issue with cache_level as blocklet

2018-11-01 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3062.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> Fix Compatibility issue with cache_level as blocklet
> 
>
> Key: CARBONDATA-3062
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3062
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Please find below steps to reproduce the issue:
>  # Create table and load data in legacy store
>  # In new store, load data and alter table set table properties 
> 'CACHE_LEVEL'='BLOCKLET'
>  # Perform Filter operation on that table and find below Exception
>  
> |*Error: java.io.IOException: Problem in loading segment blocks. 
> (state=,code=0)*
>   |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-3062) Fix Compatibility issue with cache_level as blocklet

2018-11-01 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-3062:
-
Issue Type: Bug  (was: Improvement)

> Fix Compatibility issue with cache_level as blocklet
> 
>
> Key: CARBONDATA-3062
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3062
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> Please find below steps to reproduce the issue:
>  # Create table and load data in legacy store
>  # In new store, load data and alter table set table properties 
> 'CACHE_LEVEL'='BLOCKLET'
>  # Perform Filter operation on that table and find below Exception
>  
> |*Error: java.io.IOException: Problem in loading segment blocks. 
> (state=,code=0)*
>   |



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3054) Dictionary file cannot be read in S3a with CarbonDictionaryDecoder.doConsume() codeGen

2018-10-31 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3054.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> Dictionary file cannot be read in S3a with 
> CarbonDictionaryDecoder.doConsume() codeGen
> --
>
> Key: CARBONDATA-3054
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3054
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> problem: In S3a environment, when quiried the data which has dictionary files,
> Dictionary file cannot be read in S3a with 
> CarbonDictionaryDecoder.doConsume() codeGen even though file is present.
>  
> cause: CarbonDictionaryDecoder.doConsume() codeGen doesn't set hadoop conf in 
> thread local variable, only doExecute() sets it.
> Hence, when getDictionaryWrapper() called from doConsume() codeGen,
> AbstractDictionaryCache.getDictionaryMetaCarbonFile() returns false for 
> fileExists() operation.
>  
> solution:
> In CarbonDictionaryDecoder.doConsume() codeGen, set hadoop conf in thread 
> local variable



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3061) Add validation for supported format version and Encoding type to throw proper exception to the user while reading a file

2018-10-30 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-3061:


 Summary: Add validation for supported format version and Encoding 
type to throw proper exception to the user while reading a file
 Key: CARBONDATA-3061
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3061
 Project: CarbonData
  Issue Type: Improvement
Reporter: Manish Gupta
Assignee: Manish Gupta


This jira is raised to handle forward compatibility. Through this PR if any 
data file is read using a lower version (>=1.5.1), a proper exception will be 
thrown if columnar format version or any encoding type is not supported for 
read in that version



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3042) Column Schema objects are present in Driver and Executor even after dropping table

2018-10-30 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3042.
--
   Resolution: Fixed
 Assignee: Indhumathi Muthumurugesh
Fix Version/s: 1.5.1

> Column Schema objects are present in Driver and Executor even after dropping 
> table
> --
>
> Key: CARBONDATA-3042
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3042
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-3052) Improve drop table performance by reducing the namenode RPC calls during physical deletion of files

2018-10-29 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-3052:


 Summary: Improve drop table performance by reducing the namenode 
RPC calls during physical deletion of files
 Key: CARBONDATA-3052
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3052
 Project: CarbonData
  Issue Type: Improvement
Reporter: Manish Gupta
Assignee: Manish Gupta


Current drop table command takes more than 1 minute to delete 3000 files during 
drop table operation from HDFS. This Jira is raised to improve the drop table 
operation performance



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2977) Write uncompress_size to ChunkCompressMeta in the file

2018-10-25 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2977?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2977.
--
   Resolution: Fixed
 Assignee: Jacky Li
Fix Version/s: 1.5.1

> Write uncompress_size to ChunkCompressMeta in the file
> --
>
> Key: CARBONDATA-2977
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2977
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Jacky Li
>Assignee: Jacky Li
>Priority: Major
> Fix For: 1.5.1
>
>  Time Spent: 6h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2998) Refresh column schema for old store(before V3) for SORT_COLUMNS option

2018-10-24 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2998.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> Refresh column schema for old store(before V3) for SORT_COLUMNS option
> --
>
> Key: CARBONDATA-2998
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2998
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.5.1
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-3022) Refactor ColumnPageWrapper

2018-10-23 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3022?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-3022.
--
   Resolution: Fixed
Fix Version/s: 1.5.1

> Refactor ColumnPageWrapper
> --
>
> Key: CARBONDATA-3022
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3022
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.5.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2995) Queries slow down after some time due to broadcast issue

2018-10-08 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2995:


 Summary: Queries slow down after some time due to broadcast issue
 Key: CARBONDATA-2995
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2995
 Project: CarbonData
  Issue Type: Bug
Reporter: Manish Gupta
Assignee: Manish Gupta


*Problem Description*

It is observed that during consecutive run of queries after some time queries 
are slowing down. This is causing the degrade in query performance.

No exception is thrown in driver and executor logs but as observed from the 
logs the time to broadcast hadoop conf is increasing after every query run.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2876) Support Avro datatype conversion to Carbon Format

2018-10-08 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2876.
--
   Resolution: Fixed
 Assignee: Indhumathi Muthumurugesh
Fix Version/s: 1.5.0

> Support Avro datatype conversion to Carbon Format
> -
>
> Key: CARBONDATA-2876
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2876
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 12h 50m
>  Remaining Estimate: 0h
>
> 1.Support Avro Complex Types: Enum, Union, Fixed with Carbon.
> 2.Support Avro Logical Types: TimeMillis, TimeMicros, Decimal with Carbon.
>  
> Please find the design document in the below link:
> https://docs.google.com/document/d/1Jne8vNZ3OSYmJ_72hTIk_5I4EeIVtxGNE5mN_hBlnVE/edit?usp=sharing



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2990) JVM crashes when rebuilding the datamap.

2018-10-04 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2990.
--
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.5.0

> JVM crashes when rebuilding the datamap.
> 
>
> Key: CARBONDATA-2990
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2990
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>
> create table brinjal (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('table_blocksize'='1');
>  
>  LOAD DATA INPATH '/home/root1/Downloads/vardhandaterestruct.csv' INTO TABLE 
> brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');
> CREATE DATAMAP dm_brinjal ON TABLE brinjal USING 'bloomfilter' DMPROPERTIES 
> ('INDEX_COLUMNS' = 'AMSize', 'BLOOM_SIZE'='64', 'BLOOM_FPP'='0.1');
>  
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 4.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 4.0 (TID 13, 192.168.0.12, executor 11): ExecutorLostFailure (executor 
> 11 exited caused by one of the running tasks) Reason: Remote RPC client 
> disassociated. Likely due to containers exceeding thresholds, or network 
> issues. Check driver logs for WARN messages.
> Driver stacktrace: (state=,code=0)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2982) CarbonSchemaReader don't support Array

2018-10-03 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2982.
--
Resolution: Fixed

> CarbonSchemaReader don't support Array
> --
>
> Key: CARBONDATA-2982
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2982
> Project: CarbonData
>  Issue Type: Bug
>  Components: other
>Affects Versions: 1.5.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> CarbonSchemaReader don't support Array
> When we read schema from index file and the data include array data 
> type
> run org.apache.carbondata.examples.sdk.CarbonReaderExample :
> {code:java}
> Schema schema = CarbonSchemaReader
> .readSchemaInIndexFile(dataFiles[0].getAbsolutePath())
> .asOriginOrder();
> // Transform the schema
> String[] strings = new String[schema.getFields().length];
> for (int i = 0; i < schema.getFields().length; i++) {
> strings[i] = (schema.getFields())[i].getFieldName();
> System.out.println(strings[i] + "\t" + 
> schema.getFields()[i].getSchemaOrdinal());
> }
> {code}
> and throw some exception:
> {code:java}
> log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
> info.
> arrayfield.val0   -1
> stringfield   0
> shortfield1
> intfield  2
> longfield 3
> doublefield   4
> boolfield 5
> datefield 6
> timefield 7
> decimalfield  8
> varcharfield  9
> arrayfield10
> Complex child columns projection NOT supported through CarbonReader
> java.lang.UnsupportedOperationException: Complex child columns projection NOT 
> supported through CarbonReader
>   at 
> org.apache.carbondata.sdk.file.CarbonReaderBuilder.build(CarbonReaderBuilder.java:155)
>   at 
> org.apache.carbondata.examples.sdk.CarbonReaderExample.main(CarbonReaderExample.java:110)
> {code}
> It print arrayfield.val0  -1, child schema



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2980) clear bloomindex cache when dropping datamap

2018-09-30 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2980.
--
   Resolution: Fixed
Fix Version/s: 1.5.0

> clear bloomindex cache when dropping datamap
> 
>
> Key: CARBONDATA-2980
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2980
> Project: CarbonData
>  Issue Type: Bug
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> should clear the bloomindex cache when we drop datamap, otherwise query will 
> fail if we drop and recreate a brand new table and datamap and the stale 
> cache still exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2979) select count fails when carbondata file is written through SDK and read through sparkfileformat for complex datatype map(struct->array->map)

2018-09-27 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta reassigned CARBONDATA-2979:


Assignee: Manish Gupta

> select count fails when carbondata file is written through SDK and read 
> through sparkfileformat for complex datatype map(struct->array->map)
> 
>
> Key: CARBONDATA-2979
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2979
> Project: CarbonData
>  Issue Type: Bug
>  Components: file-format
>Affects Versions: 1.5.0
>Reporter: Rahul Singha
>Assignee: Manish Gupta
>Priority: Minor
> Attachments: MapSchema_15_int.avsc
>
>
> *Steps:*
> create carabondata and carbonindex file using SDK
> place the files in a hdfs location
> Read files using spark file format
> create table schema15_int using carbon location 
> 'hdfs://hacluster/user/rahul/map/mapschema15_int';
> Select count(*) from  schema15_int;
> *Actual Result:*
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 24.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 24.0 (TID 34, BLR114238, executor 3): java.io.IOException: All the 
> files doesn't have same schema. Unsupported operation on nonTransactional 
> table. Check logs.
>  at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.updateColumns(AbstractQueryExecutor.java:276)
>  at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.getDataBlocks(AbstractQueryExecutor.java:234)
>  at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.initQuery(AbstractQueryExecutor.java:141)
>  at 
> org.apache.carbondata.core.scan.executor.impl.AbstractQueryExecutor.getBlockExecutionInfos(AbstractQueryExecutor.java:401)
>  at 
> org.apache.carbondata.core.scan.executor.impl.VectorDetailQueryExecutor.execute(VectorDetailQueryExecutor.java:44)
>  at 
> org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.initialize(VectorizedCarbonRecordReader.java:143)
>  at 
> org.apache.spark.sql.carbondata.execution.datasources.SparkCarbonFileFormat$$anonfun$buildReaderWithPartitionValues$2.apply(SparkCarbonFileFormat.scala:395)
>  at 
> org.apache.spark.sql.carbondata.execution.datasources.SparkCarbonFileFormat$$anonfun$buildReaderWithPartitionValues$2.apply(SparkCarbonFileFormat.scala:361)
>  at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:124)
>  at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:174)
>  at 
> org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:105)
>  at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
>  Source)
>  at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithoutKey$(Unknown
>  Source)
>  at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>  at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>  at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
>  at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>  at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:125)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>  at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>  at org.apache.spark.scheduler.Task.run(Task.scala:108)
>  at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>  at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  at java.lang.Thread.run(Thread.java:745)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2972) Debug Logs and a function for type of Adaptive Encoding

2018-09-27 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2972.
--
   Resolution: Fixed
 Assignee: MANISH NALLA
Fix Version/s: 1.5.0

> Debug Logs and a function for type of Adaptive Encoding
> ---
>
> Key: CARBONDATA-2972
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2972
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: MANISH NALLA
>Assignee: MANISH NALLA
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 3h 10m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2973) Add Documentation for complex Columns for Local Dictionary Support

2018-09-26 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2973.
--
   Resolution: Fixed
Fix Version/s: 1.5.0

> Add Documentation for complex Columns for Local Dictionary Support
> --
>
> Key: CARBONDATA-2973
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2973
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Praveen M P
>Assignee: Praveen M P
>Priority: Minor
> Fix For: 1.5.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2896) Adaptive encoding for primitive data types

2018-09-18 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2896.
--
   Resolution: Fixed
Fix Version/s: 1.5.0

> Adaptive encoding for primitive data types
> --
>
> Key: CARBONDATA-2896
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2896
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 22h 20m
>  Remaining Estimate: 0h
>
> Currently Encoding and Decoding is present only for Dictionary, Measure 
> Columns, but for no dictionary Primitive types encoding is *absent.*
> Encoding is a technique used to reduce the storage size and  after all these 
> encoding, result will be compressed with snappy compression to further reduce 
> the storage size.
> With this feature, we support encoding on the no dictionary primitive data 
> types also.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2943) Add configurable min max writing support for streaming table

2018-09-17 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2943:


 Summary: Add configurable min max writing support for streaming 
table
 Key: CARBONDATA-2943
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2943
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta


Add configurable min max writing support for streaming table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2942) Add read and write support for writing min max based on configurable bytes count

2018-09-17 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2942:


 Summary: Add read and write support for writing min max based on 
configurable bytes count
 Key: CARBONDATA-2942
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2942
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta


Add read and write support for writing min max based on configurable bytes 
count for transactional and non transactional table which covers standard 
carbon table, File format and SDK



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2941) Support decision based min max writing for a column

2018-09-17 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2941:


 Summary: Support decision based min max writing for a column
 Key: CARBONDATA-2941
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2941
 Project: CarbonData
  Issue Type: Improvement
Reporter: Manish Gupta
Assignee: Manish Gupta


*Background* 
Currently we are storing min max for all the columns. Currently we are 
storing page min max, blocklet min max in filefooter and all the blocklet 
metadata entries in the shard. Consider the case where each column data 
size is more than 1 characters. In this case if we write min max then 
min max will be written 3 times for each column and it will lead to 
increase in store size which will impact the query performance. 

*Design proposal* 
1. We will introduce a configurable system level property for max 
characters *"carbon.string.allowed.character.count".* If the data crosses 
this limit then min max will not be stored for that column. 
2. If a page does not contain min max for a column, then blocklet min max 
will also not contain the entry for min max of that column. 
3. Thrift file will be modified to introduce a option Boolean flag which 
will used in query to identify whether min max is stored for the filter 
column or not. 
4. As of now it will be supported only for dimensions of string/varchar 
type. We can extend it further to support bigDecimal type measures also in 
future if required. 
5. Block and blocklet dataMap cache will also include storing min max 
Boolean flag for dimensions column based on which filter pruning will be 
done. If min max is not written for any column then isScanRequired will 
return true in driver pruning. 
6. In executor again page and blocklet level min max will be checked for 
filter column. If min max is not written then complete page data will be 
scanned. 

*Backward compatibility* 
1. For stores prior to 1.5.0 min max flag for all the columns will be set 
to true during loading dataMap in query flow. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2924) Fix parsing issue for map as a nested array child and change the error message in sort column validation for SDK

2018-09-10 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2924:


 Summary: Fix parsing issue for map as a nested array child and 
change the error message in sort column validation for SDK
 Key: CARBONDATA-2924
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2924
 Project: CarbonData
  Issue Type: Bug
Reporter: Manish Gupta
Assignee: Manish Gupta
 Attachments: issue1.png, issue2.png

*Issue1:*

Parsing exception thrown while parsing map as child of nested array type like 
array> and struct> (Image attached)

*Issue2:*

Wrong error message is displayed when map type is specified as sort column 
while writing through SDK (Image attached)

*Issue3:*

When complex type data  length is more than short data type length for one row 
during loading then NegativeArraySize exception is thrown

java.lang.NegativeArraySizeException
 at 
org.apache.carbondata.processing.loading.sort.SortStepRowHandler.unpackNoSortFromBytes(SortStepRowHandler.java:271)
 at 
org.apache.carbondata.processing.loading.sort.SortStepRowHandler.readRowFromMemoryWithNoSortFieldConvert(SortStepRowHandler.java:461)
 at 
org.apache.carbondata.processing.loading.sort.unsafe.UnsafeCarbonRowPage.getRow(UnsafeCarbonRowPage.java:93)
 at 
org.apache.carbondata.processing.loading.sort.unsafe.holder.UnsafeInmemoryHolder.readRow(UnsafeInmemoryHolder.java:61)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2910) Support backward compatability in fileformat and support different sort colums per load

2018-09-07 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2910.
--
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.5.0

> Support backward compatability in fileformat and support different sort 
> colums per load
> ---
>
> Key: CARBONDATA-2910
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2910
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
> Fix For: 1.5.0
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>
> Currently if the data is loaded by old version with all dictionary exclude 
> carbon fileformat cannot read. 
> And also if the sort columns are given different per load while loading 
> through SDK does not work,



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2894) Add support for complex map type through spark carbon file format API

2018-08-27 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2894:


 Summary: Add support for complex map type through spark carbon 
file format API
 Key: CARBONDATA-2894
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2894
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-45) Support MAP type

2018-08-24 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-45?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-45:
---
Attachment: MAP DATA-TYPE SUPPORT.pdf

> Support MAP type
> 
>
> Key: CARBONDATA-45
> URL: https://issues.apache.org/jira/browse/CARBONDATA-45
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, sql
>Reporter: cen yuhai
>Assignee: Venkata Ramana G
>Priority: Major
> Attachments: MAP DATA-TYPE SUPPORT.pdf
>
>
> {code:sql}
> >>CREATE TABLE table1 (
>  deviceInformationId int,
>  channelsId string,
>  props map)
>   STORED BY 'org.apache.carbondata.format'
> >>insert into table1 select 10,'channel1', map(1,'user1',101, 'root')
> {code}
> format of data to be read from csv, with '$' as level 1 delimiter and map 
> keys terminated by '#'
> {code:sql}
> >>load data local inpath '/tmp/data.csv' into table1 options 
> >>('COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 
> >>'COMPLEX_DELIMITER_FOR_KEY'='#')
> 20,channel2,2#user2$100#usercommon
> 30,channel3,3#user3$100#usercommon
> 40,channel4,4#user3$100#usercommon
> >>select channelId, props[100] from table1 where deviceInformationId > 10;
> 20, usercommon
> 30, usercommon
> 40, usercommon
> >>select channelId, props from table1 where props[2] = 'user2';
> 20, {2,'user2', 100, 'usercommon'}
> {code}
> Following cases needs to  be handled:
> ||Sub feature||Pending activity||Remarks||
> |Basic Maptype support|Develop| Create table DDL, Load map data from CSV, 
> select * from maptable|
> |Maptype lookup in projection and filter|Develop|Projection and filters needs 
> execution at spark|
> |NULL values, UDFs, Describe support|Develop||
> |Compaction support | Test + fix | As compaction works at byte level, no 
> changes required. Needs to add test-cases|
> |Insert into table| Develop | Source table data containing Map data needs to 
> convert from spark datatype to string , as carbon takes string as input row |
> |Support DDL for Map fields Dictionary include and Dictionary Exclude | 
> Develop | Also needs to handle CarbonDictionaryDecoder  to handle the same. |
> |Support multilevel Map | Develop | currently DDL is validated to allow only 
> 2 levels, remove this restriction|
> |Support Map value to be a measure | Develop | Currently array and struct 
> supports only dimensions which needs change|
> |Support Alter table to add and remove Map column | Develop | implement DDL 
> and requires default value handling |
> |Projections of Map loopup push down to carbon | Develop | this is an 
> optimization, when more number of values are present in Map |
> |Filter map loolup push down to carbon | Develop | this is an optimization, 
> when more number of values are present in Map |
> |Update Map values | Develop | update map value|
> h4. Design suggestion:
> Map can be represented internally stored as Array>, So that 
> conversion of data is required to Map data type while giving to spark. Schema 
> will have new column of map type similar to Array.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2869) SDK support for Map DataType

2018-08-24 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta reassigned CARBONDATA-2869:


Assignee: Manish Gupta

> SDK support for Map DataType
> 
>
> Key: CARBONDATA-2869
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2869
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Indhumathi Muthumurugesh
>Assignee: Manish Gupta
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2829) Fix creating merge index on older V1 V2 store

2018-08-07 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2829:
-
Issue Type: Bug  (was: Improvement)

> Fix creating merge index on older V1 V2 store
> -
>
> Key: CARBONDATA-2829
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2829
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Block creating merge index on older V1 V2 version



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2829) Fix creating merge index on older V1 V2 store

2018-08-07 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2829.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Fix creating merge index on older V1 V2 store
> -
>
> Key: CARBONDATA-2829
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2829
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Block creating merge index on older V1 V2 version



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2832) Block loading error for select query executed after merge index command executed on V1/V2 store table

2018-08-07 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2832?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2832.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Block loading error for select query executed after merge index command 
> executed on V1/V2 store table
> -
>
> Key: CARBONDATA-2832
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2832
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.4.1
> Environment: Spark 2.1
>Reporter: Chetan Bhat
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.4.1
>
>
> Steps :
> *Create and load data in V1/V2 carbon store:*
> create table brinjal (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('table_blocksize'='1');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/vardhandaterestruct.csv' INTO TABLE 
> brinjal OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');
> *In 1.4.1*
> refresh table brinjal;
> alter table brinjal compact 'segment_index';
> select * from brinjal where AMSize='8RAM size';
>  
> *Issue : Block loading error for select query executed after merge index 
> command executed on V1/V2 store table.*
> 0: jdbc:hive2://10.18.98.101:22550/default> select * from brinjal where 
> AMSize='8RAM size';
> *Error: java.io.IOException: Problem in loading segment blocks. 
> (state=,code=0)*
> *Expected :* select query executed after merge index command executed on 
> V1/V2 store table should return correct result set without error**



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2813) Major compaction on partition table created in 1.3.x store is throwing Unable to get file status error.

2018-08-02 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2813.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Major compaction on partition table created in 1.3.x store is throwing Unable 
> to get file status error.
> ---
>
> Key: CARBONDATA-2813
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2813
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
>  # Create a partitioned table in 1.3.x version.
>  # Load data into the table.
>  # move the table to current version cluster(1.4.x).
>  # Load data into table on 1.4.x version
>  # Run major compaction



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2805) Wrong order in custom compaction

2018-08-01 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2805.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Wrong order in custom compaction
> 
>
> Key: CARBONDATA-2805
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2805
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> when we have segments from 0 to 6 and i give 1, 2, 3 for custom compaction, 
> then it should create 1.1 as compacted segment, but sometimes it will create 
> 3.1 as compacted segment which is wrong.
> +-+-+++-+---+
> |SegmentSequenceId| Status| Load Start Time| Load End Time|Merged To|File 
> Format|
> +-+-+++-+---+
> | 4| Success|2018-07-27 07:25:...|2018-07-27 07:25:...| NA|COLUMNAR_V3|
> | 3.1| Success|2018-07-27 07:25:...|2018-07-27 07:25:...| NA|COLUMNAR_V3|
> | 3|Compacted|2018-07-27 07:25:...|2018-07-27 07:25:...| 3.1|COLUMNAR_V3|
> | 2|Compacted|2018-07-27 07:25:...|2018-07-27 07:25:...| 3.1|COLUMNAR_V3|
> | 1|Compacted|2018-07-27 07:25:...|2018-07-27 07:25:...| 3.1|COLUMNAR_V3|
> | 0| Success|2018-07-27 07:25:...|2018-07-27 07:25:...| NA|COLUMNAR_V3|
> +-+-+++-+---+
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2788) Fix bugs in incorrect query result with bloom datamap

2018-07-31 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2788.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Fix bugs in incorrect query result with bloom datamap
> -
>
> Key: CARBONDATA-2788
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2788
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: xuchuanyin
>Assignee: xuchuanyin
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 13h 20m
>  Remaining Estimate: 0h
>
> revert modification in PR2539



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2778) Empty result in query after IUD delete operation

2018-07-26 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2778.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Empty result in query after IUD delete operation
> 
>
> Key: CARBONDATA-2778
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2778
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> # drop table if exists t1
>  # create table t1 (c1 int,c2 string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='1', 
> 'dictionary_exclude'='c2')
>  # LOAD DATA LOCAL INPATH 'test.csv' INTO table t1 
> options('fileheader'='c1,c2')
>  # run delete command which should delete a whole block
>  # Run clean file operation.
>  # select from t1.
>  
> *NOTE*: Disable mergeindex property



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2778) Empty result in query after IUD delete operation

2018-07-26 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2778:
-
Priority: Minor  (was: Major)

> Empty result in query after IUD delete operation
> 
>
> Key: CARBONDATA-2778
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2778
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> # drop table if exists t1
>  # create table t1 (c1 int,c2 string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES('table_blocksize'='1', 
> 'dictionary_exclude'='c2')
>  # LOAD DATA LOCAL INPATH 'test.csv' INTO table t1 
> options('fileheader'='c1,c2')
>  # run delete command which should delete a whole block
>  # Run clean file operation.
>  # select from t1.
>  
> *NOTE*: Disable mergeindex property



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2779) Filter query is failing for store created with V1/V2 format

2018-07-25 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2779:
-
Issue Type: Bug  (was: Improvement)

> Filter query is failing for store created with V1/V2 format
> ---
>
> Key: CARBONDATA-2779
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2779
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Filter query is failing for store created with V1/V2 format with 
> Arrayindexoutofbound exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2779) Filter query is failing for store created with V1/V2 format

2018-07-25 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2779.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Filter query is failing for store created with V1/V2 format
> ---
>
> Key: CARBONDATA-2779
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2779
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> Filter query is failing for store created with V1/V2 format with 
> Arrayindexoutofbound exception



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache

2018-07-25 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2638.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Implement driver min max caching for specified columns and segregate block 
> and blocklet cache
> -
>
> Key: CARBONDATA-2638
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2638
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Major
> Fix For: 1.4.1
>
> Attachments: Driver_Block_Cache.docx
>
>
> *Background*
> Current implementation of Blocklet dataMap caching in driver is that it 
> caches the min and max values of all the columns in schema by default. 
> *Problem*
>  Problem with this implementation is that as the number of loads increases 
> the memory required to hold min and max values also increases considerably. 
> We know that in most of the scenarios there is a single driver and memory 
> configured for driver is less as compared to executor. With continuous 
> increase in memory requirement driver can even go out of memory which makes 
> the situation further worse.
> *Solution*
> 1. Cache only the required columns in Driver
> 2. Segregation of block and Blocklet level cache**
> For more details please check the attached document



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2651) Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties

2018-07-25 Thread Manish Gupta (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555654#comment-16555654
 ] 

Manish Gupta commented on CARBONDATA-2651:
--

https://github.com/apache/carbondata/pull/2558

> Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties
> ---
>
> Key: CARBONDATA-2651
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2651
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Minor
> Fix For: 1.4.1
>
>
> Update document for caching properties



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2651) Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties

2018-07-25 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2651.
--
   Resolution: Fixed
 Assignee: Gururaj Shetty  (was: Manish Gupta)
Fix Version/s: 1.4.1

> Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties
> ---
>
> Key: CARBONDATA-2651
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2651
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Manish Gupta
>Assignee: Gururaj Shetty
>Priority: Minor
> Fix For: 1.4.1
>
>
> Update document for caching properties



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2621) Lock problem in index datamap

2018-07-24 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2621.
--
   Resolution: Fixed
Fix Version/s: (was: 1.5.0)
   1.4.1

> Lock problem in index datamap
> -
>
> Key: CARBONDATA-2621
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2621
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.4.0
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> The locking for the index Datamap is not correct.
> The HDFS lock will not work properly, because the lock is getting created the 
> the local filesystem instead of HDFS. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2753) Fix Compatibility issues

2018-07-23 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2753.
--
   Resolution: Fixed
 Assignee: dhatchayani  (was: Indhumathi Muthumurugesh)
Fix Version/s: 1.4.1

> Fix Compatibility issues
> 
>
> Key: CARBONDATA-2753
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2753
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Assignee: dhatchayani
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 9h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2710) Refactor CarbonSparkSqlParser for better code reuse.

2018-07-18 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2710.
--
   Resolution: Fixed
 Assignee: Mohammad Shahid Khan
Fix Version/s: 1.4.1

> Refactor CarbonSparkSqlParser for better code reuse.
> 
>
> Key: CARBONDATA-2710
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2710
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 4.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2704) Index file size in describe formatted command is not updated correctly with the segment file

2018-07-15 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2704.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Index file size in describe formatted command is not updated correctly with 
> the segment file
> 
>
> Key: CARBONDATA-2704
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2704
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2684) Code Generator Error is thrown when Select filter contains more than one count of distinct of ComplexColumn with group by Clause

2018-07-05 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2684.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Code Generator Error is thrown when Select filter contains more than one 
> count of distinct of ComplexColumn with group by Clause
> 
>
> Key: CARBONDATA-2684
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2684
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2701) Refactor code to store minimal required info in Block and Blocklet Cache

2018-07-05 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2701:


 Summary: Refactor code to store minimal required info in Block and 
Blocklet Cache
 Key: CARBONDATA-2701
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2701
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta


Refactor code to store minimal required info in Block and Blocklet Cache



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache

2018-06-25 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2638:
-
Attachment: Driver_Block_Cache.docx

> Implement driver min max caching for specified columns and segregate block 
> and blocklet cache
> -
>
> Key: CARBONDATA-2638
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2638
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Major
> Attachments: Driver_Block_Cache.docx
>
>
> *Background*
> Current implementation of Blocklet dataMap caching in driver is that it 
> caches the min and max values of all the columns in schema by default. 
> *Problem*
>  Problem with this implementation is that as the number of loads increases 
> the memory required to hold min and max values also increases considerably. 
> We know that in most of the scenarios there is a single driver and memory 
> configured for driver is less as compared to executor. With continuous 
> increase in memory requirement driver can even go out of memory which makes 
> the situation further worse.
> *Solution*
> 1. Cache only the required columns in Driver
> 2. Segregation of block and Blocklet level cache**
> For more details please check the attached document



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache

2018-06-25 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2638:
-
Attachment: (was: Driver_Block_Cache.docx)

> Implement driver min max caching for specified columns and segregate block 
> and blocklet cache
> -
>
> Key: CARBONDATA-2638
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2638
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Major
> Attachments: Driver_Block_Cache.docx
>
>
> *Background*
> Current implementation of Blocklet dataMap caching in driver is that it 
> caches the min and max values of all the columns in schema by default. 
> *Problem*
>  Problem with this implementation is that as the number of loads increases 
> the memory required to hold min and max values also increases considerably. 
> We know that in most of the scenarios there is a single driver and memory 
> configured for driver is less as compared to executor. With continuous 
> increase in memory requirement driver can even go out of memory which makes 
> the situation further worse.
> *Solution*
> 1. Cache only the required columns in Driver
> 2. Segregation of block and Blocklet level cache**
> For more details please check the attached document



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2651) Update IDG for COLUMN_META_CACHE and CACHE_LEVEL properties

2018-06-25 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2651:


 Summary: Update IDG for COLUMN_META_CACHE and CACHE_LEVEL 
properties
 Key: CARBONDATA-2651
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2651
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta


Update document for caching properties



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2649) Add code for caching min/max only for specified columns

2018-06-25 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2649:


 Summary: Add code for caching min/max only for specified columns
 Key: CARBONDATA-2649
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2649
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta


Add code for caching min/max only for specified columns



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2648) Add support for COLUMN_META_CACHE in create table and alter table properties

2018-06-25 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2648:


 Summary: Add support for COLUMN_META_CACHE in create table and 
alter table properties
 Key: CARBONDATA-2648
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2648
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta


Add support for COLUMN_META_CACHE in create table and alter table properties



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2647) Add support for CACHE_LEVEL in create table and alter table properties

2018-06-25 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2647:


 Summary: Add support for CACHE_LEVEL in create table and alter 
table properties
 Key: CARBONDATA-2647
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2647
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta


Add support for CACHE_LEVEL in create table and alter table properties



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2645) Segregate block and blocklet cache

2018-06-25 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2645:


 Summary: Segregate block and blocklet cache
 Key: CARBONDATA-2645
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2645
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta


Separate block and blocklet cache using the cache level configuration



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2638) Implement driver min max caching for specified columns and segregate block and blocklet cache

2018-06-24 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2638:


 Summary: Implement driver min max caching for specified columns 
and segregate block and blocklet cache
 Key: CARBONDATA-2638
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2638
 Project: CarbonData
  Issue Type: New Feature
Reporter: Manish Gupta
Assignee: Manish Gupta
 Attachments: Driver_Block_Cache.docx

*Background*

Current implementation of Blocklet dataMap caching in driver is that it caches 
the min and max values of all the columns in schema by default. 

*Problem*
 Problem with this implementation is that as the number of loads increases the 
memory required to hold min and max values also increases considerably. We know 
that in most of the scenarios there is a single driver and memory configured 
for driver is less as compared to executor. With continuous increase in memory 
requirement driver can even go out of memory which makes the situation further 
worse.

*Solution*

1. Cache only the required columns in Driver

2. Segregation of block and Blocklet level cache**

For more details please check the attached document



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2623) Add DataMap Pre and Pevent listener

2018-06-21 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2623.
--
   Resolution: Fixed
 Assignee: Mohammad Shahid Khan
Fix Version/s: 1.4.1

> Add DataMap Pre and Pevent listener
> ---
>
> Key: CARBONDATA-2623
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2623
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2623) Add DataMap Pre and Pevent listener

2018-06-21 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2623:
-
Issue Type: Bug  (was: Improvement)

> Add DataMap Pre and Pevent listener
> ---
>
> Key: CARBONDATA-2623
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2623
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2623) Add DataMap Pre and Pevent listener

2018-06-21 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2623:
-
Priority: Minor  (was: Major)

> Add DataMap Pre and Pevent listener
> ---
>
> Key: CARBONDATA-2623
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2623
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2617) Invalid tuple and block id getting formed for non partition table

2018-06-20 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2617.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Invalid tuple and block id getting formed for non partition table
> -
>
> Key: CARBONDATA-2617
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2617
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.4.1
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Major
> Fix For: 1.4.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While creating a partition table a segment file was written in the Metadata 
> folder under table structure. This was introduced during development of 
> partition table feature. At that time segment file was written only for 
> partition table and it was used to distinguish between parition and non 
> partition table in the code.
>                                          But later the code was modified to 
> write the segment file for both parititon and non partition table and the 
> code to distinguish partition and non partition table was not modified which 
> is causing this incorrect formation of block and tuple id.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2617) Invalid tuple and block id getting formed for non partition table

2018-06-20 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2617:
-
Issue Type: Bug  (was: Improvement)

> Invalid tuple and block id getting formed for non partition table
> -
>
> Key: CARBONDATA-2617
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2617
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.4.1
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While creating a partition table a segment file was written in the Metadata 
> folder under table structure. This was introduced during development of 
> partition table feature. At that time segment file was written only for 
> partition table and it was used to distinguish between parition and non 
> partition table in the code.
>                                          But later the code was modified to 
> write the segment file for both parititon and non partition table and the 
> code to distinguish partition and non partition table was not modified which 
> is causing this incorrect formation of block and tuple id.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2617) Invalid tuple and block id getting formed for non partition table

2018-06-20 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2617:
-
Affects Version/s: 1.4.1

> Invalid tuple and block id getting formed for non partition table
> -
>
> Key: CARBONDATA-2617
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2617
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 1.4.1
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Major
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> While creating a partition table a segment file was written in the Metadata 
> folder under table structure. This was introduced during development of 
> partition table feature. At that time segment file was written only for 
> partition table and it was used to distinguish between parition and non 
> partition table in the code.
>                                          But later the code was modified to 
> write the segment file for both parititon and non partition table and the 
> code to distinguish partition and non partition table was not modified which 
> is causing this incorrect formation of block and tuple id.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2604) getting ArrayIndexOutOfBoundException during compaction after IUD in cluster

2018-06-13 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2604.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> getting ArrayIndexOutOfBoundException during compaction after IUD in cluster
> 
>
> Key: CARBONDATA-2604
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2604
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> *Exception :* 
> !image-2018-06-12-19-19-05-257.png!
> *To reproduce the issue follow the following steps :* 
> {quote} * *create table brinjal (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('table_blocksize'='2000','sort_columns'='imei');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *insert into brinjal select * from brinjal;*
>  * *update brinjal set (AMSize)= ('8RAM size') where AMSize='4RAM size';*
>  * *delete from brinjal where AMSize='8RAM size';*
>  * *delete from table brinjal where segment.id IN(0);*
>  * *clean files for table brinjal;*
>  * *alter table brinjal compact 'minor';*
>  * *alter table brinjal compact 'major';*{quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2604) getting ArrayIndexOutOfBoundException during compaction after IUD in cluster

2018-06-13 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2604:
-
Issue Type: Bug  (was: Improvement)

> getting ArrayIndexOutOfBoundException during compaction after IUD in cluster
> 
>
> Key: CARBONDATA-2604
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2604
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Major
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> *Exception :* 
> !image-2018-06-12-19-19-05-257.png!
> *To reproduce the issue follow the following steps :* 
> {quote} * *create table brinjal (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('table_blocksize'='2000','sort_columns'='imei');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *insert into brinjal select * from brinjal;*
>  * *update brinjal set (AMSize)= ('8RAM size') where AMSize='4RAM size';*
>  * *delete from brinjal where AMSize='8RAM size';*
>  * *delete from table brinjal where segment.id IN(0);*
>  * *clean files for table brinjal;*
>  * *alter table brinjal compact 'minor';*
>  * *alter table brinjal compact 'major';*{quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2604) getting ArrayIndexOutOfBoundException during compaction after IUD in cluster

2018-06-13 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2604:
-
Priority: Minor  (was: Major)

> getting ArrayIndexOutOfBoundException during compaction after IUD in cluster
> 
>
> Key: CARBONDATA-2604
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2604
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Minor
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> *Exception :* 
> !image-2018-06-12-19-19-05-257.png!
> *To reproduce the issue follow the following steps :* 
> {quote} * *create table brinjal (imei string,AMSize string,channelsId 
> string,ActiveCountry string, Activecity string,gamePointId 
> double,deviceInformationId double,productionDate Timestamp,deliveryDate 
> timestamp,deliverycharge double) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('table_blocksize'='2000','sort_columns'='imei');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *LOAD DATA INPATH '/user/loader/xyz.csv' INTO TABLE brinjal 
> OPTIONS('DELIMITER'=',', 'QUOTECHAR'= 
> '','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'= 
> 'imei,deviceInformationId,AMSize,channelsId,ActiveCountry,Activecity,gamePointId,productionDate,deliveryDate,deliverycharge');*
>  * *insert into brinjal select * from brinjal;*
>  * *update brinjal set (AMSize)= ('8RAM size') where AMSize='4RAM size';*
>  * *delete from brinjal where AMSize='8RAM size';*
>  * *delete from table brinjal where segment.id IN(0);*
>  * *clean files for table brinjal;*
>  * *alter table brinjal compact 'minor';*
>  * *alter table brinjal compact 'major';*{quote}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2571) Calculating the carbonindex and carbondata file size of a table is wrong

2018-06-05 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2571?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2571.
--
   Resolution: Fixed
Fix Version/s: 1.4.1

> Calculating the carbonindex and carbondata file size of a table is wrong
> 
>
> Key: CARBONDATA-2571
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2571
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2538) No exception is thrown if writer path has only lock files

2018-05-28 Thread Manish Gupta (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2538.
--
   Resolution: Fixed
Fix Version/s: 1.4.0

> No exception is thrown if writer path has only lock files
> -
>
> Key: CARBONDATA-2538
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2538
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> Steps to reproduce:
>  # Create external table 
>  # Manually delete the index and carbon files
>  # Describe table (lock files would be created)
>  # Select from table



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2514) Duplicate columns in CarbonWriter is throwing NullPointerException

2018-05-23 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2514.
--
   Resolution: Fixed
 Assignee: Kunal Kapoor
Fix Version/s: 1.4.1

> Duplicate columns in CarbonWriter is throwing NullPointerException
> --
>
> Key: CARBONDATA-2514
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2514
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.4.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2503) Data write fails if empty value is provided for sort columns in sdk

2018-05-22 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2503.
--
   Resolution: Fixed
Fix Version/s: 1.4.0

> Data write fails if empty value is provided for sort columns in sdk
> ---
>
> Key: CARBONDATA-2503
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2503
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Rahul Kumar
>Assignee: Rahul Kumar
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> *Reproduce step :* 
> Use SDK to write data where empty value is provided for sort columns
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2496) Chnage the bloom implementation to hadoop for better performance and compression

2018-05-22 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2496.
--
   Resolution: Fixed
 Assignee: Ravindra Pesala
Fix Version/s: 1.4.0

> Chnage the bloom implementation to hadoop for better performance and 
> compression
> 
>
> Key: CARBONDATA-2496
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2496
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> The current implementation of bloom does not give better performance and 
> compression, And also it adds new guava dependency to carbon. So remove the 
> guava dependency and add hadoop bloom.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2227) Add Partition Values and Location information in describe formatted for Standard partition feature

2018-05-22 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2227.
--
   Resolution: Fixed
Fix Version/s: 1.4.0

> Add Partition Values and Location information in describe formatted for 
> Standard partition feature
> --
>
> Key: CARBONDATA-2227
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2227
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.4.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2448) Adding compacted segments to load and alter events

2018-05-08 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2448.
--
   Resolution: Fixed
 Assignee: dhatchayani
Fix Version/s: 1.4.0

> Adding compacted segments to load and alter events
> --
>
> Key: CARBONDATA-2448
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2448
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.4.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2436) Block pruning problem post the carbon schema restructure.

2018-05-07 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2436.
--
   Resolution: Fixed
 Assignee: Mohammad Shahid Khan
Fix Version/s: 1.4.0

> Block pruning problem post the carbon schema restructure.
> -
>
> Key: CARBONDATA-2436
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2436
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Currently datamap is pruning with segmentproperties from the 0th blcok of 
> BlockletDataMap is not 
> correct. As post restructure if the table is updated then all the block will 
> not have 
> symetric schema within the same segments.
> Fix: It must be ensured the block could be pruned with the same schema.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2433) Executor OOM because of GC when blocklet pruning is done using Lucene datamap

2018-05-03 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2433:


 Summary: Executor OOM because of GC when blocklet pruning is done 
using Lucene datamap
 Key: CARBONDATA-2433
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2433
 Project: CarbonData
  Issue Type: Sub-task
Affects Versions: 1.4.0
Reporter: Manish Gupta
Assignee: Manish Gupta






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2433) Executor OOM because of GC when blocklet pruning is done using Lucene datamap

2018-05-03 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta updated CARBONDATA-2433:
-
Description: 
While seraching using lucene it creates a PriorityQueue to hold the documents. 
As size is not specified by default the PriorityQueue size is equal to the 
number of lucene documents. As the docuemnts start getting added to the heap 
the GC time increases and after some time task fails due to excessive GC and 
executor OOM occurs.


Reference blog:

*http://lucene.472066.n3.nabble.com/Optimization-of-memory-usage-in-PriorityQueue-td590355.html*

> Executor OOM because of GC when blocklet pruning is done using Lucene datamap
> -
>
> Key: CARBONDATA-2433
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2433
> Project: CarbonData
>  Issue Type: Sub-task
>Affects Versions: 1.4.0
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Major
>
> While seraching using lucene it creates a PriorityQueue to hold the 
> documents. As size is not specified by default the PriorityQueue size is 
> equal to the number of lucene documents. As the docuemnts start getting added 
> to the heap the GC time increases and after some time task fails due to 
> excessive GC and executor OOM occurs.
> Reference blog:
> *http://lucene.472066.n3.nabble.com/Optimization-of-memory-usage-in-PriorityQueue-td590355.html*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2410) Error message correction when column value length exceeds 320000 charactor

2018-05-03 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2410.
--
   Resolution: Fixed
 Assignee: Mohammad Shahid Khan
Fix Version/s: 1.4.0

> Error message correction when column value length exceeds 32 charactor
> --
>
> Key: CARBONDATA-2410
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2410
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Trivial
> Fix For: 1.4.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2396) Add CTAS support for using DataSource Syntax

2018-05-03 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2396?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2396.
--
   Resolution: Fixed
Fix Version/s: 1.4.0

> Add CTAS support for using DataSource Syntax
> 
>
> Key: CARBONDATA-2396
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2396
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Indhumathi Muthumurugesh
>Assignee: Indhumathi Muthumurugesh
>Priority: Minor
> Fix For: 1.4.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2406) Dictionary Server and Dictionary Client MD5 Validation failed with hive.server2.enable.doAs = true

2018-05-02 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2406?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2406.
--
   Resolution: Fixed
 Assignee: Mohammad Shahid Khan
Fix Version/s: 1.4.0

> Dictionary Server and Dictionary Client  MD5 Validation failed with 
> hive.server2.enable.doAs = true 
> 
>
> Key: CARBONDATA-2406
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2406
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> With conf hive.server2.enable.doAs = true, the dictionary server is started 
> with the user who submit the load request. But the dictionary client run as 
> the user who started the executor process. Due to this dictionary client can 
> not successfully communicate with the dictionary server.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2275) Query Failed for 0 byte deletedelta file

2018-04-26 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2275.
--
   Resolution: Fixed
 Assignee: Babulal
Fix Version/s: 1.4.0

> Query Failed for 0 byte deletedelta file 
> -
>
> Key: CARBONDATA-2275
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2275
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query, sql
>Affects Versions: 1.3.0
>Reporter: Babulal
>Assignee: Babulal
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 11h 10m
>  Remaining Estimate: 0h
>
> When delete is failed on write step because of any exception from hdfs . 
> Currently 0 bye deletedelta file is created and not getting cleaned up . 
> So when any Select Query is triggered , Select Query is failed with Exception 
> Problem in loading segment blocks. 
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 
>   at 
> org.apache.carbondata.core.datastore.filesystem.AbstractDFSCarbonFile.getLocations(AbstractDFSCarbonFile.java:514)
>  
>   at 
> org.apache.carbondata.core.indexstore.BlockletDataMapIndexStore.getAll(BlockletDataMapIndexStore.java:142)
>  
>   ... 109 more
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2405) Implement columnar filling during query result preparation in DictionaryBasedResultCollector

2018-04-26 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2405:


 Summary: Implement columnar filling during query result 
preparation in DictionaryBasedResultCollector
 Key: CARBONDATA-2405
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2405
 Project: CarbonData
  Issue Type: Sub-task
Reporter: Manish Gupta
Assignee: Manish Gupta


When the number of columns in a query are greater than 100 then 
DictionaryBasedResultCollector  is selected for result preparation. 
DictionaryBasedResultCollector  fills the result row wise which reduces the 
query performance.

Same as compaction we need to implement columnar filling of results in 
DictionaryBasedResultCollector  so as to improve the query performance when the 
number of columns in a query are greater than 100.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2391) Thread leak in compaction operation if prefetch is enabled and compaction process is killed

2018-04-23 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2391:


 Summary: Thread leak in compaction operation if prefetch is 
enabled and compaction process is killed
 Key: CARBONDATA-2391
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2391
 Project: CarbonData
  Issue Type: Bug
Reporter: Manish Gupta
Assignee: Manish Gupta


Problem
Thread leak in compaction operation if prefetch is enabled and compaction 
process is killed

Analysis
During compaction if prefetch is enabled RawResultIterator launches an executor 
service for prefetching the data.
If compaction fails or the process is killed it can lead to thread leak due to 
executor service still in running state.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2381) Improve compaction performance by filling batch result in columnar format and performing IO at blocklet level

2018-04-23 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-2381:


 Summary: Improve compaction performance by filling batch result in 
columnar format and performing IO at blocklet level
 Key: CARBONDATA-2381
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2381
 Project: CarbonData
  Issue Type: Improvement
Affects Versions: 1.3.1
Reporter: Manish Gupta
Assignee: Manish Gupta


Problem: Compaction performance is slow as compared to data load. If compaction 
threshold is set to 6,6 then on minor compaction after 6 loads compaction 
performance is almost 6-7 times of the total load performance for 6 loads.

Analysis:
 # During compaction result filling is done in row format. Due to this as the 
number of columns increases the dimension and measure data filling time 
increases. This happens because in row filling we are not able to take 
advantage of OS cacheable buffers as we continuously read data for next column.
 # As compaction uses a page level reader flow wherein both IO and 
uncompression is done at page level, the IO and uncompression time increases in 
this model.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2307) OOM when using DataFrame.coalesce

2018-04-14 Thread Manish Gupta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manish Gupta resolved CARBONDATA-2307.
--
   Resolution: Fixed
 Assignee: Jin Zhou
Fix Version/s: 1.4.0

> OOM when using DataFrame.coalesce
> -
>
> Key: CARBONDATA-2307
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2307
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Jin Zhou
>Assignee: Jin Zhou
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 5h 40m
>  Remaining Estimate: 0h
>
> TaskContext object holds reader’s reference until the task finished and 
> coalesce combines a lot of CarbonSparkPartition into one task.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   >