[jira] [Assigned] (CARBONDATA-3830) Presto read support for complex columns

2020-08-30 Thread Kumar Vishal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kumar Vishal reassigned CARBONDATA-3830:


Assignee: Ajantha Bhat

> Presto read support for complex columns
> ---
>
> Key: CARBONDATA-3830
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3830
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, presto-integration
>Reporter: Akshay
>Assignee: Ajantha Bhat
>Priority: Minor
> Attachments: Presto Read Support.pdf
>
>  Time Spent: 33h 40m
>  Remaining Estimate: 0h
>
> This feature is to enable Presto to read complex columns from carbon file.
> Complex columns include - array, map and struct.
> This design document handles only for array type.
> Map and Struct types will be handled later.
>  
> PR - [https://github.com/apache/carbondata/pull/3773]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3830) Presto read support for complex columns

2020-08-30 Thread Kumar Vishal (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kumar Vishal resolved CARBONDATA-3830.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> Presto read support for complex columns
> ---
>
> Key: CARBONDATA-3830
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3830
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, presto-integration
>Reporter: Akshay
>Assignee: Ajantha Bhat
>Priority: Minor
> Fix For: 2.1.0
>
> Attachments: Presto Read Support.pdf
>
>  Time Spent: 33h 40m
>  Remaining Estimate: 0h
>
> This feature is to enable Presto to read complex columns from carbon file.
> Complex columns include - array, map and struct.
> This design document handles only for array type.
> Map and Struct types will be handled later.
>  
> PR - [https://github.com/apache/carbondata/pull/3773]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-1328) Refactoring unsafe code and added new property

2017-07-30 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-1328:
-
Description: 
refactoring unsafe memory manager and unsafe sort
Removed Sort memory manager and added Single memory manager
Deprecated old property: sort.inmemory.size.inmb, enable.offheap.sort
Added new property: carbon.sort.storage.inmemory.size.inmb, 
carbon.unsafe.working.memory.in.mb, enable.offheap
Removed copying of block from working memory to sort storage memory
Now all the unsafe flow must use CarbonUnsfeMemoryManager for allocating/ 
removing unsafe memory
If user has configured old property then internally it will be converted to new 
property
for ex: If user has configured sort.inmemory.size.inmb then 20% memory will be 
used as working memory and rest for storage memory

  was:Add unsafe property validation + handle old unsafe parameter for backward 
compatibility

 Issue Type: Improvement  (was: Bug)
Summary: Refactoring unsafe code and added new property  (was: Carbon 
Unsafe Property validation)

> Refactoring unsafe code and added new property
> --
>
> Key: CARBONDATA-1328
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1328
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>Assignee: kumar vishal
>
> refactoring unsafe memory manager and unsafe sort
> Removed Sort memory manager and added Single memory manager
> Deprecated old property: sort.inmemory.size.inmb, enable.offheap.sort
> Added new property: carbon.sort.storage.inmemory.size.inmb, 
> carbon.unsafe.working.memory.in.mb, enable.offheap
> Removed copying of block from working memory to sort storage memory
> Now all the unsafe flow must use CarbonUnsfeMemoryManager for allocating/ 
> removing unsafe memory
> If user has configured old property then internally it will be converted to 
> new property
> for ex: If user has configured sort.inmemory.size.inmb then 20% memory will 
> be used as working memory and rest for storage memory



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1370) Query is failing with array index out of bound exception

2017-08-09 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1370:


 Summary: Query is failing with array index out of bound exception
 Key: CARBONDATA-1370
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1370
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Query is failing with array index out of bound exception when table contains 
sort columns and less than filter is applied on timestamp column which is not 
present in sort columns



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1402) JVM crashes when data loading is done with unsafe column page=true

2017-08-22 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1402:


 Summary: JVM crashes when data loading is done with unsafe column 
page=true
 Key: CARBONDATA-1402
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1402
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


JVM crashes when data loading is done with unsafe column page=true



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1410) Thread leak issue in case of data loading failure

2017-08-27 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1410:


 Summary: Thread leak issue in case of data loading failure
 Key: CARBONDATA-1410
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1410
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Thread leak issue in case of data loading failure



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1449) GC issue in case of date filter if it is going to rowlevel executor

2017-09-05 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1449:


 Summary: GC issue in case of date filter if it is going to 
rowlevel executor
 Key: CARBONDATA-1449
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1449
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


GC issue in case of date filter if it is going to rowlevel executor



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1474) Memory leak issue in case of vector reader

2017-09-12 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1474:


 Summary: Memory leak issue in case of vector reader
 Key: CARBONDATA-1474
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1474
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Memory leak issue in case of vector reader



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CARBONDATA-1474) Memory leak issue in case of vector reader

2017-09-25 Thread kumar vishal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16178909#comment-16178909
 ] 

kumar vishal commented on CARBONDATA-1474:
--

Fixed as part of CARBONDATA-1488, so closing this issue

> Memory leak issue in case of vector reader
> --
>
> Key: CARBONDATA-1474
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1474
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>
> Memory leak issue in case of vector reader



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Closed] (CARBONDATA-1474) Memory leak issue in case of vector reader

2017-09-25 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal closed CARBONDATA-1474.

Resolution: Fixed

> Memory leak issue in case of vector reader
> --
>
> Key: CARBONDATA-1474
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1474
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>
> Memory leak issue in case of vector reader



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1514) Sort Column Property is not getting added in case of alter operation

2017-09-25 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1514:


 Summary: Sort Column Property is not getting added in case of 
alter operation
 Key: CARBONDATA-1514
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1514
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Sort Column Property is not getting added in case of alter operation



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1515) Fixed NPE in Data loading

2017-09-25 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1515:


 Summary: Fixed NPE in Data loading
 Key: CARBONDATA-1515
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1515
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Scenario: 
Data size: 3.5 billion rows(4.1 tb data)
3 node cluster
Number of core while data loading 12.
No. of loads 100 times
Problem: In DataConverterProcessorStepImpl it is using array list for adding 
all the local converter, in case of multiple thread scenario it is creating a 
hole (null value)(as array list if not synchronized). while closing the 
converter it is it is throwing NPE
Solution: Add local converter in synchronized block






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CARBONDATA-1516) Support pre-aggregate tables and timeseries in carbondata

2017-10-15 Thread kumar vishal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16205427#comment-16205427
 ] 

kumar vishal commented on CARBONDATA-1516:
--

[~chenliang613] In case of drop column if any aggregate table has that column 
then that table will become invalid. In that case user need to rebuild 
aggregate table again.


> Support pre-aggregate tables and timeseries in carbondata
> -
>
> Key: CARBONDATA-1516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
> Attachments: CarbonData Pre-aggregation Table.pdf
>
>
> Currently Carbondata has standard SQL capability on distributed data 
> sets.Carbondata should support pre-aggregating tables for timeseries and 
> improve query performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1589) Support Desc table and desc formatted table for Pre Aggregate table

2017-10-16 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1589:


 Summary: Support Desc table and desc formatted table for Pre 
Aggregate table
 Key: CARBONDATA-1589
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1589
 Project: CarbonData
  Issue Type: Sub-task
Reporter: kumar vishal


Support Desc table and desc formatted table for Pre Aggregate table



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1609) Update thrift to support Pre Aggregate support

2017-10-23 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1609:


 Summary: Update thrift to support Pre Aggregate support
 Key: CARBONDATA-1609
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1609
 Project: CarbonData
  Issue Type: Sub-task
Reporter: kumar vishal


Update thrift to support Pre Aggregate support



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1658) Thread Leak Issue in No Sort

2017-10-30 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1658:


 Summary: Thread Leak Issue in No Sort
 Key: CARBONDATA-1658
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1658
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Threads are not getting closed in case of no sort



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1713) Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails after creating pre-aggregate table

2017-11-15 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1713:


Assignee: kumar vishal

> Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails after 
> creating pre-aggregate table
> ---
>
> Key: CARBONDATA-1713
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1713
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: ANT Test cluster - 3 node
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>Priority: Blocker
>  Labels: sanity
> Fix For: 1.3.0
>
>
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> Error: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or 
> view 'lineitem' not found in database 'default'; (state=,code=0)
> 0: jdbc:hive2://10.18.98.34:23040> create table if not exists lineitem(
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPMODE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPINSTRUCT string,
> 0: jdbc:hive2://10.18.98.34:23040> L_RETURNFLAG string,
> 0: jdbc:hive2://10.18.98.34:23040> L_RECEIPTDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_ORDERKEY string,
> 0: jdbc:hive2://10.18.98.34:23040> L_PARTKEY string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SUPPKEY   string,
> 0: jdbc:hive2://10.18.98.34:23040> L_LINENUMBER int,
> 0: jdbc:hive2://10.18.98.34:23040> L_QUANTITY double,
> 0: jdbc:hive2://10.18.98.34:23040> L_EXTENDEDPRICE double,
> 0: jdbc:hive2://10.18.98.34:23040> L_DISCOUNT double,
> 0: jdbc:hive2://10.18.98.34:23040> L_TAX double,
> 0: jdbc:hive2://10.18.98.34:23040> L_LINESTATUS string,
> 0: jdbc:hive2://10.18.98.34:23040> L_COMMITDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_COMMENT  string
> 0: jdbc:hive2://10.18.98.34:23040> ) STORED BY 'org.apache.carbondata.format'
> 0: jdbc:hive2://10.18.98.34:23040> TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.338 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (48.634 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> create datamap agr_lineitem ON TABLE 
> lineitem USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as 
> select L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from 
> lineitem group by  L_RETURNFLAG, L_LINESTATUS;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (16.552 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem 
> group by  L_RETURNFLAG, L_LINESTATUS;
> Error: org.apache.spark.sql.AnalysisException: Column doesnot exists in Pre 
> Aggregate table; (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CARBONDATA-1713) Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails after creating pre-aggregate table

2017-11-15 Thread kumar vishal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16253143#comment-16253143
 ] 

kumar vishal commented on CARBONDATA-1713:
--

This failing because in select statement column name is in upper case to 
aggregate table selection is failing.

> Carbon1.3.0-Pre-AggregateTable - Aggregate query on main table fails after 
> creating pre-aggregate table
> ---
>
> Key: CARBONDATA-1713
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1713
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: ANT Test cluster - 3 node
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>Priority: Blocker
>  Labels: sanity
> Fix For: 1.3.0
>
>
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> Error: org.apache.spark.sql.catalyst.analysis.NoSuchTableException: Table or 
> view 'lineitem' not found in database 'default'; (state=,code=0)
> 0: jdbc:hive2://10.18.98.34:23040> create table if not exists lineitem(
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPMODE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SHIPINSTRUCT string,
> 0: jdbc:hive2://10.18.98.34:23040> L_RETURNFLAG string,
> 0: jdbc:hive2://10.18.98.34:23040> L_RECEIPTDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_ORDERKEY string,
> 0: jdbc:hive2://10.18.98.34:23040> L_PARTKEY string,
> 0: jdbc:hive2://10.18.98.34:23040> L_SUPPKEY   string,
> 0: jdbc:hive2://10.18.98.34:23040> L_LINENUMBER int,
> 0: jdbc:hive2://10.18.98.34:23040> L_QUANTITY double,
> 0: jdbc:hive2://10.18.98.34:23040> L_EXTENDEDPRICE double,
> 0: jdbc:hive2://10.18.98.34:23040> L_DISCOUNT double,
> 0: jdbc:hive2://10.18.98.34:23040> L_TAX double,
> 0: jdbc:hive2://10.18.98.34:23040> L_LINESTATUS string,
> 0: jdbc:hive2://10.18.98.34:23040> L_COMMITDATE string,
> 0: jdbc:hive2://10.18.98.34:23040> L_COMMENT  string
> 0: jdbc:hive2://10.18.98.34:23040> ) STORED BY 'org.apache.carbondata.format'
> 0: jdbc:hive2://10.18.98.34:23040> TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.338 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> load data inpath 
> "hdfs://hacluster/user/test/lineitem.tbl.1" into table lineitem 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (48.634 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> create datamap agr_lineitem ON TABLE 
> lineitem USING "org.apache.carbondata.datamap.AggregateDataMapHandler" as 
> select L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from 
> lineitem group by  L_RETURNFLAG, L_LINESTATUS;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (16.552 seconds)
> 0: jdbc:hive2://10.18.98.34:23040> select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem 
> group by  L_RETURNFLAG, L_LINESTATUS;
> Error: org.apache.spark.sql.AnalysisException: Column doesnot exists in Pre 
> Aggregate table; (state=,code=0)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (CARBONDATA-1740) Carbon1.3.0-Pre-AggregateTable - Query plan exception for aggregate query with order by when main table is having pre-aggregate table

2017-11-16 Thread kumar vishal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16255622#comment-16255622
 ] 

kumar vishal commented on CARBONDATA-1740:
--

This is failing because of order by in query. In PreAggregate rules order by 
scenario is not handled

> Carbon1.3.0-Pre-AggregateTable - Query plan exception for aggregate query 
> with order by when main table is having pre-aggregate table
> -
>
> Key: CARBONDATA-1740
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1740
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>  Labels: DFX
> Fix For: 1.3.0
>
>
> lineitem3: has a pre-aggregate table 
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus order by l_returnflag, 
> l_linestatus;
> Error: org.apache.spark.sql.AnalysisException: expression 
> '`lineitem3_l_returnflag`' is neither present in the group by, nor is it an 
> aggregate function. Add to group by or wrap in first() (or first_value) if 
> you don't care which value you get.;;
> Project [l_returnflag#2356, l_linestatus#2366, sum(l_quantity)#2791, 
> sum(l_extendedprice)#2792]
> +- Sort [aggOrder#2795 ASC NULLS FIRST, aggOrder#2796 ASC NULLS FIRST], true
>+- !Aggregate [l_returnflag#2356, l_linestatus#2366], [l_returnflag#2356, 
> l_linestatus#2366, sum(l_quantity#2362) AS sum(l_quantity)#2791, 
> sum(l_extendedprice#2363) AS sum(l_extendedprice)#2792, 
> lineitem3_l_returnflag#2341 AS aggOrder#2795, lineitem3_l_linestatus#2342 AS 
> aggOrder#2796]
>   +- SubqueryAlias lineitem3
>  +- 
> Relation[L_SHIPDATE#2353,L_SHIPMODE#2354,L_SHIPINSTRUCT#2355,L_RETURNFLAG#2356,L_RECEIPTDATE#2357,L_ORDERKEY#2358,L_PARTKEY#2359,L_SUPPKEY#2360,L_LINENUMBER#2361,L_QUANTITY#2362,L_EXTENDEDPRICE#2363,L_DISCOUNT#2364,L_TAX#2365,L_LINESTATUS#2366,L_COMMITDATE#2367,L_COMMENT#2368]
>  CarbonDatasourceHadoopRelation [ Database name :test_db1, Table name 
> :lineitem3, Schema :Some(StructType(StructField(L_SHIPDATE,StringType,true), 
> StructField(L_SHIPMODE,StringType,true), 
> StructField(L_SHIPINSTRUCT,StringType,true), 
> StructField(L_RETURNFLAG,StringType,true), 
> StructField(L_RECEIPTDATE,StringType,true), 
> StructField(L_ORDERKEY,StringType,true), 
> StructField(L_PARTKEY,StringType,true), 
> StructField(L_SUPPKEY,StringType,true), 
> StructField(L_LINENUMBER,IntegerType,true), 
> StructField(L_QUANTITY,DoubleType,true), 
> StructField(L_EXTENDEDPRICE,DoubleType,true), 
> StructField(L_DISCOUNT,DoubleType,true), StructField(L_TAX,DoubleType,true), 
> StructField(L_LINESTATUS,StringType,true), 
> StructField(L_COMMITDATE,StringType,true), 
> StructField(L_COMMENT,StringType,true))) ] (state=,code=0)
> lineitem4: no pre-aggregate table created
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem4 group by l_returnflag, l_linestatus order by l_returnflag, 
> l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | A | F | 1.263625E7   | 1.8938515425239815E10  |
> | N | F | 327800.0 | 4.91387677622E8|
> | N | O | 2.5398626E7  | 3.810981608977963E10   |
> | R | F | 1.2643878E7  | 1.8948524305619884E10  |
> +---+---+--++--+
> *+Expected:+*: aggregate query with order by should run fine
> *+Actual:+* aggregate query with order failed 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1740) Carbon1.3.0-Pre-AggregateTable - Query plan exception for aggregate query with order by when main table is having pre-aggregate table

2017-11-16 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1740:


Assignee: kumar vishal

> Carbon1.3.0-Pre-AggregateTable - Query plan exception for aggregate query 
> with order by when main table is having pre-aggregate table
> -
>
> Key: CARBONDATA-1740
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1740
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>  Labels: DFX
> Fix For: 1.3.0
>
>
> lineitem3: has a pre-aggregate table 
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus order by l_returnflag, 
> l_linestatus;
> Error: org.apache.spark.sql.AnalysisException: expression 
> '`lineitem3_l_returnflag`' is neither present in the group by, nor is it an 
> aggregate function. Add to group by or wrap in first() (or first_value) if 
> you don't care which value you get.;;
> Project [l_returnflag#2356, l_linestatus#2366, sum(l_quantity)#2791, 
> sum(l_extendedprice)#2792]
> +- Sort [aggOrder#2795 ASC NULLS FIRST, aggOrder#2796 ASC NULLS FIRST], true
>+- !Aggregate [l_returnflag#2356, l_linestatus#2366], [l_returnflag#2356, 
> l_linestatus#2366, sum(l_quantity#2362) AS sum(l_quantity)#2791, 
> sum(l_extendedprice#2363) AS sum(l_extendedprice)#2792, 
> lineitem3_l_returnflag#2341 AS aggOrder#2795, lineitem3_l_linestatus#2342 AS 
> aggOrder#2796]
>   +- SubqueryAlias lineitem3
>  +- 
> Relation[L_SHIPDATE#2353,L_SHIPMODE#2354,L_SHIPINSTRUCT#2355,L_RETURNFLAG#2356,L_RECEIPTDATE#2357,L_ORDERKEY#2358,L_PARTKEY#2359,L_SUPPKEY#2360,L_LINENUMBER#2361,L_QUANTITY#2362,L_EXTENDEDPRICE#2363,L_DISCOUNT#2364,L_TAX#2365,L_LINESTATUS#2366,L_COMMITDATE#2367,L_COMMENT#2368]
>  CarbonDatasourceHadoopRelation [ Database name :test_db1, Table name 
> :lineitem3, Schema :Some(StructType(StructField(L_SHIPDATE,StringType,true), 
> StructField(L_SHIPMODE,StringType,true), 
> StructField(L_SHIPINSTRUCT,StringType,true), 
> StructField(L_RETURNFLAG,StringType,true), 
> StructField(L_RECEIPTDATE,StringType,true), 
> StructField(L_ORDERKEY,StringType,true), 
> StructField(L_PARTKEY,StringType,true), 
> StructField(L_SUPPKEY,StringType,true), 
> StructField(L_LINENUMBER,IntegerType,true), 
> StructField(L_QUANTITY,DoubleType,true), 
> StructField(L_EXTENDEDPRICE,DoubleType,true), 
> StructField(L_DISCOUNT,DoubleType,true), StructField(L_TAX,DoubleType,true), 
> StructField(L_LINESTATUS,StringType,true), 
> StructField(L_COMMITDATE,StringType,true), 
> StructField(L_COMMENT,StringType,true))) ] (state=,code=0)
> lineitem4: no pre-aggregate table created
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem4 group by l_returnflag, l_linestatus order by l_returnflag, 
> l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | A | F | 1.263625E7   | 1.8938515425239815E10  |
> | N | F | 327800.0 | 4.91387677622E8|
> | N | O | 2.5398626E7  | 3.810981608977963E10   |
> | R | F | 1.2643878E7  | 1.8948524305619884E10  |
> +---+---+--++--+
> *+Expected:+*: aggregate query with order by should run fine
> *+Actual:+* aggregate query with order failed 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Comment Edited] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell sessions are not used in the beeline session

2017-11-20 Thread kumar vishal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258966#comment-16258966
 ] 

kumar vishal edited comment on CARBONDATA-1777 at 11/20/17 8:59 AM:


[~Ram@huawei] please check the executor log in executor log you will get the 
detail: Query will be executed on table:
And you can check the query plan which table it is hitting to execute the query 


was (Author: kumarvishal09):
[~Ram@huawei] please check the executor log in executor log you will get the 
detail: Query will be executed on table:

> Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell 
> sessions are not used in the beeline session
> -
>
> Key: CARBONDATA-1777
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1777
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Kunal Kapoor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> Steps:
> Beeline:
> 1. Create table and load with  data
> Spark-shell:
> 1. create a pre-aggregate table
> Beeline:
> 1. Run aggregate query
> *+Expected:+* Pre-aggregate table should be used in the aggregate query 
> *+Actual:+* Pre-aggregate table is not used
> 1.
> create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table 
> lineitem1 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 2. 
>  carbon.sql("create datamap agr1_lineitem1 ON TABLE lineitem1 USING 
> 'org.apache.carbondata.datamap.AggregateDataMapHandler' as select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 group by l_returnflag, l_linestatus").show();
> 3. 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus;
> Actual:
> 0: jdbc:hive2://10.18.98.136:23040> show tables;
> +---+---+--+--+
> | database  | tableName | isTemporary  |
> +---+---+--+--+
> | test_db2  | lineitem1 | false|
> | test_db2  | lineitem1_agr1_lineitem1  | false|
> +---+---+--+--+
> 2 rows selected (0.047 seconds)
> Logs:
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Running query 'select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus' 
> with 7f3091a8-4d7b-40ac-840f-9db6f564c9cf | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Parsing command: 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | 55: get_table : 
> db=test_db2 tbl=lineitem1 | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | ugi=anonymous 
> ip=unknown-ip-addr  cmd=get_table : db=test_db2 tbl=lineitem1| 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logAuditEvent(HiveMetaStore.java:371)
> 2017-11-20 15:46:48,354 | INFO  | [pool-23-thread-53] | 55: Opening raw store 
> with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:589)
> 2017-11-20 15:46:48,355 | INFO  | [pool-23-thread-53] | ObjectStore, 
> initialize called | 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:289)
> 2017-11-20 15:46:48,

[jira] [Commented] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell sessions are not used in the beeline session

2017-11-20 Thread kumar vishal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16258966#comment-16258966
 ] 

kumar vishal commented on CARBONDATA-1777:
--

[~Ram@huawei] please check the executor log in executor log you will get the 
detail: Query will be executed on table:

> Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell 
> sessions are not used in the beeline session
> -
>
> Key: CARBONDATA-1777
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1777
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Kunal Kapoor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> Steps:
> Beeline:
> 1. Create table and load with  data
> Spark-shell:
> 1. create a pre-aggregate table
> Beeline:
> 1. Run aggregate query
> *+Expected:+* Pre-aggregate table should be used in the aggregate query 
> *+Actual:+* Pre-aggregate table is not used
> 1.
> create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table 
> lineitem1 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 2. 
>  carbon.sql("create datamap agr1_lineitem1 ON TABLE lineitem1 USING 
> 'org.apache.carbondata.datamap.AggregateDataMapHandler' as select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 group by l_returnflag, l_linestatus").show();
> 3. 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus;
> Actual:
> 0: jdbc:hive2://10.18.98.136:23040> show tables;
> +---+---+--+--+
> | database  | tableName | isTemporary  |
> +---+---+--+--+
> | test_db2  | lineitem1 | false|
> | test_db2  | lineitem1_agr1_lineitem1  | false|
> +---+---+--+--+
> 2 rows selected (0.047 seconds)
> Logs:
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Running query 'select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus' 
> with 7f3091a8-4d7b-40ac-840f-9db6f564c9cf | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Parsing command: 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | 55: get_table : 
> db=test_db2 tbl=lineitem1 | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | ugi=anonymous 
> ip=unknown-ip-addr  cmd=get_table : db=test_db2 tbl=lineitem1| 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logAuditEvent(HiveMetaStore.java:371)
> 2017-11-20 15:46:48,354 | INFO  | [pool-23-thread-53] | 55: Opening raw store 
> with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:589)
> 2017-11-20 15:46:48,355 | INFO  | [pool-23-thread-53] | ObjectStore, 
> initialize called | 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:289)
> 2017-11-20 15:46:48,360 | INFO  | [pool-23-thread-53] | Reading in results 
> for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection 
> used is closing | org.datanucleus.util.Log4JLogger.info(Log4JLogger.java:77)
> 2017-11-20 15:46:48,362 | INFO  | [pool-23-thread-53] | Using di

[jira] [Assigned] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell sessions are not used in the beeline session

2017-11-20 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1777:


Assignee: kumar vishal  (was: Kunal Kapoor)

> Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell 
> sessions are not used in the beeline session
> -
>
> Key: CARBONDATA-1777
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1777
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>Priority: Minor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> Steps:
> Beeline:
> 1. Create table and load with  data
> Spark-shell:
> 1. create a pre-aggregate table
> Beeline:
> 1. Run aggregate query
> *+Expected:+* Pre-aggregate table should be used in the aggregate query 
> *+Actual:+* Pre-aggregate table is not used
> 1.
> create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table 
> lineitem1 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 2. 
>  carbon.sql("create datamap agr1_lineitem1 ON TABLE lineitem1 USING 
> 'org.apache.carbondata.datamap.AggregateDataMapHandler' as select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 group by l_returnflag, l_linestatus").show();
> 3. 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus;
> Actual:
> 0: jdbc:hive2://10.18.98.136:23040> show tables;
> +---+---+--+--+
> | database  | tableName | isTemporary  |
> +---+---+--+--+
> | test_db2  | lineitem1 | false|
> | test_db2  | lineitem1_agr1_lineitem1  | false|
> +---+---+--+--+
> 2 rows selected (0.047 seconds)
> Logs:
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Running query 'select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus' 
> with 7f3091a8-4d7b-40ac-840f-9db6f564c9cf | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Parsing command: 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | 55: get_table : 
> db=test_db2 tbl=lineitem1 | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | ugi=anonymous 
> ip=unknown-ip-addr  cmd=get_table : db=test_db2 tbl=lineitem1| 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logAuditEvent(HiveMetaStore.java:371)
> 2017-11-20 15:46:48,354 | INFO  | [pool-23-thread-53] | 55: Opening raw store 
> with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:589)
> 2017-11-20 15:46:48,355 | INFO  | [pool-23-thread-53] | ObjectStore, 
> initialize called | 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:289)
> 2017-11-20 15:46:48,360 | INFO  | [pool-23-thread-53] | Reading in results 
> for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection 
> used is closing | org.datanucleus.util.Log4JLogger.info(Log4JLogger.java:77)
> 2017-11-20 15:46:48,362 | INFO  | [pool-23-thread-53] | Using direct SQL, 
> underlying DB is MYSQL | 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.

[jira] [Assigned] (CARBONDATA-1760) Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when parent table name is not correct while creating datamap.

2017-11-21 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1760:


Assignee: Kunal Kapoor

> Carbon 1.3.0- Pre_aggregate: Proper Error message should be displayed, when 
> parent table name is not correct while creating datamap.
> 
>
> Key: CARBONDATA-1760
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1760
> Project: CarbonData
>  Issue Type: Bug
>  Components: sql
>Affects Versions: 1.3.0
>Reporter: Ayushi Sharma
>Assignee: Kunal Kapoor
>Priority: Minor
>  Labels: dfx
>
> Steps:
> 1. CREATE DATAMAP tt3 ON TABLE cust_2 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" AS SELECT c_custkey, 
> c_name, sum(c_acctbal), avg(c_acctbal), count(c_acctbal) FROM tstcust GROUP 
> BY c_custkey, c_name;
> Issue:
> Proper error message is not displayed. It throws "assertion failed" error.
> Expected:
> Proper error message should be displayed, if parent table name has any 
> ambiguity.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1737) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate table loads partially when segment filter is set on the main table

2017-11-21 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1737:


Assignee: Kunal Kapoor

> Carbon1.3.0-Pre-AggregateTable - Pre-aggregate table loads partially when 
> segment filter is set on the main table
> -
>
> Key: CARBONDATA-1737
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1737
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Kunal Kapoor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> 1. Create a table
> create table if not exists lineitem2(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> 2. Load 2 times to create 2 segments
>  load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table 
> lineitem2 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 3. Check the table content without setting any filter:
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem2 group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 327800.0 | 4.91387677624E8|
> | A | F | 1.263625E7   | 1.893851542524009E10   |
> | N | O | 2.5398626E7  | 3.810981608977967E10   |
> | R | F | 1.2643878E7  | 1.8948524305619976E10  |
> +---+---+--++--+
> 4. Set segment filter on the main table:
> set carbon.input.segments.test_db1.lineitem2=1;
> +---++--+
> |key| value  |
> +---++--+
> | carbon.input.segments.test_db1.lineitem2  | 1  |
> +---++--+
> 5. Create pre-aggregate table 
> create datamap agr_lineitem2 ON TABLE lineitem2 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem2 
> group by  L_RETURNFLAG, L_LINESTATUS;
> 6. Check table content:
>  select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem2 group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 163900.0 | 2.456938388124E8   |
> | A | F | 6318125.0| 9.469257712620043E9|
> | N | O | 1.2699313E7  | 1.9054908044889835E10  |
> | R | F | 6321939.0| 9.474262152809986E9|
> +---+---+--++--+
> 7. remove the filter on segment
> 0: jdbc:hive2://10.18.98.48:23040> reset;
> 8. Check the table conent:
>  select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem2 group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 163900.0 | 2.456938388124E8   |
> | A | F | 6318125.0| 9.469257712620043E9|
> | N | O | 1.2699313E7  | 1.9054908044889835E10  |
> | R | F | 6321939.0| 9.474262152809986E9|
> +---+---+

[jira] [Assigned] (CARBONDATA-1736) Carbon1.3.0-Pre-AggregateTable -Query from segment set is not effective when pre-aggregate table is present

2017-11-21 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1736?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1736:


Assignee: kumar vishal

>  Carbon1.3.0-Pre-AggregateTable -Query from segment set is not effective when 
> pre-aggregate table is present 
> -
>
> Key: CARBONDATA-1736
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1736
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>  Labels: DFX
> Fix For: 1.3.0
>
>
> 1. Create a table
> create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> 2. Run load :
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem1 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 3. create pre-agg table 
> create datamap agr_lineitem3 ON TABLE lineitem3 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem3 
> group by  L_RETURNFLAG, L_LINESTATUS;
> 3.  Check table content using aggregate query:
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 4913382.0| 7.369901176949993E9|
> | A | F | 1.88818373E8 | 2.8310705145736383E11  |
> | N | O | 3.82400594E8 | 5.734650756707479E11   |
> | R | F | 1.88960009E8 | 2.833523780876951E11   |
> +---+---+--++--+
> 4 rows selected (1.568 seconds)
> 4. Load one more time:
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem1 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 5.  Check table content using aggregate query:
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 9826764.0| 1.4739802353899986E10  |
> | A | F | 3.77636746E8 | 5.662141029147278E11   |
> | N | O | 7.64801188E8 | 1.1469301513414958E12  |
> | R | F | 3.77920018E8 | 5.667047561753901E11   |
> +---+---+--++--+
> 6. Set query from segment 1:
> 0: jdbc:hive2://10.18.98.48:23040> set 
> carbon.input.segments.test_db1.lilneitem1=1;
> +++--+
> |key | value  |
> +++--+
> | carbon.input.segments.test_db1.lilneitem1  | 1  |
> +++--+
> 7. Check table content using aggregate query:
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus;
> *+Expected+*: It should return the values from segment 1 alone.
> *+Actual :+* : It returns values from both segments
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)

[jira] [Assigned] (CARBONDATA-1518) 2. Support creating timeseries while creating main table.

2017-11-21 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1518?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1518:


Assignee: kumar vishal

> 2. Support creating timeseries while creating main table.
> -
>
> Key: CARBONDATA-1518
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1518
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: kumar vishal
>
> User can give timeseries option while creating the main table itself and 
> carbon will create aggregate tables automatically.
> {code}
> CREATE TABLE agg_sales
> STORED BY 'carbondata'
> TBLPROPERTIES ('parent_table'='sales', ‘timeseries_column’=’order_time’, 
> ‘granualarity’=’hour’, ‘rollup’ =’quantity:sum, max # user_id: count # price: 
> sum, max, min, avg’) 
> {code}
> In the above case, user choose timeseries_column, granularity and aggregation 
> types for measures, so carbon generates the aggregation tables automatically 
> for year, month, day and hour level aggregation tables (totally 4 tables, 
> their table name will be prefixed with agg_sales). 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1519) 3. Create UDF for timestamp to extract year,month,day,hour and minute from timestamp and date

2017-11-21 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1519:


Assignee: kumar vishal

> 3. Create UDF for timestamp to extract year,month,day,hour and minute from 
> timestamp and date
> -
>
> Key: CARBONDATA-1519
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1519
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: kumar vishal
>
> Create UDF for timestamp to extract year,month,day,hour and minute from 
> timestamp and date



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1526) 10. Handle compaction in aggregation tables.

2017-11-21 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1526:


Assignee: Kunal Kapoor

> 10. Handle compaction in aggregation tables.
> 
>
> Key: CARBONDATA-1526
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1526
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Kunal Kapoor
>
> User can trigger compaction on pre-aggregate table directly, it will further 
> merge the segments inside pre-aggregation table. To do that, use ALTER TABLE 
> COMPACT command on the pre-aggregate table just like the main table. 
> For implementation, there are two kinds of implementation for compaction. 
> 1. Mergable pre-aggregate tables: if aggregate functions are count, max, min, 
> sum, avg, the pre-aggregate table segments can be merged directly without 
> re-computing it.
> 2. Non-mergable pre-aggregate tables: if aggregate function include 
> distinct_count, it needs to re-compute when doing compaction on pre-aggregate 
> table.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Reopened] (CARBONDATA-1841) Data is not being loaded into pre-aggregation table after creation

2017-11-30 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reopened CARBONDATA-1841:
--

> Data is not being loaded into pre-aggregation table after creation
> --
>
> Key: CARBONDATA-1841
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1841
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1877) Rollup in query on timeseries table is not working

2017-12-07 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1877:


 Summary: Rollup in query on timeseries table is not working
 Key: CARBONDATA-1877
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1877
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal


*Problem: *When hour level timeseries table is present and user is firing query 
for year level it is hitting the maintable .




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1877) Rollup on query in case of timeseries table is not working

2017-12-07 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-1877:
-
Summary: Rollup on query in case of timeseries table is not working  (was: 
Rollup in query on timeseries table is not working)

> Rollup on query in case of timeseries table is not working
> --
>
> Key: CARBONDATA-1877
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1877
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>
> *Problem: *When hour level timeseries table is present and user is firing 
> query for year level it is hitting the maintable .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1877) Rollup in query on timeseries table is not working

2017-12-07 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1877:


Assignee: kumar vishal

> Rollup in query on timeseries table is not working
> --
>
> Key: CARBONDATA-1877
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1877
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>
> *Problem: *When hour level timeseries table is present and user is firing 
> query for year level it is hitting the maintable .



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1516) Support pre-aggregate tables and timeseries in carbondata

2017-12-07 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1516?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-1516:
-
Attachment: CarbonData Pre-aggregation Table_v1.2.pdf

> Support pre-aggregate tables and timeseries in carbondata
> -
>
> Key: CARBONDATA-1516
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1516
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Ravindra Pesala
> Attachments: CarbonData Pre-aggregation Table.pdf, CarbonData 
> Pre-aggregation Table_v1.1.pdf, CarbonData Pre-aggregation Table_v1.2.pdf
>
>
> Currently Carbondata has standard SQL capability on distributed data 
> sets.Carbondata should support pre-aggregating tables for timeseries and 
> improve query performance.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1888) Compaction is failing in case of timeseries

2017-12-12 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1888:


 Summary: Compaction is failing in case of timeseries
 Key: CARBONDATA-1888
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1888
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Compaction is failing in case of timeseries



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1881) insert overwrite not working properly for pre-aggregate tables

2017-12-13 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1881.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> insert overwrite not working properly for pre-aggregate tables
> --
>
> Key: CARBONDATA-1881
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1881
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> when insert overwrite if fired on the main table then the pre-aggregate 
> tables are not overwritten with the new values instead the values are 
> appended to the table like a normal insert



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1891) None.get when creating timeseries table after loading data into main table

2017-12-13 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1891.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> None.get when creating timeseries table after loading data into main table
> --
>
> Key: CARBONDATA-1891
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1891
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 1h 20m
>  Remaining Estimate: 0h
>
> *Steps to reproduce*
> 1. CREATE TABLE mainTable(mytime timestamp, name string, age int) STORED BY 
> 'org.apache.carbondata.format'
> 2. LOAD DATA LOCAL INPATH 'timeseriestest.csv' into table mainTable
> 3. create datamap agg0 on table mainTable using 'preaggregate' DMPROPERTIES 
> ('timeseries.eventTime'='mytime', 
> 'timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1') as 
> select mytime, sum(age) from mainTable group by mytime



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1887) block pruning not happening is carbon for ShortType and SmallIntType columns

2017-12-14 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1887.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> block pruning not happening is carbon for ShortType and SmallIntType columns
> 
>
> Key: CARBONDATA-1887
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1887
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Fix For: 1.3.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> spark.sql(
>   s"""
>  | create table test_numeric_type(c1 int, c2 long, c3 smallint, c4 
> bigint, c5 short) stored by 'carbondata'
>""".stripMargin).show()
> spark.sql(
>   s"""
>  | insert into test_numeric_type select 
> 1,111,111,11,'2019-01-03 12:12:12'
>""".stripMargin).show()
> spark.sql(
>   s"""
>  | insert into test_numeric_type select 
> 2,222,222,22,'2020-01-03 12:12:12'
>""".stripMargin).show()
> spark.sql(
>   s"""
>  | insert into test_numeric_type select 
> 3,333,333,33,'2021-01-03 12:12:12'
>""".stripMargin).show()
> spark.sql(
>   s"""
>  | select * from test_numeric_type where c5>
>""".stripMargin).show()
> Only two blocks should be selected but all blocks are selected during query 
> execution.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1898) Like, Contains, Ends With query optimization in case of or filter

2017-12-15 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1898:


 Summary: Like, Contains, Ends With query optimization in case of 
or filter
 Key: CARBONDATA-1898
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1898
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal


*Problem:* In case of like, contains, ends with filter With all or condition 
query is taking more time in carbon
*Solution*: This type of query avoid filter push down and let spark handle 
those filters 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1901) Fix Pre aggregate data map creation and query parsing issue

2017-12-15 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1901:


 Summary: Fix Pre aggregate data map creation and query parsing 
issue
 Key: CARBONDATA-1901
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1901
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal


*Problem:*Fixed below issues in case of pre aggregate
1. Pre aggregate data map table column order is not as per query given  by user 
because of which while data is loaded to wrong column
2. when aggregate function contains any expression query is failing with match 
error
3. pre aggregate data map columns and parent tables columns encoder is not 
matching
*Solution:*
1. Do not consider group columns in pre aggregate
2. when aggregate function contains any expression hit the maintable
3. Get encoder from main table and add in pre aggregate table column
When aggregation type is sum or avg create measure column



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1901) Fixed Pre aggregate data map creation and query parsing

2017-12-15 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1901:


Assignee: kumar vishal

> Fixed Pre aggregate data map creation and query parsing
> ---
>
> Key: CARBONDATA-1901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1901
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>Assignee: kumar vishal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem:*Fixed below issues in case of pre aggregate
> 1. Pre aggregate data map table column order is not as per query given  by 
> user because of which while data is loaded to wrong column
> 2. when aggregate function contains any expression query is failing with 
> match error
> 3. pre aggregate data map columns and parent tables columns encoder is not 
> matching
> *Solution:*
> 1. Do not consider group columns in pre aggregate
> 2. when aggregate function contains any expression hit the maintable
> 3. Get encoder from main table and add in pre aggregate table column
> When aggregation type is sum or avg create measure column



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (CARBONDATA-1901) Fixed Pre aggregate data map creation and query parsing

2017-12-15 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-1901:
-
Summary: Fixed Pre aggregate data map creation and query parsing  (was: Fix 
Pre aggregate data map creation and query parsing issue)

> Fixed Pre aggregate data map creation and query parsing
> ---
>
> Key: CARBONDATA-1901
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1901
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> *Problem:*Fixed below issues in case of pre aggregate
> 1. Pre aggregate data map table column order is not as per query given  by 
> user because of which while data is loaded to wrong column
> 2. when aggregate function contains any expression query is failing with 
> match error
> 3. pre aggregate data map columns and parent tables columns encoder is not 
> matching
> *Solution:*
> 1. Do not consider group columns in pre aggregate
> 2. when aggregate function contains any expression hit the maintable
> 3. Get encoder from main table and add in pre aggregate table column
> When aggregation type is sum or avg create measure column



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1743) Carbon1.3.0-Pre-AggregateTable - Query returns no value if run at the time of pre-aggregate table creation

2017-12-19 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1743?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1743.
--
Resolution: Fixed

> Carbon1.3.0-Pre-AggregateTable - Query returns no value if run at the time of 
> pre-aggregate table creation
> --
>
> Key: CARBONDATA-1743
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1743
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Kunal Kapoor
>  Labels: DFX
> Fix For: 1.3.0
>
>  Time Spent: 5h 20m
>  Remaining Estimate: 0h
>
> Steps:
> 1. Create table and load with large data
> create table if not exists lineitem4(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem4 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 2. Create a pre-aggregate table 
> create datamap agr_lineitem4 ON TABLE lineitem4 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem4 
> group by  L_RETURNFLAG, L_LINESTATUS;
> 3. Run aggregate query at the same time
>  select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem4 group by l_returnflag, l_linestatus;
> *+Expected:+*: aggregate query should fetch data either from main table or 
> pre-aggregate table.
> *+Actual:+* aggregate query does not return data until the pre-aggregate 
> table is created
> 0: jdbc:hive2://10.18.98.48:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
> group by l_returnflag, l_linestatus;
> +---+---+--+---+--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
> +---+---+--+---+--+
> +---+---+--+---+--+
> No rows selected (1.74 seconds)
> 0: jdbc:hive2://10.18.98.48:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
> group by l_returnflag, l_linestatus;
> +---+---+--+---+--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
> +---+---+--+---+--+
> +---+---+--+---+--+
> No rows selected (0.746 seconds)
> 0: jdbc:hive2://10.18.98.48:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
> group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 2.9808092E7  | 4.471079473931997E10   |
> | A | F | 1.145546488E9| 1.717580824169429E12   |
> | N | O | 2.31980219E9 | 3.4789002701143467E12  |
> | R | F | 1.146403932E9| 1.7190627928317903E12  |
> +---+---+--++--+
> 4 rows selected (0.8 seconds)
> 0: jdbc:hive2://10.18.98.48:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem4 
> group by l_returnflag, l_linestatus;
> +---+---+--++--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  |  sum(l_extendedprice)  |
> +---+---+--++--+
> | N | F | 2.9808092E7  | 4.471079473931997E10   |
> | A | F | 1.145546488E9| 1.717580824169429E12   |
> | N

[jira] [Resolved] (CARBONDATA-1907) Avoid unnecessary logging to improve query performance for no dictionary non string columns

2017-12-20 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1907?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1907.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Avoid unnecessary logging to improve query performance for no dictionary non 
> string columns
> ---
>
> Key: CARBONDATA-1907
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1907
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> In case of no dictionary column for non string data types exception is thrown 
> while parsing when data is empty. Due to this  excessive logging is happening 
> which is impacting the query performance.
> Log printed in the logs:
> "Problem while converting data type"



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1777) Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell sessions are not used in the beeline session

2017-12-20 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1777:


Assignee: Kunal Kapoor  (was: kumar vishal)

> Carbon1.3.0-Pre-AggregateTable - Pre-aggregate tables created in Spark-shell 
> sessions are not used in the beeline session
> -
>
> Key: CARBONDATA-1777
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1777
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Kunal Kapoor
>Priority: Minor
>  Labels: DFX
> Fix For: 1.3.0
>
>
> Steps:
> Beeline:
> 1. Create table and load with  data
> Spark-shell:
> 1. create a pre-aggregate table
> Beeline:
> 1. Run aggregate query
> *+Expected:+* Pre-aggregate table should be used in the aggregate query 
> *+Actual:+* Pre-aggregate table is not used
> 1.
> create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.5" into table 
> lineitem1 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> 2. 
>  carbon.sql("create datamap agr1_lineitem1 ON TABLE lineitem1 USING 
> 'org.apache.carbondata.datamap.AggregateDataMapHandler' as select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 group by l_returnflag, l_linestatus").show();
> 3. 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus;
> Actual:
> 0: jdbc:hive2://10.18.98.136:23040> show tables;
> +---+---+--+--+
> | database  | tableName | isTemporary  |
> +---+---+--+--+
> | test_db2  | lineitem1 | false|
> | test_db2  | lineitem1_agr1_lineitem1  | false|
> +---+---+--+--+
> 2 rows selected (0.047 seconds)
> Logs:
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Running query 'select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus' 
> with 7f3091a8-4d7b-40ac-840f-9db6f564c9cf | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,314 | INFO  | [pool-23-thread-53] | Parsing command: 
> select 
> l_returnflag,l_linestatus,sum(l_quantity),avg(l_quantity),count(l_quantity) 
> from lineitem1 where l_returnflag = 'R' group by l_returnflag, l_linestatus | 
> org.apache.spark.internal.Logging$class.logInfo(Logging.scala:54)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | 55: get_table : 
> db=test_db2 tbl=lineitem1 | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logInfo(HiveMetaStore.java:746)
> 2017-11-20 15:46:48,353 | INFO  | [pool-23-thread-53] | ugi=anonymous 
> ip=unknown-ip-addr  cmd=get_table : db=test_db2 tbl=lineitem1| 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.logAuditEvent(HiveMetaStore.java:371)
> 2017-11-20 15:46:48,354 | INFO  | [pool-23-thread-53] | 55: Opening raw store 
> with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore | 
> org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:589)
> 2017-11-20 15:46:48,355 | INFO  | [pool-23-thread-53] | ObjectStore, 
> initialize called | 
> org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:289)
> 2017-11-20 15:46:48,360 | INFO  | [pool-23-thread-53] | Reading in results 
> for query "org.datanucleus.store.rdbms.query.SQLQuery@0" since the connection 
> used is closing | org.datanucleus.util.Log4JLogger.info(Log4JLogger.java:77)
> 2017-11-20 15:46:48,362 | INFO  | [pool-23-thread-53] | Using direct SQL, 
> underlying DB is MYSQL | 
> org.apache.hadoop.hive.metastore.MetaStoreDirectSql.

[jira] [Resolved] (CARBONDATA-1913) Global Sort data dataload fails for big with RPC timeout exception

2017-12-20 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1913.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Global Sort data dataload fails for big with RPC timeout exception 
> ---
>
> Key: CARBONDATA-1913
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1913
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Mohammad Shahid Khan
>Assignee: Mohammad Shahid Khan
> Fix For: 1.3.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> When gloabl sort option is used for big data then for some times it fails with
> RPC timeout after 120s. 
> This is happening because the driver is not able to unpersist rdd cache with 
> in 120s.
> The issue is happening due to rdd unpersist blocking call. Sometimes spark is 
> not able 
> to unppersist the rdd in default "spark.rpc.askTimeout" or 
> "spark.network.timeout" time.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1714) Carbon1.3.0-Alter Table - Select columns with is null and limit throws ArrayIndexOutOfBoundsException after multiple alter

2017-12-20 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1714.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Carbon1.3.0-Alter Table - Select columns with is null and limit throws 
> ArrayIndexOutOfBoundsException after multiple alter
> --
>
> Key: CARBONDATA-1714
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1714
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
> Environment: 3 node ant cluster- SUSE 11 SP4
>Reporter: Chetan Bhat
>Assignee: Jatin
>  Labels: DFX
> Fix For: 1.3.0
>
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Steps -
> Execute the below queries in sequence.
> create database test;
> use test;
> CREATE TABLE uniqdata111785 (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='INTEGER_COLUMN1,CUST_ID');
> LOAD DATA INPATH 'hdfs://hacluster/chetan/2000_UniqData.csv' into table 
> uniqdata111785 OPTIONS('DELIMITER'=',' , 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1');
> alter table test.uniqdata111785 RENAME TO  uniqdata1117856;
> select * from test.uniqdata1117856 limit 100;
> ALTER TABLE test.uniqdata1117856 ADD COLUMNS (cust_name1 int);
> select * from test.uniqdata1117856 where cust_name1 is null limit 100;
> ALTER TABLE test.uniqdata1117856 DROP COLUMNS (cust_name1);
> select * from test.uniqdata1117856 where cust_name1 is null limit 100;
> ALTER TABLE test.uniqdata1117856 CHANGE CUST_ID CUST_ID BIGINT;
> select * from test.uniqdata1117856 where CUST_ID in (10013,10011,1,10019) 
> limit 10;
> ALTER  TABLE test.uniqdata1117856 ADD COLUMNS (a1 INT, b1 STRING) 
> TBLPROPERTIES('DICTIONARY_EXCLUDE'='b1');
> select a1,b1 from test.uniqdata1117856  where a1 is null and b1 is null limit 
> 100;
> Actual Issue : Select columns with is null and limit throws 
> ArrayIndexOutOfBoundsException after multiple alter operations.
> 0: jdbc:hive2://10.18.98.34:23040> select a1,b1 from test.uniqdata1117856  
> where a1 is null and b1 is null limit 100;
> Error: org.apache.spark.SparkException: Job aborted due to stage failure: 
> Task 0 in stage 9.0 failed 4 times, most recent failure: Lost task 0.3 in 
> stage 9.0 (TID 14, BLR114269, executor 2): 
> java.lang.ArrayIndexOutOfBoundsException: 7
> at 
> org.apache.carbondata.core.scan.model.QueryModel.setDimAndMsrColumnNode(QueryModel.java:223)
> at 
> org.apache.carbondata.core.scan.model.QueryModel.processFilterExpression(QueryModel.java:172)
> at 
> org.apache.carbondata.core.scan.model.QueryModel.processFilterExpression(QueryModel.java:181)
> at 
> org.apache.carbondata.hadoop.util.CarbonInputFormatUtil.processFilterExpression(CarbonInputFormatUtil.java:118)
> at 
> org.apache.carbondata.hadoop.api.CarbonTableInputFormat.getQueryModel(CarbonTableInputFormat.java:791)
> at 
> org.apache.carbondata.spark.rdd.CarbonScanRDD.internalCompute(CarbonScanRDD.scala:250)
> at 
> org.apache.carbondata.spark.rdd.CarbonRDD.compute(CarbonRDD.scala:60)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
> at org.apache.spark.scheduler.Task.run(Task.scala:99)
> at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:282)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> Driver stacktrace: (state=,code=0)
> Expected : The select query should be successful after m

[jira] [Created] (CARBONDATA-1925) Support expression inside aggregate expression in create and load data on Pre aggregate table

2017-12-21 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1925:


 Summary: Support expression inside aggregate expression in create 
and load data on Pre aggregate table 
 Key: CARBONDATA-1925
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1925
 Project: CarbonData
  Issue Type: Sub-task
Reporter: kumar vishal
Assignee: kumar vishal


Support expression inside aggregate expression in create and load data on Pre 
aggregate table 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1926) Support expression inside aggregate expression during query on Pre Aggregate table

2017-12-21 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1926:


 Summary: Support expression inside aggregate expression during 
query on Pre Aggregate table
 Key: CARBONDATA-1926
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1926
 Project: CarbonData
  Issue Type: Sub-task
Reporter: kumar vishal
Assignee: kumar vishal


Support expression inside aggregate expression during query on Pre Aggregate 
table



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (CARBONDATA-1927) Support sub query on Pre Aggregate table

2017-12-21 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-1927:


 Summary: Support sub query on Pre Aggregate table
 Key: CARBONDATA-1927
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1927
 Project: CarbonData
  Issue Type: Sub-task
Reporter: kumar vishal
Assignee: kumar vishal


Currently sub query is not hitting the pre aggregate table. This Jira is to 
handle the sub query on pre Aggregate table 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1930) Dictionary not found exception is thrown when filter expression is given in aggergate table query

2017-12-22 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1930.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Dictionary not found exception is thrown when filter expression is given in 
> aggergate table query
> -
>
> Key: CARBONDATA-1930
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1930
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> Steps to reproduce;
> 1. CREATE TABLE filtertable(id int, name string, city string, age string) 
> STORED BY  'org.apache.carbondata.format' 
> TBLPROPERTIES('dictionary_include'='name,age')
> 2. LOAD DATA LOCAL INPATH 
> 3. create datamap agg9 on table filtertable using 'preaggregate' as select 
> name, age, sum(age) from filtertable group by name, age
> 4. select name, sum(age) from filtertable where age = '29' group by name, age



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1931) DataLoad failed for Aggregate table when measure is used for groupby

2017-12-26 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1931:


Assignee: Babulal  (was: kumar vishal)

> DataLoad failed for Aggregate table when measure is used for groupby
> 
>
> Key: CARBONDATA-1931
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1931
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
> Fix For: 1.3.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Run commands in sequence
>  
>  spark.sql(
>   "create table y(year int,month int,name string,salary int) stored by 
> 'carbondata'"
> )
> spark.sql(
>   s"insert into y select 10,11,'x',12"
> )
> spark.sql("create datamap y1_sum1 on table y using 'preaggregate' as select 
> year,name,sum(salary) from y group by year,name")
> Result :- Aggregate creation is failed. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1931) DataLoad failed for Aggregate table when measure is used for groupby

2017-12-26 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1931.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> DataLoad failed for Aggregate table when measure is used for groupby
> 
>
> Key: CARBONDATA-1931
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1931
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
> Fix For: 1.3.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Run commands in sequence
>  
>  spark.sql(
>   "create table y(year int,month int,name string,salary int) stored by 
> 'carbondata'"
> )
> spark.sql(
>   s"insert into y select 10,11,'x',12"
> )
> spark.sql("create datamap y1_sum1 on table y using 'preaggregate' as select 
> year,name,sum(salary) from y group by year,name")
> Result :- Aggregate creation is failed. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (CARBONDATA-1931) DataLoad failed for Aggregate table when measure is used for groupby

2017-12-26 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1931:


Assignee: kumar vishal

> DataLoad failed for Aggregate table when measure is used for groupby
> 
>
> Key: CARBONDATA-1931
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1931
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: kumar vishal
> Fix For: 1.3.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> Run commands in sequence
>  
>  spark.sql(
>   "create table y(year int,month int,name string,salary int) stored by 
> 'carbondata'"
> )
> spark.sql(
>   s"insert into y select 10,11,'x',12"
> )
> spark.sql("create datamap y1_sum1 on table y using 'preaggregate' as select 
> year,name,sum(salary) from y group by year,name")
> Result :- Aggregate creation is failed. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1953) Pre-aggregate Should inherit sort column,sort_scope,dictionary encoding from main table

2018-01-03 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1953.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Pre-aggregate Should inherit sort column,sort_scope,dictionary encoding from 
> main table
> ---
>
> Key: CARBONDATA-1953
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1953
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
> Fix For: 1.3.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> Pre-aggregate Should inherit sort column,sort_scope,dictionary encoding from 
> main table
>  spark.sql("drop table if exists y ")
> spark.sql("create table y(year int,month int,name string,salary int) 
> stored by 'carbondata' 
> tblproperties('NO_INVERTED_INDEX'='name','sort_scope'='Global_sort','table_blocksize'='23','Dictionary_include'='month','Dictionary_exclude'='year,name','sort_columns'='month,year,name')")
> spark.sql("insert into y select 10,11,'babu',12")
> spark.sql("create datamap y1_sum1 on table y using 'preaggregate' as select 
> year,month,name,sum(salary) from y group by year,month,name")
> spark.sql("desc formatted y").show(100,false)
> spark.sql("desc formatted y_y1_sum1").show(100,false)
> --
> |col_name|data_type   
> |comment  
>|
> ++++
> |y_year  |int 
> |KEY COLUMN,NOINVERTEDINDEX,null  
>|
> |y_month |int 
> |DICTIONARY, KEY 
> COLUMN,NOINVERTEDINDEX,null |
> |y_name  |string  
> |KEY COLUMN,null  
>|
> |y_salary_sum|bigint  
> |MEASURE,null 
>|
> ||
> | 
>|
> |##Detailed Table Information|
> | 
>|
> |Database Name   |default 
> | 
>|
> |Table Name  |y_y1_sum1   
> | 
>|
> |CARBON Store Path   
> |D:\code\carbondata\myfork\incubator-carbondata/examples/spark2/target/store  
>||
> |Comment |
> | 
>|
> |Table Block Size|1024 MB 
> | 
>|
> |Table Data Size |1297
> | 
>|
> |Table Index Size|1076
> | 
>|
> |Last Update Time|1514546841061   
> | 
>|
> |SORT_SCOPE  |LOCAL_SORT  
> |LOCAL_SORT   
>|
> |Streaming   |false  

[jira] [Assigned] (CARBONDATA-1719) Carbon1.3.0-Pre-AggregateTable - Empty segment is created when pre-aggr table created in parallel with table load, aggregate query returns no data

2018-01-04 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1719:


Assignee: Jatin  (was: kumar vishal)

> Carbon1.3.0-Pre-AggregateTable - Empty segment is created when pre-aggr table 
> created in parallel with table load, aggregate query returns no data
> --
>
> Key: CARBONDATA-1719
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1719
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Jatin
>  Labels: DFX
> Fix For: 1.3.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> 1. Create a table
> create table if not exists lineitem3(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> 2. Run load queries and create pre-agg table queries in diff console:
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem3 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> create datamap agr_lineitem3 ON TABLE lineitem3 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem3 
> group by  L_RETURNFLAG, L_LINESTATUS;
> 3.  Check table content using aggregate query:
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus;
> 0: jdbc:hive2://10.18.98.34:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem3 
> group by l_returnflag, l_linestatus;
> +---+---+--+---+--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
> +---+---+--+---+--+
> +---+---+--+---+--+
> No rows selected (1.258 seconds)
> HDFS data:
> BLR114307:/srv/spark2.2Bigdata/install/hadoop/datanode # bin/hadoop fs 
> -ls /carbonstore/default/lineitem3_agr_lineitem3/Fact/Part0/Segment_0
> BLR114307:/srv/spark2.2Bigdata/install/hadoop/datanode # bin/hadoop fs 
> -ls /carbonstore/default/lineitem3/Fact/Part0/Segment_0
> Found 27 items
> -rw-r--r--   2 root users  22148 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/1510740293106.carbonindexmerge
> -rw-r--r--   2 root users   58353052 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-0_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58351680 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-0_batchno1-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58364823 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-1_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58356303 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-2_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58342246 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-0_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58353186 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-0_batchno1-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58352964 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-1_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58357183 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-2_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58345739 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-2-0_batchno0-0-1510740300247.carbondata
> Yarn job stages:
> 29
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem3 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PAR

[jira] [Assigned] (CARBONDATA-1719) Carbon1.3.0-Pre-AggregateTable - Empty segment is created when pre-aggr table created in parallel with table load, aggregate query returns no data

2018-01-04 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1719:


Assignee: kumar vishal  (was: Jatin)

> Carbon1.3.0-Pre-AggregateTable - Empty segment is created when pre-aggr table 
> created in parallel with table load, aggregate query returns no data
> --
>
> Key: CARBONDATA-1719
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1719
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: kumar vishal
>  Labels: DFX
> Fix For: 1.3.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> 1. Create a table
> create table if not exists lineitem3(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> 2. Run load queries and create pre-agg table queries in diff console:
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem3 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> create datamap agr_lineitem3 ON TABLE lineitem3 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem3 
> group by  L_RETURNFLAG, L_LINESTATUS;
> 3.  Check table content using aggregate query:
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus;
> 0: jdbc:hive2://10.18.98.34:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem3 
> group by l_returnflag, l_linestatus;
> +---+---+--+---+--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
> +---+---+--+---+--+
> +---+---+--+---+--+
> No rows selected (1.258 seconds)
> HDFS data:
> BLR114307:/srv/spark2.2Bigdata/install/hadoop/datanode # bin/hadoop fs 
> -ls /carbonstore/default/lineitem3_agr_lineitem3/Fact/Part0/Segment_0
> BLR114307:/srv/spark2.2Bigdata/install/hadoop/datanode # bin/hadoop fs 
> -ls /carbonstore/default/lineitem3/Fact/Part0/Segment_0
> Found 27 items
> -rw-r--r--   2 root users  22148 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/1510740293106.carbonindexmerge
> -rw-r--r--   2 root users   58353052 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-0_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58351680 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-0_batchno1-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58364823 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-1_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58356303 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-2_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58342246 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-0_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58353186 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-0_batchno1-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58352964 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-1_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58357183 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-2_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58345739 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-2-0_batchno0-0-1510740300247.carbondata
> Yarn job stages:
> 29
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem3 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKE

[jira] [Resolved] (CARBONDATA-1719) Carbon1.3.0-Pre-AggregateTable - Empty segment is created when pre-aggr table created in parallel with table load, aggregate query returns no data

2018-01-04 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1719.
--
Resolution: Fixed

> Carbon1.3.0-Pre-AggregateTable - Empty segment is created when pre-aggr table 
> created in parallel with table load, aggregate query returns no data
> --
>
> Key: CARBONDATA-1719
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1719
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-load
>Affects Versions: 1.3.0
> Environment: Test - 3 node ant cluster
>Reporter: Ramakrishna S
>Assignee: Jatin
>  Labels: DFX
> Fix For: 1.3.0
>
>  Time Spent: 3h 40m
>  Remaining Estimate: 0h
>
> 1. Create a table
> create table if not exists lineitem3(L_SHIPDATE string,L_SHIPMODE 
> string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
> string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY   string,L_LINENUMBER 
> int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
> double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT  string) STORED BY 
> 'org.apache.carbondata.format' TBLPROPERTIES 
> ('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
> 2. Run load queries and create pre-agg table queries in diff console:
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem3 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');
> create datamap agr_lineitem3 ON TABLE lineitem3 USING 
> "org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
> L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem3 
> group by  L_RETURNFLAG, L_LINESTATUS;
> 3.  Check table content using aggregate query:
> select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
> lineitem3 group by l_returnflag, l_linestatus;
> 0: jdbc:hive2://10.18.98.34:23040> select 
> l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from lineitem3 
> group by l_returnflag, l_linestatus;
> +---+---+--+---+--+
> | l_returnflag  | l_linestatus  | sum(l_quantity)  | sum(l_extendedprice)  |
> +---+---+--+---+--+
> +---+---+--+---+--+
> No rows selected (1.258 seconds)
> HDFS data:
> BLR114307:/srv/spark2.2Bigdata/install/hadoop/datanode # bin/hadoop fs 
> -ls /carbonstore/default/lineitem3_agr_lineitem3/Fact/Part0/Segment_0
> BLR114307:/srv/spark2.2Bigdata/install/hadoop/datanode # bin/hadoop fs 
> -ls /carbonstore/default/lineitem3/Fact/Part0/Segment_0
> Found 27 items
> -rw-r--r--   2 root users  22148 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/1510740293106.carbonindexmerge
> -rw-r--r--   2 root users   58353052 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-0_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58351680 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-0_batchno1-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58364823 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-1_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58356303 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-0-2_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58342246 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-0_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58353186 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-0_batchno1-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58352964 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-1_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58357183 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-1-2_batchno0-0-1510740300247.carbondata
> -rw-r--r--   2 root users   58345739 2017-11-15 18:05 
> /carbonstore/default/lineitem3/Fact/Part0/Segment_0/part-2-0_batchno0-0-1510740300247.carbondata
> Yarn job stages:
> 29
> load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
> lineitem3 
> options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUM

[jira] [Created] (CARBONDATA-2022) Query With table alias is not hitting pre aggregate table

2018-01-11 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2022:


 Summary: Query With table alias is not hitting pre aggregate table
 Key: CARBONDATA-2022
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2022
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: Babulal


**Problem:**Query with table alias is Not hitting pre Aggregate table.  
**Solution:** Problem is table alias query is plan is coming as 
SubQueryAlias(alias, SubqueryAlias) ans this case is not present in tranform 
query plan for pre aggregate 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (CARBONDATA-1927) Support sub query on Pre Aggregate table

2018-01-15 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1927.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

Fixed as a part of https://github.com/apache/carbondata/pull/1728

> Support sub query on Pre Aggregate table
> 
>
> Key: CARBONDATA-1927
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1927
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
> Fix For: 1.3.0
>
>
> Currently sub query is not hitting the pre aggregate table. This Jira is to 
> handle the sub query on pre Aggregate table 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2029) Query with expression is giving wrong result

2018-01-15 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2029:


 Summary: Query with expression is giving wrong result 
 Key: CARBONDATA-2029
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2029
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Create Maintable:

CREATE TABLE mainTable(id int, name string, city string, age string) STORED BY 
'org.apache.carbondata.format

 

Create datamap
create datamap agg1 on table mainTable using 'preaggregate' as select 
name,sum(id) from mainTable group by name

Load data 

Run query :select sum(id)+count(id) from maintable is giving wrong result

Problem: When query has expression it is not checking which aggregate function 
is applied on table and based on table it is selecting aggregate table

Solution: While extracting the aggregate expression from query plan in case if 
any expression is present extract which aggregate function applied on column to 
select the aggregate table 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2034) Improve query performance

2018-01-16 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2034:


 Summary: Improve query performance 
 Key: CARBONDATA-2034
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2034
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal


*Problem: Dictionary loading is taking more time in executor side when number 
of nodes is high.*

*Solution: During query no need to load non complex dimension dictionary. 
Dictionary decoder will take care of loading and decoding the dictionary column*



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2042) Data Mismatch issue in case of Timeseries Year, Month and Day level table

2018-01-17 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2042:


 Summary: Data Mismatch issue in case of Timeseries Year, Month and 
Day level table
 Key: CARBONDATA-2042
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2042
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal
 Attachments: data_sort.csv

sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into table 
mainTable")
sql("CREATE TABLE table_03 (imei string,age int,mac string,productdate 
timestamp,updatedate timestamp,gamePointId double,contractid double ) STORED BY 
'org.apache.carbondata.format'")
sql(s"LOAD DATA inpath '$resourcesPath/data_sort.csv' INTO table table_03 
options ('DELIMITER'=',', 
'QUOTECHAR'='','FILEHEADER'='imei,age,mac,productdate,updatedate,gamePointId,contractid')")
sql("create datamap ag1 on table table_03 using 'preaggregate' DMPROPERTIES ( 
'timeseries.eventtime'='productdate','timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1')as
 select productdate,mac,sum(age) from table_03 group by productdate,mac")



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2045) Query from segment set is not effective when pre-aggregate table is present

2018-01-17 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2045:


 Summary: Query from segment set is not effective when 
pre-aggregate table is present
 Key: CARBONDATA-2045
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2045
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


1. Create a table
create table if not exists lineitem1(L_SHIPDATE string,L_SHIPMODE 
string,L_SHIPINSTRUCT string,L_RETURNFLAG string,L_RECEIPTDATE 
string,L_ORDERKEY string,L_PARTKEY string,L_SUPPKEY string,L_LINENUMBER 
int,L_QUANTITY double,L_EXTENDEDPRICE double,L_DISCOUNT double,L_TAX 
double,L_LINESTATUS string,L_COMMITDATE string,L_COMMENT string) STORED BY 
'org.apache.carbondata.format' TBLPROPERTIES 
('table_blocksize'='128','NO_INVERTED_INDEX'='L_SHIPDATE,L_SHIPMODE,L_SHIPINSTRUCT,L_RETURNFLAG,L_RECEIPTDATE,L_ORDERKEY,L_PARTKEY,L_SUPPKEY','sort_columns'='');
2. Run load :
load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
lineitem1 
options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');

3. create pre-agg table 
create datamap agr_lineitem3 ON TABLE lineitem3 USING 
"org.apache.carbondata.datamap.AggregateDataMapHandler" as select 
L_RETURNFLAG,L_LINESTATUS,sum(L_QUANTITY),sum(L_EXTENDEDPRICE) from lineitem3 
group by L_RETURNFLAG, L_LINESTATUS;

3. Check table content using aggregate query:
select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
lineitem3 group by l_returnflag, l_linestatus;

++--++--++--
|l_returnflag|l_linestatus|sum(l_quantity)|sum(l_extendedprice)|

++--++--++--
|N|F|4913382.0|7.369901176949993E9|
|A|F|1.88818373E8|2.8310705145736383E11|
|N|O|3.82400594E8|5.734650756707479E11|
|R|F|1.88960009E8|2.833523780876951E11|

++--++--++--
4 rows selected (1.568 seconds)

4. Load one more time:
load data inpath "hdfs://hacluster/user/test/lineitem.tbl.1" into table 
lineitem1 
options('DELIMITER'='|','FILEHEADER'='L_ORDERKEY,L_PARTKEY,L_SUPPKEY,L_LINENUMBER,L_QUANTITY,L_EXTENDEDPRICE,L_DISCOUNT,L_TAX,L_RETURNFLAG,L_LINESTATUS,L_SHIPDATE,L_COMMITDATE,L_RECEIPTDATE,L_SHIPINSTRUCT,L_SHIPMODE,L_COMMENT');

5. Check table content using aggregate query:
select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
lineitem3 group by l_returnflag, l_linestatus;

++--++--++--
|l_returnflag|l_linestatus|sum(l_quantity)|sum(l_extendedprice)|

++--++--++--
|N|F|9826764.0|1.4739802353899986E10|
|A|F|3.77636746E8|5.662141029147278E11|
|N|O|7.64801188E8|1.1469301513414958E12|
|R|F|3.77920018E8|5.667047561753901E11|

++--++--++--

6. Set query from segment 1:

0: jdbc:hive2://10.18.98.48:23040> set 
carbon.input.segments.test_db1.lilneitem1=1;
+-+---++--
|key|value|

+-+---++--
|carbon.input.segments.test_db1.lilneitem1|1|

+-+---++--

7. Check table content using aggregate query:
select l_returnflag,l_linestatus,sum(l_quantity),sum(l_extendedprice) from 
lineitem3 group by l_returnflag, l_linestatus;

*+Expected+*: It should return the values from segment 1 alone.
*+Actual :+* : It returns values from both segments
++--++--++--
|l_returnflag|l_linestatus|sum(l_quantity)|sum(l_extendedprice)|

++--++--++--
|N|F|9826764.0|1.4739802353899986E10|
|A|F|3.77636746E8|5.662141029147278E11|
|N|O|7.64801188E8|1.1469301513414958E12|
|R|F|3.77920018E8|5.667047561753901E11|

++--++--++--



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2042) Data Mismatch issue in case of Timeseries Year, Month and Day level table

2018-01-17 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-2042:
-
Issue Type: Bug  (was: Improvement)

> Data Mismatch issue in case of Timeseries Year, Month and Day level table
> -
>
> Key: CARBONDATA-2042
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2042
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
> Attachments: data_sort.csv
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into table 
> mainTable")
> sql("CREATE TABLE table_03 (imei string,age int,mac string,productdate 
> timestamp,updatedate timestamp,gamePointId double,contractid double ) STORED 
> BY 'org.apache.carbondata.format'")
> sql(s"LOAD DATA inpath '$resourcesPath/data_sort.csv' INTO table table_03 
> options ('DELIMITER'=',', 
> 'QUOTECHAR'='','FILEHEADER'='imei,age,mac,productdate,updatedate,gamePointId,contractid')")
> sql("create datamap ag1 on table table_03 using 'preaggregate' DMPROPERTIES ( 
> 'timeseries.eventtime'='productdate','timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1')as
>  select productdate,mac,sum(age) from table_03 group by productdate,mac")



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2031) Select column with is null for no_inverted_index column throws java.lang.ArrayIndexOutOfBoundsException

2018-01-17 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2031.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Select column with is null for no_inverted_index column throws 
> java.lang.ArrayIndexOutOfBoundsException
> ---
>
> Key: CARBONDATA-2031
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2031
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 1.3.0
>
> Attachments: dest.csv
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> steps:
> {color:#33}1) create table zerorows_part (c1 string,c2 int,c3 string,c5 
> string) STORED BY 'carbondata' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='C2','NO_INVERTED_INDEX'='C2'){color}
> {color:#33}2){color}{color:#33}LOAD DATA LOCAL INPATH 
> '$filepath/dest.csv' INTO table zerorows_part 
> OPTIONS('delimiter'=',','fileheader'='c1,c2,c3,c5'){color}
> {color:#33}3){color}{color:#33}select c2 from zerorows_part where c2 
> is null{color}
>  
> *Previous exception in task: java.util.concurrent.ExecutionException: 
> java.lang.ArrayIndexOutOfBoundsException: 0*
>     
> *org.apache.carbondata.core.scan.processor.AbstractDataBlockIterator.updateScanner(AbstractDataBlockIterator.java:136)*
>     
> *org.apache.carbondata.core.scan.processor.impl.DataBlockIteratorImpl.processNextBatch(DataBlockIteratorImpl.java:64)*
>     
> *org.apache.carbondata.core.scan.result.iterator.VectorDetailQueryResultIterator.processNextBatch(VectorDetailQueryResultIterator.java:46)*
>     
> *org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextBatch(VectorizedCarbonRecordReader.java:283)*
>     
> *org.apache.carbondata.spark.vectorreader.VectorizedCarbonRecordReader.nextKeyValue(VectorizedCarbonRecordReader.java:171)*
>     
> *org.apache.carbondata.spark.rdd.CarbonScanRDD$$anon$1.hasNext(CarbonScanRDD.scala:370)*
>     
> *org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.scan_nextBatch$(Unknown
>  Source)*
>     
> *org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)*
>     
> *org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)*
>     
> *org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)*
>     
> *org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:234)*
>     
> *org.apache.spark.sql.execution.SparkPlan$$anonfun$2.apply(SparkPlan.scala:228)*
>     
> *org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)*
>     
> *org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$25.apply(RDD.scala:827)*
>     *org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)*
>     *org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)*
>     *org.apache.spark.rdd.RDD.iterator(RDD.scala:287)*
>     *org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)*
>     *org.apache.spark.scheduler.Task.run(Task.scala:108)*
>     *org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)*
>     
> *java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*
>     
> *java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*
>     *java.lang.Thread.run(Thread.java:748)*
>     *at 
> org.apache.spark.TaskContextImpl.invokeListeners(TaskContextImpl.scala:138)*
>     *at 
> org.apache.spark.TaskContextImpl.markTaskCompleted(TaskContextImpl.scala:116)*
>     *at org.apache.spark.scheduler.Task.run(Task.scala:118)*
>     *at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)*
>     *at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)*
>     *at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)*
>     *at java.lang.Thread.run(Thread.java:748)*
>  
>  
> {color:#33}[^dest.csv]{color}
> {color:#33} {color}
>  
>  
>  
> **



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2028) Select Query failed with preagg having timeseries and normal agg table together

2018-01-17 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2028.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Select Query failed with preagg having timeseries and normal agg table 
> together
> ---
>
> Key: CARBONDATA-2028
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2028
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
>Priority: Major
> Fix For: 1.3.0
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> sql("drop table if exists maintabletime")
>  sql("create table maintabletime(year int,month int,name string,salary 
> int,dob timestamp) stored by 'carbondata' 
> tblproperties('sort_scope'='Global_sort','table_blocksize'='23','sort_columns'='month,year,name')")
>  sql("insert into maintabletime select 10,11,'babu',12,'2014-01-01 00:00:00'")
>  sql("create datamap agg0 on table maintabletime using 'preaggregate' as 
> select dob,name from maintabletime group by dob,name")
>  sql("create datamap agg1 on table maintabletime using 'preaggregate' 
> DMPROPERTIES ('timeseries.eventTime'='dob', 
> 'timeseries.hierarchy'='hour=1,day=1,month=1,year=1') as select dob,name from 
> maintabletime group by dob,name")
>  val df = sql("select timeseries(dob,'year') from maintabletime group by 
> timeseries(dob,'year')")
>  
>  
> *Exception* 
> Exception in thread "main" org.apache.spark.sql.AnalysisException: Column 
> does not exists in Pre Aggregate table;
>  at 
> org.apache.spark.sql.hive.CarbonPreAggregateQueryRules.getChildAttributeReference(CarbonPreAggregateRules.scala:719)
>  at 
> org.apache.spark.sql.hive.CarbonPreAggregateQueryRules$$anonfun$19$$anonfun$4.applyOrElse(CarbonPreAggregateRules.scala:855)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2030) avg with Aggregate table for double data type is failed.

2018-01-17 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2030.
--
   Resolution: Fixed
 Assignee: Babulal
Fix Version/s: 1.3.0

> avg with Aggregate table for double data type is failed. 
> -
>
> Key: CARBONDATA-2030
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2030
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
>Priority: Major
> Fix For: 1.3.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> spark.sql("drop table if exists y ")
> spark.sql("create table y(year int,month int,name string,salary double) 
> stored by 'carbondata' 
> tblproperties('sort_scope'='Global_sort','table_blocksize'='23','sort_columns'='month,year,name')")
> spark.sql("insert into y select 10,11,'babu',12.89")
> spark.sql("insert into y select 10,11,'babu',12.89")
> spark.sql("create datamap y1_sum1 on table y using 'preaggregate' as select 
> name,avg(salary) from y group by name")
> spark.sql("select name,avg(salary) from y group by name").show(false)
>  
>  
> Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot 
> resolve '(sum(y_y1_sum1.`y_salary_sum`) / sum(y_y1_sum1.`y_salary_count`))' 
> due to data type mismatch: differing types in '(sum(y_y1_sum1.`y_salary_sum`) 
> / sum(y_y1_sum1.`y_salary_count`))' (double and bigint).;;
> 'Aggregate [y_name#25], [y_name#25 AS name#41, (sum(y_salary_sum#26) / 
> sum(y_salary_count#27L)) AS avg(salary)#46]
> +- Relation[y_name#25,y_salary_sum#26,y_salary_count#27L] 
> CarbonDatasourceHadoopRelation [ Database name :default, Table name 
> :y_y1_sum1, Schema :Some(StructType(StructField(y_name,StringType,true), 
> StructField(y_salary_sum,DoubleType,true), 
> StructField(y_salary_count,LongType,true))) ]
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2036) Insert overwrite on static partition cannot work properly

2018-01-19 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2036.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Insert overwrite on static partition cannot work properly
> -
>
> Key: CARBONDATA-2036
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2036
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 5h 10m
>  Remaining Estimate: 0h
>
> When trying to insert overwrite on the static partition with 0 at first on 
> int column has an issue.
> Example : 
> create table test(d1 string) partition by (c1 int, c2 int, c3 int)
> And use insert overwrite table partition(01, 02, 03) select "s1"
>  
> The above case has a problem as 01 is not converting to actual integer to 
> partition map file.
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2068) Drop datamap should work for timeseries

2018-01-23 Thread kumar vishal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-2068?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16335758#comment-16335758
 ] 

kumar vishal commented on CARBONDATA-2068:
--

[~xubo245] can u please add your testcase for the above scenario 

> Drop datamap should  work for timeseries 
> -
>
> Key: CARBONDATA-2068
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2068
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, spark-integration
>Affects Versions: 1.3.0
>Reporter: xubo245
>Priority: Major
> Fix For: 1.3.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Drop datamap is not  work after creating timeseries datamap for preaggregate 
> table,
> but it should  work.
> refer:
> https://issues.apache.org/jira/browse/CARBONDATA-1516



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CARBONDATA-2085) It's different between load twice and create datamap with load again after load data and create datamap

2018-01-29 Thread kumar vishal (JIRA)

[ 
https://issues.apache.org/jira/browse/CARBONDATA-2085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16343347#comment-16343347
 ] 

kumar vishal commented on CARBONDATA-2085:
--

[~xubo245]

+*First Case:*+

*Create table,*

*Load data* 

*Create data Map*

*Load data* 

*In this case it will have two segments in data map so it will return 2 rows ,* 

 

*Second case* 

*Create table,*

*Load data* 

*Load data* 

*Create data Map*

*In this case it will have 1 segments as out of 2 maintable segments data will 
be aggregated and only 1 segments will be created for data map so it will have 
1 row as complete data is aggregate* 

 

 

*Note: While creating a data map  if maintable data is already loaded then it 
will create only one segments and complete aggregated data will be present in 
one segment* 

*When data is loaded after creating data map then new segment will be created 
for data map, that segment will contain the data of only that load* 

 

 

*To Validate the Result of data map whether its correct or not please run the 
query on maintable* 

 

 

> It's different between load twice and create datamap with load again after 
> load data and create datamap
> ---
>
> Key: CARBONDATA-2085
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2085
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, spark-integration
>Affects Versions: 1.3.0
>Reporter: xubo245
>Priority: Major
> Fix For: 1.3.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> It's different between two test case
> test case 1: load twice and create datamap , and then query
> test case 2:load once , create datamap and load again, and then query
> {code:java}
> +  test("load data into mainTable after create timeseries datamap on table 
> 1") {
>  +sql("drop table if exists mainTable")
>  +sql(
>  +  """
>  +| CREATE TABLE mainTable(
>  +|   mytime timestamp,
>  +|   name string,
>  +|   age int)
>  +| STORED BY 'org.apache.carbondata.format'
>  +  """.stripMargin)
>  +
>  +sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into 
> table mainTable")
>  +
>  +sql(
>  +  """
>  +| create datamap agg0 on table mainTable
>  +| using 'preaggregate'
>  +| DMPROPERTIES (
>  +|   'timeseries.eventTime'='mytime',
>  +|   
> 'timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1')
>  +| as select mytime, sum(age)
>  +| from mainTable
>  +| group by mytime""".stripMargin)
>  +
>  +sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into 
> table mainTable")
>  +val df = sql(
>  +  """
>  +| select
>  +|   timeseries(mytime,'minute') as minuteLevel,
>  +|   sum(age) as sum
>  +| from mainTable
>  +| where timeseries(mytime,'minute')>='2016-02-23 01:01:00'
>  +| group by
>  +|   timeseries(mytime,'minute')
>  +| order by
>  +|   timeseries(mytime,'minute')
>  +  """.stripMargin)
>  +
>  +// only for test, it need remove before merge
>  +df.show()
>  +sql("select * from maintable_agg0_minute").show(100)
>  +
>  +checkAnswer(df,
>  +  Seq(Row(Timestamp.valueOf("2016-02-23 01:01:00"), 120),
>  +Row(Timestamp.valueOf("2016-02-23 01:02:00"), 280)))
>  +
>  +  }
>  +
>  +  test("load data into mainTable after create timeseries datamap on table 
> 2") {
>  +sql("drop table if exists mainTable")
>  +sql(
>  +  """
>  +| CREATE TABLE mainTable(
>  +|   mytime timestamp,
>  +|   name string,
>  +|   age int)
>  +| STORED BY 'org.apache.carbondata.format'
>  +  """.stripMargin)
>  +
>  +sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into 
> table mainTable")
>  +sql(s"LOAD DATA LOCAL INPATH '$resourcesPath/timeseriestest.csv' into 
> table mainTable")
>  +sql(
>  +  """
>  +| create datamap agg0 on table mainTable
>  +| using 'preaggregate'
>  +| DMPROPERTIES (
>  +|   'timeseries.eventTime'='mytime',
>  +|   
> 'timeseries.hierarchy'='second=1,minute=1,hour=1,day=1,month=1,year=1')
>  +| as select mytime, sum(age)
>  +| from mainTable
>  +| group by mytime""".stripMargin)
>  +
>  +
>  +val df = sql(
>  +  """
>  +| select
>  +|   timeseries(mytime,'minute') as minuteLevel,
>  +|   sum(age) as sum
>  +| from mainTable
>  +| where timeseries(mytime,'minute')>='2016-02-23 01:01:00'
>  +| group by
>  +|   timeseries(mytime,'minute')
>  +| order by
>  + 

[jira] [Created] (CARBONDATA-2101) Restrict Direct query on aggregation and timeseries data map

2018-01-30 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2101:


 Summary: Restrict Direct query on aggregation and timeseries data 
map
 Key: CARBONDATA-2101
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2101
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal


Restrict direct query on timeseries and pre-aggregate data map

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-2101) Restrict Direct query on aggregation and timeseries data map

2018-01-30 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-2101:


Assignee: kumar vishal

> Restrict Direct query on aggregation and timeseries data map
> 
>
> Key: CARBONDATA-2101
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2101
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
>
> Restrict direct query on timeseries and pre-aggregate data map
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-1224) Going out of memory if more segments are compacted at once in V3 format

2018-01-30 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1224.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Going out of memory if more segments are compacted at once in V3 format
> ---
>
> Key: CARBONDATA-1224
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1224
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Ravindra Pesala
>Assignee: Ravindra Pesala
>Priority: Major
> Fix For: 1.3.0
>
>  Time Spent: 5h 50m
>  Remaining Estimate: 0h
>
> In V3 format we read the whole blocklet at once to memory in order save IO 
> time. But it turns out  to be costlier in case of parallel reading of more 
> carbondata files. 
> For example if we need to compact 50 segments then compactor need to open the 
> readers on all the 50 segments to do merge sort. But the memory consumption 
> is too high if each reader reads whole blocklet to the memory and there is 
> high chances of going out of memory.
> Solution:
> In this type of scenarios we can introduce new readers for V3 to read the 
> data page by page instead of reading whole blocklet at once to reduce the 
> memory footprint.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2107) Average query is failing when data map has both sum(column) and avg(column) of big int, int type

2018-01-31 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2107:


 Summary: Average query is failing when data map has both 
sum(column) and avg(column) of big int, int type
 Key: CARBONDATA-2107
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2107
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Average query is failing when data map has both sum(column) and avg(column) of 
big int, int type



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2108) RefactorUnsafe sort property

2018-01-31 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2108:


 Summary: RefactorUnsafe sort property 
 Key: CARBONDATA-2108
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2108
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal


Deprecated old property: sort.inmemory.size.inmb,
Add new property: carbon.sort.storage.inmemory.size.inmb, 
If user has configured old property(sort.inmemory.size.inmb) then internally it 
will be converted to new property 
for ex: If user has configured sort.inmemory.size.inmb then 20% memory will be 
used as working memory and rest for storage memory



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2094) Filter DataMap Tables in "Show Table Command"

2018-02-01 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2094.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Filter DataMap Tables in "Show Table Command"
> -
>
> Key: CARBONDATA-2094
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2094
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Priority: Major
> Fix For: 1.3.0
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> Currently Show Table command shows datamap tables (agg tablels) but show 
> table command should not show aggregate tables. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2117) Fixed Synchronization issue while creating multiple carbon session

2018-02-01 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2117:


 Summary: Fixed Synchronization issue while creating multiple 
carbon session 
 Key: CARBONDATA-2117
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2117
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


+*Problem:*+ When creating multiple session (100) session initialisation is 
failing with below error

java.lang.IllegalArgumentException: requirement failed: Config entry 
enable.unsafe.sort already registered!

+*Solution: Currently in CarbonEnv we are updating global configuration(shared) 
and location configuration in class level synchronized block. In case of 
multiple session class level lock will not work , need to add global level lock 
so only one thread will update the global configuration*+ 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2082) Timeseries pre-aggregate table should support the blank space

2018-02-02 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2082.
--
Resolution: Fixed

> Timeseries pre-aggregate table should support the blank space
> -
>
> Key: CARBONDATA-2082
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2082
> Project: CarbonData
>  Issue Type: Bug
>  Components: core, spark-integration
>Affects Versions: 1.3.0
>Reporter: xubo245
>Assignee: xubo245
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> timeseries pre-aggregate table should support the blank space
> 1.scenario 1
> {code:java}
>test("test timeseries create table 35: support event_time and granularity 
> key with space") {
>   sql("DROP DATAMAP IF EXISTS agg1_month ON TABLE maintable")
>   sql(
> s"""CREATE DATAMAP agg1_month ON TABLE mainTable
>|USING '$timeSeries'
>|DMPROPERTIES (
>|   'event_time '=' dataTime',
>|   'MONTH_GRANULARITY '='1')
>|AS SELECT dataTime, SUM(age) FROM mainTable
>|GROUP BY dataTime
>   """.stripMargin)
>   checkExistence(sql("SHOW TABLES"), true, "maintable_agg1_month")
> }
> {code}
> problem: NPE
> {code:java}
>   java.lang.NullPointerException was thrown.
>   java.lang.NullPointerException
>   at 
> org.apache.spark.sql.execution.command.timeseries.TimeSeriesUtil$.validateTimeSeriesEventTime(TimeSeriesUtil.scala:50)
>   at 
> org.apache.spark.sql.execution.command.preaaggregate.CreatePreAggregateTableCommand.processMetadata(CreatePreAggregateTableCommand.scala:104)
>   at 
> org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:75)
>   at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:84)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
> {code}
> 2.scenario 2
> {code:java}
> test("test timeseries create table 36: support event_time and 
> granularity key with space") {
>   sql("DROP DATAMAP IF EXISTS agg1_month ON TABLE maintable")
>   sql(
> s"""CREATE DATAMAP agg1_month ON TABLE mainTable
>|USING '$timeSeries'
>|DMPROPERTIES (
>|   'event_time '='dataTime',
>|   'MONTH_GRANULARITY '=' 1')
>|AS SELECT dataTime, SUM(age) FROM mainTable
>|GROUP BY dataTime
>   """.stripMargin)
>   checkExistence(sql("SHOW TABLES"), true, 
> "maintable_agg1_month")
> }
>   
> {code}
> problem:
> {code:java}
>   Granularity only support 1
>   org.apache.carbondata.spark.exception.MalformedDataMapCommandException: 
> Granularity only support 1
>   at 
> org.apache.spark.sql.execution.command.timeseries.TimeSeriesUtil$.getTimeSeriesGranularityDetails(TimeSeriesUtil.scala:118)
>   at 
> org.apache.spark.sql.execution.command.datamap.CarbonCreateDataMapCommand.processMetadata(CarbonCreateDataMapCommand.scala:58)
>   at 
> org.apache.spark.sql.execution.command.AtomicRunnableCommand.run(package.scala:84)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
>   at 
> org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
>   at org.apache.spark.sql.Dataset.(Dataset.scala:183)
>   at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68)
>   at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:632)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2120) Fixed data mismatch for No dictionary numeric data type

2018-02-02 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2120:


 Summary: Fixed data mismatch for No dictionary numeric data type
 Key: CARBONDATA-2120
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2120
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal


*Problem:* Is null filter is failing for numeric data type(No dictionary 
column).

*Root cause:* Min max calculation is wrong when no dictionary column is not the 
first column. 

As it is not the first column null value can come in between and min max for 
null value is getting updated only when first row is null

*Solution:* Update the min max in all the case when value is null or not null 
for all type

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-1918) Incorrect data is displayed when String is updated using Sentences

2018-02-02 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-1918.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Incorrect data is displayed when String is updated using Sentences
> --
>
> Key: CARBONDATA-1918
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1918
> Project: CarbonData
>  Issue Type: Bug
>Reporter: dhatchayani
>Assignee: dhatchayani
>Priority: Minor
> Fix For: 1.3.0
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> update t_carbn01 set (active_status)= (sentences('Hello there! How are 
> you?'));
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (2.784 seconds)
> select active_status from t_carbn01;
> +-+--+
> |  active_status  |
> +-+--+
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> *| Hello\:there\\$How\:are\:you\\  |*
> +-+–+
>  
> The issue for sentences function also occurs when the below update is 
> performed.
>   update t_carbn01 set (active_status)= (split('ab', 'a'));



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2105) Incorrect result displays after creating data map

2018-02-03 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2105.
--
   Resolution: Fixed
Fix Version/s: 1.3.0

> Incorrect result displays after creating data map
> -
>
> Key: CARBONDATA-2105
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2105
> Project: CarbonData
>  Issue Type: Bug
>  Components: data-query
>Affects Versions: 1.3.0
>Reporter: Vandana Yadav
>Assignee: anubhav tarar
>Priority: Major
> Fix For: 1.3.0
>
> Attachments: 2000_UniqData.csv
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Incorrect result displays after creating data map
> Steps to Reproduce:
> 1. Create a TAble:
> CREATE TABLE uniqdata(CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION 
> string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 
> bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 
> decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 
> int) STORED BY 'org.apache.carbondata.format' 
> TBLPROPERTIES('DICTIONARY_INCLUDE'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> 2. Load Data
> a) LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> b) LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> c) LOAD DATA INPATH 'HDFS_URL/BabuStore/Data/uniqdata/2000_UniqData.csv' into 
> table uniqdata OPTIONS('DELIMITER'=',', 
> 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1')
> Execute Query:
> a) select avg(cust_id) from uniqdata group by cust_id
> output: 
> | 9460.0 |
> | 9671.0 |
> | 10403.0 |
> | 10725.0 |
> | 10867.0 |
> +---+--+
> | avg(cust_id) |
> +---+--+
> | 9067.0 |
> | 9901.0 |
> +---+--+
> 2,002 rows selected (1.718 seconds)
> b) create data map:
> create datamap uniqdata_agg on table uniqdata using 'preaggregate' as select 
> avg(cust_id) from uniqdata group by cust_id;
> c) select avg(cust_id) from uniqdata group by cust_id;
> output:
> | NULL |
> | NULL |
> | NULL |
> +---+--+
> | avg(cust_id) |
> +---+--+
> | NULL |
> | NULL |
> +---+--+
> 2,002 rows selected (0.895 seconds)
> Expected result: it should display similar result as before creating datamap.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2142) Fixed aggregate data map creation issue in case of hive metastore

2018-02-07 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2142:


 Summary: Fixed aggregate data map creation issue in case of hive 
metastore
 Key: CARBONDATA-2142
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2142
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Fixed aggregate data map creation issue in case of hive metastore



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2208) Pre aggregate datamap creation is failing when count(*) present in query

2018-02-27 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2208:


 Summary: Pre aggregate datamap creation is failing when count(*) 
present in query
 Key: CARBONDATA-2208
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2208
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


Pre aggregate data map creation is failing with parsing error 

create datamap agg9 on table maintable using 'preaggregate' as select name, 
count(*) from maintable group by name

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2208) Pre aggregate datamap creation is failing when count(*) present in query

2018-02-27 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-2208:
-
Description: 
Pre aggregate data map creation is failing with parsing error 

create datamap agg on table maintable using 'preaggregate' as select name, 
count(*) from maintable group by name

 

 

  was:
Pre aggregate data map creation is failing with parsing error 

create datamap agg9 on table maintable using 'preaggregate' as select name, 
count(*) from maintable group by name

 

 


> Pre aggregate datamap creation is failing when count(*) present in query
> 
>
> Key: CARBONDATA-2208
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2208
> Project: CarbonData
>  Issue Type: Bug
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Pre aggregate data map creation is failing with parsing error 
> create datamap agg on table maintable using 'preaggregate' as select name, 
> count(*) from maintable group by name
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2248) Removing parsers thread local objects after parsing of carbon query

2018-03-12 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2248:


 Summary: Removing parsers thread local objects after parsing of 
carbon query
 Key: CARBONDATA-2248
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2248
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


In some scenarios where more sessions are created, there are many parser 
failure objects are accumulated in memory inside thread locals.

Solution: Remove the parser object from thread local after parsing of the query



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (CARBONDATA-1522) 6. Loading aggregation tables for streaming data tables.

2018-03-20 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1522?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal reassigned CARBONDATA-1522:


Assignee: Kunal Kapoor

> 6. Loading aggregation tables for streaming data tables.
> 
>
> Key: CARBONDATA-1522
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1522
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: Ravindra Pesala
>Assignee: Kunal Kapoor
>Priority: Major
>
>  we can finish the segment load after receives configurable amount of data 
> and create a new segment to load new streaming data. While finishing the 
> segment we can trigger the agg tables on it. So there would not be any agg 
> tables on ongoing streaming segment but querying can be done on the streaming 
> segment of actual table.
> For example user configures stream_segment size as 1 GB, so for every 1 GB of 
> stream data it receives it creates a new segment and finishes the current 
> segment. While finishing the current segment we can trigger agg table loading 
> and compaction of segments.
> While querying of data we change the query plan to apply union of agg table 
> and streaming segment of actual table to get the current data.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2269) Support Query on Pre Aggregate on streaming table

2018-03-23 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2269:


 Summary: Support Query on Pre Aggregate on streaming table
 Key: CARBONDATA-2269
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2269
 Project: CarbonData
  Issue Type: Sub-task
Reporter: kumar vishal
Assignee: kumar vishal


For querying the data on PreAggregate table on streaming table change the query 
plan to apply union of agg table and streaming segment of actual table to get 
the current data.

For more detail, see the streaming ingest design document

Query Example for streaming table:

+User Query:+

SELECT name, sum(Salary) as totalSalary

FROM maintable

+Updated Query:+

SELECT name, sum(totalSalary) FROM(

 SELECT name, sum(Salary) as totalSalary

 FROM maintable

 GROUP BY name

 UNION ALL

 SELECT maintable_name,sum(maintable_salary) as totalSalary

 FROM maintable_agg

     GROUP BY maintable_name)

  GROUP BY name)

+User Query:+

SELECT name, AVG(Salary) as avgSalary

FROM maintable.

+Updated Query:+

SELECT name, Divide(sum(sumSalary)/sum(countsalary))

FROM(

    SELECT name, sum(Salary) as sumSalary,count(salary) countsalary

    FROM maintable

    GROUP BY name

    UNION ALL

    SELECT maintable_name,sum(maintable_salary) as sumSalary, 
count(maintable_salary) countsalary

   FROM maintable_agg

   GROUP BY maintable_name)

   GROUP BY name)

 +User Query:+

   SELECT name, count(Salary) as countSalary

    FROM maintable.

+Updated Query:+

   SELECT name, sum(countsalary)

   FROM(

    SELECT name, count(Salary) as countSalary

  FROM maintable

  GROUP BY name

    UNION ALL

    SELECT maintable_name,sum(maintable_count)

  FROM maintable_agg

  GROUP BY maintable_name)

   GROUP BY name)

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2269) Support Query on Pre Aggregate on streaming table

2018-03-23 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-2269:
-
Description: 
Support Query On Pre Aggregate table created on Streaming table
For querying the data on PreAggregate table on streaming table change the query 
plan to apply union of agg table and streaming segment of actual table to get 
the current data.
Query Example for streaming table:
**User Query:**
SELECT name, sum(Salary) as totalSalary
FROM maintable
**Updated Query:**
SELECT name, sum(totalSalary) FROM(
 SELECT name, sum(Salary) as totalSalary
 FROM maintable
 GROUP BY name
 UNION ALL
 SELECT maintable_name,sum(maintable_salary) as totalSalary
 FROM maintable_agg
 GROUP BY maintable_name)
GROUP BY name)

**User Query:**
SELECT name, AVG(Salary) as avgSalary
FROM maintable.
**Updated Query:**
SELECT name, Divide(sum(sumSalary)/sum(countsalary))
FROM(
 SELECT name, sum(Salary) as sumSalary,count(salary) countsalary
 FROM maintable
 GROUP BY name
 UNION ALL
 SELECT maintable_name,sum(maintable_salary) as sumSalary, 
count(maintable_salary) countsalary
 FROM maintable_agg
 GROUP BY maintable_name)
GROUP BY name)

**User Query:**
 SELECT name, count(Salary) as countSalary
 FROM maintable.
**Updated Query:**
SELECT name, sum(countsalary)
FROM(
 SELECT name, count(Salary) as countSalary
 FROM maintable
 GROUP BY name
 UNION ALL
 SELECT maintable_name,sum(maintable_count)
 FROM maintable_agg
 GROUP BY maintable_name)
GROUP BY name)

 

  was:
For querying the data on PreAggregate table on streaming table change the query 
plan to apply union of agg table and streaming segment of actual table to get 
the current data.

For more detail, see the streaming ingest design document

Query Example for streaming table:

+User Query:+

SELECT name, sum(Salary) as totalSalary

FROM maintable

+Updated Query:+

SELECT name, sum(totalSalary) FROM(

 SELECT name, sum(Salary) as totalSalary

 FROM maintable

 GROUP BY name

 UNION ALL

 SELECT maintable_name,sum(maintable_salary) as totalSalary

 FROM maintable_agg

     GROUP BY maintable_name)

  GROUP BY name)

+User Query:+

SELECT name, AVG(Salary) as avgSalary

FROM maintable.

+Updated Query:+

SELECT name, Divide(sum(sumSalary)/sum(countsalary))

FROM(

    SELECT name, sum(Salary) as sumSalary,count(salary) countsalary

    FROM maintable

    GROUP BY name

    UNION ALL

    SELECT maintable_name,sum(maintable_salary) as sumSalary, 
count(maintable_salary) countsalary

   FROM maintable_agg

   GROUP BY maintable_name)

   GROUP BY name)

 +User Query:+

   SELECT name, count(Salary) as countSalary

    FROM maintable.

+Updated Query:+

   SELECT name, sum(countsalary)

   FROM(

    SELECT name, count(Salary) as countSalary

  FROM maintable

  GROUP BY name

    UNION ALL

    SELECT maintable_name,sum(maintable_count)

  FROM maintable_agg

  GROUP BY maintable_name)

   GROUP BY name)

 


> Support Query on Pre Aggregate on streaming table
> -
>
> Key: CARBONDATA-2269
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2269
> Project: CarbonData
>  Issue Type: Sub-task
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
>
> Support Query On Pre Aggregate table created on Streaming table
> For querying the data on PreAggregate table on streaming table change the 
> query plan to apply union of agg table and streaming segment of actual table 
> to get the current data.
> Query Example for streaming table:
> **User Query:**
> SELECT name, sum(Salary) as totalSalary
> FROM maintable
> **Updated Query:**
> SELECT name, sum(totalSalary) FROM(
>  SELECT name, sum(Salary) as totalSalary
>  FROM maintable
>  GROUP BY name
>  UNION ALL
>  SELECT maintable_name,sum(maintable_salary) as totalSalary
>  FROM maintable_agg
>  GROUP BY maintable_name)
> GROUP BY name)
> **User Query:**
> SELECT name, AVG(Salary) as avgSalary
> FROM maintable.
> **Updated Query:**
> SELECT name, Divide(sum(sumSalary)/sum(countsalary))
> FROM(
>  SELECT name, sum(Salary) as sumSalary,count(salary) countsalary
>  FROM maintable
>  GROUP BY name
>  UNION ALL
>  SELECT maintable_name,sum(maintable_salary) as sumSalary, 
> count(maintable_salary) countsalary
>  FROM maintable_agg
>  GROUP BY maintable_name)
> GROUP BY name)
> **User Query:**
>  SELECT name, count(Salary) as countSalary
>  FROM maintable.
> **Updated Query:**
> SELECT name, sum(countsalary)
> FROM(
>  SELECT name, count(Salary) as countSalary
>  FROM maintable
>  GROUP BY name
>  UNION ALL
>  SELECT maintable_name,sum(maintable_count)
>  FROM maintable_agg
>  GROUP BY maintable_name)
> GROUP BY name)
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2312) Support In Memory catalog

2018-04-03 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2312:


 Summary: Support In Memory catalog
 Key: CARBONDATA-2312
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2312
 Project: CarbonData
  Issue Type: New Feature
Reporter: kumar vishal
Assignee: kumar vishal


Support Storing Catalog in memory(not in hive) for each session, after session 
restart user can create eternal table and run select query



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2325) Page level uncompress and Query performance improvement for Unsafe No Dictionary

2018-04-09 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2325:


 Summary: Page level uncompress and Query performance improvement 
for Unsafe No Dictionary
 Key: CARBONDATA-2325
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2325
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal


*Page Level Decoder for query*
Add page level on demand decoding, in current code, all pages of blocklet is 
getting uncompressed, because of this memory footprint is too high and causing 
OOM, Now added code to support page level decoding, one page will be decoding 
when all the records are processed next page data will be decoded. It will 
improve query performance for example limit query.
*Unsafe No Dictionary(Unsafe variable length)*
Optimized getRow(for Vector processing) And putArray method



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2328) Fixed Table Alias Issue in Pre Aggregate

2018-04-10 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2328:


 Summary: Fixed Table Alias Issue in Pre Aggregate
 Key: CARBONDATA-2328
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2328
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal


*Issue:* Query with table alias is not fetching data from pre-aggregate 

*Problem:* when table has alias all the attribute reference's  qualifiers will 
have alias name, but as data map is created without alias so expression 
comparison is failing.

*Solution*: While comparing alias remove qualifiers and then compare expressions



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2346) Dropping partition failing with null error for Partition table with Pre-Aggregate tables

2018-04-16 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2346.
--
   Resolution: Fixed
Fix Version/s: 1.4.0

> Dropping partition failing with null error for Partition table with 
> Pre-Aggregate tables
> 
>
> Key: CARBONDATA-2346
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2346
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Praveen M P
>Assignee: Praveen M P
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2335) Autohandoff is failing when preaggregate is created on streaming table

2018-04-16 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2335.
--
   Resolution: Fixed
Fix Version/s: 1.4.0

> Autohandoff is failing when preaggregate is created on streaming table
> --
>
> Key: CARBONDATA-2335
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2335
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Kunal Kapoor
>Assignee: Kunal Kapoor
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> Auto hand off is failing with NullPointerException when preaggregate table is 
> present in the streaming table.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2386) Query on Pre-Aggregate table is slower

2018-04-23 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2386:


 Summary: Query on Pre-Aggregate table is slower
 Key: CARBONDATA-2386
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2386
 Project: CarbonData
  Issue Type: Bug
Reporter: kumar vishal
Assignee: kumar vishal
 Fix For: 1.4.0


*Problem:* Query on pre aggregate table is consuming too much time.
*Root cause:* Time consumption to calculate size of selecting the smallest 
Pre-Aggregate table is approximately 76 seconds. This is index file is being 
read when segment file is present to compute the size of Pre-Aggregate table 
*Solution:* Read table status and get the size of data file and index file for 
valid segments. For older segments were datasize and indexsize is not present 
calculate the size of store folder



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2322) Data mismatch in Aggregate query after compaction on Pre-Agg table on Partition table

2018-04-23 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2322.
--
Resolution: Fixed

> Data mismatch in Aggregate query after compaction on Pre-Agg table on 
> Partition table
> -
>
> Key: CARBONDATA-2322
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2322
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Praveen M P
>Assignee: Praveen M P
>Priority: Major
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (CARBONDATA-2407) Removed All Unused code

2018-04-26 Thread kumar vishal (JIRA)
kumar vishal created CARBONDATA-2407:


 Summary: Removed All Unused code 
 Key: CARBONDATA-2407
 URL: https://issues.apache.org/jira/browse/CARBONDATA-2407
 Project: CarbonData
  Issue Type: Improvement
Reporter: kumar vishal
Assignee: kumar vishal


After adding datamap, executor btree is not used as driver is loading blocklet 
information. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (CARBONDATA-2407) Removed All Unused Executor BTree code

2018-04-26 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal updated CARBONDATA-2407:
-
Summary: Removed All Unused Executor BTree code   (was: Removed All Unused 
code )

> Removed All Unused Executor BTree code 
> ---
>
> Key: CARBONDATA-2407
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2407
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: kumar vishal
>Assignee: kumar vishal
>Priority: Major
>
> After adding datamap, executor btree is not used as driver is loading 
> blocklet information. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2381) Improve compaction performance by filling batch result in columnar format and performing IO at blocklet level

2018-04-30 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2381.
--
   Resolution: Fixed
Fix Version/s: 1.4.0

> Improve compaction performance by filling batch result in columnar format and 
> performing IO at blocklet level
> -
>
> Key: CARBONDATA-2381
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2381
> Project: CarbonData
>  Issue Type: Improvement
>Affects Versions: 1.3.1
>Reporter: Manish Gupta
>Assignee: Manish Gupta
>Priority: Major
> Fix For: 1.4.0
>
>  Time Spent: 9h 10m
>  Remaining Estimate: 0h
>
> Problem: Compaction performance is slow as compared to data load. If 
> compaction threshold is set to 6,6 then on minor compaction after 6 loads 
> compaction performance is almost 6-7 times of the total load performance for 
> 6 loads.
> Analysis:
>  # During compaction result filling is done in row format. Due to this as the 
> number of columns increases the dimension and measure data filling time 
> increases. This happens because in row filling we are not able to take 
> advantage of OS cacheable buffers as we continuously read data for next 
> column.
>  # As compaction uses a page level reader flow wherein both IO and 
> uncompression is done at page level, the IO and uncompression time increases 
> in this model.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (CARBONDATA-2411) infinite loop when sdk writer throws Exception

2018-04-30 Thread kumar vishal (JIRA)

 [ 
https://issues.apache.org/jira/browse/CARBONDATA-2411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

kumar vishal resolved CARBONDATA-2411.
--
Resolution: Fixed

> infinite loop when sdk writer throws Exception
> --
>
> Key: CARBONDATA-2411
> URL: https://issues.apache.org/jira/browse/CARBONDATA-2411
> Project: CarbonData
>  Issue Type: Bug
>Reporter: Babulal
>Assignee: Babulal
>Priority: Minor
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> When SDK CSVWriter throws Error (Cast String Exception). application is stuck 
> and data loading Threads are in waiting state. Use below code to reproduce 
> issue.
>  
> val fields: Array[Field] = new Array[Field](2)
>  fields(0) = new Field("stringField", DataTypes.STRING)
>  fields(1) = new Field("intField", DataTypes.INT)
> val builder: CarbonWriterBuilder = CarbonWriter.builder.withSchema(new 
> Schema(fields))
>  
> .isTransactionalTable(false).outputPath("D:/data/yyy").taskNo("5").isTransactionalTable(false)
>  val writer: CarbonWriter = builder.buildWriterForCSVInput
> writer.write(Array("xyz",1))
> writer.close()
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   >