Re: [New Feature] Range Filter Optimization

2017-03-21 Thread QiangCai
+1

Best Regards
David QiangCai



--
View this message in context: 
http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/New-Feature-Range-Filter-Optimization-tp9343p9383.html
Sent from the Apache CarbonData Mailing List archive mailing list archive at 
Nabble.com.


[jira] [Created] (CARBONDATA-803) Incorrect results returned by not equal to filter on dictionary column with numeric data type

2017-03-21 Thread Manish Gupta (JIRA)
Manish Gupta created CARBONDATA-803:
---

 Summary: Incorrect results returned by not equal to filter on 
dictionary column with numeric data type
 Key: CARBONDATA-803
 URL: https://issues.apache.org/jira/browse/CARBONDATA-803
 Project: CarbonData
  Issue Type: Bug
Reporter: Manish Gupta
Assignee: Manish Gupta
 Fix For: 1.1.0-incubating


Whenever a not equal to filter is applied on dictionary column with numeric 
datatype, the cast added by spark plan is removed while creating carbon filters 
from spark filter. Due to this plan modification incorrect results are returned 
by spark.
Steps to reproduce the issue:
1. CREATE TABLE IF NOT EXISTS carbon(ID Int, date Timestamp, country String, 
name String, phonetype String, serialname String, salary Int) STORED BY 
'org.apache.carbondata.format' TBLPROPERTIES('dictionary_include'='id')
2. LOAD DATA LOCAL INPATH '$csvFilePath' into table carbon
3. select Id from test_not_equal_to_carbon where id != '7'



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-802) Select query is throwing exception if new dictionary column is added without any default value

2017-03-21 Thread Naresh P R (JIRA)
Naresh P R created CARBONDATA-802:
-

 Summary: Select query is throwing exception if new dictionary 
column is added without any default value
 Key: CARBONDATA-802
 URL: https://issues.apache.org/jira/browse/CARBONDATA-802
 Project: CarbonData
  Issue Type: Bug
Reporter: Naresh P R
Assignee: Naresh P R
Priority: Minor


Select query is throwing exception if new dictionary column is added without 
any default value

eg., create table test(int id, name string) stored as 'carbondata'
alter table test add columns(country string) 
tblproperties('default.value.country'='india') -->select query is passing
alter table test add columns(state string) -->select query is failing




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (CARBONDATA-801) [Documentation] Examples format to be fixed

2017-03-21 Thread Gururaj Shetty (JIRA)
Gururaj Shetty created CARBONDATA-801:
-

 Summary: [Documentation] Examples format to be fixed
 Key: CARBONDATA-801
 URL: https://issues.apache.org/jira/browse/CARBONDATA-801
 Project: CarbonData
  Issue Type: Bug
Reporter: Gururaj Shetty
Assignee: Gururaj Shetty
Priority: Minor


Some examples provided in DDL are enclosed in “” which might not work in some 
scenarios. Need to replace the “” in the examples to ‘’.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


Re: [New Feature] Range Filter Optimization

2017-03-21 Thread Kumar Vishal
+1.

This will help in range query as only one time filter will be applied and
range of binary search will decrease as we can do incremental binary
search. Currently in carbon range filter for dictionary column is slow as
number of filter applied can be more and it degrade the filter query
performance, above optimisation will improve the dictionary column also.

-Regards
Kumar Vishal



On Tue, Mar 21, 2017 at 9:14 AM, sounak  wrote:

> Hi,
>
> Currently I am working in a RANGE Filter Optimization. Previously if in
> case there are both Greater Than and Less than predicates on a same column,
> then they are evaluated spearately where multiple times scanning of the
> blocks are involved which is undoubtedly costly.
>
> To optimize this, if "Greater and Less than" predicates are there the a
> same column and joined by AND expression, then these two expression can be
> merged into a single expression and can form Range Expression. Rather than
> evaluating the expressions separately only one time the block will be
> scanned and if the values lies within Range Min and Max then will be
> selected.
>
> Current the filter expression tree is transformed as shown below. The below
> tranformation helps us to carry the Range MIN MAX value so that prunning
> optimization is applied.
>
> //AND  AND
> // | |
> /// \   / \
> //   /   \ /   \
> //Less   Greater =>  TRUE   Range
> // / \/ \   / \
> ///   \  /   \ /   \
> //   a10 a   5 Less   greater
> // /\
>   /\
> //   /  \
> /  \
> // a   10  a
>   5
>
>
> Presence of a Range in a single expression gives few more optimization
> during block prunning. Comparing Filter Min and Max and Block Min and Max
> values few decisions can be taken as shown below.
>
> Case A)  In case the Filter Range lies completely outside of Block range =>
> Then no scanning is required and the block can be completely skipped.
>
>---
>
> |   BLOCK |
>
> ---
>  -
> Min  MAX
> 
> |  FILTER   |
>   or  |
>FILTER |
>--
>
>  
>  Min   Max
>   Min
>  Max
>
> Case B ) The Filter Completely Covers the Block => No scanning is required.
> Select all Rows of the Block. i.e. turn all bits of block to true.
>
> --
>
>|   BLOCK |
>
>--
>
>  --
>|
>  FILTER |
>
>  --
>
> Case C) The Filter Overlaps the Block, But completely donot cover it. =>
> Choose the Block Scan Start or End based on filter overlaping, no need to
> do a binary search to find the Start and End Index of Block.
>
>
> -
>  --
>   |   BLOCK
>   ||
>  BLOCK|
>
> -
>  -
>
>  OR
>   -
>|  FILTER|
>
> | FILTER|
>---
>
>-
>   Start of Block will be default
> StartIndex Value

[GitHub] incubator-carbondata-site pull request #21: Updated the Meetup Page

2017-03-21 Thread PallaviSingh1992
Github user PallaviSingh1992 closed the pull request at:

https://github.com/apache/incubator-carbondata-site/pull/21


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] incubator-carbondata-site pull request #22: Added BlackDuck Award slide in t...

2017-03-21 Thread PallaviSingh1992
Github user PallaviSingh1992 closed the pull request at:

https://github.com/apache/incubator-carbondata-site/pull/22


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---