Re: [New Feature] Range Filter Optimization
+1 Best Regards David QiangCai -- View this message in context: http://apache-carbondata-mailing-list-archive.1130556.n5.nabble.com/New-Feature-Range-Filter-Optimization-tp9343p9383.html Sent from the Apache CarbonData Mailing List archive mailing list archive at Nabble.com.
[jira] [Created] (CARBONDATA-803) Incorrect results returned by not equal to filter on dictionary column with numeric data type
Manish Gupta created CARBONDATA-803: --- Summary: Incorrect results returned by not equal to filter on dictionary column with numeric data type Key: CARBONDATA-803 URL: https://issues.apache.org/jira/browse/CARBONDATA-803 Project: CarbonData Issue Type: Bug Reporter: Manish Gupta Assignee: Manish Gupta Fix For: 1.1.0-incubating Whenever a not equal to filter is applied on dictionary column with numeric datatype, the cast added by spark plan is removed while creating carbon filters from spark filter. Due to this plan modification incorrect results are returned by spark. Steps to reproduce the issue: 1. CREATE TABLE IF NOT EXISTS carbon(ID Int, date Timestamp, country String, name String, phonetype String, serialname String, salary Int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES('dictionary_include'='id') 2. LOAD DATA LOCAL INPATH '$csvFilePath' into table carbon 3. select Id from test_not_equal_to_carbon where id != '7' -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-802) Select query is throwing exception if new dictionary column is added without any default value
Naresh P R created CARBONDATA-802: - Summary: Select query is throwing exception if new dictionary column is added without any default value Key: CARBONDATA-802 URL: https://issues.apache.org/jira/browse/CARBONDATA-802 Project: CarbonData Issue Type: Bug Reporter: Naresh P R Assignee: Naresh P R Priority: Minor Select query is throwing exception if new dictionary column is added without any default value eg., create table test(int id, name string) stored as 'carbondata' alter table test add columns(country string) tblproperties('default.value.country'='india') -->select query is passing alter table test add columns(state string) -->select query is failing -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-801) [Documentation] Examples format to be fixed
Gururaj Shetty created CARBONDATA-801: - Summary: [Documentation] Examples format to be fixed Key: CARBONDATA-801 URL: https://issues.apache.org/jira/browse/CARBONDATA-801 Project: CarbonData Issue Type: Bug Reporter: Gururaj Shetty Assignee: Gururaj Shetty Priority: Minor Some examples provided in DDL are enclosed in “” which might not work in some scenarios. Need to replace the “” in the examples to ‘’. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: [New Feature] Range Filter Optimization
+1. This will help in range query as only one time filter will be applied and range of binary search will decrease as we can do incremental binary search. Currently in carbon range filter for dictionary column is slow as number of filter applied can be more and it degrade the filter query performance, above optimisation will improve the dictionary column also. -Regards Kumar Vishal On Tue, Mar 21, 2017 at 9:14 AM, sounakwrote: > Hi, > > Currently I am working in a RANGE Filter Optimization. Previously if in > case there are both Greater Than and Less than predicates on a same column, > then they are evaluated spearately where multiple times scanning of the > blocks are involved which is undoubtedly costly. > > To optimize this, if "Greater and Less than" predicates are there the a > same column and joined by AND expression, then these two expression can be > merged into a single expression and can form Range Expression. Rather than > evaluating the expressions separately only one time the block will be > scanned and if the values lies within Range Min and Max then will be > selected. > > Current the filter expression tree is transformed as shown below. The below > tranformation helps us to carry the Range MIN MAX value so that prunning > optimization is applied. > > //AND AND > // | | > /// \ / \ > // / \ / \ > //Less Greater => TRUE Range > // / \/ \ / \ > /// \ / \ / \ > // a10 a 5 Less greater > // /\ > /\ > // / \ > / \ > // a 10 a > 5 > > > Presence of a Range in a single expression gives few more optimization > during block prunning. Comparing Filter Min and Max and Block Min and Max > values few decisions can be taken as shown below. > > Case A) In case the Filter Range lies completely outside of Block range => > Then no scanning is required and the block can be completely skipped. > >--- > > | BLOCK | > > --- > - > Min MAX > > | FILTER | > or | >FILTER | >-- > > > Min Max > Min > Max > > Case B ) The Filter Completely Covers the Block => No scanning is required. > Select all Rows of the Block. i.e. turn all bits of block to true. > > -- > >| BLOCK | > >-- > > -- >| > FILTER | > > -- > > Case C) The Filter Overlaps the Block, But completely donot cover it. => > Choose the Block Scan Start or End based on filter overlaping, no need to > do a binary search to find the Start and End Index of Block. > > > - > -- > | BLOCK > || > BLOCK| > > - > - > > OR > - >| FILTER| > > | FILTER| >--- > >- > Start of Block will be default > StartIndex Value
[GitHub] incubator-carbondata-site pull request #21: Updated the Meetup Page
Github user PallaviSingh1992 closed the pull request at: https://github.com/apache/incubator-carbondata-site/pull/21 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata-site pull request #22: Added BlackDuck Award slide in t...
Github user PallaviSingh1992 closed the pull request at: https://github.com/apache/incubator-carbondata-site/pull/22 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---