[GitHub] carbondata issue #916: [CARBONDATA-938] Prune partitions for filter query on...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/916 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2094/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #916: [CARBONDATA-938] Prune partitions for filter query on...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/916 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2095/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #916: [CARBONDATA-938] Prune partitions for filter query on...
Github user jackylk commented on the issue: https://github.com/apache/carbondata/pull/916 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #916: [CARBONDATA-938] Prune partitions for filter q...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/916 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/927#discussion_r117429668 --- Diff: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java --- @@ -66,6 +66,26 @@ public static int compare(byte[] buffer1, byte[] buffer2) { .compareTo(buffer1, offset1, len1, buffer2, offset2, len2); } + /** + * Compare method for bytes + * + * @param buffer1 + * @param buffer2 + * @return + */ + public static int compareOne(byte[] buffer1, byte[] buffer2) { --- End diff -- Remove this unused method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/927#discussion_r117430177 --- Diff: processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java --- @@ -1366,8 +1350,9 @@ public Codec encodeAndCompressMeasures(TablePage tablePage) { } catch (InterruptedException e) { LOGGER.error(e, e.getMessage()); } - IndexStorage[] dimColumns = new IndexStorage[ - colGrpModel.getNoOfColumnStore() + noDictionaryCount + getExpandedComplexColsCount()]; + IndexStorage[] dimColumns = --- End diff -- Some changes it is showing because of formatting , please remove --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (CARBONDATA-938) 4. Detail filter query on partition column
[ https://issues.apache.org/jira/browse/CARBONDATA-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacky Li resolved CARBONDATA-938. - Resolution: Fixed Fix Version/s: 1.2.0 > 4. Detail filter query on partition column > --- > > Key: CARBONDATA-938 > URL: https://issues.apache.org/jira/browse/CARBONDATA-938 > Project: CarbonData > Issue Type: Sub-task > Components: core, data-load, data-query >Reporter: QiangCai >Assignee: QiangCai > Fix For: 1.2.0 > > Time Spent: 4h 50m > Remaining Estimate: 0h > > use filter(equal,range, in etc.) to get partition id list, use this partition > id list to filter BTree. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] carbondata issue #745: [CARBONDATA-876] Clear segment access count ASAP
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/745 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2096/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #928: sync
GitHub user xuchuanyin opened a pull request: https://github.com/apache/carbondata/pull/928 sync Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[CARBONDATA-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - What manual testing you have done? - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/xuchuanyin/carbondata master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/928.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #928 commit ce1c79b9540fca198dd82c9e21d45fceea79be5a Author: RedCactus Date: 2017-05-11T12:29:53Z Fix minor mistakes in documents Fix minor mistakes in documents commit 85b1abee6543ea5bec01c7f5a7d37e28ad3ed675 Author: RedCactus Date: 2017-05-11T12:33:10Z Merge pull request #1 from xuchuanyin/ddl_doc Fix minor mistakes in documents commit 82edf26e6c15d3514ee785e20359a5ea9cecd59e Author: RedCactus Date: 2017-05-11T12:41:54Z Fix word misspelling commit b7b26894377c52ea525b99d847279dc35afeb77d Author: RedCactus Date: 2017-05-11T12:43:48Z Remove redundant word commit 508c48fdd37a13abbc8b18d9e170672fc8d6b767 Author: RedCactus Date: 2017-05-11T12:45:43Z Fix incorrect word commit 2e70fe39692150e0aeebfd601d56e86598e5f213 Author: RedCactus Date: 2017-05-11T12:51:41Z Optimize property description Gender information like 'HE' should not appear commit bb79ed52e648ba005e069000f3a4a6ac284f8be2 Author: RedCactus Date: 2017-05-11T12:55:54Z Merge pull request #2 from xuchuanyin/introduction_doc Fix word misspelling commit 3edc7ead00436af3c5b03434db9c85bc440e16e6 Author: RedCactus Date: 2017-05-11T12:56:04Z Merge pull request #3 from xuchuanyin/faq_doc Remove redundant word commit 0ee5598c8e4c1b8eefb67f9e4d6ce5b19e9a4efe Author: RedCactus Date: 2017-05-11T12:56:14Z Merge pull request #4 from xuchuanyin/trouble_doc Fix incorrect word commit 929ddbc2526540bd4f6b5ad1876363ca8a72cbaa Author: RedCactus Date: 2017-05-11T12:56:21Z Merge pull request #5 from xuchuanyin/datamgt_doc Optimize property description --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #928: sync
Github user xuchuanyin closed the pull request at: https://github.com/apache/carbondata/pull/928 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (CARBONDATA-1030) Support reading specified segment or carbondata file
[ https://issues.apache.org/jira/browse/CARBONDATA-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017107#comment-16017107 ] Weizhong commented on CARBONDATA-1030: -- We can add mapreduce.input.carboninputformat.segmentnumbers on mapred-site.xml, so that we can query from specified segments {noformat} mapreduce.input.carboninputformat.segmentnumbers 0,1 {noformat} > Support reading specified segment or carbondata file > > > Key: CARBONDATA-1030 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1030 > Project: CarbonData > Issue Type: Improvement >Reporter: Jin Zhou >Priority: Minor > > We can query whole table in SQL way currently, but reading specified segments > or data files is useful in some scenarios such as incremental data processing. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/927 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2097/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/927 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2098/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/927 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2099/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/927 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2100/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #929: [CARBONDATA-1070]Not In Filter Expression Null...
GitHub user sounakr opened a pull request: https://github.com/apache/carbondata/pull/929 [CARBONDATA-1070]Not In Filter Expression Null Value Handling Problem : Filter Test Case failure. a) Nullpointer Handling in Not Expression. b) LessThan Filter Expression : Wrong calculation of StartKey for setting the Bits. You can merge this pull request into a Git repository by running: $ git pull https://github.com/sounakr/incubator-carbondata filter_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/929.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #929 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-1070) Not In Filter Expression throwing NullPointer Exception
sounak chakraborty created CARBONDATA-1070: -- Summary: Not In Filter Expression throwing NullPointer Exception Key: CARBONDATA-1070 URL: https://issues.apache.org/jira/browse/CARBONDATA-1070 Project: CarbonData Issue Type: Bug Components: core Affects Versions: 1.2.0 Reporter: sounak chakraborty Assignee: sounak chakraborty Not In Filter Expression throwing NullPointer Exception -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...
Github user rahulforallp commented on a diff in the pull request: https://github.com/apache/carbondata/pull/927#discussion_r117447500 --- Diff: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java --- @@ -66,6 +66,26 @@ public static int compare(byte[] buffer1, byte[] buffer2) { .compareTo(buffer1, offset1, len1, buffer2, offset2, len2); } + /** + * Compare method for bytes + * + * @param buffer1 + * @param buffer2 + * @return + */ + public static int compareOne(byte[] buffer1, byte[] buffer2) { --- End diff -- @kumarvishal09 unused function has removed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/929 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2101/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/927 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2102/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/929 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2103/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-1071) test cases of TestSortColumns class will never fails
SWATI RAO created CARBONDATA-1071: - Summary: test cases of TestSortColumns class will never fails Key: CARBONDATA-1071 URL: https://issues.apache.org/jira/browse/CARBONDATA-1071 Project: CarbonData Issue Type: Bug Components: spark-integration Affects Versions: 1.1.0 Environment: test Reporter: SWATI RAO Fix For: 1.1.0 test("create table with direct-dictioanry sort_columns") { sql("CREATE TABLE sorttable3 (empno int, empname String, designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 'org.apache.carbondata.format' ") sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""") sql("select doj from sorttable3").show() sql("select doj from sorttable3 order by doj").show() checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 order by doj")) } result: ++ | doj| ++ |2010-12-29 00:00:...| |2007-01-17 00:00:...| |2011-11-09 00:00:...| |2015-12-01 00:00:...| |2013-09-22 00:00:...| |2008-05-29 00:00:...| |2009-07-07 00:00:...| |2012-10-14 00:00:...| |2015-05-12 00:00:...| |2014-08-15 00:00:...| ++ | doj| ++ |2007-01-17 00:00:...| |2008-05-29 00:00:...| |2009-07-07 00:00:...| |2010-12-29 00:00:...| |2011-11-09 00:00:...| |2012-10-14 00:00:...| |2013-09-22 00:00:...| |2014-08-15 00:00:...| |2015-05-12 00:00:...| |2015-12-01 00:00:...| ++ result of test case it passed ,but it should fail checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 order by doj") this check is only validating the data not the order of data the real purpose for which sort column is used to make sure we are able to verify the functionality of sort columns it test cases must be modified -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CARBONDATA-1071) test cases of TestSortColumns class will never fails
[ https://issues.apache.org/jira/browse/CARBONDATA-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anubhav tarar reassigned CARBONDATA-1071: - Assignee: anubhav tarar > test cases of TestSortColumns class will never fails > - > > Key: CARBONDATA-1071 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1071 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.1.0 > Environment: test >Reporter: SWATI RAO >Assignee: anubhav tarar > Fix For: 1.1.0 > > > test("create table with direct-dictioanry sort_columns") { > sql("CREATE TABLE sorttable3 (empno int, empname String, designation > String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, > deptno int, deptname String, projectcode int, projectjoindate Timestamp, > projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY > 'org.apache.carbondata.format' ") > sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE > sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""") > sql("select doj from sorttable3").show() > sql("select doj from sorttable3 order by doj").show() > checkAnswer(sql("select doj from sorttable3"), sql("select doj from > sorttable3 order by doj")) > } > result: > ++ > | doj| > ++ > |2010-12-29 00:00:...| > |2007-01-17 00:00:...| > |2011-11-09 00:00:...| > |2015-12-01 00:00:...| > |2013-09-22 00:00:...| > |2008-05-29 00:00:...| > |2009-07-07 00:00:...| > |2012-10-14 00:00:...| > |2015-05-12 00:00:...| > |2014-08-15 00:00:...| > ++ > | doj| > ++ > |2007-01-17 00:00:...| > |2008-05-29 00:00:...| > |2009-07-07 00:00:...| > |2010-12-29 00:00:...| > |2011-11-09 00:00:...| > |2012-10-14 00:00:...| > |2013-09-22 00:00:...| > |2014-08-15 00:00:...| > |2015-05-12 00:00:...| > |2015-12-01 00:00:...| > ++ > result of test case it passed ,but it should fail > checkAnswer(sql("select doj from sorttable3"), sql("select doj from > sorttable3 order by doj") > this check is only validating the data not the order of data the real purpose > for which sort column is used > to make sure we are able to verify the functionality of sort columns it test > cases must be modified -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CARBONDATA-1071) test cases of TestSortColumns class will never fails
[ https://issues.apache.org/jira/browse/CARBONDATA-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] anubhav tarar updated CARBONDATA-1071: -- Request participants: (was: ) Description: test("create table with direct-dictioanry sort_columns") { sql("CREATE TABLE sorttable3 (empno int, empname String, designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 'org.apache.carbondata.format' ") sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""") sql("select doj from sorttable3").show() sql("select doj from sorttable3 order by doj").show() checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 order by doj")) } result: ++ | doj| ++ |2010-12-29 00:00:...| |2007-01-17 00:00:...| |2011-11-09 00:00:...| |2015-12-01 00:00:...| |2013-09-22 00:00:...| |2008-05-29 00:00:...| |2009-07-07 00:00:...| |2012-10-14 00:00:...| |2015-05-12 00:00:...| |2014-08-15 00:00:...| ++ | doj| ++ |2007-01-17 00:00:...| |2008-05-29 00:00:...| |2009-07-07 00:00:...| |2010-12-29 00:00:...| |2011-11-09 00:00:...| |2012-10-14 00:00:...| |2013-09-22 00:00:...| |2014-08-15 00:00:...| |2015-05-12 00:00:...| |2015-12-01 00:00:...| ++ result of test case it passed ,but it should fail checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 order by doj") this check is only validating the data not the order of data the real purpose for which sort column is used to make sure we are able to verify the functionality of sort columns it test cases must be modified was: test("create table with direct-dictioanry sort_columns") { sql("CREATE TABLE sorttable3 (empno int, empname String, designation String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, deptno int, deptname String, projectcode int, projectjoindate Timestamp, projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 'org.apache.carbondata.format' ") sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""") sql("select doj from sorttable3").show() sql("select doj from sorttable3 order by doj").show() checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 order by doj")) } result: ++ | doj| ++ |2010-12-29 00:00:...| |2007-01-17 00:00:...| |2011-11-09 00:00:...| |2015-12-01 00:00:...| |2013-09-22 00:00:...| |2008-05-29 00:00:...| |2009-07-07 00:00:...| |2012-10-14 00:00:...| |2015-05-12 00:00:...| |2014-08-15 00:00:...| ++ | doj| ++ |2007-01-17 00:00:...| |2008-05-29 00:00:...| |2009-07-07 00:00:...| |2010-12-29 00:00:...| |2011-11-09 00:00:...| |2012-10-14 00:00:...| |2013-09-22 00:00:...| |2014-08-15 00:00:...| |2015-05-12 00:00:...| |2015-12-01 00:00:...| ++ result of test case it passed ,but it should fail checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 order by doj") this check is only validating the data not the order of data the real purpose for which sort column is used to make sure we are able to verify the functionality of sort columns it test cases must be modified > test cases of TestSortColumns class will never fails > - > > Key: CARBONDATA-1071 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1071 > Project: CarbonData > Issue Type: Bug > Components: spark-integration >Affects Versions: 1.1.0 > Environment: test >Reporter: SWATI RAO > Fix For: 1.1.0 > > > test("create table with direct-dictioanry sort_columns") { > sql("CREATE TABLE sorttable3 (empno int, empname String, designation > String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, > deptno int, deptname String, projectcode int, projectjoindate Timestamp, > projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY > 'org.apache.carbondata.format' ") > sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE > sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""") > sql("select doj from sorttable3").show() > sql("select doj from sorttable3 order by doj").show() > checkAnswer(sql("select doj from sorttable3"), sql("select doj from > sorttable3 order by doj")) > } > result: > +--
[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/929 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2104/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...
Github user gvramana commented on the issue: https://github.com/apache/carbondata/pull/929 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/929 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2105/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #929: [CARBONDATA-1070]Not In Filter Expression Null...
Github user asfgit closed the pull request at: https://github.com/apache/carbondata/pull/929 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (CARBONDATA-1070) Not In Filter Expression throwing NullPointer Exception
[ https://issues.apache.org/jira/browse/CARBONDATA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkata Ramana G updated CARBONDATA-1070: - Priority: Minor (was: Major) > Not In Filter Expression throwing NullPointer Exception > --- > > Key: CARBONDATA-1070 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1070 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 1.2.0 >Reporter: sounak chakraborty >Assignee: sounak chakraborty >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Not In Filter Expression throwing NullPointer Exception -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CARBONDATA-1070) Not In Filter Expression throwing NullPointer Exception
[ https://issues.apache.org/jira/browse/CARBONDATA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkata Ramana G updated CARBONDATA-1070: - Description: Query containing Not In Filter Expression with null value is throwing NullPointerException (was: Not In Filter Expression throwing NullPointer Exception) > Not In Filter Expression throwing NullPointer Exception > --- > > Key: CARBONDATA-1070 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1070 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 1.2.0 >Reporter: sounak chakraborty >Assignee: sounak chakraborty >Priority: Minor > Time Spent: 1h 10m > Remaining Estimate: 0h > > Query containing Not In Filter Expression with null value is throwing > NullPointerException -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (CARBONDATA-1070) Not In Filter Expression throwing NullPointer Exception
[ https://issues.apache.org/jira/browse/CARBONDATA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venkata Ramana G resolved CARBONDATA-1070. -- Resolution: Fixed Fix Version/s: 1.1.1 1.2.0 > Not In Filter Expression throwing NullPointer Exception > --- > > Key: CARBONDATA-1070 > URL: https://issues.apache.org/jira/browse/CARBONDATA-1070 > Project: CarbonData > Issue Type: Bug > Components: core >Affects Versions: 1.2.0 >Reporter: sounak chakraborty >Assignee: sounak chakraborty >Priority: Minor > Fix For: 1.2.0, 1.1.1 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > Query containing Not In Filter Expression with null value is throwing > NullPointerException -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...
Github user rahulforallp commented on the issue: https://github.com/apache/carbondata/pull/927 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/927 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2106/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...
GitHub user rahulforallp reopened a pull request: https://github.com/apache/carbondata/pull/927 [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other table properties and NPE in compaction 1. Measure shouldn't supported inside no_inverted_index , if it is not included in sort_column or dictionary_include. 2. Dimension excluded from dictionary should supported in NO_INVERTED_DICTIONARY 3. Fix NullPointerException in compaction for decimal value after multiple load You can merge this pull request into a Git repository by running: $ git pull https://github.com/rahulforallp/incubator-carbondata CARBONDATA-1066 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/carbondata/pull/927.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #927 commit 33632d34d31ab8b101a76638f3543c8f2f576be7 Author: rahulforallp Date: 2017-05-18T11:12:49Z measure shouldnot added to no_inverted_index commit 59cf9803033754e2b9ec813b85c130c56821fe49 Author: rahulforallp Date: 2017-05-18T11:23:49Z CARBONDATA-1066 supported commit 7747f38627685d04650f1c2ec13818def9ee Author: rahulforallp Date: 2017-05-18T12:11:27Z nullpointer exception resolved for multiple load and major compaction some test case added commit 5c0a1f44498489be714d89afee7a10a4bcf0c7b4 Author: rahulforallp Date: 2017-05-19T10:00:53Z comment resolved --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...
Github user rahulforallp closed the pull request at: https://github.com/apache/carbondata/pull/927 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #918: [WIP] Support add store read size metrics for carbon ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/918 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2107/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/890 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2108/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/890 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2109/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #918: [WIP] Support add store read size metrics for ...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/carbondata/pull/918#discussion_r117495284 --- Diff: core/src/main/java/org/apache/carbondata/core/scan/processor/AbstractDataBlockIterator.java --- @@ -102,6 +107,17 @@ public AbstractDataBlockIterator(BlockExecutionInfo blockExecutionInfo, FileHold this.executorService = executorService; this.nextBlock = new AtomicBoolean(false); this.nextRead = new AtomicBoolean(false); +List allStatistics = FileSystem.getAllStatistics(); --- End diff -- Move this code AbstractDetailQueryResultIterator and pass statistics object from there to this class. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/carbondata/pull/927#discussion_r117498967 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala --- @@ -495,15 +495,36 @@ abstract class CarbonDDLSqlParser extends AbstractCarbonSparkSQLParser { // check duplicate columns and only 1 col left val distinctCols = noInvertedIdxColsProps.toSet // extract the no inverted index columns +val dictionaryInclude = tableProperties.getOrElse(CarbonCommonConstants.DICTIONARY_INCLUDE, "") --- End diff -- remove this validation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/carbondata/pull/927#discussion_r117488392 --- Diff: integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala --- @@ -495,15 +495,36 @@ abstract class CarbonDDLSqlParser extends AbstractCarbonSparkSQLParser { // check duplicate columns and only 1 col left val distinctCols = noInvertedIdxColsProps.toSet // extract the no inverted index columns +val dictionaryInclude = tableProperties.getOrElse(CarbonCommonConstants.DICTIONARY_INCLUDE, "") + .split(",") +val sortColumns = tableProperties.getOrElse(CarbonCommonConstants.SORT_COLUMNS, "").split(",") fields.foreach(field => { - if (distinctCols.exists(x => x.equalsIgnoreCase(field.column))) { + if (distinctCols.exists(x => x.equalsIgnoreCase(field.column)) && + validateColumnsForInvertedIndex(field, dictionaryInclude ++ sortColumns)) { noInvertedIdxCols :+= field.column } } ) noInvertedIdxCols } + + private def validateColumnsForInvertedIndex(field: Field, + dictionaryIncludeOrSortColumn: Array[String]): Boolean = { +val invertedIndexColumns = Array("date", "timestamp", "struct", "array") --- End diff -- Struct, array will not have invertedindex, while data and timestamp will have invertedindex --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/927 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2110/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #918: [WIP] Support add store read size metrics for carbon ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/918 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2111/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/890 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2112/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/890 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2113/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...
Github user cenyuhai commented on the issue: https://github.com/apache/carbondata/pull/890 why it will failed ? I test it in my mac, it is ok... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (CARBONDATA-910) Implement Partition feature
[ https://issues.apache.org/jira/browse/CARBONDATA-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017763#comment-16017763 ] cen yuhai commented on CARBONDATA-910: -- Can you add a new partition type like hive? > Implement Partition feature > --- > > Key: CARBONDATA-910 > URL: https://issues.apache.org/jira/browse/CARBONDATA-910 > Project: CarbonData > Issue Type: New Feature > Components: core, data-load, data-query >Reporter: Cao, Lionel >Assignee: Cao, Lionel > > Why need partition table > Partition table provide an option to divide table into some smaller pieces. > With partition table: > 1. Data could be better managed, organized and stored. > 2. We can avoid full table scan in some scenario and improve query > performance. (partition column in filter, > multiple partition tables join in the same partition column etc.) > Partitioning design > Range Partitioning >range partitioning maps data to partitions according to the range of > partition column values, operator '<' defines non-inclusive upper bound of > current partition. > List Partitioning >list partitioning allows you map data to partitions with specific > value list > Hash Partitioning >hash partitioning maps data to partitions with hash algorithm and put > them to the given number of partitions > Composite Partitioning(2 levels at most for now) >Range-Range, Range-List, Range-Hash, List-Range, List-List, List-Hash, > Hash-Range, Hash-List, Hash-Hash > DDL-Create > Create table sales( > itemid long, > logdate datetime, > customerid int > ... > ...) > [partition by range logdate(...)] > [subpartition by list area(...)] > Stored By 'carbondata' > [tblproperties(...)]; > range partition: > partition by range logdate(< '2016-01-01', < '2017-01-01', < > '2017-02-01', < '2017-03-01', < '2099-01-01') > list partition: > partition by list area('Asia', 'Europe', 'North America', 'Africa', > 'Oceania') > hash partition: > partition by hash(itemid, 9) > composite partition: > partition by range logdate(< '2016- -01', < '2017-01-01', < > '2017-02-01', < '2017-03-01', < '2099-01-01') > subpartition by list area('Asia', 'Europe', 'North America', 'Africa', > 'Oceania') > DDL-Rebuild, Add > Alter table sales rebuild partition by (range|list|hash)(...); > Alter table salse add partition (< '2018-01-01');#only support range > partitioning, list partitioning > Alter table salse add partition ('South America'); > #Note: No delete operation for partition, please use rebuild. > If need delete data, use delete statement, but the definition of partition > will not be deleted. > Partition Table Data Store > [Option One] > Use the current design, keep partition folder out of segments > Fact >|___Part0 >| |___Segment_0 >| |___ ***-[bucketId]-.carbondata >| |___ ***-[bucketId]-.carbondata >| |___Segment_1 >| ... >|___Part1 >| |___Segment_0 >| |___Segment_1 >|... > [Option Two] > remove partition folder, add partition id into file name and build btree in > driver side. > Fact >|___Segment_0 >| |___ ***-[bucketId]-[partitionId].carbondata >| |___ ***-[bucketId]-[partitionId].carbondata >|___Segment_1 >|___Segment_2 >... > Pros & Cons: > Option one would be faster to locate target files > Option two need to store more metadata of folders > Partition Table MetaData Store > partitioni info should be stored in file footer/index file and load into > memory before user query. > Relationship with Bucket > Bucket should be lower level of partition. > Partition Table Query > Example: > Select * from sales > where logdate <= date '2016-12-01'; > User should remember to add a partition filter when write SQL on a partition > table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...
Github user chenliang613 commented on the issue: https://github.com/apache/carbondata/pull/890 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...
Github user CarbonDataQA commented on the issue: https://github.com/apache/carbondata/pull/890 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2114/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (CARBONDATA-910) Implement Partition feature
[ https://issues.apache.org/jira/browse/CARBONDATA-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018277#comment-16018277 ] xuchuanyin commented on CARBONDATA-910: --- As describe above, which OPTION will carbon use? > Implement Partition feature > --- > > Key: CARBONDATA-910 > URL: https://issues.apache.org/jira/browse/CARBONDATA-910 > Project: CarbonData > Issue Type: New Feature > Components: core, data-load, data-query >Reporter: Cao, Lionel >Assignee: Cao, Lionel > > Why need partition table > Partition table provide an option to divide table into some smaller pieces. > With partition table: > 1. Data could be better managed, organized and stored. > 2. We can avoid full table scan in some scenario and improve query > performance. (partition column in filter, > multiple partition tables join in the same partition column etc.) > Partitioning design > Range Partitioning >range partitioning maps data to partitions according to the range of > partition column values, operator '<' defines non-inclusive upper bound of > current partition. > List Partitioning >list partitioning allows you map data to partitions with specific > value list > Hash Partitioning >hash partitioning maps data to partitions with hash algorithm and put > them to the given number of partitions > Composite Partitioning(2 levels at most for now) >Range-Range, Range-List, Range-Hash, List-Range, List-List, List-Hash, > Hash-Range, Hash-List, Hash-Hash > DDL-Create > Create table sales( > itemid long, > logdate datetime, > customerid int > ... > ...) > [partition by range logdate(...)] > [subpartition by list area(...)] > Stored By 'carbondata' > [tblproperties(...)]; > range partition: > partition by range logdate(< '2016-01-01', < '2017-01-01', < > '2017-02-01', < '2017-03-01', < '2099-01-01') > list partition: > partition by list area('Asia', 'Europe', 'North America', 'Africa', > 'Oceania') > hash partition: > partition by hash(itemid, 9) > composite partition: > partition by range logdate(< '2016- -01', < '2017-01-01', < > '2017-02-01', < '2017-03-01', < '2099-01-01') > subpartition by list area('Asia', 'Europe', 'North America', 'Africa', > 'Oceania') > DDL-Rebuild, Add > Alter table sales rebuild partition by (range|list|hash)(...); > Alter table salse add partition (< '2018-01-01');#only support range > partitioning, list partitioning > Alter table salse add partition ('South America'); > #Note: No delete operation for partition, please use rebuild. > If need delete data, use delete statement, but the definition of partition > will not be deleted. > Partition Table Data Store > [Option One] > Use the current design, keep partition folder out of segments > Fact >|___Part0 >| |___Segment_0 >| |___ ***-[bucketId]-.carbondata >| |___ ***-[bucketId]-.carbondata >| |___Segment_1 >| ... >|___Part1 >| |___Segment_0 >| |___Segment_1 >|... > [Option Two] > remove partition folder, add partition id into file name and build btree in > driver side. > Fact >|___Segment_0 >| |___ ***-[bucketId]-[partitionId].carbondata >| |___ ***-[bucketId]-[partitionId].carbondata >|___Segment_1 >|___Segment_2 >... > Pros & Cons: > Option one would be faster to locate target files > Option two need to store more metadata of folders > Partition Table MetaData Store > partitioni info should be stored in file footer/index file and load into > memory before user query. > Relationship with Bucket > Bucket should be lower level of partition. > Partition Table Query > Example: > Select * from sales > where logdate <= date '2016-12-01'; > User should remember to add a partition filter when write SQL on a partition > table. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-1072) Streaming Ingestion Feature
Aniket Adnaik created CARBONDATA-1072: - Summary: Streaming Ingestion Feature Key: CARBONDATA-1072 URL: https://issues.apache.org/jira/browse/CARBONDATA-1072 Project: CarbonData Issue Type: New Feature Components: core, data-load, data-query, examples, file-format, spark-integration, sql Affects Versions: 1.1.0 Reporter: Aniket Adnaik Fix For: 1.2.0 High level break down of work Items/Implementation phases: Design document will be attached soon. Phase – 1 – Spark Structured Streaming with regular Carbondata Format This phase will mainly focus on supporting Streaming ingestion using Spark Structured streaming 1. Write Path Implementation - Integration with Spark’s Structured Streaming framework (FileStreamSink etc) - StreamingOutputWriter (StreamingOuputWriterFactory) - Prepare Write (Schema Validation, Segment creation, Streaming file creation etc) - StreamingRecordWriter ( Data conversion from Catalyst InternalRow to Carbondata compatible format , make use of new load path) 2. Read Path Implementation (some overlap with phase-2) - Modify getsplits() to read from Streaming Segment - Read commited info from meta data to get correct offsets - Make use of Min-Max index if available - Use sequential scan - data is unsorted , cannot use Btree index 3. Compaction - Minor Compaction - Major Compaction 4. Metadata Management - Streaming metadata store (e.g. Offsets, timestamps etc.) 5. Failure Recovery - Rollback on failure - Handle asynchronous writes to CarbonData (using hflush) Phase – 2 : Spark Structured Streaming with Appendable CarbonData format 1.Streaming File Format - Writers use V3 file format for appending Columnar unsorted data blockets - Modify Readers to read from appendable streaming file format - Phase -3 : 1. Inter-opertability Support - Functionality with other features/Components - Concurrent queries with streaming ingestion - Concurrent operations with Streaming Ingestion (e.g. Compaction, Alter table, Secondary Index etc. 2. Kafka Connect Ingestion / Carbondata connector - Direct ingestion from Kafka Connect without Spark Structured Streaming - Separate Kafka Connector to receive data through network port - Data commit and Offset management - Phase-4 : Support for other streaming engines - Analysis of Streaming APIs/interface with other streaming engines - Implementation of connectors for different streaming engines storm, flink , flume, etc. - Phase -5 : In-memory Streaming table (probable feature) - 1. In-memory Cache for Streaming data - Fault tolerant in-memory buffering / checkpoint with WAL - Readers read from in-memory tables if available - Background threads for writing streaming data ,etc. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CARBONDATA-1072) Streaming Ingestion Feature
[ https://issues.apache.org/jira/browse/CARBONDATA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aniket Adnaik updated CARBONDATA-1072: -- Request participants: (was: ) Description: High level break down of work Items/Implementation phases: Design document will be attached soon. Phase – 1 – Spark Structured Streaming with regular Carbondata Format This phase will mainly focus on supporting Streaming ingestion using Spark Structured streaming 1. Write Path Implementation - Integration with Spark’s Structured Streaming framework (FileStreamSink etc) - StreamingOutputWriter (StreamingOuputWriterFactory) - Prepare Write (Schema Validation, Segment creation, Streaming file creation etc) - StreamingRecordWriter ( Data conversion from Catalyst InternalRow to Carbondata compatible format , make use of new load path) 2. Read Path Implementation (some overlap with phase-2) - Modify getsplits() to read from Streaming Segment - Read commited info from meta data to get correct offsets - Make use of Min-Max index if available - Use sequential scan - data is unsorted , cannot use Btree index 3. Compaction - Minor Compaction - Major Compaction 4. Metadata Management - Streaming metadata store (e.g. Offsets, timestamps etc.) 5. Failure Recovery - Rollback on failure - Handle asynchronous writes to CarbonData (using hflush) - Phase – 2 : Spark Structured Streaming with Appendable CarbonData format 1.Streaming File Format - Writers use V3 file format for appending Columnar unsorted data blockets - Modify Readers to read from appendable streaming file format - Phase -3 : 1. Inter-opertability Support - Functionality with other features/Components - Concurrent queries with streaming ingestion - Concurrent operations with Streaming Ingestion (e.g. Compaction, Alter table, Secondary Index etc.) 2. Kafka Connect Ingestion / Carbondata connector - Direct ingestion from Kafka Connect without Spark Structured Streaming - Separate Kafka Connector to receive data through network port - Data commit and Offset management - Phase-4 : Support for other streaming engines - Analysis of Streaming APIs/interface with other streaming engines - Implementation of connectors for different streaming engines storm, flink , flume, etc. Phase -5 : In-memory Streaming table (probable feature) 1. In-memory Cache for Streaming data - Fault tolerant in-memory buffering / checkpoint with WAL - Readers read from in-memory tables if available - Background threads for writing streaming data ,etc. was: High level break down of work Items/Implementation phases: Design document will be attached soon. Phase – 1 – Spark Structured Streaming with regular Carbondata Format This phase will mainly focus on supporting Streaming ingestion using Spark Structured streaming 1. Write Path Implementation - Integration with Spark’s Structured Streaming framework (FileStreamSink etc) - StreamingOutputWriter (StreamingOuputWriterFactory) - Prepare Write (Schema Validation, Segment creation, Streaming file creation etc) - StreamingRecordWriter ( Data conversion from Catalyst InternalRow to Carbondata compatible format , make use of new load path) 2. Read Path Implementation (some overlap with phase-2) - Modify getsplits() to read from Streaming Segment - Read commited info from meta data to get correct offsets - Make use of Min-Max index if available - Use sequential scan - data is unsorted , cannot use Btree index 3. Compaction - Minor Compaction - Major Compaction 4. Metadata Management - Streaming metadata store (e.g. Offsets, timestamps etc.) 5. Failure Recovery - Rollback on failure - Handle asynchronous writes to CarbonData (using hflush) Phase – 2 : Spark Structured Streaming with Appendable CarbonData format 1.Streaming File Format - Writers use V3 file format for appending Columnar unsorted data blockets - Modify Readers to read from appendable streaming file format - Phase -3 : 1. Inter-opertability Support - Functionality with other features/Components - Concurrent queries with streaming ingestion - Concurrent operations with Streaming Ingestion (e.g. Compaction, Alter table, Secondary Index etc. 2. Kafka Connect Ingestion / Carbon
[GitHub] carbondata issue #821: [CARBONDATA-921] resolved bug for unable to select ou...
Github user QiangCai commented on the issue: https://github.com/apache/carbondata/pull/821 @chenliang613 looks good, suggest to merge to hive branch --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---