date:20170519

[GitHub] carbondata issue #916: [CARBONDATA-938] Prune partitions for filter query on...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/916
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2094/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #916: [CARBONDATA-938] Prune partitions for filter query on...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/916
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2095/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #916: [CARBONDATA-938] Prune partitions for filter query on...

2017-05-19 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/916
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #916: [CARBONDATA-938] Prune partitions for filter q...

2017-05-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/916


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...

2017-05-19 Thread kumarvishal09

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/927#discussion_r117429668
  
--- Diff: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java 
---
@@ -66,6 +66,26 @@ public static int compare(byte[] buffer1, byte[] 
buffer2) {
 .compareTo(buffer1, offset1, len1, buffer2, offset2, len2);
   }
 
+  /**
+   * Compare method for bytes
+   *
+   * @param buffer1
+   * @param buffer2
+   * @return
+   */
+  public static int compareOne(byte[] buffer1, byte[] buffer2) {
--- End diff --

Remove this unused method


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...

2017-05-19 Thread kumarvishal09

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/927#discussion_r117430177
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
 ---
@@ -1366,8 +1350,9 @@ public Codec encodeAndCompressMeasures(TablePage 
tablePage) {
   } catch (InterruptedException e) {
 LOGGER.error(e, e.getMessage());
   }
-  IndexStorage[] dimColumns = new IndexStorage[
-  colGrpModel.getNoOfColumnStore() + noDictionaryCount + 
getExpandedComplexColsCount()];
+  IndexStorage[] dimColumns =
--- End diff --

Some changes it is showing because of formatting , please remove


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Resolved] (CARBONDATA-938) 4. Detail filter query on partition column

2017-05-19 Thread Jacky Li (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacky Li resolved CARBONDATA-938.
-
   Resolution: Fixed
Fix Version/s: 1.2.0

> 4. Detail filter query on partition column 
> ---
>
> Key: CARBONDATA-938
> URL: https://issues.apache.org/jira/browse/CARBONDATA-938
> Project: CarbonData
>  Issue Type: Sub-task
>  Components: core, data-load, data-query
>Reporter: QiangCai
>Assignee: QiangCai
> Fix For: 1.2.0
>
>  Time Spent: 4h 50m
>  Remaining Estimate: 0h
>
> use filter(equal,range, in etc.) to get partition id list, use this partition 
> id list to filter BTree. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] carbondata issue #745: [CARBONDATA-876] Clear segment access count ASAP

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/745
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2096/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #928: sync

2017-05-19 Thread xuchuanyin

GitHub user xuchuanyin opened a pull request:

https://github.com/apache/carbondata/pull/928

sync

Be sure to do all of the following to help us incorporate your contribution
quickly and easily:

 - [ ] Make sure the PR title is formatted like:
   `[CARBONDATA-] Description of pull request`
 - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable
   Travis-CI on your fork and ensure the whole test matrix passes).
 - [ ] Replace `` in the title with the actual Jira issue
   number, if there is one.
 - [ ] If this contribution is large, please file an Apache
   [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.txt).
 - [ ] Testing done
 
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- What manual testing you have done?
- Any additional information to help reviewers in testing this 
change.
 
 - [ ] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
 
---


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/xuchuanyin/carbondata master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/928.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #928


commit ce1c79b9540fca198dd82c9e21d45fceea79be5a
Author: RedCactus 
Date:   2017-05-11T12:29:53Z

Fix minor mistakes in documents

Fix minor mistakes in documents

commit 85b1abee6543ea5bec01c7f5a7d37e28ad3ed675
Author: RedCactus 
Date:   2017-05-11T12:33:10Z

Merge pull request #1 from xuchuanyin/ddl_doc

Fix minor mistakes in documents

commit 82edf26e6c15d3514ee785e20359a5ea9cecd59e
Author: RedCactus 
Date:   2017-05-11T12:41:54Z

Fix word misspelling

commit b7b26894377c52ea525b99d847279dc35afeb77d
Author: RedCactus 
Date:   2017-05-11T12:43:48Z

Remove redundant word

commit 508c48fdd37a13abbc8b18d9e170672fc8d6b767
Author: RedCactus 
Date:   2017-05-11T12:45:43Z

Fix incorrect word

commit 2e70fe39692150e0aeebfd601d56e86598e5f213
Author: RedCactus 
Date:   2017-05-11T12:51:41Z

Optimize property description

Gender information like 'HE' should not appear

commit bb79ed52e648ba005e069000f3a4a6ac284f8be2
Author: RedCactus 
Date:   2017-05-11T12:55:54Z

Merge pull request #2 from xuchuanyin/introduction_doc

Fix word misspelling

commit 3edc7ead00436af3c5b03434db9c85bc440e16e6
Author: RedCactus 
Date:   2017-05-11T12:56:04Z

Merge pull request #3 from xuchuanyin/faq_doc

Remove redundant word

commit 0ee5598c8e4c1b8eefb67f9e4d6ce5b19e9a4efe
Author: RedCactus 
Date:   2017-05-11T12:56:14Z

Merge pull request #4 from xuchuanyin/trouble_doc

Fix incorrect word

commit 929ddbc2526540bd4f6b5ad1876363ca8a72cbaa
Author: RedCactus 
Date:   2017-05-11T12:56:21Z

Merge pull request #5 from xuchuanyin/datamgt_doc

Optimize property description




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #928: sync

2017-05-19 Thread xuchuanyin

Github user xuchuanyin closed the pull request at:

https://github.com/apache/carbondata/pull/928


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (CARBONDATA-1030) Support reading specified segment or carbondata file

2017-05-19 Thread Weizhong (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017107#comment-16017107
 ] 

Weizhong commented on CARBONDATA-1030:
--

We can add mapreduce.input.carboninputformat.segmentnumbers on mapred-site.xml, 
so that we can query from specified segments
{noformat}

  mapreduce.input.carboninputformat.segmentnumbers
  0,1

{noformat}

> Support reading specified segment or carbondata file
> 
>
> Key: CARBONDATA-1030
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1030
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Jin Zhou
>Priority: Minor
>
> We can query whole table in SQL way currently, but reading specified segments 
> or data files is useful in some scenarios such as incremental data processing.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/927
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2097/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/927
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2098/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/927
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2099/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/927
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2100/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #929: [CARBONDATA-1070]Not In Filter Expression Null...

2017-05-19 Thread sounakr

GitHub user sounakr opened a pull request:

https://github.com/apache/carbondata/pull/929

[CARBONDATA-1070]Not In Filter Expression Null Value Handling

Problem : Filter Test Case failure. 
a) Nullpointer Handling in Not Expression.
b) LessThan Filter Expression : Wrong calculation of StartKey for setting 
the Bits.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sounakr/incubator-carbondata filter_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/929.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #929






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-1070) Not In Filter Expression throwing NullPointer Exception

2017-05-19 Thread sounak chakraborty (JIRA)

sounak chakraborty created CARBONDATA-1070:
--

 Summary: Not In Filter Expression throwing NullPointer Exception
 Key: CARBONDATA-1070
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1070
 Project: CarbonData
  Issue Type: Bug
  Components: core
Affects Versions: 1.2.0
Reporter: sounak chakraborty
Assignee: sounak chakraborty


Not In Filter Expression throwing NullPointer Exception



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...

2017-05-19 Thread rahulforallp

Github user rahulforallp commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/927#discussion_r117447500
  
--- Diff: core/src/main/java/org/apache/carbondata/core/util/ByteUtil.java 
---
@@ -66,6 +66,26 @@ public static int compare(byte[] buffer1, byte[] 
buffer2) {
 .compareTo(buffer1, offset1, len1, buffer2, offset2, len2);
   }
 
+  /**
+   * Compare method for bytes
+   *
+   * @param buffer1
+   * @param buffer2
+   * @return
+   */
+  public static int compareOne(byte[] buffer1, byte[] buffer2) {
--- End diff --

@kumarvishal09 unused function has removed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/929
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2101/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/927
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2102/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/929
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2103/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Created] (CARBONDATA-1071) test cases of TestSortColumns class will never fails

2017-05-19 Thread SWATI RAO (JIRA)

SWATI RAO created CARBONDATA-1071:
-

 Summary: test cases of TestSortColumns class will never fails 
 Key: CARBONDATA-1071
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1071
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 1.1.0
 Environment: test
Reporter: SWATI RAO
 Fix For: 1.1.0


 test("create table with direct-dictioanry sort_columns") {

sql("CREATE TABLE sorttable3 (empno int, empname String, designation 
String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, 
deptno int, deptname String, projectcode int, projectjoindate Timestamp, 
projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 
'org.apache.carbondata.format' ")

sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
sql("select doj from sorttable3").show()
sql("select doj from sorttable3 order by doj").show()
checkAnswer(sql("select doj from sorttable3"), sql("select doj from 
sorttable3 order by doj"))
  }

result:
++
| doj|
++
|2010-12-29 00:00:...|
|2007-01-17 00:00:...|
|2011-11-09 00:00:...|
|2015-12-01 00:00:...|
|2013-09-22 00:00:...|
|2008-05-29 00:00:...|
|2009-07-07 00:00:...|
|2012-10-14 00:00:...|
|2015-05-12 00:00:...|
|2014-08-15 00:00:...|
++

| doj|
++
|2007-01-17 00:00:...|
|2008-05-29 00:00:...|
|2009-07-07 00:00:...|
|2010-12-29 00:00:...|
|2011-11-09 00:00:...|
|2012-10-14 00:00:...|
|2013-09-22 00:00:...|
|2014-08-15 00:00:...|
|2015-05-12 00:00:...|
|2015-12-01 00:00:...|
++

result of test case it passed ,but it should fail 

checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 
order by doj")

this check is only validating the data not the order of data the real purpose 
for which sort column is used

to make sure we are able to verify the functionality of sort columns it test 
cases must be modified



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Assigned] (CARBONDATA-1071) test cases of TestSortColumns class will never fails

2017-05-19 Thread anubhav tarar (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anubhav tarar reassigned CARBONDATA-1071:
-

Assignee: anubhav tarar

> test cases of TestSortColumns class will never fails 
> -
>
> Key: CARBONDATA-1071
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1071
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.1.0
> Environment: test
>Reporter: SWATI RAO
>Assignee: anubhav tarar
> Fix For: 1.1.0
>
>
>  test("create table with direct-dictioanry sort_columns") {
> sql("CREATE TABLE sorttable3 (empno int, empname String, designation 
> String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, 
> deptno int, deptname String, projectcode int, projectjoindate Timestamp, 
> projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 
> 'org.apache.carbondata.format' ")
> sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
> sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
> sql("select doj from sorttable3").show()
> sql("select doj from sorttable3 order by doj").show()
> checkAnswer(sql("select doj from sorttable3"), sql("select doj from 
> sorttable3 order by doj"))
>   }
> result:
> ++
> | doj|
> ++
> |2010-12-29 00:00:...|
> |2007-01-17 00:00:...|
> |2011-11-09 00:00:...|
> |2015-12-01 00:00:...|
> |2013-09-22 00:00:...|
> |2008-05-29 00:00:...|
> |2009-07-07 00:00:...|
> |2012-10-14 00:00:...|
> |2015-05-12 00:00:...|
> |2014-08-15 00:00:...|
> ++
> | doj|
> ++
> |2007-01-17 00:00:...|
> |2008-05-29 00:00:...|
> |2009-07-07 00:00:...|
> |2010-12-29 00:00:...|
> |2011-11-09 00:00:...|
> |2012-10-14 00:00:...|
> |2013-09-22 00:00:...|
> |2014-08-15 00:00:...|
> |2015-05-12 00:00:...|
> |2015-12-01 00:00:...|
> ++
> result of test case it passed ,but it should fail 
> checkAnswer(sql("select doj from sorttable3"), sql("select doj from 
> sorttable3 order by doj")
> this check is only validating the data not the order of data the real purpose 
> for which sort column is used
> to make sure we are able to verify the functionality of sort columns it test 
> cases must be modified



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-1071) test cases of TestSortColumns class will never fails

2017-05-19 Thread anubhav tarar (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

anubhav tarar updated CARBONDATA-1071:
--
Request participants:   (was: )
 Description: 
 test("create table with direct-dictioanry sort_columns") {

sql("CREATE TABLE sorttable3 (empno int, empname String, designation 
String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, 
deptno int, deptname String, projectcode int, projectjoindate Timestamp, 
projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 
'org.apache.carbondata.format' ")

sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")

sql("select doj from sorttable3").show()

sql("select doj from sorttable3 order by doj").show()

checkAnswer(sql("select doj from sorttable3"), sql("select doj from 
sorttable3 order by doj"))
  }

result:
++
| doj|
++
|2010-12-29 00:00:...|
|2007-01-17 00:00:...|
|2011-11-09 00:00:...|
|2015-12-01 00:00:...|
|2013-09-22 00:00:...|
|2008-05-29 00:00:...|
|2009-07-07 00:00:...|
|2012-10-14 00:00:...|
|2015-05-12 00:00:...|
|2014-08-15 00:00:...|
++

| doj|
++
|2007-01-17 00:00:...|
|2008-05-29 00:00:...|
|2009-07-07 00:00:...|
|2010-12-29 00:00:...|
|2011-11-09 00:00:...|
|2012-10-14 00:00:...|
|2013-09-22 00:00:...|
|2014-08-15 00:00:...|
|2015-05-12 00:00:...|
|2015-12-01 00:00:...|
++

result of test case it passed ,but it should fail 

checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 
order by doj")

this check is only validating the data not the order of data the real purpose 
for which sort column is used

to make sure we are able to verify the functionality of sort columns it test 
cases must be modified

  was:
 test("create table with direct-dictioanry sort_columns") {

sql("CREATE TABLE sorttable3 (empno int, empname String, designation 
String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, 
deptno int, deptname String, projectcode int, projectjoindate Timestamp, 
projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 
'org.apache.carbondata.format' ")

sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
sql("select doj from sorttable3").show()
sql("select doj from sorttable3 order by doj").show()
checkAnswer(sql("select doj from sorttable3"), sql("select doj from 
sorttable3 order by doj"))
  }

result:
++
| doj|
++
|2010-12-29 00:00:...|
|2007-01-17 00:00:...|
|2011-11-09 00:00:...|
|2015-12-01 00:00:...|
|2013-09-22 00:00:...|
|2008-05-29 00:00:...|
|2009-07-07 00:00:...|
|2012-10-14 00:00:...|
|2015-05-12 00:00:...|
|2014-08-15 00:00:...|
++

| doj|
++
|2007-01-17 00:00:...|
|2008-05-29 00:00:...|
|2009-07-07 00:00:...|
|2010-12-29 00:00:...|
|2011-11-09 00:00:...|
|2012-10-14 00:00:...|
|2013-09-22 00:00:...|
|2014-08-15 00:00:...|
|2015-05-12 00:00:...|
|2015-12-01 00:00:...|
++

result of test case it passed ,but it should fail 

checkAnswer(sql("select doj from sorttable3"), sql("select doj from sorttable3 
order by doj")

this check is only validating the data not the order of data the real purpose 
for which sort column is used

to make sure we are able to verify the functionality of sort columns it test 
cases must be modified


> test cases of TestSortColumns class will never fails 
> -
>
> Key: CARBONDATA-1071
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1071
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 1.1.0
> Environment: test
>Reporter: SWATI RAO
> Fix For: 1.1.0
>
>
>  test("create table with direct-dictioanry sort_columns") {
> sql("CREATE TABLE sorttable3 (empno int, empname String, designation 
> String, doj Timestamp, workgroupcategory int, workgroupcategoryname String, 
> deptno int, deptname String, projectcode int, projectjoindate Timestamp, 
> projectenddate Timestamp,attendance int,utilization int,salary int) STORED BY 
> 'org.apache.carbondata.format' ")
> sql(s"""LOAD DATA local inpath '$resourcesPath/data.csv' INTO TABLE 
> sorttable3 OPTIONS('DELIMITER'= ',', 'QUOTECHAR'= '\"')""")
> sql("select doj from sorttable3").show()
> sql("select doj from sorttable3 order by doj").show()
> checkAnswer(sql("select doj from sorttable3"), sql("select doj from 
> sorttable3 order by doj"))
>   }
> result:
> +--

[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/929
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2104/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...

2017-05-19 Thread gvramana

Github user gvramana commented on the issue:

https://github.com/apache/carbondata/pull/929
  
LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #929: [CARBONDATA-1070]Not In Filter Expression Null Value ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/929
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2105/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #929: [CARBONDATA-1070]Not In Filter Expression Null...

2017-05-19 Thread asfgit

Github user asfgit closed the pull request at:

https://github.com/apache/carbondata/pull/929


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Updated] (CARBONDATA-1070) Not In Filter Expression throwing NullPointer Exception

2017-05-19 Thread Venkata Ramana G (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Ramana G updated CARBONDATA-1070:
-
Priority: Minor  (was: Major)

> Not In Filter Expression throwing NullPointer Exception
> ---
>
> Key: CARBONDATA-1070
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1070
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.2.0
>Reporter: sounak chakraborty
>Assignee: sounak chakraborty
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Not In Filter Expression throwing NullPointer Exception



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-1070) Not In Filter Expression throwing NullPointer Exception

2017-05-19 Thread Venkata Ramana G (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Ramana G updated CARBONDATA-1070:
-
Description: Query containing Not In Filter Expression with null value is  
throwing NullPointerException  (was: Not In Filter Expression throwing 
NullPointer Exception)

> Not In Filter Expression throwing NullPointer Exception
> ---
>
> Key: CARBONDATA-1070
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1070
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.2.0
>Reporter: sounak chakraborty
>Assignee: sounak chakraborty
>Priority: Minor
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Query containing Not In Filter Expression with null value is  throwing 
> NullPointerException



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Resolved] (CARBONDATA-1070) Not In Filter Expression throwing NullPointer Exception

2017-05-19 Thread Venkata Ramana G (JIRA)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-1070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Venkata Ramana G resolved CARBONDATA-1070.
--
   Resolution: Fixed
Fix Version/s: 1.1.1
   1.2.0

> Not In Filter Expression throwing NullPointer Exception
> ---
>
> Key: CARBONDATA-1070
> URL: https://issues.apache.org/jira/browse/CARBONDATA-1070
> Project: CarbonData
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.2.0
>Reporter: sounak chakraborty
>Assignee: sounak chakraborty
>Priority: Minor
> Fix For: 1.2.0, 1.1.1
>
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> Query containing Not In Filter Expression with null value is  throwing 
> NullPointerException



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...

2017-05-19 Thread rahulforallp

Github user rahulforallp commented on the issue:

https://github.com/apache/carbondata/pull/927
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/927
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2106/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...

2017-05-19 Thread rahulforallp

GitHub user rahulforallp reopened a pull request:

https://github.com/apache/carbondata/pull/927

[CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other table properties and 
NPE in compaction



1. Measure shouldn't supported inside no_inverted_index , if it is not 
included in sort_column or dictionary_include.
2. Dimension excluded from dictionary should supported in 
NO_INVERTED_DICTIONARY
3. Fix NullPointerException in compaction for decimal value after multiple 
load




You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rahulforallp/incubator-carbondata 
CARBONDATA-1066

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/927.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #927


commit 33632d34d31ab8b101a76638f3543c8f2f576be7
Author: rahulforallp 
Date:   2017-05-18T11:12:49Z

measure shouldnot added to no_inverted_index

commit 59cf9803033754e2b9ec813b85c130c56821fe49
Author: rahulforallp 
Date:   2017-05-18T11:23:49Z

CARBONDATA-1066 supported

commit 7747f38627685d04650f1c2ec13818def9ee
Author: rahulforallp 
Date:   2017-05-18T12:11:27Z

nullpointer exception resolved for multiple load and major compaction
some test case added

commit 5c0a1f44498489be714d89afee7a10a4bcf0c7b4
Author: rahulforallp 
Date:   2017-05-19T10:00:53Z

comment resolved




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...

2017-05-19 Thread rahulforallp

Github user rahulforallp closed the pull request at:

https://github.com/apache/carbondata/pull/927


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #918: [WIP] Support add store read size metrics for carbon ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/918
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2107/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/890
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2108/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/890
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2109/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #918: [WIP] Support add store read size metrics for ...

2017-05-19 Thread kumarvishal09

Github user kumarvishal09 commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/918#discussion_r117495284
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/processor/AbstractDataBlockIterator.java
 ---
@@ -102,6 +107,17 @@ public AbstractDataBlockIterator(BlockExecutionInfo 
blockExecutionInfo, FileHold
 this.executorService = executorService;
 this.nextBlock = new AtomicBoolean(false);
 this.nextRead = new AtomicBoolean(false);
+List allStatistics = 
FileSystem.getAllStatistics();
--- End diff --

Move this code AbstractDetailQueryResultIterator and pass statistics object 
from there to this class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...

2017-05-19 Thread gvramana

Github user gvramana commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/927#discussion_r117498967
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala
 ---
@@ -495,15 +495,36 @@ abstract class CarbonDDLSqlParser extends 
AbstractCarbonSparkSQLParser {
 // check duplicate columns and only 1 col left
 val distinctCols = noInvertedIdxColsProps.toSet
 // extract the no inverted index columns
+val dictionaryInclude = 
tableProperties.getOrElse(CarbonCommonConstants.DICTIONARY_INCLUDE, "")
--- End diff --

remove this validation


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata pull request #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for ...

2017-05-19 Thread gvramana

Github user gvramana commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/927#discussion_r117488392
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala
 ---
@@ -495,15 +495,36 @@ abstract class CarbonDDLSqlParser extends 
AbstractCarbonSparkSQLParser {
 // check duplicate columns and only 1 col left
 val distinctCols = noInvertedIdxColsProps.toSet
 // extract the no inverted index columns
+val dictionaryInclude = 
tableProperties.getOrElse(CarbonCommonConstants.DICTIONARY_INCLUDE, "")
+  .split(",")
+val sortColumns = 
tableProperties.getOrElse(CarbonCommonConstants.SORT_COLUMNS, "").split(",")
 fields.foreach(field => {
-  if (distinctCols.exists(x => x.equalsIgnoreCase(field.column))) {
+  if (distinctCols.exists(x => x.equalsIgnoreCase(field.column)) &&
+  validateColumnsForInvertedIndex(field, dictionaryInclude ++ 
sortColumns)) {
 noInvertedIdxCols :+= field.column
   }
 }
 )
 noInvertedIdxCols
   }
 
+
+  private def validateColumnsForInvertedIndex(field: Field,
+  dictionaryIncludeOrSortColumn: Array[String]): Boolean = {
+val invertedIndexColumns = Array("date", "timestamp", "struct", 
"array")
--- End diff --

Struct, array will not have invertedindex, while data and timestamp will 
have invertedindex


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #927: [CARBONDATA-1066] Fixed NO_INVERTED_INDEX for other t...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/927
  
Build Success with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2110/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #918: [WIP] Support add store read size metrics for carbon ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/918
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2111/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/890
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2112/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/890
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2113/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...

2017-05-19 Thread cenyuhai

Github user cenyuhai commented on the issue:

https://github.com/apache/carbondata/pull/890
  
why it will failed ? I test it in my mac, it is ok...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (CARBONDATA-910) Implement Partition feature

2017-05-19 Thread cen yuhai (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16017763#comment-16017763
 ] 

cen yuhai commented on CARBONDATA-910:
--

Can you add a new partition type like hive? 

> Implement Partition feature
> ---
>
> Key: CARBONDATA-910
> URL: https://issues.apache.org/jira/browse/CARBONDATA-910
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, data-load, data-query
>Reporter: Cao, Lionel
>Assignee: Cao, Lionel
>
> Why need partition table
> Partition table provide an option to divide table into some smaller pieces. 
> With partition table:
>   1. Data could be better managed, organized and stored. 
>   2. We can avoid full table scan in some scenario and improve query 
> performance. (partition column in filter, 
>   multiple partition tables join in the same partition column etc.)
> Partitioning design
> Range Partitioning   
>range partitioning maps data to partitions according to the range of 
> partition column values, operator '<' defines non-inclusive upper bound of 
> current partition.
> List Partitioning
>list partitioning allows you map data to partitions with specific 
> value list
> Hash Partitioning
>hash partitioning maps data to partitions with hash algorithm and put 
> them to the given number of partitions
> Composite Partitioning(2 levels at most for now)
>Range-Range, Range-List, Range-Hash, List-Range, List-List, List-Hash, 
> Hash-Range, Hash-List, Hash-Hash
> DDL-Create 
> Create table sales(
>  itemid long, 
>  logdate datetime, 
>  customerid int
>  ...
>  ...)
> [partition by range logdate(...)]
> [subpartition by list area(...)]
> Stored By 'carbondata'
> [tblproperties(...)];
> range partition: 
>  partition by range logdate(<  '2016-01-01', < '2017-01-01', < 
> '2017-02-01', < '2017-03-01', < '2099-01-01')
> list partition:
>  partition by list area('Asia', 'Europe', 'North America', 'Africa', 
> 'Oceania')
> hash partition:
>  partition by hash(itemid, 9) 
> composite partition:
>  partition by range logdate(<  '2016- -01', < '2017-01-01', < 
> '2017-02-01', < '2017-03-01', < '2099-01-01')
>  subpartition by list area('Asia', 'Europe', 'North America', 'Africa', 
> 'Oceania')
> DDL-Rebuild, Add
> Alter table sales rebuild partition by (range|list|hash)(...);
> Alter table salse add partition (< '2018-01-01');#only support range 
> partitioning, list partitioning
> Alter table salse add partition ('South America');
> #Note: No delete operation for partition, please use rebuild. 
> If need delete data, use delete statement, but the definition of partition 
> will not be deleted.
> Partition Table Data Store
> [Option One]
> Use the current design, keep partition folder out of segments
> Fact
>|___Part0
>|  |___Segment_0
>| |___ ***-[bucketId]-.carbondata
>| |___ ***-[bucketId]-.carbondata
>|  |___Segment_1
>|  ...
>|___Part1
>|  |___Segment_0
>|  |___Segment_1
>|...
> [Option Two]
> remove partition folder, add partition id into file name and build btree in 
> driver side.
> Fact
>|___Segment_0
>|  |___ ***-[bucketId]-[partitionId].carbondata
>|  |___ ***-[bucketId]-[partitionId].carbondata
>|___Segment_1
>|___Segment_2
>...
> Pros & Cons: 
> Option one would be faster to locate target files
> Option two need to store more metadata of folders
> Partition Table MetaData Store
> partitioni info should be stored in file footer/index file and load into 
> memory before user query.
> Relationship with Bucket
> Bucket should be lower level of partition.
> Partition Table Query
> Example:
> Select * from sales
> where logdate <= date '2016-12-01';
> User should remember to add a partition filter when write SQL on a partition 
> table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...

2017-05-19 Thread chenliang613

Github user chenliang613 commented on the issue:

https://github.com/apache/carbondata/pull/890
  
retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] carbondata issue #890: [CARBONDATA-1008] Make Caron table schema compatible ...

2017-05-19 Thread CarbonDataQA

Github user CarbonDataQA commented on the issue:

https://github.com/apache/carbondata/pull/890
  
Build Failed  with Spark 1.6.2, Please check CI 
http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/2114/



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

[jira] [Commented] (CARBONDATA-910) Implement Partition feature

2017-05-19 Thread xuchuanyin (JIRA)


[ 
https://issues.apache.org/jira/browse/CARBONDATA-910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018277#comment-16018277
 ] 

xuchuanyin commented on CARBONDATA-910:
---

As describe above, which OPTION will carbon use?

> Implement Partition feature
> ---
>
> Key: CARBONDATA-910
> URL: https://issues.apache.org/jira/browse/CARBONDATA-910
> Project: CarbonData
>  Issue Type: New Feature
>  Components: core, data-load, data-query
>Reporter: Cao, Lionel
>Assignee: Cao, Lionel
>
> Why need partition table
> Partition table provide an option to divide table into some smaller pieces. 
> With partition table:
>   1. Data could be better managed, organized and stored. 
>   2. We can avoid full table scan in some scenario and improve query 
> performance. (partition column in filter, 
>   multiple partition tables join in the same partition column etc.)
> Partitioning design
> Range Partitioning   
>range partitioning maps data to partitions according to the range of 
> partition column values, operator '<' defines non-inclusive upper bound of 
> current partition.
> List Partitioning
>list partitioning allows you map data to partitions with specific 
> value list
> Hash Partitioning
>hash partitioning maps data to partitions with hash algorithm and put 
> them to the given number of partitions
> Composite Partitioning(2 levels at most for now)
>Range-Range, Range-List, Range-Hash, List-Range, List-List, List-Hash, 
> Hash-Range, Hash-List, Hash-Hash
> DDL-Create 
> Create table sales(
>  itemid long, 
>  logdate datetime, 
>  customerid int
>  ...
>  ...)
> [partition by range logdate(...)]
> [subpartition by list area(...)]
> Stored By 'carbondata'
> [tblproperties(...)];
> range partition: 
>  partition by range logdate(<  '2016-01-01', < '2017-01-01', < 
> '2017-02-01', < '2017-03-01', < '2099-01-01')
> list partition:
>  partition by list area('Asia', 'Europe', 'North America', 'Africa', 
> 'Oceania')
> hash partition:
>  partition by hash(itemid, 9) 
> composite partition:
>  partition by range logdate(<  '2016- -01', < '2017-01-01', < 
> '2017-02-01', < '2017-03-01', < '2099-01-01')
>  subpartition by list area('Asia', 'Europe', 'North America', 'Africa', 
> 'Oceania')
> DDL-Rebuild, Add
> Alter table sales rebuild partition by (range|list|hash)(...);
> Alter table salse add partition (< '2018-01-01');#only support range 
> partitioning, list partitioning
> Alter table salse add partition ('South America');
> #Note: No delete operation for partition, please use rebuild. 
> If need delete data, use delete statement, but the definition of partition 
> will not be deleted.
> Partition Table Data Store
> [Option One]
> Use the current design, keep partition folder out of segments
> Fact
>|___Part0
>|  |___Segment_0
>| |___ ***-[bucketId]-.carbondata
>| |___ ***-[bucketId]-.carbondata
>|  |___Segment_1
>|  ...
>|___Part1
>|  |___Segment_0
>|  |___Segment_1
>|...
> [Option Two]
> remove partition folder, add partition id into file name and build btree in 
> driver side.
> Fact
>|___Segment_0
>|  |___ ***-[bucketId]-[partitionId].carbondata
>|  |___ ***-[bucketId]-[partitionId].carbondata
>|___Segment_1
>|___Segment_2
>...
> Pros & Cons: 
> Option one would be faster to locate target files
> Option two need to store more metadata of folders
> Partition Table MetaData Store
> partitioni info should be stored in file footer/index file and load into 
> memory before user query.
> Relationship with Bucket
> Bucket should be lower level of partition.
> Partition Table Query
> Example:
> Select * from sales
> where logdate <= date '2016-12-01';
> User should remember to add a partition filter when write SQL on a partition 
> table.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Created] (CARBONDATA-1072) Streaming Ingestion Feature

2017-05-19 Thread Aniket Adnaik (JIRA)

Aniket Adnaik created CARBONDATA-1072:
-

 Summary: Streaming Ingestion Feature 
 Key: CARBONDATA-1072
 URL: https://issues.apache.org/jira/browse/CARBONDATA-1072
 Project: CarbonData
  Issue Type: New Feature
  Components: core, data-load, data-query, examples, file-format, 
spark-integration, sql
Affects Versions: 1.1.0
Reporter: Aniket Adnaik
 Fix For: 1.2.0


High level break down of work Items/Implementation phases:
Design document will be attached soon.
 
Phase – 1 – Spark Structured Streaming with regular Carbondata Format

This phase will mainly focus on supporting Streaming ingestion using 
Spark Structured streaming 
1.  Write Path Implementation 
   - Integration with Spark’s Structured Streaming framework  
   (FileStreamSink etc)
   - StreamingOutputWriter (StreamingOuputWriterFactory)
   - Prepare Write  (Schema Validation, Segment creation, 
  Streaming file creation etc)
   - StreamingRecordWriter ( Data conversion from Catalyst InternalRow
 to Carbondata compatible format , make use of new load path) 

 2. Read Path Implementation (some overlap with phase-2)
  - Modify getsplits() to read from Streaming Segment
  - Read commited info from meta data to get correct offsets
  - Make use of Min-Max index if available 
  - Use sequential scan - data is unsorted , cannot use Btree index 

3.  Compaction
 -  Minor Compaction
 -  Major Compaction

   4.   Metadata Management
 - Streaming metadata store (e.g. Offsets, timestamps etc.)
   
   5.   Failure Recovery
  - Rollback on failure
  - Handle asynchronous writes to CarbonData (using hflush) 

Phase – 2 : Spark Structured Streaming with Appendable CarbonData format
 1.Streaming File Format
   - Writers use V3 file format for appending Columnar unsorted 
 data blockets
  - Modify Readers to read from appendable streaming file format
-   
Phase -3 : 
 1. Inter-opertability Support
   - Functionality with other features/Components
   - Concurrent queries with streaming ingestion
   - Concurrent operations with Streaming Ingestion (e.g. Compaction, 
  Alter table, Secondary Index etc.
2.  Kafka Connect Ingestion / Carbondata connector
  - Direct ingestion from Kafka Connect without Spark Structured 
Streaming
  - Separate Kafka  Connector to receive data through network port
  - Data commit and Offset management
-
Phase-4 : Support for other streaming engines
 -  Analysis of Streaming APIs/interface  with other streaming engines
 - Implementation of connectors  for different streaming engines storm, 
   flink , flume, etc.

-
Phase -5 : In-memory Streaming table (probable feature)
-
   1.   In-memory Cache for Streaming data 
 -  Fault tolerant  in-memory buffering / checkpoint with WAL
 -  Readers read from in-memory tables if available
 -  Background threads for writing streaming data ,etc.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Updated] (CARBONDATA-1072) Streaming Ingestion Feature

2017-05-19 Thread Aniket Adnaik (JIRA)

[
https://issues.apache.org/jira/browse/CARBONDATA-1072?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Aniket Adnaik updated CARBONDATA-1072:
--
Request participants: (was: )
Description:
High level break down of work Items/Implementation phases:
Design document will be attached soon.

Phase – 1 – Spark Structured Streaming with regular Carbondata Format

This phase will mainly focus on supporting Streaming ingestion using
Spark Structured streaming
1. Write Path Implementation
- Integration with Spark’s Structured Streaming framework
(FileStreamSink etc)
- StreamingOutputWriter (StreamingOuputWriterFactory)
- Prepare Write (Schema Validation, Segment creation,
Streaming file creation etc)
- StreamingRecordWriter ( Data conversion from Catalyst InternalRow
to Carbondata compatible format , make use of new load path)

2. Read Path Implementation (some overlap with phase-2)
- Modify getsplits() to read from Streaming Segment
- Read commited info from meta data to get correct offsets
- Make use of Min-Max index if available
- Use sequential scan - data is unsorted , cannot use Btree index

3. Compaction
- Minor Compaction
- Major Compaction

4. Metadata Management
- Streaming metadata store (e.g. Offsets, timestamps etc.)

5. Failure Recovery
- Rollback on failure
- Handle asynchronous writes to CarbonData (using hflush)
-
Phase – 2 : Spark Structured Streaming with Appendable CarbonData format
1.Streaming File Format
- Writers use V3 file format for appending Columnar unsorted
data blockets
- Modify Readers to read from appendable streaming file format
-
Phase -3 :
1. Inter-opertability Support
- Functionality with other features/Components
- Concurrent queries with streaming ingestion
- Concurrent operations with Streaming Ingestion (e.g. Compaction,
Alter table, Secondary Index etc.)
2. Kafka Connect Ingestion / Carbondata connector
- Direct ingestion from Kafka Connect without Spark Structured
Streaming
- Separate Kafka Connector to receive data through network port
- Data commit and Offset management
-
Phase-4 : Support for other streaming engines
- Analysis of Streaming APIs/interface with other streaming engines
- Implementation of connectors for different streaming engines storm,
flink , flume, etc.

Phase -5 : In-memory Streaming table (probable feature)
1. In-memory Cache for Streaming data
- Fault tolerant in-memory buffering / checkpoint with WAL
- Readers read from in-memory tables if available
- Background threads for writing streaming data ,etc.

was:
High level break down of work Items/Implementation phases:
Design document will be attached soon.

Phase – 1 – Spark Structured Streaming with regular Carbondata Format

3. Compaction
- Minor Compaction
- Major Compaction

4. Metadata Management
- Streaming metadata store (e.g. Offsets, timestamps etc.)

5. Failure Recovery
- Rollback on failure
- Handle asynchronous writes to CarbonData (using hflush)

Phase – 2 : Spark Structured Streaming with Appendable CarbonData format
1.Streaming File Format
- Writers use V3 file format for appending Columnar unsorted
data blockets
- Modify Readers to read from appendable streaming file format
-
Phase -3 :
1. Inter-opertability Support
- Functionality with other features/Components
- Concurrent queries with streaming ingestion
- Concurrent operations with Streaming Ingestion (e.g. Compaction,
Alter table, Secondary Index etc.
2. Kafka Connect Ingestion / Carbon

[GitHub] carbondata issue #821: [CARBONDATA-921] resolved bug for unable to select ou...

2017-05-19 Thread QiangCai

Github user QiangCai commented on the issue:

https://github.com/apache/carbondata/pull/821
  
@chenliang613  looks good, suggest to merge to hive branch 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

53 matches

Mail list logo