[GitHub] incubator-carbondata issue #660: [WIP]Bad record making configurable empty d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/660 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1179/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-777] Alter table support for sp...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1178/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #658: [CARBONDATA-775]Updated Date DataType in Da...
Github user PallaviSingh1992 commented on the issue: https://github.com/apache/incubator-carbondata/pull/658 @chenliang613 please review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #660: [WIP]Bad record making configurable empty d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/660 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1177/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #429: [CARBONDATA-530]Modified optimizer to place...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/429 @ashokblend Spark2.1 testcases are failing, can you fix it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #635: [WIP]support SORT_COLUMNS
Github user QiangCai commented on the issue: https://github.com/apache/incubator-carbondata/pull/635 @kumarvishal09 1. If user has not mentioned any sort column then it will go to old flow, sorting based on all dimension column 2. yes 3. During dataloading, the start/end key of blocklet info contain only sort columns. 4. For dataloading, just use sort columns to build start/end key of blocklet info. Code line: CarbonFactDataHandlerColumnar.java 1041 For select query, juse use sort columns to bulid start/end key of filters. Code line: FilterUtil.java 1159 and 1206 @ravipesala I have remove date & timestamp datatype from no-dcitonary. Better to raise another pr to implement new numeric datatype encoding. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #660: [WIP]Bad record making configurable empty d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/660 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1175/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #635: [WIP]support SORT_COLUMNS
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/635 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1174/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-782) Support SORT_COLUMNS
QiangCai created CARBONDATA-782: --- Summary: Support SORT_COLUMNS Key: CARBONDATA-782 URL: https://issues.apache.org/jira/browse/CARBONDATA-782 Project: CarbonData Issue Type: New Feature Reporter: QiangCai Assignee: QiangCai The tasks of SORT_COLUMNS: 1.Support create table with sort_columns property. e.g. tblproperties('sort_columns' = 'col7,col3') The table with SORT_COLUMNS property will be sorted by SORT_COLUMNS. The order of columns is decided by SORT_COLUMNS. 2.Change the encoding rule of SORT_COLUMNS Firstly, the rule of column encoding will keep consistent with previous. Secondly, if a column of SORT_COLUMNS is a measure before, now this column will be created as a dimension. And this dimension is a no-dicitonary column(Better to use other direct-dictionary). Thirdly, the dimension of SORT_COLUMNS have RLE and ROWID page, other dimension have only RLE(not sorted). 3.The start/end key should be composed of SORT_COLUMNS. Using SORT_COLUMNS to build start/end key during data loading and select query. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata issue #661: remove shutdown dictionary server
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/661 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1170/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #429: [CARBONDATA-530]Modified optimizer to place...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/429 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1173/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #635: [WIP]support SORT_COLUMNS
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/635 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1172/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #635: [WIP]support SORT_COLUMNS
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/635 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1171/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #429: [CARBONDATA-530]Modified optimizer to place...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/429 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #661: remove shutdown dictionary server
GitHub user lionelcao opened a pull request: https://github.com/apache/incubator-carbondata/pull/661 remove shutdown dictionary server remove shutdown dictionary server You can merge this pull request into a Git repository by running: $ git pull https://github.com/lionelcao/incubator-carbondata carbon761 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/661.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #661 commit f4cf54f844758441adecb3a40b16cc2868504a2d Author: lucaoDate: 2017-03-16T02:53:08Z remove shutdown dictionary server --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #660: [WIP]Bad record making configurable empty d...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/660 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1169/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #660: [WIP]Bad record making configurable ...
GitHub user mohammadshahidkhan opened a pull request: https://github.com/apache/incubator-carbondata/pull/660 [WIP]Bad record making configurable empty data not a bad record. Be sure to do all of the following to help us incorporate your contribution quickly and easily: - [ ] Make sure the PR title is formatted like: `[CARBONDATA-] Description of pull request` - [ ] Make sure tests pass via `mvn clean verify`. (Even better, enable Travis-CI on your fork and ensure the whole test matrix passes). - [ ] Replace `` in the title with the actual Jira issue number, if there is one. - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.txt). - [ ] Testing done Please provide details on - Whether new unit test cases have been added or why no new tests are required? - What manual testing you have done? - Any additional information to help reviewers in testing this change. - [ ] For large changes, please consider breaking it into sub-tasks under an umbrella JIRA. --- You can merge this pull request into a Git repository by running: $ git pull https://github.com/mohammadshahidkhan/incubator-carbondata bad_record_optimization Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/660.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #660 commit 75d16f2a9c3d49eb29fcb62056b3a144dbcd7063 Author: mohammadshahidkhanDate: 2017-03-08T10:04:00Z Bad record making configurable empty data not a bad record. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #659: Reuse the same SegmentProperties obj...
GitHub user watermen opened a pull request: https://github.com/apache/incubator-carbondata/pull/659 Reuse the same SegmentProperties objects to reduce the memory When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties occupy 2.5+G(76K * 35 * 1000) memory in driver. ![carbonproperties](https://cloud.githubusercontent.com/assets/1400819/23979443/82d44320-0a34-11e7-9a5b-c4dcab4f9232.jpg) I don't have small files so I don't want to compact the segments. I analyzed the dump file and found the values of SegmentProperties are the same, so I think we can reuse the SegmentProperties object if possible. cc @jackylk @QiangCai You can merge this pull request into a Git repository by running: $ git pull https://github.com/watermen/incubator-carbondata CARBONDATA-781 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/659.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #659 commit b82e26907bd6882399cd4084e7584379b10c934c Author: Yadong QiDate: 2017-03-15T09:41:42Z Reuse the same SegmentProperties to reduce the memory. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (CARBONDATA-781) Some SegmentProperties objects occupy too much memory in driver
[ https://issues.apache.org/jira/browse/CARBONDATA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated CARBONDATA-781: - Description: When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties objects occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I don't want to compact the segments. I analyzed the dump file and found the values of SegmentProperties are the same, so I think we can reuse the SegmentProperties object if possible. (was: When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I don't want to compact the segments. I analyzed the dump file and found the values of SegmentProperties are the same, so I think we can reuse the SegmentProperties object if possible.) > Some SegmentProperties objects occupy too much memory in driver > --- > > Key: CARBONDATA-781 > URL: https://issues.apache.org/jira/browse/CARBONDATA-781 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 1.1.0-incubating >Reporter: Yadong Qi > Attachments: CarbonProperties.jpg > > Time Spent: 10m > Remaining Estimate: 0h > > When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties > objects occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small > files so I don't want to compact the segments. I analyzed the dump file and > found the values of SegmentProperties are the same, so I think we can reuse > the SegmentProperties object if possible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata issue #659: [CARBONDATA-781] Reuse the same SegmentProp...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/659 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1168/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (CARBONDATA-781) Some SegmentProperties objects occupy too much memory in driver
[ https://issues.apache.org/jira/browse/CARBONDATA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated CARBONDATA-781: - Description: When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I don't want to compact the segments. I analyzed the dump file and found the values of SegmentProperties are the same, so I think we can reuse the SegmentProperties object if possible. (was: When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I don't want to compate the segment. I analyze the dump file and found the value of SegmentProperties are the same, so I think we can reuse the SegmentProperties object if possible.) > Some SegmentProperties objects occupy too much memory in driver > --- > > Key: CARBONDATA-781 > URL: https://issues.apache.org/jira/browse/CARBONDATA-781 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 1.1.0-incubating >Reporter: Yadong Qi > Attachments: CarbonProperties.jpg > > > When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties > occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I > don't want to compact the segments. I analyzed the dump file and found the > values of SegmentProperties are the same, so I think we can reuse the > SegmentProperties object if possible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CARBONDATA-781) Some SegmentProperties objects occupy too much memory in driver
[ https://issues.apache.org/jira/browse/CARBONDATA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated CARBONDATA-781: - Description: When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I don't want to compate the segment. I analyze the dump file and found the value of SegmentProperties are the same, so I think we can reuse the SegmentProperties object if possible. (was: When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I don't want to compate the segment. I analyze the dump file and found the value of SegmentProperties are the same, so I think we can reuse the SegmentProperties if possible.) > Some SegmentProperties objects occupy too much memory in driver > --- > > Key: CARBONDATA-781 > URL: https://issues.apache.org/jira/browse/CARBONDATA-781 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 1.1.0-incubating >Reporter: Yadong Qi > Attachments: CarbonProperties.jpg > > > When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties > occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I > don't want to compate the segment. I analyze the dump file and found the > value of SegmentProperties are the same, so I think we can reuse the > SegmentProperties object if possible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CARBONDATA-781) Some SegmentProperties objects occupy too much memory in driver
[ https://issues.apache.org/jira/browse/CARBONDATA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated CARBONDATA-781: - Summary: Some SegmentProperties objects occupy too much memory in driver (was: Some SegmentProperties object occupy too much memory in driver) > Some SegmentProperties objects occupy too much memory in driver > --- > > Key: CARBONDATA-781 > URL: https://issues.apache.org/jira/browse/CARBONDATA-781 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 1.1.0-incubating >Reporter: Yadong Qi > Attachments: CarbonProperties.jpg > > > When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties > occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I > don't want to compate the segment. I analyze the dump file and found the > value of SegmentProperties are the same, so I think we can reuse the > SegmentProperties if possible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CARBONDATA-781) Some SegmentProperties object occupy too much memory in driver
[ https://issues.apache.org/jira/browse/CARBONDATA-781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yadong Qi updated CARBONDATA-781: - Summary: Some SegmentProperties object occupy too much memory in driver (was: SegmentProperties occupy too much memory in driver) > Some SegmentProperties object occupy too much memory in driver > -- > > Key: CARBONDATA-781 > URL: https://issues.apache.org/jira/browse/CARBONDATA-781 > Project: CarbonData > Issue Type: Improvement > Components: core >Affects Versions: 1.1.0-incubating >Reporter: Yadong Qi > Attachments: CarbonProperties.jpg > > > When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties > occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I > don't want to compate the segment. I analyze the dump file and found the > value of SegmentProperties are the same, so I think we can reuse the > SegmentProperties if possible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata issue #655: [CARBONDATA-762] Change schemaName to datab...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/655 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1167/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (CARBONDATA-781) SegmentProperties occupy too much memory in driver
Yadong Qi created CARBONDATA-781: Summary: SegmentProperties occupy too much memory in driver Key: CARBONDATA-781 URL: https://issues.apache.org/jira/browse/CARBONDATA-781 Project: CarbonData Issue Type: Improvement Components: core Affects Versions: 1.1.0-incubating Reporter: Yadong Qi Attachments: CarbonProperties.jpg When I load carbondata 1000+ times with 35 nodes, I found SegmentProperties occupy 2.5+G(76K * 35 * 1000) memory in driver. I don't have small files so I don't want to compate the segment. I analyze the dump file and found the value of SegmentProperties are the same, so I think we can reuse the SegmentProperties if possible. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata issue #514: [CARBONDATA-614]Fix issue: Dictionary file ...
Github user cyq130 commented on the issue: https://github.com/apache/incubator-carbondata/pull/514 I have pulled the latest version carbondata but still encounter this issue. My carbondata version is carbondata_2.10-1.1.0-incubating-SNAPSHOT-shade-hadoop2.7.1.jar. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-777] Alter table support for sp...
Github user gvramana commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 modified ManualApacheCarbonPRBuilder2.1 to test spark1.6 and tried building. tests have failed, @manishgupta88 please fix. http://136.243.101.176:8080/job/ManualApacheCarbonPRBuilder2.1/107/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-777] Alter table support for sp...
Github user gvramana commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 compilation errors in 1.6.2, please fix the same --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-777] Alter table support for sp...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1165/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-777] Alter table support for sp...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1164/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-777] Alter table support for sp...
Github user gvramana commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (CARBONDATA-767) Alter table support for carbondata
[ https://issues.apache.org/jira/browse/CARBONDATA-767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta reassigned CARBONDATA-767: --- Assignee: (was: Manish Gupta) > Alter table support for carbondata > -- > > Key: CARBONDATA-767 > URL: https://issues.apache.org/jira/browse/CARBONDATA-767 > Project: CarbonData > Issue Type: New Feature >Affects Versions: 1.1.0-incubating >Reporter: Manish Gupta > Fix For: 1.1.0-incubating > > Time Spent: 3.5h > Remaining Estimate: 0h > > Currently in carbondata after a table is created once it becomes immutable. > Deletion or addition of column is not supported because of which the same > table and data cannot be used again. To add more flexibility to the > carbondata system alter table support needs to be added to carbondata system. > Please refer the design document at below location. > https://drive.google.com/open?id=0B1DnrpMgGOu9a3dBSzhqVlEwY2s -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-780) Alter table support for compaction through sort step
Manish Gupta created CARBONDATA-780: --- Summary: Alter table support for compaction through sort step Key: CARBONDATA-780 URL: https://issues.apache.org/jira/browse/CARBONDATA-780 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Assignee: Manish Gupta Alter table need to support compaction process where complete data need to be sorted again and then written to file. Currently in compaction process data is directly given to writer step where it is splitted into columns and written. But as columns are sorted from left to right, on dropping a column data will again become unorganized as dropped column data will not be considered during compaction. In these scenarios complete data need to be sorted again and then submitted to writer step. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata issue #641: [CARBONDATA-777] Alter table support for sp...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1163/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (CARBONDATA-777) Alter table support for spark 2.1
[ https://issues.apache.org/jira/browse/CARBONDATA-777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manish Gupta reassigned CARBONDATA-777: --- Assignee: Manish Gupta > Alter table support for spark 2.1 > - > > Key: CARBONDATA-777 > URL: https://issues.apache.org/jira/browse/CARBONDATA-777 > Project: CarbonData > Issue Type: Sub-task >Reporter: Manish Gupta >Assignee: Manish Gupta > Fix For: 1.1.0-incubating > > Time Spent: 10m > Remaining Estimate: 0h > > Alter table need to be supported for spark 2.1 > As part of this jira following features will be supported. > 1. Support alter table result preparation. > 2. Support reading data with different block key generators. > 3. Support addition of a new column. > 4. Support deletion of a column. > 5. Support change in data type form INT to BIGINT > 6. Support Change of decimal datatype from lower to higher precision. > 7. Support filtering on newly added columns. > 8. Support rename table > 9. Parsing support for the new DDL commands added. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-779) Alter table support for column group
Manish Gupta created CARBONDATA-779: --- Summary: Alter table support for column group Key: CARBONDATA-779 URL: https://issues.apache.org/jira/browse/CARBONDATA-779 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Priority: Minor Alter table need to be supported to add, drop and change datatype of column groups -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-778) Alter table support for complex type
Manish Gupta created CARBONDATA-778: --- Summary: Alter table support for complex type Key: CARBONDATA-778 URL: https://issues.apache.org/jira/browse/CARBONDATA-778 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Priority: Minor Alter table need to support add, drop complex type columns and data type change -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-777) Alter table support for spark 2.1
Manish Gupta created CARBONDATA-777: --- Summary: Alter table support for spark 2.1 Key: CARBONDATA-777 URL: https://issues.apache.org/jira/browse/CARBONDATA-777 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Alter table need to be supported for spark 2.1 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-776) Alter table support for spark 1.6
Manish Gupta created CARBONDATA-776: --- Summary: Alter table support for spark 1.6 Key: CARBONDATA-776 URL: https://issues.apache.org/jira/browse/CARBONDATA-776 Project: CarbonData Issue Type: Sub-task Reporter: Manish Gupta Priority: Minor Alter feature need to be supported for spark 1.6 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106192438 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -313,6 +307,100 @@ private[sql] case class AlterTableAddColumns( } } +private[sql] case class AlterTableRenameTable(alterTableRenameModel: AlterTableRenameModel) --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106189693 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -747,14 +835,14 @@ case class LoadTable( true } else { LOGGER.error("Can't use single_pass, because SINGLE_PASS and ALL_DICTIONARY_PATH" + - "can not be used together, and USE_KETTLE must be set as false") + "can not be used together, and USE_KETTLE must be set as false") --- End diff -- wrong indentation --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user gvramana commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106174923 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/carbonTableSchema.scala --- @@ -313,6 +307,100 @@ private[sql] case class AlterTableAddColumns( } } +private[sql] case class AlterTableRenameTable(alterTableRenameModel: AlterTableRenameModel) --- End diff -- Move these commands and case class AlterTableRenameTable to AlterTableCommands.scala --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #652: [CARBONDATA-768] unit-testcase for o...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/652#discussion_r105890299 --- Diff: core/src/test/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeFixedLengthDimensionDataChunkStoreTest.java --- @@ -0,0 +1,40 @@ +package org.apache.carbondata.core.datastore.chunk.store.impl.unsafe; --- End diff -- Please add testcases for other public method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #652: [CARBONDATA-768] unit-testcase for o...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/652#discussion_r105890189 --- Diff: core/src/test/java/org/apache/carbondata/core/datastore/chunk/store/impl/unsafe/UnsafeFixedLengthDimensionDataChunkStoreTest.java --- @@ -0,0 +1,40 @@ +package org.apache.carbondata.core.datastore.chunk.store.impl.unsafe; + +import org.junit.BeforeClass; +import org.junit.Test; + +public class UnsafeFixedLengthDimensionDataChunkStoreTest { + + static UnsafeFixedLengthDimensionDataChunkStore unsafeFixedLengthDimensionDataChunkStore; + + @BeforeClass public static void setup() { --- End diff -- Please add testcase with inverted and inverted index reverse --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #655: [CARBONDATA-762] Change schemaName to datab...
Github user chenliang613 commented on the issue: https://github.com/apache/incubator-carbondata/pull/655 @lionelcao please push a null commit to trigger the travis CI again, CI should be ok. verified in my machine ,it is ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-767] Alter table support for ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1162/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #653: [CARBONDATA-769] Added codegen support to s...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/653 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1161/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (CARBONDATA-744) he property "spark.carbon.custom.distribution" should be change to carbon.custom.block.distribution and should be part of CarbonProperties
[ https://issues.apache.org/jira/browse/CARBONDATA-744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-744. Resolution: Fixed Fix Version/s: 1.0.1-incubating > he property "spark.carbon.custom.distribution" should be change to > carbon.custom.block.distribution and should be part of CarbonProperties > -- > > Key: CARBONDATA-744 > URL: https://issues.apache.org/jira/browse/CARBONDATA-744 > Project: CarbonData > Issue Type: Improvement >Reporter: Mohammad Shahid Khan >Assignee: Mohammad Shahid Khan >Priority: Trivial > Fix For: 1.0.1-incubating > > Time Spent: 2h 40m > Remaining Estimate: 0h > > The property "spark.carbon.custom.distribution" should be part of > CarbonProperties > As naming style adopted in carbon we should name the key > carbon.custom.distribution -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata pull request #622: [CARBONDATA-744] The property "spark...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-carbondata/pull/622 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #622: [CARBONDATA-744] The property "spark.carbon...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/622 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #653: [CARBONDATA-769] Added codegen suppo...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/653#discussion_r106162467 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDictionaryDecoder.scala --- @@ -200,6 +201,123 @@ case class CarbonDictionaryDecoder( } } + override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: ExprCode): String = { + +val storePath = CarbonEnv.get.carbonMetastore.storePath +val absoluteTableIdentifiers = relations.map { relation => + val carbonTable = relation.carbonRelation.carbonRelation.metaData.carbonTable + (carbonTable.getFactTableName, carbonTable.getAbsoluteTableIdentifier) +}.toMap + +if (isRequiredToDecode) { + val cacheProvider: CacheProvider = CacheProvider.getInstance + val forwardDictionaryCache: Cache[DictionaryColumnUniqueIdentifier, Dictionary] = +cacheProvider.createCache(CacheType.FORWARD_DICTIONARY, storePath) + val dicts: Seq[ForwardDictionaryWrapper] = getDictionaryWrapper(absoluteTableIdentifiers, +forwardDictionaryCache, storePath) + + val exprs = child.output.map(x => +ExpressionCanonicalizer.execute(BindReferences.bindReference(x, child.output))) + ctx.currentVars = input + val resultVars = exprs.zipWithIndex.map { e => --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-767] Alter table support for ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1160/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (CARBONDATA-772) Number format exception displays to user in spark 2.1.
[ https://issues.apache.org/jira/browse/CARBONDATA-772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Kumar reassigned CARBONDATA-772: -- Assignee: Rahul Kumar > Number format exception displays to user in spark 2.1. > -- > > Key: CARBONDATA-772 > URL: https://issues.apache.org/jira/browse/CARBONDATA-772 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.0.0-incubating > Environment: Spark 2.1 >Reporter: Vinod Rohilla >Assignee: Rahul Kumar >Priority: Trivial > Attachments: 2000_UniqData.csv > > > Number format exception displays to user while execute the queries: > Steps to reproduces: > 1: Create table in Hive: > CREATE TABLE uniqdata4 (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ","; > 2:Load data in Hive: > LOAD DATA LOCAL INPATH '/home/vinod/Desktop/AllCSV/2000_UniqData.csv' into > table uniqdata4; > 3:Select > CUST_ID,CUST_NAME,DOB,BIGINT_COLUMN1,DECIMAL_COLUMN1,Double_COLUMN1,INTEGER_COLUMN1 > from uniqdata4 where CUST_ID in > ('10020','10030','10032','10035','10040','10060','',NULL,' ') or > INTEGER_COLUMN1 in (1021,1031,1032,1033,'',NULL,' ') or DECIMAL_COLUMN1 in > ('12345679921.123400','12345679931.123400','12345679936.123400','',NULL,' > '); > Result: > +--+--++-+-+--+--+--+ > | CUST_ID |CUST_NAME | DOB | BIGINT_COLUMN1 | > DECIMAL_COLUMN1 |Double_COLUMN1| INTEGER_COLUMN1 | > +--+--++-+-+--+--+--+ > | 10020| CUST_NAME_01020 | 1972-10-17 01:00:03.0 | 123372037874| > 12345679921.123400 | 1.12345674897976E10 | 1021 | > | 10030| CUST_NAME_01030 | 1972-10-27 01:00:03.0 | 123372037884| > 12345679931.123400 | 1.12345674897976E10 | 1031 | > | 10031| CUST_NAME_01031 | 1972-10-28 01:00:03.0 | 123372037885| > 12345679932.123400 | 1.12345674897976E10 | 1032 | > | 10032| CUST_NAME_01032 | 1972-10-29 01:00:03.0 | 123372037886| > 12345679933.123400 | 1.12345674897976E10 | 1033 | > | 10035| CUST_NAME_01035 | 1972-11-01 01:00:03.0 | 123372037889| > 12345679936.123400 | 1.12345674897976E10 | 1036 | > | 10040| CUST_NAME_01040 | 1972-11-06 01:00:03.0 | 123372037894| > 12345679941.123400 | 1.12345674897976E10 | 1041 | > | 10060| CUST_NAME_01060 | 1972-11-26 01:00:03.0 | 123372037914| > 12345679961.123400 | 1.12345674897976E10 | 1061 | > +--+--++-+-+--+--+--+ > Create table in CarbonData : > 1: create table: > CREATE TABLE uniqdata (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) STORED BY 'org.apache.carbondata.format' TBLPROPERTIES > ("TABLE_BLOCKSIZE"= "256 MB"); > 2:Load data in table > LOAD DATA INPATH 'hdfs://localhost:54310/2000_UniqData.csv' into table > uniqdata OPTIONS('DELIMITER'=',' , > 'QUOTECHAR'='"','BAD_RECORDS_ACTION'='FORCE','FILEHEADER'='CUST_ID,CUST_NAME,ACTIVE_EMUI_VERSION,DOB,DOJ,BIGINT_COLUMN1,BIGINT_COLUMN2,DECIMAL_COLUMN1,DECIMAL_COLUMN2,Double_COLUMN1,Double_COLUMN2,INTEGER_COLUMN1'); > 3: Execute Selet queries: > select > CUST_ID,CUST_NAME,DOB,BIGINT_COLUMN1,DECIMAL_COLUMN1,Double_COLUMN1,INTEGER_COLUMN1 > from uniqdata where CUST_ID in > ('10020','10030','10032','10035','10040','10060','',NULL,' ') or > INTEGER_COLUMN1 in (1021,1031,1032,1033,'',NULL,' ') or DECIMAL_COLUMN1 in > ('12345679921.123400','12345679931.123400','12345679936.123400','',NULL,' > '); > Result: > Error: org.apache.spark.SparkException: Job aborted due to stage failure: > Task 2 in stage 27.0 failed 1 times, most recent failure: Lost task 2.0 in > stage 27.0 (TID 44, localhost, executor driver): java.lang.RuntimeException: > java.util.concurrent.ExecutionException: java.lang.NumberFormatException > at >
[GitHub] incubator-carbondata issue #658: [CARBONDATA-775]Updated Date DataType in Da...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/658 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1158/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #657: [CARBONDATA-770] Fixed null filter i...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-carbondata/pull/657 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #657: [CARBONDATA-770] Fixed null filter issue
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/657 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #657: [CARBONDATA-770] Fixed null filter issue
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/657 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1157/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #658: [CARBONDATA-775]Updated Date DataTyp...
GitHub user PallaviSingh1992 opened a pull request: https://github.com/apache/incubator-carbondata/pull/658 [CARBONDATA-775]Updated Date DataType in DataTypes supported by CarbonData Added the support for Date Data Type and re-structured the file. You can merge this pull request into a Git repository by running: $ git pull https://github.com/PallaviSingh1992/incubator-carbondata feature/UpdateDataTypeDoc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/658.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #658 commit bbf4ecbe61a83b0e61c72f8801fb93c283b679d7 Author: PallaviSingh1992Date: 2017-03-15T11:27:20Z Updated Date DataType in DataTypes supported by CarbonData --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (CARBONDATA-775) Update Documentation for Supported Datatypes
[ https://issues.apache.org/jira/browse/CARBONDATA-775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Pallavi Singh reassigned CARBONDATA-775: Assignee: Pallavi Singh > Update Documentation for Supported Datatypes > > > Key: CARBONDATA-775 > URL: https://issues.apache.org/jira/browse/CARBONDATA-775 > Project: CarbonData > Issue Type: Improvement > Components: docs >Reporter: Pallavi Singh >Assignee: Pallavi Singh > Time Spent: 10m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata pull request #653: [CARBONDATA-769] Added codegen suppo...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/653#discussion_r106149917 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDictionaryDecoder.scala --- @@ -225,6 +343,25 @@ case class CarbonDictionaryDecoder( dicts } + private def getDictionaryWrapper(atiMap: Map[String, AbsoluteTableIdentifier], + cache: Cache[DictionaryColumnUniqueIdentifier, Dictionary], storePath: String) = { +val dicts: Seq[ForwardDictionaryWrapper] = getDictionaryColumnIds.map { f => --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #653: [CARBONDATA-769] Added codegen suppo...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/653#discussion_r106149905 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/CarbonDictionaryDecoder.scala --- @@ -200,6 +201,123 @@ case class CarbonDictionaryDecoder( } } + override def doConsume(ctx: CodegenContext, input: Seq[ExprCode], row: ExprCode): String = { + +val storePath = CarbonEnv.get.carbonMetastore.storePath +val absoluteTableIdentifiers = relations.map { relation => + val carbonTable = relation.carbonRelation.carbonRelation.metaData.carbonTable + (carbonTable.getFactTableName, carbonTable.getAbsoluteTableIdentifier) +}.toMap + +if (isRequiredToDecode) { + val cacheProvider: CacheProvider = CacheProvider.getInstance + val forwardDictionaryCache: Cache[DictionaryColumnUniqueIdentifier, Dictionary] = +cacheProvider.createCache(CacheType.FORWARD_DICTIONARY, storePath) + val dicts: Seq[ForwardDictionaryWrapper] = getDictionaryWrapper(absoluteTableIdentifiers, +forwardDictionaryCache, storePath) + + val exprs = child.output.map(x => --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (CARBONDATA-774) Not like operator does not work properly in carbondata
[ https://issues.apache.org/jira/browse/CARBONDATA-774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vinod Rohilla updated CARBONDATA-774: - Summary: Not like operator does not work properly in carbondata (was: Not like operators does not work properly in carbondata) > Not like operator does not work properly in carbondata > -- > > Key: CARBONDATA-774 > URL: https://issues.apache.org/jira/browse/CARBONDATA-774 > Project: CarbonData > Issue Type: Bug > Components: sql >Affects Versions: 1.0.0-incubating > Environment: Spark 2.1 >Reporter: Vinod Rohilla >Priority: Trivial > Attachments: CSV.tar.gz > > > Not Like operator result does not display same as hive. > Steps to reproduces: > A): Create table in Hive > CREATE TABLE uniqdata_h (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION > string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 > bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 > decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 > int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' > 2:Load Data in hive > a)load data local inpath '/opt/TestData/Data/uniqdata/2000_UniqData.csv' into > table uniqdata_h > b)load data local inpath '/opt/TestData/Data/uniqdata/4000_UniqData.csv' into > table uniqdata_h > c)load data local inpath '/opt/TestData/Data/uniqdata/6000_UniqData.csv' into > table uniqdata_h > d)load data local inpath '/opt/TestData/Data/uniqdata/7000_UniqData.csv' into > table uniqdata_h > e)load data local inpath '/opt/TestData/Data/uniqdata/3000_1_UniqData.csv' > into table uniqdata_h > 3: Run the Query: > select CUST_ID from uniqdata_h where CUST_ID NOT LIKE 100079 > 4:Result in Hive > +--+--+ > | CUST_ID | > +--+--+ > | 8999 | > | 9000 | > | 9001 | > | 9002 | > | 9003 | > | 9004 | > | 9005 | > | 9006 | > | 9007 | > | 9008 | > | 9009 | > | 9010 | > | 9011 | > | 9012 | > | 9013 | > | 9014 | > | 9015 | > | 9016 | > | 9017 | > | 9018 | > | 9019 | > | 9020 | > | 9021 | > | 9022 | > | 9023 | > | 9024 | > | 9025 | > | 9026 | > | 9027 | > | 9028 | > | 9029 | > | 9030 | > | 9031 | > | 9032 | > | 9033 | > | 9034 | > | 9035 | > | 9036 | > | 9037 | > | 9038 | > | 9039 | > | 9040 | > | 9041 | > | 9042 | > | 9043 | > | 9044 | > | 9045 | > | 9046 | > | 9047 | > | 9048 | > | 9049 | > | 9050 | > | 9051 | > | 9052 | > | 9053 | > | 9054 | > | 9055 | > | 9056 | > | 9057 | > | 9058 | > | 9059 | > | 9060 | > | 9061 | > | 9062 | > | 9063 | > | 9064 | > | 9065 | > | 9066 | > | 9067 | > | 9068 | > | 9069 | > | 9070 | > | 9071 | > | 9072 | > | 9073 | > | 9074 | > | 9075 | > | 9076 | > | 9077 | > | 9078 | > | 9079 | > | 9080 | > | 9081 | > | 9082 | > | 9083 | > | 9084 | > | 9085 | > | 9086 | > | 9087 | > | 9088 | > | 9089 | > | 9090 | > | 9091 | > | 9092 | > | 9093 | > | 9094 | > | 9095 | > | 9096 | > | 9097 | > | 9098 | > +--+--+ > | CUST_ID | > +--+--+ > | 9099 | > | 9100 | > | 9101 | > | 9102 | > | 9103 | > | 9104 | > | 9105 | > | 9106 | > | 9107 | > | 9108 | > | 9109 | > | 9110 | > | 9111 | > | 9112 | > | 9113 | > | 9114 | > | 9115 | > | 9116 | > | 9117 | > | 9118 | > | 9119 | > | 9120 | > | 9121 | > | 9122 | > | 9123 | > | 9124 | > | 9125 | > | 9126 | > | 9127 | > | 9128 | > | 9129 | > | 9130 | > | 9131 | > | 9132 | > | 9133 | > | 9134 | > | 9135 | > | 9136 | > | 9137 | > | 9138 | > | 9139 | > | 9140 | > | 9141 | > | 9142 | > | 9143 | > | 9144 | > | 9145 | > | 9146 | > | 9147 | > | 9148 | > | 9149 | > | 9150 | > | 9151 | > | 9152 | > | 9153 | > | 9154 | > | 9155 | > | 9156 | > | 9157 | > | 9158 | > | 9159 | > | 9160 | > | 9161 | > | 9162 | > | 9163 | > | 9164 | > | 9165 | > | 9166 | > | 9167 | > | 9168 | > | 9169 | > | 9170 | > | 9171 | > | 9172 | > | 9173 | > | 9174 | > | 9175 | > | 9176 | > | 9177 | > | 9178 | > | 9179 | > | 9180 | > | 9181 | > | 9182 | > | 9183 | > | 9184 | > | 9185 | > | 9186 | > | 9187 | > | 9188 | > | 9189 | > | 9190 | > | 9191 | > | 9192 | > | 9193 | > | 9194 | > | 9195 | > | 9196 | > | 9197
[jira] [Created] (CARBONDATA-774) Not like operators does not work properly in carbondata
Vinod Rohilla created CARBONDATA-774: Summary: Not like operators does not work properly in carbondata Key: CARBONDATA-774 URL: https://issues.apache.org/jira/browse/CARBONDATA-774 Project: CarbonData Issue Type: Bug Components: sql Affects Versions: 1.0.0-incubating Environment: Spark 2.1 Reporter: Vinod Rohilla Priority: Trivial Attachments: CSV.tar.gz Not Like operator result does not display same as hive. Steps to reproduces: A): Create table in Hive CREATE TABLE uniqdata_h (CUST_ID int,CUST_NAME String,ACTIVE_EMUI_VERSION string, DOB timestamp, DOJ timestamp, BIGINT_COLUMN1 bigint,BIGINT_COLUMN2 bigint,DECIMAL_COLUMN1 decimal(30,10), DECIMAL_COLUMN2 decimal(36,10),Double_COLUMN1 double, Double_COLUMN2 double,INTEGER_COLUMN1 int) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' 2:Load Data in hive a)load data local inpath '/opt/TestData/Data/uniqdata/2000_UniqData.csv' into table uniqdata_h b)load data local inpath '/opt/TestData/Data/uniqdata/4000_UniqData.csv' into table uniqdata_h c)load data local inpath '/opt/TestData/Data/uniqdata/6000_UniqData.csv' into table uniqdata_h d)load data local inpath '/opt/TestData/Data/uniqdata/7000_UniqData.csv' into table uniqdata_h e)load data local inpath '/opt/TestData/Data/uniqdata/3000_1_UniqData.csv' into table uniqdata_h 3: Run the Query: select CUST_ID from uniqdata_h where CUST_ID NOT LIKE 100079 4:Result in Hive +--+--+ | CUST_ID | +--+--+ | 8999 | | 9000 | | 9001 | | 9002 | | 9003 | | 9004 | | 9005 | | 9006 | | 9007 | | 9008 | | 9009 | | 9010 | | 9011 | | 9012 | | 9013 | | 9014 | | 9015 | | 9016 | | 9017 | | 9018 | | 9019 | | 9020 | | 9021 | | 9022 | | 9023 | | 9024 | | 9025 | | 9026 | | 9027 | | 9028 | | 9029 | | 9030 | | 9031 | | 9032 | | 9033 | | 9034 | | 9035 | | 9036 | | 9037 | | 9038 | | 9039 | | 9040 | | 9041 | | 9042 | | 9043 | | 9044 | | 9045 | | 9046 | | 9047 | | 9048 | | 9049 | | 9050 | | 9051 | | 9052 | | 9053 | | 9054 | | 9055 | | 9056 | | 9057 | | 9058 | | 9059 | | 9060 | | 9061 | | 9062 | | 9063 | | 9064 | | 9065 | | 9066 | | 9067 | | 9068 | | 9069 | | 9070 | | 9071 | | 9072 | | 9073 | | 9074 | | 9075 | | 9076 | | 9077 | | 9078 | | 9079 | | 9080 | | 9081 | | 9082 | | 9083 | | 9084 | | 9085 | | 9086 | | 9087 | | 9088 | | 9089 | | 9090 | | 9091 | | 9092 | | 9093 | | 9094 | | 9095 | | 9096 | | 9097 | | 9098 | +--+--+ | CUST_ID | +--+--+ | 9099 | | 9100 | | 9101 | | 9102 | | 9103 | | 9104 | | 9105 | | 9106 | | 9107 | | 9108 | | 9109 | | 9110 | | 9111 | | 9112 | | 9113 | | 9114 | | 9115 | | 9116 | | 9117 | | 9118 | | 9119 | | 9120 | | 9121 | | 9122 | | 9123 | | 9124 | | 9125 | | 9126 | | 9127 | | 9128 | | 9129 | | 9130 | | 9131 | | 9132 | | 9133 | | 9134 | | 9135 | | 9136 | | 9137 | | 9138 | | 9139 | | 9140 | | 9141 | | 9142 | | 9143 | | 9144 | | 9145 | | 9146 | | 9147 | | 9148 | | 9149 | | 9150 | | 9151 | | 9152 | | 9153 | | 9154 | | 9155 | | 9156 | | 9157 | | 9158 | | 9159 | | 9160 | | 9161 | | 9162 | | 9163 | | 9164 | | 9165 | | 9166 | | 9167 | | 9168 | | 9169 | | 9170 | | 9171 | | 9172 | | 9173 | | 9174 | | 9175 | | 9176 | | 9177 | | 9178 | | 9179 | | 9180 | | 9181 | | 9182 | | 9183 | | 9184 | | 9185 | | 9186 | | 9187 | | 9188 | | 9189 | | 9190 | | 9191 | | 9192 | | 9193 | | 9194 | | 9195 | | 9196 | | 9197 | | 9198 | +--+--+ | CUST_ID | +--+--+ | 9199 | | 9200 | | 9201 | | 9202 | | 9203 | | 9204 | | 9205 | | 9206 | | 9207 | | 9208 | | 9209 | | 9210 | | 9211 | | 9212 | | 9213 | | 9214 | | 9215 | | 9216 | | 9217 | | 9218 | | 9219 | | 9220 | | 9221 | | 9222 | | 9223 | | 9224 | | 9225 | | 9226 | | 9227 | | 9228 | | 9229 | | 9230 | | 9231 | | 9232 | | 9233 | | 9234 | | 9235 | | 9236 | | 9237 | | 9238 | | 9239 | | 9240 | | 9241 | | 9242 | | 9243 | | 9244 | | 9245 | | 9246 | | 9247 | | 9248 | | 9249 | | 9250 | | 9251 | |
[GitHub] incubator-carbondata issue #641: [CARBONDATA-767] Alter table support for ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1156/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #657: [CARBONDATA-770] Fixed null filter i...
GitHub user kumarvishal09 opened a pull request: https://github.com/apache/incubator-carbondata/pull/657 [CARBONDATA-770] Fixed null filter issue In can of exclude filter dictionary column DictionaryColumnVisitor is adding null value surrogate twice and because of this is not null query is giving wrong result You can merge this pull request into a Git repository by running: $ git pull https://github.com/kumarvishal09/incubator-carbondata FixedNullFilterIssue Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/657.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #657 commit bfc7e64fed9329531b96b12d772e6714ba42ba22 Author: kumarvishalDate: 2017-03-15T12:08:19Z Fixed null filter issue --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-767] Alter table support for ca...
Github user manishgupta88 commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 @gvramana ...Handled all review comments and all the test cases are passing. Please refer the below link. kindly review and merge --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #656: [CARBONDATA-773] Fixed multiple DictionaryS...
Github user kunal642 commented on the issue: https://github.com/apache/incubator-carbondata/pull/656 @ravipesala changes made. Please review --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #641: [CARBONDATA-767] Alter table support for ca...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/641 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1154/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106138538 --- Diff: core/src/main/java/org/apache/carbondata/core/util/path/CarbonTablePath.java --- @@ -236,9 +236,9 @@ public String getTableUpdateStatusFilePath() { * @return absolute path of data file stored in carbon data format */ public String getCarbonDataFilePath(String partitionId, String segmentId, Integer filePartNo, - Integer taskNo, int bucketNumber, String factUpdateTimeStamp) { + Integer taskNo, int taskExtension, int bucketNumber, String factUpdateTimeStamp) { --- End diff -- Ok, updated --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106136281 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/SortProcessorStepImpl.java --- @@ -74,7 +81,6 @@ public void initialize() throws IOException { public Iterator[] execute() throws CarbonDataLoadingException { final Iterator[] iterators = child.execute(); Iterator[] sortedIterators = sorter.sort(iterators); -child.close(); --- End diff -- It is removed as part of refactoring of code to support batch sort. Earlier it was needed to get the cardinality of all dimensions. Now there is a new interface `DictionaryCardinalityFinder` has been added to figure out the cardinalities. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106135534 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/SortProcessorStepImpl.java --- @@ -58,11 +59,17 @@ public void initialize() throws IOException { boolean offheapsort = Boolean.parseBoolean(CarbonProperties.getInstance() .getProperty(CarbonCommonConstants.ENABLE_UNSAFE_SORT, CarbonCommonConstants.ENABLE_UNSAFE_SORT_DEFAULT)); +boolean batchSort = Boolean.parseBoolean(CarbonProperties.getInstance() +.getProperty(CarbonCommonConstants.LOAD_USE_BATCH_SORT, +CarbonCommonConstants.LOAD_USE_BATCH_SORT_DEFAULT)); if (offheapsort) { --- End diff -- ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106134851 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. + * And it writes data to carbondata file. It also generates mdk key while writing to carbondata file + */ +public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorStep { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName()); + + private int noDictionaryCount; + + private int complexDimensionCount; + + private int measureCount; + + private int measureIndex = IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex(); + + private int noDimByteArrayIndex = IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex(); + + private int dimsArrayIndex = IgnoreDictionary.DIMENSION_INDEX_IN_ROW.getIndex(); + + public DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuration, + AbstractDataLoadProcessorStep child) { +super(configuration, child); + } + + @Override public DataField[] getOutput() { +return child.getOutput(); + } + + @Override public void initialize() throws IOException { +child.initialize(); + } + + private String getStoreLocation(CarbonTableIdentifier tableIdentifier, String partitionId) { +String storeLocation = CarbonDataProcessorUtil +.getLocalDataFolderLocation(tableIdentifier.getDatabaseName(), +tableIdentifier.getTableName(), String.valueOf(configuration.getTaskNo()), partitionId, +configuration.getSegmentId() + "", false); +new File(storeLocation).mkdirs(); +return storeLocation; + } + + @Override public Iterator[] execute() throws CarbonDataLoadingException { +Iterator[] iterators = child.execute(); +CarbonTableIdentifier tableIdentifier = +configuration.getTableIdentifier().getCarbonTableIdentifier(); +String tableName = tableIdentifier.getTableName(); +try { + CarbonFactDataHandlerModel dataHandlerModel = CarbonFactDataHandlerModel + .createCarbonFactDataHandlerModel(configuration, + getStoreLocation(tableIdentifier, String.valueOf(0)), 0, 0); + noDictionaryCount =
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106134073 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. + * And it writes data to carbondata file. It also generates mdk key while writing to carbondata file + */ +public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorStep { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName()); + + private int noDictionaryCount; + + private int complexDimensionCount; + + private int measureCount; + + private int measureIndex = IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex(); + + private int noDimByteArrayIndex = IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex(); + + private int dimsArrayIndex = IgnoreDictionary.DIMENSION_INDEX_IN_ROW.getIndex(); + + public DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuration, + AbstractDataLoadProcessorStep child) { +super(configuration, child); + } + + @Override public DataField[] getOutput() { +return child.getOutput(); + } + + @Override public void initialize() throws IOException { +child.initialize(); + } + + private String getStoreLocation(CarbonTableIdentifier tableIdentifier, String partitionId) { +String storeLocation = CarbonDataProcessorUtil +.getLocalDataFolderLocation(tableIdentifier.getDatabaseName(), +tableIdentifier.getTableName(), String.valueOf(configuration.getTaskNo()), partitionId, +configuration.getSegmentId() + "", false); +new File(storeLocation).mkdirs(); +return storeLocation; + } + + @Override public Iterator[] execute() throws CarbonDataLoadingException { +Iterator[] iterators = child.execute(); +CarbonTableIdentifier tableIdentifier = +configuration.getTableIdentifier().getCarbonTableIdentifier(); +String tableName = tableIdentifier.getTableName(); +try { + CarbonFactDataHandlerModel dataHandlerModel = CarbonFactDataHandlerModel + .createCarbonFactDataHandlerModel(configuration, + getStoreLocation(tableIdentifier, String.valueOf(0)), 0, 0); + noDictionaryCount =
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106133618 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. + * And it writes data to carbondata file. It also generates mdk key while writing to carbondata file + */ +public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorStep { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName()); + + private int noDictionaryCount; + + private int complexDimensionCount; + + private int measureCount; + + private int measureIndex = IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex(); + + private int noDimByteArrayIndex = IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex(); + + private int dimsArrayIndex = IgnoreDictionary.DIMENSION_INDEX_IN_ROW.getIndex(); + + public DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuration, + AbstractDataLoadProcessorStep child) { +super(configuration, child); + } + + @Override public DataField[] getOutput() { +return child.getOutput(); + } + + @Override public void initialize() throws IOException { +child.initialize(); + } + + private String getStoreLocation(CarbonTableIdentifier tableIdentifier, String partitionId) { +String storeLocation = CarbonDataProcessorUtil +.getLocalDataFolderLocation(tableIdentifier.getDatabaseName(), +tableIdentifier.getTableName(), String.valueOf(configuration.getTaskNo()), partitionId, +configuration.getSegmentId() + "", false); +new File(storeLocation).mkdirs(); +return storeLocation; + } + + @Override public Iterator[] execute() throws CarbonDataLoadingException { +Iterator[] iterators = child.execute(); +CarbonTableIdentifier tableIdentifier = +configuration.getTableIdentifier().getCarbonTableIdentifier(); +String tableName = tableIdentifier.getTableName(); +try { + CarbonFactDataHandlerModel dataHandlerModel = CarbonFactDataHandlerModel + .createCarbonFactDataHandlerModel(configuration, + getStoreLocation(tableIdentifier, String.valueOf(0)), 0, 0); + noDictionaryCount =
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106133451 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. + * And it writes data to carbondata file. It also generates mdk key while writing to carbondata file + */ +public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorStep { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName()); + + private int noDictionaryCount; + + private int complexDimensionCount; + + private int measureCount; + + private int measureIndex = IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex(); + + private int noDimByteArrayIndex = IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex(); + + private int dimsArrayIndex = IgnoreDictionary.DIMENSION_INDEX_IN_ROW.getIndex(); + + public DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuration, + AbstractDataLoadProcessorStep child) { +super(configuration, child); + } + + @Override public DataField[] getOutput() { +return child.getOutput(); + } + + @Override public void initialize() throws IOException { +child.initialize(); + } + + private String getStoreLocation(CarbonTableIdentifier tableIdentifier, String partitionId) { +String storeLocation = CarbonDataProcessorUtil +.getLocalDataFolderLocation(tableIdentifier.getDatabaseName(), +tableIdentifier.getTableName(), String.valueOf(configuration.getTaskNo()), partitionId, +configuration.getSegmentId() + "", false); +new File(storeLocation).mkdirs(); +return storeLocation; + } + + @Override public Iterator[] execute() throws CarbonDataLoadingException { +Iterator[] iterators = child.execute(); +CarbonTableIdentifier tableIdentifier = +configuration.getTableIdentifier().getCarbonTableIdentifier(); +String tableName = tableIdentifier.getTableName(); +try { + CarbonFactDataHandlerModel dataHandlerModel = CarbonFactDataHandlerModel + .createCarbonFactDataHandlerModel(configuration, + getStoreLocation(tableIdentifier, String.valueOf(0)), 0, 0); + noDictionaryCount =
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106133491 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. + * And it writes data to carbondata file. It also generates mdk key while writing to carbondata file + */ +public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorStep { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName()); + + private int noDictionaryCount; + + private int complexDimensionCount; + + private int measureCount; + + private int measureIndex = IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex(); + + private int noDimByteArrayIndex = IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex(); + + private int dimsArrayIndex = IgnoreDictionary.DIMENSION_INDEX_IN_ROW.getIndex(); + + public DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuration, + AbstractDataLoadProcessorStep child) { +super(configuration, child); + } + + @Override public DataField[] getOutput() { +return child.getOutput(); + } + + @Override public void initialize() throws IOException { +child.initialize(); + } + + private String getStoreLocation(CarbonTableIdentifier tableIdentifier, String partitionId) { +String storeLocation = CarbonDataProcessorUtil +.getLocalDataFolderLocation(tableIdentifier.getDatabaseName(), +tableIdentifier.getTableName(), String.valueOf(configuration.getTaskNo()), partitionId, +configuration.getSegmentId() + "", false); +new File(storeLocation).mkdirs(); +return storeLocation; + } + + @Override public Iterator[] execute() throws CarbonDataLoadingException { +Iterator[] iterators = child.execute(); +CarbonTableIdentifier tableIdentifier = +configuration.getTableIdentifier().getCarbonTableIdentifier(); +String tableName = tableIdentifier.getTableName(); +try { + CarbonFactDataHandlerModel dataHandlerModel = CarbonFactDataHandlerModel + .createCarbonFactDataHandlerModel(configuration, + getStoreLocation(tableIdentifier, String.valueOf(0)), 0, 0); + noDictionaryCount =
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106131227 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -129,4 +134,80 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { case databaseName ~ tableName ~ limit => ShowLoadsCommand(convertDbNameToLowerCase(databaseName), tableName.toLowerCase(), limit) } + + protected lazy val alterTableModifyDataType: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ CHANGE ~ ident ~ ident ~ +ident ~ opt("(" ~> rep1sep(valueOptions, ",") <~ ")") <~ opt(";") ^^ { + case dbName ~ table ~ change ~ columnName ~ columnNameCopy ~ dataType ~ values => +// both the column names should be same +CommonUtil.validateColumnNames(columnName, columnNameCopy) +val alterTableChangeDataTypeModel = + AlterTableDataTypeChangeModel(parseDataType(dataType.toLowerCase, values), +convertDbNameToLowerCase(dbName), +table.toLowerCase, +columnName.toLowerCase, +columnNameCopy.toLowerCase) +AlterTableDataTypeChange(alterTableChangeDataTypeModel) +} + + protected lazy val alterTableAddColumns: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ +(ADD ~> COLUMNS ~> "(" ~> repsep(anyFieldDef, ",") <~ ")") ~ +(TBLPROPERTIES ~> "(" ~> repsep(loadOptions, ",") <~ ")").? <~ opt(";") ^^ { + case dbName ~ table ~ fields ~ tblProp => +fields.foreach{ f => + if (isComplexDimDictionaryExclude(f.dataType.get)) { +throw new MalformedCarbonCommandException( + s"Add column is unsupported for complex datatype column: ${f.column}") + } +} +val tableProps = if (tblProp.isDefined) { + // default value should not be converted to lower case + val tblProps = tblProp.get.map(f => if (f._1.toLowerCase.startsWith("default.value.")) { +f._1 -> f._2 + } else { +f._1 -> f._2.toLowerCase + }) + scala.collection.mutable.Map(tblProps: _*) +} else { + scala.collection.mutable.Map.empty[String, String] +} + +val tableModel = prepareTableModel (false, + convertDbNameToLowerCase(dbName), + table.toLowerCase, + fields.map(convertFieldNamesToLowercase), + Seq.empty, + tableProps, + None, + true) + +val alterTableAddColumnsModel = AlterTableAddColumnsModel(convertDbNameToLowerCase(dbName), + table, + tableProps, + tableModel.dimCols, + tableModel.msrCols, + tableModel.highcardinalitydims.getOrElse(Seq.empty)) +AlterTableAddColumns(alterTableAddColumnsModel) +} + + private def convertFieldNamesToLowercase(field: Field): Field = { +val name = field.column.toLowerCase +field.copy(column = name, name = Some(name)) + } + protected lazy val alterTableDropColumn: Parser[LogicalPlan] = --- End diff -- Ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106129635 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/parser/CarbonSpark2SqlParser.scala --- @@ -129,4 +134,80 @@ class CarbonSpark2SqlParser extends CarbonDDLSqlParser { case databaseName ~ tableName ~ limit => ShowLoadsCommand(convertDbNameToLowerCase(databaseName), tableName.toLowerCase(), limit) } + + protected lazy val alterTableModifyDataType: Parser[LogicalPlan] = +ALTER ~> TABLE ~> (ident <~ ".").? ~ ident ~ CHANGE ~ ident ~ ident ~ +ident ~ opt("(" ~> rep1sep(valueOptions, ",") <~ ")") <~ opt(";") ^^ { + case dbName ~ table ~ change ~ columnName ~ columnNameCopy ~ dataType ~ values => +// both the column names should be same +CommonUtil.validateColumnNames(columnName, columnNameCopy) --- End diff -- Ok --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106129176 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. + * And it writes data to carbondata file. It also generates mdk key while writing to carbondata file + */ +public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorStep { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName()); + + private int noDictionaryCount; + + private int complexDimensionCount; + + private int measureCount; + + private int measureIndex = IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex(); + + private int noDimByteArrayIndex = IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex(); + + private int dimsArrayIndex = IgnoreDictionary.DIMENSION_INDEX_IN_ROW.getIndex(); + + public DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuration, + AbstractDataLoadProcessorStep child) { +super(configuration, child); + } + + @Override public DataField[] getOutput() { +return child.getOutput(); + } + + @Override public void initialize() throws IOException { +child.initialize(); + } + + private String getStoreLocation(CarbonTableIdentifier tableIdentifier, String partitionId) { +String storeLocation = CarbonDataProcessorUtil +.getLocalDataFolderLocation(tableIdentifier.getDatabaseName(), +tableIdentifier.getTableName(), String.valueOf(configuration.getTaskNo()), partitionId, +configuration.getSegmentId() + "", false); +new File(storeLocation).mkdirs(); +return storeLocation; + } + + @Override public Iterator[] execute() throws CarbonDataLoadingException { +Iterator[] iterators = child.execute(); +CarbonTableIdentifier tableIdentifier = +configuration.getTableIdentifier().getCarbonTableIdentifier(); +String tableName = tableIdentifier.getTableName(); +try { + CarbonFactDataHandlerModel dataHandlerModel = CarbonFactDataHandlerModel + .createCarbonFactDataHandlerModel(configuration, + getStoreLocation(tableIdentifier, String.valueOf(0)), 0, 0); + noDictionaryCount =
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106128671 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. + * And it writes data to carbondata file. It also generates mdk key while writing to carbondata file + */ +public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorStep { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName()); + + private int noDictionaryCount; + + private int complexDimensionCount; + + private int measureCount; + + private int measureIndex = IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex(); + + private int noDimByteArrayIndex = IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex(); + + private int dimsArrayIndex = IgnoreDictionary.DIMENSION_INDEX_IN_ROW.getIndex(); + + public DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuration, + AbstractDataLoadProcessorStep child) { +super(configuration, child); + } + + @Override public DataField[] getOutput() { +return child.getOutput(); + } + + @Override public void initialize() throws IOException { +child.initialize(); + } + + private String getStoreLocation(CarbonTableIdentifier tableIdentifier, String partitionId) { +String storeLocation = CarbonDataProcessorUtil +.getLocalDataFolderLocation(tableIdentifier.getDatabaseName(), +tableIdentifier.getTableName(), String.valueOf(configuration.getTaskNo()), partitionId, +configuration.getSegmentId() + "", false); +new File(storeLocation).mkdirs(); +return storeLocation; + } + + @Override public Iterator[] execute() throws CarbonDataLoadingException { +Iterator[] iterators = child.execute(); +CarbonTableIdentifier tableIdentifier = +configuration.getTableIdentifier().getCarbonTableIdentifier(); +String tableName = tableIdentifier.getTableName(); +try { + CarbonFactDataHandlerModel dataHandlerModel = CarbonFactDataHandlerModel + .createCarbonFactDataHandlerModel(configuration, + getStoreLocation(tableIdentifier, String.valueOf(0)), 0, 0); + noDictionaryCount =
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106128302 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. + * And it writes data to carbondata file. It also generates mdk key while writing to carbondata file + */ +public class DataWriterBatchProcessorStepImpl extends AbstractDataLoadProcessorStep { + + private static final LogService LOGGER = + LogServiceFactory.getLogService(DataWriterBatchProcessorStepImpl.class.getName()); + + private int noDictionaryCount; + + private int complexDimensionCount; + + private int measureCount; + + private int measureIndex = IgnoreDictionary.MEASURES_INDEX_IN_ROW.getIndex(); + + private int noDimByteArrayIndex = IgnoreDictionary.BYTE_ARRAY_INDEX_IN_ROW.getIndex(); + + private int dimsArrayIndex = IgnoreDictionary.DIMENSION_INDEX_IN_ROW.getIndex(); + + public DataWriterBatchProcessorStepImpl(CarbonDataLoadConfiguration configuration, + AbstractDataLoadProcessorStep child) { +super(configuration, child); + } + + @Override public DataField[] getOutput() { +return child.getOutput(); + } + + @Override public void initialize() throws IOException { +child.initialize(); + } + + private String getStoreLocation(CarbonTableIdentifier tableIdentifier, String partitionId) { +String storeLocation = CarbonDataProcessorUtil +.getLocalDataFolderLocation(tableIdentifier.getDatabaseName(), +tableIdentifier.getTableName(), String.valueOf(configuration.getTaskNo()), partitionId, +configuration.getSegmentId() + "", false); +new File(storeLocation).mkdirs(); --- End diff -- if it already exists then we do nothing and continue loading. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106127463 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/steps/DataWriterBatchProcessorStepImpl.java --- @@ -0,0 +1,206 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.carbondata.processing.newflow.steps; + +import java.io.File; +import java.io.IOException; +import java.util.Iterator; + +import org.apache.carbondata.common.logging.LogService; +import org.apache.carbondata.common.logging.LogServiceFactory; +import org.apache.carbondata.core.constants.IgnoreDictionary; +import org.apache.carbondata.core.datastore.block.SegmentProperties; +import org.apache.carbondata.core.metadata.CarbonTableIdentifier; +import org.apache.carbondata.core.util.CarbonTimeStatisticsFactory; +import org.apache.carbondata.processing.newflow.AbstractDataLoadProcessorStep; +import org.apache.carbondata.processing.newflow.CarbonDataLoadConfiguration; +import org.apache.carbondata.processing.newflow.DataField; +import org.apache.carbondata.processing.newflow.exception.CarbonDataLoadingException; +import org.apache.carbondata.processing.newflow.row.CarbonRow; +import org.apache.carbondata.processing.newflow.row.CarbonRowBatch; +import org.apache.carbondata.processing.store.CarbonFactDataHandlerModel; +import org.apache.carbondata.processing.store.CarbonFactHandler; +import org.apache.carbondata.processing.store.CarbonFactHandlerFactory; +import org.apache.carbondata.processing.store.writer.exception.CarbonDataWriterException; +import org.apache.carbondata.processing.util.CarbonDataProcessorUtil; + +/** + * It reads data from sorted files which are generated in previous sort step. --- End diff -- Actually it is not always get the data from in-memory sorted files. It actually gets the data batch of sort files(it could be in-memory/disk). It always depends on batch size and memory availability. If the memory configured is 1GB and batch size configured is 2GB then after it reaches the memory limit it flushes to disk. I will update the comment as per the above --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106125641 --- Diff: integration/spark2/src/main/scala/org/apache/spark/sql/hive/CarbonMetastore.scala --- @@ -304,38 +341,76 @@ class CarbonMetastore(conf: RuntimeConfig, val storePath: String) { if (tableExists(tableName, Some(dbName))(sparkSession)) { sys.error(s"Table [$tableName] already exists under Database [$dbName]") } +val schemaEvolutionEntry = new SchemaEvolutionEntry(tableInfo.getLastUpdatedTime) val schemaConverter = new ThriftWrapperSchemaConverterImpl val thriftTableInfo = schemaConverter .fromWrapperToExternalTableInfo(tableInfo, dbName, tableName) -val schemaEvolutionEntry = new SchemaEvolutionEntry(tableInfo.getLastUpdatedTime) thriftTableInfo.getFact_table.getSchema_evolution.getSchema_evolution_history .add(schemaEvolutionEntry) +val carbonTablePath = createSchemaThriftFile(tableInfo, + thriftTableInfo, + dbName, + tableName)(sparkSession) +updateSchemasUpdatedTime(touchSchemaFileSystemTime(dbName, tableName)) +LOGGER.info(s"Table $tableName for Database $dbName created successfully.") +carbonTablePath + } + /** + * This method will write the schema thrift file in carbon store and load table metadata + * + * @param tableInfo + * @param thriftTableInfo + * @param dbName + * @param tableName + * @param sparkSession + * @return + */ + private def createSchemaThriftFile( + tableInfo: org.apache.carbondata.core.metadata.schema.table.TableInfo, + thriftTableInfo: org.apache.carbondata.format.TableInfo, + dbName: String, tableName: String) +(sparkSession: SparkSession): String = { val carbonTableIdentifier = new CarbonTableIdentifier(dbName, tableName, tableInfo.getFactTable.getTableId) val carbonTablePath = CarbonStorePath.getCarbonTablePath(storePath, carbonTableIdentifier) val schemaFilePath = carbonTablePath.getSchemaFilePath val schemaMetadataPath = CarbonTablePath.getFolderContainingFile(schemaFilePath) tableInfo.setMetaDataFilepath(schemaMetadataPath) tableInfo.setStorePath(storePath) -CarbonMetadata.getInstance().loadTableMetadata(tableInfo) -val tableMeta = new TableMeta(carbonTableIdentifier, storePath, - CarbonMetadata.getInstance().getCarbonTable(dbName + "_" + tableName)) - val fileType = FileFactory.getFileType(schemaMetadataPath) if (!FileFactory.isFileExist(schemaMetadataPath, fileType)) { FileFactory.mkdirs(schemaMetadataPath, fileType) } val thriftWriter = new ThriftWriter(schemaFilePath, false) -thriftWriter.open() +thriftWriter.open(FileWriteOperation.OVERWRITE) thriftWriter.write(thriftTableInfo) thriftWriter.close() +removeTableFromMetadata(dbName, tableName) --- End diff -- As per old code, when new table is created, table info is added into CarbonMetastore and modified.mdt file is timestamp is updated to refresh in other sessions. Those changes are extracted to a new method and used in CreateTable and AlterTable flow, updating mdt file code is missing, which i added --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #620: [CARBONDATA-742] Added batch sort to...
Github user ravipesala commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/620#discussion_r106125107 --- Diff: processing/src/main/java/org/apache/carbondata/processing/newflow/DataLoadProcessBuilder.java --- @@ -52,10 +53,15 @@ public AbstractDataLoadProcessorStep build(CarbonLoadModel loadModel, String storeLocation, CarbonIterator[] inputIterators) throws Exception { +boolean batchSort = Boolean.parseBoolean(CarbonProperties.getInstance() +.getProperty(CarbonCommonConstants.LOAD_USE_BATCH_SORT, --- End diff -- I guess it would be better to add in load option. I will raise jira for it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (CARBONDATA-770) Filter Query not null data mismatch issue
[ https://issues.apache.org/jira/browse/CARBONDATA-770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ravindra Pesala resolved CARBONDATA-770. Resolution: Fixed Fix Version/s: 1.0.1-incubating > Filter Query not null data mismatch issue > - > > Key: CARBONDATA-770 > URL: https://issues.apache.org/jira/browse/CARBONDATA-770 > Project: CarbonData > Issue Type: Bug >Reporter: kumar vishal >Assignee: kumar vishal > Fix For: 1.0.1-incubating > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Problem: Not null filter query is selecting null values. > Solution: Problem is while parsing the data based on data type we are not > parsing for int, double, float, and long data type, need to add case for the > same -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata pull request #654: [CARBONDATA-770] Fixed Not null filt...
Github user asfgit closed the pull request at: https://github.com/apache/incubator-carbondata/pull/654 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #654: [CARBONDATA-770] Fixed Not null filter quer...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/654 LGTM --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #656: [CARBONDATA-773] Fixed multiple DictionaryS...
Github user ravipesala commented on the issue: https://github.com/apache/incubator-carbondata/pull/656 @kunal642 It is not all about avoid multiple server instances, we always shutdown after load completes so even if you avoid creating multiple instance of server you cannot do second load as you already shutdown the server. So we need to come with proper solution. Solution 1 -> Start the dictionary server once per driver and avoid shutting down after load completes. But make sure that dictionary data is flushed to store after load completion of that table. Solution 2 -> Always starts new dictionary server on new port (first check the free port availability). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #650: [WIP] add intergation with presto
Github user ffpeng90 commented on the issue: https://github.com/apache/incubator-carbondata/pull/650 I'm focusing on two things. 1. let user can debug presto-carbondata in his IDE. 2. use new presto API to support lazy decode. They will be ok soon. At 2017-03-15 10:52:01, "å½"wrote: Hi: 1. This version only suppport DML, All tables for test are created by spark-sql(DML part), and i submit queries to presto to get results. I only tested the "Select" Case , like where, group , sum , join. 2. I use APIs like createQueryPlan, resolveFilter from class "CarbonInputFormatUtil". To read carbon formatted table, i make the read process into several steps: a). load table metadata b). get splits from table (pushing down filtering to filter datablocks of one segment @CarbonTableReader.getInputSplits2 ) c). parse records ( pushing down column projection and filtering into QueryModel @CarbondataRecordSetProvider.getRecordSet ) 3. As i described in partC "parse records", I use QueryModel to get decoded records. For lazy decoding, I will keep on exploring a better solution. Maybe we can get inspiration from module presto-orc, presto-parquet. At 2017-03-15 09:11:19, "Jacky Li" wrote: Thanks for working on this. Can you describe what feature is added in term of: What SQL syntax is supported? DDL ? I think it uses CarbonInputFormat to read, so are you pushing down column projection and filtering by setting the configuration in CarbonInputFormat? Is there any SQL optimization integration with Presto's optimizer? like leveraging carbon's global dictionary to do lazy decode? â You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #656: [CARBONDATA-773] Fixed multiple Dict...
GitHub user kunal642 opened a pull request: https://github.com/apache/incubator-carbondata/pull/656 [CARBONDATA-773] Fixed multiple DictionaryServer instances issue Fixed the issue where multiple dictionary server instances were being created for parallel load. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kunal642/incubator-carbondata dictionary_fix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/incubator-carbondata/pull/656.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #656 commit 3a29c2f995f35df7d166e992891ad6b2f27a7823 Author: kunal642Date: 2017-03-15T09:08:31Z fixed multiple dictionary server issue --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #654: [CARBONDATA-770] Fixed Not null filter quer...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/654 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1152/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #654: [CARBONDATA-770] Fixed Not null filt...
Github user kumarvishal09 commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/654#discussion_r106116102 --- Diff: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java --- @@ -507,12 +507,25 @@ public static String parseValue(String value, CarbonDimension dimension) { switch (dimension.getDataType()) { case DECIMAL: return parseStringToBigDecimal(value, dimension); +case INT: + Integer.parseInt(value); + break; +case DOUBLE: + Double.parseDouble(value); + break; +case LONG: + Long.parseLong(value); + break; +case FLOAT: + Float.parseFloat(value); + break; default: - return value; + // do nothing } } catch (Exception e) { return null; --- End diff -- yes --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106112796 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/CarbonScalaUtil.scala --- @@ -194,4 +196,102 @@ object CarbonScalaUtil { } } } + + /** + * This method will validate a column for its data type and check whether the column data type + * can be modified and update if conditions are met + * + * @param dataTypeInfo + * @param carbonColumn + */ + def validateColumnDataType(dataTypeInfo: DataTypeInfo, carbonColumn: CarbonColumn): Unit = { +carbonColumn.getDataType.getName match { + case "INT" => +if (!dataTypeInfo.dataType.equals("bigint")) { + sys +.error(s"Given column ${ carbonColumn.getColName } with data type ${ + carbonColumn +.getDataType.getName +} cannot be modified. Int can only be changed to bigInt") +} + case "DECIMAL" => +if (!dataTypeInfo.dataType.equals("decimal")) { + sys +.error(s"Given column ${ carbonColumn.getColName } with data type ${ + carbonColumn.getDataType.getName +} cannot be modified. Decimal can be only be changed to Decimal of higher precision") +} +if (dataTypeInfo.precision <= carbonColumn.getColumnSchema.getPrecision) { + sys +.error(s"Given column ${ + carbonColumn +.getColName +} cannot be modified. Specified precision value ${ + dataTypeInfo +.precision +} should be greater or equal to current precision value ${ + carbonColumn.getColumnSchema +.getPrecision +}") +} else if (dataTypeInfo.scale <= carbonColumn.getColumnSchema.getScale) { + sys +.error(s"Given column ${ + carbonColumn +.getColName +} cannot be modified. Specified scale value ${ + dataTypeInfo +.scale +} should be greater or equal to current scale value ${ + carbonColumn.getColumnSchema +.getScale +}") +} else { + // difference of precision and scale specified by user should not be less than the + // difference of already existing precision and scale else it will result in data loss + val carbonColumnPrecisionScaleDiff = carbonColumn.getColumnSchema.getPrecision - + carbonColumn.getColumnSchema.getScale + val dataInfoPrecisionScaleDiff = dataTypeInfo.precision - dataTypeInfo.scale + if (dataInfoPrecisionScaleDiff < carbonColumnPrecisionScaleDiff) { +sys + .error(s"Given column ${ +carbonColumn + .getColName + } cannot be modified. Specified precision and scale values will lead to data loss") + } +} + case _ => +sys + .error(s"Given column ${ carbonColumn.getColName } with data type ${ +carbonColumn + .getDataType.getName + } cannot be modified. Only Int and Decimal data types are allowed for modification") +} + } + + /** + * This method will create a copy of the same object + * + * @param thriftColumnSchema object to be cloned + * @return + */ + def createColumnSchemaCopyObject(thriftColumnSchema: org.apache.carbondata.format.ColumnSchema) --- End diff -- Done --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata pull request #641: [CARBONDATA-767] Alter table support...
Github user nareshpr commented on a diff in the pull request: https://github.com/apache/incubator-carbondata/pull/641#discussion_r106112761 --- Diff: integration/spark-common/src/main/scala/org/apache/carbondata/spark/util/DataTypeConverterUtil.scala --- @@ -77,4 +77,40 @@ object DataTypeConverterUtil { case DataType.STRUCT => "struct" } } + + /** + * convert from wrapper to external data type + * + * @param dataType + * @return + */ + def convertToThriftDataType(dataType: String): org.apache.carbondata.format.DataType = { --- End diff -- There is no direct string to thrift datatype conversion. When creating table, we first convert string to Wrapper datatype and from wrapper datatype, we convert into thrift datatype. This method is required for alter table, as we r directly converting string to thrift type in alter table change datatype --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (CARBONDATA-773) During parallel load multiple instances of DictionaryServer are being created.
[ https://issues.apache.org/jira/browse/CARBONDATA-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor reassigned CARBONDATA-773: --- Assignee: Kunal Kapoor > During parallel load multiple instances of DictionaryServer are being created. > -- > > Key: CARBONDATA-773 > URL: https://issues.apache.org/jira/browse/CARBONDATA-773 > Project: CarbonData > Issue Type: Improvement >Reporter: Kunal Kapoor >Assignee: Kunal Kapoor >Priority: Minor > > During parallel load multiple instances of DictionaryServer are being created. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (CARBONDATA-773) During parallel load multiple instances of DictionaryServer are being created.
Kunal Kapoor created CARBONDATA-773: --- Summary: During parallel load multiple instances of DictionaryServer are being created. Key: CARBONDATA-773 URL: https://issues.apache.org/jira/browse/CARBONDATA-773 Project: CarbonData Issue Type: Improvement Reporter: Kunal Kapoor Priority: Minor During parallel load multiple instances of DictionaryServer are being created. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[GitHub] incubator-carbondata issue #630: [CARBONDATA-730] added decimal type in carb...
Github user anubhav100 commented on the issue: https://github.com/apache/incubator-carbondata/pull/630 @jackylk all changes done can you review please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #630: [CARBONDATA-730] added decimal type in carb...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/630 Build Success with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1151/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] incubator-carbondata issue #630: [CARBONDATA-730] added decimal type in carb...
Github user CarbonDataQA commented on the issue: https://github.com/apache/incubator-carbondata/pull/630 Build Failed with Spark 1.6.2, Please check CI http://136.243.101.176:8080/job/ApacheCarbonPRBuilder/1150/ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---