[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader
ajantha-bhat commented on a change in pull request #3770: URL: https://github.com/apache/carbondata/pull/3770#discussion_r433650411 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/PaginationCarbonReader.java ## @@ -0,0 +1,276 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; + +import org.apache.carbondata.common.annotations.InterfaceAudience; +import org.apache.carbondata.common.annotations.InterfaceStability; +import org.apache.carbondata.core.cache.CarbonLRUCache; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.indexstore.BlockletDetailInfo; +import org.apache.carbondata.hadoop.CarbonInputSplit; +import org.apache.carbondata.sdk.file.cache.BlockletRows; + +import org.apache.hadoop.mapreduce.InputSplit; + +/** + * CarbonData SDK reader with pagination support + */ +@InterfaceAudience.User +@InterfaceStability.Evolving +public class PaginationCarbonReader extends CarbonReader { + // Splits based the file present in the reader path when the reader is built. + private List allBlockletSplits; + + // Rows till the current splits stored as list. + private List rowCountInSplits; + + // Reader builder used to create the pagination reader, used for building split level readers. + private CarbonReaderBuilder readerBuilder; + + private boolean isClosed; + + // to store the rows of each blocklet in memory based LRU cache. + // key: unique blocklet id + // value: BlockletRows + private CarbonLRUCache cache = + new CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB, + CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB_DEFAULT); + + /** + * Call {@link #builder(String)} to construct an instance + */ + + PaginationCarbonReader(List splits, CarbonReaderBuilder readerBuilder) { +// Initialize super class with no readers. +// Based on the splits identified for pagination query, readers will be built for the query. +super(null); +this.allBlockletSplits = splits; +this.readerBuilder = readerBuilder; +// prepare the mapping. +rowCountInSplits = new ArrayList<>(splits.size()); +long sum = ((CarbonInputSplit) splits.get(0)).getDetailInfo().getRowCount(); +rowCountInSplits.add(sum); +for (int i = 1; i < splits.size(); i++) { + // prepare a summation array of row counts in each blocklet, + // this is used for pruning with pagination vales. + // At current index, it contains sum of rows of all the blocklet from previous + current. + sum += ((CarbonInputSplit) splits.get(i)).getDetailInfo().getRowCount(); + rowCountInSplits.add(sum); +} + } + + /** + * Pagination query with from and to range. + * + * @param from must be greater than 0 and <= to + * @param to must be >= from and not outside the total rows + * @return array of rows between from and to (inclusive) + * @throws Exception + */ + public Object[] read(long from, long to) throws IOException, InterruptedException { +if (isClosed) { + throw new RuntimeException("Pagination Reader is closed. please build again"); +} +if (from < 1) { + throw new IllegalArgumentException("from row id:" + from + " is less than 1"); +} +if (from > to) { + throw new IllegalArgumentException( + "from row id:" + from + " is greater than to row id:" + to); +} +if (to > getTotalRows()) { + throw new IllegalArgumentException( + "to row id:" + to + " is greater than total rows:" + getTotalRows()); +} +return getRows(from, to); + } + + /** + * Get total rows in the folder. + * It is based on the snapshot of files taken while building the reader. + * + * @return total rows from all the files in the reader. + */ + public long getTotalRows() { +if (isClosed) { + throw new RuntimeException("Pagination Reader is closed. please build again"); +} +return rowCountInSplits.get(rowCountInSplits.size() - 1); + } + + /** + * This interface
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader
ajantha-bhat commented on a change in pull request #3770: URL: https://github.com/apache/carbondata/pull/3770#discussion_r433650373 ## File path: python/pycarbon/tests/sdk/test_read_write_carbon.py ## @@ -25,7 +26,8 @@ import os import jnius_config -jnius_config.set_classpath("../../../sdk/sdk/target/carbondata-sdk.jar") +jnius_config.set_classpath("../../../../sdk/sdk/target/carbondata-sdk.jar") +# jnius_config.add_options('-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=') Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader
ajantha-bhat commented on a change in pull request #3770: URL: https://github.com/apache/carbondata/pull/3770#discussion_r433644354 ## File path: python/pycarbon/sdk/PaginationCarbonReader.py ## @@ -0,0 +1,57 @@ +# Licensed to the Apache Software Foundation (ASF) under one or more +# contributor license agreements. See the NOTICE file distributed with +# this work for additional information regarding copyright ownership. +# The ASF licenses this file to You under the Apache License, Version 2.0 +# (the "License"); you may not use this file except in compliance with +# the License. You may obtain a copy of the License at +# +#http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + + +class PaginationCarbonReader(object): + def __init__(self): +from jnius import autoclass +self.readerClass = autoclass('org.apache.carbondata.sdk.file.PaginationCarbonReader') + + def builder(self, path, table_name): +self.PaginationCarbonReaderBuilder = self.readerClass.builder(path, table_name) +return self + + def projection(self, projection_list): +self.PaginationCarbonReaderBuilder.projection(projection_list) +return self + + def withHadoopConf(self, key, value): Review comment: That temporary AK SK is a separate requirement. Once PR raised for that, can python code in the same PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader
ajantha-bhat commented on a change in pull request #3770: URL: https://github.com/apache/carbondata/pull/3770#discussion_r433644058 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/PaginationCarbonReader.java ## @@ -0,0 +1,276 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; + +import org.apache.carbondata.common.annotations.InterfaceAudience; +import org.apache.carbondata.common.annotations.InterfaceStability; +import org.apache.carbondata.core.cache.CarbonLRUCache; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.indexstore.BlockletDetailInfo; +import org.apache.carbondata.hadoop.CarbonInputSplit; +import org.apache.carbondata.sdk.file.cache.BlockletRows; + +import org.apache.hadoop.mapreduce.InputSplit; + +/** + * CarbonData SDK reader with pagination support + */ +@InterfaceAudience.User +@InterfaceStability.Evolving +public class PaginationCarbonReader extends CarbonReader { + // Splits based the file present in the reader path when the reader is built. + private List allBlockletSplits; + + // Rows till the current splits stored as list. + private List rowCountInSplits; + + // Reader builder used to create the pagination reader, used for building split level readers. + private CarbonReaderBuilder readerBuilder; + + private boolean isClosed; + + // to store the rows of each blocklet in memory based LRU cache. + // key: unique blocklet id + // value: BlockletRows + private CarbonLRUCache cache = + new CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB, + CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB_DEFAULT); + + /** + * Call {@link #builder(String)} to construct an instance + */ + + PaginationCarbonReader(List splits, CarbonReaderBuilder readerBuilder) { +// Initialize super class with no readers. +// Based on the splits identified for pagination query, readers will be built for the query. +super(null); +this.allBlockletSplits = splits; +this.readerBuilder = readerBuilder; +// prepare the mapping. +rowCountInSplits = new ArrayList<>(splits.size()); +long sum = ((CarbonInputSplit) splits.get(0)).getDetailInfo().getRowCount(); +rowCountInSplits.add(sum); +for (int i = 1; i < splits.size(); i++) { + // prepare a summation array of row counts in each blocklet, + // this is used for pruning with pagination vales. + // At current index, it contains sum of rows of all the blocklet from previous + current. + sum += ((CarbonInputSplit) splits.get(i)).getDetailInfo().getRowCount(); + rowCountInSplits.add(sum); +} + } + + /** + * Pagination query with from and to range. + * + * @param from must be greater than 0 and <= to + * @param to must be >= from and not outside the total rows + * @return array of rows between from and to (inclusive) + * @throws Exception + */ + public Object[] read(long from, long to) throws IOException, InterruptedException { +if (isClosed) { + throw new RuntimeException("Pagination Reader is closed. please build again"); +} +if (from < 1) { + throw new IllegalArgumentException("from row id:" + from + " is less than 1"); +} +if (from > to) { + throw new IllegalArgumentException( + "from row id:" + from + " is greater than to row id:" + to); +} +if (to > getTotalRows()) { + throw new IllegalArgumentException( + "to row id:" + to + " is greater than total rows:" + getTotalRows()); +} +return getRows(from, to); + } + + /** + * Get total rows in the folder. + * It is based on the snapshot of files taken while building the reader. + * + * @return total rows from all the files in the reader. + */ + public long getTotalRows() { +if (isClosed) { + throw new RuntimeException("Pagination Reader is closed. please build again"); +} +return rowCountInSplits.get(rowCountInSplits.size() - 1); + } + + /** + * This interface
[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader
ajantha-bhat commented on a change in pull request #3770: URL: https://github.com/apache/carbondata/pull/3770#discussion_r433643240 ## File path: sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/PaginationCarbonReader.java ## @@ -0,0 +1,276 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.sdk.file; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; + +import org.apache.carbondata.common.annotations.InterfaceAudience; +import org.apache.carbondata.common.annotations.InterfaceStability; +import org.apache.carbondata.core.cache.CarbonLRUCache; +import org.apache.carbondata.core.constants.CarbonCommonConstants; +import org.apache.carbondata.core.indexstore.BlockletDetailInfo; +import org.apache.carbondata.hadoop.CarbonInputSplit; +import org.apache.carbondata.sdk.file.cache.BlockletRows; + +import org.apache.hadoop.mapreduce.InputSplit; + +/** + * CarbonData SDK reader with pagination support + */ +@InterfaceAudience.User +@InterfaceStability.Evolving +public class PaginationCarbonReader extends CarbonReader { + // Splits based the file present in the reader path when the reader is built. + private List allBlockletSplits; + + // Rows till the current splits stored as list. + private List rowCountInSplits; + + // Reader builder used to create the pagination reader, used for building split level readers. + private CarbonReaderBuilder readerBuilder; + + private boolean isClosed; + + // to store the rows of each blocklet in memory based LRU cache. + // key: unique blocklet id + // value: BlockletRows + private CarbonLRUCache cache = + new CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB, + CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB_DEFAULT); + + /** + * Call {@link #builder(String)} to construct an instance + */ + + PaginationCarbonReader(List splits, CarbonReaderBuilder readerBuilder) { +// Initialize super class with no readers. +// Based on the splits identified for pagination query, readers will be built for the query. +super(null); +this.allBlockletSplits = splits; +this.readerBuilder = readerBuilder; +// prepare the mapping. +rowCountInSplits = new ArrayList<>(splits.size()); +long sum = ((CarbonInputSplit) splits.get(0)).getDetailInfo().getRowCount(); +rowCountInSplits.add(sum); +for (int i = 1; i < splits.size(); i++) { + // prepare a summation array of row counts in each blocklet, + // this is used for pruning with pagination vales. + // At current index, it contains sum of rows of all the blocklet from previous + current. + sum += ((CarbonInputSplit) splits.get(i)).getDetailInfo().getRowCount(); + rowCountInSplits.add(sum); +} + } + + /** + * Pagination query with from and to range. + * + * @param from must be greater than 0 and <= to + * @param to must be >= from and not outside the total rows + * @return array of rows between from and to (inclusive) + * @throws Exception + */ + public Object[] read(long from, long to) throws IOException, InterruptedException { +if (isClosed) { + throw new RuntimeException("Pagination Reader is closed. please build again"); +} +if (from < 1) { + throw new IllegalArgumentException("from row id:" + from + " is less than 1"); +} +if (from > to) { + throw new IllegalArgumentException( + "from row id:" + from + " is greater than to row id:" + to); +} +if (to > getTotalRows()) { + throw new IllegalArgumentException( + "to row id:" + to + " is greater than total rows:" + getTotalRows()); +} +return getRows(from, to); + } + + /** + * Get total rows in the folder. + * It is based on the snapshot of files taken while building the reader. + * + * @return total rows from all the files in the reader. + */ + public long getTotalRows() { +if (isClosed) { + throw new RuntimeException("Pagination Reader is closed. please build again"); +} +return rowCountInSplits.get(rowCountInSplits.size() - 1); + } + + /** + * This interface
[GitHub] [carbondata] QiangCai commented on pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA
QiangCai commented on pull request #3782: URL: https://github.com/apache/carbondata/pull/3782#issuecomment-637256162 @ajantha-bhat Intellij Idea has this function in menu Analyze/inspect code, it will output a report which can be improved. I hope to check all suggestions in the report one by one and fix them. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.merge.index.
CarbonDataQA1 commented on pull request #3776: URL: https://github.com/apache/carbondata/pull/3776#issuecomment-637158658 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1396/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.merge.index.
CarbonDataQA1 commented on pull request #3776: URL: https://github.com/apache/carbondata/pull/3776#issuecomment-637158541 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3120/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] ajantha-bhat commented on pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA
ajantha-bhat commented on pull request #3782: URL: https://github.com/apache/carbondata/pull/3782#issuecomment-636989906 @QiangCai : you are going through every file by file and doing it ? big work! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command
CarbonDataQA1 commented on pull request #3784: URL: https://github.com/apache/carbondata/pull/3784#issuecomment-636923342 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1395/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command
CarbonDataQA1 commented on pull request #3784: URL: https://github.com/apache/carbondata/pull/3784#issuecomment-636923009 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3119/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command
CarbonDataQA1 commented on pull request #3784: URL: https://github.com/apache/carbondata/pull/3784#issuecomment-636798214 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3118/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command
CarbonDataQA1 commented on pull request #3784: URL: https://github.com/apache/carbondata/pull/3784#issuecomment-636796556 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1394/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (CARBONDATA-3839) Rename file fails in hdfs for FilterFileSystem Object
[ https://issues.apache.org/jira/browse/CARBONDATA-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor updated CARBONDATA-3839: - Issue Type: Improvement (was: New Feature) > Rename file fails in hdfs for FilterFileSystem Object > - > > Key: CARBONDATA-3839 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3839 > Project: CarbonData > Issue Type: Improvement >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.0.1 > > Time Spent: 1h > Remaining Estimate: 0h > > Rename file fails for FilterFileSystem Object -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3816) Support Float and Decimal in the Merge Flow
[ https://issues.apache.org/jira/browse/CARBONDATA-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor updated CARBONDATA-3816: - Fix Version/s: (was: 2.0.1) 2.1.0 > Support Float and Decimal in the Merge Flow > --- > > Key: CARBONDATA-3816 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3816 > Project: CarbonData > Issue Type: New Feature > Components: data-load >Affects Versions: 2.0.0 >Reporter: Xingjun Hao >Priority: Major > Fix For: 2.1.0 > > > We don't support FLOAT and DECIMAL datatype in the CDC Flow. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3783: [CARBONDATA-3840] Mark features as experimental
CarbonDataQA1 commented on pull request #3783: URL: https://github.com/apache/carbondata/pull/3783#issuecomment-636766357 Build Success with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3117/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3783: [CARBONDATA-3840] Mark features as experimental
CarbonDataQA1 commented on pull request #3783: URL: https://github.com/apache/carbondata/pull/3783#issuecomment-636766737 Build Success with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1393/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kevinjmh opened a new pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command
kevinjmh opened a new pull request #3784: URL: https://github.com/apache/carbondata/pull/3784 ### Why is this PR needed? 1. LoadDataCommand has duplicated info about user input. Since SQL plan is printed in driver log and showed on SparkUI, we want to make it pretty. 2. Some variable/file are not in used. ### What changes were proposed in this PR? Remove the duplicated info and unused variable/file. ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Created] (CARBONDATA-3841) Remove useless string in create and alter command
Manhua Jiang created CARBONDATA-3841: Summary: Remove useless string in create and alter command Key: CARBONDATA-3841 URL: https://issues.apache.org/jira/browse/CARBONDATA-3841 Project: CarbonData Issue Type: Improvement Reporter: Manhua Jiang -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] asfgit closed pull request #3783: [CARBONDATA-3840] Mark features as experimental
asfgit closed pull request #3783: URL: https://github.com/apache/carbondata/pull/3783 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] jackylk commented on pull request #3783: [CARBONDATA-3840] Mark features as experimental
jackylk commented on pull request #3783: URL: https://github.com/apache/carbondata/pull/3783#issuecomment-636695185 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 commented on a change in pull request #3783: [CARBONDATA-3840] Mark features as experimental
kunal642 commented on a change in pull request #3783: URL: https://github.com/apache/carbondata/pull/3783#discussion_r433098465 ## File path: README.md ## @@ -69,6 +69,14 @@ CarbonData is built using Apache Maven, to [build CarbonData](https://github.com * [Carbon as Spark's Datasource](https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md) * [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md) +## Experimental Features + +Some features are marked as experimental because the syntax/implementation might change in the future. +1. Hybrid format table using Add Segment. +2. Accelerating performance using MV on parquet/orc. +3. CDC and SCD. Review comment: done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] jackylk commented on a change in pull request #3783: [CARBONDATA-3840] Mark features as experimental
jackylk commented on a change in pull request #3783: URL: https://github.com/apache/carbondata/pull/3783#discussion_r433095610 ## File path: README.md ## @@ -69,6 +69,14 @@ CarbonData is built using Apache Maven, to [build CarbonData](https://github.com * [Carbon as Spark's Datasource](https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md) * [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md) +## Experimental Features + +Some features are marked as experimental because the syntax/implementation might change in the future. +1. Hybrid format table using Add Segment. +2. Accelerating performance using MV on parquet/orc. +3. CDC and SCD. Review comment: mention only for MERGE API for Spark DataFrame UPDATE and DELETE is production feature, and no need to mention CDC and SCD This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] jackylk commented on a change in pull request #3783: [CARBONDATA-3840] Mark features as experimental
jackylk commented on a change in pull request #3783: URL: https://github.com/apache/carbondata/pull/3783#discussion_r433095610 ## File path: README.md ## @@ -69,6 +69,14 @@ CarbonData is built using Apache Maven, to [build CarbonData](https://github.com * [Carbon as Spark's Datasource](https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md) * [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md) +## Experimental Features + +Some features are marked as experimental because the syntax/implementation might change in the future. +1. Hybrid format table using Add Segment. +2. Accelerating performance using MV on parquet/orc. +3. CDC and SCD. Review comment: mention only for MERGE API for Spark DataFrame UPDATE and DELETE is production feature This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] kunal642 opened a new pull request #3783: [CARBONDATA-3840] Mark features as experimental
kunal642 opened a new pull request #3783: URL: https://github.com/apache/carbondata/pull/3783 ### Why is this PR needed? Mark features as experimental because they are subject to change in future. ### What changes were proposed in this PR? Mark features as experimental because they are subject to change in future. ### Does this PR introduce any user interface change? - No ### Is any new testcase added? - No This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3773: [CARBONDATA-3830]Presto complex columns read support
Indhumathi27 commented on a change in pull request #3773: URL: https://github.com/apache/carbondata/pull/3773#discussion_r433080176 ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveFloatingCodec.java ## @@ -247,6 +247,25 @@ public void decodeAndFillVector(byte[] pageData, ColumnVectorInfo vectorInfo, Bi DataType vectorDataType = vector.getType(); vector = ColumnarVectorWrapperDirectFactory .getDirectVectorWrapperFactory(vector, null, nullBits, deletedRows, true, false); + CarbonColumnVector vectorColumn = null, childrenVectorInfo = null; + vectorColumn = vectorInfo.vector.getColumnVector(); + if (vectorColumn != null) childrenVectorInfo = vectorColumn.getChildrenVector(); Review comment: Enclose if statement in braces ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveIntegralCodec.java ## @@ -308,15 +308,49 @@ public void decodeAndFillVector(byte[] pageData, ColumnVectorInfo vectorInfo, Bi private void fillVector(byte[] pageData, CarbonColumnVector vector, DataType vectorDataType, DataType pageDataType, int pageSize, ColumnVectorInfo vectorInfo, BitSet nullBits) { int rowId = 0; + CarbonColumnVector vectorColumn = null, childrenVectorInfo = null; + vectorColumn = vectorInfo.vector.getColumnVector(); + if (vectorColumn != null) childrenVectorInfo = vectorColumn.getChildrenVector(); Review comment: Same code is added in AdaptiveFloatingCodec. Extract common code to a method and reuse ## File path: integration/presto/src/main/prestosql/org/apache/carbondata/presto/readers/ArrayStreamReader.java ## @@ -0,0 +1,113 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.carbondata.presto.readers; + +import io.prestosql.spi.type.*; +import org.apache.carbondata.core.metadata.datatype.DataType; +import org.apache.carbondata.core.metadata.datatype.DataTypes; +import org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl; + +import io.prestosql.spi.block.Block; +import io.prestosql.spi.block.BlockBuilder; + +/** + * Class to read the Array Stream + */ + +public class ArrayStreamReader extends CarbonColumnVectorImpl implements PrestoVectorBlockBuilder { + + protected int batchSize; + protected Type type; + Block childBlock = null; + protected BlockBuilder builder; + + public ArrayStreamReader(int batchSize, DataType dataType) { +super(batchSize, dataType); +this.batchSize = batchSize; +if(dataType == DataTypes.STRING) Review comment: enclose statements in braces ## File path: core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java ## @@ -508,7 +508,7 @@ private BlockExecutionInfo getBlockExecutionInfoForBlock(QueryModel queryModel, int[] dimensionChunkIndexes = QueryUtil.getDimensionChunkIndexes(projectDimensions, segmentProperties.getDimensionOrdinalToChunkMapping(), currentBlockFilterDimensions, allProjectionListDimensionIdexes); -ReusableDataBuffer[] dimensionBuffer = new ReusableDataBuffer[projectDimensions.size()]; +ReusableDataBuffer[] dimensionBuffer = new ReusableDataBuffer[projectDimensions.size() * 2]; Review comment: why this change? ## File path: core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveFloatingCodec.java ## @@ -51,8 +51,8 @@ */ public class AdaptiveFloatingCodec extends AdaptiveCodec { - private double factor; private float floatFactor; + private double factor; Review comment: Please revert this change ## File path: core/src/main/java/org/apache/carbondata/core/scan/collector/impl/DictionaryBasedVectorResultCollector.java ## @@ -98,6 +98,14 @@ void prepareDimensionAndMeasureColumnVectors() { columnVectorInfo.dimension = queryDimensions[i]; columnVectorInfo.ordinal = queryDimensions[i].getDimension().getOrdinal(); allColumnInfo[queryDimensions[i].getOrdinal()] = columnVectorInfo; + } else if (queryDimensions
[jira] [Created] (CARBONDATA-3840) Mark some features as experimental.
Kunal Kapoor created CARBONDATA-3840: Summary: Mark some features as experimental. Key: CARBONDATA-3840 URL: https://issues.apache.org/jira/browse/CARBONDATA-3840 Project: CarbonData Issue Type: Task Affects Versions: 2.0.1 Reporter: Kunal Kapoor Experimental Feature 1. Hybrid format table using Add Segment. 2. Accelerationg performance using MV on parquet/orc 3. cdc/scd merge scenario 4. Hive write for non-transactional table. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (CARBONDATA-3837) Should fallback to the original plan when MV rewrite throw exception
[ https://issues.apache.org/jira/browse/CARBONDATA-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Cai reassigned CARBONDATA-3837: - Assignee: David Cai > Should fallback to the original plan when MV rewrite throw exception > > > Key: CARBONDATA-3837 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3837 > Project: CarbonData > Issue Type: Improvement >Reporter: David Cai >Assignee: David Cai >Priority: Major > Fix For: 2.0.1 > > Time Spent: 2.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA
CarbonDataQA1 commented on pull request #3782: URL: https://github.com/apache/carbondata/pull/3782#issuecomment-636662968 Build Failed with Spark 2.3.4, Please check CI http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3115/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA
CarbonDataQA1 commented on pull request #3782: URL: https://github.com/apache/carbondata/pull/3782#issuecomment-636662453 Build Failed with Spark 2.4.5, Please check CI http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1391/ This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [carbondata] QiangCai opened a new pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA
QiangCai opened a new pull request #3782: URL: https://github.com/apache/carbondata/pull/3782 ### Why is this PR needed? 1. pointless bitwise expression 2. field can be local 3. standard Charset object can be used 4. unecessary conversion to String 5. unecessary interface modifier 6. unecessary semicolon 7. duplicate condition in 'if' statement 8. 'if' statement with common parts 9. Redundant 'if' statement 10. unnecessary 'null' check before method call 11. Redundant local variable 12. unused import ### What changes were proposed in this PR? ### Does this PR introduce any user interface change? - No - Yes. (please explain the change and update document) ### Is any new testcase added? - No - Yes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Resolved] (CARBONDATA-3836) Fix carbon store path & avoid exception when creating new carbon table
[ https://issues.apache.org/jira/browse/CARBONDATA-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor resolved CARBONDATA-3836. -- Fix Version/s: 2.0.1 Resolution: Fixed > Fix carbon store path & avoid exception when creating new carbon table > -- > > Key: CARBONDATA-3836 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3836 > Project: CarbonData > Issue Type: Improvement >Reporter: Manhua Jiang >Priority: Major > Fix For: 2.0.1 > > Time Spent: 2h > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (CARBONDATA-3839) Rename file fails in hdfs for FilterFileSystem Object
[ https://issues.apache.org/jira/browse/CARBONDATA-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor resolved CARBONDATA-3839. -- Fix Version/s: 2.0.1 2.1.0 Resolution: Fixed > Rename file fails in hdfs for FilterFileSystem Object > - > > Key: CARBONDATA-3839 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3839 > Project: CarbonData > Issue Type: New Feature >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.1.0, 2.0.1 > > Time Spent: 1h > Remaining Estimate: 0h > > Rename file fails for FilterFileSystem Object -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (CARBONDATA-3839) Rename file fails in hdfs for FilterFileSystem Object
[ https://issues.apache.org/jira/browse/CARBONDATA-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kunal Kapoor updated CARBONDATA-3839: - Fix Version/s: (was: 2.1.0) > Rename file fails in hdfs for FilterFileSystem Object > - > > Key: CARBONDATA-3839 > URL: https://issues.apache.org/jira/browse/CARBONDATA-3839 > Project: CarbonData > Issue Type: New Feature >Reporter: Akash R Nilugal >Assignee: Akash R Nilugal >Priority: Major > Fix For: 2.0.1 > > Time Spent: 1h > Remaining Estimate: 0h > > Rename file fails for FilterFileSystem Object -- This message was sent by Atlassian Jira (v8.3.4#803005)