[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader

2020-06-01 Thread GitBox


ajantha-bhat commented on a change in pull request #3770:
URL: https://github.com/apache/carbondata/pull/3770#discussion_r433650411



##
File path: 
sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/PaginationCarbonReader.java
##
@@ -0,0 +1,276 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.common.annotations.InterfaceStability;
+import org.apache.carbondata.core.cache.CarbonLRUCache;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.sdk.file.cache.BlockletRows;
+
+import org.apache.hadoop.mapreduce.InputSplit;
+
+/**
+ * CarbonData SDK reader with pagination support
+ */
+@InterfaceAudience.User
+@InterfaceStability.Evolving
+public class PaginationCarbonReader extends CarbonReader {
+  // Splits based the file present in the reader path when the reader is built.
+  private List allBlockletSplits;
+
+  // Rows till the current splits stored as list.
+  private List rowCountInSplits;
+
+  // Reader builder used to create the pagination reader, used for building 
split level readers.
+  private CarbonReaderBuilder readerBuilder;
+
+  private boolean isClosed;
+
+  // to store the rows of each blocklet in memory based LRU cache.
+  // key: unique blocklet id
+  // value: BlockletRows
+  private CarbonLRUCache cache =
+  new 
CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB,
+  
CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB_DEFAULT);
+
+  /**
+   * Call {@link #builder(String)} to construct an instance
+   */
+
+  PaginationCarbonReader(List splits, CarbonReaderBuilder 
readerBuilder) {
+// Initialize super class with no readers.
+// Based on the splits identified for pagination query, readers will be 
built for the query.
+super(null);
+this.allBlockletSplits = splits;
+this.readerBuilder = readerBuilder;
+// prepare the mapping.
+rowCountInSplits = new ArrayList<>(splits.size());
+long sum = ((CarbonInputSplit) 
splits.get(0)).getDetailInfo().getRowCount();
+rowCountInSplits.add(sum);
+for (int i = 1; i < splits.size(); i++) {
+  // prepare a summation array of row counts in each blocklet,
+  // this is used for pruning with pagination vales.
+  // At current index, it contains sum of rows of all the blocklet from 
previous + current.
+  sum += ((CarbonInputSplit) splits.get(i)).getDetailInfo().getRowCount();
+  rowCountInSplits.add(sum);
+}
+  }
+
+  /**
+   * Pagination query with from and to range.
+   *
+   * @param from must be greater than 0 and <= to
+   * @param to must be >= from and not outside the total rows
+   * @return array of rows between from and to (inclusive)
+   * @throws Exception
+   */
+  public Object[] read(long from, long to) throws IOException, 
InterruptedException {
+if (isClosed) {
+  throw new RuntimeException("Pagination Reader is closed. please build 
again");
+}
+if (from < 1) {
+  throw new IllegalArgumentException("from row id:" + from + " is less 
than 1");
+}
+if (from > to) {
+  throw new IllegalArgumentException(
+  "from row id:" + from + " is greater than to row id:" + to);
+}
+if (to > getTotalRows()) {
+  throw new IllegalArgumentException(
+  "to row id:" + to + " is greater than total rows:" + getTotalRows());
+}
+return getRows(from, to);
+  }
+
+  /**
+   * Get total rows in the folder.
+   * It is based on the snapshot of files taken while building the reader.
+   *
+   * @return total rows from all the files in the reader.
+   */
+  public long getTotalRows() {
+if (isClosed) {
+  throw new RuntimeException("Pagination Reader is closed. please build 
again");
+}
+return rowCountInSplits.get(rowCountInSplits.size() - 1);
+  }
+
+  /**
+   * This interface

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader

2020-06-01 Thread GitBox


ajantha-bhat commented on a change in pull request #3770:
URL: https://github.com/apache/carbondata/pull/3770#discussion_r433650373



##
File path: python/pycarbon/tests/sdk/test_read_write_carbon.py
##
@@ -25,7 +26,8 @@
 import os
 import jnius_config
 
-jnius_config.set_classpath("../../../sdk/sdk/target/carbondata-sdk.jar")
+jnius_config.set_classpath("../../../../sdk/sdk/target/carbondata-sdk.jar")
+# 
jnius_config.add_options('-agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=')

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader

2020-06-01 Thread GitBox


ajantha-bhat commented on a change in pull request #3770:
URL: https://github.com/apache/carbondata/pull/3770#discussion_r433644354



##
File path: python/pycarbon/sdk/PaginationCarbonReader.py
##
@@ -0,0 +1,57 @@
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+
+class PaginationCarbonReader(object):
+  def __init__(self):
+from jnius import autoclass
+self.readerClass = 
autoclass('org.apache.carbondata.sdk.file.PaginationCarbonReader')
+
+  def builder(self, path, table_name):
+self.PaginationCarbonReaderBuilder = self.readerClass.builder(path, 
table_name)
+return self
+
+  def projection(self, projection_list):
+self.PaginationCarbonReaderBuilder.projection(projection_list)
+return self
+
+  def withHadoopConf(self, key, value):

Review comment:
   That temporary AK SK is a separate requirement. Once PR raised for that, 
can python code in the same PR.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader

2020-06-01 Thread GitBox


ajantha-bhat commented on a change in pull request #3770:
URL: https://github.com/apache/carbondata/pull/3770#discussion_r433644058



##
File path: 
sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/PaginationCarbonReader.java
##
@@ -0,0 +1,276 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.common.annotations.InterfaceStability;
+import org.apache.carbondata.core.cache.CarbonLRUCache;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.sdk.file.cache.BlockletRows;
+
+import org.apache.hadoop.mapreduce.InputSplit;
+
+/**
+ * CarbonData SDK reader with pagination support
+ */
+@InterfaceAudience.User
+@InterfaceStability.Evolving
+public class PaginationCarbonReader extends CarbonReader {
+  // Splits based the file present in the reader path when the reader is built.
+  private List allBlockletSplits;
+
+  // Rows till the current splits stored as list.
+  private List rowCountInSplits;
+
+  // Reader builder used to create the pagination reader, used for building 
split level readers.
+  private CarbonReaderBuilder readerBuilder;
+
+  private boolean isClosed;
+
+  // to store the rows of each blocklet in memory based LRU cache.
+  // key: unique blocklet id
+  // value: BlockletRows
+  private CarbonLRUCache cache =
+  new 
CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB,
+  
CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB_DEFAULT);
+
+  /**
+   * Call {@link #builder(String)} to construct an instance
+   */
+
+  PaginationCarbonReader(List splits, CarbonReaderBuilder 
readerBuilder) {
+// Initialize super class with no readers.
+// Based on the splits identified for pagination query, readers will be 
built for the query.
+super(null);
+this.allBlockletSplits = splits;
+this.readerBuilder = readerBuilder;
+// prepare the mapping.
+rowCountInSplits = new ArrayList<>(splits.size());
+long sum = ((CarbonInputSplit) 
splits.get(0)).getDetailInfo().getRowCount();
+rowCountInSplits.add(sum);
+for (int i = 1; i < splits.size(); i++) {
+  // prepare a summation array of row counts in each blocklet,
+  // this is used for pruning with pagination vales.
+  // At current index, it contains sum of rows of all the blocklet from 
previous + current.
+  sum += ((CarbonInputSplit) splits.get(i)).getDetailInfo().getRowCount();
+  rowCountInSplits.add(sum);
+}
+  }
+
+  /**
+   * Pagination query with from and to range.
+   *
+   * @param from must be greater than 0 and <= to
+   * @param to must be >= from and not outside the total rows
+   * @return array of rows between from and to (inclusive)
+   * @throws Exception
+   */
+  public Object[] read(long from, long to) throws IOException, 
InterruptedException {
+if (isClosed) {
+  throw new RuntimeException("Pagination Reader is closed. please build 
again");
+}
+if (from < 1) {
+  throw new IllegalArgumentException("from row id:" + from + " is less 
than 1");
+}
+if (from > to) {
+  throw new IllegalArgumentException(
+  "from row id:" + from + " is greater than to row id:" + to);
+}
+if (to > getTotalRows()) {
+  throw new IllegalArgumentException(
+  "to row id:" + to + " is greater than total rows:" + getTotalRows());
+}
+return getRows(from, to);
+  }
+
+  /**
+   * Get total rows in the folder.
+   * It is based on the snapshot of files taken while building the reader.
+   *
+   * @return total rows from all the files in the reader.
+   */
+  public long getTotalRows() {
+if (isClosed) {
+  throw new RuntimeException("Pagination Reader is closed. please build 
again");
+}
+return rowCountInSplits.get(rowCountInSplits.size() - 1);
+  }
+
+  /**
+   * This interface

[GitHub] [carbondata] ajantha-bhat commented on a change in pull request #3770: [CARBONDATA-3829] Support pagination in SDK reader

2020-06-01 Thread GitBox


ajantha-bhat commented on a change in pull request #3770:
URL: https://github.com/apache/carbondata/pull/3770#discussion_r433643240



##
File path: 
sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/PaginationCarbonReader.java
##
@@ -0,0 +1,276 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.sdk.file;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Collections;
+import java.util.List;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.common.annotations.InterfaceStability;
+import org.apache.carbondata.core.cache.CarbonLRUCache;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.indexstore.BlockletDetailInfo;
+import org.apache.carbondata.hadoop.CarbonInputSplit;
+import org.apache.carbondata.sdk.file.cache.BlockletRows;
+
+import org.apache.hadoop.mapreduce.InputSplit;
+
+/**
+ * CarbonData SDK reader with pagination support
+ */
+@InterfaceAudience.User
+@InterfaceStability.Evolving
+public class PaginationCarbonReader extends CarbonReader {
+  // Splits based the file present in the reader path when the reader is built.
+  private List allBlockletSplits;
+
+  // Rows till the current splits stored as list.
+  private List rowCountInSplits;
+
+  // Reader builder used to create the pagination reader, used for building 
split level readers.
+  private CarbonReaderBuilder readerBuilder;
+
+  private boolean isClosed;
+
+  // to store the rows of each blocklet in memory based LRU cache.
+  // key: unique blocklet id
+  // value: BlockletRows
+  private CarbonLRUCache cache =
+  new 
CarbonLRUCache(CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB,
+  
CarbonCommonConstants.CARBON_MAX_PAGINATION_LRU_CACHE_SIZE_IN_MB_DEFAULT);
+
+  /**
+   * Call {@link #builder(String)} to construct an instance
+   */
+
+  PaginationCarbonReader(List splits, CarbonReaderBuilder 
readerBuilder) {
+// Initialize super class with no readers.
+// Based on the splits identified for pagination query, readers will be 
built for the query.
+super(null);
+this.allBlockletSplits = splits;
+this.readerBuilder = readerBuilder;
+// prepare the mapping.
+rowCountInSplits = new ArrayList<>(splits.size());
+long sum = ((CarbonInputSplit) 
splits.get(0)).getDetailInfo().getRowCount();
+rowCountInSplits.add(sum);
+for (int i = 1; i < splits.size(); i++) {
+  // prepare a summation array of row counts in each blocklet,
+  // this is used for pruning with pagination vales.
+  // At current index, it contains sum of rows of all the blocklet from 
previous + current.
+  sum += ((CarbonInputSplit) splits.get(i)).getDetailInfo().getRowCount();
+  rowCountInSplits.add(sum);
+}
+  }
+
+  /**
+   * Pagination query with from and to range.
+   *
+   * @param from must be greater than 0 and <= to
+   * @param to must be >= from and not outside the total rows
+   * @return array of rows between from and to (inclusive)
+   * @throws Exception
+   */
+  public Object[] read(long from, long to) throws IOException, 
InterruptedException {
+if (isClosed) {
+  throw new RuntimeException("Pagination Reader is closed. please build 
again");
+}
+if (from < 1) {
+  throw new IllegalArgumentException("from row id:" + from + " is less 
than 1");
+}
+if (from > to) {
+  throw new IllegalArgumentException(
+  "from row id:" + from + " is greater than to row id:" + to);
+}
+if (to > getTotalRows()) {
+  throw new IllegalArgumentException(
+  "to row id:" + to + " is greater than total rows:" + getTotalRows());
+}
+return getRows(from, to);
+  }
+
+  /**
+   * Get total rows in the folder.
+   * It is based on the snapshot of files taken while building the reader.
+   *
+   * @return total rows from all the files in the reader.
+   */
+  public long getTotalRows() {
+if (isClosed) {
+  throw new RuntimeException("Pagination Reader is closed. please build 
again");
+}
+return rowCountInSplits.get(rowCountInSplits.size() - 1);
+  }
+
+  /**
+   * This interface

[GitHub] [carbondata] QiangCai commented on pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA

2020-06-01 Thread GitBox


QiangCai commented on pull request #3782:
URL: https://github.com/apache/carbondata/pull/3782#issuecomment-637256162


   @ajantha-bhat Intellij Idea has this function in menu Analyze/inspect code, 
it will output a report which can be improved. I hope to check all suggestions 
in the report one by one and fix them.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.merge.index.

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3776:
URL: https://github.com/apache/carbondata/pull/3776#issuecomment-637158658


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1396/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3776: [CARBONDATA-3834]Segment directory and the segment file in metadata are not created for partitioned table when 'carbon.merge.index.

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3776:
URL: https://github.com/apache/carbondata/pull/3776#issuecomment-637158541


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3120/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ajantha-bhat commented on pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA

2020-06-01 Thread GitBox


ajantha-bhat commented on pull request #3782:
URL: https://github.com/apache/carbondata/pull/3782#issuecomment-636989906


   @QiangCai : you are going through every file by file and doing it ? big 
work! 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3784:
URL: https://github.com/apache/carbondata/pull/3784#issuecomment-636923342


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1395/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3784:
URL: https://github.com/apache/carbondata/pull/3784#issuecomment-636923009


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3119/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3784:
URL: https://github.com/apache/carbondata/pull/3784#issuecomment-636798214


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3118/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3784:
URL: https://github.com/apache/carbondata/pull/3784#issuecomment-636796556


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1394/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Updated] (CARBONDATA-3839) Rename file fails in hdfs for FilterFileSystem Object

2020-06-01 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor updated CARBONDATA-3839:
-
Issue Type: Improvement  (was: New Feature)

> Rename file fails in hdfs for FilterFileSystem Object
> -
>
> Key: CARBONDATA-3839
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3839
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 2.0.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Rename file fails for FilterFileSystem Object



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3816) Support Float and Decimal in the Merge Flow

2020-06-01 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor updated CARBONDATA-3816:
-
Fix Version/s: (was: 2.0.1)
   2.1.0

> Support Float and Decimal in the Merge Flow
> ---
>
> Key: CARBONDATA-3816
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3816
> Project: CarbonData
>  Issue Type: New Feature
>  Components: data-load
>Affects Versions: 2.0.0
>Reporter: Xingjun Hao
>Priority: Major
> Fix For: 2.1.0
>
>
> We don't support FLOAT and DECIMAL datatype in the CDC Flow. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3783: [CARBONDATA-3840] Mark features as experimental

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3783:
URL: https://github.com/apache/carbondata/pull/3783#issuecomment-636766357


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3117/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3783: [CARBONDATA-3840] Mark features as experimental

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3783:
URL: https://github.com/apache/carbondata/pull/3783#issuecomment-636766737


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1393/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kevinjmh opened a new pull request #3784: [CARBONDATA-3841] Remove useless string in create and alter command

2020-06-01 Thread GitBox


kevinjmh opened a new pull request #3784:
URL: https://github.com/apache/carbondata/pull/3784


### Why is this PR needed?
1. LoadDataCommand has duplicated info about user input. Since SQL plan is 
printed in driver log and showed on SparkUI, we want to make it pretty.
2. Some variable/file are not in used.

### What changes were proposed in this PR?
   Remove the duplicated info and unused variable/file.
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Created] (CARBONDATA-3841) Remove useless string in create and alter command

2020-06-01 Thread Manhua Jiang (Jira)
Manhua Jiang created CARBONDATA-3841:


 Summary: Remove useless string in create and alter command
 Key: CARBONDATA-3841
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3841
 Project: CarbonData
  Issue Type: Improvement
Reporter: Manhua Jiang






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] asfgit closed pull request #3783: [CARBONDATA-3840] Mark features as experimental

2020-06-01 Thread GitBox


asfgit closed pull request #3783:
URL: https://github.com/apache/carbondata/pull/3783


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jackylk commented on pull request #3783: [CARBONDATA-3840] Mark features as experimental

2020-06-01 Thread GitBox


jackylk commented on pull request #3783:
URL: https://github.com/apache/carbondata/pull/3783#issuecomment-636695185


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 commented on a change in pull request #3783: [CARBONDATA-3840] Mark features as experimental

2020-06-01 Thread GitBox


kunal642 commented on a change in pull request #3783:
URL: https://github.com/apache/carbondata/pull/3783#discussion_r433098465



##
File path: README.md
##
@@ -69,6 +69,14 @@ CarbonData is built using Apache Maven, to [build 
CarbonData](https://github.com
 * [Carbon as Spark's 
Datasource](https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md)
 
 * [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md) 
 
+## Experimental Features
+
+Some features are marked as experimental because the syntax/implementation 
might change in the future.
+1. Hybrid format table using Add Segment.
+2. Accelerating performance using MV on parquet/orc.
+3. CDC and SCD.

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jackylk commented on a change in pull request #3783: [CARBONDATA-3840] Mark features as experimental

2020-06-01 Thread GitBox


jackylk commented on a change in pull request #3783:
URL: https://github.com/apache/carbondata/pull/3783#discussion_r433095610



##
File path: README.md
##
@@ -69,6 +69,14 @@ CarbonData is built using Apache Maven, to [build 
CarbonData](https://github.com
 * [Carbon as Spark's 
Datasource](https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md)
 
 * [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md) 
 
+## Experimental Features
+
+Some features are marked as experimental because the syntax/implementation 
might change in the future.
+1. Hybrid format table using Add Segment.
+2. Accelerating performance using MV on parquet/orc.
+3. CDC and SCD.

Review comment:
   mention only for MERGE API for Spark DataFrame 
   UPDATE and DELETE is production feature, and no need to mention CDC and SCD





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] jackylk commented on a change in pull request #3783: [CARBONDATA-3840] Mark features as experimental

2020-06-01 Thread GitBox


jackylk commented on a change in pull request #3783:
URL: https://github.com/apache/carbondata/pull/3783#discussion_r433095610



##
File path: README.md
##
@@ -69,6 +69,14 @@ CarbonData is built using Apache Maven, to [build 
CarbonData](https://github.com
 * [Carbon as Spark's 
Datasource](https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md)
 
 * [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md) 
 
+## Experimental Features
+
+Some features are marked as experimental because the syntax/implementation 
might change in the future.
+1. Hybrid format table using Add Segment.
+2. Accelerating performance using MV on parquet/orc.
+3. CDC and SCD.

Review comment:
   mention only for MERGE API for Spark DataFrame 
   UPDATE and DELETE is production feature





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] kunal642 opened a new pull request #3783: [CARBONDATA-3840] Mark features as experimental

2020-06-01 Thread GitBox


kunal642 opened a new pull request #3783:
URL: https://github.com/apache/carbondata/pull/3783


### Why is this PR needed?
Mark features as experimental because they are subject to change in future.

### What changes were proposed in this PR?
Mark features as experimental because they are subject to change in future.
   
### Does this PR introduce any user interface change?
- No
   
### Is any new testcase added?
- No
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] Indhumathi27 commented on a change in pull request #3773: [CARBONDATA-3830]Presto complex columns read support

2020-06-01 Thread GitBox


Indhumathi27 commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r433080176



##
File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveFloatingCodec.java
##
@@ -247,6 +247,25 @@ public void decodeAndFillVector(byte[] pageData, 
ColumnVectorInfo vectorInfo, Bi
   DataType vectorDataType = vector.getType();
   vector = ColumnarVectorWrapperDirectFactory
   .getDirectVectorWrapperFactory(vector, null, nullBits, deletedRows, 
true, false);
+  CarbonColumnVector vectorColumn = null, childrenVectorInfo = null;
+  vectorColumn = vectorInfo.vector.getColumnVector();
+  if (vectorColumn != null) childrenVectorInfo = 
vectorColumn.getChildrenVector();

Review comment:
   Enclose if statement in braces

##
File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveIntegralCodec.java
##
@@ -308,15 +308,49 @@ public void decodeAndFillVector(byte[] pageData, 
ColumnVectorInfo vectorInfo, Bi
 private void fillVector(byte[] pageData, CarbonColumnVector vector, 
DataType vectorDataType,
 DataType pageDataType, int pageSize, ColumnVectorInfo vectorInfo, 
BitSet nullBits) {
   int rowId = 0;
+  CarbonColumnVector vectorColumn = null, childrenVectorInfo = null;
+  vectorColumn = vectorInfo.vector.getColumnVector();
+  if (vectorColumn != null) childrenVectorInfo = 
vectorColumn.getChildrenVector();

Review comment:
   Same code is added in AdaptiveFloatingCodec. Extract common code to a 
method and reuse

##
File path: 
integration/presto/src/main/prestosql/org/apache/carbondata/presto/readers/ArrayStreamReader.java
##
@@ -0,0 +1,113 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto.readers;
+
+import io.prestosql.spi.type.*;
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+
+import io.prestosql.spi.block.Block;
+import io.prestosql.spi.block.BlockBuilder;
+
+/**
+ * Class to read the Array Stream
+ */
+
+public class ArrayStreamReader extends CarbonColumnVectorImpl implements 
PrestoVectorBlockBuilder {
+
+  protected int batchSize;
+  protected Type type;
+  Block childBlock = null;
+  protected BlockBuilder builder;
+
+  public ArrayStreamReader(int batchSize, DataType dataType) {
+super(batchSize, dataType);
+this.batchSize = batchSize;
+if(dataType == DataTypes.STRING)

Review comment:
   enclose statements in braces

##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/executor/impl/AbstractQueryExecutor.java
##
@@ -508,7 +508,7 @@ private BlockExecutionInfo 
getBlockExecutionInfoForBlock(QueryModel queryModel,
 int[] dimensionChunkIndexes = 
QueryUtil.getDimensionChunkIndexes(projectDimensions,
 segmentProperties.getDimensionOrdinalToChunkMapping(),
 currentBlockFilterDimensions, allProjectionListDimensionIdexes);
-ReusableDataBuffer[] dimensionBuffer = new 
ReusableDataBuffer[projectDimensions.size()];
+ReusableDataBuffer[] dimensionBuffer = new 
ReusableDataBuffer[projectDimensions.size() * 2];

Review comment:
   why this change?

##
File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveFloatingCodec.java
##
@@ -51,8 +51,8 @@
  */
 public class AdaptiveFloatingCodec extends AdaptiveCodec {
 
-  private double factor;
   private float floatFactor;
+  private double factor;

Review comment:
   Please revert this change

##
File path: 
core/src/main/java/org/apache/carbondata/core/scan/collector/impl/DictionaryBasedVectorResultCollector.java
##
@@ -98,6 +98,14 @@ void prepareDimensionAndMeasureColumnVectors() {
 columnVectorInfo.dimension = queryDimensions[i];
 columnVectorInfo.ordinal = 
queryDimensions[i].getDimension().getOrdinal();
 allColumnInfo[queryDimensions[i].getOrdinal()] = columnVectorInfo;
+  } else if (queryDimensions

[jira] [Created] (CARBONDATA-3840) Mark some features as experimental.

2020-06-01 Thread Kunal Kapoor (Jira)
Kunal Kapoor created CARBONDATA-3840:


 Summary: Mark some features as experimental.
 Key: CARBONDATA-3840
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3840
 Project: CarbonData
  Issue Type: Task
Affects Versions: 2.0.1
Reporter: Kunal Kapoor


Experimental Feature
1. Hybrid format table using Add Segment.
2. Accelerationg performance using MV on parquet/orc
3. cdc/scd merge scenario
4. Hive write for non-transactional table.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (CARBONDATA-3837) Should fallback to the original plan when MV rewrite throw exception

2020-06-01 Thread David Cai (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Cai reassigned CARBONDATA-3837:
-

Assignee: David Cai

> Should fallback to the original plan when MV rewrite throw exception
> 
>
> Key: CARBONDATA-3837
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3837
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: David Cai
>Assignee: David Cai
>Priority: Major
> Fix For: 2.0.1
>
>  Time Spent: 2.5h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3782:
URL: https://github.com/apache/carbondata/pull/3782#issuecomment-636662968


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3115/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA

2020-06-01 Thread GitBox


CarbonDataQA1 commented on pull request #3782:
URL: https://github.com/apache/carbondata/pull/3782#issuecomment-636662453


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/1391/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] QiangCai opened a new pull request #3782: [WIP] Fix the result of inspecting code in Intellij IDEA

2020-06-01 Thread GitBox


QiangCai opened a new pull request #3782:
URL: https://github.com/apache/carbondata/pull/3782


### Why is this PR needed?
1. pointless bitwise expression
   2. field can be local
   3. standard Charset object can be used
   4. unecessary conversion to String
   5. unecessary interface modifier
   6. unecessary semicolon
   7. duplicate condition in 'if' statement
   8. 'if' statement with common parts 
   9. Redundant 'if' statement
   10. unnecessary 'null' check before method call
   11. Redundant local variable
   12. unused import
### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[jira] [Resolved] (CARBONDATA-3836) Fix carbon store path & avoid exception when creating new carbon table

2020-06-01 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3836.
--
Fix Version/s: 2.0.1
   Resolution: Fixed

> Fix carbon store path & avoid exception when creating new carbon table
> --
>
> Key: CARBONDATA-3836
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3836
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Manhua Jiang
>Priority: Major
> Fix For: 2.0.1
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (CARBONDATA-3839) Rename file fails in hdfs for FilterFileSystem Object

2020-06-01 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3839.
--
Fix Version/s: 2.0.1
   2.1.0
   Resolution: Fixed

> Rename file fails in hdfs for FilterFileSystem Object
> -
>
> Key: CARBONDATA-3839
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3839
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 2.1.0, 2.0.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Rename file fails for FilterFileSystem Object



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (CARBONDATA-3839) Rename file fails in hdfs for FilterFileSystem Object

2020-06-01 Thread Kunal Kapoor (Jira)


 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor updated CARBONDATA-3839:
-
Fix Version/s: (was: 2.1.0)

> Rename file fails in hdfs for FilterFileSystem Object
> -
>
> Key: CARBONDATA-3839
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3839
> Project: CarbonData
>  Issue Type: New Feature
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Major
> Fix For: 2.0.1
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Rename file fails for FilterFileSystem Object



--
This message was sent by Atlassian Jira
(v8.3.4#803005)