from:"jackylk"

[GitHub] carbondata pull request #3066: [CARBONDATA-3244] Add benchmark for Change Da...

2019-01-10 Thread jackylk

GitHub user jackylk opened a pull request:

https://github.com/apache/carbondata/pull/3066

[CARBONDATA-3244] Add benchmark for Change Data Capture scenario

CDC (change data capture) is a common scenario for analyzing slowly changed 
table in data warehouse.
It is good to add benchmark test comparing two update methods:
1. hive_solution, which uses INSERT OVERWRITE. This is a popular method for 
hive warehouse.
2. carbon_solution, which uses CarbonData's update syntax to update the 
history table directly.

This test simulates updates to history table using CDC table.
When running in a 8-cores laptop, the benchmark shows:
1. test one
History table 1M records, update 10K records everyday and insert 10K 
records everyday, simulated 3 days.
hive_solution: total process time takes 13,516 ms
carbon_solution: total process time takes 7,521 ms


2. test two
History table 10M records, update 10K records everyday and insert 10K 
records everyday,
simulated 3 days.
hive_solution: total process time takes 104,250 ms
carbon_solution: total process time takes 17,384 ms

 - [X] Any interfaces changed?
 No
 - [X] Any backward compatibility impacted?
 No
 - [X] Document update required?
No
 - [X] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
  Only example is added 
 - [X] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
NA


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata cdc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/3066.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3066


commit ebb5ef79ac85a6c736496fe19f719bfed74902c1
Author: Jacky Li 
Date:   2019-01-10T16:44:58Z

add benchmark for Change Data Capture scenario




---

[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...

2019-01-03 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2963#discussion_r245013088
  
--- Diff: pom.xml ---
@@ -527,6 +526,7 @@
 examples/spark2
 datamap/lucene
 datamap/bloom
+datamap/example
--- End diff --

I think it is better not to add this, since it will make the assembling 
bigger


---

[GitHub] carbondata issue #3019: [CARBONDATA-3194] Integrating Carbon with Presto usi...

2018-12-28 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/3019
  
LGTM


---

[GitHub] carbondata pull request #3019: [CARBONDATA-3194] Integrating Carbon with Pre...

2018-12-27 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3019#discussion_r244269584
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/table/CarbonCreateTableCommand.scala
 ---
@@ -157,7 +157,7 @@ case class CarbonCreateTableCommand(
  |  tableName "$tableName",
  |  dbName "$dbName",
  |  tablePath "$tablePath",
- |  path "$tablePath",
+ |  path "${FileFactory.addSchemeIfNotExists(tablePath)}",
--- End diff --

Can you update the comment from line 147 to 150. I feel it need to be 
rephrased


---

[GitHub] carbondata pull request #3019: [CARBONDATA-3194] Integrating Carbon with Pre...

2018-12-27 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3019#discussion_r244269478
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/impl/FileFactory.java 
---
@@ -369,6 +369,24 @@ public static boolean createNewLockFile(String 
filePath, FileType fileType) thro
 LOCAL, HDFS, ALLUXIO, VIEWFS, S3
   }
 
+  public static String addSchemeIfNotExists(String filePath) {
--- End diff --

add comment


---

[GitHub] carbondata pull request #3019: [CARBONDATA-3194] Integrating Carbon with Pre...

2018-12-27 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3019#discussion_r244269435
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataModule.java
 ---
@@ -17,62 +17,150 @@
 
 package org.apache.carbondata.presto;
 
-import javax.inject.Inject;
+import java.util.function.Supplier;
 
 import static java.util.Objects.requireNonNull;
 
-import org.apache.carbondata.presto.impl.CarbonTableConfig;
 import org.apache.carbondata.presto.impl.CarbonTableReader;
 
+import com.facebook.presto.hive.CoercionPolicy;
+import com.facebook.presto.hive.DirectoryLister;
+import com.facebook.presto.hive.FileFormatDataSourceStats;
+import com.facebook.presto.hive.GenericHiveRecordCursorProvider;
+import com.facebook.presto.hive.HadoopDirectoryLister;
+import com.facebook.presto.hive.HdfsConfiguration;
+import com.facebook.presto.hive.HdfsConfigurationUpdater;
+import com.facebook.presto.hive.HdfsEnvironment;
+import com.facebook.presto.hive.HiveClientConfig;
+import com.facebook.presto.hive.HiveClientModule;
+import com.facebook.presto.hive.HiveCoercionPolicy;
+import com.facebook.presto.hive.HiveConnectorId;
+import com.facebook.presto.hive.HiveEventClient;
+import com.facebook.presto.hive.HiveFileWriterFactory;
+import com.facebook.presto.hive.HiveHdfsConfiguration;
+import com.facebook.presto.hive.HiveLocationService;
+import com.facebook.presto.hive.HiveMetadataFactory;
+import com.facebook.presto.hive.HiveNodePartitioningProvider;
+import com.facebook.presto.hive.HivePageSinkProvider;
+import com.facebook.presto.hive.HivePageSourceFactory;
+import com.facebook.presto.hive.HivePartitionManager;
+import com.facebook.presto.hive.HiveRecordCursorProvider;
+import com.facebook.presto.hive.HiveSessionProperties;
+import com.facebook.presto.hive.HiveSplitManager;
+import com.facebook.presto.hive.HiveTableProperties;
+import com.facebook.presto.hive.HiveTransactionManager;
+import com.facebook.presto.hive.HiveTypeTranslator;
+import com.facebook.presto.hive.HiveWriterStats;
+import com.facebook.presto.hive.LocationService;
+import com.facebook.presto.hive.NamenodeStats;
+import com.facebook.presto.hive.OrcFileWriterConfig;
+import com.facebook.presto.hive.OrcFileWriterFactory;
+import com.facebook.presto.hive.PartitionUpdate;
+import com.facebook.presto.hive.RcFileFileWriterFactory;
+import com.facebook.presto.hive.TableParameterCodec;
+import com.facebook.presto.hive.TransactionalMetadata;
+import com.facebook.presto.hive.TypeTranslator;
+import com.facebook.presto.hive.orc.DwrfPageSourceFactory;
+import com.facebook.presto.hive.orc.OrcPageSourceFactory;
+import com.facebook.presto.hive.parquet.ParquetPageSourceFactory;
+import com.facebook.presto.hive.parquet.ParquetRecordCursorProvider;
+import com.facebook.presto.hive.rcfile.RcFilePageSourceFactory;
+import com.facebook.presto.spi.connector.ConnectorNodePartitioningProvider;
+import com.facebook.presto.spi.connector.ConnectorPageSinkProvider;
 import com.facebook.presto.spi.connector.ConnectorPageSourceProvider;
 import com.facebook.presto.spi.connector.ConnectorSplitManager;
-import com.facebook.presto.spi.type.Type;
-import com.facebook.presto.spi.type.TypeManager;
-import com.fasterxml.jackson.databind.DeserializationContext;
-import com.fasterxml.jackson.databind.deser.std.FromStringDeserializer;
 import com.google.inject.Binder;
-import com.google.inject.Module;
 import com.google.inject.Scopes;
+import com.google.inject.TypeLiteral;
+import com.google.inject.multibindings.Multibinder;
+import io.airlift.event.client.EventClient;
 
-import static 
com.facebook.presto.spi.type.TypeSignature.parseTypeSignature;
-import static com.google.common.base.Preconditions.checkArgument;
+import static com.google.inject.multibindings.Multibinder.newSetBinder;
 import static io.airlift.configuration.ConfigBinder.configBinder;
+import static io.airlift.json.JsonCodecBinder.jsonCodecBinder;
 
-public class CarbondataModule implements Module {
+import static org.weakref.jmx.ObjectNames.generatedNameOf;
+import static org.weakref.jmx.guice.ExportBinder.newExporter;
+
+public class CarbondataModule extends HiveClientModule {
--- End diff --

Please add comment for this class


---

[GitHub] carbondata pull request #3019: [CARBONDATA-3194] Integrating Carbon with Pre...

2018-12-27 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3019#discussion_r244254924
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/PrestoFilterUtil.java
 ---
@@ -78,32 +72,33 @@
   private static final String HIVE_DEFAULT_DYNAMIC_PARTITION = 
"__HIVE_DEFAULT_PARTITION__";
 
   /**
-   * @param carbondataColumnHandle
+   * @param columnHandle
* @return
*/
-  private static DataType spi2CarbondataTypeMapper(CarbondataColumnHandle 
carbondataColumnHandle) {
-Type colType = carbondataColumnHandle.getColumnType();
-if (colType == BooleanType.BOOLEAN) {
+  private static DataType spi2CarbondataTypeMapper(HiveColumnHandle 
columnHandle) {
+HiveType colType = columnHandle.getHiveType();
+if (colType.equals(HiveType.HIVE_BOOLEAN)) {
   return DataTypes.BOOLEAN;
-} else if (colType == SmallintType.SMALLINT) {
+} else if (colType.equals(HiveType.HIVE_SHORT)) {
   return DataTypes.SHORT;
-} else if (colType == IntegerType.INTEGER) {
+} else if (colType.equals(HiveType.HIVE_INT)) {
   return DataTypes.INT;
-} else if (colType == BigintType.BIGINT) {
+} else if (colType.equals(HiveType.HIVE_LONG)) {
   return DataTypes.LONG;
-} else if (colType == DoubleType.DOUBLE) {
+} else if (colType.equals(HiveType.HIVE_DOUBLE)) {
   return DataTypes.DOUBLE;
-} else if (colType == VarcharType.VARCHAR) {
+} else if (colType.equals(HiveType.HIVE_STRING)) {
   return DataTypes.STRING;
-} else if (colType == DateType.DATE) {
+} else if (colType.equals(HiveType.HIVE_DATE)) {
   return DataTypes.DATE;
-} else if (colType == TimestampType.TIMESTAMP) {
+} else if (colType.equals(HiveType.HIVE_TIMESTAMP)) {
   return DataTypes.TIMESTAMP;
-} else if 
(colType.equals(DecimalType.createDecimalType(carbondataColumnHandle.getPrecision(),
-carbondataColumnHandle.getScale( {
-  return 
DataTypes.createDecimalType(carbondataColumnHandle.getPrecision(),
-  carbondataColumnHandle.getScale());
-} else {
+}
+else if (colType.getTypeInfo() instanceof DecimalTypeInfo) {
+  DecimalTypeInfo typeInfo = (DecimalTypeInfo) colType.getTypeInfo();
+  return 
DataTypes.createDecimalType(typeInfo.getPrecision(),typeInfo.getScale());
+}
+else {
--- End diff --

move up


---

[GitHub] carbondata pull request #3019: [CARBONDATA-3194] Integrating Carbon with Pre...

2018-12-27 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3019#discussion_r244254915
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/PrestoFilterUtil.java
 ---
@@ -78,32 +72,33 @@
   private static final String HIVE_DEFAULT_DYNAMIC_PARTITION = 
"__HIVE_DEFAULT_PARTITION__";
 
   /**
-   * @param carbondataColumnHandle
+   * @param columnHandle
* @return
*/
-  private static DataType spi2CarbondataTypeMapper(CarbondataColumnHandle 
carbondataColumnHandle) {
-Type colType = carbondataColumnHandle.getColumnType();
-if (colType == BooleanType.BOOLEAN) {
+  private static DataType spi2CarbondataTypeMapper(HiveColumnHandle 
columnHandle) {
+HiveType colType = columnHandle.getHiveType();
+if (colType.equals(HiveType.HIVE_BOOLEAN)) {
   return DataTypes.BOOLEAN;
-} else if (colType == SmallintType.SMALLINT) {
+} else if (colType.equals(HiveType.HIVE_SHORT)) {
   return DataTypes.SHORT;
-} else if (colType == IntegerType.INTEGER) {
+} else if (colType.equals(HiveType.HIVE_INT)) {
   return DataTypes.INT;
-} else if (colType == BigintType.BIGINT) {
+} else if (colType.equals(HiveType.HIVE_LONG)) {
   return DataTypes.LONG;
-} else if (colType == DoubleType.DOUBLE) {
+} else if (colType.equals(HiveType.HIVE_DOUBLE)) {
   return DataTypes.DOUBLE;
-} else if (colType == VarcharType.VARCHAR) {
+} else if (colType.equals(HiveType.HIVE_STRING)) {
   return DataTypes.STRING;
-} else if (colType == DateType.DATE) {
+} else if (colType.equals(HiveType.HIVE_DATE)) {
   return DataTypes.DATE;
-} else if (colType == TimestampType.TIMESTAMP) {
+} else if (colType.equals(HiveType.HIVE_TIMESTAMP)) {
   return DataTypes.TIMESTAMP;
-} else if 
(colType.equals(DecimalType.createDecimalType(carbondataColumnHandle.getPrecision(),
-carbondataColumnHandle.getScale( {
-  return 
DataTypes.createDecimalType(carbondataColumnHandle.getPrecision(),
-  carbondataColumnHandle.getScale());
-} else {
+}
+else if (colType.getTypeInfo() instanceof DecimalTypeInfo) {
--- End diff --

move up 


---

[GitHub] carbondata pull request #3019: [CARBONDATA-3194] Integrating Carbon with Pre...

2018-12-27 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3019#discussion_r244254822
  
--- Diff: 
integration/presto/src/test/scala/org/apache/carbondata/presto/server/PrestoServer.scala
 ---
@@ -129,6 +130,21 @@ class PrestoServer {
 }
   }
 
+  def execute(query: String) = {
+
+Try {
+  val conn: Connection = createJdbcConnection(dbName)
+  logger.info(s"* executing the query * \n $query")
--- End diff --

rename logger to LOGGER


---

[GitHub] carbondata pull request #3019: [CARBONDATA-3194] Integrating Carbon with Pre...

2018-12-27 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3019#discussion_r244254647
  
--- Diff: 
integration/presto/src/test/scala/org/apache/carbondata/presto/util/CarbonDataStoreCreator.scala
 ---
@@ -80,7 +80,7 @@ object CarbonDataStoreCreator {
   UUID.randomUUID().toString))
   //   val factFilePath: String = new 
File(dataFilePath).getCanonicalPath
   val storeDir: File = new File(absoluteTableIdentifier.getTablePath)
-  CarbonUtil.deleteFoldersAndFiles(storeDir)
+//  CarbonUtil.deleteFoldersAndFiles(storeDir)
--- End diff --

delete it if not required, same for line 81


---

[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...

2018-12-27 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2161
  
LGTM. Merging into master branch


---

[GitHub] carbondata issue #3018: [HOTFIX] rename field "thread_pool_size" to match ca...

2018-12-27 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/3018
  
LGTM


---

[GitHub] carbondata issue #2984: [CARBONDATA-3165]Protection of Bloom Null Exception

2018-12-27 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2984
  
LGTM


---

[GitHub] carbondata pull request #3019: [CARBONDATA-3194] Integrating Carbon with Pre...

2018-12-27 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3019#discussion_r244172111
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/CarbondataConnectorFactory.java
 ---
@@ -17,69 +17,177 @@
 
 package org.apache.carbondata.presto;
 
+import java.lang.management.ManagementFactory;
+import java.lang.reflect.*;
 import java.util.Map;
+import java.util.Optional;
+import java.util.Set;
 
 import static java.util.Objects.requireNonNull;
 
-import com.facebook.presto.spi.ConnectorHandleResolver;
+import org.apache.carbondata.presto.impl.CarbonTableConfig;
+
+import com.facebook.presto.hive.HiveConnector;
+import com.facebook.presto.hive.HiveConnectorFactory;
+import com.facebook.presto.hive.HiveMetadataFactory;
+import com.facebook.presto.hive.HiveProcedureModule;
+import com.facebook.presto.hive.HiveSchemaProperties;
+import com.facebook.presto.hive.HiveSessionProperties;
+import com.facebook.presto.hive.HiveStorageFormat;
+import com.facebook.presto.hive.HiveTableProperties;
+import com.facebook.presto.hive.HiveTransactionManager;
+import com.facebook.presto.hive.NodeVersion;
+import com.facebook.presto.hive.RebindSafeMBeanServer;
+import com.facebook.presto.hive.authentication.HiveAuthenticationModule;
+import com.facebook.presto.hive.metastore.HiveMetastoreModule;
+import com.facebook.presto.hive.s3.HiveS3Module;
+import com.facebook.presto.hive.security.HiveSecurityModule;
+import com.facebook.presto.hive.security.PartitionsAwareAccessControl;
+import com.facebook.presto.spi.NodeManager;
+import com.facebook.presto.spi.PageIndexerFactory;
+import com.facebook.presto.spi.PageSorter;
 import com.facebook.presto.spi.classloader.ThreadContextClassLoader;
-import com.facebook.presto.spi.connector.*;
-import 
com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorMetadata;
+import com.facebook.presto.spi.connector.Connector;
+import com.facebook.presto.spi.connector.ConnectorAccessControl;
+import com.facebook.presto.spi.connector.ConnectorContext;
+import com.facebook.presto.spi.connector.ConnectorNodePartitioningProvider;
+import com.facebook.presto.spi.connector.ConnectorPageSinkProvider;
+import com.facebook.presto.spi.connector.ConnectorPageSourceProvider;
+import com.facebook.presto.spi.connector.ConnectorSplitManager;
+import 
com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorPageSinkProvider;
 import 
com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorPageSourceProvider;
 import 
com.facebook.presto.spi.connector.classloader.ClassLoaderSafeConnectorSplitManager;
-import com.google.common.base.Throwables;
+import 
com.facebook.presto.spi.connector.classloader.ClassLoaderSafeNodePartitioningProvider;
+import com.facebook.presto.spi.procedure.Procedure;
+import com.facebook.presto.spi.type.TypeManager;
+import com.google.common.collect.ImmutableSet;
 import com.google.inject.Injector;
+import com.google.inject.Key;
+import com.google.inject.TypeLiteral;
 import io.airlift.bootstrap.Bootstrap;
 import io.airlift.bootstrap.LifeCycleManager;
+import io.airlift.event.client.EventModule;
 import io.airlift.json.JsonModule;
+import io.airlift.units.DataSize;
+import org.weakref.jmx.guice.MBeanModule;
+import sun.reflect.ConstructorAccessor;
 
+import static com.google.common.base.Throwables.throwIfUnchecked;
+import static io.airlift.configuration.ConfigBinder.configBinder;
 
 /**
  * Build Carbondata Connector
  * It will be called by CarbondataPlugin
  */
-public class CarbondataConnectorFactory implements ConnectorFactory {
+public class CarbondataConnectorFactory extends HiveConnectorFactory {
 
-  private final String name;
   private final ClassLoader classLoader;
 
   public CarbondataConnectorFactory(String connectorName, ClassLoader 
classLoader) {
-this.name = connectorName;
+super(connectorName, classLoader, null);
 this.classLoader = requireNonNull(classLoader, "classLoader is null");
   }
 
-  @Override public String getName() {
-return name;
-  }
-
-  @Override public ConnectorHandleResolver getHandleResolver() {
-return new CarbondataHandleResolver();
-  }
-
-  @Override public Connector create(String connectorId, Map config,
+  @Override public Connector create(String catalogName, Map config,
   ConnectorContext context) {
 requireNonNull(config, "config is null");
 
 try (ThreadContextClassLoader ignored = new 
ThreadContextClassLoader(classLoader)) {
-  Bootstrap app = new Bootstrap(new JsonModule(),
-

[GitHub] carbondata issue #3004: [CARBONDATA-3188] Create carbon table as hive unders...

2018-12-27 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/3004
  
LGTM


---

[GitHub] carbondata pull request #3017: [HOTFIX] remove this useless assignment

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3017#discussion_r244065800
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/locks/LocalFileLock.java ---
@@ -112,7 +112,7 @@ public LocalFileLock(String lockFileLocation, String 
lockFile) {
 status = true;
   }
 } catch (IOException e) {
-  status = false;
+  // status = false;
--- End diff --

Yes, I think HdfsFileLock has similar problem. Actually there is a PR #2878 
trying to fix all exception code style problem, by @kevinjmh . I suggest we can 
have it handled in that PR. Then for this PR, I think you can remove the 
assignment directly, same for HdfsFileLock


---

[GitHub] carbondata pull request #3026: [WIP] Added support to compile carbon CDH spa...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3026#discussion_r244065240
  
--- Diff: 
integration/spark-datasource/src/main/spark2.1andspark2.2/org/apache/spark/sql/CarbonDictionaryUtil.java
 ---
@@ -0,0 +1,116 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql;
+
+import java.lang.reflect.Array;
+import java.lang.reflect.Field;
+import java.lang.reflect.Method;
+
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+
+import org.apache.spark.sql.execution.vectorized.ColumnVector;
+
+/**
+ * This class uses the java reflection to create parquet dictionary class 
as CDH distribution uses
+ * twitter parquet instead of apache parquet.
+ */
+public class CarbonDictionaryUtil {
--- End diff --

It it better to make it as `ReflectionUtil` 
And please add InterfaceAudience annotation


---

[GitHub] carbondata pull request #2963: [CARBONDATA-3139] Fix bugs in MinMaxDataMap e...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2963#discussion_r244007049
  
--- Diff: 
datamap/example/src/main/java/org/apache/carbondata/datamap/minmax/MinMaxDataMapFactory.java
 ---
@@ -0,0 +1,353 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.datamap.minmax;
+
+import java.io.IOException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.HashSet;
+import java.util.List;
+import java.util.Map;
+import java.util.Set;
+import java.util.concurrent.ConcurrentHashMap;
+
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import 
org.apache.carbondata.common.exceptions.sql.MalformedDataMapCommandException;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.cache.Cache;
+import org.apache.carbondata.core.cache.CacheProvider;
+import org.apache.carbondata.core.cache.CacheType;
+import org.apache.carbondata.core.datamap.DataMapDistributable;
+import org.apache.carbondata.core.datamap.DataMapLevel;
+import org.apache.carbondata.core.datamap.DataMapMeta;
+import org.apache.carbondata.core.datamap.DataMapStoreManager;
+import org.apache.carbondata.core.datamap.Segment;
+import org.apache.carbondata.core.datamap.TableDataMap;
+import org.apache.carbondata.core.datamap.dev.DataMapBuilder;
+import org.apache.carbondata.core.datamap.dev.DataMapWriter;
+import org.apache.carbondata.core.datamap.dev.cgdatamap.CoarseGrainDataMap;
+import 
org.apache.carbondata.core.datamap.dev.cgdatamap.CoarseGrainDataMapFactory;
+import org.apache.carbondata.core.datastore.block.SegmentProperties;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFile;
+import org.apache.carbondata.core.datastore.filesystem.CarbonFileFilter;
+import org.apache.carbondata.core.datastore.impl.FileFactory;
+import org.apache.carbondata.core.features.TableOperation;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import org.apache.carbondata.core.metadata.schema.table.DataMapSchema;
+import 
org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn;
+import org.apache.carbondata.core.scan.filter.intf.ExpressionType;
+import org.apache.carbondata.core.statusmanager.SegmentStatusManager;
+import org.apache.carbondata.core.util.CarbonUtil;
+import org.apache.carbondata.core.util.path.CarbonTablePath;
+import org.apache.carbondata.events.Event;
+
+import org.apache.log4j.Logger;
+
+/**
+ * Min Max DataMap Factory
+ */
+@InterfaceAudience.Internal
+public class MinMaxDataMapFactory extends CoarseGrainDataMapFactory {
+  private static final Logger LOGGER =
+  
LogServiceFactory.getLogService(MinMaxDataMapFactory.class.getName());
+  private DataMapMeta dataMapMeta;
+  private String dataMapName;
+  // segmentId -> list of index files
+  private Map> segmentMap = new ConcurrentHashMap<>();
+  private Cache cache;
+
+  public MinMaxDataMapFactory(CarbonTable carbonTable, DataMapSchema 
dataMapSchema)
+  throws MalformedDataMapCommandException {
+super(carbonTable, dataMapSchema);
+
+// this is an example for datamap, we can choose the columns and 
operations that
+// will be supported by this datamap. Furthermore, we can add 
cache-support for this datamap.
+
+this.dataMapName = dataMapSchema.getDataMapName();
+List indexedColumns = 
carbonTable.getIndexedColumns(dataMapSchema);
+
+// operations that will be supported on the indexed columns
+List optOperations = new ArrayList<>();
+optOperations.add(ExpressionType.NOT);
+optOperations.add(ExpressionType.EQUALS);
+optOperations.add(ExpressionType.NOT_EQUALS);
+optOperations.add(ExpressionType.GREATERTHAN);
+optOperations.add(ExpressionType.GREATERTHAN_EQUALTO);
+opt

[GitHub] carbondata issue #3001: [Presto][Streaming] support presto read streaming ta...

2018-12-26 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/3001
  
@QiangCai Can you explain the dependency of readers after this PR? Ideally, 
presto module should depends on streaming reader in core or hadoop modules.


---

[GitHub] carbondata pull request #2970: [CARBONDATA-3142]Add timestamp with thread na...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2970#discussion_r244005751
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
 ---
@@ -144,14 +144,14 @@ private void 
initParameters(CarbonFactDataHandlerModel model) {
 blockletProcessingCount = new AtomicInteger(0);
 producerExecutorService = 
Executors.newFixedThreadPool(model.getNumberOfCores(),
 new CarbonThreadFactory(
-"ProducerPool_" + System.nanoTime() + ":" + 
model.getTableName() + ", range: " + model
-.getBucketId()));
+"ProducerPool:" + model.getTableName() + ", range: " + model
--- End diff --

better to use String.format


---

[GitHub] carbondata pull request #2970: [CARBONDATA-3142]Add timestamp with thread na...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2970#discussion_r244005776
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/store/CarbonFactDataHandlerColumnar.java
 ---
@@ -144,14 +144,14 @@ private void 
initParameters(CarbonFactDataHandlerModel model) {
 blockletProcessingCount = new AtomicInteger(0);
 producerExecutorService = 
Executors.newFixedThreadPool(model.getNumberOfCores(),
 new CarbonThreadFactory(
-"ProducerPool_" + System.nanoTime() + ":" + 
model.getTableName() + ", range: " + model
-.getBucketId()));
+"ProducerPool:" + model.getTableName() + ", range: " + model
+.getBucketId(), true));
 producerExecutorServiceTaskList =
 new ArrayList<>(CarbonCommonConstants.DEFAULT_COLLECTION_SIZE);
 LOGGER.debug("Initializing writer executors");
 consumerExecutorService = Executors.newFixedThreadPool(1, new 
CarbonThreadFactory(
-"ConsumerPool_" + System.nanoTime() + ":" + model.getTableName() + 
", range: " + model
-.getBucketId()));
+"ConsumerPool:" + model.getTableName() + ", range: " + model
--- End diff --

better to use String.format()


---

[GitHub] carbondata pull request #2970: [CARBONDATA-3142]Add timestamp with thread na...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2970#discussion_r244005707
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/util/CarbonThreadFactory.java ---
@@ -34,14 +34,26 @@
*/
   private String name;
 
+  private boolean withTime = false;
+
   public CarbonThreadFactory(String name) {
 this.defaultFactory = Executors.defaultThreadFactory();
 this.name = name;
   }
 
+  public CarbonThreadFactory(String name, boolean withTime) {
+this(name);
+this.withTime = withTime;
+  }
+
   @Override public Thread newThread(Runnable r) {
 final Thread thread = defaultFactory.newThread(r);
-thread.setName(name);
+if (withTime) {
+  thread.setName(name + "_" + System.currentTimeMillis());
+}
+else {
--- End diff --

move to previous line


---

[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...

2018-12-26 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2161
  
@chandrasaripaka thanks for contributing this, I have only last 1 commend. 
It can be merged after that is resolved, if it is really an issue


---

[GitHub] carbondata pull request #2161: [CARBONDATA-2218] AlluxioCarbonFile while try...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2161#discussion_r244005006
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AlluxioCarbonFile.java
 ---
@@ -90,21 +97,82 @@ public CarbonFile getParentFile() {
 return null == parent ? null : new AlluxioCarbonFile(parent);
   }
 
+  /**
+   * RenameForce of the fileName for the AlluxioFileSystem 
Implementation.
+   * Involves by opening a {@link FSDataInputStream} from the existing 
path and copy
+   * bytes to {@link FSDataOutputStream}.
+   * 
+   * Close the output and input streams only after the files have been 
written
+   * Also check for the existence of the changed path and then delete the 
previous Path.
+   * The No of Bytes that can be read is controlled by {@literal 
io.file.buffer.size},
+   * where the default value is 4096.
+   * @param changeToName
+   * @return
+   */
   @Override
   public boolean renameForce(String changeToName) {
-FileSystem fs;
+FileSystem fs = null;
+FSDataOutputStream fsdos = null;
+FSDataInputStream fsdis = null;
 try {
-  fs = 
fileStatus.getPath().getFileSystem(FileFactory.getConfiguration());
-  if (fs instanceof DistributedFileSystem) {
-((DistributedFileSystem) fs).rename(fileStatus.getPath(), new 
Path(changeToName),
-org.apache.hadoop.fs.Options.Rename.OVERWRITE);
+  Path actualPath = fileStatus.getPath();
+  Path changedPath = new Path(changeToName);
+  fs = actualPath.getFileSystem(hadoopConf);
+  fsdos = fs.create(changedPath, true);
+  fsdis = fs.open(actualPath);
+  if (null != fsdis && null != fsdos) {
+IOUtils.copyBytes(fsdis, fsdos, hadoopConf, true);
 return true;
-  } else {
-return false;
   }
+  return false;
 } catch (IOException e) {
   LOGGER.error("Exception occured: " + e.getMessage());
   return false;
+} finally {
+  try {
+if (null != fsdis && null != fsdos) {
+  if (fs.exists(new Path(changeToName))) {
+fs.delete(fileStatus.getPath(), true);
--- End diff --

Here the fileStatus is not updated, I suspect there will be potential issue


---

[GitHub] carbondata pull request #2161: [CARBONDATA-2218] AlluxioCarbonFile while try...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2161#discussion_r244004960
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/filesystem/AlluxioCarbonFile.java
 ---
@@ -90,21 +97,82 @@ public CarbonFile getParentFile() {
 return null == parent ? null : new AlluxioCarbonFile(parent);
   }
 
+  /**
+   * RenameForce of the fileName for the AlluxioFileSystem 
Implementation.
+   * Involves by opening a {@link FSDataInputStream} from the existing 
path and copy
+   * bytes to {@link FSDataOutputStream}.
+   * 
+   * Close the output and input streams only after the files have been 
written
+   * Also check for the existence of the changed path and then delete the 
previous Path.
+   * The No of Bytes that can be read is controlled by {@literal 
io.file.buffer.size},
+   * where the default value is 4096.
+   * @param changeToName
+   * @return
+   */
   @Override
   public boolean renameForce(String changeToName) {
-FileSystem fs;
+FileSystem fs = null;
+FSDataOutputStream fsdos = null;
+FSDataInputStream fsdis = null;
 try {
-  fs = 
fileStatus.getPath().getFileSystem(FileFactory.getConfiguration());
-  if (fs instanceof DistributedFileSystem) {
-((DistributedFileSystem) fs).rename(fileStatus.getPath(), new 
Path(changeToName),
-org.apache.hadoop.fs.Options.Rename.OVERWRITE);
+  Path actualPath = fileStatus.getPath();
+  Path changedPath = new Path(changeToName);
+  fs = actualPath.getFileSystem(hadoopConf);
+  fsdos = fs.create(changedPath, true);
+  fsdis = fs.open(actualPath);
+  if (null != fsdis && null != fsdos) {
+IOUtils.copyBytes(fsdis, fsdos, hadoopConf, true);
 return true;
--- End diff --

I think what you do in the finally block should move to here and fileStatus 
should be assigned to point to the newly created file. 



---

[GitHub] carbondata pull request #3017: [HOTFIX] remove this useless assignment

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3017#discussion_r244003686
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/locks/LocalFileLock.java ---
@@ -112,7 +112,7 @@ public LocalFileLock(String lockFileLocation, String 
lockFile) {
 status = true;
   }
 } catch (IOException e) {
-  status = false;
+  // status = false;
--- End diff --

I suggest we print a warning message, and remove this assignment
@ravipesala @QiangCai 


---

[GitHub] carbondata issue #3004: [CARBONDATA-3188] Create carbon table as hive unders...

2018-12-26 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/3004
  
@SteNicholas Let's first merge this PR and then you can start refactory on 
carbondata-hive module


---

[GitHub] carbondata pull request #3004: [CARBONDATA-3188] Create carbon table as hive...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3004#discussion_r244002248
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateHiveTableWithCarbonDS.scala
 ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io.File
+
+import org.apache.hadoop.fs.Path
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, CarbonSession}
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat
+
+class TestCreateHiveTableWithCarbonDS extends QueryTest with 
BeforeAndAfterAll {
+
+  override def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  override def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with 
stored by") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |STORED BY 'carbondata'
+ """.stripMargin)
+
+verifyTable
+
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  private def verifyTable = {
+val table = 
sqlContext.sparkSession.asInstanceOf[CarbonSession].sessionState.catalog.asInstanceOf[CarbonSessionCatalog].getClient().getTable("default",
 "source")
+assert(table.schema.fields.length == 3)
+if (SparkUtil.isSparkVersionEqualTo("2.2")) {
+  assert(table.storage.locationUri.get.equals(new 
Path(s"file:$storeLocation/source").toUri))
+}
+
assert(table.storage.inputFormat.get.equals(classOf[CarbonFileInputFormat[_]].getName))
--- End diff --

Shouldn't it be CarbonTableInputFormat class?


---

[GitHub] carbondata pull request #3004: [CARBONDATA-3188] Create carbon table as hive...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3004#discussion_r244002150
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateHiveTableWithCarbonDS.scala
 ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io.File
+
+import org.apache.hadoop.fs.Path
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, CarbonSession}
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat
+
+class TestCreateHiveTableWithCarbonDS extends QueryTest with 
BeforeAndAfterAll {
+
+  override def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  override def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with 
stored by") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |STORED BY 'carbondata'
+ """.stripMargin)
+
+verifyTable
+
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  private def verifyTable = {
+val table = 
sqlContext.sparkSession.asInstanceOf[CarbonSession].sessionState.catalog.asInstanceOf[CarbonSessionCatalog].getClient().getTable("default",
 "source")
+assert(table.schema.fields.length == 3)
+if (SparkUtil.isSparkVersionEqualTo("2.2")) {
+  assert(table.storage.locationUri.get.equals(new 
Path(s"file:$storeLocation/source").toUri))
+}
+
assert(table.storage.inputFormat.get.equals(classOf[CarbonFileInputFormat[_]].getName))
--- End diff --

please assert the serde class and outputformat class also


---

[GitHub] carbondata pull request #3004: [CARBONDATA-3188] Create carbon table as hive...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3004#discussion_r244001857
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateHiveTableWithCarbonDS.scala
 ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io.File
+
+import org.apache.hadoop.fs.Path
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, CarbonSession}
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat
+
+class TestCreateHiveTableWithCarbonDS extends QueryTest with 
BeforeAndAfterAll {
+
+  override def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  override def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with 
stored by") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |STORED BY 'carbondata'
+ """.stripMargin)
+
+verifyTable
+
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  private def verifyTable = {
+val table = 
sqlContext.sparkSession.asInstanceOf[CarbonSession].sessionState.catalog.asInstanceOf[CarbonSessionCatalog].getClient().getTable("default",
 "source")
+assert(table.schema.fields.length == 3)
--- End diff --

same for line 61, 63


---

[GitHub] carbondata pull request #3004: [CARBONDATA-3188] Create carbon table as hive...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3004#discussion_r244001900
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateHiveTableWithCarbonDS.scala
 ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io.File
+
+import org.apache.hadoop.fs.Path
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, CarbonSession}
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat
+
+class TestCreateHiveTableWithCarbonDS extends QueryTest with 
BeforeAndAfterAll {
+
+  override def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  override def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with 
stored by") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |STORED BY 'carbondata'
+ """.stripMargin)
+
+verifyTable
+
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  private def verifyTable = {
+val table = 
sqlContext.sparkSession.asInstanceOf[CarbonSession].sessionState.catalog.asInstanceOf[CarbonSessionCatalog].getClient().getTable("default",
 "source")
+assert(table.schema.fields.length == 3)
+if (SparkUtil.isSparkVersionEqualTo("2.2")) {
+  assert(table.storage.locationUri.get.equals(new 
Path(s"file:$storeLocation/source").toUri))
+}
+
assert(table.storage.inputFormat.get.equals(classOf[CarbonFileInputFormat[_]].getName))
+  }
+
+  test("test create table and verify the hive table correctness with using 
carbondata") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |using carbondata
+ """.stripMargin)
+
+verifyTable
+
+
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with using 
carbon") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |using carbon
+ """.stripMargin)
+
+verifyTable
+
--- End diff --

remove extra line


---

[GitHub] carbondata pull request #3004: [CARBONDATA-3188] Create carbon table as hive...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3004#discussion_r244001878
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateHiveTableWithCarbonDS.scala
 ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io.File
+
+import org.apache.hadoop.fs.Path
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, CarbonSession}
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat
+
+class TestCreateHiveTableWithCarbonDS extends QueryTest with 
BeforeAndAfterAll {
+
+  override def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  override def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with 
stored by") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |STORED BY 'carbondata'
+ """.stripMargin)
+
+verifyTable
+
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  private def verifyTable = {
+val table = 
sqlContext.sparkSession.asInstanceOf[CarbonSession].sessionState.catalog.asInstanceOf[CarbonSessionCatalog].getClient().getTable("default",
 "source")
+assert(table.schema.fields.length == 3)
+if (SparkUtil.isSparkVersionEqualTo("2.2")) {
+  assert(table.storage.locationUri.get.equals(new 
Path(s"file:$storeLocation/source").toUri))
+}
+
assert(table.storage.inputFormat.get.equals(classOf[CarbonFileInputFormat[_]].getName))
+  }
+
+  test("test create table and verify the hive table correctness with using 
carbondata") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |using carbondata
+ """.stripMargin)
+
+verifyTable
+
+
--- End diff --

remove extra line


---

[GitHub] carbondata pull request #3004: [CARBONDATA-3188] Create carbon table as hive...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3004#discussion_r244001830
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateHiveTableWithCarbonDS.scala
 ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io.File
+
+import org.apache.hadoop.fs.Path
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, CarbonSession}
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat
+
+class TestCreateHiveTableWithCarbonDS extends QueryTest with 
BeforeAndAfterAll {
+
+  override def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  override def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with 
stored by") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |STORED BY 'carbondata'
+ """.stripMargin)
+
+verifyTable
+
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  private def verifyTable = {
+val table = 
sqlContext.sparkSession.asInstanceOf[CarbonSession].sessionState.catalog.asInstanceOf[CarbonSessionCatalog].getClient().getTable("default",
 "source")
+assert(table.schema.fields.length == 3)
--- End diff --

use assertResult so that it can show the actual value when failed


---

[GitHub] carbondata pull request #3004: [CARBONDATA-3188] Create carbon table as hive...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3004#discussion_r244001671
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateHiveTableWithCarbonDS.scala
 ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io.File
+
+import org.apache.hadoop.fs.Path
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, CarbonSession}
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat
+
+class TestCreateHiveTableWithCarbonDS extends QueryTest with 
BeforeAndAfterAll {
+
+  override def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  override def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with 
stored by") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |STORED BY 'carbondata'
--- End diff --

```suggestion
 |STORED AS carbondata
```


---

[GitHub] carbondata pull request #3004: [CARBONDATA-3188] Create carbon table as hive...

2018-12-26 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/3004#discussion_r244001631
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/createTable/TestCreateHiveTableWithCarbonDS.scala
 ---
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.testsuite.createTable
+
+import java.io.File
+
+import org.apache.hadoop.fs.Path
+import org.apache.spark.sql.catalyst.TableIdentifier
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.sql.test.util.QueryTest
+import org.apache.spark.sql.{AnalysisException, CarbonEnv, CarbonSession}
+import org.apache.spark.util.SparkUtil
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.carbondata.hadoop.api.CarbonFileInputFormat
+
+class TestCreateHiveTableWithCarbonDS extends QueryTest with 
BeforeAndAfterAll {
+
+  override def beforeAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  override def afterAll(): Unit = {
+sql("DROP TABLE IF EXISTS source")
+  }
+
+  test("test create table and verify the hive table correctness with 
stored by") {
+sql("DROP TABLE IF EXISTS source")
+sql(
+  s"""
+ |CREATE TABLE source (key INT, value string, col1 double)
+ |STORED BY 'carbondata'
--- End diff --

please use STORED AS carbondata


---

[GitHub] carbondata issue #3012: [CARBONDATA-3127]Fix the HiveExample & TestCarbonSer...

2018-12-21 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/3012
  
Thanks for contributing this fix


---

[GitHub] carbondata issue #3012: [CARBONDATA-3127]Fix the HiveExample & TestCarbonSer...

2018-12-20 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/3012
  
LGTM


---

[GitHub] carbondata issue #2990: [CARBONDATA-3149]Support alter table column rename

2018-12-20 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2990
  
LGTM. Thanks for working on this


---

[GitHub] carbondata issue #2161: [CARBONDATA-2218] AlluxioCarbonFile while trying to ...

2018-12-20 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2161
  
Since user normally uses Alluxio as a read cache, I think we can firstly 
verify carbon on alluxio for the query scenario. As I am still not very sure 
what is the correct way to implement rename for Alluxio, in the meantime, we 
can merge this PR first. So please rebase it. @chandrasaripaka 


---

[GitHub] carbondata issue #2969: [CARBONDATA-3127]Fix the HiveExample & TestCarbonSer...

2018-12-20 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2969
  
LGTM, can you squash your commits and rebase to master. I could not squash 
it


---

[GitHub] carbondata issue #3010: Fix PreAggregate Datamap Issue

2018-12-20 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/3010
  
Please create a JIRA issue and put it in the PR title


---

[GitHub] carbondata pull request #2990: [CARBONDATA-3149]Support alter table column r...

2018-12-20 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2990#discussion_r243250514
  
--- Diff: 
integration/spark2/src/test/scala/org/apache/spark/carbondata/restructure/vectorreader/AlterTableColumnRenameTestCase.scala
 ---
@@ -0,0 +1,344 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.carbondata.restructure.vectorreader
+
+import org.apache.spark.sql.common.util.Spark2QueryTest
+import org.scalatest.BeforeAndAfterAll
+
+import org.apache.carbondata.core.metadata.CarbonMetadata
+import org.apache.carbondata.spark.exception.ProcessMetaDataException
+
+class AlterTableColumnRenameTestCase extends Spark2QueryTest with 
BeforeAndAfterAll {
+
+
--- End diff --

remove extra empty line


---

[GitHub] carbondata pull request #2990: [CARBONDATA-3149]Support alter table column r...

2018-12-20 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2990#discussion_r243250452
  
--- Diff: 
integration/spark2/src/test/scala/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMapSuite.scala
 ---
@@ -548,11 +548,16 @@ class BloomCoarseGrainDataMapSuite extends QueryTest 
with BeforeAndAfterAll with
  | USING 'bloomfilter'
  | DMProperties( 'INDEX_COLUMNS'='city,id', 'BLOOM_SIZE'='64')
   """.stripMargin)
-val exception: MalformedCarbonCommandException = 
intercept[MalformedCarbonCommandException] {
+val exception1: MalformedCarbonCommandException = 
intercept[MalformedCarbonCommandException] {
--- End diff --

use a proper name


---

[GitHub] carbondata pull request #2990: [CARBONDATA-3149]Support alter table column r...

2018-12-20 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2990#discussion_r243248566
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/util/AlterTableUtil.scala ---
@@ -269,12 +269,50 @@ object AlterTableUtil {
   }
 }
   }
-  metastore
+  metaStore
 
.revertTableSchemaInAlterFailure(carbonTable.getCarbonTableIdentifier,
   thriftTable, 
carbonTable.getAbsoluteTableIdentifier)(sparkSession)
 }
   }
 
+  /**
+   * This method modifies the table properties if column rename happened
+   *
+   * @param tableProperties
--- End diff --

add comment for all parameter


---

[GitHub] carbondata pull request #2990: [CARBONDATA-3149]Support alter table column r...

2018-12-20 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2990#discussion_r243248408
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/strategy/StreamingTableStrategy.scala
 ---
@@ -54,10 +54,10 @@ private[sql] class StreamingTableStrategy(sparkSession: 
SparkSession) extends Sp
   new TableIdentifier(model.tableName, model.databaseName),
   "Alter table drop column")
 Nil
-  case CarbonAlterTableDataTypeChangeCommand(model) =>
+  case CarbonAlterTableColRenameDataTypeChangeCommand(model, _) =>
 rejectIfStreamingTable(
   new TableIdentifier(model.tableName, model.databaseName),
-  "Alter table change datatype")
+  "Alter table change datatype or column rename")
--- End diff --

I think you can know whether it is rename by 
`CarbonAlterTableColRenameDataTypeChangeCommand(model, rename)`, so you can use 
different message


---

[GitHub] carbondata pull request #2990: [CARBONDATA-3149]Support alter table column r...

2018-12-20 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2990#discussion_r243247751
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/execution/command/schema/CarbonAlterTableColRenameDataTypeChangeCommand.scala
 ---
@@ -0,0 +1,324 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution.command.schema
+
+import scala.collection.JavaConverters._
+import scala.collection.mutable
+
+import org.apache.spark.sql.{CarbonEnv, Row, SparkSession}
+import 
org.apache.spark.sql.execution.command.{AlterTableDataTypeChangeModel, 
DataTypeInfo,
+  MetadataCommand}
+import org.apache.spark.sql.hive.CarbonSessionCatalog
+import org.apache.spark.util.AlterTableUtil
+
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+import org.apache.carbondata.common.logging.LogServiceFactory
+import org.apache.carbondata.core.features.TableOperation
+import org.apache.carbondata.core.locks.{ICarbonLock, LockUsage}
+import 
org.apache.carbondata.core.metadata.converter.ThriftWrapperSchemaConverterImpl
+import org.apache.carbondata.core.metadata.datatype.DecimalType
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable
+import org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn
+import 
org.apache.carbondata.events.{AlterTableColRenameAndDataTypeChangePostEvent,
+  AlterTableColRenameAndDataTypeChangePreEvent, OperationContext, 
OperationListenerBus}
+import org.apache.carbondata.format.{ColumnSchema, SchemaEvolutionEntry, 
TableInfo}
+import org.apache.carbondata.spark.util.DataTypeConverterUtil
+
+abstract class CarbonAlterTableColumnRenameCommand(oldColumnName: String, 
newColumnName: String)
+  extends MetadataCommand {
+
+  protected def validColumnsForRenaming(carbonColumns: 
mutable.Buffer[CarbonColumn],
+  oldCarbonColumn: CarbonColumn,
+  carbonTable: CarbonTable): Unit = {
+// check whether new column name is already an existing column name
+if 
(carbonColumns.exists(_.getColName.equalsIgnoreCase(newColumnName))) {
+  throw new MalformedCarbonCommandException(s"Column Rename Operation 
failed. New " +
+s"column name 
$newColumnName already exists" +
+s" in table ${ 
carbonTable.getTableName }")
+}
+
+// if the column rename is for complex column, block the operation
+if (oldCarbonColumn.isComplex) {
+  throw new MalformedCarbonCommandException(s"Column Rename Operation 
failed. Rename " +
+s"column is unsupported 
for complex datatype " +
+s"column ${ 
oldCarbonColumn.getColName }")
+}
+
+// if column rename operation is on partition column, then fail the 
rename operation
+if (null != carbonTable.getPartitionInfo) {
+  val partitionColumns = 
carbonTable.getPartitionInfo.getColumnSchemaList
+  partitionColumns.asScala.foreach {
+col =>
+  if (col.getColumnName.equalsIgnoreCase(oldColumnName)) {
+throw new MalformedCarbonCommandException(
+  s"Column Rename Operation failed. Renaming " +
+  s"the partition column $newColumnName is not " +
+  s"allowed")
+  }
+  }
+}
+
+  }
+}
+
+private[sql] case class CarbonAlterTableColRenameDataTypeChangeCommand(
+alterTableColRenameAndDataTypeChangeModel: 
AlterTableDataTypeChangeModel,
+childTableColumnRename: Boolean = false)
+  extends 
CarbonAlterTableColumnRenameCommand(alterTableColRenameAndDataTypeChangeModel.columnName,
+alterTableColRenameAndDataTypeChangeModel.newColu

[GitHub] carbondata pull request #2990: [CARBONDATA-3149]Support alter table column r...

2018-12-20 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2990#discussion_r243245093
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala
 ---
@@ -1511,7 +1514,16 @@ abstract class CarbonDDLSqlParser extends 
AbstractCarbonSparkSQLParser {
 }
 DataTypeInfo("decimal", precision, scale)
   case _ =>
-throw new MalformedCarbonCommandException("Data type provided is 
invalid.")
+if (isColumnRename) {
+  dataType match {
--- End diff --

why here match `dataType` again which is already match in line 1496. Seems 
this func need to be refactored


---

[GitHub] carbondata pull request #2990: [CARBONDATA-3149]Support alter table column r...

2018-12-20 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2990#discussion_r243242285
  
--- Diff: 
integration/spark-common/src/main/scala/org/apache/spark/sql/catalyst/CarbonDDLSqlParser.scala
 ---
@@ -1487,16 +1487,19 @@ abstract class CarbonDDLSqlParser extends 
AbstractCarbonSparkSQLParser {
* @param values
* @return
*/
-  def parseDataType(dataType: String, values: Option[List[(Int, Int)]]): 
DataTypeInfo = {
+  def parseDataType(
+  dataType: String,
+  values: Option[List[(Int, Int)]],
+  isColumnRename: Boolean): DataTypeInfo = {
--- End diff --

complete the comment start from line 1486


---

[GitHub] carbondata pull request #2969: [CARBONDATA-3127]Fix the TestCarbonSerde exce...

2018-12-19 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2969#discussion_r243150967
  
--- Diff: 
integration/hive/src/main/scala/org/apache/carbondata/hiveexample/HiveExample.scala
 ---
@@ -22,52 +22,49 @@ import java.sql.{DriverManager, ResultSet, Statement}
 import org.apache.spark.sql.SparkSession
--- End diff --

Please move this example to example folder, I suggest to put all example in 
one module, so instead of adding another example module for hive, I think it is 
better add it in existing example module and rename it to carbondata-example 
from carbondata-example-spark2


---

[GitHub] carbondata pull request #2969: [CARBONDATA-3127]Fix the TestCarbonSerde exce...

2018-12-19 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2969#discussion_r243149546
  
--- Diff: 
integration/hive/src/main/scala/org/apache/carbondata/hiveexample/HiveExample.scala
 ---
@@ -85,25 +82,35 @@ object HiveExample {
 
 logger.info(s"HIVE CLI IS STARTED ON PORT $port 
==")
 
-statement.execute("CREATE TABLE IF NOT EXISTS " + "HIVE_CARBON_EXAMPLE 
" +
-  " (ID int, NAME string,SALARY double)")
-statement
-  .execute(
-"ALTER TABLE HIVE_CARBON_EXAMPLE SET FILEFORMAT INPUTFORMAT 
\"org.apache.carbondata." +
-"hive.MapredCarbonInputFormat\"OUTPUTFORMAT 
\"org.apache.carbondata.hive." +
-"MapredCarbonOutputFormat\"SERDE \"org.apache.carbondata.hive." +
-"CarbonHiveSerDe\" ")
+statement.execute(
+  s"""
+ | CREATE TABLE IF NOT EXISTS HIVE_CARBON_EXAMPLE
+ | (ID int, NAME string,SALARY double)
+ | ROW FORMAT SERDE 'org.apache.carbondata.hive.CarbonHiveSerDe'
+ | WITH SERDEPROPERTIES 
('mapreduce.input.carboninputformat.databaseName'='default',
+ | 
'mapreduce.input.carboninputformat.tableName'='HIVE_CARBON_EXAMPLE')
+   """.stripMargin)
+
+statement.execute(
+  s"""
+ | ALTER TABLE HIVE_CARBON_EXAMPLE
+ | SET FILEFORMAT
+ | INPUTFORMAT 
\"org.apache.carbondata.hive.MapredCarbonInputFormat\"
--- End diff --

This is fine for this PR, and after merging #3004 this ALTER TABLE will not 
be required


---

[GitHub] carbondata pull request #2969: [CARBONDATA-3127]Fix the TestCarbonSerde exce...

2018-12-19 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2969#discussion_r243149331
  
--- Diff: 
integration/hive/src/main/scala/org/apache/carbondata/hiveexample/HiveExample.scala
 ---
@@ -49,17 +49,19 @@ object HiveExample {
 
 sparkSession.sql("""DROP TABLE IF EXISTS 
HIVE_CARBON_EXAMPLE""".stripMargin)
 
-sparkSession
-  .sql(
-"CREATE TABLE HIVE_CARBON_EXAMPLE (ID int,NAME string,SALARY 
double) " +
-  "STORED BY 'CARBONDATA' ")
+sparkSession.sql(
+  s"""
+ | CREATE TABLE HIVE_CARBON_EXAMPLE
+ | (ID int,NAME string,SALARY double)
+ | STORED BY 'CARBONDATA'
--- End diff --

```suggestion
 | STORED AS CARBONDATA
```


---

[GitHub] carbondata issue #2925: [CARBONDATA-3102] Fix NoClassDefFoundError when use ...

2018-12-15 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2925
  
LGTM


---

[GitHub] carbondata pull request #2984: [CARBONDATA-3165]Protection of Bloom Null Exc...

2018-12-14 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2984#discussion_r241938979
  
--- Diff: 
datamap/bloom/src/main/java/org/apache/carbondata/datamap/bloom/BloomCoarseGrainDataMap.java
 ---
@@ -227,6 +231,12 @@ private String getAncestorTablePath(CarbonTable 
currentTable) {
 }
   }
 }
+if (hitBlocklets == null) {
+  LOGGER.warn(String.format("HitBlocklets is empty in bloom filter 
prune method. " +
--- End diff --

can you make it non-null, such as making it an empty set


---

[GitHub] carbondata pull request #2925: [CARBONDATA-3102] Fix NoClassDefFoundError wh...

2018-12-14 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2925#discussion_r241774500
  
--- Diff: integration/spark2/pom.xml ---
@@ -134,6 +134,11 @@
 
   
 
+
--- End diff --

I see `org.apache.httpcomponents` is already introduced in other modules, 
please unify the version for this jar in all places 


---

[GitHub] carbondata issue #2973: [WIP][CARBONDATA-3144] CarbonData support spark-2.4....

2018-12-14 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2973
  
Please close this if you are not working on it. Thanks


---

[GitHub] carbondata pull request #2925: [CARBONDATA-3102] Fix NoClassDefFoundError wh...

2018-12-14 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2925#discussion_r241736440
  
--- Diff: integration/spark2/pom.xml ---
@@ -134,6 +134,11 @@
 
   
 
+
--- End diff --

I have never encounter this problem before, can you describe the scenario 
in more detail, in what case NoClassDefFoundError will be thrown?


---

[GitHub] carbondata pull request #2925: [CARBONDATA-3102] Fix NoClassDefFoundError wh...

2018-12-14 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2925#discussion_r241736214
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/carbondata/spark/thriftserver/CarbonThriftServer.scala
 ---
@@ -48,8 +48,13 @@ object CarbonThriftServer {
   System.exit(0)
 }
 
+val master = Option(System.getProperty("spark.master"))
--- End diff --

I think better to let user give --master local, user can control this.


---

[GitHub] carbondata pull request #2890: [CARBONDATA-3002] Fix some spell error

2018-12-14 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2890#discussion_r241735498
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/util/CarbonLoaderUtil.java
 ---
@@ -569,25 +569,25 @@ public static Dictionary 
getDictionary(AbsoluteTableIdentifier absoluteTableIden
   nodes.add(relation.getNode());
 }
 
-int noofNodes = (-1 == noOfNodesInput) ? nodes.size() : noOfNodesInput;
+int noOfNodes = (-1 == noOfNodesInput) ? nodes.size() : noOfNodesInput;
--- End diff --

```suggestion
int numNodes = (-1 == noOfNodesInput) ? nodes.size() : numNodesInput;
```


---

[GitHub] carbondata pull request #2914: [CARBONDATA-3093] Provide property builder fo...

2018-12-12 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2914#discussion_r241285417
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -40,1372 +41,1024 @@ private CarbonCommonConstants() {
   @CarbonProperty
   public static final String STORE_LOCATION = "carbon.storelocation";
 
-  /**
-   * blocklet size in carbon file
-   */
-  @CarbonProperty
-  public static final String BLOCKLET_SIZE = "carbon.blocklet.size";
-
-  /**
-   * default blocklet size
-   */
-  public static final String BLOCKLET_SIZE_DEFAULT_VAL = "12";
-
-  /**
-   * min blocklet size
-   */
-  public static final int BLOCKLET_SIZE_MIN_VAL = 2000;
-
-  /**
-   * max blocklet size
-   */
-  public static final int BLOCKLET_SIZE_MAX_VAL = 1200;
-
-  /**
-   * min block size in MB
-   */
-  public static final int BLOCK_SIZE_MIN_VAL = 1;
-
-  /**
-   * max block size in MB
-   */
-  public static final int BLOCK_SIZE_MAX_VAL = 2048;
-
-  /**
-   * carbon properties file path
-   */
-  @CarbonProperty
-  public static final String CARBON_PROPERTIES_FILE_PATH = 
"carbon.properties.filepath";
-
-  /**
-   * default carbon properties file path
-   */
-  public static final String CARBON_PROPERTIES_FILE_PATH_DEFAULT =
-  "../../../conf/carbon.properties";
-
-  /**
-   * CARBON_DDL_BASE_HDFS_URL
-   */
-  @CarbonProperty
-  public static final String CARBON_DDL_BASE_HDFS_URL = 
"carbon.ddl.base.hdfs.url";
-
-  /**
-   * CARBON_BADRECORDS_LOCATION
-   */
-  @CarbonProperty
-  public static final String CARBON_BADRECORDS_LOC = 
"carbon.badRecords.location";
-
-  /**
-   * CARBON_BADRECORDS_LOCATION_DEFAULT
-   */
-  public static final String CARBON_BADRECORDS_LOC_DEFAULT_VAL = "";
-
-  /**
-   * Property for specifying the format of TIMESTAMP data type column.
-   * e.g. /MM/dd HH:mm:ss, or using default value
-   */
-  @CarbonProperty
-  public static final String CARBON_TIMESTAMP_FORMAT = 
"carbon.timestamp.format";
-
-  /**
-   * default value
-   */
-  public static final String CARBON_TIMESTAMP_DEFAULT_FORMAT = "-MM-dd 
HH:mm:ss";
-
-  /**
-   * CARBON_TIMESTAMP
-   */
-  public static final String CARBON_TIMESTAMP = "dd-MM- HH:mm:ss";
-
-  /**
-   * CARBON_TIMESTAMP
-   */
-  public static final String CARBON_TIMESTAMP_MILLIS = "dd-MM- 
HH:mm:ss:SSS";
-
-  /**
-   * Property for specifying the format of DATE data type column.
-   * e.g. /MM/dd , or using default value
-   */
-  @CarbonProperty
-  public static final String CARBON_DATE_FORMAT = "carbon.date.format";
-
-  /**
-   * default value
-   */
-  public static final String CARBON_DATE_DEFAULT_FORMAT = "-MM-dd";
-
-  /**
-   * compressor for writing/reading CarbonData file
-   */
-  @CarbonProperty
-  public static final String COMPRESSOR = "carbon.column.compressor";
-
-  /**
-   * default compressor is snappy
-   */
-  public static final String DEFAULT_COMPRESSOR = "snappy";
-
-  /**
-   * ZOOKEEPER_ENABLE_LOCK if this is set to true then zookeeper
-   * will be used to handle locking
-   * mechanism of carbon
-   */
-  @CarbonProperty
-  public static final String LOCK_TYPE = "carbon.lock.type";
-
-  /**
-   * ZOOKEEPER_ENABLE_DEFAULT the default value for zookeeper will be true 
for carbon
-   */
-  public static final String LOCK_TYPE_DEFAULT = "LOCALLOCK";
-
-  /**
-   * Specifies the path where the lock files have to be created.
-   * By default, lock files are created in table path.
-   */
-  @CarbonProperty
-  public static final String LOCK_PATH = "carbon.lock.path";
-
-  public static final String LOCK_PATH_DEFAULT = "";
-
-  /**
-   * ZOOKEEPER_LOCATION this is the location in zookeeper file system 
where locks are created.
-   * mechanism of carbon
-   */
-  public static final String ZOOKEEPER_LOCATION = "/CarbonLocks";
-
-  /**
-   * xxhash algorithm property for hashmap
-   */
-  @CarbonProperty
-  public static final String ENABLE_XXHASH = "carbon.enableXXHash";
-
-  /**
-   * xxhash algorithm property for hashmap Default value false
-   */
-  public static final String ENABLE_XXHASH_DEFAULT = "true&qu

[GitHub] carbondata pull request #2914: [CARBONDATA-3093] Provide property builder fo...

2018-12-12 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2914#discussion_r241285278
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -40,1372 +41,1024 @@ private CarbonCommonConstants() {
   @CarbonProperty
   public static final String STORE_LOCATION = "carbon.storelocation";
 
-  /**
-   * blocklet size in carbon file
-   */
-  @CarbonProperty
-  public static final String BLOCKLET_SIZE = "carbon.blocklet.size";
-
-  /**
-   * default blocklet size
-   */
-  public static final String BLOCKLET_SIZE_DEFAULT_VAL = "12";
-
-  /**
-   * min blocklet size
-   */
-  public static final int BLOCKLET_SIZE_MIN_VAL = 2000;
-
-  /**
-   * max blocklet size
-   */
-  public static final int BLOCKLET_SIZE_MAX_VAL = 1200;
-
-  /**
-   * min block size in MB
-   */
-  public static final int BLOCK_SIZE_MIN_VAL = 1;
-
-  /**
-   * max block size in MB
-   */
-  public static final int BLOCK_SIZE_MAX_VAL = 2048;
-
-  /**
-   * carbon properties file path
-   */
-  @CarbonProperty
-  public static final String CARBON_PROPERTIES_FILE_PATH = 
"carbon.properties.filepath";
-
-  /**
-   * default carbon properties file path
-   */
-  public static final String CARBON_PROPERTIES_FILE_PATH_DEFAULT =
-  "../../../conf/carbon.properties";
-
-  /**
-   * CARBON_DDL_BASE_HDFS_URL
-   */
-  @CarbonProperty
-  public static final String CARBON_DDL_BASE_HDFS_URL = 
"carbon.ddl.base.hdfs.url";
-
-  /**
-   * CARBON_BADRECORDS_LOCATION
-   */
-  @CarbonProperty
-  public static final String CARBON_BADRECORDS_LOC = 
"carbon.badRecords.location";
-
-  /**
-   * CARBON_BADRECORDS_LOCATION_DEFAULT
-   */
-  public static final String CARBON_BADRECORDS_LOC_DEFAULT_VAL = "";
-
-  /**
-   * Property for specifying the format of TIMESTAMP data type column.
-   * e.g. /MM/dd HH:mm:ss, or using default value
-   */
-  @CarbonProperty
-  public static final String CARBON_TIMESTAMP_FORMAT = 
"carbon.timestamp.format";
-
-  /**
-   * default value
-   */
-  public static final String CARBON_TIMESTAMP_DEFAULT_FORMAT = "-MM-dd 
HH:mm:ss";
-
-  /**
-   * CARBON_TIMESTAMP
-   */
-  public static final String CARBON_TIMESTAMP = "dd-MM- HH:mm:ss";
-
-  /**
-   * CARBON_TIMESTAMP
-   */
-  public static final String CARBON_TIMESTAMP_MILLIS = "dd-MM- 
HH:mm:ss:SSS";
-
-  /**
-   * Property for specifying the format of DATE data type column.
-   * e.g. /MM/dd , or using default value
-   */
-  @CarbonProperty
-  public static final String CARBON_DATE_FORMAT = "carbon.date.format";
-
-  /**
-   * default value
-   */
-  public static final String CARBON_DATE_DEFAULT_FORMAT = "-MM-dd";
-
-  /**
-   * compressor for writing/reading CarbonData file
-   */
-  @CarbonProperty
-  public static final String COMPRESSOR = "carbon.column.compressor";
-
-  /**
-   * default compressor is snappy
-   */
-  public static final String DEFAULT_COMPRESSOR = "snappy";
-
-  /**
-   * ZOOKEEPER_ENABLE_LOCK if this is set to true then zookeeper
-   * will be used to handle locking
-   * mechanism of carbon
-   */
-  @CarbonProperty
-  public static final String LOCK_TYPE = "carbon.lock.type";
-
-  /**
-   * ZOOKEEPER_ENABLE_DEFAULT the default value for zookeeper will be true 
for carbon
-   */
-  public static final String LOCK_TYPE_DEFAULT = "LOCALLOCK";
-
-  /**
-   * Specifies the path where the lock files have to be created.
-   * By default, lock files are created in table path.
-   */
-  @CarbonProperty
-  public static final String LOCK_PATH = "carbon.lock.path";
-
-  public static final String LOCK_PATH_DEFAULT = "";
-
-  /**
-   * ZOOKEEPER_LOCATION this is the location in zookeeper file system 
where locks are created.
-   * mechanism of carbon
-   */
-  public static final String ZOOKEEPER_LOCATION = "/CarbonLocks";
-
-  /**
-   * xxhash algorithm property for hashmap
-   */
-  @CarbonProperty
-  public static final String ENABLE_XXHASH = "carbon.enableXXHash";
-
-  /**
-   * xxhash algorithm property for hashmap Default value false
-   */
-  public static final String ENABLE_XXHASH_DEFAULT = "true&qu

[GitHub] carbondata pull request #2914: [CARBONDATA-3093] Provide property builder fo...

2018-12-12 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2914#discussion_r241285056
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -40,1372 +41,1024 @@ private CarbonCommonConstants() {
   @CarbonProperty
   public static final String STORE_LOCATION = "carbon.storelocation";
 
-  /**
-   * blocklet size in carbon file
-   */
-  @CarbonProperty
-  public static final String BLOCKLET_SIZE = "carbon.blocklet.size";
-
-  /**
-   * default blocklet size
-   */
-  public static final String BLOCKLET_SIZE_DEFAULT_VAL = "12";
-
-  /**
-   * min blocklet size
-   */
-  public static final int BLOCKLET_SIZE_MIN_VAL = 2000;
-
-  /**
-   * max blocklet size
-   */
-  public static final int BLOCKLET_SIZE_MAX_VAL = 1200;
-
-  /**
-   * min block size in MB
-   */
-  public static final int BLOCK_SIZE_MIN_VAL = 1;
-
-  /**
-   * max block size in MB
-   */
-  public static final int BLOCK_SIZE_MAX_VAL = 2048;
-
-  /**
-   * carbon properties file path
-   */
-  @CarbonProperty
-  public static final String CARBON_PROPERTIES_FILE_PATH = 
"carbon.properties.filepath";
-
-  /**
-   * default carbon properties file path
-   */
-  public static final String CARBON_PROPERTIES_FILE_PATH_DEFAULT =
-  "../../../conf/carbon.properties";
-
-  /**
-   * CARBON_DDL_BASE_HDFS_URL
-   */
-  @CarbonProperty
-  public static final String CARBON_DDL_BASE_HDFS_URL = 
"carbon.ddl.base.hdfs.url";
-
-  /**
-   * CARBON_BADRECORDS_LOCATION
-   */
-  @CarbonProperty
-  public static final String CARBON_BADRECORDS_LOC = 
"carbon.badRecords.location";
-
-  /**
-   * CARBON_BADRECORDS_LOCATION_DEFAULT
-   */
-  public static final String CARBON_BADRECORDS_LOC_DEFAULT_VAL = "";
-
-  /**
-   * Property for specifying the format of TIMESTAMP data type column.
-   * e.g. /MM/dd HH:mm:ss, or using default value
-   */
-  @CarbonProperty
-  public static final String CARBON_TIMESTAMP_FORMAT = 
"carbon.timestamp.format";
-
-  /**
-   * default value
-   */
-  public static final String CARBON_TIMESTAMP_DEFAULT_FORMAT = "-MM-dd 
HH:mm:ss";
-
-  /**
-   * CARBON_TIMESTAMP
-   */
-  public static final String CARBON_TIMESTAMP = "dd-MM- HH:mm:ss";
-
-  /**
-   * CARBON_TIMESTAMP
-   */
-  public static final String CARBON_TIMESTAMP_MILLIS = "dd-MM- 
HH:mm:ss:SSS";
-
-  /**
-   * Property for specifying the format of DATE data type column.
-   * e.g. /MM/dd , or using default value
-   */
-  @CarbonProperty
-  public static final String CARBON_DATE_FORMAT = "carbon.date.format";
-
-  /**
-   * default value
-   */
-  public static final String CARBON_DATE_DEFAULT_FORMAT = "-MM-dd";
-
-  /**
-   * compressor for writing/reading CarbonData file
-   */
-  @CarbonProperty
-  public static final String COMPRESSOR = "carbon.column.compressor";
-
-  /**
-   * default compressor is snappy
-   */
-  public static final String DEFAULT_COMPRESSOR = "snappy";
-
-  /**
-   * ZOOKEEPER_ENABLE_LOCK if this is set to true then zookeeper
-   * will be used to handle locking
-   * mechanism of carbon
-   */
-  @CarbonProperty
-  public static final String LOCK_TYPE = "carbon.lock.type";
-
-  /**
-   * ZOOKEEPER_ENABLE_DEFAULT the default value for zookeeper will be true 
for carbon
-   */
-  public static final String LOCK_TYPE_DEFAULT = "LOCALLOCK";
-
-  /**
-   * Specifies the path where the lock files have to be created.
-   * By default, lock files are created in table path.
-   */
-  @CarbonProperty
-  public static final String LOCK_PATH = "carbon.lock.path";
-
-  public static final String LOCK_PATH_DEFAULT = "";
-
-  /**
-   * ZOOKEEPER_LOCATION this is the location in zookeeper file system 
where locks are created.
-   * mechanism of carbon
-   */
-  public static final String ZOOKEEPER_LOCATION = "/CarbonLocks";
-
-  /**
-   * xxhash algorithm property for hashmap
-   */
-  @CarbonProperty
-  public static final String ENABLE_XXHASH = "carbon.enableXXHash";
-
-  /**
-   * xxhash algorithm property for hashmap Default value false
-   */
-  public static final String ENABLE_XXHASH_DEFAULT = "true&qu

[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-11 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
@chenliang613 column vector code (CarbonColumnVector interface) and a base 
implementation (CarbonColumnVectorImpl class) is in carbon-core module, but 
still for every engine integration layer, they need to be adapt to the compute 
engine. 


---

[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-11 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
LGTM


---

[GitHub] carbondata issue #2978: [CARBONDATA-3157] Added lazy load and direct vector ...

2018-12-11 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2978
  
LGTM


---

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240530952
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java
 ---
@@ -0,0 +1,320 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.math.BigDecimal;
+import java.util.BitSet;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+import 
org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
+import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
+
+/**
+ * Fills the vector directly with out considering any deleted rows.
+ */
+class ColumnarVectorWrapperDirect implements 
CarbonColumnVector,SequentialFill {
+
+
+  /**
+   * It is adapter class of complete ColumnarBatch.
+   */
+  protected CarbonColumnVectorImpl columnVector;
+
+  private DataType blockDataType;
+
+  private CarbonColumnVector dictionaryVector;
+
+  private BitSet nullBitset;
+
+  ColumnarVectorWrapperDirect(CarbonColumnVectorImpl columnVector) {
+this.columnVector = columnVector;
+this.dictionaryVector = columnVector.getDictionaryVector();
+this.nullBitset = new BitSet();
+  }
+
+  @Override public void setNullBits(BitSet nullBits) {
+this.nullBitset = nullBits;
+  }
+
+  @Override public void putBoolean(int rowId, boolean value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putBoolean(rowId, value);
+}
+  }
+
+  @Override public void putFloat(int rowId, float value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putFloat(rowId, value);
+}
+  }
+
+  @Override public void putShort(int rowId, short value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putShort(rowId, value);
+}
+  }
+
+  @Override public void putShorts(int rowId, int count, short value) {
+for (int i = 0; i < count; i++) {
+  if (nullBitset.get(rowId)) {
+columnVector.putNull(rowId);
+  } else {
+columnVector.putShort(rowId, value);
+  }
+  rowId++;
+}
+
+  }
+
+  @Override public void putInt(int rowId, int value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putInt(rowId, value);
+}
+  }
+
+  @Override public void putInts(int rowId, int count, int value) {
+columnVector.putInts(rowId, count, value);
+  }
+
+  @Override public void putLong(int rowId, long value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putLong(rowId, value);
+}
+  }
+
+  @Override public void putLongs(int rowId, int count, long value) {
+columnVector.putLongs(rowId, count, value);
+  }
+
+  @Override public void putDecimal(int rowId, BigDecimal value, int 
precision) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putDecimal(rowId, value, precision);
+}
+  }
+
+  @Override public void putDecimals(int rowId, int count, BigDecimal 
value, int precision) {
+for (int i = 0; i < count; i++) {
+  if (nullBitset.get(rowId)) {
+columnVector.putNull(rowId);
+  } else {
+columnVector.putDecimal(rowId, value, precision);

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240530928
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java
 ---
@@ -0,0 +1,320 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.math.BigDecimal;
+import java.util.BitSet;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+import 
org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
+import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
+
+/**
+ * Fills the vector directly with out considering any deleted rows.
+ */
+class ColumnarVectorWrapperDirect implements 
CarbonColumnVector,SequentialFill {
+
+
+  /**
+   * It is adapter class of complete ColumnarBatch.
+   */
+  protected CarbonColumnVectorImpl columnVector;
+
+  private DataType blockDataType;
+
+  private CarbonColumnVector dictionaryVector;
+
+  private BitSet nullBitset;
+
+  ColumnarVectorWrapperDirect(CarbonColumnVectorImpl columnVector) {
+this.columnVector = columnVector;
+this.dictionaryVector = columnVector.getDictionaryVector();
+this.nullBitset = new BitSet();
+  }
+
+  @Override public void setNullBits(BitSet nullBits) {
+this.nullBitset = nullBits;
+  }
+
+  @Override public void putBoolean(int rowId, boolean value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putBoolean(rowId, value);
+}
+  }
+
+  @Override public void putFloat(int rowId, float value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putFloat(rowId, value);
+}
+  }
+
+  @Override public void putShort(int rowId, short value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putShort(rowId, value);
+}
+  }
+
+  @Override public void putShorts(int rowId, int count, short value) {
+for (int i = 0; i < count; i++) {
+  if (nullBitset.get(rowId)) {
+columnVector.putNull(rowId);
+  } else {
+columnVector.putShort(rowId, value);
+  }
+  rowId++;
+}
+
+  }
+
+  @Override public void putInt(int rowId, int value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putInt(rowId, value);
+}
+  }
+
+  @Override public void putInts(int rowId, int count, int value) {
+columnVector.putInts(rowId, count, value);
+  }
+
+  @Override public void putLong(int rowId, long value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putLong(rowId, value);
+}
+  }
+
+  @Override public void putLongs(int rowId, int count, long value) {
+columnVector.putLongs(rowId, count, value);
+  }
+
+  @Override public void putDecimal(int rowId, BigDecimal value, int 
precision) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putDecimal(rowId, value, precision);
+}
+  }
+
+  @Override public void putDecimals(int rowId, int count, BigDecimal 
value, int precision) {
+for (int i = 0; i < count; i++) {
+  if (nullBitset.get(rowId)) {
+columnVector.putNull(rowId);
+  } else {
+columnVector.putDecimal(rowId, value, precision);

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240531041
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java
 ---
@@ -0,0 +1,320 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.math.BigDecimal;
+import java.util.BitSet;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+import 
org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
+import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
+
+/**
+ * Fills the vector directly with out considering any deleted rows.
+ */
+class ColumnarVectorWrapperDirect implements 
CarbonColumnVector,SequentialFill {
+
+
+  /**
+   * It is adapter class of complete ColumnarBatch.
+   */
+  protected CarbonColumnVectorImpl columnVector;
+
+  private DataType blockDataType;
+
+  private CarbonColumnVector dictionaryVector;
+
+  private BitSet nullBitset;
+
+  ColumnarVectorWrapperDirect(CarbonColumnVectorImpl columnVector) {
+this.columnVector = columnVector;
+this.dictionaryVector = columnVector.getDictionaryVector();
+this.nullBitset = new BitSet();
+  }
+
+  @Override public void setNullBits(BitSet nullBits) {
+this.nullBitset = nullBits;
+  }
+
+  @Override public void putBoolean(int rowId, boolean value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putBoolean(rowId, value);
+}
+  }
+
+  @Override public void putFloat(int rowId, float value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putFloat(rowId, value);
+}
+  }
+
+  @Override public void putShort(int rowId, short value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putShort(rowId, value);
+}
+  }
+
+  @Override public void putShorts(int rowId, int count, short value) {
+for (int i = 0; i < count; i++) {
+  if (nullBitset.get(rowId)) {
+columnVector.putNull(rowId);
+  } else {
+columnVector.putShort(rowId, value);
+  }
+  rowId++;
+}
+
+  }
+
+  @Override public void putInt(int rowId, int value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putInt(rowId, value);
+}
+  }
+
+  @Override public void putInts(int rowId, int count, int value) {
+columnVector.putInts(rowId, count, value);
+  }
+
+  @Override public void putLong(int rowId, long value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putLong(rowId, value);
+}
+  }
+
+  @Override public void putLongs(int rowId, int count, long value) {
+columnVector.putLongs(rowId, count, value);
+  }
+
+  @Override public void putDecimal(int rowId, BigDecimal value, int 
precision) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putDecimal(rowId, value, precision);
+}
+  }
+
+  @Override public void putDecimals(int rowId, int count, BigDecimal 
value, int precision) {
+for (int i = 0; i < count; i++) {
+  if (nullBitset.get(rowId)) {
+columnVector.putNull(rowId);
+  } else {
+columnVector.putDecimal(rowId, value, precision);

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240530272
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java
 ---
@@ -0,0 +1,320 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.math.BigDecimal;
+import java.util.BitSet;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+import 
org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
+import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
+
+/**
+ * Fills the vector directly with out considering any deleted rows.
+ */
+class ColumnarVectorWrapperDirect implements 
CarbonColumnVector,SequentialFill {
+
+
+  /**
+   * It is adapter class of complete ColumnarBatch.
+   */
+  protected CarbonColumnVectorImpl columnVector;
+
+  private DataType blockDataType;
+
+  private CarbonColumnVector dictionaryVector;
+
+  private BitSet nullBitset;
+
+  ColumnarVectorWrapperDirect(CarbonColumnVectorImpl columnVector) {
+this.columnVector = columnVector;
+this.dictionaryVector = columnVector.getDictionaryVector();
+this.nullBitset = new BitSet();
+  }
+
+  @Override public void setNullBits(BitSet nullBits) {
+this.nullBitset = nullBits;
+  }
+
+  @Override public void putBoolean(int rowId, boolean value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putBoolean(rowId, value);
+}
+  }
+
+  @Override public void putFloat(int rowId, float value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putFloat(rowId, value);
+}
+  }
+
+  @Override public void putShort(int rowId, short value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putShort(rowId, value);
+}
+  }
+
+  @Override public void putShorts(int rowId, int count, short value) {
+for (int i = 0; i < count; i++) {
+  if (nullBitset.get(rowId)) {
+columnVector.putNull(rowId);
+  } else {
+columnVector.putShort(rowId, value);
+  }
+  rowId++;
+}
+
+  }
+
+  @Override public void putInt(int rowId, int value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putInt(rowId, value);
+}
+  }
+
+  @Override public void putInts(int rowId, int count, int value) {
+columnVector.putInts(rowId, count, value);
+  }
+
+  @Override public void putLong(int rowId, long value) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putLong(rowId, value);
+}
+  }
+
+  @Override public void putLongs(int rowId, int count, long value) {
+columnVector.putLongs(rowId, count, value);
+  }
+
+  @Override public void putDecimal(int rowId, BigDecimal value, int 
precision) {
+if (nullBitset.get(rowId)) {
+  columnVector.putNull(rowId);
+} else {
+  columnVector.putDecimal(rowId, value, precision);
+}
+  }
+
+  @Override public void putDecimals(int rowId, int count, BigDecimal 
value, int precision) {
+for (int i = 0; i < count; i++) {
+  if (nullBitset.get(rowId)) {
+columnVector.putNull(rowId);
+  } else {
+columnVector.putDecimal(rowId, value, precision);

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240528243
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java
 ---
@@ -0,0 +1,320 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.math.BigDecimal;
+import java.util.BitSet;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+import 
org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
+import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
+
+/**
+ * Fills the vector directly with out considering any deleted rows.
+ */
+class ColumnarVectorWrapperDirect implements 
CarbonColumnVector,SequentialFill {
+
+
--- End diff --

remove extra empty line


---

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240528010
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java
 ---
@@ -0,0 +1,320 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.math.BigDecimal;
+import java.util.BitSet;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+import 
org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
+import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
+
+/**
+ * Fills the vector directly with out considering any deleted rows.
+ */
+class ColumnarVectorWrapperDirect implements 
CarbonColumnVector,SequentialFill {
+
+
+  /**
+   * It is adapter class of complete ColumnarBatch.
+   */
+  protected CarbonColumnVectorImpl columnVector;
+
+  private DataType blockDataType;
--- End diff --

it is column's datatype?


---

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240528169
  
--- Diff: 
integration/presto/src/main/java/org/apache/carbondata/presto/ColumnarVectorWrapperDirect.java
 ---
@@ -0,0 +1,320 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.presto;
+
+import java.math.BigDecimal;
+import java.util.BitSet;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.CarbonDictionary;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+import 
org.apache.carbondata.core.scan.result.vector.impl.directread.SequentialFill;
+import org.apache.carbondata.core.scan.scanner.LazyPageLoader;
+
+/**
+ * Fills the vector directly with out considering any deleted rows.
+ */
+class ColumnarVectorWrapperDirect implements 
CarbonColumnVector,SequentialFill {
--- End diff --

add space after ','


---

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240526032
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonColumnVectorImpl.java
 ---
@@ -367,7 +378,14 @@ public void setBlockDataType(DataType blockDataType) {
   }
 
   @Override public void setLazyPage(LazyPageLoader lazyPage) {
-lazyPage.loadPage();
+this.lazyPage = lazyPage;
+  }
+
+  public void loadPage() {
+if (lazyPage != null) {
+  lazyPage.loadPage();
+}
+loaded = true;
--- End diff --

if lazyPath is null, still set to true?


---

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240524512
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/AbstractNonDictionaryVectorFiller.java
 ---
@@ -83,20 +84,37 @@ public StringVectorFiller(int numberOfRows, int 
actualDataLength) {
   @Override
   public void fillVector(byte[] data, CarbonColumnVector vector) {
 // start position will be used to store the current data position
+boolean invertedIndex = vector instanceof 
ColumnarVectorWrapperDirectWithInvertedIndex
+|| vector instanceof SequentialFill;
+
 int localOffset = 0;
 ByteUtil.UnsafeComparer comparator = ByteUtil.UnsafeComparer.INSTANCE;
-for (int i = 0; i < numberOfRows; i++) {
-  int length = (((data[localOffset] & 0xFF) << 8) | (data[localOffset 
+ 1] & 0xFF));
-  localOffset += 2;
-  if 
(comparator.equals(CarbonCommonConstants.MEMBER_DEFAULT_VAL_ARRAY, 0,
-  CarbonCommonConstants.MEMBER_DEFAULT_VAL_ARRAY.length, data, 
localOffset, length)) {
-vector.putNull(i);
-  } else {
-vector.putArray(i, localOffset, length);
+if (invertedIndex) {
--- End diff --

can you add comment for this if-else branch


---

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240524219
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/chunk/store/impl/safe/AbstractNonDictionaryVectorFiller.java
 ---
@@ -83,20 +84,37 @@ public StringVectorFiller(int numberOfRows, int 
actualDataLength) {
   @Override
   public void fillVector(byte[] data, CarbonColumnVector vector) {
 // start position will be used to store the current data position
+boolean invertedIndex = vector instanceof 
ColumnarVectorWrapperDirectWithInvertedIndex
+|| vector instanceof SequentialFill;
--- End diff --

It seems SequentialFill is not subclass of CarbonColumnVector?


---

[GitHub] carbondata pull request #2978: [CARBONDATA-3157] Added lazy load and direct ...

2018-12-11 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2978#discussion_r240519199
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/directread/SequentialFill.java
 ---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.scan.result.vector.impl.directread;
+
+import java.util.BitSet;
+
+/**
+ * It is sort of a marker interface to let execution engine know that it 
is appendable/sequential
+ * data adding vector. It means we cannot add random rowids to it.
+ */
+public interface SequentialFill {
--- End diff --

Please add @InterfaceAudience annotation


---

[GitHub] carbondata issue #2940: [CARBONDATA-3116] Support set carbon.query.directQue...

2018-12-11 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2940
  
LGTM


---

[GitHub] carbondata pull request #2941: [WIP] Support CarbonSession from pyspark shel...

2018-11-24 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2941#discussion_r236035510
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/__init__.py ---
@@ -0,0 +1,6 @@
+from __future__ import absolute_import
+
+
+__all__ = [
--- End diff --

I think it is better to add example and doc to describe how to use it in 
the same PR


---

[GitHub] carbondata pull request #2941: [WIP] Support CarbonSession from pyspark shel...

2018-11-24 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2941#discussion_r236035500
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.py ---
@@ -0,0 +1,122 @@
+from __future__ import print_function
+import sys
+from threading import RLock
+
+if sys.version >= '3':
+basestring = unicode = str
+xrange = range
+else:
+from itertools import izip as zip, imap as map
+
+from pyspark.sql.conf import RuntimeConfig
+
+__all__ = ["CarbonSession"]
+
+
+class CarbonSession(object):
--- End diff --

please add comment to this file


---

[GitHub] carbondata pull request #2941: [WIP] Support CarbonSession from pyspark shel...

2018-11-24 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2941#discussion_r236035495
  
--- Diff: 
integration/spark2/src/main/scala/org/apache/spark/sql/CarbonSession.py ---
@@ -0,0 +1,122 @@
+from __future__ import print_function
--- End diff --

license header is required


---

[GitHub] carbondata pull request #2940: [CARBONDATA-3116] Support set carbon.query.di...

2018-11-21 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2940#discussion_r235494866
  
--- Diff: integration/spark2/pom.xml ---
@@ -105,6 +105,11 @@
 
   
 
+
+  org.apache.httpcomponents
--- End diff --

Is this required in this PR?


---

[GitHub] carbondata issue #2936: [WIP] Parallelize block pruning of default datamap i...

2018-11-21 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2936
  
please describe this PR


---

[GitHub] carbondata issue #2926: [HOTFIX] Reduce blocklet minimum configurable size

2018-11-21 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2926
  
LGTM


---

[GitHub] carbondata issue #2932: [HOTFIX]Fix Describe Formatted Testcases

2018-11-21 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2932
  
LGTM
can be merged after CI passed


---

[GitHub] carbondata issue #2872: [WIP] Added reusable buffer code

2018-11-20 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2872
  
LGTM
please change the title to incluce jira ticket number


---

[GitHub] carbondata issue #2863: [WIP] Optimise decompressing while filling the vecto...

2018-11-20 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2863
  
LGTM


---

[GitHub] carbondata issue #2917: [WIP]Show load/insert/update/delete row number

2018-11-18 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2917
  
This is a good feature, will it show the number of rows got deleted or 
updated?
hope it can get into next version.


---

[GitHub] carbondata pull request #2926: [HOTFIX] Reduce blocklet minimum configurable...

2018-11-16 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2926#discussion_r234399060
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonV3DataFormatConstants.java
 ---
@@ -35,9 +35,9 @@
   String BLOCKLET_SIZE_IN_MB_DEFAULT_VALUE = "64";
 
   /**
-   * blocklet group size min value
+   * blocklet size min value
--- End diff --

since you are modifying this, can you please modify line 33, 27 also


---

[GitHub] carbondata issue #2926: [HOTFIX] Reduce blocklet minimum configurable size

2018-11-16 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2926
  
`table_blocklet_size` is a table property that can be used in create table


---

[GitHub] carbondata pull request #2872: [WIP] Added reusable buffer code

2018-11-16 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2872#discussion_r234209828
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/ReusableDataBuffer.java 
---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore;
+
+public class ReusableDataBuffer {
+
+  private byte[] dataBuffer;
+
+  private int size;
+
+  public byte[] getDataBuffer(int requestedSize) {
--- End diff --

add comment


---

[GitHub] carbondata pull request #2872: [WIP] Added reusable buffer code

2018-11-16 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2872#discussion_r234209700
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/ReusableDataBuffer.java 
---
@@ -0,0 +1,33 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore;
+
+public class ReusableDataBuffer {
--- End diff --

add interface annotation also


---

[GitHub] carbondata issue #2922: [WIP]s3 lock file fix

2018-11-15 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2922
  
Is this for CARBONDATA-3103?


---

[GitHub] carbondata pull request #2920: [HOTFIX] Improve log message in CarbonWriterB...

2018-11-14 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2920#discussion_r233470872
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java 
---
@@ -438,13 +438,13 @@ public CarbonWriter build() throws IOException, 
InvalidLoadOptionException {
 Objects.requireNonNull(path, "path should not be null");
 if (this.writerType == null) {
   throw new IOException(
-  "Writer type is not set, use withCsvInput() or withAvroInput() 
or withJsonInput()  "
+  "'writerType' must be set, use withCsvInput() or withAvroInput() 
or withJsonInput()  "
   + "API based on input");
 }
 if (this.writtenByApp == null || this.writtenByApp.isEmpty()) {
   throw new RuntimeException(
-  "AppName is not set, please use writtenBy() API to set the App 
Name"
-  + "which is using SDK");
+  "'writtenBy' must be set when writting carbon files, use 
writtenBy() API to "
+  + "set it, it can be the application name which using the 
SDK");
--- End diff --

```suggestion
  + "set it, it can be the name of the application which using 
the SDK");
```


---

[GitHub] carbondata pull request #2920: [HOTFIX] Improve log message in CarbonWriterB...

2018-11-14 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2920#discussion_r233437690
  
--- Diff: 
store/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java 
---
@@ -438,13 +438,13 @@ public CarbonWriter build() throws IOException, 
InvalidLoadOptionException {
 Objects.requireNonNull(path, "path should not be null");
 if (this.writerType == null) {
   throw new IOException(
-  "Writer type is not set, use withCsvInput() or withAvroInput() 
or withJsonInput()  "
+  "'writerType' must be set, use withCsvInput() or withAvroInput() 
or withJsonInput()  "
   + "API based on input");
 }
 if (this.writtenByApp == null || this.writtenByApp.isEmpty()) {
   throw new RuntimeException(
-  "AppName is not set, please use writtenBy() API to set the App 
Name"
-  + "which is using SDK");
+  "'writtenBy' must be set when writting carbon files, use 
writtenBy() API to " 
--- End diff --

```suggestion
  "'writtenBy' must be set when writting carbon files, use 
writtenBy() API to "
```


---

[GitHub] carbondata pull request #2920: [HOTFIX] Improve log message in CarbonWriterB...

2018-11-14 Thread jackylk

GitHub user jackylk opened a pull request:

https://github.com/apache/carbondata/pull/2920

[HOTFIX] Improve log message in CarbonWriterBuilder

In master the log message is not proper: 
`AppName is not set, please use writtenBy() API to set the App Namewhich is 
using SDK`

This PR improves log message in CarbonWriterBuilder

 - [X] Any interfaces changed?
 No
 - [X] Any backward compatibility impacted?
 No
 - [X] Document update required?
No
 - [X] Testing done
Please provide details on 
- Whether new unit test cases have been added or why no new tests 
are required?
- How it is tested? Please attach test report.
- Is it a performance related change? Please attach the performance 
test report.
- Any additional information to help reviewers in testing this 
change.
 rerun all test  
 - [X] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
NA

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata patch-17

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/2920.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2920


commit b7fd6a6a9469feb99fdbc6b82907bc5c57381792
Author: Jacky Li 
Date:   2018-11-14T12:54:33Z

Update CarbonWriterBuilder.java




---

[GitHub] carbondata issue #2909: [CARBONDATA-3089] Change task distribution for NO_SO...

2018-11-14 Thread jackylk

Github user jackylk commented on the issue:

https://github.com/apache/carbondata/pull/2909
  
Since it is using CSV split in this PR, there are more tasks and each task 
is smaller, thus we can get better parallelism and require less resource to run 
in each task


---

[GitHub] carbondata pull request #2863: [WIP] Optimise decompressing while filling th...

2018-11-13 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2863#discussion_r233341050
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/scan/result/vector/impl/CarbonColumnVectorImpl.java
 ---
@@ -53,6 +53,12 @@
 
   private DataType blockDataType;
 
+  private int[] lengths;
--- End diff --

please add comment for these three newly added variables


---

[GitHub] carbondata pull request #2863: [WIP] Optimise decompressing while filling th...

2018-11-13 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2863#discussion_r233340004
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/compress/DirectCompressCodec.java
 ---
@@ -224,130 +238,134 @@ public void decodeAndFillVector(ColumnPage 
columnPage, ColumnVectorInfo vectorIn
   }
 }
 
-private void fillVector(ColumnPage columnPage, CarbonColumnVector 
vector,
-DataType vectorDataType, DataType pageDataType, int pageSize, 
ColumnVectorInfo vectorInfo) {
+private void fillVector(byte[] pageData, CarbonColumnVector vector, 
DataType vectorDataType,
+DataType pageDataType, int pageSize, ColumnVectorInfo vectorInfo, 
BitSet nullBits) {
+  int rowId = 0;
   if (pageDataType == DataTypes.BOOLEAN || pageDataType == 
DataTypes.BYTE) {
-byte[] byteData = columnPage.getBytePage();
 if (vectorDataType == DataTypes.SHORT) {
   for (int i = 0; i < pageSize; i++) {
-vector.putShort(i, (short) byteData[i]);
+vector.putShort(i, (short) pageData[i]);
   }
 } else if (vectorDataType == DataTypes.INT) {
   for (int i = 0; i < pageSize; i++) {
-vector.putInt(i, (int) byteData[i]);
+vector.putInt(i, (int) pageData[i]);
   }
 } else if (vectorDataType == DataTypes.LONG) {
   for (int i = 0; i < pageSize; i++) {
-vector.putLong(i, byteData[i]);
+vector.putLong(i, pageData[i]);
   }
 } else if (vectorDataType == DataTypes.TIMESTAMP) {
   for (int i = 0; i < pageSize; i++) {
-vector.putLong(i, (long) byteData[i] * 1000);
+vector.putLong(i, (long) pageData[i] * 1000);
   }
 } else if (vectorDataType == DataTypes.BOOLEAN || vectorDataType 
== DataTypes.BYTE) {
-  vector.putBytes(0, pageSize, byteData, 0);
+  vector.putBytes(0, pageSize, pageData, 0);
 } else if (DataTypes.isDecimal(vectorDataType)) {
   DecimalConverterFactory.DecimalConverter decimalConverter = 
vectorInfo.decimalConverter;
-  decimalConverter.fillVector(byteData, pageSize, vectorInfo, 
columnPage.getNullBits());
+  decimalConverter.fillVector(pageData, pageSize, vectorInfo, 
nullBits, pageDataType);
 } else {
   for (int i = 0; i < pageSize; i++) {
-vector.putDouble(i, byteData[i]);
+vector.putDouble(i, pageData[i]);
   }
 }
   } else if (pageDataType == DataTypes.SHORT) {
-short[] shortData = columnPage.getShortPage();
+int size = pageSize * DataTypes.SHORT.getSizeInBytes();
 if (vectorDataType == DataTypes.SHORT) {
-  vector.putShorts(0, pageSize, shortData, 0);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putShort(rowId++, 
(ByteUtil.toShortLittleEndian(pageData, i)));
+  }
 } else if (vectorDataType == DataTypes.INT) {
-  for (int i = 0; i < pageSize; i++) {
-vector.putInt(i, (int) shortData[i]);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putInt(rowId++, ByteUtil.toShortLittleEndian(pageData, 
i));
   }
 } else if (vectorDataType == DataTypes.LONG) {
-  for (int i = 0; i < pageSize; i++) {
-vector.putLong(i, shortData[i]);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putLong(rowId++, ByteUtil.toShortLittleEndian(pageData, 
i));
   }
 } else if (vectorDataType == DataTypes.TIMESTAMP) {
-  for (int i = 0; i < pageSize; i++) {
-vector.putLong(i, (long) shortData[i] * 1000);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putLong(rowId++, (long) 
ByteUtil.toShortLittleEndian(pageData, i) * 1000);
   }
 } else if (DataTypes.isDecimal(vectorDataType)) {
   DecimalConverterFactory.DecimalConverter decimalConverter = 
vectorInfo.decimalConverter;
-  decimalConverter.fillVector(shortData, pageSize, vectorInfo, 
columnPage.getNullBits());
+  decimalConverter.fillVector(pageData, pageSize, vectorInfo, 
nullBits, pageDataType);
 } else {
-  for (int i = 0; i < pageSize; i++) {
-vector.putDouble(i, shortData[i]);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putDouble(rowId++,

[GitHub] carbondata pull request #2863: [WIP] Optimise decompressing while filling th...

2018-11-13 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2863#discussion_r29949
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/compress/DirectCompressCodec.java
 ---
@@ -224,130 +238,134 @@ public void decodeAndFillVector(ColumnPage 
columnPage, ColumnVectorInfo vectorIn
   }
 }
 
-private void fillVector(ColumnPage columnPage, CarbonColumnVector 
vector,
-DataType vectorDataType, DataType pageDataType, int pageSize, 
ColumnVectorInfo vectorInfo) {
+private void fillVector(byte[] pageData, CarbonColumnVector vector, 
DataType vectorDataType,
+DataType pageDataType, int pageSize, ColumnVectorInfo vectorInfo, 
BitSet nullBits) {
+  int rowId = 0;
   if (pageDataType == DataTypes.BOOLEAN || pageDataType == 
DataTypes.BYTE) {
-byte[] byteData = columnPage.getBytePage();
 if (vectorDataType == DataTypes.SHORT) {
   for (int i = 0; i < pageSize; i++) {
-vector.putShort(i, (short) byteData[i]);
+vector.putShort(i, (short) pageData[i]);
   }
 } else if (vectorDataType == DataTypes.INT) {
   for (int i = 0; i < pageSize; i++) {
-vector.putInt(i, (int) byteData[i]);
+vector.putInt(i, (int) pageData[i]);
   }
 } else if (vectorDataType == DataTypes.LONG) {
   for (int i = 0; i < pageSize; i++) {
-vector.putLong(i, byteData[i]);
+vector.putLong(i, pageData[i]);
   }
 } else if (vectorDataType == DataTypes.TIMESTAMP) {
   for (int i = 0; i < pageSize; i++) {
-vector.putLong(i, (long) byteData[i] * 1000);
+vector.putLong(i, (long) pageData[i] * 1000);
   }
 } else if (vectorDataType == DataTypes.BOOLEAN || vectorDataType 
== DataTypes.BYTE) {
-  vector.putBytes(0, pageSize, byteData, 0);
+  vector.putBytes(0, pageSize, pageData, 0);
 } else if (DataTypes.isDecimal(vectorDataType)) {
   DecimalConverterFactory.DecimalConverter decimalConverter = 
vectorInfo.decimalConverter;
-  decimalConverter.fillVector(byteData, pageSize, vectorInfo, 
columnPage.getNullBits());
+  decimalConverter.fillVector(pageData, pageSize, vectorInfo, 
nullBits, pageDataType);
 } else {
   for (int i = 0; i < pageSize; i++) {
-vector.putDouble(i, byteData[i]);
+vector.putDouble(i, pageData[i]);
   }
 }
   } else if (pageDataType == DataTypes.SHORT) {
-short[] shortData = columnPage.getShortPage();
+int size = pageSize * DataTypes.SHORT.getSizeInBytes();
 if (vectorDataType == DataTypes.SHORT) {
-  vector.putShorts(0, pageSize, shortData, 0);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putShort(rowId++, 
(ByteUtil.toShortLittleEndian(pageData, i)));
+  }
 } else if (vectorDataType == DataTypes.INT) {
-  for (int i = 0; i < pageSize; i++) {
-vector.putInt(i, (int) shortData[i]);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putInt(rowId++, ByteUtil.toShortLittleEndian(pageData, 
i));
   }
 } else if (vectorDataType == DataTypes.LONG) {
-  for (int i = 0; i < pageSize; i++) {
-vector.putLong(i, shortData[i]);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putLong(rowId++, ByteUtil.toShortLittleEndian(pageData, 
i));
   }
 } else if (vectorDataType == DataTypes.TIMESTAMP) {
-  for (int i = 0; i < pageSize; i++) {
-vector.putLong(i, (long) shortData[i] * 1000);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putLong(rowId++, (long) 
ByteUtil.toShortLittleEndian(pageData, i) * 1000);
   }
 } else if (DataTypes.isDecimal(vectorDataType)) {
   DecimalConverterFactory.DecimalConverter decimalConverter = 
vectorInfo.decimalConverter;
-  decimalConverter.fillVector(shortData, pageSize, vectorInfo, 
columnPage.getNullBits());
+  decimalConverter.fillVector(pageData, pageSize, vectorInfo, 
nullBits, pageDataType);
 } else {
-  for (int i = 0; i < pageSize; i++) {
-vector.putDouble(i, shortData[i]);
+  for (int i = 0; i < size; i += DataTypes.SHORT.getSizeInBytes()) 
{
+vector.putDouble(rowId++,

[GitHub] carbondata pull request #2863: [WIP] Optimise decompressing while filling th...

2018-11-13 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2863#discussion_r28236
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
 ---
@@ -2006,6 +2006,12 @@ private CarbonCommonConstants() {
*/
   public static final String CARBON_WRITTEN_BY_APPNAME = 
"carbon.writtenby.app.name";
 
+  /**
+   * When more global dictionary columns are there then there is issue in 
generating codegen to them
--- End diff --

Is it only valid for table with global dictionary, or for normal table also?


---

[GitHub] carbondata pull request #2863: [WIP] Optimise decompressing while filling th...

2018-11-13 Thread jackylk

Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/2863#discussion_r27530
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/datastore/page/ColumnPageValueConverter.java
 ---
@@ -37,5 +40,6 @@
   double decodeDouble(long value);
   double decodeDouble(float value);
   double decodeDouble(double value);
-  void decodeAndFillVector(ColumnPage columnPage, ColumnVectorInfo 
vectorInfo);
+  void decodeAndFillVector(byte[] pageData, ColumnVectorInfo vectorInfo, 
BitSet nullBits,
+  DataType pageDataType, int pageSize);
--- End diff --

can you provide comment for this func


---

1 2 3 4 5 6 7 8 9 10 >

1 - 100 of 3249 matches

Mail list logo