[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-12 Thread jackylk
Github user jackylk closed the pull request at:

https://github.com/apache/carbondata/pull/1970


---


[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1970#discussion_r167482420
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/loading/model/LoadOption.java
 ---
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.loading.model;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.carbondata.common.Maps;
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import 
org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.constants.CarbonLoadOptionConstants;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.core.util.CarbonUtil;
+import 
org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException;
+import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
+import org.apache.carbondata.processing.util.CarbonLoaderUtil;
+
+import org.apache.commons.lang.StringUtils;
+import org.apache.hadoop.conf.Configuration;
+
+@InterfaceAudience.Developer
+public class LoadOption {
+
+  private static LogService LOG = 
LogServiceFactory.getLogService(LoadOption.class.getName());
+
+  /**
+   * get data loading options and initialise default value
+   */
+  public static Map fillOptionWithDefaultValue(
+  Map options) throws InvalidLoadOptionException {
+Map optionsFinal = new HashMap<>();
+optionsFinal.put("delimiter", Maps.getOrDefault(options, "delimiter", 
","));
+optionsFinal.put("quotechar", Maps.getOrDefault(options, "quotechar", 
"\""));
+optionsFinal.put("fileheader", Maps.getOrDefault(options, 
"fileheader", ""));
+optionsFinal.put("commentchar", Maps.getOrDefault(options, 
"commentchar", "#"));
+optionsFinal.put("columndict", Maps.getOrDefault(options, 
"columndict", null));
+
+optionsFinal.put(
+"escapechar",
+
CarbonLoaderUtil.getEscapeChar(Maps.getOrDefault(options,"escapechar", "\\")));
+
+optionsFinal.put(
+"serialization_null_format",
+Maps.getOrDefault(options, "serialization_null_format", "\\N"));
+
+optionsFinal.put(
+"bad_records_logger_enable",
+Maps.getOrDefault(
+options,
+"bad_records_logger_enable",
+CarbonProperties.getInstance().getProperty(
+
CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_LOGGER_ENABLE,
+
CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_LOGGER_ENABLE_DEFAULT)));
+
+String badRecordActionValue = 
CarbonProperties.getInstance().getProperty(
+CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
+CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION_DEFAULT);
+
+optionsFinal.put(
+"bad_records_action",
+Maps.getOrDefault(
+options,
+"bad_records_action",
+CarbonProperties.getInstance().getProperty(
+
CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_ACTION,
+badRecordActionValue)));
+
+optionsFinal.put(
+"is_empty_data_bad_record",
+Maps.getOrDefault(
+options,
+"is_empty_data_bad_record",
+CarbonProperties.getInstance().getProperty(
+
CarbonLoadOptionConstants.CARBON_OPTIONS_IS_EMPTY_DATA_BAD_RECORD,
+
CarbonLoadOptionConstants.CARBON_OPTIONS_IS_EMPTY_DATA_BAD_RECORD_DEFAULT)));
+
+optionsFinal.put(
+"skip_empty_line",
+

[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1970#discussion_r167482170
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
 ---
@@ -708,4 +710,129 @@ public static Boolean 
checkIfValidLoadInProgress(AbsoluteTableIdentifier absolut
 }
   }
 
+  private static boolean isLoadDeletionRequired(String metaDataLocation) {
+LoadMetadataDetails[] details = 
SegmentStatusManager.readLoadMetadata(metaDataLocation);
+if (details != null && details.length > 0) {
+  for (LoadMetadataDetails oneRow : details) {
+if ((SegmentStatus.MARKED_FOR_DELETE == oneRow.getSegmentStatus()
+|| SegmentStatus.COMPACTED == oneRow.getSegmentStatus()
+|| SegmentStatus.INSERT_IN_PROGRESS == 
oneRow.getSegmentStatus()
+|| SegmentStatus.INSERT_OVERWRITE_IN_PROGRESS == 
oneRow.getSegmentStatus())
+&& oneRow.getVisibility().equalsIgnoreCase("true")) {
+  return true;
+}
+  }
+}
+return false;
+  }
+
+  /**
+   * This will update the old table status details before clean files to 
the latest table status.
+   * @param oldList
+   * @param newList
+   * @return
+   */
+  public static List updateLoadMetadataFromOldToNew(
+  LoadMetadataDetails[] oldList, LoadMetadataDetails[] newList) {
+
+List newListMetadata =
+new ArrayList(Arrays.asList(newList));
+for (LoadMetadataDetails oldSegment : oldList) {
+  if ("false".equalsIgnoreCase(oldSegment.getVisibility())) {
+
newListMetadata.get(newListMetadata.indexOf(oldSegment)).setVisibility("false");
+  }
+}
+return newListMetadata;
+  }
+
+  private static void writeLoadMetadata(AbsoluteTableIdentifier identifier,
+  List listOfLoadFolderDetails) throws 
IOException {
+String dataLoadLocation = 
CarbonTablePath.getTableStatusFilePath(identifier.getTablePath());
+
+DataOutputStream dataOutputStream;
+Gson gsonObjectToWrite = new Gson();
+BufferedWriter brWriter = null;
+
+AtomicFileOperations writeOperation =
+new AtomicFileOperationsImpl(dataLoadLocation, 
FileFactory.getFileType(dataLoadLocation));
+
+try {
+
+  dataOutputStream = 
writeOperation.openForWrite(FileWriteOperation.OVERWRITE);
+  brWriter = new BufferedWriter(new 
OutputStreamWriter(dataOutputStream,
+  Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET)));
+
+  String metadataInstance = 
gsonObjectToWrite.toJson(listOfLoadFolderDetails.toArray());
+  brWriter.write(metadataInstance);
+} finally {
+  try {
+if (null != brWriter) {
+  brWriter.flush();
+}
+  } catch (Exception e) {
+LOG.error("error in  flushing ");
+
+  }
+  CarbonUtil.closeStreams(brWriter);
+  writeOperation.close();
+}
+  }
+
+  public static void deleteLoadsAndUpdateMetadata(
+  CarbonTable carbonTable,
+  boolean isForceDeletion) throws IOException {
+if (isLoadDeletionRequired(carbonTable.getMetadataPath())) {
+  LoadMetadataDetails[] details =
+  
SegmentStatusManager.readLoadMetadata(carbonTable.getMetadataPath());
+  AbsoluteTableIdentifier identifier = 
carbonTable.getAbsoluteTableIdentifier();
+  ICarbonLock carbonTableStatusLock = 
CarbonLockFactory.getCarbonLockObj(
+  identifier, LockUsage.TABLE_STATUS_LOCK);
+
+  // Delete marked loads
+  boolean isUpdationRequired = 
DeleteLoadFolders.deleteLoadFoldersFromFileSystem(
+  identifier,
--- End diff --

fixed


---


[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1970#discussion_r167482163
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datetype/DateTypeTest.scala
 ---
@@ -16,10 +16,11 @@
  */
 package org.apache.carbondata.spark.testsuite.datetype
 
-import 
org.apache.carbondata.spark.exception.MalformedCarbonCommandException
 import org.apache.spark.sql.test.util.QueryTest
 import org.scalatest.BeforeAndAfterAll
 
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+
--- End diff --

fixed


---


[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1970#discussion_r167481586
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/loading/model/LoadOption.java
 ---
@@ -0,0 +1,245 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.loading.model;
+
+import java.io.IOException;
+import java.util.HashMap;
+import java.util.Map;
+
+import org.apache.carbondata.common.Maps;
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import 
org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.common.logging.LogService;
+import org.apache.carbondata.common.logging.LogServiceFactory;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.constants.CarbonLoadOptionConstants;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.core.util.CarbonUtil;
+import 
org.apache.carbondata.processing.loading.exception.CarbonDataLoadingException;
+import org.apache.carbondata.processing.util.CarbonDataProcessorUtil;
+import org.apache.carbondata.processing.util.CarbonLoaderUtil;
+
+import org.apache.commons.lang.StringUtils;
+import org.apache.hadoop.conf.Configuration;
+
+@InterfaceAudience.Developer
+public class LoadOption {
+
+  private static LogService LOG = 
LogServiceFactory.getLogService(LoadOption.class.getName());
+
+  /**
+   * get data loading options and initialise default value
+   */
+  public static Map fillOptionWithDefaultValue(
+  Map options) throws InvalidLoadOptionException {
+Map optionsFinal = new HashMap<>();
+optionsFinal.put("delimiter", Maps.getOrDefault(options, "delimiter", 
","));
+optionsFinal.put("quotechar", Maps.getOrDefault(options, "quotechar", 
"\""));
+optionsFinal.put("fileheader", Maps.getOrDefault(options, 
"fileheader", ""));
+optionsFinal.put("commentchar", Maps.getOrDefault(options, 
"commentchar", "#"));
+optionsFinal.put("columndict", Maps.getOrDefault(options, 
"columndict", null));
+
+optionsFinal.put(
+"escapechar",
+
CarbonLoaderUtil.getEscapeChar(Maps.getOrDefault(options,"escapechar", "\\")));
+
+optionsFinal.put(
+"serialization_null_format",
+Maps.getOrDefault(options, "serialization_null_format", "\\N"));
+
+optionsFinal.put(
+"bad_records_logger_enable",
+Maps.getOrDefault(
+options,
+"bad_records_logger_enable",
+CarbonProperties.getInstance().getProperty(
+
CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_LOGGER_ENABLE,
+
CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_LOGGER_ENABLE_DEFAULT)));
+
+String badRecordActionValue = 
CarbonProperties.getInstance().getProperty(
+CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION,
+CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION_DEFAULT);
+
+optionsFinal.put(
+"bad_records_action",
+Maps.getOrDefault(
+options,
+"bad_records_action",
+CarbonProperties.getInstance().getProperty(
+
CarbonLoadOptionConstants.CARBON_OPTIONS_BAD_RECORDS_ACTION,
+badRecordActionValue)));
+
+optionsFinal.put(
+"is_empty_data_bad_record",
+Maps.getOrDefault(
+options,
+"is_empty_data_bad_record",
+CarbonProperties.getInstance().getProperty(
+
CarbonLoadOptionConstants.CARBON_OPTIONS_IS_EMPTY_DATA_BAD_RECORD,
+
CarbonLoadOptionConstants.CARBON_OPTIONS_IS_EMPTY_DATA_BAD_RECORD_DEFAULT)));
+
+optionsFinal.put(
+"skip_empty_line",
+   

[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1970#discussion_r167481299
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
 ---
@@ -708,4 +710,129 @@ public static Boolean 
checkIfValidLoadInProgress(AbsoluteTableIdentifier absolut
 }
   }
 
+  private static boolean isLoadDeletionRequired(String metaDataLocation) {
+LoadMetadataDetails[] details = 
SegmentStatusManager.readLoadMetadata(metaDataLocation);
+if (details != null && details.length > 0) {
+  for (LoadMetadataDetails oneRow : details) {
+if ((SegmentStatus.MARKED_FOR_DELETE == oneRow.getSegmentStatus()
+|| SegmentStatus.COMPACTED == oneRow.getSegmentStatus()
+|| SegmentStatus.INSERT_IN_PROGRESS == 
oneRow.getSegmentStatus()
+|| SegmentStatus.INSERT_OVERWRITE_IN_PROGRESS == 
oneRow.getSegmentStatus())
+&& oneRow.getVisibility().equalsIgnoreCase("true")) {
+  return true;
+}
+  }
+}
+return false;
+  }
+
+  /**
+   * This will update the old table status details before clean files to 
the latest table status.
+   * @param oldList
+   * @param newList
+   * @return
+   */
+  public static List updateLoadMetadataFromOldToNew(
+  LoadMetadataDetails[] oldList, LoadMetadataDetails[] newList) {
+
+List newListMetadata =
+new ArrayList(Arrays.asList(newList));
+for (LoadMetadataDetails oldSegment : oldList) {
+  if ("false".equalsIgnoreCase(oldSegment.getVisibility())) {
+
newListMetadata.get(newListMetadata.indexOf(oldSegment)).setVisibility("false");
+  }
+}
+return newListMetadata;
+  }
+
+  private static void writeLoadMetadata(AbsoluteTableIdentifier identifier,
+  List listOfLoadFolderDetails) throws 
IOException {
+String dataLoadLocation = 
CarbonTablePath.getTableStatusFilePath(identifier.getTablePath());
+
+DataOutputStream dataOutputStream;
+Gson gsonObjectToWrite = new Gson();
+BufferedWriter brWriter = null;
+
+AtomicFileOperations writeOperation =
+new AtomicFileOperationsImpl(dataLoadLocation, 
FileFactory.getFileType(dataLoadLocation));
+
+try {
+
+  dataOutputStream = 
writeOperation.openForWrite(FileWriteOperation.OVERWRITE);
+  brWriter = new BufferedWriter(new 
OutputStreamWriter(dataOutputStream,
+  Charset.forName(CarbonCommonConstants.DEFAULT_CHARSET)));
+
+  String metadataInstance = 
gsonObjectToWrite.toJson(listOfLoadFolderDetails.toArray());
+  brWriter.write(metadataInstance);
+} finally {
+  try {
+if (null != brWriter) {
+  brWriter.flush();
+}
+  } catch (Exception e) {
+LOG.error("error in  flushing ");
+
+  }
+  CarbonUtil.closeStreams(brWriter);
+  writeOperation.close();
+}
+  }
+
+  public static void deleteLoadsAndUpdateMetadata(
+  CarbonTable carbonTable,
+  boolean isForceDeletion) throws IOException {
+if (isLoadDeletionRequired(carbonTable.getMetadataPath())) {
+  LoadMetadataDetails[] details =
+  
SegmentStatusManager.readLoadMetadata(carbonTable.getMetadataPath());
+  AbsoluteTableIdentifier identifier = 
carbonTable.getAbsoluteTableIdentifier();
+  ICarbonLock carbonTableStatusLock = 
CarbonLockFactory.getCarbonLockObj(
+  identifier, LockUsage.TABLE_STATUS_LOCK);
+
+  // Delete marked loads
+  boolean isUpdationRequired = 
DeleteLoadFolders.deleteLoadFoldersFromFileSystem(
+  identifier,
--- End diff --

please apply java style


---


[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread QiangCai
Github user QiangCai commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1970#discussion_r167466587
  
--- Diff: 
integration/spark-common-test/src/test/scala/org/apache/carbondata/spark/testsuite/datetype/DateTypeTest.scala
 ---
@@ -16,10 +16,11 @@
  */
 package org.apache.carbondata.spark.testsuite.datetype
 
-import 
org.apache.carbondata.spark.exception.MalformedCarbonCommandException
 import org.apache.spark.sql.test.util.QueryTest
 import org.scalatest.BeforeAndAfterAll
 
+import 
org.apache.carbondata.common.exceptions.sql.MalformedCarbonCommandException
+
--- End diff --

not required


---


[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1970#discussion_r167465747
  
--- Diff: 
processing/src/main/java/org/apache/carbondata/processing/loading/model/CarbonLoadModelBuilder.java
 ---
@@ -0,0 +1,322 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.processing.loading.model;
+
+import java.io.IOException;
+import java.text.SimpleDateFormat;
+import java.util.List;
+import java.util.Map;
+
+import org.apache.carbondata.common.Maps;
+import org.apache.carbondata.common.Strings;
+import org.apache.carbondata.common.annotations.InterfaceAudience;
+import org.apache.carbondata.common.constants.LoggerAction;
+import 
org.apache.carbondata.common.exceptions.sql.InvalidLoadOptionException;
+import org.apache.carbondata.core.constants.CarbonCommonConstants;
+import org.apache.carbondata.core.metadata.schema.table.CarbonTable;
+import 
org.apache.carbondata.core.metadata.schema.table.column.CarbonColumn;
+import org.apache.carbondata.core.util.CarbonProperties;
+import org.apache.carbondata.core.util.CarbonUtil;
+import 
org.apache.carbondata.processing.loading.constants.DataLoadProcessorConstants;
+import org.apache.carbondata.processing.loading.csvinput.CSVInputFormat;
+import org.apache.carbondata.processing.loading.sort.SortScopeOptions;
+import org.apache.carbondata.processing.util.TableOptionConstant;
+
+import org.apache.commons.lang.StringUtils;
+import org.apache.hadoop.conf.Configuration;
+
+/**
+ * Builder for {@link CarbonLoadModel}
+ */
+@InterfaceAudience.Developer
+public class CarbonLoadModelBuilder {
+
+  private CarbonTable table;
+
+  public CarbonLoadModelBuilder(CarbonTable table) {
+this.table = table;
+  }
+
+  /**
+   * build CarbonLoadModel for data loading
+   * @param options Load options from user input
+   * @return a new CarbonLoadModel instance
+   */
+  public CarbonLoadModel build(
+  Map options) throws InvalidLoadOptionException, 
IOException {
+Map optionsFinal = 
LoadOption.fillOptionWithDefaultValue(options);
+optionsFinal.put("sort_scope", "no_sort");
+if (!options.containsKey("fileheader")) {
+  List csvHeader = 
table.getCreateOrderColumn(table.getTableName());
+  String[] columns = new String[csvHeader.size()];
+  for (int i = 0; i < columns.length; i++) {
+columns[i] = csvHeader.get(i).getColName();
+  }
+  optionsFinal.put("fileheader", Strings.mkString(columns, ","));
+}
+CarbonLoadModel model = new CarbonLoadModel();
+
+// we have provided 'fileheader', so it hadoopConf can be null
+build(options, optionsFinal, model, null);
+
+// set default values
+
model.setTimestampformat(CarbonCommonConstants.CARBON_TIMESTAMP_DEFAULT_FORMAT);
+model.setDateFormat(CarbonCommonConstants.CARBON_DATE_DEFAULT_FORMAT);
+model.setUseOnePass(Boolean.parseBoolean(Maps.getOrDefault(options, 
"onepass", "false")));
+model.setDictionaryServerHost(Maps.getOrDefault(options, "dicthost", 
null));
+try {
+  
model.setDictionaryServerPort(Integer.parseInt(Maps.getOrDefault(options, 
"dictport", "-1")));
+} catch (NumberFormatException e) {
+  throw new InvalidLoadOptionException(e.getMessage());
+}
+return model;
+  }
+
+  /**
+   * build CarbonLoadModel for data loading
+   * @param options Load options from user input
+   * @param optionsFinal Load options that populated with default values 
for optional options
+   * @param carbonLoadModel The output load model
+   * @param hadoopConf hadoopConf is needed to read CSV header if there 
'fileheader' is not set in
+   *   user provided load options
+   */
+  public void build(
--- End diff --

These code are moved from 

[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread jackylk
Github user jackylk commented on a diff in the pull request:

https://github.com/apache/carbondata/pull/1970#discussion_r167465707
  
--- Diff: 
core/src/main/java/org/apache/carbondata/core/statusmanager/SegmentStatusManager.java
 ---
@@ -708,4 +710,129 @@ public static Boolean 
checkIfValidLoadInProgress(AbsoluteTableIdentifier absolut
 }
   }
 
+  private static boolean isLoadDeletionRequired(String metaDataLocation) {
--- End diff --

These code are moved from DataLoadingUtil.scala in carbon-spark module


---


[GitHub] carbondata pull request #1970: [CARBONDATA-2159] Remove carbon-spark depende...

2018-02-11 Thread jackylk
GitHub user jackylk opened a pull request:

https://github.com/apache/carbondata/pull/1970

[CARBONDATA-2159] Remove carbon-spark dependency in store-sdk module

store-sdk module should not depend on carbon-spark module
This PR changes:
1.  A `Maps` utility is added to provide `getOrDefault` method and avoid 
JDK 8 dependency
2. `CarbonLoadModelBuilder` is added to build `CarbonLoadModel`
3. `DataLoadingUtil.scala` and `ValidateUtil.scala` is changed to java 
implementation and moved to `CarbonLoadModelBuilder` in processing module

After all these changes, carbon-spark dependency can be removed from 
store-sdk module

 - [X] Any interfaces changed?
 No
 - [X] Any backward compatibility impacted?
 No
 - [X] Document update required?
No
 - [X] Testing done
No functionality is added 
 - [X] For large changes, please consider breaking it into sub-tasks under 
an umbrella JIRA. 
NA

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/jackylk/incubator-carbondata 
sdk-remove-spark-dependency

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/carbondata/pull/1970.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1970


commit 952665a8c1c52f28951463fef989333ae0e6d83e
Author: Jacky Li 
Date:   2018-01-06T12:28:44Z

[CARBONDATA-1992] Remove partitionId in CarbonTablePath

In CarbonTablePath, there is a deprecated partition id which is always 0, 
it should be removed to avoid confusion.

This closes #1765

commit 111c3821557820241d1114d87eae2f7cd017e610
Author: Jacky Li 
Date:   2018-01-02T15:46:14Z

[CARBONDATA-1968] Add external table support

This PR adds support for creating external table with existing carbondata 
files, using Hive syntax.
CREATE EXTERNAL TABLE tableName STORED BY 'carbondata' LOCATION 'path'

This closes #1749

commit 80b42ac662ebd2bc243ca91c86b035717223daf4
Author: SangeetaGulia 
Date:   2017-09-21T09:26:26Z

[CARBONDATA-1827] S3 Carbon Implementation

1.Provide support for s3 in carbondata.
2.Added S3Example to create carbon table on s3.
3.Added S3CSVExample to load carbon table using csv from s3.

This closes #1805

commit 71c2d8ca4a3212cff1eedbe78ee03e521f57fbbc
Author: Jacky Li 
Date:   2018-01-31T16:25:31Z

[REBASE] Solve conflict after rebasing master

commit 15b4e192ee904a2e7c845ac67e0fcf1ba151a683
Author: Jacky Li 
Date:   2018-01-30T13:24:04Z

[CARBONDATA-2099] Refactor query scan process to improve readability

Unified concepts in scan process flow:

1.QueryModel contains all parameter for scan, it is created by API in 
CarbonTable. (In future, CarbonTable will be the entry point for various table 
operations)
2.Use term ColumnChunk to represent one column in one blocklet, and use 
ChunkIndex in reader to read specified column chunk
3.Use term ColumnPage to represent one page in one ColumnChunk
4.QueryColumn => ProjectionColumn, indicating it is for projection

This closes #1874

commit c3e99681bcd397ed33bc90e8d73b1fd33e0e60f7
Author: Jacky Li 
Date:   2018-01-31T08:14:27Z

[CARBONDATA-2025] Unify all path construction through CarbonTablePath 
static method

Refactory CarbonTablePath:

1.Remove CarbonStorePath and use CarbonTablePath only.
2.Make CarbonTablePath an utility without object creation, it can avoid 
creating object before using it, thus code is cleaner and GC is less.

This closes #1768

commit e502c59a2d0b95d80db3aff04c749654254eadbe
Author: Jatin 
Date:   2018-01-25T11:23:00Z

[CARBONDATA-2080] [S3-Implementation] Propagated hadoopConf from driver to 
executor for s3 implementation in cluster mode.

Problem : hadoopconf was not getting propagated from driver to the executor 
that's why load was failing to the distributed environment.
Solution: Setting the Hadoop conf in base class CarbonRDD
How to verify this PR :
Execute the load in the cluster mode It should be a success using location 
s3.

This closes #1860

commit cae74a8cecea74e8899a87dcb7d12e0dec1b8069
Author: sounakr 
Date:   2017-09-28T10:51:05Z

[CARBONDATA-1480]Min Max Index Example for DataMap

Datamap Example. Implementation of Min Max Index through Datamap. And Using 
the Index while prunning.

This closes #1359

commit e972fd3d5cc8f392d47ca111b2d8f262edb29ac6
Author: ravipesala 
Date:   2017-11-15T14:18:40Z

[CARBONDATA-1544][Datamap] Datamap FineGrain implementation

Implemented interfaces for FG datamap and integrated to filterscanner to 
use the pruned bitset