date:20200818

[GitHub] [carbondata] vikramahuja1001 commented on pull request #3894: [WIP] Added property to enable disable SIforFailed segments and added prope…

2020-08-18 Thread GitBox



vikramahuja1001 commented on pull request #3894:
URL: https://github.com/apache/carbondata/pull/3894#issuecomment-675868683


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-675868405


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3782/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-675867872


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2040/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Karan980 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



Karan980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675858348


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (CARBONDATA-3925) flink-integration write carbon file to hdfs error

2020-08-18 Thread yutao (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

yutao updated CARBONDATA-3925:
--
Fix Version/s: 2.1.0
   Issue Type: Bug  (was: Improvement)
 Priority: Major  (was: Minor)
  Summary: flink-integration write carbon file to hdfs error  (was: 
flink-integration CarbonWriter.java LOG print use CarbonS3Writer's classname)

> flink-integration write carbon file to hdfs error
> -
>
> Key: CARBONDATA-3925
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3925
> Project: CarbonData
>  Issue Type: Bug
>  Components: flink-integration
>Affects Versions: 2.0.0
>Reporter: yutao
>Priority: Major
> Fix For: 2.1.0, 2.0.1
>
>
> in CarbonWriter.java code ,you can find this;
> public abstract class *{color:red}CarbonWriter{color}* extends 
> ProxyFileWriter {
>   private static final Logger LOGGER =
>   
> LogServiceFactory.getLogService({color:red}CarbonS3Writer{color}.class.getName());}
> always wo can find logfile print like ;
> 2020-07-27 14:19:25,107 DEBUG org.apache.carbon.flink.CarbonS3Writer  
> this is puzzled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] yutaoChina commented on a change in pull request #3892: flink write carbon file to hdfs when file size is less than 1M,can't write

2020-08-18 Thread GitBox



yutaoChina commented on a change in pull request #3892:
URL: https://github.com/apache/carbondata/pull/3892#discussion_r472681680



##
File path: 
integration/flink/src/main/java/org/apache/carbon/core/metadata/StageManager.java
##
@@ -81,7 +81,7 @@ public static void writeStageInput(final String 
stageInputPath, final StageInput
   private static void writeSuccessFile(final String successFilePath) throws 
IOException {
 final DataOutputStream segmentStatusSuccessOutputStream =
 FileFactory.getDataOutputStream(successFilePath,
-CarbonCommonConstants.BYTEBUFFER_SIZE, 1024);
+CarbonCommonConstants.BYTEBUFFER_SIZE, 1024 * 1024 * 2);

Review comment:
   i set it 2M beacuase hdfs (dfs.namenode.fs-limits.min-block-size) 
configured minimum value size is 1M and in CarbonUtil.java class 
   `getMaxOfBlockAndFileSize(long blockSize, long fileSize) `method use `long 
maxSize = blockSize;
   if (fileSize > blockSize) {
 maxSize = fileSize;
   }` 
   if default size or filesize less than 1M program will get error ;
   why 2M ? default is 1M so default * 2 bigger than it 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (CARBONDATA-3927) TupleID/Position reference is long , make it short

2020-08-18 Thread Kunal Kapoor (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3927.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> TupleID/Position reference is long , make it short
> --
>
> Key: CARBONDATA-3927
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3927
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Akash R Nilugal
>Assignee: Akash R Nilugal
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 40m
>  Remaining Estimate: 0h
>
> the current tuple id is long where some parts we can avoid to improve 
> performance. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3837: [CARBONDATA-3927]Remove unwanted fields from tupleID to make it short and to improve store size and performance.

2020-08-18 Thread GitBox



asfgit closed pull request #3837:
URL: https://github.com/apache/carbondata/pull/3837


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] kunal642 commented on pull request #3837: [CARBONDATA-3927]Remove unwanted fields from tupleID to make it short and to improve store size and performance.

2020-08-18 Thread GitBox



kunal642 commented on pull request #3837:
URL: https://github.com/apache/carbondata/pull/3837#issuecomment-675843217


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (CARBONDATA-3863) index service go back to emmbedded mode

2020-08-18 Thread Kunal Kapoor (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kunal Kapoor resolved CARBONDATA-3863.
--
Fix Version/s: 2.1.0
   Resolution: Fixed

> index service go back to emmbedded mode
> ---
>
> Key: CARBONDATA-3863
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3863
> Project: CarbonData
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Taoli
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 6h
>  Remaining Estimate: 0h
>
> when use index service,some way may cause the floder "/tmp/indexservertmp" 
> get max-directory-item exception. in that case the index service go back to 
> emmbedded mode.
> the error is like above:
>  
> Exception occured: The directory item limit of /tmp/indexservertmp is 
> exceeded: limit=1048576
> items=1048576.
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3855: [CARBONDATA-3863], after using index service clean the temp data

2020-08-18 Thread GitBox



asfgit closed pull request #3855:
URL: https://github.com/apache/carbondata/pull/3855


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] kunal642 commented on pull request #3855: [CARBONDATA-3863], after using index service clean the temp data

2020-08-18 Thread GitBox



kunal642 commented on pull request #3855:
URL: https://github.com/apache/carbondata/pull/3855#issuecomment-675841436


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675783198


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2038/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675778572


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3780/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675713570


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2037/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675690653


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2034/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675687794


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3777/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675687528


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3776/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675687520


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2035/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675662000


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3779/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#discussion_r472408878



##
File path: 
sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/utils/SDKUtil.java
##
@@ -79,4 +98,75 @@ public static ArrayList listFiles(String sourceImageFolder,
 return (Object[]) input[i];
   }
 
+  public static List extractFilesFromFolder(String path,
+  String suf, Configuration hadoopConf) {
+List dataFiles = listFiles(path, suf, hadoopConf);
+List carbonFiles = new ArrayList<>();
+for (Object dataFile: dataFiles) {
+  carbonFiles.add(FileFactory.getCarbonFile(dataFile.toString(), 
hadoopConf));
+}
+if (CollectionUtils.isEmpty(dataFiles)) {
+  throw new RuntimeException("No file found at given location. Please 
provide" +
+  "the correct folder location.");
+}
+return carbonFiles;
+  }
+
+  public static DataFileStream buildAvroReader(CarbonFile 
carbonFile,
+   Configuration configuration) throws IOException {
+try {
+  GenericDatumReader genericDatumReader =
+  new GenericDatumReader<>();
+  DataFileStream avroReader =
+  new 
DataFileStream<>(FileFactory.getDataInputStream(carbonFile.getPath(),
+  -1, configuration), genericDatumReader);
+  return avroReader;
+} catch (FileNotFoundException ex) {
+  throw new FileNotFoundException("File " + carbonFile.getPath()
+  + " not found to build carbon writer.");
+} catch (IOException ex) {
+  if (ex.getMessage().contains("Not a data file")) {
+throw new RuntimeException("File " + carbonFile.getPath() + " is not 
in avro format.");
+  } else {
+throw ex;
+  }
+}
+  }
+
+  public static Reader buildOrcReader(String path, Configuration conf) throws 
IOException {
+try {
+  Reader orcReader = OrcFile.createReader(new Path(path),
+  OrcFile.readerOptions(conf));
+  return orcReader;
+} catch (FileFormatException ex) {
+  throw new RuntimeException("File " + path + " is not in ORC format");
+} catch (FileNotFoundException ex) {
+  throw new FileNotFoundException("File " + path + " not found to build 
carbon writer.");
+}
+  }
+
+  public static ParquetReader buildPqrquetReader(String path, 
Configuration conf)

Review comment:
   Please correct spelling mistake for parquet in method name.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#discussion_r472407770



##
File path: 
sdk/sdk/src/main/java/org/apache/carbondata/sdk/file/CarbonWriterBuilder.java
##
@@ -594,6 +607,446 @@ public CarbonWriterBuilder withJsonInput(Schema 
carbonSchema) {
 return this;
   }
 
+  private void validateCsvFiles() throws IOException {
+CarbonFile[] dataFiles = 
this.extractDataFiles(CarbonCommonConstants.CSV_FILE_EXTENSION);
+if (CollectionUtils.isEmpty(Arrays.asList(dataFiles))) {
+  throw new RuntimeException("CSV files can't be empty.");
+}
+for (CarbonFile dataFile : dataFiles) {
+  try {
+CsvParser csvParser = SDKUtil.buildCsvParser(this.hadoopConf);
+
csvParser.beginParsing(FileFactory.getDataInputStream(dataFile.getPath(),
+-1, this.hadoopConf));
+  } catch (IllegalArgumentException ex) {
+if (ex.getCause() instanceof FileNotFoundException) {
+  throw new FileNotFoundException("File " + dataFile +
+  " not found to build carbon writer.");
+}
+throw ex;
+  }
+}
+this.dataFiles = dataFiles;
+  }
+
+  /**
+   * to build a {@link CarbonWriter}, which accepts loading CSV files.
+   *
+   * @param filePath absolute path under which files should be loaded.
+   * @return CarbonWriterBuilder
+   */
+  public CarbonWriterBuilder withCsvPath(String filePath) throws IOException {
+this.validateFilePath(filePath);
+this.filePath = filePath;
+this.setIsDirectory(filePath);
+this.withCsvInput();
+this.validateCsvFiles();
+return this;
+  }
+
+  /**
+   * to build a {@link CarbonWriter}, which accepts CSV files directory and
+   * list of file which has to be loaded.
+   *
+   * @param filePath directory where the CSV file exists.
+   * @param fileList list of files which has to be loaded.
+   * @return CarbonWriterBuilder
+   */
+  public CarbonWriterBuilder withCsvPath(String filePath, List 
fileList)
+  throws IOException {
+this.fileList = fileList;
+this.withCsvPath(filePath);
+return this;
+  }
+
+  private void validateJsonFiles() throws IOException {
+CarbonFile[] dataFiles = 
this.extractDataFiles(CarbonCommonConstants.JSON_FILE_EXTENSION);
+for (CarbonFile dataFile : dataFiles) {
+  try {
+new JSONParser().parse(SDKUtil.buildJsonReader(dataFile, 
this.hadoopConf));
+  } catch (FileNotFoundException ex) {
+throw new FileNotFoundException("File " + dataFile + " not found to 
build carbon writer.");
+  } catch (ParseException ex) {
+throw new RuntimeException("File " + dataFile + " is not in json 
format.");
+  }
+}
+this.dataFiles = dataFiles;
+  }
+
+  /**
+   * to build a {@link CarbonWriter}, which accepts loading JSON files.
+   *
+   * @param filePath absolute path under which files should be loaded.
+   * @return CarbonWriterBuilder
+   */
+  public CarbonWriterBuilder withJsonPath(String filePath) throws IOException {
+this.validateFilePath(filePath);
+this.filePath = filePath;
+this.setIsDirectory(filePath);
+this.withJsonInput();
+this.validateJsonFiles();
+return this;
+  }
+
+  /**
+   * to build a {@link CarbonWriter}, which accepts JSON file directory and
+   * list of file which has to be loaded.
+   *
+   * @param filePath directory where the json file exists.
+   * @param fileList list of files which has to be loaded.
+   * @return CarbonWriterBuilder
+   * @throws IOException
+   */
+  public CarbonWriterBuilder withJsonPath(String filePath, List 
fileList)
+  throws IOException {
+this.fileList = fileList;
+this.withJsonPath(filePath);
+return this;
+  }
+
+  private void validateFilePath(String filePath) {
+if (StringUtils.isEmpty(filePath)) {
+  throw new IllegalArgumentException("filePath can not be empty");
+}
+  }
+
+  /**
+   * to build a {@link CarbonWriter}, which accepts loading Parquet files.
+   *
+   * @param filePath absolute path under which files should be loaded.
+   * @return CarbonWriterBuilder
+   */
+  public CarbonWriterBuilder withParquetPath(String filePath) throws 
IOException {
+this.validateFilePath(filePath);
+this.filePath = filePath;
+this.setIsDirectory(filePath);
+this.writerType = WRITER_TYPE.PARQUET;
+this.validateParquetFiles();
+return this;
+  }
+
+  private void setIsDirectory(String filePath) {
+if (this.hadoopConf == null) {
+  this.hadoopConf = new Configuration(FileFactory.getConfiguration());

Review comment:
   Had checked the base code. In the base code, we seem to directly assign 
the return value of FileFactory.getConfiguration() instead of new 
Configuration. Suggest to check and keep it consistent.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#discussion_r472388572



##
File path: examples/spark/pom.xml
##
@@ -38,6 +38,12 @@
   org.apache.carbondata
   carbondata-spark_${spark.binary.version}
   ${project.version}
+  

Review comment:
   Was wondering why this exclusion in examples/spark/pom.xml & 
integration/spark/pom.xml . You don't seem to have any change in these 2 
modules. I think, you want to exclude elsewhere ?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#issuecomment-675631086


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2033/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472382210



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "
+  + dimensionValue + " to " + dateToStr);
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  return validateTimeStampRange(dateToStr.getTime());
+} catch (ParseException ex) {
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  throw new NumberFormatException(ex.getMessage());
+}
+  } else {
+throw new NumberFormatException(e.getMessage());
+  }
+}
+  }
+
+  private static Long validateTimeStampRange(Long timeValue) {
+SimpleDateFormat df = new SimpleDateFormat("-MM-dd HH:mm:ss");

Review comment:
   rechecked and made use of existing value from 
`DateDirectDictionaryGenerator.MIN_VALUE` and 
`DateDirectDictionaryGenerator.MAX_VALUE`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472381642



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "

Review comment:
   agree. removed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472381408



##
File path: core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
##
@@ -153,6 +154,12 @@ private boolean validateKeyValue(String key, String value) 
throws InvalidConfigu
   case ENABLE_UNSAFE_IN_QUERY_EXECUTION:
   case ENABLE_AUTO_LOAD_MERGE:
   case CARBON_PUSH_ROW_FILTERS_FOR_VECTOR:
+  case CARBON_LOAD_SETLENIENT_ENABLE:

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472380864



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "
+  + dimensionValue + " to " + dateToStr);
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  return validateTimeStampRange(dateToStr.getTime());
+} catch (ParseException ex) {

Review comment:
   ok added

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##
@@ -306,6 +307,39 @@ class TestLoadDataWithDiffTimestampFormat extends 
QueryTest with BeforeAndAfterA
 }
   }
 
+  test("test load, update data with daylight saving time from different 
timezone") {
+CarbonProperties.getInstance().addProperty(
+  CarbonCommonConstants.CARBON_LOAD_SETLENIENT_ENABLE, "true")
+val defaultTimeZone = TimeZone.getDefault
+TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
+sql("DROP TABLE IF EXISTS t3")
+sql(
+  """
+   CREATE TABLE IF NOT EXISTS t3
+   (ID Int, date date, starttime Timestamp, country String,
+   name String, phonetype String, serialname String, salary Int)
+   STORED AS carbondata TBLPROPERTIES('dateformat'='/MM/dd',
+   'timestampformat'='-MM-dd HH:mm')
+""")
+sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/timeStampFormatData3.csv' 
into table t3")
+sql(s"insert into t3 select 11,'2015-7-23','1941-3-15 
00:00:00','china','aaa1','phone197'," +
+s"'ASD69643',15000")
+sql("update t3 set (starttime) = ('1941-3-15 00:00:00') where name='aaa2'")
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 1"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 11"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 2"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+TimeZone.setDefault(defaultTimeZone)

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#discussion_r472379038



##
File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##
@@ -2456,4 +2471,24 @@ private CarbonCommonConstants() {
* property which defines the insert stage flow
*/
   public static final String IS_INSERT_STAGE = "is_insert_stage";
+
+  /**
+   * the level 1 complex delimiter default value
+   */
+  @CarbonProperty

Review comment:
   This looks to be just a value. Not the user configuration property. If 
so, @CarbonProperty is not required. please check and remove. check the same 
for below 2 more properties as well.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#discussion_r472379038



##
File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##
@@ -2456,4 +2471,24 @@ private CarbonCommonConstants() {
* property which defines the insert stage flow
*/
   public static final String IS_INSERT_STAGE = "is_insert_stage";
+
+  /**
+   * the level 1 complex delimiter default value
+   */
+  @CarbonProperty

Review comment:
   This looks to be just a value. Not the user configuration property. If 
so, @CarbonProperty is not required. please check and remove





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#issuecomment-675621560


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3775/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Karan980 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



Karan980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675619431


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3862: [CARBONDATA-3933]Fix DDL/DML failures after table is created with column names having special characters like #,\,%

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3862:
URL: https://github.com/apache/carbondata/pull/3862#issuecomment-675594693


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3773/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675591167


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3774/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675590164


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2031/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472332314



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "
+  + dimensionValue + " to " + dateToStr);
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  return validateTimeStampRange(dateToStr.getTime());
+} catch (ParseException ex) {

Review comment:
   `validateTimeStampRange()` throws `NumberFormatException`. Your would 
want to do `dateFormatter.setLenient(false);` in that case too..





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3862: [CARBONDATA-3933]Fix DDL/DML failures after table is created with column names having special characters like #,\,%

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3862:
URL: https://github.com/apache/carbondata/pull/3862#issuecomment-675586969


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2032/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472327182



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "

Review comment:
   `Changing setLenient to true for TimeStamp: " + dimensionValue ` is 
redundant. we have already logged it in line 452. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675571056


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3771/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675569959


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2029/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675563624


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3770/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



nihal0107 commented on a change in pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#discussion_r472300831



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/util/BadRecordUtil.scala
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.util
+
+import java.io.{File, FileFilter}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.io.FileUtils
+
+object BadRecordUtil {
+
+  /**
+   * get the bad record redirected csv file path
+   * @param dbName

Review comment:
   done

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/util/BadRecordUtil.scala
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.util
+
+import java.io.{File, FileFilter}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.io.FileUtils
+
+object BadRecordUtil {
+
+  /**
+   * get the bad record redirected csv file path
+   * @param dbName
+   * @param tableName
+   * @param segment
+   * @param task
+   * @return csv File
+   */
+  def getRedirectCsvPath(dbName: String,
+tableName: String, segment: String, task: String): File = {
+var badRecordLocation = CarbonProperties.getInstance()
+  .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC)
+badRecordLocation = badRecordLocation + "/" + dbName + "/" + tableName + 
"/" + segment + "/" +
+  task
+val listFiles = new File(badRecordLocation).listFiles(new FileFilter {
+  override def accept(pathname: File): Boolean = {
+pathname.getPath.endsWith(".csv")
+  }
+})
+listFiles(0)
+  }
+
+  /**
+   * compare data of csvfile and redirected csv file.
+   * @param csvFilePath csv file path

Review comment:
   done

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/util/BadRecordUtil.scala
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.util
+
+import java.io.{File, FileFilter}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.io.FileUtils
+
+object BadRecordUtil {
+
+  /**
+   * get the

[GitHub] [carbondata] akashrn5 commented on a change in pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



akashrn5 commented on a change in pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#discussion_r472296594



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/util/BadRecordUtil.scala
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.util
+
+import java.io.{File, FileFilter}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.io.FileUtils
+
+object BadRecordUtil {
+
+  /**
+   * get the bad record redirected csv file path
+   * @param dbName

Review comment:
   remove these @param, not required

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/util/BadRecordUtil.scala
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.spark.util
+
+import java.io.{File, FileFilter}
+
+import org.apache.carbondata.core.constants.CarbonCommonConstants
+import org.apache.carbondata.core.datastore.impl.FileFactory
+import org.apache.carbondata.core.util.CarbonProperties
+import org.apache.commons.io.FileUtils
+
+object BadRecordUtil {
+
+  /**
+   * get the bad record redirected csv file path
+   * @param dbName
+   * @param tableName
+   * @param segment
+   * @param task
+   * @return csv File
+   */
+  def getRedirectCsvPath(dbName: String,
+tableName: String, segment: String, task: String): File = {
+var badRecordLocation = CarbonProperties.getInstance()
+  .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC)
+badRecordLocation = badRecordLocation + "/" + dbName + "/" + tableName + 
"/" + segment + "/" +
+  task
+val listFiles = new File(badRecordLocation).listFiles(new FileFilter {
+  override def accept(pathname: File): Boolean = {
+pathname.getPath.endsWith(".csv")
+  }
+})
+listFiles(0)
+  }
+
+  /**
+   * compare data of csvfile and redirected csv file.
+   * @param csvFilePath csv file path
+   * @param redirectCsvPath redirected csv file path
+   * @return boolean
+   */
+  def checkRedirectedCsvContentAvailableInSource(csvFilePath: String,
+redirectCsvPath: File): Boolean = {
+val origFileLineList = FileUtils.readLines(new File(csvFilePath))
+val redirectedFileLineList = FileUtils.readLines(redirectCsvPath)
+val iterator = redirectedFileLineList.iterator()
+while (iterator.hasNext) {
+  if (!origFileLineList.contains(iterator.next())) {
+return false;
+  }
+}
+true
+  }
+
+  /**
+   * delete the files at bad record location
+   * @param dbName database name
+   * @param tableName table name
+   * @return boolean
+   */

Review comment:
   same as above

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/util/BadRecordUtil.scala
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472294029



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "
+  + dimensionValue + " to " + dateToStr);
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  return validateTimeStampRange(dateToStr.getTime());
+} catch (ParseException ex) {
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  throw new NumberFormatException(ex.getMessage());
+}
+  } else {
+throw new NumberFormatException(e.getMessage());
+  }
+}
+  }
+
+  private static Long validateTimeStampRange(Long timeValue) {
+SimpleDateFormat df = new SimpleDateFormat("-MM-dd HH:mm:ss");

Review comment:
   Here, the `DateDirectDictionaryGenerator.MIN_VALUE`  is ("0001-01-01") 
which is not equals to timestamp minvalue ("0001-01-01 00:00:00"). As the 
format is different, will get different long values after parse.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472288683



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##
@@ -306,6 +307,39 @@ class TestLoadDataWithDiffTimestampFormat extends 
QueryTest with BeforeAndAfterA
 }
   }
 
+  test("test load, update data with daylight saving time from different 
timezone") {
+CarbonProperties.getInstance().addProperty(
+  CarbonCommonConstants.CARBON_LOAD_SETLENIENT_ENABLE, "true")
+val defaultTimeZone = TimeZone.getDefault
+TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
+sql("DROP TABLE IF EXISTS t3")
+sql(
+  """
+   CREATE TABLE IF NOT EXISTS t3
+   (ID Int, date date, starttime Timestamp, country String,
+   name String, phonetype String, serialname String, salary Int)
+   STORED AS carbondata TBLPROPERTIES('dateformat'='/MM/dd',
+   'timestampformat'='-MM-dd HH:mm')
+""")
+sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/timeStampFormatData3.csv' 
into table t3")
+sql(s"insert into t3 select 11,'2015-7-23','1941-3-15 
00:00:00','china','aaa1','phone197'," +
+s"'ASD69643',15000")
+sql("update t3 set (starttime) = ('1941-3-15 00:00:00') where name='aaa2'")
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 1"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 11"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 2"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+TimeZone.setDefault(defaultTimeZone)

Review comment:
   Remove `CARBON_LOAD_SETLENIENT_ENABLE` from carbon properies at the end 
of testcase.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472285299



##
File path: core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
##
@@ -153,6 +154,12 @@ private boolean validateKeyValue(String key, String value) 
throws InvalidConfigu
   case ENABLE_UNSAFE_IN_QUERY_EXECUTION:
   case ENABLE_AUTO_LOAD_MERGE:
   case CARBON_PUSH_ROW_FILTERS_FOR_VECTOR:
+  case CARBON_LOAD_SETLENIENT_ENABLE:

Review comment:
   It can be  a fall through case. Can remove line 158-162`

##
File path: core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
##
@@ -153,6 +154,12 @@ private boolean validateKeyValue(String key, String value) 
throws InvalidConfigu
   case ENABLE_UNSAFE_IN_QUERY_EXECUTION:
   case ENABLE_AUTO_LOAD_MERGE:
   case CARBON_PUSH_ROW_FILTERS_FOR_VECTOR:
+  case CARBON_LOAD_SETLENIENT_ENABLE:

Review comment:
   It can be  a fall through case. Can remove line 158-162





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#issuecomment-675546141


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2028/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] VenuReddy2103 commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



VenuReddy2103 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472278473



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "
+  + dimensionValue + " to " + dateToStr);
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  return validateTimeStampRange(dateToStr.getTime());
+} catch (ParseException ex) {
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  throw new NumberFormatException(ex.getMessage());
+}
+  } else {
+throw new NumberFormatException(e.getMessage());
+  }
+}
+  }
+
+  private static Long validateTimeStampRange(Long timeValue) {
+SimpleDateFormat df = new SimpleDateFormat("-MM-dd HH:mm:ss");

Review comment:
   Instead of creating instance of simpleDateFormat each time, suggest to 
use existing `DateDirectDictionaryGenerator.MIN_VALUE` and 
`DateDirectDictionaryGenerator.MAX_VALUE` to validate





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3894: [WIP] Added property to enable disable SIforFailed segments and added prope…

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3894:
URL: https://github.com/apache/carbondata/pull/3894#issuecomment-675526677


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2027/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3894: [WIP] Added property to enable disable SIforFailed segments and added prope…

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3894:
URL: https://github.com/apache/carbondata/pull/3894#issuecomment-675516186


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3769/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#issuecomment-675511761


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3768/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#issuecomment-675511043


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2026/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (CARBONDATA-3943) Handling the addition of geo column to hive at the time of table creation

2020-08-18 Thread Akash R Nilugal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3943.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

>  Handling the addition of geo column to hive at the time of table creation
> --
>
> Key: CARBONDATA-3943
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3943
> Project: CarbonData
>  Issue Type: Bug
>Reporter: SHREELEKHYA GAMPA
>Priority: Minor
> Fix For: 2.1.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
>  Handling the addition of geo column to hive at the time of table creation



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



asfgit closed pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



akashrn5 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-675494423


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-675488619


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3764/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675487964


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2025/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675486176


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3765/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] xubo245 commented on pull request #3819: [CARBONDATA-3855]support carbon SDK to load data from different files

2020-08-18 Thread GitBox



xubo245 commented on pull request #3819:
URL: https://github.com/apache/carbondata/pull/3819#issuecomment-675484851


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-675479510


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2024/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG opened a new pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox



ShreelekhyaG opened a new pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896


### Why is this PR needed?
 1. Fix load failures due to daylight saving time changes.
 2. During load, date/timestamp year values with >4 digit should fail or be 
null according to bad records action property.

### What changes were proposed in this PR?
   New property added to setLeniet as true and parse timestampformat.
   Added validation for timestamp range values.
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3894: [WIP] Added property to enable disable SIforFailed segments and added prope…

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3894:
URL: https://github.com/apache/carbondata/pull/3894#issuecomment-675439020


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2023/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3894: [WIP] Added property to enable disable SIforFailed segments and added prope…

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3894:
URL: https://github.com/apache/carbondata/pull/3894#issuecomment-675437596


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3763/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



nihal0107 commented on a change in pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#discussion_r472122694



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala
##
@@ -145,47 +150,162 @@ class TestLoadDataGeneral extends QueryTest with 
BeforeAndAfterEach {
 sql("drop table if exists carbon_table")
   }
 
-  test("test insert / update with data more than 32000 characters") {
+  test("test load / insert / update with data more than 32000 characters and 
bad record action as Redirect") {
+val testdata =s"$resourcesPath/MoreThan32KChar.csv"
+FileFactory.deleteAllFilesOfDir(new File(CarbonProperties.getInstance()
+  .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC)))
+sql("CREATE TABLE longerthan32kchar(dim1 String, dim2 String, mes1 int) 
STORED AS carbondata")
+sql(s"LOAD DATA LOCAL INPATH '$testdata' into table longerThan32kChar 
OPTIONS('FILEHEADER'='dim1,dim2,mes1', " +
+  s"'BAD_RECORDS_ACTION'='REDIRECT','BAD_RECORDS_LOGGER_ENABLE'='TRUE')")
+var redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", 
"0", "0")
+assert(checkRedirectedCsvContentAvailableInSource(testdata, 
redirectCsvPath))
+val longChar: String = RandomStringUtils.randomAlphabetic(33000)
+
 CarbonProperties.getInstance()
   
.addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT,
 "true")
-val testdata =s"$resourcesPath/32000char.csv"
-sql("drop table if exists load32000chardata")
-sql("drop table if exists load32000chardata_dup")
-sql("CREATE TABLE load32000chardata(dim1 String, dim2 String, mes1 int) 
STORED AS carbondata")
-sql("CREATE TABLE load32000chardata_dup(dim1 String, dim2 String, mes1 
int) STORED AS carbondata")
-sql(s"LOAD DATA LOCAL INPATH '$testdata' into table load32000chardata 
OPTIONS('FILEHEADER'='dim1,dim2,mes1')")
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, 
"REDIRECT");
+sql(s"insert into longerthan32kchar values('33000', '$longChar', 4)")
+checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("ok", "hi", 
1), Row("itsok", "hello", 2)))
+redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "1", 
"0")
+var redirectedFileLineList = FileUtils.readLines(redirectCsvPath)
+var iterator = redirectedFileLineList.iterator()
+while (iterator.hasNext) {
+  assert(iterator.next().equals("33000,"+longChar+",4"))
+}
+
+// Update strings of length greater than 32000
+sql(s"update longerthan32kchar set(longerthan32kchar.dim2)=('$longChar') " 
+
+  "where longerthan32kchar.mes1=1").show()
+checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("itsok", 
"hello", 2)))
+redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "0", 
"1")
+redirectedFileLineList = FileUtils.readLines(redirectCsvPath)
+iterator = redirectedFileLineList.iterator()
+while (iterator.hasNext) {
+  assert(iterator.next().equals("ok,"+longChar+",1"))
+}
+CarbonProperties.getInstance()
+  
.addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT,
 "false")
+
+// Insert longer string without converter step will throw exception
 intercept[Exception] {

Review comment:
   Added at some place but here exception message is not user formatted





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



nihal0107 commented on a change in pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#discussion_r472122959



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala
##
@@ -145,47 +150,162 @@ class TestLoadDataGeneral extends QueryTest with 
BeforeAndAfterEach {
 sql("drop table if exists carbon_table")
   }
 
-  test("test insert / update with data more than 32000 characters") {
+  test("test load / insert / update with data more than 32000 characters and 
bad record action as Redirect") {
+val testdata =s"$resourcesPath/MoreThan32KChar.csv"
+FileFactory.deleteAllFilesOfDir(new File(CarbonProperties.getInstance()
+  .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC)))
+sql("CREATE TABLE longerthan32kchar(dim1 String, dim2 String, mes1 int) 
STORED AS carbondata")
+sql(s"LOAD DATA LOCAL INPATH '$testdata' into table longerThan32kChar 
OPTIONS('FILEHEADER'='dim1,dim2,mes1', " +
+  s"'BAD_RECORDS_ACTION'='REDIRECT','BAD_RECORDS_LOGGER_ENABLE'='TRUE')")
+var redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", 
"0", "0")
+assert(checkRedirectedCsvContentAvailableInSource(testdata, 
redirectCsvPath))
+val longChar: String = RandomStringUtils.randomAlphabetic(33000)
+
 CarbonProperties.getInstance()
   
.addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT,
 "true")
-val testdata =s"$resourcesPath/32000char.csv"
-sql("drop table if exists load32000chardata")
-sql("drop table if exists load32000chardata_dup")
-sql("CREATE TABLE load32000chardata(dim1 String, dim2 String, mes1 int) 
STORED AS carbondata")
-sql("CREATE TABLE load32000chardata_dup(dim1 String, dim2 String, mes1 
int) STORED AS carbondata")
-sql(s"LOAD DATA LOCAL INPATH '$testdata' into table load32000chardata 
OPTIONS('FILEHEADER'='dim1,dim2,mes1')")
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, 
"REDIRECT");
+sql(s"insert into longerthan32kchar values('33000', '$longChar', 4)")
+checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("ok", "hi", 
1), Row("itsok", "hello", 2)))
+redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "1", 
"0")
+var redirectedFileLineList = FileUtils.readLines(redirectCsvPath)
+var iterator = redirectedFileLineList.iterator()
+while (iterator.hasNext) {
+  assert(iterator.next().equals("33000,"+longChar+",4"))
+}
+
+// Update strings of length greater than 32000
+sql(s"update longerthan32kchar set(longerthan32kchar.dim2)=('$longChar') " 
+
+  "where longerthan32kchar.mes1=1").show()
+checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("itsok", 
"hello", 2)))
+redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "0", 
"1")
+redirectedFileLineList = FileUtils.readLines(redirectCsvPath)
+iterator = redirectedFileLineList.iterator()
+while (iterator.hasNext) {
+  assert(iterator.next().equals("ok,"+longChar+",1"))
+}
+CarbonProperties.getInstance()
+  
.addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT,
 "false")
+
+// Insert longer string without converter step will throw exception
 intercept[Exception] {
-  sql("insert into load32000chardata_dup select 
dim1,concat(load32000chardata.dim2,''),mes1 from load32000chardata").show()
+  sql(s"insert into longerthan32kchar values('32000', '$longChar', 3)")
 }
-sql(s"LOAD DATA LOCAL INPATH '$testdata' into table load32000chardata_dup 
OPTIONS('FILEHEADER'='dim1,dim2,mes1')")
+
+FileFactory.deleteAllFilesOfDir(new File(CarbonProperties.getInstance()
+  .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC)))
+  }
+
+  test("test load / insert / update with data more than 32000 characters and 
bad record action as Force") {
+val testdata =s"$resourcesPath/MoreThan32KChar.csv"
+sql("CREATE TABLE longerthan32kchar(dim1 String, dim2 String, mes1 int) 
STORED AS carbondata")
+sql(s"LOAD DATA LOCAL INPATH '$testdata' into table longerThan32kChar 
OPTIONS('FILEHEADER'='dim1,dim2,mes1', " +
+  s"'BAD_RECORDS_ACTION'='FORCE','BAD_RECORDS_LOGGER_ENABLE'='TRUE')")
+checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("ok", "hi", 
1), Row("itsok", "hello", 2), Row("32123", null, 3)))
+val longChar: String = RandomStringUtils.randomAlphabetic(33000)
+
+CarbonProperties.getInstance()
+  
.addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT,
 "true")
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, "FORCE");
+sql(s"insert into longerthan32kchar values('33000', '$longChar', 4)")
+checkAnswer(sql("select * from longerthan32kchar"),
+  Seq(Row("ok", "hi", 1), Row("itsok", "hello", 2),

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



nihal0107 commented on a change in pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#discussion_r472122318



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/test/util/QueryTest.scala
##
@@ -207,6 +208,45 @@ class QueryTest extends PlanTest {
   }
 }
   }
+
+  def getRedirectCsvPath(dbName: String,

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] nihal0107 commented on a change in pull request #3865: [CARBONDATA-3928] Handled the Strings which length is greater than 32000 as a bad record.

2020-08-18 Thread GitBox



nihal0107 commented on a change in pull request #3865:
URL: https://github.com/apache/carbondata/pull/3865#discussion_r472122106



##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/integration/spark/testsuite/dataload/TestLoadDataGeneral.scala
##
@@ -145,47 +150,162 @@ class TestLoadDataGeneral extends QueryTest with 
BeforeAndAfterEach {
 sql("drop table if exists carbon_table")
   }
 
-  test("test insert / update with data more than 32000 characters") {
+  test("test load / insert / update with data more than 32000 characters and 
bad record action as Redirect") {
+val testdata =s"$resourcesPath/MoreThan32KChar.csv"
+FileFactory.deleteAllFilesOfDir(new File(CarbonProperties.getInstance()
+  .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC)))
+sql("CREATE TABLE longerthan32kchar(dim1 String, dim2 String, mes1 int) 
STORED AS carbondata")
+sql(s"LOAD DATA LOCAL INPATH '$testdata' into table longerThan32kChar 
OPTIONS('FILEHEADER'='dim1,dim2,mes1', " +
+  s"'BAD_RECORDS_ACTION'='REDIRECT','BAD_RECORDS_LOGGER_ENABLE'='TRUE')")
+var redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", 
"0", "0")
+assert(checkRedirectedCsvContentAvailableInSource(testdata, 
redirectCsvPath))
+val longChar: String = RandomStringUtils.randomAlphabetic(33000)
+
 CarbonProperties.getInstance()
   
.addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT,
 "true")
-val testdata =s"$resourcesPath/32000char.csv"
-sql("drop table if exists load32000chardata")
-sql("drop table if exists load32000chardata_dup")
-sql("CREATE TABLE load32000chardata(dim1 String, dim2 String, mes1 int) 
STORED AS carbondata")
-sql("CREATE TABLE load32000chardata_dup(dim1 String, dim2 String, mes1 
int) STORED AS carbondata")
-sql(s"LOAD DATA LOCAL INPATH '$testdata' into table load32000chardata 
OPTIONS('FILEHEADER'='dim1,dim2,mes1')")
+CarbonProperties.getInstance()
+  .addProperty(CarbonCommonConstants.CARBON_BAD_RECORDS_ACTION, 
"REDIRECT");
+sql(s"insert into longerthan32kchar values('33000', '$longChar', 4)")
+checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("ok", "hi", 
1), Row("itsok", "hello", 2)))
+redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "1", 
"0")
+var redirectedFileLineList = FileUtils.readLines(redirectCsvPath)
+var iterator = redirectedFileLineList.iterator()
+while (iterator.hasNext) {
+  assert(iterator.next().equals("33000,"+longChar+",4"))
+}
+
+// Update strings of length greater than 32000
+sql(s"update longerthan32kchar set(longerthan32kchar.dim2)=('$longChar') " 
+
+  "where longerthan32kchar.mes1=1").show()
+checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("itsok", 
"hello", 2)))
+redirectCsvPath = getRedirectCsvPath("default", "longerthan32kchar", "0", 
"1")
+redirectedFileLineList = FileUtils.readLines(redirectCsvPath)
+iterator = redirectedFileLineList.iterator()
+while (iterator.hasNext) {
+  assert(iterator.next().equals("ok,"+longChar+",1"))
+}
+CarbonProperties.getInstance()
+  
.addProperty(CarbonCommonConstants.CARBON_ENABLE_BAD_RECORD_HANDLING_FOR_INSERT,
 "false")
+
+// Insert longer string without converter step will throw exception
 intercept[Exception] {
-  sql("insert into load32000chardata_dup select 
dim1,concat(load32000chardata.dim2,''),mes1 from load32000chardata").show()
+  sql(s"insert into longerthan32kchar values('32000', '$longChar', 3)")
 }
-sql(s"LOAD DATA LOCAL INPATH '$testdata' into table load32000chardata_dup 
OPTIONS('FILEHEADER'='dim1,dim2,mes1')")
+
+FileFactory.deleteAllFilesOfDir(new File(CarbonProperties.getInstance()
+  .getProperty(CarbonCommonConstants.CARBON_BADRECORDS_LOC)))
+  }
+
+  test("test load / insert / update with data more than 32000 characters and 
bad record action as Force") {
+val testdata =s"$resourcesPath/MoreThan32KChar.csv"
+sql("CREATE TABLE longerthan32kchar(dim1 String, dim2 String, mes1 int) 
STORED AS carbondata")
+sql(s"LOAD DATA LOCAL INPATH '$testdata' into table longerThan32kChar 
OPTIONS('FILEHEADER'='dim1,dim2,mes1', " +
+  s"'BAD_RECORDS_ACTION'='FORCE','BAD_RECORDS_LOGGER_ENABLE'='TRUE')")
+checkAnswer(sql("select * from longerthan32kchar"), Seq(Row("ok", "hi", 
1), Row("itsok", "hello", 2), Row("32123", null, 3)))

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



akashrn5 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-675417600


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] Karan980 commented on pull request #3876: TestingCI

2020-08-18 Thread GitBox



Karan980 commented on pull request #3876:
URL: https://github.com/apache/carbondata/pull/3876#issuecomment-675417286


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3895: [WIP]SI fix for not equal to filter

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3895:
URL: https://github.com/apache/carbondata/pull/3895#issuecomment-675415732


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3760/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3895: [WIP]SI fix for not equal to filter

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3895:
URL: https://github.com/apache/carbondata/pull/3895#issuecomment-675397590


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2019/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-675395912


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3759/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] vikramahuja1001 commented on pull request #3894: [WIP] Added property to enable disable SIforFailed segments and added prope…

2020-08-18 Thread GitBox



vikramahuja1001 commented on pull request #3894:
URL: https://github.com/apache/carbondata/pull/3894#issuecomment-675389481


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3861: [CARBONDATA-3922] Support order by limit push down for secondary index queries

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3861:
URL: https://github.com/apache/carbondata/pull/3861#issuecomment-675386736


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3758/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675382970


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3762/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675382272


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2022/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-675379676


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2018/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3861: [CARBONDATA-3922] Support order by limit push down for secondary index queries

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3861:
URL: https://github.com/apache/carbondata/pull/3861#issuecomment-675377981


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2017/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Updated] (CARBONDATA-3954) Global sorting with array, if read from ORC format, write to carbon, error; If you use no_sort, success;

2020-08-18 Thread xiaohui (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaohui updated CARBONDATA-3954:

Attachment: wx20200818-174...@2x.png
wx20200818-174...@2x.png

> Global sorting with array, if read from ORC format, write to carbon, error; 
> If you use no_sort, success;
> 
>
> Key: CARBONDATA-3954
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3954
> Project: CarbonData
>  Issue Type: Bug
>  Components: spark-integration
>Affects Versions: 2.0.0
>Reporter: xiaohui
>Priority: Major
> Attachments: wx20200818-174...@2x.png, wx20200818-174...@2x.png
>
>
> 0: jdbc:hive2://localhost:1> use dict;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.391 seconds)
> 0: jdbc:hive2://localhost:1> select * from array_orc;
> ++---+--+--+
> |  name  |  col  | fee  |
> ++---+--+--+
> | xiao3  | ["",null,"j"] | 3|
> | xiao2  | ["上呼吸道疾病1","白内障1","胃溃疡1"] | 2|
> | xiao3  | ["",null,"j"] | 3|
> | xiao1  | ["上呼吸道疾病","白内障","胃溃疡"]| 1|
> | xiao9  | NULL  | 3|
> | xiao9  | NULL  | 3|
> | xiao3  | NULL  | 3|
> | xiao6  | NULL  | 3|
> | xiao2  | ["上呼吸道疾病 1","白内障 1","胃溃疡 1"]  | 2|
> | xiao1  | ["上呼吸道疾病 ","白内障 ","胃溃疡 "] | 1|
> | xiao3  | NULL  | 3|
> | xiao3  | [null]| 3|
> | xiao3  | [""]  | 3|
> ++---+--+--+
> 13 rows selected (0.416 seconds)
> 0: jdbc:hive2://localhost:1> create table array_carbon4(name string, col 
> array,fee int) STORED AS carbondata TBLPROPERTIES 
> ('SORT_COLUMNS'='name',
> 0: jdbc:hive2://localhost:1> 'TABLE_BLOCKSIZE'='128',
> 0: jdbc:hive2://localhost:1> 'TABLE_BLOCKLET_SIZE'='128',
> 0: jdbc:hive2://localhost:1> 'SORT_SCOPE'='no_SORT');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (1.04 seconds)
> 0: jdbc:hive2://localhost:1> insert overwrite table array_carbon4 select 
> name,col,fee from array_orc;
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (5.065 seconds)
> 0: jdbc:hive2://localhost:1> create table array_carbon5(name string, col 
> array,fee int) STORED AS carbondata TBLPROPERTIES 
> ('SORT_COLUMNS'='name',
> 0: jdbc:hive2://localhost:1> 'TABLE_BLOCKSIZE'='128',
> 0: jdbc:hive2://localhost:1> 'TABLE_BLOCKLET_SIZE'='128',
> 0: jdbc:hive2://localhost:1> 'SORT_SCOPE'='global_SORT');
> +-+--+
> | Result  |
> +-+--+
> +-+--+
> No rows selected (0.098 seconds)
> 0: jdbc:hive2://localhost:1> insert overwrite table array_carbon5 select 
> name,col,fee from array_orc;
> Error: java.lang.Exception: DataLoad failure (state=,code=0)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Updated] (CARBONDATA-3954) Global sorting with array, if read from ORC format, write to carbon, error; If you use no_sort, success;

2020-08-18 Thread xiaohui (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaohui updated CARBONDATA-3954:

Description: 
0: jdbc:hive2://localhost:1> use dict;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.391 seconds)
0: jdbc:hive2://localhost:1> select * from array_orc;
++---+--+--+
|  name  |  col  | fee  |
++---+--+--+
| xiao3  | ["",null,"j"] | 3|
| xiao2  | ["上呼吸道疾病1","白内障1","胃溃疡1"] | 2|
| xiao3  | ["",null,"j"] | 3|
| xiao1  | ["上呼吸道疾病","白内障","胃溃疡"]| 1|
| xiao9  | NULL  | 3|
| xiao9  | NULL  | 3|
| xiao3  | NULL  | 3|
| xiao6  | NULL  | 3|
| xiao2  | ["上呼吸道疾病 1","白内障 1","胃溃疡 1"]  | 2|
| xiao1  | ["上呼吸道疾病 ","白内障 ","胃溃疡 "] | 1|
| xiao3  | NULL  | 3|
| xiao3  | [null]| 3|
| xiao3  | [""]  | 3|
++---+--+--+
13 rows selected (0.416 seconds)
0: jdbc:hive2://localhost:1> create table array_carbon4(name string, col 
array,fee int) STORED AS carbondata TBLPROPERTIES 
('SORT_COLUMNS'='name',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKSIZE'='128',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKLET_SIZE'='128',
0: jdbc:hive2://localhost:1> 'SORT_SCOPE'='no_SORT');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (1.04 seconds)
0: jdbc:hive2://localhost:1> insert overwrite table array_carbon4 select 
name,col,fee from array_orc;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (5.065 seconds)
0: jdbc:hive2://localhost:1> create table array_carbon5(name string, col 
array,fee int) STORED AS carbondata TBLPROPERTIES 
('SORT_COLUMNS'='name',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKSIZE'='128',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKLET_SIZE'='128',
0: jdbc:hive2://localhost:1> 'SORT_SCOPE'='global_SORT');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.098 seconds)
0: jdbc:hive2://localhost:1> insert overwrite table array_carbon5 select 
name,col,fee from array_orc;
Error: java.lang.Exception: DataLoad failure (state=,code=0)


  was:
0: jdbc:hive2://localhost:1> use dict;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.391 seconds)
0: jdbc:hive2://localhost:1> insert overwrite table array_carbon3 select 
name,col,fee from array_orc;
Error: java.lang.Exception: DataLoad failure (state=,code=0)
0: jdbc:hive2://localhost:1> create table array_carbon4(name string, col 
array,fee int) STORED AS carbondata TBLPROPERTIES 
('SORT_COLUMNS'='name',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKSIZE'='128',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKLET_SIZE'='128',
0: jdbc:hive2://localhost:1> 'SORT_SCOPE'='no_SORT');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (1.04 seconds)
0: jdbc:hive2://localhost:1> insert overwrite table array_carbon4 select 
name,col,fee from array_orc;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (5.065 seconds)
0: jdbc:hive2://localhost:1> create table array_carbon5(name string, col 
array,fee int) STORED AS carbondata TBLPROPERTIES 
('SORT_COLUMNS'='name',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKSIZE'='128',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKLET_SIZE'='128',
0: jdbc:hive2://localhost:1> 'SORT_SCOPE'='global_SORT');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.098 seconds)
0: jdbc:hive2://localhost:1> insert overwrite table array_carbon5 select 
name,col,fee from array_orc;
Error: java.lang.Exception: DataLoad failure (state=,code=0)
0: jdbc:hive2://localhost:1> select * from array_orc;
++---+--+--+
|  name  |  col  | fee  |
++---+--+--+
| xiao3  | ["",null,"j"] | 3|
| xiao2  | ["上呼吸道疾病1","白内障1","胃溃疡1"] | 2|
| xiao3  | ["",null,"j"] | 3|
| xiao1  | ["上呼吸道疾病","白内障","胃溃疡"]| 1|
| xiao9  | NULL  | 3|
| xiao9  | NULL  | 3|
| xiao3  | NULL  | 3|
| xiao6  | NULL  | 3|
| xiao2  | ["上呼吸道疾病 1","白内障 1","胃溃疡 1"]  | 2|
| xiao1  | ["上呼吸道疾病 ","白内障 ","胃溃疡 "] | 1|
| xiao3  | NULL  | 3|
| xiao3  | [null]| 3|
| xiao3  | [""]  | 3|
++---+--+--+
13 rows selected (0.416 seconds)


> Global

[jira] [Created] (CARBONDATA-3954) Global sorting with array, if read from ORC format, write to carbon, error; If you use no_sort, success;

2020-08-18 Thread xiaohui (Jira)

xiaohui created CARBONDATA-3954:
---

 Summary: Global sorting with array, if read from ORC format, write 
to carbon, error; If you use no_sort, success;
 Key: CARBONDATA-3954
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3954
 Project: CarbonData
  Issue Type: Bug
  Components: spark-integration
Affects Versions: 2.0.0
Reporter: xiaohui


0: jdbc:hive2://localhost:1> use dict;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.391 seconds)
0: jdbc:hive2://localhost:1> insert overwrite table array_carbon3 select 
name,col,fee from array_orc;
Error: java.lang.Exception: DataLoad failure (state=,code=0)
0: jdbc:hive2://localhost:1> create table array_carbon4(name string, col 
array,fee int) STORED AS carbondata TBLPROPERTIES 
('SORT_COLUMNS'='name',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKSIZE'='128',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKLET_SIZE'='128',
0: jdbc:hive2://localhost:1> 'SORT_SCOPE'='no_SORT');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (1.04 seconds)
0: jdbc:hive2://localhost:1> insert overwrite table array_carbon4 select 
name,col,fee from array_orc;
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (5.065 seconds)
0: jdbc:hive2://localhost:1> create table array_carbon5(name string, col 
array,fee int) STORED AS carbondata TBLPROPERTIES 
('SORT_COLUMNS'='name',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKSIZE'='128',
0: jdbc:hive2://localhost:1> 'TABLE_BLOCKLET_SIZE'='128',
0: jdbc:hive2://localhost:1> 'SORT_SCOPE'='global_SORT');
+-+--+
| Result  |
+-+--+
+-+--+
No rows selected (0.098 seconds)
0: jdbc:hive2://localhost:1> insert overwrite table array_carbon5 select 
name,col,fee from array_orc;
Error: java.lang.Exception: DataLoad failure (state=,code=0)
0: jdbc:hive2://localhost:1> select * from array_orc;
++---+--+--+
|  name  |  col  | fee  |
++---+--+--+
| xiao3  | ["",null,"j"] | 3|
| xiao2  | ["上呼吸道疾病1","白内障1","胃溃疡1"] | 2|
| xiao3  | ["",null,"j"] | 3|
| xiao1  | ["上呼吸道疾病","白内障","胃溃疡"]| 1|
| xiao9  | NULL  | 3|
| xiao9  | NULL  | 3|
| xiao3  | NULL  | 3|
| xiao6  | NULL  | 3|
| xiao2  | ["上呼吸道疾病 1","白内障 1","胃溃疡 1"]  | 2|
| xiao1  | ["上呼吸道疾病 ","白内障 ","胃溃疡 "] | 1|
| xiao3  | NULL  | 3|
| xiao3  | [null]| 3|
| xiao3  | [""]  | 3|
++---+--+--+
13 rows selected (0.416 seconds)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675366072


   Build Failed  with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3761/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akkio-97 closed pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-18 Thread GitBox



akkio-97 closed pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akkio-97 commented on pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-18 Thread GitBox



akkio-97 commented on pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#issuecomment-675365410


   okay, thanks for all your suggestions.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3887: [WIP] Refactor #3773 and support struct type

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3887:
URL: https://github.com/apache/carbondata/pull/3887#issuecomment-675365569


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2021/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akkio-97 commented on a change in pull request #3773: [CARBONDATA-3830]Presto array columns read support

2020-08-18 Thread GitBox



akkio-97 commented on a change in pull request #3773:
URL: https://github.com/apache/carbondata/pull/3773#discussion_r472038054



##
File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/adaptive/AdaptiveIntegralCodec.java
##
@@ -23,6 +23,7 @@
 import java.util.BitSet;
 import java.util.List;
 import java.util.Map;
+import java.util.Stack;

Review comment:
   okay

##
File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/FillVector.java
##
@@ -0,0 +1,347 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.page.encoding;

Review comment:
   okay

##
File path: 
core/src/main/java/org/apache/carbondata/core/datastore/page/encoding/FillVector.java
##
@@ -0,0 +1,347 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.carbondata.core.datastore.page.encoding;
+
+import java.nio.ByteBuffer;
+import java.util.ArrayList;
+import java.util.BitSet;
+
+import org.apache.carbondata.core.metadata.datatype.DataType;
+import org.apache.carbondata.core.metadata.datatype.DataTypes;
+import org.apache.carbondata.core.metadata.datatype.DecimalConverterFactory;
+import org.apache.carbondata.core.scan.result.vector.CarbonColumnVector;
+import org.apache.carbondata.core.scan.result.vector.ColumnVectorInfo;
+import 
org.apache.carbondata.core.scan.result.vector.impl.CarbonColumnVectorImpl;
+import org.apache.carbondata.core.util.ByteUtil;
+
+public class FillVector {
+  private byte[] pageData;
+  private float floatFactor = 0;
+  private double factor = 0;
+  private ColumnVectorInfo vectorInfo;
+  private BitSet nullBits;
+
+  public FillVector(byte[] pageData, ColumnVectorInfo vectorInfo, BitSet 
nullBits) {
+this.pageData = pageData;
+this.vectorInfo = vectorInfo;
+this.nullBits = nullBits;
+  }
+
+  public void setFactor(double factor) {
+this.factor = factor;
+  }
+
+  public void setFloatFactor(float floatFactor) {
+this.floatFactor = floatFactor;
+  }
+
+  public void basedOnType(CarbonColumnVector vector, DataType vectorDataType, 
int pageSize,
+  DataType pageDataType) {
+if (vectorInfo.vector.getColumnVector() != null && 
((CarbonColumnVectorImpl) vectorInfo.vector
+.getColumnVector()).isComplex()) {
+  fillComplexType(vector.getColumnVector(), pageDataType);
+} else {
+  fillPrimitiveType(vector, vectorDataType, pageSize, pageDataType);
+  vector.setIndex(0);
+}
+  }
+
+  private void fillComplexType(CarbonColumnVector vector, DataType 
pageDataType) {
+CarbonColumnVectorImpl vectorImpl = (CarbonColumnVectorImpl) vector;
+if (vector != null && vector.getChildrenVector() != null) {
+  ArrayList childElements = ((CarbonColumnVectorImpl) 
vector).getChildrenElements();
+  for (int i = 0; i < childElements.size(); i++) {
+int count = childElements.get(i);
+typeComplexObject(vectorImpl.getChildrenVector().get(0), count, 
pageDataType);
+vector.putArrayObject();
+  }
+  vectorImpl.getChildrenVector().get(0).setIndex(0);
+}
+  }
+
+  private void fillPrimitiveType(CarbonColumnVector vector, DataType 
vectorDataType, int pageSize,
+  DataType pageDataType) {
+// offset which denotes the start index for pageData
+int pageIndex = vector.getIndex();
+int rowId = 0;
+
+// Filling into vector is done based on page

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #1431: [WIP] DataMap Access Path Optimization

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #1431:
URL: https://github.com/apache/carbondata/pull/1431#issuecomment-675362465


   Build Failed  with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2020/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Resolved] (CARBONDATA-3919) Improve concurrent query performance

2020-08-18 Thread Akash R Nilugal (Jira)



 [ 
https://issues.apache.org/jira/browse/CARBONDATA-3919?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Akash R Nilugal resolved CARBONDATA-3919.
-
Fix Version/s: 2.1.0
   Resolution: Fixed

> Improve concurrent query performance
> 
>
> Key: CARBONDATA-3919
> URL: https://issues.apache.org/jira/browse/CARBONDATA-3919
> Project: CarbonData
>  Issue Type: Improvement
>Reporter: Ajantha Bhat
>Assignee: Ajantha Bhat
>Priority: Major
> Fix For: 2.1.0
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> problem1: when 500 queries executed concurrently. 
> checkIfRefreshIsNeeded method was synchronized. so only one thread was 
> working at time.
> But actually synchronization is required only when schema modified to drop 
> tables. Not for whole function
>  
> solution: synchronize only remove table part. Observed 500 query total 
> performance improved from 10s to 3 seconds in cluster.
>  
> problem2:  
> TokenCache.obtainTokensForNamenodes was causing a performance bottleneck for 
> concurrent queries. so, removed it
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] asfgit closed pull request #3858: [CARBONDATA-3919] Improve concurrent query performance

2020-08-18 Thread GitBox



asfgit closed pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] akashrn5 commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance

2020-08-18 Thread GitBox



akashrn5 commented on pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858#issuecomment-675355454


   LGTM



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858#issuecomment-675351077


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2015/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3858: [CARBONDATA-3919] Improve concurrent query performance

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3858:
URL: https://github.com/apache/carbondata/pull/3858#issuecomment-675349093


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3756/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3894: [WIP] Added property to enable disable SIforFailed segments and added prope…

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3894:
URL: https://github.com/apache/carbondata/pull/3894#issuecomment-675347031







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[jira] [Created] (CARBONDATA-3953) Dead lock when doing dataframe persist and loading

2020-08-18 Thread ChenKai (Jira)

ChenKai created CARBONDATA-3953:
---

 Summary: Dead lock when doing dataframe persist and loading
 Key: CARBONDATA-3953
 URL: https://issues.apache.org/jira/browse/CARBONDATA-3953
 Project: CarbonData
  Issue Type: Bug
Affects Versions: 2.1.0
Reporter: ChenKai
 Attachments: image-2020-08-18-15-59-46-108.png, 
image-2020-08-18-16-03-33-370.png

Thread-1
 !image-2020-08-18-15-59-46-108.png! 

Thread-2
 !image-2020-08-18-16-03-33-370.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[GitHub] [carbondata] vikramahuja1001 opened a new pull request #3895: [WIP]SI fix fr not equal to filter

2020-08-18 Thread GitBox



vikramahuja1001 opened a new pull request #3895:
URL: https://github.com/apache/carbondata/pull/3895


### Why is this PR needed?


### What changes were proposed in this PR?
   
   
### Does this PR introduce any user interface change?
- No
- Yes. (please explain the change and update document)
   
### Is any new testcase added?
- No
- Yes
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



ShreelekhyaG commented on a change in pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#discussion_r471986162



##
File path: 
integration/spark/src/main/scala/org/apache/spark/sql/CarbonSource.scala
##
@@ -281,10 +281,22 @@ object CarbonSource {
   isExternal)
 val updatedFormat = CarbonToSparkAdapter
   .getUpdatedStorageFormat(storageFormat, updatedTableProperties, 
tableInfo.getTablePath)

Review comment:
   added validation, changed the schema ordinal value of geocolumn from -1 
to 0, so it is added in to the schema now and handled.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] ShreelekhyaG commented on pull request #3879: [CARBONDATA-3943] Handling the addition of geo column to hive at the time of table creation.

2020-08-18 Thread GitBox



ShreelekhyaG commented on pull request #3879:
URL: https://github.com/apache/carbondata/pull/3879#issuecomment-675319661


   retest this please



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] vikramahuja1001 commented on pull request #3861: [CARBONDATA-3922] Support order by limit push down for secondary index queries

2020-08-18 Thread GitBox



vikramahuja1001 commented on pull request #3861:
URL: https://github.com/apache/carbondata/pull/3861#issuecomment-675315576







This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-675307163


   Build Success with Spark 2.4.5, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbon_PR_Builder_2.4.5/2013/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [carbondata] CarbonDataQA1 commented on pull request #3885: [CARBONDATA-3946] Support IndexServer with Presto Engine

2020-08-18 Thread GitBox



CarbonDataQA1 commented on pull request #3885:
URL: https://github.com/apache/carbondata/pull/3885#issuecomment-675299911


   Build Success with Spark 2.3.4, Please check CI 
http://121.244.95.60:12545/job/ApacheCarbonPRBuilder2.3/3754/
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

1 2 >

1 - 100 of 108 matches

Mail list logo