akashrn5 commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r476487539



##########
File path: 
core/src/main/java/org/apache/carbondata/core/constants/CarbonCommonConstants.java
##########
@@ -1592,6 +1592,15 @@ private CarbonCommonConstants() {
 
   public static final String CARBON_LUCENE_INDEX_STOP_WORDS_DEFAULT = "false";
 
+  // Property to enable parsing the timestamp/date data with setLenient = true 
in load
+  // flow if it fails with parse invalid timestamp data. (example: 1941-03-15 
00:00:00
+  // is valid time in Asia/Calcutta zone and is invalid and will fail to parse 
in Asia/Shanghai
+  // zone as DST is observed and clocks were turned forward 1 hour to 
1941-03-15 01:00:00)
+  @CarbonProperty(dynamicConfigurable = true) public static final String

Review comment:
       `move public static final String` to next line

##########
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##########
@@ -444,9 +446,38 @@ private static Object parseTimestamp(String 
dimensionValue, String dateFormat) {
         dateFormatter = timestampFormatter.get();
       }
       dateToStr = dateFormatter.parse(dimensionValue);
-      return dateToStr.getTime();
+      timeValue = dateToStr.getTime();
+      validateTimeStampRange(timeValue);
+      return timeValue;
     } catch (ParseException e) {
-      throw new NumberFormatException(e.getMessage());
+      // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+      if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+        try {
+          dateFormatter.setLenient(true);
+          dateToStr = dateFormatter.parse(dimensionValue);
+          dateFormatter.setLenient(false);
+          timeValue = dateToStr.getTime();
+          validateTimeStampRange(timeValue);
+          LOGGER.info("Parsed data with lenience as true, setting back to 
default mode");
+          return timeValue;
+        } catch (ParseException ex) {
+          dateFormatter.setLenient(false);
+          LOGGER.info("Failed to parse data with lenience as true, setting 
back to default mode");
+          throw new NumberFormatException(ex.getMessage());
+        }
+      } else {
+        throw new NumberFormatException(e.getMessage());
+      }
+    }
+  }
+
+  private static void validateTimeStampRange(Long timeValue) {
+    if (timeValue < DateDirectDictionaryGenerator.MIN_VALUE
+        || timeValue > DateDirectDictionaryGenerator.MAX_VALUE) {
+      throw new NumberFormatException(
+          "timestamp column data value: " + timeValue + "is not in valid " + 
"range of: "

Review comment:
       `"is not in valid " + "range of: "` correct the formatting here, there 
are unnecessary concatenations

##########
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/util/BadRecordUtil.scala
##########
@@ -68,4 +71,32 @@ object BadRecordUtil {
     badRecordLocation = badRecordLocation + "/" + dbName + "/" + tableName
     
FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(badRecordLocation))
   }
+
+  def createCSV(rows: ListBuffer[Array[String]], csvPath: String): Unit = {
+    val out = new BufferedWriter(new FileWriter(csvPath))
+    val writer: CSVWriter = new CSVWriter(out)
+    try {
+      for (row <- rows) {
+        writer.writeNext(row)
+      }
+    }
+    catch {
+      case e: Exception =>
+        Assert.fail(e.getMessage)
+    }
+    finally {
+      out.close()
+      writer.close()
+    }
+  }
+
+  def deleteCSVFile(csvPath: String): Unit = {

Review comment:
       i don't think this method is required, just call 
`FileUtils.forceDelete(new File(csvPath))` in each test case itself and no need 
of any Assert here, as its not doing any functional validation

##########
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/util/BadRecordUtil.scala
##########
@@ -68,4 +71,32 @@ object BadRecordUtil {
     badRecordLocation = badRecordLocation + "/" + dbName + "/" + tableName
     
FileFactory.deleteAllCarbonFilesOfDir(FileFactory.getCarbonFile(badRecordLocation))
   }
+
+  def createCSV(rows: ListBuffer[Array[String]], csvPath: String): Unit = {
+    val out = new BufferedWriter(new FileWriter(csvPath))

Review comment:
       correct the formatting of this method and may be you can use 
try-with-resource here, you can check once




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to