[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472382210



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "
+  + dimensionValue + " to " + dateToStr);
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  return validateTimeStampRange(dateToStr.getTime());
+} catch (ParseException ex) {
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  throw new NumberFormatException(ex.getMessage());
+}
+  } else {
+throw new NumberFormatException(e.getMessage());
+  }
+}
+  }
+
+  private static Long validateTimeStampRange(Long timeValue) {
+SimpleDateFormat df = new SimpleDateFormat("-MM-dd HH:mm:ss");

Review comment:
   rechecked and made use of existing value from 
`DateDirectDictionaryGenerator.MIN_VALUE` and 
`DateDirectDictionaryGenerator.MAX_VALUE`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472381642



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "

Review comment:
   agree. removed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472381408



##
File path: core/src/main/java/org/apache/carbondata/core/util/SessionParams.java
##
@@ -153,6 +154,12 @@ private boolean validateKeyValue(String key, String value) 
throws InvalidConfigu
   case ENABLE_UNSAFE_IN_QUERY_EXECUTION:
   case ENABLE_AUTO_LOAD_MERGE:
   case CARBON_PUSH_ROW_FILTERS_FOR_VECTOR:
+  case CARBON_LOAD_SETLENIENT_ENABLE:

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472380864



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "
+  + dimensionValue + " to " + dateToStr);
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  return validateTimeStampRange(dateToStr.getTime());
+} catch (ParseException ex) {

Review comment:
   ok added

##
File path: 
integration/spark/src/test/scala/org/apache/carbondata/spark/testsuite/dataload/TestLoadDataWithDiffTimestampFormat.scala
##
@@ -306,6 +307,39 @@ class TestLoadDataWithDiffTimestampFormat extends 
QueryTest with BeforeAndAfterA
 }
   }
 
+  test("test load, update data with daylight saving time from different 
timezone") {
+CarbonProperties.getInstance().addProperty(
+  CarbonCommonConstants.CARBON_LOAD_SETLENIENT_ENABLE, "true")
+val defaultTimeZone = TimeZone.getDefault
+TimeZone.setDefault(TimeZone.getTimeZone("Asia/Shanghai"))
+sql("DROP TABLE IF EXISTS t3")
+sql(
+  """
+   CREATE TABLE IF NOT EXISTS t3
+   (ID Int, date date, starttime Timestamp, country String,
+   name String, phonetype String, serialname String, salary Int)
+   STORED AS carbondata TBLPROPERTIES('dateformat'='/MM/dd',
+   'timestampformat'='-MM-dd HH:mm')
+""")
+sql(s" LOAD DATA LOCAL INPATH '$resourcesPath/timeStampFormatData3.csv' 
into table t3")
+sql(s"insert into t3 select 11,'2015-7-23','1941-3-15 
00:00:00','china','aaa1','phone197'," +
+s"'ASD69643',15000")
+sql("update t3 set (starttime) = ('1941-3-15 00:00:00') where name='aaa2'")
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 1"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 11"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+checkAnswer(
+  sql("SELECT starttime FROM t3 WHERE ID = 2"),
+  Seq(Row(Timestamp.valueOf("1941-3-15 01:00:00")))
+)
+TimeZone.setDefault(defaultTimeZone)

Review comment:
   done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [carbondata] ShreelekhyaG commented on a change in pull request #3896: [WIP] Fix load failures due to daylight saving time changes

2020-08-18 Thread GitBox


ShreelekhyaG commented on a change in pull request #3896:
URL: https://github.com/apache/carbondata/pull/3896#discussion_r472294029



##
File path: core/src/main/java/org/apache/carbondata/core/util/DataTypeUtil.java
##
@@ -434,20 +434,59 @@ public static Object 
getDataDataTypeForNoDictionaryColumn(String dimensionValue,
   }
 
   private static Object parseTimestamp(String dimensionValue, String 
dateFormat) {
-Date dateToStr;
-DateFormat dateFormatter;
+Date dateToStr = null;
+DateFormat dateFormatter = null;
 try {
   if (null != dateFormat && !dateFormat.trim().isEmpty()) {
 dateFormatter = new SimpleDateFormat(dateFormat);
-dateFormatter.setLenient(false);
   } else {
 dateFormatter = timestampFormatter.get();
   }
+  dateFormatter.setLenient(false);
   dateToStr = dateFormatter.parse(dimensionValue);
-  return dateToStr.getTime();
+  return validateTimeStampRange(dateToStr.getTime());
 } catch (ParseException e) {
-  throw new NumberFormatException(e.getMessage());
+  // If the parsing fails, try to parse again with setLenient to true if 
the property is set
+  if (CarbonProperties.getInstance().isSetLenientEnabled()) {
+try {
+  LOGGER.info("Changing setLenient to true for TimeStamp: " + 
dimensionValue);
+  dateFormatter.setLenient(true);
+  dateToStr = dateFormatter.parse(dimensionValue);
+  LOGGER.info(
+  "Changing setLenient to true for TimeStamp: " + dimensionValue + 
". Changing "
+  + dimensionValue + " to " + dateToStr);
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  return validateTimeStampRange(dateToStr.getTime());
+} catch (ParseException ex) {
+  dateFormatter.setLenient(false);
+  LOGGER.info("Changing setLenient back to false");
+  throw new NumberFormatException(ex.getMessage());
+}
+  } else {
+throw new NumberFormatException(e.getMessage());
+  }
+}
+  }
+
+  private static Long validateTimeStampRange(Long timeValue) {
+SimpleDateFormat df = new SimpleDateFormat("-MM-dd HH:mm:ss");

Review comment:
   Here, the `DateDirectDictionaryGenerator.MIN_VALUE`  is ("0001-01-01") 
which is not equals to timestamp minvalue ("0001-01-01 00:00:00"). As the 
format is different, will get different long values after parse.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org