Hi, In analytics use cases we use a couple of extensions to extract attributes (year, month, day, hour, etc.) from the event timestamp. In product anaytics implementations, we mostly use time:extract() Siddhi extension [1] in real time scenarios and DateTimeUDF in batch scenarios [2]. However, both those extensions give output time units based on the local time zone of the server on which DAS is running.
For example, if the DAS is running in a server which has the timezone set as IST, for timestamp 1461135538669 (2016/04/20 06:58:58 GMT) output time units are resolved as 2016, 04, 20, 12, 28. In most of the scenarios, analytics is implemented in per <time unit> basis, i.e., we maintain summary tables for per minute, per hour, per day, per month. These summary tables has columns for year, month, date, hour, etc. Since aforementioned extensions are giving time units based on local timezone what we store in there are are local time units. IMO, we should store UTC time units instead, since it is better to maintain time units uniformly without depending on the time zone of the server DAS is running. We have also found that this inconsistency is capable of producing issues in incremental data processing. Shall we extend our analytics extensions to support UTC time units throughout? [1] https://github.com/wso2/siddhi/blob/master/modules/siddhi-extensions/time/src/main/java/org/wso2/siddhi/extension/time/ExtractAttributesFunctionExtension.java [2] https://github.com/wso2/shared-analytics/blob/master/components/spark-udf/org.wso2.carbon.analytics.shared.spark.common.udf/src/main/java/org/wso2/carbon/analytics/shared/common/udf/DateTimeUDF.java -- Thanks & Regards, Inosh Goonewardena Associate Technical Lead- WSO2 Inc. Mobile: +94779966317
_______________________________________________ Architecture mailing list [email protected] https://mail.wso2.org/cgi-bin/mailman/listinfo/architecture
