EandrewJones commented on code in PR #46:
URL: https://github.com/apache/flagon-distill/pull/46#discussion_r1663059324


##########
distill/core/log.py:
##########
@@ -53,3 +55,27 @@ def to_json(self) -> str:
 
     def to_dict(self) -> JsonDict:
         return self.data.model_dump(by_alias=True)
+
+
+def normalize_timestamp(timestamp: Timestamp, tz: str ='+0000') -> datetime:
+    """
+    Attempts to normalize a given timestamp to a datetime object 
+    Arguments:
+        timestamp: a int or float representing (milli)seconds since the epoch 
or an 
+            arbitrary timestamp string (ex: '02/19/24 10:32:02', 1719530111079)
+        tz: an arbitrary timestamp string (ex: '+0100')
+    """
+    if isinstance(timestamp, str):
+        # Only uses provided timezone arg if there is no associated timezone 
in the timestamp
+        parsed = dateparser.parse(timestamp, settings={'TIMEZONE': tz})
+        if parsed is None:
+            raise ValueError("ERROR: could not parse timestamp " + 
str(timestamp))
+        return parsed.astimezone(timezone.utc)
+    elif isinstance(timestamp, float) or isinstance(timestamp, int):
+        tzinformation = dateparser.parse("00:01", settings={'TIMEZONE': 
tz}).tzinfo
+
+        if timestamp > datetime.now().timestamp():
+            timestamp = timestamp / 1000
+        return datetime.fromtimestamp(float(timestamp), tzinformation)
+    else:
+        raise TypeError("ERROR: " + str(type(timestamp)) + " timestamp should 
be a string, int, or datetime object")

Review Comment:
   Function looks good, user experience is off.
   
   I don't think this is something we want to directly expose to the user. As a 
user, my logs' timestamps should automatically be converted, if possible. I 
shouldn't have to call this function as part of my processing pipeline.
   
   One way to do this is to move this over to the schema. I believe you can 
inject behavior into the schema during the validation lifecycle, so that _if_ 
there's a timestamp field, we normalize on load.
   
   We can leave this as a function for users, bake it into our schema, and then 
allow users to utilize it if supplying a custom schema.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@flagon.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscr...@flagon.apache.org
For additional commands, e-mail: notifications-h...@flagon.apache.org

Reply via email to