[GitHub] [spark] HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

GitBox Tue, 19 Mar 2019 02:27:40 -0700

HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] 
Introduce new option to Kafka source: offset by timestamp (starting/ending)
URL: https://github.com/apache/spark/pull/23747#discussion_r266793948


 ##########
 File path: 
external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/JsonUtils.scala
 ##########
 @@ -76,6 +76,16 @@ private object JsonUtils {
     }
   }
 
+  def topicTimestamps(str: String): Map[String, Long] = {
+    try {
+      Serialization.read[Map[String, Long]](str)
+    } catch {
+      case NonFatal(x) =>
+        throw new IllegalArgumentException(
+          s"""Expected e.g. {"topicA": 1549597128110,"topicB": 1549597120110}, 
got $str""")
 
 Review comment:
   Read offsets for each partition in specific time would be different for most 
of cases, so twos are different. My 2 cents, I think referring offsets was 
mostly needed when Kafka didn't have timestamp as index. Once we provide 
feature on timestamp index end users would love to use it instead of offset, 
except someone who know the exact offsets where to replaying.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] HeartSaVioR commented on a change in pull request #23747: [SPARK-26848][SQL] Introduce new option to Kafka source: offset by timestamp (starting/ending)

Reply via email to