[
https://issues.apache.org/jira/browse/CALCITE-3073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matt Wang updated CALCITE-3073:
-------------------------------
Description:
Currently the KafkaAdapter consumes data from default
offset(latest/earliest/last_offset) and runs forever.
In other words, if the app runs at first time and user want to consume the past
data, user must set the value of '{color:#707070}*auto.offset.reset'*{color}
parameter to *earliest*.
{quote}{{auto.offset.reset:What to do when there is no initial offset in Kafka
or if the current offset does not exist any more on the server (e.g. because
that data has been deleted):}}
* earliest: automatically reset the offset to the earliest offset
* latest: automatically reset the offset to the latest offset
* none: throw exception to the consumer if no previous offset is found for the
consumer's group
* anything else: throw exception to the consumer.{quote}
for example, suppose data in Kafka is retained for 7 days and you just want to
read from the data of yesterday, if you could not control the start timestamp,
you can only read from the earliest offset, it's very inefficient. If
supporting to consume from special timestamp in KafkaAdapter will be a good
idea for some cases.
was:
Currently the KafkaAdapter consumes data from default
offset(latest/earliest/last_offset) and runs forever.
In other words, if the app runs at first time and user want to consume the past
data, user must set the value of '{color:#707070}*auto.offset.reset'*{color}
parameter to *earliest*.
{quote}{{auto.offset.reset:What to do when there is no initial offset in Kafka
or if the current offset does not exist any more on the server (e.g. because
that data has been deleted):}}
* earliest: automatically reset the offset to the earliest offset
* latest: automatically reset the offset to the latest offset
* none: throw exception to the consumer if no previous offset is found for the
consumer's group
* anything else: throw exception to the consumer.{quote}
for example, suppose data in Kafka is retained for 7 days and you just want to
read from the data of yesterday, if you could not control the start timestamp,
you can read from the earliest offset, it's very inefficient. If supporting to
consume from special timestamp will be a good idea for some cases.
> Support to consume from timestamp in KafkaAdapter
> -------------------------------------------------
>
> Key: CALCITE-3073
> URL: https://issues.apache.org/jira/browse/CALCITE-3073
> Project: Calcite
> Issue Type: Improvement
> Components: kafka-adapter
> Reporter: Xu Mingmin
> Assignee: Matt Wang
> Priority: Major
> Labels: pull-request-available
> Time Spent: 3h 40m
> Remaining Estimate: 0h
>
> Currently the KafkaAdapter consumes data from default
> offset(latest/earliest/last_offset) and runs forever.
> In other words, if the app runs at first time and user want to consume the
> past data, user must set the value of
> '{color:#707070}*auto.offset.reset'*{color} parameter to *earliest*.
> {quote}{{auto.offset.reset:What to do when there is no initial offset in
> Kafka or if the current offset does not exist any more on the server (e.g.
> because that data has been deleted):}}
> * earliest: automatically reset the offset to the earliest offset
> * latest: automatically reset the offset to the latest offset
> * none: throw exception to the consumer if no previous offset is found for
> the consumer's group
> * anything else: throw exception to the consumer.{quote}
> for example, suppose data in Kafka is retained for 7 days and you just want
> to read from the data of yesterday, if you could not control the start
> timestamp, you can only read from the earliest offset, it's very inefficient.
> If supporting to consume from special timestamp in KafkaAdapter will be a
> good idea for some cases.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)