[
https://issues.apache.org/jira/browse/SAMZA-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Cameron Lee updated SAMZA-2297:
-------------------------------
Description:
# When a stream is empty for an "InMemory" system, the admin returns that the
oldest/newest offset is "0" and the upcoming offset is "1".
# The "newest" offset returned by the admin is one higher than the offset of
the last message in the list. Therefore, consuming from this offset would give
no messages.
# The "upcoming" offset returned by the admin is two higher than the offset of
the last message in the list. However, the next message actually gets an offset
which is (offset of the last message + 1).
For case (1), when using UPCOMING offsets, this results in missing the first
message in a stream when the job starts while the stream is empty but then
messages get produced to the stream. The current usages use the
InMemorySystemDescriptor which uses OLDEST offsets, so this issue is not
observed in the current tests. However, internal streams don't use the
descriptor, so when trying to move some other tests to an in-memory system,
those tests run into this problem.
For case (2)/(3), the current usage of the in-memory system doesn't seem to be
impacted. It seems to be because the newest/upcoming offsets aren't used to
consume in the current tests.
was:
# When a stream is empty for an "InMemory" system, the admin returns that the
oldest/newest offset is "0" and the upcoming offset is "1". This means that if
consumption starts from the upcoming offset when the job initially saw an empty
stream, then the first message would get skipped once messages started getting
produced.
# The "newest" offset returned by the admin is one higher than the offset of
the last message in the list. Therefore, consuming from this offset would give
no messages.
# The "upcoming" offset returned by the admin is two higher than the offset of
the last message in the list. However, the next message actually gets an offset
which is (offset of the last message + 1).
> InMemorySystemAdmin offsets are off-by-one in some cases
> --------------------------------------------------------
>
> Key: SAMZA-2297
> URL: https://issues.apache.org/jira/browse/SAMZA-2297
> Project: Samza
> Issue Type: Bug
> Reporter: Cameron Lee
> Assignee: Cameron Lee
> Priority: Major
>
> # When a stream is empty for an "InMemory" system, the admin returns that the
> oldest/newest offset is "0" and the upcoming offset is "1".
> # The "newest" offset returned by the admin is one higher than the offset of
> the last message in the list. Therefore, consuming from this offset would
> give no messages.
> # The "upcoming" offset returned by the admin is two higher than the offset
> of the last message in the list. However, the next message actually gets an
> offset which is (offset of the last message + 1).
> For case (1), when using UPCOMING offsets, this results in missing the first
> message in a stream when the job starts while the stream is empty but then
> messages get produced to the stream. The current usages use the
> InMemorySystemDescriptor which uses OLDEST offsets, so this issue is not
> observed in the current tests. However, internal streams don't use the
> descriptor, so when trying to move some other tests to an in-memory system,
> those tests run into this problem.
> For case (2)/(3), the current usage of the in-memory system doesn't seem to
> be impacted. It seems to be because the newest/upcoming offsets aren't used
> to consume in the current tests.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)