There is currently no strict ordering which is supported within Apache Beam
(timestamp or not) and any ordering which may be occurring is just a side
effect and not guaranteed in any way.

Since the smallest unit of work is a bundle containing 1 element, the only
way to get ordering is to make one giant element containing all your data
that needs to be ordered and perform the ordering yourself (e.g  GroupByKey
with single dummy key).

On Thu, Aug 3, 2017 at 12:41 PM, Eric Fang <ef...@stacklighting.com> wrote:

> Hi all,
>
> We have a stream of data that's ordered by a timestamp and our use case
> requires us to process the data in order with respect to the previous
> element. For example, we have a stream of true/false ingested from PubSub
> and we want to make sure for each key, a true always follows by a false.
>
> I know from PubSub, the order is not guaranteed, but for the same Dataflow
> job, does the ProcessContext.output guarantee order when processElement is
> called based on event time or process time? From my experiment, this
> assumption seems to hold up but I wonder if this is an actual assumption of
> the system.
>
> In addition, if I key the stream with another key, does the assumption
> still hold? If not, is there any way with Beam to ensure that
> processElement is called in order of some time stamp.
>
> Thanks
> Eric
>
>
> --
>
> Eric Fang
>
> Stack Labs  |  10054 Pasadena Ave, Cupertino, CA 95014
>
>
> This electronic mail transmission may contain private, confidential and
> privileged information that is for the sole use of the intended
> recipient.  If you are not the intended recipient, you are hereby
> notified that any review, dissemination, distribution, archiving, or
> copying of this communication is strictly prohibited.  If you received
> this communication in error, please reply to this message immediately and
> delete the original message and the associated reply.
>

Reply via email to