Grouping everything under "Reify" sounds good because it's a short name and thus easier to read when reading pipeline code.  It also sounds good for early learners of the API -- people who don't need it can ignore a single class called Reify instead of having to ignore many similarly named classes ReifyTimestamps, ReifyValueInSingleWindows, Reify*...

Would we mark ReifyTimestamps deprecated at the same time so that in the long term there was only a single Reify class?  Or would that need to wait for a 3.x release?

On 10/11/2017 11:34 AM, Eugene Kirpichov wrote:
Luke, I think you're talking about the ability to *output into the given
window*. Wesley's code is about just *extracting* the current element's
windowing info and packaging it into a ValueInSingleWindow. I'd say +1,
this is a safe and potentially handy little utility transform. Such
reification is also mentioned in s.apache.org/context-fn as an argument
against needing explicit windowing information in context for user code
closures.

In terms of API, I'd suggest to package this under Reify:
Reify.timestampedValues() could be a synonym for ReifyTimestamps,
Reify.valuesInWindows() could be what you've implemented.

There's other kinds of reifications possible, don't know if it's a good
idea to put them under the same namespace or not: e.g. Reify.asIterable():
PCollection<T> -> PCollection<Iterable<T>> (equivalent to grouping by a
Void key and taking the values).

On Wed, Oct 11, 2017 at 2:14 PM Lukasz Cwik <lc...@google.com.invalid>
wrote:

Reifying requires outputting records within a given window and timestamp.
Giving access to underlying information and the ability to output arbitrary
records within arbitrary windows is dangerous as a user may not honor the
windowing/triggering semantics that are required and a runner may drop
records causing confusion for users.

On Sat, Oct 7, 2017 at 12:52 PM, Wesley Tanaka <wtanaka+b...@wtanaka.com>
wrote:

GatherAllPanes.ReifyTimestampsAndWindowsFn looks useful for giving
MapElements, Filter, et al access to PaneInfo and BoundedWindow. Is
there a
reason why that functionality shouldn't be made into a public PTransform?
I filed https://issues.apache.org/jira/browse/BEAM-3035 which can be
resolved invalid if this is a bad idea.

More generally, it seems like ValueInSingleWindow is hardly used across
the API.  Is there a reason to avoid it, either in the API or in user
code
or both?

--
Wesley Tanaka
https://wtanaka.com/


--
Wesley Tanaka
https://wtanaka.com/

Reply via email to