Beam throws an error at submission time in Python if you pass a single
PCollection  to Flatten. The scenario you describe concerns a one-element
list.

On Thu, Mar 21, 2024, 13:43 Joey Tran <joey.t...@schrodinger.com> wrote:

> I think it'd be quite surprising if beam.Flatten would become equivalent
> to FlatMap if passed only a single pcollection. One use case that would be
> broken from that is cases where someone might be flattening a variable
> number of pcollections, including possibly only one pcollection. In that
> case, that single pcollection suddenly get FlatMapped.
>
>
>
> On Thu, Mar 21, 2024 at 4:36 PM Valentyn Tymofieiev via dev <
> dev@beam.apache.org> wrote:
>
>> One possible alternative is to define beam.Flatten for a single
>> collection to be functionally equivalent to beam.FlatMap(lambda x: x), but
>> that would be a larger change and such behavior might need to be
>> consistent across SDKs and documented. Adding a default value is a simpler
>> change.
>>
>> I can also confirm that the usage
>>
>>     |  'Flatten' >> beam.FlatMap(lambda x: x)
>>
>> is fairly common by inspecting uses of Beam internally.
>> On Thu, Mar 21, 2024 at 1:30 PM Robert Bradshaw via dev <
>> dev@beam.apache.org> wrote:
>>
>>> IIRC, Java has Flatten.iterables() and Flatten.collections(), the first
>>> of which does what you want.
>>>
>>> Giving FlatMap a default arg of lambda x: x is an interesting idea. The
>>> only downside I see is a less clear error if one forgets to provide this
>>> (now mandatory) parameter, but maybe that's low enough to be worth the
>>> convenience?
>>>
>>> On Thu, Mar 21, 2024 at 12:02 PM Joey Tran <joey.t...@schrodinger.com>
>>> wrote:
>>>
>>>> That's not really the same thing, is it? `beam.Flatten` combines two or
>>>> more pcollections into a single pcollection while beam.FlatMap unpacks
>>>> iterables of elements (i.e. PCollection<Iterable<T>> -> PCollection<T>)
>>>>
>>>> On Thu, Mar 21, 2024 at 2:57 PM Valentyn Tymofieiev via dev <
>>>> dev@beam.apache.org> wrote:
>>>>
>>>>> Hi, you can use beam.Flatten() instead.
>>>>>
>>>>> On Thu, Mar 21, 2024 at 10:55 AM Joey Tran <joey.t...@schrodinger.com>
>>>>> wrote:
>>>>>
>>>>>> Hey all,
>>>>>>
>>>>>> Using an identity function for FlatMap comes up more often than using
>>>>>> FlatMap without an identity function. Would it make sense to use the
>>>>>> identity function as a default?
>>>>>>
>>>>>>
>>>>>>
>>>>>>

Reply via email to