Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
It's fair. if we change the default value, we can perhaps add an error handling logic so that (pcoll) | beam.Flatten() fails with an error that recommends (pcoll) | beam.FlatMap(), instead of saying that input is not an iterable. On Thu, Mar 21, 2024 at 3:41 PM Joey Tran wrote: > +1 > > On Thu,

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
+1 On Thu, Mar 21, 2024 at 6:30 PM Robert Bradshaw via dev wrote: > I would be more comfortable with a default for FlatMap than overloading > Flatten in this way. Distinguishing between > > (pcoll,) | beam.Flatten() > > and > > (pcoll) | beam.Flatten() > > seems a bit error prone. > > >

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Robert Bradshaw via dev
I would be more comfortable with a default for FlatMap than overloading Flatten in this way. Distinguishing between (pcoll,) | beam.Flatten() and (pcoll) | beam.Flatten() seems a bit error prone. On Thu, Mar 21, 2024 at 2:23 PM Joey Tran wrote: > Ah, I misunderstood your original

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
Ah, I misunderstood your original suggestion then. That makes sense then. I have already seen someone get a little confused about the names and surprised that Flatten doesn't do what FlatMap does. On Thu, Mar 21, 2024 at 5:20 PM Valentyn Tymofieiev wrote: > Beam throws an error at submission

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
Beam throws an error at submission time in Python if you pass a single PCollection to Flatten. The scenario you describe concerns a one-element list. On Thu, Mar 21, 2024, 13:43 Joey Tran wrote: > I think it'd be quite surprising if beam.Flatten would become equivalent > to FlatMap if passed

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
I think it'd be quite surprising if beam.Flatten would become equivalent to FlatMap if passed only a single pcollection. One use case that would be broken from that is cases where someone might be flattening a variable number of pcollections, including possibly only one pcollection. In that case,

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
One possible alternative is to define beam.Flatten for a single collection to be functionally equivalent to beam.FlatMap(lambda x: x), but that would be a larger change and such behavior might need to be consistent across SDKs and documented. Adding a default value is a simpler change. I can also

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Robert Bradshaw via dev
IIRC, Java has Flatten.iterables() and Flatten.collections(), the first of which does what you want. Giving FlatMap a default arg of lambda x: x is an interesting idea. The only downside I see is a less clear error if one forgets to provide this (now mandatory) parameter, but maybe that's low

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
That's not really the same thing, is it? `beam.Flatten` combines two or more pcollections into a single pcollection while beam.FlatMap unpacks iterables of elements (i.e. PCollection> -> PCollection) On Thu, Mar 21, 2024 at 2:57 PM Valentyn Tymofieiev via dev < dev@beam.apache.org> wrote: > Hi,

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
Actually, disregard that, Flatten is used in a different context to flatten multiple collections. On Thu, Mar 21, 2024 at 11:55 AM Valentyn Tymofieiev wrote: > Hi, you can use beam.Flatten() instead. > > On Thu, Mar 21, 2024 at 10:55 AM Joey Tran > wrote: > >> Hey all, >> >> Using an identity

Re: Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Valentyn Tymofieiev via dev
Hi, you can use beam.Flatten() instead. On Thu, Mar 21, 2024 at 10:55 AM Joey Tran wrote: > Hey all, > > Using an identity function for FlatMap comes up more often than using > FlatMap without an identity function. Would it make sense to use the > identity function as a default? > > > >

Python API: FlatMap default -> lambda x:x?

2024-03-21 Thread Joey Tran
Hey all, Using an identity function for FlatMap comes up more often than using FlatMap without an identity function. Would it make sense to use the identity function as a default?