@Kyle Weaver<mailto:[email protected]> sure thing! So the input/output definition for the Flatten.Iterables<https://beam.apache.org/releases/javadoc/2.25.0/org/apache/beam/sdk/transforms/Flatten.Iterables.html> is:
Input: PCollection<Iterable<T> Output: PCollection<T> The input/output for a explode transform would look like this: Input: PCollection<Row> The row schema has a field which is an array of T Output: PCollection<Row> The array type field from input schema is replaced with a new field of type T. The elements from the array type field are flattened into multiple rows in the new table (other fields of input table are just duplicated. Hope this clarification helps! From: Kyle Weaver <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Tuesday, January 12, 2021 at 4:58 PM To: "[email protected]" <[email protected]> Cc: Reuven Lax <[email protected]> Subject: Re: Is there an array explode function/transform? @Reuven Lax<mailto:[email protected]> yes I am aware of that transform, but that’s different from the explode operation I was referring to: https://spark.apache.org/docs/latest/api/sql/index.html#explode<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fapi%2Fsql%2Findex.html%23explode&data=04%7C01%7Ctaol%40zillow.com%7C1226a5d9efee43fc7d5508d8b75e5bfd%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637460963191408293%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=IjXWhmHTGsbpgbxa1gJ5LcOFI%2BoiGIDYBwXPnukQfxk%3D&reserved=0> How is it different? It'd help if you could provide the signature (input and output PCollection types) of the transform you have in mind. On Tue, Jan 12, 2021 at 4:49 PM Tao Li <[email protected]<mailto:[email protected]>> wrote: @Reuven Lax<mailto:[email protected]> yes I am aware of that transform, but that’s different from the explode operation I was referring to: https://spark.apache.org/docs/latest/api/sql/index.html#explode<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fspark.apache.org%2Fdocs%2Flatest%2Fapi%2Fsql%2Findex.html%23explode&data=04%7C01%7Ctaol%40zillow.com%7C1226a5d9efee43fc7d5508d8b75e5bfd%7C033464830d1840e7a5883784ac50e16f%7C0%7C0%7C637460963191418249%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=XuUUmNB3fgBasjDj0Dq1Z2g6%2Bc5fbvluf%2BnAp2m8cuE%3D&reserved=0> From: Reuven Lax <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Tuesday, January 12, 2021 at 2:04 PM To: user <[email protected]<mailto:[email protected]>> Subject: Re: Is there an array explode function/transform? Have you tried Flatten.iterables On Tue, Jan 12, 2021, 2:02 PM Tao Li <[email protected]<mailto:[email protected]>> wrote: Hi community, Is there a beam function to explode an array (similarly to spark sql’s explode())? I did some research but did not find anything. BTW I think we can potentially use FlatMap to implement the explode functionality, but a Beam provided function would be very handy. Thanks a lot!
