I do suspect that over time we'll find more and more cases we can't express, and will be asked to extend this little templating in more directions. To head that off - could we easily just reuse an existing language (SQL, LUA, something of the form?) instead of creating something new?
On Tue, Apr 2, 2024 at 8:55 AM Kenneth Knowles <k...@apache.org> wrote: > I really like this proposal. I think it has narrowed down and solved the > essential problem of not shuffling excess redundant data, and also provides > the vast majority of the functionality that a lambda would, with > significantly better debugability and usability too, since the dynamic > destination pattern string can be in display data, etc. > > Kenn > > On Wed, Mar 27, 2024 at 1:58 PM Robert Bradshaw via dev < > dev@beam.apache.org> wrote: > >> On Wed, Mar 27, 2024 at 10:20 AM Reuven Lax <re...@google.com> wrote: >> >>> Can the prefix still be generated programmatically at graph creation >>> time? >>> >> >> Yes. It's just a property of the transform passed by the user at >> configuration time. >> >> >>> On Wed, Mar 27, 2024 at 9:40 AM Robert Bradshaw <rober...@google.com> >>> wrote: >>> >>>> On Wed, Mar 27, 2024 at 9:12 AM Reuven Lax <re...@google.com> wrote: >>>> >>>>> This does seem like the best compromise, though I think there will >>>>> still end up being performance issues. A common pattern I've seen is that >>>>> there is a long common prefix to the dynamic destination followed the >>>>> dynamic component. e.g. the destination might be >>>>> long/common/path/to/destination/files/<per-user-file>. In this case, the >>>>> prefix is often much larger than messages themselves and is what gets >>>>> effectively encoded in the lambda. >>>>> >>>> >>>> The idea here is that the destination would be given as a format >>>> string, say, "long/common/path/to/destination/files/{dest_info.user}". >>>> Another way to put this is that we support (only) "lambdas" that are >>>> represented as string substitutions. (The fact that dest_info does not have >>>> to be part of the record, and can be the output of an arbitrary map if need >>>> be, makes this restriction not so bad.) >>>> >>>> As well as solving the performance issues, I think this is actually a >>>> pretty convenient and natural way for the user to name their destination >>>> (for the common usecase, even easier than providing a lambda), and has the >>>> benefit of being much more transparent than an arbitrary callable as well >>>> for introspection (for both machine and human that may look at the >>>> resulting pipeline). >>>> >>>> >>>>> I'm not entirely sure how to address this in a portable context. We >>>>> might simply have to accept the extra overhead when going cross language. >>>>> >>>>> Reuven >>>>> >>>>> On Wed, Mar 27, 2024 at 8:51 AM Robert Bradshaw via dev < >>>>> dev@beam.apache.org> wrote: >>>>> >>>>>> Thanks for putting this together, it will be a really useful feature >>>>>> to have. >>>>>> >>>>>> I am in favor of the string-pattern approaches. I think we need to >>>>>> support both the {record=..., dest_info=...} and the elide-fields >>>>>> approaches, as the former is nicer when one has a fixed representation >>>>>> for >>>>>> the output record (e.g. a proto or avro schema) and the flattened form >>>>>> for >>>>>> ease of use in more free-form contexts (e.g. when producing records from >>>>>> YAML and SQL). >>>>>> >>>>>> Also left some comments on the doc. >>>>>> >>>>>> >>>>>> On Wed, Mar 27, 2024 at 6:51 AM Ahmed Abualsaud via dev < >>>>>> dev@beam.apache.org> wrote: >>>>>> >>>>>>> Hey all, >>>>>>> >>>>>>> There have been some conversations lately about how best to enable >>>>>>> dynamic destinations in a portable context. Usually, this comes up for >>>>>>> cross-language transforms and more recently for Beam YAML. >>>>>>> >>>>>>> I've started a short doc outlining some routes we could take. The >>>>>>> purpose is to establish a good standard for supporting dynamic >>>>>>> destinations >>>>>>> with portability, one that can be applied to most use cases and IOs. >>>>>>> Please >>>>>>> take a look and add any thoughts! >>>>>>> >>>>>>> https://s.apache.org/portable-dynamic-destinations >>>>>>> >>>>>>> Best, >>>>>>> Ahmed >>>>>>> >>>>>>