I do suspect that over time we'll find more and more cases we can't
express, and will be asked to extend this little templating in more
directions. To head that off - could we easily just reuse an existing
language (SQL, LUA, something of the form?) instead of creating something
new?

On Tue, Apr 2, 2024 at 8:55 AM Kenneth Knowles <k...@apache.org> wrote:

> I really like this proposal. I think it has narrowed down and solved the
> essential problem of not shuffling excess redundant data, and also provides
> the vast majority of the functionality that a lambda would, with
> significantly better debugability and usability too, since the dynamic
> destination pattern string can be in display data, etc.
>
> Kenn
>
> On Wed, Mar 27, 2024 at 1:58 PM Robert Bradshaw via dev <
> dev@beam.apache.org> wrote:
>
>> On Wed, Mar 27, 2024 at 10:20 AM Reuven Lax <re...@google.com> wrote:
>>
>>> Can the prefix still be generated programmatically at graph creation
>>> time?
>>>
>>
>> Yes. It's just a property of the transform passed by the user at
>> configuration time.
>>
>>
>>> On Wed, Mar 27, 2024 at 9:40 AM Robert Bradshaw <rober...@google.com>
>>> wrote:
>>>
>>>> On Wed, Mar 27, 2024 at 9:12 AM Reuven Lax <re...@google.com> wrote:
>>>>
>>>>> This does seem like the best compromise, though I think there will
>>>>> still end up being performance issues. A common pattern I've seen is that
>>>>> there is a long common prefix to the dynamic destination followed the
>>>>> dynamic component. e.g. the destination might be
>>>>> long/common/path/to/destination/files/<per-user-file>. In this case, the
>>>>> prefix is often much larger than messages themselves and is what gets
>>>>> effectively encoded in the lambda.
>>>>>
>>>>
>>>> The idea here is that the destination would be given as a format
>>>> string, say, "long/common/path/to/destination/files/{dest_info.user}".
>>>> Another way to put this is that we support (only) "lambdas" that are
>>>> represented as string substitutions. (The fact that dest_info does not have
>>>> to be part of the record, and can be the output of an arbitrary map if need
>>>> be, makes this restriction not so bad.)
>>>>
>>>> As well as solving the performance issues, I think this is actually a
>>>> pretty convenient and natural way for the user to name their destination
>>>> (for the common usecase, even easier than providing a lambda), and has the
>>>> benefit of being much more transparent than an arbitrary callable as well
>>>> for introspection (for both machine and human that may look at the
>>>> resulting pipeline).
>>>>
>>>>
>>>>> I'm not entirely sure how to address this in a portable context. We
>>>>> might simply have to accept the extra overhead when going cross language.
>>>>>
>>>>> Reuven
>>>>>
>>>>> On Wed, Mar 27, 2024 at 8:51 AM Robert Bradshaw via dev <
>>>>> dev@beam.apache.org> wrote:
>>>>>
>>>>>> Thanks for putting this together, it will be a really useful feature
>>>>>> to have.
>>>>>>
>>>>>> I am in favor of the string-pattern approaches. I think we need to
>>>>>> support both the {record=..., dest_info=...} and the elide-fields
>>>>>> approaches, as the former is nicer when one has a fixed representation 
>>>>>> for
>>>>>> the output record (e.g. a proto or avro schema) and the flattened form 
>>>>>> for
>>>>>> ease of use in more free-form contexts (e.g. when producing records from
>>>>>> YAML and SQL).
>>>>>>
>>>>>> Also left some comments on the doc.
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 27, 2024 at 6:51 AM Ahmed Abualsaud via dev <
>>>>>> dev@beam.apache.org> wrote:
>>>>>>
>>>>>>> Hey all,
>>>>>>>
>>>>>>> There have been some conversations lately about how best to enable
>>>>>>> dynamic destinations in a portable context. Usually, this comes up for
>>>>>>> cross-language transforms and more recently for Beam YAML.
>>>>>>>
>>>>>>> I've started a short doc outlining some routes we could take. The
>>>>>>> purpose is to establish a good standard for supporting dynamic 
>>>>>>> destinations
>>>>>>> with portability, one that can be applied to most use cases and IOs. 
>>>>>>> Please
>>>>>>> take a look and add any thoughts!
>>>>>>>
>>>>>>> https://s.apache.org/portable-dynamic-destinations
>>>>>>>
>>>>>>> Best,
>>>>>>> Ahmed
>>>>>>>
>>>>>>

Reply via email to