subject:"Any recomendation for key for GroupIntoBatches"

Re: Any recomendation for key for GroupIntoBatches

2024-04-28 Thread Wiśniowski Piotr

Hi, Might be late to the discussion, but providing another option (as I think it was not mentioned or I missed it). Take a look at [this](https://beam.apache.org/releases/pydoc/current/apache_beam.transforms.util.html#apache_beam.transforms.util.BatchElements) as I think this is precisely

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread Robert Bradshaw via user

On Fri, Apr 12, 2024 at 1:39 PM Ruben Vargas wrote: > On Fri, Apr 12, 2024 at 2:17 PM Jaehyeon Kim wrote: > > > > Here is an example from a book that I'm reading now and it may be > applicable. > > > > JAVA - (id.hashCode() & Integer.MAX_VALUE) % 100 > > PYTHON - ord(id[0]) % 100 > or

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread Reuven Lax via user

There are various strategies. Here is an example of how Beam does it (taken from Reshuffle.viaRandomKey().withNumBuckets(N) Note that this does some extra hashing to work around issues with the Spark runner. If you don't care about that, you could implement something simpler (e.g. initialize

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread Damon Douglas

Good day, Ruben, Would you be able to compute a shasum on the group of IDs to use as the key? Best, Damon On 2024/04/12 19:22:45 Ruben Vargas wrote: > Hello guys > > Maybe this question was already answered, but I cannot find it and > want some more input on this topic. > > I have some

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread Ruben Vargas

Yeah unfortunately the data on the endpoint could change at any point in time and I need to make sure to have the latest one :/ That limits my options here. But I also have other sources that can benefit from this caching :) Thank you very much! On Mon, Apr 15, 2024 at 9:37 AM XQ Hu wrote: >

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread XQ Hu via user

I am not sure you still need to do batching since Web API can handle caching. If you really need it, I think GoupIntoBatches is a good way to go. On Mon, Apr 15, 2024 at 11:30 AM Ruben Vargas wrote: > Is there a way to do batching in that transformation? I'm assuming for > now no. or may be

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread Ruben Vargas

Is there a way to do batching in that transformation? I'm assuming for now no. or may be using in conjuntion with GoupIntoBatches On Mon, Apr 15, 2024 at 9:29 AM Ruben Vargas wrote: > > Interesting > > I think the cache feature could be interesting for some use cases I have. > > On Mon, Apr 15,

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread Ruben Vargas

Interesting I think the cache feature could be interesting for some use cases I have. On Mon, Apr 15, 2024 at 9:18 AM XQ Hu wrote: > > For the new web API IO, the page lists these features: > > developers provide minimal code that invokes Web API endpoint > delegate to the transform to handle

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread XQ Hu via user

For the new web API IO, the page lists these features: - developers provide minimal code that invokes Web API endpoint - delegate to the transform to handle request retries and exponential backoff - optional caching of request and response associations - optional metrics On Mon,

Re: Any recomendation for key for GroupIntoBatches

2024-04-15 Thread Ruben Vargas

That one looks interesting What is not clear to me is what are the advantages of using it? Is only the error/retry handling? anything in terms of performance? My PCollection is unbounded but I was thinking of sending my messages in batches to the external API in order to gain some performance

Re: Any recomendation for key for GroupIntoBatches

2024-04-14 Thread XQ Hu via user

To enrich your data, have you checked https://cloud.google.com/dataflow/docs/guides/enrichment? This transform is built on top of https://beam.apache.org/documentation/io/built-in/webapis/ On Fri, Apr 12, 2024 at 4:38 PM Ruben Vargas wrote: > On Fri, Apr 12, 2024 at 2:17 PM Jaehyeon Kim

Re: Any recomendation for key for GroupIntoBatches

2024-04-12 Thread Ruben Vargas

On Fri, Apr 12, 2024 at 2:17 PM Jaehyeon Kim wrote: > > Here is an example from a book that I'm reading now and it may be applicable. > > JAVA - (id.hashCode() & Integer.MAX_VALUE) % 100 > PYTHON - ord(id[0]) % 100 Maybe this is what I'm looking for. I'll give it a try. Thanks! > > On Sat, 13

Re: Any recomendation for key for GroupIntoBatches

2024-04-12 Thread Jaehyeon Kim

Here is an example from a book that I'm reading now and it may be applicable. JAVA - (id.hashCode() & Integer.MAX_VALUE) % 100 PYTHON - ord(id[0]) % 100 On Sat, 13 Apr 2024 at 06:12, George Dekermenjian wrote: > How about just keeping track of a buffer and flush the buffer after 100 > messages

Re: Any recomendation for key for GroupIntoBatches

2024-04-12 Thread George Dekermenjian

How about just keeping track of a buffer and flush the buffer after 100 messages and if there is a buffer on finish_bundle as well? On Fri, Apr 12, 2024 at 21.23 Ruben Vargas wrote: > Hello guys > > Maybe this question was already answered, but I cannot find it and > want some more input on

Any recomendation for key for GroupIntoBatches

2024-04-12 Thread Ruben Vargas

Hello guys Maybe this question was already answered, but I cannot find it and want some more input on this topic. I have some messages that don't have any particular key candidate, except the ID, but I don't want to use it because the idea is to group multiple IDs in the same batch. This is

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Re: Any recomendation for key for GroupIntoBatches

Any recomendation for key for GroupIntoBatches

15 matches

Site Navigation

Mail list logo

Footer information