Hi Joey,
We just deployed the 2.54 changes on playground. Can you now check?
Thanks,
Vlado
From: Joey Tran
Sent: Friday, March 8, 2024 2:45:32 PM
To: Andrey Devyatkin
Subject: [EXTERNAL] Re: Hiding logging for beam playground examples
Ah thanks Andrey! I had
: [EXTERNAL] Re: SDF to read from a Spark custom receiver
Though the Spark custom receiver protocol doesn't look rich enough to
support SDFs, it does look like the Hubspot APIs do support pagination
which could be used to build an SDF (using the page as the offsets)
directly. (That being said
Hello,
Is it somehow related to this work [1]?
No, this work adds the ability to return values from a sql insert query. There
are no improvements to work with row and schema in it.
Not sure that I got it. Could you elaborate a bit on this?
When we using "Write" with table and without
. 19:12:41
Кому: dev
Копия: Reuven Lax; pabl...@google.com; Ilya Kozyrev
Тема: [EXTERNAL] Re:
> And also the ticket and "// TODO: BEAM-10396 use writeRows() when it's
> available" appeared later than this functionality was added to "JdbcIO.Write".
Note that this TODO has
Hi Boyuan,
Thanks for replying. We are using beam 2.25.0 and direct runner for testing. We
are trying to develop an unbounded streaming service connector with splittable
DoFn. In our connector.read(), we want to commit the message back to stream
after output the record to downstream user
On Wed, Sep 16, 2020 at 8:48 AM Tyson Hamilton wrote:
> The use case is to support an unbounded stream-stream join, where the
> elements are arriving in roughly time sorted order. Removing a specific
> element from the timestamp indexed collection is necessary when a match is
> found.
>
Just
The use case is to support an unbounded stream-stream join, where the
elements are arriving in roughly time sorted order. Removing a specific
element from the timestamp indexed collection is necessary when a match is
found. Having clearRange is helpful to expire elements that are no longer
Hi,
Currently we only support removing a timestamp range. You can remove a
single timestamp of course by removing [ts, ts+1), however if there are
multiple elements with the same timestamp this will remove all of those
elements.
Does this fit your use case? If not, I wonder if MapState is closer
Hi Reuven,
I noticed that there was an implementation of the in-memory OrderedListState
introduced [1]. Where can I find out more regarding the plan and design? Is
there a design doc? I'd like to know more details about the implementation to
see if it fits my use case. I was hoping it would
I have a very naive question. I know Jan suggested to use 2 successive fixed
overlapping windows with offset as a temporary solution to dedup the events.
However, I am wondering whether using a single fixed window of length let's say
1 day followed by a deduplicate function is a good
> If the user chooses to create a window of 10 years, I'd say it is
expected behavior that the state will be kept for as long as this duration.
State will be kept, the problem is that each key in the window will
carry a cleanup timer, although there might be nothing to clear (there
is no
If the user chooses to create a window of 10 years, I'd say it is
expected behavior that the state will be kept for as long as this duration.
GlobalWindows are different because they represent the default case
where the user does not even use windowing. I think it warrants to be
treated
Window triggering is afaik operation that is specific to GBK. Stateful
DoFns can have (as shown in the case of deduplication) timers set for
the GC only, triggering has no effect there. And yes, if we have other
timers than GC (any user timers), then we have to have GC timer (because
timers
The inefficiency described happens if and only if the following two conditions
are met:
a) there are many timers per single window (as otherwise they will be
negligible)
b) there are many keys which actually contain no state (as otherwise the timer would be negligible wrt the state size)
On 8/25/20 9:27 PM, Maximilian Michels wrote:
I agree that this probably solves the described issue in the most
straightforward way, but special handling for global window feels
weird, as there is really nothing special about global window wrt
state cleanup.
Why is special handling for the
I agree that this probably solves the described issue in the most straightforward way, but special handling for global window feels weird, as there is really nothing special about global window wrt state cleanup.
Why is special handling for the global window weird? After all, it is a
special
I'd suggest a modified option (2) which does not use a timer to perform
the cleanup (as mentioned, this will cause problems with migrating state).
Instead, whenever we receive a watermark which closes the global window,
we enumerate all keys and cleanup the associated state.
This is the
Awesome! Thanks a lot for the memory profile. Couple remarks:
a) I can see that there are about 378k keys and each of them sets a timer.
b) Based on the settings for DeduplicatePerKey you posted, you will keep
track of all keys of the last 30 minutes.
Unless you have much fewer keys, the
Hey folks,
Sry I'm late to this thread but this might be very helpful for the problem
we're dealing with. Do we have a design doc or a jira ticket I can follow?
Cheers,
Catlyn
On Thu, Jun 18, 2020 at 1:11 PM Jan Lukavský wrote:
> My questions were just an example. I fully agree there is a
Thanks for all the ideas! We looked around and found out that
@RequiresTimeSortedInput
is not supported on portable Flink runner for streaming pipelines
Hi,
I'm afraid @RequiresTimeSortedInput currently does not fit this
requirement, because it works on event _timestamps_ only. Assigning
Kafka offsets as event timestamps is probably not a good idea. In the
original proposal [1] there is mention that it would be good to
implement sorting by
I would not recommend using RequiresTimeSortedInput in this way. I also
would not use ingestion time, as in a distributed environment, time skew
between workers might mess up the order.
I will ping the discussion on the sorted state API and add you. My hope is
that this can be implemented
Thank y’all for the input!
About the RequiresTimeSortedInput, we were thinking of the following 2
potential approaches:
1.
Assign kafka offset as the timestamp while doing a GroupByKey on
partition_id in a GlobalWindow
2.
Rely on the fact that Flink consumes from kafka
ache.org>>
*Cc:* Stefan Djelekar mailto:stefan.djele...@redbox.com>>
*Subject:* [EXTERNAL] Re: Java Build broken
__ __
Hi All,
__ __
Was this issue resolved? I started to get the sa
t;>>
>>>> It looks like on localhost build references
>>>> https://oss.sonatype.org/content/repositories/staging/com/google/errorprone/error_prone_check_api/2.3.4/
>>>>
>>>> istead of
>>>>
>>>>
>>>> http
fact/com.google.errorprone/error_prone_check_api/2.3.4
>>>
>>>
>>>
>>> and the first link returns 404
>>>
>>>
>>>
>>>
>>>
>>> Can you please advise?
>>>
>>>
>>>
>>> All the be
_prone_check_api/2.3.4
>>
>>
>>
>> and the first link returns 404
>>
>>
>>
>>
>>
>> Can you please advise?
>>
>>
>>
>> All the best,
>>
>> Stefan
>>
>>
>>
>> *From:* Pulasthi
;
> Can you please advise?
>
>
>
> All the best,
>
> Stefan
>
>
>
> *From:* Pulasthi Supun Wickramasinghe
> *Sent:* Tuesday, February 18, 2020 5:11 PM
> *To:* dev
> *Cc:* Stefan Djelekar
> *Subject:* [EXTERNAL] Re: Java Build broken
>
>
: [EXTERNAL] Re: Java Build broken
Hi All,
Was this issue resolved? I started to get the same error on my local build
suddenly.
Best Regards,
Pulasthi
On Thu, Jan 23, 2020 at 10:17 AM Maximilian Michels
mailto:m...@apache.org>> wrote:
Do you have any overrides in your ~/.m2/settin
>
>
> Getting a review would be nice.
>
>
>
> All the best,
>
> Stefan
>
>
>
> *From:* Chamikara Jayalath
> *Sent:* Wednesday, November 6, 2019 8:05 PM
> *To:* dev
> *Subject:* Re: [EXTERNAL] Re: FirestoreIO connector [JavaSDK]
>
>
>
> BTW
Subject: Re: [EXTERNAL] Re: FirestoreIO connector [JavaSDK]
BTW, FYI, I'm also talking with folks from Google Firestore team regarding
this. I think they had shown some interest in taking this up but I'm not sure.
If they are able to contribute here, and if we can coordinate with some of
ll make the PR in the next few days and let’s pick it up from there.
>
>
>
> King regards,
>
> Stefan
>
>
>
>
>
> *From:* Chamikara Jayalath
> *Sent:* Tuesday, November 5, 2019 2:24 AM
> *To:* dev
> *Subject:* [EXTERNAL] Re: FirestoreIO connector [
AM
To: dev
Subject: [EXTERNAL] Re: FirestoreIO connector [JavaSDK]
Thanks for the contribution. Happy to help with the review.
Also, probably it'll be good to follow patterns employed by the existing
Datastore connector when applicable:
https://github.com/apache/beam/blob/master/sdks/java/io
33 matches
Mail list logo