[
https://issues.apache.org/jira/browse/BEAM-13175?focusedWorklogId=733031&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-733031
]
ASF GitHub Bot logged work on BEAM-13175:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 25/Feb/22 12:27
Start Date: 25/Feb/22 12:27
Worklog Time Spent: 10m
Work Description: mosche commented on pull request #16077:
URL: https://github.com/apache/beam/pull/16077#issuecomment-1050811277
@aromanenko-dev This should be in a good shape as an initial version. In my
tests I was able to outperform the KPL based writer of SDK v1.
Using Spark I run into some interesting issues at high scale. These were
caused by having classes of multiple Netty 4 versions on the classpath (see
[here](https://github.com/aws/aws-sdk-java-v2/issues/1803)). It took me quite a
while to figure this out as it only happened at very high throughput.
I'm enforcing Beam's version of Netty over the more recent one used by AWS.
In my tests this worked very well.
Another change is the output type of the writer. As recommended
[here](https://docs.google.com/document/d/1V2FkGGunVgvLwi1dKHr-7mtDuwYjTuvuESV_oPzVnfQ/edit?resourcekey=0-KvfQq-5iCcMlu3f3MFJ-GQ#heading=h.xl97lw3dyot7)
i changed it from `Void` to `Result` to allow for future backwards compatible
changes. Currently the writer is fail fast (after retries of course), but
adding a deadletter output would make sense...
The next step is to provide a more powerful internal partitioner that is
aware of hashkey ranges assigned to KinesisShards.
With that record aggregation would be as powerful as the one provided by KPL.
cc @echauchot I'd be more than happy to get another review if you have the
capacity for it ...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 733031)
Time Spent: 7h 50m (was: 7h 40m)
> Implement missing KinesisIO.Write for aws2
> ------------------------------------------
>
> Key: BEAM-13175
> URL: https://issues.apache.org/jira/browse/BEAM-13175
> Project: Beam
> Issue Type: New Feature
> Components: io-java-aws
> Reporter: Moritz Mack
> Assignee: Moritz Mack
> Priority: P2
> Labels: aws-sdk-v2
> Time Spent: 7h 50m
> Remaining Estimate: 0h
>
> The Kinesis Producer library KPL isn't available for aws2. Hence, we cannot
> trivially port the old KinesisIO.Write over.
> But at the same time KPL also doesn't align with the ideas behind SDFs. So
> it's a good opportunity to implement it properly.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)