[
https://issues.apache.org/jira/browse/BEAM-7896?focusedWorklogId=291586&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-291586
]
ASF GitHub Bot logged work on BEAM-7896:
----------------------------------------
Author: ASF GitHub Bot
Created on: 08/Aug/19 21:29
Start Date: 08/Aug/19 21:29
Worklog Time Spent: 10m
Work Description: akedin commented on pull request #9298: [BEAM-7896]
Implementing RateEstimation for KafkaTable
URL: https://github.com/apache/beam/pull/9298#discussion_r312251383
##########
File path:
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/kafka/BeamKafkaTable.java
##########
@@ -84,7 +98,18 @@ public BeamKafkaTable updateConsumerProperties(Map<String,
Object> configUpdates
@Override
public PCollection<Row> buildIOReader(PBegin begin) {
- KafkaIO.Read<byte[], byte[]> kafkaRead = null;
+ return begin
+ .apply("read", createKafkaRead().withoutMetadata())
+ .apply("in_format", getPTransformForInput())
+ .setRowSchema(getSchema());
+ }
+
+ public static void setNumberOfRecordsForRate(int numberOfRecordsForRate) {
Review comment:
The problem is we don't know the concrete user scenarios. For one, this
table provider can also be used in JDBC path where users might not have access
to the class to mutate it. The specific approach to make this configurable can
depend on the use case (e.g. what if the user want to use a different number
for different topics or retrieve it from metadata somewhere?) Until we know
that I'd rather just make a non-static default, and only override what we need
for testing. This way we're not exposing a new API contract and can change the
behavior if needed. If we expose an API, users will start using it, and we will
have to support it. Without understanding potential use cases I'd rather not
commit to that.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 291586)
Time Spent: 2h 20m (was: 2h 10m)
> Rate estimation for Kafka Table
> -------------------------------
>
> Key: BEAM-7896
> URL: https://issues.apache.org/jira/browse/BEAM-7896
> Project: Beam
> Issue Type: New Feature
> Components: dsl-sql
> Reporter: Alireza Samadianzakaria
> Assignee: Alireza Samadianzakaria
> Priority: Major
> Time Spent: 2h 20m
> Remaining Estimate: 0h
>
> Currently, KafkaTable returns UNKNOWN statistics for its rate.
> We can use previously arrived tuples to estimate the rate and return correct
> statistics (See
> [https://docs.google.com/document/d/1vi1PBBu5IqSy-qZl1Gk-49CcANOpbNs1UAud6LnOaiY|https://docs.google.com/document/d/1vi1PBBu5IqSy-qZl1Gk-49CcANOpbNs1UAud6LnOaiY/])
>
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)