[ 
https://issues.apache.org/jira/browse/BEAM-7896?focusedWorklogId=291583&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-291583
 ]

ASF GitHub Bot logged work on BEAM-7896:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 08/Aug/19 21:22
            Start Date: 08/Aug/19 21:22
    Worklog Time Spent: 10m 
      Work Description: riazela commented on pull request #9298: [BEAM-7896] 
Implementing RateEstimation for KafkaTable 
URL: https://github.com/apache/beam/pull/9298#discussion_r312249059
 
 

 ##########
 File path: 
sdks/java/extensions/sql/src/main/java/org/apache/beam/sdk/extensions/sql/meta/provider/kafka/BeamKafkaTable.java
 ##########
 @@ -84,7 +98,18 @@ public BeamKafkaTable updateConsumerProperties(Map<String, 
Object> configUpdates
 
   @Override
   public PCollection<Row> buildIOReader(PBegin begin) {
-    KafkaIO.Read<byte[], byte[]> kafkaRead = null;
+    return begin
+        .apply("read", createKafkaRead().withoutMetadata())
+        .apply("in_format", getPTransformForInput())
+        .setRowSchema(getSchema());
+  }
+
+  public static void setNumberOfRecordsForRate(int numberOfRecordsForRate) {
 
 Review comment:
   The reason that I included this was giving the user ability to set this 
before running the query to change the behavior of the estimator. Since user 
does not have access to the table when the optimization is happening I made it 
static. 
   I'm wondering if the current approach is fine or it is better to make this 
variable defaultNumberOfRecords and then create one non-static member to be 
NumberOfRecords. (We can still keep the setter for the static one so that the 
user can change the default)
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 291583)
    Time Spent: 2h 10m  (was: 2h)

> Rate estimation for Kafka Table
> -------------------------------
>
>                 Key: BEAM-7896
>                 URL: https://issues.apache.org/jira/browse/BEAM-7896
>             Project: Beam
>          Issue Type: New Feature
>          Components: dsl-sql
>            Reporter: Alireza Samadianzakaria
>            Assignee: Alireza Samadianzakaria
>            Priority: Major
>          Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> Currently, KafkaTable returns UNKNOWN statistics for its rate. 
> We can use previously arrived tuples to estimate the rate and return correct 
> statistics (See 
> [https://docs.google.com/document/d/1vi1PBBu5IqSy-qZl1Gk-49CcANOpbNs1UAud6LnOaiY|https://docs.google.com/document/d/1vi1PBBu5IqSy-qZl1Gk-49CcANOpbNs1UAud6LnOaiY/])
>  



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to