[ 
https://issues.apache.org/jira/browse/GOBBLIN-1876?focusedWorklogId=876249&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-876249
 ]

ASF GitHub Bot logged work on GOBBLIN-1876:
-------------------------------------------

                Author: ASF GitHub Bot
            Created on: 15/Aug/23 02:53
            Start Date: 15/Aug/23 02:53
    Worklog Time Spent: 10m 
      Work Description: homatthew commented on code in PR #3738:
URL: https://github.com/apache/gobblin/pull/3738#discussion_r1294135399


##########
gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/KafkaExtractor.java:
##########
@@ -337,4 +340,24 @@ public void close() throws IOException {
   public long getHighWatermark() {
     return 0;
   }
+
+  public static String getKafkaBrokerSimpleName(State state) {

Review Comment:
   In my mind, this kafka extractor method is already a downstream method that 
wants exactly 1 broker. Any implementation of this class that properly supports 
multiple kafka brokers, should have its own implementation and work with the 
map directly. 
   
   This implementation works out of the box for those that are just consuming 
from a single kafka broker per extractor, which is IMO the general case and 
seems to be the case for existing implementations. 



##########
gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/KafkaExtractor.java:
##########
@@ -337,4 +340,24 @@ public void close() throws IOException {
   public long getHighWatermark() {
     return 0;
   }
+
+  public static String getKafkaBrokerSimpleName(State state) {

Review Comment:
   I put the annoying boiler plate for parsing the map in the common utils 
which we would never want to duplicate, and I think that is the correct layer 
of abstraction for all use cases (single kafka broker or multiple). 



##########
gobblin-modules/gobblin-kafka-common/src/main/java/org/apache/gobblin/source/extractor/extract/kafka/KafkaExtractor.java:
##########
@@ -337,4 +340,24 @@ public void close() throws IOException {
   public long getHighWatermark() {
     return 0;
   }
+
+  public static String getKafkaBrokerSimpleName(State state) {

Review Comment:
   The reason I separated this from the original common utils is because if 
someone has multiple kafka brokers, then it wouldn't make much sense to return 
the list of strings. Realistically they'd need to map out which simple name 
corresponds to which broker uri. 
   
   In other words, they would definitely want the entire map so that they it's 
easier to reason about in the code.





Issue Time Tracking
-------------------

    Worklog Id:     (was: 876249)
    Time Spent: 40m  (was: 0.5h)

> Kafka source / extractor utility to get a simple name for kafka brokers
> -----------------------------------------------------------------------
>
>                 Key: GOBBLIN-1876
>                 URL: https://issues.apache.org/jira/browse/GOBBLIN-1876
>             Project: Apache Gobblin
>          Issue Type: Improvement
>            Reporter: Matthew Ho
>            Priority: Major
>          Time Spent: 40m
>  Remaining Estimate: 0h
>
> This utility can be used to write human readable names for the URIs that can 
> be used in Gobblin tracking events, Gobblin Metadata Change Events



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to