[GitHub] spark issue #20767: [SPARK-23623] [SS] Avoid concurrent use of cached consum...

tdas Thu, 15 Mar 2018 19:49:23 -0700

Github user tdas commented on the issue:

    https://github.com/apache/spark/pull/20767
  
    The idea is good. But how do you propose exposing that information?
    Periodic print out in the log?
    
    From a different angle, I would rather not do feature creep in this PR that
    is intended to be backported to 2.3.
    
    On Mar 15, 2018 7:31 PM, "tedyu" <[email protected]> wrote:
    
    > *@tedyu* commented on this pull request.
    > ------------------------------
    >
    > In external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/
    > KafkaDataConsumer.scala
    > <https://github.com/apache/spark/pull/20767#discussion_r174984237>:
    >
    > >        CachedKafkaDataConsumer(newInternalConsumer)
    >
    > -    } else if (existingInternalConsumer.inuse) {
    > +    } else if (existingInternalConsumer.inUse) {
    >        // If consumer is already cached but is currently in use, then 
return a new consumer
    >        NonCachedKafkaDataConsumer(newInternalConsumer)
    >
    > Maybe keep an internal counter for how many times the non cached consumer
    > is created.
    > This would give us information on how effective the cache is
    >
    > â
    > You are receiving this because you authored the thread.
    > Reply to this email directly, view it on GitHub
    > <https://github.com/apache/spark/pull/20767#pullrequestreview-104439601>,
    > or mute the thread
    > 
<https://github.com/notifications/unsubscribe-auth/AAoerMcXNmKmobW4ws25hx3OvcER-1Ptks5teyPogaJpZM4SiC1I>
    > .
    >




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #20767: [SPARK-23623] [SS] Avoid concurrent use of cached consum...

Reply via email to