[ 
https://issues.apache.org/jira/browse/KAFKA-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steven Zhen Wu updated KAFKA-2121:
----------------------------------
    Attachment: KAFKA-2121_2015-04-20_09:06:09.patch

> prevent potential resource leak in KafkaProducer and KafkaConsumer
> ------------------------------------------------------------------
>
>                 Key: KAFKA-2121
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2121
>             Project: Kafka
>          Issue Type: Bug
>          Components: producer 
>    Affects Versions: 0.8.2.0
>            Reporter: Steven Zhen Wu
>            Assignee: Jun Rao
>         Attachments: KAFKA-2121.patch, KAFKA-2121_2015-04-16_09:55:14.patch, 
> KAFKA-2121_2015-04-16_10:43:55.patch, KAFKA-2121_2015-04-18_20:09:20.patch, 
> KAFKA-2121_2015-04-19_20:08:45.patch, KAFKA-2121_2015-04-19_20:30:18.patch, 
> KAFKA-2121_2015-04-20_09:06:09.patch
>
>
> On Mon, Apr 13, 2015 at 7:17 PM, Guozhang Wang <wangg...@gmail.com> wrote:
> It is a valid problem and we should correct it as soon as possible, I'm
> with Ewen regarding the solution.
> On Mon, Apr 13, 2015 at 5:05 PM, Ewen Cheslack-Postava <e...@confluent.io>
> wrote:
> > Steven,
> >
> > Looks like there is even more that could potentially be leaked -- since key
> > and value serializers are created and configured at the end, even the IO
> > thread allocated by the producer could leak. Given that, I think 1 isn't a
> > great option since, as you said, it doesn't really address the underlying
> > issue.
> >
> > 3 strikes me as bad from a user experience perspective. It's true we might
> > want to introduce additional constructors to make testing easier, but the
> > more components I need to allocate myself and inject into the producer's
> > constructor, the worse the default experience is. And since you would have
> > to inject the dependencies to get correct, non-leaking behavior, it will
> > always be more code than previously (and a backwards incompatible change).
> > Additionally, the code creating a the producer would have be more
> > complicated since it would have to deal with the cleanup carefully whereas
> > it previously just had to deal with the exception. Besides, for testing
> > specifically, you can avoid exposing more constructors just for testing by
> > using something like PowerMock that let you mock private methods. That
> > requires a bit of code reorganization, but doesn't affect the public
> > interface at all.
> >
> > So my take is that a variant of 2 is probably best. I'd probably do two
> > things. First, make close() safe to call even if some fields haven't been
> > initialized, which presumably just means checking for null fields. (You
> > might also want to figure out if all the methods close() calls are
> > idempotent and decide whether some fields should be marked non-final and
> > cleared to null when close() is called). Second, add the try/catch as you
> > suggested, but just use close().
> >
> > -Ewen
> >
> >
> > On Mon, Apr 13, 2015 at 3:53 PM, Steven Wu <stevenz...@gmail.com> wrote:
> >
> > > Here is the resource leak problem that we have encountered when 0.8.2
> > java
> > > KafkaProducer failed in constructor. here is the code snippet of
> > > KafkaProducer to illustrate the problem.
> > >
> > > -------------------------------
> > > public KafkaProducer(ProducerConfig config, Serializer<K> keySerializer,
> > > Serializer<V> valueSerializer) {
> > >
> > >     // create metrcis reporter via reflection
> > >     List<MetricsReporter> reporters =
> > >
> > >
> > config.getConfiguredInstances(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG,
> > > MetricsReporter.class);
> > >
> > >     // validate bootstrap servers
> > >     List<InetSocketAddress> addresses =
> > >
> > >
> > ClientUtils.parseAndValidateAddresses(config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG));
> > >
> > > }
> > > -------------------------------
> > >
> > > let's say MyMetricsReporter creates a thread in constructor. if hostname
> > > validation threw an exception, constructor won't call the close method of
> > > MyMetricsReporter to clean up the resource. as a result, we created
> > thread
> > > leak issue. this becomes worse when we try to auto recovery (i.e. keep
> > > creating KafkaProducer again -> failing again -> more thread leaks).
> > >
> > > there are multiple options of fixing this.
> > >
> > > 1) just move the hostname validation to the beginning. but this is only
> > fix
> > > one symtom. it didn't fix the fundamental problem. what if some other
> > lines
> > > throw an exception.
> > >
> > > 2) use try-catch. in the catch section, try to call close methods for any
> > > non-null objects constructed so far.
> > >
> > > 3) explicitly declare the dependency in the constructor. this way, when
> > > KafkaProducer threw an exception, I can call close method of metrics
> > > reporters for releasing resources.
> > >     KafkaProducer(..., List<MetricsReporter> reporters)
> > > we don't have to dependency injection framework. but generally hiding
> > > dependency is a bad coding practice. it is also hard to plug in mocks for
> > > dependencies. this is probably the most intrusive change.
> > >
> > > I am willing to submit a patch. but like to hear your opinions on how we
> > > should fix the issue.
> > >
> > > Thanks,
> > > Steven
> > >
> >
> >
> >
> > --
> > Thanks,
> > Ewen
> >
> --
> -- Guozhang



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to