[
https://issues.apache.org/jira/browse/KUDU-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15283135#comment-15283135
]
Dan Burkert commented on KUDU-1453:
-----------------------------------
Spark doesn't provide an API for notification on task shutdown, so client
closure is tied to a JVM shutdown hook. I think the best bet from a
performance perspective is to create a static lazily initialized Netty instance
that would be shared between all clients. Unfortunately this means modifying
the public API, and may be tricky if we don't want to expose netty (it's
currently shaded).
> Spark executors leak kudu clients and netty threads
> ---------------------------------------------------
>
> Key: KUDU-1453
> URL: https://issues.apache.org/jira/browse/KUDU-1453
> Project: Kudu
> Issue Type: Bug
> Components: spark
> Affects Versions: 0.8.0
> Reporter: Todd Lipcon
> Priority: Blocker
>
> On a test cluster, every time I run a spark SQL query against a table, each
> of my spark worker tasks end up with another ~500 netty worker threads
> created. It seems like each Spark partition/task is creating its own
> KuduClient which then creates a bunch of worker threads and never cleans them
> up.
> I'm calling this a blocker since after ~20 queries or so, the machines woudl
> run out of threads and crash.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)