[GitHub] spark issue #10846: [SPARK-12920][SQL] Fix high CPU usage in spark thrift se...

rajeshbalamohan Mon, 08 Aug 2016 17:03:02 -0700

Github user rajeshbalamohan commented on the issue:

    https://github.com/apache/spark/pull/10846
  
    SoftRef causes lots of mem-pressure on thrift server. To be precise, when 
executing query with large dataset, it can very soon run at 1200% CPU and all 
threads carrying out just GC activities. That is for the HadoopRDD conf 
caching. Due to softRef they reach till GC threshold and gets cleared up. It 
does not OOM, but runs at very high CPU due to GC.
    
    JobProgress* does not cleanup the data fast enough in some cases (e.g too 
many queries are executed continuously) and in such cases the memory pressure 
on thrift server increases.
    
    Both of them contribute to the high CPU usage.  I am afraid that fixing one 
of them would still have the high-CPU usage issue.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #10846: [SPARK-12920][SQL] Fix high CPU usage in spark thrift se...

Reply via email to