[GitHub] spark pull request: [SPARK-12196][Core] Store/retrieve blocks in d...

yucai Mon, 11 Apr 2016 00:14:19 -0700

Github user yucai commented on the pull request:

    https://github.com/apache/spark/pull/10225#issuecomment-208193697
  
    @JoshRosen @rxin 
    Dear committers:
    In China, most of companies are still using HDDs as external storage, the 
IO bottleneck is quite obvious in shuffle, we make this solution to use 1 PCIe 
SSD as cache, it completely eliminates IO bottleneck and with very low cost.
    
    In Baidu real production case, Spark SQL improves x1.7 with this patch.
    In Youku's machine learning case, the application improves x1.8 with this 
patch. 
    And because only 1 SSD is added, the cost is very attractive to customers.
    
    This PR is really important to Chinese internet and big data company, more 
and more company are showing interest and evaluating it in their environment. 
We are quite sure it will benefit others also, kindly help review.
    
    Much thanks!



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-12196][Core] Store/retrieve blocks in d...

Reply via email to