[GitHub] spark issue #19810: [SPARK-22599][SQL] In-Memory Table Pruning without Extra...

CodingCat Mon, 27 Nov 2017 19:47:50 -0800

Github user CodingCat commented on the issue:

    https://github.com/apache/spark/pull/19810
  
    Hi, @cloud-fan, this PR is not only for the case where the data size is 
larger than the memory size, even when all data is in-memory, I observed up to 
10-40% speedup  because the implementation here
    
    (1) read less data
    
    (2) started less tasks
    
    you can understand this PR as it implement the functionality of Parquet's 
footer for the in-memory table




---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark issue #19810: [SPARK-22599][SQL] In-Memory Table Pruning without Extra...

Reply via email to