[GitHub] spark issue #21500: Scalable Memory option for HDFSBackedStateStore

HeartSaVioR Mon, 11 Jun 2018 07:26:17 -0700

Github user HeartSaVioR commented on the issue:

    https://github.com/apache/spark/pull/21500
  
    After enabling option, I've observed small expected latency whenever 
starting batch per each partition per each batch. Median/average was 4~50 ms 
for my case, but max latency was a bit higher than 700 ms.
    
    Please note that state size in my experiment is not that super huge, so if 
partition has much bigger state the latency could be much higher: 
    
    ```
    memory used by state total (min, med, max): 812.6 KB (2.1 KB, 4.1 KB, 4.1 
KB)
    time to commit changes total (min, med, max): 13.5 s (21 ms, 35 ms, 449 ms)
    total time to remove rows total (min, med, max): 22 ms (22 ms, 22 ms, 22 ms)
    number of updated state rows: 5,692
    total time to update rows total (min, med, max): 1.4 s (3 ms, 5 ms, 42 ms)
    ```
    
    As I explained earlier, loading the last version from files brings 
avoidable latency.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark issue #21500: Scalable Memory option for HDFSBackedStateStore

Reply via email to