Github user suyanNone commented on the pull request:

    https://github.com/apache/spark/pull/6586#issuecomment-108707568
  
    @srowen 
    Today I spent some time to have a performance test.
    
    If I just test 1 cycle, TestOutPutStream have a minor strength, may due to 
directbuffer creation and destroy is a time cost thing.
    
    cycle: 1, data: 10Mb
    TestOutputStream: 12
    TestChannel: 14
    
    cycle: 1, data: 50MB
    TestOutputStream: 46
    TestChannel: 54
    
    cycle: 1, data: 100MB
    TestOutputStream: 110
    TestChannel: 112
    
    cycle: 1, data: 500MB
    TestOutputStream: 620
    TestChannel: 600
    
    While cycle is increased to 10.
    FileOutputStream is direct proportion.  and channel thanks the directBuffer 
pool, it just increase a little time on the "cycle 1" time.
     
    cycle: 10, data 10MB
    TestOutputStream: 100
    TestChannel: 16
    
    cycle: 10, data 50MB
    TestOutputStream: 474
    TestChannel: 63
    
    cycle: 10, data 100MB
    TestOutputStream: 1118
    TestChannel: 138
    
    cycle:10, data:500MB
    TestOutputStream: 6332
    TestChannel: 690
    
    And also according to test, the time to create a direct buffer is in direct 
proportion of data size.
    so I think slice large data into small size will be good for performance 
and can reduce direct buffer pool size.
    
    
    



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to