marsupialtail commented on PR #13662: URL: https://github.com/apache/arrow/pull/13662#issuecomment-1216997121
Let's review our options here: 1) O_DIRECT with O_SYNC. Horribly slow with the same problems with O_DIRECT, not worth it if you don't want to persist every write. 2) O_DIRECT without O_SYNC. This is my preferred option. This does not persist each write immediately onto the SSD but uses SSD cache to move data off the page cache to save memory. This does not offer persistence for fault tolerance but saves memory for our purposes. 3) fadvise with O_SYNC. Horribly slow. (> 15x slower than option 2 on my machine) 4) fadvise without O_SYNC. This does not free up the page cache. To see swapping occur, what you could do is to run a memory intensive job alongside the binary compiled with option 2 or option 4. (Make sure the SIZE and N are the same in direct.cpp and fadvise.cpp) Good option could be SIZE = 1024 * 1024 and N = 1024 * 30, i.e. write 30 GB with each write 1MB. Then while running this write job, run this python script: ``` import time import numpy as np start = time.time() a = np.random.normal(size=(1024,1024,1024)) print(time.time() - start) ``` With option2: ./direct & python script.py, the write itself takes 12s and the script takes 30s on my machine. With option4: ./fadvise & python script.py, the write itself takes 25s and the script takes 40s on my machine, with it using up all free memory. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
