[ 
https://issues.apache.org/jira/browse/HDDS-12895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan Andika updated HDDS-12895:
-------------------------------
    Description: 
We need to see whether we should enable O_DIRECT (ExtendedOptions.DIRECT) in 
datanodes reads and writes. It has been supported since JDK 10 
([https://bugs.openjdk.org/browse/JDK-8164900])

Resources
 * 
[https://events19.linuxfoundation.org/wp-content/uploads/2017/11/Accelerating-IO-in-Big-Data-%E2%80%93-A-Data-Driven-Approach-and-Case-Studies-Yingqi-Lucy-Lu-Intel-Corporation.pdf]
 * https://github.com/facebook/rocksdb/wiki/Direct-IO

In some datanodes that is colocated with compute engines (e.g. Yarn / Spark / 
Presto), we want the DN to NOT use file system cache since it can affect the 
colocated machines.

However, there should be expected performance degradations since writes are not 
buffered and reads are not cached.

  was:
We need to see whether we should enable O_DIRECT (ExtendedOptions.DIRECT) in 
datanodes reads and writes. It has been supported since JDK 10 
([https://bugs.openjdk.org/browse/JDK-8164900])

Resources
 * 
[https://events19.linuxfoundation.org/wp-content/uploads/2017/11/Accelerating-IO-in-Big-Data-%E2%80%93-A-Data-Driven-Approach-and-Case-Studies-Yingqi-Lucy-Lu-Intel-Corporation.pdf]
 * https://github.com/facebook/rocksdb/wiki/Direct-IO

In some datanodes that is colocated with compute engines (e.g. Yarn / Spark / 
Presto), we want the DN to NOT use file system cache since it can affect the 
colocated machines.

However, there should be an expected performance degradations since writes are 
not buffered and reads are not cached.


> Explore O_DIRECT in Datanodes
> -----------------------------
>
>                 Key: HDDS-12895
>                 URL: https://issues.apache.org/jira/browse/HDDS-12895
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Ivan Andika
>            Assignee: Ivan Andika
>            Priority: Major
>
> We need to see whether we should enable O_DIRECT (ExtendedOptions.DIRECT) in 
> datanodes reads and writes. It has been supported since JDK 10 
> ([https://bugs.openjdk.org/browse/JDK-8164900])
> Resources
>  * 
> [https://events19.linuxfoundation.org/wp-content/uploads/2017/11/Accelerating-IO-in-Big-Data-%E2%80%93-A-Data-Driven-Approach-and-Case-Studies-Yingqi-Lucy-Lu-Intel-Corporation.pdf]
>  * https://github.com/facebook/rocksdb/wiki/Direct-IO
> In some datanodes that is colocated with compute engines (e.g. Yarn / Spark / 
> Presto), we want the DN to NOT use file system cache since it can affect the 
> colocated machines.
> However, there should be expected performance degradations since writes are 
> not buffered and reads are not cached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to