LZD-PratyushBhatt commented on code in PR #7529: URL: https://github.com/apache/ozone/pull/7529#discussion_r1881427563
########## hadoop-hdds/docs/content/feature/Short-Circuit-Read.md: ########## @@ -0,0 +1,75 @@ +--- +title: "Short Circuit Local Read in Datanode" +weight: 2 +menu: + main: + parent: Features +summary: Introduction to Ozone Datanode Short Circuit Local Read Feature +--- +<!--- + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--> + +By default, client reads data over GRPC from the Datanode. When the client asks the Datanode to read a file, the DataNode reads that file off of the disk and sends the data to the client over a GRPC connection. + +This “short-circuit” local read feature will bypass the DataNode, allowing the client to read the file from local disk directly when the client is co-located with the data on the same server. + +Short-circuit local read can provide a substantial performance boost to many applications, by removing the overhead of network communication. + +## Prerequisite + +Short-circuit local reads make use of a UNIX domain socket. This is a special path in the filesystem that allows the client and the DataNodes to communicate. + +The Hadoop native library `libhadoop.so` provides support to for Unix domain sockets. Please refer to Hadoop's [Native Libraries Guide](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/NativeLibraries.html) for details. + +The Hadoop version used in Ozone is defined by `hadoop.version` in pom.xml. Before enabling short-circuit local reads, find the `libhadoop.so` from the corresponding version Hadoop release package, put it under one of the directories specified by Java `java.library.path` property. The default value of `java.library.path` depends on the OS and Java version. For example, on Linux with OpenJDK 8 it is `/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib`. + +The `ozone checknative` command can be used to detect whether `libhadoop.so` can be found and loaded successfully by Ozone service. + + +## Configuration + +Short-circuit local reads need to be configured on both the DataNode and the client. By default, it is disabled. + +```XML +<property> + <name>ozone.client.read.short-circuit</name> + <value>false</value> + <description>Disable or enable the short-circuit local read feature.</description> +</property> +``` + +It makes use of a UNIX domain socket, a special path in the filesystem. You will need to set a path to this socket. + +```XML +<property> + <name>ozone.domain.socket.path</name> + <value>/var/lib/ozone_dn_socket</value> + <description>The path used to create domain socket.</description> +</property> +``` + +The DataNode needs to be able to create this path. On the other hand, it should not be possible for any user except the Ozone user(user who launches Ozone service) or root to create this path. For this reason, paths under `/var/run` or `/var/lib` are often used, just like the current default value `/var/lib/ozone_dn_socket`. + +If you configure the `ozone.domain.socket.path` to another value, for example `/dir1/dir2/ozone_dn_socket`, please make sure that both `dir1` and `dir2` are exiting directories, but the file `ozone_dn_socket` does not exist under `dir2`. `ozone_dn_socket` will be created by Ozone Datanode later during its startup. Review Comment: Small nit: Typo in 'exiting', it should be 'existing'. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
