[ 
https://issues.apache.org/jira/browse/HDDS-10429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Duong resolved HDDS-10429.
--------------------------
    Resolution: Not A Problem

> Confusing behavior of ozone.fs.datastream.auto.threshold in FS API.
> -------------------------------------------------------------------
>
>                 Key: HDDS-10429
>                 URL: https://issues.apache.org/jira/browse/HDDS-10429
>             Project: Apache Ozone
>          Issue Type: Improvement
>            Reporter: Duong
>            Priority: Major
>
> When writing data in streaming mode, *ozone.fs.datastream.auto.threshold* 
> defines the threshold to switch between streaming and normal write path 
> async-api using GRPC based on data size. Its default value is {*}4MB{*}.
> However, this config behavior is inconsistent between S3G and FS API.
> In S3, the threshold is compared against the file/key size. For instance, if 
> the written file is 5MB and the threshold is 4MB, the file is written in 
> streaming mode.
> In FS API, the write mode is decided {*}not by the file size, but by the 
> first write buffer size{*}, i.e. by the buffer size of method 
> {_}FSDataOutputStream.write(byte[] buffer){_}. For instance, if the written 
> file is *1G* and the same threshold, and the file is put to Ozone with the 
> following code, the file is *not* written using streaming.
> {code:java}
> InputStream is = new FileInputStream(localInputPath));
> OutputStream os = fileSystem.create(outputPath);
> byte[] buffer = new byte[1024];
> while (-1 != (n = inputStream.read(buffer))) {
>     os.write(buffer, 0, n);
> }{code}
> It's always a good practice to use a small buffer when writing data. Hence 
> with the default threshold value of 4MB, the streaming mode is never used in 
> FS API.
> It's also noted that FileSystem API doesn't provide a hint of size when 
> creating files, we may need a different way to automatically switch between 
> streaming and async-api.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to