[
https://issues.apache.org/jira/browse/KUDU-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xixu Wang updated KUDU-3523:
----------------------------
Description:
In my ** aarch64 architecture system, the st_blksize is not equal to the real
filesystem block size. The st_blksize in my system is 65536 bytes, but the
block size of the filesystem is 4096 bytes. When writing some data which size
is less than 4096 bytes, the file on disk size is 4096 bytes not 65536 bytes.
But in kudu, it use st_blksize to decide the filesystem block size, which is
not always right.
There is a unit test which causing this issue:
EncryptionEnabled/LogBlockManagerTest.ContainerPreallocationTest/1
{code:java}
/root/kudu/src/kudu/fs/log_block_manager-test.cc:541: Failure
Expected equality of these values:
FLAGS_log_container_preallocate_bytes
Which is: 33554432
size
Which is: 33492992 {code}
The code is follow:
!image-2023-11-08-14-35-08-189.png!
FLAGS_log_container_preallocate_bytes=33554432 bytes
The file is encrypted, so the encryption header occupies one block in file
system. After creating the first block, there should be 2 blocks on the disk.
In my system (aarch64 kylin-10), the st_blksize=65536, but the block size of
file system is 4096, see part-4 follow.
When write the encryption header into the file, the on disk size is 4096, when
writing a new block, it's offset is 65536(it uses st_blksize to decide the next
block offset, see function:
src/kudu/util/env_posix.cc#GetBlockSize()). Therefore, in the first file system
block, only 4096 bytes on disk, but Kudu thinks it occupies 65536 bytes, and
preallocate (FLAGS_log_container_preallocate_bytes - 1) bytes for this file.
Actually, it generates (65536 - 4096) bytes hole in the file system block.
Finally, the file size on disk is (FLAGS_log_container_preallocate_bytes -
(65536 - 4096)) = 33492992.
{color:#de350b}In my opinion, Kudu should use the file system block
size(f_bsize) as the Kudu block size, not st_blksize.{color}
*1. The test environment*
Linux hybrid01 4.19.90-23.30.v2101.ky10.aarch64 #1 SMP Thu Dec 15 09:57:55 CST
2022 aarch64 aarch64 aarch64 GNU/Linux. And a docker container runs on it.
*2.Create a file with encryption header*
{code:java}
const string kFile = JoinPathSegments(test_dir_, "encrypted_file");
unique_ptr<RWFile> rw;
RWFileOptions opts;
opts.is_sensitive = true;
ASSERT_OK(env_->NewRWFile(opts, kFile, &rw));
uint64_t file_size = 0;
env_->GetFileSizeOnDisk(kFile, &file_size); {code}
*3.stat the file*
The IO Block size is 65536, which means st_blsize is 65536, the file logic size
is 64 bytes.
!image-2023-11-06-15-42-46-082.png!
*4. filesystem block size is 4096 bytes*
!image-2023-11-06-15-45-39-233.png!
*5.The file on disk size is 4096 bytes*
!image-2023-11-06-15-52-41-834.png!
was:
In my ** aarch64 architecture system, the st_blksize is not equal to the real
filesystem block size. The st_blksize in my system is 65536 bytes, but the
block size of the filesystem is 4096 bytes. When writing some data which size
is less than 4096 bytes, the file on disk size is 4096 bytes not 65536 bytes.
But in kudu, it use st_blksize to decide the filesystem block size, which is
not always right.
*1. The test environment*
Linux hybrid01 4.19.90-23.30.v2101.ky10.aarch64 #1 SMP Thu Dec 15 09:57:55 CST
2022 aarch64 aarch64 aarch64 GNU/Linux. And a docker container runs on it.
*2.Create a file with encryption header*
{code:java}
const string kFile = JoinPathSegments(test_dir_, "encrypted_file");
unique_ptr<RWFile> rw;
RWFileOptions opts;
opts.is_sensitive = true;
ASSERT_OK(env_->NewRWFile(opts, kFile, &rw));
uint64_t file_size = 0;
env_->GetFileSizeOnDisk(kFile, &file_size); {code}
*3.stat the file*
The IO Block size is 65536, which means st_blsize is 65536, the file logic size
is 64 bytes.
!image-2023-11-06-15-42-46-082.png!
*4. filesystem block size is 4096 bytes*
!image-2023-11-06-15-45-39-233.png!
*5.The file on disk size is 4096 bytes*
!image-2023-11-06-15-52-41-834.png!
> st_blksize is not alway equal to the filesystem block size
> ----------------------------------------------------------
>
> Key: KUDU-3523
> URL: https://issues.apache.org/jira/browse/KUDU-3523
> Project: Kudu
> Issue Type: Bug
> Reporter: Xixu Wang
> Priority: Major
> Attachments: image-2023-11-06-15-42-46-082.png,
> image-2023-11-06-15-45-11-819.png, image-2023-11-06-15-45-39-233.png,
> image-2023-11-06-15-52-41-834.png, image-2023-11-08-14-35-08-189.png
>
>
> In my ** aarch64 architecture system, the st_blksize is not equal to the real
> filesystem block size. The st_blksize in my system is 65536 bytes, but the
> block size of the filesystem is 4096 bytes. When writing some data which size
> is less than 4096 bytes, the file on disk size is 4096 bytes not 65536 bytes.
> But in kudu, it use st_blksize to decide the filesystem block size, which is
> not always right.
>
> There is a unit test which causing this issue:
> EncryptionEnabled/LogBlockManagerTest.ContainerPreallocationTest/1
> {code:java}
> /root/kudu/src/kudu/fs/log_block_manager-test.cc:541: Failure
> Expected equality of these values:
> FLAGS_log_container_preallocate_bytes
> Which is: 33554432
> size
> Which is: 33492992 {code}
> The code is follow:
> !image-2023-11-08-14-35-08-189.png!
>
> FLAGS_log_container_preallocate_bytes=33554432 bytes
> The file is encrypted, so the encryption header occupies one block in file
> system. After creating the first block, there should be 2 blocks on the disk.
> In my system (aarch64 kylin-10), the st_blksize=65536, but the block size of
> file system is 4096, see part-4 follow.
> When write the encryption header into the file, the on disk size is 4096,
> when writing a new block, it's offset is 65536(it uses st_blksize to decide
> the next block offset, see function:
> src/kudu/util/env_posix.cc#GetBlockSize()). Therefore, in the first file
> system block, only 4096 bytes on disk, but Kudu thinks it occupies 65536
> bytes, and preallocate (FLAGS_log_container_preallocate_bytes - 1) bytes for
> this file. Actually, it generates (65536 - 4096) bytes hole in the file
> system block. Finally, the file size on disk is
> (FLAGS_log_container_preallocate_bytes - (65536 - 4096)) = 33492992.
>
> {color:#de350b}In my opinion, Kudu should use the file system block
> size(f_bsize) as the Kudu block size, not st_blksize.{color}
>
> *1. The test environment*
> Linux hybrid01 4.19.90-23.30.v2101.ky10.aarch64 #1 SMP Thu Dec 15 09:57:55
> CST 2022 aarch64 aarch64 aarch64 GNU/Linux. And a docker container runs on it.
> *2.Create a file with encryption header*
>
> {code:java}
> const string kFile = JoinPathSegments(test_dir_, "encrypted_file");
> unique_ptr<RWFile> rw;
> RWFileOptions opts;
> opts.is_sensitive = true;
> ASSERT_OK(env_->NewRWFile(opts, kFile, &rw));
> uint64_t file_size = 0;
> env_->GetFileSizeOnDisk(kFile, &file_size); {code}
> *3.stat the file*
>
> The IO Block size is 65536, which means st_blsize is 65536, the file logic
> size is 64 bytes.
> !image-2023-11-06-15-42-46-082.png!
> *4. filesystem block size is 4096 bytes*
> !image-2023-11-06-15-45-39-233.png!
> *5.The file on disk size is 4096 bytes*
> !image-2023-11-06-15-52-41-834.png!
--
This message was sent by Atlassian Jira
(v8.20.10#820010)