[ 
https://issues.apache.org/jira/browse/KUDU-3523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xixu Wang updated KUDU-3523:
----------------------------
    Description: 
In my ** aarch64 architecture system, the st_blksize is not equal to the real 
filesystem block size. The st_blksize in my system is 65536 bytes, but the 
block size of the filesystem is 4096 bytes. When writing some data which size 
is less than 4096 bytes, the file on disk size is 4096 bytes not 65536 bytes. 
But in kudu, it use st_blksize to decide the filesystem block size, which is 
not always right.

 

There is a unit test which causing this issue: 
EncryptionEnabled/LogBlockManagerTest.ContainerPreallocationTest/1
{code:java}
/root/kudu/src/kudu/fs/log_block_manager-test.cc:541: Failure

Expected equality of these values:
  FLAGS_log_container_preallocate_bytes
    Which is: 33554432
  size
    Which is: 33492992 {code}
The code is follow:

!image-2023-11-08-14-35-08-189.png!
 
FLAGS_log_container_preallocate_bytes=33554432 bytes
The file is encrypted, so the encryption header occupies one block in file 
system. After creating the first block, there should be 2 blocks on the disk.
In my system (aarch64 kylin-10), the st_blksize=65536, but the block size of 
file system is 4096, see part-4 follow.
When write the encryption header into the file, the on disk size is 4096, when 
writing a new block, it's offset is 65536(it uses st_blksize to decide the next 
block offset, see function:
src/kudu/util/env_posix.cc#GetBlockSize()). Therefore, in the first file system 
block, only 4096 bytes on disk, but Kudu thinks it occupies 65536 bytes, and 
preallocate (FLAGS_log_container_preallocate_bytes - 1) bytes for this file. 
Actually, it generates (65536 - 4096) bytes hole in the file system block. 
Finally, the file size on disk is (FLAGS_log_container_preallocate_bytes - 
(65536 - 4096)) = 33492992.
 
{color:#de350b}In my opinion, Kudu should use the file system block 
size(f_bsize) as the Kudu block size, not st_blksize.{color}
 

*1. The test environment*

Linux hybrid01 4.19.90-23.30.v2101.ky10.aarch64 #1 SMP Thu Dec 15 09:57:55 CST 
2022 aarch64 aarch64 aarch64 GNU/Linux. And a docker container runs on it.

*2.Create a file with encryption header*

 
{code:java}
const string kFile = JoinPathSegments(test_dir_, "encrypted_file");  
unique_ptr<RWFile> rw;  
RWFileOptions opts;  
opts.is_sensitive = true;  
ASSERT_OK(env_->NewRWFile(opts, kFile, &rw));  
uint64_t file_size = 0;  
env_->GetFileSizeOnDisk(kFile, &file_size); {code}
*3.stat the file*

 

The IO Block size is 65536, which means st_blsize is 65536, the file logic size 
is 64 bytes.

!image-2023-11-06-15-42-46-082.png!

*4. filesystem block size is 4096 bytes*

!image-2023-11-06-15-45-39-233.png!

*5.The file on disk size is 4096 bytes*

!image-2023-11-06-15-52-41-834.png!

  was:
In my ** aarch64 architecture system, the st_blksize is not equal to the real 
filesystem block size. The st_blksize in my system is 65536 bytes, but the 
block size of the filesystem is 4096 bytes. When writing some data which size 
is less than 4096 bytes, the file on disk size is 4096 bytes not 65536 bytes. 
But in kudu, it use st_blksize to decide the filesystem block size, which is 
not always right.

 

*1. The test environment*

Linux hybrid01 4.19.90-23.30.v2101.ky10.aarch64 #1 SMP Thu Dec 15 09:57:55 CST 
2022 aarch64 aarch64 aarch64 GNU/Linux. And a docker container runs on it.

*2.Create a file with encryption header*

 
{code:java}
const string kFile = JoinPathSegments(test_dir_, "encrypted_file");  
unique_ptr<RWFile> rw;  
RWFileOptions opts;  
opts.is_sensitive = true;  
ASSERT_OK(env_->NewRWFile(opts, kFile, &rw));  
uint64_t file_size = 0;  
env_->GetFileSizeOnDisk(kFile, &file_size); {code}
*3.stat the file*

 

The IO Block size is 65536, which means st_blsize is 65536, the file logic size 
is 64 bytes.

!image-2023-11-06-15-42-46-082.png!

*4. filesystem block size is 4096 bytes*

!image-2023-11-06-15-45-39-233.png!

*5.The file on disk size is 4096 bytes*

!image-2023-11-06-15-52-41-834.png!


> st_blksize is not alway equal to the filesystem block size
> ----------------------------------------------------------
>
>                 Key: KUDU-3523
>                 URL: https://issues.apache.org/jira/browse/KUDU-3523
>             Project: Kudu
>          Issue Type: Bug
>            Reporter: Xixu Wang
>            Priority: Major
>         Attachments: image-2023-11-06-15-42-46-082.png, 
> image-2023-11-06-15-45-11-819.png, image-2023-11-06-15-45-39-233.png, 
> image-2023-11-06-15-52-41-834.png, image-2023-11-08-14-35-08-189.png
>
>
> In my ** aarch64 architecture system, the st_blksize is not equal to the real 
> filesystem block size. The st_blksize in my system is 65536 bytes, but the 
> block size of the filesystem is 4096 bytes. When writing some data which size 
> is less than 4096 bytes, the file on disk size is 4096 bytes not 65536 bytes. 
> But in kudu, it use st_blksize to decide the filesystem block size, which is 
> not always right.
>  
> There is a unit test which causing this issue: 
> EncryptionEnabled/LogBlockManagerTest.ContainerPreallocationTest/1
> {code:java}
> /root/kudu/src/kudu/fs/log_block_manager-test.cc:541: Failure
> Expected equality of these values:
>   FLAGS_log_container_preallocate_bytes
>     Which is: 33554432
>   size
>     Which is: 33492992 {code}
> The code is follow:
> !image-2023-11-08-14-35-08-189.png!
>  
> FLAGS_log_container_preallocate_bytes=33554432 bytes
> The file is encrypted, so the encryption header occupies one block in file 
> system. After creating the first block, there should be 2 blocks on the disk.
> In my system (aarch64 kylin-10), the st_blksize=65536, but the block size of 
> file system is 4096, see part-4 follow.
> When write the encryption header into the file, the on disk size is 4096, 
> when writing a new block, it's offset is 65536(it uses st_blksize to decide 
> the next block offset, see function:
> src/kudu/util/env_posix.cc#GetBlockSize()). Therefore, in the first file 
> system block, only 4096 bytes on disk, but Kudu thinks it occupies 65536 
> bytes, and preallocate (FLAGS_log_container_preallocate_bytes - 1) bytes for 
> this file. Actually, it generates (65536 - 4096) bytes hole in the file 
> system block. Finally, the file size on disk is 
> (FLAGS_log_container_preallocate_bytes - (65536 - 4096)) = 33492992.
>  
> {color:#de350b}In my opinion, Kudu should use the file system block 
> size(f_bsize) as the Kudu block size, not st_blksize.{color}
>  
> *1. The test environment*
> Linux hybrid01 4.19.90-23.30.v2101.ky10.aarch64 #1 SMP Thu Dec 15 09:57:55 
> CST 2022 aarch64 aarch64 aarch64 GNU/Linux. And a docker container runs on it.
> *2.Create a file with encryption header*
>  
> {code:java}
> const string kFile = JoinPathSegments(test_dir_, "encrypted_file");  
> unique_ptr<RWFile> rw;  
> RWFileOptions opts;  
> opts.is_sensitive = true;  
> ASSERT_OK(env_->NewRWFile(opts, kFile, &rw));  
> uint64_t file_size = 0;  
> env_->GetFileSizeOnDisk(kFile, &file_size); {code}
> *3.stat the file*
>  
> The IO Block size is 65536, which means st_blsize is 65536, the file logic 
> size is 64 bytes.
> !image-2023-11-06-15-42-46-082.png!
> *4. filesystem block size is 4096 bytes*
> !image-2023-11-06-15-45-39-233.png!
> *5.The file on disk size is 4096 bytes*
> !image-2023-11-06-15-52-41-834.png!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to