On 11/19/25 18:28, [email protected] wrote:
The mkemmc.sh script calculates file sizes via `wc -c'. `wc'
normally works by reading the entire file, resulting in O(n) performance.
Even something as mundane as 'wc' can surprise you! Running "strace wc
-c < somefile.txt" shows this:
fstat(0, {st_mode=S_IFREG|0644, st_size=6900, ...}) = 0
lseek(0, 0, SEEK_CUR) = 0
lseek(0, 6900, SEEK_CUR) = 6900
So wc notices you don't need word or line counts, and takes a shortcut.
Paolo
Unix file systems obviously know a file's size and POSIX `ls' reports this
information unambiguously, so replacing `wc' with `ls' ensures O(1)
performance. The files in question tend to be large making this change
worthwhile.
Signed-off-by: Konrad Schwarz <[email protected]>
---
scripts/mkemmc.sh | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/scripts/mkemmc.sh b/scripts/mkemmc.sh
index 45dc3f08fa..d2c4e84b16 100755
--- a/scripts/mkemmc.sh
+++ b/scripts/mkemmc.sh
@@ -37,13 +37,19 @@ usage() {
exit "$1"
}
+file_size() {
+ ls_line=$(ls -Hdog "$1") || return
+ printf %s\\n "$ls_line" | cut -d\ -f3
+ unset ls_line
+}
+
process_size() {
name=$1
image_file=$2
alignment=$3
image_arg=$4
if [ "${image_arg#*:}" = "$image_arg" ]; then
- if ! size=$(wc -c < "$image_file" 2>/dev/null); then
+ if ! size=$(file_size "$image_file"); then
echo "Missing $name image '$image_file'." >&2
exit 1
fi
@@ -105,7 +111,7 @@ check_truncation() {
if [ "$image_file" = "/dev/zero" ]; then
return
fi
- if ! actual_size=$(wc -c < "$image_file" 2>/dev/null); then
+ if ! actual_size=$(file_size "$image_file"); then
echo "Missing image '$image_file'." >&2
exit 1
fi