On 2020-02-06 20:14, Matt Graham wrote:

2 930G disks here, in softRAID, with LVM.  630G ext4 filesystem on
/home in an LV.  1,090,498 files and dirs on that filesystem, file
sizes all over the map but more files in the M and K range than in the
G range.  No problems with filesystem speed ever.  Is this RAID in
hardware or software?  Are you running baloo or some kind of file
indexing/searching service?

No indexing services. It's just file storage.



I ran time { rsync -av /home/myuser/.cache/
remote:/backup/dir/.cache/; } and after 75 minutes I cancelled it.
There are 46k files in that folder and it is roughly 2GB... 75 minutes
it wasn't finished. Now this is running over an NFS link just FYI.

This seems off.  What was the exact command you ran?  I don't think
rsync supports that syntax for accessing NFS things.  What options is
your NFS share mounted with?  What was the rsync displaying after 75
minutes?  Are the times on both machines the same, or unsynced?
46,000 files seems like it should take a minute or so unless it had to
transfer all the files in full.  For comparison, doing "rsync -av
/home/me/ /mnt/backup1/me/" on that 630G filesystem with 1,090,498
files from an ext4 filesystem to a USB2 backup disk took 6 minutes 3
seconds wall-clock time.  I don't have anything using NFS at the
moment so I can't check that.  However, rsyncing a 15G dir with
150,000 files to my ext4 filesystem on RAID over ssh took ~2 minutes.


The volume on the server is mounted with noatime,nodiratime,noexec
The export on the server is (rw,sync,no_root_squash,no_subtree_check)
The mounted export on my system is mounted with nfsvers=4,proto=tcp,tcp,intr,nolock,sec=sys,noexec,fsc,noatime,nodiratime,x-systemd.idle-timeout=1min
and `mount` shows the actual options as:
(rw,noexec,noatime,nodiratime,vers=4.2,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=192.168.238.100,fsc,local_lock=none,addr=192.168.238.1)


Both machines have their time synced.



So I created a 4GB tmpfs and mounted it where I needed and ran my
time backup again and it took 2 minutes and 6 seconds. Obviously my
network is not the issue.

Doesn't tmpfs cache things in RAM?  Also, how full is the ext4
destination?  My filesystem has only 42% used, which probably avoids
any fragmentation problems.  How is the ext4 filesystem mounted?  The
thing that helps the most is "noatime", but that's really only a small
win.


Yes, it does, but I figured this would be a good means to remove the HDD as a potential bottle neck. The first run took two minutes, the second run was nearly instant. So I ran it again but instead of pushing the files to the same directory I pushed them to a new directory and it was just as fast.

Reading files from the server to my desktop is as fast as I would expect a gigabit network to be. Writing large files to the network storage works as I expect.
It's just all of these small files that are killing me.

The hard drives in the server at Seagate 7200RPM drives. The processor does not support aes, despite the file system being on luks2, and doesn't seem to penalize much.



Is there a program that watches and optimizes placement of files on a
hard drive? I know these exist for windows, but linux?

Can you umount and e2fsck -p the filesystem?  That should at least
tell you how fragmented the thing is.  The "filefrag" utility will
tell you how fragmented an individual file is.  I don't see anything
about defragging tools for ext234 in portage/sys-fs/ but that may be
just me.  There is http://vleu.net/shake/ but it's a bit old and I'm
not sure whether it would help you.  Finally, make sure you don't see
anything about disk errors in the output from dmesg on the machine
with the RAID, and check that "cat /proc/mdstat" returns UU for that
md device.

Defragging and ext4 filesystem is done with e4defrag


---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
https://lists.phxlinux.org/mailman/listinfo/plug-discuss

Reply via email to