thx again for the tip, but disconnecting the peer (ie. WFConnection-mode) was the first thing i've done.

i'm currently deleting with
find subdir/ -type f | while read LINE ; do rm -vf $LINE && sleep 0.03; done
that delay seems to be enough to not cause the device to block I/O-access and so at least the machine is online again. deleting this way though will most likely take till end of next week.

enjoy your weekend,

joe

Am 28.01.2011 21:59, schrieb Moti Levy:
All I can think of is that DRBD is trying to catch up and causes the
delays.
Maybe take one of the nodes offline and try to delete without "real time
replication" ?

Moti


On Fri, Jan 28, 2011 at 2:44 PM, Joseph Hauptmann<[email protected]>wrote:

  Yes, I did try that. Doesn't make much of a (speed) difference.

It seems, that the problem is less that rm gets stuck for good, but that it
takes really long breaks (about 20 sec.) while deleting - during those
breaks the whole partition is stuck and iostat reports 100% utilization
compared to ~95% while actually deleting files. Could the "hang-time" be
DRBD writing meta-information (internal in my case) and blocking every other
access as long the meta-data isn't written to the disk? Of course there is
also the ext3-journal that has to be written, but still I don't see why it
should take that long: I'm currently timing how long it takes to delete a
subdir with 285868 block-sized files in it (already more than 30 min).


dmesg is clear, so it does not seem to be a SATA reset.

any other ideas?





Am 2011-01-28 20:02, schrieb Moti Levy:

Have you tried :
find dirname -type f -exec rm {} \;


  On Fri, Jan 28, 2011 at 1:46 PM, Joseph Hauptmann<[email protected]
wrote:
Hello DRBD-users worldwide...

I've been using DRBD almost a year now, until now without problems that I
couldn't resolve myself.
But now I ran into quite a serious problem and I'm interested if someone
else experienced something similar with or without DRBD (as of course I
can't really be sure that DRBD is the problem):

A few months ago a colleague of mine forgot to activate a cronjob, that
deletes a couple thousand very small temporary files each night on a
DRBD-device. Now I have a directory with, I guess more than a million files,
which wouldn't be so bad, if rm -rf {dir}/ could delete it. But sadly that
is not the case.
rm gets stuck after it deleted a few hundred files and doesn't resume
operation. Furthermore the all IO-access on the DRBD-device is complete
stuck until the rm process is killed.

I've already disconnected all resources from it's peer and shut down most
of the non essential services on the machine.

It's running Debian Lenny with

uname -a
Linux srv1.xxx.at 2.6.26-2-openvz-amd64 #1 SMP Wed May 12 18:14:56 UTC
2010 x86_64 GNU/Linux

cat /proc/drbd
version: 8.3.7 (api:88/proto:86-91)
GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by
[email protected], 2010-03-28 21:47:13
  0: cs:WFConnection ro:Primary/Unknown ds:UpToDate/DUnknown C r----
    ns:1875795496 nr:0 dw:225995436 dr:566154981 al:105639961 bm:11019801
lo:2 pe:0 ua:0 ap:1 ep:1 wo:b oos:1242040
  1: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:31796784 dw:31796784 dr:2253416 al:0 bm:1134 lo:0 pe:0 ua:0
ap:0 ep:1 wo:d oos:0
  2: cs:StandAlone ro:Secondary/Unknown ds:UpToDate/DUnknown   r----
    ns:0 nr:57709884 dw:143774088 dr:8480 al:0 bm:50 lo:0 pe:0 ua:0 ap:0
ep:1 wo:d oos:0

The filesystem on resource 0 is ext3  with a block size of 4096 and lies
on a SW-RAID5 (far from ideal - I know).


Atm. I'm using a bash-hack, that kills the rm-process every 30 seconds and
restarts it as long as the directory still exists.

Thanks for any hints to what might cause this problem.

Joe

--
Joseph Hauptmann

/digiconcept/ - GmbH.
1080 Wien
Blindengasse 52/1

Tel. +43 1 218 0 212 - 24
Fax +43 1 218 0 212 - 10

_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user



--
Joseph Hauptmann

/digiconcept/ - GmbH.
1080 Wien
Blindengasse 52/1

Tel. +43 1 218 0 212 - 24
Fax +43 1 218 0 212 - 10


_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user



_______________________________________________
drbd-user mailing list
[email protected]
http://lists.linbit.com/mailman/listinfo/drbd-user

Reply via email to