Re: [SLUG] correct form of find - exec rm on RH5.2?
This one time, at band camp, Matthew Hannigan wrote: >On Wed, Apr 12, 2006 at 06:19:21PM +1000, Jamie Wilkinson wrote: >> This one time, at band camp, Terry Collins wrote: >> >Can anyone tell me what is the correct form of >> > >> >find smtpd* -atime 7 -exec ` rm -f {} ` \; >> >on a RH5.2 system? >> >> You can only specify one directory to look in, so smtpd* isn't going to >> work. > >not true -- do as many as you like Ok, woops! >> find . -wholename './smtpd*' -atime 7 -exec rm -f {} \; >> >> or >> >> find . -wholename './smtpd*' -atime 7 -print0 | xargs -0 rm -f >> >> I only just learnt -wholename so I may have gotten the syntax wrong. Buyer >> beware! :-) > >I suspect older find's don't have -wholename. "buyer beware". -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
On Thu, 2006-04-13 at 11:27 +1000, Peter Rundle wrote: > Incidently the last time the -exec vs xargs came up the -exec script > given worked, the xargs didn't. Then when this was pointed out the reply > was "well it was only an example" I.E a polite way of saying go RTFM. > For my money, I'll take the slower simpler but working example every time. Maybe it's just me, but I find piping to xargs simpler than grappling with quoting shenanigans on a find -exec command line. But then, I was using xargs long before I even found out that find had an -exec. Horses for courses, I suppose. -- Pete -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
On Wed, Apr 12, 2006 at 09:05:08PM -0400, Bruce Woodward wrote: > For what it's worth; > > [EMAIL PROTECTED] a]$ i=0 > [EMAIL PROTECTED] a]$ while ((i<1000)); do touch $i; ((i++)); done > [EMAIL PROTECTED] a]$ time find . -type f -exec rm -f {} \; A more realistic test would be to use non-zero files in many subdirectories. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
On Thu, Apr 13, 2006 at 09:09:47AM +1000, Peter Rundle wrote: > Matthew Hannigan wrote: > >Yeah, not a bad call - it follows the 'correct first, fast later > >/ premature optimization is the root of all evil' principle. > > Obviously xargs is a "better" solution based on the unix concept of only > doing one thing and doing it well. xargs adds the capability to any data > stream where as -exec is tied to find. I just reckon that people are > just a bit quick to diss -exec. Consider these two cmds; > > find . -exec echo {} \; > > find . -print0 | xargs -0 echo > > Don't produce the same result. > > >Still, it can be a lot slower to use -exec. > > Maybe but given that rm is the command in this case, the time to > remove the files from disk is orders of magnitude greater than the time > to load and execute the program. Lots of good points. > Also what happens if find returns a million file names? xargs can't put > them all on a single command line else the dreaded "too many args" error > will occur. So xargs must have some smarts in it to call the command > multiple times, passing it 1,2,...N args at a time? Speculation on my > part, I don't know this for a fact but man xargs says; > > If any invocation of the command exits with a status of 255, > xargs will stop immediately without reading any further input. > > So it obviously breaks the args up into chunks and invokes the command > multiple times in any case. Indeed it does. > The stopping on error could be a "good thing" or a "bad thing" depending > on what you are trying to do. In the original example the find statement > will pick up directory names, hence passing it to "xargs rm -f" should > cause an error straight up as the first name found will be a directory. > Hence the find with xargs method won't clear the smtpd logs which was > Terry's intention. (Or will it as what error code does a failed rm > return?) On the other hand, -exec will continue on blindly executing the > given command for each filename passed to it, which may or may not be "a > good thing" depending on circumstance. Yep, and another thing. Since xargs might run command with just 1 args, the command may behave differently. E.g. grep when given one file will not print the filename as part of the match, but will when given many files. So find . -print0 | xargs -0 grep | dosomething might misbehave very occasionally, depending on what exactly dosomething does. The fix is to use -H with grep (non-portable) or just add a certain to be not matched file after grep, so xargs runs grep with at least 2 files each time: find . -print0 | xargs -0 grep /dev/null | dosomething Of course using -exec avoids the problem entirely. Matt -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
Bruce Woodward wrote: Forking and exec'ing isn't effortless. Never said it was, but why don't you try it again but this time make the 1000 files say a few hundred megabytes in size instead of zero bytes. Unix System V file accesses were asynchronous, this could be seen when you wrote to a floppy. For example $ rm /mnt/floppy/* would return to the $ prompt after only a second or so, but then when you did $ sync; sync; umount /dev/dsk/floppy you'd be waiting ages whilst the disk was erased. If that is still the case in Linux the forking and exec'ing will occur in parallel to the disk activity and will finish inside that time window so from a practical point of view that one second saved isn't worth a whole lot. (and given that I've now got a godzillion hz processor bored to tears out of it's silicon picking head, who cares!) Incidently the last time the -exec vs xargs came up the -exec script given worked, the xargs didn't. Then when this was pointed out the reply was "well it was only an example" I.E a polite way of saying go RTFM. For my money, I'll take the slower simpler but working example every time. P. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
For what it's worth;[EMAIL PROTECTED] a]$ i=0[EMAIL PROTECTED] a]$ while ((i<1000)); do touch $i; ((i++)); done[EMAIL PROTECTED] a]$ time find . -type f -exec rm -f {} \;real 0m0.949suser 0m0.221ssys 0m0.725s[EMAIL PROTECTED] a]$ i=0[EMAIL PROTECTED] a]$ while ((i<1000)); do touch $i; ((i++)); done[EMAIL PROTECTED] a]$ time find . -type f | xargs rm -freal 0m0.036suser 0m0.003ssys 0m0.033sForking and exec'ing isn't effortless.[EMAIL PROTECTED] a]$ time ruby -e '1000.times { system("/bin/date >/dev/null") }'real 0m2.660suser 0m0.956s sys 0m1.676s[EMAIL PROTECTED] a]$ time ruby -e 'system("/bin/date")'Thu Apr 13 11:02:36 EDT 2006real 0m0.011suser 0m0.008ssys 0m0.003s-b. On 4/12/06, Jeff Waugh <[EMAIL PROTECTED]> wrote: > Maybe but given that rm is the command in this case, the time to> remove the files from disk is orders of magnitude greater than the time to> load and execute the program. You'd be surprised - benchmark it. (Hint: You're adding the overhead ofbringing up rm for *every* *single* file. Doesn't matter that the time toremove files from disk takes longer - you're adding time to every single cycle.)> Also what happens if find returns a million file names? xargs can't put> them all on a single command line else the dreaded "too many args" error> will occur. So xargs must have some smarts in it to call the command > multiple times, passing it 1,2,...N args at a time?> So it obviously breaks the args up into chunks and invokes the command> multiple times in any case.That's actually the main point of xargs. It just happens to do things faster by reducing the number of times it invokes the target command. If you wantto do the same thing to 2000 files - in the majority of cases, you're betteroff doing it 10 times to 200 files than 2000 times to 1 file. Of course, you can specify how many files xargs will operate on in one invocation too, ifthere is some kind of arbitrary limit you must adhere to.Also, you would have to write something extremely fiendishly clever to win a SLUG Shell Scripting Smackdown that included find -exec.- Jeff--LinuxWorldExpo: Johannesburg, South Africa http://www.linuxworldexpo.co.za/"Gah. Out of coffee. Shall think whilst auto-caffeinating." - Telsa Gwynne--SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
> Maybe but given that rm is the command in this case, the time to > remove the files from disk is orders of magnitude greater than the time to > load and execute the program. You'd be surprised - benchmark it. (Hint: You're adding the overhead of bringing up rm for *every* *single* file. Doesn't matter that the time to remove files from disk takes longer - you're adding time to every single cycle.) > Also what happens if find returns a million file names? xargs can't put > them all on a single command line else the dreaded "too many args" error > will occur. So xargs must have some smarts in it to call the command > multiple times, passing it 1,2,...N args at a time? > So it obviously breaks the args up into chunks and invokes the command > multiple times in any case. That's actually the main point of xargs. It just happens to do things faster by reducing the number of times it invokes the target command. If you want to do the same thing to 2000 files - in the majority of cases, you're better off doing it 10 times to 200 files than 2000 times to 1 file. Of course, you can specify how many files xargs will operate on in one invocation too, if there is some kind of arbitrary limit you must adhere to. Also, you would have to write something extremely fiendishly clever to win a SLUG Shell Scripting Smackdown that included find -exec. - Jeff -- LinuxWorldExpo: Johannesburg, South Africa http://www.linuxworldexpo.co.za/ "Gah. Out of coffee. Shall think whilst auto-caffeinating." - Telsa Gwynne -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
Matthew Hannigan wrote: Yeah, not a bad call - it follows the 'correct first, fast later / premature optimization is the root of all evil' principle. Obviously xargs is a "better" solution based on the unix concept of only doing one thing and doing it well. xargs adds the capability to any data stream where as -exec is tied to find. I just reckon that people are just a bit quick to diss -exec. Consider these two cmds; find . -exec echo {} \; find . -print0 | xargs -0 echo Don't produce the same result. Still, it can be a lot slower to use -exec. Maybe but given that rm is the command in this case, the time to remove the files from disk is orders of magnitude greater than the time to load and execute the program. Also what happens if find returns a million file names? xargs can't put them all on a single command line else the dreaded "too many args" error will occur. So xargs must have some smarts in it to call the command multiple times, passing it 1,2,...N args at a time? Speculation on my part, I don't know this for a fact but man xargs says; If any invocation of the command exits with a status of 255, xargs will stop immediately without reading any further input. So it obviously breaks the args up into chunks and invokes the command multiple times in any case. The stopping on error could be a "good thing" or a "bad thing" depending on what you are trying to do. In the original example the find statement will pick up directory names, hence passing it to "xargs rm -f" should cause an error straight up as the first name found will be a directory. Hence the find with xargs method won't clear the smtpd logs which was Terry's intention. (Or will it as what error code does a failed rm return?) On the other hand, -exec will continue on blindly executing the given command for each filename passed to it, which may or may not be "a good thing" depending on circumstance. P. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
On Thu, Apr 13, 2006 at 08:09:18AM +1000, Peter Rundle wrote: > [ .. ] > But congrats on resisting the politically correct push to use xargs, go > -exec I reckon. I've yet to see any real advantage in using xargs when > combined with find just the fact that you have to remember to add -0 > because file names can have space in them these days, what a mess, > whereas exec rm -f {} works everytime, no dramas. Yeah, not a bad call - it follows the 'correct first, fast later / premature optimization is the root of all evil' principle. Especially if you want to use the script on Unices that don't have the -0 option to find/xargs. Still, it can be a lot slower to use -exec. Matt -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
Terry Collins wrote: > Damm, just tested above as > > find smtpd* -atime +7 -exec rm -f {} \; > > and found no quotes needed. > head scratch. But of course quotes are not required. The * is uninterpreted by the shell not by find. Assuming smtpd1, smtpd2, smtpd.log are files in the current directory, the above would be expanded to, find smtpd1 smtpd2 smtpd.log -atime +7 -exec rm -f {} \; by the shell *before* being executed. If however you want to do find . -name 'smtpd*' -print (I.E only files in all subdirs that start with smtpd) then you need the quotes to stop the shell from converting that into find -name smtpd1 smtpd2 smtpd.log -print BTW given that your find statement will return directory names as well as file names (and assuming you don't want to delete the directories) then rm -f will try to do so and return an un-trapped error. So you should probably add a -type f to find. But congrats on resisting the politically correct push to use xargs, go -exec I reckon. I've yet to see any real advantage in using xargs when combined with find just the fact that you have to remember to add -0 because file names can have space in them these days, what a mess, whereas exec rm -f {} works everytime, no dramas. P. -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
Jamie Wilkinson wrote: > This one time, at band camp, Terry Collins wrote: > >>Can anyone tell me what is the correct form of >> >>find smtpd* -atime 7 -exec ` rm -f {} ` \; >>on a RH5.2 system? > > > You can only specify one directory to look in, so smtpd* isn't going to > work. err, nope find smtpd* -atime 7 -print works just find. It is the quoting of exec that is the problem (missing argument or invalid is the contual squark) Apologies for not being clearer. > > find . -wholename './smtpd*' -atime 7 -exec rm -f {} \; Damm, just tested above as find smtpd* -atime +7 -exec rm -f {} \; and found no quotes needed. head scratch. Thanks for the help. Now to write and cron the script to just auto dump filtered spam after a week. -- Terry Collins {:-)}}} email: terryc at woa.com.au www: http://www.woa.com.au Wombat Outdoor Adventures "Any society that would give up a little liberty to gain a little security will deserve neither and lose both." Benjamin Franklin -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
On Wed, Apr 12, 2006 at 06:19:21PM +1000, Jamie Wilkinson wrote: > This one time, at band camp, Terry Collins wrote: > >Can anyone tell me what is the correct form of > > > >find smtpd* -atime 7 -exec ` rm -f {} ` \; > >on a RH5.2 system? > > You can only specify one directory to look in, so smtpd* isn't going to > work. not true -- do as many as you like > find . -wholename './smtpd*' -atime 7 -exec rm -f {} \; > > or > > find . -wholename './smtpd*' -atime 7 -print0 | xargs -0 rm -f > > I only just learnt -wholename so I may have gotten the syntax wrong. Buyer > beware! :-) I suspect older find's don't have -wholename. Here's my literal translation (the backquotes should definitely not be there) find smtpd* -atime 7 -exec rm -f {} \; It'll only work of course if you're in the parent dir of smptd*. Meaning the ./ bit is redundant. Using -atime is a bit odd; you usually want -mtime. atime can be updated every backup, i.e. every night!. Also rm is only good for files, not dirs. And add the usual -0 and xargs goodness and you get: find smtpd* -type f -mtime 7 -print0 | xargs -0 rm -f Matt -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
Re: [SLUG] correct form of find - exec rm on RH5.2?
This one time, at band camp, Terry Collins wrote: >Can anyone tell me what is the correct form of > >find smtpd* -atime 7 -exec ` rm -f {} ` \; >on a RH5.2 system? You can only specify one directory to look in, so smtpd* isn't going to work. find . -wholename './smtpd*' -atime 7 -exec rm -f {} \; or find . -wholename './smtpd*' -atime 7 -print0 | xargs -0 rm -f I only just learnt -wholename so I may have gotten the syntax wrong. Buyer beware! :-) -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html
[SLUG] correct form of find - exec rm on RH5.2?
Can anyone tell me what is the correct form of find smtpd* -atime 7 -exec ` rm -f {} ` \; on a RH5.2 system? -- Terry Collins {:-)}}} email: terryc at woa.com.au www: http://www.woa.com.au Wombat Outdoor Adventures "Any society that would give up a little liberty to gain a little security will deserve neither and lose both." Benjamin Franklin -- SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/ Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html