Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Jamie Wilkinson
This one time, at band camp, Matthew Hannigan wrote:
>On Wed, Apr 12, 2006 at 06:19:21PM +1000, Jamie Wilkinson wrote:
>> This one time, at band camp, Terry Collins wrote:
>> >Can anyone tell me what is the correct form of
>> >
>> >find smtpd* -atime 7 -exec ` rm -f {} ` \;
>> >on a RH5.2 system?
>> 
>> You can only specify one directory to look in, so smtpd* isn't going to
>> work.
>
>not true -- do as many as you like

Ok, woops!

>> find . -wholename './smtpd*' -atime 7 -exec rm -f {} \;
>> 
>> or
>> 
>> find . -wholename './smtpd*' -atime 7 -print0 | xargs -0 rm -f
>> 
>> I only just learnt -wholename so I may have gotten the syntax wrong.  Buyer
>> beware! :-)
>
>I suspect older find's don't have -wholename.

"buyer beware".
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Peter Hardy
On Thu, 2006-04-13 at 11:27 +1000, Peter Rundle wrote:
> Incidently the last time the -exec vs xargs came up the -exec script 
> given worked, the xargs didn't. Then when this was pointed out the reply 
> was "well it was only an example" I.E a polite way of saying go RTFM.
> For my money, I'll take the slower simpler but working example every time.

Maybe it's just me, but I find piping to xargs simpler than grappling
with quoting shenanigans on a find -exec command line. But then, I was
using xargs long before I even found out that find had an -exec. Horses
for courses, I suppose.

-- 
Pete

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Matthew Hannigan
On Wed, Apr 12, 2006 at 09:05:08PM -0400, Bruce Woodward wrote:
> For what it's worth;
> 
> [EMAIL PROTECTED] a]$ i=0
> [EMAIL PROTECTED] a]$ while ((i<1000)); do touch $i; ((i++)); done
> [EMAIL PROTECTED] a]$ time find . -type f -exec rm -f {} \;

A more realistic test would be to use non-zero files in 
many subdirectories.
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Matthew Hannigan
On Thu, Apr 13, 2006 at 09:09:47AM +1000, Peter Rundle wrote:
> Matthew Hannigan wrote:
> >Yeah, not a bad call - it follows the 'correct first, fast later
> >/ premature optimization is the root of all evil' principle.
> 
> Obviously xargs is a "better" solution based on the unix concept of only 
> doing one thing and doing it well. xargs adds the capability to any data 
> stream where as -exec is tied to find. I just reckon that people are 
> just a bit quick to diss -exec. Consider these two cmds;
> 
>  find . -exec echo {} \;
> 
>  find . -print0 | xargs -0 echo
> 
> Don't produce the same result.
> 
> >Still, it can be a lot slower to use -exec.
> 
> Maybe but given that rm is the command in this case, the time to 
> remove the files from disk is orders of magnitude greater than the time 
> to load and execute the program.

Lots of good points.

> Also what happens if find returns a million file names? xargs can't put 
> them all on a single command line else the dreaded "too many args" error 
> will occur. So xargs must have some smarts in it to call the command 
> multiple times, passing it 1,2,...N args at a time? Speculation on my 
> part, I don't know this for a fact but man xargs says;
> 
>  If any invocation of the command exits with a status of 255,
>  xargs will stop immediately without reading any further input.
> 
> So it obviously breaks the args up into chunks and invokes the command 
> multiple times in any case.

Indeed it does.

> The stopping on error could be a "good thing" or a "bad thing" depending 
> on what you are trying to do. In the original example the find statement 
> will pick up directory names, hence passing it to "xargs rm -f" should 
> cause an error straight up as the first name found will be a directory. 
> Hence the find with xargs method won't clear the smtpd logs which was 
> Terry's intention. (Or will it as what error code does a failed rm 
> return?) On the other hand, -exec will continue on blindly executing the 
> given command for each filename passed to it, which may or may not be "a 
> good thing" depending on circumstance.

Yep, and another thing.  Since xargs might run command with just 1
args, the command may behave differently.  E.g. grep when given one file
will not print the filename as part of the match, but will when given
many files.

So
find . -print0 | xargs -0 grep | dosomething

might misbehave very occasionally, depending on what exactly dosomething does.

The fix is to use -H with grep (non-portable) or just
add a certain to be not matched file after grep, so xargs runs
grep with at least 2 files each time:

find . -print0 | xargs -0 grep /dev/null | dosomething

Of course using -exec avoids the problem entirely.


Matt





-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Peter Rundle

Bruce Woodward wrote:

Forking and exec'ing isn't effortless.


Never said it was, but why don't you try it again but this time make the 
 1000 files say a few hundred megabytes in size instead of zero bytes. 
 Unix System V file accesses were asynchronous, this could be seen when 
you wrote to a floppy. For example


  $ rm /mnt/floppy/*

would return to the $ prompt after only a second or so, but then when 
you did


  $ sync; sync; umount /dev/dsk/floppy

you'd be waiting ages whilst the disk was erased. If that is still the 
case in Linux the forking and exec'ing will occur in parallel to the 
disk activity and will finish inside that time window so from a 
practical point of view that one second saved isn't worth a whole lot. 
(and given that I've now got a godzillion hz processor bored to tears 
out of it's silicon picking head, who cares!)


Incidently the last time the -exec vs xargs came up the -exec script 
given worked, the xargs didn't. Then when this was pointed out the reply 
was "well it was only an example" I.E a polite way of saying go RTFM.

For my money, I'll take the slower simpler but working example every time.


P.







--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Bruce Woodward
For what it's worth;[EMAIL PROTECTED] a]$ i=0[EMAIL PROTECTED] a]$ while ((i<1000)); do touch $i; ((i++)); done[EMAIL PROTECTED] a]$ time find . -type f -exec rm -f {} \;real    0m0.949suser    
0m0.221ssys 0m0.725s[EMAIL PROTECTED] a]$ i=0[EMAIL PROTECTED] a]$ while ((i<1000)); do touch $i; ((i++)); done[EMAIL PROTECTED] a]$ time find . -type f | xargs rm -freal    0m0.036suser    
0m0.003ssys 0m0.033sForking and exec'ing isn't effortless.[EMAIL PROTECTED] a]$ time ruby -e '1000.times { system("/bin/date >/dev/null") }'real    0m2.660suser    0m0.956s
sys 0m1.676s[EMAIL PROTECTED] a]$ time ruby -e 'system("/bin/date")'Thu Apr 13 11:02:36 EDT 2006real    0m0.011suser    0m0.008ssys 0m0.003s-b.
On 4/12/06, Jeff Waugh <[EMAIL PROTECTED]> wrote:
> Maybe but given that rm is the command in this case, the time to> remove the files from disk is orders of magnitude greater than the time to> load and execute the program.
You'd be surprised - benchmark it. (Hint: You're adding the overhead ofbringing up rm for *every* *single* file. Doesn't matter that the time toremove files from disk takes longer - you're adding time to every single
cycle.)> Also what happens if find returns a million file names? xargs can't put> them all on a single command line else the dreaded "too many args" error> will occur. So xargs must have some smarts in it to call the command
> multiple times, passing it 1,2,...N args at a time?> So it obviously breaks the args up into chunks and invokes the command> multiple times in any case.That's actually the main point of xargs. It just happens to do things faster
by reducing the number of times it invokes the target command. If you wantto do the same thing to 2000 files - in the majority of cases, you're betteroff doing it 10 times to 200 files than 2000 times to 1 file. Of course, you
can specify how many files xargs will operate on in one invocation too, ifthere is some kind of arbitrary limit you must adhere to.Also, you would have to write something extremely fiendishly clever to win a
SLUG Shell Scripting Smackdown that included find -exec.- Jeff--LinuxWorldExpo: Johannesburg, South Africa  http://www.linuxworldexpo.co.za/"Gah. Out of coffee. Shall think whilst auto-caffeinating." - Telsa
   Gwynne--SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/Subscription info and FAQs: 
http://slug.org.au/faq/mailinglists.html
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html

Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Jeff Waugh


> Maybe but given that rm is the command in this case, the time to
> remove the files from disk is orders of magnitude greater than the time to
> load and execute the program.

You'd be surprised - benchmark it. (Hint: You're adding the overhead of
bringing up rm for *every* *single* file. Doesn't matter that the time to
remove files from disk takes longer - you're adding time to every single
cycle.)

> Also what happens if find returns a million file names? xargs can't put
> them all on a single command line else the dreaded "too many args" error
> will occur. So xargs must have some smarts in it to call the command
> multiple times, passing it 1,2,...N args at a time?

> So it obviously breaks the args up into chunks and invokes the command 
> multiple times in any case.

That's actually the main point of xargs. It just happens to do things faster
by reducing the number of times it invokes the target command. If you want
to do the same thing to 2000 files - in the majority of cases, you're better
off doing it 10 times to 200 files than 2000 times to 1 file. Of course, you
can specify how many files xargs will operate on in one invocation too, if
there is some kind of arbitrary limit you must adhere to.

Also, you would have to write something extremely fiendishly clever to win a
SLUG Shell Scripting Smackdown that included find -exec.

- Jeff

-- 
LinuxWorldExpo: Johannesburg, South Africa  http://www.linuxworldexpo.co.za/
 
"Gah. Out of coffee. Shall think whilst auto-caffeinating." - Telsa
   Gwynne
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Peter Rundle

Matthew Hannigan wrote:

Yeah, not a bad call - it follows the 'correct first, fast later
/ premature optimization is the root of all evil' principle.


Obviously xargs is a "better" solution based on the unix concept of only 
doing one thing and doing it well. xargs adds the capability to any data 
stream where as -exec is tied to find. I just reckon that people are 
just a bit quick to diss -exec. Consider these two cmds;


 find . -exec echo {} \;

 find . -print0 | xargs -0 echo

Don't produce the same result.


Still, it can be a lot slower to use -exec.


Maybe but given that rm is the command in this case, the time to 
remove the files from disk is orders of magnitude greater than the time 
to load and execute the program.


Also what happens if find returns a million file names? xargs can't put 
them all on a single command line else the dreaded "too many args" error 
will occur. So xargs must have some smarts in it to call the command 
multiple times, passing it 1,2,...N args at a time? Speculation on my 
part, I don't know this for a fact but man xargs says;


 If any invocation of the command exits with a status of 255,
 xargs will stop immediately without reading any further input.

So it obviously breaks the args up into chunks and invokes the command 
multiple times in any case.


The stopping on error could be a "good thing" or a "bad thing" depending 
on what you are trying to do. In the original example the find statement 
will pick up directory names, hence passing it to "xargs rm -f" should 
cause an error straight up as the first name found will be a directory. 
Hence the find with xargs method won't clear the smtpd logs which was 
Terry's intention. (Or will it as what error code does a failed rm 
return?) On the other hand, -exec will continue on blindly executing the 
given command for each filename passed to it, which may or may not be "a 
good thing" depending on circumstance.


P.


--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Matthew Hannigan
On Thu, Apr 13, 2006 at 08:09:18AM +1000, Peter Rundle wrote:
> [ .. ] 
> But congrats on resisting the politically correct push to use xargs, go 
> -exec I reckon. I've yet to see any real advantage in using xargs when 
> combined with find just the fact that you have to remember to add -0 
> because file names can have space in them these days, what a mess,
> whereas exec rm -f {} works everytime, no dramas.

Yeah, not a bad call - it follows the 'correct first, fast later
/ premature optimization is the root of all evil' principle.

Especially if you want to use the script on Unices that
don't have the -0 option to find/xargs.

Still, it can be a lot slower to use -exec.


Matt

-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Peter Rundle

Terry Collins wrote:
> Damm, just tested above as
>
> find smtpd* -atime +7 -exec rm -f {} \;
>
> and found no quotes needed.
> head scratch.

But of course quotes are not required. The * is uninterpreted by the 
shell not by find. Assuming smtpd1, smtpd2, smtpd.log are files in the 
current directory, the above would be expanded to,


 find smtpd1 smtpd2 smtpd.log -atime +7 -exec rm -f {} \;

by the shell *before* being executed. If however you want to do

 find . -name 'smtpd*' -print

(I.E only files in all subdirs that start with smtpd) then you need the 
quotes to stop the shell from converting that into


 find -name smtpd1 smtpd2 smtpd.log -print

BTW given that your find statement will return directory names as well 
as file names (and assuming you don't want to delete the directories) 
then rm -f will try to do so and return an un-trapped error. So you 
should probably add a -type f to find.


But congrats on resisting the politically correct push to use xargs, go 
-exec I reckon. I've yet to see any real advantage in using xargs when 
combined with find just the fact that you have to remember to add -0 
because file names can have space in them these days, what a mess,

whereas exec rm -f {} works everytime, no dramas.


P.

--
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Terry Collins
Jamie Wilkinson wrote:
> This one time, at band camp, Terry Collins wrote:
> 
>>Can anyone tell me what is the correct form of
>>
>>find smtpd* -atime 7 -exec ` rm -f {} ` \;
>>on a RH5.2 system?
> 
> 
> You can only specify one directory to look in, so smtpd* isn't going to
> work.

err, nope

find smtpd* -atime 7 -print  works just find.
It is the quoting of exec that is the problem (missing argument or
invalid is the contual squark)

Apologies for not being clearer.

> 
> find . -wholename './smtpd*' -atime 7 -exec rm -f {} \;

Damm, just tested above as

find smtpd* -atime +7 -exec rm -f {} \;

and found no quotes needed.

head scratch.

Thanks for the help.

Now to write and cron the script to just auto dump filtered spam after a
week.




-- 
   Terry Collins {:-)}}}
   email: terryc at woa.com.au  www: http://www.woa.com.au
   Wombat Outdoor Adventures 

 "Any society that would give up a little liberty to gain a little
  security will deserve neither and lose both." Benjamin Franklin
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Matthew Hannigan
On Wed, Apr 12, 2006 at 06:19:21PM +1000, Jamie Wilkinson wrote:
> This one time, at band camp, Terry Collins wrote:
> >Can anyone tell me what is the correct form of
> >
> >find smtpd* -atime 7 -exec ` rm -f {} ` \;
> >on a RH5.2 system?
> 
> You can only specify one directory to look in, so smtpd* isn't going to
> work.

not true -- do as many as you like

> find . -wholename './smtpd*' -atime 7 -exec rm -f {} \;
> 
> or
> 
> find . -wholename './smtpd*' -atime 7 -print0 | xargs -0 rm -f
> 
> I only just learnt -wholename so I may have gotten the syntax wrong.  Buyer
> beware! :-)

I suspect older find's don't have -wholename.

Here's my literal translation (the backquotes should definitely not be there)

find smtpd* -atime 7 -exec rm -f {} \;

It'll only work of course if you're in the parent dir
of smptd*. Meaning the ./ bit is redundant.
Using -atime is a bit odd; you usually want -mtime.
atime can be updated every backup, i.e. every night!.
Also rm is only good for files, not dirs.
And add the usual -0 and xargs goodness and you get:

find smtpd* -type f -mtime 7 -print0 | xargs -0 rm -f




Matt
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


Re: [SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Jamie Wilkinson
This one time, at band camp, Terry Collins wrote:
>Can anyone tell me what is the correct form of
>
>find smtpd* -atime 7 -exec ` rm -f {} ` \;
>on a RH5.2 system?

You can only specify one directory to look in, so smtpd* isn't going to
work.

find . -wholename './smtpd*' -atime 7 -exec rm -f {} \;

or

find . -wholename './smtpd*' -atime 7 -print0 | xargs -0 rm -f

I only just learnt -wholename so I may have gotten the syntax wrong.  Buyer
beware! :-)
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html


[SLUG] correct form of find - exec rm on RH5.2?

2006-04-12 Thread Terry Collins
Can anyone tell me what is the correct form of

find smtpd* -atime 7 -exec ` rm -f {} ` \;
on a RH5.2 system?


-- 
   Terry Collins {:-)}}}
   email: terryc at woa.com.au  www: http://www.woa.com.au
   Wombat Outdoor Adventures 

 "Any society that would give up a little liberty to gain a little
  security will deserve neither and lose both." Benjamin Franklin
-- 
SLUG - Sydney Linux User's Group Mailing List - http://slug.org.au/
Subscription info and FAQs: http://slug.org.au/faq/mailinglists.html