On 2015-08-02 08:23, Alessandro DE LAURENZIS wrote:
> On Sat 01/08/2015 14:09, Vijay Sankar wrote:
>> alias nof='ls -l . | egrep -c '^-''
>> I have always wondered if there is a better way of doing this.
> 
> In general, I would avoid using a pipe when a native command exists (and
> particularly in this case, where grep string comparison is a slow
> operation); this could probably be more appropriate:

There IS no native command doing what Vijay wants... you introduced a
pipe in your own example, too.

Don't be afraid of pipes!

There isn't necessarily a disadvantage in splitting jobs through a pipe.
For example, it enables the system to better utilize multiple
processors/cores, which may or may not make a difference.


In this case, your example is undoubtedly faster.

*But*, what you did was to speed optimize a process, involving a human
operator, to work half a second faster in a rather constructed scenario
with over a hundred thousand files in one directory.

In practice, the difference is completely imperceptible for the operator:


----8<--------8<--------8<--------8<--------8<---- (cut)
bl@paddan:~$ cd /usr/share/man/man3      # [1]
bl@paddan:/usr/share/man/man3$ time ls -l . | egrep -c '^-'
4045
    0m0.05s real     0m0.02s user     0m0.03s system
bl@paddan:/usr/share/man/man3$ time find . -maxdepth 1 -type f | wc -l
    4045
    0m0.04s real     0m0.01s user     0m0.02s system
bl@paddan:/usr/share/man/man3$ _
----8<--------8<--------8<--------8<--------8<---- (cut)


This kind of optimization is really not that productive.

There is for sure a good lesson in showing how to do things in different
ways, to broaden ones horizon when it comes to thinking outside the box
(or pipe).


But, starting to talk about shaving fractions of a second off of an
interactive command in an edge case is just a red herring in my opinion.
It teaches the wrong message.


A much better optimization for this question, in my mind, is this:

Don't use an alias at all! Instead use a shell function, like this:

----8<--------8<--------8<--------8<--------8<---- (cut)
nof() {
        ls -l $1 | egrep -c '^-'
}
----8<--------8<--------8<--------8<--------8<---- (cut)

(In this case, substituting find is *not* immediately applicable.)


The advantage of this approach is that in the regular case "nof" works
just like in Vijay's original alias, but this has the added
functionality of being able to "nof" any directory with a command line
argument, like this:

----8<--------8<--------8<--------8<--------8<---- (cut)
bl@paddan:/usr/share/man/man3$ nof
4045
bl@paddan:/usr/share/man/man3$ nof /bin
42             <-- (Who knew Douglas Adams was an OpenBSD contributor!)
bl@paddan:/usr/share/man/man3$
----8<--------8<--------8<--------8<--------8<---- (cut)


You can't do the above (as easily) with the find approach, since it
doesn't work without a directory argument. (Yes, of course we can add
code to fix that, but that's not the point here.)


This isn't a SPEED optimization, it is a FUNCTIONALITY optimization.

It is a better way to do the same thing, just what Vijay asked for. :-)


Moral of my story: KISS. Keep It Simple, Stupid.

Put your efforts in the right place.


Regards,

/Benny



[1] I first did this to quickly find out which directory in my machine
was the biggest, to have somewhere to play:

bl@paddan:~$ sudo find / -type d -ls | cut -c48- | grep -v "^   "

The cut and grep business sorts out all smaller directories with three
or four digit sizes, giving me a quick overview over the biggest
directories.

This whole operation took me less than a minute, including a couple of
trial-and-error runs to find out the best position for the cut.

I am sure there are much better and more accurate ways of doing this,
still with simple shell commands and pipe chaining, but this is what I
thought of off the top of my head, and it did this one-shot job much
more quickly than if I had sat down to come up with a more accurate or
general solution.

Optimizing your *work* doesn't have to include measuring cpu cycles!


> 
> alias nof='find ./ -type f -maxdepth 1 | wc -l'
> 
> See the difference in runtime in case of a huge file listing (not so
> uncommon...):
> 
> just22@poseidon:[tmp]> time find ./ -type f -maxdepth 1 | wc -l
>   113069
> 
>   real    0m1.732s
>   user    0m0.100s
>   sys     0m1.560s
> 
> 
> just22@poseidon:[tmp]> time ls -l ./ | egrep -c "^-"
> 113069
> 
> real    0m2.238s
> user    0m0.630s
> sys     0m1.550s
> 
> 
> All the best

Reply via email to