Resources are always scarce (limited) and should be used responsibly.
They are always limited. They are not always scarce. In the case of memory,
as long as you do not reach the (limited) amount of RAM you have, there is is
no penalty. And by using more memory, a program can be made faster.
I pointed you to https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff
for real-world examples. But I can explain it on a basic example too.
Let us say you have a file with one million lines and a program repetitively
needs to read specific lines, identified by their line numbers. The program
can access the disk every time it needs a line. That strategy uses as little
memory as possible. It is constant, i.e., it does not depend on the size of
the file. But the program is slow. Storing the whole file in RAM turns the
program several orders of magnitude faster... unless there is not enough free
space in RAM to store the file. Let us take that second case and imagine
that it often happens that a same line must be reread, whereas most of the
lines are never read. The program can implement a cache. It will keep in
RAM the last lines that were accessed so that rereading a line that was
recently read is fast (no disk access). The cache uses some memory to fasten
the program. As long as the size of the cache does not exceed the amount of
available RAM, the larger the cache, the faster the program.
Going on with our one-million-line file to give another example of a
trade-off between time and space: let us imagine 95% of the lines are
actually the same, a default "line". If there is enough free memory, the
fastest implementation strategy remains to have an array of size one million
so that the program can access any line in constant-time. If there is not
enough free memory, the program can store the sole pairs (line number, line)
where "line" is not the default one, i.e., 5% of the lines. After ordering
those pairs by line number, a binary search allows to return any line in a
time that grows logarithmically with the number of non-default lines (if the
line number was not stored, the default line, stored once, is returned).
That strategy is slower (logarithmic-time vs. constant-time) than the one
using an array of size one million, if there is enough free space to store
such an array.
In the end, the fastest implementation is the one that uses the more space
while remaining below the amount of available RAM. It is that strategy that
you want. Not the one that uses as little memory as possible.
You need free RAM for handling new processes and peak loads.
That is correct. And it is an important point for server systems. On
desktop systems, processes do not pop up from nowhere and you usually do not
want to do many things at the same time.
Python is an interpreted language and you don't know how the interpreter
handles the data internally. A valid test would be near the hardware level,
perhaps in assembler.
You certainly run far more Python than assembler. So, for a real-life
comparison, Python makes more sense than assembler. The same holds for
programs that takes into consideration the specificities of your hardware:
you run far more generic code (unless we are talking about supercomputing).
I was talking about the algorithmic memory fragmentation which results in
extra CPU cycles.
No, it does not, because there it is *random-access* memory. As I was
writing in my other post, RAM fragmentation is only a problem when you run
short of RAM: there are free blocks but, because they are not contiguous, the
kernel cannot allocate them to store a large "object" (for example a large
array). It therefore has to swap.
Running a browser like Firefox on 512MB resulted in swapping. If you assume
that software should be bloated and incompatible with older hardware, you
will never create a lightweight program.
There can be space-efficient (and probably time-inefficient, given the
trade-offs as in my examples above) programs for systems with little RAM,
where the faster (but more space-consuming) program would lead to swapping.
For systems with enough RAM, the space-consuming programs are faster and that
is what users want: a fast program. It makes sense that programmers consider
what most users have, several GB of RAM nowadays, when it comes to designing
their programs to be as fast as possible.
Your tests show that 'dd' is faster with a cache. It is what onpon4 and I
keep on telling you: by storing more data you can get a faster program.