[Trisquel-users] Re : Lightweight Browser
Thank you for the discussion. :-)
[Trisquel-users] Re : Lightweight Browser
Here is a Shell script, "fact.sh", that reads integers on the standard input and factorizes them: #!/bin/sh while read nb do factor $nb done Calling it with twice the same integer and measuring the overall run time: $ printf '531691198313966348700354693990401\n531691198313966348700354693990401\n' | /usr/bin/time -f %U ./fact.sh 531691198313966348700354693990401: 2305843009213693951 2305843009213693951 531691198313966348700354693990401: 2305843009213693951 2305843009213693951 185.00 The result of the first call of 'factor' was not cached. As a consequence, the exact same computation is done twice. The kernel cannot know that, given a same number, 'factor' will return the same factorization. It cannot know that, for some reason, 'factor' will likely be called on an integer that it recently factorized. The programmer knows. She implements a cache of the last 1000 calls to 'factor' (the worst cache ever, I know): #!/bin/sh TMP=`mktemp -t fact.sh.XX` trap "rm $TMP* 2>/dev/null" 0 while read nb do if ! grep -m 1 ^$nb: $TMP then factor $nb fi | tee $TMP.1 head -999 $TMP >> $TMP.1 mv $TMP.1 $TMP done And our execution on twice the same integer is about twice faster: $ printf '531691198313966348700354693990401\n531691198313966348700354693990401\n' | /usr/bin/time -f %U ./fact.sh 531691198313966348700354693990401: 2305843009213693951 2305843009213693951 531691198313966348700354693990401: 2305843009213693951 2305843009213693951 92.71 It does not look like there is what you call "a higher (strategic) reason" here. Just a costly function that is frequently called with the same arguments and whose result only depends on these arguments. A quite common situation.
[Trisquel-users] Re : Lightweight Browser
You see, we are back to the subtleties between grand design and tactical design choices. It really depends on for which purpose you allocate RAM. I agree. There is no reason to re-implement what the kernel does (probably better). If it is for *direct* data caching, then it is both "selfish" (as I've explained in previous post) and inefficient too, as the kernel makes more efficient use of free RAM. But, on the other hand, if it is related to *indirect* caching, that employs a rather complicated algorithm beyond that of kernel's idea of data caching, then it may be worthwhile. The algorithm does not have to be complicated. It is just that the kernel cannot take initiatives that require application-level knowledge (such as the fact that a function will often be called with the same arguments). The programmer has to do the work in that case.
[Trisquel-users] Re : Lightweight Browser
Etymologies are not definitions. The source you cite tells it on its front page: Etymologies are not definitions; they're explanations of what our words meant https://www.etymonline.com In the case of "scarce", the page you show says "c. 1300". So, that is what what "scarce" meant circa 1300. Today, it has a different meaning.
[Trisquel-users] Re : Lightweight Browser
And also because DRAM is accessed page-wise. Changing page is much more expensive than accessing data on the same (already selected) page. Yes, there is that too. And accessing recent pages is fast thanks to yet another cache, the translation lookaside buffer: https://en.wikipedia.org/wiki/Translation_lookaside_buffer This, along with onpon4's similar views, is overlooking a basic fact: That the kernel is already using free memory for data caching. A user space program attemting to do its own data caching is a grave error (a bug, in essence) because it tries to overtake kernels job on itself, rather selfishly. The kernel cannot know a costly function will be frequently called with the same arguments and will always return the same value given the same arguments (i.e., does not depend on anything but its arguments). A cache at the application-level is not reimplementing the caches at system-level. And how would you really check that from within a user space program and take necessary steps? I am not suggesting that the program should do that. I am only saying that there is no benefit in choosing "lightweight applications" and always having much free RAM. That it is a waste of RAM. If you always have much free RAM, you had better choose applications that require more memory to be faster. The obvious thing to do is that, you must allocate no more RAM than you really need, and leave the rest (deciding what to do with free RAM) to the kernel. An implementation strategy that minimizes the space requirements ("no more RAM than you really need") will usually be slower that alternatives that require more space. As with the one-million-line examples I gave to heyjoe. Or with the examples on https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff
[Trisquel-users] Re : Lightweight Browser
Scarce means restricted in quantity. No, it does not. It means "insufficient to satisfy the need or demand": http://www.dictionary.com/browse/scarce When your system does not swap, the amount of RAM is sufficient to satisfy the need or demand. There is no scarcity of RAM. One more time (as with the word "freedom"), you are rewriting the dictionary to not admit you are wrong. You are ready to play dumb too: Reading/writing 1GB of RAM is slower than reading/writing 1KB of RAM. Thank you Captain Obvious. You are also grossly distorting what we write (so that your wrong statements about the sequentiality of RAM or its fragmentation eating up CPU cycles or ... do not look too stupid in comparison?): Which implies that one should fill up the whole available RAM just to print(7) and that won't affect performance + will add a benefit, which is nonsense. And then you accuse us of derailing the discussion: The space-time trade-off has absolutely nothing to do with where all this started. (...) Then the whole discussion went into some unsolicited mini lecturing By the way, sorry for using arguments, giving examples, etc. It is easy to verify who derailed the "whole discussion" because he does not want to admit he is wrong: just go up the hierarchy of posts. It starts with you writing: It is possible to optimize performance through about:config settings (turn off disk cache, tune mem cache size and others). https://trisquel.info/forum/lightweight-browser#comment-127383 Me replying: Caches are improving performances when you revisit a site. https://trisquel.info/forum/lightweight-browser#comment-127396 And onpon4 adding the amount of RAM, which is not scarce nowadays, as the only limitation to my affirmation, which can be generalized to many other programming techniques that reduce time requirements by using more space: Exactly. I don't think a lot of people understand that increased RAM and hard disk consumption is often done intentionally to improve performance. The only way reducing RAM consumption will ever help performance is if you're using so much RAM that it's going into swap, and very few people have so little RAM that that's going to happen. https://trisquel.info/forum/lightweight-browser#comment-127400 Then you start saying we are wrong. Onpon4 and I are still talking about why programs eating most of your RAM (but not more) are usually faster than the so-called "lightweight" alternatives and how, in particular, caching improves performance. In contrast, and although you stayed on-topic at the beginning (e.g., claiming that "caching in RAM is not a performance benefit per se"), you now pretend that "the space-time trade-off has absolutely nothing to do with where all this started" and that "Reading/writing 1GB of RAM is slower than reading/writing 1KB of RAM" is a relevant argument to close the "whole discussion". Also, earlier, you were trying to question onpon4's skills, starting a sentence with "I don't know what your programming experience is but". Kind of funny from somebody who believe the Web could be broadcast to every user. FYI, both onpon4 and I are programmers.
[Trisquel-users] Re : Lightweight Browser
Resources are always scarce (limited) and should be used responsibly. They are always limited. They are not always scarce. In the case of memory, as long as you do not reach the (limited) amount of RAM you have, there is is no penalty. And by using more memory, a program can be made faster. I pointed you to https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff for real-world examples. But I can explain it on a basic example too. Let us say you have a file with one million lines and a program repetitively needs to read specific lines, identified by their line numbers. The program can access the disk every time it needs a line. That strategy uses as little memory as possible. It is constant, i.e., it does not depend on the size of the file. But the program is slow. Storing the whole file in RAM turns the program several orders of magnitude faster... unless there is not enough free space in RAM to store the file. Let us take that second case and imagine that it often happens that a same line must be reread, whereas most of the lines are never read. The program can implement a cache. It will keep in RAM the last lines that were accessed so that rereading a line that was recently read is fast (no disk access). The cache uses some memory to fasten the program. As long as the size of the cache does not exceed the amount of available RAM, the larger the cache, the faster the program. Going on with our one-million-line file to give another example of a trade-off between time and space: let us imagine 95% of the lines are actually the same, a default "line". If there is enough free memory, the fastest implementation strategy remains to have an array of size one million so that the program can access any line in constant-time. If there is not enough free memory, the program can store the sole pairs (line number, line) where "line" is not the default one, i.e., 5% of the lines. After ordering those pairs by line number, a binary search allows to return any line in a time that grows logarithmically with the number of non-default lines (if the line number was not stored, the default line, stored once, is returned). That strategy is slower (logarithmic-time vs. constant-time) than the one using an array of size one million, if there is enough free space to store such an array. In the end, the fastest implementation is the one that uses the more space while remaining below the amount of available RAM. It is that strategy that you want. Not the one that uses as little memory as possible. You need free RAM for handling new processes and peak loads. That is correct. And it is an important point for server systems. On desktop systems, processes do not pop up from nowhere and you usually do not want to do many things at the same time. Python is an interpreted language and you don't know how the interpreter handles the data internally. A valid test would be near the hardware level, perhaps in assembler. You certainly run far more Python than assembler. So, for a real-life comparison, Python makes more sense than assembler. The same holds for programs that takes into consideration the specificities of your hardware: you run far more generic code (unless we are talking about supercomputing). I was talking about the algorithmic memory fragmentation which results in extra CPU cycles. No, it does not, because there it is *random-access* memory. As I was writing in my other post, RAM fragmentation is only a problem when you run short of RAM: there are free blocks but, because they are not contiguous, the kernel cannot allocate them to store a large "object" (for example a large array). It therefore has to swap. Running a browser like Firefox on 512MB resulted in swapping. If you assume that software should be bloated and incompatible with older hardware, you will never create a lightweight program. There can be space-efficient (and probably time-inefficient, given the trade-offs as in my examples above) programs for systems with little RAM, where the faster (but more space-consuming) program would lead to swapping. For systems with enough RAM, the space-consuming programs are faster and that is what users want: a fast program. It makes sense that programmers consider what most users have, several GB of RAM nowadays, when it comes to designing their programs to be as fast as possible. Your tests show that 'dd' is faster with a cache. It is what onpon4 and I keep on telling you: by storing more data you can get a faster program.
[Trisquel-users] Re : Lightweight Browser
I don't know what your programming experience is but your expectations of efficiency are contrary to the basic programming principle: that a program should use only as much memory as it actually needs for completing the task and that memory usage should be optimized. You can often get a faster program by using more memory. See https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff for examples. As long as the system does not swap, it is the way to go. Occupying as much RAM as possible just because there is free RAM is meaningless. Storing in memory data that will never be needed again is, of course, stupid. We are not talking about that. RAM access is sequential. You know that RAM means "Random-access memory", don't you? The access is not sequential. Manipulating data that is sequentially stored in RAM is faster because of CPU cache and sequential prefetching: https://en.wikipedia.org/wiki/Cache_prefetching The idea of CPU cache is, well, that of caching: keeping a copy of recent data/software closer to the CPU because it may have to be accessed again soon. The same idea, at another level, that a program can implement to be faster: keeping data in main memory instead of recomputing it. Are you also arguing that having free space in the CPU cache has benefits? On a system with more memory (e.g. 16GB) you can keep more data cached in RAM but that doesn't mean that programs should simply occupy lots or all because there is plenty of it and/or because RAM is faster than HDD. It kind of means that. You want fast programs, don't you? If that can be achieved by taking more memory, the program will indeed be faster, unless the memory requirements become so huge that the system swaps. So, I ask you again: "How often does your system run out of RAM?". If the answer is "rarely", then choosing programs with higher memory requirements may be a good idea: they can be faster than lightweight alternatives. It is more time consuming to manage scattered memory blocks and thousand of pointers than reading a whole block at once. It is not because of fragmentation (which only becomes a problem when the system swaps: free RAM cannot be allocated because too little is available in continuous blocks). It is because of sequential prefetching in the CPU cache. Not an argument against caching. Quite the opposite.
[Trisquel-users] Re : Lightweight Browser
As in, a browser with large memory footprint and heavy CPU usage will usually also have large package download size and more complex user interface. There is a strong correlation between them. That may be, in practice, the "rule" for Web browsers (I am not sure). That is not true in general. While you might be able to dig up an exception, it would still be an exception to the rule, which I was talking about in the first place. The program I work on (pattern mining, nothing to do with Web browsers) is a 650 kB binary which can easily use GB of RAM (it depends on the data at input) and, with such a memory consumption, it can take 100% CPU during seconds or during hours (it depends on the parameters it is given). One of the parameters actually controls a trade-off between space and time (a threshold to decide whether the data should be stored in a dense way or in a sparse way). For instance, one may reduce memory consumption by 30% at the cost of 30% higher CPU usage (just for the sake of example, no pedantry please) but a bad design can boost both CPU and RAM usage 200%. I am not sure what you call design. Design includes choosing a solution with a good trade-off between CPU usage and memory usage. The gain/loss is usually not fixed. It depends on the size of the data at input. The choice is often between two algorithms with different time and space complexities (in big O notation), i.e., the percentage is asymptotically infinite. You may say theory does not matter... but it does. There are popular computing techniques (e.g., dynamic programming) that precisely aim to get a smaller time complexity against a higher space complexity. https://en.wikipedia.org/wiki/Space%E2%80%93time_tradeoff gives many other examples. Including one that deals with Web browsers, about rendering SVG every time the page changes or creating a raster version of the SVG. It is far from 30% here: SVG is orders of magnitude smaller in size but it takes orders of magnitude more time to render it. Likewise a full featured software can use 400% more resources than a barebones one. A feature that you do not use should not take significantly more resources.
[Trisquel-users] Re : Lightweight Browser
In essence #2 and #4 are redundant - they are by-products of #1 and #3 to large extent. No, there are not. #1 is about RAM consumption, #2 about disk consumption, #3 about (CPU) time consumption and #4 about human-computer interface. So, in practice, there is really only one definition of "lightweight" which entails *both* CPU usage and memory footprint. They usually both go up and down (also directly affecting #2 and #4) depending on design perfection and functionality span. It is not true. There is often a choice to be made between storing data or repetitively computing them, i.e., a trade-off between (CPU) time and (memory) space.
[Trisquel-users] Re : Lightweight Browser
If you constantly allocate and deallocate huge amounts of memory this is an overhead. So caching in RAM is not a performance benefit per se. Yes, it is. It is about *not* deallocating recent data that may have to be computed/accessed again, unless there is a shortage of free memory. Starting a new program requires free memory. If all (or most) memory is already full, this will cause swapping. You need to have enough free memory. Onpon4 did not say otherwise. She also rightfully said that "there is zero benefit to having RAM free that you're not using". How often does your system run out of RAM?
[Trisquel-users] Re : Lightweight Browser
Caches are improving performances when you revisit a site.