Re: [freenet-dev] Disk I/O thread
Freetalk and WoT may be better designed on that level. However they produce more disk I/O. And the reason for this is, mainstream database practice and theory are designed for two cases: 1. Lots of data (absolute amount and transactions/sec) on fast, reliable hardware with professional sysadmins. 2. Tiny amounts of data on commodity hardware. If you have lots of data on commodity disks, it falls down. And note that mainstream use of databases in the second case frequently fails. For example, I have a huge set of bookmarks in a tree on one of my browsers. Every time I create a new category, another one gets renamed. Thats not a database error, thats an application one for sure. The reason databases fail on standard consumer hardware is because these systems often write buffer on various levels, undermining the whole ACID architecture. This leads to half-applied transactions or simply database corruption on power failure or even just os crashes. The results are random, though the standard one being a corrupt db which just doesn't get back up, any replicable behaviour isn't likely to be related to this. Apart from that databases of any size will run reliably on any hardware as long as you ensure fsync does what it is supposed to do and don't intentionally or unintentionally disable any of the transaction-based safety features. Performance will suffer on consumer hardware, greatly so if the indices are too big to be cached, but identical stuff runs slower on slower hardware, no surprise here. It doesn't mean that there is no point in running large DBs on consumer hardware or that it is somehow inherently unreliable and you end up with randomly modified data sets. The approach of fred to not use it is just wrong from a computer-science perspective. The fact that it does not perform so well is an implementation issue of the database, not of the client code which uses the database. No it is not. The load you put on it requires many seeks. Are you sure of that? The primary database will require seeking on pretty much any access due to its nature, alright, but it doesn't need to be accessed all that often actually. How many requests must a busy node handle per second? Maybe 10, if at all and those will be lucky to cause 10 ios. Your standard sata disk can deal with more then that, the standard figures are usually 50 - 100 iops. Write requests to the DBs, caused by insert requests or fetched unknown data, which will require multiple seeks, will drive the load up a bit but if the write per second to db figures of my node are any indication then this won't cause severe disk load either, it just doesn't occure often enough. This is also my general experience with freenet, the main DBs aren't the issue: Just keep it running on its own and there won't be any issues, even with 500+ GB stores on old 5400 rpm sata disks. Load the node with local requests, be it downloads or something else, and the whole things gets severly disk-limited very very fast. With stuff like WoT, Freetalk or the Spider it is even worse, these can easily get things to unbearable levels in my experience. So imho the offender is rather the db4o database which is basically used by clients. The load here is different though, I don't really see how it must strictly be seek-heavy for long durations of time. Splitfile handling for example is more of a bulk data thing. Ideally it should get the data out of the db as needed during a decode, keeping temporary results in memory or in tempfiles if needed and only store the final result back to the db, minimising writes. This is still the equivalent of reading a terribly fragmented file from disk but thats not an unsurmountable task for a standard disk. But considering how io limited splitfile decoding often is on standard disks, and the time it takes, I would really suspect that it loads the DB with uneeded stuff, trying to minimise memory footprint, temp-file usage or something like that. Espescially since just the data retrieval, before the request actually completes, is already often io heavy in my experience. The same goes for Freetalk Co: They can't really cause many transactions per second, since freenet just can't fetch data fast enough to cause enough changes, so either they run really complex queries on really large or complex data sets or the db is inefficiently constructed or inefficiently accessed. No idea but it certainly seems weird. I suspect in general that the db4o database is loaded with way too many writes per second which is what will kill DB performance fast. Even your average professionally run DB, which you talk about above, quite often runs on a raid 5 or even raid 6 array out of 10k rpm disks, simply because thats a cost effective way to store bulky stuff, and cost is saved everywhere if at all possible. Those arrays will happily deal with very high read iops compared to your standard consumer disk but
[freenet-dev] Logging in Fred
If I remember right then the main issue with debug logging wasn't GC but that generating some of the log string arguments in freenet was simply very resource intensive in itself. So if Java isn't somehow intelligent enough to detect that argument x isn't needed since it won't be used in the end anyway and thus doesn't evaluate it before passing the value, then removing the current logic will cause higher ressource usage once again. In 'log.info("Random message: %s", obj.toString())' evaluating 'obj.toString()' was what caused issues, not recycling it, or so I believe to remember. The main things causing GC issues which I remember were not reusing threads and spawning a few new objects per received and send packet, not the log stuff, that probably just tipped things over the edge when GC was already done heavily because of the former. Speeking out of memory as a non Java-dev here, so ignore if applicable. :P Von: Ximin Luo An: devl at freenetproject.org Gesendet: 4:33 Samstag, 24.M?rz 2012 Betreff: Re: [freenet-dev] Logging in Fred I see. Actually we didn't even need the side effects thing - I did notice that your example stopped doing GC when setting "shouldLog = false" always, so I tried to introduce your "if (shouldLog) { shouldLog = false; }" into my example, but was still unable to reproduce GC. Then changing - System.out.println("don't optimise me out"); + System.out.println(format + args); worked to re-introduce GC. Well, that sucks. I guess the escape analyzer isn't sophisticated enough to see that ar2.toString() does not cause ar2 to escape - and removing that from your log() method does result in no GC. I still think it's negligible though :p - for my test, I'm seeing only ~8ms lost to GC after 20 million calls, and this appears to not change by much even if I increase the number of arguments (and objects created) to log_info. On 24/03/12 03:02, Zlatin Balevsky wrote: > Your test has nothing to do with stack allocation because you never > use the objects that you pass to the log_info method so JIT removes > them.? Apply the following patch and see yourself create garbage in > both testWith and testWithout. > > --- Test2.java??? 2012-03-23 23:01:05.540475497 -0400 > +++ Test.java??? 2012-03-23 23:01:47.304477562 -0400 > @@ -6,8 +6,11 @@ >? ? final Object existing = new Object(); >? ? Object tmp = null; > > +? long sideEffect; >? ? public void log_info(String format, Object ... args) { >? ? ? if (log_level) { > +? ? ? for (Object o : args) > +??? sideEffect += o.hashCode(); >? ? ? ? System.out.println("don't optimise me out"); >? ? ? } >? ? } > @@ -88,11 +91,8 @@ >? ? ? printSummary("with alloc", res[0], "w/o alloc", res[1]); >? ? ? printSummary("with alloc", res[2], "w/o alloc", res[3]); >? ? ? printSummary("with alloc", res[4], "w/o alloc", res[5]); > -? ? System.out.println(" with stack allocation and \"real use case\""); > -? ? res = testWithOtherCode(z); > -? ? printSummary("log only", res[0], "log+work", res[1]); > -? ? printSummary("log only", res[2], "log+work", res[3]); > -? ? printSummary("log only", res[4], "log+work", res[5]); > + > +? ? System.out.println(sideEffect); >? ? } > >? ? public static void main(String[] args) { > > > On Fri, Mar 23, 2012 at 10:48 PM, Ximin Luo wrote: >> On 24/03/12 02:44, Zlatin Balevsky wrote: >>> Wrong.? Extracting the log statement into its own method and calling >>> that method inside the while loop still creates garbage.? Stack frames >>> are different. >>> >> >> How do you explain the results obtained from running my test code, then? Turn >> on -verbose:gc. testWithStackAllocation results in no GC, >> testWithoutStackAllocation results in lots of GC. >> >>> Read again exactly how Escape Analysis works.? The objects it prevents >>> from creating do not appear in heap histograms.? Take a "jmap -histo >>> " and you will see a huge number of java.lang.Integer objects in >>> the heap, not on the stack. >>> >> >> If you're still talking about your example, this is exactly consistent with >> my >> explanation, i.e. that escape analysis is NOT occurring, properly anyway. >> >>> The difference in run times of your example could be due to many >>> things.? How many iterations did you run?? It may be still in >>> interpreted mode.? Enable -XX:+PrintCompilation and discard any >>> results until JIT stops printing. >> >> Done that, JIT stops printing after the 3rd iteration and results are the >> same >> thereafter. >> >> -- >> GPG: 4096R/5FBBDBCE >> https://github.com/infinity0 >> https://bitbucket.org/infinity0 >> https://launchpad.net/~infinity0 >> >> >> ___ >> Devl mailing list >> Devl at freenetproject.org >> https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl > ___ > Devl mailing list > Devl at freenetproject.org > https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl -- GPG:
Re: [freenet-dev] Logging in Fred
If I remember right then the main issue with debug logging wasn't GC but that generating some of the log string arguments in freenet was simply very resource intensive in itself. So if Java isn't somehow intelligent enough to detect that argument x isn't needed since it won't be used in the end anyway and thus doesn't evaluate it before passing the value, then removing the current logic will cause higher ressource usage once again. In 'log.info(Random message: %s, obj.toString())' evaluating 'obj.toString()' was what caused issues, not recycling it, or so I believe to remember. The main things causing GC issues which I remember were not reusing threads and spawning a few new objects per received and send packet, not the log stuff, that probably just tipped things over the edge when GC was already done heavily because of the former. Speeking out of memory as a non Java-dev here, so ignore if applicable. :P Von: Ximin Luo infini...@gmx.com An: devl@freenetproject.org Gesendet: 4:33 Samstag, 24.März 2012 Betreff: Re: [freenet-dev] Logging in Fred I see. Actually we didn't even need the side effects thing - I did notice that your example stopped doing GC when setting shouldLog = false always, so I tried to introduce your if (shouldLog) { shouldLog = false; } into my example, but was still unable to reproduce GC. Then changing - System.out.println(don't optimise me out); + System.out.println(format + args); worked to re-introduce GC. Well, that sucks. I guess the escape analyzer isn't sophisticated enough to see that ar2.toString() does not cause ar2 to escape - and removing that from your log() method does result in no GC. I still think it's negligible though :p - for my test, I'm seeing only ~8ms lost to GC after 20 million calls, and this appears to not change by much even if I increase the number of arguments (and objects created) to log_info. On 24/03/12 03:02, Zlatin Balevsky wrote: Your test has nothing to do with stack allocation because you never use the objects that you pass to the log_info method so JIT removes them. Apply the following patch and see yourself create garbage in both testWith and testWithout. --- Test2.java 2012-03-23 23:01:05.540475497 -0400 +++ Test.java 2012-03-23 23:01:47.304477562 -0400 @@ -6,8 +6,11 @@ final Object existing = new Object(); Object tmp = null; + long sideEffect; public void log_info(String format, Object ... args) { if (log_level) { + for (Object o : args) + sideEffect += o.hashCode(); System.out.println(don't optimise me out); } } @@ -88,11 +91,8 @@ printSummary(with alloc, res[0], w/o alloc, res[1]); printSummary(with alloc, res[2], w/o alloc, res[3]); printSummary(with alloc, res[4], w/o alloc, res[5]); - System.out.println( with stack allocation and \real use case\); - res = testWithOtherCode(z); - printSummary(log only, res[0], log+work, res[1]); - printSummary(log only, res[2], log+work, res[3]); - printSummary(log only, res[4], log+work, res[5]); + + System.out.println(sideEffect); } public static void main(String[] args) { On Fri, Mar 23, 2012 at 10:48 PM, Ximin Luo infini...@gmx.com wrote: On 24/03/12 02:44, Zlatin Balevsky wrote: Wrong. Extracting the log statement into its own method and calling that method inside the while loop still creates garbage. Stack frames are different. How do you explain the results obtained from running my test code, then? Turn on -verbose:gc. testWithStackAllocation results in no GC, testWithoutStackAllocation results in lots of GC. Read again exactly how Escape Analysis works. The objects it prevents from creating do not appear in heap histograms. Take a jmap -histo pid and you will see a huge number of java.lang.Integer objects in the heap, not on the stack. If you're still talking about your example, this is exactly consistent with my explanation, i.e. that escape analysis is NOT occurring, properly anyway. The difference in run times of your example could be due to many things. How many iterations did you run? It may be still in interpreted mode. Enable -XX:+PrintCompilation and discard any results until JIT stops printing. Done that, JIT stops printing after the 3rd iteration and results are the same thereafter. -- GPG: 4096R/5FBBDBCE https://github.com/infinity0 https://bitbucket.org/infinity0 https://launchpad.net/~infinity0 ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl ___ Devl mailing list Devl@freenetproject.org https://emu.freenetproject.org/cgi-bin/mailman/listinfo/devl -- GPG: 4096R/5FBBDBCE https://github.com/infinity0 https://bitbucket.org/infinity0 https://launchpad.net/~infinity0