Re: no movement on Critical issues; 2.16 in Oct ?
Wols Lists writes: > On 31/07/11 17:47, David Kastrup wrote: >> Windows 2000 (not NT-based IIRC) does not usefully employ memory >> protection IIRC, so likely Cygwin does not add all too much on top. > > Windows 2000 most definitely IS NT-based. You're thinking of Windows ME, > which is the last of the DOS7/Win9x line. Ok. It might have been that the security implications of uninitialized memory have not caught up with NT/2000 development at that point of time. It took quite some time before Linux closed that leak, too. -- David Kastrup ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
On 31/07/11 17:47, David Kastrup wrote: > Windows 2000 (not NT-based IIRC) does not usefully employ memory > protection IIRC, so likely Cygwin does not add all too much on top. Windows 2000 most definitely IS NT-based. You're thinking of Windows ME, which is the last of the DOS7/Win9x line. Cheers, Wol ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
Graham Percival writes: > On Sun, Jul 31, 2011 at 10:26:11AM +0200, David Kastrup wrote: >> Modern operating systems don't give your code any leftovers from a >> previous run. That would be a security violation. > > I'm certain that I've seen an uninitialized variable being > 123456789 in some cases, and 0 in others. I sincerly doubt that > modern operating systems remember what collection of bits were in > memory at just before the first initialization, so the security > step would surely be simply writing 0s to that location in memory. If the stack never previously was used to that depth. I did not say that you don't get leftovers from previous function calls. And yes, you usually get zeros for uninitialized memory. >> And even user stack initialization below the stack pointer is not >> stochastical. > > Hmm, I may be misunderstanding this sentence due to my relative > ignorance of low-level OS stuff (I had a quite varied career as an > undergraduate). If you mean "the computer starts reserving pieces of > memory for variables in different places in memory on each run", then > my 0-theorizing above is false. That's not what I mean, though Linux indeed nowadays has kernel parameters for randomizing its virtual storage layout to make it harder to developer exploits for system libraries. If bugs pop up only occasionally, it might be worth switching this off and see whether it stabilizes the problem in either direction. > But I'm pretty certain that I've seen student programs (running in > 3-year-old cygwin on windows 2000, so perhaps not the most secure of > environments) share unitialized variable locations across program > runs. Windows 2000 (not NT-based IIRC) does not usefully employ memory protection IIRC, so likely Cygwin does not add all too much on top. -- David Kastrup ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
On Sun, Jul 31, 2011 at 10:26:11AM +0200, David Kastrup wrote: > Modern operating systems don't give your code any leftovers from a > previous run. That would be a security violation. I'm certain that I've seen an uninitialized variable being 123456789 in some cases, and 0 in others. I sincerly doubt that modern operating systems remember what collection of bits were in memory at just before the first initialization, so the security step would surely be simply writing 0s to that location in memory. I think it's quite reasonable that if C++ interpreted a random collection of bits (i.e. uninitailized memory), guile would barf when trying to do some math with the resulting value. But since that pile of bits would be set to 0 on program exit, and if the initial programmer just assumed that uninitialized variables were 0 (as they are in java), that would very neatly explain why we've never seen two successive runs of this problem. > And even user stack initialization below the stack pointer is not > stochastical. Hmm, I may be misunderstanding this sentence due to my relative ignorance of low-level OS stuff (I had a quite varied career as an undergraduate). If you mean "the computer starts reserving pieces of memory for variables in different places in memory on each run", then my 0-theorizing above is false. But I'm pretty certain that I've seen student programs (running in 3-year-old cygwin on windows 2000, so perhaps not the most secure of environments) share unitialized variable locations across program runs. Cheers, - Graham ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
Am Sunday, 31. July 2011, 07:45:20 schrieb Graham Percival: > I haven't seen any interest in > http://code.google.com/p/lilypond/issues/detail?id=1732 > This is unfortunate, since it means that we can't have a release > candidate on Aug 01. Without a reproducible test case, it's simply not possible to debug the problem. As I mentioned in my comment, the GUILE error message indicates an error in the threaded code. But there is no other indication where the problem might be. Cheers, Reinhold -- -- Reinhold Kainhofer, reinh...@kainhofer.com, http://reinhold.kainhofer.com/ * Financial & Actuarial Math., Vienna Univ. of Technology, Austria * http://www.fam.tuwien.ac.at/, DVR: 0005886 * LilyPond, Music typesetting, http://www.lilypond.org ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
2011/7/31 David Kastrup : > Graham Percival writes: > >> On Sun, Jul 31, 2011 at 09:42:36AM +0200, David Kastrup wrote: >>> Graham Percival writes: >>> >>> > I haven't seen any interest in >>> > http://code.google.com/p/lilypond/issues/detail?id=1771 >>> >>> My take on this (if nobody is going to protest in the next few hours) is >>> to revert the flawed fix. >> >> I think that's entirely reasonable. IMO, if there's no clear >> offer of a fix within 48 hours of a bad commit, we should revert >> it. > > Within 48 hours of pinpointing the bad commit. +1. If we manage to get stable releases every few months, i think a policy of reverting any flawed commit that appeared after last stable release (i mean x in 2.x.y, not y) would be good. I can help with these bugs when i close currently opened issues and sort out my repository after grand fixcc-ing (estimated to happen next weekend). cheers, Janek ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
Graham Percival writes: > On Sun, Jul 31, 2011 at 10:04:59AM +0200, David Kastrup wrote: >> But this bug has been reported as occuring non-deterministically even in >> successive runs on the same machine, and there are rather few things >> that can introduce such stochastic behavior (another possibility would >> be timer-triggered garbage collection). > > In C++ code, I'd suspect some uninitalized variables (especially > since it always seems to work on the second run on a machine that > failed in the first run). Modern operating systems don't give your code any leftovers from a previous run. That would be a security violation. And even user stack initialization below the stack pointer is not stochastical. System processes (like those triggered by interrupts and/or preemption) use their own stack, and again: it would be a security violation if a user process could access any information from their operation. So the sources for variation in successive identical runs are very limited. -- David Kastrup ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
On Sun, Jul 31, 2011 at 10:04:59AM +0200, David Kastrup wrote: > But this bug has been reported as occuring non-deterministically even in > successive runs on the same machine, and there are rather few things > that can introduce such stochastic behavior (another possibility would > be timer-triggered garbage collection). In C++ code, I'd suspect some uninitalized variables (especially since it always seems to work on the second run on a machine that failed in the first run). But since that throw() message seems to come from guile, and AFAIK you can't have an unitalized variable in guile, I guess that's not an issue? Or could we be setting a guile variable from some (uninitalized) C++ variable? It's a shame that we can't (usefully) run valgrind on lilypond, or that nobody's experiented with llvm or even AFAIK the more sophisticated gcc options. Finding uninitalized variables is a task that can be done by the computer; humans should never need to theorize about whether that's a cause or not. Just run the program with a trusted tool, and then you'll either find the variables, or else you can cross that off from the list of possibilities. Cheers, - Graham ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
David Kastrup wrote Sunday, July 31, 2011 8:42 AM Graham Percival writes: I haven't seen any interest in http://code.google.com/p/lilypond/issues/detail?id=1771 My take on this (if nobody is going to protest in the next few hours) is to revert the flawed fix. +1 The original bug fixer does not appear to be in the queue for fixing the effects of his patch, and the patch adds a considerable amount of material. Fixing LilyPond is rarely trivial. In my experience the first fix one thinks of is usually flawed (and the second ...) We need to be doubly cautious when applying fixes from casual contributors. Trevor - No virus found in this message. Checked by AVG - www.avg.com Version: 10.0.1390 / Virus Database: 1518/3799 - Release Date: 07/30/11 ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
Graham Percival writes: > On Sun, Jul 31, 2011 at 09:42:36AM +0200, David Kastrup wrote: >> Graham Percival writes: >> >> > I haven't seen any interest in >> > http://code.google.com/p/lilypond/issues/detail?id=1771 >> >> My take on this (if nobody is going to protest in the next few hours) is >> to revert the flawed fix. > > I think that's entirely reasonable. IMO, if there's no clear > offer of a fix within 48 hours of a bad commit, we should revert > it. Within 48 hours of pinpointing the bad commit. -- David Kastrup ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
Graham Percival writes: > On Sun, Jul 31, 2011 at 09:42:36AM +0200, David Kastrup wrote: >> Graham Percival writes: >> >> > I haven't seen any interest in >> > http://code.google.com/p/lilypond/issues/detail?id=1771 >> >> My take on this (if nobody is going to protest in the next few hours) is >> to revert the flawed fix. > > I think that's entirely reasonable. IMO, if there's no clear > offer of a fix within 48 hours of a bad commit, we should revert > it. > >> The other critical bug appears to be related with multithreading, and I >> consider it likely, given its random appearance, that it will mainly >> affect multicore systems. I don't have such a one. > > I thought lilypond was single-threaded? Or is the C++ stuff > single-threaded, but the guile stuff multi-threaded? I mean, I > know that functional programming is great for multi-threaded work > in general, but I didn't think that we used it as such. Guile explicitly differentiates functions "map" and "map-in-order". In theory, it would be free to evaluate "map" in multiple threads. I have no indication that it does so and would be quite surprised if they indeed had as fine-grained threading as that. But this bug has been reported as occuring non-deterministically even in successive runs on the same machine, and there are rather few things that can introduce such stochastic behavior (another possibility would be timer-triggered garbage collection). -- David Kastrup ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
On Sun, Jul 31, 2011 at 09:42:36AM +0200, David Kastrup wrote: > Graham Percival writes: > > > I haven't seen any interest in > > http://code.google.com/p/lilypond/issues/detail?id=1771 > > My take on this (if nobody is going to protest in the next few hours) is > to revert the flawed fix. I think that's entirely reasonable. IMO, if there's no clear offer of a fix within 48 hours of a bad commit, we should revert it. > The other critical bug appears to be related with multithreading, and I > consider it likely, given its random appearance, that it will mainly > affect multicore systems. I don't have such a one. I thought lilypond was single-threaded? Or is the C++ stuff single-threaded, but the guile stuff multi-threaded? I mean, I know that functional programming is great for multi-threaded work in general, but I didn't think that we used it as such. Cheers, - Graham ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
Re: no movement on Critical issues; 2.16 in Oct ?
Graham Percival writes: > I haven't seen any interest in > http://code.google.com/p/lilypond/issues/detail?id=1771 My take on this (if nobody is going to protest in the next few hours) is to revert the flawed fix. Reason: we get rid of a critical issue. The original bug fixer does not appear to be in the queue for fixing the effects of his patch, and the patch adds a considerable amount of material. For me this means that it is easier to think about fixing the original bug than it is thinking about the flawed fix. I'll revert in my personal copy and start thinking from there. However, it will be about noonish before I actually have time. The other critical bug appears to be related with multithreading, and I consider it likely, given its random appearance, that it will mainly affect multicore systems. I don't have such a one. -- David Kastrup ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel
no movement on Critical issues; 2.16 in Oct ?
I haven't seen any interest in http://code.google.com/p/lilypond/issues/detail?id=1771 http://code.google.com/p/lilypond/issues/detail?id=1732 This is unfortunate, since it means that we can't have a release candidate on Aug 01. I fully expect to lose a whole week of otherwise productive work in the confusion of the fixcc.py run. I'm also seeing a couple of reports of builds failing; those are very serious since it means that programmers and doc writers can't test their own work. Unless action is taken, we're looking at a much-delayed 2.16 release. Many people have said that they would like to have stable releases more regularly. Some people have expressed a willingness to work on a team, i.e. spending a few hours a week on stuff that (potentially) doesn't interest them in the least, simply to keep momentum. I implore those people to investigate + fix Critical issues. I know that some problems only occur on certain machines, so those are going to be a royal pain to investigate... but if nobody does anything, then it's not going to fix itself. In cases of occasional failures, it would be good to know if anybody can reproduce the problematic behaviour. Cheers, - Graham ___ lilypond-devel mailing list lilypond-devel@gnu.org https://lists.gnu.org/mailman/listinfo/lilypond-devel