Re: Filesystem corruption
On 30-May-07, at 2:22 PM, devsk wrote: I think people just like to spread FUD without doing any analysis of what really caused the FS corruption. I fear you're right. OTOH, filesystem developers on this list (and others including ZFS list) tend to be extremely meticulous. --Toby
Re: Filesystem corruption
On 30-May-07, at 10:25 AM, David Masover wrote: On Tuesday 29 May 2007 07:36:13 Toby Thain wrote: but you can't mention using reiserfs in mixed company without someone accusing you of throwing your data away. People who repeat this rarely have any direct experience of Reiser; they repeat what they've heard; like all myths and legends they are transmitted orally rather than based on scientific observation. Well, there is one problem I vaguely remember that I don't think has been addressed, I think it was one of those lets-put-it-off-till-v4 things. It was the fact that there are a limited number of inodes (or keys, or whatever you call a unique file), But does it cause data loss? One usually sees claims that "reiserfs ate my data", or "I heard reiserfs ate somebody's data", but without supplying a root cause - bad memory? powerfail? bad disk? etc. and no way of knowing how many you have left until your FS will suddenly, one day refuse to create another file. ... switching away from Reiser4 means I no longer see random files (including stuff in, for example, /sbin, that I hadn't touched in months) go up in smoke. I only wish sanity had prevailed over kernel inclusion, then we'd see it shaken down a lot quicker, like R3 was. Ordinarily I like to help debug things, but not at the risk of my data. Maybe I'll try again later, and see if I can reproduce it in a VM or somewhere safe... I do still follow the list, though, in case something interesting happens. Yeah, R4 is "something interesting". :) I still hope it gets finished... --Toby It was fun while it lasted!
Re: Filesystem corruption
I have always found reiser3 to be rock solid My experienced too, over many server years. but you can't mention using reiserfs in mixed company without someone accusing you of throwing your data away. People who repeat this rarely have any direct experience of Reiser; they repeat what they've heard; like all myths and legends they are transmitted orally rather than based on scientific observation. You would think the developers would be doing more to counter this but I have been following reiserfs for years and nobody seems to really care all that much. Can't do much about human nature. MySQL suffers from the same baseless poisoned folk wisdom. --Toby
Re: Why reiser does a disk write on every sync() call?
On 12-May-07, at 9:03 AM, Grzegorz Jaśkiewicz wrote: sounds like useless waste of time and space You haven't stated the reason, why it has to create and commit empty transaction. Why'd you call sync()? --T -- GJ
Re: Hans Reiser arrested...
On 12-Oct-06, at 10:38 AM, [EMAIL PROTECTED] wrote: On Wed, 11 Oct 2006 13:32:22 EDT, Toby Thain said: He's in custody of the police; apparently even his lawyer can't see him. "Even his lawyer can't see him" is the sort of thing that only happens in 3rd world countries with shaky grasp on human rights. Of course, this *is* the US, so maybe in fact Hans isn't being allowed to see his lawyer I'm told this is not particularly unusual in California (!!) --T
Re: Hans Reiser arrested...
On 11-Oct-06, at 5:31 AM, Giovanni A. Orlando wrote: Hi, This morning I get this bad news. ... Again, locate Hans and offer comfort and a path to leave this situation. He's in custody of the police; apparently even his lawyer can't see him. --T Thanks, Giovanni. -- Future Technologies E-Learning, Linux, OpenSource Projects, ... Operating Systems, Web 2.0 and more!!! Italy - USA - Venezuela Check http://www.futuretg.com/FT/Contact.html for addresses.
Re: Relocating files for faster boot/start-up on reiser(fs/4)
On 14-Sep-06, at 6:23 PM, David Masover wrote: Quinn Harris wrote: On Thursday 14 September 2006 13:55, David Masover wrote: ... That is a good point. Recording the disk layout before and after to compare relative fragmentation would be a good idea. As well as randomizing the sequence as a sanity check. Also note that during boot I was using readahead on all 3885 files. So the kernel has a good opportunity to rearrange the reads. And the read sequence doesn't necessary match the order its needed (though I tried to get that). Speaking of which, did you parallize the boot process at all? Just off the top of my head, wouldn't that make the access sequence asynchronous & thereby less predictable? (Although I'm sure it's a net win.) I'd estimate my system easily spent more than 50% of its boot time not touching the disk at all before I did that. Gentoo can do this, I'm not sure what else, as it kind of needs your init system to understand dependencies. ...
Re: Reiser4 und LZO compression
On 29-Aug-06, at 4:03 PM, David Masover wrote: Hans Reiser wrote: David Masover wrote: John Carmack is pretty much the only superstar programmer in video games, and after his first fairly massive attempt to make Quake 3 have two threads (since he'd just gotten a dual-core machine to play with) actually resulted in the game running some 30-40% slower than it did with a single thread. Do the two processors have separate caches, and thus being overly fined grained makes you memory transfer bound or? It wasn't anything that intelligent. Let me see if I can find it... Taken from http://techreport.com/etc/2005q3/carmack-quakecon/index.x?pg=1 "Graphics accelerators are a great example of parallelism working well, he noted, but game code is not similarly parallelizable. ... The downside is, most game developers are working on Windows, for which FS compression has always sucked. Thus, they most often implement their own compression, often something horrible, like storing the whole game in CAB or ZIP files, and loading the entire level into RAM before play starts, making load times less relevant for gameplay. Reiser4's cryptocompress would be a marked improvement over that, but it would also not be used in many games. Gamer systems, whether from coder's or player's p.o.v., would appear fairly irrelevant to reiserfs and this list. I'd trust Carmack's eye candy credentials but doubt he has much to say about filesystems or server threading...
Re: reiserfs and IDE write cache
On 18-Aug-06, at 3:22 AM, Francisco Javier Cabello wrote: Hello, I have been 'googling' and I have found a lot of people warning about the problems with IDE write cache and journaling filesystems. Should I disable write cache in my systems using reiserfs3+2.4.25? I have tried to disable write cache with hdparm (hdparm -W0 /dev/ hdc) but it is not working ( you can see write cache is enabled with hdparm -i /dev/hdc). Is there other way to disable write cache? Get a UPS. :) Here are some links I collected on this subject a few months ago. * _Due to loose interpretations and vendor uniqueness in the ATA Standard, there is no defined way that a driver can be assured that the disk's cache has been flushed._ (http://developer.apple.com/ technotes/tn/tn1040.html) * _if write back cache is turned on, it is not difficult to create metadata inconsistency or corruption at the file system upon power failure._ (http://sr5tech.com/write_back_cache_experiments.htm) * Apple forum [[http://lists.apple.com/archives/darwin-dev/2005/ Feb/msg00072.html post]] discussing the issues and OS X's F_FULLFSYNC feature, which tries hard to flush drive caches. * Linux kernel mailing list: _How long can the unwritten data linger in the drive cache if the drive is otherwise idle?_ http:// lkml.org/lkml/2003/11/2/73 * Interesting blog [[http://peter-zaitsev.livejournal.com/ 12639.html?mode=reply post]] on the issue by a !MySQL developer. _Transaction will be durable and database intact on the crash only if database will perform synchronous IO as synchronous - reporting it is done when data is physically on the disk._ * Detailed [[http://www.opensolaris.org/os/community/arc/caselog/ 2004/652/ post]] about Open Solaris' approach to the issue (see 'spec' link) Regards, Paco -- One of my most productive days was throwing away 1000 lines of code (Ken Thompson) - PGP fingerprint: AF69 62B4 97EB F5BB 2C60 B802 568A E122 BBBE 5820 PGP Key available at http://pgp.mit.edu -
Re: Checksumming blocks? [was Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion]
On 4-Aug-06, at 3:25 AM, Russell Leighton wrote: If the software (filesystem like ZFS or database like Berkeley DB) finds a mismatch for a checksum on a block read, then what? Is there a recovery mechanism, or do you just be happy you know there is a problem (and go to backup)? ZFS will correct from a good mirror (http://blogs.sun.com/roller/page/ bonwick?entry=zfs_end_to_end_data). --T Thx Matthias Andree wrote: Berkeley DB can, since version 4.1 (IIRC), write checksums (newer versions document this as SHA1) on its database pages, to detect corruptions and writes that were supposed to be atomic but failed (because you cannot write 4K or 16K atomically on a disk drive).
Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
On 1-Aug-06, at 4:15 AM, Jeffrey V. Merkey wrote: ...I was and have remained loyal to Linux through it all. Except for that little fling with SCO, eh? Off topic, but no more so than your self-aggrandising. --T
Re: Solaris ZFS on Linux [Was: Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion]
On 31-Jul-06, at 11:18 PM, Horst H. von Brand wrote: Adrian Ulrich <[EMAIL PROTECTED]> wrote: [...] ZFS uses 'dnodes'. The dnodes are allocated on demand from your available space so running out of [di]nodes is impossible. Great to see that Sun ships a state-of-the-art Filesystem with Solaris... I think linux should do the same... This would be worthwhile, if only to be able to futz around in Solaris-made filesystems. Are you volunteering? You'd probably need a friend in Solaris-land who passes you information on how things are done, and copies of filessytems to take apart, and so on. First question is if there are any restrictions (patent or otherwise) on doing this, just copying is out of the question due to (unfortunate) licence on Sun's part. This may eventually be good enough for "futzing around": http://zfs-on-fuse.blogspot.com/ -- Dr. Horst H. von Brand User #22616 counter.li.org Departamento de Informatica Fono: +56 32 654431 Universidad Tecnica Federico Santa Maria +56 32 654239 Casilla 110-V, Valparaiso, ChileFax: +56 32 797513
Re: metadata plugins (was Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion)
If reiser4 is delayed enough, for reasons that have nothing to do with its needs, and without it having encumbered anyone else, it won't be ahead of the other filesystems when it ships. How is that important in any way for the Linux kernel? This is not (and has not been for quite some time now) an experimental operating system. And it has /never/ been a dumpling ground for the next grand idea, it has always been about sound engineering. One of the groundbreaking things about Linux' (modular) development model was, I thought, that it can succeed at both... not necessarily at the same time, but for different users? --T
Re: Viewing files as directories
On 25-Jul-06, at 8:08 PM, David Masover wrote: Timothy Webster wrote: ... Yes it would be really wonderful, if we could just see directories as file and files as directories. Which of course means a file and a directory are one in the same. Ever use OS X? It does this, to some extent, in Finder, which supports the lkml point that doing this in the filesystem, or anywhere in the kernel, is unnecessary and a bad idea. As things stand now the way forward seams to be per application program mime types. Simple right, but it is not because, applications tools like svn, brz, There are two OS X file types that I know of, and probably quite a few more, which are actually stored on disk as folders, Apple calls them "packages": http://developer.apple.com/documentation/ DeveloperTools/Conceptual/SoftwareDistribution/Managed_Installs/ chapter_5_section_2.html (I actually thought they were called "bundles", but as the following page explains, "The term bundle indicates a directory with a specific hierarchical structure, whereas the term package indicates a directory that is treated as an opaque entity by the Finder.") http://developer.apple.com/documentation/CoreFoundation/Conceptual/ CFBundles/Concepts/about.html which is why most Mac software is distributed as disk images or zipfiles. One is the Application type (.app, though Finder hides the extension) and the other is the MPKG type (whatever it stands for, extension is .mpkg). Apart from .pkg and .mpkg, bundle extensions are also treated specially including .bundle, .component, .qtx, etc. Basically, they appear as ordinary files to Finder, which means that most of the time, you cannot see that there are files inside them. You double click on a .app, and it runs a script in a predefined relative location inside the folder. Double click on a .mpkg, and it launches their installer program. ... By the way, Hans, Apple has beaten you by quite a bit for at least some of the functionality we've discussed. You can do operations on Search Folders easily, which work by using Spotlight (an indexed fulltext local system search engine). Apparently running SQLite underneath. You can have files-as-directories, to a point. There are generic ways of getting at metadata, and they are done as plugins -- Spotlight plugins, anyway. I'd much rather use the Reiser4 described in the whitepaper, of course, and I am getting sick of the lack of decent package management for my Mac, so I'll be adding a Linux boot. I'm curious to see if Reiser4 is stable on PowerPC -- this is a year-old G4, I missed the Intel cores by just a few short months...
Re: the " 'official' point of view" expressed by kernelnewbies.org regarding reiser4 inclusion
On 23-Jul-06, at 7:48 AM, Jan-Benedict Glaw wrote: On Sun, 2006-07-23 01:20:40 -0600, Hans Reiser <[EMAIL PROTECTED]> wrote: There is nothing about small patches that makes them better code. There Erm, a small patch is something which should _obviously_ fix one issue. A small patch, containing at max some 100 lines, can easily be read and understood. A complete filesystem (I'm co-maintaining one for an ancient on-disk format, too) isn't really easy to understand or to verify from looking at it for 5min. Nonetheless, "There is nothing about small patches that makes them better code". Hans is quite right. Long patches just take longer to read. This can make them harder to penetrate review, as he describes, with analogy. is no reason we should favor them, if the developers are willing to work on something for 5 years to escape a local optimum, that is often the RIGHT thing to do. I give a shit of nothing to some 5 year work if I cannot verify that it won't hurt me at some point. Do you really review all patches to ensure this? It is well understood that only once r4 reaches mainline will it get the wider testing it must have to shake down. Lucky Namesys is not deterred by ingratitude or there would be no "5 year work" for us to contemplate at all. It is importand that we embrace our diversity, and be happy for the strength it gives us. Some of us are good at small patches that evolve, and some are good at escaping local optimums. We all have value, both trees and grass have their place in the world. Just put reiser4 in some GIT tree and publish it. Maybe you can place it on git.kernel.org . Why should Hans give up the aspiration to have r4 in mainline due to a small number of regressive personalities (a.k.a. politics)? To much of the Linux world R3 has been an extremely valuable contribution; r4 promises to be even more so. --T MfG, JBG -- Jan-Benedict Glaw [EMAIL PROTECTED] +49-172-7608481 Signature of: ...und wenn Du denkst, es geht nicht mehr, the second :kommt irgendwo ein Lichtlein her.
Re: somewhat OT query on journalling
On 19-Jul-06, at 11:27 AM, Payal Rathod wrote: Hi, ... And lastly don't the journalling fs give a false sense of security to the user, saying that the data is written to disk when in reality only an entry is made in journal and data is still not committed to disk. This last one is easy to answer: No. Regardless of the filesystem you're using, there is no guarantee your data hits the disk until you fsync(). Journalling filesystems don't change this. (And even after that, it depends on the device doing the right thing :) Thanks a lot for the patience and eagerly waiting for any replies. With warm regards, -Payal
Re: data corruption with 2.4.25 and datalogging patches
On 17-Jul-06, at 2:14 PM, Brad Dameron wrote: On Mon, 2006-07-17 at 21:55 +0400, Vladimir V. Saveliev wrote: Hello On Mon, 2006-07-17 at 10:53 +0200, Francisco Javier Cabello wrote: Hello Vladimir, such corruptions used to be considered as hardware bugs. Memory failure, for instance. Did you ever run memtest on your systems? Yes, We have run memtest in our system. It's very seldom to find a system with a hardware memory problem running. When we find a memory problem the kernel doesn't boot. I am going to pass memtest in some of the system with reiserfs corruption problem. This is not true. There are certain memory issues that can still allow the system to boot and appear to run ok. I had a system that didn't show a memory error until the 4th pass on memtest. I just happened to let it run over the weekend. I have seen other issues with my larger systems that have 64GB of ram. To where memtest after a week didn't detect anything but the kernel mcelog reported weird ECC memory issues. I replaced several DIMM's and the issue went away. But who knows what could of occured had I not replaced the memory. I agree with Brad. Memory problems can certainly manifest in obvious or obscure ways that don't prevent boot. I spent months chasing down what I thought was an IDE controller chipset problem (corrupt disk I/ O invisible to the kernel, hence corrupt filesystems, etc) that was simply bad RAM. --T Brad Dameron SeaTab Software www.seatab.com
Re: any way to disable fsync?
On 11-Jul-06, at 5:57 PM, [EMAIL PROTECTED] wrote: On Tue, 11 Jul 2006 23:03:12 +0200, =?iso-8859-2?B? o3VrYXN6IE1pZXJ6d2E=?= said: I got problem with apps that are calling fsync, it makes my hard drive flush like mad and it slows down things quite a lot. Several have posted how to bypass it. I'll pose the opposite side: Usually, applications call fsync() because they're pretty sure that if the disk and in-memory copies aren't lined up, a crash at that point could result in data loss and/or corruption. So sqlite calls fsync() - probably because if it *doesn't*, and your system crashes/reboots, you *will* lose that sqlite database. Absolutely; it's required for commit semantics. :-) Your data, your decision.
Re: ReiserFS v3 choking when free space falls below 10%?
On 6-Jul-06, at 11:43 AM, Mike Benoit wrote: On Thu, 2006-07-06 at 12:58 +0200, Jure Pečar wrote: On Tue, 04 Jul 2006 19:37:34 -0700 Hans Reiser <[EMAIL PROTECTED]> wrote: Mike Benoit wrote: Hi Jeff, I just tried the patch you suggested and it didn't make a difference. The load still spikes as soon as the free space falls below ~10%. Jeff, please audit your code for what happens when all the bitmap blocks reach 90% full. Could you discuss your design and code in that regard for our benefit? Mike, thanks so much for going to this much effort. It is rather likely this is a problem affecting many users. I run my busy mailservers with 0.5-2% free space (that's still a couple of gigabytes) and have no problems. It's true that I haven't touched the kernel & reiserfs there (2.4.21), so it does not have any additions to the reiserfs v3 code since then. It just works, so I don't have any desire to fix it :) My desktop machine (v2.6.16, same as my MythTV box) is running with 9% free space right now and it is not experiencing any slow down. I think the problem is caused by the usage pattern of MythTV and how it simultaneously streams one or more large files to the HD in relatively small chunks over a long period of time. ...And then has a hard timing requirement when reusing the free space, which a desktop/server doesn't have, exposing the issue. --T -- Mike Benoit <[EMAIL PROTECTED]>
Re: reiserfs performance on ssd
On 27-Apr-06, at 10:28 AM, Gregory Maxwell wrote: On 4/27/06, Sander <[EMAIL PROTECTED]> wrote: I have a simple solid state disk to play with here. See http://nerv.eu.org/iram/ Interesting review, thanks. To get better reliability you could raid1 them. I guess this is a 'must' anyway when used in servers (just like with harddisks). Have to try this product myself.. Because they have no ECC most failures will just be completely silent data corruption. A sadly useless device. Sure ECC would be nice, but how does this differ from disk? Silent failures are certainly possible. The fact that error detection and propagation doesn't really happen in modern disk subsystems is why systems like Sun's ZFS are coming into being. --T
Re: Reiser4 crash 2.6.16-mm1
On 28-Mar-06, at 10:34 AM, Joachim Feise wrote: Toby Thain wrote on 03/27/06 22:34: On 27-Mar-06, at 11:39 PM, [EMAIL PROTECTED] wrote: On Mon, 27 Mar 2006 14:32:14 PST, Joe Feise said: Thanks for the suggestion. I haven't run a memtest, but I don't really think that the memory is bad. The machine most likely would have had other issues if that was the case. You'd be *amazed*. Intermittently weak memory (especially if it's just one bad bit) can manifest in the most odd ways. I tend to agree. I spent weeks/months chasing down what I thought was a chipset bug, when it was bad RAM. Disk reads (and probably writes) were being corrupted and the kernel did not know about it. Was very frustrating ... until I figured out the real problem. Joe, did you soak the test at least overnight? Have you done any heavy compiles (like building X11, or gcc) lately? Compilers are often the canaries in the mine, when it comes to RAM. I'm not saying this is your problem but it would be good to rule out first. This is a production machine that I can't take offline for too long. But yes, I have compiled the kernel on another reiser4 partition over night, without problems. If this was a memory problem, it would indeed manifest itself in other areas with more or less random errors. The fact that it does not indicates to me that this is a fs problem. So, at this point I am ruling out a memory issue. I agree with all of the above, except the "random errors" part. Depending on the type of fault, it may manifest only under very specific but repeatable conditions, and never be seen in ordinary workload. Other faults, of course, will manifest in many contexts and apparently randomly. But I guess this is OT by now. :-) --Toby -Joe
Re: Reiser4 crash 2.6.16-mm1
On 27-Mar-06, at 11:39 PM, [EMAIL PROTECTED] wrote: On Mon, 27 Mar 2006 14:32:14 PST, Joe Feise said: Thanks for the suggestion. I haven't run a memtest, but I don't really think that the memory is bad. The machine most likely would have had other issues if that was the case. You'd be *amazed*. Intermittently weak memory (especially if it's just one bad bit) can manifest in the most odd ways. I tend to agree. I spent weeks/months chasing down what I thought was a chipset bug, when it was bad RAM. Disk reads (and probably writes) were being corrupted and the kernel did not know about it. Was very frustrating ... until I figured out the real problem. Joe, did you soak the test at least overnight? Have you done any heavy compiles (like building X11, or gcc) lately? Compilers are often the canaries in the mine, when it comes to RAM. I'm not saying this is your problem but it would be good to rule out first. --Toby In fact, if you think about it, if it's bad memory, your trashed reiser4 partition could very well *be* that "would have had other issues" that you said you'd see if it was bad memory. ;)
Re: Reiser4 crash 2.6.16-mm1
On 27-Mar-06, at 4:41 PM, Joe Feise wrote: Hi, I had an interesting crash on my 2.6.16-mm1 machine earlier today. I usually mount /usr/local readonly: /dev/sda6 on /usr/local type reiser4 (ro) However, since I wanted to update a sw package, I remounted it r/w. The installation of the sw package failed with reiser4 errors. Sorry, I don't have a dmesg output at this point, I'll send that tonight when I'm back at the machine. But anyway, it reported index errors, and suggested running fsck. I then ran an fsck.reiser4 --build-fs, which finished without reporting errors. In further tests, I mounted /usr/local rw from the start, and saw the installation fail again. After that, /usr/local/lib was no longer accessible. To me, this looks like corruption of some in-memory data structures. Have you run a memtest lately? --Toby Sorry to not be of more help wrt dmesg output. But I was hoping this is already known somewhere... -Joe
Re: lost partition table
On 24-Mar-06, at 8:55 AM, [EMAIL PROTECTED] wrote: Hello folks. i am using slackware linux(2.6.14 with reiser4 patch) on x86_64 and was trying to install free_bsd on a separate partition. well, during that instalation , accedentualy i have pressed a wrong key , so my partition table is owerriten by something else , but i realized this streight away and stopped the instalation process . so in fac rigth now i ve got around 200GiB of data , all mushed up . there were 11 linux partitions without raid. many of thoes were reiserfs and reiser4. I have located Reiser3 partitions after overwriting a partition table. I imagine Reiser4 would be similar. The method I used is described here: http://slashdot.org/~toby/ journal/110587 Once you have starting sector locations, you can write a new partition table. Tedious but it will work. --Toby i have found on some linux distro's mailing list that the boot sector's backup made by the LILO should contain the table . i'm not 100% sure of that , but i believe it must do . so basicaly i have this backup file, but it exists somewhere in betweent all that mess on my drive . now the question is how to recover that file ? obviously i have tried gpart , wich was updated last time back in 1999 , so could find only my swap part'ion :) i also read about magicrescue on this list's archive ..but the file i need seems to be binary and i'm still not sure dies it contain the table or just the LILO code . Can anybody help please , because i think that this 99.(9)% recoveriable ! . thank you anyway.!