Re: Trying to port data-logging to RH 2.4.18-19.7.x kernel
On Fri, 2003-01-31 at 16:28, John Dalbec wrote: > The immediate caller is the "ReiserFS specific hack" in > fs/inode.c:get_inode signed <[EMAIL PROTECTED]>. Is the BKL supposed to be > held when get_inode is called? Traditionally, the BKL is supposed to be held when iget or iget4 is called. RedHat might have patches that do away with that and simply missed reiserfs, but it is more likely they have a patch to reduce BKL use in NFS that missed the iget4 case. So your two basic choices are adding the BKL to reiserfs_read_inode2, or going into the nfsd source and putting them around the iget4 call. You might want to double check to see if their source had the BKL in reiserfs_read_inode2 before you started the data logging port. If not, you should be able to reproduce the oops on an unmodified redhat kernel (compiled with SMP on), and I'd appreciate it if you could send them a bug report as well. -chris
Re: when distros do not support official Marcelo kernels they arenot being team players (was Re: reiserfs on redhat advanced server?)
On Mon, 2003-02-03 at 09:20, Hans Reiser wrote: > It is different from refusing to support the user who downloads > Marcelo's kernel after it does ship (after the distro CD went into the > stamping plant). That is what I am complaining about. The default > should be to support all Marcelo kernels unless there is a motivated > reason not to (e.g. he ships a broken NFS kernel and the user is > complaining about NFS). Users should feel that they can download any > latest official stable kernel (it is okay though to tell them to check a > website created by the distro to see if it is a known bad/unsupported > kernel), and everything will be fine with the distro. When distros > don't do this, they are not being team players. Hans, the vanilla kernels are lacking both bug fixes and features that are critical to what our users are doing. Even if the bug fixes all got in, there are various reasons the features probably won't. If there was any vanilla kernel that had everything we needed, we'd support it, and do a dance around a bonfire made from all of our patch maintenance scripts and code. The whole point of buying the distro is that you don't have the time and energy to collect and compile every application and turn it into something you can easily install on your personal machine. The kernel is one of those applications. Feel free to replace it, but it doesn't make sense to expect us to help you fix the problems when we don't have control over the configuration, compile or sources. That would be like switching engines in your car and expecting the original car company to do a warranty repair on the new engine. -chris (speaking only for himself and not SuSE)
when distros do not support official Marcelo kernels they are notbeing team players (was Re: reiserfs on redhat advanced server?)
Juan Quintela wrote: "hans" == Hans Reiser <[EMAIL PROTECTED]> writes: hans> I understand and support being pissed at Linus for calling it 2.4.0 hans> when it wasn't stable enough before 2.4.18 because VM and VFS were hans> still being changed, but Marcelo is pretty stable in all of his hans> official releases, and it is easy to get him to take good code. But it is not possible to get Marcelo to adapt his release schedule to the distro's release schedule :p That is one of the BIG problems. If when your release is about to freeze, marcelo kernel is in pre5/pre6 what do you do: - bet that final kernel will be there by the end of the distro release and switch. And in the proccess, invalidate all the testing that you have done so far. - get the old known stable kernel, and adapt all the bugfixes that you found in the pre series? This is reasonable, and I am not complaining about it. It is different from refusing to support the user who downloads Marcelo's kernel after it does ship (after the distro CD went into the stamping plant). That is what I am complaining about. The default should be to support all Marcelo kernels unless there is a motivated reason not to (e.g. he ships a broken NFS kernel and the user is complaining about NFS). Users should feel that they can download any latest official stable kernel (it is okay though to tell them to check a website created by the distro to see if it is a known bad/unsupported kernel), and everything will be fine with the distro. When distros don't do this, they are not being team players. What strategy do you think that is better? If you bet (as almost everybody) that second one is better, you are going to have a heavily patched kernel. And that is without taking into account that a lot of the bug fixes that go to marcelo kernel go the route: - user find bug - user blame distro kernel - distro kernel team found the problem (sometimes with cooperation with the subsystem maintainer) I don't see as many ReiserFS bugs found/fixed by distro kernel teams responding to complaints by their users as I would expect. Perhaps we are unusual, I lack the perspective to know. I would like to see more of them, and I don't really understand the lack of them as I would expect to see more. -- Hans
Re: reiserfs on redhat advanced server?
> "hans" == Hans Reiser <[EMAIL PROTECTED]> writes: hans> I understand and support being pissed at Linus for calling it 2.4.0 hans> when it wasn't stable enough before 2.4.18 because VM and VFS were hans> still being changed, but Marcelo is pretty stable in all of his hans> official releases, and it is easy to get him to take good code. But it is not possible to get Marcelo to adapt his release schedule to the distro's release schedule :p That is one of the BIG problems. If when your release is about to freeze, marcelo kernel is in pre5/pre6 what do you do: - bet that final kernel will be there by the end of the distro release and switch. And in the proccess, invalidate all the testing that you have done so far. - get the old known stable kernel, and adapt all the bugfixes that you found in the pre series? What strategy do you think that is better? If you bet (as almost everybody) that second one is better, you are going to have a heavily patched kernel. And that is without taking into account that a lot of the bug fixes that go to marcelo kernel go the route: - user find bug - user blame distro kernel - distro kernel team found the problem (sometimes with cooperation with the subsystem maintainer) - distro kernel team send the patch to subsystem maintainer - subsytem maintainer send the patch to marcelo (perhaps after some local modification) hans> I am not really opposed to vendors shipping their own kernels and hans> supporting them, but I am opposed to them not supporting an official hans> stable Marcelo kernel unless they have a specific reason not to. The hans> Marcelo kernels need to be considered the official supported ones by hans> the entire community, regardless of what other ones might also be hans> supported by parts of the community. Believe me, if it will be possible (not indeed easy) to get that done, my life will be much, much better :p Later, Juan. -- In theory, practice and theory are the same, but in practice they are different -- Larry McVoy
Re: reiserfsck --rebuild-tree all-in-one problem.
On Sunday 02 February 2003 21:33, Brian Chu wrote: > Hello. > > Last friday when I went to upgrade my server, I noticed that there had > been a lot of kernel messages on my server that were saying that one > partition was spewing this: > > Jan 5 13:48:14 simmy kernel: hde: dma_intr: status=0x51 { DriveReady > SeekComplete Error } > Jan 5 13:48:14 simmy kernel: hde: dma_intr: error=0x40 { > UncorrectableError }, LBAsect=91887, high=0, low=91887, sector=91824 > Jan 5 13:48:14 simmy kernel: end_request: I/O error, dev 21:01 (hde), > sector 91824 > Jan 5 13:48:14 simmy kernel: vs-13070: reiserfs_read_inode2: i/o failure > occurred trying to find stat data of [7495 7710 0x0 SD] > > I gave up that night, because running dd once took 7 hours and > reiserfsck twice took 2 hours each, so the whole day was wasted. I had > read on the first time I ran --rebuild-tree that a "dd_rescue" was > suggested, so I downloaded it, installed it, and ran it again (since I had > used just plain dd the first time). I'm not sure if that made a difference > or not. Right, dd seems to produce an output with just skipped bad blocks not writing anything into the output. > Today I started again, assuming that with dd_rescue, I would have a > greater chance of getting the filesystem recovered, but --check told me I > had to run --rebuild-tree, and this time I just did --logfile /dev/null, > because screen dumps during the run would make it impossible to see what's > going on. But again, it stopped again at the same place- Pass 2. Since the > logfiles spit so much STUFF out, I have none at the moment (I can remake > them if needed). > > Screen dump: > > Pass 2: > 0%20%40%.. left 36, 0 > /sec > > And it stops there. top indicates reiserfsck is using all of the cpu > cycles, even after it seemingly freezes. Looks like you built the reiserfsck on another mashine. Could you rebuild it on the same mashine you run it. It is possible to suppress the logfile with -n option, but I think the logfile was so big due to this endless loop. -- Thanks, Vitaly Fertman