Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Lawrence Stewart

On 12/27/11 16:13, Ron McDowell wrote:

Doug Barton wrote:

The story so far ...

sysinstall was removed from HEAD in October. I (and others) objected on
the basis that at this time there is no replacement for the post-install
configuration role that sysinstall played. More sysinstall components
were then removed. Then the old version of libdialog (which sysinstall
used) was removed. Thus at this point it's not possible to easily
restore sysinstall.

So my question is, how much do you care? Is lack of that functionality
in HEAD something that we care about?


Doug


We have around 90 web servers running 8.2p5 right now [and yes, I did
update the lot on Christmas Eve but that's a different story] and they
will not be upgraded to 9.0 until/unless the post-install functionality
that was lost by the removal of sysinstall is reintegrated in some way.
I also complained about it and was told in effect, too bad. Everyone
who commented said sysinstall caused more problems than it solved,
although I've been using it for any system changes I needed that it was
capable of doing for as long back as I can remember, and my first
FreeBSD box was v2.2.

I think removing any functionality that was in a previous release
without providing an equal-or-better alternative is a bad idea, and that
needs to be considered more carefully in the future.

So this is not just a +1 vote, it's a +90.


Sysintall is in 9 and will not be removed from the 9 branch. The 
installer used on the release media has changed, but as far as I 
understand, there is nothing stopping you from running sysinstall from a 
installer shell or using it for post installation configuration.


Doug is only referring to the head branch (which will eventually in 
~18-24 months become the 10 branch), so you should be able to have the 
best of both worlds with 9 i.e. try bsdinstall, fall back to sysinstall 
when you find bugs or missing features (don't forget to lodge bug 
reports for problems you find so that bsdinstall can be improved).


On the topic of Doug's actual question, I see minimal sense in 
resurrecting sysinstall in head now. I would suggest it be done much 
closer to (say, 6 months before) the 10.0 release cycle, if no suitable 
post-installation configuration tool has materialised.


In the meantime, cajole everyone who pops up saying I really want post 
installation configuration support to get involved with writing a 
bsdinstaller-like script (I think it should be completely separate to 
bsdinstaller, but perhaps use the same backend shell script 
functions/infrastructure) to do the job.


Cheers,
Lawrence
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Bruce Cran
I think such a tool should /not/ be a port, since I expect it would include a 
package browser in it. I think it's something that could really help new users 
get used to FreeBSD without having to trawl through man pages right at the 
start.

-- 
Bruce Cran

Sent from my iPad

On 27 Dec 2011, at 05:05, Doug Barton do...@freebsd.org wrote:

 On 12/26/2011 20:29, Xin LI wrote:
 On Mon, Dec 26, 2011 at 3:36 PM, Doug Barton do...@freebsd.org wrote:
 The story so far ...
 
 sysinstall was removed from HEAD in October. I (and others) objected on
 the basis that at this time there is no replacement for the post-install
 configuration role that sysinstall played. More sysinstall components
 were then removed. Then the old version of libdialog (which sysinstall
 used) was removed. Thus at this point it's not possible to easily
 restore sysinstall.
 
 So my question is, how much do you care? Is lack of that functionality
 in HEAD something that we care about?
 
 Perhaps make it a port instead?  I personally don't use sysinstall for
 post-install tasks at all, but it won't hurt to have such
 functionality.
 
 You're not the first person to suggest that, but I don't see how it's
 actually responsive to the problem. This issue only affects HEAD, so a
 port would not be generally useful. It would also be an enormous amount
 of work to make it into a port. It would be much easier to revert the
 necessary changes to bring back the old libdialog and sysinstall itself.
 
 
 Doug
 
 -- 
 
[^L]
 
Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/
 
 ___
 freebsd-current@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-current
 To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [rfc] removing/conditionalising WERROR= in Makefiles

2011-12-27 Thread Philip Paeps
On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote:
 i grep'ed through src/sys and found several places where WERROR= was set in
 order to get rid of the default -Werror setting. i tried to remove those
 WERROR= overrides from any Makefile, where doing so did not break tinderbox.
 
 in those cases, where it couldn't be completely removed, i added conditions to
 only set WERROR= for the particular achitecture or compiler, where tinderbox
 did not suceed without the WERROR=.

Wouldn't it be better to set WARNS=x rather than WERROR=?  WERROR= says this
code has bugs, it breaks tinderbox whereas WARNS=x says this code has the
following kind of bugs which break tinderbox.

Possibly wrapped in an architecture-test where appropriate.

 - Philip

-- 
Philip Paeps
Senior Reality Engineer
Ministry of Information
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [rfc] removing/conditionalising WERROR= in Makefiles

2011-12-27 Thread Alexander Best
On Tue Dec 27 11, Philip Paeps wrote:
 On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote:
  i grep'ed through src/sys and found several places where WERROR= was set in
  order to get rid of the default -Werror setting. i tried to remove those
  WERROR= overrides from any Makefile, where doing so did not break tinderbox.
  
  in those cases, where it couldn't be completely removed, i added conditions 
  to
  only set WERROR= for the particular achitecture or compiler, where tinderbox
  did not suceed without the WERROR=.
 
 Wouldn't it be better to set WARNS=x rather than WERROR=?  WERROR= says this
 code has bugs, it breaks tinderbox whereas WARNS=x says this code has the
 following kind of bugs which break tinderbox.
 
 Possibly wrapped in an architecture-test where appropriate.

well there are a few issues here:

1) Jan Beich informed me via a private email that enclosing WERROR in arch
   specific conditions is a bad idea. if the code gets ported to a new
   architecture WERROR doesn't get set and so for every new architecture one
   has to evaluate again, whether WERROR needs to be set or not.
   so i'm inclined to agree with dim@ that we should not add architecture
   specific conditions -- however i think adding compiler specific conditions
   is a good idea.

2) the problem with settings WARNS= or specific -Wno-X or -Wno-error=X is that
   expecially GCC doesn't have specific -WX flags for certain warnings. some
   warnings are implied by -Wall and cannot be turned off seperately. so in
   order to get rid of these warnings (which are being handled as errors), we
   would need to disable -Wall. and i think setting WERROR= in order to handle
   all warnings for specific code as warnings rather than as errors is the
   better solution.

i've reworked the patch to only remove WERROR=, where it is not needed anymore
for any supported arch, or where it can be wrapped in a compiler condition.

cheers.
alex

 
  - Philip
 
 -- 
 Philip Paeps
 Senior Reality Engineer
 Ministry of Information
Index: sys/modules/xfs/Makefile
===
--- sys/modules/xfs/Makefile(revision 228911)
+++ sys/modules/xfs/Makefile(working copy)
@@ -6,8 +6,6 @@
 
 KMOD=   xfs
 
-WERROR=
-
 SRCS =  vnode_if.h \
xfs_alloc.c \
xfs_alloc_btree.c \
Index: sys/modules/sound/driver/maestro/Makefile
===
--- sys/modules/sound/driver/maestro/Makefile   (revision 228911)
+++ sys/modules/sound/driver/maestro/Makefile   (working copy)
@@ -5,6 +5,5 @@
 KMOD=  snd_maestro
 SRCS=  device_if.h bus_if.h pci_if.h
 SRCS+= maestro.c
-WERROR=
 
 .include bsd.kmod.mk
Index: sys/modules/aic7xxx/ahd/Makefile
===
--- sys/modules/aic7xxx/ahd/Makefile(revision 228911)
+++ sys/modules/aic7xxx/ahd/Makefile(working copy)
@@ -4,7 +4,6 @@
 .PATH: ${.CURDIR}/../../../dev/aic7xxx
 KMOD=  ahd
 
-WERROR=
 GENSRCS= aic79xx_seq.h aic79xx_reg.h
 REG_PRINT_OPT=
 AHD_REG_PRETTY_PRINT=1
Index: sys/modules/agp/Makefile
===
--- sys/modules/agp/Makefile(revision 228911)
+++ sys/modules/agp/Makefile(working copy)
@@ -20,7 +20,6 @@
 SRCS+= device_if.h bus_if.h agp_if.h pci_if.h
 SRCS+= opt_agp.h opt_bus.h
 MFILES=kern/device_if.m kern/bus_if.m dev/agp/agp_if.m dev/pci/pci_if.m
-WERROR=
 
 EXPORT_SYMS=   agp_find_device \
agp_state   \
Index: sys/modules/bios/smapi/Makefile
===
--- sys/modules/bios/smapi/Makefile (revision 228911)
+++ sys/modules/bios/smapi/Makefile (working copy)
@@ -6,7 +6,6 @@
 KMOD=  smapi
 SRCS=  smapi.c smapi_bios.S \
bus_if.h device_if.h
-WERROR=
 .if ${CC:T:Mclang} == clang
 # XXX: clang integrated-as doesn't grok 16-bit assembly yet
 CFLAGS+=   ${.IMPSRC:T:Msmapi_bios.S:C/^.+$/-no-integrated-as/}
Index: sys/modules/nve/Makefile
===
--- sys/modules/nve/Makefile(revision 228911)
+++ sys/modules/nve/Makefile(working copy)
@@ -7,7 +7,9 @@
device_if.h bus_if.h pci_if.h miibus_if.h \
os+%DIKED-nve.h
 OBJS+= nvenetlib.o
+.if ${CC:T:Mclang} == clang
 WERROR=
+.endif
 
 CLEANFILES+=   nvenetlib.o os+%DIKED-nve.h
 nvenetlib.o: ${.CURDIR}/../../contrib/dev/nve/${MACHINE}/${.TARGET}.bz2.uu
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [rfc] removing/conditionalising WERROR= in Makefiles

2011-12-27 Thread Luigi Rizzo
On Tue, Dec 27, 2011 at 11:27:43AM +, Alexander Best wrote:
 On Tue Dec 27 11, Philip Paeps wrote:
  On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote:
   i grep'ed through src/sys and found several places where WERROR= was set 
   in
   order to get rid of the default -Werror setting. i tried to remove those
   WERROR= overrides from any Makefile, where doing so did not break 
   tinderbox.
   
   in those cases, where it couldn't be completely removed, i added 
   conditions to
   only set WERROR= for the particular achitecture or compiler, where 
   tinderbox
   did not suceed without the WERROR=.
  
  Wouldn't it be better to set WARNS=x rather than WERROR=?  WERROR= says 
  this
  code has bugs, it breaks tinderbox whereas WARNS=x says this code has the
  following kind of bugs which break tinderbox.
  
  Possibly wrapped in an architecture-test where appropriate.
 
 well there are a few issues here:
 
 1) Jan Beich informed me via a private email that enclosing WERROR in arch
specific conditions is a bad idea. if the code gets ported to a new
architecture WERROR doesn't get set and so for every new architecture one
has to evaluate again, whether WERROR needs to be set or not.
so i'm inclined to agree with dim@ that we should not add architecture
specific conditions -- however i think adding compiler specific conditions
is a good idea.
 
 2) the problem with settings WARNS= or specific -Wno-X or -Wno-error=X is that
expecially GCC doesn't have specific -WX flags for certain warnings. some
warnings are implied by -Wall and cannot be turned off seperately. so in
order to get rid of these warnings (which are being handled as errors), we
would need to disable -Wall. and i think setting WERROR= in order to handle
all warnings for specific code as warnings rather than as errors is the
better solution.
 
 i've reworked the patch to only remove WERROR=, where it is not needed anymore
 for any supported arch, or where it can be wrapped in a compiler condition.

It seems to me that the removal of unnecessary WERROR= needed no
discussion since day one so why don't you go ahead and commit it.

I don't understand the comment on issue #1 above. There is a minuscule
(six, before your patch ?)
number of Makefiles with WERROR= . If you make the assignment
architecture-specific, the worst it can happen is that the variable
is not cleared, and if the build breaks, all you need is to
add the extra architecture in these few places.

cheers
luigi

 cheers.
 alex
 
  
   - Philip
  
  -- 
  Philip Paeps
  Senior Reality Engineer
  Ministry of Information

 Index: sys/modules/xfs/Makefile
 ===
 --- sys/modules/xfs/Makefile  (revision 228911)
 +++ sys/modules/xfs/Makefile  (working copy)
 @@ -6,8 +6,6 @@
  
  KMOD= xfs
  
 -WERROR=
 -
  SRCS =  vnode_if.h \
   xfs_alloc.c \
   xfs_alloc_btree.c \
 Index: sys/modules/sound/driver/maestro/Makefile
 ===
 --- sys/modules/sound/driver/maestro/Makefile (revision 228911)
 +++ sys/modules/sound/driver/maestro/Makefile (working copy)
 @@ -5,6 +5,5 @@
  KMOD=snd_maestro
  SRCS=device_if.h bus_if.h pci_if.h
  SRCS+=   maestro.c
 -WERROR=
  
  .include bsd.kmod.mk
 Index: sys/modules/aic7xxx/ahd/Makefile
 ===
 --- sys/modules/aic7xxx/ahd/Makefile  (revision 228911)
 +++ sys/modules/aic7xxx/ahd/Makefile  (working copy)
 @@ -4,7 +4,6 @@
  .PATH:   ${.CURDIR}/../../../dev/aic7xxx
  KMOD=ahd
  
 -WERROR=
  GENSRCS= aic79xx_seq.h aic79xx_reg.h
  REG_PRINT_OPT=
  AHD_REG_PRETTY_PRINT=1
 Index: sys/modules/agp/Makefile
 ===
 --- sys/modules/agp/Makefile  (revision 228911)
 +++ sys/modules/agp/Makefile  (working copy)
 @@ -20,7 +20,6 @@
  SRCS+=   device_if.h bus_if.h agp_if.h pci_if.h
  SRCS+=   opt_agp.h opt_bus.h
  MFILES=  kern/device_if.m kern/bus_if.m dev/agp/agp_if.m dev/pci/pci_if.m
 -WERROR=
  
  EXPORT_SYMS= agp_find_device \
   agp_state   \
 Index: sys/modules/bios/smapi/Makefile
 ===
 --- sys/modules/bios/smapi/Makefile   (revision 228911)
 +++ sys/modules/bios/smapi/Makefile   (working copy)
 @@ -6,7 +6,6 @@
  KMOD=smapi
  SRCS=smapi.c smapi_bios.S \
   bus_if.h device_if.h
 -WERROR=
  .if ${CC:T:Mclang} == clang
  # XXX: clang integrated-as doesn't grok 16-bit assembly yet
  CFLAGS+= ${.IMPSRC:T:Msmapi_bios.S:C/^.+$/-no-integrated-as/}
 Index: sys/modules/nve/Makefile
 ===
 --- sys/modules/nve/Makefile  (revision 228911)
 +++ sys/modules/nve/Makefile  (working copy)
 @@ -7,7 +7,9 @@
   device_if.h bus_if.h 

Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Ron McDowell

Lawrence Stewart wrote:

On 12/27/11 16:13, Ron McDowell wrote:

Doug Barton wrote:

The story so far ...

sysinstall was removed from HEAD in October. I (and others) objected on
the basis that at this time there is no replacement for the 
post-install

configuration role that sysinstall played. More sysinstall components
were then removed. Then the old version of libdialog (which sysinstall
used) was removed. Thus at this point it's not possible to easily
restore sysinstall.

So my question is, how much do you care? Is lack of that functionality
in HEAD something that we care about?


Doug


We have around 90 web servers running 8.2p5 right now [and yes, I did
update the lot on Christmas Eve but that's a different story] and they
will not be upgraded to 9.0 until/unless the post-install functionality
that was lost by the removal of sysinstall is reintegrated in some way.
I also complained about it and was told in effect, too bad. Everyone
who commented said sysinstall caused more problems than it solved,
although I've been using it for any system changes I needed that it was
capable of doing for as long back as I can remember, and my first
FreeBSD box was v2.2.

I think removing any functionality that was in a previous release
without providing an equal-or-better alternative is a bad idea, and that
needs to be considered more carefully in the future.

So this is not just a +1 vote, it's a +90.


Sysintall is in 9 and will not be removed from the 9 branch. The 
installer used on the release media has changed, but as far as I 
understand, there is nothing stopping you from running sysinstall from 
a installer shell or using it for post installation configuration.


You're right.  I stand corrected and am happy to see I'll be able to 
upgrade to 9.0 after -RELEASE.


Doug is only referring to the head branch (which will eventually in 
~18-24 months become the 10 branch), so you should be able to have the 
best of both worlds with 9 i.e. try bsdinstall, fall back to 
sysinstall when you find bugs or missing features (don't forget to 
lodge bug reports for problems you find so that bsdinstall can be 
improved).


On the topic of Doug's actual question, I see minimal sense in 
resurrecting sysinstall in head now. I would suggest it be done much 
closer to (say, 6 months before) the 10.0 release cycle, if no 
suitable post-installation configuration tool has materialised.


In the meantime, cajole everyone who pops up saying I really want 
post installation configuration support to get involved with writing 
a bsdinstaller-like script (I think it should be completely separate 
to bsdinstaller, but perhaps use the same backend shell script 
functions/infrastructure) to do the job.


I guess this is a good time for me to quit bitching, get off my butt, 
and contribute something back to a project I've been using daily for 
almost 20 years.  Having done similar sysadm development work [way back] 
on Tandy Xenix, SCO Xenix/Unix, and Dell SVR4 Unix, this is an area 
where I actually might know enough to be useful.  To that end, the first 
task I'm assigning myself is to poke around in bsdinstall/libdialog and 
see how they work.


As a related question, is there a good primer somewhere about how to use 
SVN?  I'm using csup at present.



--
Ron McDowell
San Antonio TX

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: re(4) driver dropping packets when reading NFS files

2011-12-27 Thread YongHyeon PYUN
On Mon, Dec 26, 2011 at 10:55:50PM -0500, Rick Macklem wrote:
 Way back in Nov 2010, this thread was related to a problem I
 had, where an re(4) { 810xE PCIe 10/100baseTX, according to the
 driver } interface dropped received packets, resulting in a
 significant impact of NFS performance.
 
 Well, it turns out that a recent (post r224506) commit seems to
 have fixed the problem. It hasn't dropped any packets since I
 upgraded to a kernel with a r228281 version of if_re.c.
 
 So, good news.
 
 Thanks to those maintaining this driver, rick
 ps: If you have a need to know which commit fixed this, I can
 probably test variants to find out. Otherwise, I'm just
 happy that it's fixed.:-)

Glad to know the issue was fixed.  Probably the change made in
r227593 or 227854 might have fixed it.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Steve Kargl
On Tue, Dec 27, 2011 at 11:50:54AM -0600, Ron McDowell wrote:
 
 As a related question, is there a good primer somewhere about how to use 
 SVN?  I'm using csup at present.
 

http://wiki.freebsd.org/SubversionPrimer

-- 
Steve
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: dogfooding over in clusteradm land

2011-12-27 Thread Florian Smeets
On 14.12.11 14:20, Sean Bruno wrote:
 We're seeing what looks like a syncher/ufs resource starvation on 9.0 on
 the cvs2svn ports conversion box.  I'm not sure what resource is tapped
 out.  Effectively, I cannot access the directory under use and the
 converter application stalls out waiting for some resource that isn't
 clear. (Peter had posited kmem of some kind).
 
 I've upped maxvnodes a bit on the host, turned off SUJ and mounted the
 f/s in question with async and noatime for performance reasons.
 
 Can someone hit me up with the cluebat?  I can give you direct access to
 the box for debuginationing.
 

Just for the archives. This is fixed or at least considerably improved
by r228838.

The ports cvs2svn run went down from panicking after about ~22h to being
finished after ~10h.

Thanks to Sean and Attilio for giving me access to test boxes.

Florian



signature.asc
Description: OpenPGP digital signature


Re: [rfc] removing/conditionalising WERROR= in Makefiles

2011-12-27 Thread Alexander Best
On Tue Dec 27 11, Luigi Rizzo wrote:
 On Tue, Dec 27, 2011 at 11:27:43AM +, Alexander Best wrote:
  On Tue Dec 27 11, Philip Paeps wrote:
   On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org 
   wrote:
i grep'ed through src/sys and found several places where WERROR= was 
set in
order to get rid of the default -Werror setting. i tried to remove those
WERROR= overrides from any Makefile, where doing so did not break 
tinderbox.

in those cases, where it couldn't be completely removed, i added 
conditions to
only set WERROR= for the particular achitecture or compiler, where 
tinderbox
did not suceed without the WERROR=.
   
   Wouldn't it be better to set WARNS=x rather than WERROR=?  WERROR= says 
   this
   code has bugs, it breaks tinderbox whereas WARNS=x says this code has 
   the
   following kind of bugs which break tinderbox.
   
   Possibly wrapped in an architecture-test where appropriate.
  
  well there are a few issues here:
  
  1) Jan Beich informed me via a private email that enclosing WERROR in arch
 specific conditions is a bad idea. if the code gets ported to a new
 architecture WERROR doesn't get set and so for every new architecture one
 has to evaluate again, whether WERROR needs to be set or not.
 so i'm inclined to agree with dim@ that we should not add architecture
 specific conditions -- however i think adding compiler specific 
  conditions
 is a good idea.
  
  2) the problem with settings WARNS= or specific -Wno-X or -Wno-error=X is 
  that
 expecially GCC doesn't have specific -WX flags for certain warnings. some
 warnings are implied by -Wall and cannot be turned off seperately. so in
 order to get rid of these warnings (which are being handled as errors), 
  we
 would need to disable -Wall. and i think setting WERROR= in order to 
  handle
 all warnings for specific code as warnings rather than as errors is the
 better solution.
  
  i've reworked the patch to only remove WERROR=, where it is not needed 
  anymore
  for any supported arch, or where it can be wrapped in a compiler condition.
 
 It seems to me that the removal of unnecessary WERROR= needed no
 discussion since day one so why don't you go ahead and commit it.

anybody is free to commit this part, since i don't own a commit bit. ;)

 
 I don't understand the comment on issue #1 above. There is a minuscule
 (six, before your patch ?)
 number of Makefiles with WERROR= . If you make the assignment
 architecture-specific, the worst it can happen is that the variable
 is not cleared, and if the build breaks, all you need is to
 add the extra architecture in these few places.

good point. basically the question with WERROR is: should it be a big hammer
to disable turning warnings into errors for all archs or do we want to set
WERROR in a more specific manor, where it's absolutely necessary.

cheers.
alex

 
 cheers
 luigi
 
  cheers.
  alex
  
   
- Philip
   
   -- 
   Philip Paeps
   Senior Reality Engineer
   Ministry of Information
 
  Index: sys/modules/xfs/Makefile
  ===
  --- sys/modules/xfs/Makefile(revision 228911)
  +++ sys/modules/xfs/Makefile(working copy)
  @@ -6,8 +6,6 @@
   
   KMOD=   xfs
   
  -WERROR=
  -
   SRCS =  vnode_if.h \
  xfs_alloc.c \
  xfs_alloc_btree.c \
  Index: sys/modules/sound/driver/maestro/Makefile
  ===
  --- sys/modules/sound/driver/maestro/Makefile   (revision 228911)
  +++ sys/modules/sound/driver/maestro/Makefile   (working copy)
  @@ -5,6 +5,5 @@
   KMOD=  snd_maestro
   SRCS=  device_if.h bus_if.h pci_if.h
   SRCS+= maestro.c
  -WERROR=
   
   .include bsd.kmod.mk
  Index: sys/modules/aic7xxx/ahd/Makefile
  ===
  --- sys/modules/aic7xxx/ahd/Makefile(revision 228911)
  +++ sys/modules/aic7xxx/ahd/Makefile(working copy)
  @@ -4,7 +4,6 @@
   .PATH: ${.CURDIR}/../../../dev/aic7xxx
   KMOD=  ahd
   
  -WERROR=
   GENSRCS= aic79xx_seq.h aic79xx_reg.h
   REG_PRINT_OPT=
   AHD_REG_PRETTY_PRINT=1
  Index: sys/modules/agp/Makefile
  ===
  --- sys/modules/agp/Makefile(revision 228911)
  +++ sys/modules/agp/Makefile(working copy)
  @@ -20,7 +20,6 @@
   SRCS+= device_if.h bus_if.h agp_if.h pci_if.h
   SRCS+= opt_agp.h opt_bus.h
   MFILES=kern/device_if.m kern/bus_if.m dev/agp/agp_if.m dev/pci/pci_if.m
  -WERROR=
   
   EXPORT_SYMS=   agp_find_device \
  agp_state   \
  Index: sys/modules/bios/smapi/Makefile
  ===
  --- sys/modules/bios/smapi/Makefile (revision 228911)
  +++ sys/modules/bios/smapi/Makefile (working copy)
  @@ 

Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Doug Barton
On 12/27/2011 03:48, Lawrence Stewart wrote:
 On the topic of Doug's actual question, I see minimal sense in
 resurrecting sysinstall in head now. I would suggest it be done much
 closer to (say, 6 months before) the 10.0 release cycle, if no suitable
 post-installation configuration tool has materialised.

My concern about that approach is that 9.0 hasn't even been released yet
and we've already seen changes that are going to make it hard to
resurrect sysinstall if that's the decision we come to. Waiting another
year or 2 would make it impossible.


Doug

-- 

You can observe a lot just by watching. -- Yogi Berra

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Gavin Atkinson
On Tue, 27 Dec 2011, Ron McDowell wrote:
 As a related question, is there a good primer somewhere about how to use SVN?
 I'm using csup at present.

-  Install the subversion port

-  Downlaod the source.  To get HEAD code:

   svn co svn://svn.freebsd.org/base/head/

   or to get 9-stable code:

   svn co svn://svn.freebsd.org/base/stable/9

  (If you want to check it out into a different directory, append the dir 
   name,  for example: svn co svn://svn.freebsd.org/base/head/ src)

-  Make your changes :)

-  To get a diff of your changes, you can just use svn diff

Gavin
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [rfc] removing/conditionalising WERROR= in Makefiles

2011-12-27 Thread Warner Losh

On Dec 26, 2011, at 6:04 PM, Philip Paeps wrote:

 On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote:
 i grep'ed through src/sys and found several places where WERROR= was set in
 order to get rid of the default -Werror setting. i tried to remove those
 WERROR= overrides from any Makefile, where doing so did not break tinderbox.
 
 in those cases, where it couldn't be completely removed, i added conditions 
 to
 only set WERROR= for the particular achitecture or compiler, where tinderbox
 did not suceed without the WERROR=.
 
 Wouldn't it be better to set WARNS=x rather than WERROR=?  WERROR= says this
 code has bugs, it breaks tinderbox whereas WARNS=x says this code has the
 following kind of bugs which break tinderbox.

Agreed...

 Possibly wrapped in an architecture-test where appropriate.

Not so much...  When you make architecture-specific tests, experience has shown 
that we don't fix bugs and they languish for a long time.  Many times, these 
warnings are real.  Sadly, we've found no way to tag the ones that aren't real 
yet as safe to ignore...

Warner

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


SU+J systems do not fsck themselves

2011-12-27 Thread David Thiel
I've had multiple machines now (9.0-RC3, amd64, i386 and earlier 
9-CURRENT on ppc) running SU+J that have had unexplained panics and 
crashes start happening relating to disk I/O. When I end up running a 
full fsck, it keeps turning out that the disk is dirty and corrupted, 
but no mechanism is in place with SU+J to detect and fix this. A bgfsck 
never happens, but a manual fsck in single-user does indeed fix the 
crashing and weird behavior. Others have tested their SU+J volumes and 
found them to have errors as well. This makes me super nervous.

Basically, the way SU+J seems to operate is this:

http://redundancy.redundancy.org/fscklog2

Oh hey, I see you shut down uncleanly, let's check everything looks 
good, off you go, whee

Until I actually go and fsck, when I get:

http://redundancy.redundancy.org/fscklog1

So, I understand that journalling doesn't replace the need for a 
potential fsck (though I never had this problem with gjournal), but 
without a way for the system to detect that a fsck is necessary, this 
seems pretty much a guaranteed recipe for data corruption, and seems to 
offer little to no benefit over plain SU+fsck, or even just mounting 
async.

So: is everyone else seeing this? Am I misunderstanding how SU+J should 
be used? How should the error resolution process really happen? 

Thanks,
David
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


fetch reading one char at a time

2011-12-27 Thread Julian Elischer

I noted the following behaviour from fetch today.. I am actually hunting
another problem so I'm just posting it here in case anyone recognises it
and knows where to fix it...



   d
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte

 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   s
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   u
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   c
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   c
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   e
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   s
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   s
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   f
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   u
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   l
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   .
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
   \r
 48885 fetchRET   read 1
 48885 fetchCALL  gettimeofday(0x7fffcda0,0)
 48885 fetchRET   gettimeofday 0
 48885 fetchCALL  read(0x3,0x7fffce0f,0x1)
 48885 fetchGIO   fd 3 read 1 byte
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J systems do not fsck themselves

2011-12-27 Thread Xin LI
On Tue, Dec 27, 2011 at 1:53 PM, David Thiel
l...@redundancy.redundancy.org wrote:
 I've had multiple machines now (9.0-RC3, amd64, i386 and earlier
 9-CURRENT on ppc) running SU+J that have had unexplained panics and
 crashes start happening relating to disk I/O. When I end up running a
 full fsck, it keeps turning out that the disk is dirty and corrupted,
 but no mechanism is in place with SU+J to detect and fix this. A bgfsck
 never happens, but a manual fsck in single-user does indeed fix the
 crashing and weird behavior. Others have tested their SU+J volumes and
 found them to have errors as well. This makes me super nervous.

 Basically, the way SU+J seems to operate is this:

 http://redundancy.redundancy.org/fscklog2

 Oh hey, I see you shut down uncleanly, let's check everything looks
 good, off you go, whee

 Until I actually go and fsck, when I get:

 http://redundancy.redundancy.org/fscklog1

 So, I understand that journalling doesn't replace the need for a
 potential fsck (though I never had this problem with gjournal), but
 without a way for the system to detect that a fsck is necessary, this
 seems pretty much a guaranteed recipe for data corruption, and seems to
 offer little to no benefit over plain SU+fsck, or even just mounting
 async.

 So: is everyone else seeing this? Am I misunderstanding how SU+J should
 be used? How should the error resolution process really happen?

I'm not sure if your experiments are right here, the second log shows
you're running it read-only, which is likely caused by running it on
live file system.  What I would suggest to do is:

 - Reset the system while it's running;
 - Boot into single user mode;
 - 'dd' the disk image to an image;
 - Boot the system normally and:
- use mdconfig -a -t vnode -f on copy of the image
- use journalled fsck;
- use normal fsck to check if the journalled fsck did the right thing.

This would rule out possible after-mount introduced changes, etc.  I
personally did not hit problems a few months ago but I didn't re-test
recently.

Cheers,
-- 
Xin LI delp...@delphij.net https://www.delphij.net/
FreeBSD - The Power to Serve! Live free or die
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J systems do not fsck themselves

2011-12-27 Thread David Thiel
On Tue, Dec 27, 2011 at 02:29:03PM -0800, Xin LI wrote:
 I'm not sure if your experiments are right here, the second log shows
 you're running it read-only, which is likely caused by running it on
 live file system.  

Yes, this most recent instance is me running it on a live FS, because 
I'm using that machine to type this right now. :) However, I've had the 
issues fixed in single-user on other systems and had the problems go 
away. At least for a bit.

 - use journalled fsck;
 - use normal fsck to check if the journalled fsck did the right thing.

When you say use journalled fsck, what's the proper way to initiate 
that? I don't see any journal-related options in the man page.
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J systems do not fsck themselves

2011-12-27 Thread Xin Li
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 12/27/11 14:36, David Thiel wrote:
 On Tue, Dec 27, 2011 at 02:29:03PM -0800, Xin LI wrote:
 I'm not sure if your experiments are right here, the second log
 shows you're running it read-only, which is likely caused by
 running it on live file system.
 
 Yes, this most recent instance is me running it on a live FS,
 because I'm using that machine to type this right now. :) However,
 I've had the issues fixed in single-user on other systems and had
 the problems go away. At least for a bit.
 
 - use journalled fsck; - use normal fsck to check if the
 journalled fsck did the right thing.
 
 When you say use journalled fsck, what's the proper way to
 initiate that? I don't see any journal-related options in the man
 page.

fsck -p perhaps?  IIRC the fsck_ufs(8) would use journal if it's
available and up-to-date.

Cheers,
- -- 
Xin LI delp...@delphij.nethttps://www.delphij.net/
FreeBSD - The Power to Serve!   Live free or die
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.18 (FreeBSD)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk76S04ACgkQOfuToMruuMChEACfXyh1Y7IGiATqJdnFKeuIS2vB
vJMAn0gCPy98kohAh3LD9ieIASPmksHd
=L7lN
-END PGP SIGNATURE-
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [rfc] removing/conditionalising WERROR= in Makefiles

2011-12-27 Thread Alexander Best
On Tue Dec 27 11, Warner Losh wrote:
 
 On Dec 26, 2011, at 6:04 PM, Philip Paeps wrote:
 
  On 2011-12-26 10:10:40 (+), Alexander Best arun...@freebsd.org wrote:
  i grep'ed through src/sys and found several places where WERROR= was set in
  order to get rid of the default -Werror setting. i tried to remove those
  WERROR= overrides from any Makefile, where doing so did not break 
  tinderbox.
  
  in those cases, where it couldn't be completely removed, i added 
  conditions to
  only set WERROR= for the particular achitecture or compiler, where 
  tinderbox
  did not suceed without the WERROR=.
  
  Wouldn't it be better to set WARNS=x rather than WERROR=?  WERROR= says 
  this
  code has bugs, it breaks tinderbox whereas WARNS=x says this code has the
  following kind of bugs which break tinderbox.
 
 Agreed...

in this case it would have to be WARNS=1 then, because anything  1 will enable
-Wall, which is the warning that breaks sys/modules/ie.

cheers.
alex

 
  Possibly wrapped in an architecture-test where appropriate.
 
 Not so much...  When you make architecture-specific tests, experience has 
 shown that we don't fix bugs and they languish for a long time.  Many times, 
 these warnings are real.  Sadly, we've found no way to tag the ones that 
 aren't real yet as safe to ignore...
 
 Warner
 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: [rfc] removing/conditionalising WERROR= in Makefiles

2011-12-27 Thread Dimitry Andric

On 2011-12-27 02:04, Philip Paeps wrote:

On 2011-12-26 10:10:40 (+), Alexander Bestarun...@freebsd.org  wrote:

i grep'ed through src/sys and found several places where WERROR= was set in
order to get rid of the default -Werror setting. i tried to remove those
WERROR= overrides from any Makefile, where doing so did not break tinderbox.

in those cases, where it couldn't be completely removed, i added conditions to
only set WERROR= for the particular achitecture or compiler, where tinderbox
did not suceed without the WERROR=.


Wouldn't it be better to set WARNS=x rather than WERROR=?  WERROR= says this
code has bugs, it breaks tinderbox whereas WARNS=x says this code has the
following kind of bugs which break tinderbox.


In my opinion, WERROR= says: there are warnings in this code which
cannot be fixed right now, due to varying reasons, but we don't want to
muffle them entirely, so somebody will eventually fix them in the future
(or just delete the code, if it is unmaintained, or unmaintainable, like
nve).

If you set WARNS to a low level, you can be sure nobody ever sees the
warnings, and they will never be fixed.  That may be appropriate in some
cases, but not the ones I just added WERROR= to.  Those are just crufty
drivers, that nobody wants to burn their fingers on. :)
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Lawrence Stewart

On 12/28/11 06:29, Doug Barton wrote:

On 12/27/2011 03:48, Lawrence Stewart wrote:

On the topic of Doug's actual question, I see minimal sense in
resurrecting sysinstall in head now. I would suggest it be done much
closer to (say, 6 months before) the 10.0 release cycle, if no suitable
post-installation configuration tool has materialised.


My concern about that approach is that 9.0 hasn't even been released yet
and we've already seen changes that are going to make it hard to
resurrect sysinstall if that's the decision we come to. Waiting another
year or 2 would make it impossible.


Which changes are you referring to? I would have thought a reverse merge 
to undo the deletion of the sysinstall and old libdialog sources would 
be very minimal work. We'd also probably need a few extra build system 
changes to make sure old libdialog is perhaps statically compiled into 
sysinstall as it would be the only in-tree consumer, but that's not hard 
either. I may be lacking some imagination, but don't really see why it 
would become harder the longer we wait.


Cheers,
Lawrence
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Doug Barton
On 12/27/2011 18:32, Lawrence Stewart wrote:
 On 12/28/11 06:29, Doug Barton wrote:
 On 12/27/2011 03:48, Lawrence Stewart wrote:
 On the topic of Doug's actual question, I see minimal sense in
 resurrecting sysinstall in head now. I would suggest it be done much
 closer to (say, 6 months before) the 10.0 release cycle, if no suitable
 post-installation configuration tool has materialised.

 My concern about that approach is that 9.0 hasn't even been released yet
 and we've already seen changes that are going to make it hard to
 resurrect sysinstall if that's the decision we come to. Waiting another
 year or 2 would make it impossible.
 
 Which changes are you referring to? I would have thought a reverse merge
 to undo the deletion of the sysinstall and old libdialog sources would
 be very minimal work.

Then I admire your mad skillz, because it sounds like a lot of work to
me. :)

 We'd also probably need a few extra build system
 changes to make sure old libdialog is perhaps statically compiled into
 sysinstall as it would be the only in-tree consumer, but that's not hard
 either. I may be lacking some imagination, but don't really see why it
 would become harder the longer we wait.

My concern is that it's going to get worse as time goes along. Without
sysinstall in the base people are going to feel free to make changes to
things that sysinstall depends on (as they have already), and waiting a
year or 2 to resurrect it will cause that problem to grow exponentially.


Doug

-- 

You can observe a lot just by watching. -- Yogi Berra

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J systems do not fsck themselves

2011-12-27 Thread David Thiel
On Tue, Dec 27, 2011 at 02:48:22PM -0800, Xin Li wrote:
  - use journalled fsck; - use normal fsck to check if the
  journalled fsck did the right thing.

Ok, here is the log of fsck with and without journal.

http://redundancy.redundancy.org/fscklog3

That was done the very next boot, after a clean shutdown. The errors 
from the previous live fsck aren't there (oddly), but there are still 
are apparently some corrections made. The next fsck still complains, but 
doesn't give any salvage prompts.

Here is jsa@'s, done on a live FS with SU+J:

http://redundancy.redundancy.org/fscklog4

I'm not actually looking to solve my particular problem per se. The 
issue is that almost everyone I've checked with that's running SU+J gets 
unref'd file and other errors when they check their filesystem (with the 
fs live). Unless I'm missing something, a running FS should never have 
those kinds of errors unless you deliberately disabled fsck.

This leaves only a couple options:

- SU+J and fsck do not work correctly together to fix corruption on 
  boot, i.e. bgfsck isn't getting run when it should
- Stuff is getting completely screwed up after boot
- fsck is giving incorrect results
- I'm completely clueless about how SU+J is supposed to behave or be 
  deployed

I'm pretty certain that the first is the issue here. It would be great 
if others could check their own SU+J filesystems so we could get a few 
more data points.

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Adrian Chadd
Hi,

Why not just list the things that sysinstall did that people like, and
extract out / reimplement those bits?

Noone's going to complain if you write say, a stand-alone package
browser, or a stand-alone gui upgrade tool, or stand-alone
configuration program, etc.



Adrian
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: Removal of sysinstall from HEAD and lack of a post-install configuration tool

2011-12-27 Thread Doug Barton
On 12/27/2011 22:08, Adrian Chadd wrote:
 Hi,
 
 Why not just list the things that sysinstall did that people like, and
 extract out / reimplement those bits?

That's sounds great. As soon as that's done, we can remove sysinstall
from the base. Until those things exist, removing it is premature.

-- 

You can observe a lot just by watching. -- Yogi Berra

Breadth of IT experience, and depth of knowledge in the DNS.
Yours for the right price.  :)  http://SupersetSolutions.com/

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J systems do not fsck themselves

2011-12-27 Thread Scott Long

On Dec 27, 2011, at 10:14 PM, David Thiel wrote:

 On Tue, Dec 27, 2011 at 02:48:22PM -0800, Xin Li wrote:
 - use journalled fsck; - use normal fsck to check if the
 journalled fsck did the right thing.
 
 Ok, here is the log of fsck with and without journal.
 
 http://redundancy.redundancy.org/fscklog3
 

The first run of fsck, using the journal, gives results that I would expect.  
The second run seems to imply that the fixes made on the first run didn't 
actually get written to disk.  This is definitely an oddity.  I see that you're 
using geli, maybe there's some strange side-effect there.  No idea.  Report as 
a bug, this is definitely undesired behavior.

 That was done the very next boot, after a clean shutdown. The errors 
 from the previous live fsck aren't there (oddly), but there are still 
 are apparently some corrections made. The next fsck still complains, but 
 doesn't give any salvage prompts.
 
 Here is jsa@'s, done on a live FS with SU+J:
 
 http://redundancy.redundancy.org/fscklog4
 

For the love that is all good and holy, don't ever run fsck on a live 
filesystem.  It's going to report these kinds of problems!  It's normal; 
filesystem metadata updates stay cached in memory, and fsck bypasses that 
cache.  Also, what you see in your log is a file that has been unlinked but 
held open.  This is a common Unix idiom, and one that gets cleaned up by fsck 
on reboot, whether through the SUJ intent log processing or through a 
traditional fsck.

 I'm not actually looking to solve my particular problem per se. The 
 issue is that almost everyone I've checked with that's running SU+J gets 
 unref'd file and other errors when they check their filesystem (with the 
 fs live). Unless I'm missing something, a running FS should never have 
 those kinds of errors unless you deliberately disabled fsck.
 

Nope, you are completely incorrect here.

 This leaves only a couple options:
 
 - SU+J and fsck do not work correctly together to fix corruption on 
 boot, i.e. bgfsck isn't getting run when it should

The point of SUJ is to eliminate the need for bgfsck.  Effectively, they are 
exclusive ideas.  It's possible that there are still problems with SUJ and how 
fsck processes and commits the journal entires.  However, bgfsck has nothing to 
do with this, and I'd also like to know if your use of geli is complicating the 
problem.

 - Stuff is getting completely screwed up after boot

Possibly but unlikely

 - fsck is giving incorrect results

Very unlikely

 - I'm completely clueless about how SU+J is supposed to behave or be 
 deployed

No comment =-)

 
 I'm pretty certain that the first is the issue here. It would be great 
 if others could check their own SU+J filesystems so we could get a few 
 more data points.
 

Indeed, more data is needed.

Scott

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J systems do not fsck themselves

2011-12-27 Thread David Thiel
On Tue, Dec 27, 2011 at 11:54:20PM -0700, Scott Long wrote:
 The first run of fsck, using the journal, gives results that I would 
 expect.  The second run seems to imply that the fixes made on the 
 first run didn't actually get written to disk.  This is definitely an 
 oddity.  I see that you're using geli, maybe there's some strange 
 side-effect there.  No idea.  Report as a bug, this is definitely 
 undesired behavior.

Not impossible, but I was seeing similar issues on two non-geli systems 
as well, i.e. tons of errors fixed when doing a single-user 
non-journalled fsck, but journalled fsck not fixing stuff. I'll try to 
replicate on a test machine, as I already lost data on the last 
(non-geli) machine this happened to.

 For the love that is all good and holy, don't ever run fsck on a live 
 filesystem.  It's going to report these kinds of problems!  It's 
 normal; filesystem metadata updates stay cached in memory, and fsck 
 bypasses that cache.  

Ok. I expected fsck would be softupdate-aware in that way, but I 
understand it not doing so.

  - SU+J and fsck do not work correctly together to fix corruption on 
  boot, i.e. bgfsck isn't getting run when it should
 
 The point of SUJ is to eliminate the need for bgfsck.  Effectively, 
 they are exclusive ideas.  

This is surprising to me. It is my impression that under Linux at least, 
ext3fs is checked against the journal, and gets a full e2fsck if it 
finds it's still dirty. Additionally, there's a periodic fsck after 180 
days continuous runtime or x number of mounts (see tune2fs -i and -c).  
Is SU+J somehow implemented in such a way that this is unnecessary? What 
does it do that the ext3fs people have missed?

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org


Re: SU+J systems do not fsck themselves

2011-12-27 Thread Scott Long

On Dec 28, 2011, at 12:34 AM, David Thiel wrote:

 On Tue, Dec 27, 2011 at 11:54:20PM -0700, Scott Long wrote:
 The first run of fsck, using the journal, gives results that I would 
 expect.  The second run seems to imply that the fixes made on the 
 first run didn't actually get written to disk.  This is definitely an 
 oddity.  I see that you're using geli, maybe there's some strange 
 side-effect there.  No idea.  Report as a bug, this is definitely 
 undesired behavior.
 
 Not impossible, but I was seeing similar issues on two non-geli systems 
 as well, i.e. tons of errors fixed when doing a single-user 
 non-journalled fsck, but journalled fsck not fixing stuff. I'll try to 
 replicate on a test machine, as I already lost data on the last 
 (non-geli) machine this happened to.
 
 For the love that is all good and holy, don't ever run fsck on a live 
 filesystem.  It's going to report these kinds of problems!  It's 
 normal; filesystem metadata updates stay cached in memory, and fsck 
 bypasses that cache.  
 
 Ok. I expected fsck would be softupdate-aware in that way, but I 
 understand it not doing so.
 
 - SU+J and fsck do not work correctly together to fix corruption on 
 boot, i.e. bgfsck isn't getting run when it should
 
 The point of SUJ is to eliminate the need for bgfsck.  Effectively, 
 they are exclusive ideas.  
 
 This is surprising to me. It is my impression that under Linux at least, 
 ext3fs is checked against the journal, and gets a full e2fsck if it 
 finds it's still dirty. Additionally, there's a periodic fsck after 180 
 days continuous runtime or x number of mounts (see tune2fs -i and -c).  
 Is SU+J somehow implemented in such a way that this is unnecessary? What 
 does it do that the ext3fs people have missed?
 

SUJ isn't like ext3 journaling, it doesn't do 100% metadata logging.  Instead, 
it's an extension of softupdates.  Softupdates (SU) is still responsible for 
ordering dependent writes to the disk to maintain consistency.  What SU can't 
handle is the Unix/POSIX idiom of unlinking a file from the namespace but 
keeping its inode active through refcounts.  When you have an unclean shutdown, 
you wind up with stale blocks allocated to orphaned inodes.  The point of 
bgfsck was to scan the filesystem for these allocations and free them, just 
like fsck does, but to do it in the background so that the boot could continue. 
 SUJ is basically just an intent log for this case; it tells fsck where to find 
these allocations so that fsck doesn't have to do the lengthy scan.  FWIW, this 
problem is present in most any journaling implementation and is usually solved 
via the use of intent records in a journal, not unlike SUJ.

So, there's an assumption with SUJ+fsck that SU is keeping the filesystem 
consistent.  Maybe that's a bad assumption, and I'm not trying to discredit 
your report.  But the intention with SUJ is to eliminate the need for anything 
more than a cursory check of the superblocks and a processing of the SUJ intent 
log.  If either of these fails then fsck reverts to a traditional scan.  In the 
same vein, ext3 and most other traditional journaling filesystems assume that 
the journal is correct and is preserving consistency, and don't do anything 
more than a cursory data structure scan and journal replay as well, but then 
revert to a full scan if that fails (zfs seems to be an exception here, with 
there being no actual fsck available for it).

As for the 180 day forced scan on ext3, I have no public comment.  SU has 
matured nicely over the last 10+ years, and I'm happy with the progress that 
SUJ has made in the last 2-3 years.  If there are bugs, they need to be exposed 
and addressed ASAP.

Scott

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org