question about rebuiding a gmirror on 6.1

2006-09-12 Thread George Hartzell

A friend of mine decided to take me up on my offer to help him set up
and run a freebsd-stable system (he's a photographer who's only ever
used shell accounts onto linux systems before).

We setup a gmirror using Approach 2 from
http://people.freebsd.org/~rse/mirror.

It's been up and running for a while and I just noticed that it was
running in DEGRADED mode, having failed ad4.

I had him reboot it with either ad4 or ad6 attached, and both seem to
work individually.  The data on ad4 looks older, which makes sense
because hasn't been being updated.

Since we're not sure why it was failed in the first place and it seems
to work, we're going to hook it up and try again.

I'm nervous about whether the system will sync the newer ad6s1 data
onto ad4s1 (what I'd like to happen) or sync the ad4s1 data onto the
ad6s1 (which would suck).

Our solution is to boot off a CD with only ad4 hooked up, use gmirror
clear ad4s1 to blow away the metadata on ad4s1, then reboot with both
drives and do gmirror forget gm0s1 and gmirror insert gm0s1 ad4s1.

Am I being overly paranoid about the direction in which it'll sync the
data?

Is there a better way that I should have handled this (other than
forcing a practice run back when we got started...)?

Thanks,

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Anyone??? (was Reproducible data corruption on 6.1-Stable)

2006-09-12 Thread George Hartzell
Jonathan Stewart writes:
  [...]
  I set up a new server recently and transferred all the information from
  my old server over.  I tried to use unison to synchronize the backup of
  pictures I have taken and noticed that a large number of pictures where
  marked as changed on the server.  After checking the pictures by hand I
  confirmed that many of the pictures on the server were corrupted.  I
  attempted to use unison to update the files on the server with the
  correct local copies but it would fail on almost all the files with the
  message destination updated during synchronization.
  
  It appears the corruption happens during the read process because when I
  recompare the files in a graphical diff tool between cache flushes the
  differences move around!?!?!?  The differences also appear to be very
  small for the most part, single bytes scattered throughout the file.  I
  really have no idea what is causing the problem and would like to pin it
  down so I can either replace hardware if it's bad or fix whatever the
  bug is.
  [...]

It might be a memory problem.  I had a linux server that was serving a
subversion repository, plus some web stuff.  I added some additional
memory to keep it from wheezing and it seemed to be running fine.  We
started noticing problems with things that had been checked out of the
repository (e.g. binary tarballs).  Removing the extra memory made
things work again.

memtest86 didn't find anything wrong, which I gather isn't that
unusual in these situations.

Then again, your problem might be something else entirely


g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IBM T42 freezes when going to sleep under X11

2005-12-02 Thread George Hartzell
Jacques Garrigue writes:
  From: Jacques Garrigue [EMAIL PROTECTED]
   I've got a strange problem with my IBM T42 / Radeon M10 setup.
   
   When using the 6.0-RELEASE kernel (including GENERIC), I cannot go to
   sleep when X11 is running: the machine freezes, display still on. I
   tried disabling DRI, but this does not seem to be the problem: I have
   no DRM anyway.
   
   On the other hand, everything works fines with a 6.0-RC1 kernel.
   Was there a big change in between, such that I need to change my
   configuration?
  
  I finally found the cause of my problems: there has been changes in
  the em driver (Gb ethernet), such that the machine freezes when trying
  to switch automatically from the X11 VT to the system console, before
  going to sleep. The interaction is surprising, but clearly the problem
  disappears when I remove device em from the kernel configuration,
  and it reappears when I do kldload if_em. Since I'm using only ath
  (wireless) anyway, this is fine with me...
  
  A previous partial solution suggested to me was to add
hw.syscons.sc_no_suspend_vtswitch=1
  to sysctl.conf, but this means the screen gets garbled and I have to
  do the switch by hand anyway, which is a real pain.
  Worse still: the machine would still freeze when going to sleep while
  the disk is active.
  
  The last step is to track down the bug in em, as it still seems to
  be there in yesterday's STABLE.

I don't seem to have any problem with my T42p using a kernel compiled
on 11/29 11:21

My copy of if_em.c is:

/*$FreeBSD: src/sys/dev/em/if_em.c,v 1.65.2.8 2005/11/25 14:11:59 glebius Exp 
$*/

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IBM T42 freezes when going to sleep under X11

2005-12-03 Thread George Hartzell
Gleb Smirnoff writes:
  On Fri, Dec 02, 2005 at 12:52:58PM -0800, George Hartzell wrote:
  G   I finally found the cause of my problems: there has been changes in
  G   the em driver (Gb ethernet), such that the machine freezes when trying
  G   to switch automatically from the X11 VT to the system console, before
  G   going to sleep. The interaction is surprising, but clearly the problem
  G   disappears when I remove device em from the kernel configuration,
  G   and it reappears when I do kldload if_em. Since I'm using only ath
  G   (wireless) anyway, this is fine with me...
  G   
  G   A previous partial solution suggested to me was to add
  G hw.syscons.sc_no_suspend_vtswitch=1
  G   to sysctl.conf, but this means the screen gets garbled and I have to
  G   do the switch by hand anyway, which is a real pain.
  G   Worse still: the machine would still freeze when going to sleep while
  G   the disk is active.
  G   
  G   The last step is to track down the bug in em, as it still seems to
  G   be there in yesterday's STABLE.
  G 
  G I don't seem to have any problem with my T42p using a kernel compiled
  G on 11/29 11:21
  G 
  G My copy of if_em.c is:
  G 
  G /*$FreeBSD: src/sys/dev/em/if_em.c,v 1.65.2.8 2005/11/25 14:11:59 glebius 
  Exp $*/
  
  George, Jacques,
  
  what em(4) cards exactly do you have?
  
  pciconf -lv | grep -A4 ^em

(satchel)[10:35am]~pciconf -lv | grep -A4 ^em
[EMAIL PROTECTED]:1:0:   class=0x02 card=0x05491014 chip=0x101e8086 
rev=0x03 hdr=0x00
vendor   = 'Intel Corporation'
device   = '82540EP Gigabit Ethernet Controller (Mobile)'
class= network
subclass = ethernet
(satchel)[10:36am]~

  
  Can you please try the attached patch?
  

I'll give it a try this weekend.

g.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IBM T42 freezes when going to sleep under X11

2005-12-03 Thread George Hartzell
Gleb Smirnoff writes:
  On Fri, Dec 02, 2005 at 12:52:58PM -0800, George Hartzell wrote:
  G   I finally found the cause of my problems: there has been changes in
  G   the em driver (Gb ethernet), such that the machine freezes when trying
  G   to switch automatically from the X11 VT to the system console, before
  G   going to sleep. The interaction is surprising, but clearly the problem
  G   disappears when I remove device em from the kernel configuration,
  G   and it reappears when I do kldload if_em. Since I'm using only ath
  G   (wireless) anyway, this is fine with me...
  G   
  G   A previous partial solution suggested to me was to add
  G hw.syscons.sc_no_suspend_vtswitch=1
  G   to sysctl.conf, but this means the screen gets garbled and I have to
  G   do the switch by hand anyway, which is a real pain.
  G   Worse still: the machine would still freeze when going to sleep while
  G   the disk is active.
  G   
  G   The last step is to track down the bug in em, as it still seems to
  G   be there in yesterday's STABLE.
  G 
  G I don't seem to have any problem with my T42p using a kernel compiled
  G on 11/29 11:21
  G 
  G My copy of if_em.c is:
  G 
  G /*$FreeBSD: src/sys/dev/em/if_em.c,v 1.65.2.8 2005/11/25 14:11:59 glebius 
  Exp $*/
  
  George, Jacques,
  
  what em(4) cards exactly do you have?
  
  pciconf -lv | grep -A4 ^em
  
  Can you please try the attached patch?
  
  -- 
  Totus tuus, Glebius.
  GLEBIUS-RIPN GLEB-RIPE
  Index: if_em.c
  ===
  RCS file: /home/ncvs/src/sys/dev/em/if_em.c,v
  retrieving revision 1.85
  diff -u -r1.85 if_em.c
  --- if_em.c  10 Nov 2005 11:44:37 -  1.85
  +++ if_em.c  11 Nov 2005 12:13:48 -
  @@ -129,8 +129,11 @@
   static int  em_attach(device_t);
   static int  em_detach(device_t);
   static int  em_shutdown(device_t);
  +static int  em_suspend(device_t);
  +static int  em_resume(device_t);
   static void em_intr(void *);
   static void em_start(struct ifnet *);
  +static void em_start_locked(struct ifnet *ifp);
   static int  em_ioctl(struct ifnet *, u_long, caddr_t);
   static void em_watchdog(struct ifnet *);
   static void em_init(void *);
  @@ -208,6 +211,8 @@
   DEVMETHOD(device_attach, em_attach),
   DEVMETHOD(device_detach, em_detach),
   DEVMETHOD(device_shutdown, em_shutdown),
  +DEVMETHOD(device_suspend, em_suspend),
  +DEVMETHOD(device_resume, em_resume),
   {0, 0}
   };
   
  @@ -580,6 +585,41 @@
   return(0);
   }
   
  +/*
  + * Suspend/resume device methods.
  + */
  +static int
  +em_suspend(device_t dev)
  +{
  +struct adapter *adapter = device_get_softc(dev);
  +
  +EM_LOCK(adapter);
  +em_stop(adapter);
  +EM_UNLOCK(adapter);
  +
  +return bus_generic_suspend(dev);
  +}
  +
  +static int
  +em_resume(device_t dev)
  +{
  +struct adapter *adapter = device_get_softc(dev);
  +struct ifnet *ifp;
  +
  +EM_LOCK(adapter);
  +ifp = adapter-ifp;
  +if (ifp-if_flags  IFF_UP) {
  +em_init_locked(adapter);
  +if (ifp-if_drv_flags  IFF_DRV_RUNNING)
  +em_start_locked(ifp);
  +}
  +
  +em_init_locked(adapter);
  +EM_UNLOCK(adapter);
  +
  +return bus_generic_resume(dev);
  +}
  +
   
   /*
*  Transmit entry point

I'll post details as a reply to earlier in the thread, but I have
started seeing crashes.  I don't suspect the em driver, I *do* suspect
synaptics support.  But I have more digging to do

With respect to this patch, it causes me a problem.

I have

  ifconfig_em0=DHCP NOAUTO

in my /etc/rc.conf, so that the interface doesn't come up unless I ask
it too (usually via /etc/rc.d/netif start em0)

With this patch applied, even if I've never started it, the interface
gets started.  If I have a cable plugged in, it grabs a dhcp address
and takes off.

My devd.conf is stock.

g.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: IBM T42 freezes when going to sleep under X11

2005-12-03 Thread George Hartzell
Jacques Garrigue writes:
  From: George Hartzell [EMAIL PROTECTED]
  
   Jacques Garrigue writes:
 From: Jacques Garrigue [EMAIL PROTECTED]
  I've got a strange problem with my IBM T42 / Radeon M10 setup.
  
  When using the 6.0-RELEASE kernel (including GENERIC), I cannot go to
  sleep when X11 is running: the machine freezes, display still on. I
  tried disabling DRI, but this does not seem to be the problem: I have
  no DRM anyway.
  
  On the other hand, everything works fines with a 6.0-RC1 kernel.
  Was there a big change in between, such that I need to change my
  configuration?
 
 I finally found the cause of my problems: there has been changes in
 the em driver (Gb ethernet), such that the machine freezes when trying
 to switch automatically from the X11 VT to the system console, before
 going to sleep. The interaction is surprising, but clearly the problem
 disappears when I remove device em from the kernel configuration,
 and it reappears when I do kldload if_em. Since I'm using only ath
 (wireless) anyway, this is fine with me...
   
   I don't seem to have any problem with my T42p using a kernel compiled
   on 11/29 11:21
   
   My copy of if_em.c is:
   
   /*$FreeBSD: src/sys/dev/em/if_em.c,v 1.65.2.8 2005/11/25 14:11:59 glebius 
   Exp $*/
  
  The very same version I could reproduce the bug with...
  
  I suppose the cause is a complex interaction.
  For instance it only appears under X11.
  So part of the reason might be the difference between the radeon M10
  and the FIRE GL T2. Or the fact I'm simultaneously using
  ath. Or anything else...
  My point was just that the direct trigger was a change in em between
  6.0-RC1 and 6.0-RELEASE. But if it cannot be reproduced on any other
  machine, this is going to be difficult to track down.
  [...]

I've been enabling more of the laptop-y features as I polish off my
upgrade (I'm going from 5.4BETAsomething to 6.0-STABLE, encouraged by
losing my hard drive and being handed the clean slate).

I *have* started getting occasional hangs after a suspend/resume
cycle.  It seems to be related to the mouse.  I get it with both X and
on the console.  I've disabled the psm flag that mega-kicks the device
(0x30?) and end up with a dead mouse.  Restarting the moused via
/etc/rc.d/moused gets my mouse back, but I'll frequently hand
immediately afterwards.

I've disabled synaptics support and am waiting to see if that makes a
difference, other than disabling my middle mouse button

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADS UP: Release schedule for 2006

2005-12-16 Thread George Hartzell
Kevin Oberman writes:
  [...]
  No. There is no conflict between Cx states and EST. Cx states specifies
  how deeply the CPU will sleep when idle. EST controls processor speed
  and voltage. In most cases, your REALLY want to use both of these. They
  are very significant in saving power. (Of course, USB tends to limit the
  effectiveness of Cx states. I need to run without USB to get really good
  battery life and to make suspend (S3) really ut power drain.

Can you expand a bit on that Of course USB  What's the problem
with USB?  Can one just kunload it before suspend?  

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.2-PRERELEASE mostly works on Sony PCG-Z505JE (APM problem).

2006-10-29 Thread George Hartzell

I installed 6.2-PRERELEASE on my trusty but slow Sony PCG-Z505JE and
things generally just worked (including an Atheros based pc-card).

Historically I've set the machine up to use APM, and suspend and
resume to either memory or a magically prepared disk partition worked
well.

With 6.2 I can't seem to get APM hooked up.  I've disabled ACPI by
unsetting ACPI_LOAD at the loader prompt and loaded apm, but when I
boot there aren't any apm messages in the dmesg output and /dev/apm
doesn't exist (which irritates apmd).

Things actually work surprisingly well with ACPI, including suspending
into S3 with acpiconf.  I'd be happy to run that way except that the
suspend key (Fn+Esc) doesn't work, so I have to

  sudo acpiconf -s 3

every time I want to suspend.

Does anyone have any idea how to either make the suspend key work or
get apm to behave?

Thanks,

g.

 
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


help identifying gmirror, ata, or motherboard problem (Tyan S2865G2NR)

2006-11-14 Thread George Hartzell

I'm having a problem with a machine that I support and would like some
feedback.

The system was built up from a barebones Transport PX22, which uses a
Tyan S2865G2NR motherboard.

It has two drives:

  ad4: 286188MB Maxtor 6V300F0 VA111680 at ata2-master SATA300
  ad6: 286188MB Maxtor 6V300F0 VA111630 at ata3-master SATA300

A while back I noticed in the daily periodic report that gmirror had
dropped ad4.  We rebooted and got things going again and it ran
smoothly for a month or so, then dropped it again.

At that point we did a warranty replacement of ad4 and things have
been running smoothly for a couple of months.

A few days ago gmirror kicked ad6 out of the raid, which the following
lines in dmesg:

  ad6: FAILURE - device detached
  subdisk6: detached
  ad6: detached
  GEOM_MIRROR: Device gm0s1: provider ad6s1 disconnected.

We're adding an external device into the mirror and are planning to do
a warranty swap on this drive too.

The system is running, but feels sluggish.  It might be interesting to
note that the disk activity light is continuously lit.

The system if running the stock 6.1 RELEASE.

FreeBSD foo.com 6.1-RELEASE FreeBSD 6.1-RELEASE #0: Sun May  7 04:15:57 UTC 
2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  amd64

I'm trying to figure out if we've just gotten two lousy disks, or if
there might be a driver or motherboard issue.

Does any of this ring any bells?

I'm suggesting that we upgrade to the tip of the stable tree, but the
owner's not convinced.  I can't tell if there's been anything relevant
in the stable release that might address this (aside from all the
other great stuff that's in there).

Thanks for any input,

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: help identifying gmirror, ata, or motherboard problem (Tyan S2865G2NR)

2006-11-15 Thread George Hartzell
Miroslav Lachman writes:
  George Hartzell wrote:
  
   I'm having a problem with a machine that I support and would like some
   feedback.
   
   The system was built up from a barebones Transport PX22, which uses a
   Tyan S2865G2NR motherboard.
   
   It has two drives:
   
 ad4: 286188MB Maxtor 6V300F0 VA111680 at ata2-master SATA300
 ad6: 286188MB Maxtor 6V300F0 VA111630 at ata3-master SATA300

Just to follow up on this, Maxtor asked if the board used an Nvidia
controller (it does...) and then claimed that a newer rev. of their
firmware for these drives would work better.

They're shipping a replacement drive.  We'll see

Thanks for all the feedback.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Dell poweredge 850 hangs on shutdown -p

2006-12-03 Thread George Hartzell

I have a Dell PowerEdge 850 that hangs when I try to power it down
using shutdown -p now.  Otherwise it seems to run splendidly.

I reaches the point where it says: Powering the system off using
acpi and then just sits there.

Powering it off by just pressing the power button works perfectly.

It's running a minimal installation from a FreeBSD 6.3BETA3 cd, which
I burned 11/6/06.

I've pasted the system's dmesg output below.

Has anyone seen this before?

Does anyone have any suggstion on how to whip it into shape?

g.





Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-BETA3 #0: Mon Oct 30 22:04:37 UTC 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC
ACPI APIC Table: DELL   PE850   
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Pentium(R) 4 CPU 2.80GHz (2800.11-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf49  Stepping = 9
  
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14
  AMD Features=0x2010NX,LM
  AMD Features2=0x1LAHF
real memory  = 1073479680 (1023 MB)
avail memory = 1041481728 (993 MB)
ioapic0: Changing APIC ID to 1
ioapic1: Changing APIC ID to 2
ioapic1: WARNING: intbase 32 != expected base 24
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 32-55 on motherboard
kbd1 at kbdmux0
ath_hal: 0.9.17.2 (AR5210, AR5211, AR5212, RF5111, RF5112, RF2413, RF5413)
acpi0: DELL PE850 on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0
cpu0: ACPI CPU on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge at device 1.0 on pci0
pci1: ACPI PCI bus on pcib1
pcib2: ACPI PCI-PCI bridge at device 28.0 on pci0
pci2: ACPI PCI bus on pcib2
pcib3: ACPI PCI-PCI bridge at device 0.0 on pci2
pci3: ACPI PCI bus on pcib3
pcib4: ACPI PCI-PCI bridge at device 28.4 on pci0
pci4: ACPI PCI bus on pcib4
bge0: Broadcom BCM5750 B1, ASIC rev. 0x4101 mem 0xfe8f-0xfe8f irq 16 
at device 0.0 on pci4
miibus0: MII bus on bge0
brgphy0: BCM5750 10/100/1000baseTX PHY on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 
1000baseTX-FDX, auto
bge0: Ethernet address: 00:13:72:fc:92:b8
pcib5: ACPI PCI-PCI bridge at device 28.5 on pci0
pci5: ACPI PCI bus on pcib5
bge1: Broadcom BCM5750 B1, ASIC rev. 0x4101 mem 0xfe6f-0xfe6f irq 17 
at device 0.0 on pci5
miibus1: MII bus on bge1
brgphy1: BCM5750 10/100/1000baseTX PHY on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX, 
1000baseTX-FDX, auto
bge1: Ethernet address: 00:13:72:fc:92:b9
uhci0: UHCI (generic) USB controller port 0xbce0-0xbcff irq 20 at device 29.0 
on pci0
uhci0: [GIANT-LOCKED]
usb0: UHCI (generic) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: UHCI (generic) USB controller port 0xbcc0-0xbcdf irq 21 at device 29.1 
on pci0
uhci1: [GIANT-LOCKED]
usb1: UHCI (generic) USB controller on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: UHCI (generic) USB controller port 0xbca0-0xbcbf irq 22 at device 29.2 
on pci0
uhci2: [GIANT-LOCKED]
usb2: UHCI (generic) USB controller on uhci2
usb2: USB revision 1.0
uhub2: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
ehci0: Intel 82801GB/R (ICH7) USB 2.0 controller mem 0xfeb00400-0xfeb007ff 
irq 20 at device 29.7 on pci0
ehci0: [GIANT-LOCKED]
usb3: EHCI version 1.0
usb3: wrong number of companions (7 != 3)
usb3: companion controllers, 2 ports each: usb0 usb1 usb2
usb3: Intel 82801GB/R (ICH7) USB 2.0 controller on ehci0
usb3: USB revision 2.0
uhub3: Intel EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub3: 6 ports with 6 removable, self powered
pcib6: ACPI PCI-PCI bridge at device 30.0 on pci0
pci6: ACPI PCI bus on pcib6
pci6: display, VGA at device 5.0 (no driver attached)
isab0: PCI-ISA bridge at device 31.0 on pci0
isa0: ISA bus on isab0
atapci0: Intel ICH7 UDMA100 controller port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 31.1 on pci0
ata0: ATA channel 0 on atapci0
ata1: ATA channel 1 on atapci0
atapci1: Intel ICH7 SATA300 controller port 
0xbc98-0xbc9f,0xbc90-0xbc93,0xbc80-0xbc87,0xbc78-0xbc7b,0xbc60-0xbc6f mem 
0xfeb0-0xfeb003ff irq 20 at device 31.2 on pci0
ata2: ATA channel 0 on atapci1
ata3: ATA channel 1 on atapci1
pci0: serial bus, SMBus at device 31.3 (no driver attached)
fdc0: floppy drive controller port 

RE: Dell poweredge 850 hangs on shutdown -p

2006-12-04 Thread George Hartzell
Kirk Davis writes:
   
  
   -Original Message-
   From: [EMAIL PROTECTED] 
   
   I have a Dell PowerEdge 850 that hangs when I try to power it down
   using shutdown -p now.  Otherwise it seems to run splendidly.
   
   I reaches the point where it says: Powering the system off using
   acpi and then just sits there.
   
   Powering it off by just pressing the power button works perfectly.
   
   It's running a minimal installation from a FreeBSD 6.3BETA3 cd, which
   I burned 11/6/06.
   
   I've pasted the system's dmesg output below.
   
   Has anyone seen this before?
  
   I have a couple of Dell 2950's with the same problem.  It
  doesn't happen all the time but it seem to happen more in the reboot
  command.  Sorry this information doesn't help as I haven't found a work
  around yet but it might help to ease your mind knowing your not alone
  ;-)

Thanks,

I've only tried it a couple of times, but for me shutdown -r now
seems to reliably reboot the machine, but shutdown -p now seems to
reliably stop at the Powering off... message.

Let me know if you're sick of the 2950's, you can leave them on my
doorstep anytime ;)

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Dell poweredge 850 hangs on shutdown -p

2006-12-04 Thread George Hartzell
Doug Barton writes:
  George Hartzell wrote:
   Kirk Davis writes:
  
 
  -Original Message-
  From: [EMAIL PROTECTED] 
  
  I have a Dell PowerEdge 850 that hangs when I try to power it down
  using shutdown -p now.  Otherwise it seems to run splendidly.
  
  What happens if you do 'acpiconf -s5' ?

It seems that it's non-deterministic, and that shutdown -p sometimes
works too.

Here's what I did.

1) Log in, su to root, acpiconf -s5 and it shut down cleanly.
2) Log in, su to root, acpiconf -s5 and it shut down cleanly. (just checking)
3) Log in, su to root, shutdown -p and it shut down cleanly.
4) Log in, su to root, 
ifconfig bgp0 inet 10.8.0.2 up
ssh otherhost dd if=/dev/urandom bs=1M | dd of=/dev/null bs=1M
kill it after a couple of moments
dd if=/dev/ad4 of=/dev/null count=1
   acpiconf -s5
   hung
5) Log in, su to root, 
dd if=/dev/ad4 of=/dev/null count=1
   acpiconf -s5
   worked
6) Log in, su to root, 
ifconfig bgp0 inet 10.8.0.2 up
ssh otherhost dd if=/dev/urandom bs=1M | dd of=/dev/null bs=1M
kill it after a couple of moments
   acpiconf -s5
   worked

sigh.

Anything else I can try to generate leads?

Thanks!

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


saving power in a Dell Poweredge 750.

2007-01-10 Thread George Hartzell

I'm setting up a Dell Poweredge 750 1U server.  A friend is loaning me
space in his rack and since his rack usage is limited by power I'd
like to be as thrifty as possible.

I hooked my kill-a-watt meter up and ran the machine for a couple of
days and it uses 88 watts (3.90KWH/44.01H).

Then I kldloaded cpufreq and enabled powerd and it still uses 88 watts
(8.35KWH/93.47H).

That surprised me a bit, and seems to suggest that it's spending most
of its energy spinning fans or something.

Is anyone familiar with the poweredge 750 and freebsd-stable?  I can't
find anything in the bios that suggests fans control, although I guess
it's possible that they're running efficiently by default and I just
haven't caused them to *really* run.

Any other suggestions to help economize?

Thanks,

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: saving power in a Dell Poweredge 750.

2007-01-10 Thread George Hartzell
Peter Jeremy writes:
  On Wed, 2007-Jan-10 09:34:21 -0800, George Hartzell wrote:
  I hooked my kill-a-watt meter up and ran the machine for a couple of
  days and it uses 88 watts (3.90KWH/44.01H).
  
  What was it doing for those couple of days?  [...]

It's a small time mail server and web host.  It was running under its
real world load.

  I presume you confirmed that cpufreq/powerd was actually functioning
  (ie the CPU frequency was being changed).

Yep, or at least I confirmed that powerd -v from a shell cycled up and
down w/ demand, then I configured it to run as a daemon and confirmed
that was cpufreq was loaded and that powerd was running in the
background.

  That surprised me a bit, and seems to suggest that it's spending most
  of its energy spinning fans or something.
  
  PSU overheads, fans, northbridge, video, RAM, disk, ...  it all adds up.

That's sort of what I was figuring, it is/was just that my laptop
experience with powerd and battery life suggested that there would be
more of a difference.

  I can't specifically help with the Dell.

Thanks for the thoughts!

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: saving power in a Dell Poweredge 750.

2007-01-11 Thread George Hartzell
Oliver Fromme writes:
  George Hartzell wrote:
I'm setting up a Dell Poweredge 750 1U server.  A friend is loaning me
space in his rack and since his rack usage is limited by power I'd
like to be as thrifty as possible.

I hooked my kill-a-watt meter up and ran the machine for a couple of
days and it uses 88 watts (3.90KWH/44.01H).

Then I kldloaded cpufreq and enabled powerd and it still uses 88 watts
(8.35KWH/93.47H).
  
  Did you verify that powerd actually reduced the CPU
  frequency?  What's the output from sysctl dev.cpu.0?
  
  It might be enlightening to watch the following shell
  loop for a while:
  while :; do sysctl dev.cpu.0.freq; sleep 1; done

I hadn't actually done *that* (but I had run powerd -v for a while and
watched).  Here you go:

  (merlin)[9:11am]~while (1)
  while? sysctl dev.cpu.0.freq
  while? sleep 5
  while? end
  dev.cpu.0.freq: 350
  dev.cpu.0.freq: 350
  dev.cpu.0.freq: 350
  dev.cpu.0.freq: 350
  dev.cpu.0.freq: 350
  dev.cpu.0.freq: 350
  dev.cpu.0.freq: 1051
  dev.cpu.0.freq: 2102
  dev.cpu.0.freq: 1401
  dev.cpu.0.freq: 700
  dev.cpu.0.freq: 350

  By the way, do you have an SMP system, or are you running
  a kernel without SMP?  sysctl machdep.cpu_idle_hlt?

It's a uniprocessor machine, hyperthreading capable but that's
disabled in the bios.

  (merlin)[9:12am]~sysctl machdep.cpu_idle_hlt
  machdep.cpu_idle_hlt: 1

Thanks for thinking about this!

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: saving power in a Dell Poweredge 750.

2007-01-11 Thread George Hartzell
Bruno Ducrot writes:
  [...]
  What specific driver(s) were loaded actually?
  A devinfo might help.

It looks like:

  p4tcc0
  cpufreq0

Here's a devinfo and a dmesg:

 http://shrimp.alerce.com/merlin/merlin.devinfo
 http://shrimp.alerce.com/merlin/merlin.dmesg

I'm starting to understand that the box is probably running along as
quietly as it knows how, unless there's some magic about fans and
disks that I've missed.

Thanks for the help,

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: tap device at boot time

2007-03-15 Thread George Hartzell
Willy Offermans writes:
  On Wed, Mar 14, 2007 at 11:06:15AM +, Vince wrote:
   Willy Offermans wrote:
Dear FreeBSD friends,

Is it possible to add and configure a tap device at boot time of
FreeBSD? I mean the same as a normal NIC. In my rc.conf:

snip
...
ifconfig_xl0=inet 192.168.0.2 promisc netmask 255.255.255.0
ifconfig_rl0=inet 192.168.4.2 netmask 255.255.255.0
ifconfig_tap0=inet 10.8.0.1 netmask 255.255.255.0
...
/snip

   try adding
   cloned_interfaces=tap0
   
   to your rc.conf
   
   Vince
and in my /boot/loader.conf:
snip
...
if_tap_load=YES
...
/snip

if_xl0 and if_rl0 are compiled into the kernel.

Maybe it is even possible to set the MAC address of the tap device!?

The tap device should be available before named and dhcpd have been
started. In that way I can provide IP addresses over the tap device 
and add appropriate DNS entries.

I like to run openvpn with tap devices and want to use the dhcpd server
to provide IP addresses and update the named. This works quite well.
However after reboot I always have to restart named and dhcpd again
since the tap device becomes available after these services have started
during boot. I guess this problem will be solved if the tap device is
already available and configured before named and dhcpd have started.

   
  
  Hello Vince,
  
  Thank you for your response, but unfortunately adding
  cloned_interfaces=tap0 to my rc.conf did not solve the issue. The
  tap0 device only appeared after I started the openvpn daemon. Is there
  a way to determine the order to start the daemons. Maybe I can solve
  the problem in that way.
  
  I wonder why it is so hard to accomplish this. FreeBSD is usually very
  intuitive in initialising device support. Naively I would think: load
  the kernel_module and run ifconfig and you are there. For xl0 and rl0
  it will work like this, I guess, but for tap0 certainly not. What kind
  of a kick does this tap device need? Is it that special? Openvpn needs
  to know which tap to use, but that is it, I guess. The rest is up to
  the kernel to do the trick, isn't it? Maybe I have to dig in the source
  code of openvpn to find out how to initialise the tap device.
  [...]

Are you sure that you need to initialize the tap0 device like this?

I use tun's instead of tap's, but in my openvpn server config I have a
line that says

  dev tun

and a bit further down I have a line that says

  server 10.8.0.0 255.255.255.0

and openvpn takes care of setting up the device itself.  Everything
I've read suggests that it should work the same way for a tap device.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: gmirror Issues

2007-03-25 Thread George Hartzell


On Mar 25, 2007, at 5:12 PM, Joe Kelsey wrote:


Ivan Voras wrote:

Joe Kelsey wrote:


So, after loading the mirror stuff, I regularly lock up the  
system by
trying to perform simple activities on the mirror.  What do I  
need to do

differently?

Here are the relevant dmesg lines:
atapci0: SiI 3512 SATA150 controller port
0xa000-0xa007,0x9800-0x9803,0x9400-0x9407,0x9000-0x9003,0x8800-0x880 
f

mem 0xfba0-0xfba001ff irq 18 at device 13.0 on pci0



I did almost the same thing you did with gmirror on 6.2-release on  
amd64
the other day and it worked. There were several complaints about  
SiI

hardware in the past, though - you might want to search the lists.


Thank you for the suggestion, but it does not help.  There is some  
traffic on the list about the 3112, but I have a 3512, which does  
not have any list traffic about bugs.


The major thing that needs doing is a detailed explanation of how  
to take two brand new disk drives and mirror them.  Nothing in the  
documentation discusses this.  Do you have to create file systems  
on the drives first?  Do you have to use fdisk to slice them up?   
Is there a size limit on drives?  I am trying to mirror two 400G  
drives, is this supported?  There is no information anywhere that I  
can find about these topics.


Have you seen this:

   http://people.freebsd.org/~rse/mirror/

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Hardcoding gmirror provider [was Re: Problem with migrating...]

2005-02-06 Thread George Hartzell
Pawel Jakub Dawidek writes:
  [...]
  It happens because ad0 and ad0s1 share the same last sector.
  To fix this you should use '-h' option as you did or you should recreate
  ad0s1 slice one sector smaller.

Thanks for the help!

That makes sense, which is always a nice feeling.

Does that mean that the instructions at: 

   http://people.freebsd.org/~rse/mirror/

in the section labeled:

   GEOM mirror Approach 2: Single Slice, Preferred, More Flexible

are incorrect and will result in the same kind of slice-table breakage
that I was seeing or is there something going on that I'm not getting?

Would it be more correct for Ralf to update the instructions to
include a -h arg on his gmirror label step?

g.


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


ACPI Suspend/resume [was Re: ATA mkIII first official patches...]

2005-02-07 Thread George Hartzell
Søren Schmidt writes:
  [...]
  Find such a machine might be very hard, if not plain impossible :/
  I already have 3 laptops here (of which none has worked for several 
  month regarding suspend/resume) so I have plenty. [...]

How bad is the acpi suspend/resume situation.

I have 5.3-BETA4 on an IBM T42p and suspend to memory and resume work
fine.

I haven't had time to upgrade (cobbler's kids, no shoes, etc...) but
it's on my list of things to do.

Does 5.3 Release have a working acpi based suspend/resume for anyone?

Does 5-STABLE have a working acpi based suspend/resume for anyone?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Hardcoding gmirror provider [was Re: Problem with migrating...]

2005-02-22 Thread George Hartzell

I just skimmed through your comment about hardcoding the provider name
if ad0 and ad0s1 have the same length.  I think that you need to
mention that you need to add the -h flag to both the gmirror label
command and the final gmirror insert command when you add in the
second disk.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


periodic scripts for checking gmirror status?

2005-03-17 Thread George Hartzell

I'd like to set up a periodic-style script to check the status of my
gmirror RAID.

Before I reinvent the wheel, anyone have one that they'd care t
share?

I'm running Stable, FreeBSD 5.3-STABLE #10: Sun Feb  6 17:25:02 PST 2005

Thanks,

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: gmirror oddities

2005-05-03 Thread George Hartzell

Eirik Øverby writes:
  Hi!
  
  I've been using gmirror for a while to safeguard my system disks. I have
  taken the slice-based mirror approach, where I use, say, ad0s1 and ad2s1 as
  providers.
  On one of my servers, this seems to be impossible. I create the mirror using
  ad2s1 first (to keep my system running while I do some of the work), and
  then I re-initialize ad0s1 (making it exactly the size of ad2s1) before
  using gmirror insert to add it to the mirror.
  However, at this point - when doing a gmirror list - it turns out that it
  never added ad0s1 as a provider, but ad0 itself! As a result, I now have a
  load of slices (ad0a, ad0b, ad0d, ad0e, ad0f) instead of having the same
  structure as I have on ad2s1. It's just like ad2s1, just without the s1
  part.
  
  I've tried dd if=/dev/zero of=/dev/ad0 bs=65536 a couple of times, in case
  some old provider metadata was stored there. I also have exactly the same
  setup in another server, the only difference being that it behaves as
  expected..
  
  Am I doing something blatantly wrong here? This IS supposed to work, right?
  I've even found a very nice description of how to do it at
  http://people.freebsd.org/~rse/mirror/
  confirming that what I'm doing is right.
  
  I'm on 5.4-PRERELEASE, but this problem has been there since 5.3-p2 or
  something, which was when I first tried this.

I bet you're getting bitten by a problem that bit me.  It's described
in the fine print in http://people.freebsd.org/~rse/mirror/.

Gmirror saves it's metadata on the last sector of its disk space.
Since the slice (adXs1) and the disk device (adX) end at the same
place on the disk, gmirror gets confused.  It tastes devices in a
particular order, apparently devices first, then slices.  It finds the
metadata when it tastes adX and goes ahead and uses it, even though it
should be associating it w/ adXsY.  Hilarity ensues

The fix is described in the fourth comment block of Ralf's doc, either
make the slice a sector smaller than the disk device or hardcode the
provider name.  I've been using the hardcoding approach, and it seems
to work for me.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Current status of nullfs and/or unionfs?

2005-05-05 Thread George Hartzell
Eirik Øverby writes:
  [...]
  What can I expect to see when trying nullfs and/or unionfs today? Has
  anything changed? Do I have even a remote chance of making it work - and if
  it doesn't work, what are my chances of anyone having time or energy to look
  into it? I'm an admin only, no coder, otherwise I'd be happy to look into it
  myself.

I'm using unionfs to mount a copy of my ports tree into a jail on a
fairly currently patched 5.3 system.  It works beautifully except that
it sometimes can't be unmounted as the machine shuts down, leading to
an fsck.

I've been trying to characterize it.  Seems like I can mount it, start
a jail, stop the jail, and unmount it just fine.  However if I do
anything in the jail's ports tree, then it won't unmount.  Last
experiment I did was to log into the jail and do a couple of 'syncs',
then log out, shut the jail down and unmount it.  That worked that one
time.

Not enough to file a bug yet, but the anecdote might be useful.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: unionfs limitations?

2005-05-06 Thread George Hartzell
Eirik Øverby writes:
  Hi,
  
  I just started playing with mounting ports into jails using unionfs
  (mount_unionfs -b /usr/ports_jail /usr/local/jails/jail-0/usr/ports), and
  many things seem to work fine.
  However, when trying to install either of mysql41-server or mysql41-client,
  I see the following:
  
  [EMAIL PROTECTED] /usr/ports/databases/mysql41-server# make install
  ===  Installing for mysql-server-4.1.11_1
  ===   mysql-server-4.1.11_1 depends on shared library: mysqlclient.14 -
  found
  ===   Generating temporary packing list
  ===  Checking if databases/mysql41-server already installed
  ln: POSIX: Operation not supported
  *** Error code 1
  
  Stop in /usr/ports/databases/mysql41-server.
  
  Did I miss out on something, or is this not going to work? Do I need to
  think in other ways?
  [...]

Here's one unionfs/jail gotcha that's bitten me a couple of times.  If
you actually *use* (or, have used) the ports directory to build and
install stuff onto the host machine, the ports infrastructure in the
jail gets kind of confused.  It seems to be checking for the files in
the dependencies, doesn't find them, goes to make them, and then
[depending on what state the relevant port directory is in], things
get odd.

I've started just using a virgin ports tree as the underpinnings for
my unionfs'ed jails.

Is there any chance that you've installed mysql-server on the host?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: unionfs limitations?

2005-05-07 Thread George Hartzell
Marc G. Fournier writes:
  On Sat, 7 May 2005, Eirik [ISO-8859-1] Øverby wrote:
  
   On 07-05-05 03:19, George Hartzell [EMAIL PROTECTED] wrote:
  [...]
   Here's one unionfs/jail gotcha that's bitten me a couple of times.  If
   you actually *use* (or, have used) the ports directory to build and
   install stuff onto the host machine, the ports infrastructure in the
   jail gets kind of confused.  It seems to be checking for the files in
   the dependencies, doesn't find them, goes to make them, and then
   [depending on what state the relevant port directory is in], things
   get odd.
  
   I noticed that pretty early on, yea ;)
  
  Has 5.x *really* gone that far downhill? :(

That particular problem isn't a 5.x thing *at all*, it's just a
consequence of the way ports work (recording some info as dot files in
the work directory) and dependencies are tracked (looking for files
out in /usr/local).  As soon as I thought about it, it was clear what
*I'd* done.

  I've been doing the above since about day one ... *but* ... is this the 
  case if you set WRKDIRPREFIX= to somewhere else in /etc/make.conf?  I 
  don't build the actual port *in* /usr/ports, so the only thing that gets 
  written in /usr/ports is /usr/ports/distfiles ...

I think that having the work directories to something else would have
avoided the problem quite nicely.

Again, it's not a FreeBSD 5 isse at all, just me getting exactly what
I asked for

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Is crash dumping supported onto a gmirror swap partition?

2004-11-13 Thread George Hartzell

Is it possible to save crashdumps onto a gmirror device?

I'm trying to understand why one of my systems (being upgraded from
5.3BETA4 to 5.3) is sponteously rebooting.

The first step seems to be to try to get a crash dump, but:

   (merlin)[5:16pm]~sudo dumpon -v /dev/mirror/gm0s1b
   Password:
   dumpon: ioctl(DIOCSKERNELDUMP): Operation not supported
   (merlin)[5:16pm]~

Is the problem that one can't dump onto a gmirror'ed device, or is
there something else that I'm missing.

I'm running a default kernel, w/ an 'options smp' added.

g.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Problem with migrating onto a gmirror slice.

2005-02-05 Thread George Hartzell

I have a system that I set up to use a gmirror back in the 5.3beta
days.  It's running fine but I don't remember exactly how I set it up.

It's a scsi system w/ two identical disks.

I'd like to migrate the installation to a new box that uses ide disks,
and am basing my attempts on the

  GEOM mirror Approach 2: Single Slice, Preferred, More Flexible

portion of these instructions:

   http://people.freebsd.org/~rse/mirror/

Although the disk that I ended up with was bootable in the new system,
I noticed that the slice table was messed up.  After a couple of
tries, here's what I've found:

The machine is:

   FreeBSD merlin.alerce.com 5.3-RELEASE-p2 FreeBSD 5.3-RELEASE-p2 #9: Sat Dec 
18 12:38:37 PST 2004 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/MERLIN  i386

Here's the series of commands that I've performed to illustrate the
problem:

   138  15:31   fdisk -v -B -I /dev/ad0
   139  15:31   fdisk -s /dev/ad0
   140  15:31   fdisk -s /dev/ad0  ~hartzell/fdisk-initial
   141  15:32   gmirror label -v -n -b round-robin disk0 /dev/ad0s1
   142  15:32   fdisk -s /dev/ad0
   143  15:32   bsdlabel -w -B mirror/disk0
   144  15:32   bsdlabel -e mirror/disk0
   145  15:33   fdisk -s /dev/ad0
   146  15:34   fdisk -s /dev/ad0  ~hartzell/fdisk-after
   147  15:34   history
   148  15:34   history  ~hartzell/history

After the fdisk at line 138, here's the slice table:

/dev/ad0: 387621 cyl 16 hd 63 sec
PartStartSize Type Flags
   1:  63   390721905 0xa5 0x80

The fdisk at line 142 showed that the slice table was fine after the
gmirror step. 

But after the bsdlabels at lines 143 and 144 the slice table looks
like this:

/dev/ad0: 387621 cyl 16 hd 63 sec
PartStartSize Type Flags
   4:   0   5 0xa5 0x80

Here's the output of bsdlabel /dev/mirror/disk0:

# /dev/mirror/disk0:
8 partitions:
#size   offsetfstype   [fsize bsize bps/cpg]
  a:   524272   164.2BSD 2048 16384 32768 
  b:  8336976   524288  swap
  c: 3907219040unused0 0 # raw part, 
don't edit
  d:   524288  88612644.2BSD 2048 16384 32776 
  e:   524288  93855524.2BSD 2048 16384 32776 
  f: 380812064  99098404.2BSD 2048 16384 28552 

Anyone see what I'm missing?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: HEADS UP: Release schedule for 2006

2005-12-21 Thread George Hartzell
Kevin Oberman writes:
  [discussion of USB/Cx level interactions clipped out...]
  
  If you unload the drivers, you should be to lower levels. Take a look at
  sysctl hw.acpi.cpu for detail and to see how much time is spent in each
  sleep state.
  
  I assume that you can unload the drivers, but my kernel has USB at this
  time. I do plan on building a kernel without USB and see if unloading is
  a workable solution. I think it should be.

I was spending all of my time in C1.  After I added

  performance_cx_lowest=LOW
  economy_cx_lowest=LOW

to my /etc/rc.conf, I found I spent all of my time in C2.

I built a kernel w/ all of the usb devices commented out (and
eventually remembered to set usbd_enable=NO in /etc/rc.conf, else
the modules just get kloaded...), and now I have:

hw.acpi.cpu.cx_supported: C1/1 C2/1 C3/85
hw.acpi.cpu.cx_lowest: C3
hw.acpi.cpu.cx_usage: 0.00% 15.21% 84.78%

If I start usbd by hand the system starts spending time in C2.  If I
stop usbd and kldunload usb, the system starts spending time in C3
again.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Fresh install on gmirror'ed disks?

2006-03-03 Thread George Hartzell
Patrick M. Hausen writes:
  Hello!
  
   Is it possible to boot off the install CD, setup a gmirror, and then 
   reboot and install on the mirror (and expect things to work ok)? Anyone 
   try this? It would be nice if the installer let you do this...
  
  AFAIK, no.
  
  Install a minimal system on the first disk, then follow
  these instructions:
  
  http://ezine.daemonnews.org/200502/diskmirror.html
  
  When the mirror is up and running, cvsup, buildworld, buildkernel,
  installkernel, installworld, mergemaster, reboot, enjoy ;-)

I think that the instructions in the above mentioned article mildly
incorrect in that they enable soft-updates when they newfs the root
partition.

I just asked a question about this in -stable but haven't heard any
commentary.

Am I misguided ,or?

Thanks,

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SATA RAID: Adaptec 1420SA, Promise TX4300?

2006-04-02 Thread George Hartzell
Daniel O'Connor writes:
  On Sunday 02 April 2006 17:48, Matthias Andree wrote:
You can't boot off a system with a dead primary disk with software RAID1.
(well you MIGHT but.. in any case RAID1 cards are quite cheap)
  
   It's a matter of the BIOS:
   will it complain, or will it proceed to the next SATA disk?
  
  Yes indeed.
  It also depends on the failure mode of the disk.
  
  Personally I think the price is worth paying :)
  (Although for a home server you can get your hands on easily then software 
  RAID should not be a problem)

One of the advantages that purely software raid (e.g. gmirror) has
over hardware raid (faux or genuine) is that in an emergency I can
take one or both of my gmirror'ed disks and put them in just about any
system that I can come up with and they'll work.

With raid systems that use proprietary metadata I'd need to find a
similar controller to hook them up to.

I think that this is one of those Darned Engineering Tradeoffs, but
I'd rather have the flexibility in assembling hardware than having the
raid be able to boot w/out intervention w/ a dead disk.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: SATA RAID: Adaptec 1420SA, Promise TX4300?

2006-04-02 Thread George Hartzell
Daniel O'Connor writes:
  On Monday 03 April 2006 04:39, George Hartzell wrote:
   With raid systems that use proprietary metadata I'd need to find a
   similar controller to hook them up to.
  
  Actually no..
  If you are using a cheap RAID like Promise TX2 or just about any onboard 
  IDE/SATA RAID that FreeBSD supports the array can be used on ANY system. 
  (Except for booting)

Cool.  Learned something new.  Thanks.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


help ith burncd (Input/output error, 6.1-RC, plextor PX-740a)

2006-04-11 Thread George Hartzell

I have a new system which includes a Plextor PX-740a DVD+-R/RW CD-R/RW
drive attached to an Asus A8V-MX motherboard.

When I try to use burncd to burn a cd, it writes all of the data, says
fixating CD, please wait.. and then reports

  burncd: ioctl(CDRIOCFIXATE): Input/output error

Oddly enough, the CD seems to be usable.

I can successfully burn the same file if I use cdrecord.

The system was cvsup-ed a couple of days ago, and the kernel config is
the example SMP file plus device atapicam (the error also happened
before adding that device, but cdrecord needed it).

I saw this before in the 6.0 days with older hardware and just assumed
that the drive was wonky.

Is it a known problem?

Is it worth pursuing?

g.


Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RC #0: Tue Apr 11 07:26:47 UTC 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/DEMI
ACPI APIC Table: A M I  OEMAPIC 
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2000.09-MHz 686-class CPU)
  Origin = AuthenticAMD  Id = 0x20fb1  Stepping = 1
  
Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
  Features2=0x1SSE3
  AMD Features=0xe2500800SYSCALL,NX,MMX+,b25,LM,3DNow+,3DNow
real memory  = 2147155968 (2047 MB)
avail memory = 2096123904 (1999 MB)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0 Version 0.3 irqs 0-23 on motherboard
kbd1 at kbdmux0
npx0: [FAST]
npx0: math processor on motherboard
npx0: INT 16 interface
acpi0: A M I OEMXSDT on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0
cpu0: ACPI CPU on acpi0
acpi_throttle0: ACPI CPU Throttling on cpu0
cpu1: ACPI CPU on acpi0
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
agp0: VIA 8380 host to PCI bridge mem 0xf800-0xfbff at device 0.0 on 
pci0
pcib1: ACPI PCI-PCI bridge at device 1.0 on pci0
pci1: ACPI PCI bus on pcib1
pci1: display, VGA at device 0.0 (no driver attached)
atapci0: VIA 8251 SATA150 controller port 
0xec00-0xec07,0xe880-0xe883,0xe800-0xe807,0xe480-0xe483,0xe400-0xe40f mem 
0xfebffc00-0xfebf irq 21 at device 15.0 on pci0
ata2: ATA channel 0 on atapci0
ata3: ATA channel 1 on atapci0
atapci1: VIA 8251 UDMA133 controller port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 15.1 on pci0
ata0: ATA channel 0 on atapci1
ata1: ATA channel 1 on atapci1
uhci0: VIA 83C572 USB controller port 0xe080-0xe09f irq 20 at device 16.0 on 
pci0
uhci0: [GIANT-LOCKED]
usb0: VIA 83C572 USB controller on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: VIA 83C572 USB controller port 0xe000-0xe01f irq 22 at device 16.1 on 
pci0
uhci1: [GIANT-LOCKED]
usb1: VIA 83C572 USB controller on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: VIA 83C572 USB controller port 0xdc00-0xdc1f irq 21 at device 16.2 on 
pci0
uhci2: [GIANT-LOCKED]
usb2: VIA 83C572 USB controller on uhci2
usb2: USB revision 1.0
uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: VIA 83C572 USB controller port 0xd880-0xd89f irq 23 at device 16.3 on 
pci0
uhci3: [GIANT-LOCKED]
usb3: VIA 83C572 USB controller on uhci3
usb3: USB revision 1.0
uhub3: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
ehci0: VIA VT6202 USB 2.0 controller mem 0xfebff800-0xfebff8ff irq 22 at 
device 16.4 on pci0
ehci0: [GIANT-LOCKED]
usb4: EHCI version 1.0
usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3
usb4: VIA VT6202 USB 2.0 controller on ehci0
usb4: USB revision 2.0
uhub4: VIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
umass0: USB2.0 CardReader, rev 2.00/91.44, addr 2
isab0: PCI-ISA bridge at device 17.0 on pci0
isa0: ISA bus on isab0
pcm0: VIA VT8233X port 0xd400-0xd4ff irq 22 at device 17.5 on pci0
pcm0: Avance Logic ALC655 AC97 Codec
pcm0: VIA DXS Enabled: DXS 4 / SGD 1 / REC 1
vr0: VIA VT6102 Rhine II 10/100BaseTX port 0xd000-0xd0ff mem 
0xfebff400-0xfebff4ff irq 23 at device 18.0 on pci0
miibus0: MII bus on vr0
rlphy0: RTL8201L 10/100 media interface on miibus0
rlphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
vr0: Ethernet address: 00:15:f2:2c:c3:86
pcib2: ACPI PCI-PCI bridge at device 19.0 on pci0
pci2: ACPI PCI bus on pcib2
pcib3: ACPI PCI-PCI bridge at device 0.0 on pci2
pci3: ACPI PCI bus on pcib3
pcib4: ACPI PCI-PCI bridge at device 0.1 on pci2
pci4: ACPI PCI bus on pcib4

Re: help ith burncd (Input/output error, 6.1-RC, plextor PX-740a)

2006-04-12 Thread George Hartzell
Igor Robul writes:
  On Tue, Apr 11, 2006 at 06:49:02PM -0700, George Hartzell wrote:
   When I try to use burncd to burn a cd, it writes all of the data, says
   fixating CD, please wait.. and then reports
   
 burncd: ioctl(CDRIOCFIXATE): Input/output error
   
   Oddly enough, the CD seems to be usable.
   
   I can successfully burn the same file if I use cdrecord.
  On same CD-R disc? :-)

No, on a fresh disk... ;)

  I sometimes burn many ISO images to CD-Rs from one box, and on some
  CD-s I get this error, and on some I dont.

I seem to get it reliably, although I *guess* that it could just be
chance

I guess my question is: Is this one of those known things that
everyone just ignores, or do I have an unusual problem?

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: help ith burncd (Input/output error, 6.1-RC, plextor PX-740a)

2006-04-12 Thread George Hartzell
Vladimir Botka writes:
  Hello,
  for me Plextor-750 works well with cdrecord and SCSI emulation (ATAPI/CAM 
  module) 
  http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/creating-cds.html#ATAPICAM
  
  Plextor is a good choice.

It works for me with cdrecord too.

I'm just trying to understand if I have a fixable problem, or burncd
has a fixable problem, or if it's just the way that things are.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: powerd not behaving with an Asus A8V-MX and Athlon 64 X2 3800+

2006-04-14 Thread George Hartzell
George Hartzell writes:
  
  I have an Asus A8V-MX motherboard with an AMD Athlong 64 X2 3800+ CPU
  and I'm trying to run powerd to keep it cooler/quieter/greener.
  [...]

[for the archives]

I now have powerd running w/out any complaints, although I still don't
understand what was causing the problem.

I've added the following entries to /boot/loader.conf

  hint.acpi_throttle.0.disabled=1
  hint.acpi_throttle.1.disabled=1

and then just run powerd by adding the following lines to /etc/rc.conf

  powerd_enable=YES
  powerd_flags=-a adaptive

and the system happily cycles between 2000, 1800, and 1000 MHz
depending on what it's doing (it actually never hangs out at 1800 for
long, just while transitioning between the extremes).

I guess that the CPU or bios or ??? was advertising support for
throttling that it didn't really implement, but I don't really
understand the why's and hows.

g.



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Sound problem w/ 6.1-RC and ASUS A8V-MX (VIA VT8233X)

2006-04-14 Thread George Hartzell

I have a new ASUS A8V-MX motherboard that's running 6.1-RC cvsuped
earlier this week.  I'm running w/ ACPI enabled, I still see the
problem if I boot w/ ACPI disabled at the loader prompt.

I'm running a kernel based on the standard SMP config file with the
addition of an atapicam device.

I have:
  sound_load=YES
  snd_via8233_load=YES
in /etc/loader.conf

I can listen to audio cd's using cdplay, I think that's just testing
the analog cable from the back of the drive to the motherboard and out
to the headphone jacks.  At least I know that I have that much
correct.

None of the gnome apps that I've tried make any sound, either playing
a cd from the drive or an mp3.

If I use a sound app, or cat /etc/termcap  /dev/dsp in an attempt
to make some noise, I get nothing except the following line in
/var/log/messages:

  pcm0:play:0:dsp0.0: play interrupt timeout, channel dead

I've attached my dmesg output, my mptable output, and my pciconf -lv
output.

Can anyone help me get this going?

Thanks,

g.

-cut here-
Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RC #0: Tue Apr 11 07:26:47 UTC 2006
[EMAIL PROTECTED]:/usr/obj/usr/src/sys/DEMI
ACPI APIC Table: A M I  OEMAPIC 
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2000.10-MHz 686-class CPU)
  Origin = AuthenticAMD  Id = 0x20fb1  Stepping = 1
  
Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
  Features2=0x1SSE3
  AMD Features=0xe2500800SYSCALL,NX,MMX+,b25,LM,3DNow+,3DNow
real memory  = 2147155968 (2047 MB)
avail memory = 2096119808 (1999 MB)
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0 Version 0.3 irqs 0-23 on motherboard
kbd1 at kbdmux0
npx0: [FAST]
npx0: math processor on motherboard
npx0: INT 16 interface
acpi0: A M I OEMXSDT on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-fast frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x808-0x80b on acpi0
cpu0: ACPI CPU on acpi0
powernow0: Cool`n'Quiet K8 on cpu0
cpu1: ACPI CPU on acpi0
powernow1: Cool`n'Quiet K8 on cpu1
pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0
pci0: ACPI PCI bus on pcib0
agp0: VIA 8380 host to PCI bridge mem 0xf800-0xfbff at device 0.0 on 
pci0
pcib1: ACPI PCI-PCI bridge at device 1.0 on pci0
pci1: ACPI PCI bus on pcib1
pci1: display, VGA at device 0.0 (no driver attached)
atapci0: VIA 8251 SATA150 controller port 
0xec00-0xec07,0xe880-0xe883,0xe800-0xe807,0xe480-0xe483,0xe400-0xe40f mem 
0xfebffc00-0xfebf irq 21 at device 15.0 on pci0
ata2: ATA channel 0 on atapci0
ata3: ATA channel 1 on atapci0
atapci1: VIA 8251 UDMA133 controller port 
0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0xfc00-0xfc0f at device 15.1 on pci0
ata0: ATA channel 0 on atapci1
ata1: ATA channel 1 on atapci1
uhci0: VIA 83C572 USB controller port 0xe080-0xe09f irq 20 at device 16.0 on 
pci0
uhci0: [GIANT-LOCKED]
usb0: VIA 83C572 USB controller on uhci0
usb0: USB revision 1.0
uhub0: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: VIA 83C572 USB controller port 0xe000-0xe01f irq 22 at device 16.1 on 
pci0
uhci1: [GIANT-LOCKED]
usb1: VIA 83C572 USB controller on uhci1
usb1: USB revision 1.0
uhub1: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
uhci2: VIA 83C572 USB controller port 0xdc00-0xdc1f irq 21 at device 16.2 on 
pci0
uhci2: [GIANT-LOCKED]
usb2: VIA 83C572 USB controller on uhci2
usb2: USB revision 1.0
uhub2: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub2: 2 ports with 2 removable, self powered
uhci3: VIA 83C572 USB controller port 0xd880-0xd89f irq 23 at device 16.3 on 
pci0
uhci3: [GIANT-LOCKED]
usb3: VIA 83C572 USB controller on uhci3
usb3: USB revision 1.0
uhub3: VIA UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub3: 2 ports with 2 removable, self powered
ehci0: VIA VT6202 USB 2.0 controller mem 0xfebff800-0xfebff8ff irq 22 at 
device 16.4 on pci0
ehci0: [GIANT-LOCKED]
usb4: EHCI version 1.0
usb4: companion controllers, 2 ports each: usb0 usb1 usb2 usb3
usb4: VIA VT6202 USB 2.0 controller on ehci0
usb4: USB revision 2.0
uhub4: VIA EHCI root hub, class 9/0, rev 2.00/1.00, addr 1
uhub4: 8 ports with 8 removable, self powered
uhub4: device problem (SET_ADDR_FAILED), disabling port 7
isab0: PCI-ISA bridge at device 17.0 on pci0
isa0: ISA bus on isab0
pcm0: VIA VT8233X port 0xd400-0xd4ff irq 22 at device 17.5 on pci0
pcm0: Avance Logic ALC655 AC97 Codec
pcm0: VIA DXS Enabled: DXS 4 / SGD 1 / REC 1
vr0: VIA VT6102 Rhine II 10/100BaseTX port 0xd000-0xd0ff mem 
0xfebff400-0xfebff4ff irq 23 at device 18.0 on pci0
miibus0: MII bus on vr0
rlphy0: 

howto/hack for Matrox's mga_hal and Xorg 6.9.

2006-05-05 Thread George Hartzell

[I've seen some comments here about people struggling w/ mga_hal, so I
 thought I'd share this.]

I wanted to use features of the Matrox mga x11 driver (dual headed
digital video) that required the hal, but I wasn't able to get the
mga_hal port to work with Xorg 6.9.

I cobbled up an underhanded hack that resulted in a working set of
binaries, based on some hacks that the Linux community was using.

If you need to use mga_hal w/ Xorg 6.9 on -STABLE, my hack might be
useful.

You can find the details at:

  http://forum.matrox.com/mga/viewtopic.php?t=19868

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


acpi_smbus_read_2: AE_ERROR on mac pro

2007-06-25 Thread George Hartzell

I have -STABLE (amd64) running on an 8-way Mac Pro system.  It's all
working great except I get the following message on the console every
couple of seconds.

  acpi_smbus_read_2: AE_ERROR 0x10

I've (google...) found one comment about Freebsd and the Mac Pro that
includes this message in it's dmesg output but doesn't actually
mention it.  It doesn't seem to be causing any problems but I'd just
as soon fix whatever it's whining about.

Is there some central site for freebsd on intel mac's?

Is this error familiar to anyone?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ZFS boot on zfs mirror

2009-05-26 Thread George Hartzell
Dmitry Morozovsky writes:
  On Tue, 26 May 2009, Mickael MAILLOT wrote:
  
  MM Hi,
  MM 
  MM i prefere use zfsboot boot sector, an example is better than a long talk:
  MM 
  MM $ zpool create tank mirror ad4 ad6
  MM $ zpool export tank
  MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=1
  MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1
  MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 skeep=1  seek=1024
  MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 skeep=1  seek=1024
  
  s/skeep/skip/ ? ;-)

What is the reason for copying zfsboot one bit at a time, as opposed
to 

  dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=2

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: loader not working with GPT and LOADER_ZFS_SUPPORT

2009-05-26 Thread George Hartzell
Artis Caune writes:
  2009/5/26 Philipp Wuensche cryx-free...@h3q.com:
   Hi,
  
   I tried booting from a disk with GPT scheme, with a /boot/loader build
   with LOADER_ZFS_SUPPORT=yes in make.conf. I get the following error:
  
   panic: free: guard1 fail @ 0x2fd4a6ac from
   /usr/src/sys/boot/i386/libi386/biosdisk.c:1053
  
  Same problem for me. I also tried with MBR scheme, same problem.

I had a similar problem (different @ 0x address) with -STABLE over
the weekend.  I just wanted to boot an old fashioned system, MBR and
no ZFS.  I ended up building a loader with

  LOADER_ZFS_SUPPORT=NO
  LOADER_NO_GPT_SUPPORT=YES

and it worked.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS boot on zfs mirror

2009-05-26 Thread George Hartzell
Andriy Gapon writes:
  on 26/05/2009 19:21 George Hartzell said the following:
   Dmitry Morozovsky writes:
 On Tue, 26 May 2009, Mickael MAILLOT wrote:
 
 MM Hi,
 MM 
 MM i prefere use zfsboot boot sector, an example is better than a long 
   talk:
 MM 
 MM $ zpool create tank mirror ad4 ad6
 MM $ zpool export tank
 MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=1
 MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 count=1
 MM $ dd if=/boot/zfsboot of=/dev/ad4 bs=512 skeep=1  seek=1024
 MM $ dd if=/boot/zfsboot of=/dev/ad6 bs=512 skeep=1  seek=1024
 
 s/skeep/skip/ ? ;-)
   
   What is the reason for copying zfsboot one bit at a time, as opposed
   to 
   
 dd if=/boot/zfsboot of=/dev/ad4 bs=512 count=2
  
  seek=1024 for the second part? and no 'count=1' for it? :-)
  
  [Just guessing] Apparently the first block of zfsboot is some form of MBR 
  and the
  rest is zfs-specific code that goes to magical sector 1024.

Ok, I managed to read the argument to seek as one block, apparently
my coffee hasn't hit yet.

I'm still confused about the two parts of zfsboot and what's magical
about seeking to 1024.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Does this disk/filesystem layout look sane to you?

2009-06-15 Thread George Hartzell
Freddie Cash writes:
  On Sun, Jun 14, 2009 at 9:17 AM, Dan Naumov dan.nau...@gmail.com wrote:
  
   I just wanted to have an extra pair (or a dozen) of eyes look this
   configuration over before I commit to it (tested it in VMWare just in
   case, it works, so I am considering doing this on real hardware soon).
   I drew a nice diagram: http://www.pastebin.ca/1460089 Since it doesnt
   show on the diagram, let me clarify that the geom mirror consumers as
   well as the vdevz for ZFS RAIDZ are going to be partitions (raw disk
   = full disk slice = swap partition | mirror provider partition | zfs
   vdev partition | unused.
  
  
  I don't know for sure if it's the same on FreeBSD, but on Solaris, ZFS will
  disable the onboard disk cache if the vdevs are not whole disks.  IOW, if
  you use slices, partitions, or files, the onboard disk cache is disabled.
  This can lead to poor write performance.
  
  Unless you can use one of the ZFS-on-root facilities, I'd look into getting
  a couple of CompactFlash or USB sticks to use for the gmirror for / and /usr
  (put the rest on ZFS).  Then you can dedicate the entirety of all 5 drives
  to ZFS.

Even if you use do a bootable ZFS on root, you'll end up with a couple
of gpt partitions (boot code, swap, then root) and therefor
constructing your ZFS file system from a partition.

Pawel said, back on April 6, 2007, 

   We support cache flushing operations on any GEOM provider (disk,
partition, slice, anything disk-like), so bascially currently I
treat everything as a whole disk [...]

Does anyone know for sure if we disable caching for partitions?

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


good/best practices for gmirror and gjournal on a pair of disks?

2008-05-13 Thread George Hartzell

I've been running many of my systems for some time now using gmirror
on a pair of identical disks, as described by Ralf at:

  http://people.freebsd.org/~rse/mirror/

Each disk has single slice that covers almost all of the disk.  These
slices are combined into the gmirror device (gm0), which is then
carved up by bsdlabel into gm0a (/), gm0b (swap), gm0d (/var), gm0e
(/tmp), and gm0f (/usr).

My latest machine is using Seagate 1TB disks so I thought I should add
gjournal to the mix to avoid ugly fsck's if/when the machine doesn't
shut down cleanly.  I ended up just creating a gm0f.journal and using
it for /usr, which basically seems to be working.

I'm left with a couple of questions though:

  - I've read in the gjournal man page that when it is ... configured
on top of gmirror(8) or graid3(8) providers, it also keeps them in
a consistent state...  I've been trying to figure out if this
simply falls out of how gjournal works or if there's explicity
collusion with gmirror/graid3 but can't come up with a
satisfactory explanation.  Can someone walk me through it?

Since I'm only gjournal'ing a portion of the underlying gmirror
device I assume that I don't get this benefit?

  - I've also read in the gjournal man page ... that sync(2) and
fsync(2) system calls do not work as expected anymore.  Does this
invalidate any of the assumptions made by various database
packages such as postgresql, sqlite, berkeley db, etc about
if/when/whether their data is safely on the disk?

  - What's the cleanest gjournal adaptation of rse's
two-disk-mirror-everything setup that would be able to avoid
tedious gmirror sync's.  The best I've come up with is to do two
slices per disk, combine the slices into a pair of gmirror
devices, bsdlabel the first into gm0a (/), gm0b (swap), gm0d
(/var) and gm0e (/tmp) and bsdlabel the second into a gm1f which
gets a gjournal device.

Alternatively, would it work and/or make sense to give each disk a
single slice, combine them into a gmirror, put a gjournal on top
of that, then use bsdlabel to slice it up into partitions?

Is anyone using gjournal and gmirror for all of the system on a pair
of disks in some other configuration?

Thanks,

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: good/best practices for gmirror and gjournal on a pair of disks?

2008-05-13 Thread George Hartzell
Adam McDougall writes:
  [...]
  I believe gjournal uses 1G for journal (2x512) which seemed to be 
  sufficient on all of the systems where I have used the default, but I 
  quickly found that using a smaller journal is a bad idea and leads to 
  panics that I was unable to avoid with tuning.  Considering 1G was such 
  a close value, I chose to go several times above the default journal 
  size (disk is cheap and I want to be sure) but I ran into problems using 
  gjournal label -s (size) rejecting my sizes or wrapping the value around 
  to something too low. [...]

I also stumbled on this and was unable to find any mention of it in
the pr database.  One of my todo items is to make sure I'm not messing
up somehow, dig further into the PR db for an existing report, and
file one if I can't find one?

I tried -s 2147483648 and it was found to be too small.  A quick
read of the source led me to find that jsize is an intmax_t and that
gctl_get_intmax()  should be returning an intmax_t and that intmax_
ought to be an __int64_t (I'm on amd64), which left me confused.

Has anyone else seen/reported a problem with gjournal -s and values 
1G?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: good/best practices for gmirror and gjournal on a pair of disks?

2008-05-13 Thread George Hartzell
Adam McDougall writes:
  George Hartzell wrote:
  [...]
 - I've read in the gjournal man page that when it is ... configured
   on top of gmirror(8) or graid3(8) providers, it also keeps them in
   a consistent state...  I've been trying to figure out if this
   simply falls out of how gjournal works or if there's explicity
   collusion with gmirror/graid3 but can't come up with a
   satisfactory explanation.  Can someone walk me through it?
  
   Since I'm only gjournal'ing a portion of the underlying gmirror
   device I assume that I don't get this benefit?
  [...]
  [...]
  I decided to journal /usr /var /tmp and leave / as a standard UFS 
  partition because it is so small, fsck doesn't take long anyway and 
  hopefully doesn't get written to enough to cause damage by an abrupt 
  reboot.  Because I'm not journaling the root partition, I chose to 
  ignore the possibility of gjournal marking the mirror clean.  Sudden 
  reboots don't happen enough on servers for me to care.  And all my 
  servers got abruptly rebooted this sunday and they all came up fine :)
  [...]

So you're confirming my belief that setting up gjournal on a
bsdlabel'ed partition of a gmirror does *not* provide the consistency
guarantee and that I should leave autosynchronization enabled.  Right?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


problem moving gmirror between two machines.

2008-11-13 Thread George Hartzell

I have an HP DL360 with a pair of 1TB seagate disks that's been
running -STABLE with a ZFS root partition set up using the tools
available here:

  http://yds.coolrat.org/zfsboot.shtml

It's been working great.  As part of trying to understand what's going
on, I csup'ed to -RELENG earlier today and rebuilt/installed the
kernel and world whilst running on the DL360, so everything should be
current.

I tried to move the disks into an HP DL320 G4 and it fails to boot
because it can't find /dev/mirror/boot (which it wants to mount onto
/strap and then parts get nullfs'ed onto /boot and /rescue).  It gives
me the opportunity to start a shell, and from that shell I can do a
zfs mount -a and get all of the zfs filesystems mounted, but there's
nothing in /dev/mirror.  No gmirror status and list are silent.

I can move the disks back into the older machine and they work fine.

I've run fdisk -s ad4 and bsdlabel -A /dev/ad4s1a and diffed the
output from the two machines and they're identical.

I've booted with kern.geom.mirror.debug=2 and the DL320G4 tastes
/dev/ad4s1a (along with everything else) but doesn't do anything with
it.

Any ideas?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: problem moving gmirror between two machines.

2008-11-13 Thread George Hartzell
George Hartzell writes:
  [...]
  It's been working great.  As part of trying to understand what's going
  on, I csup'ed to -RELENG earlier today and rebuilt/installed the
  kernel and world whilst running on the DL360, so everything should be
  current.
  [...]

Just to be clear, I mean that I have an up to date version of -STABLE
on the machine (it claims to be 7.1-PRERELEASE), not that I'm running
-CURRENT.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: problem moving gmirror between two machines.

2008-11-16 Thread George Hartzell
George Hartzell writes:
  
  I have an HP DL360 with a pair of 1TB seagate disks that's been
  running -STABLE with a ZFS root partition set up using the tools
  available here:
  
http://yds.coolrat.org/zfsboot.shtml
  
  It's been working great.  As part of trying to understand what's going
  on, I csup'ed to -RELENG earlier today and rebuilt/installed the
  kernel and world whilst running on the DL360, so everything should be
  current.
  
  I tried to move the disks into an HP DL320 G4 and it fails to boot
  because it can't find /dev/mirror/boot (which it wants to mount onto
  /strap and then parts get nullfs'ed onto /boot and /rescue).  It gives
  me the opportunity to start a shell, and from that shell I can do a
  zfs mount -a and get all of the zfs filesystems mounted, but there's
  nothing in /dev/mirror.  No gmirror status and list are silent.
  
  I can move the disks back into the older machine and they work fine.
  
  I've run fdisk -s ad4 and bsdlabel -A /dev/ad4s1a and diffed the
  output from the two machines and they're identical.
  
  I've booted with kern.geom.mirror.debug=2 and the DL320G4 tastes
  /dev/ad4s1a (along with everything else) but doesn't do anything with
  it.
  
  Any ideas?
  

[for the archives]

Solved.  gmirror had been set up with -h specifying the device, and
although the newer server used the same device names for its disks
(ad[46]) it assigned them to different hot swap bays.  Once I switched
the disks everything came up fine.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


success with snd_hda and 7.1-STABLE on Mac Pro, default changed.

2009-01-19 Thread George Hartzell

I upgraded my early-2008 Mac Pro to 7.1-STABLE and Gnome 2.24.3 over
the weekend, it had been tracking -STABLE.

I'd imported the snd_hda driver and had it running with a few tweaks,
which I needed to adjust to get it running under this version of the
driver.

I'm only able to get the rear line-out jack to work, I haven't found
anything combination of config values and/or default_unit's that
make the front headphone jack work.

For posterity's sake, here are the details:

  /boot/device.hints (added at the bottom)
  hint.hdac.0.config=gpio0 gpio1

  /boot/loader.conf
  hw.snd.default_unit=3

Details are available at:

  http://shrimp.alerce.com/snd_hda/sndstat.txt
  http://shrimp.alerce.com/snd_hda/dmesg.txt
  http://shrimp.alerce.com/snd_hda/pindump.txt

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


ZFS and disappearing glabels

2009-12-31 Thread George Hartzell

I've set up a system as described here.

  http://wiki.freebsd.org/RootOnZFS/ZFSBootPartition

Using the 8.0 Release DVD and then csup'ing to RELENG_8 and
rebuilding.

I set it up with a single drive, the only change that I made was that
after creating ad10s1a I glabeled it as disk0, then added
/dev/label/disk0 to the pool.

That worked great.

Then I added a second larger drive, giving it an MBR, a bsd label, and
an s1a partition that I glabeled as disk1.  I attached that to the
pool and it resilvered happily.

However, when I rebooted I found that the pool now consists of
label/disk0 and ad12s1a.  I detached ad12s1a, relabeled it as disk1,
and attached disk1 to the pool again.  It resilvered fine.  Running
strings on /boot/zfs/zpool.cache shows /dev/label/disk0 and
/dev/label/disk1.

But, when I reboot I find I'm back to label/disk0 and ad12s1a.  At
this point strings on zpool.cache lists /dev/label/disk0 and ad12s1a.

I'd like to have the device independence of using labels, and am also
worried about problems caused by the different disk sizes (since the
glabeled partition is 512 bytes smaller).

Any ideas what's going wrong?

Thanks,

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS and disappearing glabels

2009-12-31 Thread George Hartzell
Eric writes:
  On 12/31/2009 1:48 PM, George Hartzell wrote:
  
   I've set up a system as described here.
  
  http://wiki.freebsd.org/RootOnZFS/ZFSBootPartition
  
   Using the 8.0 Release DVD and then csup'ing to RELENG_8 and
   rebuilding.
  
   I set it up with a single drive, the only change that I made was that
   after creating ad10s1a I glabeled it as disk0, then added
   /dev/label/disk0 to the pool.
  
   That worked great.
  
   Then I added a second larger drive, giving it an MBR, a bsd label, and
   an s1a partition that I glabeled as disk1.  I attached that to the
   pool and it resilvered happily.
  
   However, when I rebooted I found that the pool now consists of
   label/disk0 and ad12s1a.  I detached ad12s1a, relabeled it as disk1,
   and attached disk1 to the pool again.  It resilvered fine.  Running
   strings on /boot/zfs/zpool.cache shows /dev/label/disk0 and
   /dev/label/disk1.
  
   But, when I reboot I find I'm back to label/disk0 and ad12s1a.  At
   this point strings on zpool.cache lists /dev/label/disk0 and ad12s1a.
  
   I'd like to have the device independence of using labels, and am also
   worried about problems caused by the different disk sizes (since the
   glabeled partition is 512 bytes smaller).
  
   Any ideas what's going wrong?
  
   Thanks,
  
  
  
  i ran into the same issues. every reboot i would have to fight to 
  relabel the drive (on 7.2). I upgraded to 8 and used GPT for everything 
  (and ZFS on root) and i have not had any issues. I would recommend going 
  that route. You can still label the drives with labels.
  
  This is the docs I followed:
  
  http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot/Mirror
  
  Works great!

I'm running something like that on another machine, but can't on this
one.

The gory details include the fact that this is a mac pro.  It's EFI
firmware only does magic bios emulation if it sees an MBR formatted
disk and so setting things up via GPT won't work for me.  I did try it
using the link that you pointed to above and it wouldn't boot.  Tried
it via the apple's firmware choose a boot disk by holding down the
option key trick and via rEFIt.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS and disappearing glabels

2009-12-31 Thread George Hartzell
Roland Smith writes:
  On Thu, Dec 31, 2009 at 12:48:28PM -0800, George Hartzell wrote:
   
   I've set up a system as described here.
   
 http://wiki.freebsd.org/RootOnZFS/ZFSBootPartition
   
   Using the 8.0 Release DVD and then csup'ing to RELENG_8 and
   rebuilding.
   
   I set it up with a single drive, the only change that I made was that
   after creating ad10s1a I glabeled it as disk0, then added
   /dev/label/disk0 to the pool.
   
   That worked great.
   
   Then I added a second larger drive, giving it an MBR, a bsd label, and
   an s1a partition that I glabeled as disk1.  I attached that to the
   pool and it resilvered happily.
   
   However, when I rebooted I found that the pool now consists of
   label/disk0 and ad12s1a.  I detached ad12s1a, relabeled it as disk1,
   and attached disk1 to the pool again.  It resilvered fine.  Running
   strings on /boot/zfs/zpool.cache shows /dev/label/disk0 and
   /dev/label/disk1.
   
  How did you create the labels? See glabel(8) about the difference between the
  manual and automatic method. Maybe you accidentally used the manual 
  method
  on the second disk?
  [...]

+1 bonus point to Roland, just in time under the New Years wire.

I created the first label with 'glabel label', which creates an
automatic label, but created the second with 'glabel create' (assuming
it was a synonym), which creates a manual label.

I did a detach, relabeled, reattached, and away I go.

Thanks,!

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Help with filing a [maybe] ZFS/mmap bug.

2013-07-17 Thread George Hartzell

Hi All,

I have what I think is a ZFS related bug.  Unfortunately my simplest
test case is a bit cumbersome and I haven't definitively proven that
the problem is ZFS related.

I'm hoping for some feedback on how to move forward.

Quick background: I rip my CD's using grip and produce flac files.  I
tag the music using Musicbrainz' Picard and transcode it to mp3's
within Picard using a plugin that I wrote.  Picard is a python based
app and uses the Mutagen library to tag files.

I'm working on a MacPro with 10GB ram and using Seagate ST31000340AS
drives updated to the latest firmware (SD1A).  The system is running
9-STABLE from late June.  It is ZFS only and boots from a mirrored
pool that provides a bunch of zfs filesystems, including my home
directory.

I recently realized that some of the flacs were corrupt and have been
chasing down the problem.  I've blamed Picard, my disks (there was
newer, important firmware, which they're now running), my RAM,
etc...

After blaming each of the moving parts in turn I offer up the
following experiment as evidence that I have found a ZFS problem.

- start with a bunch of untagged flac files that pass validation with
  flac -t.

- load them into Picard, tag them and save them (this also transcodes
  them to mp3's using my plugin and runs a plugin which runs flac -t
  on the tagged file).

- run flac -t on all of the tag flac files and collect the result as
  pre-exit-validation.

- exit Picard politely (using the menu options, not killing it from
  the command line...).

- run flac -t on all of the tag flac files and collect the result
  post-exit-validation.

- reboot the machine

- run flac -t on all of the tag flac files and collect the result
  post-reboot-validation.

On multiple runs through this routine I'll sometimes see errors in the
{pre,post}-exit-validations, but they'll often all validate perfectly.

On all of the runs through the validation I'll see many invalid files
in the post-reboot-validation output.

I've even scp'd the directories to an unrelated machine (Mac OS X
10.8) at the various points to do the flac -t validation, with the
same results.

Looking carefully at a couple of instances shows that they differ in a
few bytes.  E.g. one file differs by a few bytes starting at 139253 to
139264 (I might have an off by one counting issue, using emacs' buffer
positions here).  2^17 + 2^13 = 139264, which is an interesting
coincidence.  In another file I see a difference ending at 2^17+2^12
(again, I might be off by one or so in my counting).  Patching the
different hunk from a good file into a bad file (again via emacs)
results in a file that passes validation.

At one point I was blaming RAM and was pulling/swapping sims.  Running
with less memory increased the likelihood of files being invalid.

I built up a similar system running 9-STABLE as of yesterday (7/16)
that uses UFS and have been unable to recreate the problem.

Given that the files are valid after exiting Picard, I do not think
that there is anything in my tagging pipeline that is causing the
problem.

The fact that the files become invalid after a reboot suggests
something in the ZFS buffering and/or interactions with the VM system.
The observation that running with less memory causes more/earlier
problems reinforces this.  The fact that the garbage in the file
happens near a power-of-two boundry also reinforces this.

My current test case involves my local version of Picard and my
plugins, and 165 flac files (some of which Picard can discover
automatically based on grip's freedb based metadata, some of which
need a helping hand).  Not particularly minimal but I'm not sure that
I can ever get it trimmed down to something trivial that a ZFS
developer might be able to run locally.

Thanks for making it this far!

How should I move forward with this?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-18 Thread George Hartzell
Andriy Gapon writes:
  on 17/07/2013 23:47 George Hartzell said the following:
   How should I move forward with this?
  
  Could you please try to reproduce this problem using a kernel built with
  INVARIANTS options?

I added INVARIANT_SUPPORT and INVARIANTS options to the GENERIC
kernel, rebuilt it, installed it and running through my test case
generated a lot of invalid flac files.  Im not sure what the options
are/were supposed to do though, it looks like they generally lead to
KASSERTS, which lead to abort()'s.  Nothing in /var/log/messages or on
the console.

g.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-18 Thread George Hartzell
Richard Todd writes:
  George Hartzell hartz...@alerce.com writes:
  
   Hi All,
  
   I have what I think is a ZFS related bug.
   [...]
 
  [summary: Picard seems to trigger an mmap consistency bug in ZFS].
  
  [...]
  Anyway, what I'd suggest is the following: see if my patch for py-mutagen
  disabling the mmap() in those two functions lets you run picard reliably.

Removing the mmap support from those two routines seems to avoid the
issue.

  If so, then the issue is triggered by one or both of those two routines;
  hack them to print out the exact offsets used on each call and use that to 
  try and code up a simple C++ test case.  
  [...]

Your test case doesn't use mmap, I assume that you've offered it up as
a hint, not as something that's nearly done.  The shell script in
particular seems useful.

In my case I'd want to find a particular set of file size, offset, and
insertion size that triggers the problem and code up a c/c++ equiv. of
the mmap calls that py-mutagen does.  Right?

I'm hesistant about that.  I believe (and will try to prove) that the
problem does not occur deterministically for a particular track
between different test runs.  I'm worried that it's not as simple as
using mmap to insert 27 bytes into a 1024 bytes file at pos 42 causes
corruption but rather that it depends on a more complex set of
interactions.

My next step will be to see if a track that has trouble in one run has
trouble in another.  If not, then I'm not sure that a simple test will
be successful.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-19 Thread George Hartzell
Richard Todd writes:
  On Thu, Jul 18, 2013 at 11:40:51AM -0700, George Hartzell wrote:
  [...]
   [...]  
   In my case I'd want to find a particular set of file size, offset, and
   insertion size that triggers the problem and code up a c/c++ equiv. of
   the mmap calls that py-mutagen does.  Right?
  
  Yeah. 

I'm stuck.  Or I've discovered something relevant.  Or both.

I've identified a slightly simpler test case.  I load my handful of
test albums, look up single, particular album, and save a particular
track.  The tagged flac file appears to be valid.  Then I reboot.  Now
the flac file is invalid.

It's repeatable, which is useful.

Following the lead of your test script I created a new zfs filesystem,
mounted it, and had picard save the tagged files there.  After exiting
picard the files appears to be valid.  After unmounting and remounting
the filesystem the file *still* appears to be valid.  After rebooting,
the file *still* appears to be valid.

So, it would seem that there's something about the filesystem in which
my home directory resides that contributes to the problem.

The only obvious thing I saw is that my homedir filesystem has a quota
and is 80% full.  I tried creating a new, small, zfs filesystem and
running the test there.  The tagged flac file validates successfully,
I do not see the problem (the single file makes the filesystem 88%
full).

All of the filesystems have automagically created snapshots, so I
tried creating a snapshot of the new zfs filesystem before running
through the test case.  I was still unable to replicate the problem.

My spin on your gen4.cpp test case (modified to use the filesize and
offset that picard uses) does not generate a difference when run in my
home directory followed by a reboot (picard calls insert_bytes twice,
using either set of values does not cause a problem).

The only difference I see in zfs get all output (excluding obvious
sizes, etc...) is that the new filesystem has xattr on via the
default, whereas my home directory has it off via temporary.  I'm
not sure why it's off.

So, I currently have a repeatable, not-too-efficient test case using
my home directory.  I am unable to repeat the test case using a newly
created zfs filesystem (even a very small one) nor am I able to make
any headway with Richard's test case.

As I described in another thread with Andriy, add INVARIANTS and
INVARIANT_SUPPORT into the kernel did not lead to any different
behaviour, in fact the experiments described above were run on this
new kernel.

Any suggestions for a next step?

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-19 Thread George Hartzell
George Hartzell writes:
  [...]
  So, it would seem that there's something about the filesystem in which
  my home directory resides that contributes to the problem.
  [...]

Another data point.

I just ran through my test case, saving the tagged and transcoded
files into /tmp, a zfs filesystem that was created back when I built
up the system (contemporaneously with /usr/home).  I was unable to
trigger the bug there.

As I control, I then ran through the test case, saving a directory in
my home directory and triggered the bug.

I then created a new directory /usr/home/foo (within the same zfs
filesystem as my home directory).  I was unable to trigger the bug
there either.

I then ran through all 165 flac files in the full test case, saving
the results to /usr/home/foo.  After exiting picard and running flac
-t on all of the files I had errors on many files, including the file
in my single-file test case above.  I did not even need to reboot.

I then ran the single file test case, saving into /usr/foo (as above)
and was now able to observe the error after a reboot.

I then ran the single file test case (again to make sure I wasn't
crazy), saving into /usr/foo (as above) and was now able to observe
the error after a reboot.

One more control.

Create /usr/home/bar.
Run single file test case.  Reboot.  This time I observed an invalid
flac.  Not sure what this means about the test case above.

Sigh.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-21 Thread George Hartzell

George Hartzell writes:
  George Hartzell writes:
[...]
So, it would seem that there's something about the filesystem in which
my home directory resides that contributes to the problem.
[...]
  
  Another data point.
  
  [...]

Yet another data point or three.

I took an unused disk, set it up with a single pool and copied
everything from my two disk system to it using zfs send  recv.  I was
hoping that if there was something goofy about the state of the
filesystems on the older two disk pool it might get cleaned up in the
transfer.

I tagged the entire set of flac files, they were all successfully
validated via the plugin.  After exiting Picard, one failed
validation.  After rebooting, many failed validation.

Next I created a new filesystem on this new pool, mounted it,
configured Picard to save to that filesystem and ran through all of
the tracks.  They validated fine via the plugin and by hand after
exiting Picard.  They also validated properly after unmounting and
remounting the filesystem and after a reboot.  Sigh.

Then I destroyed all of the snapshots on the filesystems that I
transfered over from my real dual-disk system.  Tagging all of the
flac files into my home directory generated errors from the validation
plugin and by hand after exiting picard.  I didn't bother rebooting
and checking.

So it seems to be something about the filesystem{s} themselves.

I'm running a scrub now.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-07-24 Thread George Hartzell
George Hartzell writes:
  
  George Hartzell writes:
George Hartzell writes:
  [...]
  So, it would seem that there's something about the filesystem in which
  my home directory resides that contributes to the problem.
  [...]

Another data point.

[...]
  
  Yet another data point or three.
  
  I took an unused disk, set it up with a single pool and copied
  everything from my two disk system to it using zfs send  recv.  I was
  hoping that if there was something goofy about the state of the
  filesystems on the older two disk pool it might get cleaned up in the
  transfer.
  
  I tagged the entire set of flac files, they were all successfully
  validated via the plugin.  After exiting Picard, one failed
  validation.  After rebooting, many failed validation.
  
  Next I created a new filesystem on this new pool, mounted it,
  configured Picard to save to that filesystem and ran through all of
  the tracks.  They validated fine via the plugin and by hand after
  exiting Picard.  They also validated properly after unmounting and
  remounting the filesystem and after a reboot.  Sigh.
  
  Then I destroyed all of the snapshots on the filesystems that I
  transfered over from my real dual-disk system.  Tagging all of the
  flac files into my home directory generated errors from the validation
  plugin and by hand after exiting picard.  I didn't bother rebooting
  and checking.
  
  So it seems to be something about the filesystem{s} themselves.
  [...]

A [small] breakthrough.  I understand why saving to a freshly created
filesystem never led to any errors.

I'd tentatively concluded that there was something hinky with the
filesystem itself that was causing the problem, something that
came along when I recreated the filesystem via zfs send/recv.

This was based on my inability to trigger the problem when I saved the
files to a newly created zfs filesystem.

Yesterday I used dump and restore to transfer my trouble-free home
directory from its UFS partition to a newly created zfs filesystem (I
hadn't know that restore would write to a zfs filesystem but it
appears to...).

The resulting system generated errors when I ran through my test
case, even though it wasn't a zfs send/recv copy.

Next I created a new zfs filesystem and arranged to write the tagged
files there.  The resulting files were error free, even after a
reboot.

Next I copied the untagged source flacs onto the newly created zfs
filesystem and ran through the test routine, saving the tagged files
to the newly created zfs filesystem.  This resulted in a glorious pile
of errors.

Conclusion: my test case only generates errors when the untagged files
are on the fileysystem to which the tagged files will be written.

A bit of poking around in the sources provided the explanation.
Picard tries to move the tagged file to its final destination.  If
it's within the same filesystem this ends up being a rename operation
and I'm left with the inconsistent flac file.  If the destination is
in another fileysystem then it copies the file, which ends up reading
the clean memory-resident data.

So, now I have a smaller test version of my workflow that doesn't
involve rebooting my machine to generate the error.  I'll get back to
trying to come up with a variant of Richard's stand alone
bug-tickler.

phew.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-08-07 Thread George Hartzell
Andriy Gapon writes:
  on 18/07/2013 20:44 George Hartzell said the following:
   Andriy Gapon writes:
 on 17/07/2013 23:47 George Hartzell said the following:
  How should I move forward with this?
 
 Could you please try to reproduce this problem using a kernel built with
 INVARIANTS options?
   
   I added INVARIANT_SUPPORT and INVARIANTS options to the GENERIC
   kernel, rebuilt it, installed it and running through my test case
   generated a lot of invalid flac files.  Im not sure what the options
   are/were supposed to do though, it looks like they generally lead to
   KASSERTS, which lead to abort()'s.  Nothing in /var/log/messages or on
   the console.
  
  George,
  
  do you have anything new on this issue?

Since the message that you quoted I narrowed down my test case
somewhat but I have not yet produced a stand-alone tool that
reproduces it (you still have to go through picard et al.).

  Could you please try the following patch?
  http://people.freebsd.org/~avg/zfs-putpages.diff
  
  I expect it to not really fix the issue, but it may help to narrow it down.
  Please keep INVARIANTS.

Absolutely.  Probably not until the weekend, but I'll give it a go.

Thanks for following up.

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Help with filing a [maybe] ZFS/mmap bug.

2013-09-22 Thread George Hartzell
Andriy Gapon writes:
  on 18/07/2013 20:44 George Hartzell said the following:
   Andriy Gapon writes:
 on 17/07/2013 23:47 George Hartzell said the following:
  How should I move forward with this?
 
 Could you please try to reproduce this problem using a kernel built with
 INVARIANTS options?
   
   I added INVARIANT_SUPPORT and INVARIANTS options to the GENERIC
   kernel, rebuilt it, installed it and running through my test case
   generated a lot of invalid flac files.  Im not sure what the options
   are/were supposed to do though, it looks like they generally lead to
   KASSERTS, which lead to abort()'s.  Nothing in /var/log/messages or on
   the console.
  
  George,
  
  do you have anything new on this issue?
  
  Could you please try the following patch?
  http://people.freebsd.org/~avg/zfs-putpages.diff
  
  I expect it to not really fix the issue, but it may help to narrow it down.
  Please keep INVARIANTS.
  Thank you.
  -- 
  Andriy Gapon

Hi Andriy,

This weekend I built up a system using the 10.0 beta 2 dvd, then
updated /usr/src from head.

I grabbed a fresh copy of your patch this afternoon.

I applied your patch with no problems.  I was unable to build a new
kernel though, you have one reference to m-busy, where m is a
vm_page_t (if I remember correctly).  I dug around a bit and decided
that you meant m-busy_lock, which let me build a usable kernel.

It looks like INVARIANTS and INVARIANT_SUPPORT are included in the
GENERIC conf file.

I ran through my test routine with the original system and was able to
reproduce the problem.

After building and installing a kernel with your patch I was still
able to trigger the problem.  If anything it was worse (sample size =
1, I know...).

I did not see any interesting output in /var/log/messages or to the
console or anywhere else obvious.

I'm not sure what to do next.  It's likely that my m-busy to
m-busy_lock change was not The Right Thing to Do and might have
invalidated what the patch was trying to do.

In any case, I now have a system running HEAD and should be able to
test things more easily.

Thanks for the help,

g.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org