Re: KBI unexpexted change in stable/11 ?

2018-03-28 Thread Greg Byshenk
On Wed, Mar 28, 2018 at 03:11:50PM +0100, tech-lists wrote:
> On 28/03/2018 14:39, Gregory Byshenk wrote:
> > You can do this manually, or by adding a PORTS_MODULES line to
> > /etc/make.conf. This will rebuild the listed modules from ports
> > when you build a new kernel.
> 
> Are you sure it's in /etc/make.conf and not /etc/src.conf?

No. But it is in the man page for make.conf and not src.conf.

-- 
gregory byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: NFS and amd on older FreeBSD

2017-01-12 Thread Greg Byshenk
On Wed, Jan 11, 2017 at 03:47:37PM -0800, Karl Young wrote:
> I inherited a lab that has a few hundred hosts running FreeBSD 7.2.
> These hosts run test scripts that access files that are stored on
> FreeBSD 6.3 host.  The 6.3 host exports a /data directory with NFS
> 
> [...]
>
> $ showmount -e  9.3-host
> Exports list on 9.3-host:
> /data   Everyone
> 
> But I can't automount it:
> 
> $ ls -l /net/9.3-host/data
> ls: /net/9.3-host/data: No such file or directory
> 
> If I manually mount the exported directory, it works:
> 
> $ sudo mount -t nfs 9.3-host:/data /mnt/data/
> $ mount | grep nfs
> 9.3-host:/data on /mnt/data (nfs)
> 
> $ ls -l /mnt/data
> total 4
> drwxr-xr-x  9 root  wheel  512 Dec 20 17:41 iaf2
> 
> I've spent some time on Google, but haven't found a solution.  I realize
> these are very old versions, but I'm not in a position to upgrade them
> right now.  My last resort will be to use /etc/fstab to do the NFS
> mount, but I'd rather avoid that if I can.

If you can mount the share manually, there is almost 
certainly nothing wrong with the server. Based on the
error ("No such file or directory"), I would recommend
checking your amd config on the client.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ZFS Panic after freebsd-update

2013-07-02 Thread Greg Byshenk
On Tue, Jul 02, 2013 at 12:57:16AM -0700, Jeremy Chadwick wrote:
 
 But in the OP's case, the situation sounds dire given the limitations --
 limitations that someone (apparently not him) chose, which greatly
 hinder debugging/troubleshooting.  Had a heterogeneous setup been
 chosen, the debugging/troubleshooting pains are less (IMO).  When I see
 this, it makes me step back and ponder the decisions that lead to the
 ZFS-only setup.

As an observer (though one who has used ZFS for some time, now),
I might suggest that this can at least -seem- like FUD about ZFS
because the limitations don't necessarily have anything to do
with ZFS. That is, a situation in which one cannot recover, nor
even effectively troubleshoot, if there is a problem, will be a
dire one, regardless of what the problem might be or where its
source might lie.

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL - Portland, OR USA
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: svn - but smaller?

2013-01-25 Thread Greg Byshenk
On Fri, Jan 25, 2013 at 03:12:03PM +0200, Daniel Kalchev wrote:
[...]
 It is absurd to require the installation of any port, if your only 
 intention is to update the base system sources.

I think others have already pointed this out, but
if your only intention is to update the base system
sources, then 'freebsd-update' (from the base system)
will do the job.

Or am I missing/misunderstanding something?

 
-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL - Portland, OR USA
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: fsck_ufs running too often

2012-06-23 Thread Greg Byshenk
On Sat, Jun 23, 2012 at 06:23:58PM +1000, Sean wrote:
 On 23/06/2012, at 7:47 AM, Leonardo M. Ram? wrote:
 
  Hi, since a few of days ago, I noticed my home server turns very
  slow more than once a day, so every time I run top to see what's
  processes are running, I can see fsck_ufs at the very top, and the
  hard drive working like mad.
  
  I've checked my crontab and there's nothing related to fsck_ufs,
  where can I start searching for the cause of the problem?, I
  thought this process should run only at boot or shutdown, but this
  time it is running -apparently- without a cause.
 
 Background fsck. Your server crashed, rebooted, started up and fsck
 is running in the background while everything else continues.
 
 [...]
 
 The more important thing is to find out why it crashed - if there
 was a power outage, hardware or software issue.

Another thing to do is look in the logs to see if background fsck
is failing for some reason. I've seen it happen in some cases that
background fsck fails and asks for a manual run, in which case the
filesystem remains dirty, and further reboots will continue to fail
until a manual fsck is run.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL - Portland, OR USA
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Experience with Intel SATA and fbsd 8.3-amd64 ?

2012-06-15 Thread Greg Byshenk
On Fri, Jun 15, 2012 at 07:20:28PM +0200, Kurt Jaeger wrote:
  Kurt Jaeger li...@opsec.eu wrote:

I have a problem with some host: If I put heavy IO load on that
system, write errors happen, and then it crashes.
  
  What kind of write errors, exactly?  What messages do you
  get on the console?
 
 g_vfs_done():ada0s1f[WRITE(offset=50699862016, length=16384)]error = 2
 2
 g_vfs_done():ada0s1f[WRITE(offset=50699862016, length=16384)]error = 22
 g_vfs_done():ada0s1e[WRITE(offset=44693307392, length=16384)]error = 22
 g_vfs_done():ada0s1e[WRITE(offset=44693211136, length=2048)]error = 5
 
  It's also worth mentioning that such problems could also
  be caused by bad RAM, or even by the power supply (though
  the latter is unlikely in this case, I think).
 
 Well, the device was probably a bit on the cheap side (ALLNET FW9000).

Could it be a device problem? I've seen that type of error
(including a crash in the end) when a device can't handle DMA.
Disabling DMA solved the problem for me.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL - Portland, OR USA
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Strange 'hangs' with RELENG_9

2012-01-19 Thread Greg Byshenk
On Thu, Jan 19, 2012 at 04:00:24PM +0100, L??szl?? K??ROLYI wrote:
 
 Moreover, I couldn't set SCHED_BSD in the kernel config, it said that
 it's an illegal option. Maybe it does not exist in RELENG_9.

This should be 

options SCHED_4BSD 
  ^
if you want to try it.

It can be used with RELENG_9; check the NOTES file.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Serial multiport error Oxford/Startech PEX2S952

2011-08-24 Thread Greg Byshenk
On Mon, Aug 22, 2011 at 11:59:11AM +0100, David Wood wrote:
 
 In message 20110822094756.gj92...@core.byshenk.net, Greg Byshenk 
 free...@byshenk.net writes
 It doesn't seem to matter; both cuau?.lock and cuau?.init produce the
 error (for both ports), and cuau? itself remains a no-op.
 
 You could try
 hint.uart.2.baud=115200
 
 in /boot/device.hints - making the relevant changes to port number and 
 speed according to your needs.

This does not help; speed remains set to 9600.
 
 
 Now that I can see that the card is working (at least minimally), it
 begins to look as if there might be a problem somewhere in 9.x. I'll
 try to install 8.x and see if the results are different.
 
 It will be interesting to see if there is a difference between 8.x and 
 9.x.

Yes, there is.

Using 8-STABLE (with sources from 17 August 2011) and inbuilt puc,
the controller works as expected. It defaults to 9600, but setting
the speed on the cuaa?.lock and cuaa?.init devices works.

Interestingly, setting the speed in device.hints does _not_ work.


So, it appears that there is something wrong (or at least different)
with 9.x

Doing some poking around, I see that, in 9.x, termios.h is not
included in dev/uart/uart_core.c and dev/uart/uart_tty.c. While
it is included under 8.x.

If I look at the 8.x .c files, they want 

#include sys/termios.h

... which appears to no longer be used. But adding either that,
or

#include termios.h

... produces errors:

/usr/src/sys/dev/uart/uart_core.c:47:21: error: termios.h: No such file 
or directory
/usr/src/sys/dev/uart/uart_tty.c:42:21: error: termios.h: No such file 
or directory
mkdep: compile failed
*** Error code 1

Though a fresh build of world seems to produce termios.h:

# find /usr/obj/ |grep termios.h
/usr/obj/usr/src/lib32/usr/include/sys/termios.h
/usr/obj/usr/src/lib32/usr/include/sys/_termios.h
/usr/obj/usr/src/lib32/usr/include/termios.h
/usr/obj/usr/src/tmp/usr/include/termios.h
/usr/obj/usr/src/tmp/usr/include/sys/termios.h
/usr/obj/usr/src/tmp/usr/include/sys/_termios.h
# 

But I may be completely confused here, as I don't pretend to be
familiar with all of the details of the build process.


Does this look like a bug with 9.x, or something that should be
done differently?


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Serial multiport error Oxford/Startech PEX2S952

2011-08-22 Thread Greg Byshenk
On Mon, Aug 22, 2011 at 12:20:33AM +0200, Greg Byshenk wrote:
 On Sun, Aug 21, 2011 at 09:44:41PM +0100, David Wood wrote:
  
  I wrote and contributed the support code for the OXPCIe95x serial chips 
  - and just happened to notice your report.
 
 Thanks for the response.
 
 
  In message 20110821154249.ge92...@core.byshenk.net, Greg Byshenk 
  free...@byshenk.net writes
  I'm having a problem with a StarTech PEX2S952 dual-port serial
  card.
  
  I believe that it should be supported, as it has this entry in
  pucdata.c
  
  [...]
 {   0x1415, 0xc158, 0x, 0,
 Oxford Semiconductor OXPCIe952 UARTs,
 DEFAULT_RCLK * 0x22,
 PUC_PORT_NONSTANDARD, 0x10, 0, -1,
 .config_function = puc_config_oxford_pcie
 },
  [...]
  
  It should be supported. The OXPCIe952 is more awkward to support than 
  the OXPCIe954 and OXPCIe958 because it can be configured in so many 
  different ways by the board manufacturer. However, 0xc158 is 
  configuration that is identical in arrangement as the larger chips, so 
  is the configuration I'm most confident of. I've just double-checked the 
  data sheets, and can't see any relevant differences between 0xc158 
  OXPCIe952 and the OXPCIe954 I tested the code with.
  
  I use my OXPCIe954 board on FreeBSD 8.2, and have had success reports 
  from other OXPCIe954 and OXPCIe958 board users (including someone with a 
  16 port board based on dual OXPCIe958s). I have yet to try FreeBSD 9.x 
  on my hardware.
  
  
  And, while it is recognized at boot -- after adding
  
device  puc
options COM_MULTIPORT
  
  I'm 99% certain that options COM_MULTIPORT relates to the old sio(4) 
  code - I certainly don't need it on 8.x. Does it make any difference if 
  you delete that line and just leave device puc?
 
 I will rebuild my kernel and try.
  
  
  to my kernel, it doesn't seem to be working. The devices '/dev/cuau2'
  and '/dev/cuau3' show up, and I can connect to them, but they don't
  seem to pass any traffic. If I connect to the serial console of
  another machine (one that I know for certain is working), I get
  nothing at all.
  
  Have you remembered to set the speed (and other relevant options) on the 
  .init devices? This is a feature (or is it a quirk) of the uart(4) 
  driver that catches many people out. Setting options on the base device 
  is normally a no-op.
  
  For example, if the remote device on /dev/cuau2 operates at 115200 bps 
  with hardware handshaking, try:
  
  stty -f /dev/cuau2.init speed 115200 crtscts
 
 Interestingly, it -is- a no-op on the device, which I hadn't noticed.
 But trying to set it on the .init fails:
 
   # stty -f /dev/cuau2.init speed 115200
   stty: /dev/cuau2.init isn't a terminal crtscts
   # 
 
  
  One frustrating aspect of adding puc(4) support for many devices is that 
  you can't be certain of the clock rate multiplier - the same device can 
  crop up on a different manufacturer's board with a different multiplier. 
  This problem doesn't occur with the OXPCIe95x devices as they derive 
  their 62.5MHz UART clock from the PCI Express clock. Consequently, the 
  problem can't be that your board inadvertently operating the UARTs at 
  the wrong speed.
  
  
  I suspect (?) that it may not be recognized as the proper card. Boot
  and pciconf messages are:
  
  puc0: Oxford Semiconductor OXPCIe952 UARTs mem 
  0xf9dfc000-0xf9df,0xfa00-0xfa1f,0xf9e0-0xf9ff irq 
  30 at device 0.0 on pci4
  
  That is correct. Are there any more lines afterwards - especially one 
  giving the number of UARTs detected? That line is crucial, as, on these 
  chips, the number of UARTs has to be read from configuration space 
  because you can slave two chips together.
  
  My OXPCIe954 board is recognised thus (FreeBSD 8.2 amd64):
  
  puc0: Oxford Semiconductor OXPCIe954 UARTs mem 
  0xd5efc000-0xd5ef,0xd5c0-0xd5df,0xd5a0-0xd5bf irq 18 
  at device 0.0 on pci8
  puc0: 4 UARTs detected
  puc0: [FILTER]
  uart2: 16950 or compatible on puc0
  uart2: [FILTER]
  uart3: 16950 or compatible on puc0
  uart3: [FILTER]
  uart4: 16950 or compatible on puc0
  uart4: [FILTER]
  uart5: 16950 or compatible on puc0
  uart5: [FILTER]
 
 puc0: Oxford Semiconductor OXPCIe952 UARTs mem 
 0xf9dfc000-0xf9df,0xfa00-0xfa1f,0xf9e0-0xf9ff irq 30 at 
 device 0.0 on pci4
 puc0: 2 UARTs detected
 uart2: 16950 or compatible at port 1 on puc0
 uart3: 16950 or compatible at port 2 on puc0
 
  
  puc0@pci0:4:0:0:class=0x070002 card=0xc1581415 chip=0xc1581415 
  rev=0x00 hdr=0x00
 vendor = 'Oxford Semiconductor Ltd'
 class  = simple comms
 subclass   = UART
 bar   [10] = type Memory, range 32, base 0xf9dfc000, size 16384, enabled
 bar   [14] = type Memory, range 32, base 0xfa00, size 2097152, 
 enabled
 bar   [18] = type Memory, range 32, base 0xf9e0, size 2097152

Re: Serial multiport error Oxford/Startech PEX2S952

2011-08-22 Thread Greg Byshenk
On Mon, Aug 22, 2011 at 10:23:14AM +0100, David Wood wrote:
 
 In message 20110822083336.gi92...@core.byshenk.net, Greg Byshenk 
 free...@byshenk.net writes
 On Mon, Aug 22, 2011 at 12:20:33AM +0200, Greg Byshenk wrote:
 puc0: Oxford Semiconductor OXPCIe952 UARTs mem 
 0xf9dfc000-0xf9df,0xfa00-0xfa1f,0xf9e0-0xf9ff irq 
 30 at device 0.0 on pci4
 puc0: 2 UARTs detected
 uart2: 16950 or compatible at port 1 on puc0
 uart3: 16950 or compatible at port 2 on puc0
 
 This indicates that the puc(4) code is working correctly - it recognises 
 the board, reads via one of the BARs to confirm there are two UARTs, 
 initialises both UARTs to 16950 mode, then hands off these ports to 
 uart(4).
 
 I'll follow up tomorrow. Thanks.
 
 Following up:
 
 It appears that indeed, the options COM_MULTIPORT is unnecessary
 for 9-BETA; I've rebuilt the kernel without it, and the card is
 still recognized, along with the ports.
 
 That's what I expected. The only line needed is device puc. I have no 
 idea why this can't be included in GENERIC, especially as puc(4) doesn't 
 work as a module (no drivers are attached to the ports on the puc 
 board).
 
 
 But all it not as it should be. I still can't set the speed on the
 card.
 
  # stty -f /dev/cuau2.init speed 115200 crtscts
  stty: /dev/cuau2.init isn't a terminal
  #
 
 And setting speed on the device itself remains a no-op:
 
   # stty -f /dev/cuau2 speed 115200 crtscts
   9600
   #
 
 That said, the card -does- seem to work, at least at some level.
 With the speed issue pointed out, I set the connection on the
 other end to 9600, and then it works. But I'd really like it to
 be faster than that (it's just a serial console, so we could
 probably live with 9600, though we wouldn't like it).
 
 If there is reason to think that this could be a 9.x issue,
 then I could try going to 8.x.
 
 My earlier instructions omitted mention of the lock, which is really 
 needed if you want to force a particular speed
 
 
 On 8.2:
 
 [root@manganese ~]# PORT='/dev/cuau5' ; OPTIONS='speed 115200 crtscts' ; 
 stty -f ${PORT}.lock 0 ; stty -f ${PORT}.init ${OPTIONS}  /dev/null ; 
 stty -f ${PORT}.lock 1 ; stty -f ${PORT}
 speed 115200 baud;
 lflags: echoe echoke echoctl
 oflags: tab0
 cflags: cs8 -parenb crtscts
 [root@manganese ~]# cu -l cuau5
 Connected
 ATI4
 U.S. Robotics 56K FAX EXT Settings...
 
B0  E1  F1  L2  M1  Q0  V1  X4  Y1
SPEED=115200  PARITY=N  WORDLEN=8
DIAL=TONEOFF LINE   CID=1
 
A3  B1  C1  D2  H2  I2  K1
M4  N0  R1  S0  T5  U0  Y1
 
S00=000  S01=000  S02=043  S03=013  S04=010  S05=008  S06=004
S07=060  S08=002  S09=006  S10=014  S11=072  S12=050  S13=000
S15=000  S16=000  S18=000  S19=000  S21=010  S22=017  S23=019
S25=005  S27=001  S28=008  S29=020  S30=000  S31=128  S32=002
S33=000  S34=000  S35=000  S36=014  S38=000  S39=012  S40=000
S41=004  S42=000
 
LAST DIALLED #:
 
 OK
 ~
 [EOT]
 [root@manganese ~]# PORT='/dev/cuau5' ; OPTIONS='speed 38400 crtscts' ; 
 stty -f ${PORT}.lock 0 ; stty -f ${PORT}.init ${OPTIONS}  /dev/null ; 
 stty -f ${PORT}.lock 1 ; stty -f ${PORT}
 speed 38400 baud;
 lflags: echoe echoke echoctl
 oflags: tab0
 cflags: cs8 -parenb crtscts
 [root@manganese ~]# cu -l cuau5
 Connected
 ATI4
 U.S. Robotics 56K FAX EXT Settings...
 
B0  E1  F1  L2  M1  Q0  V1  X4  Y1
SPEED=38400  PARITY=N  WORDLEN=8
DIAL=TONEOFF LINE   CID=1
 
A3  B1  C1  D2  H2  I2  K1
M4  N0  R1  S0  T5  U0  Y1
 
S00=000  S01=000  S02=043  S03=013  S04=010  S05=008  S06=004
S07=060  S08=002  S09=006  S10=014  S11=072  S12=050  S13=000
S15=000  S16=000  S18=000  S19=000  S21=010  S22=017  S23=019
S25=005  S27=001  S28=008  S29=020  S30=000  S31=128  S32=002
S33=000  S34=000  S35=000  S36=014  S38=000  S39=012  S40=000
S41=004  S42=000
 
LAST DIALLED #:
 
 OK
 ~
 [EOT]
 
 
 This is one of my OXPCIe954 ports - the modem on that port identifies 
 the speed it is being talked to in the ATI4 output.
 
 If this is a 9.x issue, it seems more likely to be in the uart(4) code - 
 though I haven't been following development. If you are getting nowhere 
 with 9.x, can you try with 8.x? stable/8 might be the best choice, as 
 the necessary pucdata.c changes postdates 8.2-RELEASE. That said, I 
 patch 8.2-RELEASE on my machine, choosing to keep things conservative.
 
 I look forward to your feedback.

It doesn't seem to matter; both cuau?.lock and cuau?.init produce the
error (for both ports), and cuau? itself remains a no-op.

Now that I can see that the card is working (at least minimally), it
begins to look as if there might be a problem somewhere in 9.x. I'll
try to install 8.x and see if the results are different.

I'll followup again when I have something to report.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo

Serial multiport error Oxford/Startech PEX2S952

2011-08-21 Thread Greg Byshenk
Not sure if -stable is the right place for this, but I'll give it
a shot; if it's not, then a pointer in the right direction would
be much appreciated.

I'm having a problem with a StarTech PEX2S952 dual-port serial
card.

I believe that it should be supported, as it has this entry in
pucdata.c

[...]
{   0x1415, 0xc158, 0x, 0,
Oxford Semiconductor OXPCIe952 UARTs,
DEFAULT_RCLK * 0x22,
PUC_PORT_NONSTANDARD, 0x10, 0, -1,
.config_function = puc_config_oxford_pcie
},
[...]

And, while it is recognized at boot -- after adding

device  puc
options COM_MULTIPORT

to my kernel, it doesn't seem to be working. The devices '/dev/cuau2'
and '/dev/cuau3' show up, and I can connect to them, but they don't
seem to pass any traffic. If I connect to the serial console of
another machine (one that I know for certain is working), I get 
nothing at all.

I suspect (?) that it may not be recognized as the proper card. Boot
and pciconf messages are:

puc0: Oxford Semiconductor OXPCIe952 UARTs mem 
0xf9dfc000-0xf9df,0xfa00-0xfa1f,0xf9e0-0xf9ff irq 30 at 
device 0.0 on pci4

puc0@pci0:4:0:0:class=0x070002 card=0xc1581415 chip=0xc1581415 rev=0x00 
hdr=0x00
vendor = 'Oxford Semiconductor Ltd'
class  = simple comms
subclass   = UART
bar   [10] = type Memory, range 32, base 0xf9dfc000, size 16384, enabled
bar   [14] = type Memory, range 32, base 0xfa00, size 2097152, enabled
bar   [18] = type Memory, range 32, base 0xf9e0, size 2097152, enabled

The kernel is actually FreeBSD 9.0-BETA1 amd64, which is not quite
'STABLE' yet, but I don't think that this should matter.

Any advice would be much appreciated. The machine is still in
test phase, so I can mess around with it as necessary.

Thanks.

-- 
greg byshenk  -  free...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Serial multiport error Oxford/Startech PEX2S952

2011-08-21 Thread Greg Byshenk
On Sun, Aug 21, 2011 at 09:44:41PM +0100, David Wood wrote:
 
 I wrote and contributed the support code for the OXPCIe95x serial chips 
 - and just happened to notice your report.

Thanks for the response.


 In message 20110821154249.ge92...@core.byshenk.net, Greg Byshenk 
 free...@byshenk.net writes
 I'm having a problem with a StarTech PEX2S952 dual-port serial
 card.
 
 I believe that it should be supported, as it has this entry in
 pucdata.c
 
 [...]
{   0x1415, 0xc158, 0x, 0,
Oxford Semiconductor OXPCIe952 UARTs,
DEFAULT_RCLK * 0x22,
PUC_PORT_NONSTANDARD, 0x10, 0, -1,
.config_function = puc_config_oxford_pcie
},
 [...]
 
 It should be supported. The OXPCIe952 is more awkward to support than 
 the OXPCIe954 and OXPCIe958 because it can be configured in so many 
 different ways by the board manufacturer. However, 0xc158 is 
 configuration that is identical in arrangement as the larger chips, so 
 is the configuration I'm most confident of. I've just double-checked the 
 data sheets, and can't see any relevant differences between 0xc158 
 OXPCIe952 and the OXPCIe954 I tested the code with.
 
 I use my OXPCIe954 board on FreeBSD 8.2, and have had success reports 
 from other OXPCIe954 and OXPCIe958 board users (including someone with a 
 16 port board based on dual OXPCIe958s). I have yet to try FreeBSD 9.x 
 on my hardware.
 
 
 And, while it is recognized at boot -- after adding
 
   device  puc
   options COM_MULTIPORT
 
 I'm 99% certain that options COM_MULTIPORT relates to the old sio(4) 
 code - I certainly don't need it on 8.x. Does it make any difference if 
 you delete that line and just leave device puc?

I will rebuild my kernel and try.
 
 
 to my kernel, it doesn't seem to be working. The devices '/dev/cuau2'
 and '/dev/cuau3' show up, and I can connect to them, but they don't
 seem to pass any traffic. If I connect to the serial console of
 another machine (one that I know for certain is working), I get
 nothing at all.
 
 Have you remembered to set the speed (and other relevant options) on the 
 .init devices? This is a feature (or is it a quirk) of the uart(4) 
 driver that catches many people out. Setting options on the base device 
 is normally a no-op.
 
 For example, if the remote device on /dev/cuau2 operates at 115200 bps 
 with hardware handshaking, try:
 
 stty -f /dev/cuau2.init speed 115200 crtscts

Interestingly, it -is- a no-op on the device, which I hadn't noticed.
But trying to set it on the .init fails:

# stty -f /dev/cuau2.init speed 115200
stty: /dev/cuau2.init isn't a terminal crtscts
# 

 
 One frustrating aspect of adding puc(4) support for many devices is that 
 you can't be certain of the clock rate multiplier - the same device can 
 crop up on a different manufacturer's board with a different multiplier. 
 This problem doesn't occur with the OXPCIe95x devices as they derive 
 their 62.5MHz UART clock from the PCI Express clock. Consequently, the 
 problem can't be that your board inadvertently operating the UARTs at 
 the wrong speed.
 
 
 I suspect (?) that it may not be recognized as the proper card. Boot
 and pciconf messages are:
 
 puc0: Oxford Semiconductor OXPCIe952 UARTs mem 
 0xf9dfc000-0xf9df,0xfa00-0xfa1f,0xf9e0-0xf9ff irq 
 30 at device 0.0 on pci4
 
 That is correct. Are there any more lines afterwards - especially one 
 giving the number of UARTs detected? That line is crucial, as, on these 
 chips, the number of UARTs has to be read from configuration space 
 because you can slave two chips together.
 
 My OXPCIe954 board is recognised thus (FreeBSD 8.2 amd64):
 
 puc0: Oxford Semiconductor OXPCIe954 UARTs mem 
 0xd5efc000-0xd5ef,0xd5c0-0xd5df,0xd5a0-0xd5bf irq 18 
 at device 0.0 on pci8
 puc0: 4 UARTs detected
 puc0: [FILTER]
 uart2: 16950 or compatible on puc0
 uart2: [FILTER]
 uart3: 16950 or compatible on puc0
 uart3: [FILTER]
 uart4: 16950 or compatible on puc0
 uart4: [FILTER]
 uart5: 16950 or compatible on puc0
 uart5: [FILTER]

puc0: Oxford Semiconductor OXPCIe952 UARTs mem 
0xf9dfc000-0xf9df,0xfa00-0xfa1f,0xf9e0-0xf9ff irq 30 at 
device 0.0 on pci4
puc0: 2 UARTs detected
uart2: 16950 or compatible at port 1 on puc0
uart3: 16950 or compatible at port 2 on puc0

 
 puc0@pci0:4:0:0:class=0x070002 card=0xc1581415 chip=0xc1581415 
 rev=0x00 hdr=0x00
vendor = 'Oxford Semiconductor Ltd'
class  = simple comms
subclass   = UART
bar   [10] = type Memory, range 32, base 0xf9dfc000, size 16384, enabled
bar   [14] = type Memory, range 32, base 0xfa00, size 2097152, 
enabled
bar   [18] = type Memory, range 32, base 0xf9e0, size 2097152, 
enabled
 
 That is correct.
 
 The kernel is actually FreeBSD 9.0-BETA1 amd64, which is not quite
 'STABLE' yet, but I don't think that this should matter.
 
 Any advice would be much

Re: Question about packages installed via `pkg_add -r`

2011-03-06 Thread Greg Byshenk
On Sun, Mar 06, 2011 at 10:09:17AM +0800, Yue Wu wrote:
 On Sat, Mar 05, 2011 at 08:46:47PM -0500, ill...@gmail.com wrote:
  On 5 March 2011 20:14, Yue Wu vano...@gmail.com wrote:
   On Sat, Mar 05, 2011 at 08:02:47PM -0500, ill...@gmail.com wrote:
   On 5 March 2011 20:00, Yue Wu vano...@gmail.com wrote:
Hello, sorry for poor English, I will try to explan clearer with my
best.
   
On Sat, Mar 05, 2011 at 04:48:17PM +0100, Greg Byshenk wrote:
On Sat, Mar 05, 2011 at 11:04:36PM +0800, Yue Wu wrote:
   
 I'm trying to use package instead of ports these day, but a few
 questions have:

 1. How to reserve packages that fetched via `pkg_add -r`?

 2. How to know if there are updates for packages, and how to update?
   
For (1), do you mean 'preserve', as in save a copy? ?If so, then
'portmaster -b [...]' will save a backup copy of installed packages.
   
Yes, I mean 'preserve'. I've maned portmaster, seems -b is for a
installed package, so it will preserve it by packing up the files from 
a
installed package, why not preserve it just when fetching with `pkg_add
-r`? I think it's the best way, I don't like the portmaster way to do 
it
after.
  
   from man 1 pkg_add:
  
   ? ? ?-K, --keep
   ? ? ? ? ? ? ?Keep any downloaded package in PKGDIR if it is defined or 
   in cur-
   ? ? ? ? ? ? ?rent directory by default.
  
  
   Thanks, sorry for no attentively reading ;p
  
   Another question arises after checking the pkg 'pkg_add' saves, why the
   pkg doesn't have a version appended to its name, it's hard to know the
   version the pkg downloaded...
  
  Without digging in too deeply (I use ports, so I'm not the
  _most_ knowledgeable on packages) I believe it has to
  do with the fact that the packages are symlinked to non-
  versioned names on the distribution server(s), probably
  to simplify fetching.  The packages themselves should
  have the version information in their metadata somewhere,
  which might be possible to rename via script.
  
  I apologise if that isn't helpful.
 
 Thank you for info, I got the reason :)
 
 ports with portmaster makes pkg installation mangement be much more
 flexiable and more friendly than package by pkg_add -r on FreeBSD,
 except that ports take much more time and resource. After trying with
 packages, I think I have to stick to ports.

As suggested by some of the other comments, you can choose to use
portmaster with packages, if you prefer not to do local builds.

In my own case, I use ports and packages, via portmaster. That is,
I use one machine to build locally-configured packages (in some 
cases with non-standard options), and then install them on the rest
of the machines as packages. It works very well in my environment.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Question about packages installed via `pkg_add -r`

2011-03-05 Thread Greg Byshenk
On Sat, Mar 05, 2011 at 11:04:36PM +0800, Yue Wu wrote:
 
 I'm trying to use package instead of ports these day, but a few
 questions have:
 
 1. How to reserve packages that fetched via `pkg_add -r`?
 
 2. How to know if there are updates for packages, and how to update?

For (1), do you mean 'preserve', as in save a copy?  If so, then
'portmaster -b [...]' will save a backup copy of installed packages.

There may be a better way, but one way to deal with (2) is to have an
up-to-date ports tree. Then 'pkg_version -vL=' will show you which of
your ports are out of date. Then 'portmaster -PP [...]' will force
package use for updates.

If you have an up-to-date ports tree, then I think that

portmaster -abPP

will update all of your ports, using packages, and save a backup copy
of the installed versions.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: root mount error

2010-12-28 Thread Greg Byshenk
On Tue, Dec 28, 2010 at 01:36:01AM +0100, Damien Fleuriot wrote:
 On 12/27/10 9:18 PM, Michael BlackHeart wrote:

  I've got trouble with FreeBSD 8 Stable
  First I've put on notebook 8.2 RELEASE amd64, then SVN'ed src's to
  yesterday revision I don't remember exact number, but I've have this
  problem aobut week or two so it's not so important, also as it doesn't
  work on i386 too.
  
  After installing new kernel I've just build - indeed it always was
  GENERIC for both arch's on clean system - I've got an a kernel painc
  caused by disability to mount root partition because kernel couldn't
  see the drive. By pressing '?' I've sen only acd0 that represents
  CD-ROM.
  
  In debug messages I haven't found anything about ad0 - than hdd was
  identified before new kernel was installed.
  I've got an HP 6720s notebook with SATA 160GB Hitachi HDD that is
  working with diabled SATA native mode.
  
  I've not found any info 'bout this error in recent 8.Stable so I don't
  know how to handle this one.

 First, I'd advise making use of FreeBSD's nextboot utility to test new
 kernels:
 http://fuse4bsd.creo.hu/localcgi/man-cgi.cgi?nextboot+8
 
 Second, I would suggest reading the handbook's excellent section on
 upgrading your machine or rebuilding the kernel:
 http://www.freebsd.org/doc/en_US.ISO8859-1/books/handbook/updating-upgrading.html
 
 Now, a likely cause of your problem is the installation of a custom
 kernel with removed support for whatever your hard disk drive or raid
 controller is recognized as.
 
 Did you reinstall your old, working kernel, or are you actually asking
 for help doing just that ?

What kind of laptop?

For information, I had a similar problem when I updated my laptop
(HP Compaq 6910p) to 8.2-PRERELEASE as of 14 December. For some reason,
the system was no longer seeing the main hard drive. 

I solved the problem by setting 'SATA Native Mode' (or some such) in the
BIOS, which then led my (SATA) drive to be seen at '/dev/ad8'. After 
booting from ad8 and modifying my 'fstab', everything works fine.

So you might try the same thing. At least change the setting in your 
BIOS to see if you can see a drive.

-greg


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: root mount error

2010-12-28 Thread Greg Byshenk
On Tue, Dec 28, 2010 at 11:08:44PM +0300, Michael BlackHeart wrote:
 I'm no looking for help neither instructions how to build kernel. I'm
 just installing 8.1 RELEASE and svn it up to last week 8-stable. And
 going step-by-step of handbook installing kernel I'm having a trouble
 - it seems than new kenel doesn't recognize my HDD. I'm not doing
 something special, in that case I'm for shure mentioned it. I'm just
 building GENERIC kernel without any configuration of system after
 installation, to tweaks, no tunes, nothing. It's a new GENERIC kernel
 and it can't find my HDD but 8.1 i386/amd64 releases works well and as
 I remember something about month ago stable too.
 Now, a likely cause of your problem is the installation of a custom
 kernel with removed support for whatever your hard disk drive or raid
 controller is recognized as.
 When it works it's just and ad0 hdd, no raid or special driver
 I'm jsut trying to say than recent changes in kernel or kernel-modules
 broke up my HDD support and I'd like to notice developres to check
 where the problem is.
 And of couse I've tried to switch SATA native mode and it doesn't
 change anything.
 Loader on it's own stage easily detects HDD and root partition so I
 can just select old kernel and boot up, but I'm not shure how he gain
 access to HDD to mfke any conclusion, probably through BIOS interrupts
 but it's out of piont.
 And for my pity I don't know how to dump demsg without having any
 serial connection or usable disk drive, maybe to flash drive, but I
 don't know how. And anyway there's no real kernel painc, it just asks
 for root mountpoint.
 
 And for shure I've got an 2.5 Hitachi HTS542516K9A300 160Gb SATA HDD
 
 If you need any aditional info I'll give it all, just ask.

If you change to SATA native mode, then your HD may show up at a 
different device (mine moved to ad8). If you go to native mode and
issue a '?' when it fails to find the kernel, does it show any HD
devices?

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: ZFS and Storage Systems

2010-10-12 Thread Greg Byshenk
On Tue, Oct 12, 2010 at 10:33:49AM +0100, Michal wrote:
 
 Apologies for the basic question but I just want to make sure. I have 
 been looking at storage systems like this one 
 http://www.icc-usa.com/storage-35-2u.asp. I am guessing it would be a 
 case of sorting the discs out, probably on some GUI or command line for 
 the box it self, then using a FreeBSD box I can set up ZFS over the 
 drives...or is it not that simple?

You say using a FreeBSD box I can set up ZFS over the drives..., which
doesn't make sense, if I undestand the ICC system. The device appears to
be an NAS system, with an OS, not an external disk bay.

What you would want, I think, is either a) an external FCAL or iSCSI box
that you could connect to another machine (running FreeBSD or some other
OS); or b) a 'storage' server upon which to install FreeBSD and use as
a NAS system.

It may be that the ICC system can export the drives, but it seems like an
unnecessary complication.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Serial console problems with stable/8

2010-09-12 Thread Greg Byshenk
On Sun, Sep 12, 2010 at 05:26:12PM +0200, Oliver Fromme wrote:
 
 On Friday I have updated a machine from 7.1 to stable/8.
 It is connected to a serial console.  With 7.1 everything
 worked fine, but with stable/8 things seem to break.

[...]
 
 Here's my setup (which worked perfectly fine with 7.1):
 
 /boot.config:
 -P
 
 /boot/loader.conf:
 kernel_options=-P
 console=comconsole
 
 /etc/ttys:
 ttyu0   /usr/libexec/getty std.9600   vt100   off secure

Shouldn't this:   ^^^
be 'on'...?



-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
 
 /boot/device.hints:
 hint.uart.0.at=isa
 hint.uart.0.port=0x3F8
 hint.uart.0.flags=0x10
 hint.uart.0.irq=4
 
 /var/run/dmesg.boot:
 uart0: 16550 or compatible port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0
 uart0: [FILTER]
 uart0: console (9600,n,8,1)
 
 The serial port is connected to another PC that runs tip(1)
 in a screen(1) session, using a 9-pin nullmodem cable.
 That setup hasn't changed in ages; that other PC is running
 an older version of FreeBSD.
 
 I need this issue to be resolved, because the serial console
 is required for remote management (the machine is a 3-hours
 ride away from home).  If it can't be resolved, I will have
 to downgrade it to 7.x.
 
 Best regards
Oliver
 
 -- 
 Oliver Fromme, secnetix GmbH  Co. KG, Marktplatz 29, 85567 Grafing b. M.
 Handelsregister: Registergericht Muenchen, HRA 74606,  Gesch?ftsfuehrung:
 secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n-
 chen, HRB 125758,  Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart
 
 FreeBSD-Dienstleistungen, -Produkte und mehr:  http://www.secnetix.de/bsd
 
 I suggested holding a Python Object Oriented Programming Seminar,
 but the acronym was unpopular.
 -- Joseph Strout
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: igb related(?) panics on 7.3-STABLE

2010-09-02 Thread Greg Byshenk
On Mon, Aug 30, 2010 at 05:22:47AM -0700, Jeremy Chadwick wrote:
 On Mon, Aug 30, 2010 at 04:08:45AM -0700, Jeremy Chadwick wrote:
  Bcc: 
  Subject: Re: igb related(?) panics on 7.3-STABLE
  Reply-To: 
  In-Reply-To: 20100830094631.gd12...@core.byshenk.net
  {snip}

 My apologies -- somehow my mail client completely broke the Subject line
 and pulled it from another thread.  I'm not quite sure how mutt managed
 to do that, but probably an extraneous newline when editing mail
 headers, e.g. PEBKAC.

As an informational followup on this issue, I've updated the problem
machine to 8-STABLE (FreeBSD 8.1-STABLE #7: Mon Aug 23 13:01:15 CEST 2010)
and the problem seems to have gone away.

I had a journal overflow this morning, but that is a different problem,
and I think that it should be fixable via tuning a bit.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: igb related(?) panics on 7.3-STABLE

2010-08-30 Thread Greg Byshenk
On Sun, Aug 29, 2010 at 08:16:59PM +0200, Greg Byshenk wrote:

 I've begun seeing problems on a machine running FreeBSD-7.3-STABLE, 64-bit,
 with two igb nics in use.  Previously the machine was fine, running earlier
 versions of 7-STABLE, although the load on the network has increased due
 to additional machines being added to the network (the machine functions
 as a fileserver, serving files to compute machines via NFS(v3)).
 
 Any advice is much appreciated. System info is below.


Followup with more information. The machine just panic'ed again, with 
a lot of load on the network.

Output from the 'systat' that was running at the time:

   3 usersLoad 54.47 42.35 24.25  Aug 30 11:17

   Mem:KBREALVIRTUAL   VN PAGER   SWAP PAGER
   Tot   Share  TotShareFree   in   out in   out
   Act   462325504   86814010548  943324  count
   All  4564847852 1074772k27740  pages
   Proc:Interrupts
 r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltcow   54220 total
 1 170  392k8  278  22k  1951zfodsio0 
irq4
 ozfod   fdc0 
irq6
   70.4%Sys   3.1%Intr  0.0%User  0.0%Nice 26.5%Idle%ozfod27 twa0 
uhci0
   |||||||||||   daefr  2001 cpu0: 
time
   ===++ prcfr   igb0 
256
 9938 dtbuf 1247 totfr   igb0 
257
   Namei Name-cache   Dir-cache10 desvn  react   igb0 
258
  Callshits   %hits   % 34443 numvn1 pdwak   igb0 
259
24996 frevn   112852 pdpgs   igb0 
262
 intrn   igb0 
263
   Disks   da0   da1 pass0 pass1 2570672 wireigb0 
264
   KB/t   0.00 12.23  0.00  0.00   46760 act igb0 
265
   tps   026 0 014706896 inact 19449 igb1 
266
   MB/s   0.00  0.31  0.00  0.000 769796  26585
 021 0 0  173528


-greg
 
 
 
 Machine:
 ===
 
 FreeBSD server.example.com 7.3-STABLE FreeBSD 7.3-STABLE #36: Wed Aug 25 
 11:01:07 CEST 2010 r...@server.example.com:/usr/obj/usr/src/sys/KERNEL 
 amd64
 
 Kernel was csup'd earlier in the day on 25 August, immediately prior to 
 the build.
 
 
 Panic:
 ==
 
 Fatal trap 9: general protection fault while in kernel mode
 cpuid = 2; apic id = 02
 instruction pointer = 0x8:0x8052f40c
 stack pointer   = 0x10:0xff82056819d0
 frame pointer   = 0x10:0xff82056819f0
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 65 (igb1 que)
 trap number = 9
 panic: general protection fault
 cpuid = 2
 KDB: stack backtrace:
 db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
 panic() at panic+0x182
 trap_fatal() at trap_fatal+0x294
 trap() at trap+0x106
 calltrap() at calltrap+0x8
 --- trap 0x9, rip = 0x8052f40c, rsp = 0xff82056819d0, rbp = 
 0xff82056819f0 --- m_tag_delete_chain() at m_tag_delete_chain+0x1c
 uma_zfree_arg() at uma_zfree_arg+0x41
 m_freem() at m_freem+0x54
 ether_demux() at ether_demux+0x85
 ether_input() at ether_input+0x1bb
 igb_rxeof() at igb_rxeof+0x29d
 igb_handle_que() at igb_handle_que+0x9a
 taskqueue_run() at taskqueue_run+0xac
 taskqueue_thread_loop() at taskqueue_thread_loop+0x46
 fork_exit() at fork_exit+0x122
 fork_trampoline() at fork_trampoline+0xe
 --- trap 0, rip = 0, rsp = 0xff8205681d30, rbp = 0 ---
 Uptime: 11h57m6s
 Physical memory: 18411 MB
 Dumping 3770 MB:
 
 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0x80
 fault code  = supervisor write data, page not present
 instruction pointer = 0x8:0x80188b5f
 stack pointer   = 0x10:0xff82056811f0
 frame pointer   = 0x10:0xff82056812f0
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 65 (igb1 que)
 trap number = 12
 
 
 pciconf:
 ===
 
 i...@pci0:10:0:0:   class=0x02 card=0x10c915d9 chip=0x10c98086 
 rev=0x01 hdr=0x00
 vendor = 'Intel Corporation'
 class  = network
 subclass   = ethernet
 i...@pci0:10:0:1:   class=0x02 card=0x10c915d9 chip=0x10c98086 
 rev=0x01 hdr=0x00
 vendor = 'Intel Corporation'
 class

Re: Crashes on X7SPE-HF with em

2010-08-30 Thread Greg Byshenk
On Mon, Aug 30, 2010 at 04:08:45AM -0700, Jeremy Chadwick wrote:
 Bcc: 
 Subject: Re: igb related(?) panics on 7.3-STABLE
 Reply-To: 
 In-Reply-To: 20100830094631.gd12...@core.byshenk.net
 
 On Mon, Aug 30, 2010 at 11:46:31AM +0200, Greg Byshenk wrote:
  On Sun, Aug 29, 2010 at 08:16:59PM +0200, Greg Byshenk wrote:
  
   I've begun seeing problems on a machine running FreeBSD-7.3-STABLE, 
   64-bit,
   with two igb nics in use.  Previously the machine was fine, running 
   earlier
   versions of 7-STABLE, although the load on the network has increased due
   to additional machines being added to the network (the machine functions
   as a fileserver, serving files to compute machines via NFS(v3)).
   
   Any advice is much appreciated. System info is below.
  
  
  Followup with more information. The machine just panic'ed again, with 
  a lot of load on the network.
  
  Output from the 'systat' that was running at the time:
  
 3 usersLoad 54.47 42.35 24.25  Aug 30 11:17
  
 Mem:KBREALVIRTUAL   VN PAGER   SWAP 
  PAGER
 Tot   Share  TotShareFree   in   out in  
   out
 Act   462325504   86814010548  943324  count
 All  4564847852 1074772k27740  pages
 Proc:
  Interrupts
   r   p   d   s   w   Csw  Trp  Sys  Int  Sof  Fltcow   54220 
  total
   1 170  392k8  278  22k  1951zfod
  sio0 irq4
   ozfod   
  fdc0 irq6
 70.4%Sys   3.1%Intr  0.0%User  0.0%Nice 26.5%Idle%ozfod27 
  twa0 uhci0
 |||||||||||   daefr  2001 
  cpu0: time
 ===++ prcfr   
  igb0 256
   9938 dtbuf 1247 totfr   
  igb0 257
 Namei Name-cache   Dir-cache10 desvn  react   
  igb0 258
Callshits   %hits   % 34443 numvn1 pdwak   
  igb0 259
  24996 frevn   112852 pdpgs   
  igb0 262
   intrn   
  igb0 263
 Disks   da0   da1 pass0 pass1 2570672 wire
  igb0 264
 KB/t   0.00 12.23  0.00  0.00   46760 act 
  igb0 265
 tps   026 0 014706896 inact 19449 
  igb1 266
 MB/s   0.00  0.31  0.00  0.000 769796  26585
   021 0 0  173528
  
  
  -greg
   
   
   
   Machine:
   ===
   
   FreeBSD server.example.com 7.3-STABLE FreeBSD 7.3-STABLE #36: Wed Aug 25 
   11:01:07 CEST 2010 
   r...@server.example.com:/usr/obj/usr/src/sys/KERNEL amd64
   
   Kernel was csup'd earlier in the day on 25 August, immediately prior to 
   the build.
   
   
   Panic:
   ==
   
   Fatal trap 9: general protection fault while in kernel mode
   cpuid = 2; apic id = 02
   instruction pointer = 0x8:0x8052f40c
   stack pointer   = 0x10:0xff82056819d0
   frame pointer   = 0x10:0xff82056819f0
   code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long 1, def32 0, gran 1
   processor eflags= interrupt enabled, resume, IOPL = 0
   current process = 65 (igb1 que)
   trap number = 9
   panic: general protection fault
   cpuid = 2
   KDB: stack backtrace:
   db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
   panic() at panic+0x182
   trap_fatal() at trap_fatal+0x294
   trap() at trap+0x106
   calltrap() at calltrap+0x8
   --- trap 0x9, rip = 0x8052f40c, rsp = 0xff82056819d0, rbp = 
   0xff82056819f0 --- m_tag_delete_chain() at m_tag_delete_chain+0x1c
   uma_zfree_arg() at uma_zfree_arg+0x41
   m_freem() at m_freem+0x54
   ether_demux() at ether_demux+0x85
   ether_input() at ether_input+0x1bb
   igb_rxeof() at igb_rxeof+0x29d
   igb_handle_que() at igb_handle_que+0x9a
   taskqueue_run() at taskqueue_run+0xac
   taskqueue_thread_loop() at taskqueue_thread_loop+0x46
   fork_exit() at fork_exit+0x122
   fork_trampoline() at fork_trampoline+0xe
   --- trap 0, rip = 0, rsp = 0xff8205681d30, rbp = 0 ---
   Uptime: 11h57m6s
   Physical memory: 18411 MB
   Dumping 3770 MB:
   
   Fatal trap 12: page fault while in kernel mode
   cpuid = 0; apic id = 00
   fault virtual address   = 0x80
   fault code  = supervisor write data, page not present
   instruction pointer = 0x8:0x80188b5f
   stack pointer   = 0x10:0xff82056811f0
   frame pointer   = 0x10:0xff82056812f0
   code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, long

igb related(?) panics on 7.3-STABLE

2010-08-29 Thread Greg Byshenk
I've begun seeing problems on a machine running FreeBSD-7.3-STABLE, 64-bit,
with two igb nics in use.  Previously the machine was fine, running earlier
versions of 7-STABLE, although the load on the network has increased due
to additional machines being added to the network (the machine functions
as a fileserver, serving files to compute machines via NFS(v3)).

Any advice is much appreciated. System info is below.
-greg



Machine:
===

FreeBSD server.example.com 7.3-STABLE FreeBSD 7.3-STABLE #36: Wed Aug 25 
11:01:07 CEST 2010 r...@server.example.com:/usr/obj/usr/src/sys/KERNEL amd64

Kernel was csup'd earlier in the day on 25 August, immediately prior to 
the build.


Panic:
==

Fatal trap 9: general protection fault while in kernel mode
cpuid = 2; apic id = 02
instruction pointer = 0x8:0x8052f40c
stack pointer   = 0x10:0xff82056819d0
frame pointer   = 0x10:0xff82056819f0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 65 (igb1 que)
trap number = 9
panic: general protection fault
cpuid = 2
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2a
panic() at panic+0x182
trap_fatal() at trap_fatal+0x294
trap() at trap+0x106
calltrap() at calltrap+0x8
--- trap 0x9, rip = 0x8052f40c, rsp = 0xff82056819d0, rbp = 
0xff82056819f0 --- m_tag_delete_chain() at m_tag_delete_chain+0x1c
uma_zfree_arg() at uma_zfree_arg+0x41
m_freem() at m_freem+0x54
ether_demux() at ether_demux+0x85
ether_input() at ether_input+0x1bb
igb_rxeof() at igb_rxeof+0x29d
igb_handle_que() at igb_handle_que+0x9a
taskqueue_run() at taskqueue_run+0xac
taskqueue_thread_loop() at taskqueue_thread_loop+0x46
fork_exit() at fork_exit+0x122
fork_trampoline() at fork_trampoline+0xe
--- trap 0, rip = 0, rsp = 0xff8205681d30, rbp = 0 ---
Uptime: 11h57m6s
Physical memory: 18411 MB
Dumping 3770 MB:

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x80
fault code  = supervisor write data, page not present
instruction pointer = 0x8:0x80188b5f
stack pointer   = 0x10:0xff82056811f0
frame pointer   = 0x10:0xff82056812f0
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 65 (igb1 que)
trap number = 12


pciconf:
===

i...@pci0:10:0:0:   class=0x02 card=0x10c915d9 chip=0x10c98086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
class  = network
subclass   = ethernet
i...@pci0:10:0:1:   class=0x02 card=0x10c915d9 chip=0x10c98086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
class  = network
subclass   = ethernet


dmesg:
=

igb0: Intel(R) PRO/1000 Network Connection version - 1.9.5 port 0xe880-0xe89f 
mem 0xfbe6-0xfbe
7,0xfbe4-0xfbe5,0xfbeb8000-0xfbebbfff irq 16 at device 0.0 on pci10
igb0: Using MSIX interrupts with 10 vectors
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: [ITHREAD]
igb0: Ethernet address: 00:30:48:ca:cd:72
igb1: Intel(R) PRO/1000 Network Connection version - 1.9.5 port 0xec00-0xec1f 
mem 0xfbee-0xfbe
f,0xfbec-0xfbed,0xfbebc000-0xfbeb irq 17 at device 0.1 on pci10
igb1: Using MSIX interrupts with 10 vectors
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: [ITHREAD]
igb1: Ethernet address: 00:30:48:ca:cd:73


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: I broke my SSH to jails after 7.2-8.0 src upgrade

2010-03-14 Thread Greg Byshenk
On Sat, Mar 13, 2010 at 09:33:50PM -0800, Doug Barton wrote:
 On 03/12/10 02:13, Greg Byshenk wrote:

  I would put in a word for 'mergemaster -F' (or maybe '-iF') in such
  cases.
 
 At this point the -U option is generally a safer bet. The only time this
 won't work for you is when upgrading from an older -RELEASE where you've
 never run mergemaster previously, in which case it will bark loudly that
 there is no mtree database. You could then run 'mergemaster -Fi' as you
 suggested, and run 'mergemaster -U' immediately thereafter and you
 should get as much automation as is possible.

I don't actually want as much 'automation' as is possible.  Generally
I want to know what is being modified, even if it is in a file that I
haven't changed.  I like '-F' because it allows me to ignore the huge
number of files that aren't actually changed -- except the RCS line --
that sometimes arise when moving between versions.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: I broke my SSH to jails after 7.2-8.0 src upgrade

2010-03-12 Thread Greg Byshenk
On Thu, Mar 11, 2010 at 08:33:29PM -0800, Garrett Cooper wrote:
 
 I've done a few RELENG_8_0 to STABLE-8 to 9-CURRENT upgrades lately
 and mergemaster was goofing up the contents a bit based on the RCS
 versions. I had to hand-edit a crapload of stuff going from 8 to 9,
 and I still don't trust mergemaster's automatic merging logic because
 it goofs up on /etc/group // /etc/passwd still (doesn't merge
 anything, discards my info, etc) for starters.
 
 -a doesn't actually do any merging though, FWIW:

[...]

I would put in a word for 'mergemaster -F' (or maybe '-iF') in such
cases.

It doesn't try to automate much, but it allows one to concentrate on
actual differences by automating the handling of those files where
only the VCS Id is different.

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: Supplementary groups on LDAP cannot work with RELENG_8 +nss_ldap

2010-03-09 Thread Greg Byshenk
On Tue, Mar 09, 2010 at 09:00:49AM +0800, Linghua Tseng wrote:
 
 Here is the output of `diff -u /usr/src/etc/nsswitch.conf 
 /etc/nsswitch.conf'.
 --- /usr/src/etc/nsswitch.conf  2010-03-08 09:04:25.0 +0800
 +++ /etc/nsswitch.conf  2010-03-08 18:01:08.0 +0800
 @@ -1,13 +1,13 @@
 #
 # nsswitch.conf(5) - name service switch configuration file
 -# $FreeBSD: src/etc/nsswitch.conf,v 1.1.10.1 2009/08/03 08:13:06 kensmith 
 Exp $
 +# $FreeBSD: src/etc/nsswitch.conf,v 1.1 2006/05/03 15:14:47 ume Exp $
 #
 group: compat
 -group_compat: nis
 +group_compat: ldap nis
 hosts: files dns
 networks: files
 passwd: compat
 -passwd_compat: nis
 +passwd_compat: ldap nis
 shells: files
 services: compat
 services_compat: nis
 
 The line `+:*' has already put into /etc/master.passwd,
 and the line `+:*::' has already put into /etc/group.

I may be completely wrong (I can't seem to find the source), and I
don't know if it is the source of your problem, but I recall it being
reported that 'passwd_compat' and 'group_compat' require a *single*
source entry. 


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: device.hints isn't setting what I want

2010-01-22 Thread Greg Byshenk
On Thu, Jan 21, 2010 at 08:23:23PM -0500, Dan Langille wrote:
 
 First, see also my post: do I want ch0 or pass1?
 
 I have an external tape library and an external tape drive.  They are
 not always powered up.  My goal: always get the same devices regardless
 of whether or not the tape library is powered on at boot.
 
 After booting, with the tape library powered on, I have these devices:
 
 # camcontrol devlist
 QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0)
 DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1)
 DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2)
 HL-DT-ST DVDRAM GSA-H10A JL02at scbus2 target 0 lun 0 (cd0,pass3)
 USB 2.0 Storage Device 0100  at scbus5 target 0 lun 0 (da0,pass4)
 
 In /boot/devices, I have added these entries:
 
 hint.scbus.1.at=ahc0
 hint.scbus.0.at=ahc1
 hint.scbus.2.at=acd0
 hint.scbus.5.at=umass0

I think that this is wrong.

I had a similar issue (multiple tape drives and changer devices that 
needed to stay at the same ids).

Your device.hints entries should look something like this:

   hint.sa.0.at=scbus0
   hint.sa.0.target=5
   hint.sa.0.unit=0
   hint.sa.1.at=scbus0
   hint.sa.1.target=3
   hint.sa.1.unit=0
   hint.sa.2.at=scbus0
   hint.sa.2.target=1
   hint.sa.2.unit=0
   hint.ch.0.at=scbus0
   hint.ch.0.target=4
   hint.ch.0.unit=0
   hint.ch.1.at=scbus0
   hint.ch.1.target=2
   hint.ch.1.unit=0
   hint.ch.2.at=scbus0
   hint.ch.2.target=0
   hint.ch.2.unit=0

Which I use to get this:

   # camcontrol devlist
   SONY LIB-162 0208at scbus0 target 0 lun 0 (pass0,ch2)
   SONY SDX-1100 0102   at scbus0 target 1 lun 0 (sa2,pass1)
   SONY LIB-162 0203at scbus0 target 2 lun 0 (pass2,ch1)
   SONY SDX-900V 0102   at scbus0 target 3 lun 0 (sa1,pass3)
   # 

(Currently the first changer is not powered up.)


So I think that what you want is something like:

   hint.sa.0.at=scbus0
   hint.sa.0.target=5
   hint.sa.0.unit=0
   hint.sa.1.at=scbus1
   hint.sa.1.target=5
   hint.sa.1.unit=0
   hint.ch.0.at=scbus1
   hint.ch.0.target=0
   hint.ch.0.unit=0
   [...]


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: device.hints isn't setting what I want

2010-01-22 Thread Greg Byshenk
On Fri, Jan 22, 2010 at 10:01:02AM +0100, Greg Byshenk wrote:
 On Thu, Jan 21, 2010 at 08:23:23PM -0500, Dan Langille wrote:
  
  First, see also my post: do I want ch0 or pass1?
  
  I have an external tape library and an external tape drive.  They are
  not always powered up.  My goal: always get the same devices regardless
  of whether or not the tape library is powered on at boot.
  
  After booting, with the tape library powered on, I have these devices:
  
  # camcontrol devlist
  QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0)
  DEC TL800(C) DEC 0326at scbus1 target 0 lun 0 (ch0,pass1)
  DEC TZ89 (C) DEC 1837at scbus1 target 5 lun 0 (sa1,pass2)
  HL-DT-ST DVDRAM GSA-H10A JL02at scbus2 target 0 lun 0 (cd0,pass3)
  USB 2.0 Storage Device 0100  at scbus5 target 0 lun 0 (da0,pass4)
  
  In /boot/devices, I have added these entries:
  
  hint.scbus.1.at=ahc0
  hint.scbus.0.at=ahc1
  hint.scbus.2.at=acd0
  hint.scbus.5.at=umass0
 
 I think that this is wrong.
 
 I had a similar issue (multiple tape drives and changer devices that 
 needed to stay at the same ids).
 
 Your device.hints entries should look something like this:
 
hint.sa.0.at=scbus0
hint.sa.0.target=5
hint.sa.0.unit=0
hint.sa.1.at=scbus0
hint.sa.1.target=3
hint.sa.1.unit=0
hint.sa.2.at=scbus0
hint.sa.2.target=1
hint.sa.2.unit=0
hint.ch.0.at=scbus0
hint.ch.0.target=4
hint.ch.0.unit=0
hint.ch.1.at=scbus0
hint.ch.1.target=2
hint.ch.1.unit=0
hint.ch.2.at=scbus0
hint.ch.2.target=0
hint.ch.2.unit=0
 
 Which I use to get this:
 
# camcontrol devlist
SONY LIB-162 0208at scbus0 target 0 lun 0 (pass0,ch2)
SONY SDX-1100 0102   at scbus0 target 1 lun 0 (sa2,pass1)
SONY LIB-162 0203at scbus0 target 2 lun 0 (pass2,ch1)
SONY SDX-900V 0102   at scbus0 target 3 lun 0 (sa1,pass3)
# 
 
 (Currently the first changer is not powered up.)
 
 
 So I think that what you want is something like:
 
hint.sa.0.at=scbus0
hint.sa.0.target=5
hint.sa.0.unit=0
hint.sa.1.at=scbus1
hint.sa.1.target=5
hint.sa.1.unit=0
hint.ch.0.at=scbus1
hint.ch.0.target=0
hint.ch.0.unit=0
[...]


Just saw your second message.

I don't know if you can wire down 'pass?' the same way, but if you can,
I would assume that you need to set it the same way as the 'sa?' and 
other devices.

That is, if you want:

  QUANTUM DLT7000 1E48 at scbus0 target 5 lun 0 (sa0,pass0)

Then the device.hints entry would look like:

   hint.pass.0.at=scbus0
   hint.pass.0.target=5
   hint.pass.0.unit=0

(If you can do that.)

-greg

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeouts

2009-10-05 Thread Greg Byshenk
On Mon, Oct 05, 2009 at 08:32:14PM +0200, Daniel Bond wrote:
 
 What I need is useful advice/help. I never stated I needed a driver  
 developer.
 
 I'd like to be able to run my favorite OS on cool hardware, in the  
 future, for a high-performing NFS-server, without problems like I've  
 experienced the past 6months, on a production system.
 Please note that I'm managing a server-park almost completely based on  
 FreeBSD, and I'm running many NFS servers on other hardware, for other  
 services, without issues.
 
 I've seen several other FreeBSD-users having problems with this too,  
 so I think it's of importance for the project. As I mentioned  
 originally, I'm happy to dispose the hardware to any FreeBSD developer
 that might want to look further into this. Debugging it further is  
 above my skill-set, I don't even know where to begin looking,  
 especially since I can't produce any panics.

I can give one bit of advice that helped me in a similar situation:
check you motherboards.

I run about a dozen fileservers on FreeBSD, and have always been very
happy with their performance, but some months ago I began to experience
problems with one of them.  These problems were 'watchdog timeout'
errors.  Tried all manner of things, different NICs of different types,
changing settings, etc., but nothing helped over the long term.  At 
some point, when very heavy i/o was going on to our Beowulf cluster, the
'watchdog timeouts' would begin.  What was strange is that other 
(supposedly identical) machines handled _more_ i/o without a problem.

Finally, while doing some comparisons, I realized that the motherboard
having the problem was _not_ the same as the others; it was similar, but
not identical.  I changed the motherboard and all the problems went away,
never to reappear.

I don't know if it was a specific problem with that particular
motherboard, or something about that model, but for whatever reason, it
appears that the buses just couldn't handle a RAID card and three active
NICs.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: issues with Intel Pro/1000 and 1000baseTX

2009-05-16 Thread Greg Byshenk
On Fri, May 15, 2009 at 06:01:33PM -0300, Nenhum_de_Nos wrote:

 I know this is a bit off, but as I never had CAT6 stuff to deal with here
 it goes. is there any problems in using CAT6 cabling and not 1000baseTX
 capable switch ?
 
 I plan to install cat6 cables and just use 1000baseTX in future. this will
 be my new home network and all I have now is 100baseTX and two 1000baseT
 cards.
 
There should be no problem at all.  CAT6 must meet higher standards, but
the basic cable design is the same at CAT5, and it works for 100baseTX,
and even for 10baseT (if you really wanted to use it).

When my company relocated to a new building, the entire network was 
cabled at CAT6, but we still have some machines and switches that are
100baseTX, and they work fine.

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em? watchdog timeout 7-stable

2009-05-15 Thread Greg Byshenk
 on this machine:

# pciconf -lvb
e...@pci0:7:1:0: class=0x02 card=0x10028086 chip=0x10118086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = '82545EM Gigabit Ethernet Controller (Fiber)'
class  = network
subclass   = ethernet
bar   [10] = type Memory, range 64, base 0xda30, size 131072, enabled
bar   [20] = type I/O Port, range 32, base 0x5000, size 64, enabled


# vmstat -i
interrupt  total   rate
irq4: sio0  1479  0
irq6: fdc010  0
irq14: ata0   58  0
irq16: skc0 em0   758850 85
irq18: twa0  2085338234
irq24: em1 1  0
cpu0: timer 17806226   1999
cpu3: timer 17798161   1998
cpu2: timer 17798127   1998
cpu1: timer 17798043   1998
cpu5: timer 17798058   1998
cpu6: timer 17798161   1998
cpu4: timer 17798160   1998
cpu7: timer 17798160   1998
Total  145238832  16311


# ifconfig em1
em1: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST metric 0 mtu 1500
options=dbRXCSUM,TXCSUM,VLAN_MTU,VLAN_HWTAGGING,POLLING,VLAN_HWCSUM
ether 00:07:e9:1a:ae:dc
inet 192.168.1.62 netmask 0xf800 broadcast 192.168.7.255
media: Ethernet autoselect (1000baseLX full-duplex)
status: active


Any ideas?



On Wed, May 13, 2009 at 06:44:38PM +0200, Greg Byshenk wrote:
 On Wed, May 13, 2009 at 06:42:07PM +0200, Greg Byshenk wrote:
 
  As a followup to my own previous message, I continue to have annoying 
  problems with em?: watchdog timeout on one of my machines (now running
  7.2-STABLE as of 2009-05-08).
  
  I have discontinued using the on-board (em, copper) NICs, and replaced
  the original fibre NIC with a newer model, but the problem persists.
  I've also set
  
 hw.pci.enable_msix=0
 hw.pci.enable_msi=0
 hw.em.rxd=1024
 hw.em.txd=1024
 net.inet.tcp.tso=0
  
  ...as suggested in some discussions of this problem, and set the em1
  interface to 'polling', all to no avail.  Frequently, though irregularly
  (once or twice a day), the console begins to display
  
 em1: watchdog timeout -- resetting
 em1: watchdog timeout -- resetting
 em1: watchdog timeout -- resetting
  
  the nework is down, and the machine locks up.
  
  [Note: I am getting 'em1' now instead of 'em0' as previously, but this
  is due to changing all of the nics, which led to a different numbering;
  the timeout is still occurring on the (main) interface, the fibre 
  gigabit connection.]
  
  What is particularly perverse (IMO) is that, since changing the NIC to
  the newer model (and updating the kernel), I can no longer break to the
  debugger when the lockup occurs (there is no response to the break) --
  bit I _can_ shut the machine down cleanly via hardware (a touch of the
  power switch sends 'shutdown', and the machine shuts down cleanly --
  after killing off processes waiting on network i/o).
  
  The machine is running nfs and samba (3.2.10, from ports), and pretty
  much nothing else.
  
  
  Anyone have any ideas about this...?  I'm going mad with this.

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: em0 watchdog timeout 7-stable

2009-05-13 Thread Greg Byshenk
As a followup to my own previous message, I continue to have annoying 
problems with em?: watchdog timeout on one of my machines (now running
7.2-STABLE as of 2009-05-08).

I have discontinued using the on-board (em, copper) NICs, and replaced
the original fibre NIC with a newer model, but the problem persists.
I've also set

   hw.pci.enable_msix=0
   hw.pci.enable_msi=0
   hw.em.rxd=1024
   hw.em.txd=1024
   net.inet.tcp.tso=0

...as suggested in some discussions of this problem, and set the em1
interface to 'polling', all to no avail.  Frequently, though irregularly
(once or twice a day), the console begins to display

   em1: watchdog timeout -- resetting
   em1: watchdog timeout -- resetting
   em1: watchdog timeout -- resetting

the nework is down, and the machine locks up.

[Note: I am getting 'em1' now instead of 'em0' as previously, but this
is due to changing all of the nics, which led to a different numbering;
the timeout is still occurring on the (main) interface, the fibre 
gigabit connection.]

What is particularly perverse (IMO) is that, since changing the NIC to
the newer model (and updating the kernel), I can no longer break to the
debugger when the lockup occurs (there is no response to the break) --
bit I _can_ shut the machine down cleanly via hardware (a touch of the
power switch sends 'shutdown', and the machine shuts down cleanly --
after killing off processes waiting on network i/o).

The machine is running nfs and samba (3.2.10, from ports), and pretty
much nothing else.


Anyone have any ideas about this...?  I'm going mad with this.

-greg byshenk



# pciconf -lvb
[...]
e...@pci0:7:1:0: class=0x02 card=0x10028086 chip=0x10118086 rev=0x01 
hdr=0x00
vendor = 'Intel Corporation'
device = '82545EM Gigabit Ethernet Controller (Fiber)'
class  = network
subclass   = ethernet
bar   [10] = type Memory, range 64, base 0xda30, size 131072, enabled
bar   [20] = type I/O Port, range 32, base 0x5000, size 64, enabled
[...]

# vmstat -i
interrupt  total   rate
irq4: sio0  1666  0
irq6: fdc010  0
irq14: ata0   58  0
irq16: skc0 em0  1437801 98
irq18: twa0   846981 57
irq24: em1   4378650299
cpu0: timer 29258004   1999
cpu1: timer 29249758   1999
cpu3: timer 29249816   1999
cpu7: timer 29249779   1999
cpu2: timer 29249729   1999
cpu4: timer 29249852   1999
cpu6: timer 29249851   1999
cpu5: timer 29249814   1999
Total  240671769  16450



On Sun, Apr 26, 2009 at 02:50:08PM +0200, Greg Byshenk wrote:
 I have one machine that is seeing watchdog timeouts on em0, running 7-STABLE
 amd64 as of 2009.04.19, and also some other more perverse errors.
 
 Twice now in the last 48 hours, this machine has become unreachable via the
 network, and connecting to the console shows an endless string of 
 
[...]
em0: watchdog timeout -- resetting
em0: watchdog timeout -- resetting
em0: watchdog timeout -- resetting
 
 messages. The machine is almost locked up.  That is, I can get a login
 prompt, but can go no further than typing in a username; after the
 username, no password prompt, and nothing further.  The only option is
 to hard reset the machine or to drop to debugger and reboot.
 
 Now the perverse part.  After restarting, the system partition is no
 more.
 
 Background detail:  the machine is a fileserver, with a 3Ware 9650SE-16ML
 SATA controller, connected to 16 1TB SATA drives, this configured as
 a 14-drive RAID10 array (+ 2 hot spares), with a 50GB system partition
 and 6.5TB data partition.  The system partition is configured as da1,
 with one slice and more or less standard partitions for / /var /tmp, etc.
 (the data partition of the array is sliced with gpt).
 
 The issue here is that, upon restart, all parition information on da0
 seems to have disappeared, and restarting results in a no operating
 system found message, and a failure to boot (obviously).
 
 But all of the data is still present.  If I boot into rescue mode,
 recreate da0s1, mark it bootable, and restore the bsdlabel, then
 everything works again.  I can restart the machine, and it comes back
 up normally (it requires an fsck of everything on da0, but after that
 everything is back to normal).
 
 I don't know if this is two unrelated problems, or one problem with
 two symptoms, or something else.  I think that I can safely say that
 it is not a problem with the 3Ware controller itself, as I replaced
 the controller with a spare (identical model), and the problem
 recurred.  Additionally, I have an almost-identical configuration on
 four other machines

Re: em0 watchdog timeout 7-stable

2009-05-13 Thread Greg Byshenk
On Wed, May 13, 2009 at 06:42:07PM +0200, Greg Byshenk wrote:

 As a followup to my own previous message, I continue to have annoying 
 problems with em?: watchdog timeout on one of my machines (now running
 7.2-STABLE as of 2009-05-08).
 
 I have discontinued using the on-board (em, copper) NICs, and replaced
 the original fibre NIC with a newer model, but the problem persists.
 I've also set
 
hw.pci.enable_msix=0
hw.pci.enable_msi=0
hw.em.rxd=1024
hw.em.txd=1024
net.inet.tcp.tso=0
 
 ...as suggested in some discussions of this problem, and set the em1
 interface to 'polling', all to no avail.  Frequently, though irregularly
 (once or twice a day), the console begins to display
 
em1: watchdog timeout -- resetting
em1: watchdog timeout -- resetting
em1: watchdog timeout -- resetting
 
 the nework is down, and the machine locks up.
 
 [Note: I am getting 'em1' now instead of 'em0' as previously, but this
 is due to changing all of the nics, which led to a different numbering;
 the timeout is still occurring on the (main) interface, the fibre 
 gigabit connection.]
 
 What is particularly perverse (IMO) is that, since changing the NIC to
 the newer model (and updating the kernel), I can no longer break to the
 debugger when the lockup occurs (there is no response to the break) --
 bit I _can_ shut the machine down cleanly via hardware (a touch of the
 power switch sends 'shutdown', and the machine shuts down cleanly --
 after killing off processes waiting on network i/o).
 
 The machine is running nfs and samba (3.2.10, from ports), and pretty
 much nothing else.
 
 
 Anyone have any ideas about this...?  I'm going mad with this.


Just as an FYI, the drive errors I described in my previous message
appear to have been due to a bad BBU on the RAID controller, and to
have been resolved.
 

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [7.2] R/W mount of / denied. Filesystem not clean - run fsck.

2009-05-06 Thread Greg Byshenk
On Wed, May 06, 2009 at 11:50:11AM +0200, Helmut Schneider wrote:
 Marat N.Afanasyev ama...@ksu.ru wrote:
 Helmut Schneider wrote:

 after upgrading a few systems yesterday from 7.1-RELEASE to 7.2-RELEASE 
 on one machine I got the error above. The problem was that
 
 - I was unable to cope with it but booting from a live CD.
 - the message appeared ~ 1000 times and then the kernel paniced.
 
 After fsck'ing / with the help of the live CD I rebooted the machine but 
 now I got the same problem with /home.
 
 How can I avoid such issues (except of not letting the machine crash)? Is 
 there a way to boot at least to single user mode and then run fsck (I was 
 at home, far away from the machine, not funny)?

 There is no 'login' when / cannot be mounted...
 
 fsck it. if you have another machine in there, you can try to make a 
 serial console. or install a ip-kvm extender ;)
 
 I do have such thing (IBM Blade Center) but I'm looking for something to 
 avoid the situation above. Something that lets me at least boot into single 
 user mode.

If you had access to the console (I'm guessing you did in order to use the
live CD), did you try booting into single-user from the beastie menu?

IME, failure to fsck the / menu should drop automatically to single-user
at the console, but if this fails, then you should be able to choose
single-user boot from the menu, which will then not try to run fsck or
mount / rw.  From there you should be able to fsck and remount /, as well
as /home or anything else.  This will fail if there is something horribly
wrong with /, causing a failure even when / is mounted ro, but then there
may be no good solution.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: [7.2] R/W mount of / denied. Filesystem not clean - run fsck.

2009-05-06 Thread Greg Byshenk
On Wed, May 06, 2009 at 09:18:02PM +0200, Helmut Schneider wrote:
 Marat N.Afanasyev ama...@ksu.ru wrote:
 Helmut Schneider wrote:

 I do have such thing (IBM Blade Center) but I'm looking for something to 
 avoid the situation above. Something that lets me at least boot into 
 single user mode.

 if you have an ip-kvm you can drop into single-user and fsck any disk you 
 have. all you need to do is to choose 'single user' from beastie-menu. or 
 start kernel with -s parameter
 
 I *do* now how to enter single user mode but the kernel panic'ed *before* 
 the shell started. :)

The problem is that, if something is so far wrong that you can't even
get to the single-user shell, then there probably isn't anything else
but rescue.

One thing that might be an option:  at work, we use PXE for Linux and
FreeBSD installs, so one thing I've done is to create a pxeboot rescue
image (using the mfsroot from the rescue CD).  This means that, if there
is this sort of problem, we can boot into rescue mode from the network
(the BIOS is also redirected to the serial console) and not have to 
worry about swapping CDs.  The same thing should also work for remote
locations.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


em0 watchdog timeout (and 3ware problems) 7-stable

2009-04-26 Thread Greg Byshenk
I have one machine that is seeing watchdog timeouts on em0, running 7-STABLE
amd64 as of 2009.04.19, and also some other more perverse errors.

Twice now in the last 48 hours, this machine has become unreachable via the
network, and connecting to the console shows an endless string of 

   [...]
   em0: watchdog timeout -- resetting
   em0: watchdog timeout -- resetting
   em0: watchdog timeout -- resetting

messages. The machine is almost locked up.  That is, I can get a login
prompt, but can go no further than typing in a username; after the
username, no password prompt, and nothing further.  The only option is
to hard reset the machine or to drop to debugger and reboot.

Now the perverse part.  After restarting, the system partition is no
more.

Background detail:  the machine is a fileserver, with a 3Ware 9650SE-16ML
SATA controller, connected to 16 1TB SATA drives, this configured as
a 14-drive RAID10 array (+ 2 hot spares), with a 50GB system partition
and 6.5TB data partition.  The system partition is configured as da1,
with one slice and more or less standard partitions for / /var /tmp, etc.
(the data partition of the array is sliced with gpt).

The issue here is that, upon restart, all parition information on da0
seems to have disappeared, and restarting results in a no operating
system found message, and a failure to boot (obviously).

But all of the data is still present.  If I boot into rescue mode,
recreate da0s1, mark it bootable, and restore the bsdlabel, then
everything works again.  I can restart the machine, and it comes back
up normally (it requires an fsck of everything on da0, but after that
everything is back to normal).

I don't know if this is two unrelated problems, or one problem with
two symptoms, or something else.  I think that I can safely say that
it is not a problem with the 3Ware controller itself, as I replaced
the controller with a spare (identical model), and the problem
recurred.  Additionally, I have an almost-identical configuration on
four other machines, none of which are experiencing any problems.
One thing that is different is that the other machines use
Intel PRO/1000 PF (pci-e) NICs.

Is there some known problem with the Intel 2572 fibre NIC?  Or some
potential interaction of it with the 3ware RAID controller?

For the moment, I've set hw.pci.enable_msi=0 (as discussed in the
threads on 7.2/bge), and am building a new kernel/world from sources
csup'd one hour ago, but I'd really like to hear any ideas about this
-- particularly the wiping of the label.

Some information about the system:


# /dev/da0s1:
8 partitions:
#size   offsetfstype   [fsize bsize bps/cpg]
  a:  209715204.2BSD0 0 0 
  b:  8388608  2097152  swap
  c: 1048561920unused0 0 # raw part, don't 
edit
  d:  8388608 104857604.2BSD0 0 0 
  e:  2097152 188743684.2BSD0 0 0 
  f: 41943040 209715204.2BSD0 0 0 
  g: 41941632 629145604.2BSD0 0 0 


e...@pci0:4:1:0: class=0x02 card=0x10038086 chip=0x10018086 rev=0x02 
hdr=0x00
vendor = 'Intel Corporation'thernet Controller (Fiber)'
device = '2572 10/100/1000 Ethernet Controller (Fiber)'
class  = networktory, range 32, base 0xda00, size 131072, enabled
subclass   = ethernetory, range 32, base 0xda00, size 131072, enabled
bar   [10] = type Memory, range 32, base 0xda00, size 131072, enabled
bar   [14] = type Memory, range 32, base 0xda02, size 65536, enabled0x00
 
t...@pci0:9:0:0:class=0x010400 card=0x100413c1 chip=0x100413c1 rev=0x01 
hdr=0x00
device = '9650SE Series PCI-Express SATA2 Raid Controller'
class  = mass storage
subclass   = RAID
bar   [10] = type Prefetchable Memory, range 64, base 0xd800, size 
33554432, enabled
bar   [18] = type Memory, range 64, base 0xda30, size 4096, enabled
bar   [20] = type I/O Port, range 32, base 0x3000, size 256, enabled
cap 01[40] = powerspec 2  supports D0 D1 D2 D3  current D0
cap 05[50] = MSI supports 32 messages, 64 bit
cap 10[70] = PCI-Express 1 legacy endpoint

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: X.Org/xdm 'frozen' after installworld (7-stable)

2009-02-04 Thread Greg Byshenk
On Tue, Feb 03, 2009 at 10:58:42AM -0800, Kent Stewart wrote:
 On Tuesday 03 February 2009 09:29:05 am Steve Franks wrote:

  This is a new weird one I've never had before.  Consoles work fine,
  but the mouse and keyboard won't move/type when xdm pops up.
  ctrl-alt-F2 takes you right to a working console, and the mouse works
  fine in the console...ctrl-alt-backspace no longer kills X either...
 
 The option that I found the easiest was to add
 
  Option AutoAddDevices off
 
 To the ServerFlags section. I was told in the ports list that you can add it 
 to the ServerLayout section but I could never make that work.

I had the same problem yesterday after updating X.

For me, adding dbus_enable=YES and hald_enable=YES to rc.conf and
restarting solved the problem.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: SSH problem

2009-01-26 Thread Greg Byshenk
On Mon, Jan 26, 2009 at 11:21:57AM -0800, Xin LI wrote:
 Xian Chen wrote:

  I can use scp to move files from a linux to my Freebsd machine.
  
  But, when I try to use WinSCP under windows, it always failed. WinSCP
  errors: Network error: Connection refused. Both scp  sftp fail if using
  WinSCP.
  
  Any clues for this?

 My guess is that you have specified an incorrect port number.  Try tcpdump?

Another possibility, IIRC, is a bad ssh hostkey (I haven't used WinSCP in
quite some time, but I recall that its error messages are not particularly
informative).

You can also check to see if you can reach the server.  Try a plain telnet
to port 22.  You won't actually be able to establish a connection if you
aren't running ssh, but you should see something like:

   Connected to hostname.
   Escape character is '^]'.
   SSH-2.0-OpenSSH_5.1p1 FreeBSD-20080901

 
-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mergemaster broken -- take 2

2009-01-08 Thread Greg Byshenk
On Thu, Jan 08, 2009 at 10:10:25AM +0200, Andrei Kolu wrote:
 Mike Lempriere wrote:

 Hi folks -- sorry to be a nag, but my main production system is barely 
 limping along on an old kernel with mismatched libraries.  I have no 
 idea what else to do -- please help!
 ---
 I'm upgrading 5-stable (was at 5.5) to 6-stable, in preparation for 
 6-stable to 7-stable.
 No problems with cvsup, make buildworld, make installworld, make 
 buildkernel, mergemaster -p.
 make installkernel, boot to single user.  Then mergemaster -- blammo:
 What is your exact make sequences are?
 
 I usually do this way:
 
 # csup /usr/share/examples/cvsup/standard-supfile
 # cd /usr/src
 here I usually softlink my kernel config file in /root directory to 
 appropriate architecture one and edit /etc/make.conf:
 ---
 SUP_UPDATE=yes
 SUPHOST=cvsup.no.FreeBSD.org
 SUPFILE=/usr/share/examples/cvsup/standard-supfile
 PORTSSUPFILE=/usr/share/examples/cvsup/ports-supfile
 DOCSUPFILE=/usr/share/examples/cvsup/doc-supfile
 KERNCONF=KERNEL
 ---
 /usr/src/sys/amd64/conf
 KERNEL - /root/kernel/KERNEL
 
 # make buildkernel
 # make installkernel
 # make buildworld
 # mergemaster -p
 # make installworld
 # mergemaster

It may be me that is mistaken, but this seems wrong to me, as does the
sequence in the original message:

   # cvsup
   # make buildworld
   # make installworld
   # make buildkernel
   # mergemaster -p.
   # make installkernel
   # boot to single user
   # mergemaster

If I am not very much mistaken, the canonical process is:

# make buildworld
# make buildkernel
# make installkernel
# reboot (*)
# mergemaster -p
# make installworld
# mergemaster

The reasons for the other methods being wrong are (as I understand them):

- You should build your new world before building your new kernel, as
  it may be the case that some aspects of the new kernel build are
  dependent upon aspects of the new world build.  If you build your
  new kernel before building your new world, you will be building 
  your new kernel against the old world.

- You should install your new kernel before installing your new world,
  as it can be the case that some aspects of the new world will not be
  understood by your old kernel. A new kernel should always be
  compatible with an old userland/world, but an old kernel may not 
  always be compatible with a new userland/world.

 NOTE: I do not reboot my system until everything is updated. Why it is 
 necessary to boot new kernel and then upgrade world is beyound me..YMMW

- I suppose that it is not strictly necessary to reboot between 
  installing kernel and world, but I always do so.  The reason for
  this is that, if something has gone horribly wrong, it is quite easy
  to go back and boot kernel.old. If you don't realize that there is
  something wrong until after you have installed everything (kernel and
  userland), it can be much more difficult to recover.


-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Re: mergemaster broken -- take 2

2009-01-08 Thread Greg Byshenk
On Thu, Jan 08, 2009 at 11:47:42AM +0100, Oliver Fromme wrote:
 Greg Byshenk wrote:
   Andrei Kolu wrote:
 
NOTE: I do not reboot my system until everything is updated. Why it is 
necessary to boot new kernel and then upgrade world is beyound me..YMMW
   
   - I suppose that it is not strictly necessary to reboot between 
 installing kernel and world, but I always do so.
 
 It _is_ necessary.  If you don't reboot, you're still running
 the old kernel which might not be able to support new binaries
 and libraries that installworld will install on your system.

Of course this is correct; my error.

The chance of something going wrong in this case is probably quite
small, but it something does go wrong it can go horribly wrong.
 

-- 
greg byshenk  -  gbysh...@byshenk.net  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org


Problem with Adaptec 29320LPE

2008-11-24 Thread Greg Byshenk
 0x0 0x0 0x0 0x0 0x0 0x0 0x0
Nov 20 15:01:16 backuphost kernel:  Dump Card State Ends 

Nov 20 15:01:16 backuphost kernel: (ch2:ahd0:0:0:0): warning, READ ELEMENT 
STATUS avail != count
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 43927d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 31239d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 31616d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 30983d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 30983d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 31239d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 31616d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 30215d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 25603d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 31239d to a val
id element type
Nov 20 15:01:16 backuphost kernel: ch: warning: could not map element source 
address 31616d to a val
id element type


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Problem with Adaptec 29320LPE

2008-11-24 Thread Greg Byshenk
On Mon, Nov 24, 2008 at 12:49:12PM +0100, Rink Springer wrote:
 Hi Greg,
 
 On Mon, Nov 24, 2008 at 12:42:49PM +0100, Greg Byshenk wrote:
  backuphost# camcontrol devlist
  SONY LIB-162 0208at scbus0 target 0 lun 0 (pass0,ch3)
  SONY SDX-1100 0102   at scbus0 target 1 lun 0 (sa3,pass1)
  SONY LIB-162 0203at scbus0 target 2 lun 0 (pass2,ch4)
  SONY SDX-900V 0102   at scbus0 target 3 lun 0 (sa4,pass3)
  AMCC 9650SE-16M DISK 3.06at scbus1 target 0 lun 0 (da0,pass4)
  AMCC 9650SE-16M DISK 3.06at scbus1 target 0 lun 1 (da1,pass5)

 Are these volumes perhaps 2TB ? If so, it won't work...  we stumbled on
 this at work a few weeks ago, and once we resized the volumes so that'd
 all be 2TB, the controller worked fine...
 
 As far as I know, this is the only workaround - I couldn't see relevant
 patches in Open/NetBSD either that might have fixed this issue :-(
 
The volume da1 is indeed 2TB, but it is not connected to the controller;
it (along with da0) is actually a RAID-10 array connected to a 3Ware/AMCC 
SATA controller.  The Adaptec contoller is used only for the tape drives
(the SDX-900V is AIT4; the SDX-1100 is AIT5), and they are 2TB.

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: System deadlock when using mksnap_ffs

2008-11-14 Thread Greg Byshenk
On Thu, Nov 13, 2008 at 05:08:10PM +0100, Greg Byshenk wrote:
 On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote:
  
  The rest of the below information is good -- but I'm confused about
  something: is there anyone out there who can use mksnap_ffs on a
  filesystem (/usr is a good test source) and NOT experience this
  deadlocking problem?  Literally *every* FreeBSD box I have root access
  to suffers from this problem, so I'm a little baffled why we end-users
  need to keep providing debugging output when it should be easy as pie
  for a developer to do dump -0 -L -a -f /path/fs.dump /usr and watch
  their system wedge.
 
 As an answer to the question (and additional information), I am 
 experiencing the problem, but not on all filesystems. 
 
 This is under FreeBSD 7.1-PRERELEASE #7: Thu Nov  6 11:29:52 CET 2008,
 amd64 (from sources csup'ed immediately prior to the build).
 
 I have four filesystems used for data storage:
 
 /dev/da1p196850470   7866026   81236408 9%/export/mail
 /dev/da1p2  1937058312 972070320  81002332855%/export/home
 /dev/da1p3  1937058312  79027008 1703066640 4%/export/misc
 /dev/da1p4  2598991534 271980564 211909164811%/export/spare
 
 I can successfully mksnap_ffs the first (smaller) partition, but an
 attempt to do so on any of the others causes a lock.
 
 Note: this is a lockup, not a slow.  The system becomes unresponsive
 to any input, and there is no hard drive activity, and this does not
 change over a period of more than 12 hours.


As a followup to my own post, after reading this discussion, I applied
the patches and rebuild my system last night.

As of today, with the patched ffs_snapshot.c, I can now make snapshots
of all the filesystems listed above.  It takes rather a long time, but
that is to be expected, I think, and the snapshots finish normally.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: System deadlock when using mksnap_ffs

2008-11-13 Thread Greg Byshenk
On Wed, Nov 12, 2008 at 08:42:00PM -0800, Jeremy Chadwick wrote:
 
 The rest of the below information is good -- but I'm confused about
 something: is there anyone out there who can use mksnap_ffs on a
 filesystem (/usr is a good test source) and NOT experience this
 deadlocking problem?  Literally *every* FreeBSD box I have root access
 to suffers from this problem, so I'm a little baffled why we end-users
 need to keep providing debugging output when it should be easy as pie
 for a developer to do dump -0 -L -a -f /path/fs.dump /usr and watch
 their system wedge.

As an answer to the question (and additional information), I am 
experiencing the problem, but not on all filesystems. 

This is under FreeBSD 7.1-PRERELEASE #7: Thu Nov  6 11:29:52 CET 2008,
amd64 (from sources csup'ed immediately prior to the build).

I have four filesystems used for data storage:

/dev/da1p196850470   7866026   81236408 9%/export/mail
/dev/da1p2  1937058312 972070320  81002332855%/export/home
/dev/da1p3  1937058312  79027008 1703066640 4%/export/misc
/dev/da1p4  2598991534 271980564 211909164811%/export/spare

I can successfully mksnap_ffs the first (smaller) partition, but an
attempt to do so on any of the others causes a lock.

Note: this is a lockup, not a slow.  The system becomes unresponsive
to any input, and there is no hard drive activity, and this does not
change over a period of more than 12 hours.

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: challenge: end of life for 6.2 is premature with buggy 6.3

2008-06-08 Thread Greg Byshenk
On Sat, Jun 07, 2008 at 03:11:42PM -0700, Jo Rhett wrote:
 On Jun 7, 2008, at 1:44 PM, Patrick M. Hausen wrote:
 
 This is why EoLing 6.2 and forcing people to upgrade to a release
 with lots of known issues is a problem.

 People who have issues with RELENG_6_3 should upgrade to RELENG_6
 which is perfectly supported.
 
 I'm sorry, but you clearly don't run RELENG_6 on anything.  I run it  
 on two home computers, and grabbing it on any given day and trying to  
 run with it in production is insanity.  Lots and lots of things are  
 committed, reverted, recommitted, reverted and then finally  
 redesigned.  Each of those steps are often committed to the source  
 tree.  The -RELEASE versions prevent this kind of insanity.

I can't speak for Patrick, but I can ad that I very definitely _do_
run RELENG_6 on ~40 machines (web, mail, file, and applications
servers), and do so without any serious problems. Which is not to say
that there are never problems, but that when there have been problems,
they have been uncovered during testing.

Of course it is true that grabbing something and trying to run
with it in production is insanity. But this (at least IMO) has
nothing at all to do with RELENG_X _per_ _se_, as it applies equally
to X-RELEASE, and also to any production systems running any other OS.
Before we roll out a new RELENG_6 build, we test it first to discover
any potential problems -- but this is standard practice for
_everything_ that goes into production, including changes to Linux,
Solaris, and Windows systems, and also changes to samba, apache, or any
other software running on the systems.  My point here is that it is the
grabbing something and throwing it into production without testing
that is insanity, and that this has nothing specifically to do with
RELENG_6.

I might also add that I have machines that grab (actually, pretty 
much randomly -- that is, on a given day and without particular 
concern from me) RELENG_6 and RELENG_7, and even these machines very
rarely exhibit any problems. Of course, these are just test machines,
and without the full pre-production testing it is possible that there
are some problems in these cases that just don't manifest themselves,
but my experience (and, I suspect, that of many others) indicates that
your description of RELENG_6 as a seething cauldron of uncertainty is
inaccurate. 


 I'm struggling to find a phrase here that can't be taken to be an  
 insult, so forgive me and try to understand when I say that you really  
 should try watching the cvs tree for a bit before making a nonsense  
 comment like that.

You don't seem to have struggled very hard. After all, you could have
mad the same point by noting that you consider it a mistake to run
RELENG_6 in production. And by not doing this, you have undermined
your own position, as it seems clear that there are _many_ people and
organizations who run RELENG_6 in production (by which I mean, some
version of RELENG_6, and not the tracking of daily changes to RELENG_6),
which means that your assertion that such is nonsense is itself
mistaken.

Somewhat more generally, this sort of thing may be why you are getting
the amount of push-back you see. That is, what you are claiming seems
to match the experience of few (if any) others.  As you may have
noticed from this thread, the general view (a consensus, seemingly,
apart from yourself) is that 6.3 is _better_ (more stable, etc.) than
6.2.  Given that such is the case (as it seems very much to be), then
the response to your statement that 6.3 isn't good enough of what 
exactly is wrong? seems (at least to me) to be entirely reasonable.
When one of my people comes to me and says that something is wrong with
X (and particularly when my experience is that there is nothing wrong
with X), my first response is almost invariably:  what, specifically,
is wrong with X?


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: challenge: end of life for 6.2 is premature with buggy 6.3

2008-06-04 Thread Greg Byshenk
On Wed, Jun 04, 2008 at 04:41:45PM -0500, Kevin Kinsey wrote:
 Clifton Royston wrote:

   For example, if I take a 6.3R CD, or build one for 6-RELENG, is there
 a way to do an upgrade in place on each server?  Or would it work
 better to do a build from recent source on the development server, then
 export /usr/src and /usr/obj via NFS to the production servers and do
 the usual make installkernel; reboot; etc. sequence on them?  (In my
 case I do have all machines on one GigE switch.)
 
 I've heard of the latter being done with decent results.

I can't say that it is better, but I do the latter (well, actually I
build on a test machine to make sure there are no problems, then sync
to an NFS server and mount src and object from there, followed by
installkernel-reboot-installworld-merge-reboot) on a number of different
machines (currently runnign 6.3-STABLE of 2008-05-22 and 7.0-STABLE of
2008-05-27), and it is certainly faster and easier than doing a build
on each individual machine.

I do the same thing with ports, doing a 'portupgrade -p' on the build
machine followed by a 'portupgrade -P' on the clients (building
packages on the build machine, and then installing via my own packages
on the others).  Again, I can't say that it is better, but it is
certainly faster and easier.

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: possible zfs bug? lost all pools

2008-05-18 Thread Greg Byshenk
On Sun, May 18, 2008 at 09:56:17AM -0300, JoaoBR wrote:
 
 after trying to mount my zfs pools in single user mode I got the following 
 message for each:
 
 May 18 09:09:36 gw kernel: ZFS: WARNING: pool 'cache1' could not be loaded as 
 it was last accessed by another system (host: gw.bb1.matik.com.br hostid: 
 0xbefb4a0f).  See: http://www.sun.com/msg/ZFS-8000-EY
 
 any zpool cmd returned nothing else as not existing zfs, seems the zfs info 
 on 
 disks was gone
 
 to double-check I recreated them, rebooted in single user mode and repeated 
 the story, same thing, trying to /etc/rc.d/zfs start returnes the above msg 
 and pools are gone ...
 
 I guess this is kind of wrong 


I think that the problem is related to the absence of a hostid when in
single-user.  Try running '/etc/rc.d/hostid start' before mouning.

http://lists.freebsd.org/pipermail/freebsd-current/2007-July/075001.html


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: samba build failure on 6-STABLE

2008-05-03 Thread Greg Byshenk
On Sat, May 03, 2008 at 11:42:14AM +0100, Doug Rabson wrote:
 On 1 May 2008, at 15:39, Michael Proto wrote:
 Greg Byshenk wrote:

 [...] Basically my problem is that the current Samba3 (samba-3.0.28,1)  
 won't build on a recent 6-STABLE system (I noticed it with sources
 csup'd 24 April, and it continues with sources csup'd today, 1 May).
 The strange thing is that this is a version of samba that has
 previously built successfully,  on the machine and with the
 configuration that is now failing.  (I was  attempting
 to rebuild because I saw some strange library errors.)  This at least
 suggests to me that the problem is _not_ due to something changing  
 with Samba, but to some other change that is being reflected in the
 Samba build.  [...]

 I can confirm this on a 6-STABLE system last SUPed (kernel and world
 rebuilt) to 20080428 11:23 EDT. samba-3.0.28,1 built fine on this box
 when it was 6.3-RELEASE, and now fails in exactly the same place when
 trying to rebuild on 6-STABLE.
 
 The attached patch should fix the problem.

It appears that it does.

I've applied the patch on a test machine, and Samba now builds successfully.
I've also done a reinstall of Samba, and the rebuild version appears to be
working properly (though I have not yet done any extensive testing).

Thanks,
-greg

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


samba build failure on 6-STABLE

2008-05-01 Thread Greg Byshenk
I'm posting this to freebsd-stable even though it is a problem with a port,
because the port itself has not changed, but a rebuild fails (on a system
and with a configuration that worked before my most recent system updates).

Basically my problem is that the current Samba3 (samba-3.0.28,1) won't build
on a recent 6-STABLE system (I noticed it with sources csup'd 24 April, and
it continues with sources csup'd today, 1 May). The strange thing is that
this is a version of samba that has previously built successfully, on the
machine and with the configuration that is now failing.  (I was attempting
to rebuild because I saw some strange library errors.)  This at least
suggests to me that the problem is _not_ due to something changing with Samba,
but to some other change that is being reflected in the Samba build.


The system in question is built from sources csup'd today (1 May 2008), with
all installed ports current as of today.  The same Samba did build successfully
with a source and ports tree csup'd on 7 March 2008.

As a test to see if there is some problem with the ports dependencies, I've 
tried a 'portupgrade -fR samba'; all of the dependencies built fine, but then
I got the same error when attempting to build Samba itself. It is not
definitive, but this suggests to me that this is not a ports problem (per se),
but a kernel/world problem.

This latter is highlighted by the fact that Samba builds without error on a
system with sources csup'd on 17 April.  That is, if I take the exact same
system on which the build fails, revert my world/kernel to a build from
17 April (leaving everything else exactly the same), then the error 
disappears and Samba builds successfully.


The actual error is below. Any ideas are welcome. I have a machine that I can
play with if someone would like me to try anything.

-greg


Compiling smbd/oplock_linux.c
smbd/oplock_linux.c: In function `signal_handler':
smbd/oplock_linux.c:73: error: structure has no member named `si_fd'
The following command failed:
cc -I. -I/usr/ports/net/samba3/work/samba-3.0.28/source  -O2 
-fno-strict-aliasing -pipe -D_SAMBA_BUILD_=3 -I/usr/local/include  
-I/usr/ports/net/samba3/work/samba-3.0.28/source/iniparser/src -Iinclude 
-I./include  -I. -I. -I./lib/replace -I./lib/talloc -I./tdb/include 
-I./libaddns -I./librpc -DHAVE_CONFIG_H  -I/usr/local/include -DLDAP_DEPRECATED 
   -I/usr/ports/net/samba3/work/samba-3.0.28/source/lib -D_SAMBA_BUILD_=3 -fPIC 
-DPIC -c smbd/oplock_linux.c -o smbd/oplock_linux.o
*** Error code 1

Stop in /usr/ports/net/samba3/work/samba-3.0.28/source.
*** Error code 1

Stop in /usr/ports/net/samba3.
*** Error code 1

Stop in /usr/ports/net/samba3.

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Recent bootloaders not working also on FIC PA-2005 board

2008-04-09 Thread Greg Byshenk
On Wed, Apr 09, 2008 at 10:32:48PM +0200, Marcin Cieslak wrote:

 It would be these changes.  Debugging this will be hard. :(  Are you 
 familiar with x86 assembly at all?

In relation to John Baldwin's question, I (at least) have basically zero
knowledge of x86 assembler.  :-(
 

But this bit caught my eye:

 How can I try to debug this? I have tried to attach serial console
 with AT keyboard unplugged I still get message that VGA console will
 be used. The serial port is working correctly (verified with Windows and
 later with NetBSD).

 Can I get serial console while booting from CDROM - do I need to remove
 VGA card for this?

When my error occurs (with the Asus TR-DLS), I get the message about
using internal console (vga?), even when the machine is set to 
use a serial console.  I don't know if this is relevant, but in my
case I can't use serial.


I can also add that -- though I am not much of a progammer -- I will
happily test anything that anyone might suggest.  My machine is not
in production (I built it to do some testing with FreeBSD7 and ZFS),
and I can break it without any real consequences.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


7-STABLE bootloader not working on Asus TR-DLS

2008-04-08 Thread Greg Byshenk
I'm piggybacking this onto the previous bootloader thread because I have
a suspicion that my problem may be related to the 'fix' for the prveious
problem.

I've got a machine (old-ish) that will not boot with the changes to
src/sys/boot/i386 in March.

It it a dual-p3 system running on an Asus tr-dls motherboard (with most
recent -- from 2002, but that is the most recent) BIOS updates:

   Timecounter i8254 frequency 1193182 Hz quality 0
   CPU: Intel(R) Pentium(R) III CPU family  1266MHz (1266.72-MHz 686-class 
CPU)
 Origin = GenuineIntel  Id = 0x6b1  Stepping = 1
 
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
   real memory  = 2147463168 (2047 MB)
   avail memory = 2091913216 (1995 MB)
   ACPI APIC Table: ASUS   TR-DLS  
   FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
cpu0 (BSP): APIC ID:  3
cpu1 (AP): APIC ID:  0

When I install the most recent world (for example, a build of 7-STABLE from
01-04-2008), it simply fails to boot.  No panic, no crash, but just stops.

I get to:

   [...]
   BTX loader 1.00  BTX version is 1.02
   Consoles: internal video/keyboard
   BIOS drive A: is disk0

   ... and then nothing ... just hangs permanently

If I change back to 7-RELEASE, or to 7-STABLE as of 18-03-2008, there is 
no problem at all. If I run the system with 01-04-2008 world, but copy
back in the contents of /boot from 18-03-2008, then there is again no
problem. I can copy in the 01-04-2008 kernel and run under that, and there
is no problem (it is running like that now).  But I have to use the old
version of the booloader.

I'm not a coder, and haven't looked more deeply, but it appears that 
something in here:

   i386/src/sys/boot/i386/btx/btx/Makefile
   i386/src/sys/boot/i386/btx/btx/btx.S
   i386/src/sys/boot/i386/libi386/biosmem.c
   i386/src/sys/boot/i386/libi386/biossmap.c

...has broken booting on this machine.


Any advice gladly accepted.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Cannot mount a nfs share after doing a snapshot

2008-01-07 Thread Greg Byshenk
On Sun, Jan 06, 2008 at 05:38:30PM +0100, Jose Garcia Juanino wrote:
 El domingo 06 de enero a las 15:41:21 CET, Greg Byshenk escribi?:
  On Sat, Jan 05, 2008 at 11:28:31PM +0100, Jose Garcia Juanino wrote:
   
   I have a 7.0-PRERELEASE i386 system with a nfs server, with an unique 
   export
   line in /etc/exports file:
   
   / -maproot=root -network 192.168.1.0 -mask 255.255.255.0
   
   After a reboot, I have no problem mounting this nfs share from a nfs 
   client.
   But after issuing the following command on the server:
   
   # mount -u -o snapshot /.snap/now /

  Is the problem that you are trying to mount your snapshot on top of the /
  directory?  I use snapshots, but have never tried to do this, and can 
  imagine that there might be a problem, since the snapshot is itself a
  snapshot of a filesystem (different than the actual root filesystem).
  
  That would explain the error:

   Jan  5 22:47:03 gauss mountd[542]: can't delete exports for /: 
   Cross-device link
 
 No, I am not trying to mount the snapshot. I am just taking (making) the
 snapshot, as man mount says.

Sorry, I wasn't following this (as I said, I don't work with snapshots in
this way).

I've looked at the 'mount' man page, and it seems that it should work the
way you are trying to do it. That said, because taking a snapshot grabs
the entirety of a filesystem, I can well imagine that trying to take a 
snapshot of the root filesystem while at the same time exporting that
filesystem via NFS will cause a problem.

  What happens if you create a directory and mount your snapshot there:
  
  mkdir /snapshotmount
  mount -u -o snapshot /.snap/now /snapshotmount
 
  If this works, then you may need a separate exports line for /snapshotmount.
 
 # file /.snap/now
 /.snap/now: Unix Fast File system [v2] (little-endian) last mounted on
 /, last written at Sun Jan  6 16:24:19 2008, clean flag 1, readonly flag
 1, number of blocks 130721, number of data blocks 126520, number of
 cylinder groups 4, block size 16384, fragment size 2048, average file
 size 16384, average number of files in dir 64, pending blocks to free 0,
 pending inodes to free 0, system-wide uuid 0, minimum percentage of free
 blocks 8, TIME optimization

Ok, so it looks like your /.snap/now snapshot actually exists, and is being
made, so it looks like the command

# mount -u -o snapshot /.snap/now /

is actually working. (So ignore the rest of what I said last time...)


I've just played with this a bit myself (I'm no expert, but I use snapshots
currently with 6-STABLE and want to know about any future problems), and I
can reproduce the problem (7.0-PRERELEASE as of 2 Jan 2008). I see the same
sort of errors as you report, and they cannot be cleared even by removing
the snapshot file and restarting nfsd/mountd. The only solution appears to
be to remove the snapshot and restart the machine. I can see how this might
be a bit inconvenient.

That said, there appears to be a problem with using the 

# mount -u -o snapshot snapshot filesystem

form of the command.

The problem does _not_ occur (at least in my test) if you use the the

# mksnap_ffs filesystem snapshot

command. Can you try taking a snapshot using mksnap_ffs?

If mksnap_ffs works, while 'mount -u -o' fails, then it looks like a bug...

-greg

 

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Cannot mount a nfs share after doing a snapshot

2008-01-06 Thread Greg Byshenk
 .
 Mounting late file systems:
 .
 Starting ntpd.
 postfix/postfix-script: starting the Postfix mail system
 Starting distccd.
 Performing sanity check on apache22 configuration:
 Syntax OK
 Starting apache22.
 Starting anacron.
 Configuring syscons:
  keymap
  keyrate
  font8x16
  font8x14
  font8x8
  blanktime
 .
 Starting sshd.
 Starting cron.
 Local package initialization:
 #
 
 
 
 Also, my /etc/src.conf used to build the world:
 
 
 #
 WITHOUT_ACPI=1
 WITHOUT_ASSERT_DEBUG=1
 WITHOUT_ATM=1
 WITHOUT_AUDIT=1
 WITHOUT_AUTHPF=1
 WITHOUT_BIND_DNSSEC=1
 WITHOUT_BIND_ETC=1
 WITHOUT_BIND_LIBS_LWRES=1
 WITHOUT_BIND_MTREE=1
 WITHOUT_BIND_NAMED=1
 WITHOUT_BLUETOOTH=1
 WITHOUT_I4B=1
 WITHOUT_IPFILTER=1
 WITHOUT_IPX=1
 WITHOUT_KERBEROS=1
 WITHOUT_LPR=1
 WITHOUT_NIS=1
 WITHOUT_PF=1
 WITHOUT_PROFILE=1
 WITHOUT_SENDMAIL=1
 WITHOUT_SHAREDOCS=1
 #
 
 
 The /etc/make.conf file:
 
 #
 CPUTYPE?=pentium3
 MODULES_OVERRIDE=   linux if_tap sound/driver/emu10k1  syscons/green \
 linprocfs linsysfs  smbfs ntfs ext2fs libiconv \
 libmchain aio if_bridge vesa \
 cd9660_iconv udf_iconv msdosfs_iconv ntfs_iconv \
 zfs bridgestp
 BOOT_COMCONSOLE_PORT=   0x3F8
 BOOT_COMCONSOLE_SPEED=  115200
 PERL_VER=5.8.8
 PERL_VERSION=5.8.8
 #
 
 
 
 Regards



-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Nagios + 6.3-RELEASE == Hung Process

2008-01-02 Thread Greg Byshenk
On Wed, Jan 02, 2008 at 07:24:28PM -0400, Marc G. Fournier wrote:
 - --On Wednesday, January 02, 2008 22:54:33 + Tom Judge [EMAIL 
 PROTECTED] wrote:

  Not sure if this is related at all but out of the 3 nagios deployments we
  have here I have only ever seen it on one (It currently has 2 nagios threads
  spinning CPU time atm).

  The differences on that server are:
 
  * It is amd64 compared to i386

 I never tried on i386, but in my case it was an amd64 system as well ... not 
 sure if that is relevant or not ... has anyone seen this problem *with* i386?

Yes.

We run Nagios on an i386 machine (dual Athlon MP 1800+), and I first saw this
problem with a build of 6-STABLE as of 2007-10-04, and it continues (if I don't
use the libmap.conf settings) with the running system of 6.3-PRERLEASE as of
2007-12-18 and nagios-2.10 (from ports of same date).

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 7.0-BETA1

2007-10-24 Thread Greg Byshenk
On Tue, Oct 23, 2007 at 11:08:27PM +0200, Per olof Ljungmark wrote:
 rihad wrote:

 How risky is it to start using 7.0-BETA1 in production, with the 
 intention of upgrading to release as soon as possible? Thanks.
 
 We've used 7-CURRENT since January on a couple of production boxes and 
 had very few disasters, well, none, but a couple of issues.
 
 Risky is a relative term really, but if you ask me I'd say the risk 
 is rather low.
 
 But: TEST FIRST!

I concur with Per.  I've been running 7-CURRENT on a couple of production
machines for some months, without any serious problems -- but these are not
mission-critical machines.

Risk is a relative thing, and it is relative to both the risk of failure and
the cost of that failure should it occur.  I have 7- running on one fileserver
that is used only by our IT group (for online copies of distfiles and other
installable software), meaning that if something should go horribly wrong, it
would be an annoyance, but not a disaster. The same could _not_ be said about
our central user fileservers, and so they do not run 7-.

I could also note that I've been running 7-CURRENT on my own workstation
(including X, but only fvwm2 and nothing too fancy) for about 6 months, and
have experienced no serious problems (though I have swapped out SCHED_4BSD
for SCHED_ULE due to poor interactivity with 4BSD).


And I also emphasise:  TEST FIRST!  My situation is not the same as yours,
and something that works fine in my environment may break horribly in yours.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 7.0-BETA1

2007-10-24 Thread Greg Byshenk
On Wed, Oct 24, 2007 at 02:00:42PM +0500, rihad wrote:
 rihad wrote:

 How risky is it to start using 7.0-BETA1 in production, with the 
 intention of upgrading to release as soon as possible? Thanks.

 My question was more a theoretical one: it's called BETA for some 
 reason, otherwise it'd still be in HEAD. To me BETA means that no major 
 architectural changes are expected in it any more, no?

Yes, but it doesn't mean that there can't be undiscovered bugs that could
cause problems.

 
 Our machine-to-be is quite mission-critical... But if I start with the 
 latest 6.x release, it would be more difficult to migrate to 7.0 when it 
 comes out than if I start with 7.0-BETA?. I've known people running 
 4-STABLE or 5-STABLE branches on mission-critical machines, without even 
 bothering to upgrade, but I think they're stress-testing their luck ;-) 
 So I don't want to join their camp, that's why I asked for advice ;-) 
 Again it's named BETA for a reason, so it could be less intrusive than 
 STABLE?..
 
 I will definitely start with beta if it reaches BETA2 in a week or two - 
 the time I got ;-) Thanks for advice.

Well, if it is a machine-to-be, then I suspect that you should be safe
in starting with 7.0-BETA. First, there don't appear to be any serious
problems with it, and second, if it is a new build machine-to-be, then
you will have the opportunity to do the testing required to ensure that
there are no problems (in your situation) prior to rollout.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: can I do 6.1-RELEASE to 6.2 via cvsup

2007-10-24 Thread Greg Byshenk
On Wed, Oct 24, 2007 at 11:18:30AM -0400, Tuc at T-B-O-H.NET wrote:

Also, the list of things to do is a bit mis-ordered and truncated. The
official list is in /usr/src/UPDATING and reads:

make sure you have good level 0 dumps
make buildworld
make kernel KERNCONF=YOUR_KERNEL_HERE
[1]
reboot in single user [3]
mergemaster -p  [5]
make installworld
make delete-old

   Um, I went to go check the file on a 7.0-BETA1 I just installed and and 
 doing the ground
 up on.. And I just realized something...
 
   WHERE is the step to install the kernel?? I always thought it was :
 
 make buildworld
 make kernel KERNCONF=YOUR_KERNEL_HERE
   make installkernel KERNCONF=YOUR_KERNEL_HERE
   [1]
   reboot in single  [3]


Pay attention to the make options (you can find them in /usr/src/Makefile).

'make kernel' is equivalent to 'make buildkernel + installkernel', just like
'make world' is equivalent to 'make buildworld + installworld'. The latter
can be dangerous, but the former usually isn't.

One process is:

[csup, etc.]
make buildworld
make buildkernel
make installkernel  [reboot single user]
[mergemaster -p if necessary]
make installworld   
mergemaster [reboot]
[ports or other stuff]

If you wish, the 'make buildkernel' + 'make installkernel' can be replaced
with 'make kernel', which does them both in sequence with one command.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.x, NIS, local root password, and nsswitch.conf

2006-11-22 Thread Greg Byshenk
On Wed, Nov 22, 2006 at 10:49:01PM +0800, David Adam wrote:
 On Wed, 22 Nov 2006, Gerrit [ISO-8859-1] K?hn wrote:
  On Wed, 22 Nov 2006 09:07:34 -0500 (EST) Mark Hennessy [EMAIL PROTECTED]

  wrote about Re: FreeBSD 6.x, NIS, local root password, and nsswitch.conf:

  MH I'm a bit unsure about it myself.
  MH I tried exactly what you suggested, putting files on the compat line
  MH and before nis for both passwd and groups on the NIS slave server
  MH only, and no go.  Perhaps it is the master server that actually
  MH controls this? I don't know.  Any further advice would be greatly
  MH appreciated.

  Sorry to disturb, but I don't understand why you distribute the server's
  root pw via NIS at all. Is it really shown by ypcat passwd on the
  client? If so, how about removing it from the list of exported accounts?
 
 That's a really good point. When you consider the inherent insecurity of
 NIS, having a root password in the maps is a pretty bad plan anyway.
 
 Given my vague handwaving at PAM, and the fact that the OP probably has
 NIS as sufficient above pam_unix, the obvious solution if my unverified
 assertions are correct is to remove the root password from the NIS maps.

I could be mistaken, but isn't the 'compat' entry to cover the case with
the old format passwd/group files, in which one used '+:...' or similar to
include NIS (or other authentication).  As such, 'compat' means use the
file, plus whatever is added under 'compat', further meaning that you 
can have only one entry under 'compat'.

So, if you want old style behavior, what you want is something like:

   passwd: compat
   passwd_compat: nis

Alternatively, you can use something like:

   passwd: files nis
   # passwd_compat: nis

or even:

   passwd: winbind nis files
   # passwd_compat: nis


[Corrections welcome if I have this wrong]


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Cruel and unusual problems with Proliant ML350

2006-11-13 Thread Greg Byshenk
On Mon, Nov 13, 2006 at 09:19:45AM -0800, Jeremy Chadwick wrote:
 
 I'll agree with this (re: webservers not needing USB), except in
 regards to one item: keyboards.
 
 More and more x86 PCs these days are expecting keyboards to be
 USB-based.  Yes, PS/2 ports are still present on most (but not all)
 motherboards, but eventually that will be phased out.
 
 I like the idea of being able to go to my co-location facility and
 plug in a USB keyboard to begin working on a server, and when
 finished remove the keyboard and leave.

Don't you really need to have a monitor, as well?  I _have_ worked
blind before, but I didn't enjoy it.  I can imagine having a 
keyboard with me when wandering around, but wouldn't normally have
a monitor.  I had always thought that the preferred solution for 
this sort of case was to use a serial console.

And what seems to be becoming common on servers is a BIOS that allows
you to fully redirect to serial, including BIOS configuration.  The
servers that I have recently purchased have had a keyboard and monitor
plugged into them _once_ -- for the first BIOS setup -- and then never
again.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: em driver testing

2006-11-07 Thread Greg Byshenk
On Mon, Nov 06, 2006 at 04:14:40PM -0800, Jack Vogel wrote:
 Well, so run 6.2 BETA3 plus the patch I posted as Patrick
 mentioned and then report on that. You've got a lot of
 potential problem areas here, I have no experience with
 samba on FreeBSD. And that motherboard only has PCI
 as I recall, yes? Still, it should get rid of the watchdogs
 unless you have real hardware issues.


As a point of information, I don't think that samba specifically has
anything to do with the problem.

I am running samba on FreeBSD, and have two servers that are rather
heavily used (one is the filestore for a CFD cluster, and the other
for a Maya/Muster rendering cluster), each having two em interfaces
and SMP -- and have not seen any watchdog issues (they are currently
running FreeBSD 6.2-PRERELEASE as of Oct  7 -- but no problems with
any earlier 6.1-STABLE versions either).
 

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: probs on 6.2-prerelease

2006-09-25 Thread Greg Byshenk
On Mon, Sep 25, 2006 at 09:08:26AM -0400, Michael Proto wrote:
 Michael Vince wrote:

  I don't know if this is pre 6.2 specific but I changed my /etc/tty for
  device ttyd0 to 'on' from 'off' and when I rebooted the pc I couldn't
  login via regular KVM console, just don't get a login.
  The more alarming thing was that while it appeared everything was
  booting up from the boot up messages on the screen, I couldn't remotely
  log into the server in fact it appears the machine didn't bring up the
  Ethernet device as I couldn't even ping it.
  As soon as I switched the ttyd0 back to 'off' and rebooted it I could
  ssh back into the server etc.
  I have a regular kernel and 1 jail and samba on this machine.
 
 I know this isn't a yes I'm having problems response but thought it
 might be useful anyway.
 
 I'm running 6.2-pre on a Soekris Engineering Net4501 with ttyd0 enabled
 in /etc/ttys and I'm not having any problems with the system booting or
 logging in via serial console. SSH logins work fine and the network is
 brought-up as normal during boot. I've had this system in the same
 config (in regards to /etc/ttys) since the 6 was still the HEAD branch
 and I have yet to see problems with it. One difference here is that I
 don't have any virtual consoles enabled BUT ttyd0 (and
 pseudo-terminals), as this box doesn't have a video card, just a serial
 port.

I can also report no problems running 6.2-pre on i686.  I am running on 
several machines, using serial consoles, machines _with_ video cards,
but mostly unused (one machine has a KVM connected, and it works fine,
as well.  No problems with video, no problems with network, no problems
with ssh login, etc.

-greg


FreeBSD xxx.xxx.com 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #21: Tue Sep 19 
19:37:00 CEST 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/  i386

/etc/ttys:
[...]
ttyd0   /usr/libexec/getty std.9600   xterm   on  secure
[...]

/boot/loader.conf
[...]
console=comconsole
[...]


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ARRRRGH! Guys, who's breaking -STABLE's GMIRROR code?!

2006-09-16 Thread Greg Byshenk
On Fri, Sep 15, 2006 at 03:41:04PM -0300, Marc G. Fournier wrote:

 But, I'm just curious here ... for all of the talk going around about this 
 whole issue, how many ppl have truly ever been bitten by an unstable 
 -STABLE?  And for those that have, how long did it take to get help from a 
 developer to get it fixed?

I run -STABLE on a number of production machines.

I have twice been bitten by an unstable -STABLE -- but bitten in a 
very small way.

When we build a new -STABLE (on average perhaps once per month), we
build it on a test machine, so that we can be sure that it actually
works. Once it is tested and we know it works, then we can roll it out
to the production machines without undue concern.

I note that we follow the same process with out Linux machines, our
Irix machines, and our Windows machines.  Blindly rolling out updates
or patches to critical production machines is unwise and dangerous (at
least IMO).

I will add that I have never even needed to contact a maintainer.
When there has been a problem, I checked the lists.  In one case the
fix was already committed, in the other there was already an I'm
working on it message and a fix was commited in less than 24 hours.
In the interim, my test machine had a problem -- but that's what a 
test machine is for.


 In the case that started this thread, it seems to be that the developer 
 fixed his mistake fairly quickly, which is what one would expect ... it 
 shouldn't be so much that he *broke* -STABLE (shit happens, do you want 
 your money back?), but it should be 'was he around to reverse his mistake 
 in a reasonable amount of time?' ... ?

Exactly.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS locking: lockf freezes (rpc.lockd problem?)

2006-08-31 Thread Greg Byshenk
On Tue, Aug 29, 2006 at 05:05:26PM +, Michael Abbott wrote:

[I wrote]
 An alternative would be to update to RELENG_6 (or at least RELENG_6_1)
 and then try again.
 
 So.  I have done this.  And I can't reproduce the problem.

 # uname -a
 FreeBSD venus.araneidae.co.uk 6.1-STABLE FreeBSD 6.1-STABLE #1: Mon Aug 28 
 18:32:17 UTC 2006 
 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/GENERIC  i386
 
 Hmm.  Hopefully this is a *good* thing, ie, the problem really has been 
 fixed, rather than just going into hiding.
 
 So, as far as I can tell, lockf works properly in this release.


Just as an interesting side note, I just experienced rpc.lockd crashing.
The server is not running RELENG_6, but RELENG_5 (FreeBSD 5.5-STABLE
#15: Thu Aug 24 18:47:20 CEST 2006).  Due to user error, someone ended
up with over 1000 processes trying to lock the same NFS mounted file at
the same time.  The result was over 1000 Cannot allocate memory errors
followed by rpc.lockd crashing.

I guess the server is telling me it wants an update...


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS locking: lockf freezes (rpc.lockd problem?)

2006-08-27 Thread Greg Byshenk
On Sun, Aug 27, 2006 at 11:24:13AM +, Michael Abbott wrote:
 I've been trying to make some sense of the NFS locking issue.  I am 
 trying to run
   # make installworld DESTDIR=/mnt
 where /mnt is an NFS mount on a FreeBSD 4.11 server, but I am unable to 
 get past a call to `lockf`.

I have not closely followed the discussion, as I have not experienced 
the problem.

I am currently running FreeBSD6 based fileservers in an environment that
includes FreeBSD, Linux (multiple flavors), Solaris, and Irix clients,
and have experienced no nfs locking issues (I have one occasional
problem with 64-bit Linux clients, but it is not locking related and
appears to be due to a 64-bit Linux problem).

Further, (though there may well be problems with nfs locking) I cannot
recreate the problem you described -- at least in a FreeBSD6 environment.

I have just performed a test of what you describe, using 'smbtest'
(6.1-STABLE #17: Fri Aug 25 12:25:19 CEST 2006) as the client and 
'data-2' (FreeBSD 6.1-STABLE #16: Wed Aug  9 15:38:12 CEST 2006) as the
server.

   data-2 # mkdir /export/rw/bsd6root/
   ## /export/rw is already exported via NFS
   smbtest # mount data-2:/export/rw/bsd6root /mnt
   smbtest # cd /usr/src
   smbtest # make installworld DESTDIR=/mnt
   [...]
   makewhatis /mnt/usr/share/man
   makewhatis /mnt/usr/share/openssl/man
   rm -rf /tmp/install.2INObZ3j
   smbtest #

Which is to say that it completed successfully.  Which suggests that there
is not a serious and ongoing problem.

There may well be a problem with FreeBSD4, but I no longer have any NFS
servers running FreeBSD4.x, so I cannot confirm.  Alternatively, there
may have been a problem in 6.1-RELEASE that has since been solved in
6.1-STABLE that I am using.  Or there could be a problem with the 
configuration of your server.  Or there could be something else going
on (in the network...?).

But to see what exactly is happening in your case, you would probably 
want to look at what exactly is happening on the client, the server, and
the network between them.
 

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: NFS locking: lockf freezes (rpc.lockd problem?)

2006-08-27 Thread Greg Byshenk
On Sun, Aug 27, 2006 at 07:17:34PM +, Michael Abbott wrote:
 On Sun, 27 Aug 2006, Kostik Belousov wrote:

 Make sure that rpc.statd is running.
 Yep.  Took me some while to figure that one out, but the first lockf test 
 failed without that.
 
[...]
 
 As for the other test, let's have a look.  Here we are before the test 
 (NFS server, 4.11, is saturn, test machine, 6.1, is venus):
 
 saturn$ ps auxww | grep rpc\\.
 root48917  0.0  0.1   980  640  ??  Is7:56am   0:00.01 rpc.lockd
 root  115  0.0  0.1 263096  536  ??  Is   18Aug06   0:00.00 rpc.statd
 
[...]
 
 Well, how odd: as soon as I start the test process 515 on venus goes away. 
 Now to wait for it to fail... (doesn't take too long):
 
[...] 
 
 In conclusion: I agree with Greg Byshenk that the NFS server is bound to 
 be the one at fault, BUT, is this freeze until reboot behaviour really 
 what we want?  I remain astonished (and irritated) that `kill -9` doesn't 
 work!

The problem here is that the process is waiting for somthing, and 
thus not listening to signals (including your 'kill').

I'm not an expert on this, but my first guess would be that saturn (your
server) is offering something that it can't deliver.  That is, the client
asks the server can you do X?, and the server says yes I can, so the
client says do X and waits -- and the server never does it.

Or alternatively (based on your rpc.statd dying), rpc.lockd on your
client is trying to use rpc.statd to communicate with your server.  And
it starts successfully, but then rpc.statd dies (for some reason) and
your lock ends up waiting forever for it to answer.


I would recommend starting both rpc.lockd and rpc.statd with the '-d'
flag, to see if this provides any information as to what is going on.
There may well be a bug somewhere, but you need to find where it is.
I suspect that it is not actually in rpc.statd, as nothing in the
source has changed since January 2005.

An alternative would be to update to RELENG_6 (or at least RELENG_6_1)
and then try again.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 5.5 to 6.1 upgrade

2006-08-23 Thread Greg Byshenk
On Tue, Aug 22, 2006 at 12:23:00PM -0700, Chuck Swiger wrote:
 
 In practice, however, pretty much all software nowadays depends on  
 shared libraries, so it's reasonable to do a pkg_delete -a after  
 upgrading to a new major version of FreeBSD, and then reinstall all  
 of the ports you use once you've finished upgrading.  Run pkg_info  
 before the upgrade and keep track of this output to help you remember  
 what ports you've got installed...

As a possible point of clarification, my comments earlier (and, I
suspect similar comments of others) were not meant to imply that one
should not rebuild ports after a major upgrade, but only that one need
not do so _before_ upgrading.

[...probably ... it worked for me ... YMMV ... if it is a critical
package, then it wouldn't hurt to rebuild it first ... usw.]


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ATA problems again ... general problem of ICH7 or ATA?

2006-08-21 Thread Greg Byshenk
On Mon, Aug 21, 2006 at 04:03:47AM +0200, Konstantin Saurbier wrote:
 Am 20.08.2006 um 18:20 schrieb Greg Byshenk:

 What is different is that this was with a 3Ware RAID controller --
 which made removing/raconfiguring/rebuilding much easier -- but I was
 seeing the exact same errors.
 
 No your errors are not related. As of my experience (and the  
 experience of others) the controller forgetting or loosing drives is  
 a feature 3ware.
 We had similar problems with 3ware-7500-8 ATA controllers and i was  
 reported of the same errors with 3ware-9000 series. Our in-house  
 3ware-9500S are not showing this kind of errors.
 
 This errors are not driver or OS dependent such as they appear on  
 FreeBSD as well on different Linux distros.
 Since not all controllers suffering of these errors it is maybe  
 depending on the firmware or board/chip revisions.

I hesitate to make too strong a statement on this matter, as I have
not done any deep investigation, however...

The explanation above does not appear consistent with my experience.
I am now using (and have used over the past several years) a number
of different 3Ware controllers (7000, 8000, and 9000 series) and have
not previously seen this problem.  Of course I have had drives fail
-- and in one case one port of one controller simply stopped working
-- but never this particular problem.

Further, the very same controller that demonstrated problems (in the
numerically identical server, performing the exact same jobs), had
not demonstrated this problem (over a period of more than six months)
until I installed the June 6.1 STABLE, after which the problem appeared
consistently, until installing the July 6.1 STABLE, at which point the
problem disappeared, and has not occurred since (despite my trying very
hard to make it do so).

It may well be that there is some bug in the 3Ware controllers, but 
my experience suggests that there is/was something else going on.  At
the very least, it suggests that there was something about the June
6.1 STABLE (but not the earlier or later versions) that was triggering
a 3Ware bug -- as my problems occurred only when running the June
6.1 STABLE, and that was the _only_ difference between the cases of
having problems and those of not having problems.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 5.5 to 6.1 upgrade

2006-08-21 Thread Greg Byshenk
On Mon, Aug 21, 2006 at 11:52:02PM +0200, Stefan Bethke wrote:
 Am 21.08.2006 um 18:19 schrieb Ian Smith:

 I recently (without drama) upgraded a 5.4-RELEASE system to
 FreeBSD 5.5-STABLE #1: Tue Aug  1 11:11:20 EST 2006
 for 'target practice' at least, on the way to 6.1-STABLE

 I was preparing to portupgrade everything next, when I wondered:

 a) should I upgrade from RELENG_5 straight to RELENG_6 or should I be
 stopping off at 6.1-RELEASE along the way first?  and
 
 I'd go straight to 6-stable. Make sure you have a good backup, even  
 if you stop over at 6.1.

I see no reason not to go directly to 6-stable (if that is what you plan
to run); I've done it with multiple machines, and just jump right to the
6-stable version that is active on the machines running 6.x.

Though I've had no problems, I second the recommendation to have a good
backup.  Also, if you don't have a known-good 6-stable build, you might
want to upgrade to the GENERIC kernel.
 
 b) do I need to upgrade all existing ports (way out of date) before  
 the source upgrade, or can I be confident of doing that from 6.1
 (-R or -S)?

 FWIW: a wee Celeron 300, so minimising upgrade build times is  
 desirable.
 
 Unless you have business critical apps running (downtime must be  
 minimal), you can wait until you've completed the upgrade to 6- 
 stable, and then run portupgrade -af.  If you'd like to run the  
 portupgrade overnight, you might want to define BATCH, and possibly  
 set any port building options in /usr/local/etc/pkgtools.conf,  
 otherwise, the port builds will be frequently interrupted by make  
 config questions.

It shouldn't be necessary to rebuild ports before the upgrade.  If 
there is something running that is critical, you might want to upgrade
it first, just be sure, but it probably isn't necessary.  I upgraded a
workstation with 200+ ports installed, and saw no problems (I can't
for certain that nothing was broken before I upgraded the ports, but
I experienced no problems). 


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ATA problems again ... general problem of ICH7 or ATA?

2006-08-20 Thread Greg Byshenk
On Sun, Aug 20, 2006 at 01:38:55PM +0100, Matt Dawson wrote:
 On Sunday 20 August 2006 13:00, [EMAIL PROTECTED] wrote:

  Do you mean different type of cables, or just another piece? I can't
  change cables by myself, servers are dedicated from provider, but as I
  can saw, they picked whole new machine from their HW storage and put new
  Samsung disk drives in. So these two last machines are brand new with
  new cables. (Probably with a same type of cables - all machines are ASUS
  RS120)
 
 I can confirm the same behaviour with a ULi M1689/Newcastle Athlon64 based 
 system running 6.1-RELEASE-p3 (i386). ad6 just detaches without warning and 
 it takes a reboot to bring it back. atacontrol reinit has no effect. Tried 
 the following to resolve the problems:
 
 - Changed cables (both ad4 and ad6)
 - Changed SATA power to legacy
 - Moved the NIC and anything else from the shared PCI INT (thought I'd 
 cracked 
 it at this point as it was stable for a month, then it lost ad6 on a nightly 
 dump)
 - Remade my gmirror array as an ar. Put it straight back to gmirror again 
 when 
 I found out what a pain it is to rebuild after ad6 disappears.

I am not sure if it is related, but...  I experienced a similar sort of
problem, although the details in my case are quite different.

What was similar was that I would lose two ATA drives from an array,
inexplicably.  Reconfiguring the same drives and rebuilding would cause
them to work perfectly again -- for some number of days, after which 
the same failure would occur.

What is different is that this was with a 3Ware RAID controller -- 
which made removing/raconfiguring/rebuilding much easier -- but I was
seeing the exact same errors.

This happened four times (with the same errors that have been discussed
here), running 6.1 STABLE as of June 22.  Before attempting to RMA the
drives, I tried an updated kernel, 6.1 STABLE as of July 19.  Strangely
enough, the problems disappeared.

So, while I have not checked everything that has changed, it _might_ be
worth trying 6.1 STABLE...
 

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Motherboard RAID problem

2006-08-20 Thread Greg Byshenk
On Sun, Aug 20, 2006 at 07:38:28PM +0200, Roland Smith wrote:
 On Sun, Aug 20, 2006 at 10:02:05AM -0700, Bill Blue wrote:

  I'm not sure if I'm expecting too much, or this is a real bug.

  Using FreeBSD 6.1 release, CVSup'd to current. The motherboard is a
  Supermicro P4SCT0 with a 3.2Ghz P4 and 2 DDR400 1G sticks of RAM.  On
  the MB is a built-in RAID controller (Adaptec chip) for the SATA
  drives.  You set it for discrete SATA or RAID.  If RAID is set, on the
  next boot you have essentially a BIOS configuration for that 'device'
  consisting of the two SATA devices in either RAID 0 (striped) or RAID
  1 (mirrored). 

 The ataraid(4) driver supports the Adaptec HostRAID.
  
 snip
  Boot the OS now and all goes well with the device still showing up on
  /dev/ad4* but I couldn't tell if the mirroring was really working
  since the drives have no individual led indications.  I then noticed
  that there was a new ad6* device, and guess what -- it was the second
  SATA drive and a mirror image of the *original* first drive.  Watching
  it with DF for size changes when copying a large file to my home
  directory, it didn't change at all. 

  ad6* were the only new devices seen in the OS.
 
 If FreeBSD supports the device, you should see an ar0 device.
 
 Do you have the ataraid(4) driver loaded, or built into your kernel?


Alternatively, are you sure you have identified your hardware correctly?

According to Supermicro here

   http://www.supermicro.com/products/motherboard/P4/875/P4SCT.cfm

the P4SCT has an Intel 6300ESB onboard RAID controller, which according to
this page

   
http://www.gamepc.com/labs/view_content.asp?id=eoyraidpage=2cookie%5Ftest=1

is based upon the ICH4, and not Adaptec controller.  And, unless I've
missed it, this controller is not supported.


What does dmesg output say about the drives and controller?



-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: ATA problems again ... general problem of ICH7 or ATA?

2006-08-20 Thread Greg Byshenk
On Sun, Aug 20, 2006 at 07:51:29PM +0200, Miroslav Lachman wrote:
 Greg Byshenk wrote:

[...]

 This happened four times (with the same errors that have been discussed
 here), running 6.1 STABLE as of June 22.  Before attempting to RMA the
 drives, I tried an updated kernel, 6.1 STABLE as of July 19.  Strangely
 enough, the problems disappeared.

 So, while I have not checked everything that has changed, it _might_ be
 worth trying 6.1 STABLE...
 
 I have problems with 6.1-RELEASE same as with 6.1-STABLE from August 2. 
 I can try newer STABLE, but as I see on cvsweb, there are not much 
 changes in ATA driver sources, only new chipsets added.

It is only an idea, based on something that worked for me.  And, as I
said, my situation is not exactly the same as the others.
 
 It is strange to me, that I can see significant changes of read/write 
 speed. (I am running nonstop tests with writing disk full of files, 
 delete them, and start again + generating graphs) Speed vary from 
 2.5MB/s to 11MB/s by jumps. Not continuous from the lowest to the 
 highest. Writing is for example 3MB/s for 20 hours, then jump to 10MB/s 
 and after some time (6 - 20 hours) jump down to about 3MB/s.
 After some days of testing, disk disappear, system reboots itself, 
 resynchronize gmirror and work for next few days till the next disk lose.
 Also earlier synchronization was done after 1:30 hour (at about 30MB/s), 
 now synchronization run at lower speeds - from 2.5MB/s to 15MB/s, so the 
 whole synchronization is done after more then 5 hours (the longest was 
 20 hours to synchronize 250GB HDDs)

 I don't know what more can I test, what more could be done to solve 
 these problems. :(

You are using gmirror, which I am not, so the situations are not
analogous, since my situation was with h/w RAID.  And I have no direct
experience with gmirror (I use gvinum on a couple of secondary systems,
but those are SCSI based).

Does the output of 'systat -vm' tell you anything of interest?  That is,
are the disks running at or close to 100%, are the CPUs fully loaded, or
anything else...?
 

-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: Motherboard RAID problem

2006-08-20 Thread Greg Byshenk
On Sun, Aug 20, 2006 at 08:22:48PM +0200, Roland Smith wrote:
 On Sun, Aug 20, 2006 at 07:55:47PM +0200, Greg Byshenk wrote:
  On Sun, Aug 20, 2006 at 07:38:28PM +0200, Roland Smith wrote:

   If FreeBSD supports the device, you should see an ar0 device.

   Do you have the ataraid(4) driver loaded, or built into your kernel?

  Alternatively, are you sure you have identified your hardware correctly?

  According to Supermicro here

 http://www.supermicro.com/products/motherboard/P4/875/P4SCT.cfm

  the P4SCT has an Intel 6300ESB onboard RAID controller, which according to
  this page

 
  http://www.gamepc.com/labs/view_content.asp?id=eoyraidpage=2cookie%5Ftest=1

  is based upon the ICH4, and not Adaptec controller.  And, unless I've
  missed it, this controller is not supported.

 The ata(4) manual page lists the 6300ESB as supported. The ataraid(4)
 manual only lists the Intel MatrixRAID metadata format as supported.


Well, the controller itself is supported, obviously (as an ATA
controller), but I don't see that it is supported as a RAID controller.

And, if this is the case -- ie: 1) the controller is indeed 6300ESB; and
2) it is supported as an ATA controller; but 3) it is not supported as a
RAID controller -- then that would explain the situation described in the
original message.  That is:  the system happily sees two individual ATA
drives, but cannot see any array.

This is all guesswork, but it makes sense.


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: [FreeBSD 6.0-RELEASE] Incorrect geometry for VIA RAID0 array

2005-12-08 Thread greg byshenk
On freebsd-stable, [EMAIL PROTECTED] (Jason Harmening) wrote:

  Here's the dmesg output from the installer:

  ad4: 70911MB WDC WD740GD-00FLA1 27.08D27 at ata2-master SATA150
  ad6: 70911MB WDC WD740GD-00FLC0 33.08F33 at ata3-master SATA150
  ar0: 70911MB VIA Tech V-RAID RAID0 (stripe 64 KB) status: READY
  ar0: disk0 READY using ad4 at ata2-master
  ar0: disk1 READY using ad6 at ata3-master

Are you _sure_ that the array is being recognized properly?

Based on the dmesg output, it looks like the controller is being read
as a 74G drive.

FWIW, this is the section of my dmesg output, for a _mirror_:

   ad4: 78167MB Maxtor 6Y080P0 YAR41BW0 at ata2-master UDMA133
   ad6: 78167MB Maxtor 6Y080P0 YAR41BW0 at ata3-master UDMA133
   ar0: 77247MB Promise Fasttrak RAID1 status: READY
   ar0: disk0 READY (master) using ad4 at ata2-master
   ar0: disk1 READY (mirror) using ad6 at ata3-master


  On 12/7/05, Jason Harmening [EMAIL PROTECTED] wrote:

  I'm trying to install FreeBSD 6.0-RELEASE on a RAID0 array attached to the
  VIA 8237 controller on my Asus A8V Deluxe motherboard.  The array consists
  of two 74G drives.  The installer recognizes the array as ar0, but when I
  enter FDISK to set up my partition, the size of the array is only recognized
  as 74G, rather than the true 148G.  I've double-checked all my BIOS
  settings, and nothing seems out of order.  Please help!


-- 
greg byshenk  -  [EMAIL PROTECTED]  -  Leiden, NL
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]