from:"Greg 'groggy' Lehey"

Re: grep(1) bug - duplicate output lines

2023-09-27 Thread Greg 'groggy' Lehey

On Wednesday, 27 September 2023 at 22:30:43 -0500, Kyle Evans wrote:
> On 9/27/23 21:40, Jamie Landeg-Jones wrote:
>> When using color=always and a regex of '.' (for example), output lines
>> are duplicated.
>>
>> $ grep --version
>> grep (BSD grep, GNU compatible) 2.6.0-FreeBSD
>>
>> E.G.:
>>
>> $ grep --color=always . /etc/fstab
>
> I think this is what we want:
>
> https://people.freebsd.org/~kevans/grep-color.diff

That looks surprisingly complicated.  FWIW, this issue didn't occur
with older versions of grep.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php


signature.asc
Description: PGP signature

Re: Potential show-stopper in em driver?

2023-08-15 Thread Greg 'groggy' Lehey

On Monday, 14 August 2023 at 17:34:12 -0700, Kevin Bowling wrote:
> On Mon, Aug 14, 2023 at 4:45 PM Greg 'groggy' Lehey  wrote:
>> Thanks.  Let me know when you have something and I'll test it.
>
> I went ahead and reverted: 797e480cba8834e584062092c098e60956d28180

Is it that bad?  I had the impression that where it worked, it was an
advantage.  Couldn't you just leave it there disabled, with the option
of enabling it along with a warning that it hasn't been tested on all
hardware?

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php


signature.asc
Description: PGP signature

Re: Potential show-stopper in em driver?

2023-08-14 Thread Greg 'groggy' Lehey

[moving to current as requested by bz@]

On Monday, 14 August 2023 at 10:09:22 -0700, Kevin Bowling wrote:
>
> I'm able to replicate this on my I217 using iperf3.  It happens
> quickly with flow control enabled (default) and takes about 15 minutes
> of line rate with flow control disabled.  I am looking into the scope
> of the issue and will commit a fix or enable chicken bits for affected
> parts soon.

Thanks.  Let me know when you have something and I'll test it.

I'll reply to the other message later, but things look better without
TCO.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php


signature.asc
Description: PGP signature

Potential show-stopper in em driver?

2023-08-13 Thread Greg 'groggy' Lehey

I've spent the last couple of days chasing random hangs on my -CURRENT
box.  It seems to be related to the Ethernet driver (em).  I've been
trying without much success to chase it down, and I'd be grateful.
The box is headless, and all communication is via the net, which
doesn't make it any easier.  I've tried a verbose boot, but nothing of
interest shows up.  Typically it happens during the nightly backups,
which are over NFS:

  Aug 13 21:06:46 dereel kernel: <<<66>n>nffs server s6>neurekfs server 
aeureka:/dum:p: /ndoump: tnot responding
  Aug 13 21:06:46 dereel kernel:
  Aug 13 21:06:46 dereel kernel: responding
  Aug 13 21:06:46 dereel kernel:
  Aug 13 21:06:46 dereel kernel: server eureka:/dump:n not responding

And if you haven't seen those garbled messages before, admire.
They've been there for a long time, and they have nothing to do with
the problem.  More to the point, there are no other error messages.

I've run three kernels on this box over the last few weeks:

1. FreeBSD dereel 14.0-CURRENT FreeBSD 14.0-CURRENT amd64 1400093 #10 
main-n264292-7f9318a022ef: Mon Jul 24 17:13:32 AEST 2023 
grog@dereel:/usr/obj/eureka/home/src/FreeBSD/git/main/amd64.amd64/sys/GENERIC 
amd64

   This works with no problems.

2. FreeBSD 14.0-CURRENT amd64 1400094 #11 main-n264653-517e0978db1f: Thu Aug 10 
14:17:13 AEST 2023 
grog@dereel:/usr/obj/eureka/home/src/FreeBSD/git/main/amd64.amd64/sys/GENERIC

3. FreeBSD dereel 14.0-ALPHA1 FreeBSD 14.0-ALPHA1 amd64 1400094 #12 
main-n264693-b231322dbe95: Sat Aug 12 14:31:44 AEST 2023 
grog@dereel:/usr/obj/eureka/home/src/FreeBSD/git/main/amd64.amd64/sys/GENERIC 
amd64

   Both of these exhibit the problem.

Note that we're now ALPHA1, so it's a good idea to get to the bottom
of it.  The box is an ThinkCentre M93p.  I'm attaching a verbose boot
log, though I don't expect anybody to find something of use there.
I'm also currently building a new world in case something has happened
since Saturday.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php


signature.asc
Description: PGP signature

Re: FreeBSD wont boot on AMD Ryzen 9 7950X

2023-05-20 Thread Greg 'groggy' Lehey

On Saturday, 20 May 2023 at 17:15:25 -0400, Mike Jakubik wrote:
> On Sat, May 20, 2023 at 4:49 PM Yuri  wrote:
>> Mike Jakubik wrote:
>>>
>>> Thanks for the info. At least I know it's not specific to my parts. Is
>>> there any knob one can turn in the BIOS to enable/disable this feature?
>>> iirc UART is old school serial ports? Wonder if removing UART support
>>> from the kernel would be a workaround.
>>
>> Try the following from the loader prompt (option 3 from the beastie menu):
>>
>> set hint.uart.0.disabled=1
>> set hint.uart.1.disabled=1
>> boot
>
> That did the job!

Good to hear.  But it's a workaround, of course.  I hope you have
entered a PR so that somebody can fix the problem.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php


signature.asc
Description: PGP signature

Re: Posting Netiquette [ref: Threads "look definitely like" unreadable mess. Handbook project.]

2022-06-23 Thread Greg 'groggy' Lehey

On Thursday, 23 June 2022 at  6:03:20 +0200, Polytropon wrote:
> On Thu, 23 Jun 2022 13:01:18 +1000, Greg 'groggy' Lehey wrote:
>>   [...] I personally find that prepending
>>   ">" to the original message works best. Leaving white space after
>>   the "> " and leave empty lines between your text and the original
>>   text both make the result more readable.
>
> Prepending what? After the what? Seems there is a charset mismatch.

Ugh.  Yes, you're right.

> Or is it just my MUA displaying nonsense (which would be new to me).
> Oh the joy of UTF-8... ;-)

What happened here was that I copied the text from the (UTF-8) web
page into a text that was (I think) implicitly ISO 8859.  My copy of
the message also shows this mutilation.  But strangely, replying to
this message, I find that the text has been automatically recovered.
It doesn't stay that way: in the editor it looks correct, but the MUA
displays it incorrectly.  The issue was with the quotation marks, and
it should look correct above now.

> Otherwise, I completely agree to the concept that form and content
> should match, and that form can help a lot to improve readability
> and accessibility of information in general.

And people shouldn't make the kind of mess that I managed to make :-(

Does anybody have an opinion on character set recommendations?  I
think we should ask for UTF-8 if at all possible.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php

signature.asc
Description: PGP signature

Re: Posting Netiquette [ref: Threads "look definitely like" unreadable mess. Handbook project.]

2022-06-22 Thread Greg 'groggy' Lehey

On Wednesday, 22 June 2022 at 15:41:39 -0400, grarpamp wrote:
> Around 6/2x/22, Many  rammed their horribly formed
> msgs upon others to parse:
> ...
> FreeBSD needs to add an entire section on the
> email post formatting netiquette to the Handbook,
> and link it on the List Subscription pages, and in the List Welcome
> emails, and even in quarterly automated administrivia post to all lists.

In fact we have guidelines, just not exactly where you might expect.
"How to get Best Results from the FreeBSD-questions Mailing List"
(https://docs.freebsd.org/en/articles/freebsd-questions/) contains:

  Unless there is a good reason to do otherwise, reply to the sender
  and to FreeBSD-questions.

  Include relevant text from the original message. Trim it to the
  minimum, but do not overdo it. It should still be possible for
  somebody who did not read the original message to understand what
  you are talking about.

  Use some technique to identify which text came from the original
  message, and which text you add. I personally find that prepending
  â>â to the original message works best. Leaving white space after
  the â> ;â and leave empty lines between your text and the original
  text both make the result more readable.

  Put your response in the correct place (after the text to which it
  replies). It is very difficult to read a thread of responses where
  each reply comes before the text to which it replies.

  Most mailers change the subject line on a reply by prepending a text
  such as "Re: ". If your mailer does not do it automatically, you
  should do it manually.

  If the submitter did not abide by format conventions (lines too
  long, inappropriate subject line) please fix it. In the case of an
  incorrect subject line (such as "HELP!!??"), change the subject line
  to (say) "Re: Difficulties with sync PPP (was: HELP!!??)". That way
  other people trying to follow the thread will have less difficulty
  following it.

  In such cases, it is appropriate to say what you did and why you did
  it, but try not to be rude. If you find you can not answer without
  being rude, do not answer.

Arguably these recommendations should be separated out into their own
page.  Re-reading them, I see that there is no explicit line length
recommendation.  That should be included.

And yes, I agree entirely with your concerns, though without my core
hat.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php

signature.asc
Description: PGP signature

Re: Considering stepping down from all of my FreeBSD responsibilities

2022-04-01 Thread Greg 'groggy' Lehey

On Friday,  1 April 2022 at  9:38:07 -0700, Cy Schubert wrote:
> In message <20220401064816.gs60...@eureka.lemis.com>, Greg 'groggy' Lehey
> write
> s:
>>
>> --TSQPSNmi3T91JED+
>> Content-Type: text/plain; charset=us-ascii
>> Content-Disposition: inline
>>
>> On Friday,  1 April 2022 at  5:58:39 +, Alexey Dokuchaev wrote:
>>> I don't think 2.2.10 is warranted.
>>
>> Agreed.  The upgrade isn't sufficiently important.
>>
>> How about 2.2.9.1?
>
> I had a different more sinister thought: Announcing that we've moved from
> BSDL to GPLv3 to be more like Linux.

Well, since we have accepted (or at least put up with) git, why not?
Of course, things go both ways.  For those of you who missed it,
www.lemis.com/grog/slashdot/

And that wasn't even 1 April.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php


signature.asc
Description: PGP signature

Re: Considering stepping down from all of my FreeBSD responsibilities

2022-04-01 Thread Greg 'groggy' Lehey

On Friday,  1 April 2022 at  5:58:39 +, Alexey Dokuchaev wrote:
> On Fri, Apr 01, 2022 at 02:20:31PM +0900, Yasuhiro Kimura wrote:
>> Hi Glen,
>>
>> From: Glen Barber 
>> Subject: Considering stepping down from all of my FreeBSD responsibilities
>> Date: Fri, 1 Apr 2022 00:15:02 +
>>
>>> Dear community,
>>>
>>> Given the mental toll the past two years or so have taken on me, I have
>>> decided to step down from all of my "hats" within the Project, and take
>>> some time to sort out what my future looks like going forward.
>>>
>>> Happy April 1st.  I'm not going anywhere.  :-)
>>
>> We are waiting for the announce of FreeBSD 2.2.10-RELEASE. :-)
>>
>> Cf. 
>> https://lists.freebsd.org/pipermail/freebsd-announce/2006-April/001055.html
>
> I don't think 2.2.10 is warranted.

Agreed.  The upgrade isn't sufficiently important.

How about 2.2.9.1?

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA.php


signature.asc
Description: PGP signature

Re: HEADS-UP: PIE enabled by default on main

2021-02-25 Thread Greg 'groggy' Lehey

On Thursday, 25 February 2021 at 21:22:43 -0500, Ed Maste wrote:
> On Thu, 25 Feb 2021 at 19:23, John Kennedy  wrote:
>>
>>   Not sure if Ed Maste just wants to make sure that all the executables
>> are rebuilt as PIE (vs hit-and-miss) or there is a sneaker corner-case that
>> he knows about.
>
> The issue is that without a clean build you may have some .o files
> left around that are built without PIE enabled (i.e., compiled without
> -fPIE), and attempting to link them into a PIE executable will fail
> with an error like:
>
> ld: error: can't create dynamic relocation R_X86_64_32 against local symbol 
> in readonly segment; recompile object files with -fPIC or pass 
> '-Wl,-z,notext' to allow text relocations in the output

Ah, thanks.  That makes more sense.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: HEADS-UP: PIE enabled by default on main

2021-02-25 Thread Greg 'groggy' Lehey

On Thursday, 25 February 2021 at 15:58:07 -0500, Ed Maste wrote:
> As of 9a227a2fd642 (main-n245052) base system binaries are now built
> as position-independent executable (PIE) by default, for 64-bit
> architectures. PIE executables are used in conjunction with address
> randomization as a mitigation for certain types of security
> vulnerabilities.
>
> If you track -CURRENT and normally build WITHOUT_CLEAN you'll need to
> do one initial clean build -- either run `make cleanworld` or set
> WITH_CLEAN=yes.

This details worries me.  How compatible are PIE executables with
non-PIE executables?  Can I run PIE executables on older systems?  Can
I run older executables on a PIE system?

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Plans for git (was: Please check the current beta git conversions)

2020-09-01 Thread Greg 'groggy' Lehey

On Tuesday,  1 September 2020 at 13:14:10 -0400, Ed Maste wrote:
> We've been updating the svn-git converter and pushing out a new
> converted repo every two weeks, and are now approaching the time where
> we'd like to commit to the tree generated by the exporter,
> ...

Somehow I've missed this development.  Reading between the lines, it
seems that we're planning to move from svn to git, but I can't recall
seeing any announcement on the subject.  Can you give some background?
It would also be nice to find a HOWTO both for the migration and for
life with git.

Greg
--
Sent from my desktop computer.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA

signature.asc
Description: PGP signature

Re: New Xorg - different key-codes

2020-03-10 Thread Greg 'groggy' Lehey


On Wednesday, 11 March 2020 at  0:20:03 +, Poul-Henning Kamp wrote:
[originally sent to current@]
> I just updated my laptop from source, and somewhere along the way
> the key-codes Xorg sees changed.

Indeed.  This doesn't just affect -CURRENT: it happened to me on
-STABLE last week, so I'm copying that list too.  See
http://www.lemis.com/grog/diary-mar2020.php?topics=c=Daily%20teevee%20update=D-20200306-002910#D-20200306-002910
, not the first entry on the subject.

> I have the right Alt key mapped to "Multi_key", which is now
> keycode 108 instead of 113, which is now arrow left instead.

Interesting.  Mine wandered from 117 to 147, with PageDown ("Next") as
collateral damage.  It seems that there are a lot of strange new key
bindings (partial output of xmodmap -pk):

117 0xff56 (Next)   0x (NoSymbol)   0xff56 (Next)
130 0xff31 (Hangul) 0x (NoSymbol)   0xff31 (Hangul)
131 0xff34 (Hangul_Hanja)   0x (NoSymbol)   0xff34 
(Hangul_Hanja)
135 0xff67 (Menu)   0x (NoSymbol)   0xff67 (Menu)
147 0x1008ff65 (XF86MenuKB) 0x (NoSymbol)   0x1008ff65 
(XF86MenuKB)

Some of these may reflect other remappings that I have done.

> I hope this email saves somebody else from the frustrating
> morning I had...

Sorry.  I should have thought of reporting it.  For me, with a number
of other issues, it was a frustrating week,some of which are still not
resolved.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: Pkg repository is broken...

2020-03-07 Thread Greg 'groggy' Lehey

On Saturday,  7 March 2020 at 16:46:58 +0100, Michael Gmelin wrote:

[much irrelevant text deleted]

People, please trim your replies.  Only relevant text should remain

> On Sat, 07 Mar 2020 11:30:58 -0400 Waitman Gobble  wrote:
>>
>> I installed 12.1 on a new laptop yesterday, I have not experienced
>> issues with pkg.
>
> This was only an issue on the "latest" branch. If you don't alter
> "/etc/pkg/FreeBSD.conf", you'll get packages from the "quarterly"
> branch, which fortunately wasn't affected.

No, this isn't necessarily correct.  I have never modified this file,
but I ended up with a copy of /usr/src/usr.bin/pkg/FreeBSD.conf.latest
with this revision string:

  # $FreeBSD: stable/11/etc/pkg/FreeBSD.conf 263937 2014-03-30 15:24:17Z 
bdrewery $

Despite the age, this appears to identical to the current version,
according to svn blame.  Arguably this should be the default anyway.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA

signature.asc
Description: PGP signature

Re: Pkg repository is broken...

2020-03-06 Thread Greg 'groggy' Lehey

On Friday,  6 March 2020 at 12:29:44 +0100, Lars Engels wrote:
> On Wed, Mar 04, 2020 at 03:16:14PM +1100, Greg 'groggy' Lehey wrote:
>>
>> Any workarounds in the meantime?  This must affect a lot of people,
>> including those who use 12-:
>>
>>   pkg: wrong architecture: FreeBSD:12.0:amd64 instead of FreeBSD:12:amd64
>>   pkg: repository FreeBSD contains packages with wrong ABI: 
>> FreeBSD:12.0:amd64
>
> Still broken for me on 12.1.

Strange.  Mine cleared up automatically the following day.

It's also strange how few replies I have received.  Two private
messages (why?), yours, and that was it.  You'd think that people
would be screaming.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: Pkg repository is broken...

2020-03-03 Thread Greg 'groggy' Lehey

On Monday,  2 March 2020 at 17:58:01 +, marco wrote:
> On Sun, Mar 01, 2020 at 04:50:59PM -0500, you (Brennan Vincent) sent the 
> following to [freebsd-current] :
>> Apparently something has its ABI erroneously listed as FreeBSD:13.0:amd64
>> instead of FreeBSD:13:amd64.
>>
>> ```
>> $ sudo pkg update -f
>> Updating FreeBSD repository catalogue...
>> Fetching meta.conf: 100%163 B   0.2kB/s00:01
>> Fetching packagesite.txz: 100%6 MiB   6.4MB/s00:01
>> Processing entries:  72%
>> pkg: wrong architecture: FreeBSD:13.0:amd64 instead of FreeBSD:13:amd64
>> pkg: repository FreeBSD contains packages with wrong ABI: FreeBSD:13.0:amd64
>> Processing entries: 100%
>> Unable to update repository FreeBSD
>> Error updating repositories!
>
> Ran into this very same problem today too.
> Just learned on #freebsd that the repos are temporarily borked and
> people are working hard to fix it.

Any workarounds in the meantime?  This must affect a lot of people,
including those who use 12-:

  pkg: wrong architecture: FreeBSD:12.0:amd64 instead of FreeBSD:12:amd64
  pkg: repository FreeBSD contains packages with wrong ABI: FreeBSD:12.0:amd64

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: src committer please

2018-12-13 Thread Greg 'groggy' Lehey

On Thursday, 13 December 2018 at 13:07:54 +, Bob Bishop wrote:
> Hi,
>
> Please could somebody take a look at 
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=221350
>
> It???s been open for over a year with a patch that solves the problem.
>
> Failing to install out of the box on commodity HP kit is not a good look.

OK, I'll take a look.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: Nvidia issue with CURRENT

2018-04-23 Thread Greg 'groggy' Lehey

On Monday, 23 April 2018 at  9:55:40 +0200, Mariusz Zaborski wrote:
> On Mon, Apr 23, 2018 at 05:51:01PM +1000, Greg 'groggy' Lehey wrote:
>> On Monday, 23 April 2018 at  9:00:33 +0200, O. Hartmann wrote:
>>> On Sun, 22 Apr 2018 14:38:55 +0200  Mariusz Zaborski <osho...@freebsd.org> 
>>> wrote:
>>> In /etc/src.conf , therefore you should add something similar to (like I 
>>> added
>>> to mine):
>>>
>>> PORTS_MODULES=
>>> PORTS_MODULES+= x11/nvidia-driver
>>> PORTS_MODULES+= emulators/virtualbox-ose-kmod
>>>
>>> This is one of the great advantages of having an operating system which you 
>>> can
>>> compile yourself.
>>
>> Yes, but this has nothing to do with the bug.  Clearly Marisuz and I
>> have the configuration correct, but something has changed in the last
>> few months.
>
> Yea this is a known issue so I rebuild nvidia-driver.
> I'm just not sure if this is a problem with kernel or with the
> driver itself.

Almost by definition, it's a driver issue.  Something in the kernel
has changed which makes it no longer work.

>> Marisuz, as I commented, your log wasn't appended to the message I
>> received.  What is your hardware?
>
> https://people.freebsd.org/~oshogbo/Xorg.0.log

A brief scan doesn't show anything very similar to my issues.  I'll
look again tomorrow when I have time.

Did you try the most recent driver?

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: Nvidia issue with CURRENT

2018-04-23 Thread Greg 'groggy' Lehey

On Monday, 23 April 2018 at  9:00:33 +0200, O. Hartmann wrote:
> On Sun, 22 Apr 2018 14:38:55 +0200
> Mariusz Zaborski  wrote:
>
>> Hi,
>>
>> Normally I build my CURRENT by myself from Xorg - r332861.
>> But I also tried latest SNAPSHOT.
>
> All my boxes running with nVidia hardware running most recent CURRENT 
> (compiled
> this morning on an almost daily basis) and I'm using the lates official driver
> available from nVidia, 390.48.
>
> It happens to be as a natural byproduct of CURRENT that very often
> the kernel module of the nVidia driver is out of sync so i made it a
> habit to recompile the module from sources whenever I
> recompile/install a kernel.

As I commented, I've had this on -STABLE as well.

My guess is that this is GPU dependent.  I'm using an old card:

[32.251] Current Operating System: FreeBSD teevee.lemis.com 11.1-STABLE 
FreeBSD 11.1-STABLE #2 r327971: Mon Jan 15 1
0:55:53 AEDT 2018 
g...@teevee.lemis.com:/home/obj/eureka/home/src/FreeBSD/svn/stable/11/sys/GENERIC
 amd64
...
[32.763] (II) NVIDIA dlloader X Driver  390.25  Wed Jan 24 19:00:20 PST 2018
...
[33.785] (II) NVIDIA(0): NVIDIA GPU GeForce GT 710 (GK208) at PCI:1:0:0 
(GPU-0)
[33.785] (--) NVIDIA(0): Memory: 2097152 kBytes
[33.785] (--) NVIDIA(0): VideoBIOS: 80.28.b8.00.45
[33.785] (II) NVIDIA(0): Detected PCI Express Link width: 8X

> In /etc/src.conf , therefore you should add something similar to (like I added
> to mine):
>
> PORTS_MODULES=
> PORTS_MODULES+= x11/nvidia-driver
> PORTS_MODULES+= emulators/virtualbox-ose-kmod
>
> This is one of the great advantages of having an operating system which you 
> can
> compile yourself.

Yes, but this has nothing to do with the bug.  Clearly Marisuz and I
have the configuration correct, but something has changed in the last
few months.

Marisuz, as I commented, your log wasn't appended to the message I
received.  What is your hardware?

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: Nvidia issue with CURRENT

2018-04-22 Thread Greg 'groggy' Lehey

On Sunday, 22 April 2018 at 12:42:37 +0200, Mariusz Zaborski wrote:
> Hello,
>
> I upgraded my FreeBSD to CURRENT and nvidia-drvier-390.48. But it's
> stop working.
> I tried also nvidia-driver-390.25 without luck as well.

Yes, I've had this trouble as well with -STABLE.  It happened some
time in the February/March time frame.  See
http://www.lemis.com/grog/diary-mar2018.php#D-20180324-031830.  I
haven't reported it yet because I had intended to try the latest
version of the driver.  At the time that was 390.42, but now it's
390.48.  You might like to try that (see
http://www.nvidia.com/object/unix.html).

> I'm attaching also Xorg log.

This seems to have got lost.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA

signature.asc
Description: PGP signature

Re: swapfile query

2017-08-19 Thread Greg 'groggy' Lehey

On Saturday, 19 August 2017 at 16:00:28 +0100, tech-lists wrote:
> Hello list,
>
> (freebsd-current is r317212 on this machine)
>
> I have a machine with 128GB RAM. When 12-current was installed, for some
> reason the swap partition was set to 4GB. I see sometimes via top and
> also via daily status reports that sometimes the machine runs out of
> swap. It doesn't crash the machine though.
>
> I know how to add more swap with a swapfile.

That's one way.  It's really better to use a swap partition.  If you
repartition the SSD for whatever reason, you should consider creating
a larger swap partition.

> 1. should I make more than one swapfile, say 4x32GB or will it be ok
> with one 128GB swapfile?

It doesn't make any difference, but 128 GB seems excessive.  You might
like to try with one 32 GB swap file and see if that's enough.  On my
machine I have 32 GB of memory and 10 GB swap, and I don't have much
of a problem with that.

> 2. will the 4GB already there as swap play nice with a swapfile, or
> multiple swapfiles? Or should I deactivate the 4GB swap partition
> first?

Yes.

> 3. should total swap be 1x 2x or some other multiple of RAM these days?

It never needed to be.  The only issue is that if you want processor
dumps, you once needed a swap partition (and not a swap file) at least
marginally larger than memory.  With compressed dumps, that
requirement is relaxed, but I suspect that a 4 GB partition could be
too small.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA

signature.asc
Description: PGP signature

Re: date(1) default format changed between 10.3 and 11.0-BETA3

2016-08-05 Thread Greg 'groggy' Lehey

On Friday,  5 August 2016 at 18:56:33 +0300, Andrey A. Chernov wrote:
> On 05.08.2016 18:44, Mark Martinec wrote:
>> On 2016-08-05 17:23, Andrey Chernov wrote:
>>> On 05.08.2016 17:47, Mark Martinec wrote:
 [Bug 211598]
   date(1) default format in en_EN locale breaks compatibility with 10.3
 and violates POSIX

 https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=211598
>>>
>>> It breaks compatibility but not violates POSIX. POSIX care of only its
>>> own POSIX (or C) locale.
>>
>> POSIX does say that the default format should be the same
>> as with "+%a %b %e %H:%M:%S %Z %Y".
>> It also says that %a and %b are locale's abbreviated names.
>
> It is true for _POSIX_ locale only, as I already say. en_US.* is not
> POSIX or C locale.

It still violates POLA.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: FreeBSD Quarterly Status Report - First Quarter 2016 (fwd)

2016-05-02 Thread Greg 'groggy' Lehey

[line lengths recovered]

On Sunday,  1 May 2016 at 20:16:38 -0700, Jordan Hubbard wrote:
>
>> On May 1, 2016, at 5:49 PM, Warren Block  wrote:
>>
>>   The first quarter of 2016 showed that FreeBSD retains a strong sense of
>>   ipseity. Improvements were pervasive, lending credence to the concept
>>   of meliorism. [ ??? ]
>
>
> I, for one, learned at least 4 new words in that announcement, 3 of
> which were actually real.

And the other is int?  OK, I'll bite.  Which one is unreal?

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft mail program
reports problems, please read http://lemis.com/broken-MUA


signature.asc
Description: PGP signature

Re: [RFC] Removin the old make

2015-02-10 Thread Greg 'groggy' Lehey

On Tuesday, 10 February 2015 at 23:38:54 +0100, Baptiste Daroussin wrote:
 Hi,

 I would like to start using bmake only syntax on our infrastructure for that I
 want to make sure noone is using the old make, so I plan to remove the old 
 make
 from base, I plan to do it by Feb 16th.

How does this affect non-system Makefiles that depend on pmake?  Is
bmake completely upward compatible?

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft MUA reports
problems, please read http://tinyurl.com/broken-mua


pgp_RZIFijEsv.pgp
Description: PGP signature

Re: Simple check if current is broken

2012-11-13 Thread Greg 'groggy' Lehey

On Tuesday, 13 November 2012 at  9:11:21 +0200, Alexander Yerenkow wrote:
 Hello there!
 I sometimes see in this list such mails:
 I got problem with rXX
 It's known, it's fixed in rYY.

 Sometimes it's my problem, sometimes it's problem of other peoples.

 How about make simple web-service where revision numbers could be marked
 as bwoken, with minimal info - like next working rev?

 Can we make small sub-task while buildworld/buildkernel going, to simply
 fetch info about current rev and if it's broken warn user?

 This probably would improve user experience for those who use current, but
 have no time/proficiency to read commit logs.

On Tuesday, 13 November 2012 at 12:15:29 -0800, Jakub Lach wrote:
On Tuesday, 13 November 2012 at 12:15:29 -0800, Jakub Lach wrote:
 What about cases when something is broken, but not for everybody?

Just say so.

 If only GENERIC build, how does that differ from existing
 tinderboxing?

It's all in one place.

The idea sounds good to me.  All we need is somebody to implement it
and somebody to maintain it.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft MUA reports
problems, please read http://tinyurl.com/broken-mua


pgpz4m0XBxWiF.pgp
Description: PGP signature

Traditional cpp (was: /usr/bin/calendar broken on current)

2012-11-09 Thread Greg 'groggy' Lehey

On Friday,  9 November 2012 at 13:52:24 +0100, Dimitry Andric wrote:
 On 2012-11-09 08:26, Greg 'groggy' Lehey wrote: On Thursday,  8 November
 2012 at 22:58:37 -0800, Manfred Antar wrote:
 Sometime in the last week calendar stopped working.
 not sure the cause
 here is some of the output:
 /usr/share/calendar/calendar.music:231:17: warning: missing terminating '
 character [-Winvalid-pp-token]
 12/16   Don McLean's American Pie is released, 1971
   ^

 This is unexpected fallout from the transition from gcc to clang.
 calendar invokes cpp, and it seems that clang's cpp doesn't like what
 it sees.  This patch works around the issue:

 --- pathnames.h  (revision 242777)
 +++ pathnames.h  (working copy)
 @@ -32,5 +32,5 @@

  #include paths.h

 -#define _PATH_CPP   /usr/bin/cpp
 +#define _PATH_CPP   /usr/bin/gcpp
  #define _PATH_INCLUDE   /usr/share/calendar

 Clearly that's not the solution.  I'll investigate.

 Looks like yet another cpp -traditional abuse.

Use or abuse?  In any case, it's not the only one.  In the Good Old
Days people did things like that.  So, it seems, does imake, and I'm
sure others will come out of the woodwork.

 Clang will most likely never support traditional preprocessing.

OK.

 It is probably better to just use sed or awk for this kind of
 trickery.

I'm not sure that's the way to go.  It's more work than it's worth.

What we really need is a traditional cpp.  That's not difficult:
there's one in 4.3BSD (all 32 kB of source).  OpenBSD also had one,
though it's gone now, so presumably that one has a clean license.
Both appear to be from pcc.  Should we import it into the tree as,
say, tradcpp?

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft MUA reports
problems, please read http://tinyurl.com/broken-mua


pgpVnv5G7Pjwb.pgp
Description: PGP signature

Re: sysutils/lsof Author Question (for CLANG)....

2012-11-08 Thread Greg 'groggy' Lehey

[Text formatting recovered]

On Thursday,  8 November 2012 at  9:23:11 -0600, Larry Rosenman wrote:
 On 2012-11-08 09:20, Edward Tomasz Napiera??a wrote:
 Wiadomo napisana przez Andriy Gapon w dniu 8 lis 2012, o godz. 15:17:
 Just curious why lsof can't use interfaces that e.g.
 fstat/sockstat/etc use?  Those base utilities do not seem to
 experience as much trouble as lsof.

 Note that fstat(8) does not report file paths. On the other hand,
 procstat(8) does.  It looks like procstat -fa and procstat -va
 together provide the same information lsof(8) does; unfortunately
 there doesn't seem to be a way to show a merged output for files
 opened (-f) and files mmapped, but closed (-v).

Hmm.  I don't know the details, but potentially there *would* be a
more kosher way of doing what lsof wants.

 Remember also that lsof is portable between MANY flavors of *nix.

Only because the author goes to a lot of effort to make it so.
There's special-case code for most kernels.  In the case of FreeBSD,
it would make sense to use documented interfaces where possible, and
create them where they don't exist.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft MUA reports
problems, please read http://tinyurl.com/broken-mua


pgpoZiKIQH1Nh.pgp
Description: PGP signature

Re: /usr/bin/calendar broken on current

2012-11-08 Thread Greg 'groggy' Lehey

On Thursday,  8 November 2012 at 22:58:37 -0800, Manfred Antar wrote:
 Sometime in the last week calendar stopped working.
 not sure the cause
 here is some of the output:
 /usr/share/calendar/calendar.music:231:17: warning: missing terminating ' 
 character [-Winvalid-pp-token]
 12/16   Don McLean's American Pie is released, 1971
   ^

This is unexpected fallout from the transition from gcc to clang.
calendar invokes cpp, and it seems that clang's cpp doesn't like what
it sees.  This patch works around the issue:

--- pathnames.h (revision 242777)
+++ pathnames.h (working copy)
@@ -32,5 +32,5 @@

 #include paths.h

-#define_PATH_CPP   /usr/bin/cpp
+#define_PATH_CPP   /usr/bin/gcpp
 #define_PATH_INCLUDE   /usr/share/calendar

Clearly that's not the solution.  I'll investigate.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft MUA reports
problems, please read http://tinyurl.com/broken-mua


pgp2YopGowaIb.pgp
Description: PGP signature

Re: sysutils/lsof Author Question (for CLANG)....

2012-11-07 Thread Greg 'groggy' Lehey

On Wednesday,  7 November 2012 at 16:35:22 -0600, Larry Rosenman wrote:
 On 2012-11-07 15:39, Greg 'groggy' Lehey wrote:
 On Wednesday,  7 November 2012 at 10:32:23 -0500, Benjamin Kaduk
 wrote:

 Once again, attempting to use kernel internals outside of the
 supported interfaces is just asking for trouble; I do not understand
 why this message is not sinking in over the course of your previous
 mails to these lists, so I will not try to belabor it further.

 IIRC lsof is a special case that always needs to be built with
 intimate knowledge of the kernel.

 This is VERY true.  Since some of the information lsof uses has
 no API/ABI/KPI/KBI to get, it grovels around in the kernel.

And until those interfaces are provided, I think this is legitimate.
If there's anybody out there who hasn't used lsof, you should try it.
It's good.

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft MUA reports
problems, please read http://tinyurl.com/broken-mua


pgpKXB0cMD2nd.pgp
Description: PGP signature

Re: Panic on boot after svn update

2012-07-28 Thread Greg 'groggy' Lehey

On Sunday, 29 July 2012 at  0:53:55 -0400, David J. Weller-Fahy wrote:
 So, I recently updated and encountered a panic on boot which is
 reproducible, and wanted to see if anyone's encountered this before I
 file a PR.  I found a problem in (I think) recent changes to the e1000
 driver.  I'm running FreeBSD 10-CURRENT as a VirtualBox guest.

 #v+
 FreeBSD fork-pooh 10.0-CURRENT FreeBSD 10.0-CURRENT #0 r238764: Sat Jul 28 
 17:21:47 EDT 2012 root@fork-pooh:/usr/obj/usr/src/sys/GENERIC  amd64
 #v-

 I have the Adapter Type set to, Intel PRO/1000 MT Desktop (82540EM), and the
 following card is detected by pciconf.
 ...
 Updating motd:.
 Starting ntpd.
 panic: _mtx_lock_sleep: recursed on non-recursive mutex em0 @ 
 /usr/src/sys/dev/e1000/if_lem.c:881

aolMe too/aol The panic message is identical, and I'm also running
in VirtualBox.  My version string (from strings on the kernel) is:

FreeBSD 10.0-CURRENT #4: Sat Jul 28 09:45:10 EST 2012
r...@swamp.lemis.com:/usr/obj/src/FreeBSD/svn/head/sys/GENERIC

Note that this is a different EST (UTC+10).

I have a dump, but I can't get much sense out of it:

kgdb: kvm_read: invalid address (0x354540a)
#0  0x in ?? ()

I'm currently rebuilding the system, but it looks as if that won't
help much.  One interesting point is that the first panic happened
after installing the new image (from yesterday's sources) while I was
trying to reboot with the old kernel, dating back to

FreeBSD swamp.lemis.com 10.0-CURRENT FreeBSD 10.0-CURRENT #3: Sun May 13 
14:34:43 EST 2012 
r...@swamp.lemis.com:/usr/obj/src/FreeBSD/svn/head/sys/GENERIC  i386

Greg
--
Sent from my desktop computer.
Finger g...@freebsd.org for PGP public key.
See complete headers for address and phone numbers.
This message is digitally signed.  If your Microsoft MUA reports
problems, please read http://tinyurl.com/broken-mua


pgprHxhQYsUWK.pgp
Description: PGP signature

Re: 5.2-RELEASE TODO

2003-12-01 Thread Greg 'groggy' Lehey

On Monday,  1 December 2003 at 10:01:23 -0500, Robert Watson wrote:
 This is an automated bi-weekly mailing of the FreeBSD 5.2 open issues list.
 
 Show stopper defects for 5.2-RELEASE
 
  ++
  |   Issue   |  Status   |Responsible |Description|
  |---+---++---|
  |   |   ||The new i386 interrupt code|
  |ACPI kernel|   ||requires that ACPI be compiled into|
  |module |In progress|John Baldwin|the kernel if it to be used. Work  |
  |   |   ||is underway to restore the ability |
  |   |   ||to load it as a module.|
  |---+---++---|

I'm currently investigating ACPI problems on a dual processor Intel
motherboard (re@ knows about this).  It looks as if the new code is
much fussier than the old code about the quality of the motherboard
BIOS: this machine runs fine on 5.1, but won't finish booting on
5.2-BETA.  Yes, this is probably an ACPI bug, but users aren't going
to see it that way: if we release a 5.2 which won't boot on a lot of
machines, people are going to blame 5.2, not the machine.  I think we
should ensure that there's at least a fallback for machines with
broken ACPI.

Greg
-- 
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: 5.2-RELEASE TODO

2003-12-01 Thread Greg 'groggy' Lehey

On Monday,  1 December 2003 at 17:12:23 -0700, Scott Long wrote:
 On Tue, 2 Dec 2003, Greg 'groggy' Lehey wrote:
 On Monday,  1 December 2003 at 10:01:23 -0500, Robert Watson wrote:
 This is an automated bi-weekly mailing of the FreeBSD 5.2 open issues list.

 Show stopper defects for 5.2-RELEASE

 I'm currently investigating ACPI problems on a dual processor Intel
 motherboard (re@ knows about this).  It looks as if the new code is
 much fussier than the old code about the quality of the motherboard
 BIOS: this machine runs fine on 5.1, but won't finish booting on
 5.2-BETA.  Yes, this is probably an ACPI bug, but users aren't going
 to see it that way: if we release a 5.2 which won't boot on a lot of
 machines, people are going to blame 5.2, not the machine.  I think we
 should ensure that there's at least a fallback for machines with
 broken ACPI.

 This argument is exactly why I added the 'disable acpi' option in the boot
 loader menu.  Of course, we STILL need to get good debugging information
 from you as to why you get a Trap 9 when ACPI is disabled.  This is the
 more important issue.

I've sent information, and I'm waiting for feedback about what to do
next.  The fact that the stack is completely trashed doesn't help,
admittedly.

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: requesting vinum help

2003-11-26 Thread Greg 'groggy' Lehey

On Wednesday, 26 November 2003 at 12:04:52 -0600, Cosmin Stroe wrote:

 I am using vinum atm, and I am having serious problems with it.  After
 about 16 hrs of writing data to a vinum volume via NFS at a constant data
 stream of 200k/sec and reading at 400k/sec at the same time, the whole
 machine just freezes, hard.  The only thing I can do is reboot.  This
 behavior appears in 4.8 and 5-CURRENT.  I have no indication of what is
 wrong, or how to go about finding it out.  The problem is either with NFS
 or Vinum, and I'm leaning towards Vinum (because of the failure in both
 -STABLE and -CURRENT).

 I'm not the kind of person that relies on other people, and I like to fix
 my own problems, but this is a problem which I cannot fix at this time.
 So, I'm planning to look through the code of vinum and start messing with
 it to figure out how it works and how to debug it.

This is unlikely to get you very far.  Some more details (offline if
you prefer) would be handy, but as you say, you can't even be sure
that it's Vinum.  The best thing would be to get the system into the
kernel debugger at the point of freeze, if that's possible, and try to
work out what has happened.

 What would also be appreciated is an overall map of how vinum is
 organized and how it works.

You've read the documentation on http://www.vinumvm.org/, right?  If
you have any questions, I'm sure it can be improved on.

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: requesting vinum help

2003-11-26 Thread Greg 'groggy' Lehey

On Thursday, 27 November 2003 at  0:13:09 -0600, Cosmin Stroe wrote:
 On Thu, 27 Nov 2003, Greg 'groggy' Lehey wrote:

 On Wednesday, 26 November 2003 at 12:04:52 -0600, Cosmin Stroe wrote:

 I am using vinum atm, and I am having serious problems with it.  After
 about 16 hrs of writing data to a vinum volume via NFS at a constant data
 stream of 200k/sec and reading at 400k/sec at the same time, the whole
 machine just freezes, hard.  The only thing I can do is reboot.  This
 behavior appears in 4.8 and 5-CURRENT.  I have no indication of what is
 wrong, or how to go about finding it out.  The problem is either with NFS
 or Vinum, and I'm leaning towards Vinum (because of the failure in both
 -STABLE and -CURRENT).

 I'm not the kind of person that relies on other people, and I like to fix
 my own problems, but this is a problem which I cannot fix at this time.
 So, I'm planning to look through the code of vinum and start messing with
 it to figure out how it works and how to debug it.

 This is unlikely to get you very far.  Some more details (offline if
 you prefer) would be handy, but as you say, you can't even be sure
 that it's Vinum.  The best thing would be to get the system into the
 kernel debugger at the point of freeze, if that's possible, and try to
 work out what has happened.

 Quick question: If this is a software problem with vinum, there
 should be no way it can hard lock a machine.  Is this assumption
 correct ?

Heh.  Depends on what you mean by a software problem.  The right kind
of software problem anywhere can hard lock machines :-(

 I should be able to invoke the kernel debugger by pressing the
 hotkey (ctrl+alt+esc) while the machine is locked and get a
 backtrace (altho i'd be in an ISR servicing the hotkey, so i'm not
 sure it'd do much good).

It would enable you to look around and figure out what's gone wrong.

 Any special suggestions on debugging this kind of freezing problem ?
 The hardware has been tested and it's good (CPU,RAM,HDs). (some kind
 of watchdog in software ??)

I have some debugging help in Vinum which will log what's going on,
but it doesn't help much in the case of a hard freeze.  It could be a
deadlock.  Do you have swap on Vinum?

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: requesting vinum help

2003-11-25 Thread Greg 'groggy' Lehey

On Tuesday, 25 November 2003 at 10:48:44 -0600, Eric Anderson wrote:

 Could a vinum guru please contact me via email?

 I've lost 2 vinum volumes as a result of the latest fiasco and naturally
 am eager to figure out what's going on and recover the data.

 This isn't necessarily directed at you - I'm just using this email as a
 footstep to send this general comment -

 I am kind of under the assumption that -current is more of a test bed,
 and anything can happen at any time, which is why it's bad to run
 -current on a machine you care deeply about (at least its data).  

Correct.  More to the point, though, it requires you to rely more on
yourself.  At the very least, this means RTFM, which in this case
includes a number of things to submit if you have problems.  It's at
the end of vinum(4) or at
http://www.vinumvm.org/vinum/how-to-debug.html.

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: vinum still not working

2003-11-23 Thread Greg 'groggy' Lehey

On Sunday, 23 November 2003 at 22:46:30 +0100, Matthias Schuendehuette wrote:
 Hello,

 I just built a new world+kernel after the commit of grogs corrections
 but I still get:

 [EMAIL PROTECTED] - ~
 503 # vinum start
 ** no drives found: No such file or directory

Yes.  The fix wasn't enough.  I was holding off committing until I
could test it.

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: Vinum breakage - please update UPDATING

2003-11-21 Thread Greg 'groggy' Lehey

On Friday, 21 November 2003 at 15:42:12 -0800, Marcus Reid wrote:
 Hello,

 I just upgraded a -CURRENT box this afternoon to discover that vinum
 is broken. If I hadn't done dumps of my working world beforehand I
 would be in pretty sad shape. Should UPDATING make note of this
 breakage?

No.  UPDATING is for things that will change relatively permanently.

 It would have saved me some embarassment, and I'm sure others are
 about to clobber their machines.

As far as I can tell, this breakage doesn't harm the data.  Others
have reported that it works on older versions of CURRENT.  I hope to
have it fixed this weekend.

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Repeatable panic from 'camcontrol devlist'

2003-10-26 Thread Greg 'groggy' Lehey

I'm running a -CURRENT kernel built about a week ago, and on
'camcontrol devlist' I get the following repeatable panic:

#10 0xc063b1d5 in panic (fmt=0xc084458e vmapbuf)
at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/kern/kern_shutdown.c:534
#11 0xc0684d4e in vmapbuf (bp=0xc4659400) at 
/src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/kern/vfs_bio.c:3729
#12 0xc0444c81 in cam_periph_mapmem (ccb=0x0, mapinfo=0xcda8f8a8)
at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/cam/cam_periph.c:652
#13 0xc0446eaa in xptioctl (dev=0x0, cmd=3255201792, addr=0xcda8f8a8 , flag=3, 
td=0xc2204390)
at /src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/cam/cam_xpt.c:1132
#14 0xc06009ec in spec_ioctl (ap=0xcda8fb7c) at 
/src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/fs/specfs/spec_vnops.c:351
#15 0xc0600108 in spec_vnoperate (ap=0x0) at 
/src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/fs/specfs/spec_vnops.c:122
#16 0xc069e0e1 in vn_ioctl (fp=0xc2117b6c, com=3261076738, data=0xc2067000, 
active_cred=0xc211c980, td=0xc2204390)
at vnode_if.h:503
#17 0xc0660e35 in ioctl (td=0xc2204390, uap=0xcda8fd10) at 
/src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/sys/file.h:261

It doesn't happen on another machine running a kernel built
yesterday.  If anybody can confirm that this problem has been fixed,
I'll leave it; otherwise any pointers would be of use.  FWIW, it dies
here:

(kgdb) f 11
#11 0xc0684d4e in vmapbuf (bp=0xc4659400) at 
/src/FreeBSD/5-CURRENT-WANTADILLA/src/sys/kern/vfs_bio.c:3729
3729panic(vmapbuf: mapped more than MAXPHYS);
(kgdb) l
3724if (m == NULL)
3725goto retry;
3726bp-b_pages[pidx] = m;
3727}
3728if (pidx  btoc(MAXPHYS))
3729panic(vmapbuf: mapped more than MAXPHYS);
3730pmap_qenter((vm_offset_t)bp-b_saveaddr, bp-b_pages, pidx);
3731
3732kva = bp-b_saveaddr;
3733bp-b_npages = pidx;
(kgdb) p pidx
$2 = 0xcda8f8a8

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: sata + vinum + Asus p4p800 = :(

2003-10-14 Thread Greg 'groggy' Lehey

On Tuesday, 14 October 2003 at 18:46:44 +0200, Balazs Nagy wrote:
 Hi,

 I had a -CURRENT setting with an Abit BE7-S and two SATA disks with
 vinum configuration.  It worked very well until a power failure, and
 the mainboard died.  Yesterday I got a replacement mainboard, the only
 type met the requirements (eg. two SATA ports) in the store: an Asus
 P4P800.

 My only problem is with the disks.  I can use all USB ports (8; what a
 server could do with eight USB ports?), the 3C940 Gigabit Ethernet port
 (I disabled the sound subsystem), and everything works until the first
 fsck, when the kernel paniced.  Here is the dmesg:

 GEOM: create disk ad0 dp=0xc6b4bb70
 ad0: 4028MB Maxtor 90422D2 [8184/16/63] at ata0-master UDMA33
 acd0: CDROM GCR-8523B at ata0-slave PIO4
 GEOM: create disk ad4 dp=0xc6b4b070
 ad4: 117246MB Maxtor 6Y120M0 [238216/16/63] at ata2-master UDMA133
 GEOM: create disk ad6 dp=0xc6b4b170
 ad6: 117246MB Maxtor 6Y120M0 [238216/16/63] at ata3-master UDMA133
 Mounting root from ufs:/dev/vinum/root
 panic: ata_dmasetup: transfer active on this device!

 I did further investigation: I booted from ata0-master, and mounted
 /dev/vinum/root as /mnt.  A simple fsck -f -B /mnt killed the system.
 I did the same with /dev/ad4s1a (this is the boot hack partition from
 the handbook), then I switched off softupdates.  No win. I tried to
 boot with safe mode either, but it hung with page fault.

 What can I do? 

Provide a dump?  Analyse the problem yourself?  This *is* -CURRENT,
after all.  

 Besides, why my SATA interfaces are recognized as UDMA133?

It sounds like this could be an issue with ATA compatibility issues
with this mother board.  You should be able to mount your root file
system from the underlying UFS partition, thus disabling Vinum; at
least that would help you track down the problem.

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: Serial debug broken in recent -CURRENT?

2003-10-14 Thread Greg 'groggy' Lehey

On Wednesday,  8 October 2003 at  2:08:55 +1000, Bruce Evans wrote:
 On Tue, 30 Sep 2003, Sam Leffler wrote:

 It reliably locks up for me when you break into a running system; set a
 breakpoint; and then continue.  Machine is UP+HTT.  Haven't tried other
 machines.

 This seems to be because rev.1.75 of db_interface.c disturbed some much
 larger bugs related to the ones that it fixed.  It takes miracles for
 entering ddb to even sort of work in the SMP case. 

Ah, interesting.  I hadn't thought that it might be related to SMP.

 If one of multiple CPUs in kdb_trap() somehow stops the others, then the
 others face different problems when they restart.  They can't just return
 because debugger traps are not restartable (by just returning).  They can't
 just proceed because the first CPU may changed the state in such a way as
 to make proceeding in the normal way not work (e.g., it may have deleted
 a breakpoint).

 These problems are not correctly or completely fixed in:


 Index: db_interface.c
 ===
 RCS file: /home/ncvs/src/sys/i386/i386/db_interface.c,v
 retrieving revision 1.75
 diff -u -2 -r1.75 db_interface.c
 --- db_interface.c7 Sep 2003 13:43:01 -   1.75
 +++ db_interface.c7 Oct 2003 14:11:35 -
 ...
 This is supposed to stop the other CPUs either in kdb_trap() or normally.
 The timeouts are hopefully long enough for all the CPUs to stop in 1
 of these ways.  But it doesn't always work.  1 possible problem is
 that stop and start IPIs may be delivered out of order, so CPUs stopped
 in kdb_trap() may end up stopped (since we don't wait for them to see
 the stop IPI).

Correct.  This patch doesn't fix the problem on my system.  I've built
a single processor kernel (comment out SMP and APIC_IO), and that
*does* work with remote gdb, so it's almost certainly an SMP issue.  I
have a dump of a partially hanging system if that's of any help.

Greg
--
See complete headers for address and phone numbers.


pgp0.pgp
Description: PGP signature

Re: Serial debug broken in recent -CURRENT?

2003-09-30 Thread Greg 'groggy' Lehey

On Tuesday, 30 September 2003 at 16:23:35 +1000, Bruce Evans wrote:
 On Mon, 29 Sep 2003, Greg 'groggy' Lehey wrote:

 After building a new kernel, remote serial gdb no longer works.  When
 I issue a 'continue' command, I lose control of the system, but it
 doesn't continue running.  Has anybody else seen this?

 It works as well as it did a few months ago here.  (Not very well compared
 with ddb.  E.g., calling a function is usually fatal.)

Hmm, that's not what Sam or I are seeing.  How old is your kernel?
You *are* able to continue, right?  Everything else works for me.

Greg
--
See complete headers for address and phone numbers.
NOTE: Due to the currently active Microsoft-based worms, I am limiting
all incoming mail to 131,072 bytes.  This is enough for normal mail,
but not for large attachments.  Please send these as URLs.


pgp0.pgp
Description: PGP signature

Re: Serial debug broken in recent -CURRENT?

2003-09-30 Thread Greg 'groggy' Lehey

On Tuesday, 30 September 2003 at 16:13:09 -0400, Andrew Gallatin wrote:

 Sam Leffler writes:
 It reliably locks up for me when you break into a running system; set a
 breakpoint; and then continue.  Machine is UP+HTT.  Haven't tried other
 machines.

 Perhaps related, perhaps a red-herring:   With a single P4 + HTT, +
 SMP kernel, if I break into the ddb debugger on a serial console, the
 machine locks solid about 1 in 4 times.

Hmm, the first suggestion that it's possibly transient.  My machine is
a 2 processor Celeron 500 (obviously not HTT :-).  I get the same
results when debugging over firewire, which suggest that the problem
isn't in the serial link handling.

Greg
--
See complete headers for address and phone numbers.
NOTE: Due to the currently active Microsoft-based worms, I am limiting
all incoming mail to 131,072 bytes.  This is enough for normal mail,
but not for large attachments.  Please send these as URLs.


pgp0.pgp
Description: PGP signature

Serial debug broken in recent -CURRENT?

2003-09-29 Thread Greg 'groggy' Lehey

After building a new kernel, remote serial gdb no longer works.  When
I issue a 'continue' command, I lose control of the system, but it
doesn't continue running.  Has anybody else seen this?

Greg
--
See complete headers for address and phone numbers.
NOTE: Due to the currently active Microsoft-based worms, I am limiting
all incoming mail to 131,072 bytes.  This is enough for normal mail,
but not for large attachments.  Please send these as URLs.


pgp0.pgp
Description: PGP signature

Re: HEADSUP: Change of makedev() semantics.

2003-09-28 Thread Greg 'groggy' Lehey

On Sunday, 28 September 2003 at 23:22:07 +0200, Poul-Henning Kamp wrote:
 Basically:

   3. If you do a normal device driver, cache the result
  from when you call make_dev().
 ...

   ./dev/vinum
   Failure to cache result of make_dev() ?

Where should this be cached?  Can you point to example code?

Greg
--
See complete headers for address and phone numbers.
NOTE: Due to the currently active Microsoft-based worms, I am limiting
all incoming mail to 131,072 bytes.  This is enough for normal mail,
but not for large attachments.  Please send these as URLs.


pgp0.pgp
Description: PGP signature

Re: HEADSUP: Change of makedev() semantics.

2003-09-28 Thread Greg 'groggy' Lehey

On Sunday, 28 September 2003 at 19:46:20 -0400, Robert Watson wrote:

 On Mon, 29 Sep 2003, Greg 'groggy' Lehey wrote:

 On Sunday, 28 September 2003 at 23:22:07 +0200, Poul-Henning Kamp wrote:
 Basically:

 3. If you do a normal device driver, cache the result
from when you call make_dev().
 ...

 ./dev/vinum
 Failure to cache result of make_dev() ?

 Where should this be cached?  Can you point to example code?

 Actually, it looks like Vinum is caching the dev_t's,

Ah, you mean saving the results rather than calling make_dev() every
time?  Yes, it only calls make_dev() once for any device.

 but it's not always using them to get back to the dev_t--sometimes
 it's invoking makedev() instead.  However, this appears to happen
 only in the vinumrevive.c code, so I'm not sure if that's a property
 of the cached reference being unavailable it looks like it should be
 available in that context though.

No, it should always be available.  I was going to say I don't see
any references to make_dev() in vinumrevive.c, nor any references to
makedev() at all, but I see that VINUM_SD includes both.

 I.e., using sd-dev instead of VINUM_SD() -- it looks like there is
 a valid (struct sd *) reference there to follow, so you can get to
 the dev_t without doing a makedev().

Yes, this is a bug (and an indication of the dangers of using macros :-)
I'll fix it.

Greg
--
See complete headers for address and phone numbers.
NOTE: Due to the currently active Microsoft-based worms, I am limiting
all incoming mail to 131,072 bytes.  This is enough for normal mail,
but not for large attachments.  Please send these as URLs.


pgp0.pgp
Description: PGP signature

Re: recent changes prohibit vinum swap.

2003-09-26 Thread Greg 'groggy' Lehey

On Friday, 26 September 2003 at 18:38:48 -0400, Robert Watson wrote:

 On Fri, 26 Sep 2003, David Gilbert wrote:

 Recent changes to -CURRENT prohibit vinum swap:

 [1:6:[EMAIL PROTECTED]:~ swapon /dev/vinum/swapmu swapon: /dev/vinum/swapmu:
 Operation not supported by device

 In order to support swapping, Vinum will need to be modified to use struct
 disk and the disk(9) API, rather than exposing its storage devices
 directly via struct cdevsw and make_dev(9).  I.e., Vinum probably needs to
 start approaching things as disks rather than devices, a distinction
 that's becoming more mature in -CURRENT.

 From a quick read of vinumconfig.c, I'm guessing this wouldn't be hard to
 implement.  Some subset of struct sd, struct plex, and struct volume will
 need to start holding a struct disk instance which would be passed to
 disk_create() instead of a call to make_dev().  Much of the remainder will
 just consist of a bit of tweaking to make Vinum extract its data from
 bp-bio_disk-d_drv1 instead of bp-b_dev, replacing the ioctl dev_t
 argument with a disk argument, etc.

I'll take a look at this soon.  If somebody else wants to look first,
please let me know.  The introduction of GEOM means quite a shake-up
in the Vinum structure.

 I recently noticed that Vinum may be averse to blocksizes other than
 512 bytes.

It shouldn't be.  There's never been any dependency on it.

 Or at least, I can get Vinum mirrors up and running on md devices
 backed to memory, but not to swap, and the usual reason for problems
 on that front is the 4k blocksize for swap-backed md devices.

I've had a number of problems with md devices.  This one may be that
Vinum is presenting a 512 byte block size upwards instead of the 4 kB
that it should be showing.  Again, I'll take a look.

 I also noticed that the vinum commandline tool is a bit
 devfs-unfriendly, or at least, it gets pretty verbose about how all
 the files/directories it wants to create are already present.  It
 could be that a test for devfs conditionally causing a test for
 EEXIST would go a long way in muffling the somewhat loud complaining
 :-).

I'm not sure I understand this.  Can you give me a concrete example?

Greg
--
See complete headers for address and phone numbers.
NOTE: Due to the currently active Microsoft-based worms, I am limiting
all incoming mail to 131,072 bytes.  This is enough for normal mail,
but not for large attachments.  Please send these as URLs.


pgp0.pgp
Description: PGP signature

Re: recent changes prohibit vinum swap.

2003-09-26 Thread Greg 'groggy' Lehey

On Friday, 26 September 2003 at 19:28:45 -0400, David Gilbert wrote:
 Robert == Robert Watson [EMAIL PROTECTED] writes:

 Robert On Fri, 26 Sep 2003, David Gilbert wrote:

 Recent changes to -CURRENT prohibit vinum swap:

 [1:6:[EMAIL PROTECTED]:~ swapon /dev/vinum/swapmu swapon:
 /dev/vinum/swapmu: Operation not supported by device

 Robert In order to support swapping, Vinum will need to be modified
 Robert to use struct disk and the disk(9) API, rather than exposing
 Robert its storage devices directly via struct cdevsw and
 Robert make_dev(9).  I.e., Vinum probably needs to start approaching
 Robert things as disks rather than devices, a distinction that's
 Robert becoming more mature in -CURRENT.

 From a quick read of vinumconfig.c, I'm guessing this wouldn't be
 hard to
 Robert implement.  Some subset of struct sd, struct plex, and struct
 Robert volume will need to start holding a struct disk instance which
 Robert would be passed to disk_create() instead of a call to
 Robert make_dev().  Much of the remainder will just consist of a bit
 Robert of tweaking to make Vinum extract its data from
 bp- bio_disk-d_drv1 instead of bp-b_dev, replacing the ioctl dev_t
 Robert argument with a disk argument, etc.

 Is this something that someone can help me with quickly, or should I
 downgrade the machine until it's been done? 

Don't hold your breath.  This will probably happen in the course of
migrating Vinum functionality to GEOM.

 Is there a quick hack to make it work for now?

None that I know of.

 If I must downgrade, what date would be appropriate?

Sorry, I can't help there.  Maybe phk can give you some indication.

 Robert I also noticed that the vinum commandline tool is a bit
 Robert devfs-unfriendly, or at least, it gets pretty verbose about
 Robert how all the files/directories it wants to create are already
 Robert present.  It could be that a test for devfs conditionally
 Robert causing a test for EEXIST would go a long way in muffling the
 Robert somewhat loud complaining :-).

 Well... vinum is fragile in a whole bunch of ways.  vinum rm often
 leaves things in an inconsistant state.  I almost always reboot now
 after using it.  vinum rename doesn't change the devfs vinum directory
 ... which then also requires a reboot to correct.

Hmm.  That's another one to look at.

 Another thing that's very fragile is resetconfig.  It blanks memory,
 but not disk.

It should do.  It leaves the device names, though.  That's arguably a
bug.

Greg
--
See complete headers for address and phone numbers.
NOTE: Due to the currently active Microsoft-based worms, I am limiting
all incoming mail to 131,072 bytes.  This is enough for normal mail,
but not for large attachments.  Please send these as URLs.


pgp0.pgp
Description: PGP signature

Re: recent changes prohibit vinum swap.

2003-09-26 Thread Greg 'groggy' Lehey

On Friday, 26 September 2003 at 22:08:25 -0400, David Gilbert wrote:
 Greg == Greg Lehey Greg writes:

 Greg Don't hold your breath.  This will probably happen in the course
 Greg of migrating Vinum functionality to GEOM.

 So... is vinum-as-we-know-it going to disappear into the GEOM
 monster?

I suppose that depends on how you know it :-)

 There seems to be cross purposes here.

I'm not sure what you mean.  GEOM is a generalized framework which
fits around Vinum.  It also does a lot of the same things that Vinum
does.  There's no reason to have duplicated effort, so Vinum is going
to have to adapt.

Greg
--
See complete headers for address and phone numbers.
NOTE: Due to the currently active Microsoft-based worms, I am limiting
all incoming mail to 131,072 bytes.  This is enough for normal mail,
but not for large attachments.  Please send these as URLs.


pgp0.pgp
Description: PGP signature

Re: Where is my SLIP interface

2003-09-24 Thread Greg 'groggy' Lehey

On Thursday, 25 September 2003 at  3:04:52 +0200, Willem Jan Withagen wrote:
 Hi,

 I'm trying to upgrade my firewall/router to 5.x but I'm getting caught by
 the fact that I cannot find a 'sl0' interface.

Are you really still using SLIP?  What's wrong with PPP?

 I've tried the both with if_sl compiled into the kernel as well as a module.

 In neither case does ifconfig show a sl0 device.

IIRC it doesn't show now until it's configured.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Where is my SLIP interface

2003-09-24 Thread Greg 'groggy' Lehey

On Thursday, 25 September 2003 at  1:12:21 -0400, Lanny Baron wrote:
 On Wed, 2003-09-24 at 22:42, Greg 'groggy' Lehey wrote:
 On Thursday, 25 September 2003 at  3:04:52 +0200, Willem Jan Withagen wrote:
 Hi,

 I'm trying to upgrade my firewall/router to 5.x but I'm getting caught by
 the fact that I cannot find a 'sl0' interface.

 Are you really still using SLIP?  What's wrong with PPP?

 I've tried the both with if_sl compiled into the kernel as well as a module.

 In neither case does ifconfig show a sl0 device.

 IIRC it doesn't show now until it's configured.

 Perhaps 'The Complete FreeBSD' by Greg Lehey will help out.

Not any more.  I removed that chapter from the book.  The chapter's
available (also covers UUCP) if anybody wants it; just ask.  But it
seems that Willem has already had SLIP up and running.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: When this panic will be fixed?

2003-08-24 Thread Greg 'groggy' Lehey

On Saturday, 23 August 2003 at  5:05:11 -0700, Rostislav Krasny wrote:
 When FreeBSD 5.0-RELEASE had been released I tried to install it from
 floppies. I got system panic and then reported this problem into
 [EMAIL PROTECTED] mailing list. You can find this report in
 http://www.atm.tut.fi/list-archive/freebsd-stable/msg08385.html or in
 http://docs.freebsd.org/cgi/getmsg.cgi?fetch=297316+0+archive/2003/freebsd-stable/20030126.freebsd-stable

FreeBSD-CURRENT is a some assembly required list.  If you have a
panic, you should post a backtrace and ask specific questions.
Pointing to mail messages which don't even identify the panic string
are not going to get much in the way of response.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

HEADS UP: Vinum working again

2003-08-22 Thread Greg 'groggy' Lehey

Some changes in device driver locking recently broke Vinum for a short
period of time.  The problem is now fixed.  If you have any problems
with a recent version of Vinum, please let me know.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: vinum lock panic at startup -current

2003-08-14 Thread Greg 'groggy' Lehey

On Thursday,  7 August 2003 at 18:23:10 -0600, Aaron Wohl wrote:
 I just cvsuped -current this afternoon to get about 1 weeks updates.
 After that the kernel panics booting starting vinum.  I removed the one
 vinum volume (reformated as UFS2) I had for testing. And it still panics.
  I changed the /etc/rc.conf
 start_vinum=YES  to NO and can start ok now.

 Anyone else seeing this?  Is there a fix for it?

This panic actually happens in GEOM.  I believe there were some
questions about GEOM recently, but I haven't had any reply yet from
phk to my last question on the issue.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Questions about stability of snapshots and vinum in 5.1

2003-08-14 Thread Greg 'groggy' Lehey

[Format recovered--see http://www.lemis.com/email/email-format.html]

Long/short syndrome.

On Tuesday, 12 August 2003 at 20:49:05 -0400, James Quick wrote:

 I am seeking feedback on the status of vinum, and whether the
 following plan makes sense as an upgrade plan for a host with a
 light load but whose downtime windows are short.  I am curious if my
 planned use of snapshots is risky in 5.1, I have used them in under
 a much older 5.0 version with no problems, but a lot has changed.

As of right now, recent changes in -CURRENT have broken Vinum.  I hope
to have time to fix it in the next day or two.

 I have not migrated my data onto the first of these drive since I
 need to configure one, migrate from 2 old drives, then put in the
 second new drive before continuing.  I also need to do as much of
 this work as possible without interruption.  My plan is, to build
 out the first set of partitions,

Have you read the documentation on this subject?  There are easier
ways.   I don't see that any of this is necessary.

Greg
--
When replying to this message, please take care not to mutilate the
original text.  
For more information, see http://www.lemis.com/email.html
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: vinum problems with todays current

2003-08-14 Thread Greg 'groggy' Lehey

On Tuesday,  5 August 2003 at 22:21:41 +0200, Rob wrote:
 Poul-Henning Kamp wrote:
 In message [EMAIL PROTECTED], Rob writes:

 Hi all,

 After cvs'upping (about 12 hours ago) and building world/kernel vinum
 stopped working. It does show my two disks but nothing more. I also
 get an error message right after the bootloader:

 Can you try this patch:

 ...

 I noticed I had an older version of spec_vnops.c (1.205), so I
 cvsupped again and build kernel, this gave me the same msgbuf error,
 but with different values. Then I applied your patch and the error
 messgae disapeared, but still my vinum doesn't come up.

Can I assume that this is related to GEOM, and not to Vinum?

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: vinum problems with todays current

2003-08-11 Thread Greg 'groggy' Lehey

On Friday,  8 August 2003 at 16:04:05 +0200, Rob wrote:
 Greg 'groggy' Lehey wrote:
 On Tuesday,  5 August 2003 at 22:21:41 +0200, Rob wrote:
 Poul-Henning Kamp wrote:
 In message [EMAIL PROTECTED], Rob writes:

 Hi all,

 After cvs'upping (about 12 hours ago) and building world/kernel vinum
 stopped working. It does show my two disks but nothing more. I also
 get an error message right after the bootloader:

 Can you try this patch:

 I noticed I had an older version of spec_vnops.c (1.205), so I
 cvsupped again and build kernel, this gave me the same msgbuf error,
 but with different values. Then I applied your patch and the error
 messgae disapeared, but still my vinum doesn't come up.

 Can I assume that this is related to GEOM, and not to Vinum?

 After investigating a little further today, I found the config info
 on the drives to be mangled.

 --
 # rm -f log
 # for i in /dev/da0s1h /dev/da1s1h /dev/da2s1h /dev/da3s1h; do
 (dd if=$i skip=8 count=6|tr -d '\000-\011\200-\377'; echo)  log
 done
 # cat log
 IN VINOx-server.debank.tvbCc3??Z${m5?
 IN VINOx-server.debank.tvaC3?WPZ${m5?
 --

 I guess the drives can't be started again unless I have the
 parameters which I used during install (please say I'm wrong).

Hmm.  That doesn't look good.  No trace of the original config?

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

GEOM/vinum compatibility (was: vinum lock panic at startup -current)

2003-08-11 Thread Greg 'groggy' Lehey

On Friday,  8 August 2003 at 15:24:09 +0200, Poul-Henning Kamp wrote:
 In message [EMAIL PROTECTED], Aaron Wohl writes:

  Panicstring: mutex Giant owned at /usr/src/sys/geom/geom_dev.c:198

 Ok, then I think I know what it is.

 Vinum appearantly does not go through SPECFS but rather calls into
 the disk device drivers directly.  That is a pretty wrong thing to
 do,

It used to be the standard.  What's the issue?

 and it seems that vinum does not respect the D_NOGIANT flag which
 GEOM recently started setting.

Probably because it didn't know about it.  As I've said before, it
would be nice to be informed about the changes you're making,
particularly given your stated intention of doing no work on Vinum.
Could you please give details (privately if you want, but I think this
could be of interest to other people too).

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: vinum lock panic at startup -current

2003-08-08 Thread Greg 'groggy' Lehey

On Friday,  8 August 2003 at 10:27:31 +0200, Erwin Lansing wrote:
 On Fri, Aug 08, 2003 at 10:22:06AM +0200, Poul-Henning Kamp wrote:
 In message [EMAIL PROTECTED], Aaron Wohl writes:
 I just cvsuped -current this afternoon to get about 1 weeks updates.
 After that the kernel panics booting starting vinum.  I removed the one
 vinum volume (reformated as UFS2) I had for testing. And it still panics.
 I changed the /etc/rc.conf
 start_vinum=YES  to NO and can start ok now.

 What was the actual panic message ?

 Would http://people.freebsd.org/~erwin/koala.trace2 be related ?

Hmm.  I haven't seen this one before.

 This happens after a couple of hours of activity, things are fine
 again after reboot (for a while) on 5-1-RELEASE.

This is a very different backtrace from the last one you showed me.
Can I take a look at the dump?  The easiest way would be to access it
on your system, if that's possible.  I have a horrible feeling it's
going to be a memory corruption bug.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Lucent IBSS mode doesn't work in -CURRENT?

2003-08-04 Thread Greg 'groggy' Lehey

On Sunday,  3 August 2003 at 23:51:55 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 On Thursday, 31 July 2003 at  9:30:31 +0200, Eirik Oeverby wrote:
 Hey,

 I have a few Orinoco cards, and they 'work' in both ad-hoc and
 infrastructure mode. However with dhclient it gets tricky, because it
 will only work the first time dhclient assigns an address to the card.
 Whenever it tries to refresh it or whatever, I start getting those
 timeout and busy bit errors, and network connectivity drops. This
 usually happens within a few minutes or latest after 30 minutes or so -
 probably depending on your dhcpd/dhclient configuration. Configuring a
 static IP lets me use the card, and it seems stable.

 I am really glad someone else is seeing this, perhaps it can get fixed
 some day :)

 Oh and btw.. Get the *latest* firmware onto all your cards. That is
 essential for anything to work right at all..

 That sounds wrong to me.  If it worked before, and it doesn't now,
 that's not the fault of the firmware.

 Quit harping on it, ok.  We know there's a bug and carping like this
 makes me less willing to find and fix it.

I'm not harping on it, just pointing out that there's a difference
between a workaround and a fix.  If it hadn't been for that comment, I
wouldn't have replied at all.  I've borrowed an access point, so I'm
not in any pain right now.  Let me know if you want me to test
something.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Lucent IBSS mode doesn't work in -CURRENT?

2003-08-04 Thread Greg 'groggy' Lehey

On Monday,  4 August 2003 at 11:37:44 +0200, Brad Knowles wrote:
 At 11:51 PM -0600 2003/08/03, M. Warner Losh wrote:

 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:

 On Thursday, 31 July 2003 at  9:30:31 +0200, Eirik Oeverby wrote:

 Oh and btw.. Get the *latest* firmware onto all your cards. That is
 essential for anything to work right at all..

 That sounds wrong to me.  If it worked before, and it doesn't now,
 that's not the fault of the firmware.

 Quit harping on it, ok.  We know there's a bug and carping like this
 makes me less willing to find and fix it.

   I'm confused.  I agree that I have sometimes found Greg to be a
 bit annoying, but it seems to me that he's asking a perfectly
 legitimate question -- if things worked fine in the past (including
 the firmware versions at the time), and they don't work now, then why
 is a firmware update needed?

   I would ask:

   What changed so that things broke, and why can't we go back
   to the way things worked before?

I think you're misunderstanding Warner.  He's not disagreeing.  My
message wasn't directed at Warner, it was directed at Eirik.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-03 Thread Greg 'groggy' Lehey

On Sunday,  3 August 2003 at  0:31:45 -0400, John Baldwin wrote:

 On 03-Aug-2003 Greg 'groggy' Lehey wrote:
 On Saturday,  2 August 2003 at 16:47:13 +0200, Eivind Olsen wrote:
 [EMAIL PROTECTED]:~/tmp/debug  gdb -k kernel.debug
 (kgdb) list *(g_dev_strategy+29)

 This is almost certainly the wrong function.  At the very list you
 should look at the arguments passed to it.

 Actually, this line can be very instructive.  Since 'bp' is valid
 it is probably the bp2 from g_clone_bio() that is NULL.  You might
 want to ask phk about that one.

I think you'll find that there's a null dev pointer in there.  As I
say, I've seen this scenario before (without GEOM), and I'd be
surprised if this were phk's problem.

 (kgdb) list *(launch_requests+448)
 No symbol launch_requests in current context.
 (kgdb) list *(vinumstart+2b2)
 No symbol vinumstart in current context.
 (kgdb)

 Read the links I just sent you.  You haven't loaded the Vinum symbols.

 Bah, this isn't hard for you to do either:

... once you've loaded the symbols.  That's why I pointed to the
links.

As I said to Terry, the real issue here is probably what was happening
at the time, not the contents of the dump.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-03 Thread Greg 'groggy' Lehey

On Sunday,  3 August 2003 at 11:17:49 +0200, Eivind Olsen wrote:
 --On 3. august 2003 09:37 +0930 Greg 'groggy' Lehey [EMAIL PROTECTED]
 wrote:
 Read the links I just sent you.  You haven't loaded the Vinum symbols.

 I'm not sure exactly what to do here. I have absolutely no previous
 experience with kernel debugging, using gdb etc. so I'm lost without
 specific instructions on what to do, what to try etc.

Don't worry too much about that at the moment.  Let me analyze the
info you've sent me, and I'll ask some more questions.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Lucent IBSS mode doesn't work in -CURRENT?

2003-08-03 Thread Greg 'groggy' Lehey

On Thursday, 31 July 2003 at  9:30:31 +0200, Eirik Oeverby wrote:
 Hey,

 I have a few Orinoco cards, and they 'work' in both ad-hoc and
 infrastructure mode. However with dhclient it gets tricky, because it
 will only work the first time dhclient assigns an address to the card.
 Whenever it tries to refresh it or whatever, I start getting those
 timeout and busy bit errors, and network connectivity drops. This
 usually happens within a few minutes or latest after 30 minutes or so -
 probably depending on your dhcpd/dhclient configuration. Configuring a
 static IP lets me use the card, and it seems stable.

 I am really glad someone else is seeing this, perhaps it can get fixed
 some day :)

 Oh and btw.. Get the *latest* firmware onto all your cards. That is
 essential for anything to work right at all..

That sounds wrong to me.  If it worked before, and it doesn't now,
that's not the fault of the firmware.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-02 Thread Greg 'groggy' Lehey

On Saturday,  2 August 2003 at  2:11:24 -0700, Terry Lambert wrote:
 Eivind Olsen wrote:
 Can anyone suggest what I do next to find out about this crash?

 Fatal trap 12: page fault while in kernel mode
 fault virtual address   = 0x14

 Dereference of NULL pointer; reference is for element at offset
 0x14 in some structure; this is the equivalent of 5 32 bit ints
 or pointers into the structure.

 db trace
 g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29
 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448
 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2

 gdb -k kernel.debug
 (gdb) list *(g_dev_strategy+29)
 [ ... ]
 (gdb) list *(launch_requests+448)
 [ ... ]
 (gdb) list *(vinumstart+2b2)
 [ ... ]

 Will give you the exact source lines involved, assuming you
 built a debug kernel.

 You don't actually need a crash dump to debug a stack traceback.

Great!  So you know the answer?  Please submit a patch.

Seriously, this is nonsense.  Yes, it's a null pointer dereference.
What?  Why?  How do you fix it?  Finding the first step doesn't solve
the problem.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-02 Thread Greg 'groggy' Lehey

On Saturday,  2 August 2003 at 17:00:59 +0200, Eivind Olsen wrote:
 --On 2. august 2003 11:16 +0200 Bernd Walter [EMAIL PROTECTED]
 wrote:
 Looks like a problem in vinum.  The other backtrace was the same, right?
 Please take a look at an older thread named (IIRC) vinum or geom bug?
 Greg asked for special debug output, but it never happened again for me.
 A real murphy bug - it happend on three machines once a day and after
 Gregs response nothing happened over weeks.

 Are you thinking of the thread vinum and/or geom panic on alpha from 10th
 of June? I forgot to mention this but my system is i386 uniprocessor
 (Pentium2 at 450MHz).

 In case it's relevant, yes I do run vinum:

Yes, of course you do.  That's what the stack trace says, and that's
why people mentioned Vinum in the first place:

On Saturday,  2 August 2003 at 10:11:24 +0200, Eivind Olsen wrote:
 Here's some output from DDB:

 db trace
 g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29
 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448
 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2
 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6

On Saturday,  2 August 2003 at 11:16:21 +0200, Bernd Walter wrote:
 On Sat, Aug 02, 2003 at 02:00:52AM -0700, Kris Kennaway wrote:
 On Sat, Aug 02, 2003 at 10:11:24AM +0200, Eivind Olsen wrote:

 db trace
 g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at g_dev_strategy+0x29
 launch_requests(c299bf00,0,1,,47) at launch_requests+0x448
 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6) at vinumstart+0x2b2
 vinumstrategy(c5ada2d0,0,c09719b0,40,0) at vinumstrategy+0xa6

 Looks like a problem in vinum.  The other backtrace was the same, right?

 Please take a look at an older thread named (IIRC) vinum or geom bug?
 Greg asked for special debug output, but it never happened again for me.
 A real murphy bug - it happend on three machines once a day and after
 Gregs response nothing happened over weeks.

This is the real issue.  Until you supply the information I ask for in
the man page or at http://www.vinumvm.org/vinum/how-to-debug.html,
only Terry can help you.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-02 Thread Greg 'groggy' Lehey

On Saturday,  2 August 2003 at 16:47:13 +0200, Eivind Olsen wrote:
 --On 2. august 2003 02:11 -0700 Terry Lambert [EMAIL PROTECTED]
 wrote:
 db trace
 g_dev_strategy(c2156024,c2153800,0,cfb528d0,c2099eca) at
 g_dev_strategy+0x29 launch_requests(c299bf00,0,1,,47) at
 launch_requests+0x448 vinumstart(c5ada2d0,0,c22ab000,cfb5294c,c02e5bc6)
 at vinumstart+0x2b2
 gdb -k kernel.debug
 (gdb) list *(g_dev_strategy+29)
 [ ... ]
 (gdb) list *(launch_requests+448)
 [ ... ]
 (gdb) list *(vinumstart+2b2)
 [ ... ]
 Will give you the exact source lines involved, assuming you
 built a debug kernel.

 I did. At least I've tried to. :)
 (I have a kernel.debug which was compiled at the same time as the real
 kernel I'm using, and it's approx. 30MB in size).

 You don't actually need a crash dump to debug a stack traceback.

 This is what I found by using those commands you mentioned:

 [EMAIL PROTECTED]:~/tmp/debug  gdb -k kernel.debug
 GNU gdb 5.2.1 (FreeBSD)
 Copyright 2002 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain
 conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for details.
 This GDB was configured as i386-undermydesk-freebsd...
 (kgdb) list *(g_dev_strategy+29)

This is almost certainly the wrong function.  At the very list you
should look at the arguments passed to it.

 (kgdb) list *(launch_requests+448)
 No symbol launch_requests in current context.
 (kgdb) list *(vinumstart+2b2)
 No symbol vinumstart in current context.
 (kgdb)

Read the links I just sent you.  You haven't loaded the Vinum symbols.

 If anyone wants to take a look at this themselves I've put the compressed
 (gzip) debug-kernel available on
 http://eivind.aminor.no/debug/kernel.debug.gz
 NOTE! It's approx. 13MB compressed!

The kernel's not much use by itself.  

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-02 Thread Greg 'groggy' Lehey

On Saturday,  2 August 2003 at 17:54:03 -0700, Terry Lambert wrote:
 Eivind Olsen wrote:
 (kgdb) list *(launch_requests+448)
 No symbol launch_requests in current context.
 (kgdb) list *(vinumstart+2b2)
 No symbol vinumstart in current context.
 (kgdb)

 If anyone wants to take a look at this themselves I've put the compressed
 (gzip) debug-kernel available on
 http://eivind.aminor.no/debug/kernel.debug.gz
 NOTE! It's approx. 13MB compressed!

 If this is repeatable for you, it's recommended that you compile
 Vinum statically into your kernel, so that you can look at the
 other symbols in the traceback and obtain source lines for them,
 as well.

No.  It is explicitly discouraged.

 It may be that this will be debuggable without that information, but
 in my experience with similar problems, without a list of arguments
 to the functions from a live remote debug session and/or a
 crashdump, the problem is going to have to be found by an engineer
 eyeballing the call graph and seeing how that particular line could
 end up with a NULL in bp2 or bp.

Terry hasn't read the debug instructions.  You can load symbols from
klds.  See the links I pointed to.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-02 Thread Greg 'groggy' Lehey

On Saturday,  2 August 2003 at 17:56:49 -0700, Terry Lambert wrote:
 Greg 'groggy' Lehey wrote:
 You don't actually need a crash dump to debug a stack traceback.

 Great!  So you know the answer?  Please submit a patch.

 Seriously, this is nonsense.  Yes, it's a null pointer dereference.
 What?

 That is precisely what doing what I suggested discovers, Greg.

Yes, that's what you said already.

 If you haven't seen his response posting:

I saw it and explained why it didn't help.

 Clearly, bp2 or bp is NULL at the time of the dereference.

 Why?

 Programmer error.  Either bp2 or bp is a NULL pointer.

You're repeating yourself.

 How do you fix it?

 It depends on the root cause.

*bingo*  Here you are having found the first (obvious) step and acting
as if the problem has been solved.

 I really can't answer it

OK, why don't you either:

1.  Find a way to answer it, or
2.  Keep quiet.

You're just confusing the issue here.

 Finding the first step doesn't solve the problem.

 No.  Finding the first step is *necessary* to solving the problem,
 but you are entirely correct in pointing out that it's not in
 itself *sufficient*.

 But it's one step farther along than he was.  I didn't see anyone
 else helping him take that first step, so I did.

Sorry, I don't hack in the middle of the night.  If you had read the
documentation at your disposal, you'd have discovered a lot of help,
and also that this is a known problem that crops up sporadically, and
that so far we can't find out why.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-02 Thread Greg 'groggy' Lehey

On Saturday,  2 August 2003 at 18:06:36 -0700, Terry Lambert wrote:
 Greg 'groggy' Lehey wrote:
 Please take a look at an older thread named (IIRC) vinum or geom bug?
 Greg asked for special debug output, but it never happened again for me.
 A real murphy bug - it happend on three machines once a day and after
 Gregs response nothing happened over weeks.

 This is the real issue.  Until you supply the information I ask for in
 the man page or at http://www.vinumvm.org/vinum/how-to-debug.html,
 only Terry can help you.

 This is BS, Greg.

 I deal with about a traceback every other day, and sometimes as
 high as 5 in a single day, if it's a busy day for it.

Stack traces are pretty common stuff.  Your point?

 The information I gave him gets him to lines of source code, instead
 of just function names with strange hexadecimal numbers that resolve
 to instruction offsets that may be specific to his compile flags,
 date of checkout of the sources from CVS, etc..

The first step of the link above does the same thing.  But it's only
the first step.

 I don't know about you, but I can't easily write assembly
 instructions to tape, run them the tape through my teeth, and read
 the bits using my dental fillings.

Terry, why don't you come to my debug tutorial at the BSDCon next
month?  I'll show you how to do this properly.  I'm not asking for
people to interpret hex.  I'm asking for people, you included, to find
out what debugging help is available.

 If it's a NULL pointer dereference, the place to find it is by
 turning on what debugging there is, and, if that fails, which it
 probably will,

No, that will find the null pointer dereference pretty quickly.

 by eyeballing the lines of source code in question and understanding
 the code around it well enough that you can tell *how* a pointer
 there could be NULL.  My instructions *get* him those lines of
 source.

You obviously still haven't read the reference.  Do that first, and
come back when you have either understood things or are having
difficulty understanding.  But don't shoot off your mouth without
knowing what's going on.

 If you'll notice from his followup posting of the source in
 question, Vinum is loaded as a module, and it's the FreeBSD code
 that Vinum calls, not Vinum, that's causing the crash.

The bug is almost certainly in Vinum.

 There's no reason to be paranoid about your baby with me; unlike
 some people, personally I like Vinum, so relax and realize that I'm
 not trying to blame your code by trying to help him squeeze more
 information out of the data he *is* able to gather.

This has nothing to do with being paranoid about babies.  This has to
do with people shooting off their mouths in a public forum without
bothering to check details first.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Yet another crash in FreeBSD 5.1

2003-08-02 Thread Greg 'groggy' Lehey

On Saturday,  2 August 2003 at 18:36:24 -0700, Terry Lambert wrote:
 Greg 'groggy' Lehey wrote:
 The information I gave him gets him to lines of source code, instead
 of just function names with strange hexadecimal numbers that resolve
 to instruction offsets that may be specific to his compile flags,
 date of checkout of the sources from CVS, etc..

 The first step of the link above does the same thing.  But it's only
 the first step.
 by eyeballing the lines of source code in question and understanding
 the code around it well enough that you can tell *how* a pointer
 there could be NULL.  My instructions *get* him those lines of
 source.

 You obviously still haven't read the reference.  Do that first, and
 come back when you have either understood things or are having
 difficulty understanding.  But don't shoot off your mouth without
 knowing what's going on.

 I read the reference.

 How does it apply in cases like this one, where you don't have a
 vmcore file?

You don't seem to have read the reference very well.  It also asks for
other supporting information.  That's the most important thing at the
moment.  I know that because I've been there before, and I've looked
at a number of these dumps: it's almost certainly related to something
he's doing which is not normal.  You don't know that, and that's
excusable, but it's not excusable that after four or five requests,
you still haven't RTFM'd.

 The way I would approach finding this, with only:

 1)The line of code where the failure occurred
 2)The stack traceback, with no arguments
 3)The sources for the code in the stack traceback

 would be to eyeball the code in #1, and try to figure out how
 I gould get to that point with that pointer having a NULL value,
 given my apriori knowledge of the forward call graph.

You have that?

 I would examine every intermediate conditional and function call
 that could effect the value of the pointer and cause it to be NULL
 at the point in question.

Go for it.  Once I get the log files, I'll start there.

 One of the details I wish you would check is whether or not he has a
 vmcore file, or the ability to get one...

We'll address that issue when it becomes necessary.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Lucent IBSS mode doesn't work in -CURRENT?

2003-07-30 Thread Greg 'groggy' Lehey

Earlier this month I sent a message saying that my wireless card
(Orinoco) doesn't work at all any more.  In the meantime, I've
narrowed the problem down to IBSS (ad-hoc) mode: it works fine in
BSS (base station) mode.  I'd like to know if *anybody* is using IBSS
(maybe with Orinoco cards) on a -CURRENT newer than about mid-May.

Here's a summary of what I see:

It happens on two different cards with different firmware.  The
ifconfig and wicontrol outputs look identical modulo MAC address and
IBSS channel.
  
wi0: flags=8802BROADCAST,SIMPLEX,MULTICAST mtu 1500
ether 00:02:2d:04:09:3a
media: IEEE 802.11 Wireless Ethernet autoselect (none)
ssid 
stationname FreeBSD WaveLAN/IEEE node
channel -1 authmode OPEN powersavemode OFF powersavesleep 100
wepmode OFF weptxkey 1
 
NIC serial number:  [  ]
Station name:   [ FreeBSD WaveL ]
SSID for IBSS creation: [  ]
Current netname (SSID): [  ]
Desired netname (SSID): [  ]
Current BSSID:  [ 00:00:00:00:00:00 ]
Channel list:   [ 7ff ]
IBSS channel:   [ 3 ]
Current channel:[ 65535 ]
Comms quality/signal/noise: [ 0 0 0 ]
Promiscuous mode:   [ Off ]
Process 802.11b Frame:  [ Off ]
Intersil-Prism2 based card: [ 0 ]
Port type (1=BSS, 3=ad-hoc):[ 1 ]
MAC address:[ 00:02:2d:04:09:3a ]
TX rate (selection):[ 0 ]
TX rate (actual speed): [ 0 ]
RTS/CTS handshake threshold:[ 2312 ]
Create IBSS:[ Off ]
Access point density:   [ 1 ]
Power Mgmt (1=on, 0=off):   [ 0 ]
Max sleep time: [ 100 ]
WEP encryption: [ Off ]
TX encryption key:  [ 1 ]
Encryption keys:[  ][  ][  ][  ]
 
wi0: Lucent Technologies WaveLAN/IEEE at port 0x100-0x13f irq 11 function 0 config 1 
on pccard1
wi0: 802.11 address: 00:02:2d:04:09:3a
wi0: using Lucent Technologies, WaveLAN/IEEE
wi0: Lucent Firmware: Station (6.6.1)
wi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
 
wi0: Lucent Technologies WaveLAN/IEEE at port 0x100-0x13f irq 11 function 0 config 1 
on pccard1
wi0: 802.11 address: 00:02:2d:1e:d9:60
wi0: using Lucent Technologies, WaveLAN/IEEE
wi0: Lucent Firmware: Station (6.16.1)
wi0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
 
When I run dhclient against the first card, I don't get a connection,
and the other end doesn't see any data traffic, but it finds the
network:
 
wi0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
inet6 fe80::202:2dff:fe04:93a%wi0 prefixlen 64 scopeid 0x4
inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255
ether 00:02:2d:04:09:3a
media: IEEE 802.11 Wireless Ethernet autoselect (DS/2Mbps)
status: associated
ssid FOOXX 1:FOOXX
stationname FreeBSD WaveLAN/IEEE node
channel 3 authmode OPEN powersavemode OFF powersavesleep 100
wepmode OFF weptxkey 1
 
I had guessed that it might be turning WEP on without saying so, but
setting WEP on at both ends didn't help either.

The second card is much worse than the first: when I try to start
dhclient against it, I get the following messages:

  wi0: timeout in wi_cmd 0x0002; event status 0x8080
  wi0: timeout in wi_cmd 0x0121; event status 0x8080
  wi0: wi_cmd: busy bit won't clear.

This last one continues forever.  At least the keyboard is locked, so
I can't do anything (not even get into ddb, which might have been
useful).  While trying to power down I got these messages:

  wi0: failed to allocate 2372 bytes on NIC.
  wi0: tx buffer allocateion failed (error 12)

After that, it continued until I finally managed to power down.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: We have ath, now what about Broadcom?

2003-07-27 Thread Greg 'groggy' Lehey

On Saturday, 26 July 2003 at 11:00:40 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 M. Warner Losh [EMAIL PROTECTED] writes:
 The reason I keep saying that is that nobody knows for sure.  Nobody
 has reverse engineered anything, got sued and won (or lost).  Just

 However, there are one or two cases that are close to relevant working
 their ways through the courts.  Since they are in different districts,
 the answer is different depending on where you live in the US.

Or *whether* you live in the US.  There's a very good reason nobody's
ever been sued for reverse engineering in Australia: it's not illegal
(which may be a different statement from saying it's legal).  That
gets back to the original question: is it legal to use reverse
engineered software in the USA?

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-27 Thread Greg 'groggy' Lehey

On Saturday, 26 July 2003 at 22:18:59 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 Presuming that it's the ROM driver, I get this in the dmesg I posted:
 pnpbios: Bad PnP BIOS data checksum

 That's likely the problem.  However, PnP BIOS information isn't the
 same thing that the orm[sic] driver probes for.

They look related.  I've now found the orm output:

  orm0: Option ROMs at iomem 
0xe-0xe3fff,0xdf800-0xd,0xd-0xd17ff,0xc-0xcefff on isa0

The last one is the video BIOS.  It's interesting to note that it
doesn't report the 4 kB BIOS at 0xcf000, which suggests that at this
point the 16 kB area is already unmapped.  

I've worked around the problem by compiling the video BIOS into the X
server and not trying to access the BIOS in the machine.  Obviously
not a solution, but it works for the moment.  I'd really like to track
down the problem.  Does anybody have an idea?

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-27 Thread Greg 'groggy' Lehey

On Sunday, 27 July 2003 at 21:42:35 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 On Saturday, 26 July 2003 at 22:18:59 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 Presuming that it's the ROM driver, I get this in the dmesg I posted:
 pnpbios: Bad PnP BIOS data checksum

 That's likely the problem.  However, PnP BIOS information isn't the
 same thing that the orm[sic] driver probes for.

 They look related.  I've now found the orm output:

   orm0: Option ROMs at iomem 
 0xe-0xe3fff,0xdf800-0xd,0xd-0xd17ff,0xc-0xcefff on isa0

 The last one is the video BIOS.  It's interesting to note that it
 doesn't report the 4 kB BIOS at 0xcf000, which suggests that at this
 point the 16 kB area is already unmapped.

 H, The list comes from scanning the ISA HOLE for certain memory
 signatures.  These signatures have a length in them that say I'm a
 rom that's X long.

Sure.  The data at offset 0xc are:

C000:  55 AA 78 E9 44 06 00 00-00 00 00 00 00 00 00 00   U.x.D...

The 0xaa55 is the BIOS signature (Here be a BIOS), and the 0x78 is
the length byte (120 sectors, or 60 kB).  That's how orm0 knows the
end address.

 I don't think that it suggests that things are 'unmapped'...

If the area between 0xcc000 and 0xc had been mapped, orm0 would
have found this too:

C000:F000  55 AA 08 E8 6D 0B CB 11-FE 02 00 00 00 00 00 00   U...m...

 I've worked around the problem by compiling the video BIOS into the X
 server and not trying to access the BIOS in the machine.  Obviously
 not a solution, but it works for the moment.  I'd really like to track
 down the problem.  Does anybody have an idea?

 I don't, I'm sorry.

Understood.  I was hoping that somebody else might have some ideas.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-27 Thread Greg 'groggy' Lehey

On Sunday, 27 July 2003 at 22:03:57 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 Sure.  The data at offset 0xc are:

 C000:  55 AA 78 E9 44 06 00 00-00 00 00 00 00 00 00 00   U.x.D...

 The 0xaa55 is the BIOS signature (Here be a BIOS), and the 0x78 is
 the length byte (120 sectors, or 60 kB).  That's how orm0 knows the
 end address.

 I don't think that it suggests that things are 'unmapped'...

 If the area between 0xcc000 and 0xc had been mapped, orm0 would
 have found this too:

 C000:F000  55 AA 08 E8 6D 0B CB 11-FE 02 00 00 00 00 00 00   U...m...

 08 - 4k

Correct.  It should have shown a BIOS from 0xcf000 to 0xc

 It could also be that there's a bug in orm that's missing it...

Sure, but given the other indications, that's not so likely.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-27 Thread Greg 'groggy' Lehey

On Sunday, 27 July 2003 at 22:11:29 -0600, M. Warner Losh wrote:
 Where are you getting the data?  A windows tool?

If you're talking about the BIOS contents I'm printing, yes, I'm using
a Microsoft tool called DEBUG (which has been around since before
Microsoft bought DOS :-).

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-27 Thread Greg 'groggy' Lehey

On Sunday, 27 July 2003 at 22:17:32 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 On Sunday, 27 July 2003 at 22:11:29 -0600, M. Warner Losh wrote:
 Where are you getting the data?  A windows tool?

 If you're talking about the BIOS contents I'm printing, yes, I'm using
 a Microsoft tool called DEBUG (which has been around since before
 Microsoft bought DOS :-).

 I don't suppose that you could use FreeBSD's /dev/mem + od?

Yup, can do.

  # dd if=/dev/mem bs=64k skip=12 count=1 | hd | less

    55 aa 78 e9 44 06 00 00  00 00 00 00 00 00 00 00  |U.x.D...|
  0010  00 00 00 00 00 00 00 00  68 01 00 00 00 00 49 42  |h.IB|
  ...
  bff0  04 03 80 00 0c 00 00 00  20 00 10 0b 3e 00 02 40  | .@|
  c000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ||
  *
  0001

That's pretty much what I expected.  Up to offset bff0, it's identical
with the Microsoft dump.

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-27 Thread Greg 'groggy' Lehey

On Sunday, 27 July 2003 at 22:32:42 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 On Sunday, 27 July 2003 at 22:17:32 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 On Sunday, 27 July 2003 at 22:11:29 -0600, M. Warner Losh wrote:
 Where are you getting the data?  A windows tool?

 If you're talking about the BIOS contents I'm printing, yes, I'm using
 a Microsoft tool called DEBUG (which has been around since before
 Microsoft bought DOS :-).

 I don't suppose that you could use FreeBSD's /dev/mem + od?

 Yup, can do.

 dd if=/dev/mem bs=64k skip=12 count=1 | hd | less

     55 aa 78 e9 44 06 00 00  00 00 00 00 00 00 00 00  |U.x.D...|
   0010  00 00 00 00 00 00 00 00  68 01 00 00 00 00 49 42  |h.IB|
   ...
   bff0  04 03 80 00 0c 00 00 00  20 00 10 0b 3e 00 02 40  | .@|
   c000  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  ||
   *
   0001

 That's pretty much what I expected.  Up to offset bff0, it's identical
 with the Microsoft dump.

 Shouldn't you be looking at 0x000c instead of 0xc000?

Yes, I am.  Look at the calculations in the dd above: skip 12 blocks
of 64 kB, or 0xc.  If you mean the output of Microsoft's DEBUG,
that's in 8086 real mode, segment:offset.  The segment registers are
logically shifted 4 bits to the left, so C000: is 0xc.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Mapping Video BIOS?

2003-07-26 Thread Greg 'groggy' Lehey

I've spent the last couple of days tracking down a problem starting X
on a Dell Inspiron 5100.  I've got as far as discovering that the
video BIOS is not being completely mapped: it's 60 kB long, but only
48 kB are being mapped into memory.  To make matters worse, the
machine doesn't have a serial port, so I can't apply a kernel debugger
to find out what's going on.

Can anybody point me in the right direction?  Where should I be
looking for this?  Is this memory mapped permanently, or is it only
during X startup?

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-26 Thread Greg 'groggy' Lehey

On Saturday, 26 July 2003 at  9:41:14 +0100, Bruce M Simpson wrote:
 On Sat, Jul 26, 2003 at 05:32:17PM +0930, Greg 'groggy' Lehey wrote:
 Can anybody point me in the right direction?  Where should I be
 looking for this?  Is this memory mapped permanently, or is it only
 during X startup?

 The video BIOS is usually mapped by system BIOS into real memory to
 begin with, so it should be just sitting there.  There are usually northbridge
 chipset registers for dealing with this sort of thing.

 The SMM mode might reuse that window, though, but generally this is hidden
 from non-SMM mode applications.

 You're in luck - been rebuilding X, so have xc tarballs handy.

 The XFree86 code responsible is:
xc/programs/Xserver/hw/xfree86/int10

Yup, I've been playing around with it.  I currently have my arms in
xf86ExtendedInitInt10, which does the mapping.  It tries to map 256 kB
of memory, and I suppose it does, for some definition:

(II) RADEON(0): mapped system memory at 0xc, len 0x4, video BIOS offset 
0xc, to 0x28368000

But at 0xcc00, I get:

(gdb) x/20x 0x28373ff0
0x28373ff0: 0x00800304  0x000c  0x0b100020  0x4002003e
0x28374000: 0x  0x  0x  0x
0x28374010: 0x  0x  0x  0x

I've looked in the same space with Microsoft, which says:

C000:BFF0  04 03 80 00 0C 00 00 00-20 00 10 0B 3E 00 02 40    .@
C000:C000  00 2E 05 01 06 10 40 01-90 01 02 97 01 45 01 0D   [EMAIL PROTECTED]

 Some drivers like to call VBE via int10h, so this module acts as a bridge.
 It just memcpy()'s the ROM and uses various methods, depending on the
 compilation target, to call int10h.

 Is the onboard video AGP/PCI?

Intel 82845, if that's the correct answer.  I've put the dmesg up at
http://www.lemis.com/grog/Inspiron/dmesg.boot.

 It is possible that the device isn't reporting its memory window in
 the ROM BAR correctly. I've seen this happen with some low-end
 network cards before.

I could believe that, but I think we have a different problem here:
since it's mapping up to the end of low memory.  My guess is that
something else shares this space, and that it has been turned off.
I'm going to carry on investigating, but if anybody else recognizes
the problem, I'd be interested to hear from you.

 Try my tools at this URL to check this:
 http://www.incunabulum.com/code/projects/pci/freebsd/

Thanks, I'll try that anyway.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-26 Thread Greg 'groggy' Lehey

On Saturday, 26 July 2003 at 11:27:06 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 machine doesn't have a serial port, so I can't apply a kernel debugger
 to find out what's going on.

 Does it have a firewire port?

Yes.  How can I use that?

I had also expected that you could shed some light on the BIOS mapping
issue.  Since my last message I've become pretty sure that it must be
something to do with the chip set setup.  Is it possible that we're
not mapping the entire area 0xc to 0xf?

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-26 Thread Greg 'groggy' Lehey

On Saturday, 26 July 2003 at 18:44:43 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 On Saturday, 26 July 2003 at 11:27:06 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 machine doesn't have a serial port, so I can't apply a kernel debugger
 to find out what's going on.

 Does it have a firewire port?

 Yes.  How can I use that?

 If you have a second machine with firewire, then you can use the
 firewire port as your console.  Look at /usr/ports/devel/dcons.  It is
 one of the under-publicized cool features from Japan (Thanks
 Shimokawa-san!).

Ah, good stuff.  I'll have to check if it also works with gdb.
Unfortunately, this is my only machine with firewire.  I was wondering
if there were USB/conventional serial converters that I could use.

 I had also expected that you could shed some light on the BIOS mapping
 issue.  Since my last message I've become pretty sure that it must be
 something to do with the chip set setup.  Is it possible that we're
 not mapping the entire area 0xc to 0xf?

 I'm not sure what you mean by this question.  Since OLDCARD works, and
 requires read/write access to that physical memory range, I doubt that
 it is unmapped.

I'm not sure at what level.  I suspect that something in the chipset
is turning off that area of memory, or mapping something else to it.
The dump from Microsoft shows that there's another BIOS at 0xcf000,
but what I have mapped in memory shows only 0xff up to address
0xd, where I find another BIOS signature:

0x28377fe0: 0x  0x  0x  0x
0x28377ff0: 0x  0x  0x  0x
0x28378000: 0xe80caa55  0x4ecb14c8  0x033b  0x
0x28378010: 0x  0x0020  0x00600040  0x90c08b2e
0x28378020: 0x49444e55  0xea16  0x0c9d0201  0xad100800

 It may be the case that we aren't setting things up so that XFree86
 can call the BIOS, but given that we used PCIBIOS before ACPI, it
 seems unlikely.

Well, this is a new laptop, so it's possible that something *is*
getting set up incorrectly.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Mapping Video BIOS?

2003-07-26 Thread Greg 'groggy' Lehey

On Saturday, 26 July 2003 at 19:47:50 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 On Saturday, 26 July 2003 at 18:44:43 -0600, M. Warner Losh wrote:
 In message: [EMAIL PROTECTED]
 Greg 'groggy' Lehey [EMAIL PROTECTED] writes:
 I had also expected that you could shed some light on the BIOS mapping
 issue.  Since my last message I've become pretty sure that it must be
 something to do with the chip set setup.  Is it possible that we're
 not mapping the entire area 0xc to 0xf?

 I'm not sure what you mean by this question.  Since OLDCARD works, and
 requires read/write access to that physical memory range, I doubt that
 it is unmapped.

 I'm not sure at what level.  I suspect that something in the chipset
 is turning off that area of memory, or mapping something else to it.
 The dump from Microsoft shows that there's another BIOS at 0xcf000,
 but what I have mapped in memory shows only 0xff up to address
 0xd, where I find another BIOS signature:

 0x28377fe0: 0x  0x  0x  0x
 0x28377ff0: 0x  0x  0x  0x
 0x28378000: 0xe80caa55  0x4ecb14c8  0x033b  0x
 0x28378010: 0x  0x0020  0x00600040  0x90c08b2e
 0x28378020: 0x49444e55  0xea16  0x0c9d0201  0xad100800

 Typically, there are a number of different ROM sections.  The orm
 driver searches for these things out.  Does it report anything

Presuming that it's the ROM driver, I get this in the dmesg I posted:

pnpbios: Bad PnP BIOS data checksum

That's pretty much the same problem reported by the X server.

Where would I go from there?

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Can't connect to wireless network with recent -CURRENT

2003-07-05 Thread Greg 'groggy' Lehey

On Thursday,  3 July 2003 at 16:33:30 +0200, Harti Brandt wrote:
 On Thu, 3 Jul 2003, M. Warner Losh wrote:

 MWLIn message: [EMAIL PROTECTED]
 MWLHarti Brandt [EMAIL PROTECTED] writes:
 MWL: I think the same problem was reported by Rob Holmes two weeks ago and by
 MWL: me (although with lesser detail) yesterday. I converted my kernel from
 MWL: OLDBUS to NEWBUS and now one out of four or five tries the card works, but
 MWL: this is really annoying. I have an Inspiron 8200 and an Avaya (that is a
 MWL: Lucent) card. I have found no solution until now.
 MWL
 MWLThe lucent problem is well known and has been known for a long time.
 MWLIt was broken between 5.0 and 5.1 for some people with lucent cards
 MWL(not me and mine).  Enabling WITNESS seens to help, but that likely
 MWLmeans that it is a race that the overhead of WITNESS tickles in
 MWLcertain ways.  Sam indicated he'd try to find some time to fix it.
 MWLThere's something subtle going on with the lucent cards, and I've
 MWLgiven up trying to find it.  I just do't have the time.

 Updating the firmware from www.agere.com to 8.72.1 has cured the problem
 (except for two messages from the kernel):

 Jul  3 16:09:05 harti kernel: wi0: bad alloc 204 != 201, cur 0 nxt 0
 Jul  3 16:09:09 harti kernel: wi0: bad alloc 208 != 205, cur 0 nxt 0

Hmm.  I'd look on that as a workaround, not a fix.  The driver
shouldn't become more sensitive towards microcode revisions.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Can't connect to wireless network with recent -CURRENT

2003-07-02 Thread Greg 'groggy' Lehey

I've just upgraded my laptop to a recent -CURRENT, and since then I've
been having a lot of network problems.  Here's a rough chronology:

- Machine is a Dell Inspiron 7500, which I've been using with releases
  4 and 5 of FreeBSD without problems for the last 3 years.  It's
  usually connected to my house 802.11b network, which is run by an
  old 486 in ad-hoc mode, no WEP.  I use DHCP to set up the
  connection.

- Things worked fine up to my last kernel:

  Jun 26 14:03:43 kondoparinga kernel: FreeBSD 5.0-CURRENT #0: Sun May 11 13:25:03 CST 
2003

- On 28 June, I upgraded to the then -CURRENT.  I had a lot of trouble
  getting things working, including the following from the gateway
  machine:

   Jun 29 09:35:15 air-gw dhcpd: DHCPREQUEST for 192.109.197.199 from 
00:02:2d:04:09:3a via wi0
   Jun 29 09:35:15 air-gw dhcpd: DHCPACK on 192.109.197.199 to 00:02:2d:04:09:3a via 
wi0
   Jun 29 09:35:16 air-gw dhcpd: DHCPDECLINE on 192.109.197.199 from 00:02:2d:04:09:3a 
via wi0
   Jun 29 09:35:16 air-gw dhcpd: Abandoning IP address 192.109.197.199: declined.
   Jun 29 09:35:16 air-gw dhcpd: DHCPDISCOVER from 00:02:2d:04:09:3a via wi0
   Jun 29 09:35:16 air-gw dhcpd: DHCPOFFER on 192.109.197.199 to 00:02:2d:04:09:3a via 
wi0

  Nothing was mentioned in the log files on the laptop.

- I managed to connect, however, and things worked for a while, but
  the machine kept freezing.  I tried with a 100 Mb/s Ethernet card,
  and it had problems too.  With both network cards, it reported
  various error messages which I didn't write down because I thought
  they would be logged; unfortunately they weren't.  The one from wi0
  is still occurring:

  wi0: bad alloc 3b4 != ff, cur 0 nxt 0

- I built a new kernel and world on 1 July.  Since then I haven't had
  any trouble with the system freezing up, but and was no longer able
  to connect at all with the wireless card.  After booting, I get:

   wi0: flags=8843UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST mtu 1500
inet6 fe80::202:2dff:fe04:93a%wi0 prefixlen 64 scopeid 0x3 
inet 0.0.0.0 netmask 0xff00 broadcast 255.255.255.255
ether 00:02:2d:04:09:3a
media: IEEE 802.11 Wireless Ethernet autoselect (DS/2Mbps)
status: associated
ssid Netname 1:Netname
stationname FreeBSD WaveLAN/IEEE node
channel 3 authmode OPEN powersavemode OFF powersavesleep 100
wepmode OFF weptxkey 1

   However, no traffic comes through.  

It's pretty clear that it's this laptop: I have other machines on the
net which work without problems, and this machine also works if I boot
it with 4.8-STABLE.

Any thoughts?

Greg
--
Finger [EMAIL PROTECTED] for PGP public key
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: vinum and/or geom panic on alpha

2003-06-13 Thread Greg 'groggy' Lehey

On Tuesday, 10 June 2003 at 14:05:11 +0200, Bernd Walter wrote:

 fatal kernel trap:

 Stopped at  g_dev_strategy+0x44:stq t0,0x20(v0) 0x20  
 t0=0x1a61da400,v0=0x0
 db trace
 g_dev_strategy() at g_dev_strategy+0x44
 launch_requests() at launch_requests+0x390
 prologue botch: displacement 128
 frame size botch: adjust register offsets?
 vinumstart() at vinumstart+0x250
 prologue botch: displacement 64
 frame size botch: adjust register offsets?
 intr_n() at 0xccec340

Can you check the locals of launch_requests(), please?

Thanks
Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Problems building today's world

2003-06-02 Thread Greg 'groggy' Lehey

I've just cvsupped the latest -CURRENT, and it dies on me in
gnu/usr.bin/gperf/doc:

=== gnu/usr.bin/gperf/doc
c++ -O -pipe   -std=iso9899:1999  
-I/usr/obj/src/FreeBSD/5-CURRENT-ZAPHOD/src/i386/legacy/usr/include 
-I/src/FreeBSD/5-CURRENT-ZAPHOD/src/gnu/usr.bin/gperf/../../../contrib/gperf/lib 
-I/src/FreeBSD/5-CURRENT-ZAPHOD/src/gnu/usr.bin/gperf -c 
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.cc
In file included from 
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/options.h:154,
 from 
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.h:59,
 from 
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.cc:21:
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/options.icc:27: syntax 
   error before `:' token
In file included from 
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.h:59,
 from 
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.cc:21:
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/options.h:150:1: unterminated 
#ifdef
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/options.h:32:1: unterminated 
#ifndef
In file included from 
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.cc:21:
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.h:55:1: unterminated 
#ifdef
/src/FreeBSD/5-CURRENT-ZAPHOD/src/contrib/gperf/src/bool-array.h:27:1: unterminated 
#ifndef
*** Error code 1

Stop in /src/FreeBSD/5-CURRENT-ZAPHOD/src/gnu/usr.bin/gperf.
*** Error code 1

Stop in /src/FreeBSD/5-CURRENT-ZAPHOD/src.
*** Error code 1

Stop in /src/FreeBSD/5-CURRENT-ZAPHOD/src.
*** Error code 1

Stop in /src/FreeBSD/5-CURRENT-ZAPHOD/src.

The funny thing is that there's nothing obviously wrong with the
source files.  I suspect c++, which dates from:

-r-xr-xr-x  3 root  wheel  78708 May 22 17:38 /usr/bin/c++

Is there something I should be doing first?

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Problems building today's world

2003-06-02 Thread Greg 'groggy' Lehey

On Monday,  2 June 2003 at 10:54:06 +0930, Greg 'groggy' Lehey wrote:
 I've just cvsupped the latest -CURRENT, and it dies on me in
 gnu/usr.bin/gperf/doc:

*sigh*

Yes, of course I saw the dialogue between DES and obrien, and the
subsequent commit, so I re-supped and cvs updated and it still
happened.  But then, I should have updated the correct tree :-(

Sorry for the noise
Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Kernel panic - never had one before, what do I do?

2003-03-26 Thread Greg 'groggy' Lehey

On Wednesday, 26 March 2003 at 13:35:28 +, Jason Morgan wrote:
 I just got a panic. As I have never had one before, I don't know what to
 do. It's on another system so I don't have to reboot immediately (that
 would solve the problem temporarily, wouldn't it?) if someone would give
 me some advice, I could try to help debug it; however, as I'm not a
 coder (not a real one anyway), I don't know how much help I would be.

 It's a 5.0-CURRENT system, just installed and built last week. It
 paniced right after doing a source update (not a build, just cvsup).
 The panic error is as follows:

 panic: mtx_lock() of spin mutex vnode interlock @
 /usr/src/sys/kern/vfs_subr.c:3187

Take a look at http://www.lemis.com/texts/panic.txt or
http://www.lemis.com/texts/panic.pdf and tell me if that helps.  This
will be going into the new edition of The Complete FreeBSD in a few
days time, so I'm interested in getting something which is helpful.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: vinum broken by devstat changes?

2003-03-25 Thread Greg 'groggy' Lehey

On Tuesday, 25 March 2003 at 18:44:03 +0100, Hartmut Brandt wrote:

 Hi,

 when calling 'vinum start' it responds with

 usage: read drive [drive ...]

 from looking at the code, it appears that it cannot find the disk drives
 to read the configuration from.

 vinum read da0 da1

 just works.

 So what's the problem? (kernel and user land from today)

Check vinum(8), function vinum_start (in
/usr/src/sbin/vinum/commands.c).  It's possible that the changes have
broken some of the tests, probably of stat-device_type.  I can't
think it's too difficult to fix.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: SI_SUB_RAID and SI_SUB_VINUM

2003-03-24 Thread Greg 'groggy' Lehey

On Monday, 24 March 2003 at 19:07:56 -0500, Hiten Pandya wrote:
 Hi Gang!

 I was wondering, what's the point of making Vinum use a totally
 different SYSINIT type?  Isn't there a possibility it can just use
 SI_SUB_RAID?

Probably.  SI_SUB_VINUM was there first.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Anyone working on fsck?

2003-03-17 Thread Greg 'groggy' Lehey

On Monday, 17 March 2003 at 22:39:02 +0100, Poul-Henning Kamp wrote:
 In message [EMAIL PROTECTED], Bakul Shah writes:

 UFS is the real problem here, not fsck.  Its tradeoffs for
 improving normal access latencies may have been right in the
 past but not for modern big disks.  The seek time  RPM have
 not improved very much in the past 20 years while disk
 capacity has increased by a factor of about 20,000 (and GB/$
 even more).  IMHO there is not much you can do at the fsck
 level -- you stil have to visit all the cyl groups and what
 not.  Even a factor of 10 improvement in fsck means 36
 minutes which is far too long.

 Now, before we go off and design YABFS, can we just get real for
 a second ?

 I have been tending UNIX computers of all sorts for many years and
 there is one bit of wisdom that has yet to fail me:

   Every now and then, boot in single-user and run full fsck
   on all filesystems.

 If this had failed to be productive, I would have given up the
 habit years ago, but it is still a good idea it seems.

 Personally, I think background-fsck is close to the ideal situation
 since I can skip the boot in single-user part of the above
 profylactic.

 If you start to implement any sort of journaling (that is what you
 talked about in your email), you might as well just stop right at
 the clean bit, and avoid the complexity.

 Optimizing fsck is a valid project, I just wish it would be somebody
 who would also finish the last 30% who would do it.

Poul-Henning, how can you justify the second half of that sentence?  I
take exception to the implications.  In case anybody is in any doubt,
I've heard you say this sort of thing about julian before.  Please
don't do it again.

This is without my core hat.  As most people here know, core has
warned you about this kind of behaviour multiple times before.  What I
say here in no way prejudices what core may decide to do about the
incident.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Software RAID caching? (was: Anyone working on fsck?)

2003-03-17 Thread Greg 'groggy' Lehey

On Monday, 17 March 2003 at 23:02:38 -0500, Jeff Roberson wrote:
 On Mon, 17 Mar 2003, Terry Lambert wrote:

 Jeff Roberson wrote:
 On Mon, 17 Mar 2003, Brooks Davis wrote:
 I am still intrested in improvements to fsck since I'm planning to buy
 several systems with two 1.4TB IDE RAID5 arrays in them soon.

 For these types of systems doing a block caching layer with a prefetch
 that understands how many spindles there are would be a huge benefit.

 I call that layer Vinum or RAIDFrame, since that's a job I
 expect that code to do for me.  8-).

 They are not responsible for data caching.  Only informing the upper
 layers how many spindles they have.  Software RAID should be a transform
 only in my opinion.  There is no reason to have duplicate block caches in
 system memory.

Agreed.  Vinum doesn't cache.  There is one case, though, where it
could be argued that it's worthwhile, namely in RAID-[45] parity
blocks.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Vinum R5

2003-03-15 Thread Greg 'groggy' Lehey

On Saturday, 15 March 2003 at 10:34:54 +0200, Vallo Kallaste wrote:
 On Sat, Mar 15, 2003 at 12:02:23PM +1030, Greg 'groggy' Lehey
 [EMAIL PROTECTED] wrote:

 -current, system did panic everytime at the end of
 initialisation of parity (raidctl -iv raid?). So I used the
 raidframe patch for -stable at
 http://people.freebsd.org/~scottl/rf/2001-08-28-RAIDframe-stable.diff.gz
 Had to do some patching by hand, but otherwise works well.

 I don't think that problems with RAIDFrame are related to these
 problems with Vinum.  I seem to remember a commit to the head branch
 recently (in the last 12 months) relating to the problem you've seen.
 I forget exactly where it went (it wasn't from me), and in cursory
 searching I couldn't find it.  It's possible that it hasn't been
 MFC'd, which would explain your problem.  If you have a 5.0 machine,
 it would be interesting to see if you can reproduce it there.

 Yes, yes, the whole raidframe story was meant as information about
 the conditions I did the raidframe vs. Vinum testing on. Nothing to
 do with Vinum, besides that raidframe works and Vinum does not.

 Will it suffice to switch off power for one disk to simulate more
 real-world disk failure? Are there any hidden pitfalls for failing
 and restoring operation of non-hotswap disks?

 I don't think so.  It was more thinking aloud than anything else.  As
 I said above, this is the way I tested things in the first place.

 Ok, I'll try to simulate the disk failure by switching off the
 power, then.

I think you misunderstand.  I simulated the disk failures by doing a
stop -f.  I can't see any way that the way they go down can
influence the revive integrity.  I can see that powering down might
not do the disks any good.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Vinum R5

2003-03-15 Thread Greg 'groggy' Lehey

On Saturday, 15 March 2003 at 23:56:24 +0100, Poul-Henning Kamp wrote:
 In message [EMAIL PROTECTED], Greg 'groggy' Lehey
  writes:

 Ok, I'll try to simulate the disk failure by switching off the
 power, then.

 I think you misunderstand.  I simulated the disk failures by doing a
 stop -f.  I can't see any way that the way they go down can
 influence the revive integrity.  I can see that powering down might
 not do the disks any good.

 Are you saying that you only tested vinums recovery with disks which
 had been cleanly shut down ?

No.  stop -f doesn't shut down cleanly.  But I also tested with
powering down.  As you might expect, it didn't make much difference.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk]

2003-03-14 Thread Greg 'groggy' Lehey

On Friday, 14 March 2003 at 10:05:28 +0200, Vallo Kallaste wrote:
 On Fri, Mar 14, 2003 at 01:16:02PM +1030, Greg 'groggy' Lehey
 [EMAIL PROTECTED] wrote:

 So I did. Loaned two SCSI disks and 50-pin cable. Things haven't
 improved a bit, I'm very sorry to say it.

 Sorry for the slow reply to this.  I thought it would make sense to
 try things out here, and so I kept trying to find time, but I have to
 admit I just don't have it yet for a while.  I haven't forgotten, and
 I hope that in a few weeks time I can spend some time chasing down a
 whole lot of Vinum issues.  This is definitely the worst I have seen,
 and I'm really puzzled why it always happens to you.

 # simulate disk crash by forcing one arbitrary subdisk down
 # seems that vinum doesn't return values for command completion status
 # checking?
 echo Stopping subdisk.. degraded mode
 vinum stop -f r5.p0.s3  # assume it was successful

 I wonder if there's something relating to stop -f that doesn't happen
 during a normal failure.  But this was exactly the way I tested it in
 the first place.

 Thank you Greg, I really appreciate your ongoing effort for making
 vinum stable, trusted volume manager.
 I have to add some facts to the mix. Raidframe on the same hardware
 does not have any problems. The later tests I conducted was done
 under -stable, because I couldn't get raidframe to work under
 -current, system did panic everytime at the end of initialisation of
 parity (raidctl -iv raid?). So I used the raidframe patch for
 -stable at
 http://people.freebsd.org/~scottl/rf/2001-08-28-RAIDframe-stable.diff.gz
 Had to do some patching by hand, but otherwise works well.

I don't think that problems with RAIDFrame are related to these
problems with Vinum.  I seem to remember a commit to the head branch
recently (in the last 12 months) relating to the problem you've seen.
I forget exactly where it went (it wasn't from me), and in cursory
searching I couldn't find it.  It's possible that it hasn't been
MFC'd, which would explain your problem.  If you have a 5.0 machine,
it would be interesting to see if you can reproduce it there.

 Will it suffice to switch off power for one disk to simulate more
 real-world disk failure? Are there any hidden pitfalls for failing
 and restoring operation of non-hotswap disks?

I don't think so.  It was more thinking aloud than anything else.  As
I said above, this is the way I tested things in the first place.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk]

2003-03-13 Thread Greg 'groggy' Lehey

On Saturday,  1 March 2003 at 20:43:10 +0200, Vallo Kallaste wrote:
 On Thu, Feb 27, 2003 at 11:53:02AM +0200, Vallo Kallaste vallo wrote:

 The vinum R5 and system as a whole were stable without
 softupdates. Only one problem remained after disabling softupdates,
 while being online and user I/O going on, rebuilding of failed disk
 corrupt the R5 volume completely.

 Yes, we've fixed a bug in that area.  It had nothing to do with soft
 updates, though.

 Oh, that's very good news, thank you! Yes, it had nothing to do with
 soft updates at all and that's why I had the remained after in the
 sentence.

 Don't know is it fixed or not as I don't have necessary hardware at
 the moment. The only way around was to quiesce the volume before
 rebuilding, umount it, and wait until rebuild finished. I'll suggest
 extensive testing cycle for everyone who's going to work with vinum
 R5. Concat, striping and mirroring has been a breeze but not so with
 R5.

 IIRC the rebuild bug bit any striped configuration.

 Ok, I definitely had problems only with R5, but you certainly know
 much better what it was exactly. I'll need to lend 50-pin SCSI cable
 and test vinum again. Will it matter on what version of FreeBSD I'll
 try on? My home system runs -current of Feb 5, but if you suggest
 -stable for consistent results, I'll do it.

 So I did. Loaned two SCSI disks and 50-pin cable. Things haven't
 improved a bit, I'm very sorry to say it.

Sorry for the slow reply to this.  I thought it would make sense to
try things out here, and so I kept trying to find time, but I have to
admit I just don't have it yet for a while.  I haven't forgotten, and
I hope that in a few weeks time I can spend some time chasing down a
whole lot of Vinum issues.  This is definitely the worst I have seen,
and I'm really puzzled why it always happens to you.

 # simulate disk crash by forcing one arbitrary subdisk down
 # seems that vinum doesn't return values for command completion status
 # checking?
 echo Stopping subdisk.. degraded mode
 vinum stop -f r5.p0.s3# assume it was successful

I wonder if there's something relating to stop -f that doesn't happen
during a normal failure.  But this was exactly the way I tested it in
the first place.

Greg
--
See complete headers for address and phone numbers


pgp0.pgp
Description: PGP signature

Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk]

2003-02-26 Thread Greg 'groggy' Lehey

On Friday, 21 February 2003 at 10:00:46 +0200, Vallo Kallaste wrote:
 On Thu, Feb 20, 2003 at 02:28:45PM -0800, Darryl Okahata
 [EMAIL PROTECTED] wrote:

 Vallo Kallaste [EMAIL PROTECTED] wrote:

 I'll second Brad's statement about vinum and softupdates
 interactions. My last experiments with vinum were more than half a
 year ago, but I guess it still holds. BTW, the interactions showed
 up _only_ on R5 volumes. I had 6 disk (SCSI) R5 volume in Compaq
 Proliant 3000 and the system was very stable before I enabled
 softupdates.. and of course after I disabled softupdates. In between
 there were crashes and nasty problems with filesystem. Unfortunately
 it was production system and I hadn't chanche to play.

  Did you believe that the crashes were caused by enabling softupdates on
 an R5 vinum volume, or were the crashes unrelated to vinum/softupdates?
 I can see how crashes unrelated to vinum/softupdates might trash vinum
 filesystems.

 The crashes and anomalies with filesystem residing on R5 volume were
 related to vinum(R5)/softupdates combo.

Well, at one point we suspected that.  But the cases I have seen were
based on a misassumption.  Do you have any concrete evidence that
points to that particular combination?

 The vinum R5 and system as a whole were stable without
 softupdates. Only one problem remained after disabling softupdates,
 while being online and user I/O going on, rebuilding of failed disk
 corrupt the R5 volume completely.

Yes, we've fixed a bug in that area.  It had nothing to do with soft
updates, though.

 Don't know is it fixed or not as I don't have necessary hardware at
 the moment. The only way around was to quiesce the volume before
 rebuilding, umount it, and wait until rebuild finished. I'll suggest
 extensive testing cycle for everyone who's going to work with vinum
 R5. Concat, striping and mirroring has been a breeze but not so with
 R5.

IIRC the rebuild bug bit any striped configuration.

Greg
--
See complete headers for address and phone numbers
Please note: we block mail from major spammers, notably yahoo.com.
See http://www.lemis.com/yahoospam.html for further details.


pgp0.pgp
Description: PGP signature

Re: Vinum R5 [was: Re: background fsck deadlocks with ufs2 and big disk]

2003-02-26 Thread Greg 'groggy' Lehey

On Friday, 21 February 2003 at  1:56:56 -0800, Terry Lambert wrote:
 Vallo Kallaste wrote:
 The crashes and anomalies with filesystem residing on R5 volume were
 related to vinum(R5)/softupdates combo. The vinum R5 and system as
 a whole were stable without softupdates. Only one problem remained
 after disabling softupdates, while being online and user I/O going
 on, rebuilding of failed disk corrupt the R5 volume completely.
 Don't know is it fixed or not as I don't have necessary hardware at
 the moment. The only way around was to quiesce the volume before
 rebuilding, umount it, and wait until rebuild finished. I'll suggest
 extensive testing cycle for everyone who's going to work with
 vinum R5. Concat, striping and mirroring has been a breeze but not
 so with R5.

 I think this is an expected problem with a lot of concatenation,
 whether through Vinum, GEOM, RAIDFrame, or whatever.

Can you be more specific?  What you say below doesn't address any
basic difference between virtual and real disks.

 This comes about for the same reason that you can't mount -u
 to turn Soft Updates from off to on: Soft Updates does not
 tolerate dirty buffers for which a dependency does not exist, and
 will crap out when a pending dirty buffer causes a write.

I don't understand what this has to do with virtual disks.

 This could be fixed in the mount -u case for Soft Updates, and it
 can also be fixed for Vinum (et. al.).

 The key is the difference between a mount -u vs. a umount ; mount,
 which comes down to flushing and invalidating all buffers on the
 underlying device, e.g.:

   vn_lock(devvp, LK_EXCLUSIVE | LK_RETRY, p);
   vinvalbuf(devvp, V_SAVE, NOCRED, p, 0, 0);
   error = VOP_CLOSE(devvp, ronly ? FREAD : FREAD|FWRITE, FSCRED, p);
   error = VOP_OPEN(devvp, ronly ? FREAD : FREAD|FWRITE, FSCRED, p);
   VOP_UNLOCK(devvp, 0, p);

 ... Basically, after rebuilding, before allowing the mount to proceed,
 the Vinum (and GEOM and RAIDFRame, etc.) code needs to cause all the
 pending dirty buffers to be written.  This will guarantee that there
 are no outstanding dirty buffers at mount time, which in turn guarantees
 that there will be no dirty buffers that the dependency tracking in
 Soft Updates does not know about.

I don't understand what you're assuming here.  Certainly I can't see
any relevance to Vinum, RAIDframe or any other virtual disk system.

Greg
--
See complete headers for address and phone numbers
Please note: we block mail from major spammers, notably yahoo.com.
See http://www.lemis.com/yahoospam.html for further details.


pgp0.pgp
Description: PGP signature

1 2 >

1 - 100 of 158 matches

Mail list logo