Re: SD card reader only works after a suspend/resume

2018-09-12 Thread Marius Strobl
On Fri, Sep 07, 2018 at 04:52:12PM +0200, Jakob Alvermark wrote:
> On 9/7/18 12:41 AM, Marius Strobl wrote:
> > On Thu, Sep 06, 2018 at 12:33:39PM +0200, Jakob Alvermark wrote:
> >> Hi,
> >>
> >>
> >> I discovered this by chance.
> >>
> >> The SD card reader in my laptop has never worked, but now I noticed it
> >> does after suspending and resuming.
> >>
> >> The controller is probed and attached on boot:
> >>
> >> sdhci_acpi1:  iomem
> >> 0x90a0-0x90a00fff irq 47 on acpi0
> >>
> >> But nothing happens if I put a card in. Unless I suspend and resume:
> >>
> >> mmc1:  on sdhci_acpi1
> >> mmcsd0: 32GB  at mmc1
> >> 50.0MHz/4bit/65535-block
> >>
> >> Then I can remove and replug cards and it seems to work just fine.
> > I believe that making SD card insertion/removal with the integrated
> > SDHCI controlers of newer Intel SoCs work out-of-the-box requires
> > support for ACPI GPE interrupts and ACPI GPIO events respectively to
> > be added to FreeBSD. Otherwise insertion/removal interrutps/events
> > aren't reported and polling the card present state doesn't generally
> > work as a workaround with these controllers either, unfortunately.
> > I'm not aware of anyone working on the former, though.
> >
> > Polling the card present state happens to work one time after SDHCI
> > initialization with these controllers which is why a card will be
> > attached when inserted as part of a suspend/resume cycle (resume of
> > mmc(4) had some bugs until some months ago, which probably explains
> > why that procedure hasn't worked as a workaround for you in the past).
> > Inserting the card before boot, unloading/loading sdhci_acpi.ko or
> > triggering detach/attach of sdhci_acpi(4) via devctl(8) should allow
> > to attach a card, too.
> 
> 
> If a card is inserted before booting it is not detected.
> 
> Removing and inserting card after boot is not detected unless I suspend 
> and resume.
> 
> After I have suspended and resumed once, cards are detected. Removals 
> and insertions are detected as they happen.

Okay, then you are seeing somewhat different behavior than I do. What
SoC model is this? Are you loading a GPIO controller driver such as
bytgpio(4) or chvgpio(4)? Doing so might be sufficient to kick ACPI
GPIO events into working but would be missing dependency information
between drivers (which might explain what you are experiencing if
sdhci_acpi1 attaches first) and some other bits to do it properly.
Also, could you please try whether doing a suspend/resume cycle of
sdhci_acpi1 via devctl(8) only kicks the card detection into working?
That test should indicate whether the firmware plays a role in making
the latter work.

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: SD card reader only works after a suspend/resume

2018-09-06 Thread Marius Strobl
On Thu, Sep 06, 2018 at 12:33:39PM +0200, Jakob Alvermark wrote:
> Hi,
> 
> 
> I discovered this by chance.
> 
> The SD card reader in my laptop has never worked, but now I noticed it 
> does after suspending and resuming.
> 
> The controller is probed and attached on boot:
> 
> sdhci_acpi1:  iomem 
> 0x90a0-0x90a00fff irq 47 on acpi0
> 
> But nothing happens if I put a card in. Unless I suspend and resume:
> 
> mmc1:  on sdhci_acpi1
> mmcsd0: 32GB  at mmc1 
> 50.0MHz/4bit/65535-block
> 
> Then I can remove and replug cards and it seems to work just fine.

I believe that making SD card insertion/removal with the integrated
SDHCI controlers of newer Intel SoCs work out-of-the-box requires
support for ACPI GPE interrupts and ACPI GPIO events respectively to
be added to FreeBSD. Otherwise insertion/removal interrutps/events
aren't reported and polling the card present state doesn't generally
work as a workaround with these controllers either, unfortunately.
I'm not aware of anyone working on the former, though.

Polling the card present state happens to work one time after SDHCI
initialization with these controllers which is why a card will be
attached when inserted as part of a suspend/resume cycle (resume of
mmc(4) had some bugs until some months ago, which probably explains
why that procedure hasn't worked as a workaround for you in the past).
Inserting the card before boot, unloading/loading sdhci_acpi.ko or
triggering detach/attach of sdhci_acpi(4) via devctl(8) should allow
to attach a card, too.

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: SG116j install crashed

2018-01-22 Thread Marius Strobl
On Sat, Jan 20, 2018 at 08:25:08PM +0900, KIRIYAMA Kazuhiko wrote:
> At Thu, 18 Jan 2018 15:35:41 +0900,
> my wrote:
> > 
> > Hi, all
> > 
> > I've bought Biccamera's original bland note PC (SG116j)
> > impulsively because of cheapness($1780). I've installed
> > 12.0-CURRENT(r327788) right away. Booted smoothly but set
> > loader conf "unset hint.uart.1.at" and configure disk with
> > 
> > mmcsd0  58 GB   GPT
> >   mmcsd0p1  200MB   efi
> >   mmcsd0p2  54 GB   freebsd-ufs /
> >   mmcsd0p3  4.2 GB  freebsd-swapnone
> > 
> > But in "Tetching distribution files" of base.txz, crashed
> > with:
> > 
> > sdhci_acpi0-slot0: Controller timeout
> > sdhci_acpi0-slot0: == REGISTER DUMP ==
> > sdhci_acpi0-slot0: Sys addr: 0x02158000 | Version:  0x1002
> > sdhci_acpi0-slot0: Blk size: 0x0200 | Blk cnt:  0x00f8
> > sdhci_acpi0-slot0: Argument: 0x017f57e8 | Trn mode: 0x0027
> > sdhci_acpi0-slot0: Present:  0x1fff0106 | Host ctl: 0x1025
> > sdhci_acpi0-slot0: Power:0x000b | Blk gap:  0x0080
> > sdhci_acpi0-slot0: Wake-up:  0x | Clock:0x0007
> > sdhci_acpi0-slot0: Timeout:  0x0007 | Int stat: 0x0001
> > sdhci_acpi0-slot0: Int enab: 0x05ff0033 | Sig enab: 0x05ff003a
> > sdhci_acpi0-slot0: AC12 err: 0x8000 | Host ctl2:0x008b
> > sdhci_acpi0-slot0: Caps: 0x446cc8b2 | Caps2:0x0807
> > sdhci_acpi0-slot0: Max curr: 0x | ADMA err: 0x
> > sdhci_acpi0-slot0: ADMA addr:0x | Slot int: 0x
> > sdhci_acpi0-slot0: ===
> > mmcsd0: Error indicated: 1 Timeout
> >   :
> > (snip)
> >   :
> > Stopped at kdb_enter+0x3b: movq$0,kdb_why
> > db>
> > 
> > Detail log has put in [1]. BTW I used [2] so all stuffs are
> > within it and it should not be fetched to internet.
> > 
> > Is there any idea to go forth?
> > 
> > Best regards.
> > 
> > [1] http://35.200.82.201/~kiri/freebsd/sg116j/crash_in_install.jpeg
> > [2] FreeBSD-12.0-CURRENT-amd64-20180110-r327788-memstick.img
> 
> I've got r328126 memstic and install with it, then all went
> to perfect! Thanx for FreeBSD-CURRENT team!

FYI, I believe that you had hit the bug fixed in r327924; sorry about
that.

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [tzsetup] can't set up local timezone if CMOS is set to UTC

2017-08-07 Thread Marius Strobl
On Mon, Aug 07, 2017 at 09:51:15AM +0300, Boris Samorodov wrote:
> 07.08.2017 09:44, Boris Samorodov ?:
> > Hi Marius, All,
> > 
> > Subj at today's amd64-HEAD. If I use command "sudo tzsetup" and
> > choose YES (CMOS clock is set to UTC), the program just quits.
> > Yea, my clocks are at UTC but I want to get time at local timezone. :-)
> > 
> > I've found a recent commit to tzsetup, is it the cause?
> 
> Hm. There is a log message at r322097:
> ---
> - Make the initial UTC dialog actually work by giving the relevant files
>the necessary treatment and then exit when choosing "Yes" there instead
>of moving on to the time zone menu regardless.
> ---
> 
> I must misunderstand something.
> 
> So my question is: how to set up local time zone if CMOS is set to UTC?

Yeah, I hadn't thought of the case where one would like to set up
a configuration in which the RTC is using UTC but the timezone is
not. So I've reverted the corresponding part of r322097 for now as
I don't see an obvious way to give /etc/wall_cmos_clock appropriate
treatment in all 3 relevant cases (UTC/UTC, !UTC/UTC and !UTC/!UTC
regarding RTC/timezone) for all interactive and non-interactive
ways of using tzsetup(8).

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Re: Re: Realtek 8168/8111 if_re not working in current r295091

2016-02-13 Thread Marius Strobl
On Sat, Feb 13, 2016 at 09:21:06PM +0100, Stefan Kohl wrote:
> Hi Marius,
> 
> I finally got my RT 8168 Ethernet Card (Zotac Ri323) working after
> patching if_re.c (r295601). Contrary to the assumption that
> HWREV_8168E_VL with Chip Rev 0x2c80 should not require RTL8168G
> handling as soon as I expand the sc->rl_flags for the respective
> HWREV and define the (ominous) 8168G_Plus Flag for RL_HWREV_8168E_VL
> the card is functioning correctly.

My best guess currently is that treating HWREV_8168E_VL as RTL8168G
or later chip - which it simply isn't - serves as workaround by e. g.
resetting parts of the RX/TX MAC configuration, that doesn't make it
an appropriate fix, though. I have a WIP which does a more complete
initialization of Realtek Ethernet MACs, part of which is a workaround
for broken BIOSes and is specific to HWREV_8168E_VL. I suspect that's
the more likely cause for your problem and would also explain why there
was no other such report so far. Currently, 10.3-RELEASE and its show-
stoppers have higher priority for me, though.

> When broken (without the patch) I got the following tcpdump output:
> 
> 19:18:46.299360 00:00:00:00:00:00 (oui Ethernet) > 00:00:00:00:00:00
> (oui Ethernet) Null Information, send seq 0, rcv seq 0, Flags [Command],
> length 84

Actually, this pretty much confirms the assumption that your problem
is caused by a broken BIOS as the correct workaround for that bug
consists of making the GMAC aware of the MAC address via the driver
in addition to only setting it in the MAC.
Err, wait, IIRC yongari@ had a similar change as far as the broken
BIOS workaround is concerned. You may want to give the following
patch a try instead of treating HWREV_8168E_VL as RTL8168G+ (I don't
know whether that patch applies cleanly to current re(4), though):
https://people.freebsd.org/~yongari/re/re.8168evl.diff

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Re: Realtek 8168/8111 if_re not working in current r295091

2016-02-05 Thread Marius Strobl
On Fri, Feb 05, 2016 at 09:04:02PM +0100, s@web.de wrote:
> Hi Marius and Pyun,
> 
> actually it is Chip rev. 0x2c80 (I have overlooked that information in my 
> first post)
> 
> re0:  port 
> 0xe000-0xe0ff mem 0xf0104000-0xf0104fff,0xf010-0xf0103fff irq 19 at 
> device 0.0 on pci2
> re0: Using 1 MSI-X message
> re0: turning off MSI enable bit.
> re0: Chip rev. 0x2c80
> re0: MAC rev. 0x0010
> miibus0:  on re0
> rgephy0:  PHY 1 on miibus0
> 
> Does that help in any way? Thanks Stefan
> 

Unfortunately, it doesn't make a whole lot of sense to me; 0x2c80
translates to RL_HWREV_8168E_VL, which is an older chip that should
never have required the handling of RTL8168G and later revisions (or
may not actually work when applying it). So r290566 should only make
a positive difference, if it changes anyting for that revision all.
Did the interface work before r290151, or actually before r281337?
Does reverting r290946 and r290566 locally make it work again?
Another candidate causing that breakage would be r291676 if the PHY
is an RTL8211F one. If you boot verbosely, you'll have a line in the
dmesg(8) output with "OUI 0x00e04c" in it. If the "rev." number in
that line is 6, you have an RTL8211F.

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Realtek 8168/8111 if_re not working in current r295091

2016-02-05 Thread Marius Strobl
On Wed, Feb 03, 2016 at 08:57:01PM +0100, s@web.de wrote:
> After updating -current at Jan, 31st (r295091) the Realtek ethernet device 
> driver of my Zotac ZBox RI323 mini pc seems to be broken: I can neither 
> connect to the host even though the interface is shown as active, nor can I 
> initiate connection from the host through re0.
> Reverting the kernel to my previous build -current r290151 (install date Nov 
> 1st, 2015) the re0 interface is working OK.
> 
> Looking through the svn logs regarding /head/sys/dev/re/if_re.c I supect, 
> that Revision 290566 might have someting to do with this and that I have to 
> include my Realtek Chipset to the exclusion list for "enabling RX/TX after 
> initial configuration (or viceversa; I am really confused here), but I havent 
> got a clue how; as I do not know how to find the right RL_HWREV_XXX flag for 
> my device.
> 
> dmesg shows RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet and 
> pciconf -l -v re0 shows:
> re0@pci0:2:0:0: class=0x02 card=0x816819da chip=0x816810ec rev=0x07 
> hdr=0x00
> vendor = 'Realtek Semiconductor Co., Ltd.'
> device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
> 
> I am grateful for any suggestion towards a solution and I am willing (and 
> able) to assist by patching or debugging my kernel or giving further hw 
> information about my system.
> 

Hrm, does that happen to be RL_HWREV_8411B (0x5c80) according to
the "Chip rev." in the dmesg(8) output?

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sparc64 traps during probe (r293243)

2016-01-08 Thread Marius Strobl
On Fri, Jan 08, 2016 at 04:57:58PM +, Mark Cave-Ayland wrote:
> On 08/01/16 15:42, Kurt Lidl wrote:
> 
> This looks amazingly similar to what I get trying to boot FreeBSD under
> QEMU, i.e. pointing at sched_clock():
> 

<...>

> -- kernel stack fault %o7=0xc011a050 --
> panic: longjmp botch
> cpuid = -1012475520
> KDB: stack backtrace:
> Uptime: 3s
> 
> Note also the "longjmp botch" error right at the end. This is with the
> sun4u timer fix patch developed with help from Marius which has recently
> been applied to QEMU git master. So maybe this is a kernel bug after all?

No, that still is a completely trashed kernel stack as previously
seen when running under QEMU so the whole backtrace is questionable.
Apart from that, sched_clock() is called rather frequently so it is
not unlikely to show up in a kernel back trace but neither of the
two back traces in question suggest it's the culprit (assuming that
the one seen with QEMU actually is sane).

Marius


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sparc64 traps during probe (r293243)

2016-01-08 Thread Marius Strobl
On Fri, Jan 08, 2016 at 10:42:33AM -0500, Kurt Lidl wrote:
> I recently updated a sparc64 V120 from r291993
> to r293243, and it now traps during the
> autoconfiguration phase of the kernel boot:
> 

<...>

> -- data access exception sfar=0xfcf821ca0218 sfsr=0x41029 
> %o7=0xc06165e8 --

What code line does 0xc06165e8 translate to?

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Wake on LAN broken (probably between r290542 - r290606)?

2015-11-15 Thread Marius Strobl
On Sat, Nov 14, 2015 at 09:56:36AM -0800, David Wolfskill wrote:
> On Wed, Nov 11, 2015 at 06:33:37AM -0800, David Wolfskill wrote:
> > ...
> > But a quick perusal of
> >  doesn't show
> > anything especially like a "smoking gun" -- to me, anyway.
> > 
> > Can anyone else confirm or refute my observations?  Or suggest a
> > hint?  I'll try narrowing it down myself, but I need to do it during
> > times I'm at home (so I can manually power the machine back up when
> > it fails to respond to WoL), so it may be a few days before I can
> > accomplish much that way.
> > 
> 
> r290565 still works; r290566 fails -- in my case.  r290566 changed some
> re(4) behavior, and the NIC on my affected machine is an re(4):
> 
> re0@pci0:3:0:0: class=0x02 card=0x05b71028 chip=0x816810ec rev=0x0c
> hdr=0x00
> vendor = 'Realtek Semiconductor Co., Ltd.'
> device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet
> Controller'
> class  = network
> subclass   = ethernet
> 
> from "pciconf -lv" while running:
> 
> D freebeast.catwhisker.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1904  
> r290565M/290565:1100089: Sat Nov 14 09:44:33 PST 2015 
> r...@freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/GENERIC  amd64
> 
> I've placed a copy of a verbose dmes.boot in
> .
> 
> I'm happy to test suggested changes.
> 

*sigh* Okay, could you please test whether the attached patch restores
WOL capability for you?

Marius

Index: if_re.c
===
--- if_re.c	(revision 290566)
+++ if_re.c	(working copy)
@@ -3851,6 +3852,11 @@ re_setwol(struct rl_softc *sc)
 			CSR_READ_1(sc, RL_GPIO) & ~0x01);
 	}
 	if ((ifp->if_capenable & IFCAP_WOL) != 0) {
+		if ((sc->rl_flags & RL_FLAG_8168G_PLUS) != 0) {
+			/* Disable RXDV gate. */
+			CSR_WRITE_4(sc, RL_MISC, CSR_READ_4(sc, RL_MISC) &
+			~0x0008);
+		}
 		re_set_rxmode(sc);
 		if ((sc->rl_flags & RL_FLAG_WOL_MANLINK) != 0)
 			re_set_linkspeed(sc);
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: HEADS UP: sparc64 backend for llvm/clang imported

2015-08-26 Thread Marius Strobl
On Wed, Aug 19, 2015 at 04:19:03PM -0400, Kurt Lidl wrote:
> > Dimitry Andric wrote this message on Fri, Feb 28, 2014 at 20:22 +0100:
> >> In r262613 I have merged the clang-sparc64 branch back to head.  This
> >> imports an updated sparc64 backend for llvm and clang, allowing clang to
> >> bootstrap itself on sparc64, and to completely build world.  To be able
> >> to build the GENERIC kernel, there is still one patch to be finalized,
> >> see below.
> >>
> >> If you have any sparc64 hardware, and are not afraid to encounter rough
> >> edges, please try out building and running your system with clang.  To
> >> do so, update to at least r262613, and enable the following options in
> >> e.g. src.conf, or in your build environment:
> >>
> >> WITH_CLANG=y
> >> WITH_CLANG_IS_CC=y
> >> WITH_LIBCPLUSPLUS=y  (optional)
> >>
> >> Alternatively, if you would rather keep gcc as /usr/bin/cc for the
> >> moment, build world using just WITH_CLANG, enabling clang to be built
> >> (by gcc) and installed.  After installworld, you can then set CC=clang,
> >> CXX=clang++ and CPP=clang-cpp for building another world.
> >>
> >> For building the sparc64 kernel, there is one open issue left, which is
> >> that sys/sparc64/include/pcpu.h uses global register variables, and this
> >> is not supported by clang.  A preliminary patch for this is attached,
> >> but it may or may not blow up your system, please beware!
> >>
> >> The patch changes the pcpu and curpcb global register variables into
> >> inline functions, similar to what is done on other architectures.
> >> However, the current approach is not optimal, and the emitted code is
> >> slightly different from what gcc outputs.  Any improvements to this
> >> patch are greatly appreciated!
> >>
> >> Last but not least, thanks go out to Roman Divacky for his work with
> >> llvm/clang upstream in getting the sparc64 backend into shape.
> >
> > Ok, I have a new pcpu patch to try.  I have only compile tested it.
> >
> > It is available here:
> > https://www.funkthat.com/~jmg/sparc64.pcpu.patch
> >
> > I've also attached it.
> >
> > Craig, do you mind testing it?
> >
> > This patch also removes curpcb as it appears to not be used by any
> > sparc64 C code.  A GENERIC kernel compiles fine, and fxr only turns up
> > curpcb used in machdep code, and no references to it under sparc64.
> >
> > This is not a proper solution in that
> > it can mean counters/stats can be copied/moved to other cpus overwriting
> > the previous values if a race happens...  We use
> > PCPU_SET(mem, PCPU_GET(mem) + val) for PCPU_ADD, not great, but it's
> > no worse than what we were previously using..
> >
> > Until we get a proper fix which involves mapping all the cpu's PCPU
> > data on all CPUs, this will have to sufice..
> >
> > This patch is based upon, I believe, a patch from Marius and possibly
> > modified by rdivacky.
> >
> > Thanks for testing..
> 
> The above message was posted a while ago, and I decided that I would
> give the patch a test run on a spare sparc that I have, now that the
> instability problem with multiprocessor sparc64 machines has been
> resolved.
> 
> So, I have an up-to-date stable/10 V240 (2x1.5Ghz cpus, 8GB of memory),
> running a completely stock r286861.  That all seems to work just fine.
> 
> I applied the patch referenced in the email:
> 
> https://www.funkthat.com/~jmg/sparc64.pcpu.patch
> 
> (it applied cleanly), and then rebuilt the kernel on the machine,
> using the stock gcc 4.2.1 compiler.
> 
> When rebooting with that kernel, the machine panics pretty early
> in the boot:
> 
> FreeBSD 10.2-STABLE #3 r286861M: Wed Aug 19 14:28:45 EDT 2015
>  l...@spork.pix.net:/usr/obj/usr/src/sys/GENERIC sparc64
> gcc version 4.2.1 20070831 patched [FreeBSD]
> real memory  = 8589934592 (8192 MB)
> avail memory = 8379719680 (7991 MB)
> cpu0: Sun Microsystems UltraSparc-IIIi Processor (1503.00 MHz CPU)
> cpu1: Sun Microsystems UltraSparc-IIIi Processor (1503.00 MHz CPU)
> FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
> random device not loaded; using insecure entropy
> panic: trap: illegal instruction (kernel)
> cpuid = 0
> KDB: stack backtrace:
> #0 0xc05750e0 at panic+0x20
> #1 0xc08db9f8 at trap+0x558
> Uptime: 1s
> Automatic reboot in 15 seconds - press a key on the console to abort
> Rebooting...
> timeout shutting down CPUs.
> 
> So, the patch to get rid of the pcpu usage (as a prereq to poking
> at the clang compiler issues) does not work properly.
> 

As I pointed out when that patch was posted, the approach taken by
it assumes a CPU can access foreign PCPU data, which currently isn't
true on sparc64. So the patch is at least incomplete but also may
have further issues.

Such a patch is no longer a prerequisite for compiling a sparc64
kernel with clang, though, as clang meanwhile has been told to
grok at least the global registers used by the PCPU code.

Besides some default options like the choice of code model not
being appropriate for FreeBSD, clang-compiled loader 

Re: [Call for testers] DRM device-independent code update to Linux 3.8

2015-02-18 Thread Marius Strobl
On Wed, Feb 18, 2015 at 12:45:36AM +0100, Jean-Sébastien Pédron wrote:
> Hi!
> 
> An update to the DRM subsystem, not including the drivers, is ready for
> wider testing!
> 
> The patch against HEAD is here:
> https://people.freebsd.org/~dumbbell/graphics/drm-update-38.f.patch
> 

Have you looked into using a MTX_SPIN lock where Linux actually
employs a DRM_SPINTYPE one? That should allow to use a filter
instead of an ithread handler, solving a great number of problems
with pre-loading of DRM drivers and allow them to be statically
compiled into the kernel as - unlike ihtreads - filters work right
from the moment they are set up during attach. In turn, that
would make the lack of a VESA driver for vt(4) less painful and
likely even forgivable, as resolutions higher than VGA could be
used way earlier, etc.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r276200: EFI boot failure: kernel stops booting at pci0: on pcib0

2014-12-29 Thread Marius Strobl
On Mon, Dec 29, 2014 at 08:12:42PM +0100, Marius Strobl wrote:
> On Mon, Dec 29, 2014 at 05:55:28PM +0100, Roger Pau Monné wrote:
> > El 29/12/14 a les 12.41, Roger Pau Monné ha escrit:
> > > Hello,
> > > 
> > > Sorry for not noticing this earlier, I've been without a computer for
> > > some days. Do you get a panic message, or the system just freezes?
> > > 
> > > Can you please post the full boot output with boot_verbose enabled?
> > 
> > I'm not able to reproduce the problem with Qemu and OVMF, and I don't
> > have any box right now that uses UEFI.
> > 
> > I'm guessing that this is due to some memory reservation conflict, so
> > I'm attaching a patch that should help diagnose it.
> 
> You'll probably want to nuke RF_ACTIVE so the resources are marked
> as taken but in case of vt_efifb(4), the memory isn't mapped twice.
> I don't not know whether the latter actually is a problem for x86,
> though, it'll likely at least replace the VM_MEMATTR_WRITE_COMBINING
> mapping done in vt_efifb_remap(). Removing RF_ACTIVE in turn might
> not be sufficient for the Xen bits to mark the resource as reserved,
> this should be fixed in the FreeBSD/Xen code then, however.
> Also end = size - 1, see the attached patch.

Err, end = start + size - 1 that is.

Marius
 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: r276200: EFI boot failure: kernel stops booting at pci0: on pcib0

2014-12-29 Thread Marius Strobl
On Mon, Dec 29, 2014 at 05:55:28PM +0100, Roger Pau Monné wrote:
> El 29/12/14 a les 12.41, Roger Pau Monné ha escrit:
> > Hello,
> > 
> > Sorry for not noticing this earlier, I've been without a computer for
> > some days. Do you get a panic message, or the system just freezes?
> > 
> > Can you please post the full boot output with boot_verbose enabled?
> 
> I'm not able to reproduce the problem with Qemu and OVMF, and I don't
> have any box right now that uses UEFI.
> 
> I'm guessing that this is due to some memory reservation conflict, so
> I'm attaching a patch that should help diagnose it.

You'll probably want to nuke RF_ACTIVE so the resources are marked
as taken but in case of vt_efifb(4), the memory isn't mapped twice.
I don't not know whether the latter actually is a problem for x86,
though, it'll likely at least replace the VM_MEMATTR_WRITE_COMBINING
mapping done in vt_efifb_remap(). Removing RF_ACTIVE in turn might
not be sufficient for the Xen bits to mark the resource as reserved,
this should be fixed in the FreeBSD/Xen code then, however.
Also end = size - 1, see the attached patch.

Marius

Index: dev/vt/hw/efifb/efifb.c
===
--- dev/vt/hw/efifb/efifb.c	(revision 276343)
+++ dev/vt/hw/efifb/efifb.c	(working copy)
@@ -211,8 +211,8 @@
 	res_id = 0;
 	pseudo_phys_res = bus_alloc_resource(dev, SYS_RES_MEMORY,
 	&res_id, local_info.fb_pbase,
-	local_info.fb_pbase + local_info.fb_size,
-	local_info.fb_size, RF_ACTIVE);
+	local_info.fb_pbase + local_info.fb_size - 1,
+	local_info.fb_size, 0);
 	if (pseudo_phys_res == NULL)
 		panic("Unable to reserve vt_efifb memory");
 	return (0);
Index: dev/vt/hw/vga/vt_vga.c
===
--- dev/vt/hw/vga/vt_vga.c	(revision 276343)
+++ dev/vt/hw/vga/vt_vga.c	(working copy)
@@ -1275,8 +1275,8 @@
 
 	res_id = 0;
 	pseudo_phys_res = bus_alloc_resource(dev, SYS_RES_MEMORY,
-	&res_id, VGA_MEM_BASE, VGA_MEM_BASE + VGA_MEM_SIZE,
-	VGA_MEM_SIZE, RF_ACTIVE);
+	&res_id, VGA_MEM_BASE, VGA_MEM_BASE + VGA_MEM_SIZE - 1,
+	VGA_MEM_SIZE, 0);
 	if (pseudo_phys_res == NULL)
 		panic("Unable to reserve vt_vga memory");
 	return (0);
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: panic on sparc64 running 10-beta4

2013-12-27 Thread Marius Strobl
On Sun, Dec 08, 2013 at 02:50:23PM +0100, Marius Strobl wrote:
> On Wed, Dec 04, 2013 at 11:01:30AM -0500, Kurt Lidl wrote:
> > I installed a sparc V120 (4GB memory, dual 72GB disks) with the 10-beta4
> > install image today.
> > 
> > Installation went fine.  I rebooted the machine, and then went to get
> > a fresh ports tree, and the machine panic'd:
> > 
> > root@host:/usr/ports # portsnap fetch
> > Looking up portsnap.FreeBSD.org mirrors... 7 mirrors found.
> > Fetching public key from your-org.portsnap.freebsd.org... done.
> > Fetching snapshot tag from your-org.portsnap.freebsd.org... done.
> > Fetching snapshot metadata... done.
> > Fetching snapshot generated at Tue Dec  3 19:06:18 EST 2013:
> > 43b6803c6d94efd5b2e2bc9df0b66a84b75417fa3c1728100% of   69 MB 3225 kBps 
> > 00m22s
> > Extracting snapshot... done.
> > Verifying snapshot integrity... panic: trap: illegal instruction (kernel)
> > cpuid = 0
> > KDB: stack backtrace:
> > #0 0xc08836d4 at trap+0x554
> > Uptime: 6m59s
> > Dumping 4096 MB (4 chunks)
> >chunk at 0: 1073741824 bytes ... ok
> >chunk at 0x4000: 1073741824 bytes ... ok
> >chunk at 0x8000: 1073741824 bytes ... ok
> >chunk at 0xc000: 1073741824 bytes ... ok
> > 
> > Dump complete
> > Automatic reboot in 15 seconds - press a key on the console to abort
> > Rebooting...
> > 
> > And then it panic'd again when attempting to run 'savecore'!
> > (I typed a  after it printed out the line about
> > writing the core file, that's where the "load: 0.72 ..." line
> > came from...)
> 
> Hrm, I don't seem to be able to reproduce this with an installation
> built from sources and also can't remember a commit between BETA3 and
> BETA4 which should be able to cause this. I currently can't test the
> 10-BETA4 install image, though. Was the machine in question running
> FreeBSD before, i. e. is it known good hardware? Did savecore eventually
> succeed on writing out a dump?
> 

FYI, I tried again with a machine installed from the 10.0-RC3 binary
image and couldn't reproduce that problem either.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Request for testing an alternate branch

2013-12-11 Thread Marius Strobl
On Sun, Dec 08, 2013 at 03:48:54PM -0800, Justin Hibbits wrote:
> On Sun, 8 Dec 2013 14:38:53 +0100
> Marius Strobl  wrote:
> 
> > On Wed, Dec 04, 2013 at 10:21:13PM -0800, Justin Hibbits wrote:
> > > I've been working on the projects/pmac_pmu branch for some time now
> > > to add suspend/resume as well as CPU speed change for certain
> > > PowerPC machines, about a year since I created the branch, and now
> > > it's stable enough that I want to merge it into HEAD, hence this
> > > request. However, it does touch several drivers, turning them into
> > > "early drivers", such that they can be initialized, and suspended
> > > and resumed at a different time.  Saying that, I do need testing
> > > from other architectures, to make sure I haven't broken anything.
> > > 
> > > The technical details:
> > > 
> > > To get proper ordering, I've extended the bus_generic_suspend() and
> > > bus_generic_resume() to do multiple passes.  Devices which cannot be
> > > enabled or disabled at the current pass level would return an
> > > EAGAIN. This could possibly cause problems, since it's an addition
> > > to an existing API rather than a new API to run along side it, so
> > > it needs a great deal of testing.  It works fine on PowerPC, but I
> > > don't have any i386/amd64 or sparc64 hardware to test it on, so
> > > would like others who do to test it.  I don't think that it would
> > > impact x86 at all (testing is obviously required), because the
> > > nexus is not an EARLY_DRIVER_MODULE, so all devices would be
> > > handled at the same pass.  But, I do know the sparc64 has an
> > > EARLY_DRIVER_MODULE() nexus, so that will likely be impacted.
> > > 
> > > Also, any comments are of course welcome.  Technical concerns are
> > > obviously welcome, and I will try to address everything.
> > 
> > Do you have a patch against head?
> > 
> > Marius
> > 
> 
> Here you go.
> 

Thanks; on a sparc64 machine where the EARLY_DRIVER_MODULE nexus actually
matters, your patch doesn't seem to have an ill effect.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: panic on sparc64 running 10-beta4

2013-12-08 Thread Marius Strobl
On Wed, Dec 04, 2013 at 11:01:30AM -0500, Kurt Lidl wrote:
> I installed a sparc V120 (4GB memory, dual 72GB disks) with the 10-beta4
> install image today.
> 
> Installation went fine.  I rebooted the machine, and then went to get
> a fresh ports tree, and the machine panic'd:
> 
> root@host:/usr/ports # portsnap fetch
> Looking up portsnap.FreeBSD.org mirrors... 7 mirrors found.
> Fetching public key from your-org.portsnap.freebsd.org... done.
> Fetching snapshot tag from your-org.portsnap.freebsd.org... done.
> Fetching snapshot metadata... done.
> Fetching snapshot generated at Tue Dec  3 19:06:18 EST 2013:
> 43b6803c6d94efd5b2e2bc9df0b66a84b75417fa3c1728100% of   69 MB 3225 kBps 
> 00m22s
> Extracting snapshot... done.
> Verifying snapshot integrity... panic: trap: illegal instruction (kernel)
> cpuid = 0
> KDB: stack backtrace:
> #0 0xc08836d4 at trap+0x554
> Uptime: 6m59s
> Dumping 4096 MB (4 chunks)
>chunk at 0: 1073741824 bytes ... ok
>chunk at 0x4000: 1073741824 bytes ... ok
>chunk at 0x8000: 1073741824 bytes ... ok
>chunk at 0xc000: 1073741824 bytes ... ok
> 
> Dump complete
> Automatic reboot in 15 seconds - press a key on the console to abort
> Rebooting...
> 
> And then it panic'd again when attempting to run 'savecore'!
> (I typed a  after it printed out the line about
> writing the core file, that's where the "load: 0.72 ..." line
> came from...)

Hrm, I don't seem to be able to reproduce this with an installation
built from sources and also can't remember a commit between BETA3 and
BETA4 which should be able to cause this. I currently can't test the
10-BETA4 install image, though. Was the machine in question running
FreeBSD before, i. e. is it known good hardware? Did savecore eventually
succeed on writing out a dump?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Request for testing an alternate branch

2013-12-08 Thread Marius Strobl
On Wed, Dec 04, 2013 at 10:21:13PM -0800, Justin Hibbits wrote:
> I've been working on the projects/pmac_pmu branch for some time now to
> add suspend/resume as well as CPU speed change for certain PowerPC
> machines, about a year since I created the branch, and now it's stable
> enough that I want to merge it into HEAD, hence this request. However,
> it does touch several drivers, turning them into "early drivers", such
> that they can be initialized, and suspended and resumed at a different
> time.  Saying that, I do need testing from other architectures, to make
> sure I haven't broken anything.
> 
> The technical details:
> 
> To get proper ordering, I've extended the bus_generic_suspend() and
> bus_generic_resume() to do multiple passes.  Devices which cannot be
> enabled or disabled at the current pass level would return an EAGAIN.
> This could possibly cause problems, since it's an addition to an
> existing API rather than a new API to run along side it, so it needs a
> great deal of testing.  It works fine on PowerPC, but I don't have any
> i386/amd64 or sparc64 hardware to test it on, so would like others who
> do to test it.  I don't think that it would impact x86 at all (testing
> is obviously required), because the nexus is not an EARLY_DRIVER_MODULE,
> so all devices would be handled at the same pass.  But, I do know the
> sparc64 has an EARLY_DRIVER_MODULE() nexus, so that will likely be
> impacted.
> 
> Also, any comments are of course welcome.  Technical concerns are
> obviously welcome, and I will try to address everything.

Do you have a patch against head?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: newcons comming

2013-10-27 Thread Marius Strobl
On Fri, Oct 25, 2013 at 03:18:47PM +0300, Aleksandr Rybalko wrote:
> Hello fellow hackers!
> 
> I finally reach the point when I can work with newcons instead of
> syscons on my laptop. Yes, I know it still buggy and have a lot of
> style(9) problems. But we really have to get it into HEAD and 10.0 to
> enable shiny new Xorg features, drivers, etc.
> 
> So I ask everyone to look "hard" into that[1] and tell me your opinion.
> I expect a lot of opinions, since it have to affect almost all good
> guys, as result I have to ask to split "bug reports" into two parts:
> 1. Should be done before merge to 10.0;
> 2. Can be done later.
> 
> If it possible, please do it(review - report) ASAP.

Could you please port at least either creator(4) or machfb(4) to newcons
before it even hits head so we don't have the same situation as with
syscons again where we need to make square pegs fit into round holes? My
main concerns in this regard are:
o Making these drivers work as low-level console in the syscons sense so
  they already work for printing the Copyright notice of the kernel. The
  problem here is that the respective chips don't necessarily come up with
  the frame buffer mapped and we can't do that on our own at that point with
  the VM not up, yet. So all access has to be done via bus_space_*(9) and
  specially crafted bus tags and handles. In short: Except for some specific
  model and firmware combinations, in general the generic OFW frame buffer
  approach doesn't work here, that's why these drivers exist in the first
  place.
o For coexistence of f. e. machfb(4) with ofwfb.c, allow some probing of
  drivers in the BUS_PROBE_GENERIC/BUS_PROBE_DEFAULT etc. manner. The
  crucial point here is that in case a more specific driver is willing
  to attach to a certain device, a generic driver must not touch the
  hardware in any way. It seems that vd_priority is too late in the game
  for that requirement. With syscons, this is achievable by letting the
  generic driver call vid_configure(VIO_PROBE_ONLY) and then check whether
  another driver has taken the device.
o Using hardware acceleration for drawing characters and the mouse pointer,
  i. e. using a hardware cursor. Employing the respective chips as "dumb"
  frame buffers instead is just dog slow. Currently, I don't see how a
  hardware cursor could be hooked up to newcons. The current putc code in
  these drivers _might_ be suitable for implementing bitbltchr methods.
  Apart from that these chips also can do simple bitblt etc. of course.
o Using the 12 x 22 gallant font.
o Allowing Xorg to map the frame buffer but additionally also other register
  banks as needed through newcons. With syscons, a driver can provide a
  mmap method for that (see machfb(4). I currently don't see how to do that
  with the newcons infrastructure. An alternative might be to make Xorg/
  libpciaccess aware of newcons and go through a /dev/fdX in that case.
  Still, I don't see how to currently do that for resources besides the
  actual frame buffer with existing fdc.c. I'm also not sure whether the
  latter is the appropriate route to go in the first place given that
  besides mmap'ing from userland, newcons'ified creator(4) and machfb(4)
  still should be used directly.
  In any case, for creator(4) Xorg expects a /dev/fdX anyway.
o Allowing late attachment in case the primary console is the serial one,
  another graphics chip etc. during regular device attachment when everything
  needed (mainly the VM) to bring the frame buffer fully online on our own
  is available. Is that what vt_allocate() is for?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [RFC][CFT] GEOM direct dispatch and fine-grained CAM locking

2013-09-16 Thread Marius Strobl
On Tue, Sep 03, 2013 at 11:48:38PM +0200, Olivier Cochard-Labbé wrote:
> On Tue, Sep 3, 2013 at 8:10 PM, Outback Dingo  wrote:
> > Can anyone confirm how well tested/stable this patch set might be?? if
> > theres positive input i have a zoo of dev machines i could load it on, to
> > help further it.
> > Just checking to see how widely its been tested,
> 
> I've installed this patch on 3 differents machines there status after
> about 12hours:
> - SUN FIRE X4170 M2 (amd64: r255178) with 6 SAS harddrives in one big
> zraid (LSI MegaSAS Gen2 controller): Used for generating package with
> poudriere? no probleme since;
> - HAL/Fujitsu SPARC64-V (sparc64: r255178) with two SCSI-3 disks in
> gmirror: Used for generating package with poudriere too? no probleme
> since;

For testing GEOM direct dispatch on sparc64, please additionally use
the following patch:
http://people.freebsd.org/~marius/sparc64_GET_STACK_USAGE.diff

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: CFT: PCI Command Register fixups

2013-08-10 Thread Marius Strobl
On Fri, Aug 09, 2013 at 09:56:48PM -0600, Scott Long wrote:
> All,
> 
> Subversion rev 250418 affected approximately 63 drivers by making them 
> vulnerable to resource allocation failures on motherboards with buggy BIOSes. 
>  The revision itself is good, but it needs to address these drivers and bring 
> them up to what is, in effect, a modified way for drivers to manage their PCI 
> resources.  If you've been seeing something like the following message since 
> June 24/27, then you need this patch:
> 
> mps0:  port 0xd000-0xd0ff mem 0xfb79c000-0xfb79 irq 19 at 
> device 0.0 on pci4
> mps0: PCI memory window not available
> device_attach: mps0 attach returned 6
> 
> The patch originated from John Baldwin, I merely fixed up a few nits and am 
> passing it around for review and testing.  Please find it here:
> 
> http://people.freebsd.org/~scottl/pci_command_fixes.patch
> 

In mpt_pci.c, there's a style nit/inconsistency regarding the other
drivers touched by the above patch; if after these fixes, a driver
still fiddles with PCIR_COMMAND, it should be just fine to also OR
in PCIM_CMD_BUSMASTEREN as part of that and to not additionally call
pci_enable_busmaster().
Apart from that, the patch looks good to me.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [RFC] USB keyboard and devd.conf

2013-02-13 Thread Marius Strobl
On Wed, Feb 13, 2013 at 09:23:25AM +0100, Hans Petter Selasky wrote:
> On Tuesday 12 February 2013 15:51:01 Marius Strobl wrote:
> > On Mon, Feb 11, 2013 at 01:43:29PM +0100, Hans Petter Selasky wrote:
> > > Hi,
> > > 
> > > Does anyone need these lines in /etc/devd.conf ?
> > > 
> > > === etc/devd.conf
> > > ==
> > > --- etc/devd.conf (revision 246620)
> > > +++ etc/devd.conf (local)
> > > @@ -105,16 +105,6 @@
> > > 
> > >  #action "sleep 2 && /usr/sbin/ath3kfw -d $device-name -f
> > >  /usr/local/etc/ath3k-1.fw"; #};
> > > 
> > > -# When a USB keyboard arrives, attach it as the console keyboard.
> > > -attach 100 {
> > > - device-name "ukbd0";
> > > - action "/etc/rc.d/syscons setkeyboard /dev/ukbd0";
> > > -};
> > > -detach 100 {
> > > - device-name "ukbd0";
> > > - action "/etc/rc.d/syscons setkeyboard /dev/kbd0";
> > > -};
> > > -
> > > 
> > >  notify 100 {
> > >  
> > >   match "system" "DEVFS";
> > >   match "subsystem" "CDEV";
> > > 
> > > I plan to remove the lines marked with minus, because we now have kbdmux.
> > 
> > Do these entries have negative impact on systems using kbdmux(4)?
> > Will their lack have impact on systems not using kbdmux(4)? I typically
> > remove or at least disable the latter on machines without atkbd(4) etc.
> > hardware and thus ukbd(4) is the only keyboard driver ever used there.
> > 
> 
> Hi,
> 
> I suspect a system without kbdmux will still need these. However, these lines 
> are not correct with regard to multiple USB keyboards.
> 

Yes, but do these lines have ill effects for configurations with kbdmux(4)
and multiple keyboards? If not then I'd strongly suggest to keep them for
the sake of making configurations without kbdmux(4) work out of the box.
If yes, I'd at least keep them in a commented out form and add a mark
saying that these are required without kbdmux(4).

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [RFC] USB keyboard and devd.conf

2013-02-12 Thread Marius Strobl
On Mon, Feb 11, 2013 at 01:43:29PM +0100, Hans Petter Selasky wrote:
> Hi,
> 
> Does anyone need these lines in /etc/devd.conf ?
> 
> === etc/devd.conf
> ==
> --- etc/devd.conf (revision 246620)
> +++ etc/devd.conf (local)
> @@ -105,16 +105,6 @@
>  #action "sleep 2 && /usr/sbin/ath3kfw -d $device-name -f 
> /usr/local/etc/ath3k-1.fw";
>  #};
>  
> -# When a USB keyboard arrives, attach it as the console keyboard.
> -attach 100 {
> - device-name "ukbd0";
> - action "/etc/rc.d/syscons setkeyboard /dev/ukbd0";
> -};
> -detach 100 {
> - device-name "ukbd0";
> - action "/etc/rc.d/syscons setkeyboard /dev/kbd0";
> -};
> -
>  notify 100 {
>   match "system" "DEVFS";
>   match "subsystem" "CDEV";
> 
> 
> I plan to remove the lines marked with minus, because we now have kbdmux.
> 

Do these entries have negative impact on systems using kbdmux(4)?
Will their lack have impact on systems not using kbdmux(4)? I typically
remove or at least disable the latter on machines without atkbd(4) etc.
hardware and thus ukbd(4) is the only keyboard driver ever used there.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Physbio changes final call for tests and reviews

2013-02-11 Thread Marius Strobl
On Sat, Feb 02, 2013 at 10:47:09PM +0100, Marius Strobl wrote:
> On Sat, Feb 02, 2013 at 06:33:22PM +0200, Konstantin Belousov wrote:
> > Hi,
> > I finished the last (insignificant) missed bits in the Jeff' physbio
> > work. Now I am asking for the last round of testing and review, esp. for
> > the !x86 architectures. Another testing focus are the SCSI HBAs and RAID
> > controllers which drivers are changed by the patchset. Please do test
> > this before the patchset is committed into HEAD !
> > 
> > The plan is to commit the patch somewhere in two weeks from this moment.
> > The patch is required for the finalizing of the unmapped I/O work for UFS
> > I did in parallel, which I hope to finish shortly after the commit.
> > 
> > Patch is available at http://people.freebsd.org/~kib/misc/physbio.5.diff
> > 
> 
> First tests on sparc64 with ata(4), mpt(4) and sym(4) look good (to
> be sure I still need to test with a machine using a streaming buffer
> in addition to the IOMMU, though).

FYI, the latter case is also fine.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Physbio changes final call for tests and reviews

2013-02-03 Thread Marius Strobl
On Sun, Feb 03, 2013 at 06:11:45PM +0200, Konstantin Belousov wrote:
> On Sun, Feb 03, 2013 at 04:57:18PM +0100, Marius Strobl wrote:
> > On Sat, Feb 02, 2013 at 06:33:22PM +0200, Konstantin Belousov wrote:
> > > Hi,
> > > I finished the last (insignificant) missed bits in the Jeff' physbio
> > > work. Now I am asking for the last round of testing and review, esp. for
> > > the !x86 architectures. Another testing focus are the SCSI HBAs and RAID
> > > controllers which drivers are changed by the patchset. Please do test
> > > this before the patchset is committed into HEAD !
> > > 
> > > The plan is to commit the patch somewhere in two weeks from this moment.
> > > The patch is required for the finalizing of the unmapped I/O work for UFS
> > > I did in parallel, which I hope to finish shortly after the commit.
> > > 
> > > Patch is available at http://people.freebsd.org/~kib/misc/physbio.5.diff
> > > 
> > 
> > Once you bring in said UFS changes, will the use of bus_dmamap_load_ccb(9)
> > be a requirement for disk controller drivers?
> 
> Generally speaking, no. I plan to do the gradual migration of the drivers,
> definitely not forcing the unmapped bios down to the drivers which are
> not tested yet. In the patch, driver indicates the support for unmapped
> bios by a DISKFLAG. If flag is set, driver could receive both mapped
> and unmapped bios, and use of the bus_dmamap_load_ccb(), while formally
> is only convenience, is essentially the requirement.
> 
> If driver does not set the flag, it receives the same i/o requests as
> it does now. Geom performs transient compat mapping for the unmapped
> requests on its own for such drivers. As result, driver does not need
> a change.
> 
> My plan is to convert ahci(4) and then some often used high-profile drivers
> like mfi(4) and mps(4). I can also hope for isci(4) help.
> 
> Everything else, IMO, could be done on the best efforts basis, when both
> developers time and testing facilities are available. Jeff wanted to do
> all driver conversion in one pass, but IMO this is unrealistic. Still, I
> started write some helpers which should provide the transient one-page
> mappings for PIO modes.

Okay

> 
> You can look at some previous version of the unmapped patch at
> http://people.freebsd.org/~kib/misc/unmapped.8.patch. It only contain a
> hack for ahci(4), which should be fixed properly after physbio is committed.

Hrm, the changes to the sparc64 pmap code in the latter patch might
need some more attention as some of the functions used for copying
pages there IIRC have constraints on the aligment of source and
destination as well as on the count. Can you say something about
these properties when pmap_copy_page_offs() is called via
pmap_copy_pages()?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Physbio changes final call for tests and reviews

2013-02-03 Thread Marius Strobl
On Sat, Feb 02, 2013 at 06:33:22PM +0200, Konstantin Belousov wrote:
> Hi,
> I finished the last (insignificant) missed bits in the Jeff' physbio
> work. Now I am asking for the last round of testing and review, esp. for
> the !x86 architectures. Another testing focus are the SCSI HBAs and RAID
> controllers which drivers are changed by the patchset. Please do test
> this before the patchset is committed into HEAD !
> 
> The plan is to commit the patch somewhere in two weeks from this moment.
> The patch is required for the finalizing of the unmapped I/O work for UFS
> I did in parallel, which I hope to finish shortly after the commit.
> 
> Patch is available at http://people.freebsd.org/~kib/misc/physbio.5.diff
> 

Once you bring in said UFS changes, will the use of bus_dmamap_load_ccb(9)
be a requirement for disk controller drivers?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Physbio changes final call for tests and reviews

2013-02-02 Thread Marius Strobl
On Sat, Feb 02, 2013 at 06:33:22PM +0200, Konstantin Belousov wrote:
> Hi,
> I finished the last (insignificant) missed bits in the Jeff' physbio
> work. Now I am asking for the last round of testing and review, esp. for
> the !x86 architectures. Another testing focus are the SCSI HBAs and RAID
> controllers which drivers are changed by the patchset. Please do test
> this before the patchset is committed into HEAD !
> 
> The plan is to commit the patch somewhere in two weeks from this moment.
> The patch is required for the finalizing of the unmapped I/O work for UFS
> I did in parallel, which I hope to finish shortly after the commit.
> 
> Patch is available at http://people.freebsd.org/~kib/misc/physbio.5.diff
> 

First tests on sparc64 with ata(4), mpt(4) and sym(4) look good (to
be sure I still need to test with a machine using a streaming buffer
in addition to the IOMMU, though).
However, by accident I noticed that your patch (i.e. stock head is
fine) somehow breaks smartd of smartmontools with ata(4):
root@b1k2:/root # smartd
ata3: timeout waiting for write DRQ
The machine just hangs at this point (it's also strange that the above
message is from the PIO rather than from the DMA path).

One note: mjacob@ probably will be annoyed if you don't wrap the
changes to isp(4) in __FreeBSD_version so the same source still
compiles on older ones.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Fast gettimeofday(2) and static linking

2013-01-29 Thread Marius Strobl
On Mon, Jan 28, 2013 at 05:55:24PM +0200, Konstantin Belousov wrote:
> On Mon, Jan 28, 2013 at 04:45:17PM +0100, Marius Strobl wrote:
> > On Fri, Jan 25, 2013 at 02:35:54PM +0200, Konstantin Belousov wrote:
> > > Bruce Evans reported that statically linked binaries on HEAD an stable/9
> > > use the syscall for gettimeofday(2) and clock_gettime(2). Apparently, this
> > > is due to my use of the weak reference to the __vdso* symbols in the
> > > libc implementations.
> > > 
> > > Patch below reworks the __vdso* attributes to only make the symbols
> > > weak, but keep the references strong. Since I have to add a stub for
> > > each architecture, I would like to ask non-x86 machines owners to test
> > > the patch.
> > > 
> > 
> > Hi Konstantin,
> > 
> > what's the appropriate way to test this?
> 
> Please rebuild the world with the patch and check that gettimeofday(2) still
> works on your architecture, both for the static and dynamic binaries.
> I think that just booting multiuser is enough.

Okay, looks good on sparc64 (tested with a dynamically as well as a
statically built time(1)).

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Fast gettimeofday(2) and static linking

2013-01-28 Thread Marius Strobl
On Fri, Jan 25, 2013 at 02:35:54PM +0200, Konstantin Belousov wrote:
> Bruce Evans reported that statically linked binaries on HEAD an stable/9
> use the syscall for gettimeofday(2) and clock_gettime(2). Apparently, this
> is due to my use of the weak reference to the __vdso* symbols in the
> libc implementations.
> 
> Patch below reworks the __vdso* attributes to only make the symbols
> weak, but keep the references strong. Since I have to add a stub for
> each architecture, I would like to ask non-x86 machines owners to test
> the patch.
> 

Hi Konstantin,

what's the appropriate way to test this?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [RFC/RFT] calloutng

2013-01-21 Thread Marius Strobl
On Sun, Jan 13, 2013 at 09:36:11PM +0200, Alexander Motin wrote:
> On 13.01.2013 20:09, Marius Strobl wrote:
> > On Tue, Jan 08, 2013 at 12:46:57PM +0200, Alexander Motin wrote:
> >> On 06.01.2013 17:23, Marius Strobl wrote:
> >>> I'm not really sure what to do about that. Earlier you already said
> >>> that sched_bind(9) also isn't an option in case if td_critnest > 1.
> >>> To be honest, I don't really unerstand why using a spin lock in the
> >>> timecounter path makes sparc64 the only problematic architecture
> >>> for your changes. The x86 i8254_get_timecount() also uses a spin lock
> >>> so it should be in the same boat.
> >>
> >> The problem is not in using spinlock, but in waiting for other CPU while
> >> spinlock is held. Other CPU may also hold spinlock and wait for
> >> something, causing deadlock. i8254 code uses spinlock just to atomically
> >> access hardware registers, so it causes no problems.
> > 
> > Okay, but wouldn't that be a general problem then? Pretty much
> > anything triggering an IPI holds smp_ipi_mtx while doing so and
> > the lower level IPI stuff waits for other CPU(s), including on
> > x86.
> 
> The problem is general. But now it works because single smp_ipi_mtx is
> used in all cases where IPI result is waited. As soon as spinning
> happens with interrupts still enabled, there is no deadlocks. But
> problem reappears if any different lock is used, or locks are nested.

I'm having a hard time getting an alternate time counter device to
work. The crystal required for the counters in the south bridge just
doesn't seem to be mounted any where near it (I've not looked at the
bottom of the PCB though). While the time counter part of the on-
board bge(4) driven chips basically work, they don't seem to like
concurrent accesses caused by the rest of bge(4). I.e. although the
counter is just read, sooner or later this causes a fatal bus error.
I haven't tried serializing accesses to the chip, but getting to such
a complexity for just reading a non-indexed register at least doesn't
feel good ...
However, AFAICT the scenario you describe can't happen. On sparc64,
spinlock_enter() only raises the processor interrupt level, which
doesn't block the direct cross traps I've implemented remote reading
of (S)TICK as (which also means that the actions such traps may
perform are very limitted and must occur in interrupt context, but
which are sufficient for this purpose and in turn makes them very
fast). I.e. although the AP holds smp_ipi_mtx or any amount of
nested spin locks, this will not deadlock in case the BSP also holds
any spin lock when reading (S)TICK from it.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [RFC/RFT] calloutng

2013-01-13 Thread Marius Strobl
On Tue, Jan 08, 2013 at 12:46:57PM +0200, Alexander Motin wrote:
> On 06.01.2013 17:23, Marius Strobl wrote:
> > On Wed, Dec 26, 2012 at 09:24:46PM +0200, Alexander Motin wrote:
> >> On 26.12.2012 01:21, Marius Strobl wrote:
> >>> On Tue, Dec 18, 2012 at 11:03:47AM +0200, Alexander Motin wrote:
> >>>> Experiments with dummynet shown ineffective support for very short
> >>>> tick-based callouts. New version fixes that, allowing to get as many
> >>>> tick-based callout events as hz value permits, while still be able to
> >>>> aggregate events and generating minimum of interrupts.
> >>>>
> >>>> Also this version modifies system load average calculation to fix some
> >>>> cases existing in HEAD and 9 branches, that could be fixed with new
> >>>> direct callout functionality.
> >>>>
> >>>> http://people.freebsd.org/~mav/calloutng_12_17.patch
> >>>>
> >>>> With several important changes made last time I am going to delay commit
> >>>> to HEAD for another week to do more testing. Comments and new test cases
> >>>> are welcome. Thanks for staying tuned and commenting.
> >>>
> >>> FYI, I gave both calloutng_12_15_1.patch and calloutng_12_17.patch a
> >>> try on sparc64 and it at least survives a buildworld there. However,
> >>> with the patched kernels, buildworld times seem to increase slightly but
> >>> reproducible by 1-2% (I only did four runs but typically buildworld
> >>> times are rather stable and don't vary more than a minute for the
> >>> same kernel and source here). Is this an expected trade-off (system
> >>> time as such doesn't seem to increase)?
> >>
> >> I don't think build process uses significant number of callouts to 
> >> affect results directly. I think this additional time could be result of 
> >> the deeper next event look up, done by the new code, that is practically 
> >> useless for sparc64, which effectively has no cpu_idle() routine. It 
> >> wouldn't affect system time and wouldn't show up in any statistics 
> >> (except PMC or something alike) because it is executed inside timer 
> >> hardware interrupt handler. If my guess is right, that is a part that 
> >> probably still could be optimized. I'll look on it. Thanks.
> >>
> >>> Is there anything specific to test?
> >>
> >> Since the most of code is MI, for sparc64 I would mostly look on related 
> >> MD parts (eventtimers and timecounters) to make sure they are working 
> >> reliably in more stressful conditions.  I still have some worries about 
> >> possible deadlock on hardware where IPIs are used to fetch present time 
> >> from other CPU.
> > 
> > Well, I've just learnt two things the hard way:
> > a) We really need the mutex in that path.
> > b) Assuming that the initial synchronization of the counters is good
> >enough and they won't drift considerably accross the CPUs so we can
> >always use the local one makes things go south pretty soon after
> >boot. At least with your calloutng_12_26.patch applied.
> 
> Do you think it means they are not really synchronized for some reason?

There's definitely no hardware in place which would synchronize them.
I've no idea how to properly measure the difference between two tick
counters, but I think it's rarther their drift and not the software
synchronization we do when starting APs that is causing problems.
Mainly, because I can't really think of a better algorithm for doing
the latter when startiing the APs. The symptoms are that bout 30 to
60 seconds after that I start to see weird timeouts from device
drivers. I'd need to check how long these timeouts actually are to
see whether it could be a problem right from the start though. In
any case, it seems foolish to think that synchronizing them once
would be sufficient and they won't drift until shutdown. Linux
probably also doesn't keep re-synchronize them without a reason.
Just using a single timecounter source simply appears to be the
better choice.

> 
> > I'm not really sure what to do about that. Earlier you already said
> > that sched_bind(9) also isn't an option in case if td_critnest > 1.
> > To be honest, I don't really unerstand why using a spin lock in the
> > timecounter path makes sparc64 the only problematic architecture
> > for your changes. The x86 i8254_get_timecount() also uses a spin lock
> > so it should be in the same boat.
> 
> The 

Re: [RFC/RFT] calloutng

2013-01-06 Thread Marius Strobl
On Wed, Dec 26, 2012 at 09:24:46PM +0200, Alexander Motin wrote:
> On 26.12.2012 01:21, Marius Strobl wrote:
> > On Tue, Dec 18, 2012 at 11:03:47AM +0200, Alexander Motin wrote:
> >> Experiments with dummynet shown ineffective support for very short
> >> tick-based callouts. New version fixes that, allowing to get as many
> >> tick-based callout events as hz value permits, while still be able to
> >> aggregate events and generating minimum of interrupts.
> >>
> >> Also this version modifies system load average calculation to fix some
> >> cases existing in HEAD and 9 branches, that could be fixed with new
> >> direct callout functionality.
> >>
> >> http://people.freebsd.org/~mav/calloutng_12_17.patch
> >>
> >> With several important changes made last time I am going to delay commit
> >> to HEAD for another week to do more testing. Comments and new test cases
> >> are welcome. Thanks for staying tuned and commenting.
> >
> > FYI, I gave both calloutng_12_15_1.patch and calloutng_12_17.patch a
> > try on sparc64 and it at least survives a buildworld there. However,
> > with the patched kernels, buildworld times seem to increase slightly but
> > reproducible by 1-2% (I only did four runs but typically buildworld
> > times are rather stable and don't vary more than a minute for the
> > same kernel and source here). Is this an expected trade-off (system
> > time as such doesn't seem to increase)?
> 
> I don't think build process uses significant number of callouts to 
> affect results directly. I think this additional time could be result of 
> the deeper next event look up, done by the new code, that is practically 
> useless for sparc64, which effectively has no cpu_idle() routine. It 
> wouldn't affect system time and wouldn't show up in any statistics 
> (except PMC or something alike) because it is executed inside timer 
> hardware interrupt handler. If my guess is right, that is a part that 
> probably still could be optimized. I'll look on it. Thanks.
> 
> > Is there anything specific to test?
> 
> Since the most of code is MI, for sparc64 I would mostly look on related 
> MD parts (eventtimers and timecounters) to make sure they are working 
> reliably in more stressful conditions.  I still have some worries about 
> possible deadlock on hardware where IPIs are used to fetch present time 
> from other CPU.

Well, I've just learnt two things the hard way:
a) We really need the mutex in that path.
b) Assuming that the initial synchronization of the counters is good
   enough and they won't drift considerably accross the CPUs so we can
   always use the local one makes things go south pretty soon after
   boot. At least with your calloutng_12_26.patch applied.

I'm not really sure what to do about that. Earlier you already said
that sched_bind(9) also isn't an option in case if td_critnest > 1.
To be honest, I don't really unerstand why using a spin lock in the
timecounter path makes sparc64 the only problematic architecture
for your changes. The x86 i8254_get_timecount() also uses a spin lock
so it should be in the same boat.

The affected machines are equipped with a x86-style south bridge
which exposes a powermanagment unit (intended to be used as a SMBus
bridge only in these machines) on the PCI bus. Actually, this device
also includes an ACPI power management timer. However, I've just
spent a day trying to get that one working without success - it
just doesn't increment. Probably its clock input isn't connected as
it's not intended to be used in these machines.
That south bridge also includes 8254 compatible timers on the ISA/
LPC side, but are hidden from the OFW device tree. I can hack these
devices into existence and give it a try, but even if that works this
likely would use the same code as the x86 i8254_get_timecount() so I
don't see what would be gained with that.

The last thing in order to avoid using the tick counter as timecounter
in the MP case I can think of is that the Broadcom MACs in the affected
machines also provide a counter driven by a 1 MHz clock. If that's good
enough for a timecounter I can hook these up (in case these work ...)
and hack bge(4) to not detach in that case (given that we can't detach
timecounters ...).

> 
> Here is small tool we are using for test correctness and performance of 
> different user-level APIs: http://people.freebsd.org/~mav/testsleep.c
> 

I've run Ian's set of tests on a v215 with and without your
calloutng_12_26.patch and on a v210 (these uses the IPI approach)
with the latter also applied.
I'm not really sure what to make out of the numbers.

   v215 w/o v215 w/ 

Re: [RFC/RFT] calloutng

2012-12-25 Thread Marius Strobl
On Tue, Dec 18, 2012 at 11:03:47AM +0200, Alexander Motin wrote:
> Experiments with dummynet shown ineffective support for very short 
> tick-based callouts. New version fixes that, allowing to get as many 
> tick-based callout events as hz value permits, while still be able to 
> aggregate events and generating minimum of interrupts.
> 
> Also this version modifies system load average calculation to fix some 
> cases existing in HEAD and 9 branches, that could be fixed with new 
> direct callout functionality.
> 
> http://people.freebsd.org/~mav/calloutng_12_17.patch
> 
> With several important changes made last time I am going to delay commit 
> to HEAD for another week to do more testing. Comments and new test cases 
> are welcome. Thanks for staying tuned and commenting.

FYI, I gave both calloutng_12_15_1.patch and calloutng_12_17.patch a
try on sparc64 and it at least survives a buildworld there. However,
with the patched kernels, buildworld times seem to increase slightly but
reproducible by 1-2% (I only did four runs but typically buildworld
times are rather stable and don't vary more than a minute for the
same kernel and source here). Is this an expected trade-off (system
time as such doesn't seem to increase)?
Is there anything specific to test?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [CFC/CFT] large changes in the loader(8) code

2012-07-16 Thread Marius Strobl
On Mon, Jul 16, 2012 at 04:00:49PM +0400, Andrey V. Elsukov wrote:
> On 16.07.2012 15:31, Andriy Gapon wrote:
> >> Yes. It should work as before.
> > 
> > Well, but it's obvious that zfs_probe_dev would be attempting to do some 
> > unneeded
> > stuff (trying to treat partitions as disks) for that case.  To me this is a 
> > clear
> > indication zfs_probe_dev is not optimal for arch-independent 
> > implementation.  So I
> > still think that arch_zfs_probe should decide what disks and partitions to 
> > probe,
> > and zfs_probe_dev should only probe what it's given and not try to be any 
> > smarter.
> > But I've repeated myself three times already :-)
> 
> And we will have the same - several copies of the same code in each 
> architecture,
> which i have deleted...
> 
> Sparc doesn't support DIOCGMEDIASIZE and DIOCGSECTORSIZE ioctls,
> so it will not check each partition, only fd that is passed to the 
> zfs_probe_dev.
> 
> Currently there is only one problem with ZFS tasting, that can affect users -
> now we taste each disk and partition, but in the my branch ZFS tastes only 
> disks and
> partitions with type "freebsd" and "freebsd-zfs". So if you have created ZFS 
> on top
> of MBR partition with type "ntfs", then loader will be unable to detect it.
> 

Sorry, I'm missing the big picture of ZFS support in the loader and
currently unfortunately don't have the time to look into it or your
patches. I don't think there's a way to determine the media and
sector sizes without actually looking at the Sun and/or VTOC8 labels
though. As for zfs_probe_dev, some user recently indicated that
on sparc64 we should rather look at the disk devices listed in
the "boot-device" environment variable in order to mimic what Solaris
does rather than trying to probe anything that might be a disk device,
mimicking what the FreeBSD/i386 ZFS loader does. Maybe that's a hint
whether a arch_zfs_probe should exist.
I can test patches once you guys have figures out how things should
work though.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [head tinderbox] failure on powerpc64/powerpc

2011-12-24 Thread Marius Strobl
On Sat, Dec 24, 2011 at 03:43:51PM +, FreeBSD Tinderbox wrote:
> TB --- 2011-12-24 13:54:50 - tinderbox 2.8 running on 
> freebsd-current.sentex.ca
> TB --- 2011-12-24 13:54:50 - starting HEAD tinderbox run for powerpc64/powerpc
> TB --- 2011-12-24 13:54:50 - cleaning the object tree
> TB --- 2011-12-24 13:55:13 - cvsupping the source tree
> TB --- 2011-12-24 13:55:13 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca 
> /tinderbox/HEAD/powerpc64/powerpc/supfile
> TB --- 2011-12-24 13:55:26 - building world
> TB --- 2011-12-24 13:55:26 - CROSS_BUILD_TESTING=YES
> TB --- 2011-12-24 13:55:26 - MAKEOBJDIRPREFIX=/obj
> TB --- 2011-12-24 13:55:26 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
> TB --- 2011-12-24 13:55:26 - SRCCONF=/dev/null
> TB --- 2011-12-24 13:55:26 - TARGET=powerpc
> TB --- 2011-12-24 13:55:26 - TARGET_ARCH=powerpc64
> TB --- 2011-12-24 13:55:26 - TZ=UTC
> TB --- 2011-12-24 13:55:26 - __MAKE_CONF=/dev/null
> TB --- 2011-12-24 13:55:26 - cd /src
> TB --- 2011-12-24 13:55:26 - /usr/bin/make -B buildworld
> >>> World build started on Sat Dec 24 13:55:27 UTC 2011
> >>> Rebuilding the temporary build tree
> >>> stage 1.1: legacy release compatibility shims
> >>> stage 1.2: bootstrap tools
> >>> stage 2.1: cleaning up the object tree
> >>> stage 2.2: rebuilding the object tree
> >>> stage 2.3: build tools
> >>> stage 3: cross tools
> >>> stage 4.1: building includes
> >>> stage 4.2: building libraries
> >>> stage 4.3: make dependencies
> >>> stage 4.4: building everything
> [...]
> rsyncfile.o:(.text+0xf8): undefined reference to `MD5Update'
> stream.o:(.text+0x544): undefined reference to `MD5Init'
> stream.o:(.text+0xb9c): undefined reference to `MD5Update'
> stream.o:(.text+0xd0c): undefined reference to `MD5Update'
> stream.o:(.text+0xd40): undefined reference to `MD5Update'
> stream.o:(.text+0xd54): undefined reference to `MD5Update'
> stream.o:(.text+0xd84): undefined reference to `MD5Update'
> stream.o:(.text+0xd98): more undefined references to `MD5Update' follow
> *** Error code 1
> 

The tinderbox output isn't very helpful here and I've no idea how this
could happen as r228857 also added -lmd nor can I reproduce it. Could
this be a transient failure due to the tinderbox updating sources at
an unfortunate point in time or a glitch in the exported (according to
the sources presented by cvsweb.freebsd.org r228857 has reached the CVS
repository just fine though)?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sparc64 r228561 panic: kmem_suballoc: bad status return of 3

2011-12-16 Thread Marius Strobl
On Fri, Dec 16, 2011 at 11:19:22AM +, Anton Shterenlikht wrote:
> On Fri, Dec 16, 2011 at 11:37:20AM +0100, Marius Strobl wrote:
> > On Fri, Dec 16, 2011 at 08:40:48AM +, Anton Shterenlikht wrote:
> > > Updating from r216048 to r228561 on sparc64,
> > > with sys/conf/newvers.sh changed to REVISION="9.9".
> > > 
> > > Trinscribed by hand:
> > > 
> > > FreeBSD 9.9-CURRENT #3 r228561M:
> > > 
> > > panic: kmem_suballoc: bad status return of 3
> > > KDB: enter: panic
> > > [ thread pid 0 tid 0 ]
> > > Stopped at 0x02937e0:   ta%xcc,1
> > > db>
> > > 
> > > The keyboard froze, couldn't get a bt,
> > > required a cold reboot.
> > > 
> > > My /etc/make.conf and kernel config files are below.
> > > 
> > > Any advice?
> > > 
> > 
> > Hrm, doesn't look like I can reproduce this. What machine model is
> > that and how much RAM does it have?
> 
> >From dmesg:
> 
> real memory  = 2147483648 (2048 MB)
> avail memory = 2079449088 (1983 MB)
> cpu0: Sun Microsystems UltraSparc-IIIi Processor (1503.00 MHz CPU)
> 
> > Do you use any loader tuneables?
> 
> I don't think so. You mean like /boot/loader.conf?
> I haven't got this file at all.
> 

Even with a Blade 1500, which is the closest match to your machine
that I have, and a kernel built with your configuration file I can't
reproduce this using r228583. I'd suggest to test with a kernel built
using an empty object directory and without any local modifications.
If that still doesn't solve the problem given that there isn't even
a backtrace I just can suggest to do a binary search for the offending
commit, probably accounting especially for the changes to the VM
within the window of revisions in question.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: sparc64 r228561 panic: kmem_suballoc: bad status return of 3

2011-12-16 Thread Marius Strobl
On Fri, Dec 16, 2011 at 08:40:48AM +, Anton Shterenlikht wrote:
> Updating from r216048 to r228561 on sparc64,
> with sys/conf/newvers.sh changed to REVISION="9.9".
> 
> Trinscribed by hand:
> 
> FreeBSD 9.9-CURRENT #3 r228561M:
> 
> panic: kmem_suballoc: bad status return of 3
> KDB: enter: panic
> [ thread pid 0 tid 0 ]
> Stopped at 0x02937e0:   ta%xcc,1
> db>
> 
> The keyboard froze, couldn't get a bt,
> required a cold reboot.
> 
> My /etc/make.conf and kernel config files are below.
> 
> Any advice?
> 

Hrm, doesn't look like I can reproduce this. What machine model is
that and how much RAM does it have? Do you use any loader tuneables?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Burning CDs and DVDs on SATA drive in FreeBSD 9.0

2011-12-10 Thread Marius Strobl
On Sat, Dec 10, 2011 at 02:27:22AM -0800, Thomas Mueller wrote:
> 
> --- On Fri, 12/9/11, Marius Strobl  wrote:
> 
> > +, Thomas Mueller wrote:
> > > > Recompile the port; the CAM ioctl numbers have
> > changed.
> > Cheers
> > > > Michiel
>  
> > > When did these CAM ioctl numbers change?? Was it
> > before or after I built and installed cdrtools?
>  
> > > Running ls -rtl /var/db/pkg/cdrtools-3.00_1 produces
>  
> > > total 48
> > > -rw-r--r--? 1 root? wheel? 17550 Sep 26
> > 09:20 +MTREE_DIRS
> > > -rw-r--r--? 1 root? wheel? ? 470
> > Sep 26 09:20 +DISPLAY
> > > -rw-r--r--? 1 root?
> > wheel???1009 Sep 26 09:20 +DESC
> > > -rw-r--r--? 1 root? wheel? 11102 Sep 26
> > 09:20 +CONTENTS
> > > -rw-r--r--? 1 root? wheel?
> > ???63 Sep 26 09:20 +COMMENT
> > > -rw-r--r--? 1 root? wheel?
> > ???17 Dec? 7 15:44 +REQUIRED_BY
>  
>  
> > > So it might have been on FreeBSD 9.0-BETA2.
>  
> 
> > I'm not sure what CAM IOCTL number change others are
> > referring to but
> > you certainly need to rebuild libcam consumers after
> > r225950, which
> > was merged to stable/9 in r226067 on October 6 2011.
> 
> > Marius
>  
> Thanks for response.  I'm at the older computer now, but will need to check 
> /usr/src/UPDATING, and portupgrade or portmaster cdrtools after 
> source-upgrading FreeBSD 9.0-RC2 to RC3. 
> 

There's no corresponding entry in UPDATING.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Burning CDs and DVDs on SATA drive in FreeBSD 9.0

2011-12-09 Thread Marius Strobl
On Fri, Dec 09, 2011 at 09:23:58AM +, Thomas Mueller wrote:
> > Recompile the port; the CAM ioctl numbers have changed.
> 
> > Cheers
> > Michiel
> 
> When did these CAM ioctl numbers change?  Was it before or after I built and 
> installed cdrtools?
> 
> Running ls -rtl /var/db/pkg/cdrtools-3.00_1 produces
> 
> 
> total 48
> -rw-r--r--  1 root  wheel  17550 Sep 26 09:20 +MTREE_DIRS
> -rw-r--r--  1 root  wheel470 Sep 26 09:20 +DISPLAY
> -rw-r--r--  1 root  wheel   1009 Sep 26 09:20 +DESC
> -rw-r--r--  1 root  wheel  11102 Sep 26 09:20 +CONTENTS
> -rw-r--r--  1 root  wheel 63 Sep 26 09:20 +COMMENT
> -rw-r--r--  1 root  wheel 17 Dec  7 15:44 +REQUIRED_BY
> 
> 
> So it might have been on FreeBSD 9.0-BETA2.
> 

I'm not sure what CAM IOCTL number change others are referring to but
you certainly need to rebuild libcam consumers after r225950, which
was merged to stable/9 in r226067 on October 6 2011.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: nge(4), tl(4), wb(4) and rl(4) 8129 testers wanted [Re: Question about GPIO bitbang MII]

2011-10-15 Thread Marius Strobl
On Sun, Oct 16, 2011 at 02:46:23AM +0200, Damien Fleuriot wrote:
> 
> 
> On 15 Oct 2011, at 22:56, Marius Strobl  wrote:
> 
> > 
> > Could owners of nge(4), tl(4), wb(4) and rl(4) driven hardware (as for
> > rl(4) only 8129 need testing, 8139 don't) please give the following
> > patch a try in order to ensure it doesn't break anything?
> > for 9/head:
> > http://people.freebsd.org/~marius/mii_bitbang.diff
> > for 8:
> > http://people.freebsd.org/~marius/mii_bitbang.diff8
> > 
> > Thanks,
> > Marius
> > 
> 
> 
> While I don't have any box with this hardware, I'm thinking you might want to 
> get a bit more specific about what you want tested...
> 
> What do you think the patch might break ?
> 

Basically, if there's something wrong with the patch the driver should
fail to attach, if it still does and gets a link all should be fine.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


nge(4), tl(4), wb(4) and rl(4) 8129 testers wanted [Re: Question about GPIO bitbang MII]

2011-10-15 Thread Marius Strobl

Could owners of nge(4), tl(4), wb(4) and rl(4) driven hardware (as for
rl(4) only 8129 need testing, 8139 don't) please give the following
patch a try in order to ensure it doesn't break anything?
for 9/head:
http://people.freebsd.org/~marius/mii_bitbang.diff
for 8:
http://people.freebsd.org/~marius/mii_bitbang.diff8

Thanks,
Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: SCSI descriptor sense changes, testing needed

2011-09-23 Thread Marius Strobl
On Thu, Sep 22, 2011 at 01:33:05PM -0600, Kenneth D. Merry wrote:
> 
> I have attached a set of patches against head that implement SCSI
> descriptor sense support for CAM.
> 
> Descriptor sense is a new sense (SCSI error) format introduced in the SPC-3
> spec in 2006.  FreeBSD doesn't currently support it.
> 
> Seagate's new 3TB SAS drives come with descriptor sense enabled by default,
> and it's possible that other newer drives do as well.  Because all the
> sense key, additional sense code, and additional sense code qualifier
> fields are in different places, the CAM error recovery code will not do the
> right thing when it gets descriptor sense.
> 
> These patches do bump up the size of struct scsi_sense_data, and so I have
> incremented CAM_VERSION as well.  I have discussed this with re@, and it
> looks like we'll be putting the changes in before 9.0, so it ships with
> support for newer SCSI devices.

Hi Ken,

as far as I understand this also requires consumers of scsi_sense_data
and SSD_FULL_SIZE etc in userland to be recompiled. So while you are at
breaking the API and ABI of CAM anyway, could you please take the
opportunity to change CAM_XPT_PATH_ID and CAM_BUS_WILDCARD to not use
the same value so incorrect uses will fail? Currently, there seems to
be a lot of confusion when to use which one, including camcontrol(8)
just encoding this as -1:
/*
 * We don't want to rescan or reset the xpt bus.
 * See above.
 */
if ((int)bus_result->path_id == -1)
continue;

Moreover, AFAICT CAM_XPT_PATH_ID corresponds to what the ANSI CAM Draft
refers to as "XPT Path ID" and specifies a value of 0xff for.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: SSD - TRIM, SU and SUJ - Installer Options

2011-09-16 Thread Marius Strobl
On Thu, Sep 08, 2011 at 06:21:40PM +0200, Nathan Whitehorn wrote:
> On 09/08/11 16:22, mysph...@web.de wrote:
> >Hi there,
> >
> >first off all I have to say, that your new Installer in FBSD v9.0 is
> >very well done.
> >I just found an option, which is not activated at this time. So I wanted
> >to ask, if it???s possible a bug or something like that.
> >
> >I???ve tried to change the  in the section Partition Editor with
> >the subsection Add Partition and wanted to save my changes: Softupdates
> >= disabled, Softupdates journaling = disabled, TRIM = enabled with the
> >  button.
> >But if I reenter the  menu, so my changes will be overwritten
> >with the default values: SU = enabled, SUJ = enabled ([UFS1 +] TRIM =
> >disabled).
> >
> >I???ve tried this without success in FBSD v9.0 BETA1+2 (amd64) with the
> >ISO- and the IMG Images.
> >
> >My workaround is, that I run in single-user-mode and change the values
> >with tunefs.
> >
> >Would you please check, that  point ??? if my act is accurate.
> >
> >Thanks in advance and have a nice day!
> 
> This is an interesting point that I hadn't tested. The options do work 
> -- the state of the dialog is just not restored when the Options menu is 
> reentered and so a second trip to Options resets the defaults, unless 
> you then change it again. I'm traveling at the moment, so am not able to 
> fix it at the moment. The internal architecture may also make it 
> slightly tricky to fix.

In my experience the filesystem options menu doesn't work at all, i.e.
the options select there are just ignored also when selecting them just
once and not re-entering that menu. I've tried to create a filesystem
with SUJ disabled and TRIM enabled twice now, last time with BETA2 on
amd64, and I always end up with a filesystem that has SUJ enabled but
TRIM disabled.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-17 Thread Marius Strobl
On Sat, Jul 16, 2011 at 10:42:22PM -0700, Doug Barton wrote:
> On 07/15/2011 01:40, Marius Strobl wrote:
> 
> > The generated config.h and platform.h for sparc64 are these:
> > http://people.freebsd.org/~marius/bind96_config.h
> > http://people.freebsd.org/~marius/bind96_platform.h
> 
> Marius,
> 
> Thanks again for all your help on this. During the work to upgrade to
> BIND 9.8 in HEAD I first tried your patch but I got some odd errors on
> some of the non-mainstream archs, so I ultimately went with something
> similar to what you sent but much more conservative.
> 

Thanks!

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-15 Thread Marius Strobl
On Thu, Jul 14, 2011 at 05:31:49PM -0700, Doug Barton wrote:
> On 07/14/2011 16:21, Marius Strobl wrote:
> > On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote:
> >> 2011/7/11 KOT MATPOCKuH :
> >>>> Oops, sorry, I forgot to revert the previous patch when test-compiling.
> >>>> Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
> >>> I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD,
> >>> and it worked properly till Sun Jul 10 22:25:41 MSD.
> >>> At 22:25:41 I restarted bind from base system with your
> >>> sparc64_isc_atomic.h.diff2.
> >>> From this moment till today, 15:57:05 he crashed 3 times:
> >>> Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on 
> >>> signal 6
> >>> Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on 
> >>> signal 6
> >>> Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on 
> >>> signal 6
> >>>
> >>> To make to ensure proper operation of bind from ports, I ran it again
> >>> at 15:57:05, and, I think, we need to wait several days.
> >> And from that time till now bind from ports never died and works 
> >> properly...
> >>
> > 
> > Okay.
> > Doug, could you please disable the use of atomic operations for sparc64
> > in the in-tree BIND via the following patch in order to match what the
> > vendor source does?
> > http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
> 
> If you use the port and do 'make configure' are the values in config.h
> the same as the ones in your patch?  If so, that's likely to be the
> right answer, and I'll go ahead and apply your patch.
> 

The generated config.h and platform.h for sparc64 are these:
http://people.freebsd.org/~marius/bind96_config.h
http://people.freebsd.org/~marius/bind96_platform.h

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-14 Thread Marius Strobl
On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote:
> 2011/7/11 KOT MATPOCKuH :
> >> Oops, sorry, I forgot to revert the previous patch when test-compiling.
> >> Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
> > I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD,
> > and it worked properly till Sun Jul 10 22:25:41 MSD.
> > At 22:25:41 I restarted bind from base system with your
> > sparc64_isc_atomic.h.diff2.
> > From this moment till today, 15:57:05 he crashed 3 times:
> > Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 
> > 6
> > Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 
> > 6
> > Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 
> > 6
> >
> > To make to ensure proper operation of bind from ports, I ran it again
> > at 15:57:05, and, I think, we need to wait several days.
> And from that time till now bind from ports never died and works properly...
> 

Okay.
Doug, could you please disable the use of atomic operations for sparc64
in the in-tree BIND via the following patch in order to match what the
vendor source does?
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
I've no idea why they don't work properly (apart from the fact that there
additionally should be memory barriers at least when used for reference
counting just like the alpha version of the ISC atomic operations uses),
I just can say they match what we use in the kernel without problems
pretty closely and that they work as described in the respective comments
when testing them stand-alone. So my best guess is that the BIND source
additionaly depends on some x86-specific behavior of the atomic operations
there or in general, but from a glance the source it's not obvious for me
what that could be. Given that the vendor source doesn't even use atomic
operations on Solaris/SPARC I suspect this is a non-trivial problem.
It probably would be a good idea to also disable the use of atomic
operations for arm again just like the vendor source does as they don't
work there either but nobody seems to care (see PR 154306).

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-08 Thread Marius Strobl
On Fri, Jul 08, 2011 at 11:17:20PM +0400, KOT MATPOCKuH wrote:
> 2011/7/8 Marius Strobl :
> 
> > Please try the following:
> > a) Instead of the base BIND use the dns/bind96 port. The native build
> > ? of the latter defaults to not using the ISC atomic implementation
> > ? on sparc64 (and arm) and should properly enable the alternative. I
> > ? can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default
> > ? configuration on -CURRENT without problems.
> dns/bind96? Why not bind98?

In order to have a result which can be compared with the base BIND.
Whether bind98 works or works without the ISC atomic operations says
nothing about the bind96 port or the base version.

> As I see dns/bind98 configures without atomic swap operations.
> I will try to use dns/bind98 at first :)
> 
> > b) Revert the above patch and try the base bind with the following
> > ? (third) patch:
> > ? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2
> > ? That one adds the memory barriers required for reference counting
> > ? albeit in a sledgehammer-like fashion as the ISC atomic API doesn't
> > ? allow to distinguish between acquire and release semantics.
> 
> Hmmm... With this patch build fails:

Oops, sorry, I forgot to revert the previous patch when test-compiling.
Please re-fetch sparc64_isc_atomic.h.diff2 and try again.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-08 Thread Marius Strobl
On Fri, Jul 08, 2011 at 03:47:08PM +0400, KOT MATPOCKuH wrote:
> 2011/7/7 Marius Strobl :
> > That's not the patch I was referring to. I did a second one which just
> > entirely disables the use of atomic operations on sparc64:
> > http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
> Omg. I'm sorry.
> I applied this patch and restarted named, but named crashed immediatly
> after start:
> 08-Jul-2011 15:29:54.631 found 2 CPUs, using 2 worker threads
> 08-Jul-2011 15:29:54.633 using up to 4096 sockets
> Segmentation fault (core dumped)
> 
> core's backtrace:
> #0  0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7
> (gdb) bt
> #0  0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7
> #1  0x40953ccc in __sparc_utrap_install () from /lib/libc.so.7
> #2  0x40953f70 in __sparc_utrap_install () from /lib/libc.so.7
> #3  0x409537ac in __sparc_utrap_install () from /lib/libc.so.7
> #4  0x407c2d54 in pthread_mutex_lock () from /lib/libthr.so.3
> #5  0x00228dcc in ?? ()
> Previous frame identical to this frame (corrupt stack?)
> 
> Could this be a sign to a problem in libthr?

Could be but IMO that's unlikely, if there'd be a bug affecting
pthread_mutex_lock() there should be more fallout from that. I'm probably
missing something how to properly disable the use of the ISC atomic
implementation and to enable the alternative locking.
Please try the following:
a) Instead of the base BIND use the dns/bind96 port. The native build
   of the latter defaults to not using the ISC atomic implementation
   on sparc64 (and arm) and should properly enable the alternative. I
   can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default
   configuration on -CURRENT without problems.
b) Revert the above patch and try the base bind with the following
   (third) patch:
   http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2
   That one adds the memory barriers required for reference counting
   albeit in a sledgehammer-like fashion as the ISC atomic API doesn't
   allow to distinguish between acquire and release semantics.

> 
> PS.
> Also one month ago I got a problems with another multithreaded
> application from ports (www/oops). oops was crashed with stack's
> backtrace:
> #0  0x40d8fc88 in __sparc_utrap_install () from /lib/libc.so.7
> #1  0x40d8fdac in __sparc_utrap_install () from /lib/libc.so.7
> #2  0x40d90050 in __sparc_utrap_install () from /lib/libc.so.7
> #3  0x40d8f88c in __sparc_utrap_install () from /lib/libc.so.7
> #4  0x40d64044 in _malloc_thread_cleanup () from /lib/libc.so.7
> #5  0x40c039b8 in fork () from /lib/libthr.so.3
> #6  0x40c03d38 in fork () from /lib/libthr.so.3
> #7  0x40c03f50 in pthread_exit () from /lib/libthr.so.3
> #8  0x40c04414 in pthread_detach () from /lib/libthr.so.3
> #9  0x40c04710 in pthread_create () from /lib/libthr.so.3
> 
> But on yesterday's world's build oops works properly. I think it may
> be related to r223228 (?)

Unlikely, the crash caused by the assertion in _malloc_thread_cleanup()
was solved with r223369.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-07 Thread Marius Strobl
On Thu, Jul 07, 2011 at 03:44:32PM +0400, KOT MATPOCKuH wrote:
> 2011/7/7 Marius Strobl :
> > On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote:
> >> I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3,
> >> but problem is still exists:
> >> 07-Jul-2011 13:24:22.765 general:
> >> /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622:
> >> REQUIRE(prev > 0) failed
> >> 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure)
> >>
> >> How can I find root cause of the problem?
> > From your description it's unclear whether you've built BIND with or
> > without sparc64_isc_disable_atomic.diff. If it was built without that
> > patch please give it a try.
> As You can see, Doug is already included your patch in head:
> http://svnweb.freebsd.org/base/head/contrib/bind9/lib/isc/sparc64/include/isc/atomic.h?r1=222395&r2=223811
> And, of course, bind builded with your patch...
> 

That's not the patch I was referring to. I did a second one which just
entirely disables the use of atomic operations on sparc64:
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-07 Thread Marius Strobl
On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote:
> I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3,
> but problem is still exists:
> 07-Jul-2011 13:24:22.765 general:
> /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622:
> REQUIRE(prev > 0) failed
> 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure)
> 
> How can I find root cause of the problem?
> 

>From your description it's unclear whether you've built BIND with or
without sparc64_isc_disable_atomic.diff. If it was built without that
patch please give it a try. If you had applied it then this apparently
is a generic bug in BIND and unrelated to the MD atomic implementation
and I don't know how to proceed in order to get that fixed. Hopefully
Doug can help you in that case.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-07-06 Thread Marius Strobl
On Wed, Jul 06, 2011 at 11:55:15AM +0200, Marius Strobl wrote:
> On Tue, Jul 05, 2011 at 05:55:09PM -0700, Doug Barton wrote:
> > On 06/28/2011 08:58, Marius Strobl wrote:
> > >Uhm, we once fixed a problem in the MD atomic implementation which
> > >still seems to present in the ISC copy. Could you please test whether
> > >the following patch makes a difference?
> > >http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
> > 
> > I haven't seen any verification from the OP that this patch solved the 
> > problem,
> 
> It simply doesn't so apparently there's another bug in other parts of
> BIND causing it to trip over that assertion. Still, the clobber lists
> of the sparc64 atomic bits were incomplete and fixing that IMO was the
> right thing to do.
> 

MATPOCKuH, could you please test the following patch?
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
That one simple disables the use of atomic operations for sparc64 as
I doubt that these have seen much testing except on x86, be it on
sparc64 or in general; given that they are also used for reference
counting they should provide acquire and release semantics for that
purpose which include the necessary memory barriers for these but the
ISC atomic API simply doesn't account for that. Moreover, the sparc64
implementation of the ISC atomic operations is FreeBSD-specific as it's
the only OS I'm aware of using the primary instead of the secondary MMU
context for the userland (i.e. ASI_P; generally this is a wise choice
though), i.e. don't work on the other *BSDs, Linux or Solaris.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-07-06 Thread Marius Strobl
On Tue, Jul 05, 2011 at 05:55:09PM -0700, Doug Barton wrote:
> On 06/28/2011 08:58, Marius Strobl wrote:
> >Uhm, we once fixed a problem in the MD atomic implementation which
> >still seems to present in the ISC copy. Could you please test whether
> >the following patch makes a difference?
> >http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
> 
> I haven't seen any verification from the OP that this patch solved the 
> problem,

It simply doesn't so apparently there's another bug in other parts of
BIND causing it to trip over that assertion. Still, the clobber lists
of the sparc64 atomic bits were incomplete and fixing that IMO was the
right thing to do.

> however it did pass 'make universe' on both 9-current and 
> RELENG_8, so I've committed it to those 2 branches along with the recent 
> update. I'll also submit it upstream.
> 

Thanks!
Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread Marius Strobl
On Wed, Jun 29, 2011 at 02:33:06PM +0400, KOT MATPOCKuH wrote:
> 2011/6/29 KOT MATPOCKuH :
> >>> I'm got a problem with named on FreeBSD-CURRENT/sparc64.
> >>> Up to 5 times a day it crashes with these messages:
> >>> 27-Jun-2011 03:42:14.384 general:
> >>> /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
> >>> REQUIRE(prev > 0) failed
> >>> 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)
> >
> >>> I found a some similar problems on alpha and IA64, which was related
> >>> to problems with isc_atomic_xadd() function in include/isc/atomic.h.
> >>> But I don't understand that there may be incorrect for sparc64 and
> >>> this function was not changed for a minimum 4 years...
> >> Uhm, we once fixed a problem in the MD atomic implementation which
> >> still seems to present in the ISC copy. Could you please test whether
> >> the following patch makes a difference?
> >> http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
> 
> > I ran named with your patch and and watching him.
> Omg.
> Or I incorrectly rebuilt named, or the problem is not solved.
> I got a crash after about 2 hours after named restarted:
> 29-Jun-2011 13:51:28.855 general:
> /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
> REQUIRE(prev > 0) failed
> 29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure)
> 

The remainder of the isc atomic.h looks fine though, so this likely
is a general bug in BIND, especially if it didn't happen before
BIND 9.6.-ESV-R4-P1. Doug should be able to help you.
Doug, could you please nevertheless take care of getting the above
patch into BIND? It's a merge of r148453.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-06-28 Thread Marius Strobl
On Mon, Jun 27, 2011 at 07:19:33PM +0400, KOT MATPOCKuH wrote:
> Hello!
> 
> I'm got a problem with named on FreeBSD-CURRENT/sparc64.
> Up to 5 times a day it crashes with these messages:
> 27-Jun-2011 03:42:14.384 general:
> /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
> REQUIRE(prev > 0) failed
> 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)
> 
> The problem is still in latest system's bind:
> # named -v
> BIND 9.6.-ESV-R4-P1
> 
> This problem exists only on SMP sparc64 system. On my another sparc64,
> with 1 processor, I does not have this problem.
> 
> I found a some similar problems on alpha and IA64, which was related
> to problems with isc_atomic_xadd() function in include/isc/atomic.h.
> But I don't understand that there may be incorrect for sparc64 and
> this function was not changed for a minimum 4 years...
> 
> How can I help solve this problem?
> 

Uhm, we once fixed a problem in the MD atomic implementation which
still seems to present in the ISC copy. Could you please test whether
the following patch makes a difference?
http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: TLS bug?

2011-06-17 Thread Marius Strobl
On Fri, Jun 17, 2011 at 03:31:29PM -0400, Nathaniel W Filardo wrote:
> On Fri, Jun 17, 2011 at 08:07:13PM +0200, Marius Strobl wrote:
> > Using bonnie++ I can't reproduce this (didn't try mysql) but I have
> 
> I seem to have good luck reproducing it with "-r 5 -s 10 -x 10" by about the
> third iteration.

Ok, with these parameters I can reproduce it.

> 
> > some TLS fixes for libthr I forgot about but could be relevant here
> > (most actually date back to 2008 when the base binutils didn't support
> > GNUTLS for sparc64 so I couldn't test them easily). Could you please
> > give a libthr build with the following patch a try?
> > http://people.freebsd.org/~marius/libthr_sparc64.diff
> 
> Concurrent runs both with and without those diffs still asserted.
> Interestingly, libc's .tbss section, even after the assertion, is still full
> of zeros, so it looks like something stranger than a wild-write back to
> .tbss.  I'll go diving through the tls allocation code again when I get a
> minute.
> 

In combination with the below patch bonnie++ survived 100 iterations
here. I'm not sure what this means though as I don't have much knowledge
about TLS, I merely implemented the necessary relocations. Could be
that malloc() actually requires the initial exec model for variant II.
Unfortunately, it's not documented why it was added for x86.
Jason, can you shed some light on this?

Marius

Index: malloc.c
===
--- malloc.c(revision 219535)
+++ malloc.c(working copy)
@@ -234,7 +234,7 @@
 #ifdef __sparc64__
 #  define LG_QUANTUM   4
 #  define LG_SIZEOF_PTR3
-#  define TLS_MODEL/* default */
+#  define TLS_MODEL__attribute__((tls_model("initial-exec")))
 #endif
 #ifdef __amd64__
 #  define LG_QUANTUM   4
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: TLS bug?

2011-06-17 Thread Marius Strobl
On Thu, Jun 16, 2011 at 03:53:19AM -0400, Nathaniel W Filardo wrote:
> Atcht; it's late.  I forgot to mention that this system is a sparc64 V240
> 2-way SMP machine.  It's running a kernel from 9.0-CURRENT r222833+262af52:
> Tue Jun  7 18:47:35 EDT 2011 and a userland from a little later.
> 
> Sorry about that.
> --nwf;
> 
> On Thu, Jun 16, 2011 at 03:31:38AM -0400, Nathaniel W Filardo wrote:
> > I have a few applications (bonnie++ and mysql, specifically, both from
> > ports) which trip over the assertion in
> > lib/libc/stdlib/malloc.c:/^_malloc_thread_cleanup that
> > >   assert(tcache != (void *)(uintptr_t)1);
> > 
> > I have patched malloc.c thus:
> > 
> > > --- a/lib/libc/stdlib/malloc.c
> > > +++ b/lib/libc/stdlib/malloc.c
> > > @@ -1108,7 +1108,7 @@ static __thread arena_t   *arenas_map 
> > > TLS_MODEL;
> > >  
> > >  #ifdef MALLOC_TCACHE
> > >  /* Map of thread-specific caches. */
> > > -static __thread tcache_t   *tcache_tls TLS_MODEL;
> > > +__thread tcache_t  *tcache_tls TLS_MODEL;
> > >  
> > >  /*
> > >   * Number of cache slots for each bin in the thread cache, or 0 if tcache
> > >   * is
> > > @@ -6184,10 +6184,17 @@ _malloc_thread_cleanup(void)
> > >  #ifdef MALLOC_TCACHE
> > > tcache_t *tcache = tcache_tls;
> > >  
> > > +fprintf(stderr, "_m_t_c for %d:%lu with %p\n", 
> > > +   getpid(),
> > > +   (unsigned long) _pthread_self(),
> > > +   tcache);
> > > +
> > > if (tcache != NULL) {
> > > -   assert(tcache != (void *)(uintptr_t)1);
> > > -   tcache_destroy(tcache);
> > > -   tcache_tls = (void *)(uintptr_t)1;
> > > +   /* assert(tcache != (void *)(uintptr_t)1); */
> > > +   if((uintptr_t)tcache != (uintptr_t)1) {
> > > +   tcache_destroy(tcache);
> > > +   tcache_tls = (void *)(uintptr_t)1;
> > > +   }
> > 
> > and libthr/thread/thr_create.c thus:
> > 
> > > --- a/lib/libthr/thread/thr_create.c
> > > +++ b/lib/libthr/thread/thr_create.c
> > > @@ -243,6 +243,8 @@ create_stack(struct pthread_attr *pattr)
> > > return (ret);
> > >  }
> > >  
> > > +extern __thread void *tcache_tls;
> > > +
> > >  static void
> > >  thread_start(struct pthread *curthread)
> > >  {
> > > @@ -280,6 +282,11 @@ thread_start(struct pthread *curthread)
> > > curthread->attr.stacksize_attr;
> > >  #endif
> > >  
> > > +fprintf(stderr, "t_s for %d:%lu with %p\n",
> > > +getpid(),
> > > +(unsigned long) _pthread_self(),
> > > +tcache_tls);
> > > +
> > > /* Run the current thread's start routine with argument: */
> > > _pthread_exit(curthread->start_routine(curthread->arg));
> > >  
> > 
> > to attempt to debug this issue.  With those changes in place, bonnie++'s
> > execution looks like this:
> > 
> > >[...]
> > > Writing a byte at a time...done
> > > Writing intelligently...done
> > > Rewriting...done
> > > Reading a byte at a time...done
> > > Reading intelligently...done
> > > t_s for 79654:1086343168 with 0x0
> > > t_s for 79654:1086345216 with 0x0
> > > t_s for 79654:1086346240 with 0x0
> > > t_s for 79654:1086347264 with 0x0
> > > t_s for 79654:1086344192 with 0x0
> > > start 'em...done...done...done...done..._m_t_c for 79654:1086344192 with
> > > 0x41404400
> > > _m_t_c for 79654:1086346240 with 0x40d2c400
> > > _m_t_c for 79654:1086343168 with 0x41404200
> > > _m_t_c for 79654:1086345216 with 0x41804200
> > > done...
> > > _m_t_c for 79654:1086347264 with 0x41004200
> > > Create files in sequential order...done.
> > > Stat files in sequential order...done.
> > > Delete files in sequential order...done.
> > > Create files in random order...done.
> > > Stat files in random order...done.
> > > Delete files in random order...done.
> > > 1.96,1.96,hydra.priv.oc.ietfng.org,1,1308217772,10M,,7,81,2644,7,3577,14,34,93,+,+++,773.7,61,16,,,
> > > ,,2325,74,13016,99,2342,86,3019,91,11888,99,2184,89,16397ms,1237ms,671ms,2009ms,177us,1305ms,489ms,1029
> > > us,270ms,140ms,53730us,250ms
> > > Writing a byte at a time...done
> > > Writing intelligently...done
> > > Rewriting...done
> > > Reading a byte at a time...done
> > > Reading intelligently...done
> > > t_s for 79654:1086343168 with 0x1
> > > t_s for 79654:1086346240 with 0x1
> > > t_s for 79654:1086345216 with 0x1
> > > t_s for 79654:1086347264 with 0x1
> > > t_s for 79654:1086344192 with 0x1
> > > start 'em...done...done...done...done...done...
> > > _m_t_c for 79654:1086347264 with 0x1
> > > _m_t_c for 79654:1086344192 with 0x1
> > > _m_t_c for 79654:1086343168 with 0x1
> > >[...]
> > 
> > So what seems to be happening is that the TLS area is being set up
> > incorrectly, eventually: rather than zeroing the tcache_tls value, it is
> > being set to 1, which means no tcache is ever allocated, so when we get
> > around to exiting, the assert trips.
> > 
> > Unfortunately, setting a b

Re: ZFS panic with concurrent recv and read-heavy workload

2011-06-08 Thread Marius Strobl
On Fri, Jun 03, 2011 at 03:03:56AM -0400, Nathaniel W Filardo wrote:
> I just got this on another machine, no heavy workload needed, just booting
> and starting some jails.  Of interest, perhaps, both this and the machine
> triggering the below panic are SMP V240s with 1.5GHz CPUs (though I will
> confess that the machine in the original report may have had bad RAM).  I
> have run a UP 1.2GHz V240 for months and never seen this panic.
> 
> This time the kernel is
> > FreeBSD 9.0-CURRENT #9: Fri Jun  3 02:32:13 EDT 2011
> csup'd immediately before building.  The full panic this time is
> > panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @
> > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:4659
> >
> > cpuid = 1
> > KDB: stack backtrace:
> > panic() at panic+0x1c8
> > _sx_assert() at _sx_assert+0xc4
> > _sx_xunlock() at _sx_xunlock+0x98
> > l2arc_feed_thread() at l2arc_feed_thread+0xeac
> > fork_exit() at fork_exit+0x9c
> > fork_trampoline() at fork_trampoline+0x8
> >
> > SC Alert: SC Request to send Break to host.
> > KDB: enter: Line break on console
> > [ thread pid 27 tid 100121 ]
> > Stopped at  kdb_enter+0x80: ta  %xcc, 1
> > db> reset
> > ttiimmeeoouutt  sshhuuiinngg  ddoowwnn  CCPPUUss..
> 
> Half of the memory in this machine is new (well, came with the machine) and
> half is from the aforementioned UP V240 which seemed to work fine (I was
> attempting an upgrade when this happened); none of it (or indeed any of the
> hardware save the disk controller and disks) are common between this and the
> machine reporting below.
> 
> Thoughts?  Any help would be greatly appreciated.
> Thanks.
> --nwf;
> 
> On Wed, Apr 06, 2011 at 04:00:43AM -0400, Nathaniel W Filardo wrote:
> >[...]
> > panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ 
> > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869
> >
> > cpuid = 1
> > KDB: stack backtrace:
> > panic() at panic+0x1c8
> > _sx_assert() at _sx_assert+0xc4
> > _sx_xunlock() at _sx_xunlock+0x98
> > arc_evict() at arc_evict+0x614
> > arc_get_data_buf() at arc_get_data_buf+0x360
> > arc_buf_alloc() at arc_buf_alloc+0x94
> > dmu_buf_will_fill() at dmu_buf_will_fill+0xfc
> > dmu_write() at dmu_write+0xec
> > dmu_recv_stream() at dmu_recv_stream+0x8a8
> > zfs_ioc_recv() at zfs_ioc_recv+0x354
> > zfsdev_ioctl() at zfsdev_ioctl+0xe0
> > devfs_ioctl_f() at devfs_ioctl_f+0xe8
> > kern_ioctl() at kern_ioctl+0x294
> > ioctl() at ioctl+0x198
> > syscallenter() at syscallenter+0x270
> > syscall() at syscall+0x74
> > -- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 --
> > userland() at 0x40e72cc8
> > user trace: trap %o7=0x40c13e24
> > pc 0x40e72cc8, sp 0x7fd4641
> > pc 0x40c158f4, sp 0x7fd4721
> > pc 0x40c1e878, sp 0x7fd47f1
> > pc 0x40c1ce54, sp 0x7fd8b01
> > pc 0x40c1dbe0, sp 0x7fd9431
> > pc 0x40c1f718, sp 0x7fdd741
> > pc 0x10731c, sp 0x7fdd831
> > pc 0x10c90c, sp 0x7fdd8f1
> > pc 0x103ef0, sp 0x7fde1d1
> > pc 0x4021aff4, sp 0x7fde291
> > done
> >[...]

Apparently this is a locking issue in the ARC code, the ZFS people should
be able to help you.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Old ATA disk names emulation [Was: Switch from legacy ata(4) to CAM-based ATA]

2011-04-25 Thread Marius Strobl
On Mon, Apr 25, 2011 at 01:23:37PM +0300, Alexander Motin wrote:
> Hi.
> 
> I've implemented following patch to keep basic compatibility for the
> migrating users. I don't like such hacky things, but at least I tried to
> make it less invasive.
> 
> The idea:
>  - New xpt_path_legacy_ata_id() function in CAM tries to predict bus
> unit number and then device unit number for specified path, as if it was
> with legacy ATA with ATA_STATIC_ID option.
>  - on attach, ada driver fetches that number (if not disabled using
> tunable kern.cam.ada.legacy_aliases), prints to console something like:
> ada0: Previously was known as ad12
> , and sets kernel environment variable like:
> kern.devalias.ada0="ad12"
>  - when geom_dev tastes new geom and creates device node for it, it also
> tries to match prefix of the device name with present kern.devalias.*
> enviromnent variables, and, if some match found, creates alias with
> substituted name (ada0 -> ad12, ada0s1 -> ad12s1, etc.).
> 
> The patch is here: http://people.freebsd.org/~mav/legacy_aliases.patch
> 
> I did few tests and it seems like working -- two sets of device nodes
> appeared for each device, I can successfully label and mount any of them.
> 
> What will not work:
>  - old device names won't be seen inside GEOM, so users who hardcoded
> provider names in gmirror/gstripe/... metadata (not the default
> behavior) are still in trouble.
>  - patch mimics ATA_STATIC_ID behavior, if user had custom kernel
> without it, he should update device names manually.
>  - it won't work for users with hot-unplugging ATA controllers (not
> devices), but I believe it is really rare case.
>  - low-level tools, such as smartmontools, won't be able to work with
> alias devices, as background ada driver doesn't implements legacy
> ioctls. May be I could partially fix this.
> 
> Except those, I think this patch should work for the most of users.
> 
> Any more objections/ideas? Is this an acceptable solution?
> 

Hi,

given that only the amd64, i386 and pc98 GENERIC kernel configuration
files had ATA_STATIC_ID enabled by default it would be highly desireable
that your compatibility shim also only mimics that behavior on these
archs or probably better actually check for ATA_STATIC_ID and put that
option back into the respective kernel configuration files.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Switch from legacy ata(4) to CAM-based ATA

2011-04-21 Thread Marius Strobl
On Thu, Apr 21, 2011 at 01:26:25PM +0300, Alexander Motin wrote:
> Marius Strobl wrote:
> > On Wed, Apr 20, 2011 at 12:57:47PM +0300, Alexander Motin wrote:
> >> With 9.0 release approaching quickly, I believe it the best time now to
> >> manage migration from legacy ata(4) ATA to the new CAM-based one. New
> >> ATA code present in the tree for more then a year now, used by many
> >> people and proved it's superior functionality and reliability. The only
> >> major issue with it now is the migration process. Sooner or later we
> >> have to pass it, but due to major UI and API changes we can't do it
> >> after 9.0 release. So I propose to do it the next Sunday (April 24) to
> >> have as much time for troubleshooting as possible.
> >>
> >> I have prepared the following patch to do it:
> >> http://people.freebsd.org/~mav/ata_switch.patch
> > 
> > Could you please add descriptions of the controllers supported by
> > ahci(4), mvs(4) and siis(4) to the kernel configuration files and
> > preserve alphabetical ordering, i.e. list ata(4) after ahci(4)?
> 
> OK. Here is the new patch:
> http://people.freebsd.org/~mav/ata_switch2.patch
> 

Thanks!

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Switch from legacy ata(4) to CAM-based ATA

2011-04-20 Thread Marius Strobl
On Wed, Apr 20, 2011 at 12:57:47PM +0300, Alexander Motin wrote:
> Hi.
> 
> With 9.0 release approaching quickly, I believe it the best time now to
> manage migration from legacy ata(4) ATA to the new CAM-based one. New
> ATA code present in the tree for more then a year now, used by many
> people and proved it's superior functionality and reliability. The only
> major issue with it now is the migration process. Sooner or later we
> have to pass it, but due to major UI and API changes we can't do it
> after 9.0 release. So I propose to do it the next Sunday (April 24) to
> have as much time for troubleshooting as possible.
> 
> I have prepared the following patch to do it:
> http://people.freebsd.org/~mav/ata_switch.patch
> 

Could you please add descriptions of the controllers supported by
ahci(4), mvs(4) and siis(4) to the kernel configuration files and
preserve alphabetical ordering, i.e. list ata(4) after ahci(4)?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Fwd: OpenSSL 1.0.0d for Freebsd HEAD

2011-03-02 Thread Marius Strobl
On Wed, Mar 02, 2011 at 10:58:00AM +0100, Alexandre Martins wrote:
> Hello,
> 
> This sound great :)
> 
> SIGILL is raised when the program try to execute an assembly code that the 
> CPU 
> cannot execute. It mean that the library or the binary is miscompiled.
> 

Not necessarily, the sparc64 code f.e. also kills programs with SIGILL
when they corrupt overflow their stack or the stack pointer is courrupt.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Fwd: OpenSSL 1.0.0d for Freebsd HEAD

2011-03-01 Thread Marius Strobl
On Tue, Mar 01, 2011 at 10:31:16AM +0100, Alexandre Martins wrote:
> Dear,
> 
> Have you extracted the tarball fo openssl source (1.0.0d) in crypto/openssl ?
> 

Ah, I missed that, the last couple of mails in this thread were only
talking about the patch :)
With the tarball untared it actually builds and works on sparc64 as
far as ssh(d) and HTTPS via fetch are concerned. The problem reports
(programs getting killed with SIGILL probably due to an infinite
recursion or some such) were about apache and unbound using an
OpenSSL 1.0.0 port. I'm not sure whether their use of OpenSSL would
make a difference or the port is broken.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: OpenSSL 1.0.0d for Freebsd HEAD

2011-02-28 Thread Marius Strobl
On Mon, Feb 28, 2011 at 12:00:19PM +0100, Fabien Thomas wrote:
> 
> 
> > Dears,
> > 
> > After several research, i have removed the problematic part.
> > 
> > You can find the new version here:
> > 
> > http://people.freebsd.org/~fabient/patch-head20110222-openssl1.0.0d
> > 
> 
> 
> It will be great to have it in 9.0.
> 
> To do that how is it possible rebuild the port for all platform with openssl 
> 1.0.0d in base?
> Is there some people against that inclusion?
> 

Given that some users report ports linked against the port version
of OpenSSL 1.0.0 (c I think) to not work on sparc64 I wanted to
give your patch a try, but unfortuntately it doesn't even build:
===> secure/lib/libcrypto (buildincludes)
cp /usr/home/marius/co/head3/src/secure/lib/libcrypto/opensslconf-sparc64.h 
opensslconf.h
( echo "#ifndef MK1MF_BUILD";  echo "  /* auto-generated by crypto/Makefile.ssl 
for crypto/cversion.c */";  echo "  #define CFLAGS \"cc\"";  echo "  #define 
PLATFORM \"FreeBSD-sparc64\"";  echo "  #define DATE \"`LC_ALL=C date`\"";  
echo "#endif" ) > buildinf.h
make: don't know how to make asn1_locl.h. Stop
*** Error code 2

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: libcompiler_rt now part of FreeBSD's base system

2010-11-12 Thread Marius Strobl
On Fri, Nov 12, 2010 at 03:57:20PM +0100, Florian Smeets wrote:
> On 11.11.10 16:52, Ed Schouten wrote:
> > I just committed libcompiler_rt.a to HEAD. Even though I don't expect
> > serious issues -- especially not on the tier 1 architectures -- be sure
> > to contact me in case something goes wrong. I hooked it up to the build
> > in a separate commit, so if your system starts to act weird, just revert
> > r215127.
> > 
> 
> Hi Ed,
> 
> i'm at r215149 on sparc64, and my compiler stopped working. buildworld
> stops after 42 lines (http://smeets.im/~flo/bw.log). cc1 dumps a 1GB
> core file.
> 
> Program terminated with signal 4, Illegal instruction.
> #0  0x004ced80 in ?? ()
> (gdb) where
> #0  0x004ced80 in ?? ()
> #1  0x004cedb0 in ?? ()
> Previous frame identical to this frame (corrupt stack?)
> 
> Right now i cannot go back to r215126 to verify that it really is this
> change which is causing it :-) Previously the system was running a build
> from around Nov. 1st
> 

I was just about to report the same based on a test of r214838. With
debugging symbols I get a more meaningful though:
nimrod# gdb 
/tmp/objrt.old/usr/home/marius/co/compiler-rt/gnu/usr.bin/cc/cc1/cc1 
/tmp/objrt/usr/home/marius/co/compiler-rt/tmp/usr/home/marius/co/compiler-rt/tools/build/cc1.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "sparc64-marcel-freebsd"...(no debugging symbols 
found)...
Core was generated by `cc1'.
Program terminated with signal 4, Illegal instruction.
#0  0x004c0aa0 in __ctzdi2 ()
(gdb) bt
#0  0x004c0aa0 in __ctzdi2 ()
#1  0x004c0ad0 in __ctzdi2 ()
(gdb) 

The corresponding assembler code is:
004c0aa0 <__ctzdi2>:
  4c0aa0:   9d e3 bf 40 save  %sp, -192, %sp
  4c0aa4:   82 10 00 18 mov  %i0, %g1
  4c0aa8:   80 a0 00 18 cmp  %g0, %i0
  4c0aac:   85 3e 30 20 srax  %i0, 0x20, %g2
  4c0ab0:   b0 40 3f ff addc  %g0, -1, %i0
  4c0ab4:   90 38 00 18 xnor  %g0, %i0, %o0
  4c0ab8:   84 0e 00 02 and  %i0, %g2, %g2
  4c0abc:   90 0a 00 01 and  %o0, %g1, %o0
  4c0ac0:   b0 0e 20 20 and  %i0, 0x20, %i0
  4c0ac4:   90 12 00 02 or  %o0, %g2, %o0
  4c0ac8:   7f ff ff f6 call  4c0aa0 <__ctzdi2>
  4c0acc:   91 32 20 00 srl  %o0, 0, %o0
  4c0ad0:   b0 06 00 08 add  %i0, %o0, %i0
  4c0ad4:   81 cf e0 08 rett  %i7 + 8
  4c0ad8:   91 3a 20 00 sra  %o0, 0, %o0
  4c0adc:   01 00 00 00 nop

I think what happens here is that GCC uses __ctzdi2() to implement
__builtin_ctz(), while the libcompiler-rt version of __ctzdi2() uses
__builtin_ctz(), so __ctzdi2() is called recursively until the stack
overflows. Note that GCC has code like:
int __ctzsi2 (uSI x) { return __builtin_ctz (x); }
and rwindow_save() returns SIGILL, so I think this theory is correct
but I've no idea how to solve that.

Another thing that worries me is that by switching to libcompiler-rt
we lose all the assembler optimizations libgcc has for sparc64. When
building with libcompiler-rt the buildworld time increases by 2.6%
on sparc64. I guess this mostly is due to the fact that now both
libcompiler-rt and libgcc are built though. Do you have an idea how
to benchmark the possible performance loss with libcompiler-rt for
typical applications?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Sense fetching [Was: cdrtools /devel ...]

2010-11-05 Thread Marius Strobl
On Fri, Nov 05, 2010 at 08:50:49PM +0200, Alexander Motin wrote:
> Hi.
> 
> I've reviewed tests that scgcheck does to SCSI subsystem. It shown
> combination of several issues in both CAM, ahci(4) and cdrtools itself.
> Several small patches allow us to pass most of that tests:
> http://people.freebsd.org/~mav/sense/
> 
> ahci_resid.patch: Add support for reporting residual length on data
> underrun. SCSI commands often returns results shorter then expected.
> Returned value allows application to know/check how much data it really
> has. It is also important for sense fetching, as ATAPI and USB devices
> return sense as data in response to REQUEST_SENSE command.
> 
> sense_resid.patch: When manually requesting sense data (ATAPI or USB),
> request only as much data as user requested (not the fixed structure
> size), and return respective sense residual length.
> 
> pass_autosence.patch: Unless CAM_DIS_AUTOSENSE is set, always fetch
> sense if not done by SIM, independently of CAM_PASS_ERR_RECOVER. As soon
> as device freeze released before returning to user-level, user-level
> application by definition can't reliably fetch sense data if some other
> application (like hald) tries to access device same time.
> 
> cdrtools.patch: Make libscg (part of cdrtools) on FreeBSD to submit
> wanted sense length to CAM and do not clear sense return buffer. It is
> mostly cosmetics, important probably only for scgcheck.

Please don't commit this to the port directly but let it loop back
via upstream (CC'ed) instead, otherwise we would need to obey the
following, which is undesirable, especially if these really are
mostly cosmetic issues:
/*
 *  Warning: you may change this source, but if you do that
 *  you need to change the _scg_version and _scg_auth* string below.
 *  You may not return "schily" for an SCG_AUTHOR request anymore.
 *  Choose your name instead of "schily" and make clear that the version
 *  string is related to a modified source.
 */

> 
> Testers and reviewers welcome. I am especially interested in opinion
> about pass_autosence.patch -- may be we should lower sense fetching even
> deeper, to make it work for all cam_periph_runccb() consumers.
> 

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: bge0 does not work anymore

2010-10-18 Thread Marius Strobl
On Mon, Oct 18, 2010 at 09:32:13AM +0800, Buganini wrote:
> my last known usable kernel revision is r213813
> with r213920, leds are extinguished when executing dhclient

Sorry, it looks like it was my fault this time, should be fixed again
with r214012.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-09-24 Thread Marius Strobl
On Tue, Jul 20, 2010 at 01:55:28PM +0200, Stle Kristoffersen wrote:
> On 2010-07-20 at 12:17, Marius Strobl wrote:
> > On Mon, Jul 19, 2010 at 07:06:54PM +0200, Stle Kristoffersen wrote:
> > > On 2010-07-18 at 14:20, Marius Strobl wrote:
> > > > > > Downgrading now...
> > > > > 
> > > > > And it crashed again, with current from r209598...
> > > > > 
> > > > 
> > > > Ok, this at least means that your problem isn't caused by the recent
> > > > changes to mpt(4) as the pre-r209599 version only differed from the
> > > > 8-STABLE one in a cosmetic change at that time.
> > > 
> > > I have another data-point, I cvsup'ed to the latest current again, and
> > > rebuilt without INVARIANT and WITNESS, and now it seems to survive the
> > > timeouts.
> > 
> > That's more or less expected as the sanity check issuing the panic
> > just isn't compiled in then. However, my understanding was that with
> > STABLE you don't get the timeouts in the first place, or do you see
> > them there also?
> 
> I got the timeouts with STABLE as well, that was the reason for me to
> try out CURRENT. I'm sorry I didn't mention that earlier.
> 
> My main concern is to get rid of the timeouts, but a panic on one can't be
> right. How can I debug this further? I can get timeout fairly consistent by
> putting a bit of load on the drives. If it would help I can also provide
> remote access.
> 

FYI, that panic is fixed with r213105.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: {arch}/conf/DEFAULTS and uart

2010-09-12 Thread Marius Strobl
On Sun, Sep 12, 2010 at 02:40:49AM +, Alexander Best wrote:
> On Fri Sep 10 10, John Baldwin wrote:
> > On Thursday, September 09, 2010 3:50:45 pm Alexander Best wrote:
> > > On Thu Sep  9 10, Alexander Best wrote:
> > > > On Thu Sep  9 10, Alexander Best wrote:
> > > > > hi there,
> > > > > 
> > > > > except for arm most archs seem to enforce uart support in 
> > > > > conf/DEFAULTS. is
> > > > > this really necessary? shouldn't DEFAULTS only contain vital 
> > > > > devices/options
> > > > > without a kernel on a specific arch won't function at all?
> > > > 
> > > > jhb just explained to me, that the uart entry in DEFAULTS is not a 
> > > > controller
> > > > or something like that, but the uart backend to use *if* uart gets 
> > > > defined in
> > > > the kernel config.
> > > > 
> > > > sorry for the noise folks.
> > > 
> > > however i found some missing comments and incorrect syntax which i fixed.
> > > 
> > > see the attached patch.
> > 
> > I think the ia64 ordering for 'io and mem' is probably more correct
> > (alphabetically sorted), so I would fix i386 and amd64 and leave ia64 alone.
> > 
> > The powerpc 'machine' changes are wrong I think as it would break GENERIC64
> > and powerpc64 kernel configs in general.  Nathan purposefully removed
> > 'machine' from the powerpc DEFAULTS.
> 
> here's try #2. ;)
> 
> diff --git a/sys/sparc64/conf/DEFAULTS b/sys/sparc64/conf/DEFAULTS
> index 38b2408..2e60c94 100644
> --- a/sys/sparc64/conf/DEFAULTS
> +++ b/sys/sparc64/conf/DEFAULTS
> @@ -5,7 +5,7 @@
>  
>  machine  sparc64
>  
> -# Pseudo devices.
> +# Pseudo devices
>  device   mem # Memory and kernel memory devices
>  
>  # UART chips on this platform
> @@ -17,5 +17,5 @@ device  uart_z8530
>  options  GEOM_PART_BSD
>  options  GEOM_PART_VTOC8
>  
> -# Let sunkbd emulate an AT keyboard by default.
> +# Let sunkbd emulate an AT keyboard by default

IMO this is a complete sentence and thus the period should stay.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-20 Thread Marius Strobl
On Mon, Jul 19, 2010 at 07:06:54PM +0200, Stle Kristoffersen wrote:
> On 2010-07-18 at 14:20, Marius Strobl wrote:
> > > > Downgrading now...
> > > 
> > > And it crashed again, with current from r209598...
> > > 
> > 
> > Ok, this at least means that your problem isn't caused by the recent
> > changes to mpt(4) as the pre-r209599 version only differed from the
> > 8-STABLE one in a cosmetic change at that time.
> 
> I have another data-point, I cvsup'ed to the latest current again, and
> rebuilt without INVARIANT and WITNESS, and now it seems to survive the
> timeouts.

That's more or less expected as the sanity check issuing the panic
just isn't compiled in then. However, my understanding was that with
STABLE you don't get the timeouts in the first place, or do you see
them there also?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-18 Thread Marius Strobl
On Fri, Jul 16, 2010 at 12:31:26PM +0200, Stle Kristoffersen wrote:
> On 2010-07-15 at 19:52, St?le Kristoffersen wrote:
> > On 2010-07-15 at 18:00, Marius Strobl wrote:
> > > On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
> > > > Upgraded to from stable to current yesterday and very quickly received a
> > > > panic. It did however not dump it's core, so I was unable to debug it.
> > > > Today it did panic again, and I took a picture: (Sorry about the bad
> > > > quality)
> > > > 
> > > > http://folk.uio.no/stalk/mpt/IMG_1403.JPG
> > > > 
> > > > And from the backtrace:
> > > > http://folk.uio.no/stalk/mpt/IMG_1404.JPG
> > > > 
> > > > Both times I hade the mpt0: request timed out just before the panic.
> > > > 
> > > > I'm not sure why it's not dumping it's core (It was working under 
> > > > stable,
> > > > and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)
> > > 
> > > What revision were you using?
> > 
> > Not sure exactly what revision I was using, is there an easy way to figure
> > that out? I ran cvsupdate around 13:00 CEST yesterday.
> > 
> > > Does using current as of r209598 make a difference?
> > 
> > Downgrading now...
> 
> And it crashed again, with current from r209598...
> 

Ok, this at least means that your problem isn't caused by the recent
changes to mpt(4) as the pre-r209599 version only differed from the
8-STABLE one in a cosmetic change at that time.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next->prev != elm

2010-07-15 Thread Marius Strobl
On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
> Upgraded to from stable to current yesterday and very quickly received a
> panic. It did however not dump it's core, so I was unable to debug it.
> Today it did panic again, and I took a picture: (Sorry about the bad
> quality)
> 
> http://folk.uio.no/stalk/mpt/IMG_1403.JPG
> 
> And from the backtrace:
> http://folk.uio.no/stalk/mpt/IMG_1404.JPG
> 
> Both times I hade the mpt0: request timed out just before the panic.
> 
> I'm not sure why it's not dumping it's core (It was working under stable,
> and I have dumpdev="AUTO" and dumpdir="/var/crash" in rc.conf)

What revision were you using?
Does using current as of r209598 make a difference?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: [sparc64] [panic] cheetah_ipi_selected: CPU can't IPI itself

2010-06-29 Thread Marius Strobl
On Mon, Jun 28, 2010 at 10:25:15AM -0400, Nathaniel W Filardo wrote:
> Well, I'm back in the same town as my sparc64 and so csup'd, built, and
> rebooted, trying to get more information about the "vm object not owned"
> panic I reported a while ago.  To my dismay, I now get this panic, also late
> enough in the boot process to be starting up jails:
> 
> panic: cheetah_ipi_selected: CPU can't IPI itself
> cpuid = 0
> KDB: stack backtrace:
> panic() at panic+0x1c8
> cheetah_ipi_selected() at cheetah_ipi_selected+0x48
> tlb_page_demap() at tlb_page_demap+0xdc
> pmap_copy_page() at pmap_copy_page+0x4c4
> vm_fault() at vm_fault+0x13ec
> trap_pfault() at trap_pfault+0x190
> trap() at trap+0xd0
> -- data access protection tar=0x224b93 sfar=0x224550 sfsr=0x85
> %o7=0x4063398c --
> userland() at 0x40633830
> user trace: trap %o7=0x4063398c
> ...
> 
> And the system hangs; I had to use the ALOM to reboot it.
> Sorry to not have more useful news.

Could please give the following patch a try?
http://people.freebsd.org/~marius/sparc64_pin_ipis.diff
If that doesn't fix the above panic I have no clue how this can
happen apart from the per-CPU pages getting corrupted.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Followup: if_em.c prevents the 2nd time resuming

2010-05-15 Thread Marius Strobl
On Sat, May 15, 2010 at 10:23:17PM +0900, Taku YAMAMOTO wrote:
> PR filed as kern/146614.
> http://www.freebsd.org/cgi/query-pr.cgi?pr=146614
> 

That was an mismerge introduced when moving the original patch
forward to a newer version of the e1000 source. It's now fixed.
Thanks for reporting.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: AESNI driver and fpu_kern KPI

2010-05-15 Thread Marius Strobl
On Sat, May 15, 2010 at 01:04:01PM +0300, Kostik Belousov wrote:
> 
> I am interested in the problem reports and reviews. Maintainers of
> !x86-oids are welcome to provide feedback whether they feel that
> proposed KPI could be implemented on their architectures, or what
> modifications they consider as needed to be able to implement
> it.
> 

FYI, sparc64 doesn't need such a KPI as it supports using the FPU
in kernel unconditionally for ages.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Switchover to CAM ATA?

2010-05-03 Thread Marius Strobl
On Mon, Apr 26, 2010 at 09:18:07AM -0600, Scott Long wrote:
> On Apr 26, 2010, at 6:51 AM, Alexander Motin wrote:
> > Marius Strobl wrote:
> >> As noted earlier, pc98 and sparc64 need ada(4)/CAM ATA to perform
> >> geometry translation as done by ad_firmware_geom_adjust() for ad(4),
> >> which the following patch hooks up to both:
> >> http://people.freebsd.org/~marius/ata_disk_firmware_geom_adjust.diff
> >> You preferred to implement such functionality via XPT_CALC_GEOMETRY
> >> though (I'm still not convinced that it makes sense to put this
> >> functionality into every ATA SIM the same way it is done for SCSI
> >> rather than letting ada(4) handle it the same way for all SIMs
> >> however). Have you looked into implementing XPT_CALC_GEOMETRY for
> >> ATA CAM or is it okay to commit the above patch?
> > 
> > Sorry, I have forgotten about this.
> > 
> > I don't have better idea. For ATA translation seems indeed more
> > platform- then controller-specific. May be I would just preferred to see
> > this hack to be done inside XPT_CALC_GEOMETRY handler, as it is done now
> > for PC98 SCSI. But looking that whole this topic is quite crappy and
> > hopefully going to die sometimes, I won't argue much against committing
> > this as-is for now.
> 
> Put this into XPT_CALC_GEOMETRY.  There's no point in perpetuating the 
> mistakes of the ata driver.
> Give me a day or two to think of a reasonable way to do it right.
> 

Did you get further with this approach?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: mpt(4) MPI_EVENT_IR_RESYNC_UPDATE

2010-05-01 Thread Marius Strobl
On Fri, Apr 30, 2010 at 06:50:26PM +0400, pluknet wrote:
> On 30 April 2010 18:22, Matthew Jacob  wrote:
> > pluknet wrote:
> > Seems good to me- why not trhow it freebsd-scsi? if nobody says no, I'll put
> > it in
> 
> Err.. I thought that list is dedicated for cam related stuff.
> 
> [cc'ing scsi@ for better coverage. Sorry for cross-posting :/ ]
> 
> >
> >> --- RELENG_7_3/src/sys/dev/mpt/mpt_cam.c ? ? ? ?2010-03-02
> >> 15:38:13.0 +0300
> >> +++ RELENG_7_3.ours/src/sys/dev/mpt/mpt_cam.c ? 2010-04-21
> >> 19:31:00.0 +0400
> >> @@ -2564,6 +2564,12 @@ mpt_cam_event(struct mpt_softc *mpt, req
> >> ? ? ? ? ? ? ? ?CAMLOCK_2_MPTLOCK(mpt);
> >> ? ? ? ? ? ? ? ?break;
> >> ? ? ? ?}
> >> + ? ? ? case MPI_EVENT_IR_RESYNC_UPDATE:
> >> + ? ? ? {
> >> + ? ? ? ? ? ? ? uint8_t resync = (data0 >> 16) & 0xff;
> >> + ? ? ? ? ? ? ? mpt_prt(mpt, "IR resync update %d completed\n", resync);
> >> + ? ? ? ? ? ? ? break;
> >> + ? ? ? }
> >> ? ? ? ?case MPI_EVENT_EVENT_CHANGE:
> >> ? ? ? ?case MPI_EVENT_INTEGRATED_RAID:
> >> ? ? ? ?case MPI_EVENT_SAS_DEVICE_STATUS_CHANGE:
> >>
> >> Another way - just hide such event since mptutil displays rebuild
> >> progress.
> >>
> >>
> 

Could you maybe avoid defining a variable inside a nested scope for
consistency with the majority of the existing cases and in order to
not violate style(9) unnecessarily?

Marius

Index: mpt_cam.c
===
--- mpt_cam.c   (revision 207463)
+++ mpt_cam.c   (working copy)
@@ -2575,6 +2575,10 @@ mpt_cam_event(struct mpt_softc *mpt, request_t *re
CAMLOCK_2_MPTLOCK(mpt);
break;
}
+   case MPI_EVENT_IR_RESYNC_UPDATE:
+   mpt_prt(mpt, "IR resync update %d completed\n",
+   (data0 >> 16) & 0xff);
+   break;
case MPI_EVENT_EVENT_CHANGE:
case MPI_EVENT_INTEGRATED_RAID:
case MPI_EVENT_SAS_DEVICE_STATUS_CHANGE:
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Switchover to CAM ATA?

2010-04-24 Thread Marius Strobl
On Thu, Apr 22, 2010 at 06:31:37PM +0300, Alexander Motin wrote:
> Hi.
> 
> With time passed, CAM-based ATA infrastructure IMHO looks enough mature
> now to enable it in HEAD. Now we have two new stable drivers ahci(4) and
> siis(4), covering major part of modern SATA HBAs, `options ATA_CAM`
> wrapper for ata(4) to supports legacy hardware, and one more improved
> driver for Marvell HBAs (mvs) is now in development and soon will be
> present for testing. Together with many other people I have tested above
> at least on i386, amd64, arm and spart64 architectures.
> 
> This switchover would give us significant performance improvement on new
> hardware because of NCQ support in ahci/siis/mvs drivers; improved
> functionality, including SATA Port Multipliers support, better hot-plug
> support; and reduced code duplication between ata(4) and cam(4)
> subsystems and applications.
> 
> Two issues left at this moment are:
>  1) POLA breakage due to disk device being renamed from adX to adaY;
>  2) lack of araraid(4) alternative in new infrastructure. It should be
> reimplemented in GEOM in some way, but it still wasn't.
> 
> So what is the public opinion: Is the lack of ataraid(4) fatal or we can
> live without it?
> 
> Can we do switchover now, or some more reasons preventing this?
> 

As noted earlier, pc98 and sparc64 need ada(4)/CAM ATA to perform
geometry translation as done by ad_firmware_geom_adjust() for ad(4),
which the following patch hooks up to both:
http://people.freebsd.org/~marius/ata_disk_firmware_geom_adjust.diff
You preferred to implement such functionality via XPT_CALC_GEOMETRY
though (I'm still not convinced that it makes sense to put this
functionality into every ATA SIM the same way it is done for SCSI
rather than letting ada(4) handle it the same way for all SIMs
however). Have you looked into implementing XPT_CALC_GEOMETRY for
ATA CAM or is it okay to commit the above patch?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread Marius Strobl
On Mon, Nov 10, 2003 at 02:12:56PM -0500, John Baldwin wrote:
> 
> On 10-Nov-2003 John Hay wrote:
> >> 
> >> With the new interrupt code I get:
> >> <...>
> >> OK boot
> >> cpuid = 0; apic id = 00
> >> instruction pointer = 0x0:0xa00
> >> stack pointer   = 0x0:0xffe
> >> frame pointer   = 0x0:0x0
> >> code segment= base 0x0, limit 0x0, type 0x0
> >> = DPL 0, pres 0, def32 0, gran 0
> >> processor eflags= interrupt enabled, vm86, IOPL = 0
> >> current process = 0 ()
> >> kernel: type 30 trap, code=0
> >> Stopped at  0xa00:  cli
> >> db> tr
> >> (null)(0,0,0,0,0) at 0xa00
> >> <...>
> >> 
> >> However, if I enter 'continue' at the DDB prompt it continues to boot
> >> and the system seems to runs fine:
> >> 
> >> <...>
> >> db> continue
> > ...
> >> Copyright (c) 1992-2003 The FreeBSD Project.
> >> <...>
> >> 
> > 
> > Now why didn't I think of trying 'continue'? Hey there my old dual
> > Pentium I diskless machine is running in SMP mode.
> 
> Can you try this patch:
> 
> http://www.FreeBSD.org/~jhb/patches/atpic.patch
> 

Works here, thanks!
Btw., I also get such a stray interrupt on my Sun U60, IIRC also from the
printer port :)

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread Marius Strobl
On Thu, Nov 06, 2003 at 12:22:45PM -0500, John Baldwin wrote:
> 
> On 06-Nov-2003 Harti Brandt wrote:
> > JB>I figured out what is happenning I think.  You are getting a spurious
> > JB>interrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
> > JB>lists pending interrupts still waiting to be serviced.  Try using
> > JB>'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
> > JB>the spurious IRQ 7 interrupts go away.
> > 
> > Ok, that seems to help. Interesting although why do these interrupts
> > happen only with a larger HZ and when the kernel is doing printfs (this
> > machine has a serial console). I have also not tried to disable SIO2 and
> > the parallel port.
> 
> Can you also try turning mixed mode back on and using
> http://www.FreeBSD.org/~jhb/patches/spurious.patch
> 
> You should get some stray IRQ 7's in the vmstat -i output as well as a few
> printf's to the kernel console.
> 

I think I'm seeing something related here, with the old interrupt code I
got:
<...>
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...   
ACPI autoload failed - no such file or directory
stray irq 7
^^^
Copyright (c) 1992-2003 The FreeBSD Project.
<...>

With the new interrupt code I get:
<...>
OK boot
cpuid = 0; apic id = 00
instruction pointer = 0x0:0xa00
stack pointer   = 0x0:0xffe
frame pointer   = 0x0:0x0
code segment= base 0x0, limit 0x0, type 0x0
= DPL 0, pres 0, def32 0, gran 0
processor eflags= interrupt enabled, vm86, IOPL = 0
current process = 0 ()
kernel: type 30 trap, code=0
Stopped at  0xa00:  cli
db> tr
(null)(0,0,0,0,0) at 0xa00
<...>

However, if I enter 'continue' at the DDB prompt it continues to boot
and the system seems to runs fine:

<...>
db> continue
SMAP type=01 base= len=0009f400
SMAP type=02 base=0009f400 len=0c00
SMAP type=02 base=000d len=0003
SMAP type=01 base=0010 len=1fdf
SMAP type=03 base=1fef len=f000
SMAP type=04 base=1feff000 len=1000
SMAP type=01 base=1ff0 len=0008
SMAP type=02 base=1ff8 len=0008
SMAP type=02 base=fec0 len=4000
SMAP type=02 base=fee0 len=1000
SMAP type=02 base=fff8 len=0008
Copyright (c) 1992-2003 The FreeBSD Project.
<...>

Neiter the spurious interrupt patch nor setting 'options NO_MIXED_MODE'
makes a difference. This is on a Tyan Tiger MPX S2466N-4M board, a full
verbose boot log is at: http://quad.zeist.de/newintr.log

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: g++ problem

2003-11-06 Thread Marius Strobl
On Thu, Nov 06, 2003 at 11:51:12AM -0500, Alexander Kabaev wrote:
> On Thu, 6 Nov 2003 17:44:59 +0100
> Marius Strobl <[EMAIL PROTECTED]> wrote:
> 
> > This happens with g++ 3.x ...
> This will happen with g++ 3.x, 2.x, 1.x and future 4.x too. I.e. the
> GCC is not at fault and the subject of the original message is
> misleading.
> 

It's at least not fatal with g++ 2.95.4 on 4-stable, that's what I
meant. But yes, GCC is not at fault.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: g++ problem

2003-11-06 Thread Marius Strobl
On Thu, Nov 06, 2003 at 11:28:28AM -0500, Alexander Kabaev wrote:
> On Thu, 6 Nov 2003 16:55:00 +0100 (CET)
> "C. Kukulies" <[EMAIL PROTECTED]> wrote:
> 
> > I tried to compile a virus-scanner for Linux that allows for scanning
> > Windoze PCs in a network for all sorts of recent viruses (RPC/DCOM and
> > such).
> > 
> > http://www.enyo.de/fw/software/doscan
> > 
> > Compilation fails with the following:
> > 
> > kukuboo2k# gmake
> > g++ -g -O2 -Wall -I/usr/local/include -I. -I. -I./lib \
> > -MMD -MF src/doscan.d \
> > -c -o src/doscan.o src/doscan.cc
> > In file included from src/doscan.cc:28:
> > /usr/local/include/getopt.h:115: error: declaration of C function `int
> > getopt()
> >' conflicts with
> > /usr/include/unistd.h:377: error: previous declaration `int
> > getopt(int, char*
> >const*, const char*)' here
> > gmake: *** [src/doscan.o] Error 1
> > 
> > I wonder where /usr/local/include comes from. If I remove that it
> > compiles smoothly.
> 
> Uhm, from you command line? What _this_ has to do with a compiler?
> 

This happens with g++ 3.x when the devel/libgnugetopt port is installed
and both its getopt.h and the base unistd.h are included. There are
several ports that have workarounds for this issue.
I have a patch for devel/libgnugetopt at
ftp://ftp.zeist.de/pub/patches/devel_libgnugetopt.diff
that should fix this issue by updating to the latest sources.
In my opinion the right thing to do is however to also include
getopt_long_only() in libc and not only getopt_long() so one can get
rid of the devel/libgnugetopt port. I have a patch for this at
ftp://ftp.zeist.de/pub/patches/src_getopt_long_only.diff
When I have time I'll continue testing of both and eventually submit
PRs.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic with cdrecord-devel @ Mon Oct 20 21:28:57 EEST 2003

2003-10-21 Thread Marius Strobl
On Tue, Oct 21, 2003 at 10:27:11AM +0300, Paulius Bulotas wrote:
> Hello,
> 
> 5.1-CURRENT #0: Mon Oct 20 21:28:57 EEST 2003
> 
> % cdrecord -scanbus
> panics, and trace looks like:
> vmapbuf
> cam_periph_mapmem
> xptioctl
> spec_ioctl
> spec_vnoperate
> vn_ioctl
> ioctl
> syscall(2f,2f,2f,,3)
> Xint0x80_syscall
> 
> Any ideas?
> 

I was told that a `camcontrol devlist` triggers a panic in
sys/kern/vfs_bio.c:3729 on recent -current and that turning off
INVARIANTS and WITNESS avoids it. This may be related.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Random signals in {build,install}world recently?

2003-10-20 Thread Marius Strobl
On Mon, Oct 20, 2003 at 05:08:26PM +0200, Christian Brueffer wrote:
> On Mon, Oct 20, 2003 at 10:50:02AM -0400, Barney Wolff wrote:
> > On Mon, Oct 20, 2003 at 03:20:56PM +0200, Mark Santcroos wrote:
> > > On Mon, Oct 20, 2003 at 10:27:38AM +0200, Harti Brandt wrote:
> > > > On Mon, 20 Oct 2003, Vallo Kallaste wrote:
> > > > 
> > > > VK>Hi
> > > > VK>
> > > > VK>It seems to be a recent problem. The hardware is OK, both Windows XP
> > > > VK>(which I use very seldom) and Gentoo Linux do not exhibit any
> > > > VK>problems.
> > > > VK>Basically one will get random signals as I have got in build- and
> > > > VK>installworld. It's impossible to complete make -j2 buildworld on my
> > > > VK>machine, but sometimes non-parallel buildworld will do, only to die
> > > > VK>later in installworld.
> > > > VK>This is on two-processor AMD 2400+ MP system, ASUS A7M-266D mobo and
> > > > VK>1GB ECC memory, ATA disks and CD/RW-DVD only. 4BSD scheduler if it
> > > > VK>matters.
> > > > 
> > > > I have the same MB just with 1800+ processors. I had to reduce the CPU
> > > > frequency by about 10% in the BIOS setup to get the machine stable. I
> > > > assume the problem is actually the memory.
> > > 
> > > Couldn't the following be of help here?
> > > 
> > > options DISABLE_PSE
> > > options DISABLE_PG_G
> > 
> > I don't think so.  I tried that on my A7M266D with no effect.  I believe
> > something in recent pmap code doesn't like this mobo, or maybe dual
> > athlons in general.  I can run RELENG_5_1 rock solid, and -current from
> > 9/24/03 rock solid, but -current from 10/3 or later gets random sigs
> > and eventually panics.  I have scsi disks so it's not ata.
> > 
> 
> I have the same experiences.  Also AMD A7M-266D with two 1800+ Athlons here.
> Used to work fine, but got random signals with my latest builds.
> 

Here, too. However, on a Tyan Thunder K7 with MP 1200 and a Tyan Tiger MPX
with MP 1600+. Additionally to random signals I also get ICEs from GCC at
random places and freezes with `make -jX buildworld` but no panics.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic: vm_map_wire: lookup failed

2003-10-17 Thread Marius Strobl
On Fri, Oct 17, 2003 at 03:19:12PM +0200, John Hay wrote:
> > > > 
> > > > The latest development source of ntpd started to use setrlimit() before
> > > > using mlockall(). This combination proves fatal on -current. The code
> > > > in ntpd/ntpd.c looks like this:
> > > 
> > > Ok, I found an easier way to provoke the panic. Just compile the following
> > > program like this:
> > 
> > >   if (mlockall(MCL_CURRENT|MCL_FUTURE) < 0)
> > >   perror("mlockall()");
> > 
> > Did you tested it on a recent -current? It is supposed to be fixed
> > (since a day or two, I think).
> 
> Nope it is still not fixed. I have tried again just now and it still
> throw a panic on both UP and SMP. Try for yourself if you are brave.
> :-)))
> 

It's probably a different problem, the mlockall()-related panic that
was fixed had "vm_fault_copy_wired: page missing" or "mutex vm object
not owned at /usr/src/sys/vm/vm_page.c:7XX" (depending on the FreeBSD
version) as the panic message.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: panic with cdrecord -- anybody else seeing this? [backtrace obtained]

2003-10-12 Thread Marius Strobl
On Sun, Oct 12, 2003 at 09:18:08PM -0700, Kris Kennaway wrote:
> On Sun, Oct 12, 2003 at 06:32:21PM -0700, John Reynolds wrote:
> > Hi all, forgive me if I give incomplete information. This is the first time
> > I've created a debugging kernel and gotten a dump after a panic, so I might not
> > have done everything right.
> > 
> > Ever since the tail end of July it seems, any time I've tried to burn a CD with
> > cdrecord (cdrtools 2.0.3 from ports) I get a panic
> > 
> >   vm_fault_copy_wired: page missing
> > 
> > General busy-ness and the thought that "somebody will see it too and fix it"
> > has prevented me from caring too much about it until now, but it seems it's
> > still there in the kernel from Oct 11th, and I figured I might as well try to
> > provide somebody some information ..
> 
> Thanks..Alan made a commit which he thought might have fixed this, but
> someone else also claimed it did not.
> 
> See also
> 
>   http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/56380
> 

And
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/57611

The latter is a bit more detailed and correct (it's not limited to ATAPI
burners). It also doesn't seem to be limited to cdrecord, the latest
ntpd also causes a panic when using mlockall(2) as reported on this list,
however the backtrace looks different.
Btw., the cdrtools-devel port contains a workaround.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: X does not work on todays current with ATI

2003-10-05 Thread Marius Strobl
On Sun, Oct 05, 2003 at 04:56:53PM +0200, Matt Douhan wrote:
> Hello
> 
> I am unable to start X with a current as of today 08.00 CEST, it crashes to
> the debugger with a fatal trap 12, I poked around to see if anything useful
> was in the logs but I could not find anything, can you please advice what
> log I could send to aid in hunting down this problem?
> 

"Me, too"

GNU gdb 5.2.1 (FreeBSD)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-undermydesk-freebsd"...
panic: from debugger
panic messages:
---
Fatal trap 12: page fault while in kernel mode
cpuid = 0; lapic.id = 0100
fault virtual address   = 0x1c
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc0503499
stack pointer   = 0x10:0xdc91ab9c
frame pointer   = 0x10:0xdc91abb8
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 582 (XFree86)
panic: from debugger
cpuid = 0; lapic.id = 0100
boot() called on cpu#0
Uptime: 28m17s
Dumping 512 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 
384 400 416 432 448 464 480 496
---
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
240 dumping++;
(kgdb) where
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
#1  0xc050c3a3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372
#2  0xc050c788 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
#3  0xc043eff2 in db_panic () at /usr/src/sys/ddb/db_command.c:450
#4  0xc043ef6a in db_command (last_cmdp=0xc06d8e00, cmd_table=0x0, 
aux_cmd_tablep=0xc0693a6c, aux_cmd_tablep_end=0xc0693a70)
at /usr/src/sys/ddb/db_command.c:346
#5  0xc043f078 in db_command_loop () at /usr/src/sys/ddb/db_command.c:472
#6  0xc0441db9 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:73
#7  0xc0623143 in kdb_trap (type=12, code=0, regs=0xdc91ab5c)
at /usr/src/sys/i386/i386/db_interface.c:171
#8  0xc063bb96 in trap_fatal (frame=0xdc91ab5c, eva=0)
at /usr/src/sys/i386/i386/trap.c:814
at /usr/src/sys/ddb/db_command.c:346
#5  0xc043f078 in db_command_loop () at /usr/src/sys/ddb/db_command.c:472
#6  0xc0441db9 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:73
#7  0xc0623143 in kdb_trap (type=12, code=0, regs=0xdc91ab5c)
at /usr/src/sys/i386/i386/db_interface.c:171
#8  0xc063bb96 in trap_fatal (frame=0xdc91ab5c, eva=0)
at /usr/src/sys/i386/i386/trap.c:814
#9  0xc063b881 in trap_pfault (frame=0xdc91ab5c, usermode=0, eva=28)
at /usr/src/sys/i386/i386/trap.c:733
#10 0xc063b453 in trap (frame=
  {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = -1066886654, tf_esi = 1645, tf_ebp 
= -594433096, tf_isp = -594433144, tf_ebx = 0, tf_edx = 0, tf_ecx = 1, tf_eax = 0, 
tf_trapno = 12, tf_err = 0, tf_eip = -1068485479, tf_cs = 8, tf_eflags = 66118, tf_esp 
= 1, tf_ss = -997343232}) at /usr/src/sys/i386/i386/trap.c:418
#11 0xc0624a48 in calltrap () at {standard input}:103
#12 0xc05fb35e in vm_page_zero_invalid (m=0x66d, setvalid=1)
at /usr/src/sys/vm/vm_page.c:1645
---Type  to continue, or q  to quit--- 
#13 0xc05ebca2 in vm_fault (map=0xc19216e4, vaddr=674037760, 
fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_pager.h:131
#14 0xc063b7b6 in trap_pfault (frame=0xdc91ad48, usermode=1, eva=674041472)
at /usr/src/sys/i386/i386/trap.c:709
#15 0xc063b364 in trap (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077937450, tf_esi = 674041472, 
tf_ebp = -1077937480, tf_isp = -594432652, tf_ebx = 674037760, tf_edx = 2, tf_ecx = 2, 
tf_eax = -1077937450, tf_trapno = 12, tf_err = 4, tf_eip = 673893577, tf_cs = 31, 
tf_eflags = 66050, tf_esp = -1077937532, tf_ss = 47})
at /usr/src/sys/i386/i386/trap.c:317
#16 0xc0624a48 in calltrap () at {standard input}:103
---Can't read userspace from dump, or kernel process---

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ATAng still problematic

2003-09-19 Thread Marius Strobl
On Sat, Sep 20, 2003 at 01:47:44AM +0100, Bruce M Simpson wrote:
> On Sat, Sep 20, 2003 at 02:17:21AM +0200, Marius Strobl wrote:
> > > Isn't it still a kernel bug if a user process can trigger a panic?
> > 
> > Yes, it seems to be a bug in the mlockall(2) implementation. Backing
> > it out or hindering cdrecord to use it avoids the panic. I already
> > wrote an email to bms@ who commited the mlockall(2) and munlockall(2)
> > support regarding this issue.
> 
> I don't think that's been conclusively established yet, so statements
> of the form above are a bit unhelpful.
> 

Ok, sorry.

> The problem may well lie elsewhere in the system, as a parameter in
> vm_map_copy_entry() is being unexpectedly set to NULL in the backtrace
> which you provided me with.
> 

It's just certainly not ATAng or ATAPICAM as I get this panic on a
SCSI-only box, too.

> If more people can exercise the same codepath as you appear to be
> exercising with different configurations, then I will have more to go on.
> 
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ATAng still problematic

2003-09-19 Thread Marius Strobl
On Fri, Sep 19, 2003 at 04:36:32PM -0700, Kris Kennaway wrote:
> On Fri, Sep 19, 2003 at 06:21:52PM +0200, Marius Strobl wrote:
> > On Thu, Sep 18, 2003 at 05:51:25PM +0200, Jan Srzednicki wrote:
> > > 
> > > Anyway, here's backtrace for atapicam panic I've mentioned. It's
> > > triggered by:
> > > 
> > > cdrecord dev=1,1,0 /some/track
> > > 
> > 
> > This panic isn't ATAPICAM related. Could you try the patch below? It's
> > against the cdrtools-devel port but should also work with the cdrtools
> > port.
> 
> Isn't it still a kernel bug if a user process can trigger a panic?
> 

Yes, it seems to be a bug in the mlockall(2) implementation. Backing
it out or hindering cdrecord to use it avoids the panic. I already
wrote an email to bms@ who commited the mlockall(2) and munlockall(2)
support regarding this issue.
The patch for the cdrtools ports is only a workaround until the real
cause is fixed. I also was not sure if it would work for Bryan as I
originally didn't get the same panic as he did.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: ATAng still problematic

2003-09-19 Thread Marius Strobl
On Thu, Sep 18, 2003 at 05:51:25PM +0200, Jan Srzednicki wrote:
> 
> Anyway, here's backtrace for atapicam panic I've mentioned. It's
> triggered by:
> 
> cdrecord dev=1,1,0 /some/track
> 

This panic isn't ATAPICAM related. Could you try the patch below? It's
against the cdrtools-devel port but should also work with the cdrtools
port.


Index: files/patch-conf::configure
===
RCS file: files/patch-conf::configure
diff -N files/patch-conf::configure
--- /dev/null   1 Jan 1970 00:00:00 -
+++ files/patch-conf::configure 19 Sep 2003 16:03:35 -
@@ -0,0 +1,10 @@
+--- conf/configure.origFri Sep 19 16:47:37 2003
 conf/configure Fri Sep 19 16:49:26 2003
+@@ -5567,6 +5567,7 @@
+ int
+ main()
+ {
++  exit(1);
+   if (mlockall(MCL_CURRENT|MCL_FUTURE) < 0) {
+   if (errno == EINVAL || errno ==  ENOMEM ||
+   errno == EPERM  || errno ==  EACCES)
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Question about genassym, locore.s and 0-sized arrays(showstopper for an icc compiled kernel)

2003-09-05 Thread Marius Strobl
On Fri, Sep 05, 2003 at 07:34:39PM +1000, Bruce Evans wrote:
> On Fri, 5 Sep 2003, I wrote:
> 
> > ...
> > If some values are unrepresentable then they need to be represtended
> > using other values.  E.g., add 1 to avoid 0, or multiply by the alignment
> > size if some element of the tool chanin instsists on rounding up things
>chain  insists
> > for alignment like a broken aout version used to do.  16-bit values
> > would need 17 bits to represent after adding 1.
> 
> Better, add 0x1 to avoid 0.  awk has no support for parsing hex numbers
> so subtracting the bias of 1 would take a lot more code, but ignoring
> leading hexdigits requires no changes in genassym.sh -- it already ignores
> everything except the last 4 hexdigits.
> 

This works, too. Thanks for the detailed explanation Bruce!

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"


Re: Question about genassym, locore.s and 0-sized arrays (showstopper for an icc compiled kernel)

2003-09-04 Thread Marius Strobl
On Thu, Sep 04, 2003 at 03:47:09PM -0700, Marcel Moolenaar wrote:
> 
> We use the size of the symbol (ie the size of the object identified
> by the symbol) to pass around values. This we do by creating arrays.
> If we want to export a C constant 'FOOBAR' to assembly and the constant
> is defined to be 6, then we create an array for the sign, of which the
> size is 1 for negative numbers and 0 otherwise. In this case the array
> will be named FOOBARsign and its size is 0. We also create 4 arrays (*w0,
> *w1, *w2 and *w3), each with a maximum of 64K and corresponding to the
> 4 16-bit words that constitutes a single 64-bit entity.
> In this case
>   0006 C FOOBARw0
>    C FOOBARw1
>    C FOOBARw2
>    C FOOBARw3
> 
> If the compiler creates arrays of size 1 for arrays we define as a
> zero-sized array, you get exactly what you've observed.
> 

Is this rather complex approach really necessary? I have successfully
generated assyms.s' using genassym.sh(8) from NetBSD and both ICC and
GCC on i386 which have exactly the same values as one generated with
sys/kern/genassym.sh from FreeBSD. The genassym.sh(8) of NetBSD kind
of directly exports the C-constants so it just needs one symbol per
constant and doesn't require zero sized arrays. Given that it's from
NetBSD their approach also should be very MI.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "[EMAIL PROTECTED]"