from:"Marius Strobl"

Re: SD card reader only works after a suspend/resume

2018-09-12 Thread Marius Strobl

On Fri, Sep 07, 2018 at 04:52:12PM +0200, Jakob Alvermark wrote:
> On 9/7/18 12:41 AM, Marius Strobl wrote:
> > On Thu, Sep 06, 2018 at 12:33:39PM +0200, Jakob Alvermark wrote:
> >> Hi,
> >>
> >>
> >> I discovered this by chance.
> >>
> >> The SD card reader in my laptop has never worked, but now I noticed it
> >> does after suspending and resuming.
> >>
> >> The controller is probed and attached on boot:
> >>
> >> sdhci_acpi1:  iomem
> >> 0x90a0-0x90a00fff irq 47 on acpi0
> >>
> >> But nothing happens if I put a card in. Unless I suspend and resume:
> >>
> >> mmc1:  on sdhci_acpi1
> >> mmcsd0: 32GB  at mmc1
> >> 50.0MHz/4bit/65535-block
> >>
> >> Then I can remove and replug cards and it seems to work just fine.
> > I believe that making SD card insertion/removal with the integrated
> > SDHCI controlers of newer Intel SoCs work out-of-the-box requires
> > support for ACPI GPE interrupts and ACPI GPIO events respectively to
> > be added to FreeBSD. Otherwise insertion/removal interrutps/events
> > aren't reported and polling the card present state doesn't generally
> > work as a workaround with these controllers either, unfortunately.
> > I'm not aware of anyone working on the former, though.
> >
> > Polling the card present state happens to work one time after SDHCI
> > initialization with these controllers which is why a card will be
> > attached when inserted as part of a suspend/resume cycle (resume of
> > mmc(4) had some bugs until some months ago, which probably explains
> > why that procedure hasn't worked as a workaround for you in the past).
> > Inserting the card before boot, unloading/loading sdhci_acpi.ko or
> > triggering detach/attach of sdhci_acpi(4) via devctl(8) should allow
> > to attach a card, too.
> 
> 
> If a card is inserted before booting it is not detected.
> 
> Removing and inserting card after boot is not detected unless I suspend 
> and resume.
> 
> After I have suspended and resumed once, cards are detected. Removals 
> and insertions are detected as they happen.

Okay, then you are seeing somewhat different behavior than I do. What
SoC model is this? Are you loading a GPIO controller driver such as
bytgpio(4) or chvgpio(4)? Doing so might be sufficient to kick ACPI
GPIO events into working but would be missing dependency information
between drivers (which might explain what you are experiencing if
sdhci_acpi1 attaches first) and some other bits to do it properly.
Also, could you please try whether doing a suspend/resume cycle of
sdhci_acpi1 via devctl(8) only kicks the card detection into working?
That test should indicate whether the firmware plays a role in making
the latter work.

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: SD card reader only works after a suspend/resume

2018-09-06 Thread Marius Strobl

On Thu, Sep 06, 2018 at 12:33:39PM +0200, Jakob Alvermark wrote:
> Hi,
> 
> 
> I discovered this by chance.
> 
> The SD card reader in my laptop has never worked, but now I noticed it 
> does after suspending and resuming.
> 
> The controller is probed and attached on boot:
> 
> sdhci_acpi1:  iomem 
> 0x90a0-0x90a00fff irq 47 on acpi0
> 
> But nothing happens if I put a card in. Unless I suspend and resume:
> 
> mmc1:  on sdhci_acpi1
> mmcsd0: 32GB  at mmc1 
> 50.0MHz/4bit/65535-block
> 
> Then I can remove and replug cards and it seems to work just fine.

I believe that making SD card insertion/removal with the integrated
SDHCI controlers of newer Intel SoCs work out-of-the-box requires
support for ACPI GPE interrupts and ACPI GPIO events respectively to
be added to FreeBSD. Otherwise insertion/removal interrutps/events
aren't reported and polling the card present state doesn't generally
work as a workaround with these controllers either, unfortunately.
I'm not aware of anyone working on the former, though.

Polling the card present state happens to work one time after SDHCI
initialization with these controllers which is why a card will be
attached when inserted as part of a suspend/resume cycle (resume of
mmc(4) had some bugs until some months ago, which probably explains
why that procedure hasn't worked as a workaround for you in the past).
Inserting the card before boot, unloading/loading sdhci_acpi.ko or
triggering detach/attach of sdhci_acpi(4) via devctl(8) should allow
to attach a card, too.

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: SG116j install crashed

2018-01-22 Thread Marius Strobl

On Sat, Jan 20, 2018 at 08:25:08PM +0900, KIRIYAMA Kazuhiko wrote:
> At Thu, 18 Jan 2018 15:35:41 +0900,
> my wrote:
> > 
> > Hi, all
> > 
> > I've bought Biccamera's original bland note PC (SG116j)
> > impulsively because of cheapness($1780). I've installed
> > 12.0-CURRENT(r327788) right away. Booted smoothly but set
> > loader conf "unset hint.uart.1.at" and configure disk with
> > 
> > mmcsd0  58 GB   GPT
> >   mmcsd0p1  200MB   efi
> >   mmcsd0p2  54 GB   freebsd-ufs /
> >   mmcsd0p3  4.2 GB  freebsd-swapnone
> > 
> > But in "Tetching distribution files" of base.txz, crashed
> > with:
> > 
> > sdhci_acpi0-slot0: Controller timeout
> > sdhci_acpi0-slot0: == REGISTER DUMP ==
> > sdhci_acpi0-slot0: Sys addr: 0x02158000 | Version:  0x1002
> > sdhci_acpi0-slot0: Blk size: 0x0200 | Blk cnt:  0x00f8
> > sdhci_acpi0-slot0: Argument: 0x017f57e8 | Trn mode: 0x0027
> > sdhci_acpi0-slot0: Present:  0x1fff0106 | Host ctl: 0x1025
> > sdhci_acpi0-slot0: Power:0x000b | Blk gap:  0x0080
> > sdhci_acpi0-slot0: Wake-up:  0x | Clock:0x0007
> > sdhci_acpi0-slot0: Timeout:  0x0007 | Int stat: 0x0001
> > sdhci_acpi0-slot0: Int enab: 0x05ff0033 | Sig enab: 0x05ff003a
> > sdhci_acpi0-slot0: AC12 err: 0x8000 | Host ctl2:0x008b
> > sdhci_acpi0-slot0: Caps: 0x446cc8b2 | Caps2:0x0807
> > sdhci_acpi0-slot0: Max curr: 0x | ADMA err: 0x
> > sdhci_acpi0-slot0: ADMA addr:0x | Slot int: 0x
> > sdhci_acpi0-slot0: ===
> > mmcsd0: Error indicated: 1 Timeout
> >   :
> > (snip)
> >   :
> > Stopped at kdb_enter+0x3b: movq$0,kdb_why
> > db>
> > 
> > Detail log has put in [1]. BTW I used [2] so all stuffs are
> > within it and it should not be fetched to internet.
> > 
> > Is there any idea to go forth?
> > 
> > Best regards.
> > 
> > [1] http://35.200.82.201/~kiri/freebsd/sg116j/crash_in_install.jpeg
> > [2] FreeBSD-12.0-CURRENT-amd64-20180110-r327788-memstick.img
> 
> I've got r328126 memstic and install with it, then all went
> to perfect! Thanx for FreeBSD-CURRENT team!

FYI, I believe that you had hit the bug fixed in r327924; sorry about
that.

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: [tzsetup] can't set up local timezone if CMOS is set to UTC

2017-08-07 Thread Marius Strobl

On Mon, Aug 07, 2017 at 09:51:15AM +0300, Boris Samorodov wrote:
> 07.08.2017 09:44, Boris Samorodov ?:
> > Hi Marius, All,
> > 
> > Subj at today's amd64-HEAD. If I use command "sudo tzsetup" and
> > choose YES (CMOS clock is set to UTC), the program just quits.
> > Yea, my clocks are at UTC but I want to get time at local timezone. :-)
> > 
> > I've found a recent commit to tzsetup, is it the cause?
> 
> Hm. There is a log message at r322097:
> ---
> - Make the initial UTC dialog actually work by giving the relevant files
>the necessary treatment and then exit when choosing "Yes" there instead
>of moving on to the time zone menu regardless.
> ---
> 
> I must misunderstand something.
> 
> So my question is: how to set up local time zone if CMOS is set to UTC?

Yeah, I hadn't thought of the case where one would like to set up
a configuration in which the RTC is using UTC but the timezone is
not. So I've reverted the corresponding part of r322097 for now as
I don't see an obvious way to give /etc/wall_cmos_clock appropriate
treatment in all 3 relevant cases (UTC/UTC, !UTC/UTC and !UTC/!UTC
regarding RTC/timezone) for all interactive and non-interactive
ways of using tzsetup(8).

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Re: Re: Realtek 8168/8111 if_re not working in current r295091

2016-02-13 Thread Marius Strobl

On Sat, Feb 13, 2016 at 09:21:06PM +0100, Stefan Kohl wrote:
> Hi Marius,
> 
> I finally got my RT 8168 Ethernet Card (Zotac Ri323) working after
> patching if_re.c (r295601). Contrary to the assumption that
> HWREV_8168E_VL with Chip Rev 0x2c80 should not require RTL8168G
> handling as soon as I expand the sc->rl_flags for the respective
> HWREV and define the (ominous) 8168G_Plus Flag for RL_HWREV_8168E_VL
> the card is functioning correctly.

My best guess currently is that treating HWREV_8168E_VL as RTL8168G
or later chip - which it simply isn't - serves as workaround by e. g.
resetting parts of the RX/TX MAC configuration, that doesn't make it
an appropriate fix, though. I have a WIP which does a more complete
initialization of Realtek Ethernet MACs, part of which is a workaround
for broken BIOSes and is specific to HWREV_8168E_VL. I suspect that's
the more likely cause for your problem and would also explain why there
was no other such report so far. Currently, 10.3-RELEASE and its show-
stoppers have higher priority for me, though.

> When broken (without the patch) I got the following tcpdump output:
> 
> 19:18:46.299360 00:00:00:00:00:00 (oui Ethernet) > 00:00:00:00:00:00
> (oui Ethernet) Null Information, send seq 0, rcv seq 0, Flags [Command],
> length 84

Actually, this pretty much confirms the assumption that your problem
is caused by a broken BIOS as the correct workaround for that bug
consists of making the GMAC aware of the MAC address via the driver
in addition to only setting it in the MAC.
Err, wait, IIRC yongari@ had a similar change as far as the broken
BIOS workaround is concerned. You may want to give the following
patch a try instead of treating HWREV_8168E_VL as RTL8168G+ (I don't
know whether that patch applies cleanly to current re(4), though):
https://people.freebsd.org/~yongari/re/re.8168evl.diff

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Realtek 8168/8111 if_re not working in current r295091

2016-02-05 Thread Marius Strobl

On Wed, Feb 03, 2016 at 08:57:01PM +0100, s@web.de wrote:
> After updating -current at Jan, 31st (r295091) the Realtek ethernet device 
> driver of my Zotac ZBox RI323 mini pc seems to be broken: I can neither 
> connect to the host even though the interface is shown as active, nor can I 
> initiate connection from the host through re0.
> Reverting the kernel to my previous build -current r290151 (install date Nov 
> 1st, 2015) the re0 interface is working OK.
> 
> Looking through the svn logs regarding /head/sys/dev/re/if_re.c I supect, 
> that Revision 290566 might have someting to do with this and that I have to 
> include my Realtek Chipset to the exclusion list for "enabling RX/TX after 
> initial configuration (or viceversa; I am really confused here), but I havent 
> got a clue how; as I do not know how to find the right RL_HWREV_XXX flag for 
> my device.
> 
> dmesg shows RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet and 
> pciconf -l -v re0 shows:
> re0@pci0:2:0:0: class=0x02 card=0x816819da chip=0x816810ec rev=0x07 
> hdr=0x00
> vendor = 'Realtek Semiconductor Co., Ltd.'
> device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller'
> 
> I am grateful for any suggestion towards a solution and I am willing (and 
> able) to assist by patching or debugging my kernel or giving further hw 
> information about my system.
> 

Hrm, does that happen to be RL_HWREV_8411B (0x5c80) according to
the "Chip rev." in the dmesg(8) output?

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Re: Realtek 8168/8111 if_re not working in current r295091

2016-02-05 Thread Marius Strobl

On Fri, Feb 05, 2016 at 09:04:02PM +0100, s@web.de wrote:
> Hi Marius and Pyun,
> 
> actually it is Chip rev. 0x2c80 (I have overlooked that information in my 
> first post)
> 
> re0:  port 
> 0xe000-0xe0ff mem 0xf0104000-0xf0104fff,0xf010-0xf0103fff irq 19 at 
> device 0.0 on pci2
> re0: Using 1 MSI-X message
> re0: turning off MSI enable bit.
> re0: Chip rev. 0x2c80
> re0: MAC rev. 0x0010
> miibus0:  on re0
> rgephy0:  PHY 1 on miibus0
> 
> Does that help in any way? Thanks Stefan
> 

Unfortunately, it doesn't make a whole lot of sense to me; 0x2c80
translates to RL_HWREV_8168E_VL, which is an older chip that should
never have required the handling of RTL8168G and later revisions (or
may not actually work when applying it). So r290566 should only make
a positive difference, if it changes anyting for that revision all.
Did the interface work before r290151, or actually before r281337?
Does reverting r290946 and r290566 locally make it work again?
Another candidate causing that breakage would be r291676 if the PHY
is an RTL8211F one. If you boot verbosely, you'll have a line in the
dmesg(8) output with "OUI 0x00e04c" in it. If the "rev." number in
that line is 6, you have an RTL8211F.

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: sparc64 traps during probe (r293243)

2016-01-08 Thread Marius Strobl

On Fri, Jan 08, 2016 at 10:42:33AM -0500, Kurt Lidl wrote:
> I recently updated a sparc64 V120 from r291993
> to r293243, and it now traps during the
> autoconfiguration phase of the kernel boot:
> 

<...>

> -- data access exception sfar=0xfcf821ca0218 sfsr=0x41029 
> %o7=0xc06165e8 --

What code line does 0xc06165e8 translate to?

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: sparc64 traps during probe (r293243)

2016-01-08 Thread Marius Strobl

On Fri, Jan 08, 2016 at 04:57:58PM +, Mark Cave-Ayland wrote:
> On 08/01/16 15:42, Kurt Lidl wrote:
> 
> This looks amazingly similar to what I get trying to boot FreeBSD under
> QEMU, i.e. pointing at sched_clock():
> 

<...>

> -- kernel stack fault %o7=0xc011a050 --
> panic: longjmp botch
> cpuid = -1012475520
> KDB: stack backtrace:
> Uptime: 3s
> 
> Note also the "longjmp botch" error right at the end. This is with the
> sun4u timer fix patch developed with help from Marius which has recently
> been applied to QEMU git master. So maybe this is a kernel bug after all?

No, that still is a completely trashed kernel stack as previously
seen when running under QEMU so the whole backtrace is questionable.
Apart from that, sched_clock() is called rather frequently so it is
not unlikely to show up in a kernel back trace but neither of the
two back traces in question suggest it's the culprit (assuming that
the one seen with QEMU actually is sane).

Marius

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: Wake on LAN broken (probably between r290542 - r290606)?

2015-11-15 Thread Marius Strobl

On Sat, Nov 14, 2015 at 09:56:36AM -0800, David Wolfskill wrote:
> On Wed, Nov 11, 2015 at 06:33:37AM -0800, David Wolfskill wrote:
> > ...
> > But a quick perusal of
> >  doesn't show
> > anything especially like a "smoking gun" -- to me, anyway.
> > 
> > Can anyone else confirm or refute my observations?  Or suggest a
> > hint?  I'll try narrowing it down myself, but I need to do it during
> > times I'm at home (so I can manually power the machine back up when
> > it fails to respond to WoL), so it may be a few days before I can
> > accomplish much that way.
> > 
> 
> r290565 still works; r290566 fails -- in my case.  r290566 changed some
> re(4) behavior, and the NIC on my affected machine is an re(4):
> 
> re0@pci0:3:0:0: class=0x02 card=0x05b71028 chip=0x816810ec rev=0x0c
> hdr=0x00
> vendor = 'Realtek Semiconductor Co., Ltd.'
> device = 'RTL8111/8168/8411 PCI Express Gigabit Ethernet
> Controller'
> class  = network
> subclass   = ethernet
> 
> from "pciconf -lv" while running:
> 
> D freebeast.catwhisker.org 11.0-CURRENT FreeBSD 11.0-CURRENT #1904  
> r290565M/290565:1100089: Sat Nov 14 09:44:33 PST 2015 
> r...@freebeast.catwhisker.org:/common/S3/obj/usr/src/sys/GENERIC  amd64
> 
> I've placed a copy of a verbose dmes.boot in
> .
> 
> I'm happy to test suggested changes.
> 

*sigh* Okay, could you please test whether the attached patch restores
WOL capability for you?

Marius

Index: if_re.c
===
--- if_re.c	(revision 290566)
+++ if_re.c	(working copy)
@@ -3851,6 +3852,11 @@ re_setwol(struct rl_softc *sc)
 			CSR_READ_1(sc, RL_GPIO) & ~0x01);
 	}
 	if ((ifp->if_capenable & IFCAP_WOL) != 0) {
+		if ((sc->rl_flags & RL_FLAG_8168G_PLUS) != 0) {
+			/* Disable RXDV gate. */
+			CSR_WRITE_4(sc, RL_MISC, CSR_READ_4(sc, RL_MISC) &
+			~0x0008);
+		}
 		re_set_rxmode(sc);
 		if ((sc->rl_flags & RL_FLAG_WOL_MANLINK) != 0)
 			re_set_linkspeed(sc);
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: HEADS UP: sparc64 backend for llvm/clang imported

2015-08-26 Thread Marius Strobl

On Wed, Aug 19, 2015 at 04:19:03PM -0400, Kurt Lidl wrote:
  Dimitry Andric wrote this message on Fri, Feb 28, 2014 at 20:22 +0100:
  In r262613 I have merged the clang-sparc64 branch back to head.  This
  imports an updated sparc64 backend for llvm and clang, allowing clang to
  bootstrap itself on sparc64, and to completely build world.  To be able
  to build the GENERIC kernel, there is still one patch to be finalized,
  see below.
 
  If you have any sparc64 hardware, and are not afraid to encounter rough
  edges, please try out building and running your system with clang.  To
  do so, update to at least r262613, and enable the following options in
  e.g. src.conf, or in your build environment:
 
  WITH_CLANG=y
  WITH_CLANG_IS_CC=y
  WITH_LIBCPLUSPLUS=y  (optional)
 
  Alternatively, if you would rather keep gcc as /usr/bin/cc for the
  moment, build world using just WITH_CLANG, enabling clang to be built
  (by gcc) and installed.  After installworld, you can then set CC=clang,
  CXX=clang++ and CPP=clang-cpp for building another world.
 
  For building the sparc64 kernel, there is one open issue left, which is
  that sys/sparc64/include/pcpu.h uses global register variables, and this
  is not supported by clang.  A preliminary patch for this is attached,
  but it may or may not blow up your system, please beware!
 
  The patch changes the pcpu and curpcb global register variables into
  inline functions, similar to what is done on other architectures.
  However, the current approach is not optimal, and the emitted code is
  slightly different from what gcc outputs.  Any improvements to this
  patch are greatly appreciated!
 
  Last but not least, thanks go out to Roman Divacky for his work with
  llvm/clang upstream in getting the sparc64 backend into shape.
 
  Ok, I have a new pcpu patch to try.  I have only compile tested it.
 
  It is available here:
  https://www.funkthat.com/~jmg/sparc64.pcpu.patch
 
  I've also attached it.
 
  Craig, do you mind testing it?
 
  This patch also removes curpcb as it appears to not be used by any
  sparc64 C code.  A GENERIC kernel compiles fine, and fxr only turns up
  curpcb used in machdep code, and no references to it under sparc64.
 
  This is not a proper solution in that
  it can mean counters/stats can be copied/moved to other cpus overwriting
  the previous values if a race happens...  We use
  PCPU_SET(mem, PCPU_GET(mem) + val) for PCPU_ADD, not great, but it's
  no worse than what we were previously using..
 
  Until we get a proper fix which involves mapping all the cpu's PCPU
  data on all CPUs, this will have to sufice..
 
  This patch is based upon, I believe, a patch from Marius and possibly
  modified by rdivacky.
 
  Thanks for testing..
 
 The above message was posted a while ago, and I decided that I would
 give the patch a test run on a spare sparc that I have, now that the
 instability problem with multiprocessor sparc64 machines has been
 resolved.
 
 So, I have an up-to-date stable/10 V240 (2x1.5Ghz cpus, 8GB of memory),
 running a completely stock r286861.  That all seems to work just fine.
 
 I applied the patch referenced in the email:
 
 https://www.funkthat.com/~jmg/sparc64.pcpu.patch
 
 (it applied cleanly), and then rebuilt the kernel on the machine,
 using the stock gcc 4.2.1 compiler.
 
 When rebooting with that kernel, the machine panics pretty early
 in the boot:
 
 FreeBSD 10.2-STABLE #3 r286861M: Wed Aug 19 14:28:45 EDT 2015
  l...@spork.pix.net:/usr/obj/usr/src/sys/GENERIC sparc64
 gcc version 4.2.1 20070831 patched [FreeBSD]
 real memory  = 8589934592 (8192 MB)
 avail memory = 8379719680 (7991 MB)
 cpu0: Sun Microsystems UltraSparc-IIIi Processor (1503.00 MHz CPU)
 cpu1: Sun Microsystems UltraSparc-IIIi Processor (1503.00 MHz CPU)
 FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 random device not loaded; using insecure entropy
 panic: trap: illegal instruction (kernel)
 cpuid = 0
 KDB: stack backtrace:
 #0 0xc05750e0 at panic+0x20
 #1 0xc08db9f8 at trap+0x558
 Uptime: 1s
 Automatic reboot in 15 seconds - press a key on the console to abort
 Rebooting...
 timeout shutting down CPUs.
 
 So, the patch to get rid of the pcpu usage (as a prereq to poking
 at the clang compiler issues) does not work properly.
 

As I pointed out when that patch was posted, the approach taken by
it assumes a CPU can access foreign PCPU data, which currently isn't
true on sparc64. So the patch is at least incomplete but also may
have further issues.

Such a patch is no longer a prerequisite for compiling a sparc64
kernel with clang, though, as clang meanwhile has been told to
grok at least the global registers used by the PCPU code.

Besides some default options like the choice of code model not
being appropriate for FreeBSD, clang-compiled loader and kernel
don't work due to two major problems present in clang up to at
least version 3.6.0: a) it uses a different stack layout than
GCC so any unwinding code fails and b) it produces broken

Re: [Call for testers] DRM device-independent code update to Linux 3.8

2015-02-18 Thread Marius Strobl

On Wed, Feb 18, 2015 at 12:45:36AM +0100, Jean-Sébastien Pédron wrote:
 Hi!
 
 An update to the DRM subsystem, not including the drivers, is ready for
 wider testing!
 
 The patch against HEAD is here:
 https://people.freebsd.org/~dumbbell/graphics/drm-update-38.f.patch
 

Have you looked into using a MTX_SPIN lock where Linux actually
employs a DRM_SPINTYPE one? That should allow to use a filter
instead of an ithread handler, solving a great number of problems
with pre-loading of DRM drivers and allow them to be statically
compiled into the kernel as - unlike ihtreads - filters work right
from the moment they are set up during attach. In turn, that
would make the lack of a VESA driver for vt(4) less painful and
likely even forgivable, as resolutions higher than VGA could be
used way earlier, etc.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: r276200: EFI boot failure: kernel stops booting at pci0: ACPI PCI bus on pcib0

2014-12-29 Thread Marius Strobl

On Mon, Dec 29, 2014 at 05:55:28PM +0100, Roger Pau Monné wrote:
 El 29/12/14 a les 12.41, Roger Pau Monné ha escrit:
  Hello,
  
  Sorry for not noticing this earlier, I've been without a computer for
  some days. Do you get a panic message, or the system just freezes?
  
  Can you please post the full boot output with boot_verbose enabled?
 
 I'm not able to reproduce the problem with Qemu and OVMF, and I don't
 have any box right now that uses UEFI.
 
 I'm guessing that this is due to some memory reservation conflict, so
 I'm attaching a patch that should help diagnose it.

You'll probably want to nuke RF_ACTIVE so the resources are marked
as taken but in case of vt_efifb(4), the memory isn't mapped twice.
I don't not know whether the latter actually is a problem for x86,
though, it'll likely at least replace the VM_MEMATTR_WRITE_COMBINING
mapping done in vt_efifb_remap(). Removing RF_ACTIVE in turn might
not be sufficient for the Xen bits to mark the resource as reserved,
this should be fixed in the FreeBSD/Xen code then, however.
Also end = size - 1, see the attached patch.

Marius

Index: dev/vt/hw/efifb/efifb.c
===
--- dev/vt/hw/efifb/efifb.c	(revision 276343)
+++ dev/vt/hw/efifb/efifb.c	(working copy)
@@ -211,8 +211,8 @@
 	res_id = 0;
 	pseudo_phys_res = bus_alloc_resource(dev, SYS_RES_MEMORY,
 	res_id, local_info.fb_pbase,
-	local_info.fb_pbase + local_info.fb_size,
-	local_info.fb_size, RF_ACTIVE);
+	local_info.fb_pbase + local_info.fb_size - 1,
+	local_info.fb_size, 0);
 	if (pseudo_phys_res == NULL)
 		panic(Unable to reserve vt_efifb memory);
 	return (0);
Index: dev/vt/hw/vga/vt_vga.c
===
--- dev/vt/hw/vga/vt_vga.c	(revision 276343)
+++ dev/vt/hw/vga/vt_vga.c	(working copy)
@@ -1275,8 +1275,8 @@
 
 	res_id = 0;
 	pseudo_phys_res = bus_alloc_resource(dev, SYS_RES_MEMORY,
-	res_id, VGA_MEM_BASE, VGA_MEM_BASE + VGA_MEM_SIZE,
-	VGA_MEM_SIZE, RF_ACTIVE);
+	res_id, VGA_MEM_BASE, VGA_MEM_BASE + VGA_MEM_SIZE - 1,
+	VGA_MEM_SIZE, 0);
 	if (pseudo_phys_res == NULL)
 		panic(Unable to reserve vt_vga memory);
 	return (0);
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: r276200: EFI boot failure: kernel stops booting at pci0: ACPI PCI bus on pcib0

2014-12-29 Thread Marius Strobl

On Mon, Dec 29, 2014 at 08:12:42PM +0100, Marius Strobl wrote:
 On Mon, Dec 29, 2014 at 05:55:28PM +0100, Roger Pau Monné wrote:
  El 29/12/14 a les 12.41, Roger Pau Monné ha escrit:
   Hello,
   
   Sorry for not noticing this earlier, I've been without a computer for
   some days. Do you get a panic message, or the system just freezes?
   
   Can you please post the full boot output with boot_verbose enabled?
  
  I'm not able to reproduce the problem with Qemu and OVMF, and I don't
  have any box right now that uses UEFI.
  
  I'm guessing that this is due to some memory reservation conflict, so
  I'm attaching a patch that should help diagnose it.
 
 You'll probably want to nuke RF_ACTIVE so the resources are marked
 as taken but in case of vt_efifb(4), the memory isn't mapped twice.
 I don't not know whether the latter actually is a problem for x86,
 though, it'll likely at least replace the VM_MEMATTR_WRITE_COMBINING
 mapping done in vt_efifb_remap(). Removing RF_ACTIVE in turn might
 not be sufficient for the Xen bits to mark the resource as reserved,
 this should be fixed in the FreeBSD/Xen code then, however.
 Also end = size - 1, see the attached patch.

Err, end = start + size - 1 that is.

Marius
 
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: panic on sparc64 running 10-beta4

2013-12-27 Thread Marius Strobl

On Sun, Dec 08, 2013 at 02:50:23PM +0100, Marius Strobl wrote:
 On Wed, Dec 04, 2013 at 11:01:30AM -0500, Kurt Lidl wrote:
  I installed a sparc V120 (4GB memory, dual 72GB disks) with the 10-beta4
  install image today.
  
  Installation went fine.  I rebooted the machine, and then went to get
  a fresh ports tree, and the machine panic'd:
  
  root@host:/usr/ports # portsnap fetch
  Looking up portsnap.FreeBSD.org mirrors... 7 mirrors found.
  Fetching public key from your-org.portsnap.freebsd.org... done.
  Fetching snapshot tag from your-org.portsnap.freebsd.org... done.
  Fetching snapshot metadata... done.
  Fetching snapshot generated at Tue Dec  3 19:06:18 EST 2013:
  43b6803c6d94efd5b2e2bc9df0b66a84b75417fa3c1728100% of   69 MB 3225 kBps 
  00m22s
  Extracting snapshot... done.
  Verifying snapshot integrity... panic: trap: illegal instruction (kernel)
  cpuid = 0
  KDB: stack backtrace:
  #0 0xc08836d4 at trap+0x554
  Uptime: 6m59s
  Dumping 4096 MB (4 chunks)
 chunk at 0: 1073741824 bytes ... ok
 chunk at 0x4000: 1073741824 bytes ... ok
 chunk at 0x8000: 1073741824 bytes ... ok
 chunk at 0xc000: 1073741824 bytes ... ok
  
  Dump complete
  Automatic reboot in 15 seconds - press a key on the console to abort
  Rebooting...
  
  And then it panic'd again when attempting to run 'savecore'!
  (I typed a ctrl-t after it printed out the line about
  writing the core file, that's where the load: 0.72 ... line
  came from...)
 
 Hrm, I don't seem to be able to reproduce this with an installation
 built from sources and also can't remember a commit between BETA3 and
 BETA4 which should be able to cause this. I currently can't test the
 10-BETA4 install image, though. Was the machine in question running
 FreeBSD before, i. e. is it known good hardware? Did savecore eventually
 succeed on writing out a dump?
 

FYI, I tried again with a machine installed from the 10.0-RC3 binary
image and couldn't reproduce that problem either.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Request for testing an alternate branch

2013-12-11 Thread Marius Strobl

On Sun, Dec 08, 2013 at 03:48:54PM -0800, Justin Hibbits wrote:
 On Sun, 8 Dec 2013 14:38:53 +0100
 Marius Strobl mar...@alchemy.franken.de wrote:
 
  On Wed, Dec 04, 2013 at 10:21:13PM -0800, Justin Hibbits wrote:
   I've been working on the projects/pmac_pmu branch for some time now
   to add suspend/resume as well as CPU speed change for certain
   PowerPC machines, about a year since I created the branch, and now
   it's stable enough that I want to merge it into HEAD, hence this
   request. However, it does touch several drivers, turning them into
   early drivers, such that they can be initialized, and suspended
   and resumed at a different time.  Saying that, I do need testing
   from other architectures, to make sure I haven't broken anything.
   
   The technical details:
   
   To get proper ordering, I've extended the bus_generic_suspend() and
   bus_generic_resume() to do multiple passes.  Devices which cannot be
   enabled or disabled at the current pass level would return an
   EAGAIN. This could possibly cause problems, since it's an addition
   to an existing API rather than a new API to run along side it, so
   it needs a great deal of testing.  It works fine on PowerPC, but I
   don't have any i386/amd64 or sparc64 hardware to test it on, so
   would like others who do to test it.  I don't think that it would
   impact x86 at all (testing is obviously required), because the
   nexus is not an EARLY_DRIVER_MODULE, so all devices would be
   handled at the same pass.  But, I do know the sparc64 has an
   EARLY_DRIVER_MODULE() nexus, so that will likely be impacted.
   
   Also, any comments are of course welcome.  Technical concerns are
   obviously welcome, and I will try to address everything.
  
  Do you have a patch against head?
  
  Marius
  
 
 Here you go.
 

Thanks; on a sparc64 machine where the EARLY_DRIVER_MODULE nexus actually
matters, your patch doesn't seem to have an ill effect.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Request for testing an alternate branch

2013-12-08 Thread Marius Strobl

On Wed, Dec 04, 2013 at 10:21:13PM -0800, Justin Hibbits wrote:
 I've been working on the projects/pmac_pmu branch for some time now to
 add suspend/resume as well as CPU speed change for certain PowerPC
 machines, about a year since I created the branch, and now it's stable
 enough that I want to merge it into HEAD, hence this request. However,
 it does touch several drivers, turning them into early drivers, such
 that they can be initialized, and suspended and resumed at a different
 time.  Saying that, I do need testing from other architectures, to make
 sure I haven't broken anything.
 
 The technical details:
 
 To get proper ordering, I've extended the bus_generic_suspend() and
 bus_generic_resume() to do multiple passes.  Devices which cannot be
 enabled or disabled at the current pass level would return an EAGAIN.
 This could possibly cause problems, since it's an addition to an
 existing API rather than a new API to run along side it, so it needs a
 great deal of testing.  It works fine on PowerPC, but I don't have any
 i386/amd64 or sparc64 hardware to test it on, so would like others who
 do to test it.  I don't think that it would impact x86 at all (testing
 is obviously required), because the nexus is not an EARLY_DRIVER_MODULE,
 so all devices would be handled at the same pass.  But, I do know the
 sparc64 has an EARLY_DRIVER_MODULE() nexus, so that will likely be
 impacted.
 
 Also, any comments are of course welcome.  Technical concerns are
 obviously welcome, and I will try to address everything.

Do you have a patch against head?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: panic on sparc64 running 10-beta4

2013-12-08 Thread Marius Strobl

On Wed, Dec 04, 2013 at 11:01:30AM -0500, Kurt Lidl wrote:
 I installed a sparc V120 (4GB memory, dual 72GB disks) with the 10-beta4
 install image today.
 
 Installation went fine.  I rebooted the machine, and then went to get
 a fresh ports tree, and the machine panic'd:
 
 root@host:/usr/ports # portsnap fetch
 Looking up portsnap.FreeBSD.org mirrors... 7 mirrors found.
 Fetching public key from your-org.portsnap.freebsd.org... done.
 Fetching snapshot tag from your-org.portsnap.freebsd.org... done.
 Fetching snapshot metadata... done.
 Fetching snapshot generated at Tue Dec  3 19:06:18 EST 2013:
 43b6803c6d94efd5b2e2bc9df0b66a84b75417fa3c1728100% of   69 MB 3225 kBps 
 00m22s
 Extracting snapshot... done.
 Verifying snapshot integrity... panic: trap: illegal instruction (kernel)
 cpuid = 0
 KDB: stack backtrace:
 #0 0xc08836d4 at trap+0x554
 Uptime: 6m59s
 Dumping 4096 MB (4 chunks)
chunk at 0: 1073741824 bytes ... ok
chunk at 0x4000: 1073741824 bytes ... ok
chunk at 0x8000: 1073741824 bytes ... ok
chunk at 0xc000: 1073741824 bytes ... ok
 
 Dump complete
 Automatic reboot in 15 seconds - press a key on the console to abort
 Rebooting...
 
 And then it panic'd again when attempting to run 'savecore'!
 (I typed a ctrl-t after it printed out the line about
 writing the core file, that's where the load: 0.72 ... line
 came from...)

Hrm, I don't seem to be able to reproduce this with an installation
built from sources and also can't remember a commit between BETA3 and
BETA4 which should be able to cause this. I currently can't test the
10-BETA4 install image, though. Was the machine in question running
FreeBSD before, i. e. is it known good hardware? Did savecore eventually
succeed on writing out a dump?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: newcons comming

2013-10-27 Thread Marius Strobl

On Fri, Oct 25, 2013 at 03:18:47PM +0300, Aleksandr Rybalko wrote:
 Hello fellow hackers!
 
 I finally reach the point when I can work with newcons instead of
 syscons on my laptop. Yes, I know it still buggy and have a lot of
 style(9) problems. But we really have to get it into HEAD and 10.0 to
 enable shiny new Xorg features, drivers, etc.
 
 So I ask everyone to look hard into that[1] and tell me your opinion.
 I expect a lot of opinions, since it have to affect almost all good
 guys, as result I have to ask to split bug reports into two parts:
 1. Should be done before merge to 10.0;
 2. Can be done later.
 
 If it possible, please do it(review - report) ASAP.

Could you please port at least either creator(4) or machfb(4) to newcons
before it even hits head so we don't have the same situation as with
syscons again where we need to make square pegs fit into round holes? My
main concerns in this regard are:
o Making these drivers work as low-level console in the syscons sense so
  they already work for printing the Copyright notice of the kernel. The
  problem here is that the respective chips don't necessarily come up with
  the frame buffer mapped and we can't do that on our own at that point with
  the VM not up, yet. So all access has to be done via bus_space_*(9) and
  specially crafted bus tags and handles. In short: Except for some specific
  model and firmware combinations, in general the generic OFW frame buffer
  approach doesn't work here, that's why these drivers exist in the first
  place.
o For coexistence of f. e. machfb(4) with ofwfb.c, allow some probing of
  drivers in the BUS_PROBE_GENERIC/BUS_PROBE_DEFAULT etc. manner. The
  crucial point here is that in case a more specific driver is willing
  to attach to a certain device, a generic driver must not touch the
  hardware in any way. It seems that vd_priority is too late in the game
  for that requirement. With syscons, this is achievable by letting the
  generic driver call vid_configure(VIO_PROBE_ONLY) and then check whether
  another driver has taken the device.
o Using hardware acceleration for drawing characters and the mouse pointer,
  i. e. using a hardware cursor. Employing the respective chips as dumb
  frame buffers instead is just dog slow. Currently, I don't see how a
  hardware cursor could be hooked up to newcons. The current putc code in
  these drivers _might_ be suitable for implementing bitbltchr methods.
  Apart from that these chips also can do simple bitblt etc. of course.
o Using the 12 x 22 gallant font.
o Allowing Xorg to map the frame buffer but additionally also other register
  banks as needed through newcons. With syscons, a driver can provide a
  mmap method for that (see machfb(4). I currently don't see how to do that
  with the newcons infrastructure. An alternative might be to make Xorg/
  libpciaccess aware of newcons and go through a /dev/fdX in that case.
  Still, I don't see how to currently do that for resources besides the
  actual frame buffer with existing fdc.c. I'm also not sure whether the
  latter is the appropriate route to go in the first place given that
  besides mmap'ing from userland, newcons'ified creator(4) and machfb(4)
  still should be used directly.
  In any case, for creator(4) Xorg expects a /dev/fdX anyway.
o Allowing late attachment in case the primary console is the serial one,
  another graphics chip etc. during regular device attachment when everything
  needed (mainly the VM) to bring the frame buffer fully online on our own
  is available. Is that what vt_allocate() is for?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [RFC][CFT] GEOM direct dispatch and fine-grained CAM locking

2013-09-16 Thread Marius Strobl

On Tue, Sep 03, 2013 at 11:48:38PM +0200, Olivier Cochard-Labbé wrote:
 On Tue, Sep 3, 2013 at 8:10 PM, Outback Dingo outbackdi...@gmail.com wrote:
  Can anyone confirm how well tested/stable this patch set might be?? if
  theres positive input i have a zoo of dev machines i could load it on, to
  help further it.
  Just checking to see how widely its been tested,
 
 I've installed this patch on 3 differents machines there status after
 about 12hours:
 - SUN FIRE X4170 M2 (amd64: r255178) with 6 SAS harddrives in one big
 zraid (LSI MegaSAS Gen2 controller): Used for generating package with
 poudriere? no probleme since;
 - HAL/Fujitsu SPARC64-V (sparc64: r255178) with two SCSI-3 disks in
 gmirror: Used for generating package with poudriere too? no probleme
 since;

For testing GEOM direct dispatch on sparc64, please additionally use
the following patch:
http://people.freebsd.org/~marius/sparc64_GET_STACK_USAGE.diff

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: CFT: PCI Command Register fixups

2013-08-10 Thread Marius Strobl

On Fri, Aug 09, 2013 at 09:56:48PM -0600, Scott Long wrote:
 All,
 
 Subversion rev 250418 affected approximately 63 drivers by making them 
 vulnerable to resource allocation failures on motherboards with buggy BIOSes. 
  The revision itself is good, but it needs to address these drivers and bring 
 them up to what is, in effect, a modified way for drivers to manage their PCI 
 resources.  If you've been seeing something like the following message since 
 June 24/27, then you need this patch:
 
 mps0: LSI SAS2116 port 0xd000-0xd0ff mem 0xfb79c000-0xfb79 irq 19 at 
 device 0.0 on pci4
 mps0: PCI memory window not available
 device_attach: mps0 attach returned 6
 
 The patch originated from John Baldwin, I merely fixed up a few nits and am 
 passing it around for review and testing.  Please find it here:
 
 http://people.freebsd.org/~scottl/pci_command_fixes.patch
 

In mpt_pci.c, there's a style nit/inconsistency regarding the other
drivers touched by the above patch; if after these fixes, a driver
still fiddles with PCIR_COMMAND, it should be just fine to also OR
in PCIM_CMD_BUSMASTEREN as part of that and to not additionally call
pci_enable_busmaster().
Apart from that, the patch looks good to me.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [RFC] USB keyboard and devd.conf

2013-02-13 Thread Marius Strobl

On Wed, Feb 13, 2013 at 09:23:25AM +0100, Hans Petter Selasky wrote:
 On Tuesday 12 February 2013 15:51:01 Marius Strobl wrote:
  On Mon, Feb 11, 2013 at 01:43:29PM +0100, Hans Petter Selasky wrote:
   Hi,
   
   Does anyone need these lines in /etc/devd.conf ?
   
   === etc/devd.conf
   ==
   --- etc/devd.conf (revision 246620)
   +++ etc/devd.conf (local)
   @@ -105,16 +105,6 @@
   
#action sleep 2  /usr/sbin/ath3kfw -d $device-name -f
/usr/local/etc/ath3k-1.fw; #};
   
   -# When a USB keyboard arrives, attach it as the console keyboard.
   -attach 100 {
   - device-name ukbd0;
   - action /etc/rc.d/syscons setkeyboard /dev/ukbd0;
   -};
   -detach 100 {
   - device-name ukbd0;
   - action /etc/rc.d/syscons setkeyboard /dev/kbd0;
   -};
   -
   
notify 100 {

 match system DEVFS;
 match subsystem CDEV;
   
   I plan to remove the lines marked with minus, because we now have kbdmux.
  
  Do these entries have negative impact on systems using kbdmux(4)?
  Will their lack have impact on systems not using kbdmux(4)? I typically
  remove or at least disable the latter on machines without atkbd(4) etc.
  hardware and thus ukbd(4) is the only keyboard driver ever used there.
  
 
 Hi,
 
 I suspect a system without kbdmux will still need these. However, these lines 
 are not correct with regard to multiple USB keyboards.
 

Yes, but do these lines have ill effects for configurations with kbdmux(4)
and multiple keyboards? If not then I'd strongly suggest to keep them for
the sake of making configurations without kbdmux(4) work out of the box.
If yes, I'd at least keep them in a commented out form and add a mark
saying that these are required without kbdmux(4).

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [RFC] USB keyboard and devd.conf

2013-02-12 Thread Marius Strobl

On Mon, Feb 11, 2013 at 01:43:29PM +0100, Hans Petter Selasky wrote:
 Hi,
 
 Does anyone need these lines in /etc/devd.conf ?
 
 === etc/devd.conf
 ==
 --- etc/devd.conf (revision 246620)
 +++ etc/devd.conf (local)
 @@ -105,16 +105,6 @@
  #action sleep 2  /usr/sbin/ath3kfw -d $device-name -f 
 /usr/local/etc/ath3k-1.fw;
  #};
  
 -# When a USB keyboard arrives, attach it as the console keyboard.
 -attach 100 {
 - device-name ukbd0;
 - action /etc/rc.d/syscons setkeyboard /dev/ukbd0;
 -};
 -detach 100 {
 - device-name ukbd0;
 - action /etc/rc.d/syscons setkeyboard /dev/kbd0;
 -};
 -
  notify 100 {
   match system DEVFS;
   match subsystem CDEV;
 
 
 I plan to remove the lines marked with minus, because we now have kbdmux.
 

Do these entries have negative impact on systems using kbdmux(4)?
Will their lack have impact on systems not using kbdmux(4)? I typically
remove or at least disable the latter on machines without atkbd(4) etc.
hardware and thus ukbd(4) is the only keyboard driver ever used there.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Physbio changes final call for tests and reviews

2013-02-11 Thread Marius Strobl

On Sat, Feb 02, 2013 at 10:47:09PM +0100, Marius Strobl wrote:
 On Sat, Feb 02, 2013 at 06:33:22PM +0200, Konstantin Belousov wrote:
  Hi,
  I finished the last (insignificant) missed bits in the Jeff' physbio
  work. Now I am asking for the last round of testing and review, esp. for
  the !x86 architectures. Another testing focus are the SCSI HBAs and RAID
  controllers which drivers are changed by the patchset. Please do test
  this before the patchset is committed into HEAD !
  
  The plan is to commit the patch somewhere in two weeks from this moment.
  The patch is required for the finalizing of the unmapped I/O work for UFS
  I did in parallel, which I hope to finish shortly after the commit.
  
  Patch is available at http://people.freebsd.org/~kib/misc/physbio.5.diff
  
 
 First tests on sparc64 with ata(4), mpt(4) and sym(4) look good (to
 be sure I still need to test with a machine using a streaming buffer
 in addition to the IOMMU, though).

FYI, the latter case is also fine.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Physbio changes final call for tests and reviews

2013-02-03 Thread Marius Strobl

On Sat, Feb 02, 2013 at 06:33:22PM +0200, Konstantin Belousov wrote:
 Hi,
 I finished the last (insignificant) missed bits in the Jeff' physbio
 work. Now I am asking for the last round of testing and review, esp. for
 the !x86 architectures. Another testing focus are the SCSI HBAs and RAID
 controllers which drivers are changed by the patchset. Please do test
 this before the patchset is committed into HEAD !
 
 The plan is to commit the patch somewhere in two weeks from this moment.
 The patch is required for the finalizing of the unmapped I/O work for UFS
 I did in parallel, which I hope to finish shortly after the commit.
 
 Patch is available at http://people.freebsd.org/~kib/misc/physbio.5.diff
 

Once you bring in said UFS changes, will the use of bus_dmamap_load_ccb(9)
be a requirement for disk controller drivers?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Physbio changes final call for tests and reviews

2013-02-03 Thread Marius Strobl

On Sun, Feb 03, 2013 at 06:11:45PM +0200, Konstantin Belousov wrote:
 On Sun, Feb 03, 2013 at 04:57:18PM +0100, Marius Strobl wrote:
  On Sat, Feb 02, 2013 at 06:33:22PM +0200, Konstantin Belousov wrote:
   Hi,
   I finished the last (insignificant) missed bits in the Jeff' physbio
   work. Now I am asking for the last round of testing and review, esp. for
   the !x86 architectures. Another testing focus are the SCSI HBAs and RAID
   controllers which drivers are changed by the patchset. Please do test
   this before the patchset is committed into HEAD !
   
   The plan is to commit the patch somewhere in two weeks from this moment.
   The patch is required for the finalizing of the unmapped I/O work for UFS
   I did in parallel, which I hope to finish shortly after the commit.
   
   Patch is available at http://people.freebsd.org/~kib/misc/physbio.5.diff
   
  
  Once you bring in said UFS changes, will the use of bus_dmamap_load_ccb(9)
  be a requirement for disk controller drivers?
 
 Generally speaking, no. I plan to do the gradual migration of the drivers,
 definitely not forcing the unmapped bios down to the drivers which are
 not tested yet. In the patch, driver indicates the support for unmapped
 bios by a DISKFLAG. If flag is set, driver could receive both mapped
 and unmapped bios, and use of the bus_dmamap_load_ccb(), while formally
 is only convenience, is essentially the requirement.
 
 If driver does not set the flag, it receives the same i/o requests as
 it does now. Geom performs transient compat mapping for the unmapped
 requests on its own for such drivers. As result, driver does not need
 a change.
 
 My plan is to convert ahci(4) and then some often used high-profile drivers
 like mfi(4) and mps(4). I can also hope for isci(4) help.
 
 Everything else, IMO, could be done on the best efforts basis, when both
 developers time and testing facilities are available. Jeff wanted to do
 all driver conversion in one pass, but IMO this is unrealistic. Still, I
 started write some helpers which should provide the transient one-page
 mappings for PIO modes.

Okay

 
 You can look at some previous version of the unmapped patch at
 http://people.freebsd.org/~kib/misc/unmapped.8.patch. It only contain a
 hack for ahci(4), which should be fixed properly after physbio is committed.

Hrm, the changes to the sparc64 pmap code in the latter patch might
need some more attention as some of the functions used for copying
pages there IIRC have constraints on the aligment of source and
destination as well as on the count. Can you say something about
these properties when pmap_copy_page_offs() is called via
pmap_copy_pages()?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Physbio changes final call for tests and reviews

2013-02-02 Thread Marius Strobl

On Sat, Feb 02, 2013 at 06:33:22PM +0200, Konstantin Belousov wrote:
 Hi,
 I finished the last (insignificant) missed bits in the Jeff' physbio
 work. Now I am asking for the last round of testing and review, esp. for
 the !x86 architectures. Another testing focus are the SCSI HBAs and RAID
 controllers which drivers are changed by the patchset. Please do test
 this before the patchset is committed into HEAD !
 
 The plan is to commit the patch somewhere in two weeks from this moment.
 The patch is required for the finalizing of the unmapped I/O work for UFS
 I did in parallel, which I hope to finish shortly after the commit.
 
 Patch is available at http://people.freebsd.org/~kib/misc/physbio.5.diff
 

First tests on sparc64 with ata(4), mpt(4) and sym(4) look good (to
be sure I still need to test with a machine using a streaming buffer
in addition to the IOMMU, though).
However, by accident I noticed that your patch (i.e. stock head is
fine) somehow breaks smartd of smartmontools with ata(4):
root@b1k2:/root # smartd
ata3: timeout waiting for write DRQ
The machine just hangs at this point (it's also strange that the above
message is from the PIO rather than from the DMA path).

One note: mjacob@ probably will be annoyed if you don't wrap the
changes to isp(4) in __FreeBSD_version so the same source still
compiles on older ones.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Fast gettimeofday(2) and static linking

2013-01-29 Thread Marius Strobl

On Mon, Jan 28, 2013 at 05:55:24PM +0200, Konstantin Belousov wrote:
 On Mon, Jan 28, 2013 at 04:45:17PM +0100, Marius Strobl wrote:
  On Fri, Jan 25, 2013 at 02:35:54PM +0200, Konstantin Belousov wrote:
   Bruce Evans reported that statically linked binaries on HEAD an stable/9
   use the syscall for gettimeofday(2) and clock_gettime(2). Apparently, this
   is due to my use of the weak reference to the __vdso* symbols in the
   libc implementations.
   
   Patch below reworks the __vdso* attributes to only make the symbols
   weak, but keep the references strong. Since I have to add a stub for
   each architecture, I would like to ask non-x86 machines owners to test
   the patch.
   
  
  Hi Konstantin,
  
  what's the appropriate way to test this?
 
 Please rebuild the world with the patch and check that gettimeofday(2) still
 works on your architecture, both for the static and dynamic binaries.
 I think that just booting multiuser is enough.

Okay, looks good on sparc64 (tested with a dynamically as well as a
statically built time(1)).

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Fast gettimeofday(2) and static linking

2013-01-28 Thread Marius Strobl

On Fri, Jan 25, 2013 at 02:35:54PM +0200, Konstantin Belousov wrote:
 Bruce Evans reported that statically linked binaries on HEAD an stable/9
 use the syscall for gettimeofday(2) and clock_gettime(2). Apparently, this
 is due to my use of the weak reference to the __vdso* symbols in the
 libc implementations.
 
 Patch below reworks the __vdso* attributes to only make the symbols
 weak, but keep the references strong. Since I have to add a stub for
 each architecture, I would like to ask non-x86 machines owners to test
 the patch.
 

Hi Konstantin,

what's the appropriate way to test this?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [RFC/RFT] calloutng

2013-01-21 Thread Marius Strobl

On Sun, Jan 13, 2013 at 09:36:11PM +0200, Alexander Motin wrote:
 On 13.01.2013 20:09, Marius Strobl wrote:
  On Tue, Jan 08, 2013 at 12:46:57PM +0200, Alexander Motin wrote:
  On 06.01.2013 17:23, Marius Strobl wrote:
  I'm not really sure what to do about that. Earlier you already said
  that sched_bind(9) also isn't an option in case if td_critnest  1.
  To be honest, I don't really unerstand why using a spin lock in the
  timecounter path makes sparc64 the only problematic architecture
  for your changes. The x86 i8254_get_timecount() also uses a spin lock
  so it should be in the same boat.
 
  The problem is not in using spinlock, but in waiting for other CPU while
  spinlock is held. Other CPU may also hold spinlock and wait for
  something, causing deadlock. i8254 code uses spinlock just to atomically
  access hardware registers, so it causes no problems.
  
  Okay, but wouldn't that be a general problem then? Pretty much
  anything triggering an IPI holds smp_ipi_mtx while doing so and
  the lower level IPI stuff waits for other CPU(s), including on
  x86.
 
 The problem is general. But now it works because single smp_ipi_mtx is
 used in all cases where IPI result is waited. As soon as spinning
 happens with interrupts still enabled, there is no deadlocks. But
 problem reappears if any different lock is used, or locks are nested.

I'm having a hard time getting an alternate time counter device to
work. The crystal required for the counters in the south bridge just
doesn't seem to be mounted any where near it (I've not looked at the
bottom of the PCB though). While the time counter part of the on-
board bge(4) driven chips basically work, they don't seem to like
concurrent accesses caused by the rest of bge(4). I.e. although the
counter is just read, sooner or later this causes a fatal bus error.
I haven't tried serializing accesses to the chip, but getting to such
a complexity for just reading a non-indexed register at least doesn't
feel good ...
However, AFAICT the scenario you describe can't happen. On sparc64,
spinlock_enter() only raises the processor interrupt level, which
doesn't block the direct cross traps I've implemented remote reading
of (S)TICK as (which also means that the actions such traps may
perform are very limitted and must occur in interrupt context, but
which are sufficient for this purpose and in turn makes them very
fast). I.e. although the AP holds smp_ipi_mtx or any amount of
nested spin locks, this will not deadlock in case the BSP also holds
any spin lock when reading (S)TICK from it.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [RFC/RFT] calloutng

2013-01-13 Thread Marius Strobl

On Tue, Jan 08, 2013 at 12:46:57PM +0200, Alexander Motin wrote:
 On 06.01.2013 17:23, Marius Strobl wrote:
  On Wed, Dec 26, 2012 at 09:24:46PM +0200, Alexander Motin wrote:
  On 26.12.2012 01:21, Marius Strobl wrote:
  On Tue, Dec 18, 2012 at 11:03:47AM +0200, Alexander Motin wrote:
  Experiments with dummynet shown ineffective support for very short
  tick-based callouts. New version fixes that, allowing to get as many
  tick-based callout events as hz value permits, while still be able to
  aggregate events and generating minimum of interrupts.
 
  Also this version modifies system load average calculation to fix some
  cases existing in HEAD and 9 branches, that could be fixed with new
  direct callout functionality.
 
  http://people.freebsd.org/~mav/calloutng_12_17.patch
 
  With several important changes made last time I am going to delay commit
  to HEAD for another week to do more testing. Comments and new test cases
  are welcome. Thanks for staying tuned and commenting.
 
  FYI, I gave both calloutng_12_15_1.patch and calloutng_12_17.patch a
  try on sparc64 and it at least survives a buildworld there. However,
  with the patched kernels, buildworld times seem to increase slightly but
  reproducible by 1-2% (I only did four runs but typically buildworld
  times are rather stable and don't vary more than a minute for the
  same kernel and source here). Is this an expected trade-off (system
  time as such doesn't seem to increase)?
 
  I don't think build process uses significant number of callouts to 
  affect results directly. I think this additional time could be result of 
  the deeper next event look up, done by the new code, that is practically 
  useless for sparc64, which effectively has no cpu_idle() routine. It 
  wouldn't affect system time and wouldn't show up in any statistics 
  (except PMC or something alike) because it is executed inside timer 
  hardware interrupt handler. If my guess is right, that is a part that 
  probably still could be optimized. I'll look on it. Thanks.
 
  Is there anything specific to test?
 
  Since the most of code is MI, for sparc64 I would mostly look on related 
  MD parts (eventtimers and timecounters) to make sure they are working 
  reliably in more stressful conditions.  I still have some worries about 
  possible deadlock on hardware where IPIs are used to fetch present time 
  from other CPU.
  
  Well, I've just learnt two things the hard way:
  a) We really need the mutex in that path.
  b) Assuming that the initial synchronization of the counters is good
 enough and they won't drift considerably accross the CPUs so we can
 always use the local one makes things go south pretty soon after
 boot. At least with your calloutng_12_26.patch applied.
 
 Do you think it means they are not really synchronized for some reason?

There's definitely no hardware in place which would synchronize them.
I've no idea how to properly measure the difference between two tick
counters, but I think it's rarther their drift and not the software
synchronization we do when starting APs that is causing problems.
Mainly, because I can't really think of a better algorithm for doing
the latter when startiing the APs. The symptoms are that bout 30 to
60 seconds after that I start to see weird timeouts from device
drivers. I'd need to check how long these timeouts actually are to
see whether it could be a problem right from the start though. In
any case, it seems foolish to think that synchronizing them once
would be sufficient and they won't drift until shutdown. Linux
probably also doesn't keep re-synchronize them without a reason.
Just using a single timecounter source simply appears to be the
better choice.

 
  I'm not really sure what to do about that. Earlier you already said
  that sched_bind(9) also isn't an option in case if td_critnest  1.
  To be honest, I don't really unerstand why using a spin lock in the
  timecounter path makes sparc64 the only problematic architecture
  for your changes. The x86 i8254_get_timecount() also uses a spin lock
  so it should be in the same boat.
 
 The problem is not in using spinlock, but in waiting for other CPU while
 spinlock is held. Other CPU may also hold spinlock and wait for
 something, causing deadlock. i8254 code uses spinlock just to atomically
 access hardware registers, so it causes no problems.

Okay, but wouldn't that be a general problem then? Pretty much
anything triggering an IPI holds smp_ipi_mtx while doing so and
the lower level IPI stuff waits for other CPU(s), including on
x86.

 
  The affected machines are equipped with a x86-style south bridge
  which exposes a powermanagment unit (intended to be used as a SMBus
  bridge only in these machines) on the PCI bus. Actually, this device
  also includes an ACPI power management timer. However, I've just
  spent a day trying to get that one working without success - it
  just doesn't increment. Probably its clock input isn't connected as
  it's

Re: [RFC/RFT] calloutng

2013-01-06 Thread Marius Strobl

On Wed, Dec 26, 2012 at 09:24:46PM +0200, Alexander Motin wrote:
 On 26.12.2012 01:21, Marius Strobl wrote:
  On Tue, Dec 18, 2012 at 11:03:47AM +0200, Alexander Motin wrote:
  Experiments with dummynet shown ineffective support for very short
  tick-based callouts. New version fixes that, allowing to get as many
  tick-based callout events as hz value permits, while still be able to
  aggregate events and generating minimum of interrupts.
 
  Also this version modifies system load average calculation to fix some
  cases existing in HEAD and 9 branches, that could be fixed with new
  direct callout functionality.
 
  http://people.freebsd.org/~mav/calloutng_12_17.patch
 
  With several important changes made last time I am going to delay commit
  to HEAD for another week to do more testing. Comments and new test cases
  are welcome. Thanks for staying tuned and commenting.
 
  FYI, I gave both calloutng_12_15_1.patch and calloutng_12_17.patch a
  try on sparc64 and it at least survives a buildworld there. However,
  with the patched kernels, buildworld times seem to increase slightly but
  reproducible by 1-2% (I only did four runs but typically buildworld
  times are rather stable and don't vary more than a minute for the
  same kernel and source here). Is this an expected trade-off (system
  time as such doesn't seem to increase)?
 
 I don't think build process uses significant number of callouts to 
 affect results directly. I think this additional time could be result of 
 the deeper next event look up, done by the new code, that is practically 
 useless for sparc64, which effectively has no cpu_idle() routine. It 
 wouldn't affect system time and wouldn't show up in any statistics 
 (except PMC or something alike) because it is executed inside timer 
 hardware interrupt handler. If my guess is right, that is a part that 
 probably still could be optimized. I'll look on it. Thanks.
 
  Is there anything specific to test?
 
 Since the most of code is MI, for sparc64 I would mostly look on related 
 MD parts (eventtimers and timecounters) to make sure they are working 
 reliably in more stressful conditions.  I still have some worries about 
 possible deadlock on hardware where IPIs are used to fetch present time 
 from other CPU.

Well, I've just learnt two things the hard way:
a) We really need the mutex in that path.
b) Assuming that the initial synchronization of the counters is good
   enough and they won't drift considerably accross the CPUs so we can
   always use the local one makes things go south pretty soon after
   boot. At least with your calloutng_12_26.patch applied.

I'm not really sure what to do about that. Earlier you already said
that sched_bind(9) also isn't an option in case if td_critnest  1.
To be honest, I don't really unerstand why using a spin lock in the
timecounter path makes sparc64 the only problematic architecture
for your changes. The x86 i8254_get_timecount() also uses a spin lock
so it should be in the same boat.

The affected machines are equipped with a x86-style south bridge
which exposes a powermanagment unit (intended to be used as a SMBus
bridge only in these machines) on the PCI bus. Actually, this device
also includes an ACPI power management timer. However, I've just
spent a day trying to get that one working without success - it
just doesn't increment. Probably its clock input isn't connected as
it's not intended to be used in these machines.
That south bridge also includes 8254 compatible timers on the ISA/
LPC side, but are hidden from the OFW device tree. I can hack these
devices into existence and give it a try, but even if that works this
likely would use the same code as the x86 i8254_get_timecount() so I
don't see what would be gained with that.

The last thing in order to avoid using the tick counter as timecounter
in the MP case I can think of is that the Broadcom MACs in the affected
machines also provide a counter driven by a 1 MHz clock. If that's good
enough for a timecounter I can hook these up (in case these work ...)
and hack bge(4) to not detach in that case (given that we can't detach
timecounters ...).

 
 Here is small tool we are using for test correctness and performance of 
 different user-level APIs: http://people.freebsd.org/~mav/testsleep.c
 

I've run Ian's set of tests on a v215 with and without your
calloutng_12_26.patch and on a v210 (these uses the IPI approach)
with the latter also applied.
I'm not really sure what to make out of the numbers.

   v215 w/o v215 w/  v210 w/ 
--   
select  1   1999.61  1 23.87  1 29.97
poll1   1999.70  1   1069.61  1   1075.24
usleep  1   1999.86  1 23.43  1 28.99
nanosleep   1999.92  1 23.28  1 28.66
kqueue  1   1000.12  1   1071.13  1   1076.35

Re: [RFC/RFT] calloutng

2012-12-25 Thread Marius Strobl

On Tue, Dec 18, 2012 at 11:03:47AM +0200, Alexander Motin wrote:
 Experiments with dummynet shown ineffective support for very short 
 tick-based callouts. New version fixes that, allowing to get as many 
 tick-based callout events as hz value permits, while still be able to 
 aggregate events and generating minimum of interrupts.
 
 Also this version modifies system load average calculation to fix some 
 cases existing in HEAD and 9 branches, that could be fixed with new 
 direct callout functionality.
 
 http://people.freebsd.org/~mav/calloutng_12_17.patch
 
 With several important changes made last time I am going to delay commit 
 to HEAD for another week to do more testing. Comments and new test cases 
 are welcome. Thanks for staying tuned and commenting.

FYI, I gave both calloutng_12_15_1.patch and calloutng_12_17.patch a
try on sparc64 and it at least survives a buildworld there. However,
with the patched kernels, buildworld times seem to increase slightly but
reproducible by 1-2% (I only did four runs but typically buildworld
times are rather stable and don't vary more than a minute for the
same kernel and source here). Is this an expected trade-off (system
time as such doesn't seem to increase)?
Is there anything specific to test?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [CFC/CFT] large changes in the loader(8) code

2012-07-16 Thread Marius Strobl

On Mon, Jul 16, 2012 at 04:00:49PM +0400, Andrey V. Elsukov wrote:
 On 16.07.2012 15:31, Andriy Gapon wrote:
  Yes. It should work as before.
  
  Well, but it's obvious that zfs_probe_dev would be attempting to do some 
  unneeded
  stuff (trying to treat partitions as disks) for that case.  To me this is a 
  clear
  indication zfs_probe_dev is not optimal for arch-independent 
  implementation.  So I
  still think that arch_zfs_probe should decide what disks and partitions to 
  probe,
  and zfs_probe_dev should only probe what it's given and not try to be any 
  smarter.
  But I've repeated myself three times already :-)
 
 And we will have the same - several copies of the same code in each 
 architecture,
 which i have deleted...
 
 Sparc doesn't support DIOCGMEDIASIZE and DIOCGSECTORSIZE ioctls,
 so it will not check each partition, only fd that is passed to the 
 zfs_probe_dev.
 
 Currently there is only one problem with ZFS tasting, that can affect users -
 now we taste each disk and partition, but in the my branch ZFS tastes only 
 disks and
 partitions with type freebsd and freebsd-zfs. So if you have created ZFS 
 on top
 of MBR partition with type ntfs, then loader will be unable to detect it.
 

Sorry, I'm missing the big picture of ZFS support in the loader and
currently unfortunately don't have the time to look into it or your
patches. I don't think there's a way to determine the media and
sector sizes without actually looking at the Sun and/or VTOC8 labels
though. As for zfs_probe_dev, some user recently indicated that
on sparc64 we should rather look at the disk devices listed in
the boot-device environment variable in order to mimic what Solaris
does rather than trying to probe anything that might be a disk device,
mimicking what the FreeBSD/i386 ZFS loader does. Maybe that's a hint
whether a arch_zfs_probe should exist.
I can test patches once you guys have figures out how things should
work though.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [head tinderbox] failure on powerpc64/powerpc

2011-12-24 Thread Marius Strobl

On Sat, Dec 24, 2011 at 03:43:51PM +, FreeBSD Tinderbox wrote:
 TB --- 2011-12-24 13:54:50 - tinderbox 2.8 running on 
 freebsd-current.sentex.ca
 TB --- 2011-12-24 13:54:50 - starting HEAD tinderbox run for powerpc64/powerpc
 TB --- 2011-12-24 13:54:50 - cleaning the object tree
 TB --- 2011-12-24 13:55:13 - cvsupping the source tree
 TB --- 2011-12-24 13:55:13 - /usr/bin/csup -z -r 3 -g -L 1 -h cvsup.sentex.ca 
 /tinderbox/HEAD/powerpc64/powerpc/supfile
 TB --- 2011-12-24 13:55:26 - building world
 TB --- 2011-12-24 13:55:26 - CROSS_BUILD_TESTING=YES
 TB --- 2011-12-24 13:55:26 - MAKEOBJDIRPREFIX=/obj
 TB --- 2011-12-24 13:55:26 - PATH=/usr/bin:/usr/sbin:/bin:/sbin
 TB --- 2011-12-24 13:55:26 - SRCCONF=/dev/null
 TB --- 2011-12-24 13:55:26 - TARGET=powerpc
 TB --- 2011-12-24 13:55:26 - TARGET_ARCH=powerpc64
 TB --- 2011-12-24 13:55:26 - TZ=UTC
 TB --- 2011-12-24 13:55:26 - __MAKE_CONF=/dev/null
 TB --- 2011-12-24 13:55:26 - cd /src
 TB --- 2011-12-24 13:55:26 - /usr/bin/make -B buildworld
  World build started on Sat Dec 24 13:55:27 UTC 2011
  Rebuilding the temporary build tree
  stage 1.1: legacy release compatibility shims
  stage 1.2: bootstrap tools
  stage 2.1: cleaning up the object tree
  stage 2.2: rebuilding the object tree
  stage 2.3: build tools
  stage 3: cross tools
  stage 4.1: building includes
  stage 4.2: building libraries
  stage 4.3: make dependencies
  stage 4.4: building everything
 [...]
 rsyncfile.o:(.text+0xf8): undefined reference to `MD5Update'
 stream.o:(.text+0x544): undefined reference to `MD5Init'
 stream.o:(.text+0xb9c): undefined reference to `MD5Update'
 stream.o:(.text+0xd0c): undefined reference to `MD5Update'
 stream.o:(.text+0xd40): undefined reference to `MD5Update'
 stream.o:(.text+0xd54): undefined reference to `MD5Update'
 stream.o:(.text+0xd84): undefined reference to `MD5Update'
 stream.o:(.text+0xd98): more undefined references to `MD5Update' follow
 *** Error code 1
 

The tinderbox output isn't very helpful here and I've no idea how this
could happen as r228857 also added -lmd nor can I reproduce it. Could
this be a transient failure due to the tinderbox updating sources at
an unfortunate point in time or a glitch in the exported (according to
the sources presented by cvsweb.freebsd.org r228857 has reached the CVS
repository just fine though)?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: sparc64 r228561 panic: kmem_suballoc: bad status return of 3

2011-12-16 Thread Marius Strobl

On Fri, Dec 16, 2011 at 08:40:48AM +, Anton Shterenlikht wrote:
 Updating from r216048 to r228561 on sparc64,
 with sys/conf/newvers.sh changed to REVISION=9.9.
 
 Trinscribed by hand:
 
 FreeBSD 9.9-CURRENT #3 r228561M:
 
 panic: kmem_suballoc: bad status return of 3
 KDB: enter: panic
 [ thread pid 0 tid 0 ]
 Stopped at 0x02937e0:   ta%xcc,1
 db
 
 The keyboard froze, couldn't get a bt,
 required a cold reboot.
 
 My /etc/make.conf and kernel config files are below.
 
 Any advice?
 

Hrm, doesn't look like I can reproduce this. What machine model is
that and how much RAM does it have? Do you use any loader tuneables?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: sparc64 r228561 panic: kmem_suballoc: bad status return of 3

2011-12-16 Thread Marius Strobl

On Fri, Dec 16, 2011 at 11:19:22AM +, Anton Shterenlikht wrote:
 On Fri, Dec 16, 2011 at 11:37:20AM +0100, Marius Strobl wrote:
  On Fri, Dec 16, 2011 at 08:40:48AM +, Anton Shterenlikht wrote:
   Updating from r216048 to r228561 on sparc64,
   with sys/conf/newvers.sh changed to REVISION=9.9.
   
   Trinscribed by hand:
   
   FreeBSD 9.9-CURRENT #3 r228561M:
   
   panic: kmem_suballoc: bad status return of 3
   KDB: enter: panic
   [ thread pid 0 tid 0 ]
   Stopped at 0x02937e0:   ta%xcc,1
   db
   
   The keyboard froze, couldn't get a bt,
   required a cold reboot.
   
   My /etc/make.conf and kernel config files are below.
   
   Any advice?
   
  
  Hrm, doesn't look like I can reproduce this. What machine model is
  that and how much RAM does it have?
 
 From dmesg:
 
 real memory  = 2147483648 (2048 MB)
 avail memory = 2079449088 (1983 MB)
 cpu0: Sun Microsystems UltraSparc-IIIi Processor (1503.00 MHz CPU)
 
  Do you use any loader tuneables?
 
 I don't think so. You mean like /boot/loader.conf?
 I haven't got this file at all.
 

Even with a Blade 1500, which is the closest match to your machine
that I have, and a kernel built with your configuration file I can't
reproduce this using r228583. I'd suggest to test with a kernel built
using an empty object directory and without any local modifications.
If that still doesn't solve the problem given that there isn't even
a backtrace I just can suggest to do a binary search for the offending
commit, probably accounting especially for the changes to the VM
within the window of revisions in question.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Burning CDs and DVDs on SATA drive in FreeBSD 9.0

2011-12-10 Thread Marius Strobl

On Sat, Dec 10, 2011 at 02:27:22AM -0800, Thomas Mueller wrote:
 
 --- On Fri, 12/9/11, Marius Strobl mar...@alchemy.franken.de wrote:
 
  +, Thomas Mueller wrote:
Recompile the port; the CAM ioctl numbers have
  changed.
  Cheers
Michiel
  
   When did these CAM ioctl numbers change?? Was it
  before or after I built and installed cdrtools?
  
   Running ls -rtl /var/db/pkg/cdrtools-3.00_1 produces
  
   total 48
   -rw-r--r--? 1 root? wheel? 17550 Sep 26
  09:20 +MTREE_DIRS
   -rw-r--r--? 1 root? wheel? ? 470
  Sep 26 09:20 +DISPLAY
   -rw-r--r--? 1 root?
  wheel???1009 Sep 26 09:20 +DESC
   -rw-r--r--? 1 root? wheel? 11102 Sep 26
  09:20 +CONTENTS
   -rw-r--r--? 1 root? wheel?
  ???63 Sep 26 09:20 +COMMENT
   -rw-r--r--? 1 root? wheel?
  ???17 Dec? 7 15:44 +REQUIRED_BY
  
  
   So it might have been on FreeBSD 9.0-BETA2.
  
 
  I'm not sure what CAM IOCTL number change others are
  referring to but
  you certainly need to rebuild libcam consumers after
  r225950, which
  was merged to stable/9 in r226067 on October 6 2011.
 
  Marius
  
 Thanks for response.  I'm at the older computer now, but will need to check 
 /usr/src/UPDATING, and portupgrade or portmaster cdrtools after 
 source-upgrading FreeBSD 9.0-RC2 to RC3. 
 

There's no corresponding entry in UPDATING.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Burning CDs and DVDs on SATA drive in FreeBSD 9.0

2011-12-09 Thread Marius Strobl

On Fri, Dec 09, 2011 at 09:23:58AM +, Thomas Mueller wrote:
  Recompile the port; the CAM ioctl numbers have changed.
 
  Cheers
  Michiel
 
 When did these CAM ioctl numbers change?  Was it before or after I built and 
 installed cdrtools?
 
 Running ls -rtl /var/db/pkg/cdrtools-3.00_1 produces
 
 
 total 48
 -rw-r--r--  1 root  wheel  17550 Sep 26 09:20 +MTREE_DIRS
 -rw-r--r--  1 root  wheel470 Sep 26 09:20 +DISPLAY
 -rw-r--r--  1 root  wheel   1009 Sep 26 09:20 +DESC
 -rw-r--r--  1 root  wheel  11102 Sep 26 09:20 +CONTENTS
 -rw-r--r--  1 root  wheel 63 Sep 26 09:20 +COMMENT
 -rw-r--r--  1 root  wheel 17 Dec  7 15:44 +REQUIRED_BY
 
 
 So it might have been on FreeBSD 9.0-BETA2.
 

I'm not sure what CAM IOCTL number change others are referring to but
you certainly need to rebuild libcam consumers after r225950, which
was merged to stable/9 in r226067 on October 6 2011.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

nge(4), tl(4), wb(4) and rl(4) 8129 testers wanted [Re: Question about GPIO bitbang MII]

2011-10-15 Thread Marius Strobl


Could owners of nge(4), tl(4), wb(4) and rl(4) driven hardware (as for
rl(4) only 8129 need testing, 8139 don't) please give the following
patch a try in order to ensure it doesn't break anything?
for 9/head:
http://people.freebsd.org/~marius/mii_bitbang.diff
for 8:
http://people.freebsd.org/~marius/mii_bitbang.diff8

Thanks,
Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: nge(4), tl(4), wb(4) and rl(4) 8129 testers wanted [Re: Question about GPIO bitbang MII]

2011-10-15 Thread Marius Strobl

On Sun, Oct 16, 2011 at 02:46:23AM +0200, Damien Fleuriot wrote:
 
 
 On 15 Oct 2011, at 22:56, Marius Strobl mar...@alchemy.franken.de wrote:
 
  
  Could owners of nge(4), tl(4), wb(4) and rl(4) driven hardware (as for
  rl(4) only 8129 need testing, 8139 don't) please give the following
  patch a try in order to ensure it doesn't break anything?
  for 9/head:
  http://people.freebsd.org/~marius/mii_bitbang.diff
  for 8:
  http://people.freebsd.org/~marius/mii_bitbang.diff8
  
  Thanks,
  Marius
  
 
 
 While I don't have any box with this hardware, I'm thinking you might want to 
 get a bit more specific about what you want tested...
 
 What do you think the patch might break ?
 

Basically, if there's something wrong with the patch the driver should
fail to attach, if it still does and gets a link all should be fine.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SCSI descriptor sense changes, testing needed

2011-09-23 Thread Marius Strobl

On Thu, Sep 22, 2011 at 01:33:05PM -0600, Kenneth D. Merry wrote:
 
 I have attached a set of patches against head that implement SCSI
 descriptor sense support for CAM.
 
 Descriptor sense is a new sense (SCSI error) format introduced in the SPC-3
 spec in 2006.  FreeBSD doesn't currently support it.
 
 Seagate's new 3TB SAS drives come with descriptor sense enabled by default,
 and it's possible that other newer drives do as well.  Because all the
 sense key, additional sense code, and additional sense code qualifier
 fields are in different places, the CAM error recovery code will not do the
 right thing when it gets descriptor sense.
 
 These patches do bump up the size of struct scsi_sense_data, and so I have
 incremented CAM_VERSION as well.  I have discussed this with re@, and it
 looks like we'll be putting the changes in before 9.0, so it ships with
 support for newer SCSI devices.

Hi Ken,

as far as I understand this also requires consumers of scsi_sense_data
and SSD_FULL_SIZE etc in userland to be recompiled. So while you are at
breaking the API and ABI of CAM anyway, could you please take the
opportunity to change CAM_XPT_PATH_ID and CAM_BUS_WILDCARD to not use
the same value so incorrect uses will fail? Currently, there seems to
be a lot of confusion when to use which one, including camcontrol(8)
just encoding this as -1:
/*
 * We don't want to rescan or reset the xpt bus.
 * See above.
 */
if ((int)bus_result-path_id == -1)
continue;

Moreover, AFAICT CAM_XPT_PATH_ID corresponds to what the ANSI CAM Draft
refers to as XPT Path ID and specifies a value of 0xff for.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: SSD - TRIM, SU and SUJ - Installer Options

2011-09-16 Thread Marius Strobl

On Thu, Sep 08, 2011 at 06:21:40PM +0200, Nathan Whitehorn wrote:
 On 09/08/11 16:22, mysph...@web.de wrote:
 Hi there,
 
 first off all I have to say, that your new Installer in FBSD v9.0 is
 very well done.
 I just found an option, which is not activated at this time. So I wanted
 to ask, if it???s possible a bug or something like that.
 
 I???ve tried to change theOptions  in the section Partition Editor with
 the subsection Add Partition and wanted to save my changes: Softupdates
 = disabled, Softupdates journaling = disabled, TRIM = enabled with the
 OK  button.
 But if I reenter theOptions  menu, so my changes will be overwritten
 with the default values: SU = enabled, SUJ = enabled ([UFS1 +] TRIM =
 disabled).
 
 I???ve tried this without success in FBSD v9.0 BETA1+2 (amd64) with the
 ISO- and the IMG Images.
 
 My workaround is, that I run in single-user-mode and change the values
 with tunefs.
 
 Would you please check, thatOptions  point ??? if my act is accurate.
 
 Thanks in advance and have a nice day!
 
 This is an interesting point that I hadn't tested. The options do work 
 -- the state of the dialog is just not restored when the Options menu is 
 reentered and so a second trip to Options resets the defaults, unless 
 you then change it again. I'm traveling at the moment, so am not able to 
 fix it at the moment. The internal architecture may also make it 
 slightly tricky to fix.

In my experience the filesystem options menu doesn't work at all, i.e.
the options select there are just ignored also when selecting them just
once and not re-entering that menu. I've tried to create a filesystem
with SUJ disabled and TRIM enabled twice now, last time with BETA2 on
amd64, and I always end up with a filesystem that has SUJ enabled but
TRIM disabled.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-17 Thread Marius Strobl

On Sat, Jul 16, 2011 at 10:42:22PM -0700, Doug Barton wrote:
 On 07/15/2011 01:40, Marius Strobl wrote:
 
  The generated config.h and platform.h for sparc64 are these:
  http://people.freebsd.org/~marius/bind96_config.h
  http://people.freebsd.org/~marius/bind96_platform.h
 
 Marius,
 
 Thanks again for all your help on this. During the work to upgrade to
 BIND 9.8 in HEAD I first tried your patch but I got some odd errors on
 some of the non-mainstream archs, so I ultimately went with something
 similar to what you sent but much more conservative.
 

Thanks!

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-15 Thread Marius Strobl

On Thu, Jul 14, 2011 at 05:31:49PM -0700, Doug Barton wrote:
 On 07/14/2011 16:21, Marius Strobl wrote:
  On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote:
  2011/7/11 KOT MATPOCKuH matpoc...@gmail.com:
  Oops, sorry, I forgot to revert the previous patch when test-compiling.
  Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
  I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD,
  and it worked properly till Sun Jul 10 22:25:41 MSD.
  At 22:25:41 I restarted bind from base system with your
  sparc64_isc_atomic.h.diff2.
  From this moment till today, 15:57:05 he crashed 3 times:
  Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on 
  signal 6
  Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on 
  signal 6
  Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on 
  signal 6
 
  To make to ensure proper operation of bind from ports, I ran it again
  at 15:57:05, and, I think, we need to wait several days.
  And from that time till now bind from ports never died and works 
  properly...
 
  
  Okay.
  Doug, could you please disable the use of atomic operations for sparc64
  in the in-tree BIND via the following patch in order to match what the
  vendor source does?
  http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
 
 If you use the port and do 'make configure' are the values in config.h
 the same as the ones in your patch?  If so, that's likely to be the
 right answer, and I'll go ahead and apply your patch.
 

The generated config.h and platform.h for sparc64 are these:
http://people.freebsd.org/~marius/bind96_config.h
http://people.freebsd.org/~marius/bind96_platform.h

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-14 Thread Marius Strobl

On Thu, Jul 14, 2011 at 09:53:42AM +0400, KOT MATPOCKuH wrote:
 2011/7/11 KOT MATPOCKuH matpoc...@gmail.com:
  Oops, sorry, I forgot to revert the previous patch when test-compiling.
  Please re-fetch sparc64_isc_atomic.h.diff2 and try again.
  I started named from ports (dns/bind96) at Sat Jul ?9 10:08:41 MSD,
  and it worked properly till Sun Jul 10 22:25:41 MSD.
  At 22:25:41 I restarted bind from base system with your
  sparc64_isc_atomic.h.diff2.
  From this moment till today, 15:57:05 he crashed 3 times:
  Jul 10 23:19:19 sunrise kernel: pid 45352 (named), uid 53: exited on signal 
  6
  Jul 11 14:52:20 sunrise kernel: pid 52032 (named), uid 53: exited on signal 
  6
  Jul 11 15:14:15 sunrise kernel: pid 71300 (named), uid 53: exited on signal 
  6
 
  To make to ensure proper operation of bind from ports, I ran it again
  at 15:57:05, and, I think, we need to wait several days.
 And from that time till now bind from ports never died and works properly...
 

Okay.
Doug, could you please disable the use of atomic operations for sparc64
in the in-tree BIND via the following patch in order to match what the
vendor source does?
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
I've no idea why they don't work properly (apart from the fact that there
additionally should be memory barriers at least when used for reference
counting just like the alpha version of the ISC atomic operations uses),
I just can say they match what we use in the kernel without problems
pretty closely and that they work as described in the respective comments
when testing them stand-alone. So my best guess is that the BIND source
additionaly depends on some x86-specific behavior of the atomic operations
there or in general, but from a glance the source it's not obvious for me
what that could be. Given that the vendor source doesn't even use atomic
operations on Solaris/SPARC I suspect this is a non-trivial problem.
It probably would be a good idea to also disable the use of atomic
operations for arm again just like the vendor source does as they don't
work there either but nobody seems to care (see PR 154306).

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-08 Thread Marius Strobl

On Fri, Jul 08, 2011 at 03:47:08PM +0400, KOT MATPOCKuH wrote:
 2011/7/7 Marius Strobl mar...@alchemy.franken.de:
  That's not the patch I was referring to. I did a second one which just
  entirely disables the use of atomic operations on sparc64:
  http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
 Omg. I'm sorry.
 I applied this patch and restarted named, but named crashed immediatly
 after start:
 08-Jul-2011 15:29:54.631 found 2 CPUs, using 2 worker threads
 08-Jul-2011 15:29:54.633 using up to 4096 sockets
 Segmentation fault (core dumped)
 
 core's backtrace:
 #0  0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7
 (gdb) bt
 #0  0x40953ba8 in __sparc_utrap_install () from /lib/libc.so.7
 #1  0x40953ccc in __sparc_utrap_install () from /lib/libc.so.7
 #2  0x40953f70 in __sparc_utrap_install () from /lib/libc.so.7
 #3  0x409537ac in __sparc_utrap_install () from /lib/libc.so.7
 #4  0x407c2d54 in pthread_mutex_lock () from /lib/libthr.so.3
 #5  0x00228dcc in ?? ()
 Previous frame identical to this frame (corrupt stack?)
 
 Could this be a sign to a problem in libthr?

Could be but IMO that's unlikely, if there'd be a bug affecting
pthread_mutex_lock() there should be more fallout from that. I'm probably
missing something how to properly disable the use of the ISC atomic
implementation and to enable the alternative locking.
Please try the following:
a) Instead of the base BIND use the dns/bind96 port. The native build
   of the latter defaults to not using the ISC atomic implementation
   on sparc64 (and arm) and should properly enable the alternative. I
   can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default
   configuration on -CURRENT without problems.
b) Revert the above patch and try the base bind with the following
   (third) patch:
   http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2
   That one adds the memory barriers required for reference counting
   albeit in a sledgehammer-like fashion as the ISC atomic API doesn't
   allow to distinguish between acquire and release semantics.

 
 PS.
 Also one month ago I got a problems with another multithreaded
 application from ports (www/oops). oops was crashed with stack's
 backtrace:
 #0  0x40d8fc88 in __sparc_utrap_install () from /lib/libc.so.7
 #1  0x40d8fdac in __sparc_utrap_install () from /lib/libc.so.7
 #2  0x40d90050 in __sparc_utrap_install () from /lib/libc.so.7
 #3  0x40d8f88c in __sparc_utrap_install () from /lib/libc.so.7
 #4  0x40d64044 in _malloc_thread_cleanup () from /lib/libc.so.7
 #5  0x40c039b8 in fork () from /lib/libthr.so.3
 #6  0x40c03d38 in fork () from /lib/libthr.so.3
 #7  0x40c03f50 in pthread_exit () from /lib/libthr.so.3
 #8  0x40c04414 in pthread_detach () from /lib/libthr.so.3
 #9  0x40c04710 in pthread_create () from /lib/libthr.so.3
 
 But on yesterday's world's build oops works properly. I think it may
 be related to r223228 (?)

Unlikely, the crash caused by the assertion in _malloc_thread_cleanup()
was solved with r223369.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-08 Thread Marius Strobl

On Fri, Jul 08, 2011 at 11:17:20PM +0400, KOT MATPOCKuH wrote:
 2011/7/8 Marius Strobl mar...@alchemy.franken.de:
 
  Please try the following:
  a) Instead of the base BIND use the dns/bind96 port. The native build
  ? of the latter defaults to not using the ISC atomic implementation
  ? on sparc64 (and arm) and should properly enable the alternative. I
  ? can at least start named from bind96-9.6.3.1.ESV.R4.3 with the default
  ? configuration on -CURRENT without problems.
 dns/bind96? Why not bind98?

In order to have a result which can be compared with the base BIND.
Whether bind98 works or works without the ISC atomic operations says
nothing about the bind96 port or the base version.

 As I see dns/bind98 configures without atomic swap operations.
 I will try to use dns/bind98 at first :)
 
  b) Revert the above patch and try the base bind with the following
  ? (third) patch:
  ? http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff2
  ? That one adds the memory barriers required for reference counting
  ? albeit in a sledgehammer-like fashion as the ISC atomic API doesn't
  ? allow to distinguish between acquire and release semantics.
 
 Hmmm... With this patch build fails:

Oops, sorry, I forgot to revert the previous patch when test-compiling.
Please re-fetch sparc64_isc_atomic.h.diff2 and try again.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-07 Thread Marius Strobl

On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote:
 I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3,
 but problem is still exists:
 07-Jul-2011 13:24:22.765 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622:
 REQUIRE(prev  0) failed
 07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure)
 
 How can I find root cause of the problem?
 

From your description it's unclear whether you've built BIND with or
without sparc64_isc_disable_atomic.diff. If it was built without that
patch please give it a try. If you had applied it then this apparently
is a generic bug in BIND and unrelated to the MD atomic implementation
and I don't know how to proceed in order to get that fixed. Hopefully
Doug can help you in that case.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-07-07 Thread Marius Strobl

On Thu, Jul 07, 2011 at 03:44:32PM +0400, KOT MATPOCKuH wrote:
 2011/7/7 Marius Strobl mar...@alchemy.franken.de:
  On Thu, Jul 07, 2011 at 01:46:23PM +0400, KOT MATPOCKuH wrote:
  I updated system to r223824 and got named patched to 9.6.-ESV-R4-P3,
  but problem is still exists:
  07-Jul-2011 13:24:22.765 general:
  /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1622:
  REQUIRE(prev  0) failed
  07-Jul-2011 13:24:22.781 general: exiting (due to assertion failure)
 
  How can I find root cause of the problem?
  From your description it's unclear whether you've built BIND with or
  without sparc64_isc_disable_atomic.diff. If it was built without that
  patch please give it a try.
 As You can see, Doug is already included your patch in head:
 http://svnweb.freebsd.org/base/head/contrib/bind9/lib/isc/sparc64/include/isc/atomic.h?r1=222395r2=223811
 And, of course, bind builded with your patch...
 

That's not the patch I was referring to. I did a second one which just
entirely disables the use of atomic operations on sparc64:
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-07-06 Thread Marius Strobl

On Tue, Jul 05, 2011 at 05:55:09PM -0700, Doug Barton wrote:
 On 06/28/2011 08:58, Marius Strobl wrote:
 Uhm, we once fixed a problem in the MD atomic implementation which
 still seems to present in the ISC copy. Could you please test whether
 the following patch makes a difference?
 http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
 
 I haven't seen any verification from the OP that this patch solved the 
 problem,

It simply doesn't so apparently there's another bug in other parts of
BIND causing it to trip over that assertion. Still, the clobber lists
of the sparc64 atomic bits were incomplete and fixing that IMO was the
right thing to do.

 however it did pass 'make universe' on both 9-current and 
 RELENG_8, so I've committed it to those 2 branches along with the recent 
 update. I'll also submit it upstream.
 

Thanks!
Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-07-06 Thread Marius Strobl

On Wed, Jul 06, 2011 at 11:55:15AM +0200, Marius Strobl wrote:
 On Tue, Jul 05, 2011 at 05:55:09PM -0700, Doug Barton wrote:
  On 06/28/2011 08:58, Marius Strobl wrote:
  Uhm, we once fixed a problem in the MD atomic implementation which
  still seems to present in the ISC copy. Could you please test whether
  the following patch makes a difference?
  http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
  
  I haven't seen any verification from the OP that this patch solved the 
  problem,
 
 It simply doesn't so apparently there's another bug in other parts of
 BIND causing it to trip over that assertion. Still, the clobber lists
 of the sparc64 atomic bits were incomplete and fixing that IMO was the
 right thing to do.
 

MATPOCKuH, could you please test the following patch?
http://people.freebsd.org/~marius/sparc64_isc_disable_atomic.diff
That one simple disables the use of atomic operations for sparc64 as
I doubt that these have seen much testing except on x86, be it on
sparc64 or in general; given that they are also used for reference
counting they should provide acquire and release semantics for that
purpose which include the necessary memory barriers for these but the
ISC atomic API simply doesn't account for that. Moreover, the sparc64
implementation of the ISC atomic operations is FreeBSD-specific as it's
the only OS I'm aware of using the primary instead of the secondary MMU
context for the userland (i.e. ASI_P; generally this is a wise choice
though), i.e. don't work on the other *BSDs, Linux or Solaris.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c on sparc64/SMP

2011-06-29 Thread Marius Strobl

On Wed, Jun 29, 2011 at 02:33:06PM +0400, KOT MATPOCKuH wrote:
 2011/6/29 KOT MATPOCKuH matpoc...@gmail.com:
  I'm got a problem with named on FreeBSD-CURRENT/sparc64.
  Up to 5 times a day it crashes with these messages:
  27-Jun-2011 03:42:14.384 general:
  /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
  REQUIRE(prev  0) failed
  27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)
 
  I found a some similar problems on alpha and IA64, which was related
  to problems with isc_atomic_xadd() function in include/isc/atomic.h.
  But I don't understand that there may be incorrect for sparc64 and
  this function was not changed for a minimum 4 years...
  Uhm, we once fixed a problem in the MD atomic implementation which
  still seems to present in the ISC copy. Could you please test whether
  the following patch makes a difference?
  http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff
 
  I ran named with your patch and and watching him.
 Omg.
 Or I incorrectly rebuilt named, or the problem is not solved.
 I got a crash after about 2 hours after named restarted:
 29-Jun-2011 13:51:28.855 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 29-Jun-2011 13:51:28.856 general: exiting (due to assertion failure)
 

The remainder of the isc atomic.h looks fine though, so this likely
is a general bug in BIND, especially if it didn't happen before
BIND 9.6.-ESV-R4-P1. Doug should be able to help you.
Doug, could you please nevertheless take care of getting the above
patch into BIND? It's a merge of r148453.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: named crashes on assertion in rbtdb.c o? sparc64/SMP

2011-06-28 Thread Marius Strobl

On Mon, Jun 27, 2011 at 07:19:33PM +0400, KOT MATPOCKuH wrote:
 Hello!
 
 I'm got a problem with named on FreeBSD-CURRENT/sparc64.
 Up to 5 times a day it crashes with these messages:
 27-Jun-2011 03:42:14.384 general:
 /usr/src/lib/bind/dns/../../../contrib/bind9/lib/dns/rbtdb.c:1614:
 REQUIRE(prev  0) failed
 27-Jun-2011 03:42:14.385 general: exiting (due to assertion failure)
 
 The problem is still in latest system's bind:
 # named -v
 BIND 9.6.-ESV-R4-P1
 
 This problem exists only on SMP sparc64 system. On my another sparc64,
 with 1 processor, I does not have this problem.
 
 I found a some similar problems on alpha and IA64, which was related
 to problems with isc_atomic_xadd() function in include/isc/atomic.h.
 But I don't understand that there may be incorrect for sparc64 and
 this function was not changed for a minimum 4 years...
 
 How can I help solve this problem?
 

Uhm, we once fixed a problem in the MD atomic implementation which
still seems to present in the ISC copy. Could you please test whether
the following patch makes a difference?
http://people.freebsd.org/~marius/sparc64_isc_atomic.h.diff

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: TLS bug?

2011-06-17 Thread Marius Strobl

On Thu, Jun 16, 2011 at 03:53:19AM -0400, Nathaniel W Filardo wrote:
 Atcht; it's late.  I forgot to mention that this system is a sparc64 V240
 2-way SMP machine.  It's running a kernel from 9.0-CURRENT r222833+262af52:
 Tue Jun  7 18:47:35 EDT 2011 and a userland from a little later.
 
 Sorry about that.
 --nwf;
 
 On Thu, Jun 16, 2011 at 03:31:38AM -0400, Nathaniel W Filardo wrote:
  I have a few applications (bonnie++ and mysql, specifically, both from
  ports) which trip over the assertion in
  lib/libc/stdlib/malloc.c:/^_malloc_thread_cleanup that
 assert(tcache != (void *)(uintptr_t)1);
  
  I have patched malloc.c thus:
  
   --- a/lib/libc/stdlib/malloc.c
   +++ b/lib/libc/stdlib/malloc.c
   @@ -1108,7 +1108,7 @@ static __thread arena_t   *arenas_map 
   TLS_MODEL;

#ifdef MALLOC_TCACHE
/* Map of thread-specific caches. */
   -static __thread tcache_t   *tcache_tls TLS_MODEL;
   +__thread tcache_t  *tcache_tls TLS_MODEL;

/*
 * Number of cache slots for each bin in the thread cache, or 0 if tcache
 * is
   @@ -6184,10 +6184,17 @@ _malloc_thread_cleanup(void)
#ifdef MALLOC_TCACHE
   tcache_t *tcache = tcache_tls;

   +fprintf(stderr, _m_t_c for %d:%lu with %p\n, 
   +   getpid(),
   +   (unsigned long) _pthread_self(),
   +   tcache);
   +
   if (tcache != NULL) {
   -   assert(tcache != (void *)(uintptr_t)1);
   -   tcache_destroy(tcache);
   -   tcache_tls = (void *)(uintptr_t)1;
   +   /* assert(tcache != (void *)(uintptr_t)1); */
   +   if((uintptr_t)tcache != (uintptr_t)1) {
   +   tcache_destroy(tcache);
   +   tcache_tls = (void *)(uintptr_t)1;
   +   }
  
  and libthr/thread/thr_create.c thus:
  
   --- a/lib/libthr/thread/thr_create.c
   +++ b/lib/libthr/thread/thr_create.c
   @@ -243,6 +243,8 @@ create_stack(struct pthread_attr *pattr)
   return (ret);
}

   +extern __thread void *tcache_tls;
   +
static void
thread_start(struct pthread *curthread)
{
   @@ -280,6 +282,11 @@ thread_start(struct pthread *curthread)
   curthread-attr.stacksize_attr;
#endif

   +fprintf(stderr, t_s for %d:%lu with %p\n,
   +getpid(),
   +(unsigned long) _pthread_self(),
   +tcache_tls);
   +
   /* Run the current thread's start routine with argument: */
   _pthread_exit(curthread-start_routine(curthread-arg));

  
  to attempt to debug this issue.  With those changes in place, bonnie++'s
  execution looks like this:
  
  [...]
   Writing a byte at a time...done
   Writing intelligently...done
   Rewriting...done
   Reading a byte at a time...done
   Reading intelligently...done
   t_s for 79654:1086343168 with 0x0
   t_s for 79654:1086345216 with 0x0
   t_s for 79654:1086346240 with 0x0
   t_s for 79654:1086347264 with 0x0
   t_s for 79654:1086344192 with 0x0
   start 'em...done...done...done...done..._m_t_c for 79654:1086344192 with
   0x41404400
   _m_t_c for 79654:1086346240 with 0x40d2c400
   _m_t_c for 79654:1086343168 with 0x41404200
   _m_t_c for 79654:1086345216 with 0x41804200
   done...
   _m_t_c for 79654:1086347264 with 0x41004200
   Create files in sequential order...done.
   Stat files in sequential order...done.
   Delete files in sequential order...done.
   Create files in random order...done.
   Stat files in random order...done.
   Delete files in random order...done.
   1.96,1.96,hydra.priv.oc.ietfng.org,1,1308217772,10M,,7,81,2644,7,3577,14,34,93,+,+++,773.7,61,16,,,
   ,,2325,74,13016,99,2342,86,3019,91,11888,99,2184,89,16397ms,1237ms,671ms,2009ms,177us,1305ms,489ms,1029
   us,270ms,140ms,53730us,250ms
   Writing a byte at a time...done
   Writing intelligently...done
   Rewriting...done
   Reading a byte at a time...done
   Reading intelligently...done
   t_s for 79654:1086343168 with 0x1
   t_s for 79654:1086346240 with 0x1
   t_s for 79654:1086345216 with 0x1
   t_s for 79654:1086347264 with 0x1
   t_s for 79654:1086344192 with 0x1
   start 'em...done...done...done...done...done...
   _m_t_c for 79654:1086347264 with 0x1
   _m_t_c for 79654:1086344192 with 0x1
   _m_t_c for 79654:1086343168 with 0x1
  [...]
  
  So what seems to be happening is that the TLS area is being set up
  incorrectly, eventually: rather than zeroing the tcache_tls value, it is
  being set to 1, which means no tcache is ever allocated, so when we get
  around to exiting, the assert trips.
  
  Unfortunately, setting a breakpoint on __libc_allocate_tls seems to do bad
  things to the kernel (inducing a SIR without any panic message).  I am
  somewhat at a loss; help?
  

Using bonnie++ I can't reproduce this (didn't try mysql) but I have
some TLS fixes for libthr I forgot about but could be relevant here
(most actually date back to 2008 when the base binutils

Re: TLS bug?

2011-06-17 Thread Marius Strobl

On Fri, Jun 17, 2011 at 03:31:29PM -0400, Nathaniel W Filardo wrote:
 On Fri, Jun 17, 2011 at 08:07:13PM +0200, Marius Strobl wrote:
  Using bonnie++ I can't reproduce this (didn't try mysql) but I have
 
 I seem to have good luck reproducing it with -r 5 -s 10 -x 10 by about the
 third iteration.

Ok, with these parameters I can reproduce it.

 
  some TLS fixes for libthr I forgot about but could be relevant here
  (most actually date back to 2008 when the base binutils didn't support
  GNUTLS for sparc64 so I couldn't test them easily). Could you please
  give a libthr build with the following patch a try?
  http://people.freebsd.org/~marius/libthr_sparc64.diff
 
 Concurrent runs both with and without those diffs still asserted.
 Interestingly, libc's .tbss section, even after the assertion, is still full
 of zeros, so it looks like something stranger than a wild-write back to
 .tbss.  I'll go diving through the tls allocation code again when I get a
 minute.
 

In combination with the below patch bonnie++ survived 100 iterations
here. I'm not sure what this means though as I don't have much knowledge
about TLS, I merely implemented the necessary relocations. Could be
that malloc() actually requires the initial exec model for variant II.
Unfortunately, it's not documented why it was added for x86.
Jason, can you shed some light on this?

Marius

Index: malloc.c
===
--- malloc.c(revision 219535)
+++ malloc.c(working copy)
@@ -234,7 +234,7 @@
 #ifdef __sparc64__
 #  define LG_QUANTUM   4
 #  define LG_SIZEOF_PTR3
-#  define TLS_MODEL/* default */
+#  define TLS_MODEL__attribute__((tls_model(initial-exec)))
 #endif
 #ifdef __amd64__
 #  define LG_QUANTUM   4
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: ZFS panic with concurrent recv and read-heavy workload

2011-06-08 Thread Marius Strobl

On Fri, Jun 03, 2011 at 03:03:56AM -0400, Nathaniel W Filardo wrote:
 I just got this on another machine, no heavy workload needed, just booting
 and starting some jails.  Of interest, perhaps, both this and the machine
 triggering the below panic are SMP V240s with 1.5GHz CPUs (though I will
 confess that the machine in the original report may have had bad RAM).  I
 have run a UP 1.2GHz V240 for months and never seen this panic.
 
 This time the kernel is
  FreeBSD 9.0-CURRENT #9: Fri Jun  3 02:32:13 EDT 2011
 csup'd immediately before building.  The full panic this time is
  panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @
  /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:4659
 
  cpuid = 1
  KDB: stack backtrace:
  panic() at panic+0x1c8
  _sx_assert() at _sx_assert+0xc4
  _sx_xunlock() at _sx_xunlock+0x98
  l2arc_feed_thread() at l2arc_feed_thread+0xeac
  fork_exit() at fork_exit+0x9c
  fork_trampoline() at fork_trampoline+0x8
 
  SC Alert: SC Request to send Break to host.
  KDB: enter: Line break on console
  [ thread pid 27 tid 100121 ]
  Stopped at  kdb_enter+0x80: ta  %xcc, 1
  db reset
  ttiimmeeoouutt  sshhuuiinngg  ddoowwnn  CCPPUUss..
 
 Half of the memory in this machine is new (well, came with the machine) and
 half is from the aforementioned UP V240 which seemed to work fine (I was
 attempting an upgrade when this happened); none of it (or indeed any of the
 hardware save the disk controller and disks) are common between this and the
 machine reporting below.
 
 Thoughts?  Any help would be greatly appreciated.
 Thanks.
 --nwf;
 
 On Wed, Apr 06, 2011 at 04:00:43AM -0400, Nathaniel W Filardo wrote:
 [...]
  panic: Lock buf_hash_table.ht_locks[i].ht_lock not exclusively locked @ 
  /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1869
 
  cpuid = 1
  KDB: stack backtrace:
  panic() at panic+0x1c8
  _sx_assert() at _sx_assert+0xc4
  _sx_xunlock() at _sx_xunlock+0x98
  arc_evict() at arc_evict+0x614
  arc_get_data_buf() at arc_get_data_buf+0x360
  arc_buf_alloc() at arc_buf_alloc+0x94
  dmu_buf_will_fill() at dmu_buf_will_fill+0xfc
  dmu_write() at dmu_write+0xec
  dmu_recv_stream() at dmu_recv_stream+0x8a8
  zfs_ioc_recv() at zfs_ioc_recv+0x354
  zfsdev_ioctl() at zfsdev_ioctl+0xe0
  devfs_ioctl_f() at devfs_ioctl_f+0xe8
  kern_ioctl() at kern_ioctl+0x294
  ioctl() at ioctl+0x198
  syscallenter() at syscallenter+0x270
  syscall() at syscall+0x74
  -- syscall (54, FreeBSD ELF64, ioctl) %o7=0x40c13e24 --
  userland() at 0x40e72cc8
  user trace: trap %o7=0x40c13e24
  pc 0x40e72cc8, sp 0x7fd4641
  pc 0x40c158f4, sp 0x7fd4721
  pc 0x40c1e878, sp 0x7fd47f1
  pc 0x40c1ce54, sp 0x7fd8b01
  pc 0x40c1dbe0, sp 0x7fd9431
  pc 0x40c1f718, sp 0x7fdd741
  pc 0x10731c, sp 0x7fdd831
  pc 0x10c90c, sp 0x7fdd8f1
  pc 0x103ef0, sp 0x7fde1d1
  pc 0x4021aff4, sp 0x7fde291
  done
 [...]

Apparently this is a locking issue in the ARC code, the ZFS people should
be able to help you.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Old ATA disk names emulation [Was: Switch from legacy ata(4) to CAM-based ATA]

2011-04-25 Thread Marius Strobl

On Mon, Apr 25, 2011 at 01:23:37PM +0300, Alexander Motin wrote:
 Hi.
 
 I've implemented following patch to keep basic compatibility for the
 migrating users. I don't like such hacky things, but at least I tried to
 make it less invasive.
 
 The idea:
  - New xpt_path_legacy_ata_id() function in CAM tries to predict bus
 unit number and then device unit number for specified path, as if it was
 with legacy ATA with ATA_STATIC_ID option.
  - on attach, ada driver fetches that number (if not disabled using
 tunable kern.cam.ada.legacy_aliases), prints to console something like:
 ada0: Previously was known as ad12
 , and sets kernel environment variable like:
 kern.devalias.ada0=ad12
  - when geom_dev tastes new geom and creates device node for it, it also
 tries to match prefix of the device name with present kern.devalias.*
 enviromnent variables, and, if some match found, creates alias with
 substituted name (ada0 - ad12, ada0s1 - ad12s1, etc.).
 
 The patch is here: http://people.freebsd.org/~mav/legacy_aliases.patch
 
 I did few tests and it seems like working -- two sets of device nodes
 appeared for each device, I can successfully label and mount any of them.
 
 What will not work:
  - old device names won't be seen inside GEOM, so users who hardcoded
 provider names in gmirror/gstripe/... metadata (not the default
 behavior) are still in trouble.
  - patch mimics ATA_STATIC_ID behavior, if user had custom kernel
 without it, he should update device names manually.
  - it won't work for users with hot-unplugging ATA controllers (not
 devices), but I believe it is really rare case.
  - low-level tools, such as smartmontools, won't be able to work with
 alias devices, as background ada driver doesn't implements legacy
 ioctls. May be I could partially fix this.
 
 Except those, I think this patch should work for the most of users.
 
 Any more objections/ideas? Is this an acceptable solution?
 

Hi,

given that only the amd64, i386 and pc98 GENERIC kernel configuration
files had ATA_STATIC_ID enabled by default it would be highly desireable
that your compatibility shim also only mimics that behavior on these
archs or probably better actually check for ATA_STATIC_ID and put that
option back into the respective kernel configuration files.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Switch from legacy ata(4) to CAM-based ATA

2011-04-21 Thread Marius Strobl

On Thu, Apr 21, 2011 at 01:26:25PM +0300, Alexander Motin wrote:
 Marius Strobl wrote:
  On Wed, Apr 20, 2011 at 12:57:47PM +0300, Alexander Motin wrote:
  With 9.0 release approaching quickly, I believe it the best time now to
  manage migration from legacy ata(4) ATA to the new CAM-based one. New
  ATA code present in the tree for more then a year now, used by many
  people and proved it's superior functionality and reliability. The only
  major issue with it now is the migration process. Sooner or later we
  have to pass it, but due to major UI and API changes we can't do it
  after 9.0 release. So I propose to do it the next Sunday (April 24) to
  have as much time for troubleshooting as possible.
 
  I have prepared the following patch to do it:
  http://people.freebsd.org/~mav/ata_switch.patch
  
  Could you please add descriptions of the controllers supported by
  ahci(4), mvs(4) and siis(4) to the kernel configuration files and
  preserve alphabetical ordering, i.e. list ata(4) after ahci(4)?
 
 OK. Here is the new patch:
 http://people.freebsd.org/~mav/ata_switch2.patch
 

Thanks!

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Switch from legacy ata(4) to CAM-based ATA

2011-04-20 Thread Marius Strobl

On Wed, Apr 20, 2011 at 12:57:47PM +0300, Alexander Motin wrote:
 Hi.
 
 With 9.0 release approaching quickly, I believe it the best time now to
 manage migration from legacy ata(4) ATA to the new CAM-based one. New
 ATA code present in the tree for more then a year now, used by many
 people and proved it's superior functionality and reliability. The only
 major issue with it now is the migration process. Sooner or later we
 have to pass it, but due to major UI and API changes we can't do it
 after 9.0 release. So I propose to do it the next Sunday (April 24) to
 have as much time for troubleshooting as possible.
 
 I have prepared the following patch to do it:
 http://people.freebsd.org/~mav/ata_switch.patch
 

Could you please add descriptions of the controllers supported by
ahci(4), mvs(4) and siis(4) to the kernel configuration files and
preserve alphabetical ordering, i.e. list ata(4) after ahci(4)?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Fwd: OpenSSL 1.0.0d for Freebsd HEAD

2011-03-02 Thread Marius Strobl

On Wed, Mar 02, 2011 at 10:58:00AM +0100, Alexandre Martins wrote:
 Hello,
 
 This sound great :)
 
 SIGILL is raised when the program try to execute an assembly code that the 
 CPU 
 cannot execute. It mean that the library or the binary is miscompiled.
 

Not necessarily, the sparc64 code f.e. also kills programs with SIGILL
when they corrupt overflow their stack or the stack pointer is courrupt.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Fwd: OpenSSL 1.0.0d for Freebsd HEAD

2011-03-01 Thread Marius Strobl

On Tue, Mar 01, 2011 at 10:31:16AM +0100, Alexandre Martins wrote:
 Dear,
 
 Have you extracted the tarball fo openssl source (1.0.0d) in crypto/openssl ?
 

Ah, I missed that, the last couple of mails in this thread were only
talking about the patch :)
With the tarball untared it actually builds and works on sparc64 as
far as ssh(d) and HTTPS via fetch are concerned. The problem reports
(programs getting killed with SIGILL probably due to an infinite
recursion or some such) were about apache and unbound using an
OpenSSL 1.0.0 port. I'm not sure whether their use of OpenSSL would
make a difference or the port is broken.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: OpenSSL 1.0.0d for Freebsd HEAD

2011-02-28 Thread Marius Strobl

On Mon, Feb 28, 2011 at 12:00:19PM +0100, Fabien Thomas wrote:
 
 
  Dears,
  
  After several research, i have removed the problematic part.
  
  You can find the new version here:
  
  http://people.freebsd.org/~fabient/patch-head20110222-openssl1.0.0d
  
 
 
 It will be great to have it in 9.0.
 
 To do that how is it possible rebuild the port for all platform with openssl 
 1.0.0d in base?
 Is there some people against that inclusion?
 

Given that some users report ports linked against the port version
of OpenSSL 1.0.0 (c I think) to not work on sparc64 I wanted to
give your patch a try, but unfortuntately it doesn't even build:
=== secure/lib/libcrypto (buildincludes)
cp /usr/home/marius/co/head3/src/secure/lib/libcrypto/opensslconf-sparc64.h 
opensslconf.h
( echo #ifndef MK1MF_BUILD;  echo   /* auto-generated by crypto/Makefile.ssl 
for crypto/cversion.c */;  echo   #define CFLAGS \cc\;  echo   #define 
PLATFORM \FreeBSD-sparc64\;  echo   #define DATE \`LC_ALL=C date`\;  
echo #endif )  buildinf.h
make: don't know how to make asn1_locl.h. Stop
*** Error code 2

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: libcompiler_rt now part of FreeBSD's base system

2010-11-12 Thread Marius Strobl

On Fri, Nov 12, 2010 at 03:57:20PM +0100, Florian Smeets wrote:
 On 11.11.10 16:52, Ed Schouten wrote:
  I just committed libcompiler_rt.a to HEAD. Even though I don't expect
  serious issues -- especially not on the tier 1 architectures -- be sure
  to contact me in case something goes wrong. I hooked it up to the build
  in a separate commit, so if your system starts to act weird, just revert
  r215127.
  
 
 Hi Ed,
 
 i'm at r215149 on sparc64, and my compiler stopped working. buildworld
 stops after 42 lines (http://smeets.im/~flo/bw.log). cc1 dumps a 1GB
 core file.
 
 Program terminated with signal 4, Illegal instruction.
 #0  0x004ced80 in ?? ()
 (gdb) where
 #0  0x004ced80 in ?? ()
 #1  0x004cedb0 in ?? ()
 Previous frame identical to this frame (corrupt stack?)
 
 Right now i cannot go back to r215126 to verify that it really is this
 change which is causing it :-) Previously the system was running a build
 from around Nov. 1st
 

I was just about to report the same based on a test of r214838. With
debugging symbols I get a more meaningful though:
nimrod# gdb 
/tmp/objrt.old/usr/home/marius/co/compiler-rt/gnu/usr.bin/cc/cc1/cc1 
/tmp/objrt/usr/home/marius/co/compiler-rt/tmp/usr/home/marius/co/compiler-rt/tools/build/cc1.core
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as sparc64-marcel-freebsd...(no debugging symbols 
found)...
Core was generated by `cc1'.
Program terminated with signal 4, Illegal instruction.
#0  0x004c0aa0 in __ctzdi2 ()
(gdb) bt
#0  0x004c0aa0 in __ctzdi2 ()
#1  0x004c0ad0 in __ctzdi2 ()
(gdb) 

The corresponding assembler code is:
004c0aa0 __ctzdi2:
  4c0aa0:   9d e3 bf 40 save  %sp, -192, %sp
  4c0aa4:   82 10 00 18 mov  %i0, %g1
  4c0aa8:   80 a0 00 18 cmp  %g0, %i0
  4c0aac:   85 3e 30 20 srax  %i0, 0x20, %g2
  4c0ab0:   b0 40 3f ff addc  %g0, -1, %i0
  4c0ab4:   90 38 00 18 xnor  %g0, %i0, %o0
  4c0ab8:   84 0e 00 02 and  %i0, %g2, %g2
  4c0abc:   90 0a 00 01 and  %o0, %g1, %o0
  4c0ac0:   b0 0e 20 20 and  %i0, 0x20, %i0
  4c0ac4:   90 12 00 02 or  %o0, %g2, %o0
  4c0ac8:   7f ff ff f6 call  4c0aa0 __ctzdi2
  4c0acc:   91 32 20 00 srl  %o0, 0, %o0
  4c0ad0:   b0 06 00 08 add  %i0, %o0, %i0
  4c0ad4:   81 cf e0 08 rett  %i7 + 8
  4c0ad8:   91 3a 20 00 sra  %o0, 0, %o0
  4c0adc:   01 00 00 00 nop

I think what happens here is that GCC uses __ctzdi2() to implement
__builtin_ctz(), while the libcompiler-rt version of __ctzdi2() uses
__builtin_ctz(), so __ctzdi2() is called recursively until the stack
overflows. Note that GCC has code like:
int __ctzsi2 (uSI x) { return __builtin_ctz (x); }
and rwindow_save() returns SIGILL, so I think this theory is correct
but I've no idea how to solve that.

Another thing that worries me is that by switching to libcompiler-rt
we lose all the assembler optimizations libgcc has for sparc64. When
building with libcompiler-rt the buildworld time increases by 2.6%
on sparc64. I guess this mostly is due to the fact that now both
libcompiler-rt and libgcc are built though. Do you have an idea how
to benchmark the possible performance loss with libcompiler-rt for
typical applications?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Sense fetching [Was: cdrtools /devel ...]

2010-11-05 Thread Marius Strobl

On Fri, Nov 05, 2010 at 08:50:49PM +0200, Alexander Motin wrote:
 Hi.
 
 I've reviewed tests that scgcheck does to SCSI subsystem. It shown
 combination of several issues in both CAM, ahci(4) and cdrtools itself.
 Several small patches allow us to pass most of that tests:
 http://people.freebsd.org/~mav/sense/
 
 ahci_resid.patch: Add support for reporting residual length on data
 underrun. SCSI commands often returns results shorter then expected.
 Returned value allows application to know/check how much data it really
 has. It is also important for sense fetching, as ATAPI and USB devices
 return sense as data in response to REQUEST_SENSE command.
 
 sense_resid.patch: When manually requesting sense data (ATAPI or USB),
 request only as much data as user requested (not the fixed structure
 size), and return respective sense residual length.
 
 pass_autosence.patch: Unless CAM_DIS_AUTOSENSE is set, always fetch
 sense if not done by SIM, independently of CAM_PASS_ERR_RECOVER. As soon
 as device freeze released before returning to user-level, user-level
 application by definition can't reliably fetch sense data if some other
 application (like hald) tries to access device same time.
 
 cdrtools.patch: Make libscg (part of cdrtools) on FreeBSD to submit
 wanted sense length to CAM and do not clear sense return buffer. It is
 mostly cosmetics, important probably only for scgcheck.

Please don't commit this to the port directly but let it loop back
via upstream (CC'ed) instead, otherwise we would need to obey the
following, which is undesirable, especially if these really are
mostly cosmetic issues:
/*
 *  Warning: you may change this source, but if you do that
 *  you need to change the _scg_version and _scg_auth* string below.
 *  You may not return schily for an SCG_AUTHOR request anymore.
 *  Choose your name instead of schily and make clear that the version
 *  string is related to a modified source.
 */

 
 Testers and reviewers welcome. I am especially interested in opinion
 about pass_autosence.patch -- may be we should lower sense fetching even
 deeper, to make it work for all cam_periph_runccb() consumers.
 

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: bge0 does not work anymore

2010-10-18 Thread Marius Strobl

On Mon, Oct 18, 2010 at 09:32:13AM +0800, Buganini wrote:
 my last known usable kernel revision is r213813
 with r213920, leds are extinguished when executing dhclient

Sorry, it looks like it was my fault this time, should be fixed again
with r214012.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next-prev != elm

2010-09-24 Thread Marius Strobl

On Tue, Jul 20, 2010 at 01:55:28PM +0200, Stle Kristoffersen wrote:
 On 2010-07-20 at 12:17, Marius Strobl wrote:
  On Mon, Jul 19, 2010 at 07:06:54PM +0200, Stle Kristoffersen wrote:
   On 2010-07-18 at 14:20, Marius Strobl wrote:
  Downgrading now...
 
 And it crashed again, with current from r209598...
 

Ok, this at least means that your problem isn't caused by the recent
changes to mpt(4) as the pre-r209599 version only differed from the
8-STABLE one in a cosmetic change at that time.
   
   I have another data-point, I cvsup'ed to the latest current again, and
   rebuilt without INVARIANT and WITNESS, and now it seems to survive the
   timeouts.
  
  That's more or less expected as the sanity check issuing the panic
  just isn't compiled in then. However, my understanding was that with
  STABLE you don't get the timeouts in the first place, or do you see
  them there also?
 
 I got the timeouts with STABLE as well, that was the reason for me to
 try out CURRENT. I'm sorry I didn't mention that earlier.
 
 My main concern is to get rid of the timeouts, but a panic on one can't be
 right. How can I debug this further? I can get timeout fairly consistent by
 putting a bit of load on the drives. If it would help I can also provide
 remote access.
 

FYI, that panic is fixed with r213105.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: {arch}/conf/DEFAULTS and uart

2010-09-12 Thread Marius Strobl

On Sun, Sep 12, 2010 at 02:40:49AM +, Alexander Best wrote:
 On Fri Sep 10 10, John Baldwin wrote:
  On Thursday, September 09, 2010 3:50:45 pm Alexander Best wrote:
   On Thu Sep  9 10, Alexander Best wrote:
On Thu Sep  9 10, Alexander Best wrote:
 hi there,
 
 except for arm most archs seem to enforce uart support in 
 conf/DEFAULTS. is
 this really necessary? shouldn't DEFAULTS only contain vital 
 devices/options
 without a kernel on a specific arch won't function at all?

jhb just explained to me, that the uart entry in DEFAULTS is not a 
controller
or something like that, but the uart backend to use *if* uart gets 
defined in
the kernel config.

sorry for the noise folks.
   
   however i found some missing comments and incorrect syntax which i fixed.
   
   see the attached patch.
  
  I think the ia64 ordering for 'io and mem' is probably more correct
  (alphabetically sorted), so I would fix i386 and amd64 and leave ia64 alone.
  
  The powerpc 'machine' changes are wrong I think as it would break GENERIC64
  and powerpc64 kernel configs in general.  Nathan purposefully removed
  'machine' from the powerpc DEFAULTS.
 
 here's try #2. ;)
 
 diff --git a/sys/sparc64/conf/DEFAULTS b/sys/sparc64/conf/DEFAULTS
 index 38b2408..2e60c94 100644
 --- a/sys/sparc64/conf/DEFAULTS
 +++ b/sys/sparc64/conf/DEFAULTS
 @@ -5,7 +5,7 @@
  
  machine  sparc64
  
 -# Pseudo devices.
 +# Pseudo devices
  device   mem # Memory and kernel memory devices
  
  # UART chips on this platform
 @@ -17,5 +17,5 @@ device  uart_z8530
  options  GEOM_PART_BSD
  options  GEOM_PART_VTOC8
  
 -# Let sunkbd emulate an AT keyboard by default.
 +# Let sunkbd emulate an AT keyboard by default

IMO this is a complete sentence and thus the period should stay.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next-prev != elm

2010-07-20 Thread Marius Strobl

On Mon, Jul 19, 2010 at 07:06:54PM +0200, Stle Kristoffersen wrote:
 On 2010-07-18 at 14:20, Marius Strobl wrote:
Downgrading now...
   
   And it crashed again, with current from r209598...
   
  
  Ok, this at least means that your problem isn't caused by the recent
  changes to mpt(4) as the pre-r209599 version only differed from the
  8-STABLE one in a cosmetic change at that time.
 
 I have another data-point, I cvsup'ed to the latest current again, and
 rebuilt without INVARIANT and WITNESS, and now it seems to survive the
 timeouts.

That's more or less expected as the sanity check issuing the panic
just isn't compiled in then. However, my understanding was that with
STABLE you don't get the timeouts in the first place, or do you see
them there also?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next-prev != elm

2010-07-18 Thread Marius Strobl

On Fri, Jul 16, 2010 at 12:31:26PM +0200, Stle Kristoffersen wrote:
 On 2010-07-15 at 19:52, St?le Kristoffersen wrote:
  On 2010-07-15 at 18:00, Marius Strobl wrote:
   On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
Upgraded to from stable to current yesterday and very quickly received a
panic. It did however not dump it's core, so I was unable to debug it.
Today it did panic again, and I took a picture: (Sorry about the bad
quality)

http://folk.uio.no/stalk/mpt/IMG_1403.JPG

And from the backtrace:
http://folk.uio.no/stalk/mpt/IMG_1404.JPG

Both times I hade the mpt0: request timed out just before the panic.

I'm not sure why it's not dumping it's core (It was working under 
stable,
and I have dumpdev=AUTO and dumpdir=/var/crash in rc.conf)
   
   What revision were you using?
  
  Not sure exactly what revision I was using, is there an easy way to figure
  that out? I ran cvsupdate around 13:00 CEST yesterday.
  
   Does using current as of r209598 make a difference?
  
  Downgrading now...
 
 And it crashed again, with current from r209598...
 

Ok, this at least means that your problem isn't caused by the recent
changes to mpt(4) as the pre-r209599 version only differed from the
8-STABLE one in a cosmetic change at that time.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: current + mpt = panic: Bad link elm 0xffffff80002d6480 next-prev != elm

2010-07-15 Thread Marius Strobl

On Thu, Jul 15, 2010 at 02:34:23PM +0200, Stle Kristoffersen wrote:
 Upgraded to from stable to current yesterday and very quickly received a
 panic. It did however not dump it's core, so I was unable to debug it.
 Today it did panic again, and I took a picture: (Sorry about the bad
 quality)
 
 http://folk.uio.no/stalk/mpt/IMG_1403.JPG
 
 And from the backtrace:
 http://folk.uio.no/stalk/mpt/IMG_1404.JPG
 
 Both times I hade the mpt0: request timed out just before the panic.
 
 I'm not sure why it's not dumping it's core (It was working under stable,
 and I have dumpdev=AUTO and dumpdir=/var/crash in rc.conf)

What revision were you using?
Does using current as of r209598 make a difference?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: [sparc64] [panic] cheetah_ipi_selected: CPU can't IPI itself

2010-06-29 Thread Marius Strobl

On Mon, Jun 28, 2010 at 10:25:15AM -0400, Nathaniel W Filardo wrote:
 Well, I'm back in the same town as my sparc64 and so csup'd, built, and
 rebooted, trying to get more information about the vm object not owned
 panic I reported a while ago.  To my dismay, I now get this panic, also late
 enough in the boot process to be starting up jails:
 
 panic: cheetah_ipi_selected: CPU can't IPI itself
 cpuid = 0
 KDB: stack backtrace:
 panic() at panic+0x1c8
 cheetah_ipi_selected() at cheetah_ipi_selected+0x48
 tlb_page_demap() at tlb_page_demap+0xdc
 pmap_copy_page() at pmap_copy_page+0x4c4
 vm_fault() at vm_fault+0x13ec
 trap_pfault() at trap_pfault+0x190
 trap() at trap+0xd0
 -- data access protection tar=0x224b93 sfar=0x224550 sfsr=0x85
 %o7=0x4063398c --
 userland() at 0x40633830
 user trace: trap %o7=0x4063398c
 ...
 
 And the system hangs; I had to use the ALOM to reboot it.
 Sorry to not have more useful news.

Could please give the following patch a try?
http://people.freebsd.org/~marius/sparc64_pin_ipis.diff
If that doesn't fix the above panic I have no clue how this can
happen apart from the per-CPU pages getting corrupted.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: AESNI driver and fpu_kern KPI

2010-05-15 Thread Marius Strobl

On Sat, May 15, 2010 at 01:04:01PM +0300, Kostik Belousov wrote:
 
 I am interested in the problem reports and reviews. Maintainers of
 !x86-oids are welcome to provide feedback whether they feel that
 proposed KPI could be implemented on their architectures, or what
 modifications they consider as needed to be able to implement
 it.
 

FYI, sparc64 doesn't need such a KPI as it supports using the FPU
in kernel unconditionally for ages.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Followup: if_em.c prevents the 2nd time resuming

2010-05-15 Thread Marius Strobl

On Sat, May 15, 2010 at 10:23:17PM +0900, Taku YAMAMOTO wrote:
 PR filed as kern/146614.
 http://www.freebsd.org/cgi/query-pr.cgi?pr=146614
 

That was an mismerge introduced when moving the original patch
forward to a newer version of the e1000 source. It's now fixed.
Thanks for reporting.

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Switchover to CAM ATA?

2010-05-03 Thread Marius Strobl

On Mon, Apr 26, 2010 at 09:18:07AM -0600, Scott Long wrote:
 On Apr 26, 2010, at 6:51 AM, Alexander Motin wrote:
  Marius Strobl wrote:
  As noted earlier, pc98 and sparc64 need ada(4)/CAM ATA to perform
  geometry translation as done by ad_firmware_geom_adjust() for ad(4),
  which the following patch hooks up to both:
  http://people.freebsd.org/~marius/ata_disk_firmware_geom_adjust.diff
  You preferred to implement such functionality via XPT_CALC_GEOMETRY
  though (I'm still not convinced that it makes sense to put this
  functionality into every ATA SIM the same way it is done for SCSI
  rather than letting ada(4) handle it the same way for all SIMs
  however). Have you looked into implementing XPT_CALC_GEOMETRY for
  ATA CAM or is it okay to commit the above patch?
  
  Sorry, I have forgotten about this.
  
  I don't have better idea. For ATA translation seems indeed more
  platform- then controller-specific. May be I would just preferred to see
  this hack to be done inside XPT_CALC_GEOMETRY handler, as it is done now
  for PC98 SCSI. But looking that whole this topic is quite crappy and
  hopefully going to die sometimes, I won't argue much against committing
  this as-is for now.
 
 Put this into XPT_CALC_GEOMETRY.  There's no point in perpetuating the 
 mistakes of the ata driver.
 Give me a day or two to think of a reasonable way to do it right.
 

Did you get further with this approach?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: mpt(4) MPI_EVENT_IR_RESYNC_UPDATE

2010-05-01 Thread Marius Strobl

On Fri, Apr 30, 2010 at 06:50:26PM +0400, pluknet wrote:
 On 30 April 2010 18:22, Matthew Jacob m...@feral.com wrote:
  pluknet wrote:
  Seems good to me- why not trhow it freebsd-scsi? if nobody says no, I'll put
  it in
 
 Err.. I thought that list is dedicated for cam related stuff.
 
 [cc'ing scsi@ for better coverage. Sorry for cross-posting :/ ]
 
 
  --- RELENG_7_3/src/sys/dev/mpt/mpt_cam.c ? ? ? ?2010-03-02
  15:38:13.0 +0300
  +++ RELENG_7_3.ours/src/sys/dev/mpt/mpt_cam.c ? 2010-04-21
  19:31:00.0 +0400
  @@ -2564,6 +2564,12 @@ mpt_cam_event(struct mpt_softc *mpt, req
  ? ? ? ? ? ? ? ?CAMLOCK_2_MPTLOCK(mpt);
  ? ? ? ? ? ? ? ?break;
  ? ? ? ?}
  + ? ? ? case MPI_EVENT_IR_RESYNC_UPDATE:
  + ? ? ? {
  + ? ? ? ? ? ? ? uint8_t resync = (data0  16)  0xff;
  + ? ? ? ? ? ? ? mpt_prt(mpt, IR resync update %d completed\n, resync);
  + ? ? ? ? ? ? ? break;
  + ? ? ? }
  ? ? ? ?case MPI_EVENT_EVENT_CHANGE:
  ? ? ? ?case MPI_EVENT_INTEGRATED_RAID:
  ? ? ? ?case MPI_EVENT_SAS_DEVICE_STATUS_CHANGE:
 
  Another way - just hide such event since mptutil displays rebuild
  progress.
 
 
 

Could you maybe avoid defining a variable inside a nested scope for
consistency with the majority of the existing cases and in order to
not violate style(9) unnecessarily?

Marius

Index: mpt_cam.c
===
--- mpt_cam.c   (revision 207463)
+++ mpt_cam.c   (working copy)
@@ -2575,6 +2575,10 @@ mpt_cam_event(struct mpt_softc *mpt, request_t *re
CAMLOCK_2_MPTLOCK(mpt);
break;
}
+   case MPI_EVENT_IR_RESYNC_UPDATE:
+   mpt_prt(mpt, IR resync update %d completed\n,
+   (data0  16)  0xff);
+   break;
case MPI_EVENT_EVENT_CHANGE:
case MPI_EVENT_INTEGRATED_RAID:
case MPI_EVENT_SAS_DEVICE_STATUS_CHANGE:
___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: Switchover to CAM ATA?

2010-04-24 Thread Marius Strobl

On Thu, Apr 22, 2010 at 06:31:37PM +0300, Alexander Motin wrote:
 Hi.
 
 With time passed, CAM-based ATA infrastructure IMHO looks enough mature
 now to enable it in HEAD. Now we have two new stable drivers ahci(4) and
 siis(4), covering major part of modern SATA HBAs, `options ATA_CAM`
 wrapper for ata(4) to supports legacy hardware, and one more improved
 driver for Marvell HBAs (mvs) is now in development and soon will be
 present for testing. Together with many other people I have tested above
 at least on i386, amd64, arm and spart64 architectures.
 
 This switchover would give us significant performance improvement on new
 hardware because of NCQ support in ahci/siis/mvs drivers; improved
 functionality, including SATA Port Multipliers support, better hot-plug
 support; and reduced code duplication between ata(4) and cam(4)
 subsystems and applications.
 
 Two issues left at this moment are:
  1) POLA breakage due to disk device being renamed from adX to adaY;
  2) lack of araraid(4) alternative in new infrastructure. It should be
 reimplemented in GEOM in some way, but it still wasn't.
 
 So what is the public opinion: Is the lack of ataraid(4) fatal or we can
 live without it?
 
 Can we do switchover now, or some more reasons preventing this?
 

As noted earlier, pc98 and sparc64 need ada(4)/CAM ATA to perform
geometry translation as done by ad_firmware_geom_adjust() for ad(4),
which the following patch hooks up to both:
http://people.freebsd.org/~marius/ata_disk_firmware_geom_adjust.diff
You preferred to implement such functionality via XPT_CALC_GEOMETRY
though (I'm still not convinced that it makes sense to put this
functionality into every ATA SIM the same way it is done for SCSI
rather than letting ada(4) handle it the same way for all SIMs
however). Have you looked into implementing XPT_CALC_GEOMETRY for
ATA CAM or is it okay to commit the above patch?

Marius

___
freebsd-current@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to freebsd-current-unsubscr...@freebsd.org

Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread Marius Strobl

On Thu, Nov 06, 2003 at 12:22:45PM -0500, John Baldwin wrote:
 
 On 06-Nov-2003 Harti Brandt wrote:
  JBI figured out what is happenning I think.  You are getting a spurious
  JBinterrupt from the 8259A PIC (which comes in on IRQ 7).  The IRR register
  JBlists pending interrupts still waiting to be serviced.  Try using
  JB'options NO_MIXED_MODE' to stop using the 8259A's for the clock and see if
  JBthe spurious IRQ 7 interrupts go away.
  
  Ok, that seems to help. Interesting although why do these interrupts
  happen only with a larger HZ and when the kernel is doing printfs (this
  machine has a serial console). I have also not tried to disable SIO2 and
  the parallel port.
 
 Can you also try turning mixed mode back on and using
 http://www.FreeBSD.org/~jhb/patches/spurious.patch
 
 You should get some stray IRQ 7's in the vmstat -i output as well as a few
 printf's to the kernel console.
 

I think I'm seeing something related here, with the old interrupt code I
got:
...
Hit [Enter] to boot immediately, or any other key for command prompt.
Booting [/boot/kernel/kernel]...   
ACPI autoload failed - no such file or directory
stray irq 7
^^^
Copyright (c) 1992-2003 The FreeBSD Project.
...

With the new interrupt code I get:
...
OK boot
cpuid = 0; apic id = 00
instruction pointer = 0x0:0xa00
stack pointer   = 0x0:0xffe
frame pointer   = 0x0:0x0
code segment= base 0x0, limit 0x0, type 0x0
= DPL 0, pres 0, def32 0, gran 0
processor eflags= interrupt enabled, vm86, IOPL = 0
current process = 0 ()
kernel: type 30 trap, code=0
Stopped at  0xa00:  cli
db tr
(null)(0,0,0,0,0) at 0xa00
...

However, if I enter 'continue' at the DDB prompt it continues to boot
and the system seems to runs fine:

...
db continue
SMAP type=01 base= len=0009f400
SMAP type=02 base=0009f400 len=0c00
SMAP type=02 base=000d len=0003
SMAP type=01 base=0010 len=1fdf
SMAP type=03 base=1fef len=f000
SMAP type=04 base=1feff000 len=1000
SMAP type=01 base=1ff0 len=0008
SMAP type=02 base=1ff8 len=0008
SMAP type=02 base=fec0 len=4000
SMAP type=02 base=fee0 len=1000
SMAP type=02 base=fff8 len=0008
Copyright (c) 1992-2003 The FreeBSD Project.
...

Neiter the spurious interrupt patch nor setting 'options NO_MIXED_MODE'
makes a difference. This is on a Tyan Tiger MPX S2466N-4M board, a full
verbose boot log is at: http://quad.zeist.de/newintr.log

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: New interrupt stuff breaks ASUS 2 CPU system

2003-11-10 Thread Marius Strobl

On Mon, Nov 10, 2003 at 02:12:56PM -0500, John Baldwin wrote:
 
 On 10-Nov-2003 John Hay wrote:
  
  With the new interrupt code I get:
  ...
  OK boot
  cpuid = 0; apic id = 00
  instruction pointer = 0x0:0xa00
  stack pointer   = 0x0:0xffe
  frame pointer   = 0x0:0x0
  code segment= base 0x0, limit 0x0, type 0x0
  = DPL 0, pres 0, def32 0, gran 0
  processor eflags= interrupt enabled, vm86, IOPL = 0
  current process = 0 ()
  kernel: type 30 trap, code=0
  Stopped at  0xa00:  cli
  db tr
  (null)(0,0,0,0,0) at 0xa00
  ...
  
  However, if I enter 'continue' at the DDB prompt it continues to boot
  and the system seems to runs fine:
  
  ...
  db continue
  ...
  Copyright (c) 1992-2003 The FreeBSD Project.
  ...
  
  
  Now why didn't I think of trying 'continue'? Hey there my old dual
  Pentium I diskless machine is running in SMP mode.
 
 Can you try this patch:
 
 http://www.FreeBSD.org/~jhb/patches/atpic.patch
 

Works here, thanks!
Btw., I also get such a stray interrupt on my Sun U60, IIRC also from the
printer port :)

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: g++ problem

2003-11-06 Thread Marius Strobl

On Thu, Nov 06, 2003 at 11:28:28AM -0500, Alexander Kabaev wrote:
 On Thu, 6 Nov 2003 16:55:00 +0100 (CET)
 C. Kukulies [EMAIL PROTECTED] wrote:
 
  I tried to compile a virus-scanner for Linux that allows for scanning
  Windoze PCs in a network for all sorts of recent viruses (RPC/DCOM and
  such).
  
  http://www.enyo.de/fw/software/doscan
  
  Compilation fails with the following:
  
  kukuboo2k# gmake
  g++ -g -O2 -Wall -I/usr/local/include -I. -I. -I./lib \
  -MMD -MF src/doscan.d \
  -c -o src/doscan.o src/doscan.cc
  In file included from src/doscan.cc:28:
  /usr/local/include/getopt.h:115: error: declaration of C function `int
  getopt()
 ' conflicts with
  /usr/include/unistd.h:377: error: previous declaration `int
  getopt(int, char*
 const*, const char*)' here
  gmake: *** [src/doscan.o] Error 1
  
  I wonder where /usr/local/include comes from. If I remove that it
  compiles smoothly.
 
 Uhm, from you command line? What _this_ has to do with a compiler?
 

This happens with g++ 3.x when the devel/libgnugetopt port is installed
and both its getopt.h and the base unistd.h are included. There are
several ports that have workarounds for this issue.
I have a patch for devel/libgnugetopt at
ftp://ftp.zeist.de/pub/patches/devel_libgnugetopt.diff
that should fix this issue by updating to the latest sources.
In my opinion the right thing to do is however to also include
getopt_long_only() in libc and not only getopt_long() so one can get
rid of the devel/libgnugetopt port. I have a patch for this at
ftp://ftp.zeist.de/pub/patches/src_getopt_long_only.diff
When I have time I'll continue testing of both and eventually submit
PRs.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: g++ problem

2003-11-06 Thread Marius Strobl

On Thu, Nov 06, 2003 at 11:51:12AM -0500, Alexander Kabaev wrote:
 On Thu, 6 Nov 2003 17:44:59 +0100
 Marius Strobl [EMAIL PROTECTED] wrote:
 
  This happens with g++ 3.x ...
 This will happen with g++ 3.x, 2.x, 1.x and future 4.x too. I.e. the
 GCC is not at fault and the subject of the original message is
 misleading.
 

It's at least not fatal with g++ 2.95.4 on 4-stable, that's what I
meant. But yes, GCC is not at fault.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: panic with cdrecord-devel @ Mon Oct 20 21:28:57 EEST 2003

2003-10-21 Thread Marius Strobl

On Tue, Oct 21, 2003 at 10:27:11AM +0300, Paulius Bulotas wrote:
 Hello,
 
 5.1-CURRENT #0: Mon Oct 20 21:28:57 EEST 2003
 
 % cdrecord -scanbus
 panics, and trace looks like:
 vmapbuf
 cam_periph_mapmem
 xptioctl
 spec_ioctl
 spec_vnoperate
 vn_ioctl
 ioctl
 syscall(2f,2f,2f,,3)
 Xint0x80_syscall
 
 Any ideas?
 

I was told that a `camcontrol devlist` triggers a panic in
sys/kern/vfs_bio.c:3729 on recent -current and that turning off
INVARIANTS and WITNESS avoids it. This may be related.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Random signals in {build,install}world recently?

2003-10-20 Thread Marius Strobl

On Mon, Oct 20, 2003 at 05:08:26PM +0200, Christian Brueffer wrote:
 On Mon, Oct 20, 2003 at 10:50:02AM -0400, Barney Wolff wrote:
  On Mon, Oct 20, 2003 at 03:20:56PM +0200, Mark Santcroos wrote:
   On Mon, Oct 20, 2003 at 10:27:38AM +0200, Harti Brandt wrote:
On Mon, 20 Oct 2003, Vallo Kallaste wrote:

VKHi
VK
VKIt seems to be a recent problem. The hardware is OK, both Windows XP
VK(which I use very seldom) and Gentoo Linux do not exhibit any
VKproblems.
VKBasically one will get random signals as I have got in build- and
VKinstallworld. It's impossible to complete make -j2 buildworld on my
VKmachine, but sometimes non-parallel buildworld will do, only to die
VKlater in installworld.
VKThis is on two-processor AMD 2400+ MP system, ASUS A7M-266D mobo and
VK1GB ECC memory, ATA disks and CD/RW-DVD only. 4BSD scheduler if it
VKmatters.

I have the same MB just with 1800+ processors. I had to reduce the CPU
frequency by about 10% in the BIOS setup to get the machine stable. I
assume the problem is actually the memory.
   
   Couldn't the following be of help here?
   
   options DISABLE_PSE
   options DISABLE_PG_G
  
  I don't think so.  I tried that on my A7M266D with no effect.  I believe
  something in recent pmap code doesn't like this mobo, or maybe dual
  athlons in general.  I can run RELENG_5_1 rock solid, and -current from
  9/24/03 rock solid, but -current from 10/3 or later gets random sigs
  and eventually panics.  I have scsi disks so it's not ata.
  
 
 I have the same experiences.  Also AMD A7M-266D with two 1800+ Athlons here.
 Used to work fine, but got random signals with my latest builds.
 

Here, too. However, on a Tyan Thunder K7 with MP 1200 and a Tyan Tiger MPX
with MP 1600+. Additionally to random signals I also get ICEs from GCC at
random places and freezes with `make -jX buildworld` but no panics.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: panic with cdrecord -- anybody else seeing this? [backtrace obtained]

2003-10-12 Thread Marius Strobl

On Sun, Oct 12, 2003 at 09:18:08PM -0700, Kris Kennaway wrote:
 On Sun, Oct 12, 2003 at 06:32:21PM -0700, John Reynolds wrote:
  Hi all, forgive me if I give incomplete information. This is the first time
  I've created a debugging kernel and gotten a dump after a panic, so I might not
  have done everything right.
  
  Ever since the tail end of July it seems, any time I've tried to burn a CD with
  cdrecord (cdrtools 2.0.3 from ports) I get a panic
  
vm_fault_copy_wired: page missing
  
  General busy-ness and the thought that somebody will see it too and fix it
  has prevented me from caring too much about it until now, but it seems it's
  still there in the kernel from Oct 11th, and I figured I might as well try to
  provide somebody some information ..
 
 Thanks..Alan made a commit which he thought might have fixed this, but
 someone else also claimed it did not.
 
 See also
 
   http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/56380
 

And
http://www.freebsd.org/cgi/query-pr.cgi?pr=kern/57611

The latter is a bit more detailed and correct (it's not limited to ATAPI
burners). It also doesn't seem to be limited to cdrecord, the latest
ntpd also causes a panic when using mlockall(2) as reported on this list,
however the backtrace looks different.
Btw., the cdrtools-devel port contains a workaround.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: X does not work on todays current with ATI

2003-10-05 Thread Marius Strobl

On Sun, Oct 05, 2003 at 04:56:53PM +0200, Matt Douhan wrote:
 Hello
 
 I am unable to start X with a current as of today 08.00 CEST, it crashes to
 the debugger with a fatal trap 12, I poked around to see if anything useful
 was in the logs but I could not find anything, can you please advice what
 log I could send to aid in hunting down this problem?
 

Me, too

GNU gdb 5.2.1 (FreeBSD)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-undermydesk-freebsd...
panic: from debugger
panic messages:
---
Fatal trap 12: page fault while in kernel mode
cpuid = 0; lapic.id = 0100
fault virtual address   = 0x1c
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc0503499
stack pointer   = 0x10:0xdc91ab9c
frame pointer   = 0x10:0xdc91abb8
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 582 (XFree86)
panic: from debugger
cpuid = 0; lapic.id = 0100
boot() called on cpu#0
Uptime: 28m17s
Dumping 512 MB
 16 32 48 64 80 96 112 128 144 160 176 192 208 224 240 256 272 288 304 320 336 352 368 
384 400 416 432 448 464 480 496
---
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
240 dumping++;
(kgdb) where
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
#1  0xc050c3a3 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:372
#2  0xc050c788 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
#3  0xc043eff2 in db_panic () at /usr/src/sys/ddb/db_command.c:450
#4  0xc043ef6a in db_command (last_cmdp=0xc06d8e00, cmd_table=0x0, 
aux_cmd_tablep=0xc0693a6c, aux_cmd_tablep_end=0xc0693a70)
at /usr/src/sys/ddb/db_command.c:346
#5  0xc043f078 in db_command_loop () at /usr/src/sys/ddb/db_command.c:472
#6  0xc0441db9 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:73
#7  0xc0623143 in kdb_trap (type=12, code=0, regs=0xdc91ab5c)
at /usr/src/sys/i386/i386/db_interface.c:171
#8  0xc063bb96 in trap_fatal (frame=0xdc91ab5c, eva=0)
at /usr/src/sys/i386/i386/trap.c:814
at /usr/src/sys/ddb/db_command.c:346
#5  0xc043f078 in db_command_loop () at /usr/src/sys/ddb/db_command.c:472
#6  0xc0441db9 in db_trap (type=12, code=0) at /usr/src/sys/ddb/db_trap.c:73
#7  0xc0623143 in kdb_trap (type=12, code=0, regs=0xdc91ab5c)
at /usr/src/sys/i386/i386/db_interface.c:171
#8  0xc063bb96 in trap_fatal (frame=0xdc91ab5c, eva=0)
at /usr/src/sys/i386/i386/trap.c:814
#9  0xc063b881 in trap_pfault (frame=0xdc91ab5c, usermode=0, eva=28)
at /usr/src/sys/i386/i386/trap.c:733
#10 0xc063b453 in trap (frame=
  {tf_fs = 24, tf_es = 16, tf_ds = 16, tf_edi = -1066886654, tf_esi = 1645, tf_ebp 
= -594433096, tf_isp = -594433144, tf_ebx = 0, tf_edx = 0, tf_ecx = 1, tf_eax = 0, 
tf_trapno = 12, tf_err = 0, tf_eip = -1068485479, tf_cs = 8, tf_eflags = 66118, tf_esp 
= 1, tf_ss = -997343232}) at /usr/src/sys/i386/i386/trap.c:418
#11 0xc0624a48 in calltrap () at {standard input}:103
#12 0xc05fb35e in vm_page_zero_invalid (m=0x66d, setvalid=1)
at /usr/src/sys/vm/vm_page.c:1645
---Type return to continue, or q return to quit--- 
#13 0xc05ebca2 in vm_fault (map=0xc19216e4, vaddr=674037760, 
fault_type=1 '\001', fault_flags=0) at /usr/src/sys/vm/vm_pager.h:131
#14 0xc063b7b6 in trap_pfault (frame=0xdc91ad48, usermode=1, eva=674041472)
at /usr/src/sys/i386/i386/trap.c:709
#15 0xc063b364 in trap (frame=
  {tf_fs = 47, tf_es = 47, tf_ds = 47, tf_edi = -1077937450, tf_esi = 674041472, 
tf_ebp = -1077937480, tf_isp = -594432652, tf_ebx = 674037760, tf_edx = 2, tf_ecx = 2, 
tf_eax = -1077937450, tf_trapno = 12, tf_err = 4, tf_eip = 673893577, tf_cs = 31, 
tf_eflags = 66050, tf_esp = -1077937532, tf_ss = 47})
at /usr/src/sys/i386/i386/trap.c:317
#16 0xc0624a48 in calltrap () at {standard input}:103
---Can't read userspace from dump, or kernel process---

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ATAng still problematic

2003-09-19 Thread Marius Strobl

On Thu, Sep 18, 2003 at 05:51:25PM +0200, Jan Srzednicki wrote:
 
 Anyway, here's backtrace for atapicam panic I've mentioned. It's
 triggered by:
 
 cdrecord dev=1,1,0 /some/track
 

This panic isn't ATAPICAM related. Could you try the patch below? It's
against the cdrtools-devel port but should also work with the cdrtools
port.


Index: files/patch-conf::configure
===
RCS file: files/patch-conf::configure
diff -N files/patch-conf::configure
--- /dev/null   1 Jan 1970 00:00:00 -
+++ files/patch-conf::configure 19 Sep 2003 16:03:35 -
@@ -0,0 +1,10 @@
+--- conf/configure.origFri Sep 19 16:47:37 2003
 conf/configure Fri Sep 19 16:49:26 2003
+@@ -5567,6 +5567,7 @@
+ int
+ main()
+ {
++  exit(1);
+   if (mlockall(MCL_CURRENT|MCL_FUTURE)  0) {
+   if (errno == EINVAL || errno ==  ENOMEM ||
+   errno == EPERM  || errno ==  EACCES)
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ATAng still problematic

2003-09-19 Thread Marius Strobl

On Fri, Sep 19, 2003 at 04:36:32PM -0700, Kris Kennaway wrote:
 On Fri, Sep 19, 2003 at 06:21:52PM +0200, Marius Strobl wrote:
  On Thu, Sep 18, 2003 at 05:51:25PM +0200, Jan Srzednicki wrote:
   
   Anyway, here's backtrace for atapicam panic I've mentioned. It's
   triggered by:
   
   cdrecord dev=1,1,0 /some/track
   
  
  This panic isn't ATAPICAM related. Could you try the patch below? It's
  against the cdrtools-devel port but should also work with the cdrtools
  port.
 
 Isn't it still a kernel bug if a user process can trigger a panic?
 

Yes, it seems to be a bug in the mlockall(2) implementation. Backing
it out or hindering cdrecord to use it avoids the panic. I already
wrote an email to bms@ who commited the mlockall(2) and munlockall(2)
support regarding this issue.
The patch for the cdrtools ports is only a workaround until the real
cause is fixed. I also was not sure if it would work for Bryan as I
originally didn't get the same panic as he did.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ATAng still problematic

2003-09-19 Thread Marius Strobl

On Sat, Sep 20, 2003 at 01:47:44AM +0100, Bruce M Simpson wrote:
 On Sat, Sep 20, 2003 at 02:17:21AM +0200, Marius Strobl wrote:
   Isn't it still a kernel bug if a user process can trigger a panic?
  
  Yes, it seems to be a bug in the mlockall(2) implementation. Backing
  it out or hindering cdrecord to use it avoids the panic. I already
  wrote an email to bms@ who commited the mlockall(2) and munlockall(2)
  support regarding this issue.
 
 I don't think that's been conclusively established yet, so statements
 of the form above are a bit unhelpful.
 

Ok, sorry.

 The problem may well lie elsewhere in the system, as a parameter in
 vm_map_copy_entry() is being unexpectedly set to NULL in the backtrace
 which you provided me with.
 

It's just certainly not ATAng or ATAPICAM as I get this panic on a
SCSI-only box, too.

 If more people can exercise the same codepath as you appear to be
 exercising with different configurations, then I will have more to go on.
 
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Question about genassym, locore.s and 0-sized arrays(showstopper for an icc compiled kernel)

2003-09-05 Thread Marius Strobl

On Fri, Sep 05, 2003 at 07:34:39PM +1000, Bruce Evans wrote:
 On Fri, 5 Sep 2003, I wrote:
 
  ...
  If some values are unrepresentable then they need to be represtended
  using other values.  E.g., add 1 to avoid 0, or multiply by the alignment
  size if some element of the tool chanin instsists on rounding up things
chain  insists
  for alignment like a broken aout version used to do.  16-bit values
  would need 17 bits to represent after adding 1.
 
 Better, add 0x1 to avoid 0.  awk has no support for parsing hex numbers
 so subtracting the bias of 1 would take a lot more code, but ignoring
 leading hexdigits requires no changes in genassym.sh -- it already ignores
 everything except the last 4 hexdigits.
 

This works, too. Thanks for the detailed explanation Bruce!

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Question about genassym, locore.s and 0-sized arrays (showstopper for an icc compiled kernel)

2003-09-04 Thread Marius Strobl

On Thu, Sep 04, 2003 at 03:47:09PM -0700, Marcel Moolenaar wrote:
 
 We use the size of the symbol (ie the size of the object identified
 by the symbol) to pass around values. This we do by creating arrays.
 If we want to export a C constant 'FOOBAR' to assembly and the constant
 is defined to be 6, then we create an array for the sign, of which the
 size is 1 for negative numbers and 0 otherwise. In this case the array
 will be named FOOBARsign and its size is 0. We also create 4 arrays (*w0,
 *w1, *w2 and *w3), each with a maximum of 64K and corresponding to the
 4 16-bit words that constitutes a single 64-bit entity.
 In this case
   0006 C FOOBARw0
    C FOOBARw1
    C FOOBARw2
    C FOOBARw3
 
 If the compiler creates arrays of size 1 for arrays we define as a
 zero-sized array, you get exactly what you've observed.
 

Is this rather complex approach really necessary? I have successfully
generated assyms.s' using genassym.sh(8) from NetBSD and both ICC and
GCC on i386 which have exactly the same values as one generated with
sys/kern/genassym.sh from FreeBSD. The genassym.sh(8) of NetBSD kind
of directly exports the C-constants so it just needs one symbol per
constant and doesn't require zero sized arrays. Given that it's from
NetBSD their approach also should be very MI.

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]

90 matches

Mail list logo