from:"Masanobu SAITOH"

heartbeat panic by heavy traffic

2023-09-14 Thread Masanobu SAITOH

Hi.

I can see the following heartbeat panic when a machine is forwarding
heavy short packets:

[ 745.0068385] cpu14: found cpu15 heart stopped beating after 16 seconds
[ 745.0068385] panic: cpu15: softints stuck for 16 seconds
[ 745.0168386] cpu15: Begin traceback...
[ 745.0168386] cpu14: found cpu15 heart stopped beating after 16 seconds
[ 745.0268387] vpanic() at cpu14: found cpu15 heart stopped beating after 16 
seconds
[ 745.0268387] netbsd:vpanic+0x173
[ 745.0368390] cpu14: found cpu15 heart stopped beating after 16 seconds
[ 745.0368390] panic() at cpu14: found cpu15 heart stopped beating after 16 
seconds
[ 745.0468390] netbsd:panic+0x3c
[ 745.0468390] heartbeat() at netbsd:heartbeat+0x353
[ 745.0568392] hardclock() at netbsd:hardclock+0x8b
[ 745.0668393] Xresume_lapic_ltimer() at netbsd:Xresume_lapic_ltimer+0x1e
[ 745.0668393] --- interrupt ---
[ 745.0768393] psref_release() at netbsd:psref_release+0x83
[ 745.0768393] ipintr() at netbsd:ipintr+0xef
[ 745.0868396] softint_dispatch() at netbsd:softint_dispatch+0x103
[ 745.0868396] DDB lost frame for netbsd:Xsoftintr+0x4c, trying 
0x8288589fc0f0
[ 745.0968395] Xsoftintr() at netbsd:Xsoftintr+0x4c
[ 745.0968395] --- interrupt ---
[ 745.1068397] f9faeac0f5baeac4:
[ 745.1068397] cpu15: End traceback...
[ 745.1068397] fatal breakpoint trap in supervisor mode
[ 745.1168399] trap type 1 code 0 rip 0x80235425 cs 0x8 rflags 0x202 
cr2 0 ilevel 0x7 rsp 0x8288589fbc68
[ 745.1268401] curlwp 0xd8070facf6c0 pid 0.175 lowest kstack 
0x8288589f72c0
Stopped in pid 0.175 (system) atnetbsd:breakpoint+0x5:  leave
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x173
panic() at netbsd:panic+0x3c
heartbeat() at netbsd:heartbeat+0x353
hardclock() at netbsd:hardclock+0x8b
Xresume_lapic_ltimer() at netbsd:Xresume_lapic_ltimer+0x1e
--- interrupt ---
psref_release() at netbsd:psref_release+0x83
ipintr() at netbsd:ipintr+0xef
softint_dispatch() at netbsd:softint_dispatch+0x103
DDB lost frame for netbsd:Xsoftintr+0x4c, trying 0x8288589fc0f0
Xsoftintr() at netbsd:Xsoftintr+0x4c
(snip)

wm and ixg have hw.{wm,ixg}N.txrx_workqueue sysctl.
If we set them from 0 to 1, we can avoid the panic. Many drivers
have no way to avoid the problem.

I think it would be good to change the default behavior from
panic to something others because GENERIC kernel enables HEARTBEAT.
by default. One of idea is to print warning message at sufficient intervals.

 Regards.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: cpu temperature readings

2023-07-07 Thread Masanobu SAITOH

Hi, all.

Could you test the following diff?

http://www.netbsd.org/~msaitoh/coretemp-20230707-0.dif

In the draft of the commit message:
--
coretemp(4): Change limits of Tjmax.

 - Change the lower limit from 70 to 60. At least, some BIOSes can change
   the value down to 62.
 - Change the upper limit from 110 to 120. At least, some BIOSes can change
   the value up to 115.
 - Print error message when rdmsr(TEMPERATURE_TARGET) failed.
#if 1
 - Print error message when Tjmax exceeded the limit.
#else
 - When Tjmax exceeded the limit, print warning message and use the value
   as it is.
#endif
--

In "#if 1" part, The default value (100) is used for Tjmax if it exceeded
the limit. It's the same as before except the range of the limit.
In "#else" part, the read value is used as it is even if it exceeded the
limit.

Which one do you think is better?

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: cpu temperature readings

2023-06-28 Thread Masanobu SAITOH

Hi.

On 2023/06/28 14:24, Michael van Elst wrote:
> k...@munnari.oz.au (Robert Elz) writes:
> 
>> cpu0: "12th Gen Intel(R) Core(TM) i9-12900KS"
> 
> The chip apparently reports a Tjmax of 100 C (as for the non-selected chip)
> but actually has a real Tjmax of 115 C.

https://ark.intel.com/content/www/us/en/ark/products/225916/intel-core-i912900ks-processor-30m-cache-up-to-5-50-ghz.html

ark.intel.com often shows incorrect values. Looking at this page now,
it says Tjmax is 90 degrees.

Robert, could you show me the output of:

dmesg -t | grep Tjmax

It seems that the MSR_TEMPERATURE_TARGET's value is not fixed
on newer chips. Please test the following diff:

https://www.netbsd.org/~msaitoh/coretemp-20230628-0.dif

Thanks in advance.

> There are two caveats:
> 
> Our driver ignores Tjmax of > 110 C (and uses 100 C as default). If the
> chip would report the real value, we would ignore it.
> 
> Intel recommends that the BIOS fakes the value and configures the MSR ten
> degrees lower (so you see Tjmax of 90 C).
> 
> 
> The temperature sensor reading is relative to Tjmax.
> 
> /*
>  * The temperature is computed by
>  * subtracting the reading by Tj(max).
>  */
> edata->value_cur = sc->sc_tjmax;
> edata->value_cur -= __SHIFTOUT(msr, MSR_THERM_STATUS_READOUT);
> 
> 
> So it could be 15C lower than reality (if the default of 100 instead
> of 115 is used) or even 25C lower if (if the Intel recommenendation
> is followed).
> 

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: sdmmc_mem_enable failed with error 60

2023-01-06 Thread Masanobu SAITOH

Hi.

On 2021/01/29 17:50, Stephen Borrill wrote:
> On Tue, Mar 24, 2020 at 03:48:03PM +, Patrick Welche wrote:
>> Last time I played with my raspberry pi zero w, I couldn't see the network
>> card and saw
>>
>>   sdmmc_mem_enable failed with error 60
>>
>> Now I'm seeing the same thing on a new amd64 laptop trying to use
>> another new 32GB microsd card. I opened kern/54959 in the rpi0w
>> case.
>>
>> The laptop has a
>>
>> rtsx0 at pci6 dev 0 function 0: Realtek Semiconductor RTS525A PCI-E Card 
>> Reader (rev. 0x01)
>> rtsx0: interrupting at msi2 vec 0
>> sdmmc0 at rtsx0
> 
> I'm testing a Dell Latitude 3190 and its hard drive is on sdmmc0 so I have no 
> storage:
> 
> [ 1.016863] sdhc0 at pci0 dev 28 function 0: Intel Gemini Lake eMMC (rev. 
> 0x06)
> [ 1.016863] sdhc0: interrupting at ioapic0 pin 39
> [ 1.016863] sdhc0: SDHC 3.0, rev 16, SDMA, 20 kHz, embedded slot, HS 
> SDR50 DDR50 SDR104 HS200 1.8V, re-tuning mode 1 (128s timer), 2048 byte blocks
> [ 1.016863] sdmmc0 at sdhc0 slot 0
> [ 4.914910] sdmmc0: sdmmc_mem_enable failed with error 60
> [ 4.924910] sdmmc0: autoconfiguration error: couldn't enable card: 60
> 
> Full dmesg (without SDMMC_DEBUG):
> http://www.netbsd.org/~sborrill/dmesg.lt3190
> 
> And acpidump:
> http://www.netbsd.org/~sborrill/acpidump.lt3190
> 

Could all people who have the same problem try the latest -current?
I added quirks to sdhc_pci.c yesterday:

http://mail-index.netbsd.org/source-changes/2023/01/05/msg142634.html

Thanks in advance.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Strange error with "netstat -i 1"

2022-11-20 Thread Masanobu SAITOH




On 2022/11/18 23:18, Takahiro Kambe wrote:
> Oh, I just noticed.

Thank you for fixing.

> diff --git a/usr.bin/netstat/if.c b/usr.bin/netstat/if.c
> index e33f84f324..b02df5fc3b 100644
> --- a/usr.bin/netstat/if.c
> +++ b/usr.bin/netstat/if.c
> @@ -176,7 +176,7 @@ if_data_ext_get(const char *ifname, struct if_data_ext 
> *dext)
>  {
>   char namebuf[1024];
>   size_t len;
> - int drops;
> + int64_t drops;
>  
>   /* For sysctl */
>   snprintf(namebuf, sizeof(namebuf),
> 

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Weird clock behaviour with current (amd64) kernel

2022-07-18 Thread Masanobu SAITOH

Hi,

On 2022/07/15 21:38, Robert Elz wrote:
> Date:Fri, 15 Jul 2022 13:32:55 +0900
> From:    Masanobu SAITOH 
> Message-ID:  <87553319-950b-7dad-ac64-29d2c25d1...@execsw.org>
> 
>   | Could you show me the full dmesg with verbose output?
> 
> I could, but it turns out to be about a megabyte, so unless you need
> to see all that noise, perhaps just this will help:
> 
> jacaranda$ grep -i TSC ~/dmesg*cold
> [ 1.03] cpu0: Use lfence to serialize rdtsc
> [ 1.03] cpu0: TSC freq CPUID 341760 Hz
> [ 1.031552] cpu0: TSC freq CPUID 341760 Hz
> [ 1.031552] cpu0: TSC freq calibrated 10483939000 Hz
> [ 1.457151] timecounter: Timecounter "TSC" frequency 10483939000 Hz 
> quality 3000
> 
> When that was running, I did a
>   while sleep 1; do date; done
> loop, and watched it compared with time from my phone.
> The date lines showed nicely incrementing seconds, but each
> "sleep 1" lasted for (about) 3 seconds.
> 
> I switched from TSC to hpet0 and repeated the loop, the sleeps
> still slept for 3 seconds, but now time went up in steps of 3.
> (ie: time of day keeping became accurate, internal relative times
> remained broken).
> 
> It is difficult to believe that the ratio 10483939000/341760 (== 3.067)
> is not related here.

Yes, it should be related to the problem.

> A different boot produced this instead
> 
> [ 1.03] cpu0: TSC freq CPUID 341760 Hz
> [ 1.031706] cpu0: TSC freq CPUID 341760 Hz
> [ 1.031706] cpu0: TSC freq calibrated 10491257000 Hz
> [ 1.345506] timecounter: Timecounter "TSC" frequency 10491257000 Hz 
> quality 3000
> 
> A similar calibrated value, but not identical.   That kind of looks like
> the PCI_CONFIG_DUMP output/scrolling/something related is interfering with
> the calibration ...

Another possibility is that the LAPIC_ICR_TIMER register is
not set correctly in lapic_calibrate_timer(). The function is called
twice and it refers "lapic_per_second" global variable. The code is
little tricky.

> all the messages with dmesg timestamps of 1.03xxx are
> being produced while all of that is happening.   The 1.000 message is before
> the config dump starts, the 1.3455 or 1.457... messages appear after the
> config dump has ended.
> 
>   | i.e. add -v option to the boot command in /boot.cfg.
> 
> I can't do exactly that, as I cannot find a boot.cfg file anywhere that
> gets used (this is another issue I'm having, which I was going to ask
> about sometime) - I have been modifying the banner= strings on every one
> I can find (turns out I have a lot of them... none being used) so I know
> when one is picked.   None of the boot.cfg files I can find has the ascii
> art flag in the banner - the menu printed by efiboot does have that.
> 
> So, instead I simply hand typed a boot command, with -v at the end.   I'm
> not sure that worked (that is, the system certainly booted, but I don't
> know that the -v flag worked).

It worked. "TSC freq CPUID" is printed in cpu_tsc_freq_cpuid().

aprint_verbose_dev(ci->ci_dev, "TSC freq CPUID %" PRIu64
" Hz\n", freq);

aprint_verbose*() is printed when -v is set.

> If you'd like to see the complete dmesg I can make that available (either
> one from a cold boot, or the subsequent one, from a reboot, where the
> previous boot's dmesg is still in the buffer - that file is about 2MB).
> 
> For comparison, when I boot generic (built from the same kernel sources,
> but no pci config dump of course) what I see is:
> 
> [ 1.03] cpu0: Use lfence to serialize rdtsc
> [ 1.03] cpu0: TSC freq CPUID 341760 Hz
> [ 1.059545] cpu0: TSC freq CPUID 341760 Hz
> [ 1.059545] cpu0: TSC freq calibrated 3417601000 Hz
> [ 2.106529] timecounter: Timecounter "TSC" frequency 3417601000 Hz 
> quality 3000
> 
> That there is no info not there which is included above, is why I
> suspect the -v might not have worked .. of course, this is all from a
> simple grep (using -i, which is why that rdtsc line is included).
> Of course, if the additional info you are looking for doesn't contain "tsc"
> (or TSC) then I wouldn't have found it, in which case either I give you
> the whole dmesg,

I'd like to see the debug(-x) output, too.
I'll mail to you privately.


> or you supply a different grep pattern (if I can find one
> line, I can easily extract any relevant looking surrounding lines).
> 
> kre
> 

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Weird clock behaviour with current (amd64) kernel

2022-07-14 Thread Masanobu SAITOH




On 2022/07/14 22:59, Robert Elz wrote:
> Hi,
> 
> I just booted a kernel that I built (from up to date at the time)
> HEAD sources about 24 hours ago.
> 
> Everything seemed to be working fine - until I noticed that all of
> my clocks (there are several, gkrellm, window manager, a dclock,
> and an xtu) were all wildly wrong (as in, were moving time forwards
> incredibly slowly).
> 
> You can see the results of that if you compare the Date header, and
> Received header from my local system (jacaranda) with the Received
> header munnari adds (there should be no more than a second or two
> between those - in this message there will be much more).
> 
> When I noticed this, I changed the clock source from TSC to hpet0,
> and since then, the system appears to be advancing time at about
> the right rate - but unlike its normal smooth motion, the second hand
> in xtu (the only one of the clocks that has seconds) looks very
> jerky, and ...
> 
> jacaranda$ while sleep 1; do date; done
> Thu Jul 14 20:43:49 +07 2022
> Thu Jul 14 20:43:52 +07 2022
> Thu Jul 14 20:43:55 +07 2022
> Thu Jul 14 20:43:58 +07 2022
> Thu Jul 14 20:44:01 +07 2022
> 
> that would be much like what it looks like.

Could you show me the full dmesg with verbose output?
i.e. add -v option to the boot command in /boot.cfg.
It shows some message related to the TSC stuff.

Thanks in advance.


> This is a fairly normal kernel, the most notable features of its config
> are:
> 
> options PCIVERBOSE  # verbose PCI device autoconfig messages
> options PCI_CONFIG_DUMP # verbosely dump PCI config space
> options SCSIVERBOSE # human readable SCSI error messages
> options HDAUDIOVERBOSE  # human readable HDAUDIO device names
> 
> options ACPI_SCANPCI# find PCI roots using ACPI
> options MPBIOS  # configure CPUs and APICs using 
> MPBIOS
> options MPBIOS_SCANPCI  # MPBIOS configures PCI roots
> options PCI_INTR_FIXUP  # fixup PCI interrupt routing via ACPI
> options PCI_BUS_FIXUP   # fixup PCI bus numbering
> options PCI_ADDR_FIXUP  # fixup PCI I/O addresses
> options ACPI_ACTIVATE_DEV   # If set, activate inactive devices
> options VGA_POST# in-kernel support for VGA POST
> 
> options MSGBUFSIZE=1049600
> 
> (still not big enough to hold all of what PCI_CONFIG_DUMP produces, I do have
> the dmesg.boot file from it if anyone cares).
> 
> Anyone have any ideas?   Note that the CPU is an Alder Lake - has both
> performance and economy cores which run at different rates (which NetBSD
> knows nothing about, yet, but that's OK) - core speed is always subject
> to variation anyway, so that should not matter (and has not on previous
> kernels).
> 
> My previous kernel did not have most of those options, and managed time well
> enough (I have never really trusted TSC, but it was at least close enough
> that NTP could keep it in line - this is beyond NTP's abilities).
> 
> While I am here, an unrelatged matter, one other config option:
> 
> options WS_KERNEL_FG=WSCOL_CYAN
> 
> No way is that related to the time ... I've used that in kernels for decades.
> I mention it now, as it might just provide some assistance to those working
> on the graphics drivers.
> 
> When the system first boots, all the console messages appear in yellow, not
> the normal green, and not cyan.   After the system switches the console to
> graphics mode (it is an nvidia GT930 - running X on wsfb) the messages all
> switch to cyan.  This harms nothing, it's just a bit weird, and I thought
> it might provide a clue to some possible setup errors?
> 
> kre

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Crash on various Supermicro motherboards

2022-04-08 Thread Masanobu SAITOH



On 2022/04/08 15:29, 6b...@6bone.informatik.uni-leipzig.de wrote:
> Here is the CPU type:
> 
> https://speicherwolke.uni-leipzig.de/index.php/s/SM6LQqKPqKYeCqM
> 
> 
> Regards
> Uwe
> 
> On Thu, 7 Apr 2022, Christos Zoulas wrote:
> 
>> Date: Thu, 7 Apr 2022 20:36:26 - (UTC)
>> From: Christos Zoulas 
>> To: current-users@netbsd.org
>> Subject: [Extern] Re: Crash on various Supermicro motherboards
>>
>> In article 
>> ,
>> <6b...@6bone.informatik.uni-leipzig.de> wrote:
>>> Hello,
>>>
>>> I now have the backtrace:
>>>
>>> https://speicherwolke.uni-leipzig.de/index.php/s/cFXAbL6axwHpKkL

Same as kern/54489?

http://gnats.netbsd.org/54489

>> What CPUs are these? I don't see the cpu lines in the avi...
>>
>> christos
>>

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm0 panic

2020-07-06 Thread Masanobu SAITOH

Hi, all.

On 2020/06/29 12:53, Kengo NAKAHARA wrote:
> Hi,
> 
> On 2020/06/28 0:24, Patrick Welche wrote:
>> Trying a today's -current/amd64 with DIAGNOSTIC/DEBUG/LOCKDEBUG, I can
>> boot multiuser without a network. If I log in as root, as soon as I hit
>> enter:
>>
>> # ifconfig wm0 inet 10.0.0.62 netmask 0xff00
>> [ 127.5763268] Kernel lock error 127.5763268] lock address : 
>> 0x8106ab40 type :   spin
>> [ 127.5863237] initialized  : 0x80b0bbb9
>> [ 127.5863237] shared holds :  0 exclusive:  
>> 1
>> [ 127.5963238] shares wanted:  0 exclusive:  
>> 1
>> [ 127.6063236] relevant cpu :  1 last held:  >> 0
>> [ 127.6163235] relevant lwp : 0x8d419a07f20
>> [ 127.6163235] last locked* : 0x80a7d2f5 unlocked : 
>> 0x80a7d2e6
>> [ 127.6263235] curcpu holds :  0 wanted by: 
>> 0x8d419a07f200
>> [ 127.6363234] panic: LOCKDEBock,244: spinout
>> [ 127.6363234] cpu1: Begin traceback...
>> [ 127.6463233] vpanic() at netbsd:vpanic+0x152
>> [ 127.6463233] snprintf() at netbsd:snprintf
>> [ 127.6563232] lockdebug_more() at netbsd:lockdebug_more
>> [ 127.6563232] _kernel_lock() at netbsd:_kernel_lock+0x244
>> [ 127.6663231] ip_slowtimo() at netbsd:ip_slowtimo+0x1a
>> [ 127.6763231] pfslowtimo() at netbsd:pfslowtimo+0x34
>> [ 127.6763231] callout_softclock() at netbsd:callout_softclock+0x10f
>> [ 127.6863230] softint_disph+0x108
>> [ 127.6863230] DDB lost frame for netbsd:Xsoftintr+0x4f, trying 
>> 0xa4825d02eff0
>> [ 127.6963230] Xsoftintr() at netbsd:Xsoftintr+0x4f
>> [ 127.7063229] --- interrupt ---
>> [ 127.706322traceback...
> It seems some other code have held KERNEL_LOCK too long time.
> Could you show the function of last locked address?
> # e.g. addr2line -e "your kernel image" -f 0x80a7d2f5
> 
> If the panic can reappear, could you show "show all locks/t" of ddb?
> 
> 
> Thanks,

It seems this problem is the same as the following mail:

http://mail-index.netbsd.org/current-users/2020/06/03/msg038785.html


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: 8.99.41 panic in nvm_poll

2019-06-10 Thread Masanobu SAITOH

On 2019/06/10 17:17, Thomas Klausner wrote:
> I tried some more stuff.
> 
> Enabling the NVME_QUIRK_DELAY_B4_CHK_RDY quirk didn't help, it paniced
> during boot. However, forcing nvme_pci_force_intx to 1 makes it boot
> successfully!

Please show the following information:

- "FULL" dmesg

- output of cpuctl list

- output of intrctl list

- output of pcictl pci0 dump -b X -d Y of the device.


> Should this variable default to 1, or how do we improve the situation
> in general?
> 
> Cheers,
>  Thomas
> 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Does IPv6 on athn(4) work?

2019-05-28 Thread Masanobu SAITOH

On 2019/05/26 14:33, Thomas Mueller wrote:
> from: Masanobu SAITOH:
> 
>>  While modifying Ethernet multicast's code, I noticed that
>> athn.c doesn't modify the multicast filter. Does IPv6 on
>> athn(4) work?
> 
> ---
>> Index: sys/dev/ic/athn.c
> 
> I have a motherboard from 2013 by MSI (MPOWER) that has an onboard athn 
> (Atheros 9271) wi-fi chip, (quasi-)USB.
> 
> It has worked sporadically in the past on NetBSD, but now and for some time 
> causes the boot to hang unless I disable athn through userconf.

Interesting.

I'll get USB athn(4) and will test the stability.

martin@ reported me that IPv6 works with the current athn(4) for him
and he didn't see any bad influence with my patch. I'll commit the
diff.

 Thanks all.


> I never thought to look in sys/dev/ic .
> 
> If this is updated, I'd like to try on amd64 and i386 current to see if this 
> wi-fi chip can be made to work under NetBSD.
> 
> Just for comparison, no support at all in FreeBSD.
> 
> Tom
> 


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Does IPv6 on athn(4) work?

2019-05-24 Thread Masanobu SAITOH

 Hi.

 While modifying Ethernet multicast's code, I noticed that
athn.c doesn't modify the multicast filter. Does IPv6 on
athn(4) work?

---
Index: sys/dev/ic/athn.c
===
RCS file: /cvsroot/src/sys/dev/ic/athn.c,v
retrieving revision 1.18
diff -u -p -r1.18 athn.c
--- sys/dev/ic/athn.c   26 Jun 2018 06:48:00 -  1.18
+++ sys/dev/ic/athn.c   24 May 2019 06:28:27 -
@@ -136,8 +136,8 @@ Static void athn_ani_lower_immunity(stru
 Static voidathn_ani_monitor(struct athn_softc *);
 Static voidathn_ani_ofdm_err_trigger(struct athn_softc *);
 Static voidathn_ani_restart(struct athn_softc *);
-Static voidathn_set_multi(struct athn_softc *);
 #endif /* notyet */
+Static voidathn_set_multi(struct athn_softc *);
 
 PUBLIC int
 athn_attach(struct athn_softc *sc)
@@ -2751,12 +2751,11 @@ athn_watchdog(struct ifnet *ifp)
ieee80211_watchdog(>sc_ic);
 }
 
-#ifdef notyet
 Static void
 athn_set_multi(struct athn_softc *sc)
 {
-   struct arpcom *ac = >sc_ic.ic_ac;
-   struct ifnet *ifp = >ac_if;
+   struct ethercom *ec = >sc_ec;
+   struct ifnet *ifp = >ec_if;
struct ether_multi *enm;
struct ether_multistep step;
const uint8_t *addr;
@@ -2768,7 +2767,7 @@ athn_set_multi(struct athn_softc *sc)
goto done;
}
lo = hi = 0;
-   ETHER_FIRST_MULTI(step, ac, enm);
+   ETHER_FIRST_MULTI(step, ec, enm);
while (enm != NULL) {
if (memcmp(enm->enm_addrlo, enm->enm_addrhi, 6) != 0) {
ifp->if_flags |= IFF_ALLMULTI;
@@ -2793,7 +2792,6 @@ athn_set_multi(struct athn_softc *sc)
AR_WRITE(sc, AR_MCAST_FIL1, hi);
AR_WRITE_BARRIER(sc);
 }
-#endif /* notyet */
 
 Static int
 athn_ioctl(struct ifnet *ifp, u_long cmd, void *data)
@@ -2835,9 +2833,7 @@ athn_ioctl(struct ifnet *ifp, u_long cmd
case SIOCDELMULTI:
if ((error = ether_ioctl(ifp, cmd, data)) == ENETRESET) {
/* setup multicast filter, etc */
-#ifdef notyet
athn_set_multi(sc);
-#endif
error = 0;
}
break;

---
(The same diff is at: http://www.netbsd.org/~msaitoh/athn-20190524-0.dif)

Could someone test the above diff on athn(4)?
Is it OK to commit?

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: no options COMPAT_43

2019-04-15 Thread Masanobu SAITOH

On 2019/04/16 6:21, Paul Goyette wrote:
> On Mon, 15 Apr 2019, Paul Goyette wrote:
> 
>> There may be some other option you have enabled which requires COMPAT_43
>>
>> Try to boot your new kernel and use modstat(8) to see what other modules 
>> might require the compat_43 module.
> 
> A quick check on my recent amd64 build shows that compat_linux requires
> compat_43.

modstat said:
> NAME   CLASSSOURCE   FLAG  REFSSIZE REQUIRES
> compat_43  exec builtin  -2   - 
> compat_sysctl_09_43,compat_util,compat_60
> compat_linux   exec builtin  -1   - 
> compat_ossaudio,sysv_ipc,compat_util,compat_50,compat_43,exec_elf64
> compat_linux32 exec builtin  -0   - 
> compat_linux,sysv_ipc,compat_sysv_50,compat_netbsd32_50,compat_netbsd32_43,exec_elf32,compat_netbsd32,compat_netbsd32_sysvipc
> compat_netbsd32_43 exec builtin  -1   - 
> compat_netbsd32,compat_43

so I added the following three lines to my config file and worked as I expected:

no options  COMPAT_LINUX
no options  COMPAT_LINUX32
no options  COMPAT_43

Thanks!

> 
>> On Mon, 15 Apr 2019, Masanobu SAITOH wrote:
>>
>>> Hi.
>>>
>>> I tried to make a kernel without COMPAT_43 from conf/GENERIC.
>>> I added "no options COMPAT_43" at the end of conf/GENERIC or
>>> conf/GENERIC.local, but compile/GENERIC/opt_compat_43.h had:
>>>
>>> #define COMPAT_43   1
>>>
>>> Is this behavior intended? When I added "no options COMPAT_43"
>>> twice, config(8) said:
>>>
>>> GENERIC:1219: warning: options `COMPAT_43' is not defined
>>>
>>> so my "no options COMPAT_43" lines are at after loading of
>>> sys/conf/compat_netbsd.config.
>>>
>>> -- 
>>> ---
>>>    SAITOH Masanobu (msai...@execsw.org
>>>     msai...@netbsd.org)
>>>
>>> !DSPAM:5cb45ac1290301372017733!
>>>
>>>
>>
>> ++--+---+
>> | Paul Goyette   | PGP Key fingerprint: | E-mail addresses: |
>> | (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com |
>> | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org   |
>> ++--+---+
>>
> 
> ++--+---+
> | Paul Goyette   | PGP Key fingerprint: | E-mail addresses: |
> | (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com |
> | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org   |
> ++--+---+


-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

no options COMPAT_43

2019-04-15 Thread Masanobu SAITOH

 Hi.

 I tried to make a kernel without COMPAT_43 from conf/GENERIC.
I added "no options COMPAT_43" at the end of conf/GENERIC or
conf/GENERIC.local, but compile/GENERIC/opt_compat_43.h had:

#define COMPAT_43   1

Is this behavior intended? When I added "no options COMPAT_43"
twice, config(8) said:

GENERIC:1219: warning: options `COMPAT_43' is not defined

so my "no options COMPAT_43" lines are at after loading of
sys/conf/compat_netbsd.config.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: i386/conf/ALL link error

2019-04-11 Thread Masanobu SAITOH

On 2019/04/11 18:59, Kamil Rytarowski wrote:
> On 11.04.2019 07:32, Masanobu SAITOH wrote:
>> Hi.
>>
>> i386/conf/ALL kernel can't link. See below.
>>
> 
> This used to work.
> 
> It means that __HAVE_ATOMIC64_OPS is defined but 64-bit atomics are
> unavailable..
> 
> Someone disabled them for NetBSD/i386 them?
> 
> 
> #if defined(_KERNEL)
> /*
>  * Processors < i586 do not have cmpxchg8b, and we compile for i486
>  * by default. The kernel tsc driver uses them though, and handles < i586
>  * by patching.  E.g. rump kernels and crash(8) and a selection of
>  * other run-in-userspace code defines _KERNEL, but is careful not to
>  * build anything using 64bit atomic ops by default.
>  */
> #define __HAVE_ATOMIC64_OPS
> #endif
> 
> -- sys/arch/i386/include/types.h
> 

It seems the problem was fixed.

Thanks!

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

i386/conf/ALL link error

2019-04-10 Thread Masanobu SAITOH

Hi.

i386/conf/ALL kernel can't link. See below.

 Thanks in advance.


#  link  ALL/netbsd
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld
 -Map netbsd.map --cref -T netbsd.ldscript -Ttext c010 -e start -X -o 
netbsd ${SYSTEM_OBJ:[@]:Nswapnetbsd.o} ${EXTRA_OBJ} vers.o swapnetbsd.o
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 subr_kcov.o: in function `trace_cmp':
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:424:
 undefined reference to `__atomic_load_8'
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:426:
 undefined reference to `__atomic_store_8'
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:427:
 undefined reference to `__atomic_store_8'
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:428:
 undefined reference to `__atomic_store_8'
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:429:
 undefined reference to `__atomic_store_8'
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 subr_kcov.o: in function `__sanitizer_cov_trace_pc':
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:383:
 undefined reference to `__atomic_load_8'
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:385:
 undefined reference to `__atomic_store_8'
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:387:
 undefined reference to `__atomic_store_8'
/disk2/sources/NetBSD-current/src/obj/tooldir.NetBSD-8.99.37-amd64/bin/i486--netbsdelf-ld:
 subr_kcov.o: in function `trace_cmp':
/disk2/sources/NetBSD-current/src/sys/arch/i386/compile/ALL/../../../../kern/subr_kcov.c:430:
 undefined reference to `__atomic_store_8'
*** [netbsd] Error code 1


-- 
-
 Masanobu SAITOH(masan...@iij.ad.jp
  msai...@netbsd.org)

Re: building kernel w/ options MIIVERBOSE fails

2019-03-25 Thread Masanobu SAITOH


On 2019/03/25 18:25, K. Schreiner wrote:

Hi,

with current source cvs'upped an hour or so ago fails with:

...

 compile  vNBx64/mii_verbose.o
In file included from /u/NetBSD/src/sys/dev/mii/mii_verbose.c:62:0:
/u/NetBSD/src/sys/dev/mii/miidevs_data.h:39:21: error: array type has 
incomplete element type 'struct mii_knowndev'
struct mii_knowndev mii_knowndevs[] = {
 ^
/u/NetBSD/src/sys/dev/mii/mii_verbose.c: In function 'mii_get_descr_real':
/u/NetBSD/src/sys/dev/mii/mii_verbose.c:105:1: error: control reaches end of 
non-void function [-Werror=return-type]
}
^
cc1: all warnings being treated as errors
--- mii_verbose.o ---
*** [mii_verbose.o] Error code 1


 Fixed. Please update the latest -current.

 Thank you for your quick report!


 - msaitoh




Source of the problem are this parts of the change to mii_verbose.c
as 'struct mii_knowndev' must be defined before miidevs_data.h is
included:


@@ -55,9 +55,11 @@
   */

  #include 
  -__KERNEL_RCSID(0, "$NetBSD: mii_verbose.c,v 1.4 2019/01/08 03:14:51 msaitoh Exp 
$");
  +__KERNEL_RCSID(0, "$NetBSD: mii_verbose.c,v 1.5 2019/03/25 07:34:13 msaitoh Exp 
$");

  #include 
+#include 
+#include 
  #include 

  struct mii_knowndev {
  @@ -65,8 +67,6 @@
  int model;
 const char *descr;
  };
-#include 
-#include 

  const char * mii_get_descr_real(int, int);


Reverting part of the change in r1.5 like so fixes the failure:

Index: mii_verbose.c
===
RCS file: /cvsroot/src/sys/dev/mii/mii_verbose.c,v
retrieving revision 1.5
diff -u -r1.5 mii_verbose.c
--- mii_verbose.c   25 Mar 2019 07:34:13 -  1.5
+++ mii_verbose.c   25 Mar 2019 09:24:53 -
@@ -58,9 +58,8 @@
  __KERNEL_RCSID(0, "$NetBSD: mii_verbose.c,v 1.5 2019/03/25 07:34:13 msaitoh Exp 
$");
  
  #include 

-#include 
-#include 
  #include 
+#include 
  
  struct mii_knowndev {

int oui;
@@ -68,6 +67,8 @@
const char *descr;
  };
  
+#include 

+
  const char * mii_get_descr_real(int, int);
  
  MODULE(MODULE_CLASS_MISC, miiverbose, NULL);



Kurt




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: README: libstdc++.so bumped

2019-03-22 Thread Masanobu SAITOH


On 2019/03/22 17:45, matthew green wrote:

Masanobu SAITOH writes:

On 2019/03/20 14:41, matthew green wrote:

hi folks.


users of -current may notice issues with libstdc++.so major
being increased.  i've described the issues why this was
done in the commit:

 http://mail-index.netbsd.org/source-changes/2019/03/20/msg104433.html

this may break update builds but will ensure what we ship
as netbsd 9 is properly incompatible with netbsd 8, vs
there being minor ABI changes present.

please send-pr or send me email if you have a problem
related to this that rebuilding does not solve.

thanks!


Not required to make pkgsrc/emulators/compat8 now?


we need that yes :)


 Please someone(TM) do it. Sorry. I'm not familiar with 
pkgsrc/emulators/compat_netbsd/gencompat.sh...




.mrg.




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: README: libstdc++.so bumped

2019-03-22 Thread Masanobu SAITOH


On 2019/03/20 14:41, matthew green wrote:

hi folks.


users of -current may notice issues with libstdc++.so major
being increased.  i've described the issues why this was
done in the commit:

http://mail-index.netbsd.org/source-changes/2019/03/20/msg104433.html

this may break update builds but will ensure what we ship
as netbsd 9 is properly incompatible with netbsd 8, vs
there being minor ABI changes present.

please send-pr or send me email if you have a problem
related to this that rebuilding does not solve.

thanks!


Not required to make pkgsrc/emulators/compat8 now?




.mrg.




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: varshm check in postinstall

2019-01-31 Thread Masanobu SAITOH


On 2019/01/31 18:49, Martin Husemann wrote:

On Thu, Jan 31, 2019 at 05:23:45PM +0900, Masanobu SAITOH wrote:

-   if ${GREP} -w "/var/shm" "${DEST_DIR}/etc/fstab" >/dev/null 2>&1;
+   if ${GREP} -E "^var_shm_symlink" "${DEST_DIR}/etc/rc.conf" >/dev/null 
2>&1;
+   then
+   failed=0;
+   elif ${GREP} -w "/var/shm" "${DEST_DIR}/etc/fstab" >/dev/null 2>&1;
then
failed=0;
else

OK?


Sounds good!
I always left a commented out /var/shm tmpfs entry in my fstab, but your fix
is better.

Martin


 Done!

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

varshm check in postinstall

2019-01-31 Thread Masanobu SAITOH


 Hi.

 If var_shm_symlink="/tmp/.shm" is in /etc/rc.conf and no /var/shm mount
in /etc/fstab, postinstall complains:

varshm check:
No /var/shm mount found in /etc/fstab

So, I propose the following change:

Index: postinstall
===
RCS file: /cvsroot/src/usr.sbin/postinstall/postinstall,v
retrieving revision 1.221
diff -u -p -r1.221 postinstall
--- postinstall 4 Dec 2018 16:53:44 -   1.221
+++ postinstall 31 Jan 2019 08:22:15 -
@@ -2229,7 +2229,10 @@ do_varshm()
failed=0
 
 	[ -f "${DEST_DIR}/etc/fstab" ] || return 0

-   if ${GREP} -w "/var/shm" "${DEST_DIR}/etc/fstab" >/dev/null 2>&1;
+   if ${GREP} -E "^var_shm_symlink" "${DEST_DIR}/etc/rc.conf" >/dev/null 
2>&1;
+   then
+   failed=0;
+   elif ${GREP} -w "/var/shm" "${DEST_DIR}/etc/fstab" >/dev/null 2>&1;
then
failed=0;
else

OK?

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: tester need: MII PHY register read/write API change

2019-01-24 Thread Masanobu SAITOH


On 2019/01/24 23:47, Riccardo Mottola wrote:

Hello Masanobu,

I updated and recompiled kernel,

[ 1.049407] re0 at pci5 dev 0 function 0: RealTek 8100E/8101E/8102E/8102EL 
PCIe 10/100BaseTX (rev. 0x02)
[ 1.049407] re0: interrupting at msix1 vec 0
[ 1.049407] re0: Ethernet address 64:31:50:7b:8f:55
[ 1.049407] re0: using 256 tx descriptors
[ 1.049407] rlphy0 at re0 phy 7: RTL8201L 10/100 media interface, rev. 1

it would be re+rlphy I suppose, not listed in your list?


Yes, the combination is not listed, but MAC and PHY combination
is not important. re (re+rgephy) and rlphy(rtk+rlphy) are listed.


Just want to confirm that it compiled and that it works fine.


 Anyway, thank you for your report!


Thank you,
Riccardo



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: tester need: MII PHY register read/write API change

2019-01-22 Thread Masanobu SAITOH




 Hi, all.

 This change was committed yesterday. Please report if you got a new problem
related to PHY access after the change.

 Thanks.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: tester need: MII PHY register read/write API change

2019-01-20 Thread Masanobu SAITOH


On 2019/01/17 14:26, Masanobu SAITOH wrote:

  Hi.

  I'll commit the following diff to -current in this weekend:

 http://www.netbsd.org/~msaitoh/miiphy-20190117-0.dif

This diff changes all of Ethernet driver which use mii(4) and all mii drivers.
This change is not complicated, but more than a hundred files are modified.
It might result in adding bugs.

  The following drivers are tested:
axe+ukphy
axe+rgephy
axen+rgephy
wm+atphy
wm+ukphy
wm+igphy
wm+ihphy
wm+makphy
sk+makphy
sk+brgphy
sk+gentbi
msk+makphy
sip+icsphy
sip+ukphy
re+rgephy
bge+brgphy
bnx+brgphy
gsip+gphyter
rtk+rlphy
fxp+inphy
tlp+acphy
epic+qsphy
(MAC & PHY combination is not important)


Tested += vge+ciphy



  The following drivers are not tested yet:
For MAC side:
arm:at91emac
arm:cemac
arm:epe
arm:geminigmac
arm:enet
arm:cpsw
arm:emac(omac)
arm:emac(sunxi)
arm:npe
evbppc:temac
macppc:bm
macppc:gm
mips:aumac
mips:ae
mips:cnmac
mips:reth
mips:sbmac
playstation2:smap
powerpc:tsec
powerpc:emac(ibm4xx)
sgimips:mec
sparc:be
sf
ne(ax88190, dl10019)
awge
ep
gem
hme
smsh
mtd
sm
age
alc
ale
bce
cas
et
jme
lii
nfe
pcn
ste
stge
tl
vge
vr
vte
xi
aue
mue
smsc
udav
url

MII PHY side:
amhphy
bmtphy
ciphy
dmphy
etphy
glxtphy
ikphy
iophy
lxtphy
nsphyter
pnaphy
rdcphy*
sqphy
tlphy
tqphy
urlphy

If you have a time and can test with the above untested device,
please test and report the result. It would be enough to check:

 if the dmesg is not changed
 if ifconfig show the media status correctly when the link is up.


See also:
 http://mail-index.netbsd.org/tech-net/2018/12/25/msg007201.html




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: tester need: MII PHY register read/write API change

2019-01-20 Thread Masanobu SAITOH


On 2019/01/20 6:10, Andrius V wrote:

Hi,

Tested vr+ukphy, vge+ciphy, vte+ukphy, fxp+inphy, axen+rgephy. All of
them work same way as before the patch, no dmesg changes.


 Thank you for your report!

- msaitoh



On Fri, Jan 18, 2019 at 12:24 PM Andrius V  wrote:


Hi,

I can test vge, vr on my EPIA board and partially vte (dmesg only,
since media status is incorrect now and network is not working for it
(PR/53494)).


On Thu, Jan 17, 2019 at 7:27 AM Masanobu SAITOH  wrote:


   Hi.

   I'll commit the following diff to -current in this weekend:

 http://www.netbsd.org/~msaitoh/miiphy-20190117-0.dif

This diff changes all of Ethernet driver which use mii(4) and all mii drivers.
This change is not complicated, but more than a hundred files are modified.
It might result in adding bugs.

   The following drivers are tested:
axe+ukphy
axe+rgephy
axen+rgephy
wm+atphy
wm+ukphy
wm+igphy
wm+ihphy
wm+makphy
sk+makphy
sk+brgphy
sk+gentbi
msk+makphy
sip+icsphy
sip+ukphy
re+rgephy
bge+brgphy
bnx+brgphy
gsip+gphyter
rtk+rlphy
fxp+inphy
tlp+acphy
epic+qsphy
(MAC & PHY combination is not important)

   The following drivers are not tested yet:
For MAC side:
arm:at91emac
arm:cemac
arm:epe
arm:geminigmac
arm:enet
arm:cpsw
arm:emac(omac)
arm:emac(sunxi)
arm:npe
evbppc:temac
macppc:bm
macppc:gm
mips:aumac
mips:ae
mips:cnmac
mips:reth
mips:sbmac
playstation2:smap
powerpc:tsec
powerpc:emac(ibm4xx)
sgimips:mec
sparc:be
sf
ne(ax88190, dl10019)
awge
ep
gem
hme
smsh
mtd
sm
age
alc
ale
bce
cas
et
jme
lii
nfe
pcn
ste
stge
tl
vge
vr
vte
xi
aue
mue
smsc
udav
url

MII PHY side:
amhphy
bmtphy
ciphy
dmphy
etphy
glxtphy
ikphy
iophy
lxtphy
nsphyter
pnaphy
rdcphy*
sqphy
tlphy
tqphy
urlphy

If you have a time and can test with the above untested device,
please test and report the result. It would be enough to check:

 if the dmesg is not changed
 if ifconfig show the media status correctly when the link is up.


See also:
 http://mail-index.netbsd.org/tech-net/2018/12/25/msg007201.html

--
---
  SAITOH Masanobu (msai...@execsw.org
   msai...@netbsd.org)



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

tester need: MII PHY register read/write API change

2019-01-16 Thread Masanobu SAITOH


 Hi.

 I'll commit the following diff to -current in this weekend:

http://www.netbsd.org/~msaitoh/miiphy-20190117-0.dif

This diff changes all of Ethernet driver which use mii(4) and all mii drivers.
This change is not complicated, but more than a hundred files are modified.
It might result in adding bugs.

 The following drivers are tested:
axe+ukphy
axe+rgephy
axen+rgephy
wm+atphy
wm+ukphy
wm+igphy
wm+ihphy
wm+makphy
sk+makphy
sk+brgphy
sk+gentbi
msk+makphy
sip+icsphy
sip+ukphy
re+rgephy
bge+brgphy
bnx+brgphy
gsip+gphyter
rtk+rlphy
fxp+inphy
tlp+acphy
epic+qsphy
(MAC & PHY combination is not important)

 The following drivers are not tested yet:
For MAC side:
arm:at91emac
arm:cemac
arm:epe
arm:geminigmac
arm:enet
arm:cpsw
arm:emac(omac)
arm:emac(sunxi)
arm:npe
evbppc:temac
macppc:bm
macppc:gm
mips:aumac
mips:ae
mips:cnmac
mips:reth
mips:sbmac
playstation2:smap
powerpc:tsec
powerpc:emac(ibm4xx)
sgimips:mec
sparc:be
sf
ne(ax88190, dl10019)
awge
ep
gem
hme
smsh
mtd
sm
age
alc
ale
bce
cas
et
jme
lii
nfe
pcn
ste
stge
tl
vge
vr
vte
xi
aue
mue
smsc
udav
url

MII PHY side:
amhphy
bmtphy
ciphy
dmphy
etphy
glxtphy
ikphy
iophy
lxtphy
nsphyter
pnaphy
rdcphy*
sqphy
tlphy
tqphy
urlphy

If you have a time and can test with the above untested device,
please test and report the result. It would be enough to check:

if the dmesg is not changed
if ifconfig show the media status correctly when the link is up.


See also:
http://mail-index.netbsd.org/tech-net/2018/12/25/msg007201.html

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Panic on a -current from 13/12/2018

2018-12-18 Thread Masanobu SAITOH


On 2018/12/18 20:13, Masanobu SAITOH wrote:

Hi!

On 2018/12/17 19:38, Chavdar Ivanov wrote:

I went through a series of tests. It is indeed that point the panic
takes place, the two parts of the screendump are in

http://ci4ic4.tx0.org/nb-panic-wm-03.png and
http://ci4ic4.tx0.org/nb-panic-wm-04.png .


  Thanks. This is the workaround code for broken lapic timer
counter which was added in:

 http://mail-index.netbsd.org/source-changes/2017/11/23/msg089946.html
 
http://cvsweb.netbsd.org/bsdweb.cgi/src/sys/arch/x86/x86/lapic.c.diff?r1=1.63=1.64=h

Your VM is configured act as KVM
(See system->acceleration(L) tab or see .box file's "Paravirt provider=")

I set up my vm to KVM and


VirtualBox gives three Intel NIC options:

Intel PRO/1000 MT Desktop (82540EM)
Intel PRO/1000 T Server   (82543GC)
Intel PRO/1000 MT Server  (82545EM)

I was able to get a panic with the same kernel from 13/12/2018 only
when I select the second option:


  I changed my VM's setting to use 82543GC. I tried hibernation
three times but I couldn't reproduce the problem. I couldn't reproduce
the same problem, but this problem must be exist because you had the
problem.

  The possibilities are:
 a) VirtualBox's lapic is not good.
 b) Our workaround code is not perfect or somewhere is not good.
 c) any others

I suspect this problem is not from if_wm.c. but from

There was a VirtualBox upgrade a few weeks ago, perhaps the problem is there.



  I read vbox/src/VBox/Devices/Network/DevE1000.cpp. One of the
difference between 82543GC emulation and other two is that
it generates interrupt when chip reset occurred. If other network
device emulation works well, I suspect that the reset timing in vbox
is not good and it makes no update of lapic timer.

  Workarounds are:
 a) Don't use KVM mode and use "Default" or other.
    On my Windows7's virtual box, "Default" makes
    CPUID2_RAZ bit not set. It makes NetBSD recognize
    it's not on KVM.


 If the problem which lapic timer stops also exist on the "Defalut" mode,
that workaround isn't used and delay() won't work. If so, b) is the best
to avoid the problem.


 b) Use Other than 82543GC.
 c) any others

BTW, when I use 82543GC emulation, I got the following bug:

makphy0 at wm0 phy 0: Marvell 88E1000 Gigabit PHY, rev. 0
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
makphy1 at wm0 phy 1: Marvell 88E1000 Gigabit PHY, rev. 0
makphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto

(snip)

makphy31 at wm0 phy 31: Marvell 88E1000 Gigabit PHY, rev. 0
makphy31: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
ifmedia_match: multiple match for 0x20/0xfbff9ff, selected instance 0


This _IS_ a bug of VirtualBox's 82543GC emulation.
DevE1000Phy.cpp line 568 says:

 /* Note: A single PHY is supported, ignore PHYADR */

So I recommend all users not to use 82543GC emulation until this PHY
bug is fixed.


..
-rw--- 1 root wheel   2199810 Dec 17 09:24 netbsd.9
-rw--- 1 root wheel 147348504 Dec 17 09:24 netbsd.9.core
/var/crash # gdb netbsd.9
GNU gdb (GDB) 8.0.1
Copyright (C) 2017 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from netbsd.9...(no debugging symbols found)...done.
(gdb) target kvm netbsd.9.core
0x80222d75 in cpu_reboot ()
(gdb) bt
#0  0x80222d75 in cpu_reboot ()
#1  0x8076e6f7 in db_reboot_cmd ()
#2  0x8076ee92 in db_command ()
#3  0x8076f20c in db_command_loop ()
#4  0x80772b80 in db_trap ()
#5  0x8021f5c2 in kdb_trap ()
#6  0x802244b1 in trap ()
#7  0x8021d568 in alltraps ()
#8  0x8021de45 in breakpoint ()
#9  0x809d54b0 in vpanic ()
#10 0x809d5550 in panic ()
#11 0x802514f0 in lapic_delay ()
#12 0x80353270 in wm_gmii_i82543_readreg ()
#13 0x807b1aa5 in makphy_status ()
#14 0x807b1cf7 in makphy_service ()
#15 0x807a826c in mii_tick ()
#16 0x80360926 in wm_tick ()
#17 0x809b6b96 in callout_softclock ()
#18 0x809aaa55 in softint_dispatch ()
#19 0x8021d21f in Xsoftintr ()


  I rebuilt the kernel (on a different physical host, but there may
have been an update on the 14th there) a

Re: Panic on a -current from 13/12/2018

2018-12-18 Thread Masanobu SAITOH

st,
but I reported the panic thinking it may be relevant in other use
cases.


 Thank you for your report!




On Mon, 17 Dec 2018 at 07:49, Masanobu SAITOH  wrote:


On 2018/12/17 1:09, Chavdar Ivanov wrote:

I have no idea. As I said, it is running under VirtualBox on a Windows
10 host; I put the host in hibernation whilst the NetBSD guest is
running.


I tested today's -current on VirtualBox 5.2.22 on Windows 7 64bit
(on Core i7-2600). I tried hybernate(shutdown ->hybernate(H)) a few times
but I couldn't reproduce the problem yet.


  while (deltat > 0) {
  xtick = lapic_gettick();
  if (lapic_broken_periodic && xtick == 0 && otick == 0) {
  lapic_initclocks();
  xtick = lapic_gettick();
  if (xtick == 0)
  panic("lapic timer stopped ticking");   
<=== here!
  }


If that panic is from this, lapic_broken_periodic must be true, but it's set 
only
when the VM is KVM:

 /*
  * Apply workaround for broken periodic timer under KVM
  */
 if (vm_guest == VM_GUEST_KVM) {
 lapic_broken_periodic = true;
 lapic_timecounter.tc_quality = -100;
 aprint_debug_dev(ci->ci_dev,
 "applying KVM timer workaround\n");
 }


   Could you try to reproduce the problem and see the panic message?
ci4ic4-panic-01.png has backtrace and it wiped out the panic message.

   Regards.


Previously it survived this, using the Intel Desktop NIC
emulation within VirtualBox, even my ssh connections (from the host to
the guest) remained active. I switched the NIC emulation for the
NetBSD guest to virtio-net, now it behaves as before, surviving a
hibernation.

There was a VirtualBox upgrade a few weeks ago, perhaps the problem is there.
On Sun, 16 Dec 2018 at 15:55, SAITOH Masanobu  wrote:


Hi.

On 2018/12/16 18:09, Chavdar Ivanov wrote:

Repeated this morning. Happens when the host hibernates when the
machine is running. The initial trace is slightly different, but the
lines with wm_gmii are the same, so for now I will switch to a
different NIC emulator.



In your .png:

vpanic()
lapic_delay()
wm_gmii_mdic_readreg()
.
.
.


There is no panic message itself, but I suspect it's:

static void
lapic_delay(unsigned int usec)
{
  int32_t xtick, otick;
  int64_t deltat; /* XXX may want to be 64bit */

  otick = lapic_gettick();

  if (usec <= 0)
  return;
  if (usec <= 25)
  deltat = lapic_delaytab[usec];
  else
  deltat = (lapic_frac_cycle_per_usec * usec) >> 32;

  while (deltat > 0) {
  xtick = lapic_gettick();
  if (lapic_broken_periodic && xtick == 0 && otick == 0) {
  lapic_initclocks();
  xtick = lapic_gettick();
  if (xtick == 0)
  panic("lapic timer stopped ticking");   
<=== here!
  }
  if (xtick > otick)
  deltat -= lapic_tval - (xtick - otick);
  else
  deltat -= otick - xtick;
  otick = xtick;

  x86_pause();
  }
}


Why does it cause?



And yes, it used to survive many hibernations of the hosts before. I
only had to adjust the time after waking the host up.
On Sat, 15 Dec 2018 at 10:59, Chavdar Ivanov  wrote:


Hi,

On 8.99.27 AMD64 running under VirtualBox I got this morning the panic
in http://ci4ic4.tx0.org/ci4ic4-panic-01.png

I have the  coredump, if it is of interest. I thought it might be
useful, as it is apparently in the wm driver.

Chavdar
--








--
---
  SAITOH Masanobu (msai...@execsw.org
   msai...@netbsd.org)







--
---
  SAITOH Masanobu (msai...@execsw.org
   msai...@netbsd.org)







--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Panic on a -current from 13/12/2018

2018-12-16 Thread Masanobu SAITOH


On 2018/12/17 1:09, Chavdar Ivanov wrote:

I have no idea. As I said, it is running under VirtualBox on a Windows
10 host; I put the host in hibernation whilst the NetBSD guest is
running.


I tested today's -current on VirtualBox 5.2.22 on Windows 7 64bit
(on Core i7-2600). I tried hybernate(shutdown ->hybernate(H)) a few times
but I couldn't reproduce the problem yet.


 while (deltat > 0) {
 xtick = lapic_gettick();
 if (lapic_broken_periodic && xtick == 0 && otick == 0) {
 lapic_initclocks();
 xtick = lapic_gettick();
 if (xtick == 0)
 panic("lapic timer stopped ticking");   
<=== here!
 }


If that panic is from this, lapic_broken_periodic must be true, but it's set 
only
when the VM is KVM:

/*
 * Apply workaround for broken periodic timer under KVM
 */
if (vm_guest == VM_GUEST_KVM) {
lapic_broken_periodic = true;
lapic_timecounter.tc_quality = -100;
aprint_debug_dev(ci->ci_dev,
"applying KVM timer workaround\n");
}


 Could you try to reproduce the problem and see the panic message?
ci4ic4-panic-01.png has backtrace and it wiped out the panic message.

 Regards.


Previously it survived this, using the Intel Desktop NIC
emulation within VirtualBox, even my ssh connections (from the host to
the guest) remained active. I switched the NIC emulation for the
NetBSD guest to virtio-net, now it behaves as before, surviving a
hibernation.

There was a VirtualBox upgrade a few weeks ago, perhaps the problem is there.
On Sun, 16 Dec 2018 at 15:55, SAITOH Masanobu  wrote:


Hi.

On 2018/12/16 18:09, Chavdar Ivanov wrote:

Repeated this morning. Happens when the host hibernates when the
machine is running. The initial trace is slightly different, but the
lines with wm_gmii are the same, so for now I will switch to a
different NIC emulator.



In your .png:

vpanic()
lapic_delay()
wm_gmii_mdic_readreg()
.
.
.


There is no panic message itself, but I suspect it's:

static void
lapic_delay(unsigned int usec)
{
 int32_t xtick, otick;
 int64_t deltat; /* XXX may want to be 64bit */

 otick = lapic_gettick();

 if (usec <= 0)
 return;
 if (usec <= 25)
 deltat = lapic_delaytab[usec];
 else
 deltat = (lapic_frac_cycle_per_usec * usec) >> 32;

 while (deltat > 0) {
 xtick = lapic_gettick();
 if (lapic_broken_periodic && xtick == 0 && otick == 0) {
 lapic_initclocks();
 xtick = lapic_gettick();
 if (xtick == 0)
 panic("lapic timer stopped ticking");   
<=== here!
 }
 if (xtick > otick)
 deltat -= lapic_tval - (xtick - otick);
 else
 deltat -= otick - xtick;
 otick = xtick;

 x86_pause();
 }
}


Why does it cause?



And yes, it used to survive many hibernations of the hosts before. I
only had to adjust the time after waking the host up.
On Sat, 15 Dec 2018 at 10:59, Chavdar Ivanov  wrote:


Hi,

On 8.99.27 AMD64 running under VirtualBox I got this morning the panic
in http://ci4ic4.tx0.org/ci4ic4-panic-01.png

I have the  coredump, if it is of interest. I thought it might be
useful, as it is apparently in the wm driver.

Chavdar
--








--
---
 SAITOH Masanobu (msai...@execsw.org
  msai...@netbsd.org)







--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: dependency of magic.h

2018-12-11 Thread Masanobu SAITOH


On 2018/12/11 21:40, Christos Zoulas wrote:

In article <944c79fd-4414-1936-67f4-03ec228bd...@execsw.org>,
Masanobu SAITOH   wrote:

  Hi.

  While doing "./build.sh -j16 distribution", I got "unterminated ifdef"
error in magic.h(src/external/bsd/file/lib). When I got this compile error,
I checked the file and the #ifdef...#endif was consistent. It seems the
compiler saw unfinished magic.h. Have you ever seen this error before?

  src/external/bsd/file/lib/Makefile has "${ALLOBJS}: magic.h".
I'm not familiar with Makefile, but I suspect it should be replaced with
"DPSRCS+=magic.h".

  Right?

  It's hard to reproduce the problem, so I can't verify the change works
or not...


Why don't you change it anyway. I think I've seen this before too.

christo


 Committed now!

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

dependency of magic.h

2018-12-11 Thread Masanobu SAITOH


 Hi.

 While doing "./build.sh -j16 distribution", I got "unterminated ifdef"
error in magic.h(src/external/bsd/file/lib). When I got this compile error,
I checked the file and the #ifdef...#endif was consistent. It seems the
compiler saw unfinished magic.h. Have you ever seen this error before?

 src/external/bsd/file/lib/Makefile has "${ALLOBJS}: magic.h".
I'm not familiar with Makefile, but I suspect it should be replaced with
"DPSRCS+=magic.h".

 Right?

 It's hard to reproduce the problem, so I can't verify the change works
or not...

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-12-05 Thread Masanobu SAITOH


On 2018/12/06 0:22, David Brownlee wrote:

On Sat, 1 Dec 2018 at 01:27, SAITOH Masanobu  wrote:


Committed.



Hi - is this worth pulling up into netbsd-8?


Of course yes!

 

I ran a quick test and my T420s suspends/resumes fine with a netbsd-8
kernel and no X11 running. x11 looks to have issues which seem
resolved in current, but it feels like a worthwhile fix to have in -8.

It needed two other small changes pulled up (git hashes from
https://github.com/NetBSD/src)
# 5fca1fb34425590e514f0d745f936998fded9c18 PCIE_HAS_LINKREGS
# 3fe2d92356e154aef689fc05111ac422c89c0785 PCIE_HAS_ROOTREGS


The above two changes were pulled up two days ago.


# 2e6fa7a8bbdd4df652e4cfb605385ff915176cbe suspend/resume


I'll send the pullup request today.


Sample kernel at
http://ftp.netbsd.org/pub/NetBSD/misc/abs/netbsd-8-amd64-pci-resume/netbsd.gz
if anyone wants to give it a quick go.

(Many thanks again to msaitoh@ for all the work in analysing and


You're welcome and thank you. If your didn't find the MSI's change,
we couldn't fix this problem.


fixing this - its really nice to have some fully functional
suspend/resume use cases (eg :T420s with external mouse :)

David




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-28 Thread Masanobu SAITOH


On 2018/11/28 22:12, SAITOH Masanobu wrote:

On 2018/11/28 14:18, Masanobu SAITOH wrote:

Hi, David.

On 2018/11/28 6:09, David Brownlee wrote:

On Tue, 27 Nov 2018 at 18:10, David Brownlee  wrote:


On Tue, 27 Nov 2018 at 08:27, Masanobu SAITOH  wrote:


    Hi, David.

On 2018/11/26 6:11, David Brownlee wrote:

I've bisected the changes against the github src copy, and it looks like the 
suspend/resume issue is related to the following commit:

commit 0fe469276f49bf0dc003300e0b8a35a80b7b246d (HEAD)
Author: jdolecek 
Date:   Mon Oct 22 20:57:07 2018 +

   enable MSI support where available, blatantly copied from jmcneill's 
msk(4)

I tried building from HEAD with just that one commit reverted, and my T420s 
suspends and resumes again!

iwn0 is still non responsive after resume and wm0 will not pick up an IP via 
dhcpcd, but the disk responds :-p


    (Note that I'm not familiar with suspend/resume though...)

    Our pci_suspend()/pci_resume() copy only first 16 bytes of each PCI
config space. Other OSes copy some other control registers and
MSI/MSI-X capability area.

    Could you dump all PCI config space both before and after suspend with:

  http://www.netbsd.org/~msaitoh/pcidump

and put the two output somewhere? Diffing the two output will teach
us what we have to do.

    Thanks in advance.


Let me just install to a USB stick to give me a working filesystem
from which to run pcidump after resume :-p


Collecting a pre-suspend dump was easy, but getting post-resume turned
out to be a little more involved :)
- root on wd0 on ahcisata - times out on resume
- root on sd0 on usb on xhci - times out on resume
- root on sd0 on usb on uhci - loses the root filesystem mount point on resume
- install image - doesn't have the libs to run pcictl
- install image, then chroot to mfs with extracted base - suspends but
video does not come back (no drm)
- root on wd0, then chroot to mfs with extracted base, suspend &
resume, then mount sd0 on usb on uhci to save data - \o/

After all that it occurred to me I could have probably run the
suspend/resume with an older NetBSD version where MSI was not being
used. Still, interesting puzzle to try, and useful technique to stash.

Files for the ThinkPad T420s:

http://ftp.netbsd.org/pub/NetBSD/misc/abs/acpi-suspend-resume/pcidump.pre
http://ftp.netbsd.org/pub/NetBSD/misc/abs/acpi-suspend-resume/pcidump.post


The diff says we should save/restore MSI table.
We also should save/restore some other registers.

  Give me one or two days to resolve the problem.


  Please try the following diff:

http://www.netbsd.org/~msaitoh/pci-resume-20181118-0.dif

Even if I use this change with Thinkpad X220, it doesn't recover from
suspend...


 But, my X61 survived from suspend with this patch!






  Thanks.



Thanks for looking at this!

David










--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-27 Thread Masanobu SAITOH


Hi, David.

On 2018/11/28 6:09, David Brownlee wrote:

On Tue, 27 Nov 2018 at 18:10, David Brownlee  wrote:


On Tue, 27 Nov 2018 at 08:27, Masanobu SAITOH  wrote:


   Hi, David.

On 2018/11/26 6:11, David Brownlee wrote:

I've bisected the changes against the github src copy, and it looks like the 
suspend/resume issue is related to the following commit:

commit 0fe469276f49bf0dc003300e0b8a35a80b7b246d (HEAD)
Author: jdolecek 
Date:   Mon Oct 22 20:57:07 2018 +

  enable MSI support where available, blatantly copied from jmcneill's 
msk(4)

I tried building from HEAD with just that one commit reverted, and my T420s 
suspends and resumes again!

iwn0 is still non responsive after resume and wm0 will not pick up an IP via 
dhcpcd, but the disk responds :-p


   (Note that I'm not familiar with suspend/resume though...)

   Our pci_suspend()/pci_resume() copy only first 16 bytes of each PCI
config space. Other OSes copy some other control registers and
MSI/MSI-X capability area.

   Could you dump all PCI config space both before and after suspend with:

 http://www.netbsd.org/~msaitoh/pcidump

and put the two output somewhere? Diffing the two output will teach
us what we have to do.

   Thanks in advance.


Let me just install to a USB stick to give me a working filesystem
from which to run pcidump after resume :-p


Collecting a pre-suspend dump was easy, but getting post-resume turned
out to be a little more involved :)
- root on wd0 on ahcisata - times out on resume
- root on sd0 on usb on xhci - times out on resume
- root on sd0 on usb on uhci - loses the root filesystem mount point on resume
- install image - doesn't have the libs to run pcictl
- install image, then chroot to mfs with extracted base - suspends but
video does not come back (no drm)
- root on wd0, then chroot to mfs with extracted base, suspend &
resume, then mount sd0 on usb on uhci to save data - \o/

After all that it occurred to me I could have probably run the
suspend/resume with an older NetBSD version where MSI was not being
used. Still, interesting puzzle to try, and useful technique to stash.

Files for the ThinkPad T420s:

http://ftp.netbsd.org/pub/NetBSD/misc/abs/acpi-suspend-resume/pcidump.pre
http://ftp.netbsd.org/pub/NetBSD/misc/abs/acpi-suspend-resume/pcidump.post


The diff says we should save/restore MSI table.
We also should save/restore some other registers.

 Give me one or two days to resolve the problem.


 Thanks.



Thanks for looking at this!

David




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-27 Thread Masanobu SAITOH


On 2018/11/27 17:27, Masanobu SAITOH wrote:

  Hi, David.

On 2018/11/26 6:11, David Brownlee wrote:

I've bisected the changes against the github src copy, and it looks like the 
suspend/resume issue is related to the following commit:

commit 0fe469276f49bf0dc003300e0b8a35a80b7b246d (HEAD)
Author: jdolecek 
Date:   Mon Oct 22 20:57:07 2018 +

     enable MSI support where available, blatantly copied from jmcneill's msk(4)

I tried building from HEAD with just that one commit reverted, and my T420s 
suspends and resumes again!

iwn0 is still non responsive after resume and wm0 will not pick up an IP via 
dhcpcd, but the disk responds :-p


  (Note that I'm not familiar with suspend/resume though...)

  Our pci_suspend()/pci_resume() copy only first 16 bytes of each PCI


s/16 bytes/64 bytes/


config space. Other OSes copy some other control registers and
MSI/MSI-X capability area.

  Could you dump all PCI config space both before and after suspend with:

 http://www.netbsd.org/~msaitoh/pcidump

and put the two output somewhere? Diffing the two output will teach
us what we have to do.

  Thanks in advance.



David

On Sat, 24 Nov 2018 at 22:47, David Brownlee mailto:a...@absd.org>> wrote:

    On Sat, 24 Nov 2018 at 18:52, David H. Gutteridge mailto:da...@gutteridge.ca>> wrote:
 >
 > On Fri, 2018-11-23 at 21:42 +, David Brownlee wrote:
 > > Another couple of data points in case it helps
 > >
 > > Tested on Thinkpad T420s and T530 with NetBSD/amd64 - both have
 > > similar behaviour
 > >
 > > 8.99.25 Single user:
 > > - Suspends and seems to resume but hangs on first disk access "wd0a:
 > > device timeout reading fsbn ..."
 >
 > Yes, I get that too. pgoyette@ suggested I follow up with jdolecek@
 > about it, but I haven't had time yet to look for more details. There
 > are a number of PRs that jdolecek@ was working on fixing that
 > reference "clearing WDCTL_RST failed for drive" in the dmesg. In my
 > case, I get that error on both 8.0_STABLE and 8.99.26 (after his
 > latest changes), but it seems like it's a red herring or there's more
 > to it, because 8 still resumes reliably regardless of that warning,
 > while HEAD behaves as you've seen. I just keep getting continuous
 > output with "wd0a: device timeout writing fsbn X of X..."

    I asked jdolecek@ if it might be worth bisecting to find out when the
    hang was introduced, and he replied it was.
    I've just started using the github copy of src. Mon Oct 22 2018 was "good"

 > > netbsd-8 Single user:
 > > - Suspend (hw.acpi.sleep.state=3) and resume appears to work reliably
 > > many times in a row
 > > - Booting multi user after suspend/resume: wireless iwn0 does not
 > > appear to work "iwn0: could not load firmware .text section"
 >
 > I see that too. I haven't looked into it yet, but wondered if it was
 > as simple as forcing it to reload its firmware after resumption.

    Mmm, the man page indicates "iwn0: could not load firmware .text
    section" is reported when it attempted to
    load the firmware from disk into the device but failed, so it may be a
    little more than that :/

 > (Actually, my iwn didn't work at all, originally, because it requires
 > a different firmware file than any that are distributed by NetBSD at
 > present, and needed an addition in the driver to target that firmware.
 > I made those changes in my tree and have been testing with them on
 > both 8 and HEAD.)
 >
 > > netbsd-8 Multi user no x11:
 > > - Suspends, keyboard *usually* non responsive on resume (but can
 > > switch virtual terminals)
 >
 > I've never had this problem, I've found my T420 consistently responsive
 > whether I'm at a console or have suspended with X running (typically
 > with an Xfce4 session). When it comes back, no issues there (aside from
 > iwn).

    Thats definitely encouraging!

    David







--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ThinkPad - suspend-to-RAM intel-x86 issues and tests

2018-11-27 Thread Masanobu SAITOH


 Hi, David.

On 2018/11/26 6:11, David Brownlee wrote:

I've bisected the changes against the github src copy, and it looks like the 
suspend/resume issue is related to the following commit:

commit 0fe469276f49bf0dc003300e0b8a35a80b7b246d (HEAD)
Author: jdolecek 
Date:   Mon Oct 22 20:57:07 2018 +

     enable MSI support where available, blatantly copied from jmcneill's msk(4)

I tried building from HEAD with just that one commit reverted, and my T420s 
suspends and resumes again!

iwn0 is still non responsive after resume and wm0 will not pick up an IP via 
dhcpcd, but the disk responds :-p


 (Note that I'm not familiar with suspend/resume though...)

 Our pci_suspend()/pci_resume() copy only first 16 bytes of each PCI
config space. Other OSes copy some other control registers and
MSI/MSI-X capability area.

 Could you dump all PCI config space both before and after suspend with:

http://www.netbsd.org/~msaitoh/pcidump

and put the two output somewhere? Diffing the two output will teach
us what we have to do.

 Thanks in advance.



David

On Sat, 24 Nov 2018 at 22:47, David Brownlee mailto:a...@absd.org>> wrote:

On Sat, 24 Nov 2018 at 18:52, David H. Gutteridge mailto:da...@gutteridge.ca>> wrote:
 >
 > On Fri, 2018-11-23 at 21:42 +, David Brownlee wrote:
 > > Another couple of data points in case it helps
 > >
 > > Tested on Thinkpad T420s and T530 with NetBSD/amd64 - both have
 > > similar behaviour
 > >
 > > 8.99.25 Single user:
 > > - Suspends and seems to resume but hangs on first disk access "wd0a:
 > > device timeout reading fsbn ..."
 >
 > Yes, I get that too. pgoyette@ suggested I follow up with jdolecek@
 > about it, but I haven't had time yet to look for more details. There
 > are a number of PRs that jdolecek@ was working on fixing that
 > reference "clearing WDCTL_RST failed for drive" in the dmesg. In my
 > case, I get that error on both 8.0_STABLE and 8.99.26 (after his
 > latest changes), but it seems like it's a red herring or there's more
 > to it, because 8 still resumes reliably regardless of that warning,
 > while HEAD behaves as you've seen. I just keep getting continuous
 > output with "wd0a: device timeout writing fsbn X of X..."

I asked jdolecek@ if it might be worth bisecting to find out when the
hang was introduced, and he replied it was.
I've just started using the github copy of src. Mon Oct 22 2018 was "good"

 > > netbsd-8 Single user:
 > > - Suspend (hw.acpi.sleep.state=3) and resume appears to work reliably
 > > many times in a row
 > > - Booting multi user after suspend/resume: wireless iwn0 does not
 > > appear to work "iwn0: could not load firmware .text section"
 >
 > I see that too. I haven't looked into it yet, but wondered if it was
 > as simple as forcing it to reload its firmware after resumption.

Mmm, the man page indicates "iwn0: could not load firmware .text
section" is reported when it attempted to
load the firmware from disk into the device but failed, so it may be a
little more than that :/

 > (Actually, my iwn didn't work at all, originally, because it requires
 > a different firmware file than any that are distributed by NetBSD at
 > present, and needed an addition in the driver to target that firmware.
 > I made those changes in my tree and have been testing with them on
 > both 8 and HEAD.)
 >
 > > netbsd-8 Multi user no x11:
 > > - Suspends, keyboard *usually* non responsive on resume (but can
 > > switch virtual terminals)
 >
 > I've never had this problem, I've found my T420 consistently responsive
 > whether I'm at a console or have suspended with X running (typically
 > with an Xfce4 session). When it comes back, no issues there (aside from
 > iwn).

Thats definitely encouraging!

David




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Panic in ahci_detach

2018-11-02 Thread Masanobu SAITOH


On 2018/11/03 6:28, Jaromír Doleček wrote:

Le jeu. 1 nov. 2018 à 06:38, Masanobu SAITOH  a écrit :

The meaning of atac_nchannels changed or numbering of channel
changed?


ahci_detach() counted improperly. Can you confirm rev. 1.66 of
dev/ic/ahcisata_core.c fixes the problem?

Jaromir


I've confirmed this problem was fixed now.

Thanks!

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Panic in ahci_detach

2018-10-31 Thread Masanobu SAITOH


 Hi.

 One of my machine panics in ahci_detach while shutdown.

dmesg related to ahci:

ci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 8086 product 1980 (rev. 0x10)
pchb1 at pci0 dev 4 function 0: vendor 8086 product 19a1 (rev. 0x10)
ppb5 at pci0 dev 14 function 0: vendor 8086 product 19a8 (rev. 0x10)
ppb5: PCI Express capability version 2  x2 @ 
8.0GT/s
pci6 at ppb5 bus 6
pci6: i/o space, memory space enabled, rd/line, wr/inv ok
allocated pic msi5 type edge pin 0 level 6 to cpu0 slot 21 idt entry 110
ahcisata0: interrupting at msi5 vec 0
ahcisata0: 64-bit DMA
ahcisata0: AHCI revision 1.31, 1 port, 32 slots, CAP 
0xc3369f40
atabus0 at ahcisata0 channel 3
ahcisata1 at pci0 dev 20 function 0: vendor 8086 product 19c2 (rev. 0x10)
allocated pic msi6 type edge pin 0 level 6 to cpu0 slot 22 idt entry 111
ahcisata1: interrupting at msi6 vec 0
ahcisata1: 64-bit DMA
ahcisata1: AHCI revision 1.31, 2 ports, 32 slots, CAP 
0xc3369f41
atabus1 at ahcisata1 channel 4
atabus2 at ahcisata1 channel 5


ahcisata1 has two channels. channel 4 and 5.


ahcisata1 port 5: device present, speed: 6.0Gb/s
wd0 at atabus2 drive 0
wd0: 
wd0: drive supports 16-sector PIO transfers, LBA48 addressing
wd0: 223 GB, 465141 cyl, 16 head, 63 sec, 512 bytes/sect x 468862128 sectors
wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133), WRITE 
DMA FUA, NCQ (32 tags)
wd0(ahcisata1:5:0): using PIO mode 4, DMA mode 2, Ultra-DMA mode 6 (Ultra/133) 
(using DMA), NCQ (31 tags)



diff for debugging:
Index: sys/kern/subr_autoconf.c
===
RCS file: /cvsroot/src/sys/kern/subr_autoconf.c,v
retrieving revision 1.263
diff -u -p -r1.263 subr_autoconf.c
--- sys/kern/subr_autoconf.c18 Sep 2018 01:25:09 -  1.263
+++ sys/kern/subr_autoconf.c1 Nov 2018 05:33:15 -
@@ -1711,6 +1711,9 @@ config_detach(device_t dev, int flags)
device_t d __diagused;
int rv = 0;
 
+	if (dev->dv_cfdata != NULL && (flags & DETACH_QUIET) == 0)

+   aprint_normal_dev(dev, "detaching\n");
+
cf = dev->dv_cfdata;
KASSERTMSG((cf == NULL || cf->cf_fstate == FSTATE_FOUND ||
cf->cf_fstate == FSTATE_STAR),
Index: sys/dev/ic/ahcisata_core.c
===
RCS file: /cvsroot/src/sys/dev/ic/ahcisata_core.c,v
retrieving revision 1.65
diff -u -p -r1.65 ahcisata_core.c
--- sys/dev/ic/ahcisata_core.c  24 Oct 2018 19:38:00 -  1.65
+++ sys/dev/ic/ahcisata_core.c  1 Nov 2018 05:33:15 -
@@ -508,15 +508,21 @@ ahci_detach(struct ahci_softc *sc, int f
atac = >sc_atac;
adapt = >atac_atapi_adapter._generic;
 
+	printf("%s: sc_ahci_ports = %08x\n", __func__, sc->sc_ahci_ports);

+   printf("%s: sc_atac.atac_nchannels = %08x\n", __func__,
+   sc->sc_atac.atac_nchannels);
for (i = 0; i < AHCI_MAX_PORTS; i++) {
achp = >sc_channels[i];
chp = >ata_channel;
 
+		printf("%s: port %d\n", __func__, i);

if ((sc->sc_ahci_ports & (1U << i)) == 0)
continue;
+   printf("%s: detach %d (atac_nchannels = %d)\n", __func__, i,
+   sc->sc_atac.atac_nchannels);
if (i >= sc->sc_atac.atac_nchannels) {
-   aprint_error("%s: more ports than announced\n",
-   AHCINAME(sc));
+   aprint_error("%s: %s: more ports than announced\n",
+   __func__, AHCINAME(sc));
break;
}
 



console output:

[ 837.8647946] pci8: detaching
[ 837.8647946] pci8: detached
[ 837.8647946] pci7: detaching
[ 837.8748051] pci7: detached
[ 837.8748051] usb1: detaching
[ 837.8748051] usb1: detached
[ 837.8748051] usb0: detaching
[ 837.8848152] usb0: detached
[ 837.8848152] atabus2: detaching
[ 837.8848152] wd0: detaching
[ 837.8948259] atabus1: detaching
[ 837.8948259] atabus1: detached
[ 837.8948259] atabus0: detaching
[ 837.9048363] atabus0: detached
[ 837.9048363] pci6: detaching
[ 837.9048363] pci6: detached
[ 837.9048363] pci5: detaching
[ 837.9148468] pci5: detached
[ 837.9148468] pci4: detaching
[ 837.9148468] pci4: detached
[ 837.9148468] pci3: detaching
[ 837.9248570] pci3: detached
[ 837.9248570] pci2: detaching
[ 837.9248570] pci2: detached
[ 837.9348674] pci1: detaching
[ 837.9348674] pci1: detached
[ 837.9348674] coretemp3: detaching
[ 837.9348674] acpicpu3: detaching
[ 837.9448781] coretemp2: detaching
[ 837.9448781] acpicpu2: detaching
[ 837.9448781] coretemp1: detaching
[ 837.9548884] acpicpu1: detaching
[ 837.9548884] coretemp0: detaching
[ 837.9548884] acpicpu0: detaching
[ 837.9648986] ichsmb0: detaching
[ 837.9648986] pcib0: detaching
[ 837.9648986] isa0: detaching
[ 837.9648986] sdhc0: detaching
[ 837.9749091] ppb7: detaching
[ 837.9749091]

Re: Panic with recent -current with interrupt setup

2018-10-02 Thread Masanobu SAITOH


On 2018/10/03 5:47, Brad Spencer wrote:

m...@netbsd.org writes:


On Tue, Oct 02, 2018 at 06:55:48AM -0400, Brad Spencer wrote:


Just wondering if anyone else has seen this, but I am getting panics on
boot during probe with sources after 2018-09-23 [at some point, at least
2018-09-29 and 2018-10-01 panic, but 2018-09-23 doesn't].  This is with
trying to use the stock XEN3_DOM0 kernel on a new system I am setting
up.  The panics seem related to setting up interrupts or printing
interrupt information in the intel wm(4) driver.  The system in question
does not have a serial port on it in any form, but I can probably
capture a screen shot of the panic.  The keyboard works and ddb seems
usable.


I assume this is related to cherry's recent xen interrupt work.
If you're unable to type at the ddb prompt but can reproduce the crash,
but can see the output, it'd be interesting to see if it gives more
info with

  options   DDB_COMMANDONENTER="bt"

As a kernel option.

But having the panic string would be nice too.


Here is more information:

Screen shot of the panic:
http://www.netbsd.org/~brad/PANIC_1.jpg
http://www.netbsd.org/~brad/PANIC_2.jpg

Screen shot of the ddb bt command, sorry for the quality:
http://www.netbsd.org/~brad/BT_1.jpg
http://www.netbsd.org/~brad/BT_2.jpg
http://www.netbsd.org/~brad/BT_3.jpg

Hopefully I also managed to attach a couple of files that are of a
working NetBSD dmesg and a working xl dmesg.  The Xen version I am using
is 4.8.3 built from source pulled from HEAD on Saturday or so.

This system is pretty new.  It has a 4 core Ryzen CPU, 16GB memory.  A
two port Intel NIC is also present, that would be wm0 and wm1 in the
dmesg.  The motherboard has a Intel NIC on it as well, which is wm2.

If I do a dmesg in ddb I can get the reason for the panic:

panic: kernel diagnostic assertion "irq2vect[irq] == 0" failed: 
file"/usr/src/sys/arch/xen/x86/pintr.c", line 202

It looks like this may have triggered on the onboard wm2 interface.



Try to revert x86/pci/pci_intr_machdep.c rev. 1.45:

http://mail-index.netbsd.org/source-changes/2018/09/23/msg099361.html


--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

SIOC[GZ]IFDATA fix

2018-09-11 Thread Masanobu SAITOH


 Hi.

 I noticed that -current's SIOC[GZ]IFDATA doesn't work correctly.
It's OK on netbsd-8.


static int
doifioctl(struct socket *so, u_long cmd, void *data, struct lwp *l)
{
struct ifnet *ifp;
struct ifreq *ifr;
int error = 0;
#if defined(COMPAT_OSOCK) || defined(COMPAT_OIFREQ)
u_long ocmd = cmd;
#endif
short oif_flags;
#ifdef COMPAT_OIFREQ
struct ifreq ifrb;
struct oifreq *oifr = NULL;
#endif
int r;
struct psref psref;
int bound;

switch (cmd) {
case SIOCGIFCONF:
return ifconf(cmd, data);
case SIOCINITIFADDR:
return EPERM;
default:
error = (*vec_compat_ifconf)(l, cmd, data);
if (error != ENOSYS)
return error;
error = (*vec_compat_ifdatareq)(l, cmd, data);
if (error != ENOSYS) <<==
return error;
break;
}


The call of vec_compat_ifdatareq() always returns other than ENOSYS.


Patch:

Index: uipc_syscalls_50.c
===
RCS file: /cvsroot/src/sys/compat/common/uipc_syscalls_50.c,v
retrieving revision 1.5
diff -u -p -r1.5 uipc_syscalls_50.c
--- uipc_syscalls_50.c  26 Apr 2018 08:11:18 -  1.5
+++ uipc_syscalls_50.c  11 Sep 2018 09:45:31 -
@@ -65,23 +65,24 @@ compat_ifdatareq(struct lwp *l, u_long c
 
 	/* Validate arguments. */

switch (cmd) {
-   case SIOCGIFDATA:
-   case SIOCZIFDATA:
-   ifp = ifunit(ifdr->ifdr_name);
-   if (ifp == NULL)
-   return ENXIO;
+   case OSIOCGIFDATA:
+   case OSIOCZIFDATA:
break;
default:
return ENOSYS;
}
 
+	ifp = ifunit(ifdr->ifdr_name);

+   if (ifp == NULL)
+   return ENXIO;
+
/* Do work. */
switch (cmd) {
-   case SIOCGIFDATA:
+   case OSIOCGIFDATA:
ifdatan2o(>ifdr_data, >if_data);
return 0;
 
-	case SIOCZIFDATA:

+   case OSIOCZIFDATA:
if (l != NULL) {
error = kauth_authorize_network(l->l_cred,
KAUTH_NETWORK_INTERFACE,


OK?


--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: CTL_*_NAMES macros

2018-08-21 Thread Masanobu SAITOH


rjs@ wrote:


Masanobu SAITOH  wrote:

% egrep -r 'CTL_.*_NAMES' .
./sys/arch/m68k/include/sysctl.h:#ifndef CTL_MACHDEP_NAMES
./sys/arch/m68k/include/sysctl.h:#endif /* CTL_MACHDEP_NAMES */


These are not examples of sysctl counter names, they are just the
multiple inclusion prevention symbols for that file.


In reality, CTL_MACHDEP_NAMES was not defined from anywhere :(
I cleaned up the m68k's CPU_* macros for CTL_MACHDEP sysctl
by moving them into m68k/include/cpu.h


On 2018/08/11 17:39, Christos Zoulas wrote:

In article <41393cac-bf42-08dd-da98-4d4fabbd5...@execsw.org>,
Masanobu SAITOH   wrote:


Are these required?
Are these maintained?
Can we remove these macros?

The following changes are related:
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138991.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138992.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138993.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138994.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138995.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138996.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138997.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138998.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138999.html


It seems FreeBSD removed those macros in the following commit:

https://svnweb.freebsd.org/base?view=revision=254925



Let's remove them.


 Done!


christos




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

error of makerumpdefs.sh

2018-08-21 Thread Masanobu SAITOH


 Hi.

 While cleaning some header files, I've noticed that 
sys/rump/include/rump/makerumpdefs.sh
complained some sed lines:


% sh -f makerumpdefs.sh
Generating rumpdefs.h
sed: 1: "/#define[  ]*_FCN/{:t;N ...": unexpected EOF (pending }'s)
sed: 1: "/#define[  ]*_IO.*\\$/{ ...": unexpected EOF (pending }'s)
sed: 1: "/#define[  ]*_IO.*[^\]$ ...": bad flag in substitute command: '}'
Generating rumperr.h
Generating rumperrno2host.h


 What is wrong? makerumpdefs.sh, sed or header files?

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: if_addrflags6: Can't assign requested address

2018-08-17 Thread Masanobu SAITOH


On 2018/08/12 0:11, Roy Marples wrote:

Hi

On 08/08/2018 03:13, Masanobu SAITOH wrote:

  Hi.

  While testing netbsd-7, I've noticed dhcpcd put the following
message:


Configuring network interfaces: wm0wm0: if_addrflags6: Can't assign requested 
address
wm0: if_addrflags6: Can't assign requested address
wm0: if_addrflags6: Can't assign requested address
wm0: if_addrflags6: Can't assign requested address


  Can we ignore this message, or is it a real problem?

/etc/dhcpcd.conf is the default.


I just got back and cannot replicate this issue with the latest netbsd-7 
sources which ship with dhcpcd-7.0.7.

I use a XEN DOMU for testing. Can you provide more information please?


In /etc/rc.conf:
ip6mode=autohost
ifconfig_wm2=dhcp
dhcpcd_flags="-d"

No /etc/ifconfig.wm2.

When boot:


Starting network.
Hostname: amd64-n7.execsw.org
IPv6 mode: autoconfigured host
Configuring network interfaces: wm2dhcpcd-7.0.7 starting
wm2: executing `/libexec/dhcpcd-run-hooks' PREINIT
wm2: executing `/libexec/dhcpcd-run-hooks' CARRIER
DUID 00:01:00:01:1c:4b:d9:02:00:26:55:57:20:ac
wm2: IAID 90:82:9b:a4
wm2: adding address fe80::1392:4012:56d8:a7a2
wm2: pltime infinity, vltime infinity
wm2: delaying IPv6 router solicitation for 0.5 seconds
wm2: delaying IPv4 for 0.9 seconds
wm2: carrier lost
wm2: executing `/libexec/dhcpcd-run-hooks' NOCARRIER
wm2: deleting address fe80::1392:4012:56d8:a7a2
wm2: carrier acquired
wm2: executing `/libexec/dhcpcd-run-hooks' CARRIER
wm2: IAID 90:82:9b:a4
wm2: adding address fe80::1392:4012:56d8:a7a2
wm2: pltime infinity, vltime infinity
wm2: delaying IPv6 router solicitation for 0.9 seconds
wm2: delaying IPv4 for 0.2 seconds
wm2: carrier lost
wm2: executing `/libexec/dhcpcd-run-hooks' NOCARRIER
wm2: deleting address fe80::1392:4012:56d8:a7a2
wm2: if_addrflags6: Can't assign requested address
wm2: if_addrflags6: Can't assign requested address
wm2: if_addrflags6: Can't assign requested address
wm2: if_addrflags6: Can't assign requested address
wm2: carrier acquired
wm2: executing `/libexec/dhcpcd-run-hooks' CARRIER
wm2: IAID 90:82:9b:a4
wm2: adding address fe80::1392:4012:56d8:a7a2
wm2: pltime infinity, vltime infinity
wm2: delaying IPv6 router solicitation for 0.4 seconds
wm2: delaying IPv4 for 0.2 seconds
wm2: reading lease `/var/db/dhcpcd/wm2.lease'
wm2: rebinding lease of 192.168.0.178
wm2: sending REQUEST (xid 0x63705096), next in 4.4 seconds
wm2: acknowledged 192.168.0.178 from 192.168.0.2
wm2: probing address 192.168.0.178/24
wm2: probing for 192.168.0.178
wm2: ARP probing 192.168.0.178 (1 of 3), next in 1.6 seconds
wm2: soliciting an IPv6 router
wm2: delaying Router Solicitation for LL address
wm2: sending Router Solicitation
wm2: ARP probing 192.168.0.178 (2 of 3), next in 1.8 seconds
wm2: ARP probing 192.168.0.178 (3 of 3), next in 2.0 seconds
wm2: sending Router Solicitation
wm2: DAD completed for 192.168.0.178
wm2: leased 192.168.0.178 for 43200 seconds
wm2: renew in 21600 seconds, rebind in 37800 seconds
wm2: writing lease `/var/db/dhcpcd/wm2.lease'
wm2: adding IP address 192.168.0.178/24 broadcast 192.168.0.255
wm2: adding route to 192.168.0.0/24
wm2: adding default route via 192.168.0.2
wm2: ARP announcing 192.168.0.178 (1 of 2), next in 2.0 seconds
wm2: executing `/libexec/dhcpcd-run-hooks' BOUND
forking to background
forked to background, child pid 250
.
Adding interface aliases:.
Waiting for DAD completion for statically configured addresses...


Does this log help for you?



Roy





--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: CTL_*_NAMES macros

2018-08-10 Thread Masanobu SAITOH


On 2018/08/10 17:22, Masanobu SAITOH wrote:

  Hi.

  While debugging some counters which can be taken from sysctl,
I've noticed that some macros are not used from anywhere.


% egrep -r 'CTL_.*_NAMES' .
./sys/arch/m68k/include/sysctl.h:#ifndef CTL_MACHDEP_NAMES
./sys/arch/m68k/include/sysctl.h:#endif /* CTL_MACHDEP_NAMES */
./sys/net/if.h:#define CTL_IFQ_NAMES  { \
./sys/netinet/in.h:#define  CTL_IPPROTO_NAMES { \
./sys/netinet6/in6.h:#define CTL_IPV6PROTO_NAMES { \
./sys/sys/mbuf.h:#define    CTL_MBUF_NAMES {
    \
./sys/sys/mount.h:#define   CTL_VFS_NAMES { \
./sys/sys/mount.h:#define   CTL_VFSGENCTL_NAMES { \
./sys/sys/pipe.h:#define    CTL_PIPE_NAMES { \
./sys/sys/socket.h:#define CTL_NET_NAMES { \
./sys/sys/socket.h:#define CTL_NET_RT_NAMES { \
./sys/sys/sysctl.h:#define  CTL_KERN_NAMES { \
./sys/sys/sysctl.h:#define  CTL_HW_NAMES { \
./sys/sys/sysctl.h:#define  CTL_USER_NAMES { \
./sys/sys/sysctl.h:#define  CTL_DDB_NAMES { \
./sys/uvm/uvm_param.h:#define   CTL_VM_NAMES { \
./usr.bin/nfsstat/nfsstat.c:    mib[1] = 2; /* XXX from CTL_VFS_NAMES in 
 */
./external/bsd/libpcap/dist/ieee80211.h:#define IEEE80211_CTL_SUBTYPE_NAMES {   
    \


Are these required?
Are these maintained?
Can we remove these macros?

The following changes are related:
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138991.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138992.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138993.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138994.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138995.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138996.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138997.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138998.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138999.html


It seems FreeBSD removed those macros in the following commit:

https://svnweb.freebsd.org/base?view=revision=254925


--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

CTL_*_NAMES macros

2018-08-10 Thread Masanobu SAITOH


 Hi.

 While debugging some counters which can be taken from sysctl,
I've noticed that some macros are not used from anywhere.


% egrep -r 'CTL_.*_NAMES' .
./sys/arch/m68k/include/sysctl.h:#ifndef CTL_MACHDEP_NAMES
./sys/arch/m68k/include/sysctl.h:#endif /* CTL_MACHDEP_NAMES */
./sys/net/if.h:#define CTL_IFQ_NAMES  { \
./sys/netinet/in.h:#define  CTL_IPPROTO_NAMES { \
./sys/netinet6/in6.h:#define CTL_IPV6PROTO_NAMES { \
./sys/sys/mbuf.h:#defineCTL_MBUF_NAMES {
\
./sys/sys/mount.h:#define   CTL_VFS_NAMES { \
./sys/sys/mount.h:#define   CTL_VFSGENCTL_NAMES { \
./sys/sys/pipe.h:#defineCTL_PIPE_NAMES { \
./sys/sys/socket.h:#define CTL_NET_NAMES { \
./sys/sys/socket.h:#define CTL_NET_RT_NAMES { \
./sys/sys/sysctl.h:#define  CTL_KERN_NAMES { \
./sys/sys/sysctl.h:#define  CTL_HW_NAMES { \
./sys/sys/sysctl.h:#define  CTL_USER_NAMES { \
./sys/sys/sysctl.h:#define  CTL_DDB_NAMES { \
./sys/uvm/uvm_param.h:#define   CTL_VM_NAMES { \
./usr.bin/nfsstat/nfsstat.c:mib[1] = 2; /* XXX from CTL_VFS_NAMES in 
 */
./external/bsd/libpcap/dist/ieee80211.h:#define IEEE80211_CTL_SUBTYPE_NAMES {   
\


Are these required?
Are these maintained?
Can we remove these macros?

The following changes are related:
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138991.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138992.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138993.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138994.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138995.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138996.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138997.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138998.html
http://mail-index.netbsd.org/source-changes/2003/12/04/msg138999.html

--
-
 Masanobu SAITOH(masan...@iij.ad.jp
  msai...@netbsd.org)

if_addrflags6: Can't assign requested address

2018-08-07 Thread Masanobu SAITOH


 Hi.

 While testing netbsd-7, I've noticed dhcpcd put the following
message:


Configuring network interfaces: wm0wm0: if_addrflags6: Can't assign requested 
address
wm0: if_addrflags6: Can't assign requested address
wm0: if_addrflags6: Can't assign requested address
wm0: if_addrflags6: Can't assign requested address


 Can we ignore this message, or is it a real problem?

/etc/dhcpcd.conf is the default.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: kernels with "pseudo-device pfsync" fail to build

2018-06-26 Thread Masanobu SAITOH


On 2018/06/27 11:58, John D. Baker wrote:

After the recent bpf_tap/bpf_mtap change, kernels which include:

   pseudo-device pfsync

fail to build:

[...]
--- if_pfsync.o ---
/x/current/src/sys/dist/pf/net/if_pfsync.c: In function 'pfsync_tdb_sendout':
/x/current/src/sys/dist/pf/net/if_pfsync.c:1559:2: error: too few arguments to 
function 'bpf_mtap'
   bpf_mtap(ifp, m);
   ^~~~
In file included from /x/current/src/sys/dist/pf/net/if_pfsync.c:51:0:
/x/current/src/sys/net/bpf.h:458:1: note: declared here
  bpf_mtap(struct ifnet *_ifp, struct mbuf *_m, u_int _direction)
  ^~~~
*** [if_pfsync.o] Error code 1

nbmake: stopped in /r0/build/current/obj/amd64/sys/arch/amd64/compile/PLEX760
--- if_plip.o ---
/r0/build/current/tools/amd64/bin/nbctfconvert -g -L VERSION -g if_plip.o
1 error

nbmake: stopped in /r0/build/current/obj/amd64/sys/arch/amd64/compile/PLEX760

ERROR: Failed to make all in 
"/r0/build/current/obj/amd64/sys/arch/amd64/compile/PLEX760"
*** BUILD ABORTED ***




Fixed. Thank you for the report.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ixg tester needed (was Re: Problems with netbsd-8 RC1 and ixg drivers (?))

2018-06-03 Thread Masanobu SAITOH


On 2018/06/03 18:20, 6b...@6bone.informatik.uni-leipzig.de wrote:

Hello,

I have applied

 http://www.netbsd.org/~msaitoh/ixgbe-eitr-20180522-0.dif
and
 http://www.netbsd.org/~msaitoh/ixgbe-norearm-20180530-0.dif

to netbsd-8 RC1. With these patches the problem seems to be solved.


 Thanks. I've committed the latest patch now!




Thank you for your efforts

Regards
Uwe


On Fri, 1 Jun 2018, Masanobu SAITOH wrote:


Date: Fri, 1 Jun 2018 12:47:32 +0900
From: Masanobu SAITOH 
To: 6b...@6bone.informatik.uni-leipzig.de
Cc: msai...@execsw.org, Martin Husemann ,
    current-users@netbsd.org
Subject: Re: ixg tester needed (was Re: Problems with netbsd-8 RC1 and ixg
    drivers (?))




  The same diff is at:

 http://www.netbsd.org/~msaitoh/ixgbe-norearm-20180530-0.dif



Updated patch (Fix compile error and ixv patch):

--
Don't call ixgbe_rearm_queues() in ixgbe_local_timer1(). ixgbe_enable_queue()
and ixgbe_disable_queue() try to enable/disable queue interrupt safely. It
has the internal counter. When a queue's MSI-X is received, ixgbe_msix_que()
is called (IPL_NET). This function disable the queue's interrupt by
ixgbe_disable_queue() and issues an softint. ixgbe_handle() queue is called by
the softint (IPL_SOFTNET), process TX,RX and call ixgbe_enable_queue() at the
end.

ixgbe_local_timer1() is a callout and run always on CPU 0 (IPL_SOFTCLOCK).
When ixgbe_rearm_queues() called, an MSI-X interrupt is issued for a specific
queue. It may not CPU 0. If this interrupt's ixgbe_msix_que() is called
and sofint_schedule() is called before the last sofint's softint_execute()
is not called, the softint_schedule() fails because of SOFTINT_PENDING.
It result in breaking ixgbe_{enable,disable}_queue()'s internal counter.

ixgbe_local_timer1() is written not to call ixgbe_rearm_queues() if
the interrupt is disabled, but it's called because of unknown bug or a race.

One solution is to not to use the internal counter, but it's little difficult.
Another solution is stop using ixgbe_rearm_queues() at all.  Essentially,
ixgbe_rearm_queues() is not required (it was added in ixgbe.c rev. 1.43
(2016/12/01)). ixgbe_rearm_queues() helps for lost interrupt problem but
I've never seen it other than ixgbe_rearm_queues() problem.


Index: ixgbe.c
===
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixgbe.c,v
retrieving revision 1.158
diff -u -p -r1.158 ixgbe.c
--- ixgbe.c    30 May 2018 09:17:17 -    1.158
+++ ixgbe.c    1 Jun 2018 03:22:05 -
@@ -4411,6 +4411,7 @@ ixgbe_local_timer1(void *arg)
/* Only truely watchdog if all queues show hung */
if (hung == adapter->num_queues)
    goto watchdog;
+#if 0 /* XXX Avoid unexpectedly disabling interrupt forever (PR#53294) */
else if (queues != 0) { /* Force an IRQ on queues with work */
    que = adapter->queues;
    for (i = 0; i < adapter->num_queues; i++, que++) {
@@ -4421,6 +4422,7 @@ ixgbe_local_timer1(void *arg)
    mutex_exit(>dc_mtx);
    }
}
+#endif
 out:
callout_reset(>timer, hz, ixgbe_local_timer, adapter);
@@ -6643,7 +6645,7 @@ ixgbe_handle_link(void *context)
/
 * ixgbe_rearm_queues
 /
-static void
+static __inline void
ixgbe_rearm_queues(struct adapter *adapter, u64 queues)
{
u32 mask;
Index: ixv.c
===
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixv.c,v
retrieving revision 1.102
diff -u -p -r1.102 ixv.c
--- ixv.c    30 May 2018 08:35:26 -    1.102
+++ ixv.c    1 Jun 2018 03:22:05 -
@@ -1266,9 +1266,11 @@ ixv_local_timer_locked(void *arg)
/* Only truly watchdog if all queues show hung */
if (hung == adapter->num_queues)
    goto watchdog;
+#if 0
else if (queues != 0) { /* Force an IRQ on queues with work */
    ixv_rearm_queues(adapter, queues);
}
+#endif
 callout_reset(>timer, hz, ixv_local_timer, adapter);
--

The same diff is at:

http://www.netbsd.org/~msaitoh/ixgbe-norearm-20180531-0.dif

--
---
   SAITOH Masanobu (msai...@execsw.org
    msai...@netbsd.org)




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ixg tester needed (was Re: Problems with netbsd-8 RC1 and ixg drivers (?))

2018-05-31 Thread Masanobu SAITOH






  The same diff is at:

 http://www.netbsd.org/~msaitoh/ixgbe-norearm-20180530-0.dif



Updated patch (Fix compile error and ixv patch):

--
 Don't call ixgbe_rearm_queues() in ixgbe_local_timer1(). ixgbe_enable_queue()
and ixgbe_disable_queue() try to enable/disable queue interrupt safely. It
has the internal counter. When a queue's MSI-X is received, ixgbe_msix_que()
is called (IPL_NET). This function disable the queue's interrupt by
ixgbe_disable_queue() and issues an softint. ixgbe_handle() queue is called by
the softint (IPL_SOFTNET), process TX,RX and call ixgbe_enable_queue() at the
end.

 ixgbe_local_timer1() is a callout and run always on CPU 0 (IPL_SOFTCLOCK).
When ixgbe_rearm_queues() called, an MSI-X interrupt is issued for a specific
queue. It may not CPU 0. If this interrupt's ixgbe_msix_que() is called
and sofint_schedule() is called before the last sofint's softint_execute()
is not called, the softint_schedule() fails because of SOFTINT_PENDING.
It result in breaking ixgbe_{enable,disable}_queue()'s internal counter.

 ixgbe_local_timer1() is written not to call ixgbe_rearm_queues() if
the interrupt is disabled, but it's called because of unknown bug or a race.

 One solution is to not to use the internal counter, but it's little difficult.
Another solution is stop using ixgbe_rearm_queues() at all.  Essentially,
ixgbe_rearm_queues() is not required (it was added in ixgbe.c rev. 1.43
(2016/12/01)). ixgbe_rearm_queues() helps for lost interrupt problem but
I've never seen it other than ixgbe_rearm_queues() problem.


Index: ixgbe.c
===
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixgbe.c,v
retrieving revision 1.158
diff -u -p -r1.158 ixgbe.c
--- ixgbe.c 30 May 2018 09:17:17 -  1.158
+++ ixgbe.c 1 Jun 2018 03:22:05 -
@@ -4411,6 +4411,7 @@ ixgbe_local_timer1(void *arg)
/* Only truely watchdog if all queues show hung */
if (hung == adapter->num_queues)
goto watchdog;
+#if 0 /* XXX Avoid unexpectedly disabling interrupt forever (PR#53294) */
else if (queues != 0) { /* Force an IRQ on queues with work */
que = adapter->queues;
for (i = 0; i < adapter->num_queues; i++, que++) {
@@ -4421,6 +4422,7 @@ ixgbe_local_timer1(void *arg)
mutex_exit(>dc_mtx);
}
}
+#endif
 
 out:

callout_reset(>timer, hz, ixgbe_local_timer, adapter);
@@ -6643,7 +6645,7 @@ ixgbe_handle_link(void *context)
 /
  * ixgbe_rearm_queues
  /
-static void
+static __inline void
 ixgbe_rearm_queues(struct adapter *adapter, u64 queues)
 {
u32 mask;
Index: ixv.c
===
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixv.c,v
retrieving revision 1.102
diff -u -p -r1.102 ixv.c
--- ixv.c   30 May 2018 08:35:26 -  1.102
+++ ixv.c   1 Jun 2018 03:22:05 -
@@ -1266,9 +1266,11 @@ ixv_local_timer_locked(void *arg)
/* Only truly watchdog if all queues show hung */
if (hung == adapter->num_queues)
goto watchdog;
+#if 0
else if (queues != 0) { /* Force an IRQ on queues with work */
ixv_rearm_queues(adapter, queues);
}
+#endif
 
 	callout_reset(>timer, hz, ixv_local_timer, adapter);
 
--


The same diff is at:

http://www.netbsd.org/~msaitoh/ixgbe-norearm-20180531-0.dif

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ixg tester needed (was Re: Problems with netbsd-8 RC1 and ixg drivers (?))

2018-05-31 Thread Masanobu SAITOH


 Hi, all.

 New patch:

---
 Don't call ixgbe_rearm_queues() in ixgbe_local_timer1(). ixgbe_enable_queue()
and ixgbe_disable_queue() try to enable/disable queue interrupt safely. It
has the internal counter. When a queue's MSI-X is received, ixgbe_msix_que()
is called (IPL_NET). This function disable the queue's interrupt by
ixgbe_disable_queue() and issues an softint. ixgbe_handle() queue is called by
the softint (IPL_SOFTNET), process TX,RX and call ixgbe_enable_queue() at the
end.

 ixgbe_local_timer1() is a callout and run always on CPU 0 (IPL_SOFTCLOCK).
When ixgbe_rearm_queues() called, an MSI-X interrupt is issued for a specific
queue. It may not CPU 0. If this interrupt's ixgbe_msix_que() is called
and sofint_schedule() is called before the last sofint's softint_execute()
is not called, the softint_schedule() fails because of SOFTINT_PENDING.
It result in breaking ixgbe_{enable,disable}_queue()'s internal counter.

 ixgbe_local_timer1() is written not to call ixgbe_rearm_queues() if
the interrupt is disabled, but it's called because of unknown bug or a race.

 One solution is to not to use the internal counter, but it's little difficult.
Another solution is stop using ixgbe_rearm_queues() at all.  Essentially,
ixgbe_rearm_queues() is not required (it was added in ixgbe.c rev. 1.43
(2016/12/01)). ixgbe_rearm_queues() helps for lost interrupt problem but
I've never seen it other than ixgbe_rearm_queues() problem.


Index: ixgbe.c
===
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixgbe.c,v
retrieving revision 1.158
diff -u -p -r1.158 ixgbe.c
--- ixgbe.c 30 May 2018 09:17:17 -  1.158
+++ ixgbe.c 31 May 2018 09:51:19 -
@@ -4411,6 +4411,7 @@ ixgbe_local_timer1(void *arg)
/* Only truely watchdog if all queues show hung */
if (hung == adapter->num_queues)
goto watchdog;
+#if 0 /* XXX Avoid unexpectedly disabling interrupt forever (PR#53294) */
else if (queues != 0) { /* Force an IRQ on queues with work */
que = adapter->queues;
for (i = 0; i < adapter->num_queues; i++, que++) {
@@ -4421,6 +4422,7 @@ ixgbe_local_timer1(void *arg)
mutex_exit(>dc_mtx);
}
}
+#endif
 
 out:

callout_reset(>timer, hz, ixgbe_local_timer, adapter);
---

 The same diff is at:

http://www.netbsd.org/~msaitoh/ixgbe-norearm-20180530-0.dif


--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: ixg tester needed (was Re: Problems with netbsd-8 RC1 and ixg drivers (?))

2018-05-29 Thread Masanobu SAITOH


Hello, Uew.

On 2018/05/29 14:59, 6b...@6bone.informatik.uni-leipzig.de wrote:

Hello,

I have tested the the patch with netbsd-8. The problem is not solved.


Thanks.

 It seems that the occurrence of this problem is depend on the
hardware configuration. I've never seen this problem on some
machines.

 Today, I could set up a system that this RX stall problem occurs
quickly (in a few minutes). I don't know if I can fix this problem soon.


 Thanks.





Regards
Uwe


On Mon, 28 May 2018, Masanobu SAITOH wrote:


Date: Mon, 28 May 2018 17:10:02 +0900
From: Masanobu SAITOH 
To: Martin Husemann ,
    6b...@6bone.informatik.uni-leipzig.de, current-users@netbsd.org
Cc: msai...@execsw.org
Subject: ixg tester needed (was Re: Problems with netbsd-8 RC1 and ixg drivers
 (?))

On 2018/05/28 16:51, Martin Husemann wrote:

On Mon, May 28, 2018 at 09:46:21AM +0200, 6b...@6bone.informatik.uni-leipzig.de 
wrote:

Hello,

At the weekend I tried to update to a current version of netbsd-8 rc1.

After the restart, the kernel will work for a few hours. After that, no
packets will arrive at the network card.



Please try the following patch who are using ixg(4) on netbsd-8 or -current:

http://www.netbsd.org/~msaitoh/ixgbe-eitr-20180522-0.dif

This change might fix RX stall problem. If you got TX device timeout or
RX stall,  please report with the output of:

sysctl hw |grep ixg


Regards.




The server is running normally. No
hints in dmesg.

Some network programs report issues:

zebra[371]: rtm_write: write : No buffer space available (55)

syslogd[541]: recvfrom() unix `/var/run/log': No buffer space available

gate zebra[1423]: routing socket error: No buffer space available


You are seeing two different issues here. The "No buffer space" is considered
harmless (it used to be silent, but the lossage should be the same).

The ixg(4) stops receiving packets is under investigation, RC2 is waiting
for a proposed patch being tested.

Martin




--
---
   SAITOH Masanobu (msai...@execsw.org
    msai...@netbsd.org)




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

ixg tester needed (was Re: Problems with netbsd-8 RC1 and ixg drivers (?))

2018-05-28 Thread Masanobu SAITOH


On 2018/05/28 16:51, Martin Husemann wrote:

On Mon, May 28, 2018 at 09:46:21AM +0200, 6b...@6bone.informatik.uni-leipzig.de 
wrote:

Hello,

At the weekend I tried to update to a current version of netbsd-8 rc1.

After the restart, the kernel will work for a few hours. After that, no
packets will arrive at the network card.



 Please try the following patch who are using ixg(4) on netbsd-8 or -current:

http://www.netbsd.org/~msaitoh/ixgbe-eitr-20180522-0.dif

This change might fix RX stall problem. If you got TX device timeout or
RX stall,  please report with the output of:

sysctl hw |grep ixg


 Regards.




The server is running normally. No
hints in dmesg.

Some network programs report issues:

zebra[371]: rtm_write: write : No buffer space available (55)

syslogd[541]: recvfrom() unix `/var/run/log': No buffer space available

gate zebra[1423]: routing socket error: No buffer space available


You are seeing two different issues here. The "No buffer space" is considered
harmless (it used to be silent, but the lossage should be the same).

The ixg(4) stops receiving packets is under investigation, RC2 is waiting
for a proposed patch being tested.

Martin




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: adding devices to puc(4)

2018-05-07 Thread Masanobu SAITOH


Hi.

On 2018/05/07 16:38, John Nemeth wrote:

  I'm trying to add an Oxford Semiconductor 4-port serial card
to puc(4).  Using the datasheet, I've gotten to the point where
all four serial ports are probed and attached.  However, I don't
seem to be able to communicate through the ports.  And, there
doesn't seem to be any real documentation on puc(4).  I'm trying
to figure out what the various flags mean.  A URL for the datasheet
is:

https://www.semiconductorstore.com/pdf/newsite/oxford/OXPCIe954_ds.pdf


 It seems that the Legacy mode (e.g. I/O mapped) of OXPCIe952 is removed
from OXPCIe954. It also seems it has no INTx support, right? Are the device's
com registers not 4 byte stride but 1 byte stride?



Here are the patches that I used to get it started:

Index: pcidevs
===
RCS file: /cvsroot/src/sys/dev/pci/pcidevs,v
retrieving revision 1.1333
diff -u -r1.1333 pcidevs
--- pcidevs 3 May 2018 04:21:10 -   1.1333
+++ pcidevs 7 May 2018 07:58:28 -
@@ -6280,6 +6280,7 @@
  product OXFORDSEMI OXPCIE952_40xc141  OXPCIe952
  product OXFORDSEMI OXPCIE952_50xc144  OXPCIe952
  product OXFORDSEMI OXPCIE952_60xc145  OXPCIe952
+product OXFORDSEMI OXPCIE952_7 0xc208  OXPCIe952


Not 952 but 954

  
  /* Packet Engines products */

  product PACKETENGINES GNICII  0x0911  G-NIC II Ethernet

Index: pucdata.c
===
RCS file: /cvsroot/src/sys/dev/pci/pucdata.c,v
retrieving revision 1.101
diff -u -r1.101 pucdata.c
--- pucdata.c   13 Apr 2018 07:57:04 -  1.101
+++ pucdata.c   7 May 2018 07:58:53 -
@@ -1108,6 +1108,19 @@
},
},
  
+	/* Oxford Semiconductor OXPCIe952 PCIe UARTs */

+   {   "Oxford Semiconductor OXPCIe952 UART",
+   {   PCI_VENDOR_OXFORDSEMI, PCI_PRODUCT_OXFORDSEMI_OXPCIE952_7,
+   0, 0 },
+   {   0x, 0x, 0,  0   },
+   {
+   { PUC_PORT_TYPE_COM, PCI_BAR0, 0x1000, COM_FREQ },
+   { PUC_PORT_TYPE_COM, PCI_BAR0, 0x1200, COM_FREQ },
+   { PUC_PORT_TYPE_COM, PCI_BAR0, 0x1400, COM_FREQ },
+   { PUC_PORT_TYPE_COM, PCI_BAR0, 0x1600, COM_FREQ },
+   },
+   },
+
/* Oxford Semiconductor OXmPCI952 PCI UARTs */
{   "Oxford Semiconductor OXmPCI952 UARTs",
{   PCI_VENDOR_OXFORDSEMI,  PCI_PRODUCT_OXFORDSEMI_EXSYS_EX41092,



 It seems FreeBSD's puc(4) support OXPCIe954, so it will help
to see FreeBSD's sys/dev/puc/pucdata.c.

 If it's a PCIe addin card, could you tell me the product name?
(I don't say I'll work to support it :))

FYI:
http://mail-index.netbsd.org/tech-kern/2014/02/09/msg016616.html

http://mail-index.netbsd.org/tech-kern/2014/01/23/msg016459.html
(If you're trying to use the device for console, you will modify
#if 0'd code in the diff.)

sys/dev/ic/com.c has a lof of #ifdefs. Some of them should not be determined
at compile time (e.g. COM16650 and COM_REGMAP), but the change is not easy.
Please someone(TM) do it.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: dmesg | grep -c "not configured" = 240...

2018-02-25 Thread Masanobu SAITOH


On 2018/02/24 18:55, Patrick Welche wrote:

On Mon, Feb 19, 2018 at 03:17:48PM +, Stephen Borrill wrote:

So I've just got a Lenovo ThinkSystem SR630 and:
# dmesg | grep -c "not configured"
240

http://www.netbsd.org/~sborrill/sr630.dmesg.txt

Main issues are missing Ethernet (Intel X722) and RAID controller:
vendor 8086 product 37d2 (ethernet network, revision 0x09) at pci7 dev 0 
function 0 not configured
vendor 8086 product 37d2 (ethernet network, revision 0x09) at pci7 dev 0 
function 1 not configured
vendor 8086 product 37d2 (ethernet network, revision 0x09) at pci7 dev 0 
function 2 not configured
vendor 8086 product 37d2 (ethernet network, revision 0x09) at pci7 dev 0 
function 3 not configured
vendor 1000 product 0016 (RAID mass storage, revision 0x01) at pci11 dev 0 
function 0 not configured

msaitoh@ - have you looked at the Intel X722 gigabit controllers?


 X722 is a 10G device which is based on 40G controller. {Free,Open}BSD
has ixl(4) but NetBSD has no device driver yet :(

 And I have no any ixl(4) devices :)







For the second part:


As for the RAID controller, we are missing support for all recent
LSI/Symbios/Avago/Broadcom controllers meaning no support for lots of
servers from Lenovo/HP, etc. OpenBSD's mfii supports most of these:

https://www.precedence.co.uk/wiki/Support-KB-IBM/PCIIDs

NetBSD has extended mfi to support a few variants, but OpenBSD has split the
driver into mfi and mfii which makes porting more tricky.


I had a first stab, for which feedback would have been nice:

https://mail-index.netbsd.org/current-users/2015/07/08/msg027701.html

(Development might be easier now that several USB keyboard bugs have
been fixed since then.)


Cheers,

Patrick




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Possible regression in wm(4)?

2017-12-06 Thread Masanobu SAITOH


On 2017/12/06 22:26, Bert Kiers wrote:

On Fri, Dec 01, 2017 at 04:40:37PM +0900, Masanobu SAITOH wrote:

Hi, all

On 2017/11/22 0:21, Bert Kiers wrote:

Hi,

A different computer with the same type motherboard has the same
problem.  A quad I350 (also wm(4)) works fine (with GENERIC netbsd-8
kernel).

Still wondering what queue drops are.

Grtnx,


  Could you test the following diff?


Yes!  Works!
Thank you!


 Thanks. The diff have been committed now and will be pulled
up to netbsd-8.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Possible regression in wm(4)?

2017-11-30 Thread Masanobu SAITOH


Hi, all

On 2017/11/22 0:21, Bert Kiers wrote:

Hi,

A different computer with the same type motherboard has the same
problem.  A quad I350 (also wm(4)) works fine (with GENERIC netbsd-8
kernel).

Still wondering what queue drops are.

Grtnx,


 Could you test the following diff?

---
 Fix a bug that 8257[56] can't receive packet. For 82575 and 82576, the RX
descriptors must be initialized after the setting of RCTL.EN in
wm_set_filter(). This bug was added in if_wm.c rev. 1.515.

Index: if_wm.c
===
RCS file: /cvsroot/src/sys/dev/pci/if_wm.c,v
retrieving revision 1.546
diff -u -p -r1.546 if_wm.c
--- if_wm.c 30 Nov 2017 09:24:18 -  1.546
+++ if_wm.c 1 Dec 2017 07:36:26 -
@@ -5814,6 +5814,14 @@ wm_init_locked(struct ifnet *ifp)
break;
}
 
+	/*

+* Set the receive filter.
+*
+* For 82575 and 82576, the RX descriptors must be initialized after
+* the setting of RCTL.EN in wm_set_filter()
+*/
+   wm_set_filter(sc);
+
/* On 575 and later set RDT only if RX enabled */
if ((sc->sc_flags & WM_F_NEWQUEUE) != 0) {
int qidx;
@@ -5828,9 +5836,6 @@ wm_init_locked(struct ifnet *ifp)
}
}
 
-	/* Set the receive filter. */

-   wm_set_filter(sc);
-
wm_unset_stopping_flags(sc);
 
 	/* Start the one second link check clock. */

@@ -6688,13 +6693,13 @@ wm_init_rx_buffer(struct wm_softc *sc, s
return ENOMEM;
}
} else {
-   if ((sc->sc_flags & WM_F_NEWQUEUE) == 0)
-   wm_init_rxdesc(rxq, i);
/*
-* For 82575 and newer device, the RX descriptors
-* must be initialized after the setting of RCTL.EN in
+* For 82575 and 82576, the RX descriptors must be
+* initialized after the setting of RCTL.EN in
 * wm_set_filter()
 */
+   if ((sc->sc_flags & WM_F_NEWQUEUE) == 0)
+   wm_init_rxdesc(rxq, i);
}
}
rxq->rxq_ptr = 0;
---


The same diff is at:

http://www.netbsd.org/~msaitoh/wm-825756-20171201-0.dif

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: netbsd-8 crash in ixg driver during booting

2017-11-15 Thread Masanobu SAITOH


Hi, Uwe.

On 2017/11/15 15:41, 6b...@6bone.informatik.uni-leipzig.de wrote:


Does your machine boot with the latest -current?


I have tested the current sources from tonight.

https://suse.uni-leipzig.de/crash/crash-current1.jpg
https://suse.uni-leipzig.de/crash/crash-current2.jpg

Regards
Uwe


 Thank you for the report.

 This problem is different from ixg(4)'s problem. I'll now
working to fix this softint related problem.

 This problem is caused by some devices which uses a lot of
softint, could you tell me the machine's spec? e.g.:

number of port of wm(4) and/or ixg(4)
number of nvme(4)
etc.

Thanks in advance.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: netbsd 8 (beta) failing to load ixg device

2017-11-14 Thread Masanobu SAITOH

 Hi, all.

On 2017/11/14 7:01, Jaromír Doleček wrote:

I had a very brief look on the crashing function ixgbe_update_stats_count(). The 
only division there is in the one using adapter->num_queue.

Looking at ixgbe_configure_interrups(), seems that one can happily set it to 0 
if number of MSI vectors is 1, as is the case according to your dmesg.

SAITOH Masanobu, could you please have a closer look? I wonder if this could 
have been introduced around rev. 1.96/1.97/1.98 of dev/pci/ixgbe/ixgbe.c, 
that's when the code related to this changed.

 I could reproduce the problem. Yes, it's my fault.

 I sent a new pullup request:

http://releng.netbsd.org/cgi-bin/req-8.cgi?show=361

It's little hard to fix that problem with small patch. I'm going to
send a jumbo patch for ixg(4) to fix a lot of bugs and it'll fix this
problemcleanly.

 Sorry and thank you all.

Jaromir

2017-11-13 16:03 GMT+01:00 Derrick Lobo >:
 >
 > HI Thor
 >
 > I have attached the logs..  I am able to use the server in 7.99 but cannot
 > upgrade to 8.0 beta. Theres a kernel panic when the driver is loaded.. its
 > not like the bootup progresses and marks the driver as unconfigured.
 >
 > -Original Message-
 > From: Thor Lancelot Simon [mailto:t...@panix.com ]
 > Sent: Sunday, November 12, 2017 10:30 PM
 > To: Derrick Lobo
 > Cc: port-am...@netbsd.org ; current-users@netbsd.org 

 > Subject: Re: netbsd 8 (beta) failing to load ixg device
 >
 > On Thu, Nov 09, 2017 at 09:15:53AM -0500, Derrick Lobo wrote:
 > > The daily beta version of nebtsd 8 does not support ixg 5gb NIC's, the
 > > support was enabled in 7.99
 >
 > That doesn't make sense - if it's in 7.anything, it's in 8.  When we cut
 > the 8 branch, we move the version number on HEAD to 8.99.
 >
 > Thor

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

UEFI status? (was Re: GPT/UEFI booting)

2017-08-04 Thread Masanobu SAITOH


 Hi, all.

 A few days ago, I successfully booted from UP board.

http://www.up-board.org/up/comparison-up-versus-edisongalileo-joule/

UP board's UEFI doesn't support legacy boot. Currently,
our sysinst doesn't support creating UEFI bootable disk,
so I tried creating disk with reading the first mail of
this thread. I couldn't make bootable disk, too. But,
I've noticed the info from doing "gpt show -a sd0"
on a disk which contains images/NetBSD-8.xx.y-1-amd64-uefi-install.img.
A bootable image has a attribute "bootme":


uefi# gpt show -a sd0
  start   size  index  contents
  0  1 PMBR
  1  1 Pri GPT header
  2 32 Pri GPT table
 34  65536  1  GPT part - EFI System
 Type: efi
 TypeID: c12a7328-f81f-11d2-ba4b-00a0c93ec93b
 GUID: 64a9158f-099d-442d-9ae4-ddbb6b604114
 Size: 32768 K
 Label: EFI System
 Attributes: None
  65570  122486717  2  GPT part - NetBSD FFSv1/FFSv2
 Type: ffs
 TypeID: 49f48d5a-b10e-11dc-b99b-0019d1879648
 GUID: 63313eee-af70-4e00-90a1-e0443c02ce4d
 Size: 59808 M
 Label: amd64-current
 Attributes: bootme < This!
  122552287 32 Sec GPT table
  122552319  1 Sec GPT header


so,


This is what I’ve done so far:
#
# Create the GPT segments
#
gpt create -f ld0
gpt add -s 65536 -t efi -l "EFI System" ld0
gpt add -s 525168 -t ffs -l "NetBSD-root" ld0


"gpt set -a bootme -i N(parpahs 2) xx0" is required.


gpt add -s 256000 -t swap -l "NetBSD-swap" ld0
gpt add -s 1024000 -t ffs -l "NetBSD-var" ld0
gpt add -s 16264934 -t ffs -l "NetBSD-usr" ld0
gpt add -t ffs -l "NetBSD-home”
#
# Initialize the wedges
#


 To inform this flag, I modified gpt.8 yesterday
(and thanks kre@ fixing typo!)

 After setup my environment, I've noticed that neither
acpidump and nor pkgsrc/dmidecode didn't work. The reason
is that those information are not near the beginning of
the physical address on UEFI environment. For acpidump,
the problem was fixed. Not yet for pkgsrc/dmidecode.

 I've not used grub2. If someone(TM) succeeded booting
with grub2, it would be good to summarize the howto.

 Making sysinst UEFI friendly is important. Could someone
do by 8.0-RELEASE?

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: lm0 at isa reboot

2017-08-03 Thread Masanobu SAITOH


Hi, Patrick

On 2017/07/27 18:28, Patrick Welche wrote:

On Fri, Jul 21, 2017 at 10:59:51PM +0100, Patrick Welche wrote:

On Fri, Jul 21, 2017 at 08:17:44AM +0100, Patrick Welche wrote:

On Fri, Jul 21, 2017 at 07:53:09AM +0100, Patrick Welche wrote:

On Thu, Jul 20, 2017 at 09:06:16PM +, co...@sdf.org wrote:

On Thu, Jul 13, 2017 at 09:49:32AM +0100, Patrick Welche wrote:

or "new" reboot: just updated a working 3rd July amd64
kernel with this morning's source, and the computer reboots after
printing nouveau, but before drm. Haven't had a chance to dig (won't
until tonight) - any first guesses?


Must be unrelated to nouveau. no changes in sys/external/bsd/drm2 since
June 1.


In the meantime it is getting even more confusing: I bisected to

http://mail-index.netbsd.org/source-changes/2017/07/11/msg086253.html

 lm(4): Add suport for NCT5174D, NCT6775F, NCT6779D and NCT679[1235]D.
 wbsio(4): Add support for NCT6795D.

but then rebooting with disable wbsio (which didn't switch anything off)
and disable lm still failed. Now moving the modules which I haven't kept
in synch during the bisection away.


Moving the modules out of the way now gets consistent results:
- unsuccessful boot with the above patch
- successful boot with the above patch and "disable lm"

I have no idea what chip is in this box(!)


End conclusion: if I comment out

   #lm0at isa? port 0x290 flags 0x0# other common ports: 0x280, 0x310

i.e., have it the way it is in GENERIC, the kernel boots. (With it in, I
get a reboot with nothing printed on the serial console.)

Leaving

   lm*at wbsio?

is fine.

Adding DEBUG, DIAGNOSTIC, LOCKDEBUG, and leaving lm0 at isa? also appears
to be fine (?!) (and the locking I had in earlier posts also doesn't seem
to tie up with lm, but I did have to have it uncommented.)

Turns out that on 0x290, this Ryzen box has an ITE IT8665E.



 Driver for ITE's super I/O is itesio(4)



Why the bisection hit that commit remains a mystery, as that chip
won't have attached to the driver.


 Default I/O port of ITE's super I/O and Winbond(Nuvoton)'s super I/O
the same. Only one or two byte ID of super I/O makes hard to identify.
For our Winbond support, it checks only 8 bits instead of real 12 bits,
so it can be improved. Another problem is that nslm7x.c's xxx_match()
is not match function but attach function, so it also should be improved.

 Until above problems will be solved, don't enable "lm* at wbsio?" on
your machine.




Cheers,

Patrick




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: -current boot failure @ wm(4)

2017-07-20 Thread Masanobu SAITOH


On 2017/07/20 0:33, Lars Reichardt wrote:



Masanobu SAITOH <msai...@execsw.org> hat am 18. Juli 2017 um 12:04 geschrieben:


On 2017/07/14 13:34, Masanobu SAITOH wrote:

Hi, Brad.

On 2017/07/14 3:12, bch wrote:

Hello NetBSD.

I think this maybe related to msaitoh@ work in ./sys/dev/pci/*wm*.
The latest kernel begins boot, then just hangs, last 3  lines are
(transcribed):

wm0 at pci0 dev 25 function 0: PCH2 LAN (82579LM) controller (rev. 0x04)
wm0: interrupting at msii vec 0
wm0: wm_init_lcd_from_nvm: need write_smbus()


-bch


   Thanks. I could reproduce the same problem. It's not easy to fix this
problem, so I've committed to disable wm_init_lcd_from_nvm() now.
Please cvs update.


   I enabled the code with some bugfixes. Could you test with
the latest -current?

And, if you're OK, could you tell me what is your machine (e.g. product
name)? I have some PCH and newer machines but only Thinkpad X220 enters
the code path.

   


Hi, I've seen that problem on my Thinkpad X201s as well and it's gone now.


 Sorry and thank you!


Thanks,
Lars




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: -current boot failure @ wm(4)

2017-07-18 Thread Masanobu SAITOH


On 2017/07/14 13:34, Masanobu SAITOH wrote:

Hi, Brad.

On 2017/07/14 3:12, bch wrote:

Hello NetBSD.

I think this maybe related to msaitoh@ work in ./sys/dev/pci/*wm*.
The latest kernel begins boot, then just hangs, last 3  lines are
(transcribed):

wm0 at pci0 dev 25 function 0: PCH2 LAN (82579LM) controller (rev. 0x04)
wm0: interrupting at msii vec 0
wm0: wm_init_lcd_from_nvm: need write_smbus()


-bch


  Thanks. I could reproduce the same problem. It's not easy to fix this
problem, so I've committed to disable wm_init_lcd_from_nvm() now.
Please cvs update.


 I enabled the code with some bugfixes. Could you test with
the latest -current?

And, if you're OK, could you tell me what is your machine (e.g. product
name)? I have some PCH and newer machines but only Thinkpad X220 enters
the code path.

 Thanks in advance.


  Thanks.




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: -current boot failure @ wm(4)

2017-07-13 Thread Masanobu SAITOH


Hi, Brad.

On 2017/07/14 3:12, bch wrote:

Hello NetBSD.

I think this maybe related to msaitoh@ work in ./sys/dev/pci/*wm*.
The latest kernel begins boot, then just hangs, last 3  lines are
(transcribed):

wm0 at pci0 dev 25 function 0: PCH2 LAN (82579LM) controller (rev. 0x04)
wm0: interrupting at msii vec 0
wm0: wm_init_lcd_from_nvm: need write_smbus()


-bch


 Thanks. I could reproduce the same problem. It's not easy to fix this
problem, so I've committed to disable wm_init_lcd_from_nvm() now.
Please cvs update.

 Thanks.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: bad counter for ixg* interfaces

2017-05-08 Thread Masanobu SAITOH


Hi, Uwe.

On 2017/04/28 21:11, 6b...@6bone.informatik.uni-leipzig.de wrote:

Hello,

ifconfig -v ixg0 shows:

ixg0: flags=8843 mtu 1500
 capabilities=fff80
 capabilities=fff80
 capabilities=fff80
 enabled=0
 ec_capabilities=7
 ec_enabled=7
 address: a0:36:9f:d4:3c:08
 media: Ethernet autoselect (10GbaseSR full-duplex,rxpause,txpause)
 status: active
 input: 22429456 packets, 12404771626 bytes, 132266 multicasts
 output: 10573554 packets, 0 bytes
 ...


The outgoing packets are counted correctly. The outgoing bytes remain at 0.

Kernel version: NetBSD 7.99.70. Older Versions have the same problem.

Programs like the snmpd which the counter evaluate provide wrong data.

Maybe someone can take a look at the code.



Thank you for your efforts


Regards
Uwe


Thank you for the report. I've committed the fix.
Please test with ix_txrx.c rev. 1.23.

 netbsd-7 pullup request for this fix will be sent with
other changes.

 Thanks.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

panic in vfs_wapbl.c

2017-03-15 Thread Masanobu SAITOH


 I updated my machine's kernel which was made from 1 hour ago's
-current source. It paniced. It's reproducible.


/dev/rwd0a: file system is clean; not checking
panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)" failed: file 
"../../../../kern/vfs_wapbl.c", line 1142
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0x80215455 cs 0x8 rflags 0x246 cr2 
0x770e1f2ae190 ilevel 0 rsp 0xfe8120956b00
curlwp 0xfe847b8820a0 pid 30.1 lowest kstack 0xfe81209532c0
Stopped in pid 30.1 (mount_ffs) at  netbsd:breakpoint+0x5:  leave
db{15}> trace
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
wapbl_add_buf() at netbsd:wapbl_add_buf+0x133
bdwrite() at netbsd:bdwrite+0xbd
bwrite() at netbsd:bwrite+0x95
ffs_sbupdate() at netbsd:ffs_sbupdate+0x1b9
ffs_wapbl_start() at netbsd:ffs_wapbl_start+0x177
ffs_mount() at netbsd:ffs_mount+0x4e9
VFS_MOUNT() at netbsd:VFS_MOUNT+0x34
do_sys_mount() at netbsd:do_sys_mount+0x5ee
sys___mount50() at netbsd:sys___mount50+0x33
syscall() at netbsd:syscall+0x1ed
--- syscall (number 410) ---
770e1f28989a:
db{15}>


 At least five days ago's kernel worked without this proble,

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: panic: kernel diagnostic assertion "next != _PSLIST_POISON"

2017-03-14 Thread Masanobu SAITOH


On 2017/03/15 14:11, Masanobu SAITOH wrote:

On 2017/03/15 0:30, Jaromír Doleček wrote:

Yes, this panic is already fixed in -current:

panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)"
failed: file "../../../../kern/vfs_wapbl.c", line 1142

Jaromir


 It still panics for me.

New output with two printf()s:


boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
kern.module.path=/stand/amd64/7.99.66/modules
Wed Mar 15 14:05:38 JST 2017
Starting root file system check:
/dev/rwd0a: file system is clean; not checking
ffs_mount: path "/" flags 0x6015040
ffs_wapbl_start: sbupdate
panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)" failed: file 
"../../../../kern/vfs_wapbl.c", line 1142
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0x80215455 cs 0x8 rflags 0x246 cr2 
0x75db2c768548 ilevel 0 rsp 0xfe812099eaf0
curlwp 0xfe847b7ba0a0 pid 30.1 lowest kstack 0xfe812099b2c0
Stopped in pid 30.1 (mount_ffs) at  netbsd:breakpoint+0x5:  leave
db{0}> trace
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
wapbl_add_buf() at netbsd:wapbl_add_buf+0x134
bdwrite() at netbsd:bdwrite+0xbd
bwrite() at netbsd:bwrite+0x95
ffs_sbupdate() at netbsd:ffs_sbupdate+0x1b9
ffs_wapbl_start() at netbsd:ffs_wapbl_start+0x188
ffs_mount() at netbsd:ffs_mount+0x50a
VFS_MOUNT() at netbsd:VFS_MOUNT+0x34
do_sys_mount() at netbsd:do_sys_mount+0x5ee
sys___mount50() at netbsd:sys___mount50+0x33
syscall() at netbsd:syscall+0x1ed
--- syscall (number 410) ---
7938d028989a:
db{0}>


And,


Please attach your /etc/fstab.


My root filesystem has both log and async option.


/dev/wd0a   /   ffs rw,async,noatime 1 1   
< don't panic
#/dev/wd0a  /   ffs rw,log,async,noatime 1 1   
< do panic


I've noticed that only log option doesn't cause panic.
If both log and async is set, it panics.


mount(8) says:


 log (FFS only) Mount the file system with wapbl(4) meta-
 data journaling, also known simply as logging.  It
 provides rapid metadata updates and eliminates the
 need to check file system consistency after a system
 outage.  A file system mounted with log can not be
 mounted with async.


 Isn't it checked when mounting?





2017-03-14 9:04 GMT+01:00 Frank Kardel <kar...@netbsd.org>:

Hmm, I think ch_voltag_convert_in() is a red herring,

Both panics contextually match the higher parts of the stack traces. So I
would disregard the ch_voltag_convert_in() part here and
conclude it is two distinct panics. One relates to psref corruption in
network code and the other to wapbl and possibly
recent mount update (-u) changes,

Other ideas ?

Frank


On 03/14/17 08:56, Masanobu SAITOH wrote:


Hi.

On 2017/03/14 16:36, Frank Kardel wrote:


Has anyone seen this panic recently?

Seen in -current-20170311, i386, Soekris 6501.

panic: kernel diagnostic assertion "next != _PSLIST_POISON" failed: file
"/fs/raid2a/src/NetBSD/cur/src/sys/sys/pslist.h", line 270
cpu0: Begin traceback...

vpanic(c0cb1784,dba43dac,dba43e2c,c09e0d1e,c0cb1784,c0cb16d3,c0cb681b,c0cb6458,10e,a8)
at netbsd:vpanic+0x121

ch_voltag_convert_in(c0cb1784,c0cb16d3,c0cb681b,c0cb6458,10e,a8,0,c3d70578,c09e0988,c3d70348)
at netbsd:ch_voltag_convert_in

sysctl_iflist(4,cbd8cf60,c7,cbd8cff9,c33c06c0,c7,c090f986,0,cbd8cf60,a43e90)
at c09e0d1e

sysctl_rtable(dba43f0c,3,afe01000,dba43efc,0,0,dba43f00,c3de1560,c3c11c0c,3)
at c09e129c

sysctl_dispatch(dba43f00,6,afe01000,dba43efc,0,0,dba43f00,c3de1560,c3c11c0c,dba43efc)
at netbsd:sysctl_dispatch+0xbd

sys___sysctl(c3de1560,dba43f68,dba43f60,7dd51000,c3de1560,dba43f60,dba43f68,0,0,b0094fb0)
at netbsd:sys___sysctl+0xe3
syscall() at netbsd:syscall+0x257
--- syscall (number 202) ---
b00736f7:
cpu0: End traceback...

Frank



Yesterday I sent the following mail to current-users@ but it haven't
delivered yet...


 I updated my machine's kernel which was made from 1 hour ago's
-current source. It paniced. It's reproducible.


/dev/rwd0a: file system is clean; not checking
panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)" failed:
file "../../../../kern/vfs_wapbl.c", line 1142
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0x80215455 cs 0x8 rflags 0x246 cr2
0x770e1f2ae190 ilevel 0 rsp 0xfe8120956b00
curlwp 0xfe847b8820a0 pid 30.1 lowest kstack 0xfe81209532c0
Stopped in pid 30.1 (mount_ffs) at netbsd:breakpoint+0x5:  leave
db{15}> trace
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
wapbl_add_buf() at netbsd:wapbl_add_b

Re: panic: kernel diagnostic assertion "next != _PSLIST_POISON"

2017-03-14 Thread Masanobu SAITOH


On 2017/03/15 0:30, Jaromír Doleček wrote:

Yes, this panic is already fixed in -current:

panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)"
failed: file "../../../../kern/vfs_wapbl.c", line 1142

Jaromir


 It still panics for me.

New output with two printf()s:


boot device: wd0
root on wd0a dumps on wd0b
root file system type: ffs
kern.module.path=/stand/amd64/7.99.66/modules
Wed Mar 15 14:05:38 JST 2017
Starting root file system check:
/dev/rwd0a: file system is clean; not checking
ffs_mount: path "/" flags 0x6015040
ffs_wapbl_start: sbupdate
panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)" failed: file 
"../../../../kern/vfs_wapbl.c", line 1142
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0x80215455 cs 0x8 rflags 0x246 cr2 
0x75db2c768548 ilevel 0 rsp 0xfe812099eaf0
curlwp 0xfe847b7ba0a0 pid 30.1 lowest kstack 0xfe812099b2c0
Stopped in pid 30.1 (mount_ffs) at  netbsd:breakpoint+0x5:  leave
db{0}> trace
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
wapbl_add_buf() at netbsd:wapbl_add_buf+0x134
bdwrite() at netbsd:bdwrite+0xbd
bwrite() at netbsd:bwrite+0x95
ffs_sbupdate() at netbsd:ffs_sbupdate+0x1b9
ffs_wapbl_start() at netbsd:ffs_wapbl_start+0x188
ffs_mount() at netbsd:ffs_mount+0x50a
VFS_MOUNT() at netbsd:VFS_MOUNT+0x34
do_sys_mount() at netbsd:do_sys_mount+0x5ee
sys___mount50() at netbsd:sys___mount50+0x33
syscall() at netbsd:syscall+0x1ed
--- syscall (number 410) ---
7938d028989a:
db{0}>


And,


Please attach your /etc/fstab.


My root filesystem has both log and async option.


/dev/wd0a   /   ffs rw,async,noatime 1 1   
< don't panic
#/dev/wd0a  /   ffs rw,log,async,noatime 1 1   
< do panic


I've noticed that only log option doesn't cause panic.
If both log and async is set, it panics.



2017-03-14 9:04 GMT+01:00 Frank Kardel <kar...@netbsd.org>:

Hmm, I think ch_voltag_convert_in() is a red herring,

Both panics contextually match the higher parts of the stack traces. So I
would disregard the ch_voltag_convert_in() part here and
conclude it is two distinct panics. One relates to psref corruption in
network code and the other to wapbl and possibly
recent mount update (-u) changes,

Other ideas ?

Frank


On 03/14/17 08:56, Masanobu SAITOH wrote:


Hi.

On 2017/03/14 16:36, Frank Kardel wrote:


Has anyone seen this panic recently?

Seen in -current-20170311, i386, Soekris 6501.

panic: kernel diagnostic assertion "next != _PSLIST_POISON" failed: file
"/fs/raid2a/src/NetBSD/cur/src/sys/sys/pslist.h", line 270
cpu0: Begin traceback...

vpanic(c0cb1784,dba43dac,dba43e2c,c09e0d1e,c0cb1784,c0cb16d3,c0cb681b,c0cb6458,10e,a8)
at netbsd:vpanic+0x121

ch_voltag_convert_in(c0cb1784,c0cb16d3,c0cb681b,c0cb6458,10e,a8,0,c3d70578,c09e0988,c3d70348)
at netbsd:ch_voltag_convert_in

sysctl_iflist(4,cbd8cf60,c7,cbd8cff9,c33c06c0,c7,c090f986,0,cbd8cf60,a43e90)
at c09e0d1e

sysctl_rtable(dba43f0c,3,afe01000,dba43efc,0,0,dba43f00,c3de1560,c3c11c0c,3)
at c09e129c

sysctl_dispatch(dba43f00,6,afe01000,dba43efc,0,0,dba43f00,c3de1560,c3c11c0c,dba43efc)
at netbsd:sysctl_dispatch+0xbd

sys___sysctl(c3de1560,dba43f68,dba43f60,7dd51000,c3de1560,dba43f60,dba43f68,0,0,b0094fb0)
at netbsd:sys___sysctl+0xe3
syscall() at netbsd:syscall+0x257
--- syscall (number 202) ---
b00736f7:
cpu0: End traceback...

Frank



Yesterday I sent the following mail to current-users@ but it haven't
delivered yet...


 I updated my machine's kernel which was made from 1 hour ago's
-current source. It paniced. It's reproducible.


/dev/rwd0a: file system is clean; not checking
panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)" failed:
file "../../../../kern/vfs_wapbl.c", line 1142
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0x80215455 cs 0x8 rflags 0x246 cr2
0x770e1f2ae190 ilevel 0 rsp 0xfe8120956b00
curlwp 0xfe847b8820a0 pid 30.1 lowest kstack 0xfe81209532c0
Stopped in pid 30.1 (mount_ffs) at netbsd:breakpoint+0x5:  leave
db{15}> trace
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
wapbl_add_buf() at netbsd:wapbl_add_buf+0x133
bdwrite() at netbsd:bdwrite+0xbd
bwrite() at netbsd:bwrite+0x95
ffs_sbupdate() at netbsd:ffs_sbupdate+0x1b9
ffs_wapbl_start() at netbsd:ffs_wapbl_start+0x177
ffs_mount() at netbsd:ffs_mount+0x4e9
VFS_MOUNT() at netbsd:VFS_MOUNT+0x34
do_sys_mount() at netbsd:do_sys_mount+0x5ee
sys___mount50() at netbsd:sys___mount50+0x33
syscall() at netbsd:syscall+0x1ed
--- syscall (number 410) ---
770e1f28989a:
db{15}>



 At least five days ago's kernel worked without this proble,



Both panics include ch_voltag_convert_in()






--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: panic: kernel diagnostic assertion "next != _PSLIST_POISON"

2017-03-14 Thread Masanobu SAITOH


On 2017/03/14 17:15, Ryota Ozaki wrote:

Hi,

On Tue, Mar 14, 2017 at 4:36 PM, Frank Kardel  wrote:

Has anyone seen this panic recently?

Seen in -current-20170311, i386, Soekris 6501.

panic: kernel diagnostic assertion "next != _PSLIST_POISON" failed: file
"/fs/raid2a/src/NetBSD/cur/src/sys/sys/pslist.h", line 270
cpu0: Begin traceback...
vpanic(c0cb1784,dba43dac,dba43e2c,c09e0d1e,c0cb1784,c0cb16d3,c0cb681b,c0cb6458,10e,a8)
at netbsd:vpanic+0x121
ch_voltag_convert_in(c0cb1784,c0cb16d3,c0cb681b,c0cb6458,10e,a8,0,c3d70578,c09e0988,c3d70348)
at netbsd:ch_voltag_convert_in
sysctl_iflist(4,cbd8cf60,c7,cbd8cff9,c33c06c0,c7,c090f986,0,cbd8cf60,a43e90)
at c09e0d1e
sysctl_rtable(dba43f0c,3,afe01000,dba43efc,0,0,dba43f00,c3de1560,c3c11c0c,3)
at c09e129c
sysctl_dispatch(dba43f00,6,afe01000,dba43efc,0,0,dba43f00,c3de1560,c3c11c0c,dba43efc)
at netbsd:sysctl_dispatch+0xbd
sys___sysctl(c3de1560,dba43f68,dba43f60,7dd51000,c3de1560,dba43f60,dba43f68,0,0,b0094fb0)
at netbsd:sys___sysctl+0xe3
syscall() at netbsd:syscall+0x257
--- syscall (number 202) ---
b00736f7:
cpu0: End traceback...


I committed a possible fix. Could you update your kernel and see
if the panic is fixed?

Thanks,
  ozaki-r


 For me, it still panics. Trace is the same as before:


Starting root file system check:
/dev/rwd0a: file system is clean; not checking
panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)" failed: file 
"../../../../kern/vfs_wapbl.c", line 1142
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0x80215455 cs 0x8 rflags 0x246 cr2 
0x7caa0deae190 ilevel 0 rsp 0xfe8120997b00
curlwp 0xfe847b8180a0 pid 30.1 lowest kstack 0xfe81209942c0
Stopped in pid 30.1 (mount_ffs) at  netbsd:breakpoint+0x5:  leave
db{0}> trace
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
wapbl_add_buf() at netbsd:wapbl_add_buf+0x134
bdwrite() at netbsd:bdwrite+0xbd
bwrite() at netbsd:bwrite+0x95
ffs_sbupdate() at netbsd:ffs_sbupdate+0x1b9
ffs_wapbl_start() at netbsd:ffs_wapbl_start+0x177
ffs_mount() at netbsd:ffs_mount+0x4e9
VFS_MOUNT() at netbsd:VFS_MOUNT+0x34
do_sys_mount() at netbsd:do_sys_mount+0x5ee
sys___mount50() at netbsd:sys___mount50+0x33
syscall() at netbsd:syscall+0x1ed
--- syscall (number 410) ---
7caa0de8989a:
db{0}>




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: panic: kernel diagnostic assertion "next != _PSLIST_POISON"

2017-03-14 Thread Masanobu SAITOH


Hi.

On 2017/03/14 16:36, Frank Kardel wrote:

Has anyone seen this panic recently?

Seen in -current-20170311, i386, Soekris 6501.

panic: kernel diagnostic assertion "next != _PSLIST_POISON" failed: file 
"/fs/raid2a/src/NetBSD/cur/src/sys/sys/pslist.h", line 270
cpu0: Begin traceback...
vpanic(c0cb1784,dba43dac,dba43e2c,c09e0d1e,c0cb1784,c0cb16d3,c0cb681b,c0cb6458,10e,a8)
 at netbsd:vpanic+0x121
ch_voltag_convert_in(c0cb1784,c0cb16d3,c0cb681b,c0cb6458,10e,a8,0,c3d70578,c09e0988,c3d70348)
 at netbsd:ch_voltag_convert_in
sysctl_iflist(4,cbd8cf60,c7,cbd8cff9,c33c06c0,c7,c090f986,0,cbd8cf60,a43e90) at 
c09e0d1e
sysctl_rtable(dba43f0c,3,afe01000,dba43efc,0,0,dba43f00,c3de1560,c3c11c0c,3) at 
c09e129c
sysctl_dispatch(dba43f00,6,afe01000,dba43efc,0,0,dba43f00,c3de1560,c3c11c0c,dba43efc)
 at netbsd:sysctl_dispatch+0xbd
sys___sysctl(c3de1560,dba43f68,dba43f60,7dd51000,c3de1560,dba43f60,dba43f68,0,0,b0094fb0)
 at netbsd:sys___sysctl+0xe3
syscall() at netbsd:syscall+0x257
--- syscall (number 202) ---
b00736f7:
cpu0: End traceback...

Frank


Yesterday I sent the following mail to current-users@ but it haven't
delivered yet...


 I updated my machine's kernel which was made from 1 hour ago's
-current source. It paniced. It's reproducible.


/dev/rwd0a: file system is clean; not checking
panic: kernel diagnostic assertion "!(bp->b_oflags & BO_DELWRI)" failed: file 
"../../../../kern/vfs_wapbl.c", line 1142
fatal breakpoint trap in supervisor mode
trap type 1 code 0 rip 0x80215455 cs 0x8 rflags 0x246 cr2 
0x770e1f2ae190 ilevel 0 rsp 0xfe8120956b00
curlwp 0xfe847b8820a0 pid 30.1 lowest kstack 0xfe81209532c0
Stopped in pid 30.1 (mount_ffs) at  netbsd:breakpoint+0x5:  leave
db{15}> trace
breakpoint() at netbsd:breakpoint+0x5
vpanic() at netbsd:vpanic+0x140
ch_voltag_convert_in() at netbsd:ch_voltag_convert_in
wapbl_add_buf() at netbsd:wapbl_add_buf+0x133
bdwrite() at netbsd:bdwrite+0xbd
bwrite() at netbsd:bwrite+0x95
ffs_sbupdate() at netbsd:ffs_sbupdate+0x1b9
ffs_wapbl_start() at netbsd:ffs_wapbl_start+0x177
ffs_mount() at netbsd:ffs_mount+0x4e9
VFS_MOUNT() at netbsd:VFS_MOUNT+0x34
do_sys_mount() at netbsd:do_sys_mount+0x5ee
sys___mount50() at netbsd:sys___mount50+0x33
syscall() at netbsd:syscall+0x1ed
--- syscall (number 410) ---
770e1f28989a:
db{15}>


 At least five days ago's kernel worked without this proble,


Both panics include ch_voltag_convert_in()

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2017-01-15 Thread Masanobu SAITOH


On 2017/01/14 2:03, Tom Ivar Helbekkmo wrote:

Masanobu SAITOH <msai...@execsw.org> writes:


 Please test the latest -current. knakahara found a problem:


That worked fine!  No longer any need for the tcpdump hack.  :)

(I didn't get the latest -current; I just added those patches to 7.99.39.)


Good :)


-tih




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2017-01-15 Thread Masanobu SAITOH


On 2017/01/14 6:43, Jarle Greipsland wrote:

Masanobu SAITOH <msai...@execsw.org> writes:

On 2016/11/28 17:16, Masanobu SAITOH wrote:

Hello, Jarle.

On 2016/11/27 0:45, Jarle Greipsland wrote:

[ ... ]

Was this problem ever fixed?


 Perhaps no. I've added a lot of changes into if_wm.c, but I've not
touched vlan related stuff.



 Please test the latest -current. knakahara found a problem:

[ ... ]

cvs rdiff -u -r1.234 -r1.235 src/sys/net/if_ethersubr.c
cvs rdiff -u -r1.93 -r1.94 src/sys/net/if_vlan.c

As far as I can tell, with these changes, the problem is gone.


Thank you for the verification.


-jarle




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2017-01-12 Thread Masanobu SAITOH


On 2016/11/28 17:16, Masanobu SAITOH wrote:

Hello, Jarle.

On 2016/11/27 0:45, Jarle Greipsland wrote:

Masanobu SAITOH <msai...@execsw.org> writes:

Hi.

On 2016/03/07 21:12, Tobias Nygren wrote:

On Mon, 7 Mar 2016 20:57:02 +0900
Masanobu SAITOH <msai...@execsw.org> wrote:


One of the possibility is that the multicast filter table and broadcast
bit in a register aren't set correctly on ICH9.


I'm not sure if this is relevant to the discussion, but I have a wm(4)
device (8086:1502) on -current that does not work after boot. It comes
to life only after running "tcpdump -n -i wm0" once. I am using vlan(4),
but haven't checked if that makes any difference. I usually run the
tcpdump command then forget about it until the next reboot.


 It must be a bug! Could you tell me how you set up network interface include
vlan? (e.g. part of /etc/rc.conf, /etc/ifconfig.xxx, and the output of 
"ifconfig -a)


Was this problem ever fixed?


 Perhaps no. I've added a lot of changes into if_wm.c, but I've not
touched vlan related stuff.



 Please test the latest -current. knakahara found a problem:


Module Name:src
Committed By:   msaitoh
Date:   Fri Jan 13 06:11:56 UTC 2017

Modified Files:
src/sys/net: if_ethersubr.c if_vlan.c

Log Message:
 Fix a bug that the parent interface's callback wasn't called when the vlan
interface is configured. A callback function uses VLAN_ATTACHED() function
which check ec->ec_nvlans, the value should be incremented before calling the
callback. This bug was added in if_vlan.c rev. 1.83 (2015/11/19).


To generate a diff of this commit:
cvs rdiff -u -r1.234 -r1.235 src/sys/net/if_ethersubr.c
cvs rdiff -u -r1.93 -r1.94 src/sys/net/if_vlan.c



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2016-11-28 Thread Masanobu SAITOH


Hello, Jarle.

On 2016/11/27 0:45, Jarle Greipsland wrote:

Masanobu SAITOH <msai...@execsw.org> writes:

Hi.

On 2016/03/07 21:12, Tobias Nygren wrote:

On Mon, 7 Mar 2016 20:57:02 +0900
Masanobu SAITOH <msai...@execsw.org> wrote:


One of the possibility is that the multicast filter table and broadcast
bit in a register aren't set correctly on ICH9.


I'm not sure if this is relevant to the discussion, but I have a wm(4)
device (8086:1502) on -current that does not work after boot. It comes
to life only after running "tcpdump -n -i wm0" once. I am using vlan(4),
but haven't checked if that makes any difference. I usually run the
tcpdump command then forget about it until the next reboot.


 It must be a bug! Could you tell me how you set up network interface include
vlan? (e.g. part of /etc/rc.conf, /etc/ifconfig.xxx, and the output of 
"ifconfig -a)


Was this problem ever fixed?


 Perhaps no. I've added a lot of changes into if_wm.c, but I've not
touched vlan related stuff.

 I've struggled to make I219 work for month but the work have not
finished yet.

 I also resumed working ixg(4) sync with FreeBSD last week and it'll
take a few or more weeks to finish. Until finishing the work, I won't
come back to any wm bugfixes. So if someone can spend time to fix vlan
and I219 problem of wm(4), please do!

 Regards.



 I am experiencing very similar
problems with -current as of yesterday.  My system is a
SuperMicro X7SPA-HF used as a router with a non-vlan interface
towards my ISP (wm1), and a vlan'ed interface for a number of
internal networks (wm0).

An old kernel 7.99.21 from October last year works fine:
[ ... ]
ppb0 at pci0 dev 28 function 0: vendor 8086 product 2940 (rev. 0x02)
ppb0: PCI Express capability version 1  x4 @ 2.
5GT/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
ppb1 at pci0 dev 28 function 4: vendor 8086 product 2948 (rev. 0x02)
ppb1: PCI Express capability version 1  x1 @ 2.
5GT/s
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
wm0 at pci2 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: for TX interrupting at msix0 vec 0 affinity to 0
wm0: for RX interrupting at msix0 vec 1 affinity to 1
wm0: for RX interrupting at msix0 vec 2 affinity to 2
wm0: for LINK interrupting at msix0 vec 3
wm0: PCI-Express bus
wm0: 2048 words (8 address bits) SPI EEPROM, version 1.9.0, Image Unique ID 

wm0: Ethernet address 00:xx:xx:xx:xx:xx
makphy0 at wm0 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
ppb2 at pci0 dev 28 function 5: vendor 8086 product 294a (rev. 0x02)
ppb2: PCI Express capability version 1  x1 @ 
2.5GT/s
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
wm1 at pci3 dev 0 function 0: Intel i82574L (rev. 0x00)
wm1: for TX interrupting at msix1 vec 0 affinity to 0
wm1: for RX interrupting at msix1 vec 1 affinity to 1
wm1: for RX interrupting at msix1 vec 2 affinity to 2
wm1: for LINK interrupting at msix1 vec 3
wm1: PCI-Express bus
wm1: 512 words (8 address bits) SPI EEPROM, version 1.9.0, Image Unique ID 

wm1: Ethernet address 00:yy:yy:yy:yy:yy
makphy1 at wm1 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
makphy1: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

With a current kernel from yesterday (7.99.42) however, the vlan
interfaces on wm0 do not work.  The dmesg also looks slightly different
(the interrupt routing seems different):

ppb0 at pci0 dev 28 function 0: vendor 8086 product 2940 (rev. 0x02)
ppb0: PCI Express capability version 1  x4 @ 2.
5GT/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
ppb1 at pci0 dev 28 function 4: vendor 8086 product 2948 (rev. 0x02)
ppb1: PCI Express capability version 1  x1 @ 2.
5GT/s
pci2 at ppb1 bus 2
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
wm0 at pci2 dev 0 function 0: Intel i82574L (rev. 0x00)
wm0: for TX and RX interrupting at msix0 vec 0 affinity to 1
wm0: for TX and RX interrupting at msix0 vec 1 affinity to 2
wm0: for LINK interrupting at msix0 vec 2
wm0: PCI-Express bus
wm0: 2048 words (8 address bits) SPI EEPROM, version 1.9.0, Image Unique ID 

wm0: Ethernet address 00:xx:xx:xx:xx:xx
makphy0 at wm0 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
ppb2 at pci0 dev 28 function 5: vendor 8086 product 294a (rev. 0x02)
ppb2: PCI Express capability version 1  x1 @ 
2.5GT/s
pci3 at ppb2 bus 3
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
wm1 at pci3 dev 0 function 0: Intel i82574L (rev. 0x00)
wm1: for TX and RX interrupting at msix1 vec 0 affinity to 1
wm1: for TX and RX interrupting at msix1 vec 1 affinity to 2
wm1: for LINK interrupting at msix1 vec 2
wm1: PCI-Express bus
wm1: 512 words (8 address bits) SPI EEPROM, version 1.9.0, Image Unique ID 
ff

Re: wm WOL not working anymore

2016-11-08 Thread Masanobu SAITOH


Hi, Frank.

On 2016/10/24 14:56, Masanobu SAITOH wrote:

Hello.

On 2016/10/22 19:32, Frank Kardel wrote:

Hi !

There has be quite some work going on for wm interfaces.

When testing current kernels I found that some time after
if_wm.c:1.347 the WOL functionality has stopped working
on my ASRock 990FX Extreme 9 wm interfaces (PHYs are down
after "shutdown -p").

Compiling if_wm.c with "options WM_WOL" leads to
compilations errors (defined, but not used).

So currently I gather that WOL on wm is work in
progress - am I right ?


 Yes, It's work in progress...



dmesg snipplets:
wm0 at pci6 dev 0 function 0: Intel i82572EI 1000baseT Ethernet (rev. 0x06)
wm0: interrupting at ioapic1 pin 23
wm0: PCI-Express bus
wm0: 2048 words (16 address bits) SPI EEPROM, version 5.11.8, Image Unique ID 

wm0: Ethernet address 00:1b:21:xx:yy:zz
igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

wm1 at pci13 dev 0 function 0: Intel i82583V (rev. 0x00)
wm1: interrupting at ioapic0 pin 18
wm1: PCI-Express bus
wm1: 2048 words FLASH, version 1.10.0, Image Unique ID 
wm1: Ethernet address bc:5f:f4:xx:yy:zz
makphy0 at wm1 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

Frank


Could you test with if_wm.c rev. 1.439?

If it doesn't wakeup with WOL packet, could you show me the
output of the following code in wm_get_wakeup()?

#ifdef WM_DEBUG
printf("\n");
if ((sc->sc_flags & WM_F_HAS_AMT) != 0)
printf("HAS_AMT,");
if ((sc->sc_flags & WM_F_ARC_SUBSYS_VALID) != 0)
printf("ARC_SUBSYS_VALID,");
if ((sc->sc_flags & WM_F_ASF_FIRMWARE_PRES) != 0)
printf("ASF_FIRMWARE_PRES,");
if ((sc->sc_flags & WM_F_HAS_MANAGE) != 0)
printf("HAS_MANAGE,");
printf("\n");
#endif

Regards.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm WOL not working anymore

2016-10-23 Thread Masanobu SAITOH


Hello.

On 2016/10/22 19:32, Frank Kardel wrote:

Hi !

There has be quite some work going on for wm interfaces.

When testing current kernels I found that some time after
if_wm.c:1.347 the WOL functionality has stopped working
on my ASRock 990FX Extreme 9 wm interfaces (PHYs are down
after "shutdown -p").

Compiling if_wm.c with "options WM_WOL" leads to
compilations errors (defined, but not used).

So currently I gather that WOL on wm is work in
progress - am I right ?


 Yes, It's work in progress...



dmesg snipplets:
wm0 at pci6 dev 0 function 0: Intel i82572EI 1000baseT Ethernet (rev. 0x06)
wm0: interrupting at ioapic1 pin 23
wm0: PCI-Express bus
wm0: 2048 words (16 address bits) SPI EEPROM, version 5.11.8, Image Unique ID 

wm0: Ethernet address 00:1b:21:xx:yy:zz
igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

wm1 at pci13 dev 0 function 0: Intel i82583V (rev. 0x00)
wm1: interrupting at ioapic0 pin 18
wm1: PCI-Express bus
wm1: 2048 words FLASH, version 1.10.0, Image Unique ID 
wm1: Ethernet address bc:5f:f4:xx:yy:zz
makphy0 at wm1 phy 1: Marvell 88E1149 Gigabit PHY, rev. 1
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

Frank



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: WANTED: nvme(4) driver testing on MP systems on -current

2016-10-18 Thread Masanobu SAITOH


On 2016/09/22 5:54, Jaromír Doleček wrote:

Hello,

NVMe driver in NetBSD-current was recently tweaked to fix several MP and locking
issues, and the driver is now marked as MPSAFE by default.

Most of this work was done on emulators since I lack the the hardware,
so it's not clear if
everything would work properly on real systems too.

Anyone having the hardware, I'd appreciate if you could check the
driver out, and try
to punish the drive by some heavy I/O test with parallel load if
possible, and report
results.

The driver should work on i386 and amd64, and is enabled in
INSTALL/GENERIC kernels there,
so you could just try to boot install iso from NetBSD daily builds,
and send-pr any
issues.

I'd also especially welcome if someone with sparc64 system could test
the driver out, too.
The driver originates from OpenBSD where nvme(4) is enabled in GENERIC sparc64
kernel, so it should work. But it was not confirmed yet on
NetBSD/sparc64. Note you might
need fairly modern system, at least some Intel NVMe cards require PCIe
Generation 3 to
actually work, so this rules out e.g. T1s.

I'd also very welcome any benchmark results, it would be very
interesting to share some
IOPS figures.

Let me know the results, I'd like to update driver manpage to list
known working hardware.

In any reports, please include the attachment fragment from dmesg, as there
is quite significant different between attachment via apic/INTx and MSI/MSI-X.
Also useful would be intrctl(8) output, to confirm interrupt handlers
are dispatched
properly to individual available CPUs.

Thank you.

Jaromir



With nvme.c rev. 1.16:


Oct 18 17:14:02 five savecore: reboot after panic: panic: ioWsAtRNatI_NWG:Au 
nRSNPILN GbNuO:Ts  SLPOyLW E RN


and,


five# crash -M netbsd.36.core -N /netbsd
Crash version 7.99.39, image version 7.99.39.
System panicked: iostat_unbusy
Backtrace from time of crash is available.
crash> trace
_KERNEL_OPT_NVGA_RASTERCONSOLE() at 0
?() at 80008f0e5240
vpanic() at vpanic+0x149
snprintf() at snprintf
iostat_isbusy() at iostat_isbusy
dk_done1() at dk_done1+0xab
lddone() at lddone+0xf
nvme_q_complete() at nvme_q_complete+0xc6
softint_dispatch() at softint_dispatch+0xd3
DDB lost frame for Xsoftintr+0x4f, trying 0xfe810e919ff0
Xsoftintr() at Xsoftintr+0x4f
--- interrupt ---
0:


Again, the panic message was:


Oct 18 17:14:02 five savecore: reboot after panic: panic: ioWsAtRNatI_NWG:Au 
nRSNPILN GbNuO:Ts  SLPOyLW E RN


-> panic: iostat_unbust
-> WARNINWG:A RSNPILN GNO:T  SLPOLW E RN

  -> WARNING: SPL NOT LOWER
  -> WARNING: SPL N

The full dmesg is at:

http://www.netbsd.org/~msaitoh/nvme-20161018-0.log

Any test code are welcomed!

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2016-07-06 Thread Masanobu SAITOH


On 2016/07/06 18:06, Tom Ivar Helbekkmo wrote:

Masanobu SAITOH <msai...@execsw.org> writes:


 I got a Latitude E6400 via an auction. I tried -current and it
worked with MSI. While checking your dmesg, I noticed that you
didn't use ACPI. I tried without ACPI and I could reproduce the
problem. Without ACPI, any ioapic isn't attached. knakaraha said
it might be the reason of the problem. Perhaps the problem is
not only for Latitude E6400 but for all systems which don't use
ACPI and use MSI/MSI-X. It can be fixed.


That's very interesting - but if I understand correctly, you're booting
that E6400 with ACPI enabled in the kernel.  When I try that, it crashes
very early in the boot process, dropping into KDB.  Could it have one or
more corrupt ACPI data structures in the BIOS -- and is there any way to
fix this, if so?


 I tried the latest -current and it didn't crash. Could you retry?


-tih




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2016-07-06 Thread Masanobu SAITOH


Hi.

 Sorry for the long delay.

On 2016/03/10 4:26, SAITOH Masanobu wrote:

Hi, Tom.

On 2016/03/10 4:12, Tom Ivar Helbekkmo wrote:

SAITOH Masanobu  writes:


 You mean your machine works with INTx but it doesn't work on MSI, right?


That is correct.


If so, could you show the full dmesg of the machine?


Appended below.


 And, did you test if your machine's problem does occur "without" vlan?


This is the laptop, which doesn't use vlans.  The other machine, the
Poweredge 2650, is my main server, and does all its networking over a
vlan trunk on its wm0 interface.  I suspect that its problem is
different, since it works with a -current from October 10th, whereas the
laptop doesn't.

dmesg output from the laptop after making its wm0 use INTx instead of MSI:


 Thank you for your quick reply. I had two ICH9 motherboard but I discarded
them because both of them were broken... Now I have no any ICH9 machine.
I have some ICH8s and one ICH10. All of them worked, so I had thought that
ICH9 worked.

 I'm sorry that I'm busy because AsiaBSDCon starts today and I'll be absent
the next one week from Tokyo. I would be happy if someone(TM) debug
and test with variety of ICH9 machines.

 Regards.



Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016
The NetBSD Foundation, Inc.  All rights reserved.
Copyright (c) 1982, 1986, 1989, 1991, 1993
The Regents of the University of California.  All rights reserved.

NetBSD 7.99.26 (DEJAH) #5: Wed Mar  9 17:36:37 CET 2016

r...@barsoom.hamartun.priv.no:/usr/obj/sys/arch/amd64/compile.amd64/DEJAH
total memory = 4083 MB
avail memory = 3945 MB
rnd: seeded with 128 bits
timecounter: Timecounters tick every 10.000 msec
Kernelized RAIDframe activated
timecounter: Timecounter "i8254" frequency 1193182 Hz quality 100
Dell Inc. Latitude E6400
mainbus0 (root)
cpu0 at mainbus0
cpu0: Intel(R) Core(TM)2 Duo CPU T9600  @ 2.80GHz, id 0x1067a
pci0 at mainbus0 bus 0: configuration mode 1
pci0: i/o space, memory space enabled, rd/line, rd/mult, wr/inv ok
pchb0 at pci0 dev 0 function 0: vendor 8086 product 2a40 (rev. 0x07)
agp0 at pchb0: can't find internal VGA config space
ppb0 at pci0 dev 1 function 0: vendor 8086 product 2a41 (rev. 0x07)
ppb0: PCI Express capability version 1  x16 @ 
2.5GT/s
pci1 at ppb0 bus 1
pci1: i/o space, memory space enabled, rd/line, wr/inv ok
vga0 at pci1 dev 0 function 0: vendor 10de product 06eb (rev. 0xa1)
wsdisplay0 at vga0 kbdmux 1: console (80x25, vt100 emulation)
wsmux1: connecting to wsdisplay0
drm at vga0 not configured
wm0 at pci0 dev 25 function 0: 82801I mobile (AMT) LAN Controller (rev. 0x03)
wm0: interrupting at irq 11
wm0: PCI-Express bus
wm0: 2048 words FLASH
wm0: Ethernet address 00:26:b9:cd:21:c2
makphy0 at wm0 phy 2: Marvell 88E1149 Gigabit PHY, rev. 1
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto
uhci0 at pci0 dev 26 function 0: vendor 8086 product 2937 (rev. 0x03)
uhci0: interrupting at irq 10
usb0 at uhci0: USB revision 1.0
uhci1 at pci0 dev 26 function 1: vendor 8086 product 2938 (rev. 0x03)
uhci1: interrupting at irq 3
usb1 at uhci1: USB revision 1.0
uhci2 at pci0 dev 26 function 2: vendor 8086 product 2939 (rev. 0x03)
uhci2: interrupting at irq 11
usb2 at uhci2: USB revision 1.0
ehci0 at pci0 dev 26 function 7: vendor 8086 product 293c (rev. 0x03)
ehci0: interrupting at irq 11
ehci0: BIOS has given up ownership
ehci0: EHCI version 1.0
ehci0: companion controllers, 2 ports each: uhci0 uhci1 uhci2
usb3 at ehci0: USB revision 2.0
hdaudio0 at pci0 dev 27 function 0: HD Audio Controller
hdaudio0: interrupting at irq 3
hdafg0 at hdaudio0: vendor 111d product 76b2
hdafg0: DAC00 2ch: Speaker [Built-In], HP Out [Jack]
hdafg0: DAC01 2ch: Speaker [Jack]
hdafg0: DIG02 2ch: SPDIF Out [Jack]
hdafg0: 2ch/0ch 44100Hz 48000Hz 88200Hz 96000Hz 192000Hz PCM16 PCM20 PCM24 AC3
audio0 at hdafg0: full duplex, playback, capture, mmap, independent
ppb1 at pci0 dev 28 function 0: vendor 8086 product 2940 (rev. 0x03)
ppb1: PCI Express capability version 1  x1 @ 
2.5GT/s
pci2 at ppb1 bus 11
pci2: i/o space, memory space enabled, rd/line, wr/inv ok
ppb2 at pci0 dev 28 function 1: vendor 8086 product 2942 (rev. 0x03)
ppb2: PCI Express capability version 1  x1 @ 
2.5GT/s
pci3 at ppb2 bus 12
pci3: i/o space, memory space enabled, rd/line, wr/inv ok
iwn0 at pci3 dev 0 function 0: vendor 8086 product 4235 (rev. 0x00)
iwn0: interrupting at irq 10
iwn0: MIMO 3T3R, MoW, address 00:21:6a:ba:79:e2
iwn0: 11a rates: 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 36Mbps 48Mbps 54Mbps
iwn0: 11b rates: 1Mbps 2Mbps 5.5Mbps 11Mbps
iwn0: 11g rates: 1Mbps 2Mbps 5.5Mbps 11Mbps 6Mbps 9Mbps 12Mbps 18Mbps 24Mbps 
36Mbps 48Mbps 54Mbps
ppb3 at pci0 dev 28 function 2: vendor 8086 product 2944 (rev. 0x03)
ppb3: PCI Express capability version 1  x1 @ 
2.5GT/s
pci4 at ppb3 bus 13
pci4: i/o space, memory space enabled, rd/line, wr/inv ok

nbctfconvert and /etc/malloc.conf

2016-03-28 Thread Masanobu SAITOH

 Hi.

 While compiling amd64's GENERIC kernel. I found a problem:

> --- kern_sig_43.o ---
> /disk/sources/NetBSD-current/src/obj/tooldir.NetBSD-7.99.26-amd64/bin/nbctfconvert
>  -g -L VERSION -g kern_sig_43.o
> ERROR: kern_sig_43.c: die 20756: failed to get ref: Invalid attribute form 
> [dwarf_attrval_unsigned(216)]
> *** [kern_sig_43.o] Error code 1
> 

My machine's /etc/malloc.conf points to 'J'

> % ls -l /etc/malloc.conf
> lrwxr-xr-x  1 root  wheel  1 Nov  7  2014 /etc/malloc.conf -> J

Without "/etc/malloc.conf->J", I can compile kern_sig_43.o without any errors.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2016-03-07 Thread Masanobu SAITOH


Hi, Tom.

On 2016/03/07 21:34, Tom Ivar Helbekkmo wrote:

Masanobu SAITOH <msai...@execsw.org> writes:


  Is the port connecting 100BaseT switch or gigabit switch.


It's connected to a Cisco 2924 VLAN switch, and both the switch port and
the wm0 device on the laptop are explicitly configured for 100/full.


I see.


  Are you using dhcpcd? Have you tried with static IPv4 address?


Yes, and yes.  When I manually configure it for an address on the VLAN
that's actually on that port, and try to ping a neighbor, I can see this
(using tcpdump on the neighbor):

13:24:21.596308 ARP, Request who-has hamartun-gw.hamartun.priv.no tell 
172.27.202.50, length 46
13:24:21.596326 ARP, Reply hamartun-gw.hamartun.priv.no is-at 00:13:72:f7:00:06 
(oui Unknown), length 28

However, the ARP response is never received, so it just keeps trying.


And, could you try "ping6 ff02::1%wm0" and check if any reply
comes from other machine?


dejah# ping6 ff02::1%wm0
PING6(56=40+8+8 bytes) fe80::226:b9ff:fecd:21c2%wm0 --> ff02::1%wm0
16 bytes from fe80::226:b9ff:fecd:21c2%wm0, icmp_seq=0 hlim=64 time=0.037 ms
16 bytes from fe80::226:b9ff:fecd:21c2%wm0, icmp_seq=1 hlim=64 time=0.017 ms
16 bytes from fe80::226:b9ff:fecd:21c2%wm0, icmp_seq=2 hlim=64 time=0.017 ms
16 bytes from fe80::226:b9ff:fecd:21c2%wm0, icmp_seq=3 hlim=64 time=0.017 ms
16 bytes from fe80::226:b9ff:fecd:21c2%wm0, icmp_seq=4 hlim=64 time=0.017 ms
16 bytes from fe80::226:b9ff:fecd:21c2%wm0, icmp_seq=5 hlim=64 time=0.016 ms
^C
--- ff02::1%wm0 ping6 statistics ---
6 packets transmitted, 6 packets received, 0.0% packet loss
round-trip min/avg/max/std-dev = 0.016/0.020/0.037/0.008 ms

That's just the host itself answering, right?


Yes. Only host itself.


If I do this on the
(working) iwn0 WiFi interface, I get more responses, with much longer
RTTs, and "(DUP!)" messages.  I assume that's neighbours answering?


 Yes, those replies are from neighbors.

 A bug must be exist. sborrill@ repored vlan related probem before. One of
the problem is that I can't reproduce the problem with my machines...
If I can reproduce the problem with my machine, I can fix it...


-tih




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2016-03-07 Thread Masanobu SAITOH


Hi.

On 2016/03/07 21:12, Tobias Nygren wrote:

On Mon, 7 Mar 2016 20:57:02 +0900
Masanobu SAITOH <msai...@execsw.org> wrote:


One of the possibility is that the multicast filter table and broadcast
bit in a register aren't set correctly on ICH9.


I'm not sure if this is relevant to the discussion, but I have a wm(4)
device (8086:1502) on -current that does not work after boot. It comes
to life only after running "tcpdump -n -i wm0" once. I am using vlan(4),
but haven't checked if that makes any difference. I usually run the
tcpdump command then forget about it until the next reboot.


 It must be a bug! Could you tell me how you set up network interface include
vlan? (e.g. part of /etc/rc.conf, /etc/ifconfig.xxx, and the output of 
"ifconfig -a)


wm0 at pci0 dev 25 function 0: PCH2 LAN (82579LM) Controller (rev. 0x06)
wm0: interrupting at msi0 vec 0
wm0: PCI-Express bus
wm0: 2048 words FLASH
wm0: Ethernet address xx:xx:xx:xx:xx:xx
ihphy0 at wm0 phy 2: i82579 10/100/1000 media interface, rev. 3
ihphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT,
1000baseT-FDX, auto




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2016-03-07 Thread Masanobu SAITOH


Hi, Tom.

On 2016/03/07 20:42, Tom Ivar Helbekkmo wrote:

Masanobu SAITOH <msai...@execsw.org> writes:


  0) Did you check /var/log/message if device timouts occured?


No timeouts.  Everything behaves as if there were no incoming traffic.


  1) Is Intel AMT set to enable by BIOS?


No, it's not.


  2) Could you show me the output of "ifconfig -v wm0"


wm0: flags=0x8843<UP,BROADCAST,RUNNING,SIMPLEX,MULTICAST> mtu 1500
capabilities=7ff80<TSO4,IP4CSUM_Rx,IP4CSUM_Tx,TCP4CSUM_Rx>
capabilities=7ff80<TCP4CSUM_Tx,UDP4CSUM_Rx,UDP4CSUM_Tx,TCP6CSUM_Rx>
capabilities=7ff80<TCP6CSUM_Tx,UDP6CSUM_Rx,UDP6CSUM_Tx,TSO6>
enabled=0
ec_capabilities=7<VLAN_MTU,VLAN_HWTAGGING,JUMBO_MTU>
ec_enabled=0
address: 00:26:b9:cd:21:c2
media: Ethernet 100baseTX full-duplex


 Is the port connecting 100BaseT switch or gigabit switch.


status: active
input: 0 packets, 0 bytes


 Strange.


output: 20 packets, 2608 bytes, 9 multicasts
inet 169.254.22.67 netmask 0x broadcast 169.254.255.255


 Are you using dhcpcd? Have you tried with static IPv4 address?
And, could you try "ping6 ff02::1%wm0" and check if any reply
comes from other machine?

 One of the possibility is that the multicast filter table and broadcast
bit in a register aren't set correctly on ICH9.

 Regards.


inet6 fe80::226:b9ff:fecd:21c2%wm0 prefixlen 64 scopeid 0x1


  Is it possible to test with 7.0(RELEASE) and the latest snapshot
of netbsd-7 branch?


I'll burn a couple of CDs, and try those tonight.

-tih




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm devices don't work under current amd64

2016-03-07 Thread Masanobu SAITOH


Hi.

On 2016/03/07 19:35, Tom Ivar Helbekkmo wrote:

I've recently set up a Dell E6400 laptop with NetBSD, and it's working
great - over WiFi.  The built-in wm ethernet interface doesn't work.  It
can send packets out, but can't receive anything.


 0) Did you check /var/log/message if device timouts occured?
 1) Is Intel AMT set to enable by BIOS?
 2) Could you show me the output of "ifconfig -v wm0"


 I tried booting Linux
on the laptop, and it has no trouble with it.  Here's the device:

wm0 at pci0 dev 25 function 0: 82801I mobile (AMT) LAN Controller (rev. 0x03)
wm0: interrupting at msi0 vec 0
wm0: PCI-Express bus
wm0: 2048 words FLASH
wm0: Ethernet address 00:26:b9:cd:21:c2
makphy0 at wm0 phy 2: Marvell 88E1149 Gigabit PHY, rev. 1
makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

I first tried installing from a CD containing the NetBSD/amd64 install
image from October 10th (7.99.21).  After that failed to work for the wm
device, I built a new image, using a current from March 4th (7.99.26).
That also failed to work, but it doesn't get kernel panics relating to
arp on the WiFi, so the laptop is working nicely enough without wm0.


 Is it possible to test with 7.0(RELEASE) and the latest snapshot
of netbsd-7 branch?

 Thanks in advance.



However, since I'd built a newer current anyway, I tried upgrading
another Dell box, a 2650 that I use as a server.  It also has wm type
networking devices, which work just fine with 7.99.21.  Surprisingly,
upgrading to 7.99.26 broke them.  The hardware is different, of course:

wm0 at pci6 dev 7 function 0: Intel i82541GI 1000BASE-T Ethernet
(rev. 0x05)
wm0: interrupting at ioapic2 pin 0
wm0: 32-bit 66MHz PCI bus
wm0: 512 words (16 address bits) SPI EEPROM
wm0: Ethernet address 00:13:72:f7:00:06
igphy0 at wm0 phy 1: Intel IGP01E1000 Gigabit PHY, rev. 0
igphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseT, 
1000baseT-FDX, auto

I also use this wm0 differently from the plain use I've attempted with
the laptop: this server runs 802.1q VLAN trunking on wm0, and acts as a
router (with pf firewall) between a number of VLANs.  The failures of
the wm devices on the two machines may thus have different causes,
related to either the different hardware, or the difference in use.

Grateful for any hints, things to try, etc.  :)

-tih




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

x86's /proc/cpuinfo (was Re: libcrypto: Illegal instruction ``pshufb'' on non-sse3 CPU)

2016-01-12 Thread Masanobu SAITOH


On 2015/05/19 22:02, Masanobu SAITOH wrote:

Hi.

On 2015/05/19 10:45, Timo Buhrmester wrote:

As of late, when building (and installing) -head I end up with a libcrypto 
causing SIGILL, apparently due to using the ``pshufb'' instruction (which I 
believe is part of the SSE3 extension).

My CPU is, according to /proc/cpuinfo:


  For x86, /proc/cpuinfo have not maintained for many years...
To avoid this problem, use "cpuctl identify 0" instead of /proc/cpuinfo
to check cpu features (PR#49246).

  Regards.


 I've commit the change of x86/procfs_machdep.c to print much information about 
CPU features now!

Example:

 before:


processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 12
model name  : Intel(R) Xeon(R) CPU E3-1240L v3 @ 2.00GHz
stepping: 3
cpu MHz : 2000.24
fdiv_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm


 after:


processor   : 0
vendor_id   : GenuineIntel
cpu family  : 6
model   : 12
model name  : Intel(R) Xeon(R) CPU E3-1240L v3 @ 2.00GHz
stepping: 3
cpu MHz : 2000.27
fdiv_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 13
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb 
rdtscp lm pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma 
cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes 
xsave avx f16c rdrand lahf_lm abm fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 
erms invpcid rtm



 It's not perfect but it's better than before


 Thanks.










processor: 0
vendor_id: AuthenticAMD
cpu family: 15
model: 6
model name: AMD Athlon(tm) II X2 265 Processor
stepping: 3
cpu MHz: 3311.46
fdiv_bug: no
fpu: yes
fpu_exception: yes
cpuid level: 5
wp: yes
flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht

processor: 1
[another core]


(No sse3)


As far as gdb is concerned, in libcrypto.so.8.4, this is the point where it 
blows up:
[...]
   >0x7f7ff6f3ba87  pshufb %xmm6,%xmm0   ;bang
0x7f7ff6f3ba8c  add$0x40,%r9 ;for context
0x7f7ff6f3ba90  pshufb %xmm6,%xmm1
0x7f7ff6f3ba95  pshufb %xmm6,%xmm2
0x7f7ff6f3ba9a  pshufb %xmm6,%xmm3
0x7f7ff6f3ba9f  paddd  %xmm9,%xmm0

(Core dump available on request)

The assembly code originates from 
crypto/external/bsd/openssl/dist/crypto/sha/asm/sha1-x86_64.pl (around line 
346), but seems to have been untouched for too long to be the culprit (a -head 
build in March didn't provoke the problem yet).

Any ideas?







--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: -current kernel on KVM with virtio disk fails to boot

2015-08-26 Thread Masanobu SAITOH



On 2015/08/24 15:34, Ryota Ozaki wrote:

On Mon, Aug 24, 2015 at 2:12 PM, Michael van Elst mlel...@serpens.de wrote:

ozak...@netbsd.org (Ryota Ozaki) writes:


Hi,



I got the following panic on bootup. It seems recent
IPL_VM = IPL_NONE change in dk_attach causes it.


Yes. Unfortunately the ld drivers differ very much in
what context the start and iodone routines are called.


Should we apply it?


It would only help virtio. Either all drivers must
be adjusted or a common solution must be found.


I thought the panic happens only on the virtio disk driver.
Feel free to apply the patch if we choose the former.


 We have some virtio related PRs:

http://gnats.netbsd.org/48739
http://gnats.netbsd.org/49295



   ozaki-r




--
--
 Michael van Elst
Internet: mlel...@serpens.de
 A potential Snark may lurk in every tree.



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: wm0: wm0: ROM image version 3.11 is older than 3.25

2015-06-28 Thread Masanobu SAITOH


On 2015/06/28 0:33, Thomas Klausner wrote:

Hi!

I noted this message today:

wm0: ROM image version 3.11 is older than 3.25

What does it want to tell me?


 Perhaps there is nothing to do by yourself :-)


  Thomas


 I210 and I211 have a bug that the PLL misconfigured very slowly.
The NVM image version 3.25 and newer have a workaround to solve
the problem but not in others. See 25. Slow System Clock,
Intel Ethernet Controller I210 Specification Update

 Even if your card's NVM image is old, our driver has a workaround
code to recover from it (wm_pll_workaround_i21()). If the code
detect the wrong state, The follwing message will be shown:


if (wa_done)
aprint_verbose_dev(sc-sc_dev, I210 workaround done\n);
}


 The docuemtn says the problem occurs one oer 1000 power cycles.
I wrote avobe aprint_verbose_dev() to check if the code really
works, but I've never seen the message yet.

 One month ago, I wrote a code to print the NVM image version
(I don't know that code is correct or not because the documents
don't explain the format explicitly). If you find a new errata
in a spec update document and it has note the NVM version, you
can compare it with the output of NetBSD's dmesg.

Note that it's differnt from the BOOT ROM version which is printed
by boot ROM.

Regards.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: libcrypto: Illegal instruction ``pshufb'' on non-sse3 CPU

2015-05-19 Thread Masanobu SAITOH


Hi.

On 2015/05/19 10:45, Timo Buhrmester wrote:

As of late, when building (and installing) -head I end up with a libcrypto 
causing SIGILL, apparently due to using the ``pshufb'' instruction (which I 
believe is part of the SSE3 extension).

My CPU is, according to /proc/cpuinfo:


 For x86, /proc/cpuinfo have not maintained for many years...
To avoid this problem, use cpuctl identify 0 instead of /proc/cpuinfo
to check cpu features (PR#49246).

 Regards.




processor   : 0
vendor_id   : AuthenticAMD
cpu family  : 15
model   : 6
model name  : AMD Athlon(tm) II X2 265 Processor
stepping: 3
cpu MHz : 3311.46
fdiv_bug: no
fpu : yes
fpu_exception   : yes
cpuid level : 5
wp  : yes
flags   : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov 
pat pse36 clflush mmx fxsr sse sse2 ht

processor   : 1
[another core]


(No sse3)


As far as gdb is concerned, in libcrypto.so.8.4, this is the point where it 
blows up:
[...]
   0x7f7ff6f3ba87  pshufb %xmm6,%xmm0   ;bang
0x7f7ff6f3ba8c  add$0x40,%r9 ;for context
0x7f7ff6f3ba90  pshufb %xmm6,%xmm1
0x7f7ff6f3ba95  pshufb %xmm6,%xmm2
0x7f7ff6f3ba9a  pshufb %xmm6,%xmm3
0x7f7ff6f3ba9f  paddd  %xmm9,%xmm0

(Core dump available on request)

The assembly code originates from 
crypto/external/bsd/openssl/dist/crypto/sha/asm/sha1-x86_64.pl (around line 
346), but seems to have been untouched for too long to be the culprit (a -head 
build in March didn't provoke the problem yet).

Any ideas?




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Currently: build failure

2015-05-12 Thread Masanobu SAITOH


On 2015/05/13 9:11, Paul Goyette wrote:

Change all of the FALSE -- false and TRUE--true

:)


 Fixed.

 Thanks!



On Tue, 12 May 2015, bch wrote:


[...]

--- dependall-usr.sbin ---
/usr/src/usr.sbin/crash/../../sys/arch/amd64/amd64/db_disasm.c:216:19:
error: 'FALSE' undeclared here (not in a function)
/*10*/ { ,  FALSE, NONE,  0,   0 },
  ^
/usr/src/usr.sbin/crash/../../sys/arch/amd64/amd64/db_disasm.c:232:19:
error: 'TRUE' undeclared here (not in a function)
/*1f*/ { nopl,  TRUE,  SDEP,  0,   nopw },

[...]



-
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:   |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee.com|
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd.org  |
-



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: current status of ixg(4)

2015-04-24 Thread Masanobu SAITOH


On 2015/04/14 17:22, Masanobu SAITOH wrote:

On 2015/04/10 4:26, 6b...@6bone.informatik.uni-leipzig.de wrote:

On Wed, 8 Apr 2015, SAITOH Masanobu wrote:


Use new one:

http://www.netbsd.org/~msaitoh/ixg-20150407-1.dif



After a first test, it looks as if the interrupt throttling now works (better).


  Thanks. I committed the diff. I also committed a change
that ifconfig -z didn't work.


  Regards
  Uwe


New one:

 http://www.netbsd.org/~msaitoh/ixg-20150414-0.dif

-
Sync ixg(4) up to FreeBSD r250108:
  - Cleanup some unused counters and some unused code.
  - Improve performance.
  - Fix flow control - don't override user value on re-init
  - Fix to make 1G optics work correctly
  - Change to interrupt enabling - some bits were incorrect
for certain hardware.
  - Certain stats fixes, remove a duplicate increment of
ierror, thanks to Scott Long for pointing these out.
  - Fix the setting of RX which related to multicast.
  - Some netmap related fixes.
-


Committed.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: current status of ixg(4)

2015-04-14 Thread Masanobu SAITOH


On 2015/04/10 4:26, 6b...@6bone.informatik.uni-leipzig.de wrote:

On Wed, 8 Apr 2015, SAITOH Masanobu wrote:


Use new one:

http://www.netbsd.org/~msaitoh/ixg-20150407-1.dif



After a first test, it looks as if the interrupt throttling now works (better).


 Thanks. I committed the diff. I also committed a change
that ifconfig -z didn't work.


  Regards
  Uwe


New one:

http://www.netbsd.org/~msaitoh/ixg-20150414-0.dif

-
Sync ixg(4) up to FreeBSD r250108:
 - Cleanup some unused counters and some unused code.
 - Improve performance.
 - Fix flow control - don't override user value on re-init
 - Fix to make 1G optics work correctly
 - Change to interrupt enabling - some bits were incorrect
   for certain hardware.
 - Certain stats fixes, remove a duplicate increment of
   ierror, thanks to Scott Long for pointing these out.
 - Fix the setting of RX which related to multicast.
 - Some netmap related fixes.
-

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: current status of ixg(4)

2015-04-02 Thread Masanobu SAITOH


On 2015/04/02 18:13, Masanobu SAITOH wrote:

On 2015/04/02 0:01, Thor Lancelot Simon wrote:

On Tue, Mar 31, 2015 at 03:38:45PM +0200, 6b...@6bone.informatik.uni-leipzig.de 
wrote:

On Fri, 27 Mar 2015, Masanobu SAITOH wrote:


This change have commited now.

New patch:

http://www.netbsd.org/~msaitoh/ixg-20150327-0.dif



I have tested the patch and found no problems.


  Thanks. I'll commit that diff.


My server (HP G5) can handle with the new driver package rates up to 200,000
packets per second. Then CPU0 is running at 100% with interrupts.

If I have not charged me, it comes with 200,000 packets per second and an
MTU of 1500 bytes to a maximum of 2.4GB.

Is it possible to optimize some parameters for the interrupt throttling?


I think for your workload you may actually want interrupt distribution over
multiple CPUs.  But we have some support for that in -current now, right?

The next thing you're likely to run into is contention for the kernel lock
in ip_input.  But if I recall correctlly how this driver's built, I don't
think you should be seeing that time charged to the interrupt handler, so
I *think* you are not there yet.

Thor


  It's correct what tls@ said.

  For ixg(4), it has multiqueue support, but it's not enabled because
NetBSD-current has not MSI/MSI-X support yet. The remaining is to merge
knakahara@'s MSI/MSI-X support and use the API. The change would be
merged in a few weeks.

  For layer 2, NET_MPSAFE enables the MP capability.

  For layer 3, it's now working in progress.

  For the detail about NET_MPSAFE and MSI/MSI-X, see Ryota Ozaki and
Kengo Nakahara's paper and presentations in AsiaBSDCon 2015:

 http://www.netbsd.org/gallery/presentations/


 There are some graphs from page 60 to 63 in their slides.



Other than NET_MPSAFE and MSI/MSI-X, ixg(4) still have a room to
improve the performance. I have not merged all of FreeBSD's change
yet. I checked all changes and I knew there were some changes that
I have not merged yet.

  Thanks.




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: current status of ixg(4)

2015-04-02 Thread Masanobu SAITOH


On 2015/04/02 0:01, Thor Lancelot Simon wrote:

On Tue, Mar 31, 2015 at 03:38:45PM +0200, 6b...@6bone.informatik.uni-leipzig.de 
wrote:

On Fri, 27 Mar 2015, Masanobu SAITOH wrote:


This change have commited now.

New patch:

http://www.netbsd.org/~msaitoh/ixg-20150327-0.dif



I have tested the patch and found no problems.


 Thanks. I'll commit that diff.


My server (HP G5) can handle with the new driver package rates up to 200,000
packets per second. Then CPU0 is running at 100% with interrupts.

If I have not charged me, it comes with 200,000 packets per second and an
MTU of 1500 bytes to a maximum of 2.4GB.

Is it possible to optimize some parameters for the interrupt throttling?


I think for your workload you may actually want interrupt distribution over
multiple CPUs.  But we have some support for that in -current now, right?

The next thing you're likely to run into is contention for the kernel lock
in ip_input.  But if I recall correctlly how this driver's built, I don't
think you should be seeing that time charged to the interrupt handler, so
I *think* you are not there yet.

Thor


 It's correct what tls@ said.

 For ixg(4), it has multiqueue support, but it's not enabled because
NetBSD-current has not MSI/MSI-X support yet. The remaining is to merge
knakahara@'s MSI/MSI-X support and use the API. The change would be
merged in a few weeks.

 For layer 2, NET_MPSAFE enables the MP capability.

 For layer 3, it's now working in progress.

 For the detail about NET_MPSAFE and MSI/MSI-X, see Ryota Ozaki and
Kengo Nakahara's paper and presentations in AsiaBSDCon 2015:

http://www.netbsd.org/gallery/presentations/

Other than NET_MPSAFE and MSI/MSI-X, ixg(4) still have a room to
improve the performance. I have not merged all of FreeBSD's change
yet. I checked all changes and I knew there were some changes that
I have not merged yet.

 Thanks.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: current status of ixg(4)

2015-03-27 Thread Masanobu SAITOH




New patch:

http://www.netbsd.org/~msaitoh/ixg-20150321-0.dif


 This change have commited now.

New patch:

http://www.netbsd.org/~msaitoh/ixg-20150327-0.dif

This change synchronizes our ixg(4) driver up to FreeBSD r38149:

- Add TSO6 support.
- The max size in dma tag is changed from 65535 to 262140 (IXGBE_TSO_SIZE).
  The value is the same as other *BSDs. The change might cause a address
  space shortage (ixgbe_dmamap_create() might fail) on some machines.
- Fix a lot of bugs.
- Improve performance.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: current status of ixg(4)

2015-03-27 Thread Masanobu SAITOH


On 2015/03/27 16:03, Masanobu SAITOH wrote:



New patch:

http://www.netbsd.org/~msaitoh/ixg-20150321-0.dif


  This change have commited now.

New patch:

 http://www.netbsd.org/~msaitoh/ixg-20150327-0.dif

This change synchronizes our ixg(4) driver up to FreeBSD r38149:


s/38149/238149/


- Add TSO6 support.
- The max size in dma tag is changed from 65535 to 262140 (IXGBE_TSO_SIZE).
   The value is the same as other *BSDs. The change might cause a address
   space shortage (ixgbe_dmamap_create() might fail) on some machines.
- Fix a lot of bugs.
- Improve performance.




--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: current status of ixg(4)

2015-03-23 Thread Masanobu SAITOH


Hi, Uwe.


but I found a problem with vlan interfaces.

You can create a vlan interface with:

ifconfig vlan8 create
ifconfig vlan8 vlan 8 vlanif ixg0 up

The interface is generated. But there are no packages. ifconfig -v shows no 
inbound packets. tcpdump on ixg0 but indicates tagged packets for VLAN8.

The problem also existed with the previous ixg driver. With wm interfaces, the 
problem does not seem to exist.


Is this problem filed PR? If not, could you file a PR?


Could you test with this patch?

Index: ixgbe.c
===
RCS file: /cvsroot/src/sys/dev/pci/ixgbe/ixgbe.c,v
retrieving revision 1.14.2.2
diff -u -p -r1.14.2.2 ixgbe.c
--- ixgbe.c 24 Feb 2015 10:41:09 -  1.14.2.2
+++ ixgbe.c 23 Mar 2015 07:32:50 -
@@ -1064,6 +1064,9 @@ ixgbe_ifflags_cb(struct ethercom *ec)
else if ((change  (IFF_PROMISC | IFF_ALLMULTI)) != 0)
ixgbe_set_promisc(adapter);
 
+	/* Set up VLAN support and filter */

+   ixgbe_setup_vlan_hw_support(adapter);
+
IXGBE_CORE_UNLOCK(adapter);
 
 	return rc;



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: current status of ixg(4)

2015-03-22 Thread Masanobu SAITOH


On 2015/03/22 2:20, 6b...@6bone.informatik.uni-leipzig.de wrote:

On Sat, 21 Mar 2015, SAITOH Masanobu wrote:


New patch:

http://www.netbsd.org/~msaitoh/ixg-20150321-0.dif

Could you try with this patch again?


Now the patch works,


 BTW, what card(or chip) does your machine have?
X540, 82599 or 82598?


but I found a problem with vlan interfaces.

You can create a vlan interface with:

ifconfig vlan8 create
ifconfig vlan8 vlan 8 vlanif ixg0 up

The interface is generated. But there are no packages. ifconfig -v shows no 
inbound packets. tcpdump on ixg0 but indicates tagged packets for VLAN8.

The problem also existed with the previous ixg driver. With wm interfaces, the 
problem does not seem to exist.


Is this problem filed PR? If not, could you file a PR?

 Thanks.


Regards
Uwe



--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

current status of ixg(4)

2015-03-20 Thread Masanobu SAITOH

 Hello.

 Yesterday, I commited some changes to ixg(4) on -current.

http://mail-index.netbsd.org/source-changes/2015/03/19/msg064110.html

I'll wait for a few days to wait feedback of this change. And then
I'll send pullup request to pullup-7@.


 And, I made a patch to support Intel X540:

http://www.netbsd.org/~msaitoh/ixg-20150320-0.dif

 This change syncronizes our ixg(4) driver up to FreeBSD r230775.
It's still very old though... It's not tested well and it'll take
a few or more days to sync with the latest FreeBSD driver.

 Any feedback are welcome!

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: amd64 radeondrmkms near-invisble text

2015-02-22 Thread Masanobu SAITOH


On 2015/02/21 0:12, John D. Baker wrote:

I was recently gifted with a Dell PowerEdge 2850 (which might end up
being a white elephant, judging by this thread:

   http://mail-index.netbsd.org/tech-kern/2015/02/11/msg018393.html

but that's for later concern).

I set up an NFS root and netboot arrangement for it (its disks were
decommissioned with extreme prejudice before I got it, so it has no
drive trays either).  First booting i386 7.99.4, I hit the problem with
pcdisplay0/vga0 at isa0.  Rebooting with -c and disabling these, it
panicked with cnopen: no console device.  This I tracked down to not
linking the firmware file into /libdata/firmware/radeon/R100_cp.bin
(/usr is a separate NFS mount shared by all machines of the same
architecture).

When I next rebooted, I thought I'd forgotten to disable pcdisplay/vga0
again as the screen looked black except for the bright white cursor.
It was in the wrong place, though.  (The pcdisplay/vga0 hang puts the
cursor at the top left, but this time it was at the bottom, and some
ways away from the left edge.)

Looking closer, I could just BARELY see the text emitted by the various
rc scripts.  For some reason, when radeondrmkms attaches on this machine,
the framebuffer console white text is exceptionally dim--the green kernel
messages are invisible.

After confirming that this machine could run the amd64 port, I switched
it over to that (7.99.5), but the radeondrmkms dim-ness persists.  The
cursor is bright white, however, as I would expect normal text to be.

VGA text mode/BIOS/etc. and operation prior to radeondrmkms attachment is
normal.

NetBSD 7.99.5 (GENERIC) #221: Wed Feb 18 04:46:21 CST 2015
 
sy...@yggdrasil.technoskunk.fur:/r0/build/current/obj/amd64/sys/arch/amd
64/compile/GENERIC
total memory = 6143 MB
avail memory = 5946 MB
[...]
Dell Computer Corporation PowerEdge 2850
[...]
acpivga0 at acpi0 (EVGA): ACPI Display Adapter
[...]
radeon0 at pci11 dev 13 function 0: ATI Technologies Radeon 7000/VE QY (rev. 
0x00)
[...]
drm: initializing kernel modesetting (RV100 0x1002:0x5159 0x1028:0x016D).
drm: register mmio base: 0xdf5e
drm: register mmio size: 65536
radeon0: info: VRAM: 128M 0xC800 - 0xCFFF (16M used)
radeon0: info: GTT: 512M 0xA800 - 0xC7FF
drm: Detected VRAM RAM=80M, BAR=128M
drm: RAM width 32bits DDR
Zone  kernel: Available graphics memory: 2134630 kiB
Zone   dma32: Available graphics memory: 2097152 kiB
drm: radeon: 16M of VRAM memory ready
drm: radeon: 512M of GTT memory ready.
drm: GART: num cpu pages 131072, num gpu pages 131072
drm: PCI GART of 512M enabled (table at 0x42A86000).
radeon0: info: WB disabled
radeon0: info: fence driver on ring 0 use gpu addr 0xa800 and cpu 
addr 0x0x80006f53f000
drm: Supports vblank timestamp caching Rev 2 (21.10.2013).
drm: Driver supports precise vblank timestamp query.
radeon0: interrupting at ioapic0 pin 18 (radeon)
drm: radeon: irq initialized.
drm: Loading R100 Microcode
drm: radeon: ring at 0xA8001000
drm: ring test succeeded in 1 usecs
drm: ib test succeeded in 0 usecs
drm: No TV DAC info found in BIOS
drm: Radeon Display Connectors
drm: Connector 0:
drm:   VGA-1
drm:   DDC: 0x60 0x60 0x60 0x60 0x60 0x60 0x60 0x60
drm:   Encoders:
drm: CRT1: INTERNAL_DAC1
drm: Connector 1:
drm:   VGA-2
drm:   DDC: 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c 0x6c
drm:   Encoders:
drm: CRT2: INTERNAL_DAC2
drm: Connector 2:
drm:   DVI-I-1
drm:   HPD1
drm:   DDC: 0x64 0x64 0x64 0x64 0x64 0x64 0x64 0x64
drm:   Encoders:
drm: CRT2: INTERNAL_DAC2
drm: DFP1: INTERNAL_TMDS1
radeondrmkmsfb0 at radeon0
radeon0: info: registered panic notifier
radeondrmkmsfb0: framebuffer at 0x80006f94, size 1024x768, depth 8, 
stride 1024
wsdisplay0 at radeondrmkmsfb0 kbdmux 1: console (default, vt100 emulation), 
using wskbd0
wsmux1: connecting to wsdisplay0
wskbd1: connecting to wsdisplay0


Has anyone else experienced anything like this?  I have only one other
radeon-equipped amd64 machine I've tried with a drmkms-enabled kernel
and it worked fine (although it refused to use resolutions above 1024x768
in X, when it's perfectly capable of 2048x1536 in UMS mode).


 Perhaps the problem is the same as my machines' problem reported in:

http://mail-index.netbsd.org/tech-x11/2015/01/16/msg001471.html

 The cursor is bridght but others are very dark. It's too dark
to read even if I gazing the text.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Re: Fwd: Re: kern/48960: ichlpcib maps I/O space from ACPI BAR which ichsmb wants

2014-12-25 Thread Masanobu SAITOH


On 2014/12/25 11:56, Masanobu SAITOH wrote:

  Hi, all.

  This change fixes a problem that ichlpcib can't map correctly for GPIO
area. And, it will aslo fix some ACPI problem. Could you test if you
have a problem with ACPI?

  Regards.


 The change was committed.

 Please report if you noticed that the change fixes any problems
other than ichlpcib's GPIO.

--
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)

Fwd: Re: kern/48960: ichlpcib maps I/O space from ACPI BAR which ichsmb wants

2014-12-24 Thread Masanobu SAITOH


 Hi, all.

 This change fixes a problem that ichlpcib can't map correctly for GPIO
area. And, it will aslo fix some ACPI problem. Could you test if you
have a problem with ACPI?

 Regards.


 Forwarded Message 
Return-Path: bounces-netbsd-bugs-owner-msaitoh=execsw@netbsd.org
X-Original-To: msai...@execsw.org
Delivered-To: msai...@execsw.org
Received: from mail.netbsd.org (mail.NetBSD.org [IPv6:2001:4f8:3:7::25]) by 
vslock.execsw.org (Postfix) with ESMTPS id F2E5831A56B for 
msai...@execsw.org; Sat, 20 Dec 2014 12:14:00 +0900 (JST)
Received: by mail.netbsd.org (Postfix, from userid 605) id 9FA9F14A1F2; Sat, 20 
Dec 2014 03:13:50 + (UTC)
Delivered-To: netbsd-b...@netbsd.org
Received: from localhost (localhost [127.0.0.1]) by mail.netbsd.org (Postfix) 
with ESMTP id 2878F14A1EC; Sat, 20 Dec 2014 03:13:37 + (UTC)
X-Virus-Scanned: amavisd-new at NetBSD.org
Received: from mail.netbsd.org ([127.0.0.1]) by localhost (mail.NetBSD.org 
[127.0.0.1]) (amavisd-new, port 10025) with ESMTP id OfjewqTZ6r-4; Sat, 20 Dec 
2014 03:13:36 + (UTC)
Received: from vslock.execsw.org (unknown [IPv6:2001:240:694:1::2]) by 
mail.netbsd.org (Postfix) with ESMTP id 0761314A1BA; Sat, 20 Dec 2014 03:13:32 
+ (UTC)
Received: from [127.0.0.1] (unknown [192.168.1.11]) by vslock.execsw.org 
(Postfix) with ESMTPS id D36BF31A56B; Sat, 20 Dec 2014 12:13:29 +0900 (JST)
Message-ID: 5494e959.4020...@execsw.org
Date: Sat, 20 Dec 2014 12:13:29 +0900
From: SAITOH Masanobu msai...@execsw.org
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:31.0) Gecko/20100101 
Thunderbird/31.3.0
MIME-Version: 1.0
To: gnats-b...@netbsd.org, kern-bug-peo...@netbsd.org, gnats-ad...@netbsd.org, 
netbsd-b...@netbsd.org
CC: msai...@execsw.org
Subject: Re: kern/48960: ichlpcib maps I/O space from ACPI BAR which ichsmb 
wants
References: pr-kern-48...@gnats.netbsd.org 
20140702194215.9643d60...@jupiter.mumble.net 
20140702195000.a99e2a6...@mollari.netbsd.org
In-Reply-To: 20140702195000.a99e2a6...@mollari.netbsd.org
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: netbsd-bugs-ow...@netbsd.org
List-Id: netbsd-bugs.NetBSD.org
Precedence: bulk

On 2014/07/03 4:50, Taylor R Campbell wrote:

Number: 48960
Category:   kern
Synopsis:   ichlpcib maps I/O space from ACPI BAR which ichsmb wants
Confidential:   no
Severity:   serious
Priority:   medium
Responsible:kern-bug-people
State:  open
Class:  sw-bug
Submitter-Id:   net
Arrival-Date:   Wed Jul 02 19:50:00 + 2014
Originator: Taylor R Campbell campbell+net...@mumble.net
Release:NetBSD-current as of 2014-07-02
Organization:
Environment:

Architecture: x86_64
Machine: amd64

Description:


ichlpcib(4) maps the space in PCI register 0x40, ACPI PMBASE,
for acpipmtimer, tcotimer, and speedstep support.  ichsmb(4)
tries to map space which is sometimes in the middle of this
region.


How-To-Repeat:


Boot a system with ichlpcib and ichsmb.  Observe that ichsmb0
fails to map its I/O space because ichlpcib0 already mapped
it.


Fix:


matt@ and jakllsch@ say it is wrong for ichlpcib(4) to map the
ACPI PMBASE region, which it does for acpipmtimer, tcotimer,
and speedstep support.

I imagine the Right Thing is to attach these only if ACPI is
disabled, and to write proper ACPI drivers under sys/dev/acpi
for them.  Disabling the ad hoc drivers until proper ACPI
drivers are written is probably not a good idea, however, and I
don't have time or the datasheets or hardware necessary to port
the ad hoc drivers to the ACPI subsystem.



 How about the following patch?

 This fixes a bug that ichlpcib(4) maps I/O area incorrectly.
- The LPCIB_PCI_PMBASE and LPCIB_PCI_GPIO register are alike PCI BAR but not
  completely compatible with it. The PMBASE and GPIO registers define the
  base address and the type but not describe the size. The size is fixed
  to 128bytes. So use pci_mapreg_submap().
- Make pci_mapreg_submap() extern again.
- Fix the calculation of the map size in pci_mapreg_submap().



Index: sys/dev/pci/pcivar.h
===
RCS file: /cvsroot/src/sys/dev/pci/pcivar.h,v
retrieving revision 1.100
diff -u -p -r1.100 pcivar.h
--- sys/dev/pci/pcivar.h16 Oct 2014 12:31:23 -  1.100
+++ sys/dev/pci/pcivar.h19 Dec 2014 17:08:14 -
@@ -269,6 +269,10 @@ intpci_mapreg_info(pci_chipset_tag_t, p
 intpci_mapreg_map(const struct pci_attach_args *, int, pcireg_t, int,
bus_space_tag_t *, bus_space_handle_t *, bus_addr_t *,
bus_size_t *);
+intpci_mapreg_submap(const struct pci_attach_args *, int, pcireg_t, int,
+bus_size_t, bus_size_t, bus_space_tag_t *, bus_space_handle_t *,
+bus_addr_t *, bus_size_t *);
+

 int pci_find_rom(const struct pci_attach_args *,

1 2 >

1 - 100 of 107 matches

Mail list logo