environment corrupt; missing value for QT_IM_MO [Was: [VirtualBox 4.3.26] Fail with assertion VERR_INVALID_POINTER (-6) - Invalid memory pointer.]

2016-01-05 Thread Andriy Gapon

Very weird, this suddenly started happening to me but with libreoffice.  I can
not correlate the problem with any actions /  events.

stderr:
soffice.bin: environment corrupt; missing value for QT_IM_MO

gdb:
Core was generated by `soffice.bin'.
Program terminated with signal SIGABRT, Aborted.
#0  thr_kill () at thr_kill.S:3
3   RSYSCALL(thr_kill)
[Current thread is 2 (Thread 816615000 (LWP 102134))]
(gdb) bt
#0  thr_kill () at thr_kill.S:3
#1  0x000800dc5ddb in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
#2  0x000800dc5d49 in abort () at /usr/src/lib/libc/stdlib/abort.c:65
#3  0x000805231318 in tools::extendApplicationEnvironment() () from
/usr/local/lib/libreoffice/program/libtllo.so

Smells like a possible bug in libc...

On 27/03/2015 07:08, Yuri wrote:
> After ports update today, now both VirtualBox and VBoxManage fail right away
> with this message.
> This happens also without module vboxdrv.ko loaded.
> I rebuilt whole dependency tree of virtualbox-ose-4.3.26. but still have the
> same problem.
> 
> Yuri
> 
> 
> --- message ---
> $ VirtualBox
> VirtualBox: environment corrupt; missing value for QT_IM_MO
> 
> !!Assertion Failed!!
> Expression: RT_SUCCESS_NP(vrc)
> Location  :
> /usr/ports/emulators/virtualbox-ose/work/VirtualBox-4.3.26/src/VBox/Main/glue/initterm.cpp(466)
> nsresult com::Initialize(bool)
> VERR_INVALID_POINTER (-6) - Invalid memory pointer.
> Trace/BPT trap
> [yuri@yuri ~/proj/app-llvm-update/libwikiparser/web-demo]$ VBoxManage list vms
> VBoxManage: environment corrupt; missing value for QT_IM_MO
> 
> !!Assertion Failed!!
> Expression: RT_SUCCESS_NP(vrc)
> Location  :
> /usr/ports/emulators/virtualbox-ose/work/VirtualBox-4.3.26/src/VBox/Main/glue/initterm.cpp(466)
> nsresult com::Initialize(bool)
> VERR_INVALID_POINTER (-6) - Invalid memory pointer.
> Trace/BPT trap

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: environment corrupt; missing value for QT_IM_MO

2016-01-05 Thread Andriy Gapon
On 05/01/2016 10:45, Andriy Gapon wrote:
> 
> Very weird, this suddenly started happening to me but with libreoffice.  I can
> not correlate the problem with any actions /  events.
> 
> stderr:
> soffice.bin: environment corrupt; missing value for QT_IM_MO
> 
> gdb:
> Core was generated by `soffice.bin'.
> Program terminated with signal SIGABRT, Aborted.
> #0  thr_kill () at thr_kill.S:3
> 3   RSYSCALL(thr_kill)
> [Current thread is 2 (Thread 816615000 (LWP 102134))]
> (gdb) bt
> #0  thr_kill () at thr_kill.S:3
> #1  0x000800dc5ddb in __raise (s=6) at /usr/src/lib/libc/gen/raise.c:52
> #2  0x000800dc5d49 in abort () at /usr/src/lib/libc/stdlib/abort.c:65
> #3  0x000805231318 in tools::extendApplicationEnvironment() () from
> /usr/local/lib/libreoffice/program/libtllo.so
> 
> Smells like a possible bug in libc...

Is there a limit on the environment's size?
QT_IM_MODULE is reported by ps as the last variable.

ps axwwlee -p 4629
 UID  PID PPID CPU PRI NIVSZ   RSS MWCHAN STAT TT TIME COMMAND
1001 46291   0  21  0 351744 28428 select Ss-  0:09.37
KDE_SESSION_VERSION=4 VENDOR=amd GS_LIB=/home/avg/.fonts GTK_IM_MODULE=xim
LOGNAME=avg LC_CTYPE=uk_UA.UTF-8 LC_MESSAGES=C LSCOLORS=Exfxcxdxbxegedabagacad
JAVA_VERSION=1.5 LANG=uk_UA.UTF-8 PAGER=more XDM_MANAGED=method=classic
OSTYPE=FreeBSD LC_TIME=en_GB.US-ASCII CDIFFCOLORS=1:36:31:35
XDG_DATA_DIRS=/usr/local/share::/usr/share:/usr/local/share:/usr/local/share/gnome
DESKTOP_SESSION=custom MACHTYPE=x86_64 CLICOLOR= MAIL=/var/mail/avg
PATH=/bin:/sbin:/usr/bin:/usr/sbin:/usr/local/bin:/usr/local/sbin:/usr/local/gnu-autotools/bin:/usr/games:/home/avg/bin:.
QT_PLUGIN_PATH=/usr/home/avg/.kde4/lib/kde4/plugins/:/usr/local/lib/kde4/plugins/
 EDITOR=vim
HOST=trant GTK2_RC_FILES=/home/avg/.gtkrc-2.0-kde4 JAVA_OS=native
KDE_SESSION_UID=1001 DISPLAY=:0 DM_CONTROL=/var/run/xdmctl OLDPWD=/home/avg
SSH_AUTH_SOCK=/tmp/ssh-CI9wwUUaf762/agent.4579 PWD=/home/avg
XDG_CURRENT_DESKTOP=KDE _=/home/avg/.xsession GROUP=staff
DBUS_SESSION_BUS_ADDRESS=unix:path=/tmp/dbus-11hLaxhk4u,guid=fe710affe5d1ea53e5034d0d56898723
USER=avg HOME=/home/avg EXINIT=set autoindent LC_COLLATE=uk_UA.UTF-8
LC_NUMERIC=C CHARSET=UTF-8 LC_MONETARY=C SHELL=/usr/local/bin/zsh
HOSTTYPE=FreeBSD IBM_NOLDT=1 MM_CHARSET=UTF-8 LD_BIND_NOW=true
KDE_FULL_SESSION=true MORE=-e -R -Pm?f%f:stdin .?lbLine %lb:?pb%pb\\%:?bbByte
%bb:-... ?eEND CDROM=/dev/cd0
XDG_CONFIG_DIRS=:/etc/xdg:/usr/local/etc/xdg:/usr/local/etc/xdg/xfce4
XDG_SESSION_COOKIE=3440bb84087c22a5d5d65b192c69-1451853601.307276-216584541
SSH_AGENT_PID=4580 BLOCKSIZE=K QT_IM_MODULE=xim kdeinit4: kdeinit4 Running...
(kdeinit4)


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


New i915 graphic and WXGA

2016-01-05 Thread Andrey Fesenko
Hello, i'm test new driver
https://wiki.freebsd.org/Graphics/Update%20i915%20GPU%20driver%20to%20Linux%203.8

My monitor have WXGA+ 16:10 1440 900, not correct detected, only 4:3
Modeline. Full log https://bsdnir.info/files/Xorg.i915_01-16.log

CPU: Intel(R) Core(TM) i5-4570 CPU @ 3.20GHz (3192.68-MHz K8-class CPU)
vgapci0@pci0:0:2:0: class=0x03 card=0xd0001458 chip=0x04128086
rev=0x06 hdr=0x00
vendor = 'Intel Corporation'
device = 'Xeon E3-1200 v3/4th Gen Core Processor Integrated
Graphics Controller'
class  = display
subclass   = VGA
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: environment corrupt; missing value for QT_IM_MO

2016-01-05 Thread Ryan Stone
On Tue, Jan 5, 2016 at 3:54 AM, Andriy Gapon  wrote:

> Is there a limit on the environment's size?
>

If memory serves, this is bounded by ARG_MAX in sys/syslimits.h.  The value
is not tunable as far as I know, so if you want to experiment with changing
it you will have to change syslimits.h and recompile your kernel.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Possible bug in or around posix_fadvise after r292326

2016-01-05 Thread Konstantin Belousov
On Mon, Jan 04, 2016 at 10:05:21PM -0800, Benno Rice wrote:
> Hi Konstantin,
> 
> I recently updated my dev box to r292962. After doing this I attempted to set 
> up PostgreSQL 9.4. When I ran initdb the last phase hung. Using procstat -kk 
> I found it appeared to be stuck in a loop inside a posix_fadvise syscall. I 
> could not ^C or ^Z the initdb process. I could kill it but a subsequent 
> attempt to rm -rf the /usr/local/pgsql/data directory also got stuck and was 
> unkillable by any means. Rebooting allowed me to remove the directory but the 
> initdb process still hung when I re-ran it.
> 
> I tried PostgreSQL 9.3 with similar results.
> 
> Looking at the source code for initdb I found that it calls posix_fadvise 
> like so[1]:
> 
>  /*
>   * We do what pg_flush_data() would do in the backend: prefer to use
>   * sync_file_range, but fall back to posix_fadvise.  We ignore errors
>   * because this is only a hint.
>   */
>  #if defined(HAVE_SYNC_FILE_RANGE)
>  (void) sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WRITE);
>  #elif defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED)
>  (void) posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
>  #else
>  #error PG_FLUSH_DATA_WORKS should not have been defined
>  #endif
> 
> Looking for recent commits involving POSIX_FADV_DONTNEED I found r292326:
> 
> https://svnweb.freebsd.org/changeset/base/292326 
> 
> 
> Backing this revision out allowed the initdb process to complete.
> 
> My current theory is that some how we???re getting ENOLCK or EAGAIN from the 
> BUF_TIMELOCK call in bnoreuselist:
> 
> https://svnweb.freebsd.org/base/head/sys/kern/vfs_subr.c?view=annotate#l1676 
> 
> 
> Leading to an infinite loop in vop_stdadvise:
> 
> https://svnweb.freebsd.org/base/head/sys/kern/vfs_default.c?annotate=292373#l1083
>  
> 
> 
> I haven???t managed to dig any deeper than that yet.
> 
> Is there any other information I could give you to help narrow this down?

I do not see this issue locally.

When the hang in initdb occur, what is the state of the initdb thread
which performs advise() ?  Is it "brlsfl" sleep, or is the thread running ?

If buffer lock is not available, and this is the cause of the ENOLCK/EAGAIN,
then the question is who is the owner of the corresponding buffer lock.
You could overview the state of the system with 'ps' command in ddb, and
'show alllocks' would list owner, unless buffer was async.

Also, I do not quite understand the behaviour of SIGINT/SIGKILL.  Could
it be that the process was not killed by SIGKILL as well ?  It would be
consistent with the vnode lock still owned and preventing the accesses.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Possible bug in or around posix_fadvise after r292326

2016-01-05 Thread Konstantin Belousov
On Tue, Jan 05, 2016 at 01:07:40PM +0200, Konstantin Belousov wrote:
> On Mon, Jan 04, 2016 at 10:05:21PM -0800, Benno Rice wrote:
> > Hi Konstantin,
> > 
> > I recently updated my dev box to r292962. After doing this I attempted to 
> > set up PostgreSQL 9.4. When I ran initdb the last phase hung. Using 
> > procstat -kk I found it appeared to be stuck in a loop inside a 
> > posix_fadvise syscall. I could not ^C or ^Z the initdb process. I could 
> > kill it but a subsequent attempt to rm -rf the /usr/local/pgsql/data 
> > directory also got stuck and was unkillable by any means. Rebooting allowed 
> > me to remove the directory but the initdb process still hung when I re-ran 
> > it.
> > 
> > I tried PostgreSQL 9.3 with similar results.
> > 
> > Looking at the source code for initdb I found that it calls posix_fadvise 
> > like so[1]:
> > 
> >  /*
> >   * We do what pg_flush_data() would do in the backend: prefer to use
> >   * sync_file_range, but fall back to posix_fadvise.  We ignore errors
> >   * because this is only a hint.
> >   */
> >  #if defined(HAVE_SYNC_FILE_RANGE)
> >  (void) sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WRITE);
> >  #elif defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED)
> >  (void) posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
> >  #else
> >  #error PG_FLUSH_DATA_WORKS should not have been defined
> >  #endif
> > 
> > Looking for recent commits involving POSIX_FADV_DONTNEED I found r292326:
> > 
> > https://svnweb.freebsd.org/changeset/base/292326 
> > 
> > 
> > Backing this revision out allowed the initdb process to complete.
> > 
> > My current theory is that some how we???re getting ENOLCK or EAGAIN from 
> > the BUF_TIMELOCK call in bnoreuselist:
> > 
> > https://svnweb.freebsd.org/base/head/sys/kern/vfs_subr.c?view=annotate#l1676
> >  
> > 
> > 
> > Leading to an infinite loop in vop_stdadvise:
> > 
> > https://svnweb.freebsd.org/base/head/sys/kern/vfs_default.c?annotate=292373#l1083
> >  
> > 
> > 
> > I haven???t managed to dig any deeper than that yet.
> > 
> > Is there any other information I could give you to help narrow this down?
> 
> I do not see this issue locally.
> 
> When the hang in initdb occur, what is the state of the initdb thread
> which performs advise() ?  Is it "brlsfl" sleep, or is the thread running ?
> 
> If buffer lock is not available, and this is the cause of the ENOLCK/EAGAIN,
> then the question is who is the owner of the corresponding buffer lock.
> You could overview the state of the system with 'ps' command in ddb, and
> 'show alllocks' would list owner, unless buffer was async.
> 
> Also, I do not quite understand the behaviour of SIGINT/SIGKILL.  Could
> it be that the process was not killed by SIGKILL as well ?  It would be
> consistent with the vnode lock still owned and preventing the accesses.

Just in case, if this is due to the quadratic loop behaviour, there
is no need to restart from the very start in the new stdadvise(DONTNEED)
implementation.  So regardless of the answers to the questions above,
you might also try this patch.

diff --git a/sys/kern/vfs_default.c b/sys/kern/vfs_default.c
index fd83f87..3da8618 100644
--- a/sys/kern/vfs_default.c
+++ b/sys/kern/vfs_default.c
@@ -1080,15 +1080,9 @@ vop_stdadvise(struct vop_advise_args *ap)
bsize = vp->v_bufobj.bo_bsize;
startn = ap->a_start / bsize;
endn = ap->a_end / bsize;
-   for (;;) {
-   error = bnoreuselist(>bo_clean, bo, startn, endn);
-   if (error == EAGAIN)
-   continue;
+   error = bnoreuselist(>bo_clean, bo, startn, endn);
+   if (error == 0)
error = bnoreuselist(>bo_dirty, bo, startn, endn);
-   if (error == EAGAIN)
-   continue;
-   break;
-   }
BO_RUNLOCK(bo);
VOP_UNLOCK(vp, 0);
break;
diff --git a/sys/kern/vfs_subr.c b/sys/kern/vfs_subr.c
index ace97e8..8cac32f 100644
--- a/sys/kern/vfs_subr.c
+++ b/sys/kern/vfs_subr.c
@@ -1670,6 +1670,7 @@ bnoreuselist(struct bufv *bufv, struct bufobj *bo, 
daddr_t startn, daddr_t endn)
ASSERT_BO_LOCKED(bo);
 
for (lblkno = startn;; lblkno++) {
+again:
bp = BUF_PCTRIE_LOOKUP_GE(>bv_root, lblkno);
if (bp == NULL || bp->b_lblkno >= endn)
break;
@@ -1677,7 +1678,9 @@ bnoreuselist(struct bufv *bufv, struct bufobj *bo, 
daddr_t startn, daddr_t endn)
LK_INTERLOCK, BO_LOCKPTR(bo), "brlsfl", 0, 0);
if (error != 0) {
BO_RLOCK(bo);
-   return 

Re: Can't run `make universe` on universe11a.freebsd.org and ref11-amd64.freebsd.org (anymore); [shell] globbing is broken [there]

2016-01-05 Thread Jilles Tjoelker
On Mon, Jan 04, 2016 at 12:08:43PM -0800, NGie Cooper wrote:
> > It looks like a libc problem though, because I’ve run into this
> > issue with both /bin/sh and bash (my default login shell). I’m not
> > sure why this isn’t reproing on my VM (yet).

> > This doesn’t repro in universe10a.freebsd.org (another jail on the
> > same machine I think…).

> > It was working yesterday on ref11-amd64.freebsd.org before the
> > system was upgraded (it was running October sources), and wasn’t
> > working on universe11a.freebsd.org yesterday (it was running
> > December sources yesterday).

> delphij@ pointed me in the right direction (thanks :)..). Globbing
> expressions seems extremely broken with LANG set to en_US.UTF-8 [at
> least].

> $ echo $LANG
> en_US.UTF-8
> $ hostname
> universe11a.freebsd.org
> $ (unset LANG; cd sys/amd64/conf && echo [A-Z0-9]*[A-Z0-9])
> DEFAULTS GENERIC GENERIC-NODEBUG MINIMAL NOTES

Traditionally, ranges in bracket expressions like A-Z meant characters
that collate between A and Z, inclusive. Although this used to be in
POSIX and is widely implemented, it does not make much sense.
POSIX.1-2008 leaves ranges undefined in locales other than the POSIX
locale.

Therefore, it is an option to disable collation for ranges and just
compare character codes.

The problem started to occur more often with the new collation code,
which supports UTF-8 and uses CLDR's different collation order, but
strange results from [A-Z] can be encountered in much older versions.

Bash behaves similarly to sh, but supports 'shopt -s globasciiranges' to
disable collation. In some sense this is strange, an option that needs
to be enabled to provide the behaviour most people expect.

Alternatively, the pattern could be rewritten to be locale-sensitive:
[[:upper:][:digit:]]*[[:upper:][:digit:]]

-- 
Jilles Tjoelker
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

Re: TCP Stack: Challenge ACKs and Timestamps

2016-01-05 Thread hiren panchasara
bcc: current@
Moving the discussion to transport@.

On 12/28/15 at 12:25P, Sam Kumar wrote:
> Hello,
> I am working with the code for the TCP Stack. I noticed that support for
> Challenge Acks have been added since the latest release (10.2.0), and I
> think I may have found a bug in how they relate to TCP timestamps. All line
> numbers below are those in commit e66e064c45687b5d294565dbd829b419848f7992.
> 
> Looking at tcp_input.c, at lines 1594 to 1604, I see code that expects a
> timestamp to be in every segment during the session, if they were
> negotiated when the connection was being established.
> (
> https://github.com/freebsd/freebsd/blob/master/sys/netinet/tcp_input.c#L1595
> )
> 
> Looking at tcp_input.c, at lines 2161 and 2188, I see that Challenge ACKs
> are sent via calls to tcp_respond().
> (
> https://github.com/freebsd/freebsd/blob/master/sys/netinet/tcp_input.c#L2161
> and
> https://github.com/freebsd/freebsd/blob/master/sys/netinet/tcp_input.c#L2188
> )
> 
> Looking at tcp_subr.c, at line 978, I see that the segment sent by
> tcp_respond() never contains TCP options.
> (https://github.com/freebsd/freebsd/blob/master/sys/netinet/tcp_subr.c#L978)
> 
> Therefore, it seems to me that Challenge ACKs will never contain any TCP
> options. This violates the condition that once timestamps are negotiated,
> they must be present in every segment.
> 
> Please let me know if I am mistaken, or if this is actually a bug.
> 
> Sam Kumar
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


pgpTEUgpGXwwJ.pgp
Description: PGP signature


RE: FreeBsd MCA Panic Crash !!

2016-01-05 Thread shahzaibcb
* >> Are those the only MCA errors you're seeing? The reason I ask is that
there's an errata in the X5600 series which can cause an "internal timer
error" MCA to be logged after another uncorrectable MCA occurs.* 

90% are these MCA errors regarding rest of the 10% there is no log for it
such as one of the supermicro was rebooted two days ago but it was unable to
generate crashdump under /var/crash directory though dump is enabled in
rc.conf :

dumpdev="AUTO"
dumpdir="/var/crash"

*>>This seems to me like it would be a CPU failure.  Can you try replacing 
the CPU itself?  I've seen this exact message on a different board, and 
the cause was a failing CPU. *

We're thinking to replace x5690 with x5675 CPUs.

*>>Well, mcelog has this hardcoded and prints this for every MCA just as a 
matter of course.  It isn't selective but assumes every machine check is 
a hardware error (which they are, though some are warnings for corrected 
events that you can ignore as the hardware hasn't degraded enough to 
warrant replacement.  However, corrected events don't generate panics, 
just messages in the logs, and only a subset of corrected events include 
the "yellow / green" indicators for which you can ignore "green" events. 
Even corrected ECC errors I would ignore if you get a few events with 
a count of 1 that don't recur). *

Each time the MCA error occurs, server went down. So please guide how do we
suppose to tackle this issue ?
*
>> Depending on the CPU model, you can determine more info about the 
error using the CPU manuals (for Intel the SDM). *
CPU is x5690, is there a link we can get manual for supermicro x5690 cpu ?



--
View this message in context: 
http://freebsd.1045724.n5.nabble.com/FreeBsd-MCA-Panic-Crash-tp6064691p6065043.html
Sent from the freebsd-current mailing list archive at Nabble.com.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Possible bug in or around posix_fadvise after r292326

2016-01-05 Thread Peter Holm
On Tue, Jan 05, 2016 at 01:07:40PM +0200, Konstantin Belousov wrote:
> On Mon, Jan 04, 2016 at 10:05:21PM -0800, Benno Rice wrote:
> > Hi Konstantin,
> > 
> > I recently updated my dev box to r292962. After doing this I attempted to 
> > set up PostgreSQL 9.4. When I ran initdb the last phase hung. Using 
> > procstat -kk I found it appeared to be stuck in a loop inside a 
> > posix_fadvise syscall. I could not ^C or ^Z the initdb process. I could 
> > kill it but a subsequent attempt to rm -rf the /usr/local/pgsql/data 
> > directory also got stuck and was unkillable by any means. Rebooting allowed 
> > me to remove the directory but the initdb process still hung when I re-ran 
> > it.
> > 
> > I tried PostgreSQL 9.3 with similar results.
> > 
> > Looking at the source code for initdb I found that it calls posix_fadvise 
> > like so[1]:
> > 
> >  /*
> >   * We do what pg_flush_data() would do in the backend: prefer to use
> >   * sync_file_range, but fall back to posix_fadvise.  We ignore errors
> >   * because this is only a hint.
> >   */
> >  #if defined(HAVE_SYNC_FILE_RANGE)
> >  (void) sync_file_range(fd, 0, 0, SYNC_FILE_RANGE_WRITE);
> >  #elif defined(USE_POSIX_FADVISE) && defined(POSIX_FADV_DONTNEED)
> >  (void) posix_fadvise(fd, 0, 0, POSIX_FADV_DONTNEED);
> >  #else
> >  #error PG_FLUSH_DATA_WORKS should not have been defined
> >  #endif
> > 
> > Looking for recent commits involving POSIX_FADV_DONTNEED I found r292326:
> > 
> > https://svnweb.freebsd.org/changeset/base/292326 
> > 
> > 
> > Backing this revision out allowed the initdb process to complete.
> > 
> > My current theory is that some how we???re getting ENOLCK or EAGAIN from 
> > the BUF_TIMELOCK call in bnoreuselist:
> > 
> > https://svnweb.freebsd.org/base/head/sys/kern/vfs_subr.c?view=annotate#l1676
> >  
> > 
> > 
> > Leading to an infinite loop in vop_stdadvise:
> > 
> > https://svnweb.freebsd.org/base/head/sys/kern/vfs_default.c?annotate=292373#l1083
> >  
> > 
> > 
> > I haven???t managed to dig any deeper than that yet.
> > 
> > Is there any other information I could give you to help narrow this down?
> 
> I do not see this issue locally.
> 

I do:

(kgdb) f 9
#9  0x80ac7956 in vop_stdadvise (ap=0xfe081dc6d930) at 
../../../kern/vfs_default.c:1087
1087error = bnoreuselist(>bo_dirty, bo, startn, 
endn);
(kgdb) l
1082endn = ap->a_end / bsize;
1083for (;;) {
1084error = bnoreuselist(>bo_clean, bo, startn, 
endn);
1085if (error == EAGAIN)
1086continue;
1087error = bnoreuselist(>bo_dirty, bo, startn, 
endn);
1088if (error == EAGAIN)
1089continue;
1090break;
1091}
(kgdb) info loc
vp = (struct vnode *) 0xf8008bdaa9c0
bo = (struct bufobj *) 0xf8008bdaab28
startn = 0x0
endn = 0x
start = 0x0
end = 0x8000
bsize = 0x8000
error = 0x0
(kgdb)

https://people.freebsd.org/~pho/stress/log/kostik855.txt

- Peter
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBsd MCA Panic Crash !!

2016-01-05 Thread shahzaibcb
We're thinking to replace x5690 with x5675 CPUs.



--
View this message in context: 
http://freebsd.1045724.n5.nabble.com/FreeBsd-MCA-Panic-Crash-tp6064691p6065039.html
Sent from the freebsd-current mailing list archive at Nabble.com.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: FreeBsd MCA Panic Crash !!

2016-01-05 Thread shahzaibcb
 each time the MCA error occurs, server went down. So please guide how do we
suppose to tackle this issue ?



--
View this message in context: 
http://freebsd.1045724.n5.nabble.com/FreeBsd-MCA-Panic-Crash-tp6064691p6065041.html
Sent from the freebsd-current mailing list archive at Nabble.com.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


RE: FreeBsd MCA Panic Crash !!

2016-01-05 Thread shahzaibcb
90% are these MCA errors regarding rest of the 10% there is no log for it
such as one of the supermicro was rebooted two days ago but it was unable to
generate crashdump under /var/crash directory though dump is enabled in
rc.conf but we've no idea what went wrong :

dumpdev="AUTO"
dumpdir="/var/crash"



--
View this message in context: 
http://freebsd.1045724.n5.nabble.com/FreeBsd-MCA-Panic-Crash-tp6064691p6065042.html
Sent from the freebsd-current mailing list archive at Nabble.com.
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


kernel panic by enabling net.inet.ip.random_id

2016-01-05 Thread Shawn Webb
Hey All,

Here's a kernel panic I'm experiencing by enabling net.inet.ip.random_id
at boot.

I'm on latest HEAD on amd64 in bhyve. I'll soon-ish be testing on native
hardware with VIMAGE enabled.

=== Begin Log ===
Kernel page fault with the following non-sleepable locks held:
exclusive sleep mutex ip_id_mtx (ip_id_mtx) r = 0 (0x81c54830) locked @ 
/usr/src/sys/netinet/ip_id.c:227
stack backtrace:
#0 0x80a79620 at witness_debugger+0x70
#1 0x80a7a937 at witness_warn+0x3d7
#2 0x80e6b887 at trap_pfault+0x57
#3 0x80e6b15f at trap+0x4bf
#4 0x80e4af97 at calltrap+0x8
#5 0x80b6c41b at ip_output+0x16b
#6 0x80b68e82 at icmp_reflect+0x5b2
#7 0x80b6883f at icmp_error+0x46f
#8 0x80beeb12 at udp_input+0x982
#9 0x80b69d1d at ip_input+0x17d
#10 0x80b08ba1 at netisr_dispatch_src+0x81
#11 0x80afecce at ether_demux+0x15e
#12 0x80affa14 at ether_nh_input+0x344
#13 0x80b08ba1 at netisr_dispatch_src+0x81
#14 0x80afefcf at ether_input+0x4f
#15 0x8089a5c3 at vtnet_rxq_eof+0x823
#16 0x8089b2ce at vtnet_rx_vq_intr+0x4e
#17 0x809e9ba6 at intr_event_execute_handlers+0x96


Fatal trap 12: page fault while in kernel mode
cpuid = 6; apic id = 06
fault virtual address   = 0x5bd
fault code  = supervisor read data, page not present
instruction pointer = 0x20:0x80b5de9e
stack pointer   = 0x28:0xfe02b8d483e0
frame pointer   = 0x28:0xfe02b8d48410
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 12 (irq265: virtio_pci0)
[ thread pid 12 tid 100040 ]
Stopped at  ip_fillid+0x8e: movzbl  (%rax,%rcx,1),%esi
=== End Log ===

Thanks,

-- 
Shawn Webb
HardenedBSD

GPG Key ID:  0x6A84658F52456EEE
GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89  3D9E 6A84 658F 5245 6EEE


signature.asc
Description: PGP signature


Re: kernel panic by enabling net.inet.ip.random_id

2016-01-05 Thread Adrian Chadd
looks like a null pointer deference. What's kgdb show at that IP?


-a


On 5 January 2016 at 17:57, Shawn Webb  wrote:
> Hey All,
>
> Here's a kernel panic I'm experiencing by enabling net.inet.ip.random_id
> at boot.
>
> I'm on latest HEAD on amd64 in bhyve. I'll soon-ish be testing on native
> hardware with VIMAGE enabled.
>
> === Begin Log ===
> Kernel page fault with the following non-sleepable locks held:
> exclusive sleep mutex ip_id_mtx (ip_id_mtx) r = 0 (0x81c54830) locked 
> @ /usr/src/sys/netinet/ip_id.c:227
> stack backtrace:
> #0 0x80a79620 at witness_debugger+0x70
> #1 0x80a7a937 at witness_warn+0x3d7
> #2 0x80e6b887 at trap_pfault+0x57
> #3 0x80e6b15f at trap+0x4bf
> #4 0x80e4af97 at calltrap+0x8
> #5 0x80b6c41b at ip_output+0x16b
> #6 0x80b68e82 at icmp_reflect+0x5b2
> #7 0x80b6883f at icmp_error+0x46f
> #8 0x80beeb12 at udp_input+0x982
> #9 0x80b69d1d at ip_input+0x17d
> #10 0x80b08ba1 at netisr_dispatch_src+0x81
> #11 0x80afecce at ether_demux+0x15e
> #12 0x80affa14 at ether_nh_input+0x344
> #13 0x80b08ba1 at netisr_dispatch_src+0x81
> #14 0x80afefcf at ether_input+0x4f
> #15 0x8089a5c3 at vtnet_rxq_eof+0x823
> #16 0x8089b2ce at vtnet_rx_vq_intr+0x4e
> #17 0x809e9ba6 at intr_event_execute_handlers+0x96
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 6; apic id = 06
> fault virtual address   = 0x5bd
> fault code  = supervisor read data, page not present
> instruction pointer = 0x20:0x80b5de9e
> stack pointer   = 0x28:0xfe02b8d483e0
> frame pointer   = 0x28:0xfe02b8d48410
> code segment= base 0x0, limit 0xf, type 0x1b
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags= interrupt enabled, resume, IOPL = 0
> current process = 12 (irq265: virtio_pci0)
> [ thread pid 12 tid 100040 ]
> Stopped at  ip_fillid+0x8e: movzbl  (%rax,%rcx,1),%esi
> === End Log ===
>
> Thanks,
>
> --
> Shawn Webb
> HardenedBSD
>
> GPG Key ID:  0x6A84658F52456EEE
> GPG Key Fingerprint: 2ABA B6BD EF6A F486 BE89  3D9E 6A84 658F 5245 6EEE
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: kernel panic by enabling net.inet.ip.random_id

2016-01-05 Thread Shawn Webb
Thanks for the quick reply! Here's some more debugging output:

=== Begin Log ===
(kgdb) bt
#0  doadump (textdump=0) at pcpu.h:221
#1  0x8037c78b in db_dump (dummy=, dummy2=false, 
dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
#2  0x8037c57e in db_command (cmd_table=0x0) at 
/usr/src/sys/ddb/db_command.c:440
#3  0x8037c314 in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:493
#4  0x8037edab in db_trap (type=, code=0) at 
/usr/src/sys/ddb/db_main.c:251
#5  0x80a5c563 in kdb_trap (type=12, code=0, tf=) 
at /usr/src/sys/kern/subr_kdb.c:654
#6  0x80e6b7e1 in trap_fatal (frame=0xfe02c33894d0, eva=) at /usr/src/sys/amd64/amd64/trap.c:829
#7  0x80e6ba2d in trap_pfault (frame=0xfe02c33894d0, 
usermode=) at /usr/src/sys/amd64/amd64/trap.c:684
#8  0x80e6b15f in trap (frame=0xfe02c33894d0) at 
/usr/src/sys/amd64/amd64/trap.c:435
#9  0x80e4af97 in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:234
#10 0x80b5de9e in ip_fillid (ip=0xf8000ef8cb88) at 
/usr/src/sys/netinet/ip_id.c:237
#11 0x80b6c41b in ip_output (m=, opt=, ro=, flags=0, imo=0x0, 
inp=0xf8000e66e960) at /usr/src/sys/netinet/ip_output.c:268
#12 0x80bf0612 in udp_send (so=, flags=, m=, addr=0x0, control=, td=0xf8000ef8cb88) at /usr/src/sys/netinet/udp_usrreq.c:1517
#13 0x80aa3872 in sosend_dgram (so=0xf8000e6422e8, addr=0x0, 
uio=, top=0xf8000ef8cb00, control=0x0, flags=, td=0x81bef2ec) at /usr/src/sys/kern/uipc_socket.c:1164
#13 0x80aa3872 in sosend_dgram (so=0xf8000e6422e8, addr=0x0, 
uio=, top=0xf8000ef8cb00, control=0x0, flags=, td=0x81bef2ec) at /usr/src/sys/kern/uipc_socket.c:1164
#14 0x80aaa03b in kern_sendit (td=0xf8000e4cd9c0, s=6, mp=, flags=0, control=0x0, segflg=UIO_USERSPACE) at 
/usr/src/sys/kern/uipc_syscalls.c:906
#15 0x80aaa336 in sendit (td=0xf8000e4cd9c0, s=, mp=0xfe02c3389970, flags=3980) at 
/usr/src/sys/kern/uipc_syscalls.c:833
#16 0x80aaa1fd in sys_sendto (td=0x0, uap=) at 
/usr/src/sys/kern/uipc_syscalls.c:957
#17 0x80e6bfdb in amd64_syscall (td=0xf8000e4cd9c0, traced=0) at 
subr_syscall.c:135
#18 0x80e4b27b in Xfast_syscall () at 
/usr/src/sys/amd64/amd64/exception.S:394
#19 0x03e339782e8a in ?? ()
(kgdb) x/i 0x80b5de9e
0x80b5de9e : movzbl (%rax,%rcx,1),%esi
(kgdb) info reg
rax0x0  0
rbx0x0  0
rcx0x0  0
rdx0x0  0
rsi0x0  0
rdi0x0  0
rbp0xfe02c3388fe0   0xfe02c3388fe0
rsp0xfe02c3388fc8   0xfe02c3388fc8
r8 0x0  0
r9 0x0  0
r100x0  0
r110x0  0
r120x817c0b80   -2122577024
r130x817c1470   -2122574736
r140x1  1
r150x4  4
rip0x80a1fae3   0x80a1fae3 
eflags 0x0  0
cs 0x0  0
ss 0x0  0
ds 0x0  0
es 0x0  0
fs 0x0  0
gs 0x0  0
=== End Log ===

Thanks,

Shawn

On Tue, Jan 05, 2016 at 06:06:41PM -0800, Adrian Chadd wrote:
> looks like a null pointer deference. What's kgdb show at that IP?
> 
> 
> -a
> 
> 
> On 5 January 2016 at 17:57, Shawn Webb  wrote:
> > Hey All,
> >
> > Here's a kernel panic I'm experiencing by enabling net.inet.ip.random_id
> > at boot.
> >
> > I'm on latest HEAD on amd64 in bhyve. I'll soon-ish be testing on native
> > hardware with VIMAGE enabled.
> >
> > === Begin Log ===
> > Kernel page fault with the following non-sleepable locks held:
> > exclusive sleep mutex ip_id_mtx (ip_id_mtx) r = 0 (0x81c54830) 
> > locked @ /usr/src/sys/netinet/ip_id.c:227
> > stack backtrace:
> > #0 0x80a79620 at witness_debugger+0x70
> > #1 0x80a7a937 at witness_warn+0x3d7
> > #2 0x80e6b887 at trap_pfault+0x57
> > #3 0x80e6b15f at trap+0x4bf
> > #4 0x80e4af97 at calltrap+0x8
> > #5 0x80b6c41b at ip_output+0x16b
> > #6 0x80b68e82 at icmp_reflect+0x5b2
> > #7 0x80b6883f at icmp_error+0x46f
> > #8 0x80beeb12 at udp_input+0x982
> > #9 0x80b69d1d at ip_input+0x17d
> > #10 0x80b08ba1 at netisr_dispatch_src+0x81
> > #11 0x80afecce at ether_demux+0x15e
> > #12 0x80affa14 at ether_nh_input+0x344
> > #13 0x80b08ba1 at netisr_dispatch_src+0x81
> > #14 0x80afefcf at ether_input+0x4f
> > #15 0x8089a5c3 at vtnet_rxq_eof+0x823
> > #16 0x8089b2ce at vtnet_rx_vq_intr+0x4e
> > #17 0x809e9ba6 at intr_event_execute_handlers+0x96
> >
> >
> > Fatal trap 12: page fault while in kernel mode
> > cpuid = 6; apic id = 06
> > fault virtual address   = 0x5bd
> > fault code  

Re: kernel panic by enabling net.inet.ip.random_id

2016-01-05 Thread Adrian Chadd
try list *(0x[address]) .

That line is mtx_unlock(), which makes no sense (as mtx_lock succeeded fine.)


-a


On 5 January 2016 at 18:13, Shawn Webb  wrote:
> Thanks for the quick reply! Here's some more debugging output:
>
> === Begin Log ===
> (kgdb) bt
> #0  doadump (textdump=0) at pcpu.h:221
> #1  0x8037c78b in db_dump (dummy=, dummy2=false, 
> dummy3=0, dummy4=0x0) at /usr/src/sys/ddb/db_command.c:533
> #2  0x8037c57e in db_command (cmd_table=0x0) at 
> /usr/src/sys/ddb/db_command.c:440
> #3  0x8037c314 in db_command_loop () at 
> /usr/src/sys/ddb/db_command.c:493
> #4  0x8037edab in db_trap (type=, code=0) at 
> /usr/src/sys/ddb/db_main.c:251
> #5  0x80a5c563 in kdb_trap (type=12, code=0, tf= out>) at /usr/src/sys/kern/subr_kdb.c:654
> #6  0x80e6b7e1 in trap_fatal (frame=0xfe02c33894d0, eva= optimized out>) at /usr/src/sys/amd64/amd64/trap.c:829
> #7  0x80e6ba2d in trap_pfault (frame=0xfe02c33894d0, 
> usermode=) at /usr/src/sys/amd64/amd64/trap.c:684
> #8  0x80e6b15f in trap (frame=0xfe02c33894d0) at 
> /usr/src/sys/amd64/amd64/trap.c:435
> #9  0x80e4af97 in calltrap () at 
> /usr/src/sys/amd64/amd64/exception.S:234
> #10 0x80b5de9e in ip_fillid (ip=0xf8000ef8cb88) at 
> /usr/src/sys/netinet/ip_id.c:237
> #11 0x80b6c41b in ip_output (m=, opt= optimized out>, ro=, flags=0, imo=0x0, 
> inp=0xf8000e66e960) at /usr/src/sys/netinet/ip_output.c:268
> #12 0x80bf0612 in udp_send (so=, flags= optimized out>, m=, addr=0x0, control= out>, td=0xf8000ef8cb88) at /usr/src/sys/netinet/udp_usrreq.c:1517
> #13 0x80aa3872 in sosend_dgram (so=0xf8000e6422e8, addr=0x0, 
> uio=, top=0xf8000ef8cb00, control=0x0, flags= optimized out>, td=0x81bef2ec) at /usr/src/sys/kern/uipc_socket.c:1164
> #13 0x80aa3872 in sosend_dgram (so=0xf8000e6422e8, addr=0x0, 
> uio=, top=0xf8000ef8cb00, control=0x0, flags= optimized out>, td=0x81bef2ec) at /usr/src/sys/kern/uipc_socket.c:1164
> #14 0x80aaa03b in kern_sendit (td=0xf8000e4cd9c0, s=6, mp= optimized out>, flags=0, control=0x0, segflg=UIO_USERSPACE) at 
> /usr/src/sys/kern/uipc_syscalls.c:906
> #15 0x80aaa336 in sendit (td=0xf8000e4cd9c0, s= out>, mp=0xfe02c3389970, flags=3980) at 
> /usr/src/sys/kern/uipc_syscalls.c:833
> #16 0x80aaa1fd in sys_sendto (td=0x0, uap=) at 
> /usr/src/sys/kern/uipc_syscalls.c:957
> #17 0x80e6bfdb in amd64_syscall (td=0xf8000e4cd9c0, traced=0) at 
> subr_syscall.c:135
> #18 0x80e4b27b in Xfast_syscall () at 
> /usr/src/sys/amd64/amd64/exception.S:394
> #19 0x03e339782e8a in ?? ()
> (kgdb) x/i 0x80b5de9e
> 0x80b5de9e : movzbl (%rax,%rcx,1),%esi
> (kgdb) info reg
> rax0x0  0
> rbx0x0  0
> rcx0x0  0
> rdx0x0  0
> rsi0x0  0
> rdi0x0  0
> rbp0xfe02c3388fe0   0xfe02c3388fe0
> rsp0xfe02c3388fc8   0xfe02c3388fc8
> r8 0x0  0
> r9 0x0  0
> r100x0  0
> r110x0  0
> r120x817c0b80   -2122577024
> r130x817c1470   -2122574736
> r140x1  1
> r150x4  4
> rip0x80a1fae3   0x80a1fae3 
> eflags 0x0  0
> cs 0x0  0
> ss 0x0  0
> ds 0x0  0
> es 0x0  0
> fs 0x0  0
> gs 0x0  0
> === End Log ===
>
> Thanks,
>
> Shawn
>
> On Tue, Jan 05, 2016 at 06:06:41PM -0800, Adrian Chadd wrote:
>> looks like a null pointer deference. What's kgdb show at that IP?
>>
>>
>> -a
>>
>>
>> On 5 January 2016 at 17:57, Shawn Webb  wrote:
>> > Hey All,
>> >
>> > Here's a kernel panic I'm experiencing by enabling net.inet.ip.random_id
>> > at boot.
>> >
>> > I'm on latest HEAD on amd64 in bhyve. I'll soon-ish be testing on native
>> > hardware with VIMAGE enabled.
>> >
>> > === Begin Log ===
>> > Kernel page fault with the following non-sleepable locks held:
>> > exclusive sleep mutex ip_id_mtx (ip_id_mtx) r = 0 (0x81c54830) 
>> > locked @ /usr/src/sys/netinet/ip_id.c:227
>> > stack backtrace:
>> > #0 0x80a79620 at witness_debugger+0x70
>> > #1 0x80a7a937 at witness_warn+0x3d7
>> > #2 0x80e6b887 at trap_pfault+0x57
>> > #3 0x80e6b15f at trap+0x4bf
>> > #4 0x80e4af97 at calltrap+0x8
>> > #5 0x80b6c41b at ip_output+0x16b
>> > #6 0x80b68e82 at icmp_reflect+0x5b2
>> > #7 0x80b6883f at icmp_error+0x46f
>> > #8 0x80beeb12 at udp_input+0x982
>> > #9 0x80b69d1d at ip_input+0x17d
>> > #10 0x80b08ba1 at netisr_dispatch_src+0x81
>> > #11 0x80afecce at ether_demux+0x15e