Re: [OmniOS-discuss] kernel panic

2016-09-19 Thread Michael Rasmussen
On Mon, 19 Sep 2016 13:49:16 -0400 Dan McDonald wrote: > > Something corrupted the anon cache entry, making it point to something in > unused/unallocated/un-something hyperspace. The big question is what. This > is why I asked about your hardware. It wasn't obvious, but

Re: [OmniOS-discuss] kernel panic

2016-09-19 Thread Michael Rasmussen
On Mon, 19 Sep 2016 13:33:02 -0400 Dan McDonald wrote: > > Generally speaking, this is memory corruption. What caused it I'm not sure, > however. > A bit flip caused by radiation from Cosmos? ;-) -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael

Re: [OmniOS-discuss] kernel panic

2016-09-19 Thread Michael Rasmussen
On Mon, 19 Sep 2016 13:33:02 -0400 Dan McDonald wrote: > This process: > > R 15776 2671 10 10 0 0x4a004000 ff075dadd0a0 perl > T 0xff072006ec60 > > Only perl running here: # ps -ef |grep perl root 2684 2674 3 Sep 17 ?

Re: [OmniOS-discuss] kernel panic

2016-09-19 Thread Dan McDonald
> On Sep 19, 2016, at 11:51 AM, Michael Rasmussen wrote: > > On Mon, 19 Sep 2016 11:35:33 -0400 > Dan McDonald wrote: > >> >> I see you have a vmdump.2 present. Can you make that downloadable >> somewhere? I'd like to take a closer look. >> >

Re: [OmniOS-discuss] kernel panic

2016-09-19 Thread Michael Rasmussen
On Mon, 19 Sep 2016 17:51:28 +0200 Michael Rasmussen wrote: > ftp://ftp.datanom.net/pub/vmdump.2 > Also via http: http://ftp.datanom.net/vmdump.2 -- Hilsen/Regards Michael Rasmussen Get my public GnuPG keys: michael rasmussen cc

Re: [OmniOS-discuss] kernel panic

2016-09-19 Thread Michael Rasmussen
On Mon, 19 Sep 2016 11:35:33 -0400 Dan McDonald wrote: > > I see you have a vmdump.2 present. Can you make that downloadable somewhere? > I'd like to take a closer look. > ftp://ftp.datanom.net/pub/vmdump.2 > The stack trace looks... almost random. It could be a

Re: [OmniOS-discuss] kernel panic

2016-09-19 Thread Dan McDonald
> On Sep 17, 2016, at 6:08 PM, Michael Rasmussen wrote: > > And after reboot: > Sep 17 21:51:07 nas savecore: [ID 833026 auth.error] Decompress the crash > dump w > ith > Sep 17 21:51:07 nas 'savecore -vf /var/crash/unknown/vmdump.2' > Sep 17 21:51:08 nas fmd: [ID 377184

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-30 Thread Richard Elling
On Mar 26, 2015, at 11:24 PM, wuffers m...@wuffers.net wrote: So here's what I will attempt to test: - Create thin vmdk @ 10TB with vSphere fat client: PASS - Create lazy zeroed vmdk @ 10 TB with vSphere fat client: PASS - Create eager zeroed vmdk @ 10 TB with vSphere web client: PASS!

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-30 Thread wuffers
On Mar 30, 2015, at 4:10 PM, Richard Elling richard.ell...@richardelling.com wrote: is compression enabled? -- richard Yes, LZ4. Dedupe off. ___ OmniOS-discuss mailing list OmniOS-discuss@lists.omniti.com

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-30 Thread Richard Elling
On Mar 30, 2015, at 1:16 PM, wuffers m...@wuffers.net wrote: On Mar 30, 2015, at 4:10 PM, Richard Elling richard.ell...@richardelling.com wrote: is compression enabled? -- richard Yes, LZ4. Dedupe off. Ironically, WRITE_SAME is the perfect workload for dedup :-) --

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-26 Thread Dan McDonald
On Mar 26, 2015, at 11:47 AM, wuffers m...@wuffers.net wrote: It looks like I'll have to make do with lazy zeroed or thin provisioned disks of 10TB+ for my Veeam tests, if it doesn't cause another kernel panic. I'm hesitant to create these now during business hours (and I shouldn't be..

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-26 Thread wuffers
On Thu, Mar 26, 2015 at 1:05 PM, Dan McDonald dan...@omniti.com wrote: WRITE_SAME is one of the four VAAI primitives. Nexenta wrote this code for NS, and upstreamed two of them: WRITE_SAME is hardware assisted erase. UNMAP is hardware assisted freeing. Those are in upstream illumos.

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-26 Thread Dan McDonald
Just remember that only WRITE_SAME and UNMAP are on stock illumos. If you want the other two, you either get NexentaStor or you start an effort to upstream them from illumos-nexenta. Dan ___ OmniOS-discuss mailing list

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-26 Thread wuffers
On Thu, Mar 26, 2015 at 11:37 AM, Dan McDonald dan...@omniti.com wrote: I mentioned earlier: I know Nexenta's done a LOT of improvements on this in illumos-nexenta. It might be time to upstream some of what they've done. I know it's a moving target (COMSTAR is not a well-written

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-25 Thread Dan McDonald
On Mar 25, 2015, at 11:51 AM, wuffers m...@wuffers.net wrote: Going to do this as soon as I can. Solaris docs say to put the following line in etc/system and reboot: set kmem_flags=0xf That's correct. Can't I just set this dynamically like so (so I can potentially skip 2 reboots)?

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-25 Thread wuffers
On Tue, Mar 24, 2015 at 7:44 PM, Dan McDonald dan...@omniti.com wrote: On Mar 24, 2015, at 7:44 PM, wuffers m...@wuffers.net wrote: On r151012 since Nov. And yes, the LUs are exposed via COMSTAR. If it helps to have some kmem_flags set, I can do that and try to reproduce it in the

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-25 Thread Nate Smith
, 2015 7:45 PM To: wuffers Cc: omnios-discuss Subject: Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks On Mar 24, 2015, at 7:44 PM, wuffers m...@wuffers.net wrote: On r151012 since Nov. And yes, the LUs are exposed via COMSTAR. If it helps

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-25 Thread wuffers
On Wed, Mar 25, 2015 at 12:09 PM, Dan McDonald dan...@omniti.com wrote: Can't I just set this dynamically like so (so I can potentially skip 2 reboots)? echo kmem_flags/W0xf | mdb -kw No, because those are read at kmem cache creation time at the system's start. Ahh, if I RTFM'd the

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-25 Thread Dan McDonald
On Mar 25, 2015, at 2:17 PM, wuffers m...@wuffers.net wrote: You reproduce this bug by configuring things a specific way, right? I ask because you seem to have been running okay until you fell down this particular panic rabbit hole with a particular set of things, correct? The panic

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-25 Thread Dan McDonald
On Mar 26, 2015, at 12:58 AM, wuffers m...@wuffers.net wrote: | genunix:kmem_free+1c8 () | stmf_sbd:sbd_handle_write_same_xfer_completion+14d () | stmf_sbd:sbd_dbuf_xfer_done+b1 () | stmf:stmf_worker_task+376 () | unix:thread_start+8 () | Hmmph. The WRITE_SAME code, huh? I know

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-25 Thread wuffers
On Wed, Mar 25, 2015 at 2:21 PM, Dan McDonald dan...@omniti.com wrote: A with-kmem-flags coredump will be very useful. Here we go. I reproduced this with no load on the SAN, just a DC and vcenter server up, then created my 10TB disk in the vSphere fat client. As expected, I got the kernel

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-24 Thread Dan McDonald
Here's a good place to start. It may need to be kicked to the illumos developer's list, but let's see what we can figure out first. 1.) What revision of OmniOS are you running? 2.) I notice a lot of STMF threads. COMSTAR (aka. STMF) is not the most stable piece of software in illumos,

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-24 Thread wuffers
On r151012 since Nov. And yes, the LUs are exposed via COMSTAR. If it helps to have some kmem_flags set, I can do that and try to reproduce it in the same way, and have the dump accessible. On Tue, Mar 24, 2015 at 6:41 PM, Dan McDonald dan...@omniti.com wrote: Here's a good place to start. It

Re: [OmniOS-discuss] kernel panic kernel heap corruption detected when creating zero eager disks

2015-03-24 Thread Dan McDonald
On Mar 24, 2015, at 7:44 PM, wuffers m...@wuffers.net wrote: On r151012 since Nov. And yes, the LUs are exposed via COMSTAR. If it helps to have some kmem_flags set, I can do that and try to reproduce it in the same way, and have the dump accessible. kmem_flags=0xf + the actual coredump

Re: [OmniOS-discuss] Kernel Panic - ZFS on iSCSI target and transferring data.

2014-05-21 Thread Dan McDonald
On May 21, 2014, at 5:53 AM, Svavar Örn Eysteinsson sva...@januar.is wrote: SNIP! panicstr = BAD TRAP: type=8 (#df Double fault) rp=ff04e3069f10 addr=0 panicstack = unix:real_mode_stop_cpu_stage2_end+9de3 () | unix:trap+ca5 () |

Re: [OmniOS-discuss] kernel panic

2014-04-16 Thread Kevin Swab
Any thoughts on this one? I can provide some more info if that helps. The system is all desktop-grade hardware, with a core-i3 540 CPU and 8gigs of (non-ecc) ram. The pool in question is a 3-disk raidz built on Toshiba DT01ACA3 3T SATA drives attached to the motherboard SATA ports. The pool was

Re: [OmniOS-discuss] kernel panic

2014-04-16 Thread Dan McDonald
On Apr 16, 2014, at 12:39 PM, Kevin Swab kevin.s...@colostate.edu wrote: SNIP! Traversing all blocks to verify checksums ... assertion failed for thread 0xfd7fff162a40, thread-id 1: c SPA_MAXBLOCKSIZE SPA_MINBLOCKSHIFT, file ../../../uts/common/fs/zfs/zio.c, line 226 Abort (core

Re: [OmniOS-discuss] kernel panic

2014-04-16 Thread Dan McDonald
Doesn't matter where the panic is from -- it's caused by a corrupt block on the disk. A vmdump.N would be nice. You're running 008, I see, so I can use an 008 box to examine the dump. Dan ___ OmniOS-discuss mailing list

Re: [OmniOS-discuss] Kernel panic on using iSCSI target on same host, differences between OmniOS r151006 and r151008 ?

2014-02-10 Thread Franz Schober
Although the creation of the zpool worked on OmniOS r151006y, the zfs send I wanted to test on the pool failed with a kernel panic on r151006y also. zfs create test/testds dd if=/dev/zero of=/test/testds/file1 bs=1M count=50 dd if=/dev/zero of=/test/testds/file2 bs=1M count=50 zfs snapshot

Re: [OmniOS-discuss] kernel panic - anon_decref

2013-11-15 Thread Saso Kiselkov
On 11/15/13, 5:39 AM, wuffers wrote: So I'm adding VMware hosts (ESXi 5.5) to my OmniOS ZFS SAN, which are already hosting some volumes for our Windows 2012 Hyper-V infrastructure, running over SRP and Infiniband. In VMware, I had uninstalled the default Mellanox 1.9.7 drivers and installed

Re: [OmniOS-discuss] kernel panic - anon_decref

2013-11-15 Thread wuffers
When it pours, it rains. With r151006y, I had two kernel panics in quick succession while trying to create some zero thick eager disks (4 at the same time) in ESXi. They are now kernel heap corruption detected instead of anon_decref. Kernel panic 2 (dump info: