Re: [discuss] assertion failed: space_map_open

Alexander Pyhalov via illumos-discuss Fri, 31 Oct 2014 07:46:11 -0700

On 10/24/2014 17:48, Alexander Pyhalov via illumos-discuss wrote:

Hello.
I was moving OI Hipster installation from physical server to KVM VM.

So I was continuing my attempts to move zfs pool to another host. Funny,but it already takes about a week.

I decided to move zfs filesystems one by one, finding out what's wrong.

While sending some snapshots I got
ZFS I/O error.

I hoped that errors are in files which are only in the old snapshots. AsI don't need history too much, I destroyed all snapshots, created newone and tried to send. The same effect.

But this time zpool status finally noticed data errors.

# zpool  status -v  data
  pool: data
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: scrub repaired 0 in 4h5m with 0 errors on Tue Oct 28 13:58:26 2014
config:

        NAME                                     STATE     READ WRITE CKSUM
        data                                     ONLINE       0     0     7
          c4t6005076802808844B000000000000032d0  ONLINE       0     0    14

errors: Permanent errors have been detected in the following files:

        <0xb04>:<0x4832f>
        <0x1a35>:<0x4832f>
        <0x2045>:<0x4832f>
        <0x1596>:<0x4832f>
        <0x25fd>:<0x4832f>

I wanted to identify affected files.   So I tared the whole zone.
tar gave one I/O error:

# gtar -C /zones/build/root/ -cpf /export/test.tar  .
gtar: ./var/samba/locks/winbindd_privileged/pipe: socket ignored

gtar: ./var/postgres/9.3/data_64/base/16385/106833: Read error at byte931653120, while reading 10240 bytes: I/O error

gtar: ./var/tmp/orbit-alp/linc-2197-0-5347df3ce3cb8: socket ignored
gtar: ./var/tmp/orbit-alp/linc-53fb-0-534682603ad6b: socket ignored

Dropped affected table from PostgreSQL database (luckily, it was just atest database), made vacuum full, so that file was removed.


After that could zfs send snapshot...

The question which worries me, why this could happen? We use IBMStorwize as backend. It shouldn't lie about writing data to the disk (atleast we haven't found out such issues). On other hand, we have ratherfrequent power outages (about once per month) long enough for our UPSesto die...


--
Best regards,
Alexander Pyhalov,
system administrator of Southern Federal University IT department


-------------------------------------------
illumos-discuss
Archives: https://www.listbox.com/member/archive/182180/=now
RSS Feed: https://www.listbox.com/member/archive/rss/182180/21175430-2e6923be
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21175430&id_secret=21175430-6a77cda4
Powered by Listbox: http://www.listbox.com

Re: [discuss] assertion failed: space_map_open

Reply via email to