Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-06 Thread Adam Tygart
Would it be beneficial for anyone to have an archive copy of an osd that took more than 4 days to export. All but an hour of that time was spent exporting 1 pg (that ended up being 197MB). I can even send along the extracted pg for analysis... -- Adam On Fri, Jun 3, 2016 at 2:39 PM, Adam Tygart

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-03 Thread Adam Tygart
With regards to this export/import process, I've been exporting a pg from an osd for more than 24 hours now. The entire OSD only has 8.6GB of data. 3GB of that is in omap. The export for this particular PG is only 108MB in size right now, after more than 24 hours. How is it possible that a fragment

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-03 Thread Brandon Morris, PMP
Nice catch. That was a copy-paste error. Sorry it should have read: 3. Flush the journal and export the primary version of the PG. This took 1 minute on a well-behaved PG and 4 hours on the misbehaving PG i.e. ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-16 --journal-path /va

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-03 Thread Adam Tygart
Is there any way we could have a "leveldb_defrag_on_mount" option for the osds similar to the "leveldb_compact_on_mount" option? Also, I've got at least one user that is creating and deleting thousands of files at a time in some of their directories (keeping 1-2% of them). Could that cause this fr

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-02 Thread Adam Tygart
I'm still exporting pgs out of some of the downed osds, but things are definitely looking promising. Marginally related to this thread, as these seem to be most of the hanging objects when exporting pgs, what are inodes in the 600 range used for within the metadata pool? I know the 200 range is us

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-02 Thread Brad Hubbard
On Thu, Jun 2, 2016 at 9:07 AM, Brandon Morris, PMP wrote: > The only way that I was able to get back to Health_OK was to export/import. > * Please note, any time you use the ceph_objectstore_tool you risk data > loss if not done carefully. Never remove a PG until you have a known good

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-02 Thread Samuel Just
I suspect the problem is that ReplicatedBackend::build_push_op assumes that osd_recovery_max_chunk (defaults to 8MB) of omap entries is about the same amount of work to get as 8MB of normal object data. The fix would be to add another config osd_recovery_max_omap_entries_per_chunk with a sane defa

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-02 Thread Gregory Farnum
On Thu, Jun 2, 2016 at 9:49 AM, Adam Tygart wrote: > Okay, > > Exporting, removing and importing the pgs seems to be working > (slowly). The question now becomes, why does and export/import work? > That would make me think there is a bug in there somewhere in the pg > loading code. Or does it have

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-02 Thread Adam Tygart
Okay, Exporting, removing and importing the pgs seems to be working (slowly). The question now becomes, why does and export/import work? That would make me think there is a bug in there somewhere in the pg loading code. Or does it have to do with re-creating the leveldb databases? The same number

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Brandon Morris, PMP
I concur with Greg. The only way that I was able to get back to Health_OK was to export/import. * Please note, any time you use the ceph_objectstore_tool you risk data loss if not done carefully. Never remove a PG until you have a known good export * Here are the steps I used: 1. set

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Gregory Farnum
On Wed, Jun 1, 2016 at 2:47 PM, Adam Tygart wrote: > I tried to compact the leveldb on osd 16 and the osd is still hitting > the suicide timeout. I know I've got some users with more than 1 > million files in single directories. > > Now that I'm in this situation, can I get some pointers on how ca

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Mike Lovell
On Wed, Jun 1, 2016 at 9:13 AM, Adam Tygart wrote: > Hello all, > > I'm running into an issue with ceph osds crashing over the last 4 > days. I'm running Jewel (10.2.1) on CentOS 7.2.1511. > > A little setup information: > 26 hosts > 2x 400GB Intel DC P3700 SSDs > 12x6TB spinning disks > 4x4TB spi

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Adam Tygart
I tried to compact the leveldb on osd 16 and the osd is still hitting the suicide timeout. I know I've got some users with more than 1 million files in single directories. Now that I'm in this situation, can I get some pointers on how can I use either of your options? Thanks, Adam On Wed, Jun 1,

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Gregory Farnum
If that pool is your metadata pool, it looks at a quick glance like it's timing out somewhere while reading and building up the omap contents (ie, the contents of a directory). Which might make sense if, say, you have very fragmented leveldb stores combined with very large CephFS directories. Tryin

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Adam Tygart
I've been attempting to work through this, finding the pgs that are causing hangs, determining if they are "safe" to remove, and removing them with ceph-objectstore-tool on osd 16. I'm now getting hangs (followed by suicide timeouts) referencing pgs that I've just removed, so this doesn't seem to

Re: [ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Brandon Morris, PMP
Adam, We ran into similar issues when we get too many objects in bucket (around 300 million). The .rgw.buckets.index pool became unable to complete backfill operations.The only way we were able to get past it was to export the offending placement group with the ceph-objectstore-tool and

[ceph-users] Crashing OSDs (suicide timeout, following a single pool)

2016-06-01 Thread Adam Tygart
Hello all, I'm running into an issue with ceph osds crashing over the last 4 days. I'm running Jewel (10.2.1) on CentOS 7.2.1511. A little setup information: 26 hosts 2x 400GB Intel DC P3700 SSDs 12x6TB spinning disks 4x4TB spinning disks. The SSDs are used for both journals and as an OSD (for t