[ceph-users] Unable to create new pool in cluster

2015-07-25 Thread Daleep Bais
Hi All,

I am unable to create new pool in my cluster. I have some existing pools.

I get error :

ceph osd pool create fullpool 128 128
Error EINVAL: crushtool: exec failed: (2) No such file or directory


existing pools are :

cluster# ceph osd lspools
0 rbd,1 data,3 pspl,

Please suggest..

Thanks..
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon cpu usage

2015-07-25 Thread Luis Periquito
I think I figured out! All 4 of the OSDs on one host (OSD 107-110) were
sending massive amounts of auth requests to the monitors, seeming to
overwhelm them.

Weird bit is that I removed them (osd crush remove, auth del, osd rm), dd
the box and all of the disks, reinstalled and guess what? They are still
doing a lot of requests to the MONs... this will require some further
investigations.

As this is happening during my holidays, I just disabled them, and will
investigate further when I get back.


On Fri, Jul 24, 2015 at 11:11 PM, Kjetil Jørgensen kje...@medallia.com
wrote:

 It sounds slightly similar to what I just experienced.

 I had one monitor out of three, which seemed to essentially run one core
 at full tilt continuously, and had it's virtual address space allocated at
 the point where top started calling it Tb. Requests hitting this monitor
 did not get very timely responses (although; I don't know if this were
 happening consistently or arbitrarily).

 I ended up re-building the monitor from the two healthy ones I had, which
 made the problem go away for me.

 After the fact inspection of the monitor I ripped out, clocked it in at
 1.3Gb compared to the 250Mb of the other two, after rebuild they're all
 comparable in size.

 In my case; this started out for me on firefly, and persisted after
 upgrading to hammer. Which prompted the rebuild, suspecting that in my case
 it were related to something persistent for this monitor.

 I do not have that much more useful to contribute to this discussion,
 since I've more-or-less destroyed any evidence by re-building the monitor.

 Cheers,
 KJ

 On Fri, Jul 24, 2015 at 1:55 PM, Luis Periquito periqu...@gmail.com
 wrote:

 The leveldb is smallish: around 70mb.

 I ran debug mon = 10 for a while,  but couldn't find any interesting
 information. I would run out of space quite quickly though as the log
 partition only has 10g.
 On 24 Jul 2015 21:13, Mark Nelson mnel...@redhat.com wrote:

 On 07/24/2015 02:31 PM, Luis Periquito wrote:

 Now it's official,  I have a weird one!

 Restarted one of the ceph-mons with jemalloc and it didn't make any
 difference. It's still using a lot of cpu and still not freeing up
 memory...

 The issue is that the cluster almost stops responding to requests, and
 if I restart the primary mon (that had almost no memory usage nor cpu)
 the cluster goes back to its merry way responding to requests.

 Does anyone have any idea what may be going on? The worst bit is that I
 have several clusters just like this (well they are smaller), and as we
 do everything with puppet, they should be very similar... and all the
 other clusters are just working fine, without any issues whatsoever...


 We've seen cases where leveldb can't compact fast enough and memory
 balloons, but it's usually associated with extreme CPU usage as well. It
 would be showing up in perf though if that were the case...


 On 24 Jul 2015 10:11, Jan Schermer j...@schermer.cz
 mailto:j...@schermer.cz wrote:

 You don’t (shouldn’t) need to rebuild the binary to use jemalloc. It
 should be possible to do something like

 LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libjemalloc.so.1 ceph-osd …

 The last time we tried it segfaulted after a few minutes, so YMMV
 and be careful.

 Jan

  On 23 Jul 2015, at 18:18, Luis Periquito periqu...@gmail.com
 mailto:periqu...@gmail.com wrote:

 Hi Greg,

 I've been looking at the tcmalloc issues, but did seem to affect
 osd's, and I do notice it in heavy read workloads (even after the
 patch and
 increasing TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=134217728). This
 is affecting the mon process though.

 looking at perf top I'm getting most of the CPU usage in mutex
 lock/unlock
   5.02% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_unlock
   3.82%  libsoftokn3.so[.] 0x0001e7cb
   3.46% libpthread-2.19.so http://libpthread-2.19.so/[.]
 pthread_mutex_lock

 I could try to use jemalloc, are you aware of any built binaries?
 Can I mix a cluster with different malloc binaries?


 On Thu, Jul 23, 2015 at 10:50 AM, Gregory Farnum g...@gregs42.com
 mailto:g...@gregs42.com wrote:

 On Thu, Jul 23, 2015 at 8:39 AM, Luis Periquito
 periqu...@gmail.com mailto:periqu...@gmail.com wrote:
  The ceph-mon is already taking a lot of memory, and I ran a
 heap stats
  
  MALLOC:   32391696 (   30.9 MiB) Bytes in use by
 application
  MALLOC: +  27597135872 (26318.7 MiB) Bytes in page heap
 freelist
  MALLOC: + 16598552 (   15.8 MiB) Bytes in central cache
 freelist
  MALLOC: + 14693536 (   14.0 MiB) Bytes in transfer cache
 freelist
  MALLOC: + 17441592 (   16.6 MiB) Bytes in thread cache
 freelists
  MALLOC: +116387992 (  111.0 MiB) Bytes in