Thanks again, and btw, beside being Friday I'm also on vacation - so double
the joy of troubleshooting performance problmes :)))

Thx :)


On 8 August 2014 16:01, Dan Van Der Ster <daniel.vanders...@cern.ch> wrote:

>  Hi,
>
>  On 08 Aug 2014, at 15:55, Andrija Panic <andrija.pa...@gmail.com> wrote:
>
>  Hi Dan,
>
>  thank you very much for the script, will check it out...no thortling so
> far, but I guess it will have to be done...
>
>  This seems to read only gziped logs?
>
>
>  Well it’s pretty simple, and it zcat’s each input file. So yes, only gz
> files in the current script. But you can change that pretty trivially ;)
>
>  so since read only I guess it is safe to run it on proudction cluster
> now… ?
>
>
>  I personally don’t do anything new on a Friday just before leaving ;)
>
>  But its just grepping the log files, so start with one, then two, then...
>
>   The script will also check for mulitply OSDs as far as I can
> understadn, not just osd.0 given in script comment ?
>
>
>  Yup, what I do is gather all of the OSD logs for a single day in a
> single directory (in CephFS ;), then run that script on all of the OSDs. It
> takes awhile, but it will give you the overall daily totals for the whole
> cluster.
>
>  If you are only trying to find the top users, then it is sufficient to
> check a subset of OSDs, since by their nature the client IOs are spread
> across most/all OSDs.
>
>  Cheers, Dan
>
>  Thanks a lot.
> Andrija
>
>
>
>
> On 8 August 2014 15:44, Dan Van Der Ster <daniel.vanders...@cern.ch>
> wrote:
>
>> Hi,
>> Here’s what we do to identify our top RBD users.
>>
>>  First, enable log level 10 for the filestore so you can see all the IOs
>> coming from the VMs. Then use a script like this (used on a dumpling
>> cluster):
>>
>>
>> https://github.com/cernceph/ceph-scripts/blob/master/tools/rbd-io-stats.pl
>>
>>  to summarize the osd logs and identify the top clients.
>>
>>  Then its just a matter of scripting to figure out the ops/sec per
>> volume, but for us at least the main use-case has been to identify who is
>> responsible for a new peak in overall ops — and daily-granular statistics
>> from the above script tends to suffice.
>>
>>  BTW, do you throttle your clients? We found that its absolutely
>> necessary, since without a throttle just a few active VMs can eat up the
>> entire iops capacity of the cluster.
>>
>>  Cheers, Dan
>>
>> -- Dan van der Ster || Data & Storage Services || CERN IT Department --
>>
>>
>>   On 08 Aug 2014, at 13:51, Andrija Panic <andrija.pa...@gmail.com>
>> wrote:
>>
>>    Hi,
>>
>>  we just had some new clients, and have suffered very big degradation in
>> CEPH performance for some reasons (we are using CloudStack).
>>
>>  I'm wondering if there is way to monitor OP/s or similar usage by
>> client connected, so we can isolate the heavy client ?
>>
>>  Also, what is the general best practice to monitor these kind of
>> changes in CEPH ? I'm talking about R/W or OP/s change or similar...
>>
>>  Thanks,
>> --
>>
>> Andrija Panić
>>
>>    _______________________________________________
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>>
>>
>
>
>  --
>
> Andrija Panić
> --------------------------------------
>   http://admintweets.com
> --------------------------------------
>
>
>


-- 

Andrija Panić
--------------------------------------
  http://admintweets.com
--------------------------------------
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to