Re: rbd top

John Spray Mon, 15 Jun 2015 09:53:26 -0700


On 15/06/2015 17:10, Robert LeBlanc wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

John, let me see if I understand what you are saying...

When a person runs `rbd top`, each OSD would receive a message saying
please capture all the performance, grouped by RBD and limit it to
'X'. That way the OSD doesn't have to constantly update performance
for each object, but when it is requested it starts tracking it?

Right, initially the OSD isn't collecting anything, it starts as soon asit sees a query get loaded up (published via OSDMap or some othermechanism).

That said, in practice I can see people having some set of queries thatthey always have loaded and feeding into graphite in the background.


If so, that is an interesting idea. I wonder if that would be simpler
than tracking the performance of each/MRU objects in some format like
/proc/diskstats where it is in memory and not necessarily consistent.
The benefit is that you could have "lifelong" stats that show up like
iostat and it would be a simple operation.

Hmm, not sure we're on the same page about this part, what I'm talkingabout is all in memory and would be lost across daemon restarts. Someother component would be responsible for gathering the stats across allthe daemons in one place (that central part could persist stats if desired).

Each object should be able
to reference back to RBD/CephFS upon request and the client could even
be responsible for that load. Client performance data would need stats
in addition to the object stats.

You could extend the mechanism to clients. However, as much as possibleit's a good thing to keep it server side, as servers are generally fewer(still have to reduce these stats across N servers to present to user),and we have multiple client implementations (kernel/userspace). Whatkind of thing do you want to get from clients?

My concern is that adding additional SQL like logic to each op is
going to get very expensive. I guess if we could push that to another
thread early in the op, then it might not be too bad. I'm enjoying the
discussion and new ideas.

Hopefully in most cases the query can be applied very cheaply, foroperations like comparing pool ID or grouping by client ID. However, Iwould also envisage an optional sampling number, such that e.g. only 1in every 100 ops would go through the query processing. Useful forsystems where keeping highest throughput is paramount, and the numberswill still be useful if clients are doing many thousands of ops per second.


Cheers,
John
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: rbd top

Reply via email to