Hi there,

Sorry for picking up this old thread, and I'd like to share our own experience FWIW.

We agree too that PostgreSQL is better for handling large TB of jobs data nowadays. But instead of writing a new specific accounting storage plugin (just quick overview of mysql plugin code is enough to be convinced that it would be painful), we have another approach.

We consider that slurm database is just a temporary application specific storage backend only used for accounting purpose, and just live with it. Then, we enable slurmdbd automatic purging (to avoid the database growing forever). With MariaDB, it goes pretty well so far.

But since we do care about jobs metadata over the lifetime of our supercomputers, we have developed a software that crawls into slurm database to fill up incrementaly a PostgreSQL database:

http://edf-hpc.github.io/hpcstats/ [*]

This software is also able to get data from monitoring software, LDAP directories, and so on. This way, we have all our precious data in PostgreSQL for reporting and statistics purposes. This has the following advantage:

- It's a separate DB, then it does not disturb slurmdbd when running complex queries ; - It's a mashup of various data sources, so we can extract metrics with advanced correlations. - It's generic and not linked to any technology, so we get all the flexibility to change whevener.

We are happy with this approach so far :)

[*] The software is open-sourced but it may be hard to make it work in your IS without tough integration effort. It is designed as a generic framework with plugins but the current plugins are quite specifics to our needs. Feel free to contact me if you feel brave and would like any help though :)

Best,
Rémi

Reply via email to