no snapshots. there are mostly large files yes (few GB)) , but also lots of small ones . large files occupy about 90% of used space.

maybe worth to mention for large files, there is about 100-200TB files written/removed daily.  We also use a cache space for batch jobs, where inputs are stored in common dirs and hardlinks to them created in job dirs and removed when job is done. This caused some problems a couple of years ago but not any more.

Best,
Andrej

On 9. 6. 2026 18:26, Gregory Farnum wrote:
Do you have any snapshots?

There's also generally some amount of inflation with EC and small
files, but I don't think it should reach (7/5.6=)25%, and since you're
using 16MB objects I presume most of your files are large.
-Greg

On Tue, Jun 9, 2026 at 7:02 AM Andrej Filipčič via ceph-users
<[email protected]> wrote:

thanks for this info, but our purge queue is more or less empty
      "purge_queue": {
          "pq_executing_ops": 2,
          "pq_executing_ops_high_water": 13659,
          "pq_executing": 1,
          "pq_executing_high_water": 500,
          "pq_executed_ops": 127095,
          "pq_executed": 50471,
          "pq_item_in_journal": 2
      },

anyway, I raised the parameters to see if it helps.

Best,
Andrej


On 9. 6. 2026 14:57, Loïc Tortay via ceph-users wrote:
On 09/06/2026 14:10, Andrej Filipcic via ceph-users wrote:
Hi,

I have a problem with large amounts of dark data/orphans on cephfs
data pool. Data is stored mostly on a 8+3 EC pool (cephfs_data_echdd).

There is ~5.6PiB stored on /ceph, shown by ceph.dir.rbytes with 132M
files and 139M rentries. The pool shows 7PiB stored and 9.7PiB used
consistent with 8+3 EC.
The layout for most files:
ceph.dir.layout="stripe_unit=16777216 stripe_count=1
object_size=16777216 pool=cephfs_data_echdd"

But there is 1.4PiB discrepancy between the pool and the filesystem
which I cannot explain and I suspect there are a lot of orphan
objects there. I have run mds scrub on / and ~mdsdir as well. There
is some mds damage on some old small files (~400 files), which I do
not think it's relevant here.

Hello,
We had a similar issue last year with a group of users that created
and removed files at a very high rate.

Have you read
https://docs.clyso.com/docs/kb/cephfs/#cephfs-pool-data-usage-growth-without-explanation
?

We increased the purge rate parameters (very) aggressively to get back
to a comfortable situation (i.e. not a pool w/ near full warnings).


Loïc.

--
_____________________________________________________________
     prof. dr. Andrej Filipcic,   E-mail: [email protected]
     Department of Experimental High Energy Physics - F9
     Jozef Stefan Institute, Jamova 39, P.o.Box 3000
     SI-1001 Ljubljana, Slovenia
     Tel.: +386-1-477-3674    Fax: +386-1-477-3166
-------------------------------------------------------------
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]



--
_____________________________________________________________
   prof. dr. Andrej Filipcic,   E-mail: [email protected]
   Department of Experimental High Energy Physics - F9
   Jozef Stefan Institute, Jamova 39, P.o.Box 3000
   SI-1001 Ljubljana, Slovenia
   Tel.: +386-1-477-3674    Fax: +386-1-477-3166
-------------------------------------------------------------
_______________________________________________
ceph-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to