Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Janne Johansson
Den tis 8 jan. 2019 kl 16:05 skrev Yoann Moulin :
> The best thing you can do here is added two disks to pf-us1-dfs3.

After that, get a fourth host with 4 OSDs on it and add to the cluster.
If you have 3 replicas (which is good!), then any downtime will mean
the cluster is
kept in a degraded mode. If you have 4 or more hosts then the cluster
will repair
itself and get back into a decent state when you lose a server for
whatever reason.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Yoann Moulin

> root@pf-us1-dfs3:/home/rodrigo# ceph osd crush rule dump
> [
>    {
>    "rule_id": 0,
>    "rule_name": "replicated_rule",
>    "ruleset": 0,
>    "type": 1,
>    "min_size": 1,
>    "max_size": 10,
>    "steps": [
>    {
>    "op": "take",
>    "item": -1,
>    "item_name": "default"
>    },
>    {
>    "op": "chooseleaf_firstn",
>    "num": 0,
>    "type": "host"
>    }

This means the failure domain is set to "host", the cluster will try to balance 
objects between "hosts" to be able to lose one host and be able
to keep data online.

You can change this to "disk" but in that case, your cluster will tolerate the 
failure of one disk but you won't be able to lose one server, you
won't have the warranty that all replica of an object will be on different 
hosts.

The best thing you can do here is added two disks to pf-us1-dfs3.

The second one would be, moving one disk from one of the 2 other servers to 
pf-us1-dfs3 if you can't quickly get  new disks. I don't know what
is the best way to do that, I never had this case on my cluster.

Best regards,

Yoann

> On Tue, Jan 8, 2019 at 11:35 AM Yoann Moulin  > wrote:
> 
> Hello,
> 
> > Hi Yoann, thanks for your response.
> > Here are the results of the commands.
> >
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd df
> > ID CLASS WEIGHT  REWEIGHT SIZE    USE AVAIL   %USE  VAR  PGS  
> > 0   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310  
> > 5   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271  
> > 6   hdd 7.27739  1.0 7.3 TiB 609 GiB 6.7 TiB  8.17 0.15  49  
> > 8   hdd 7.27739  1.0 7.3 TiB 2.5 GiB 7.3 TiB  0.03    0  42  
> > 1   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285  
> > 3   hdd 7.27739  1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296  
> > 7   hdd 7.27739  1.0 7.3 TiB 360 GiB 6.9 TiB  4.84 0.09  53  
> > 9   hdd 7.27739  1.0 7.3 TiB 4.1 GiB 7.3 TiB  0.06 0.00  38  
> > 2   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321  
> > 4   hdd 7.27739  1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351  
> >    TOTAL  73 TiB  39 TiB  34 TiB 53.13   
> > MIN/MAX VAR: 0/1.79  STDDEV: 41.15
> 
> It looks like you don't have a good balance between your OSD, what is 
> your failure domain ?
> 
> could you provide your crush map 
> http://docs.ceph.com/docs/luminous/rados/operations/crush-map/
> 
> ceph osd crush tree
> ceph osd crush rule ls
> ceph osd crush rule dump
> 
> 
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail
> > pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 128 pgp_num 128 last_change 471 fla
> > gs hashpspool,full stripe_width 0
> > pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 256 pgp_num 256 last_change 471 lf
> > or 0/439 flags hashpspool,full stripe_width 0 application cephfs
> > pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 256 pgp_num 256 last_change 47
> > 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs
> > pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha
> > shpspool,full stripe_width 0 application rgw
> > pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 47
> > 1 flags hashpspool,full stripe_width 0 application rgw
> > pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f
> > lags hashpspool,full stripe_width 0 application rgw
> > pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl
> > ags hashpspool,full stripe_width 0 application rgw
> 
> You may need to increase the pg num for cephfs_data pool. But before, you 
> must understand what is the impact https://ceph.com/pgcalc/
> you can't decrease pg_num, if it set too high you may have trouble in 
> your cluster.
> 
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd tree
> > ID CLASS WEIGHT   TYPE NAME    STATUS REWEIGHT PRI-AFF  
> > -1   72.77390 root default  
> > -3   29.10956 host pf-us1-dfs1  
> > 0   hdd  7.27739 osd.0    up  1.0 1.0  
> > 5   hdd  7.27739 osd.5    up  1.0 1.0  
> > 6   hdd  7.27739 osd.6    up  1.0 1.0  
> > 8   hdd  7.27739 osd.8    up  1.0 1.0  
> > -5   29.10956 host 

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Hi Yoann, thanks a lot for your help.


root@pf-us1-dfs3:/home/rodrigo# ceph osd crush tree
ID CLASS WEIGHT   TYPE NAME
-1   72.77390 root default
-3   29.10956 host pf-us1-dfs1
0   hdd  7.27739 osd.0
5   hdd  7.27739 osd.5
6   hdd  7.27739 osd.6
8   hdd  7.27739 osd.8
-5   29.10956 host pf-us1-dfs2
1   hdd  7.27739 osd.1
3   hdd  7.27739 osd.3
7   hdd  7.27739 osd.7
9   hdd  7.27739 osd.9
-7   14.55478 host pf-us1-dfs3
2   hdd  7.27739 osd.2
4   hdd  7.27739 osd.4

root@pf-us1-dfs3:/home/rodrigo# ceph osd crush rule ls
replicated_rule

root@pf-us1-dfs3:/home/rodrigo# ceph osd crush rule dump
[
   {
   "rule_id": 0,
   "rule_name": "replicated_rule",
   "ruleset": 0,
   "type": 1,
   "min_size": 1,
   "max_size": 10,
   "steps": [
   {
   "op": "take",
   "item": -1,
   "item_name": "default"
   },
   {
   "op": "chooseleaf_firstn",
   "num": 0,
   "type": "host"
   },
   {
   "op": "emit"
   }
   ]
   }
]


On Tue, Jan 8, 2019 at 11:35 AM Yoann Moulin  wrote:

> Hello,
>
> > Hi Yoann, thanks for your response.
> > Here are the results of the commands.
> >
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd df
> > ID CLASS WEIGHT  REWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS
> > 0   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310
> > 5   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271
> > 6   hdd 7.27739  1.0 7.3 TiB 609 GiB 6.7 TiB  8.17 0.15  49
> > 8   hdd 7.27739  1.0 7.3 TiB 2.5 GiB 7.3 TiB  0.030  42
> > 1   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285
> > 3   hdd 7.27739  1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296
> > 7   hdd 7.27739  1.0 7.3 TiB 360 GiB 6.9 TiB  4.84 0.09  53
> > 9   hdd 7.27739  1.0 7.3 TiB 4.1 GiB 7.3 TiB  0.06 0.00  38
> > 2   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321
> > 4   hdd 7.27739  1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351
> >TOTAL  73 TiB  39 TiB  34 TiB 53.13
> > MIN/MAX VAR: 0/1.79  STDDEV: 41.15
>
> It looks like you don't have a good balance between your OSD, what is your
> failure domain ?
>
> could you provide your crush map
> http://docs.ceph.com/docs/luminous/rados/operations/crush-map/
>
> ceph osd crush tree
> ceph osd crush rule ls
> ceph osd crush rule dump
>
>
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail
> > pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 128 pgp_num 128 last_change 471 fla
> > gs hashpspool,full stripe_width 0
> > pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 256 pgp_num 256 last_change 471 lf
> > or 0/439 flags hashpspool,full stripe_width 0 application cephfs
> > pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 256 pgp_num 256 last_change 47
> > 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs
> > pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
> rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha
> > shpspool,full stripe_width 0 application rgw
> > pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 47
> > 1 flags hashpspool,full stripe_width 0 application rgw
> > pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f
> > lags hashpspool,full stripe_width 0 application rgw
> > pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl
> > ags hashpspool,full stripe_width 0 application rgw
>
> You may need to increase the pg num for cephfs_data pool. But before, you
> must understand what is the impact https://ceph.com/pgcalc/
> you can't decrease pg_num, if it set too high you may have trouble in your
> cluster.
>
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd tree
> > ID CLASS WEIGHT   TYPE NAMESTATUS REWEIGHT PRI-AFF
> > -1   72.77390 root default
> > -3   29.10956 host pf-us1-dfs1
> > 0   hdd  7.27739 osd.0up  1.0 1.0
> > 5   hdd  7.27739 osd.5up  1.0 1.0
> > 6   hdd  7.27739 osd.6up  1.0 1.0
> > 8   hdd  7.27739 osd.8up  1.0 1.0
> > -5   29.10956 host pf-us1-dfs2
> > 1   hdd  7.27739 osd.1up  1.0 1.0
> > 3   hdd  7.27739 osd.3up  1.0 1.0
> > 7   hdd  7.27739 osd.7up  1.0 1.0
> > 9   hdd  7.27739 osd.9up  1.0 1.0
> > -7   14.55478 host pf-us1-dfs3
> > 2   hdd  7.27739 osd.2up  

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Kevin Olbrich
It would but you should not:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-December/014846.html

Kevin

Am Di., 8. Jan. 2019 um 15:35 Uhr schrieb Rodrigo Embeita
:
>
> Thanks again Kevin.
> If I reduce the size flag to a value of 2, that should fix the problem?
>
> Regards
>
> On Tue, Jan 8, 2019 at 11:28 AM Kevin Olbrich  wrote:
>>
>> You use replication 3 failure-domain host.
>> OSD 2 and 4 are full, thats why your pool is also full.
>> You need to add two disks to pf-us1-dfs3 or swap one from the larger
>> nodes to this one.
>>
>> Kevin
>>
>> Am Di., 8. Jan. 2019 um 15:20 Uhr schrieb Rodrigo Embeita
>> :
>> >
>> > Hi Yoann, thanks for your response.
>> > Here are the results of the commands.
>> >
>> > root@pf-us1-dfs2:/var/log/ceph# ceph osd df
>> > ID CLASS WEIGHT  REWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS
>> > 0   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310
>> > 5   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271
>> > 6   hdd 7.27739  1.0 7.3 TiB 609 GiB 6.7 TiB  8.17 0.15  49
>> > 8   hdd 7.27739  1.0 7.3 TiB 2.5 GiB 7.3 TiB  0.030  42
>> > 1   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285
>> > 3   hdd 7.27739  1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296
>> > 7   hdd 7.27739  1.0 7.3 TiB 360 GiB 6.9 TiB  4.84 0.09  53
>> > 9   hdd 7.27739  1.0 7.3 TiB 4.1 GiB 7.3 TiB  0.06 0.00  38
>> > 2   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321
>> > 4   hdd 7.27739  1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351
>> >TOTAL  73 TiB  39 TiB  34 TiB 53.13
>> > MIN/MAX VAR: 0/1.79  STDDEV: 41.15
>> >
>> >
>> > root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail
>> > pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 object_hash 
>> > rjenkins pg_num 128 pgp_num 128 last_change 471 fla
>> > gs hashpspool,full stripe_width 0
>> > pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash 
>> > rjenkins pg_num 256 pgp_num 256 last_change 471 lf
>> > or 0/439 flags hashpspool,full stripe_width 0 application cephfs
>> > pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 
>> > object_hash rjenkins pg_num 256 pgp_num 256 last_change 47
>> > 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs
>> > pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash 
>> > rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha
>> > shpspool,full stripe_width 0 application rgw
>> > pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 
>> > object_hash rjenkins pg_num 8 pgp_num 8 last_change 47
>> > 1 flags hashpspool,full stripe_width 0 application rgw
>> > pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 
>> > object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f
>> > lags hashpspool,full stripe_width 0 application rgw
>> > pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 
>> > object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl
>> > ags hashpspool,full stripe_width 0 application rgw
>> >
>> >
>> > root@pf-us1-dfs2:/var/log/ceph# ceph osd tree
>> > ID CLASS WEIGHT   TYPE NAMESTATUS REWEIGHT PRI-AFF
>> > -1   72.77390 root default
>> > -3   29.10956 host pf-us1-dfs1
>> > 0   hdd  7.27739 osd.0up  1.0 1.0
>> > 5   hdd  7.27739 osd.5up  1.0 1.0
>> > 6   hdd  7.27739 osd.6up  1.0 1.0
>> > 8   hdd  7.27739 osd.8up  1.0 1.0
>> > -5   29.10956 host pf-us1-dfs2
>> > 1   hdd  7.27739 osd.1up  1.0 1.0
>> > 3   hdd  7.27739 osd.3up  1.0 1.0
>> > 7   hdd  7.27739 osd.7up  1.0 1.0
>> > 9   hdd  7.27739 osd.9up  1.0 1.0
>> > -7   14.55478 host pf-us1-dfs3
>> > 2   hdd  7.27739 osd.2up  1.0 1.0
>> > 4   hdd  7.27739 osd.4up  1.0 1.0
>> >
>> >
>> > Thanks for your help guys.
>> >
>> >
>> > On Tue, Jan 8, 2019 at 10:36 AM Yoann Moulin  wrote:
>> >>
>> >> Hello,
>> >>
>> >> > Hi guys, I need your help.
>> >> > I'm new with Cephfs and we started using it as file storage.
>> >> > Today we are getting no space left on device but I'm seeing that we 
>> >> > have plenty space on the filesystem.
>> >> > Filesystem  Size  Used Avail Use% Mounted on
>> >> > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts   
>> >> > 73T   39T   35T  54% /mnt/cephfs
>> >> >
>> >> > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB 
>> >> > each but I'm getting the error "No space left on device" every time that
>> >> > I want to add a new file.
>> >> > After adding the 2 additional OSD disks I'm seeing that the load is 
>> >> > beign distributed among the cluster.
>> >> > Please I need your help.
>> >>
>> >> Could you give us the output of
>> >>
>> >> ceph 

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Thanks again Kevin.
If I reduce the size flag to a value of 2, that should fix the problem?

Regards

On Tue, Jan 8, 2019 at 11:28 AM Kevin Olbrich  wrote:

> You use replication 3 failure-domain host.
> OSD 2 and 4 are full, thats why your pool is also full.
> You need to add two disks to pf-us1-dfs3 or swap one from the larger
> nodes to this one.
>
> Kevin
>
> Am Di., 8. Jan. 2019 um 15:20 Uhr schrieb Rodrigo Embeita
> :
> >
> > Hi Yoann, thanks for your response.
> > Here are the results of the commands.
> >
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd df
> > ID CLASS WEIGHT  REWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS
> > 0   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310
> > 5   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271
> > 6   hdd 7.27739  1.0 7.3 TiB 609 GiB 6.7 TiB  8.17 0.15  49
> > 8   hdd 7.27739  1.0 7.3 TiB 2.5 GiB 7.3 TiB  0.030  42
> > 1   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285
> > 3   hdd 7.27739  1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296
> > 7   hdd 7.27739  1.0 7.3 TiB 360 GiB 6.9 TiB  4.84 0.09  53
> > 9   hdd 7.27739  1.0 7.3 TiB 4.1 GiB 7.3 TiB  0.06 0.00  38
> > 2   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321
> > 4   hdd 7.27739  1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351
> >TOTAL  73 TiB  39 TiB  34 TiB 53.13
> > MIN/MAX VAR: 0/1.79  STDDEV: 41.15
> >
> >
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail
> > pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 128 pgp_num 128 last_change 471 fla
> > gs hashpspool,full stripe_width 0
> > pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 256 pgp_num 256 last_change 471 lf
> > or 0/439 flags hashpspool,full stripe_width 0 application cephfs
> > pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 256 pgp_num 256 last_change 47
> > 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs
> > pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
> rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha
> > shpspool,full stripe_width 0 application rgw
> > pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 47
> > 1 flags hashpspool,full stripe_width 0 application rgw
> > pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f
> > lags hashpspool,full stripe_width 0 application rgw
> > pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl
> > ags hashpspool,full stripe_width 0 application rgw
> >
> >
> > root@pf-us1-dfs2:/var/log/ceph# ceph osd tree
> > ID CLASS WEIGHT   TYPE NAMESTATUS REWEIGHT PRI-AFF
> > -1   72.77390 root default
> > -3   29.10956 host pf-us1-dfs1
> > 0   hdd  7.27739 osd.0up  1.0 1.0
> > 5   hdd  7.27739 osd.5up  1.0 1.0
> > 6   hdd  7.27739 osd.6up  1.0 1.0
> > 8   hdd  7.27739 osd.8up  1.0 1.0
> > -5   29.10956 host pf-us1-dfs2
> > 1   hdd  7.27739 osd.1up  1.0 1.0
> > 3   hdd  7.27739 osd.3up  1.0 1.0
> > 7   hdd  7.27739 osd.7up  1.0 1.0
> > 9   hdd  7.27739 osd.9up  1.0 1.0
> > -7   14.55478 host pf-us1-dfs3
> > 2   hdd  7.27739 osd.2up  1.0 1.0
> > 4   hdd  7.27739 osd.4up  1.0 1.0
> >
> >
> > Thanks for your help guys.
> >
> >
> > On Tue, Jan 8, 2019 at 10:36 AM Yoann Moulin 
> wrote:
> >>
> >> Hello,
> >>
> >> > Hi guys, I need your help.
> >> > I'm new with Cephfs and we started using it as file storage.
> >> > Today we are getting no space left on device but I'm seeing that we
> have plenty space on the filesystem.
> >> > Filesystem  Size  Used Avail Use% Mounted on
> >> > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts
>  73T   39T   35T  54% /mnt/cephfs
> >> >
> >> > We have 35TB of disk space. I've added 2 additional OSD disks with
> 7TB each but I'm getting the error "No space left on device" every time that
> >> > I want to add a new file.
> >> > After adding the 2 additional OSD disks I'm seeing that the load is
> beign distributed among the cluster.
> >> > Please I need your help.
> >>
> >> Could you give us the output of
> >>
> >> ceph osd df
> >> ceph osd pool ls detail
> >> ceph osd tree
> >>
> >> Best regards,
> >>
> >> --
> >> Yoann Moulin
> >> EPFL IC-IT
> >> ___
> >> ceph-users mailing list
> >> ceph-users@lists.ceph.com
> >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
> > 

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Yoann Moulin
Hello,

> Hi Yoann, thanks for your response.
> Here are the results of the commands.
> 
> root@pf-us1-dfs2:/var/log/ceph# ceph osd df
> ID CLASS WEIGHT  REWEIGHT SIZE    USE AVAIL   %USE  VAR  PGS  
> 0   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310  
> 5   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271  
> 6   hdd 7.27739  1.0 7.3 TiB 609 GiB 6.7 TiB  8.17 0.15  49  
> 8   hdd 7.27739  1.0 7.3 TiB 2.5 GiB 7.3 TiB  0.03    0  42  
> 1   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285  
> 3   hdd 7.27739  1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296  
> 7   hdd 7.27739  1.0 7.3 TiB 360 GiB 6.9 TiB  4.84 0.09  53  
> 9   hdd 7.27739  1.0 7.3 TiB 4.1 GiB 7.3 TiB  0.06 0.00  38  
> 2   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321  
> 4   hdd 7.27739  1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351  
>    TOTAL  73 TiB  39 TiB  34 TiB 53.13   
> MIN/MAX VAR: 0/1.79  STDDEV: 41.15

It looks like you don't have a good balance between your OSD, what is your 
failure domain ?

could you provide your crush map 
http://docs.ceph.com/docs/luminous/rados/operations/crush-map/

ceph osd crush tree
ceph osd crush rule ls
ceph osd crush rule dump


> root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail
> pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 object_hash 
> rjenkins pg_num 128 pgp_num 128 last_change 471 fla
> gs hashpspool,full stripe_width 0
> pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash 
> rjenkins pg_num 256 pgp_num 256 last_change 471 lf
> or 0/439 flags hashpspool,full stripe_width 0 application cephfs
> pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 256 pgp_num 256 last_change 47
> 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs
> pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash 
> rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha
> shpspool,full stripe_width 0 application rgw
> pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 47
> 1 flags hashpspool,full stripe_width 0 application rgw
> pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f
> lags hashpspool,full stripe_width 0 application rgw
> pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl
> ags hashpspool,full stripe_width 0 application rgw

You may need to increase the pg num for cephfs_data pool. But before, you must 
understand what is the impact https://ceph.com/pgcalc/
you can't decrease pg_num, if it set too high you may have trouble in your 
cluster.

> root@pf-us1-dfs2:/var/log/ceph# ceph osd tree
> ID CLASS WEIGHT   TYPE NAME    STATUS REWEIGHT PRI-AFF  
> -1   72.77390 root default  
> -3   29.10956 host pf-us1-dfs1  
> 0   hdd  7.27739 osd.0    up  1.0 1.0  
> 5   hdd  7.27739 osd.5    up  1.0 1.0  
> 6   hdd  7.27739 osd.6    up  1.0 1.0  
> 8   hdd  7.27739 osd.8    up  1.0 1.0  
> -5   29.10956 host pf-us1-dfs2  
> 1   hdd  7.27739 osd.1    up  1.0 1.0  
> 3   hdd  7.27739 osd.3    up  1.0 1.0  
> 7   hdd  7.27739 osd.7    up  1.0 1.0  
> 9   hdd  7.27739 osd.9    up  1.0 1.0  
> -7   14.55478 host pf-us1-dfs3  
> 2   hdd  7.27739 osd.2    up  1.0 1.0  
> 4   hdd  7.27739 osd.4    up  1.0 1.0

You really should add 2 disks to pf-us1-dfs3, currently, the cluster tries to 
balance data between the 3 hosts, (replica 3, failure domain set to
'host' I guess). Each host will store 1/3 of data (1 replica) pf-us1-dfs3  only 
have half of the 2 others, you won't be able to put more than 3x
(osd.2+osd.4) even though there are free spaces on others OSDs.

Best regards,

Yoann

> On Tue, Jan 8, 2019 at 10:36 AM Yoann Moulin  > wrote:
> 
> Hello,
> 
> > Hi guys, I need your help.
> > I'm new with Cephfs and we started using it as file storage.
> > Today we are getting no space left on device but I'm seeing that we 
> have plenty space on the filesystem.
> > Filesystem              Size  Used Avail Use% Mounted on
> > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts   
> 73T   39T   35T  54% /mnt/cephfs
> >
> > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB 
> each but I'm getting the error "No space left on device" every time
> that
> > I want to add a new file.
> > After adding the 2 

Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Hi Kevin, thanks for your answer.
How Can I check the (re-)weights?

On Tue, Jan 8, 2019 at 10:36 AM Kevin Olbrich  wrote:

> Looks like the same problem like mine:
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032054.html
>
> The free space is total while Ceph uses the smallest free space (worst
> OSD).
> Please check your (re-)weights.
>
> Kevin
>
> Am Di., 8. Jan. 2019 um 14:32 Uhr schrieb Rodrigo Embeita
> :
> >
> > Hi guys, I need your help.
> > I'm new with Cephfs and we started using it as file storage.
> > Today we are getting no space left on device but I'm seeing that we have
> plenty space on the filesystem.
> > Filesystem  Size  Used Avail Use% Mounted on
> > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts
>  73T   39T   35T  54% /mnt/cephfs
> >
> > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB
> each but I'm getting the error "No space left on device" every time that I
> want to add a new file.
> > After adding the 2 additional OSD disks I'm seeing that the load is
> beign distributed among the cluster.
> > Please I need your help.
> >
> > root@pf-us1-dfs1:/etc/ceph# ceph -s
> >  cluster:
> >id: 609e9313-bdd3-449e-a23f-3db8382e71fb
> >health: HEALTH_ERR
> >2 backfillfull osd(s)
> >1 full osd(s)
> >7 pool(s) full
> >197313040/508449063 objects misplaced (38.807%)
> >Degraded data redundancy: 2/508449063 objects degraded
> (0.000%), 2 pgs degraded
> >Degraded data redundancy (low space): 16 pgs
> backfill_toofull, 3 pgs recovery_toofull
> >
> >  services:
> >mon: 3 daemons, quorum pf-us1-dfs2,pf-us1-dfs1,pf-us1-dfs3
> >mgr: pf-us1-dfs3(active), standbys: pf-us1-dfs2
> >mds: pagefs-2/2/2 up
> {0=pf-us1-dfs3=up:active,1=pf-us1-dfs1=up:active}, 1 up:standby
> >osd: 10 osds: 10 up, 10 in; 189 remapped pgs
> >rgw: 1 daemon active
> >
> >  data:
> >pools:   7 pools, 416 pgs
> >objects: 169.5 M objects, 3.6 TiB
> >usage:   39 TiB used, 34 TiB / 73 TiB avail
> >pgs: 2/508449063 objects degraded (0.000%)
> > 197313040/508449063 objects misplaced (38.807%)
> > 224 active+clean
> > 168 active+remapped+backfill_wait
> > 16  active+remapped+backfill_wait+backfill_toofull
> > 5   active+remapped+backfilling
> > 2   active+recovery_toofull+degraded
> > 1   active+recovery_toofull
> >
> >  io:
> >recovery: 1.1 MiB/s, 31 objects/s
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Kevin Olbrich
You use replication 3 failure-domain host.
OSD 2 and 4 are full, thats why your pool is also full.
You need to add two disks to pf-us1-dfs3 or swap one from the larger
nodes to this one.

Kevin

Am Di., 8. Jan. 2019 um 15:20 Uhr schrieb Rodrigo Embeita
:
>
> Hi Yoann, thanks for your response.
> Here are the results of the commands.
>
> root@pf-us1-dfs2:/var/log/ceph# ceph osd df
> ID CLASS WEIGHT  REWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS
> 0   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310
> 5   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271
> 6   hdd 7.27739  1.0 7.3 TiB 609 GiB 6.7 TiB  8.17 0.15  49
> 8   hdd 7.27739  1.0 7.3 TiB 2.5 GiB 7.3 TiB  0.030  42
> 1   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285
> 3   hdd 7.27739  1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296
> 7   hdd 7.27739  1.0 7.3 TiB 360 GiB 6.9 TiB  4.84 0.09  53
> 9   hdd 7.27739  1.0 7.3 TiB 4.1 GiB 7.3 TiB  0.06 0.00  38
> 2   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321
> 4   hdd 7.27739  1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351
>TOTAL  73 TiB  39 TiB  34 TiB 53.13
> MIN/MAX VAR: 0/1.79  STDDEV: 41.15
>
>
> root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail
> pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 object_hash 
> rjenkins pg_num 128 pgp_num 128 last_change 471 fla
> gs hashpspool,full stripe_width 0
> pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash 
> rjenkins pg_num 256 pgp_num 256 last_change 471 lf
> or 0/439 flags hashpspool,full stripe_width 0 application cephfs
> pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 256 pgp_num 256 last_change 47
> 1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs
> pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash 
> rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha
> shpspool,full stripe_width 0 application rgw
> pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 47
> 1 flags hashpspool,full stripe_width 0 application rgw
> pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f
> lags hashpspool,full stripe_width 0 application rgw
> pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0 
> object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl
> ags hashpspool,full stripe_width 0 application rgw
>
>
> root@pf-us1-dfs2:/var/log/ceph# ceph osd tree
> ID CLASS WEIGHT   TYPE NAMESTATUS REWEIGHT PRI-AFF
> -1   72.77390 root default
> -3   29.10956 host pf-us1-dfs1
> 0   hdd  7.27739 osd.0up  1.0 1.0
> 5   hdd  7.27739 osd.5up  1.0 1.0
> 6   hdd  7.27739 osd.6up  1.0 1.0
> 8   hdd  7.27739 osd.8up  1.0 1.0
> -5   29.10956 host pf-us1-dfs2
> 1   hdd  7.27739 osd.1up  1.0 1.0
> 3   hdd  7.27739 osd.3up  1.0 1.0
> 7   hdd  7.27739 osd.7up  1.0 1.0
> 9   hdd  7.27739 osd.9up  1.0 1.0
> -7   14.55478 host pf-us1-dfs3
> 2   hdd  7.27739 osd.2up  1.0 1.0
> 4   hdd  7.27739 osd.4up  1.0 1.0
>
>
> Thanks for your help guys.
>
>
> On Tue, Jan 8, 2019 at 10:36 AM Yoann Moulin  wrote:
>>
>> Hello,
>>
>> > Hi guys, I need your help.
>> > I'm new with Cephfs and we started using it as file storage.
>> > Today we are getting no space left on device but I'm seeing that we have 
>> > plenty space on the filesystem.
>> > Filesystem  Size  Used Avail Use% Mounted on
>> > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts   73T   
>> > 39T   35T  54% /mnt/cephfs
>> >
>> > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB 
>> > each but I'm getting the error "No space left on device" every time that
>> > I want to add a new file.
>> > After adding the 2 additional OSD disks I'm seeing that the load is beign 
>> > distributed among the cluster.
>> > Please I need your help.
>>
>> Could you give us the output of
>>
>> ceph osd df
>> ceph osd pool ls detail
>> ceph osd tree
>>
>> Best regards,
>>
>> --
>> Yoann Moulin
>> EPFL IC-IT
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Hi Yoann, thanks for your response.
Here are the results of the commands.

root@pf-us1-dfs2:/var/log/ceph# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS
0   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 571 GiB 92.33 1.74 310
5   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.18 1.45 271
6   hdd 7.27739  1.0 7.3 TiB 609 GiB 6.7 TiB  8.17 0.15  49
8   hdd 7.27739  1.0 7.3 TiB 2.5 GiB 7.3 TiB  0.030  42
1   hdd 7.27739  1.0 7.3 TiB 5.6 TiB 1.7 TiB 77.28 1.45 285
3   hdd 7.27739  1.0 7.3 TiB 6.9 TiB 371 GiB 95.02 1.79 296
7   hdd 7.27739  1.0 7.3 TiB 360 GiB 6.9 TiB  4.84 0.09  53
9   hdd 7.27739  1.0 7.3 TiB 4.1 GiB 7.3 TiB  0.06 0.00  38
2   hdd 7.27739  1.0 7.3 TiB 6.7 TiB 576 GiB 92.27 1.74 321
4   hdd 7.27739  1.0 7.3 TiB 6.1 TiB 1.2 TiB 84.10 1.58 351
   TOTAL  73 TiB  39 TiB  34 TiB 53.13
MIN/MAX VAR: 0/1.79  STDDEV: 41.15


root@pf-us1-dfs2:/var/log/ceph# ceph osd pool ls detail
pool 1 'poolcephfs' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 128 pgp_num 128 last_change 471 fla
gs hashpspool,full stripe_width 0
pool 2 'cephfs_data' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 256 pgp_num 256 last_change 471 lf
or 0/439 flags hashpspool,full stripe_width 0 application cephfs
pool 3 'cephfs_metadata' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 256 pgp_num 256 last_change 47
1 lfor 0/448 flags hashpspool,full stripe_width 0 application cephfs
pool 4 '.rgw.root' replicated size 3 min_size 2 crush_rule 0 object_hash
rjenkins pg_num 8 pgp_num 8 last_change 471 flags ha
shpspool,full stripe_width 0 application rgw
pool 5 'default.rgw.control' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 47
1 flags hashpspool,full stripe_width 0 application rgw
pool 6 'default.rgw.meta' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 f
lags hashpspool,full stripe_width 0 application rgw
pool 7 'default.rgw.log' replicated size 3 min_size 2 crush_rule 0
object_hash rjenkins pg_num 8 pgp_num 8 last_change 471 fl
ags hashpspool,full stripe_width 0 application rgw


root@pf-us1-dfs2:/var/log/ceph# ceph osd tree
ID CLASS WEIGHT   TYPE NAMESTATUS REWEIGHT PRI-AFF
-1   72.77390 root default
-3   29.10956 host pf-us1-dfs1
0   hdd  7.27739 osd.0up  1.0 1.0
5   hdd  7.27739 osd.5up  1.0 1.0
6   hdd  7.27739 osd.6up  1.0 1.0
8   hdd  7.27739 osd.8up  1.0 1.0
-5   29.10956 host pf-us1-dfs2
1   hdd  7.27739 osd.1up  1.0 1.0
3   hdd  7.27739 osd.3up  1.0 1.0
7   hdd  7.27739 osd.7up  1.0 1.0
9   hdd  7.27739 osd.9up  1.0 1.0
-7   14.55478 host pf-us1-dfs3
2   hdd  7.27739 osd.2up  1.0 1.0
4   hdd  7.27739 osd.4up  1.0 1.0


Thanks for your help guys.


On Tue, Jan 8, 2019 at 10:36 AM Yoann Moulin  wrote:

> Hello,
>
> > Hi guys, I need your help.
> > I'm new with Cephfs and we started using it as file storage.
> > Today we are getting no space left on device but I'm seeing that we have
> plenty space on the filesystem.
> > Filesystem  Size  Used Avail Use% Mounted on
> > 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts
>   73T   39T   35T  54% /mnt/cephfs
> >
> > We have 35TB of disk space. I've added 2 additional OSD disks with 7TB
> each but I'm getting the error "No space left on device" every time that
> > I want to add a new file.
> > After adding the 2 additional OSD disks I'm seeing that the load is
> beign distributed among the cluster.
> > Please I need your help.
>
> Could you give us the output of
>
> ceph osd df
> ceph osd pool ls detail
> ceph osd tree
>
> Best regards,
>
> --
> Yoann Moulin
> EPFL IC-IT
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
I believe I found something but I don't know how to fix it.
I run "ceph df" and I'm seeing that cephfs_data and cephfs_metadata is at
100% USED.
How can I increase the cephfs_data and cephfs_metadata pool.
Sorry I'm new with Ceph.

root@pf-us1-dfs1:/etc/ceph# ceph df
GLOBAL:
   SIZE   AVAIL  RAW USED %RAW USED
   73 TiB 34 TiB   39 TiB 53.12
POOLS:
   NAMEID USED%USED  MAX AVAIL
OBJECTS
   poolcephfs  1  0 B  0   0 B
0
   cephfs_data 2  3.6 TiB 100.00   0 B
169273821
   cephfs_metadata 3  1.0 GiB 100.00   0 B
   208981
   .rgw.root   4  1.1 KiB 100.00   0 B
4
   default.rgw.control 5  0 B  0   0 B
8
   default.rgw.meta6  0 B  0   0 B
0
   default.rgw.log 7  0 B  0   0 B
  207


On Tue, Jan 8, 2019 at 10:30 AM Rodrigo Embeita 
wrote:

> Hi guys, I need your help.
> I'm new with Cephfs and we started using it as file storage.
> Today we are getting no space left on device but I'm seeing that we have
> plenty space on the filesystem.
> Filesystem  Size  Used Avail Use% Mounted on
> 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts   73T
>   39T   35T  54% /mnt/cephfs
>
> We have 35TB of disk space. I've added 2 additional OSD disks with 7TB
> each but I'm getting the error "No space left on device" every time that I
> want to add a new file.
> After adding the 2 additional OSD disks I'm seeing that the load is beign
> distributed among the cluster.
> Please I need your help.
>
> root@pf-us1-dfs1:/etc/ceph# ceph -s
>  cluster:
>id: 609e9313-bdd3-449e-a23f-3db8382e71fb
>health: HEALTH_ERR
>2 backfillfull osd(s)
>1 full osd(s)
>7 pool(s) full
>197313040/508449063 objects misplaced (38.807%)
>Degraded data redundancy: 2/508449063 objects degraded
> (0.000%), 2 pgs degraded
>Degraded data redundancy (low space): 16 pgs backfill_toofull,
> 3 pgs recovery_toofull
>
>  services:
>mon: 3 daemons, quorum pf-us1-dfs2,pf-us1-dfs1,pf-us1-dfs3
>mgr: pf-us1-dfs3(active), standbys: pf-us1-dfs2
>mds: pagefs-2/2/2 up
>  {0=pf-us1-dfs3=up:active,1=pf-us1-dfs1=up:active}, 1 up:standby
>osd: 10 osds: 10 up, 10 in; 189 remapped pgs
>rgw: 1 daemon active
>
>  data:
>pools:   7 pools, 416 pgs
>objects: 169.5 M objects, 3.6 TiB
>usage:   39 TiB used, 34 TiB / 73 TiB avail
>pgs: 2/508449063 objects degraded (0.000%)
> 197313040/508449063 objects misplaced (38.807%)
> 224 active+clean
> 168 active+remapped+backfill_wait
> 16  active+remapped+backfill_wait+backfill_toofull
> 5   active+remapped+backfilling
> 2   active+recovery_toofull+degraded
> 1   active+recovery_toofull
>
>  io:
>recovery: 1.1 MiB/s, 31 objects/s
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Yoann Moulin
Hello,

> Hi guys, I need your help.
> I'm new with Cephfs and we started using it as file storage.
> Today we are getting no space left on device but I'm seeing that we have 
> plenty space on the filesystem.
> Filesystem              Size  Used Avail Use% Mounted on
> 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts   73T   
> 39T   35T  54% /mnt/cephfs
> 
> We have 35TB of disk space. I've added 2 additional OSD disks with 7TB each 
> but I'm getting the error "No space left on device" every time that
> I want to add a new file.
> After adding the 2 additional OSD disks I'm seeing that the load is beign 
> distributed among the cluster.
> Please I need your help.

Could you give us the output of

ceph osd df
ceph osd pool ls detail
ceph osd tree

Best regards,

-- 
Yoann Moulin
EPFL IC-IT
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Kevin Olbrich
Looks like the same problem like mine:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032054.html

The free space is total while Ceph uses the smallest free space (worst OSD).
Please check your (re-)weights.

Kevin

Am Di., 8. Jan. 2019 um 14:32 Uhr schrieb Rodrigo Embeita
:
>
> Hi guys, I need your help.
> I'm new with Cephfs and we started using it as file storage.
> Today we are getting no space left on device but I'm seeing that we have 
> plenty space on the filesystem.
> Filesystem  Size  Used Avail Use% Mounted on
> 192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts   73T   
> 39T   35T  54% /mnt/cephfs
>
> We have 35TB of disk space. I've added 2 additional OSD disks with 7TB each 
> but I'm getting the error "No space left on device" every time that I want to 
> add a new file.
> After adding the 2 additional OSD disks I'm seeing that the load is beign 
> distributed among the cluster.
> Please I need your help.
>
> root@pf-us1-dfs1:/etc/ceph# ceph -s
>  cluster:
>id: 609e9313-bdd3-449e-a23f-3db8382e71fb
>health: HEALTH_ERR
>2 backfillfull osd(s)
>1 full osd(s)
>7 pool(s) full
>197313040/508449063 objects misplaced (38.807%)
>Degraded data redundancy: 2/508449063 objects degraded (0.000%), 2 
> pgs degraded
>Degraded data redundancy (low space): 16 pgs backfill_toofull, 3 
> pgs recovery_toofull
>
>  services:
>mon: 3 daemons, quorum pf-us1-dfs2,pf-us1-dfs1,pf-us1-dfs3
>mgr: pf-us1-dfs3(active), standbys: pf-us1-dfs2
>mds: pagefs-2/2/2 up  {0=pf-us1-dfs3=up:active,1=pf-us1-dfs1=up:active}, 1 
> up:standby
>osd: 10 osds: 10 up, 10 in; 189 remapped pgs
>rgw: 1 daemon active
>
>  data:
>pools:   7 pools, 416 pgs
>objects: 169.5 M objects, 3.6 TiB
>usage:   39 TiB used, 34 TiB / 73 TiB avail
>pgs: 2/508449063 objects degraded (0.000%)
> 197313040/508449063 objects misplaced (38.807%)
> 224 active+clean
> 168 active+remapped+backfill_wait
> 16  active+remapped+backfill_wait+backfill_toofull
> 5   active+remapped+backfilling
> 2   active+recovery_toofull+degraded
> 1   active+recovery_toofull
>
>  io:
>recovery: 1.1 MiB/s, 31 objects/s
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Problem with CephFS - No space left on device

2019-01-08 Thread Rodrigo Embeita
Hi guys, I need your help.
I'm new with Cephfs and we started using it as file storage.
Today we are getting no space left on device but I'm seeing that we have
plenty space on the filesystem.
Filesystem  Size  Used Avail Use% Mounted on
192.168.51.8,192.168.51.6,192.168.51.118:6789:/pagefreezer/smhosts   73T
  39T   35T  54% /mnt/cephfs

We have 35TB of disk space. I've added 2 additional OSD disks with 7TB each
but I'm getting the error "No space left on device" every time that I want
to add a new file.
After adding the 2 additional OSD disks I'm seeing that the load is beign
distributed among the cluster.
Please I need your help.

root@pf-us1-dfs1:/etc/ceph# ceph -s
 cluster:
   id: 609e9313-bdd3-449e-a23f-3db8382e71fb
   health: HEALTH_ERR
   2 backfillfull osd(s)
   1 full osd(s)
   7 pool(s) full
   197313040/508449063 objects misplaced (38.807%)
   Degraded data redundancy: 2/508449063 objects degraded (0.000%),
2 pgs degraded
   Degraded data redundancy (low space): 16 pgs backfill_toofull, 3
pgs recovery_toofull

 services:
   mon: 3 daemons, quorum pf-us1-dfs2,pf-us1-dfs1,pf-us1-dfs3
   mgr: pf-us1-dfs3(active), standbys: pf-us1-dfs2
   mds: pagefs-2/2/2 up  {0=pf-us1-dfs3=up:active,1=pf-us1-dfs1=up:active},
1 up:standby
   osd: 10 osds: 10 up, 10 in; 189 remapped pgs
   rgw: 1 daemon active

 data:
   pools:   7 pools, 416 pgs
   objects: 169.5 M objects, 3.6 TiB
   usage:   39 TiB used, 34 TiB / 73 TiB avail
   pgs: 2/508449063 objects degraded (0.000%)
197313040/508449063 objects misplaced (38.807%)
224 active+clean
168 active+remapped+backfill_wait
16  active+remapped+backfill_wait+backfill_toofull
5   active+remapped+backfilling
2   active+recovery_toofull+degraded
1   active+recovery_toofull

 io:
   recovery: 1.1 MiB/s, 31 objects/s
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with CephFS

2018-11-23 Thread Rodrigo Embeita
Hi Daniel, thanks a lot for your help.
Do you know how I can recover the data again in this scenario since I lost
1 node with 6 OSD?
My configuration had 12 OSD (6 per host).

Regards

On Wed, Nov 21, 2018 at 3:16 PM Daniel Baumann 
wrote:

> Hi,
>
> On 11/21/2018 07:04 PM, Rodrigo Embeita wrote:
> > Reduced data availability: 7 pgs inactive, 7 pgs down
>
> this is your first problem: unless you have all data available again,
> cephfs will not be back.
>
> after that, I would take care about the redundancy next, and get the one
> missing monitor back online.
>
> once that is done, get the mds working again and your cephfs should be
> back in service.
>
> if you encounter problems with any of the steps, send all the necessary
> commands and outputs to the list and I (or others) can try to help.
>
> Regards,
> Daniel
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problem with CephFS

2018-11-21 Thread Daniel Baumann
Hi,

On 11/21/2018 07:04 PM, Rodrigo Embeita wrote:
>             Reduced data availability: 7 pgs inactive, 7 pgs down

this is your first problem: unless you have all data available again,
cephfs will not be back.

after that, I would take care about the redundancy next, and get the one
missing monitor back online.

once that is done, get the mds working again and your cephfs should be
back in service.

if you encounter problems with any of the steps, send all the necessary
commands and outputs to the list and I (or others) can try to help.

Regards,
Daniel
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Problem with CephFS

2018-11-21 Thread Rodrigo Embeita
Hi guys, maybe someone can help me.
I'm new with CephFS and I was testing the installation of Ceph Mimic with
ceph-deploy in 2 ubuntu 16.04 nodes.
These two nodes have 6 OSD disks each.
I've installed CephFS and 2 MDS service.
The problem is that I copied a lot of data (15 Millions of small files) and
I lost one of this 2 nodes.
The node lost had the MDS service working and is suppose to be moved to the
other Ceph host but the MDS service got stuck on rejoin status.
The problem is that the status of the cluster seems to be down and I'm not
able to connect CephFS.

root@pf-us1-dfs1:/var/log/ceph# ceph status
  cluster:
id: 459cdedc-488e-49ed-8b16-36cf843cef76
health: HEALTH_WARN
1 filesystem is degraded
1 MDSs report slow metadata IOs
3 osds down
1 host (6 osds) down
5313/50445780 objects misplaced (0.011%)
Reduced data availability: 7 pgs inactive, 7 pgs down
Degraded data redundancy: 25192943/50445780 objects degraded
(49.941%), 265 pgs degraded, 283 pgs undersized
1/3 mons down, quorum pf-us1-dfs3,pf-us1-dfs1

  services:
mon: 3 daemons, quorum pf-us1-dfs3,pf-us1-dfs1, out of quorum:
pf-us1-dfs2
mgr: pf-us1-dfs3(active)
mds: cephfs-1/1/1 up  {0=pf-us1-dfs1=up:rejoin}, 1 up:standby
osd: 13 osds: 6 up, 9 in; 6 remapped pgs
rgw: 1 daemon active

  data:
pools:   7 pools, 296 pgs
objects: 25.22 M objects, 644 GiB
usage:   2.0 TiB used, 42 TiB / 44 TiB avail
pgs: 2.365% pgs not active
 25192943/50445780 objects degraded (49.941%)
 5313/50445780 objects misplaced (0.011%)
 265 active+undersized+degraded
 18  active+undersized
 7   down
 6   active+clean+remapped


And the MDS service wrote the following on the log for over 14 hours and
never stop.

2018-11-21 10:06:12.585 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18421 from mon.2
2018-11-21 10:06:16.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18422 from mon.2
2018-11-21 10:06:20.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18423 from mon.2
2018-11-21 10:06:24.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18424 from mon.2
2018-11-21 10:06:32.590 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18425 from mon.2
2018-11-21 10:06:36.594 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18426 from mon.2
2018-11-21 10:06:40.606 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18427 from mon.2
2018-11-21 10:06:44.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18428 from mon.2
2018-11-21 10:06:52.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18429 from mon.2
2018-11-21 10:06:56.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18430 from mon.2
2018-11-21 10:07:00.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18431 from mon.2
2018-11-21 10:07:04.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18432 from mon.2
2018-11-21 10:07:12.590 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18433 from mon.2
2018-11-21 10:07:16.602 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18434 from mon.2
2018-11-21 10:07:20.602 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18435 from mon.2
2018-11-21 10:07:24.586 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18436 from mon.2
2018-11-21 10:07:32.590 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18437 from mon.2
2018-11-21 10:07:36.614 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18438 from mon.2
2018-11-21 10:07:40.626 7f1b80873700  1 mds.pf-us1-dfs1 Updating MDS map to
version 18439 from mon.2

Please someone help me.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] problem in cephfs for remove empty directory

2015-03-03 Thread John Spray

On 03/03/2015 14:07, Daniel Takatori Ohara wrote:

*$ls test-daniel-old/*
total 0
drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar  2 10:52 ./
drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar  2 11:41 ../

*$rm -rf test-daniel-old/*
rm: cannot remove ‘test-daniel-old/’: Directory not empty

*$ls test-daniel-old/*
ls: cannot access 
test-daniel-old/M_S8_L001_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such 
file or directory
ls: cannot access 
test-daniel-old/M_S8_L001_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No 
such file or directory
ls: cannot access 
test-daniel-old/M_S8_L002_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such 
file or directory
ls: cannot access 
test-daniel-old/M_S8_L002_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No 
such file or directory
ls: cannot access 
test-daniel-old/M_S8_L003_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such 
file or directory
ls: cannot access 
test-daniel-old/M_S8_L003_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No 
such file or directory
ls: cannot access 
test-daniel-old/M_S8_L004_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such 
file or directory
ls: cannot access 
test-daniel-old/M_S8_L004_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No 
such file or directory

total 0
drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar  2 10:52 ./
drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar  2 11:41 ../
l? ? ?  ?   ?  ? 
M_S8_L001_R1-2_001.fastq.gz_ref.sam_fixed.bam
l? ? ?  ?   ?  ? 
M_S8_L001_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
l? ? ?  ?   ?  ? 
M_S8_L002_R1-2_001.fastq.gz_ref.sam_fixed.bam
l? ? ?  ?   ?  ? 
M_S8_L002_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
l? ? ?  ?   ?  ? 
M_S8_L003_R1-2_001.fastq.gz_ref.sam_fixed.bam
l? ? ?  ?   ?  ? 
M_S8_L003_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
l? ? ?  ?   ?  ? 
M_S8_L004_R1-2_001.fastq.gz_ref.sam_fixed.bam
l? ? ?  ?   ?  ? 
M_S8_L004_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
You don't say what version of the client (version of kernel, if it's the 
kernel client) this is.  It would appear that the client thinks there 
are some dentries that don't really exist.  You should enable verbose 
debug logs (with fuse client, debug client = 20) and reproduce this.  
It looks like you had similar issues (subject: problem for remove files 
in cephfs) a while back, when Yan Zheng also advised you to get some 
debug logs.


John
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] problem in cephfs for remove empty directory

2015-03-03 Thread Gregory Farnum
On Tue, Mar 3, 2015 at 9:24 AM, John Spray john.sp...@redhat.com wrote:
 On 03/03/2015 14:07, Daniel Takatori Ohara wrote:

 $ls test-daniel-old/
 total 0
 drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar  2 10:52 ./
 drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar  2 11:41 ../

 $rm -rf test-daniel-old/
 rm: cannot remove ‘test-daniel-old/’: Directory not empty

 $ls test-daniel-old/
 ls: cannot access
 test-daniel-old/M_S8_L001_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such file
 or directory
 ls: cannot access
 test-daniel-old/M_S8_L001_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
 file or directory
 ls: cannot access
 test-daniel-old/M_S8_L002_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such file
 or directory
 ls: cannot access
 test-daniel-old/M_S8_L002_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
 file or directory
 ls: cannot access
 test-daniel-old/M_S8_L003_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such file
 or directory
 ls: cannot access
 test-daniel-old/M_S8_L003_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
 file or directory
 ls: cannot access
 test-daniel-old/M_S8_L004_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such file
 or directory
 ls: cannot access
 test-daniel-old/M_S8_L004_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
 file or directory
 total 0
 drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar  2 10:52 ./
 drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar  2 11:41 ../
 l? ? ?  ?   ??
 M_S8_L001_R1-2_001.fastq.gz_ref.sam_fixed.bam
 l? ? ?  ?   ??
 M_S8_L001_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
 l? ? ?  ?   ??
 M_S8_L002_R1-2_001.fastq.gz_ref.sam_fixed.bam
 l? ? ?  ?   ??
 M_S8_L002_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
 l? ? ?  ?   ??
 M_S8_L003_R1-2_001.fastq.gz_ref.sam_fixed.bam
 l? ? ?  ?   ??
 M_S8_L003_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
 l? ? ?  ?   ??
 M_S8_L004_R1-2_001.fastq.gz_ref.sam_fixed.bam
 l? ? ?  ?   ??
 M_S8_L004_R1-2_001.fastq.gz_sylvio.sam_fixed.bam

 You don't say what version of the client (version of kernel, if it's the
 kernel client) this is.  It would appear that the client thinks there are
 some dentries that don't really exist.  You should enable verbose debug logs
 (with fuse client, debug client = 20) and reproduce this.  It looks like
 you had similar issues (subject: problem for remove files in cephfs) a
 while back, when Yan Zheng also advised you to get some debug logs.

In particular this is a known bug in older kernels and is fixed in new
enough ones. Unfortunately I don't have the bug link handy though. :(
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] problem in cephfs for remove empty directory

2015-03-03 Thread Daniel Takatori Ohara
Hi John and Gregory,

The version of ceph client is 0.87 and the kernel is 3.13.

The debug logs here in attach.

I see this problem in a older kernel, but i didn't find the solution in the
track.

Thanks,

Att.

---
Daniel Takatori Ohara.
System Administrator - Lab. of Bioinformatics
Molecular Oncology Center
Instituto Sírio-Libanês de Ensino e Pesquisa
Hospital Sírio-Libanês
Phone: +55 11 3155-0200 (extension 1927)
R: Cel. Nicolau dos Santos, 69
São Paulo-SP. 01308-060
http://www.bioinfo.mochsl.org.br


On Tue, Mar 3, 2015 at 2:26 PM, Gregory Farnum g...@gregs42.com wrote:

 On Tue, Mar 3, 2015 at 9:24 AM, John Spray john.sp...@redhat.com wrote:
  On 03/03/2015 14:07, Daniel Takatori Ohara wrote:
 
  $ls test-daniel-old/
  total 0
  drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar  2 10:52 ./
  drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar  2 11:41 ../
 
  $rm -rf test-daniel-old/
  rm: cannot remove ‘test-daniel-old/’: Directory not empty
 
  $ls test-daniel-old/
  ls: cannot access
  test-daniel-old/M_S8_L001_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such
 file
  or directory
  ls: cannot access
  test-daniel-old/M_S8_L001_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
  file or directory
  ls: cannot access
  test-daniel-old/M_S8_L002_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such
 file
  or directory
  ls: cannot access
  test-daniel-old/M_S8_L002_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
  file or directory
  ls: cannot access
  test-daniel-old/M_S8_L003_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such
 file
  or directory
  ls: cannot access
  test-daniel-old/M_S8_L003_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
  file or directory
  ls: cannot access
  test-daniel-old/M_S8_L004_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such
 file
  or directory
  ls: cannot access
  test-daniel-old/M_S8_L004_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
  file or directory
  total 0
  drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar  2 10:52 ./
  drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar  2 11:41 ../
  l? ? ?  ?   ??
  M_S8_L001_R1-2_001.fastq.gz_ref.sam_fixed.bam
  l? ? ?  ?   ??
  M_S8_L001_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
  l? ? ?  ?   ??
  M_S8_L002_R1-2_001.fastq.gz_ref.sam_fixed.bam
  l? ? ?  ?   ??
  M_S8_L002_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
  l? ? ?  ?   ??
  M_S8_L003_R1-2_001.fastq.gz_ref.sam_fixed.bam
  l? ? ?  ?   ??
  M_S8_L003_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
  l? ? ?  ?   ??
  M_S8_L004_R1-2_001.fastq.gz_ref.sam_fixed.bam
  l? ? ?  ?   ??
  M_S8_L004_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
 
  You don't say what version of the client (version of kernel, if it's the
  kernel client) this is.  It would appear that the client thinks there are
  some dentries that don't really exist.  You should enable verbose debug
 logs
  (with fuse client, debug client = 20) and reproduce this.  It looks
 like
  you had similar issues (subject: problem for remove files in cephfs) a
  while back, when Yan Zheng also advised you to get some debug logs.

 In particular this is a known bug in older kernels and is fixed in new
 enough ones. Unfortunately I don't have the bug link handy though. :(
 -Greg



log_mds.gz
Description: GNU Zip compressed data
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] problem in cephfs for remove empty directory

2015-03-03 Thread Daniel Takatori Ohara
Hi,

I have a  problem when i will remove a empty directory in cephfs. The
directory is empty, but it seems have files crashed in MDS.

*$ls test-daniel-old/*
total 0
drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar  2 10:52 ./
drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar  2 11:41 ../

*$rm -rf test-daniel-old/*
rm: cannot remove ‘test-daniel-old/’: Directory not empty

*$ls test-daniel-old/*
ls: cannot access
test-daniel-old/M_S8_L001_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such file
or directory
ls: cannot access
test-daniel-old/M_S8_L001_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
file or directory
ls: cannot access
test-daniel-old/M_S8_L002_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such file
or directory
ls: cannot access
test-daniel-old/M_S8_L002_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
file or directory
ls: cannot access
test-daniel-old/M_S8_L003_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such file
or directory
ls: cannot access
test-daniel-old/M_S8_L003_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
file or directory
ls: cannot access
test-daniel-old/M_S8_L004_R1-2_001.fastq.gz_ref.sam_fixed.bam: No such file
or directory
ls: cannot access
test-daniel-old/M_S8_L004_R1-2_001.fastq.gz_sylvio.sam_fixed.bam: No such
file or directory
total 0
drwx-- 1 rmagalhaes BioInfoHSL Users0 Mar  2 10:52 ./
drwx-- 1 rmagalhaes BioInfoHSL Users 773099838313 Mar  2 11:41 ../
l? ? ?  ?   ??
M_S8_L001_R1-2_001.fastq.gz_ref.sam_fixed.bam
l? ? ?  ?   ??
M_S8_L001_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
l? ? ?  ?   ??
M_S8_L002_R1-2_001.fastq.gz_ref.sam_fixed.bam
l? ? ?  ?   ??
M_S8_L002_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
l? ? ?  ?   ??
M_S8_L003_R1-2_001.fastq.gz_ref.sam_fixed.bam
l? ? ?  ?   ??
M_S8_L003_R1-2_001.fastq.gz_sylvio.sam_fixed.bam
l? ? ?  ?   ??
M_S8_L004_R1-2_001.fastq.gz_ref.sam_fixed.bam
l? ? ?  ?   ??
M_S8_L004_R1-2_001.fastq.gz_sylvio.sam_fixed.bam


Att.

---
Daniel Takatori Ohara.
System Administrator - Lab. of Bioinformatics
Molecular Oncology Center
Instituto Sírio-Libanês de Ensino e Pesquisa
Hospital Sírio-Libanês
Phone: +55 11 3155-0200 (extension 1927)
R: Cel. Nicolau dos Santos, 69
São Paulo-SP. 01308-060
http://www.bioinfo.mochsl.org.br
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com