Re: [ceph-users] Omap issues - metadata creating too many

2019-01-03 Thread Josef Zelenka
Hi, i had the default - so it was on(according to ceph kb). turned it 
off, but the issue persists. i noticed Bryan Stillwell(cc-ing him) had 
the same issue (reported about it yesterday) - tried his tips about 
compacting, but it doesn't do anything, however i have to add to his 
last point, this happens even with bluestore. Is there anything we can 
do to clean up the omap manually?


Josef

On 18/12/2018 23:19, J. Eric Ivancich wrote:

On 12/17/18 9:18 AM, Josef Zelenka wrote:

Hi everyone, i'm running a Luminous 12.2.5 cluster with 6 hosts on
ubuntu 16.04 - 12 HDDs for data each, plus 2 SSD metadata OSDs(three
nodes have an additional SSD i added to have more space to rebalance the
metadata). CUrrently, the cluster is used mainly as a radosgw storage,
with 28tb data in total, replication 2x for both the metadata and data
pools(a cephfs isntance is running alongside there, but i don't think
it's the perpetrator - this happenned likely before we had it). All
pools aside from the data pool of the cephfs and data pool of the
radosgw are located on the SSD's. Now, the interesting thing - at random
times, the metadata OSD's fill up their entire capacity with OMAP data
and go to r/o mode and we have no other option currently than deleting
them and re-creating. The fillup comes at a random time, it doesn't seem
to be triggered by anything and it isn't caused by some data influx. It
seems like some kind of a bug to me to be honest, but i'm not certain -
anyone else seen this behavior with their radosgw? Thanks a lot

Hi Josef,

Do you have rgw_dynamic_resharding turned on? Try turning it off and see
if the behavior continues.

One theory is that dynamic resharding is triggered and possibly not
completing. This could add a lot of data to omap for the incomplete
bucket index shards. After a delay it tries resharding again, possibly
failing again, and adding more data to the omap. This continues.

If this is the ultimate issue we have some commits on the upstream
luminous branch that are designed to address this set of issues.

But we should first see if this is the cause.

Eric

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Omap issues - metadata creating too many

2018-12-17 Thread Josef Zelenka
Hi everyone, i'm running a Luminous 12.2.5 cluster with 6 hosts on 
ubuntu 16.04 - 12 HDDs for data each, plus 2 SSD metadata OSDs(three 
nodes have an additional SSD i added to have more space to rebalance the 
metadata). CUrrently, the cluster is used mainly as a radosgw storage, 
with 28tb data in total, replication 2x for both the metadata and data 
pools(a cephfs isntance is running alongside there, but i don't think 
it's the perpetrator - this happenned likely before we had it). All 
pools aside from the data pool of the cephfs and data pool of the 
radosgw are located on the SSD's. Now, the interesting thing - at random 
times, the metadata OSD's fill up their entire capacity with OMAP data 
and go to r/o mode and we have no other option currently than deleting 
them and re-creating. The fillup comes at a random time, it doesn't seem 
to be triggered by anything and it isn't caused by some data influx. It 
seems like some kind of a bug to me to be honest, but i'm not certain - 
anyone else seen this behavior with their radosgw? Thanks a lot


Josef Zelenka

Cloudevelops

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pgs incomplete and inactive

2018-08-27 Thread Josef Zelenka
The fullratio was ignored, that's why that happenned most likely. I 
can't delete pgs, because it's only kb's worth of space - the osd is 
40gb, 39.8 gb is taken up by omap - that's why i can't move/extract. Any 
clue on how to compress/move away the omap dir?




On 27/08/18 12:34, Paul Emmerich wrote:

Don't ever let an OSD run 100% full, that's usually bad news.
Two ways to salvage this:

1. You can try to extract the PGs with ceph-objectstore-tool and
inject them into another OSD; Ceph will find them and recover
2. You seem to be using Filestore, so you should easily be able to
just delete a whole PG on the full OSD's file system to make space
(preferably one that is already recovered and active+clean even
without the dead OSD)


Paul

2018-08-27 10:44 GMT+02:00 Josef Zelenka :

Hi, i've had a very ugly thing happen to me over the weekend. Some of my
OSDs in a root that handles metadata pools overflowed to 100% disk usage due
to omap size(even though i had 97% full ratio, which is odd) and refused to
start. There were some pgs on those OSDs that went away with them. I have
tried compacting the omap, moving files away etc, but nothing  - i can't
export the pgs, i get errors like this:

2018-08-27 04:42:33.436182 7fcb53382580  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1535359353436170, "job": 1, "event": "recovery_started",
"log_files": [5504, 5507]}
2018-08-27 04:42:33.436194 7fcb53382580  4 rocksdb:
[/build/ceph-12.2.5/src/rocksdb/db/db_impl_open.cc:482] Recovering log #5504
mode 2
2018-08-27 04:42:35.422502 7fcb53382580  4 rocksdb:
[/build/ceph-12.2.5/src/rocksdb/db/db_impl.cc:217] Shutdown: canceling all
background work
2018-08-27 04:42:35.431613 7fcb53382580  4 rocksdb:
[/build/ceph-12.2.5/src/rocksdb/db/db_impl.cc:343] Shutdown complete
2018-08-27 04:42:35.431716 7fcb53382580 -1 rocksdb: IO error: No space left
on device/var/lib/ceph/osd/ceph-5//current/omap/005507.sst: No space left on
device
Mount failed with '(1) Operation not permitted'
2018-08-27 04:42:35.432945 7fcb53382580 -1
filestore(/var/lib/ceph/osd/ceph-5/) mount(1723): Error initializing rocksdb
:

I decided to take the loss and mark the osds as lost and remove them from
the cluster, however, it left 4 pgs hanging in incomplete + inactive state,
which apparently prevents my radosgw from starting. Is there another way to
export/import the pgs into their new osds/recreate them? I'm running
Luminous 12.2.5 on Ubuntu 16.04.

Thanks

Josef

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] pgs incomplete and inactive

2018-08-27 Thread Josef Zelenka
Hi, i've had a very ugly thing happen to me over the weekend. Some of 
my  OSDs in a root that handles metadata pools overflowed to 100% disk 
usage due to omap size(even though i had 97% full ratio, which is odd) 
and refused to start. There were some pgs on those OSDs that went away 
with them. I have tried compacting the omap, moving files away etc, but 
nothing  - i can't export the pgs, i get errors like this:


2018-08-27 04:42:33.436182 7fcb53382580  4 rocksdb: EVENT_LOG_v1 
{"time_micros": 1535359353436170, "job": 1, "event": "recovery_started", 
"log_files": [5504, 5507]}
2018-08-27 04:42:33.436194 7fcb53382580  4 rocksdb: 
[/build/ceph-12.2.5/src/rocksdb/db/db_impl_open.cc:482] Recovering log 
#5504 mode 2
2018-08-27 04:42:35.422502 7fcb53382580  4 rocksdb: 
[/build/ceph-12.2.5/src/rocksdb/db/db_impl.cc:217] Shutdown: canceling 
all background work
2018-08-27 04:42:35.431613 7fcb53382580  4 rocksdb: 
[/build/ceph-12.2.5/src/rocksdb/db/db_impl.cc:343] Shutdown complete
2018-08-27 04:42:35.431716 7fcb53382580 -1 rocksdb: IO error: No space 
left on device/var/lib/ceph/osd/ceph-5//current/omap/005507.sst: No 
space left on device

Mount failed with '(1) Operation not permitted'
2018-08-27 04:42:35.432945 7fcb53382580 -1 
filestore(/var/lib/ceph/osd/ceph-5/) mount(1723): Error initializing 
rocksdb :


I decided to take the loss and mark the osds as lost and remove them 
from the cluster, however, it left 4 pgs hanging in incomplete + 
inactive state, which apparently prevents my radosgw from starting. Is 
there another way to export/import the pgs into their new osds/recreate 
them? I'm running Luminous 12.2.5 on Ubuntu 16.04.


Thanks

Josef

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD had suicide timed out

2018-08-09 Thread Josef Zelenka
THe only reason that i could think of is some kind of a network issue, 
even though different clusters run on the same switch with the same 
settings and we don't register any issues on there. One thing i recall - 
one of my colleagues was testing something out on this cluster and after 
he finished, he deleted a big(few million objects) bucket. Is it 
possible that there are some orphaned files from that action that break 
our osds somehow? Can't think of anything else.


Josef


On 09/08/18 04:07, Brad Hubbard wrote:

If, in the above case, osd 13 was not too busy to respond (resource
shortage) then you need to find out why else osd 5, etc. could not
contact it.

On Wed, Aug 8, 2018 at 6:47 PM, Josef Zelenka
 wrote:

Checked the system load on the host with the OSD that is suiciding currently
and it's fine, however i can see a noticeably higher IO (around 700), though
that seems rather like a symptom of the constant flapping/attempting to come
up to me(it's an SSD based Ceph so this shouldn't cause much harm to it).
Had a look at one of the osds sending the you_died messages and it seems
it's attempting to contact osd.13, but ultimately fails.

8/0 13574/13574/13574) [5,11] r=0 lpr=13574 crt=13592'3654839 lcod
13592'3654838 mlcod 13592'3654838 active+clean] publish_stats_to_osd
13593:9552151
2018-08-08 10:45:16.112344 7effa1d8c700 15 osd.5 pg_epoch: 13593 pg[14.6( v
13592'3654839 (13294'3653334,13592'3654839] local-lis/les=13574/13575 n=945
ec=126/126 lis/c 13574/13574 les/c/f 13575/13578/0 13574/13574/13574) [5,11]
r=0 lpr=13574 crt=13592'3654839 lcod 13592'3654838 mlcod 13592'3654838
active+clean] publish_stats_to_osd 13593:9552152
2018-08-08 10:45:16.679484 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
active+clean] publish_stats_to_osd 13593:2966967
2018-08-08 10:45:17.818135 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
osd.13 10.12.125.3:0/735946 18  osd_ping(ping e13589 stamp 2018-08-08
10:45:17.817238) v4  2004+0+0 (4218069135 0 0) 0x55bb638ba800 con
0x55bb65e79800
2018-08-08 10:45:17.818176 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
osd.13 10.12.3.17:0/735946 18  osd_ping(ping e13589 stamp 2018-08-08
10:45:17.817238) v4  2004+0+0 (4218069135 0 0) 0x55bb63cd8c00 con
0x55bb65e7b000
2018-08-08 10:45:18.919053 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
osd.13 10.12.125.3:0/735946 19  osd_ping(ping e13589 stamp 2018-08-08
10:45:18.918149) v4  2004+0+0 (1428835292 0 0) 0x55bb638bb200 con
0x55bb65e79800
2018-08-08 10:45:18.919598 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
osd.13 10.12.3.17:0/735946 19  osd_ping(ping e13589 stamp 2018-08-08
10:45:18.918149) v4  2004+0+0 (1428835292 0 0) 0x55bb63cd8a00 con
0x55bb65e7b000
2018-08-08 10:45:21.679563 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
active+clean] publish_stats_to_osd 13593:2966968
2018-08-08 10:45:23.020715 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
osd.13 10.12.125.3:0/735946 20  osd_ping(ping e13589 stamp 2018-08-08
10:45:23.018994) v4  2004+0+0 (1018071233 0 0) 0x55bb63bb7200 con
0x55bb65e79800
2018-08-08 10:45:23.020837 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
osd.13 10.12.3.17:0/735946 20  osd_ping(ping e13589 stamp 2018-08-08
10:45:23.018994) v4  2004+0+0 (1018071233 0 0) 0x55bb63cd8c00 con
0x55bb65e7b000
2018-08-08 10:45:26.679513 7eff8e565700 15 osd.5 pg_epoch: 13593 pg[11.15( v
13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
active+clean] publish_stats_to_osd 13593:2966969
2018-08-08 10:45:28.921091 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
osd.13 10.12.125.3:0/735946 21  osd_ping(ping e13589 stamp 2018-08-08
10:45:28.920140) v4  2004+0+0 (2459835898 0 0) 0x55bb638ba800 con
0x55bb65e79800
2018-08-08 10:45:28.922026 7effb9da5700  1 -- 10.12.3.15:6809/1319081 <==
osd.13 10.12.3.17:0/735946 21  osd_ping(ping e13589 stamp 2018-08-08
10:45:28.920140) v4  2004+0+0 (2459835898 0 0) 0x55bb63cd8c00 con
0x55bb65e7b000
2018-08-08 10:45:31.679828 7eff9a57d700 15 osd.5 pg_epoch: 13593 pg[11.15( v
13575'34486 (9987'32956,13575'34486] local-lis/les=13574/13575 n=1
ec=115/115 lis/c 13574/13574 les/c/f 13575/13575/0 13574/13574/13574) [5,10]
r=0 lpr=13574 crt=13572'34485 lcod 13572'34485 mlcod 13572'34485
active+clean] publish_stats_to_osd 13593:2966970
2018-08-08 10:45:33.022697 7effb95a4700  1 -- 10.12.125.1:6803/1319081 <==
osd.13 10.12.125.3:0/7

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Josef Zelenka
Thank you for your suggestion, tried it,  really seems like the other 
osds think the osd is dead(if I understand this right), however the 
networking seems absolutely fine between the nodes(no issues in graphs 
etc).


   -13> 2018-08-08 09:13:58.466119 7fe053d41700  1 -- 
10.12.3.17:0/706864 <== osd.12 10.12.3.17:6807/4624236 81  
osd_ping(ping_reply e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (687351303 0 0) 0x55731eb73e00 con 0x55731e7d4800
   -12> 2018-08-08 09:13:58.466140 7fe054542700  1 -- 
10.12.3.17:0/706864 <== osd.11 10.12.3.16:6812/19232 81  
osd_ping(ping_reply e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (687351303 0 0) 0x55733c391200 con 0x55731e7a5800
   -11> 2018-08-08 09:13:58.466147 7fe053540700  1 -- 
10.12.125.3:0/706864 <== osd.11 10.12.125.2:6811/19232 82  
osd_ping(you_died e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (3111562112 0 0) 0x55731eb66800 con 0x55731e7a4000
   -10> 2018-08-08 09:13:58.466164 7fe054542700  1 -- 
10.12.3.17:0/706864 <== osd.11 10.12.3.16:6812/19232 82  
osd_ping(you_died e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (3111562112 0 0) 0x55733c391200 con 0x55731e7a5800
    -9> 2018-08-08 09:13:58.466164 7fe053d41700  1 -- 
10.12.3.17:0/706864 <== osd.12 10.12.3.17:6807/4624236 82  
osd_ping(you_died e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (3111562112 0 0) 0x55731eb73e00 con 0x55731e7d4800
    -8> 2018-08-08 09:13:58.466176 7fe053540700  1 -- 
10.12.3.17:0/706864 <== osd.9 10.12.3.16:6813/10016600 81  
osd_ping(ping_reply e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (687351303 0 0) 0x55731eb66800 con 0x55731e732000
    -7> 2018-08-08 09:13:58.466200 7fe053d41700  1 -- 
10.12.3.17:0/706864 <== osd.10 10.12.3.16:6810/2017908 81  
osd_ping(ping_reply e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (687351303 0 0) 0x55731eb73e00 con 0x55731e796800
    -6> 2018-08-08 09:13:58.466208 7fe053540700  1 -- 
10.12.3.17:0/706864 <== osd.9 10.12.3.16:6813/10016600 82  
osd_ping(you_died e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (3111562112 0 0) 0x55731eb66800 con 0x55731e732000
    -5> 2018-08-08 09:13:58.466222 7fe053d41700  1 -- 
10.12.3.17:0/706864 <== osd.10 10.12.3.16:6810/2017908 82  
osd_ping(you_died e13452 stamp 2018-08-08 09:13:58.464608) v4  
2004+0+0 (3111562112 0 0) 0x55731eb73e00 con 0x55731e796800
    -4> 2018-08-08 09:13:59.748336 7fe040531700  1 -- 
10.12.3.17:6802/706864 --> 10.12.3.16:6800/1677830 -- 
mgrreport(unknown.13 +0-0 packed 742 osd_metrics=1) v5 -- 0x55731fa4af00 
con 0
    -3> 2018-08-08 09:13:59.748538 7fe040531700  1 -- 
10.12.3.17:6802/706864 --> 10.12.3.16:6800/1677830 -- pg_stats(64 pgs 
tid 0 v 0) v1 -- 0x55733cbf4c00 con 0
    -2> 2018-08-08 09:14:00.953804 7fe0525a1700  1 heartbeat_map 
is_healthy 'OSD::peering_tp thread 0x7fe03f52f700' had timed out after 15
    -1> 2018-08-08 09:14:00.953857 7fe0525a1700  1 heartbeat_map 
is_healthy 'OSD::peering_tp thread 0x7fe03f52f700' had suicide timed out 
after 150
 0> 2018-08-08 09:14:00.970742 7fe03f52f700 -1 *** Caught signal 
(Aborted) **



Could it be that the suiciding OSDs are rejecting the ping somehow? I'm 
quite confused as on what's really going on here, it seems completely 
random to me.


On 08/08/18 01:51, Brad Hubbard wrote:

Try to work out why the other osds are saying this one is down. Is it
because this osd is too busy to respond or something else.

debug_ms = 1 will show you some message debugging which may help.

On Tue, Aug 7, 2018 at 10:34 PM, Josef Zelenka
 wrote:

To follow up, I did some further digging with debug_osd=20/20 and it appears
as if there's no traffic to the OSD, even though it comes UP for the cluster
(this started happening on another OSD in the cluster today, same stuff):

-27> 2018-08-07 14:10:55.146531 7f9fce3cd700 10 osd.0 12560
handle_osd_ping osd.17 10.12.3.17:6811/19661 says i am down in 12566
-26> 2018-08-07 14:10:55.146542 7f9fcebce700 10 osd.0 12560
handle_osd_ping osd.12 10.12.125.3:6807/4624236 says i am down in 12566
-25> 2018-08-07 14:10:55.146551 7f9fcf3cf700 10 osd.0 12560
handle_osd_ping osd.13 10.12.3.17:6805/186262 says i am down in 12566
-24> 2018-08-07 14:10:55.146564 7f9fce3cd700 20 osd.0 12559
share_map_peer 0x56308a9d already has epoch 12566
-23> 2018-08-07 14:10:55.146576 7f9fcebce700 20 osd.0 12559
share_map_peer 0x56308abb9800 already has epoch 12566
-22> 2018-08-07 14:10:55.146590 7f9fcf3cf700 20 osd.0 12559
share_map_peer 0x56308abb1000 already has epoch 12566
-21> 2018-08-07 14:10:55.146600 7f9fce3cd700 10 osd.0 12560
handle_osd_ping osd.15 10.12.125.3:6813/49064793 says i am down in 12566
-20> 2018-08-07 14:10:55.146609 7f9fcebce700 10 osd.0 12560
handle_osd_ping osd.16 10.12.3.17:6801/1018363 says i am down in 12566
-19> 20

Re: [ceph-users] OSD had suicide timed out

2018-08-07 Thread Josef Zelenka
To follow up, I did some further digging with debug_osd=20/20 and it 
appears as if there's no traffic to the OSD, even though it comes UP for 
the cluster (this started happening on another OSD in the cluster today, 
same stuff):


   -27> 2018-08-07 14:10:55.146531 7f9fce3cd700 10 osd.0 12560 
handle_osd_ping osd.17 10.12.3.17:6811/19661 says i am down in 12566
   -26> 2018-08-07 14:10:55.146542 7f9fcebce700 10 osd.0 12560 
handle_osd_ping osd.12 10.12.125.3:6807/4624236 says i am down in 12566
   -25> 2018-08-07 14:10:55.146551 7f9fcf3cf700 10 osd.0 12560 
handle_osd_ping osd.13 10.12.3.17:6805/186262 says i am down in 12566
   -24> 2018-08-07 14:10:55.146564 7f9fce3cd700 20 osd.0 12559 
share_map_peer 0x56308a9d already has epoch 12566
   -23> 2018-08-07 14:10:55.146576 7f9fcebce700 20 osd.0 12559 
share_map_peer 0x56308abb9800 already has epoch 12566
   -22> 2018-08-07 14:10:55.146590 7f9fcf3cf700 20 osd.0 12559 
share_map_peer 0x56308abb1000 already has epoch 12566
   -21> 2018-08-07 14:10:55.146600 7f9fce3cd700 10 osd.0 12560 
handle_osd_ping osd.15 10.12.125.3:6813/49064793 says i am down in 12566
   -20> 2018-08-07 14:10:55.146609 7f9fcebce700 10 osd.0 12560 
handle_osd_ping osd.16 10.12.3.17:6801/1018363 says i am down in 12566
   -19> 2018-08-07 14:10:55.146619 7f9fcf3cf700 10 osd.0 12560 
handle_osd_ping osd.11 10.12.3.16:6812/19232 says i am down in 12566
   -18> 2018-08-07 14:10:55.146643 7f9fcf3cf700 20 osd.0 12559 
share_map_peer 0x56308a9d already has epoch 12566
   -17> 2018-08-07 14:10:55.146653 7f9fcf3cf700 10 osd.0 12560 
handle_osd_ping osd.15 10.12.3.17:6812/49064793 says i am down in 12566
   -16> 2018-08-07 14:10:55.448468 7f9fcabdd700 10 osd.0 12560 
tick_without_osd_lock
   -15> 2018-08-07 14:10:55.448491 7f9fcabdd700 20 osd.0 12559 
can_inc_scrubs_pending 0 -> 1 (max 1, active 0)
   -14> 2018-08-07 14:10:55.448497 7f9fcabdd700 20 osd.0 12560 
scrub_time_permit should run between 0 - 24 now 14 = yes
   -13> 2018-08-07 14:10:55.448525 7f9fcabdd700 20 osd.0 12560 
scrub_load_below_threshold loadavg 2.31 < daily_loadavg 2.68855 and < 
15m avg 2.63 = yes
   -12> 2018-08-07 14:10:55.448535 7f9fcabdd700 20 osd.0 12560 
sched_scrub load_is_low=1
   -11> 2018-08-07 14:10:55.448555 7f9fcabdd700 10 osd.0 12560 
sched_scrub 15.112 scheduled at 2018-08-07 15:03:15.052952 > 2018-08-07 
14:10:55.448494
   -10> 2018-08-07 14:10:55.448563 7f9fcabdd700 20 osd.0 12560 
sched_scrub done
    -9> 2018-08-07 14:10:55.448565 7f9fcabdd700 10 osd.0 12559 
promote_throttle_recalibrate 0 attempts, promoted 0 objects and 0 bytes; 
target 25 obj/sec or 5120 k bytes/sec
    -8> 2018-08-07 14:10:55.448568 7f9fcabdd700 20 osd.0 12559 
promote_throttle_recalibrate  new_prob 1000
    -7> 2018-08-07 14:10:55.448569 7f9fcabdd700 10 osd.0 12559 
promote_throttle_recalibrate  actual 0, actual/prob ratio 1, adjusted 
new_prob 1000, prob 1000 -> 1000
    -6> 2018-08-07 14:10:55.507159 7f9faab9d700 20 osd.0 op_wq(5) 
_process empty q, waiting
    -5> 2018-08-07 14:10:55.812434 7f9fb5bb3700 20 osd.0 op_wq(7) 
_process empty q, waiting
    -4> 2018-08-07 14:10:56.236584 7f9fcd42e700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7f9fa7396700' had timed out after 60
    -3> 2018-08-07 14:10:56.236618 7f9fcd42e700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7f9fb33ae700' had timed out after 60
    -2> 2018-08-07 14:10:56.236621 7f9fcd42e700  1 heartbeat_map 
is_healthy 'OSD::peering_tp thread 0x7f9fba3bc700' had timed out after 15
    -1> 2018-08-07 14:10:56.236640 7f9fcd42e700  1 heartbeat_map 
is_healthy 'OSD::peering_tp thread 0x7f9fba3bc700' had suicide timed out 
after 150
 0> 2018-08-07 14:10:56.245420 7f9fba3bc700 -1 *** Caught signal 
(Aborted) **

 in thread 7f9fba3bc700 thread_name:tp_peering

THe osd cyclically crashes and comes back up. I tried modifying the 
recovery etc timeouts, but no luck - the situation is still the same. 
Regarding the radosgw, across all nodes, after starting the rgw process, 
i only get this:


2018-08-07 14:32:17.852785 7f482dcaf700  2 
RGWDataChangesLog::ChangesRenewThread: start


I found this thread in the ceph mailing list 
(http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-June/018956.html) 
but I'm not sure if this is the same thing(albeit, it's the same error), 
as I don't use s3 acls/expiration in my cluster(if it's set to a 
default, I'm not aware of it)




On 06/08/18 16:30, Josef Zelenka wrote:

Hi,

i'm running a cluster on Luminous(12.2.5), Ubuntu 16.04 - 
configuration is 3 nodes, 6 drives each(though i have encountered this 
on a different cluster, similar hardware, only the drives were HDD 
instead of SSD - same usage). I have recently seen a bug(?) where one 
of the OSDs suddenly spikes in iops and constantly restarts(trying to 
load the journal/filemap apparently) which renders the radosgw(primary 
usage of this cluster) unable t

[ceph-users] OSD had suicide timed out

2018-08-06 Thread Josef Zelenka

Hi,

i'm running a cluster on Luminous(12.2.5), Ubuntu 16.04 - configuration 
is 3 nodes, 6 drives each(though i have encountered this on a different 
cluster, similar hardware, only the drives were HDD instead of SSD - 
same usage). I have recently seen a bug(?) where one of the OSDs 
suddenly spikes in iops and constantly restarts(trying to load the 
journal/filemap apparently) which renders the radosgw(primary usage of 
this cluster) unable to write. The only thing that helps here is 
stopping the OSD, but that helps only until another one does the similar 
thing. Any clue on the cause of this? LOgs of the osd when it crashes 
below. THanks


Josef

 -9920> 2018-08-06 12:12:10.588227 7f8e7afcb700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7f8e56f9a700' had timed out after 60
 -9919> 2018-08-06 12:12:10.607070 7f8e7a7ca700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7f8e56f9a700' had timed out after 60

--
    -1> 2018-08-06 14:12:52.428994 7f8e7982b700  1 heartbeat_map 
is_healthy 'OSD::osd_op_tp thread 0x7f8e56f9a700' had suicide timed out 
after 150
 0> 2018-08-06 14:12:52.432088 7f8e56f9a700 -1 *** Caught signal 
(Aborted) **

 in thread 7f8e56f9a700 thread_name:tp_osd_tp

 ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) 
luminous (stable)

 1: (()+0xa7cab4) [0x55868269aab4]
 2: (()+0x11390) [0x7f8e7e51d390]
 3: (()+0x1026d) [0x7f8e7e51c26d]
 4: (pthread_mutex_lock()+0x7d) [0x7f8e7e515dbd]
 5: (Mutex::Lock(bool)+0x49) [0x5586826bb899]
 6: (PG::lock(bool) const+0x33) [0x55868216ace3]
 7: (OSD::ShardedOpWQ::_process(unsigned int, 
ceph::heartbeat_handle_d*)+0x844) [0x558682101044]
 8: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x884) 
[0x5586826e27f4]

 9: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5586826e5830]
 10: (()+0x76ba) [0x7f8e7e5136ba]
 11: (clone()+0x6d) [0x7f8e7d58a41d]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.


--- logging levels ---
   0/ 5 none
   0/ 0 lockdep
   0/ 0 context
   0/ 0 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 0 buffer
   0/ 0 timer
   0/ 0 filer
   0/ 1 striper
   0/ 0 objecter
   0/ 0 rados
   0/ 0 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 0 journaler
   0/ 0 objectcacher
   0/ 0 client
   0/ 0 osd
   0/ 0 optracker
   0/ 0 objclass
   0/ 0 filestore
   0/ 0 journal
   0/ 0 ms
   0/ 0 mon
   0/ 0 monc
   0/ 0 paxos
   0/ 0 tp
   0/ 0 auth
   1/ 5 crypto
   0/ 0 finisher
   1/ 1 reserver
   1/ 5 heartbeatmap
   0/ 0 perfcounter
   0/ 0 rgw
   1/10 civetweb
   1/ 5 javaclient
   0/ 0 asok
   0/ 0 throttle
   0/ 0 refs
   1/ 5 xio
   1/ 5 compressor
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   4/ 5 memdb
   1/ 5 kinetic
   1/ 5 fuse
   1/ 5 mgr
   1/ 5 mgrc
   1/ 5 dpdk
   1/ 5 eventtrace
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent 1
  max_new 1000
  log_file /var/log/ceph/ceph-osd.7.log
--- end dump of recent events ---

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Best way to replace OSD

2018-08-06 Thread Josef Zelenka
Hi, our procedure is usually(assured that the cluster was ok the 
failure, with 2 replicas as crush rule)


1.Stop the OSD process(to keep it from coming up and down and putting 
load on the cluster)


2. Wait for the "Reweight" to come to 0(happens after 5 min i think - 
can be set manually but i let it happen by itself)


3. remove the osd from cluster(ceph auth del, ceph osd crush remove, 
ceph osd rm)


4. note down the journal partitions if needed

5. umount drive, replace the disk with new one

6. ensure permissions are set to ceph:ceph in /dev

7. mklabel gpt on the new drive

8. create the new osd with ceph-disk prepare(automatically adds it to 
the crushmap)



your procedure sounds reasonable to me, as far as i'm concerned you 
shouldn't have to wait for rebalancing after you remove the osd. all 
this might not be 100% per ceph books but it works for us :)


Josef


On 06/08/18 16:15, Iztok Gregori wrote:

Hi Everyone,

Which is the best way to replace a failing (SMART Health Status: 
HARDWARE IMPENDING FAILURE) OSD hard disk?


Normally I will:

1. set the OSD as out
2. wait for rebalancing
3. stop the OSD on the osd-server (unmount if needed)
4. purge the OSD from CEPH
5. physically replace the disk with the new one
6. with ceph-deploy:
6a   zap the new disk (just in case)
6b   create the new OSD
7. add the new osd to the crush map.
8. wait for rebalancing.

My questions are:

- Is my procedure reasonable?
- What if I skip the #2 and instead to wait for rebalancing I directly 
purge the OSD?

- Is better to reweight the OSD before take it out?

I'm running a Luminous (12.2.2) cluster with 332 OSDs, failure domain 
is host.


Thanks,
Iztok



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Erasure coded pools - overhead, data distribution

2018-07-26 Thread Josef Zelenka
   hdd  3.63860  1.0  3725G 2906G  819G 78.01 1.30 119
 69   hdd  1.81929  1.0  1862G 1316G  546G 70.66 1.18  53
 70   hdd  1.81929  0.95000  1862G 1224G  638G 65.73 1.10  49
120   hdd  5.45789  1.0  5588G 3900G 1687G 69.80 1.17 151
147   hdd  5.45789  1.0  5588G  869G 4719G 15.56 0.26   8
 11   hdd  5.45799  1.0  5588G 4174G 1414G 74.70 1.25 162
 20   hdd  7.27699  1.0  7451G 5430G 2021G 72.88 1.22 221
 71   hdd  1.81929  1.0  1862G 1300G  562G 69.83 1.17  55
 72   hdd  1.81929  1.0  1862G 1093G  769G 58.70 0.98  44
 73   hdd  3.63860  1.0  3725G 2706G 1019G 72.63 1.21 112
 74   hdd  1.81929  1.0  1862G 1295G  567G 69.54 1.16  50
 75   hdd  1.81929  1.0  1862G 1127G  735G 60.53 1.01  45
116   hdd  7.27730  1.0  7451G 4775G 2676G 64.09 1.07 181
121   hdd  3.63860  1.0  3725G 2163G 1562G 58.06 0.97  84
148   hdd  5.45789  1.0  5588G  832G 4756G 14.90 0.25  14
149   hdd  5.45789  1.0  5588G  776G 4812G 13.89 0.23  13
 19   hdd  7.27699  1.0  7451G 5664G 1787G 76.01 1.27 206
 76   hdd  1.81929  1.0  1862G 1476G  386G 79.26 1.33  59
 77   hdd  1.81929  0.95000  1862G 1513G  349G 81.24 1.36  60
 78   hdd  1.81929  1.0  1862G 1503G  359G 80.70 1.35  65
 79   hdd  3.63860  1.0  3725G 2705G 1020G 72.62 1.21 104
 81   hdd  1.81929  1.0  1862G 1315G  547G 70.63 1.18  50
108   hdd  1.81929  1.0  1862G 1706G  156G 91.59 1.53  61
140   hdd 10.91399  1.0 11175G 8090G 3085G 72.39 1.21 313
150   hdd  5.45789  1.0  5588G  939G 4649G 16.81 0.28  16
151   hdd  5.45789  1.0  5588G  900G 4688G 16.10 0.27  12
122   hdd  1.81929  1.0  1862G 1731G  131G 92.93 1.55  73
135   hdd  1.81929  1.0  1862G 1605G  256G 86.21 1.44  65
136   hdd  1.81929  1.0  1862G 1441G  421G 77.36 1.29  58
137   hdd  1.81929  1.0  1862G 1693G  169G 90.93 1.52  70
138   hdd  1.81929  1.0  1862G 1275G  587G 68.46 1.14  49
139   hdd  1.81929  1.0  1862G 1705G  157G 91.54 1.53  66
141   hdd 10.91399  1.0 11175G 8657G 2518G 77.47 1.30 299
152   hdd  5.45789  1.0  5588G  999G 4589G 17.88 0.30  11
 10   hdd  5.45799  1.0  5588G 3825G 1763G 68.45 1.14 156
 82   hdd  5.45789  1.0  5588G 3839G 1749G 68.69 1.15 152
 83   hdd  1.81929  1.0  1862G 1231G  631G 66.11 1.11  49
 84   hdd  1.81929  1.0  1862G 1273G  589G 68.37 1.14  49
 85   hdd  1.81929  1.0  1862G 1429G  432G 76.76 1.28  59
114   hdd  5.45789  1.0  5588G 3455G 2133G 61.83 1.03 138
123   hdd  5.45789  1.0  5588G 3678G 1910G 65.82 1.10 146
142   hdd 10.91399  1.0 11175G 7359G 3816G 65.85 1.10 298
153   hdd  5.45789  1.0  5588G  986G 4602G 17.64 0.30  17
144   hdd    0  1.0  5588G 1454M 5587G  0.03    0   0
145   hdd    0  1.0  5588G 1446M 5587G  0.03    0   0
146   hdd    0  1.0  5588G 1455M 5587G  0.03    0   0
  TOTAL   579T  346T  232T 59.80
MIN/MAX VAR: 0/1.55  STDDEV: 23.04

THanks in advance for any help, i find it very hard to wrap my head 
around this.


Josef Zelenka

Cloudevelops

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] NFS-ganesha with RGW

2018-05-30 Thread Josef Zelenka
Hi, thanks for the quick reply. As for 1. I mentioned that i'm running 
ubuntu 16.04, kernel 4.4.0-121 - as it seems the platform 
package(nfs-ganesha-ceph) does not include the rgw fsal.


2. Nfsd was running - after rebooting i managed to get ganesha to bind, 
rpcbind is running, though i still can't mount the rgw due to timeouts. 
I suspect my conf might be wrong, but i'm not sure how to make sure it 
is. I've set up my ganesha.conf with the FSAL and RGW block - do i need 
anything else?


EXPORT
{
 Export_ID=1;
 Path = "/";
 Pseudo = "/";
 Access_Type = RW;
 SecType = "sys";
 NFS_Protocols = 4;
 Transport_Protocols = TCP;

 # optional, permit unsquashed access by client "root" user
 #Squash = No_Root_Squash;

    FSAL {
 Name = RGW;
 User_Id = access key/secret>;

 Access_Key_Id = "";
 Secret_Access_Key = "";
 }

    RGW {
    cluster = "ceph";
    name = "client.radosgw.radosgw-s2";
    ceph_conf = "/etc/ceph/ceph.conf";
    init_args = "-d --debug-rgw=16";
    }
}
Josef




On 30/05/18 13:18, Matt Benjamin wrote:

Hi Josef,

1. You do need the Ganesha fsal driver to be present;  I don't know
your platform and os version, so I couldn't look up what packages you
might need to install (or if the platform package does not build the
RGW fsal)
2. The most common reason for ganesha.nfsd to fail to bind to a port
is that a Linux kernel nfsd is already running--can you make sure
that's not the case;  meanwhile you -do- need rpcbind to be running

Matt

On Wed, May 30, 2018 at 6:03 AM, Josef Zelenka
 wrote:

Hi everyone, i'm currently trying to set up a NFS-ganesha instance that
mounts a RGW storage, however i'm not succesful in this. I'm running Ceph
Luminous 12.2.4 and ubuntu 16.04. I tried compiling ganesha from
source(latest version), however i didn't manage to get the mount running
with that, as ganesha refused to bind to the ipv6 interface - i assume this
is a ganesha issue, but i didn't find any relevant info on what might cause
this - my network setup should allow for that. Then i installed ganesha-2.6
from the official repos, set up the config for RGW as per the official howto
http://docs.ceph.com/docs/master/radosgw/nfs/, but i'm getting:
Could not dlopen module:/usr/lib/x86_64-linux-gnu/ganesha/libfsalrgw.so
Error:/usr/lib/x86_64-linux-gnu/ganesha/libfsalrgw.so: cannot open shared
object file: No such file or directory
and lo and behold, the libfsalrgw.so isn't present in the folder. I
installed the nfs-ganesha and nfs-ganesha-fsal packages. I tried googling
around, but i didn't find any relevant info or walkthroughs for this setup,
so i'm asking - was anyone succesful in setting this up? I can see that even
the redhat solution is still in progress, so i'm not sure if this even
works. Thanks for any help,

Josef

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] NFS-ganesha with RGW

2018-05-30 Thread Josef Zelenka
Hi everyone, i'm currently trying to set up a NFS-ganesha instance that 
mounts a RGW storage, however i'm not succesful in this. I'm running 
Ceph Luminous 12.2.4 and ubuntu 16.04. I tried compiling ganesha from 
source(latest version), however i didn't manage to get the mount running 
with that, as ganesha refused to bind to the ipv6 interface - i assume 
this is a ganesha issue, but i didn't find any relevant info on what 
might cause this - my network setup should allow for that. Then i 
installed ganesha-2.6 from the official repos, set up the config for RGW 
as per the official howto http://docs.ceph.com/docs/master/radosgw/nfs/, 
but i'm getting:
Could not dlopen module:/usr/lib/x86_64-linux-gnu/ganesha/libfsalrgw.so 
Error:/usr/lib/x86_64-linux-gnu/ganesha/libfsalrgw.so: cannot open 
shared object file: No such file or directory
and lo and behold, the libfsalrgw.so isn't present in the folder. I 
installed the nfs-ganesha and nfs-ganesha-fsal packages. I tried 
googling around, but i didn't find any relevant info or walkthroughs for 
this setup, so i'm asking - was anyone succesful in setting this up? I 
can see that even the redhat solution is still in progress, so i'm not 
sure if this even works. Thanks for any help,


Josef

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Issues with RBD when rebooting

2018-05-25 Thread Josef Zelenka
Hi, we are running a jewel cluster (54OSDs, six nodes, ubuntu 16.04) 
that serves as a backend for openstack(newton) VMs. TOday we had to 
reboot one of the nodes(replicated pool, x2) and some of our VMs oopsed 
with issues with their FS(mainly database VMs, postgresql) - is there a 
reason for this to happen? if data is replicated, the VMs shouldn't even 
notice we rebooted one of the nodes, right? Maybe i just don't 
understand how this works correctly, but i hope someone around here can 
either tell me why this is happenning or how to fix it.


Thanks

Josef

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs write fail when node goes down

2018-05-15 Thread Josef Zelenka
Client's kernel is 4.4.0. Regarding the hung osd request, i'll have to 
check, the issue is gone now, so i'm not sure if i'll find what you are 
suggesting. It's rather odd, because Ceph's failover worked for us every 
time, so i'm trying to figure out whether it is a ceph or app issue.



On 15/05/18 02:57, Yan, Zheng wrote:

On Mon, May 14, 2018 at 5:37 PM, Josef Zelenka
<josef.zele...@cloudevelops.com> wrote:

Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48
OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0). Yesterday,
we were doing a HW upgrade of the nodes, so they went down one by one - the
cluster was in good shape during the upgrade, as we've done this numerous
times and we're quite sure that the redundancy wasn't screwed up while doing
this. However, during this upgrade one of the clients that does backups to
cephfs(mounted via the kernel driver) failed to write the backup file
correctly to the cluster with the following trace after we turned off one of
the nodes:

[2585732.529412]  8800baa279a8 813fb2df 880236230e00
8802339c
[2585732.529414]  8800baa28000 88023fc96e00 7fff
8800baa27b20
[2585732.529415]  81840ed0 8800baa279c0 818406d5

[2585732.529417] Call Trace:
[2585732.529505]  [] ? cpumask_next_and+0x2f/0x40
[2585732.529558]  [] ? bit_wait+0x60/0x60
[2585732.529560]  [] schedule+0x35/0x80
[2585732.529562]  [] schedule_timeout+0x1b5/0x270
[2585732.529607]  [] ? kvm_clock_get_cycles+0x1e/0x20
[2585732.529609]  [] ? bit_wait+0x60/0x60
[2585732.529611]  [] io_schedule_timeout+0xa4/0x110
[2585732.529613]  [] bit_wait_io+0x1b/0x70
[2585732.529614]  [] __wait_on_bit_lock+0x4e/0xb0
[2585732.529652]  [] __lock_page+0xbb/0xe0
[2585732.529674]  [] ? autoremove_wake_function+0x40/0x40
[2585732.529676]  [] pagecache_get_page+0x17d/0x1c0
[2585732.529730]  [] ? ceph_pool_perm_check+0x48/0x700
[ceph]
[2585732.529732]  [] grab_cache_page_write_begin+0x26/0x40
[2585732.529738]  [] ceph_write_begin+0x48/0xe0 [ceph]
[2585732.529739]  [] generic_perform_write+0xce/0x1c0
[2585732.529763]  [] ? file_update_time+0xc9/0x110
[2585732.529769]  [] ceph_write_iter+0xf89/0x1040 [ceph]
[2585732.529792]  [] ? __alloc_pages_nodemask+0x159/0x2a0
[2585732.529808]  [] new_sync_write+0x9b/0xe0
[2585732.529811]  [] __vfs_write+0x26/0x40
[2585732.529812]  [] vfs_write+0xa9/0x1a0
[2585732.529814]  [] SyS_write+0x55/0xc0
[2585732.529817]  [] entry_SYSCALL_64_fastpath+0x16/0x71



is there any hang osd request in /sys/kernel/debug/ceph//osdc?


I have encountered this behavior on Luminous, but not on Jewel. Anyone who
has a clue why the write fails? As far as i'm concerned, it should always
work if all the PGs are available. Thanks
Josef

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Cephfs write fail when node goes down

2018-05-14 Thread Josef Zelenka
Hi everyone, we've encountered an unusual thing in our setup(4 nodes, 48 
OSDs, 3 monitors - ceph Jewel, Ubuntu 16.04 with kernel 4.4.0). 
Yesterday, we were doing a HW upgrade of the nodes, so they went down 
one by one - the cluster was in good shape during the upgrade, as we've 
done this numerous times and we're quite sure that the redundancy wasn't 
screwed up while doing this. However, during this upgrade one of the 
clients that does backups to cephfs(mounted via the kernel driver) 
failed to write the backup file correctly to the cluster with the 
following trace after we turned off one of the nodes:


[2585732.529412]  8800baa279a8 813fb2df 880236230e00 
8802339c
[2585732.529414]  8800baa28000 88023fc96e00 7fff 
8800baa27b20
[2585732.529415]  81840ed0 8800baa279c0 818406d5 

[2585732.529417] Call Trace:
[2585732.529505]  [] ? cpumask_next_and+0x2f/0x40
[2585732.529558]  [] ? bit_wait+0x60/0x60
[2585732.529560]  [] schedule+0x35/0x80
[2585732.529562]  [] schedule_timeout+0x1b5/0x270
[2585732.529607]  [] ? kvm_clock_get_cycles+0x1e/0x20
[2585732.529609]  [] ? bit_wait+0x60/0x60
[2585732.529611]  [] io_schedule_timeout+0xa4/0x110
[2585732.529613]  [] bit_wait_io+0x1b/0x70
[2585732.529614]  [] __wait_on_bit_lock+0x4e/0xb0
[2585732.529652]  [] __lock_page+0xbb/0xe0
[2585732.529674]  [] ? autoremove_wake_function+0x40/0x40
[2585732.529676]  [] pagecache_get_page+0x17d/0x1c0
[2585732.529730]  [] ? ceph_pool_perm_check+0x48/0x700 [ceph]
[2585732.529732]  [] grab_cache_page_write_begin+0x26/0x40
[2585732.529738]  [] ceph_write_begin+0x48/0xe0 [ceph]
[2585732.529739]  [] generic_perform_write+0xce/0x1c0
[2585732.529763]  [] ? file_update_time+0xc9/0x110
[2585732.529769]  [] ceph_write_iter+0xf89/0x1040 [ceph]
[2585732.529792]  [] ? __alloc_pages_nodemask+0x159/0x2a0
[2585732.529808]  [] new_sync_write+0x9b/0xe0
[2585732.529811]  [] __vfs_write+0x26/0x40
[2585732.529812]  [] vfs_write+0xa9/0x1a0
[2585732.529814]  [] SyS_write+0x55/0xc0
[2585732.529817]  [] entry_SYSCALL_64_fastpath+0x16/0x71


I have encountered this behavior on Luminous, but not on Jewel. Anyone who has 
a clue why the write fails? As far as i'm concerned, it should always work if 
all the PGs are available. Thanks
Josef

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RGW multisite sync issues

2018-04-06 Thread Josef Zelenka

Hi everyone,

i'm currently setting up RGW multisite(one cluster is jewel(primary), 
the other is luminous - this is only for testing, on prod we will have 
the same version - jewel on both), but i can't get bucket 
synchronization to work. Data gets synchronized fine when i upload it, 
but when i delete it from the primary cluster, it only deletes the 
metadata of the file on the secondary one, the files are still there(can 
see it in rados df - pool states the same). Also, none of the older 
buckets start synchronizing to the secondary cluster. It's been quite a 
headache so far. Anyone who knows what might be wrong? I can supply any 
needed info. THanks


Josef Zelenka

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Radosgw halts writes during recovery, recovery info issues

2018-03-26 Thread Josef Zelenka

forgot to mention - we are running jewel, 10.2.10


On 26/03/18 11:30, Josef Zelenka wrote:
Hi everyone, i'm currently fighting an issue in a cluster we have for 
a customer. It's used for a lot of small files(113m currently) that 
are pulled via radosgw. We have 3 nodes, 24 OSDs in total. the index 
etc pools are migrated to a separate root called "ssd", that root is 
on only ssd drives - each node has one ssd in this root. We did this 
because we had an issue where if a normal OSD(an HDD) crashed, the 
entire rgw stopped working. Today, one of the SSDs crashed and after 
changing the drive, putting a new one in and starting recovery, RGW 
halted writes. Read worked ok, but we couldn't upload any more files 
to it. The non-data pools all have size set to 3, so there should 
still be 2 healthy copies of the index data. Also, when recovery 
started, no recovery i/o was shown in the ceph -s output, so we 
checked it through df, after the ssd backfilled, ceph -s went from X 
degraded pgs back to OK instantly. Does anyone know how to fix these? 
i don't think writes should be halted during recovery.


Thanks

Josef Z

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mapping faulty pg to file on cephfs

2018-02-13 Thread Josef Zelenka

Oh, sorry, forgot to mention - this cluster is running jewel :(


On 13/02/18 12:10, John Spray wrote:

On Tue, Feb 13, 2018 at 10:38 AM, Josef Zelenka
<josef.zele...@cloudevelops.com> wrote:

Hi everyone, one of the clusters we are running for a client recently had a
power outage, it's currently in a working state, however 3 pgs were left
inconsistent atm, with this type of error in the log(when i attempt to ceph
pg repair it)

2018-02-13 09:47:17.534912 7f3735626700 -1 log_channel(cluster) log [ERR] :
repair 15.1e32 15:4c7eed31:::10002110e12.004b:head on disk size (0) does
not match object info size (4194304) adjusted for ondisk to (4194304)

i know this can be fixed by truncating the ondisk object to the expected
size, but it clearly means we've lost some data. This cluster is used for
cephfs only, so i'd like to find which files on the cephfs were affected. I
know the OSDs for that pg, i know which pg and which object was affected, so
i hope it's possible. I found a 2015 entry in the mailing list, that does
the reverse thing
(http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-October/005384.html),
as in - map file to pg/object. I have 230TB of data in that cluster in a lot
of files, so mapping them all would take a long time. I hope there is a way
to do this, if people here have any idea/experience with this, it'd be
great.

We added a tool in luminous that does this:
http://docs.ceph.com/docs/master/cephfs/disaster-recovery/#finding-files-affected-by-lost-data-pgs

John




Thanks

Josef Zelenka

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Mapping faulty pg to file on cephfs

2018-02-13 Thread Josef Zelenka
Hi everyone, one of the clusters we are running for a client recently 
had a power outage, it's currently in a working state, however 3 pgs 
were left inconsistent atm, with this type of error in the log(when i 
attempt to ceph pg repair it)


2018-02-13 09:47:17.534912 7f3735626700 -1 log_channel(cluster) log 
[ERR] : repair 15.1e32 15:4c7eed31:::10002110e12.004b:head on disk 
size (0) does not match object info size (4194304) adjusted for ondisk 
to (4194304)


i know this can be fixed by truncating the ondisk object to the expected 
size, but it clearly means we've lost some data. This cluster is used 
for cephfs only, so i'd like to find which files on the cephfs were 
affected. I know the OSDs for that pg, i know which pg and which object 
was affected, so i hope it's possible. I found a 2015 entry in the 
mailing list, that does the reverse thing 
(http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-October/005384.html), 
as in - map file to pg/object. I have 230TB of data in that cluster in a 
lot of files, so mapping them all would take a long time. I hope there 
is a way to do this, if people here have any idea/experience with this, 
it'd be great.


Thanks

Josef Zelenka

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Inconsistent PG - failed to pick suitable auth object

2018-01-29 Thread Josef Zelenka
Hi everyone, i'm having issues with one of our clusters, regarding a 
seemingly unfixable inconsistent pg. We are running ubuntu 16.04, ceph 
10.2.7, 96 osds on 8 nodes. After a power outage, we had some 
inconsistent pgs, i managed to fix all of them but this one, here's an 
excerpt from the logs(it's outputting this everytime i issue a ceph pg 
repair command)


2018-01-29 12:49:35.126066 7f09ffd1e700 -1 log_channel(cluster) log 
[ERR] : 3.c04 shard 44: soid 
3:203d2906:::benchmark_data_mon3_3417_object685:head data_digest 
0x8d3f3b5b != data_digest 0xdbdd31f0 from auth oi 
3:203d2906:::benchmark_data_mon3_3417_object685:head(112873'834220 
client.79854137.0:686 dirty|data_digest|omap_digest s 65536 uv 834220 dd 
dbdd31f0 od )
2018-01-29 12:49:35.126087 7f09ffd1e700 -1 log_channel(cluster) log 
[ERR] : 3.c04 shard 97: soid 
3:203d2906:::benchmark_data_mon3_3417_object685:head data_digest 
0x8d3f3b5b != data_digest 0xdbdd31f0 from auth oi 
3:203d2906:::benchmark_data_mon3_3417_object685:head(112873'834220 
client.79854137.0:686 dirty|data_digest|omap_digest s 65536 uv 834220 dd 
dbdd31f0 od ), attr name mismatch '_', attr name mismatch 'snapset'
2018-01-29 12:49:35.126091 7f09ffd1e700 -1 log_channel(cluster) log 
[ERR] : 3.c04 soid 3:203d2906:::benchmark_data_mon3_3417_object685:head: 
failed to pick suitable auth object
2018-01-29 12:49:35.126164 7f09ffd1e700 -1 log_channel(cluster) log 
[ERR] : deep-scrub 3.c04 
3:203d2906:::benchmark_data_mon3_3417_object685:head no '_' attr
2018-01-29 12:49:35.126170 7f09ffd1e700 -1 log_channel(cluster) log 
[ERR] : deep-scrub 3.c04 
3:203d2906:::benchmark_data_mon3_3417_object685:head no 'snapset' attr
2018-01-29 12:50:11.670123 7f09f3d06700 -1 log_channel(cluster) log 
[ERR] : 3.c04 deep-scrub 5 errors
2018-01-29 13:30:13.839317 7f596c5d2700 -1 log_channel(cluster) log 
[ERR] : 3.c04 shard 44: soid 
3:203d2906:::benchmark_data_mon3_3417_object685:head data_digest 
0x8d3f3b5b != data_digest 0xdbdd31f0 from auth oi 
3:203d2906:::benchmark_data_mon3_3417_object685:head(112873'834220 
client.79854137.0:686 dirty|data_digest|omap_digest s 65536 uv 834220 dd 
dbdd31f0 od )
2018-01-29 13:30:13.839335 7f596c5d2700 -1 log_channel(cluster) log 
[ERR] : 3.c04 shard 97 missing 
3:203d2906:::benchmark_data_mon3_3417_object685:head
2018-01-29 13:30:13.839339 7f596c5d2700 -1 log_channel(cluster) log 
[ERR] : 3.c04 soid 3:203d2906:::benchmark_data_mon3_3417_object685:head: 
failed to pick suitable auth object
2018-01-29 13:30:52.850323 7f596c5d2700 -1 log_channel(cluster) log 
[ERR] : 3.c04 repair stat mismatch, got 4084/4085 objects, 0/0 clones, 
4084/4084 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 
whiteouts, 16824119169/16824119169 bytes, 0/0 hit_set_archive bytes.
2018-01-29 13:30:52.850379 7f596c5d2700 -1 log_channel(cluster) log 
[ERR] : 3.c04 repair 3 errors, 1 fixed
2018-01-29 13:51:33.138881 7f59605ba700 -1 log_channel(cluster) log 
[ERR] : 3.c04 shard 44: soid 
3:203d2906:::benchmark_data_mon3_3417_object685:head data_digest 
0x8d3f3b5b != data_digest 0xdbdd31f0 from auth oi 
3:203d2906:::benchmark_data_mon3_3417_object685:head(112873'834220 
client.79854137.0:686 dirty|data_digest|omap_digest s 65536 uv 834220 dd 
dbdd31f0 od )
2018-01-29 13:51:33.138895 7f59605ba700 -1 log_channel(cluster) log 
[ERR] : 3.c04 shard 97 missing 
3:203d2906:::benchmark_data_mon3_3417_object685:head
2018-01-29 13:51:33.138898 7f59605ba700 -1 log_channel(cluster) log 
[ERR] : 3.c04 soid 3:203d2906:::benchmark_data_mon3_3417_object685:head: 
failed to pick suitable auth object


when i try to find info about the object itself, i get this(after a deep 
scrub)


 rados list-inconsistent-obj 3.c04 --format=json-pretty
{
    "epoch": 114466,
    "inconsistents": []
}

i tried deleting the object from the primary and repairing, truncating 
the object to the same size on both primary and secondary and even 
copying the identical object from the secondary to the primary, but 
nothing seems to work. any pointers regarding this?


thanks

Josef Zelenka

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] how to get bucket or object's ACL?

2018-01-29 Thread Josef Zelenka

hi, this should be possible via the s3cmd tool.

|s3cmd info s3:/// s3cmd info s3://PP-2015-Tut/ here is more 
info - https://kunallillaney.github.io/s3cmd-tutorial/ i have 
succesfully used this tool in the past for ACL management, so i hope 
it's gonna work for you too. JZ |



On 29/01/18 11:23, 13605702...@163.com wrote:

hi
    how to  get the bucket or object's ACL in command line?

thanks


13605702...@163.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cluster crash - FAILED assert(interval.last > last)

2018-01-11 Thread Josef Zelenka
I have posted logs/strace from our osds with details to a ticket in the 
ceph bug tracker - see here http://tracker.ceph.com/issues/21142. You 
can see where exactly the OSDs crash etc, this can be of help if someone 
decides to debug it.


JZ


On 10/01/18 22:05, Josef Zelenka wrote:


Hi, today we had a disasterous crash - we are running a 3 node, 24 osd 
in total cluster (8 each) with SSDs for blockdb, HDD for bluestore 
data. This cluster is used as a radosgw backend, for storing a big 
number of thumbnails for a file hosting site - around 110m files in 
total. We were adding an interface to the nodes which required a 
restart, but after restarting one of the nodes, a lot of the OSDs were 
kicked out of the cluster and rgw stopped working. We have a lot of 
pgs down and unfound atm. OSDs can't be started(aside from some, 
that's a mystery) with this error - FAILED assert ( interval.last > 
last) - they just periodically restart. So far, the cluster is broken 
and we can't seem to bring it back up. We tried fscking the osds via 
the ceph objectstore tool, but it was no good. The root of all this 
seems to be in the FAILED assert(interval.last > last) error, however 
i can't find any info regarding this or how to fix it. Did someone 
here also encounter it? We're running luminous on ubuntu 16.04.


Thanks

Josef Zelenka

Cloudevelops



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to speed up backfill

2018-01-10 Thread Josef Zelenka
Hi, our recovery slowed down significantly towards the end, however it 
was still about five times faster than the original speed.We suspected 
that this is caused somehow by threading (more objects transferred - 
more threads used), but this is only an assumption.



On 11/01/18 05:02, shadow_lin wrote:

Hi,
I had tried these two method and for backfilling it seems only 
osd-max-backfills works.

How was your recovery speed when it comes to the last few pgs or objects?
2018-01-11

shadow_lin


*发件人:*Josef Zelenka <josef.zele...@cloudevelops.com>
*发送时间:*2018-01-11 04:53
*主题:*Re: [ceph-users] How to speed up backfill
*收件人:*"shadow_lin"<shadow_...@163.com>
*抄送:*

Hi, i had the same issue a few days back, i tried playing around
with these two:

ceph tell 'osd.*' injectargs '--osd-max-backfills '
ceph tell 'osd.*' injectargs '--osd-recovery-max-active  '
  and it helped greatly(increased our recovery speed 20x), but be careful 
to not overload your systems.


On 10/01/18 17:50, shadow_lin wrote:

Hi all,
I am playing with setting for backfill to try to find how to
control the speed of backfill.
Now I only find  "osd max backfills" can have effect the backfill
speed. But after all pg need to be backfilled begin backfilling I
can't find any way to speed up backfills.
Especailly when it comes to the last pg to recover, the speed is
only a few MB/s(when there are multi pg are backfilled the speed
could be more than 600MB/s in my test)
I am a little confused about the setting of backfills and
recovery.Though backfilling is a kind of recovery but It seems
recovery setting is only about to replay pg logs to do recover  pg.
Would change "osd recovery max active" or other recovery setting
have any effect on backfilling?
I did tried "osd recovery op priority" and "osd recovery max
active" with no luck.
Any advice would be greatly appreciated.Thanks
2018-01-11

lin.yunfan


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How to speed up backfill

2018-01-10 Thread Josef Zelenka



On 10/01/18 21:53, Josef Zelenka wrote:


Hi, i had the same issue a few days back, i tried playing around with 
these two:


ceph tell 'osd.*' injectargs '--osd-max-backfills '
ceph tell 'osd.*' injectargs '--osd-recovery-max-active  '
  and it helped greatly(increased our recovery speed 20x), but be careful to 
not overload your systems.

On 10/01/18 17:50, shadow_lin wrote:

Hi all,
I am playing with setting for backfill to try to find how to control 
the speed of backfill.
Now I only find  "osd max backfills" can have effect the backfill 
speed. But after all pg need to be backfilled begin backfilling I 
can't find any way to speed up backfills.
Especailly when it comes to the last pg to recover, the speed is only 
a few MB/s(when there are multi pg are backfilled the speed could be 
more than 600MB/s in my test)
I am a little confused about the setting of backfills and 
recovery.Though backfilling is a kind of recovery but It seems 
recovery setting is only about to replay pg logs to do recover  pg.
Would change "osd recovery max active" or other recovery setting have 
any effect on backfilling?
I did tried "osd recovery op priority" and "osd recovery max active" 
with no luck.

Any advice would be greatly appreciated.Thanks
2018-01-11

lin.yunfan


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Cluster crash - FAILED assert(interval.last > last)

2018-01-10 Thread Josef Zelenka
Hi, today we had a disasterous crash - we are running a 3 node, 24 osd 
in total cluster (8 each) with SSDs for blockdb, HDD for bluestore data. 
This cluster is used as a radosgw backend, for storing a big number of 
thumbnails for a file hosting site - around 110m files in total. We were 
adding an interface to the nodes which required a restart, but after 
restarting one of the nodes, a lot of the OSDs were kicked out of the 
cluster and rgw stopped working. We have a lot of pgs down and unfound 
atm. OSDs can't be started(aside from some, that's a mystery) with this 
error - FAILED assert ( interval.last > last) - they just periodically 
restart. So far, the cluster is broken and we can't seem to bring it 
back up. We tried fscking the osds via the ceph objectstore tool, but it 
was no good. The root of all this seems to be in the FAILED 
assert(interval.last > last) error, however i can't find any info 
regarding this or how to fix it. Did someone here also encounter it? 
We're running luminous on ubuntu 16.04.


Thanks

Josef Zelenka

Cloudevelops

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] determining the source of io in the cluster

2017-12-18 Thread Josef Zelenka

Hi everyone,

we have recently deployed a Luminous(12.2.1) cluster on Ubuntu - three 
osd nodes and three monitors, every osd has 3x 2TB SSD + an NVMe drive 
for a blockdb. We use it as a backend for our Openstack cluster, so we 
store volumes there. IN the last few days, the read op/s rose to around 
10k-25k constantly(it fluctuates between those two) and it doesn't seem 
to go down. I can see, that the io/read ops come from the pool where we 
store VM volumes, but i can't source this issue to a particular volume. 
Is that even possible? Any experiences with debugging this? Any info or 
advice is greatly appreciated.


Thanks

Josef Zelenka

Cloudevelops

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] A new SSD for journals - everything sucks?

2017-10-11 Thread Josef Zelenka

Hello everyone,
lately, we've had issues with buying SSDs that we use for 
journaling(Kingston stopped making them) - Kingston V300 - so we decided 
to start using a different model and started researching which one would 
be the best price/value for us. We compared five models, to check if 
they are compatible with our needs - SSDNow v300, HyperX Fury,SSDNOw 
KC400, SSDNow UV400 and SSDNow A400. the best one is still the V300, 
with the highest iops of 59 001. Second best and still useable was the 
HyperX Fury with 45000 iops.  The other three had terrible results, the 
max iops we got were around 13 000 with the dsync and direct flags. We 
also tested Samsung SSDs(the EVO series) and we got similarly bad 
results. To get to the root of my question - i am pretty sure we are not 
the only ones affected by the v300's death. Is there anyone else out 
there with some benchmarking data/knowledge about some good 
price/performance SSDs for ceph journaling? I can also share the 
complete benchmarking data my coworker made, if someone is interested.

Thanks
Josef Zelenka
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Large amount of files - cephfs?

2017-09-29 Thread Josef Zelenka

Hi everyone,

thanks for the advice, we consulted it and we're gonna test it out with 
cephfs first. Object storage is a possibility if it misbehaves. 
Hopefully it will go well :)



On 28/09/17 08:20, Henrik Korkuc wrote:

On 17-09-27 14:57, Josef Zelenka wrote:

Hi,

we are currently working on a ceph solution for one of our customers. 
They run a file hosting and they need to store approximately 100 
million of pictures(thumbnails). Their current code works with FTP, 
that they use as a storage. We thought that we could use cephfs for 
this, but i am not sure how it would behave with that many files, how 
would the performance be affected etc. Is cephfs useable in this 
scenario, or would radosgw+swift be better(they'd likely have to 
rewrite some of the code, so we'd prefer not to do this)? We already 
have some experience with cephfs for storing bigger files, streaming 
etc so i'm not completely new to this, but i thought it'd be better 
to ask more experiened users. Some advice on this would be greatly 
appreciated, thanks,


Josef

Depending on your OSD count, you should be able to put 100mil of files 
there. As others mentioned, depending on your workload, metadata may 
be a bottleneck.


If metadata is not a concern, then you just need to have enough OSDs 
to distribute RADOS objects. You should be fine with few millions 
objects per OSDs, going with tens of millions per OSD may be more 
problematic as you have larger memory usage, OSDs are slower, 
backfill/recovery is slow.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Large amount of files - cephfs?

2017-09-27 Thread Josef Zelenka

Hi,

we are currently working on a ceph solution for one of our customers. 
They run a file hosting and they need to store approximately 100 million 
of pictures(thumbnails). Their current code works with FTP, that they 
use as a storage. We thought that we could use cephfs for this, but i am 
not sure how it would behave with that many files, how would the 
performance be affected etc. Is cephfs useable in this scenario, or 
would radosgw+swift be better(they'd likely have to rewrite some of the 
code, so we'd prefer not to do this)? We already have some experience 
with cephfs for storing bigger files, streaming etc so i'm not 
completely new to this, but i thought it'd be better to ask more 
experiened users. Some advice on this would be greatly appreciated, thanks,


Josef

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] RADOSGW S3 api ACLs

2017-02-16 Thread Josef Zelenka

Hello everyone,
i've been struggling for the past few days with setting up ACLs for 
buckets on my radosgw. I want to use the buckets with the s3 API and i 
want them to have the ACL set up like this:
every file that gets pushed into the bucket is automatically readable by 
everyone and writeable only by a specific user. Currently i was able to 
set the ACLs i want on existing files, but i want them to be set up in a 
way that will automatically do this, i.e the entire bucket. Can anyone 
shed some light on ACLs in S3 API and RGW?

Thanks
Josef Zelenka
Cloudevelops
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com