Re: [ceph-users] osd backfills and recovery limit issue

2017-08-09 Thread Hyun Ha
Thank you for comment.

I can understand what you mean.
When one osd goes down, the osd has many PGs through whole ceph cluster
nodes, so each nodes can have one backfill/recovery per osd and ceph
culster shows many backfills/recoverys.
The other side, When one osd goes up, the osd needs to copy PG one by one
from other nodes, so ceph cluster shows 1 backfill/recovery.
Is that right?

When host or osd goes down, it can give more performance impact than when
host or osd goes up.
So, Is there any configuration to limit osd count per PG when ceph is doing
recovers/backfills?
Or Is it possible when the usage of system resource(cpu, memory, network
throughput, etc) is low, force more recovery/backfills like recovery
scheduling?

Thank you.

2017-08-10 13:31 GMT+09:00 David Turner :

> osd_max_backfills is a setting per osd.  With that set to 1, each osd will
> only be involved in a single backfill/recovery at the same time.  However
> the cluster as a whole will have as many backfills as it can while each osd
> is only involved in 1 each.
>
> On Wed, Aug 9, 2017 at 10:58 PM 하현  wrote:
>
>> Hi ceph experts.
>>
>> I confused when set limitation of osd max backfills.
>> When osd down recovery occuerred, and osd up is same.
>>
>> I want to set limitation for backfills to 1.
>> So, I set config as below.
>>
>>
>> # ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show|egrep
>> "osd_max_backfills|osd_recovery_threads|osd_recovery_
>> max_active|osd_recovery_op_priority"
>> "osd_max_backfills": "1",
>> "osd_recovery_threads": "1",
>> "osd_recovery_max_active": "1",
>> "osd_recovery_op_priority": "3",
>>
>> When osd up it seemed works good but when osd down it seemed not works as
>> I thinks.
>> Please see the ceph watch logs.
>>
>> osd down>
>> pgmap v898158: 2048 pgs: 20 remapped+peering, 106
>> active+undersized+degraded, 1922 active+clean; 641 B/s rd, 253 kB/s wr, 36
>> op/s; 45807/1807242 objects degraded (2.535%)
>> pgmap v898159: 2048 pgs: *5
>> active+undersized+degraded+remapped+backfilling*, 9
>> activating+undersized+degraded+remapped, 24 
>> active+undersized+degraded+remapped+wait_backfill,
>> 20 remapped+peering, 68 active+undersized+degraded, 1922 active+clean; 510
>> B/s rd, 498 kB/s wr, 42 op/s; 41619/1812733 objects degraded (2.296%);
>> 21029/1812733 objects misplaced (1.160%); 149 MB/s, 37 objects/s recovering
>> pgmap v898168: 2048 pgs: *16
>> active+undersized+degraded+remapped+backfilling*, 110
>> active+undersized+degraded+remapped+wait_backfill, 1922 active+clean;
>> 508 B/s rd, 562 kB/s wr, 61 op/s; 54118/1823939 objects degraded (2.967%);
>> 86984/1823939 objects misplaced (4.769%); 4025 MB/s, 1006 objects/s
>> recovering
>> pgmap v898192: 2048 pgs: 3 peering, 1 activating, 13
>> active+undersized+degraded+remapped+backfilling, 106
>> active+undersized+degraded+remapped+wait_backfill, 1925 active+clean;
>> 10184 B/s rd, 362 kB/s wr, 47 op/s; 49724/1823312 objects degraded
>> (2.727%); 79709/1823312 objects misplaced (4.372%); 1949 MB/s, 487
>> objects/s recovering
>> pgmap v898216: 2048 pgs: 1 active+undersized+remapped, 11
>> active+undersized+degraded+remapped+backfilling, 98
>> active+undersized+degraded+remapped+wait_backfill, 1938 active+clean;
>> 10164 B/s rd, 251 kB/s wr, 37 op/s; 44429/1823312 objects degraded
>> (2.437%); 74037/1823312 objects misplaced (4.061%); 2751 MB/s, 687
>> objects/s recovering
>> pgmap v898541: 2048 pgs: 1 active+undersized+degraded+remapped+backfilling,
>> 2047 active+clean; 218 kB/s wr, 39 op/s; 261/1806097 objects degraded
>> (0.014%); 543/1806097 objects misplaced (0.030%); 677 MB/s, 9 keys/s, 176
>> objects/s recovering
>>
>> osd up>
>> pgmap v899274: 2048 pgs: 2 activating, 14 peering, 12 remapped+peering,
>> 2020 active+clean; 5594 B/s rd, 452 kB/s wr, 54 op/s
>> pgmap v899277: 2048 pgs: *1 active+remapped+backfilling*, 41
>> active+remapped+wait_backfill, 2 activating, 14 peering, 1990 active+clean;
>> 595 kB/s wr, 23 op/s; 36111/1823939 objects misplaced (1.980%); 380 MB/s,
>> 95 objects/s recovering
>> pgmap v899298: 2048 pgs: 1 peering, *1 active+remapped+backfilling*, 40
>> active+remapped+wait_backfill, 2006 active+clean; 723 kB/s wr, 13 op/s;
>> 34903/1823294 objects misplaced (1.914%); 1113 MB/s, 278 objects/s
>> recovering
>> pgmap v899342: 2048 pgs: 1 active+remapped+backfilling, 39
>> active+remapped+wait_backfill, 2008 active+clean; 5615 B/s rd, 291 kB/s wr,
>> 41 op/s; 33150/1822666 objects misplaced (1.819%)
>> pgmap v899274: 2048 pgs: 2 activating, 14 peering, 12 remapped+peering,
>> 2020 active+clean;5594 B/s rd, 452 kB/s wr, 54 op/s
>> pgmap v899796: 2048 pgs: 1 activating, 1 active+remapped+backfilling, 10
>> active+remapped+wait_backfill, 2036 active+clean; 235 kB/s wr, 22 op/s;
>> 6423/1809085 objects misplaced (0.355%)
>>
>> in osd down> logs,  we can see 16 backfills, and in osd up> logs, we can
>> see only one backfills. Is that correct? If not, what config 

Re: [ceph-users] osd backfills and recovery limit issue

2017-08-09 Thread David Turner
osd_max_backfills is a setting per osd.  With that set to 1, each osd will
only be involved in a single backfill/recovery at the same time.  However
the cluster as a whole will have as many backfills as it can while each osd
is only involved in 1 each.

On Wed, Aug 9, 2017 at 10:58 PM 하현  wrote:

> Hi ceph experts.
>
> I confused when set limitation of osd max backfills.
> When osd down recovery occuerred, and osd up is same.
>
> I want to set limitation for backfills to 1.
> So, I set config as below.
>
>
> # ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show|egrep
> "osd_max_backfills|osd_recovery_threads|osd_recovery_max_active|osd_recovery_op_priority"
> "osd_max_backfills": "1",
> "osd_recovery_threads": "1",
> "osd_recovery_max_active": "1",
> "osd_recovery_op_priority": "3",
>
> When osd up it seemed works good but when osd down it seemed not works as
> I thinks.
> Please see the ceph watch logs.
>
> osd down>
> pgmap v898158: 2048 pgs: 20 remapped+peering, 106
> active+undersized+degraded, 1922 active+clean; 641 B/s rd, 253 kB/s wr, 36
> op/s; 45807/1807242 objects degraded (2.535%)
> pgmap v898159: 2048 pgs: *5
> active+undersized+degraded+remapped+backfilling*, 9
> activating+undersized+degraded+remapped, 24
> active+undersized+degraded+remapped+wait_backfill, 20 remapped+peering, 68
> active+undersized+degraded, 1922 active+clean; 510 B/s rd, 498 kB/s wr, 42
> op/s; 41619/1812733 objects degraded (2.296%); 21029/1812733 objects
> misplaced (1.160%); 149 MB/s, 37 objects/s recovering
> pgmap v898168: 2048 pgs: *16
> active+undersized+degraded+remapped+backfilling*, 110
> active+undersized+degraded+remapped+wait_backfill, 1922 active+clean; 508
> B/s rd, 562 kB/s wr, 61 op/s; 54118/1823939 objects degraded (2.967%);
> 86984/1823939 objects misplaced (4.769%); 4025 MB/s, 1006 objects/s
> recovering
> pgmap v898192: 2048 pgs: 3 peering, 1 activating, 13
> active+undersized+degraded+remapped+backfilling, 106
> active+undersized+degraded+remapped+wait_backfill, 1925 active+clean; 10184
> B/s rd, 362 kB/s wr, 47 op/s; 49724/1823312 objects degraded (2.727%);
> 79709/1823312 objects misplaced (4.372%); 1949 MB/s, 487 objects/s
> recovering
> pgmap v898216: 2048 pgs: 1 active+undersized+remapped, 11
> active+undersized+degraded+remapped+backfilling, 98
> active+undersized+degraded+remapped+wait_backfill, 1938 active+clean; 10164
> B/s rd, 251 kB/s wr, 37 op/s; 44429/1823312 objects degraded (2.437%);
> 74037/1823312 objects misplaced (4.061%); 2751 MB/s, 687 objects/s
> recovering
> pgmap v898541: 2048 pgs: 1
> active+undersized+degraded+remapped+backfilling, 2047 active+clean; 218
> kB/s wr, 39 op/s; 261/1806097 objects degraded (0.014%); 543/1806097
> objects misplaced (0.030%); 677 MB/s, 9 keys/s, 176 objects/s recovering
>
> osd up>
> pgmap v899274: 2048 pgs: 2 activating, 14 peering, 12 remapped+peering,
> 2020 active+clean; 5594 B/s rd, 452 kB/s wr, 54 op/s
> pgmap v899277: 2048 pgs: *1 active+remapped+backfilling*, 41
> active+remapped+wait_backfill, 2 activating, 14 peering, 1990 active+clean;
> 595 kB/s wr, 23 op/s; 36111/1823939 objects misplaced (1.980%); 380 MB/s,
> 95 objects/s recovering
> pgmap v899298: 2048 pgs: 1 peering, *1 active+remapped+backfilling*, 40
> active+remapped+wait_backfill, 2006 active+clean; 723 kB/s wr, 13 op/s;
> 34903/1823294 objects misplaced (1.914%); 1113 MB/s, 278 objects/s
> recovering
> pgmap v899342: 2048 pgs: 1 active+remapped+backfilling, 39
> active+remapped+wait_backfill, 2008 active+clean; 5615 B/s rd, 291 kB/s wr,
> 41 op/s; 33150/1822666 objects misplaced (1.819%)
> pgmap v899274: 2048 pgs: 2 activating, 14 peering, 12 remapped+peering,
> 2020 active+clean;5594 B/s rd, 452 kB/s wr, 54 op/s
> pgmap v899796: 2048 pgs: 1 activating, 1 active+remapped+backfilling, 10
> active+remapped+wait_backfill, 2036 active+clean; 235 kB/s wr, 22 op/s;
> 6423/1809085 objects misplaced (0.355%)
>
> in osd down> logs,  we can see 16 backfills, and in osd up> logs, we can
> see only one backfills. Is that correct? If not, what config should I set ?
> Thank you in advance.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Cephfs IO monitoring

2017-08-09 Thread Brady Deetz
Curious if there is a method way I could see in near real-time the io
patters for an fs. For instance, what files are currently being
read/written and the block sizes. I suspect this is a big ask. The only
thing I know of that can provide that level of detail for a filesystem is
dtrace with zfs.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osd backfills and recovery limit issue

2017-08-09 Thread 하현
Hi ceph experts.

I confused when set limitation of osd max backfills.
When osd down recovery occuerred, and osd up is same.

I want to set limitation for backfills to 1.
So, I set config as below.


# ceph --admin-daemon /var/run/ceph/ceph-osd.0.asok config show|egrep
"osd_max_backfills|osd_recovery_threads|osd_recovery_max_
active|osd_recovery_op_priority"
"osd_max_backfills": "1",
"osd_recovery_threads": "1",
"osd_recovery_max_active": "1",
"osd_recovery_op_priority": "3",

When osd up it seemed works good but when osd down it seemed not works as I
thinks.
Please see the ceph watch logs.

osd down>
pgmap v898158: 2048 pgs: 20 remapped+peering, 106
active+undersized+degraded, 1922 active+clean; 641 B/s rd, 253 kB/s wr, 36
op/s; 45807/1807242 objects degraded (2.535%)
pgmap v898159: 2048 pgs: *5 active+undersized+degraded+remapped+backfilling*,
9 activating+undersized+degraded+remapped, 24
active+undersized+degraded+remapped+wait_backfill,
20 remapped+peering, 68 active+undersized+degraded, 1922 active+clean; 510
B/s rd, 498 kB/s wr, 42 op/s; 41619/1812733 objects degraded (2.296%);
21029/1812733 objects misplaced (1.160%); 149 MB/s, 37 objects/s recovering
pgmap v898168: 2048 pgs: *16
active+undersized+degraded+remapped+backfilling*, 110
active+undersized+degraded+remapped+wait_backfill, 1922 active+clean; 508
B/s rd, 562 kB/s wr, 61 op/s; 54118/1823939 objects degraded (2.967%);
86984/1823939 objects misplaced (4.769%); 4025 MB/s, 1006 objects/s
recovering
pgmap v898192: 2048 pgs: 3 peering, 1 activating, 13
active+undersized+degraded+remapped+backfilling, 106
active+undersized+degraded+remapped+wait_backfill, 1925 active+clean; 10184
B/s rd, 362 kB/s wr, 47 op/s; 49724/1823312 objects degraded (2.727%);
79709/1823312 objects misplaced (4.372%); 1949 MB/s, 487 objects/s
recovering
pgmap v898216: 2048 pgs: 1 active+undersized+remapped, 11
active+undersized+degraded+remapped+backfilling, 98
active+undersized+degraded+remapped+wait_backfill, 1938 active+clean; 10164
B/s rd, 251 kB/s wr, 37 op/s; 44429/1823312 objects degraded (2.437%);
74037/1823312 objects misplaced (4.061%); 2751 MB/s, 687 objects/s
recovering
pgmap v898541: 2048 pgs: 1 active+undersized+degraded+remapped+backfilling,
2047 active+clean; 218 kB/s wr, 39 op/s; 261/1806097 objects degraded
(0.014%); 543/1806097 objects misplaced (0.030%); 677 MB/s, 9 keys/s, 176
objects/s recovering

osd up>
pgmap v899274: 2048 pgs: 2 activating, 14 peering, 12 remapped+peering,
2020 active+clean; 5594 B/s rd, 452 kB/s wr, 54 op/s
pgmap v899277: 2048 pgs: *1 active+remapped+backfilling*, 41
active+remapped+wait_backfill, 2 activating, 14 peering, 1990 active+clean;
595 kB/s wr, 23 op/s; 36111/1823939 objects misplaced (1.980%); 380 MB/s,
95 objects/s recovering
pgmap v899298: 2048 pgs: 1 peering, *1 active+remapped+backfilling*, 40
active+remapped+wait_backfill, 2006 active+clean; 723 kB/s wr, 13 op/s;
34903/1823294 objects misplaced (1.914%); 1113 MB/s, 278 objects/s
recovering
pgmap v899342: 2048 pgs: 1 active+remapped+backfilling, 39
active+remapped+wait_backfill, 2008 active+clean; 5615 B/s rd, 291 kB/s wr,
41 op/s; 33150/1822666 objects misplaced (1.819%)
pgmap v899274: 2048 pgs: 2 activating, 14 peering, 12 remapped+peering,
2020 active+clean;5594 B/s rd, 452 kB/s wr, 54 op/s
pgmap v899796: 2048 pgs: 1 activating, 1 active+remapped+backfilling, 10
active+remapped+wait_backfill, 2036 active+clean; 235 kB/s wr, 22 op/s;
6423/1809085 objects misplaced (0.355%)

in osd down> logs,  we can see 16 backfills, and in osd up> logs, we can
see only one backfills. Is that correct? If not, what config should I set ?
Thank you in advance.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] New install error

2017-08-09 Thread Brad Hubbard


On Wed, Aug 9, 2017 at 11:42 PM, Timothy Wolgemuth  
wrote:
> Here is the output:
>
> [ceph-deploy@ceph01 my-cluster]$ sudo /usr/bin/ceph --connect-timeout=25
> --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph01/keyring
> auth get client.admin
> 2017-08-09 09:07:00.519683 7f389700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efffc0617c0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efffc05d670).fault
> 2017-08-09 09:07:03.520486 7f288700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efffc80 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efff0001f90).fault
> 2017-08-09 09:07:06.521091 7f389700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efff00052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efff0006570).fault
> 2017-08-09 09:07:09.521483 7f288700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efffc80 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efff0002410).fault
> 2017-08-09 09:07:12.522027 7f389700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efff00052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efff0002f60).fault
> 2017-08-09 09:07:15.522433 7f288700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efffc80 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efff00036d0).fault
> 2017-08-09 09:07:18.523025 7f389700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efff00052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efff0002a10).fault
> 2017-08-09 09:07:21.523332 7f288700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efffc80 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efff0008d40).fault
> 2017-08-09 09:07:24.523353 7f389700  0 -- :/1582396262 >>
> 192.168.100.11:6789/0 pipe(0x7efff00052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
> c=0x7efff0003df0).fault
> Traceback (most recent call last):  File "/usr/bin/ceph", line 948, in
> 
> retval = main()
>   File "/usr/bin/ceph", line 852, in main
> prefix='get_command_descriptions')

What is your public and cluster network configuration and which interface is it
using to try to connect to 192.168.100.11:6789? You can use wireshark or similar
to find out. ceph01 need to be able to communicate with 192.168.100.11 on port
6789 so you need to find out why currently it can't.

>
>
>
> On Wed, Aug 9, 2017 at 12:15 AM, Brad Hubbard  wrote:
>>
>> On ceph01 if you login as ceph-deploy and run the following command
>> what output do you get?
>>
>> $ sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon.
>> --keyring=/var/lib/ceph/mon/ceph-ceph01/keyring auth get client.admin
>>
>> On Tue, Aug 8, 2017 at 11:41 PM, Timothy Wolgemuth
>>  wrote:
>> > I have a new installation and following the quick start guide at:
>> >
>> > http://docs.ceph.com/docs/master/start/quick-ceph-deploy/
>> >
>> > Running into the following error in the create-initial step.  See below:
>> >
>> >
>> >
>> > $ ceph-deploy --username ceph-deploy mon create-initial
>> > [ceph_deploy.conf][DEBUG ] found configuration file at:
>> > /home/ceph-deploy/.cephdeploy.conf
>> > [ceph_deploy.cli][INFO  ] Invoked (1.5.37): /bin/ceph-deploy --username
>> > ceph-deploy mon create-initial
>> > [ceph_deploy.cli][INFO  ] ceph-deploy options:
>> > [ceph_deploy.cli][INFO  ]  username  : ceph-deploy
>> > [ceph_deploy.cli][INFO  ]  verbose   : False
>> > [ceph_deploy.cli][INFO  ]  overwrite_conf: False
>> > [ceph_deploy.cli][INFO  ]  subcommand:
>> > create-initial
>> > [ceph_deploy.cli][INFO  ]  quiet : False
>> > [ceph_deploy.cli][INFO  ]  cd_conf   :
>> > 
>> > [ceph_deploy.cli][INFO  ]  cluster   : ceph
>> > [ceph_deploy.cli][INFO  ]  func  : > > at
>> > 0x275e320>
>> > [ceph_deploy.cli][INFO  ]  ceph_conf : None
>> > [ceph_deploy.cli][INFO  ]  default_release   : False
>> > [ceph_deploy.cli][INFO  ]  keyrings  : None
>> > [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph01
>> > [ceph_deploy.mon][DEBUG ] detecting platform for host ceph01 ...
>> > [ceph01][DEBUG ] connection detected need for sudo
>> > [ceph01][DEBUG ] connected to host: ceph-deploy@ceph01
>> > [ceph01][DEBUG ] detect platform information from remote host
>> > [ceph01][DEBUG ] detect machine type
>> > [ceph01][DEBUG ] find the location of an executable
>> > [ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.3.1611 Core
>> > [ceph01][DEBUG ] determining if provided host has same hostname in
>> > remote
>> > [ceph01][DEBUG ] get remote short hostname
>> > [ceph01][DEBUG ] deploying mon to ceph01
>> > [ceph01][DEBUG ] get remote short hostname
>> > [ceph01][DEBUG ] remote hostname: ceph01
>> > [ceph01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
>> > [ceph01][DEBUG ] create the mon path if it does not exist
>> > [ceph01][DEBUG ] checking for done path:
>> > /var/lib/ceph/mon/ceph-ceph01/done

Re: [ceph-users] jewel - radosgw-admin bucket limit check broken?

2017-08-09 Thread Robin H. Johnson
I just hit this too, and found it was fixed in master, so generated a
backport issue & PR:
http://tracker.ceph.com/issues/20966
https://github.com/ceph/ceph/pull/16952

-- 
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Trustee & Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85
GnuPG FP : 7D0B3CEB E9B85B1F 825BCECF EE05E6F6 A48F6136


signature.asc
Description: Digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Iscsi configuration

2017-08-09 Thread Samuel Soulard
Hi Jason,

Thank you so much for all of the information.  This really provides some
good insight on the integration of iSCSI with LIO.  Lets hope that kernel
folks can work fast ahah

Sam

On Wed, Aug 9, 2017 at 12:48 PM, Jason Dillaman  wrote:

> Yeah -- the issue is that if nodeA is the active path and Windows
> issues some PRs, then if nodeA fails and nodeB is promoted to the
> active path, those PRs won't exist and Windows will balk and fail the
> device. I've seen some posts online w/ folks writing custom pacemaker
> resource scripts to try to manually distribute the PRs to other nodes.
>
> I believe the LIO kernel folks are working on integrating corosync /
> dlm to store the PRs for the near-term solution. I have been pushing
> to allow the PR state to be pushed down to tcmu-runner since for Ceph,
> we already have a distributed system and don't need nor want an added
> layer of complexity in the long term.
>
> On Wed, Aug 9, 2017 at 12:42 PM, Samuel Soulard
>  wrote:
> > Hmm :(  Even for an Active/Passive configuration?  I'm guessing we will
> need
> > to do something with Pacemaker in the meantime?
> >
> > On Wed, Aug 9, 2017 at 12:37 PM, Jason Dillaman 
> wrote:
> >>
> >> I can probably say that it won't work out-of-the-gate for Hyper-V
> >> since it most likely will require iSCSI persistent reservations. That
> >> support is still being added to the kernel because right now it isn't
> >> being distributed to all the target portal group nodes.
> >>
> >> On Wed, Aug 9, 2017 at 12:30 PM, Samuel Soulard
> >>  wrote:
> >> > Thanks! we'll visit back this subject once it is released.  Waiting on
> >> > this
> >> > to perform some tests for Hyper-V/VMware ISCSI LUNs :)
> >> >
> >> > Sam
> >> >
> >> > On Wed, Aug 9, 2017 at 10:35 AM, Jason Dillaman 
> >> > wrote:
> >> >>
> >> >> Yes, RHEL/CentOS 7.4 or kernel 4.13 (once it's released).
> >> >>
> >> >> On Wed, Aug 9, 2017 at 6:56 AM, Samuel Soulard
> >> >> 
> >> >> wrote:
> >> >> > Hi Jason,
> >> >> >
> >> >> > Oh the documentation is awesome:
> >> >> >
> >> >> >
> >> >> > https://github.com/ritz303/ceph/blob/
> 6ab7bc887b265127510c3c3fde6dbad0e047955d/doc/rbd/iscsi-target-cli.rst
> >> >> >
> >> >> > So I assume that this is not yet available for CentOS and requires
> us
> >> >> > to
> >> >> > wait until CentOS 7.4 is released?
> >> >> >
> >> >> > Thanks for the documentation, it makes everything more clear.
> >> >> >
> >> >> > On Tue, Aug 8, 2017 at 9:37 PM, Jason Dillaman <
> jdill...@redhat.com>
> >> >> > wrote:
> >> >> >>
> >> >> >> We are working hard to formalize active/passive iSCSI
> configuration
> >> >> >> across Linux/Windows/ESX via LIO. We have integrated librbd into
> >> >> >> LIO's
> >> >> >> tcmu-runner and have developed a set of support applications to
> >> >> >> managing the clustered configuration of your iSCSI targets. There
> is
> >> >> >> some preliminary documentation here [1] that will be merged once
> we
> >> >> >> can finish our testing.
> >> >> >>
> >> >> >> [1] https://github.com/ceph/ceph/pull/16182
> >> >> >>
> >> >> >> On Tue, Aug 8, 2017 at 4:45 PM, Samuel Soulard
> >> >> >> 
> >> >> >> wrote:
> >> >> >> > Hi all,
> >> >> >> >
> >> >> >> > Platform : Centos 7 Luminous 12.1.2
> >> >> >> >
> >> >> >> > First time here but, are there any guides or guidelines out
> there
> >> >> >> > on
> >> >> >> > how
> >> >> >> > to
> >> >> >> > configure ISCSI gateways in HA so that if one gateway fails, IO
> >> >> >> > can
> >> >> >> > continue
> >> >> >> > on the passive node?
> >> >> >> >
> >> >> >> > What I've done so far
> >> >> >> > -ISCSI node with Ceph client map rbd on boot
> >> >> >> > -Rbd has exclusive-lock feature enabled and layering
> >> >> >> > -Targetd service dependent on rbdmap.service
> >> >> >> > -rbd exported through LUN ISCSI
> >> >> >> > -Windows ISCSI imitator can map the lun and format / write to it
> >> >> >> > (awesome)
> >> >> >> >
> >> >> >> > Now I have no idea where to start to have an active /passive
> >> >> >> > scenario
> >> >> >> > for
> >> >> >> > luns exported with LIO.  Any ideas?
> >> >> >> >
> >> >> >> > Also the web dashboard seem to hint that it can get stats for
> >> >> >> > various
> >> >> >> > clients made on ISCSI gateways, I'm not sure where it pulls that
> >> >> >> > information. Is Luminous now shipping a ISCSI daemon of some
> sort?
> >> >> >> >
> >> >> >> > Thanks all!
> >> >> >> >
> >> >> >> > ___
> >> >> >> > ceph-users mailing list
> >> >> >> > ceph-users@lists.ceph.com
> >> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >> >> >
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Jason
> >> >> >
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Jason
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jason
> >
> >
>
>
>
> --
> Jason
>

Re: [ceph-users] Iscsi configuration

2017-08-09 Thread Maged Mokhtar
Hi Sam, 

Pacemaker will take care of HA failover but you will need to progagate
the PR data yourself.
If you are interested in a solution that works out of the box with
Windows, have a look at PetaSAN 
www.petasan.org
It works well with MS hyper-v/storage spaces/Scale Out File Server. 

Cheers
/Maged 

On 2017-08-09 18:42, Samuel Soulard wrote:

> Hmm :(  Even for an Active/Passive configuration?  I'm guessing we will need 
> to do something with Pacemaker in the meantime? 
> 
> On Wed, Aug 9, 2017 at 12:37 PM, Jason Dillaman  wrote:
> 
>> I can probably say that it won't work out-of-the-gate for Hyper-V
>> since it most likely will require iSCSI persistent reservations. That
>> support is still being added to the kernel because right now it isn't
>> being distributed to all the target portal group nodes.
>> 
>> On Wed, Aug 9, 2017 at 12:30 PM, Samuel Soulard
>> 
>>  wrote:
>>> Thanks! we'll visit back this subject once it is released.  Waiting on this
>>> to perform some tests for Hyper-V/VMware ISCSI LUNs :)
>>> 
>>> Sam
>>> 
>>> On Wed, Aug 9, 2017 at 10:35 AM, Jason Dillaman  wrote:
 
 Yes, RHEL/CentOS 7.4 or kernel 4.13 (once it's released).
 
 On Wed, Aug 9, 2017 at 6:56 AM, Samuel Soulard 
 wrote:
> Hi Jason,
> 
> Oh the documentation is awesome:
> 
> https://github.com/ritz303/ceph/blob/6ab7bc887b265127510c3c3fde6dbad0e047955d/doc/rbd/iscsi-target-cli.rst
>  [1]
> 
> So I assume that this is not yet available for CentOS and requires us to
> wait until CentOS 7.4 is released?
> 
> Thanks for the documentation, it makes everything more clear.
> 
> On Tue, Aug 8, 2017 at 9:37 PM, Jason Dillaman 
> wrote:
>> 
>> We are working hard to formalize active/passive iSCSI configuration
>> across Linux/Windows/ESX via LIO. We have integrated librbd into LIO's
>> tcmu-runner and have developed a set of support applications to
>> managing the clustered configuration of your iSCSI targets. There is
>> some preliminary documentation here [1] that will be merged once we
>> can finish our testing.
>> 
>> [1] https://github.com/ceph/ceph/pull/16182 [2]
>> 
>> On Tue, Aug 8, 2017 at 4:45 PM, Samuel Soulard
>> 
>> wrote:
>>> Hi all,
>>>
>>> Platform : Centos 7 Luminous 12.1.2
>>>
>>> First time here but, are there any guides or guidelines out there on
>>> how
>>> to
>>> configure ISCSI gateways in HA so that if one gateway fails, IO can
>>> continue
>>> on the passive node?
>>>
>>> What I've done so far
>>> -ISCSI node with Ceph client map rbd on boot
>>> -Rbd has exclusive-lock feature enabled and layering
>>> -Targetd service dependent on rbdmap.service
>>> -rbd exported through LUN ISCSI
>>> -Windows ISCSI imitator can map the lun and format / write to it
>>> (awesome)
>>>
>>> Now I have no idea where to start to have an active /passive scenario
>>> for
>>> luns exported with LIO.  Any ideas?
>>>
>>> Also the web dashboard seem to hint that it can get stats for various
>>> clients made on ISCSI gateways, I'm not sure where it pulls that
>>> information. Is Luminous now shipping a ISCSI daemon of some sort?
>>>
>>> Thanks all!
>>>
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com [3]
>>>
>> 
>> 
>> 
>> --
>> Jason
> 
> 
 
 
 
 --
 Jason
>>> 
>>> 
>> 
>> --
>> Jason
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

  

Links:
--
[1]
https://github.com/ritz303/ceph/blob/6ab7bc887b265127510c3c3fde6dbad0e047955d/doc/rbd/iscsi-target-cli.rst
[2] https://github.com/ceph/ceph/pull/16182
[3] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Iscsi configuration

2017-08-09 Thread Jason Dillaman
Yeah -- the issue is that if nodeA is the active path and Windows
issues some PRs, then if nodeA fails and nodeB is promoted to the
active path, those PRs won't exist and Windows will balk and fail the
device. I've seen some posts online w/ folks writing custom pacemaker
resource scripts to try to manually distribute the PRs to other nodes.

I believe the LIO kernel folks are working on integrating corosync /
dlm to store the PRs for the near-term solution. I have been pushing
to allow the PR state to be pushed down to tcmu-runner since for Ceph,
we already have a distributed system and don't need nor want an added
layer of complexity in the long term.

On Wed, Aug 9, 2017 at 12:42 PM, Samuel Soulard
 wrote:
> Hmm :(  Even for an Active/Passive configuration?  I'm guessing we will need
> to do something with Pacemaker in the meantime?
>
> On Wed, Aug 9, 2017 at 12:37 PM, Jason Dillaman  wrote:
>>
>> I can probably say that it won't work out-of-the-gate for Hyper-V
>> since it most likely will require iSCSI persistent reservations. That
>> support is still being added to the kernel because right now it isn't
>> being distributed to all the target portal group nodes.
>>
>> On Wed, Aug 9, 2017 at 12:30 PM, Samuel Soulard
>>  wrote:
>> > Thanks! we'll visit back this subject once it is released.  Waiting on
>> > this
>> > to perform some tests for Hyper-V/VMware ISCSI LUNs :)
>> >
>> > Sam
>> >
>> > On Wed, Aug 9, 2017 at 10:35 AM, Jason Dillaman 
>> > wrote:
>> >>
>> >> Yes, RHEL/CentOS 7.4 or kernel 4.13 (once it's released).
>> >>
>> >> On Wed, Aug 9, 2017 at 6:56 AM, Samuel Soulard
>> >> 
>> >> wrote:
>> >> > Hi Jason,
>> >> >
>> >> > Oh the documentation is awesome:
>> >> >
>> >> >
>> >> > https://github.com/ritz303/ceph/blob/6ab7bc887b265127510c3c3fde6dbad0e047955d/doc/rbd/iscsi-target-cli.rst
>> >> >
>> >> > So I assume that this is not yet available for CentOS and requires us
>> >> > to
>> >> > wait until CentOS 7.4 is released?
>> >> >
>> >> > Thanks for the documentation, it makes everything more clear.
>> >> >
>> >> > On Tue, Aug 8, 2017 at 9:37 PM, Jason Dillaman 
>> >> > wrote:
>> >> >>
>> >> >> We are working hard to formalize active/passive iSCSI configuration
>> >> >> across Linux/Windows/ESX via LIO. We have integrated librbd into
>> >> >> LIO's
>> >> >> tcmu-runner and have developed a set of support applications to
>> >> >> managing the clustered configuration of your iSCSI targets. There is
>> >> >> some preliminary documentation here [1] that will be merged once we
>> >> >> can finish our testing.
>> >> >>
>> >> >> [1] https://github.com/ceph/ceph/pull/16182
>> >> >>
>> >> >> On Tue, Aug 8, 2017 at 4:45 PM, Samuel Soulard
>> >> >> 
>> >> >> wrote:
>> >> >> > Hi all,
>> >> >> >
>> >> >> > Platform : Centos 7 Luminous 12.1.2
>> >> >> >
>> >> >> > First time here but, are there any guides or guidelines out there
>> >> >> > on
>> >> >> > how
>> >> >> > to
>> >> >> > configure ISCSI gateways in HA so that if one gateway fails, IO
>> >> >> > can
>> >> >> > continue
>> >> >> > on the passive node?
>> >> >> >
>> >> >> > What I've done so far
>> >> >> > -ISCSI node with Ceph client map rbd on boot
>> >> >> > -Rbd has exclusive-lock feature enabled and layering
>> >> >> > -Targetd service dependent on rbdmap.service
>> >> >> > -rbd exported through LUN ISCSI
>> >> >> > -Windows ISCSI imitator can map the lun and format / write to it
>> >> >> > (awesome)
>> >> >> >
>> >> >> > Now I have no idea where to start to have an active /passive
>> >> >> > scenario
>> >> >> > for
>> >> >> > luns exported with LIO.  Any ideas?
>> >> >> >
>> >> >> > Also the web dashboard seem to hint that it can get stats for
>> >> >> > various
>> >> >> > clients made on ISCSI gateways, I'm not sure where it pulls that
>> >> >> > information. Is Luminous now shipping a ISCSI daemon of some sort?
>> >> >> >
>> >> >> > Thanks all!
>> >> >> >
>> >> >> > ___
>> >> >> > ceph-users mailing list
>> >> >> > ceph-users@lists.ceph.com
>> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>> >> >> >
>> >> >>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Jason
>> >> >
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Jason
>> >
>> >
>>
>>
>>
>> --
>> Jason
>
>



-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Iscsi configuration

2017-08-09 Thread Samuel Soulard
Hmm :(  Even for an Active/Passive configuration?  I'm guessing we will
need to do something with Pacemaker in the meantime?

On Wed, Aug 9, 2017 at 12:37 PM, Jason Dillaman  wrote:

> I can probably say that it won't work out-of-the-gate for Hyper-V
> since it most likely will require iSCSI persistent reservations. That
> support is still being added to the kernel because right now it isn't
> being distributed to all the target portal group nodes.
>
> On Wed, Aug 9, 2017 at 12:30 PM, Samuel Soulard
>  wrote:
> > Thanks! we'll visit back this subject once it is released.  Waiting on
> this
> > to perform some tests for Hyper-V/VMware ISCSI LUNs :)
> >
> > Sam
> >
> > On Wed, Aug 9, 2017 at 10:35 AM, Jason Dillaman 
> wrote:
> >>
> >> Yes, RHEL/CentOS 7.4 or kernel 4.13 (once it's released).
> >>
> >> On Wed, Aug 9, 2017 at 6:56 AM, Samuel Soulard <
> samuel.soul...@gmail.com>
> >> wrote:
> >> > Hi Jason,
> >> >
> >> > Oh the documentation is awesome:
> >> >
> >> > https://github.com/ritz303/ceph/blob/6ab7bc887b265127510c3c3fde6dba
> d0e047955d/doc/rbd/iscsi-target-cli.rst
> >> >
> >> > So I assume that this is not yet available for CentOS and requires us
> to
> >> > wait until CentOS 7.4 is released?
> >> >
> >> > Thanks for the documentation, it makes everything more clear.
> >> >
> >> > On Tue, Aug 8, 2017 at 9:37 PM, Jason Dillaman 
> >> > wrote:
> >> >>
> >> >> We are working hard to formalize active/passive iSCSI configuration
> >> >> across Linux/Windows/ESX via LIO. We have integrated librbd into
> LIO's
> >> >> tcmu-runner and have developed a set of support applications to
> >> >> managing the clustered configuration of your iSCSI targets. There is
> >> >> some preliminary documentation here [1] that will be merged once we
> >> >> can finish our testing.
> >> >>
> >> >> [1] https://github.com/ceph/ceph/pull/16182
> >> >>
> >> >> On Tue, Aug 8, 2017 at 4:45 PM, Samuel Soulard
> >> >> 
> >> >> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > Platform : Centos 7 Luminous 12.1.2
> >> >> >
> >> >> > First time here but, are there any guides or guidelines out there
> on
> >> >> > how
> >> >> > to
> >> >> > configure ISCSI gateways in HA so that if one gateway fails, IO can
> >> >> > continue
> >> >> > on the passive node?
> >> >> >
> >> >> > What I've done so far
> >> >> > -ISCSI node with Ceph client map rbd on boot
> >> >> > -Rbd has exclusive-lock feature enabled and layering
> >> >> > -Targetd service dependent on rbdmap.service
> >> >> > -rbd exported through LUN ISCSI
> >> >> > -Windows ISCSI imitator can map the lun and format / write to it
> >> >> > (awesome)
> >> >> >
> >> >> > Now I have no idea where to start to have an active /passive
> scenario
> >> >> > for
> >> >> > luns exported with LIO.  Any ideas?
> >> >> >
> >> >> > Also the web dashboard seem to hint that it can get stats for
> various
> >> >> > clients made on ISCSI gateways, I'm not sure where it pulls that
> >> >> > information. Is Luminous now shipping a ISCSI daemon of some sort?
> >> >> >
> >> >> > Thanks all!
> >> >> >
> >> >> > ___
> >> >> > ceph-users mailing list
> >> >> > ceph-users@lists.ceph.com
> >> >> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >> >> >
> >> >>
> >> >>
> >> >>
> >> >> --
> >> >> Jason
> >> >
> >> >
> >>
> >>
> >>
> >> --
> >> Jason
> >
> >
>
>
>
> --
> Jason
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pg inconsistent / export_files error -5

2017-08-09 Thread Marc Roos

This is for osd.0 (more below)
bluestore(/var/lib/ceph/osd/ceph-0) _verify_csum bad crc32c/0x1000 
checksum at blob offset 0x0, got 0x1a128a93, expected 0x90407f75, device 
location [0x5826c1~1000], logical extent 0x0~1000
bluestore(/var/lib/ceph/osd/ceph-12) _verify_csum bad crc32c/0x1000 
checksum at blob offset 0x0, got 0x100ac314, expected 0x90407f75, device 
location [0x15a017~1000], logical extent 0x0~1000,
bluestore(/var/lib/ceph/osd/ceph-9) _verify_csum bad crc32c/0x1000 
checksum at blob offset 0x0, got 0xb40b26a7, expected 0x90407f75, device 
location [0x2daea~1000], logical extent 0x0~1000,


I could not get this to work.
ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-12 --log-file out 
--debug-bluestore 30 --no-log-to-stderr
too many positional options have been specified on the command line

ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-12 --out-dir /tmp 
 > /tmp/fsck.out 2>&1 

2017-08-09 11:01:34.208832 7fafd0171b80 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-08-09 11:01:34.208927 7fafd0171b80 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
action fsck
2017-08-09 11:01:34.238734 7fafd0171b80 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-08-09 11:01:34.238937 7fafd0171b80  1 
bluestore(/var/lib/ceph/osd/ceph-12) fsck (shallow) start
2017-08-09 11:01:34.239001 7fafd0171b80  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
2017-08-09 11:01:34.239013 7fafd0171b80  1 bdev(0x7fafd2ac9200 
/var/lib/ceph/osd/ceph-12/block) open path 
/var/lib/ceph/osd/ceph-12/block
2017-08-09 11:01:34.239263 7fafd0171b80  1 bdev(0x7fafd2ac9200 
/var/lib/ceph/osd/ceph-12/block) open size 4000681103360 (0x3a37b2d1000, 
3725 GB) block_size 4096 (4096 B) rotational
2017-08-09 11:01:34.239549 7fafd0171b80  1 
bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes max 0.5 < ratio 
0.99
2017-08-09 11:01:34.239577 7fafd0171b80  1 
bluestore(/var/lib/ceph/osd/ceph-12) _set_cache_sizes cache_size 
1073741824 meta 0.5 kv 0.5 data 0
2017-08-09 11:01:34.239659 7fafd0171b80  1 bdev create path 
/var/lib/ceph/osd/ceph-12/block type kernel
2017-08-09 11:01:34.239664 7fafd0171b80  1 bdev(0x7fafd2ac8e00 
/var/lib/ceph/osd/ceph-12/block) open path 
/var/lib/ceph/osd/ceph-12/block
2017-08-09 11:01:34.239815 7fafd0171b80  1 bdev(0x7fafd2ac8e00 
/var/lib/ceph/osd/ceph-12/block) open size 4000681103360 (0x3a37b2d1000, 
3725 GB) block_size 4096 (4096 B) rotational
2017-08-09 11:01:34.239827 7fafd0171b80  1 bluefs add_block_device bdev 
1 path /var/lib/ceph/osd/ceph-12/block size 3725 GB
2017-08-09 11:01:34.239857 7fafd0171b80  1 bluefs mount
2017-08-09 11:01:35.239635 7fafd0171b80  0  set rocksdb option 
compaction_readahead_size = 2097152
2017-08-09 11:01:35.239654 7fafd0171b80  0  set rocksdb option 
compression = kNoCompression
2017-08-09 11:01:35.239677 7fafd0171b80  0  set rocksdb option 
max_write_buffer_number = 4
2017-08-09 11:01:35.239683 7fafd0171b80  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
2017-08-09 11:01:35.239686 7fafd0171b80  0  set rocksdb option 
recycle_log_file_num = 4
2017-08-09 11:01:35.239690 7fafd0171b80  0  set rocksdb option 
writable_file_max_buffer_size = 0
2017-08-09 11:01:35.239694 7fafd0171b80  0  set rocksdb option 
write_buffer_size = 268435456
2017-08-09 11:01:35.239725 7fafd0171b80  0  set rocksdb option 
compaction_readahead_size = 2097152
2017-08-09 11:01:35.239730 7fafd0171b80  0  set rocksdb option 
compression = kNoCompression
2017-08-09 11:01:35.239735 7fafd0171b80  0  set rocksdb option 
max_write_buffer_number = 4
2017-08-09 11:01:35.239739 7fafd0171b80  0  set rocksdb option 
min_write_buffer_number_to_merge = 1
2017-08-09 11:01:35.239742 7fafd0171b80  0  set rocksdb option 
recycle_log_file_num = 4
2017-08-09 11:01:35.239746 7fafd0171b80  0  set rocksdb option 
writable_file_max_buffer_size = 0
2017-08-09 11:01:35.239749 7fafd0171b80  0  set rocksdb option 
write_buffer_size = 268435456
2017-08-09 11:01:35.239948 7fafd0171b80  4 rocksdb: RocksDB version: 
5.4.0

2017-08-09 11:01:35.239959 7fafd0171b80  4 rocksdb: Git sha 
rocksdb_build_git_sha:@0@
2017-08-09 11:01:35.239961 7fafd0171b80  4 rocksdb: Compile date Jul 17 
2017
2017-08-09 11:01:35.239964 7fafd0171b80  4 rocksdb: DB SUMMARY

2017-08-09 11:01:35.240004 7fafd0171b80  4 rocksdb: CURRENT file:  
CURRENT

2017-08-09 11:01:35.240006 7fafd0171b80  4 rocksdb: IDENTITY file:  
IDENTITY

2017-08-09 11:01:35.240010 7fafd0171b80  4 rocksdb: MANIFEST file:  
MANIFEST-001083 size: 1868 Bytes

2017-08-09 11:01:35.240013 7fafd0171b80  4 rocksdb: SST files in db dir, 
Total Num: 15, files: 000529.sst 000733.sst 000904.sst 000905.sst 
000906.sst 000907.sst 000908.sst 000909.sst 000910.sst 

2017-08-09 11:01:35.240015 7fafd0171b80  4 rocksdb: Write Ahead Log file 
in db: 001084.log size: 160853870 ; 

2017-08-09 11:01:35.240018 7fafd0171b80  4 rocksdb:   

Re: [ceph-users] lease_timeout - new election

2017-08-09 Thread Webert de Souza Lima
Hi David,

thanks for your feedback.

With that in mind, I did rm a 15TB RBD Pool about 1 hour or so before this
had happened.
I wouldn't think it would be related to this because there was nothing
different going on after I removed it. Not even high system load.

But considering what you sid, I think it could have been due to OSDs
operations related to that pool removal.






Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*

On Wed, Aug 9, 2017 at 10:15 AM, David Turner  wrote:

> I just want to point out that there are many different types of network
> issues that don't involve entire networks. Bad nic, bad/loose cable, a
> service on a server restarting our modifying the network stack, etc.
>
> That said there are other things that can prevent an mds service, or any
> service from responding to the mons and being wings marked down. It happens
> to osds enough that they even have the ability to wire in their logs that
> they were wrongly marked down. That usually happens when the service is so
> busy with an operation that it can't get to the request from the mon fast
> enough and it gets marked down. This could also be environment from the mds
> server. If something else on the host is using too many resources
> preventing the mds service from having what it needs, this could easily
> happen.
>
> What level of granularity do you have in your monitoring to tell what your
> system state was when this happened? Is there a time of day it is more
> likely to happen (expect to find a Cron at that time)?
>
> On Wed, Aug 9, 2017, 8:37 AM Webert de Souza Lima 
> wrote:
>
>> Hi,
>>
>> I recently had a mds outage beucase the mds suicided due to "dne in the
>> mds map".
>> I've asked it here before and I know that happens because the monitors
>> took out this mds from the mds map even though it was alive.
>>
>> Weird thing, there was no network related issues happening at the time,
>> which if there was, it would impact many other systems.
>>
>> I found this in the mon logs, and i'd like to understand it better:
>>  lease_timeout -- calling new election
>>
>> full logs:
>>
>> 2017-08-08 23:12:33.286908 7f2b8398d700  1 leveldb: Manual compaction at
>> level-1 from 'pgmap_pg\x009.a' @ 1830392430 : 1 .. 'paxos\x0057687834' @ 0
>> : 0; will stop at (end)
>>
>> 2017-08-08 23:12:36.885087 7f2b86f9a700  0 
>> mon.bhs1-mail02-ds03@2(peon).data_health(3524)
>> update_stats avail 81% total 19555 MB, used 2632 MB, avail 15907 MB
>> 2017-08-08 23:13:29.357625 7f2b86f9a700  1 
>> mon.bhs1-mail02-ds03@2(peon).paxos(paxos
>> updating c 57687834..57688383) lease_timeout -- calling new election
>> 2017-08-08 23:13:29.358965 7f2b86799700  0 log_channel(cluster) log [INF]
>> : mon.bhs1-mail02-ds03 calling new monitor election
>> 2017-08-08 23:13:29.359128 7f2b86799700  1 
>> mon.bhs1-mail02-ds03@2(electing).elector(3524)
>> init, last seen epoch 3524
>> 2017-08-08 23:13:35.383530 7f2b86799700  1 mon.bhs1-mail02-ds03@2(peon).osd
>> e12617 e12617: 19 osds: 19 up, 19 in
>> 2017-08-08 23:13:35.605839 7f2b86799700  0 mon.bhs1-mail02-ds03@2(peon).mds
>> e18460 print_map
>> e18460
>> enable_multiple, ever_enabled_multiple: 0,0
>> compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable
>> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
>> uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
>>
>> Filesystem 'cephfs' (2)
>> fs_name cephfs
>> epoch   18460
>> flags   0
>> created 2016-08-01 11:07:47.592124
>> modified2017-07-03 10:32:44.426431
>> tableserver 0
>> root0
>> session_timeout 60
>> session_autoclose   300
>> max_file_size   1099511627776
>> last_failure0
>> last_failure_osd_epoch  12617
>> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
>> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
>> uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
>> max_mds 1
>> in  0
>> up  {0=1574278}
>> failed
>> damaged
>> stopped
>> data_pools  8,9
>> metadata_pool   7
>> inline_data disabled
>> 1574278:10.0.2.4:6800/2556733458 'd' mds.0.18460 up:replay seq 1
>> laggy since 2017-08-08 23:13:35.174109 (standby for rank 0)
>>
>>
>>
>> 2017-08-08 23:13:35.606303 7f2b86799700  0 log_channel(cluster) log [INF]
>> : mon.bhs1-mail02-ds03 calling new monitor election
>> 2017-08-08 23:13:35.606361 7f2b86799700  1 
>> mon.bhs1-mail02-ds03@2(electing).elector(3526)
>> init, last seen epoch 3526
>> 2017-08-08 23:13:36.885540 7f2b86f9a700  0 
>> mon.bhs1-mail02-ds03@2(peon).data_health(3528)
>> update_stats avail 81% total 19555 MB, used 2636 MB, avail 15903 MB
>> 2017-08-08 23:13:38.311777 7f2b86799700  0 mon.bhs1-mail02-ds03@2(peon).mds
>> e18461 print_map
>>
>>
>> Regards,
>>
>> Webert Lima
>> DevOps Engineer at MAV Tecnologia
>> *Belo Horizonte - Brasil*
>> ___

Re: [ceph-users] New install error

2017-08-09 Thread Timothy Wolgemuth
Here is the output:

[ceph-deploy@ceph01 my-cluster]$ sudo /usr/bin/ceph --connect-timeout=25
--cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph01/keyring
auth get client.admin
2017-08-09 09:07:00.519683 7f389700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efffc0617c0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efffc05d670).fault
2017-08-09 09:07:03.520486 7f288700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efffc80 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efff0001f90).fault
2017-08-09 09:07:06.521091 7f389700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efff00052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efff0006570).fault
2017-08-09 09:07:09.521483 7f288700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efffc80 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efff0002410).fault
2017-08-09 09:07:12.522027 7f389700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efff00052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efff0002f60).fault
2017-08-09 09:07:15.522433 7f288700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efffc80 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efff00036d0).fault
2017-08-09 09:07:18.523025 7f389700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efff00052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efff0002a10).fault
2017-08-09 09:07:21.523332 7f288700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efffc80 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efff0008d40).fault
2017-08-09 09:07:24.523353 7f389700  0 -- :/1582396262 >>
192.168.100.11:6789/0 pipe(0x7efff00052b0 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7efff0003df0).fault
Traceback (most recent call last):  File "/usr/bin/ceph", line 948, in

retval = main()
  File "/usr/bin/ceph", line 852, in main
prefix='get_command_descriptions')



On Wed, Aug 9, 2017 at 12:15 AM, Brad Hubbard  wrote:

> On ceph01 if you login as ceph-deploy and run the following command
> what output do you get?
>
> $ sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon.
> --keyring=/var/lib/ceph/mon/ceph-ceph01/keyring auth get client.admin
>
> On Tue, Aug 8, 2017 at 11:41 PM, Timothy Wolgemuth
>  wrote:
> > I have a new installation and following the quick start guide at:
> >
> > http://docs.ceph.com/docs/master/start/quick-ceph-deploy/
> >
> > Running into the following error in the create-initial step.  See below:
> >
> >
> >
> > $ ceph-deploy --username ceph-deploy mon create-initial
> > [ceph_deploy.conf][DEBUG ] found configuration file at:
> > /home/ceph-deploy/.cephdeploy.conf
> > [ceph_deploy.cli][INFO  ] Invoked (1.5.37): /bin/ceph-deploy --username
> > ceph-deploy mon create-initial
> > [ceph_deploy.cli][INFO  ] ceph-deploy options:
> > [ceph_deploy.cli][INFO  ]  username  : ceph-deploy
> > [ceph_deploy.cli][INFO  ]  verbose   : False
> > [ceph_deploy.cli][INFO  ]  overwrite_conf: False
> > [ceph_deploy.cli][INFO  ]  subcommand: create-initial
> > [ceph_deploy.cli][INFO  ]  quiet : False
> > [ceph_deploy.cli][INFO  ]  cd_conf   :
> > 
> > [ceph_deploy.cli][INFO  ]  cluster   : ceph
> > [ceph_deploy.cli][INFO  ]  func  :  at
> > 0x275e320>
> > [ceph_deploy.cli][INFO  ]  ceph_conf : None
> > [ceph_deploy.cli][INFO  ]  default_release   : False
> > [ceph_deploy.cli][INFO  ]  keyrings  : None
> > [ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph01
> > [ceph_deploy.mon][DEBUG ] detecting platform for host ceph01 ...
> > [ceph01][DEBUG ] connection detected need for sudo
> > [ceph01][DEBUG ] connected to host: ceph-deploy@ceph01
> > [ceph01][DEBUG ] detect platform information from remote host
> > [ceph01][DEBUG ] detect machine type
> > [ceph01][DEBUG ] find the location of an executable
> > [ceph_deploy.mon][INFO  ] distro info: CentOS Linux 7.3.1611 Core
> > [ceph01][DEBUG ] determining if provided host has same hostname in remote
> > [ceph01][DEBUG ] get remote short hostname
> > [ceph01][DEBUG ] deploying mon to ceph01
> > [ceph01][DEBUG ] get remote short hostname
> > [ceph01][DEBUG ] remote hostname: ceph01
> > [ceph01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf
> > [ceph01][DEBUG ] create the mon path if it does not exist
> > [ceph01][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph01/
> done
> > [ceph01][DEBUG ] create a done file to avoid re-doing the mon deployment
> > [ceph01][DEBUG ] create the init path if it does not exist
> > [ceph01][INFO  ] Running command: sudo systemctl enable ceph.target
> > [ceph01][INFO  ] Running command: sudo systemctl enable ceph-mon@ceph01
> > [ceph01][INFO  ] Running command: sudo systemctl start ceph-mon@ceph01
> > [ceph01][INFO  ] Running command: sudo ceph --cluster=ceph --admin-daemon
> > /var/run/ceph/ceph-mon.ceph01.asok mon_status
> > [ceph01][DEBUG ]
> > 

Re: [ceph-users] IO Error reaching client when primary osd get funky but secondaries are ok

2017-08-09 Thread Peter Gervai
Hello David,

On Wed, Aug 9, 2017 at 3:08 PM, David Turner  wrote:

> When exactly is the timeline of when the io error happened?

The timeline was included in the email, hour:min:sec resolution. I
spared millisecs since it doesn't really change things.

> If the primary
> osd was dead, but not marked down in the cluster yet,

The email showed when the osd went up, so before that it was suposed
to be down, as far as I can tell from the logs, unless there was an
up-down somewhere I have missed. I believe a boot-failed osd won't
come up.

> then the cluster would
> sit there and expect that osd too respond.

Suppose the osd would have been up and in (which I believe it wasn't),
and it fails to respond, what is supposed to happen? I thought
librados would see failure or timeout and would try to contact
secondaries, and definitely not send IO error upwards unless all
possibilities failed.

> If this definitely happened after
> the primary osd was marked down, then it's a different story.

Seems so, based on the logs I was able to correlate, but I cannot be
absolutely sure.

> I'm confused about you saying 1 osd was down/out and 2 other osds we're down
> but not out.

Okay, that may have been a mistake on my part: there's 2 osds failed
and one was about to be replaced first, and since it have failed we
kind of hesitated to replace the other one. :-/ The email was heavily
trimmed to remove fluff, this info may have been missed. Sorry.

> We're this in the same host whole you were replacing the disk?

The logs were gathered from many hosts and osds and mons, since events
happened simultaneously. The replacement happened on the same host, I
believe this is expected.

> Is your failure domain host or osd?

Host (and datacenter).

> What version of ceph are you running?

See the first line of my mail: version 0.94.10 (hammer)

Thanks,
Peter
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] lease_timeout - new election

2017-08-09 Thread David Turner
I just want to point out that there are many different types of network
issues that don't involve entire networks. Bad nic, bad/loose cable, a
service on a server restarting our modifying the network stack, etc.

That said there are other things that can prevent an mds service, or any
service from responding to the mons and being wings marked down. It happens
to osds enough that they even have the ability to wire in their logs that
they were wrongly marked down. That usually happens when the service is so
busy with an operation that it can't get to the request from the mon fast
enough and it gets marked down. This could also be environment from the mds
server. If something else on the host is using too many resources
preventing the mds service from having what it needs, this could easily
happen.

What level of granularity do you have in your monitoring to tell what your
system state was when this happened? Is there a time of day it is more
likely to happen (expect to find a Cron at that time)?

On Wed, Aug 9, 2017, 8:37 AM Webert de Souza Lima 
wrote:

> Hi,
>
> I recently had a mds outage beucase the mds suicided due to "dne in the
> mds map".
> I've asked it here before and I know that happens because the monitors
> took out this mds from the mds map even though it was alive.
>
> Weird thing, there was no network related issues happening at the time,
> which if there was, it would impact many other systems.
>
> I found this in the mon logs, and i'd like to understand it better:
>  lease_timeout -- calling new election
>
> full logs:
>
> 2017-08-08 23:12:33.286908 7f2b8398d700  1 leveldb: Manual compaction at
> level-1 from 'pgmap_pg\x009.a' @ 1830392430 : 1 .. 'paxos\x0057687834' @ 0
> : 0; will stop at (end)
>
> 2017-08-08 23:12:36.885087 7f2b86f9a700  0 
> mon.bhs1-mail02-ds03@2(peon).data_health(3524)
> update_stats avail 81% total 19555 MB, used 2632 MB, avail 15907 MB
> 2017-08-08 23:13:29.357625 7f2b86f9a700  1 
> mon.bhs1-mail02-ds03@2(peon).paxos(paxos
> updating c 57687834..57688383) lease_timeout -- calling new election
> 2017-08-08 23:13:29.358965 7f2b86799700  0 log_channel(cluster) log [INF]
> : mon.bhs1-mail02-ds03 calling new monitor election
> 2017-08-08 23:13:29.359128 7f2b86799700  1 
> mon.bhs1-mail02-ds03@2(electing).elector(3524)
> init, last seen epoch 3524
> 2017-08-08 23:13:35.383530 7f2b86799700  1 mon.bhs1-mail02-ds03@2(peon).osd
> e12617 e12617: 19 osds: 19 up, 19 in
> 2017-08-08 23:13:35.605839 7f2b86799700  0 mon.bhs1-mail02-ds03@2(peon).mds
> e18460 print_map
> e18460
> enable_multiple, ever_enabled_multiple: 0,0
> compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
> uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
>
> Filesystem 'cephfs' (2)
> fs_name cephfs
> epoch   18460
> flags   0
> created 2016-08-01 11:07:47.592124
> modified2017-07-03 10:32:44.426431
> tableserver 0
> root0
> session_timeout 60
> session_autoclose   300
> max_file_size   1099511627776
> last_failure0
> last_failure_osd_epoch  12617
> compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
> ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
> uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
> max_mds 1
> in  0
> up  {0=1574278}
> failed
> damaged
> stopped
> data_pools  8,9
> metadata_pool   7
> inline_data disabled
> 1574278:10.0.2.4:6800/2556733458 'd' mds.0.18460 up:replay seq 1
> laggy since 2017-08-08 23:13:35.174109 (standby for rank 0)
>
>
>
> 2017-08-08 23:13:35.606303 7f2b86799700  0 log_channel(cluster) log [INF]
> : mon.bhs1-mail02-ds03 calling new monitor election
> 2017-08-08 23:13:35.606361 7f2b86799700  1 
> mon.bhs1-mail02-ds03@2(electing).elector(3526)
> init, last seen epoch 3526
> 2017-08-08 23:13:36.885540 7f2b86f9a700  0 
> mon.bhs1-mail02-ds03@2(peon).data_health(3528)
> update_stats avail 81% total 19555 MB, used 2636 MB, avail 15903 MB
> 2017-08-08 23:13:38.311777 7f2b86799700  0 mon.bhs1-mail02-ds03@2(peon).mds
> e18461 print_map
>
>
> Regards,
>
> Webert Lima
> DevOps Engineer at MAV Tecnologia
> *Belo Horizonte - Brasil*
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] IO Error reaching client when primary osd get funky but secondaries are ok

2017-08-09 Thread David Turner
When exactly is the timeline of when the io error happened? If the primary
osd was dead, but not marked down in the cluster yet, then the cluster
would sit there and expect that osd too respond. If this definitely
happened after the primary osd was marked down, then it's a different story.

I'm confused about you saying 1 osd was down/out and 2 other osds we're
down but not out. We're this in the same host whole you were replacing the
disk? Is your failure domain host or osd? What version of ceph are you
running?

On Wed, Aug 9, 2017, 7:32 AM Peter Gervai  wrote:

> Hello,
>
> ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)
>
> We had a few problems related to the simple operation of replacing a
> failed OSD, and some clarification would be appreciated. It is not
> very simple to observe what specifically happened (the timeline was
> gathered from half a dozen logs), so apologies for any vagueness,
> which could be fixed by looking at further specific logs on request.
>
> The main problem was that we had to replace a failed OSD. There were 2
> down+out but otherwise known (not deleted) OSDs. We have removed
> (deleted) one. It changes the CRUSH, and rebalancing starts (no matter
> that noout set since it's been out anyway; it could only be stopped by
> the scary norecover, but it's not been flagged then; I will check
> nobackfill/norebalance next time which looks more safe). Rebalancing
> finished fine (25% objects were told to be misplaced, which is a PITA,
> but there's been not much objects on that cluster). This is the
> prologue, so far it's all fine.
>
> We plugged in (and created) the new OSD, but due to the environment
> and some admin errors [wasn't me! :)] the OSD at start were not able
> to umount it's temporary filesystem which seems to be used for initial
> creation, so what I have observed is [from the logs]
>
> - 14:12:00, osd6 created, enters the osdmap, down+out
> - 14:12:02, replaced osd6 started, boots, tries to create initial osd
> layout
> - 14:12:03, osd6 crash due to failed umount / file not found
> - 14:12:07, some other osds are logging warnings like (may not be
> important):
>misdirected client (some that osd not in the set, others just logged
> the pg)
> - 14:12:07, one of the clients get IO error (this one was actually
> pretty fatal):
>rbd: rbd1: write 1000 at 40779000 (379000)
>rbd: rbd1:   result -6 xferred 1000
>blk_update_request: I/O error, dev rbd1, sector 2112456
>EXT4-fs warning (device rbd1): ext4_end_bio:329: I/O error -6
> writing to inode 399502 (offset 0 size 0 starting block 264058)
>   Buffer I/O error on device rbd1, logical block 264057
> - 14:12:17, other client gets IO error (this one's been lucky):
>rbd: rbd1: write 1000 at c84795000 (395000)
>rbd: rbd1:   result -6 xferred 1000
>blk_update_request: I/O error, dev rbd1, sector 105004200
> - 14:12:27, libceph: osd6 weight 0x1 (in); in+down: the osd6 is
> crashed at that point and hasn't been restarted yet
>
> - 14:13:19, osd6 started again
> - 14:13:22, libceph: osd6 up
> - from this on everything's fine, apart from the crashed VM :-/
>
> The main problem is of course that the IO error which have reached the
> client, and knocked out the FS, while there were 2 replica osds
> active. I haven't found the specifics about how it's handled when the
> primary fails, or acts funky, since by my guess this may have been
> happened.
>
> I would like to understand why the IO error was there, and how to
> prevent it, if it's possible, and whether this is something which
> already have been taken care of in later ceph versions.
>
> Your shared wisdom would be appreciated.
>
> Thanks,
> Peter
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] lease_timeout - new election

2017-08-09 Thread Webert de Souza Lima
Hi,

I recently had a mds outage beucase the mds suicided due to "dne in the mds
map".
I've asked it here before and I know that happens because the monitors took
out this mds from the mds map even though it was alive.

Weird thing, there was no network related issues happening at the time,
which if there was, it would impact many other systems.

I found this in the mon logs, and i'd like to understand it better:
 lease_timeout -- calling new election

full logs:

2017-08-08 23:12:33.286908 7f2b8398d700  1 leveldb: Manual compaction at
level-1 from 'pgmap_pg\x009.a' @ 1830392430 : 1 .. 'paxos\x0057687834' @ 0
: 0; will stop at (end)

2017-08-08 23:12:36.885087 7f2b86f9a700  0
mon.bhs1-mail02-ds03@2(peon).data_health(3524)
update_stats avail 81% total 19555 MB, used 2632 MB, avail 15907 MB
2017-08-08 23:13:29.357625 7f2b86f9a700  1
mon.bhs1-mail02-ds03@2(peon).paxos(paxos
updating c 57687834..57688383) lease_timeout -- calling new election
2017-08-08 23:13:29.358965 7f2b86799700  0 log_channel(cluster) log [INF] :
mon.bhs1-mail02-ds03 calling new monitor election
2017-08-08 23:13:29.359128 7f2b86799700  1
mon.bhs1-mail02-ds03@2(electing).elector(3524)
init, last seen epoch 3524
2017-08-08 23:13:35.383530 7f2b86799700  1 mon.bhs1-mail02-ds03@2(peon).osd
e12617 e12617: 19 osds: 19 up, 19 in
2017-08-08 23:13:35.605839 7f2b86799700  0 mon.bhs1-mail02-ds03@2(peon).mds
e18460 print_map
e18460
enable_multiple, ever_enabled_multiple: 0,0
compat: compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}

Filesystem 'cephfs' (2)
fs_name cephfs
epoch   18460
flags   0
created 2016-08-01 11:07:47.592124
modified2017-07-03 10:32:44.426431
tableserver 0
root0
session_timeout 60
session_autoclose   300
max_file_size   1099511627776
last_failure0
last_failure_osd_epoch  12617
compat  compat={},rocompat={},incompat={1=base v0.20,2=client writeable
ranges,3=default file layouts on dirs,4=dir inode in separate object,5=mds
uses versioned encoding,6=dirfrag is stored in omap,8=file layout v2}
max_mds 1
in  0
up  {0=1574278}
failed
damaged
stopped
data_pools  8,9
metadata_pool   7
inline_data disabled
1574278:10.0.2.4:6800/2556733458 'd' mds.0.18460 up:replay seq 1
laggy since 2017-08-08 23:13:35.174109 (standby for rank 0)



2017-08-08 23:13:35.606303 7f2b86799700  0 log_channel(cluster) log [INF] :
mon.bhs1-mail02-ds03 calling new monitor election
2017-08-08 23:13:35.606361 7f2b86799700  1
mon.bhs1-mail02-ds03@2(electing).elector(3526)
init, last seen epoch 3526
2017-08-08 23:13:36.885540 7f2b86f9a700  0
mon.bhs1-mail02-ds03@2(peon).data_health(3528)
update_stats avail 81% total 19555 MB, used 2636 MB, avail 15903 MB
2017-08-08 23:13:38.311777 7f2b86799700  0 mon.bhs1-mail02-ds03@2(peon).mds
e18461 print_map


Regards,

Webert Lima
DevOps Engineer at MAV Tecnologia
*Belo Horizonte - Brasil*
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] IO Error reaching client when primary osd get funky but secondaries are ok

2017-08-09 Thread Peter Gervai
Hello,

ceph version 0.94.10 (b1e0532418e4631af01acbc0cedd426f1905f4af)

We had a few problems related to the simple operation of replacing a
failed OSD, and some clarification would be appreciated. It is not
very simple to observe what specifically happened (the timeline was
gathered from half a dozen logs), so apologies for any vagueness,
which could be fixed by looking at further specific logs on request.

The main problem was that we had to replace a failed OSD. There were 2
down+out but otherwise known (not deleted) OSDs. We have removed
(deleted) one. It changes the CRUSH, and rebalancing starts (no matter
that noout set since it's been out anyway; it could only be stopped by
the scary norecover, but it's not been flagged then; I will check
nobackfill/norebalance next time which looks more safe). Rebalancing
finished fine (25% objects were told to be misplaced, which is a PITA,
but there's been not much objects on that cluster). This is the
prologue, so far it's all fine.

We plugged in (and created) the new OSD, but due to the environment
and some admin errors [wasn't me! :)] the OSD at start were not able
to umount it's temporary filesystem which seems to be used for initial
creation, so what I have observed is [from the logs]

- 14:12:00, osd6 created, enters the osdmap, down+out
- 14:12:02, replaced osd6 started, boots, tries to create initial osd layout
- 14:12:03, osd6 crash due to failed umount / file not found
- 14:12:07, some other osds are logging warnings like (may not be important):
   misdirected client (some that osd not in the set, others just logged the pg)
- 14:12:07, one of the clients get IO error (this one was actually
pretty fatal):
   rbd: rbd1: write 1000 at 40779000 (379000)
   rbd: rbd1:   result -6 xferred 1000
   blk_update_request: I/O error, dev rbd1, sector 2112456
   EXT4-fs warning (device rbd1): ext4_end_bio:329: I/O error -6
writing to inode 399502 (offset 0 size 0 starting block 264058)
  Buffer I/O error on device rbd1, logical block 264057
- 14:12:17, other client gets IO error (this one's been lucky):
   rbd: rbd1: write 1000 at c84795000 (395000)
   rbd: rbd1:   result -6 xferred 1000
   blk_update_request: I/O error, dev rbd1, sector 105004200
- 14:12:27, libceph: osd6 weight 0x1 (in); in+down: the osd6 is
crashed at that point and hasn't been restarted yet

- 14:13:19, osd6 started again
- 14:13:22, libceph: osd6 up
- from this on everything's fine, apart from the crashed VM :-/

The main problem is of course that the IO error which have reached the
client, and knocked out the FS, while there were 2 replica osds
active. I haven't found the specifics about how it's handled when the
primary fails, or acts funky, since by my guess this may have been
happened.

I would like to understand why the IO error was there, and how to
prevent it, if it's possible, and whether this is something which
already have been taken care of in later ceph versions.

Your shared wisdom would be appreciated.

Thanks,
Peter
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Container deployment

2017-08-09 Thread 徐蕴
Hi,

I’m trying to deploy an Openstack with Openstack Kolla. With Kolla I can easily 
deploy most Openstack components and ceph by containers. I wander if there is 
any reliability or performance issue with container/docker?

Thank you!

Xu Yun
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com