Re: [ceph-users] Implement replication network with live cluster
If I remember right, someone has done this on a live cluster without any issues. I seem to remember that it had a fallback mechanism if the OSDs couldn't be reached on the cluster network to contact them on the public network. You could test it pretty easily without much impact. Take one OSD that has both networks and configure it and restart the process. If all the nodes (specifically the old ones with only one network) is able to connect to it, then you are good to go by restarting one OSD at a time. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure to implement dedicated Replication/Private and Public network. I've read the manual, know how to do it in ceph.conf, but I'm wondering since this is already running cluster - what should I do after I change ceph.conf on all nodes ? Restarting OSDs one by one, or... ? Is there any downtime expected ? - for the replication network to actually imlemented completely. Another related quetion: Also, I'm demoting some old OSDs, on old servers, I will have them all stoped, but would like to implement replication network before actually removing old OSDs from crush map - since lot of data will be moved arround. My old nodes/OSDs (that will be stoped before I implement replication network) - do NOT have dedicated NIC for replication network, in contrast to new nodes/OSDs. So there will be still reference to these old OSD in the crush map. Will this be a problem - me changing/implementing replication network that WILL work on new nodes/OSDs, but not on old ones since they don't have dedicated NIC ? I guess not since old OSDs are stoped anyway, but would like opinion. Or perhaps i might remove OSD from crush map with prior seting of nobackfill and norecover (so no rebalancing happens) and then implement replication netwotk? Sorry for old post, but... Thanks, -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Implement replication network with live cluster
That was my thought, yes - I found this blog that confirms what you are saying I guess: http://www.sebastien-han.fr/blog/2012/07/29/tip-ceph-public-slash-private-network-configuration/ I will do that... Thx I guess it doesnt matter, since my Crush Map will still refernce old OSDs, that are stoped (and cluster resynced after that) ? Thx again for the help On 4 March 2015 at 17:44, Robert LeBlanc rob...@leblancnet.us wrote: If I remember right, someone has done this on a live cluster without any issues. I seem to remember that it had a fallback mechanism if the OSDs couldn't be reached on the cluster network to contact them on the public network. You could test it pretty easily without much impact. Take one OSD that has both networks and configure it and restart the process. If all the nodes (specifically the old ones with only one network) is able to connect to it, then you are good to go by restarting one OSD at a time. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure to implement dedicated Replication/Private and Public network. I've read the manual, know how to do it in ceph.conf, but I'm wondering since this is already running cluster - what should I do after I change ceph.conf on all nodes ? Restarting OSDs one by one, or... ? Is there any downtime expected ? - for the replication network to actually imlemented completely. Another related quetion: Also, I'm demoting some old OSDs, on old servers, I will have them all stoped, but would like to implement replication network before actually removing old OSDs from crush map - since lot of data will be moved arround. My old nodes/OSDs (that will be stoped before I implement replication network) - do NOT have dedicated NIC for replication network, in contrast to new nodes/OSDs. So there will be still reference to these old OSD in the crush map. Will this be a problem - me changing/implementing replication network that WILL work on new nodes/OSDs, but not on old ones since they don't have dedicated NIC ? I guess not since old OSDs are stoped anyway, but would like opinion. Or perhaps i might remove OSD from crush map with prior seting of nobackfill and norecover (so no rebalancing happens) and then implement replication netwotk? Sorry for old post, but... Thanks, -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Implement replication network with live cluster
On 03/04/2015 05:44 PM, Robert LeBlanc wrote: If I remember right, someone has done this on a live cluster without any issues. I seem to remember that it had a fallback mechanism if the OSDs couldn't be reached on the cluster network to contact them on the public network. You could test it pretty easily without much impact. Take one OSD that has both networks and configure it and restart the process. If all the nodes (specifically the old ones with only one network) is able to connect to it, then you are good to go by restarting one OSD at a time. In the OSDMap each OSD has a public and cluster network address. If the cluster network address is not set, replication to that OSD will be done over the public network. So you can push a new configuration to all OSDs and restart them one by one. Make sure the network is ofcourse up and running and it should work. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure to implement dedicated Replication/Private and Public network. I've read the manual, know how to do it in ceph.conf, but I'm wondering since this is already running cluster - what should I do after I change ceph.conf on all nodes ? Restarting OSDs one by one, or... ? Is there any downtime expected ? - for the replication network to actually imlemented completely. Another related quetion: Also, I'm demoting some old OSDs, on old servers, I will have them all stoped, but would like to implement replication network before actually removing old OSDs from crush map - since lot of data will be moved arround. My old nodes/OSDs (that will be stoped before I implement replication network) - do NOT have dedicated NIC for replication network, in contrast to new nodes/OSDs. So there will be still reference to these old OSD in the crush map. Will this be a problem - me changing/implementing replication network that WILL work on new nodes/OSDs, but not on old ones since they don't have dedicated NIC ? I guess not since old OSDs are stoped anyway, but would like opinion. Or perhaps i might remove OSD from crush map with prior seting of nobackfill and norecover (so no rebalancing happens) and then implement replication netwotk? Sorry for old post, but... Thanks, -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Implement replication network with live cluster
If the data have been replicated to new OSDs, it will be able to function properly even them them down or only on the public network. On Wed, Mar 4, 2015 at 9:49 AM, Andrija Panic andrija.pa...@gmail.com wrote: I guess it doesnt matter, since my Crush Map will still refernce old OSDs, that are stoped (and cluster resynced after that) ? I wanted to say: it doesnt matter (I guess?) that my Crush map is still referencing old OSD nodes that are already stoped. Tired, sorry... On 4 March 2015 at 17:48, Andrija Panic andrija.pa...@gmail.com wrote: That was my thought, yes - I found this blog that confirms what you are saying I guess: http://www.sebastien-han.fr/blog/2012/07/29/tip-ceph-public-slash-private-network-configuration/ I will do that... Thx I guess it doesnt matter, since my Crush Map will still refernce old OSDs, that are stoped (and cluster resynced after that) ? Thx again for the help On 4 March 2015 at 17:44, Robert LeBlanc rob...@leblancnet.us wrote: If I remember right, someone has done this on a live cluster without any issues. I seem to remember that it had a fallback mechanism if the OSDs couldn't be reached on the cluster network to contact them on the public network. You could test it pretty easily without much impact. Take one OSD that has both networks and configure it and restart the process. If all the nodes (specifically the old ones with only one network) is able to connect to it, then you are good to go by restarting one OSD at a time. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure to implement dedicated Replication/Private and Public network. I've read the manual, know how to do it in ceph.conf, but I'm wondering since this is already running cluster - what should I do after I change ceph.conf on all nodes ? Restarting OSDs one by one, or... ? Is there any downtime expected ? - for the replication network to actually imlemented completely. Another related quetion: Also, I'm demoting some old OSDs, on old servers, I will have them all stoped, but would like to implement replication network before actually removing old OSDs from crush map - since lot of data will be moved arround. My old nodes/OSDs (that will be stoped before I implement replication network) - do NOT have dedicated NIC for replication network, in contrast to new nodes/OSDs. So there will be still reference to these old OSD in the crush map. Will this be a problem - me changing/implementing replication network that WILL work on new nodes/OSDs, but not on old ones since they don't have dedicated NIC ? I guess not since old OSDs are stoped anyway, but would like opinion. Or perhaps i might remove OSD from crush map with prior seting of nobackfill and norecover (so no rebalancing happens) and then implement replication netwotk? Sorry for old post, but... Thanks, -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Andrija Panić -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Implement replication network with live cluster
Thx Wido, I needed this confirmations - thanks! On 4 March 2015 at 17:49, Wido den Hollander w...@42on.com wrote: On 03/04/2015 05:44 PM, Robert LeBlanc wrote: If I remember right, someone has done this on a live cluster without any issues. I seem to remember that it had a fallback mechanism if the OSDs couldn't be reached on the cluster network to contact them on the public network. You could test it pretty easily without much impact. Take one OSD that has both networks and configure it and restart the process. If all the nodes (specifically the old ones with only one network) is able to connect to it, then you are good to go by restarting one OSD at a time. In the OSDMap each OSD has a public and cluster network address. If the cluster network address is not set, replication to that OSD will be done over the public network. So you can push a new configuration to all OSDs and restart them one by one. Make sure the network is ofcourse up and running and it should work. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure to implement dedicated Replication/Private and Public network. I've read the manual, know how to do it in ceph.conf, but I'm wondering since this is already running cluster - what should I do after I change ceph.conf on all nodes ? Restarting OSDs one by one, or... ? Is there any downtime expected ? - for the replication network to actually imlemented completely. Another related quetion: Also, I'm demoting some old OSDs, on old servers, I will have them all stoped, but would like to implement replication network before actually removing old OSDs from crush map - since lot of data will be moved arround. My old nodes/OSDs (that will be stoped before I implement replication network) - do NOT have dedicated NIC for replication network, in contrast to new nodes/OSDs. So there will be still reference to these old OSD in the crush map. Will this be a problem - me changing/implementing replication network that WILL work on new nodes/OSDs, but not on old ones since they don't have dedicated NIC ? I guess not since old OSDs are stoped anyway, but would like opinion. Or perhaps i might remove OSD from crush map with prior seting of nobackfill and norecover (so no rebalancing happens) and then implement replication netwotk? Sorry for old post, but... Thanks, -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Wido den Hollander 42on B.V. Ceph trainer and consultant Phone: +31 (0)20 700 9902 Skype: contact42on ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Implement replication network with live cluster
Thx again - I really appreciatethe help guys ! On 4 March 2015 at 17:51, Robert LeBlanc rob...@leblancnet.us wrote: If the data have been replicated to new OSDs, it will be able to function properly even them them down or only on the public network. On Wed, Mar 4, 2015 at 9:49 AM, Andrija Panic andrija.pa...@gmail.com wrote: I guess it doesnt matter, since my Crush Map will still refernce old OSDs, that are stoped (and cluster resynced after that) ? I wanted to say: it doesnt matter (I guess?) that my Crush map is still referencing old OSD nodes that are already stoped. Tired, sorry... On 4 March 2015 at 17:48, Andrija Panic andrija.pa...@gmail.com wrote: That was my thought, yes - I found this blog that confirms what you are saying I guess: http://www.sebastien-han.fr/blog/2012/07/29/tip-ceph-public-slash-private-network-configuration/ I will do that... Thx I guess it doesnt matter, since my Crush Map will still refernce old OSDs, that are stoped (and cluster resynced after that) ? Thx again for the help On 4 March 2015 at 17:44, Robert LeBlanc rob...@leblancnet.us wrote: If I remember right, someone has done this on a live cluster without any issues. I seem to remember that it had a fallback mechanism if the OSDs couldn't be reached on the cluster network to contact them on the public network. You could test it pretty easily without much impact. Take one OSD that has both networks and configure it and restart the process. If all the nodes (specifically the old ones with only one network) is able to connect to it, then you are good to go by restarting one OSD at a time. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure to implement dedicated Replication/Private and Public network. I've read the manual, know how to do it in ceph.conf, but I'm wondering since this is already running cluster - what should I do after I change ceph.conf on all nodes ? Restarting OSDs one by one, or... ? Is there any downtime expected ? - for the replication network to actually imlemented completely. Another related quetion: Also, I'm demoting some old OSDs, on old servers, I will have them all stoped, but would like to implement replication network before actually removing old OSDs from crush map - since lot of data will be moved arround. My old nodes/OSDs (that will be stoped before I implement replication network) - do NOT have dedicated NIC for replication network, in contrast to new nodes/OSDs. So there will be still reference to these old OSD in the crush map. Will this be a problem - me changing/implementing replication network that WILL work on new nodes/OSDs, but not on old ones since they don't have dedicated NIC ? I guess not since old OSDs are stoped anyway, but would like opinion. Or perhaps i might remove OSD from crush map with prior seting of nobackfill and norecover (so no rebalancing happens) and then implement replication netwotk? Sorry for old post, but... Thanks, -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Andrija Panić -- Andrija Panić -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Implement replication network with live cluster
I guess it doesnt matter, since my Crush Map will still refernce old OSDs, that are stoped (and cluster resynced after that) ? I wanted to say: it doesnt matter (I guess?) that my Crush map is still referencing old OSD nodes that are already stoped. Tired, sorry... On 4 March 2015 at 17:48, Andrija Panic andrija.pa...@gmail.com wrote: That was my thought, yes - I found this blog that confirms what you are saying I guess: http://www.sebastien-han.fr/blog/2012/07/29/tip-ceph-public-slash-private-network-configuration/ I will do that... Thx I guess it doesnt matter, since my Crush Map will still refernce old OSDs, that are stoped (and cluster resynced after that) ? Thx again for the help On 4 March 2015 at 17:44, Robert LeBlanc rob...@leblancnet.us wrote: If I remember right, someone has done this on a live cluster without any issues. I seem to remember that it had a fallback mechanism if the OSDs couldn't be reached on the cluster network to contact them on the public network. You could test it pretty easily without much impact. Take one OSD that has both networks and configure it and restart the process. If all the nodes (specifically the old ones with only one network) is able to connect to it, then you are good to go by restarting one OSD at a time. On Wed, Mar 4, 2015 at 4:17 AM, Andrija Panic andrija.pa...@gmail.com wrote: Hi, I'm having a live cluster with only public network (so no explicit network configuraion in the ceph.conf file) I'm wondering what is the procedure to implement dedicated Replication/Private and Public network. I've read the manual, know how to do it in ceph.conf, but I'm wondering since this is already running cluster - what should I do after I change ceph.conf on all nodes ? Restarting OSDs one by one, or... ? Is there any downtime expected ? - for the replication network to actually imlemented completely. Another related quetion: Also, I'm demoting some old OSDs, on old servers, I will have them all stoped, but would like to implement replication network before actually removing old OSDs from crush map - since lot of data will be moved arround. My old nodes/OSDs (that will be stoped before I implement replication network) - do NOT have dedicated NIC for replication network, in contrast to new nodes/OSDs. So there will be still reference to these old OSD in the crush map. Will this be a problem - me changing/implementing replication network that WILL work on new nodes/OSDs, but not on old ones since they don't have dedicated NIC ? I guess not since old OSDs are stoped anyway, but would like opinion. Or perhaps i might remove OSD from crush map with prior seting of nobackfill and norecover (so no rebalancing happens) and then implement replication netwotk? Sorry for old post, but... Thanks, -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com -- Andrija Panić -- Andrija Panić ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com