Glad you figured it out Cody. On Tue, Nov 27, 2018 at 1:48 PM Cody <[email protected]> wrote: > > Hi there, > > The performance issue was caused by a failed OS drive on one of the > storage nodes. Here is a link [1] to the thread at ceph-ansible ML > with useful tips on using the 'fio' to test storage devices, in case > anyone is interested. > > Thank you very much to all. > > Best regards, > Cody > > [1] > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-November/031547.html > > > On Mon, Nov 26, 2018 at 1:13 PM Cody <[email protected]> wrote: > > > > Hi Donny, > > > > Thank you for the reply. > > > > > What kind of images are you using? > > The image used is CentOS 7 cloud image in RAW format (approx. 8GB in size). > > > > >Also how are you uploading the images? > > I was uploading the image file from the undercloud node. > > > > Thank you very much. > > > > Best regards, > > Cody > > On Mon, Nov 26, 2018 at 10:57 AM Donny Davis <[email protected]> wrote: > > > > > > Also how are you uploading the images? > > > > > > On Mon, Nov 26, 2018 at 10:54 AM Donny Davis <[email protected]> wrote: > > >> > > >> What kind of images are you using? > > >> > > >> On Mon, Nov 26, 2018 at 9:14 AM John Fulton <[email protected]> wrote: > > >>> > > >>> On Sun, Nov 25, 2018 at 11:29 PM Cody <[email protected]> wrote: > > >>> > > > >>> > Hello, > > >>> > > > >>> > My tripleO cluster is deployed with Ceph. Both Cinder and Nova use RBD > > >>> > as backend. While all essential functions work, services involving > > >>> > Ceph are getting very poor performance. E.g., it takes several hours > > >>> > to upload an 8GB image into Cinder and about 20 minutes to completely > > >>> > boot up an instance (from launch to ssh ready). > > >>> > > > >>> > Running 'ceph -s' shows a top write speed at 6~700 KiB/s during image > > >>> > upload and read speed 2 MiB/s during instance launch. > > >>> > > > >>> > I used the default scheme for network isolation and a single 1G port > > >>> > for all VLAN traffics on each overcloud node. I haven't set jumbo > > >>> > frame on the storage network VLAN yet, but think the performance > > >>> > should not be this bad with MTU 1500. Something must be wrong. Any > > >>> > suggestions for debugging? > > >>> > > >>> Hi Cody, > > >>> > > >>> If you're using queens or rocky, then ceph luminous was deployed in > > >>> containers. Though tripleo did the overall deployment, ceph-ansible > > >>> would have done the actual ceph deployment and configuration and you > > >>> can determine the ceph-ansible version via 'rpm -q ceph-ansible' on > > >>> your undercloud. It probably makes sense for you to pass along what > > >>> you mentioned above in addition to some other info, which I'll note > > >>> below, to the ceph-users list > > >>> (http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com), who will be > > >>> focused on ceph itself. When you contact them (I'm on the list too) > > >>> also let them know the following: > > >>> > > >>> 1. How many OSD servers you have and how many OSDs per server > > >>> 2. What type of disks you're using per OSD and how you set up journaling > > >>> 3. Specs of your servers themselves (OpenStack controller servers w/ > > >>> CPU X and Ram Y for Ceph monitors and Ceph Storage servers RAM/CPU > > >>> info) > > >>> 4. Did you override the RAM/CPU for the Mon, Mgr, and OSD containers? > > >>> If so, what did you override them to? > > >>> > > >>> TripleO can pass any parameter you would normally pass to ceph-ansible > > >>> as described in the following: > > >>> > > >>> > > >>> https://docs.openstack.org/tripleo-docs/latest/install/advanced_deployment/ceph_config.html#customizing-ceph-conf-with-ceph-ansible > > >>> > > >>> So if you let them know things in terms of a containerized > > >>> ceph-ansible luminous deployment and the ceph.conf and they have > > >>> suggestions, then you can apply the suggestions back to ceph-ansible > > >>> through tripleo as described above. If you start troubleshooting the > > >>> cluster as per this troubleshooting guide [2] and share the results > > >>> that would also help. > > >>> > > >>> I've gotten better performance than you describe on a completely > > >>> virtualized deployment using my PC [1] using quickstart with the > > >>> defaults that TripleO passes using queens and rocky. Though, TripleO > > >>> tends to favor the defaults which ceph-ansible uses. However, with a > > >>> single 1G port for all network traffic I don't expect great > > >>> performance. > > >>> > > >>> Feel free to CC me when you email ceph-users and feel free to share on > > >>> rdo-users a link to the thread you started there in case anyone else > > >>> on this list is interested. > > >>> > > >>> John > > >>> > > >>> [1] > > >>> http://blog.johnlikesopenstack.com/2018/08/pc-for-tripleo-quickstart.html > > >>> [2] > > >>> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/pdf/troubleshooting_guide/Red_Hat_Ceph_Storage-3-Troubleshooting_Guide-en-US.pdf > > >>> > > >>> > Thank you very much. > > >>> > > > >>> > Best regards, > > >>> > Cody > > >>> > _______________________________________________ > > >>> > users mailing list > > >>> > [email protected] > > >>> > http://lists.rdoproject.org/mailman/listinfo/users > > >>> > > > >>> > To unsubscribe: [email protected] > > >>> _______________________________________________ > > >>> users mailing list > > >>> [email protected] > > >>> http://lists.rdoproject.org/mailman/listinfo/users > > >>> > > >>> To unsubscribe: [email protected] _______________________________________________ users mailing list [email protected] http://lists.rdoproject.org/mailman/listinfo/users
To unsubscribe: [email protected]
