Re: [ceph-users] Ceph OSD with OCFS2

gjprabu Wed, 17 Jun 2015 00:05:31 -0700

Hi Somnath,

           Yes, We will analyze is there any bottleneck, do we have any 
valuable command to analyze this bottleneck.
&gt;&gt; 1. What is your backend cluster configuration like how many OSDs, 
PGs/pool, HW details etc
        We are using 2 OSD and there is no PGs/Pool created , it is default. 
Hardware is physical machine above 2 GB RAM.


 &gt;&gt;2. Is it a single big rbd image you mounted from different hosts and 
running OCFS2 on top ? Please give some details on that front.
          Yes, It is single rbd image we are using in different hosts and 
running OCFS2 on top
rbd ls
newinteg      

rbd showmapped
   id pool image    snap device    
   1  rbd  newinteg -    /dev/rbd1 

rbd info newinteg
    rbd image 'newinteg':
    size 70000 MB in 17500 objects
    order 22 (4096 kB objects)
    block_name_prefix: rb.0.1149.74b0dc51
    format: 1


&gt;&gt; 3. Also, is this HDD or SSD setup ? If HDD, hope you have journals on 
SSD.
            Hope so this HDD and below is the out put for disk.

        *-ide                   
       description: IDE interface
       product: 82371SB PIIX3 IDE [Natoma/Triton II]
       vendor: Intel Corporation
       physical id: 1.1
       bus info: pci@0000:00:01.1
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: ide bus_master
       configuration: driver=ata_piix latency=0
       resources: irq:0 ioport:1f0(size=8) ioport:3f6 ioport:170(size=8) 
ioport:376 ioport:c000(size=16)
  *-scsi
       description: SCSI storage controller
       product: Virtio block device
       vendor: Red Hat, Inc
       physical id: 4
       bus info: pci@0000:00:04.0
       version: 00
       width: 32 bits
       clock: 33MHz
       capabilities: scsi msix bus_master cap_list
       configuration: driver=virtio-pci latency=0
       resources: irq:11 ioport:c080(size=64) memory:f2040000-f2040fff


Regards
Prabu GJ





---- On Tue, 16 Jun 2015 21:50:29 +0530 Somnath Roy 
&lt;[email protected]&gt; wrote ---- 

  Okay…I think the extra layers you have will add some delay, but 1m is high 
probably (I never tested Ceph on HDD though).
 We can minimize it probably by optimizing the cluster setup.
 Please monitor your backend cluster or even the rbd nodes to see if anything 
is bottleneck there.
 Also, check if there is any delay between you are issuing request on OCFS2/rbd 
getting that/cluster getting that.
  
 Could you please share the following details ?
  
 1. What is your backend cluster configuration like how many OSDs, PGs/pool, HW 
details etc.
  
 2. Is it a single big rbd image you mounted from different hosts and running 
OCFS2 on top ? Please give some details on that front.
  
 3. Also, is this HDD or SSD setup ? If HDD, hope you have journals on SSD.
  
 Thanks &amp; Regards
 Somnath
  
  
   From: gjprabu [mailto:[email protected]] 
 Sent: Tuesday, June 16, 2015 1:57 AM
 To: Somnath Roy
 Cc: Kamala Subramani        ; [email protected]; Siva Sokkumuthu       
         
 Subject: RE: Re: [ceph-users] Ceph OSD with OCFS2
 
 
  
   Somnath ,
 
   
 
  Yes , we are cloning repository in ceph client shared directory.
 
   
 
  Please refer the time analyzation
 
   
 
  Ceph Client Shared Directory : 
 
   
 
  [[email protected]~/ceph/]$  time git clone 
https://github.com/elastic/elasticsearch.git
 
  Cloning into 'elasticsearch'...
 
  remote: Counting objects: 373468, done.
 
  remote: Compressing objects: 100% (76/76), done.
 
  remote: Total 373468 (delta 66), reused 20 (delta 20), pack-reused 373371
 
  Receiving objects: 100% (373468/373468), 137.56 MiB | 7.10 MiB/s, done.
 
  Resolving deltas: 100% (210489/210489), done.
 
  Checking connectivity... done.
 
  Checking out files: 100% (5531/5531), done.
 
   
 
  real    1m2.154s
 
  user    0m26.315s
 
  sys    0m20.017s
 
   
 
  Not Shared Directory : 
 
   
 
  [[email protected]~/test]$ time git clone 
https://github.com/elastic/elasticsearch.git
 
  Cloning into 'elasticsearch'...
 
  remote: Counting objects: 373594, done.
 
  remote: Compressing objects: 100% (172/172), done.
 
  remote: Total 373594 (delta 104), reused 20 (delta 20), pack-reused 373399
 
  Receiving objects: 100% (373594/373594), 137.65 MiB | 8.30 MiB/s, done.
 
  Resolving deltas: 100% (210550/210550), done.
 
   
 
  real    0m37.878s
 
  user    0m20.061s
 
  sys    0m3.049s
 
   
 
   
 
  FYI : 
 
   
 
  Not only clone - repository operation like checkout , update , fetch command 
also getting delayed .
 
  
   Regards
 
 Prabu
   
 
 
  
   
 ---- On Tue, 16 Jun 2015 10:35:32 +0530 Somnath 
Roy&lt;[email protected]&gt; wrote ---- 
 
    Prabu,
 I am still not clear..
 You are cloning git source repository on top of RBD + OCFS2 and that is taking 
extra time ?
  
 Thanks &amp; Regards
 Somnath
  
   From: gjprabu [mailto:[email protected]] 
 Sent: Monday, June 15, 2015 9:39 PM
 To: gjprabu
 Cc: Somnath Roy; Kamala Subramani        ; [email protected]; Siva 
Sokkumuthu                
 Subject: Re: Re: [ceph-users] Ceph OSD with OCFS2
 
 
  
  Hi Somnath,
          Is there any fine tune for the blow issues.        
 &lt;&lt; Also please let us know the reason ( Extra 2-3 mins is taken for hg / 
git repository operation like clone , pull , checkout and update.)
 &lt;&lt; Could you please explain a bit what you are trying to do here ?
 
     In ceph shared directory , we will clone source repository then will 
access the same from ceph client .
 
 Regards
 Prabu
   
 ---- On Mon, 15 Jun 2015 17:16:11 +0530 gjprabu &lt;[email protected]&gt; 
wrote ---- 
 
    Hi  
 
     The size differ issue is solved, This is related to ocfs2 format option 
and -C count should be 4K.
     (mkfs.ocfs2 /dev/mapper/mpatha -N 64 -b 4K -C 256K -T mail 
--fs-features=extended-slotmap --fs-feature-level=max-features -L ) 
 
     Need to change like below.
     (mkfs.ocfs2 /dev/mapper/mpatha -b4K -C 4K -L label -T mail -N 2 /dev/sdX
 &lt;&lt; Also please let us know the reason ( Extra 2-3 mins is taken for hg / 
git repository operation like clone , pull , checkout and update.)
 &lt;&lt; Could you please explain a bit what you are trying to do here ?
 
     In ceph shared directory , we will clone source repository then will 
access the same from ceph client .
 
     
    
   Regards
 
 Prabu
 
  
   ---- On Fri, 12 Jun 2015 21:12:03 +0530 Somnath Roy 
&lt;[email protected]&gt; wrote ---- 
 
    Sorry, it was a typo , I meant to say 1GB only.
 I would say break the problem like the following.
  
 1. Run some fio workload say (1G) on RBD and run ceph command like ‘ceph df’ 
to see how much data it written. I am sure you will be seeing same data. 
Remember by default ceph rados object size is 4MB, so, it should write 1GB/4MB 
number of objects.
  
 2. Also, you can use ‘rados’ utility to directly put/get say 1GB file to the 
cluster and check similar way.
  
 As I said, if your journal in the same device and if you measure the space 
consumed by entire OSD mount point , it will be more because of WA induced by 
Ceph. But, individual file size you transferred should not differ.
  
 &lt;&lt; Also please let us know the reason ( Extra 2-3 mins is taken for hg / 
git repository operation like clone , pull , checkout and update.)
 Could you please explain a bit what you are trying to do here ?
  
 Thanks &amp; Regards
 Somnath
  
   From: gjprabu [mailto:[email protected]] 
 Sent: Friday, June 12, 2015 12:34 AM
 To: Somnath Roy
 Cc: [email protected]; Kamala Subramani    ; Siva Sokkumuthu        
 Subject: Re: RE: RE: [ceph-users] Ceph OSD with OCFS2
 
 
  
  Hi,
 
       I measured the data only what i transfered from client. Example 500MB 
file transfered after complete if i measured the same file size will be 1GB not 
10GB. 
 
        Our Configuration is :-
 
=============================================================================================
 ceph -w
 cluster f428f5d6-7323-4254-9f66-56a21b099c1a
 health HEALTH_OK
 monmap e1: 3 mons at 
{cephadmin=172.20.19.235:6789/0,cephnode1=172.20.7.168:6789/0,cephnode2=172.20.9.41:6789/0},
 election epoch 114, quorum 0,1,2 cephnode1,cephnode2,cephadmin
 osdmap e9: 2 osds: 2 up, 2 in
 pgmap v1022: 64 pgs, 1 pools, 7507 MB data, 1952 objects
 26139 MB used, 277 GB / 302 GB avail
 64 active+clean
 
===============================================================================================
 ceph.conf
 [global]
 osd pool default size = 2
 auth_service_required = cephx
 filestore_xattr_use_omap = true
 auth_client_required = cephx
 auth_cluster_required = cephx
 mon_host = 172.20.7.168,172.20.9.41,172.20.19.235
 mon_initial_members = zoho-cephnode1, zoho-cephnode2, zoho-cephadmin
 fsid = f428f5d6-7323-4254-9f66-56a21b099c1a
 
================================================================================================
 
 What is the replication policy you are using ?
  
        We are using default OSD with 2 replica not using CRUSH Map, PG num and 
Erasure etc., 
 
 What interface you used to store the data ?
 
        We are using RBD to store data and its has been mounted with OCFS2 in 
client side.
 
 How are you removing data ? Are you removing a rbd image ?
      
        We are not removing rbd image, only removing data which is already 
having and removing using rm command from client. We didn't set async way to 
transfer or remove data
 
 
 Also please let us know the reason ( Extra 2-3 mins is taken for hg / git 
repository operation like clone , pull , checkout and update.)
   Regards
 
 Prabu GJ
   
 
 
  
   
 ---- On Fri, 12 Jun 2015 00:21:24 +0530 Somnath Roy 
&lt;[email protected]&gt; wrote ---- 
 
    Hi,
  
 Ceph journal works in different way.  It’s a write ahead journal, all the data 
will be persisted first in journal and then will be written to actual place. 
Journal data is encoded. Journal is a fixed size partition/file and written 
sequentially. So, if you are placing journal in HDDs, it will be overwritten, 
for SSD case , it will be GC later. So, if you are measuring amount of data 
written to the device it will be double. But, if you are saying you have 
written a 500MB file to cluster and you are seeing the actual file size is 10G, 
it should not be the case. How are you seeing this size BTW ?
  
 Could you please tell us more about your configuration ?
 What is the replication policy you are using ?
 What interface you used to store the data ?
  
 Regarding your other query..
  
 &lt;&lt; If i transfer 1GB data, what will be server size(OSD), Is this will 
write compressed format
  
 No, actual data is not compressed. You don’t want to fill up OSD disk and 
there are some limits you can set . Check the following link
  
 http://ceph.com/docs/master/rados/troubleshooting/troubleshooting-osd/
  
 It will stop working if the disk is 95% full by default.
  
 &lt;&lt; Is it possible to take backup from server compressed data and copy 
the same to other machine as Server_Backup  - then start new client using 
Server_Backup
 For backup, check the following link if that works for you.
  
 https://ceph.com/community/blog/tag/backup/
  
 Also, you can use RGW federated config for back up.
  
 &lt;&lt; Data removal is very slow
  
 How are you removing data ? Are you removing a rbd image ?
  
 If you are removing entire pool , that should be fast and do deletes data 
async way I guess.
  
 Thanks &amp; Regards
 Somnath
  
   From: gjprabu [mailto:[email protected]] 
 Sent: Thursday, June 11, 2015 6:38 AM
 To: Somnath Roy
 Cc: [email protected]; Kamala Subramani; Siva Sokkumuthu    
 Subject: Re: RE: [ceph-users] Ceph OSD with OCFS2
 
 
  
   Hi Team,
 
     Once data transfer completed the journal file should convert all memory 
data's to real places but our cause it showing double of the size after 
complete transfer, Here everyone will confuse what is real file and folder 
size. Also What will happen If i move the monitoring from that osd server to 
separately, is the double size issue may solve ?
 
     We have below query also.
 
 1.  Extra 2-3 mins is taken for hg / git repository operation like clone , 
pull , checkout and update.
 
  2.  If i transfer 1GB data, what will be server size(OSD), Is this will write 
compressed format.
 
  3 . Is it possible to take backup from server compressed data and copy the 
same to other machine as Server_Backup  - then start new client using 
Server_Backup.  
 
 4.  Data removal is very slow.
   Regards
 
 Prabu
   
 
 
  
   
 ---- On Fri, 05 Jun 2015 21:55:28 +0530 Somnath Roy 
&lt;[email protected]&gt; wrote ---- 
 
    Yes, Ceph will be writing twice , one for journal and one for actual data. 
Considering you configured journal in the same device , this is what you end up 
seeing if you are monitoring the device BW.
  
 Thanks &amp; Regards
 Somnath
  
   From: ceph-users [mailto:[email protected]] On Behalf Of 
gjprabu
 Sent: Friday, June 05, 2015 3:07 AM
 To: [email protected]
 Cc: Kamala Subramani; Siva Sokkumuthu
 Subject: [ceph-users] Ceph OSD with OCFS2
 
 
  
  Dear Team,  
 
    We are newly using ceph with two OSD and two clients. Both clients are 
mounted with OCFS2 file system. Here suppose i transfer 500MB of data in the 
client its showing double of the size 1GB after finish data transfer. Is the 
behavior is correct or is there any solution for this.
   Regards
 
 Prabu
   
 
 
 
 
  
   
 
 PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
 
   
 
  
 
 
 
   
 
  
 
 
 
   
 
  
 
 _______________________________________________ 
 ceph-users mailing list 
 [email protected] 
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
[email protected]
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph OSD with OCFS2

Reply via email to