[ceph-users] questions about rgw, multiple zones

2014-10-30 Thread yuelongguang
hi,all
 
1.
how to set region's endpoints?  how to  know there are how many endpoints?
 
2.
i follow the step  of 'create a region', but after that, i can list the new 
region. default region is always there.
 
3.
there is one rgw for each zone. after rgw starts up. i can find the pools 
related to a zone are created?
 
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw issues

2014-10-29 Thread yuelongguang
i  ignore a detail, to set FastCgiWrapper off
thanks






At 2014-10-30 10:01:19, "yuelongguang"  wrote:

lists ceph, hi:
how do you solve this issue? i run into it  when i tryy to deploy 2 rgws on one 
ceph cluster in default region and default zone.
 
 
thanks








At 2014-07-01 09:06:24, "Brian Rak"  wrote:
>That sounds like you have some kind of odd situation going on.  We only 
>use radosgw with nginx/tengine so I can't comment on the apache part of it.
>
>My understanding is this:
>
>You start ceph-radosgw, this creates a fastcgi socket somewhere (verify 
>this is created with lsof, there are some permission problems that will 
>result in radosgw running, but not opening the socket).
>
>Apache is configured to connect to this socket, and forward any incoming 
>requests.  Apache should not be launching things.
>
>I did set Apache up once to test a bug, so I took a look at my config.  
>I do *not* have a s3gw.fcgi file on disk.  Have you tried removing 
>that?  I think that with FastCgiExternalServer, you don't need 
>s3gw.fcgi.  The other thing that got me was socket path in 
>FastCgiExternalServer is relative to whatever you have FastCgiIpcDir set 
>to (which the Ceph docs don't seem to take into account).
>
>
>On 6/30/2014 8:40 PM, lists+c...@deksai.com wrote:
>> On 2014-06-16 13:16, lists+c...@deksai.com wrote:
>>> I've just tried setting up the radosgw on centos6 according to
>>> http://ceph.com/docs/master/radosgw/config/
>>
>>> While I can run the admin commands just fine to create users etc.,
>>> making a simple wget request to the domain I set up returns a 500 due
>>> to a timeout.  Every request I make results in another radosgw process
>>> being created, which seems to start even more processes itself. I
>>> only have to make a few requests to have about 60 radosgw processes.
>>>
>>
>> Guess I'll try again.  I gave this another shot, following the 
>> documentation, and still end up with basically a fork bomb rather than 
>> the nice ListAllMyBucketsResult output that the docs say I should 
>> get.  Everything else about the cluster works fine, and I see others 
>> talking about the gateway as if it just worked, so I'm led to believe 
>> that I'm probably doing something stupid.  Has anybody else run into 
>> the situation where apache times out while fastcgi just launches more 
>> and more processes?
>>
>> The init script launches a process, and the webserver seems to launch 
>> the same thing, so I'm not clear on what should be happening here.  
>> Either way, I get nothing back when making a simple GET request to the 
>> domain.
>>
>> If anybody has suggestions, even if they are "You nincompoop! 
>> Everybody knows that you need to do such and such", that would be 
>> helpful.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] radosgw issues

2014-10-29 Thread yuelongguang
lists ceph, hi:
how do you solve this issue? i run into it  when i tryy to deploy 2 rgws on one 
ceph cluster in default region and default zone.
 
 
thanks








At 2014-07-01 09:06:24, "Brian Rak"  wrote:
>That sounds like you have some kind of odd situation going on.  We only 
>use radosgw with nginx/tengine so I can't comment on the apache part of it.
>
>My understanding is this:
>
>You start ceph-radosgw, this creates a fastcgi socket somewhere (verify 
>this is created with lsof, there are some permission problems that will 
>result in radosgw running, but not opening the socket).
>
>Apache is configured to connect to this socket, and forward any incoming 
>requests.  Apache should not be launching things.
>
>I did set Apache up once to test a bug, so I took a look at my config.  
>I do *not* have a s3gw.fcgi file on disk.  Have you tried removing 
>that?  I think that with FastCgiExternalServer, you don't need 
>s3gw.fcgi.  The other thing that got me was socket path in 
>FastCgiExternalServer is relative to whatever you have FastCgiIpcDir set 
>to (which the Ceph docs don't seem to take into account).
>
>
>On 6/30/2014 8:40 PM, lists+c...@deksai.com wrote:
>> On 2014-06-16 13:16, lists+c...@deksai.com wrote:
>>> I've just tried setting up the radosgw on centos6 according to
>>> http://ceph.com/docs/master/radosgw/config/
>>
>>> While I can run the admin commands just fine to create users etc.,
>>> making a simple wget request to the domain I set up returns a 500 due
>>> to a timeout.  Every request I make results in another radosgw process
>>> being created, which seems to start even more processes itself. I
>>> only have to make a few requests to have about 60 radosgw processes.
>>>
>>
>> Guess I'll try again.  I gave this another shot, following the 
>> documentation, and still end up with basically a fork bomb rather than 
>> the nice ListAllMyBucketsResult output that the docs say I should 
>> get.  Everything else about the cluster works fine, and I see others 
>> talking about the gateway as if it just worked, so I'm led to believe 
>> that I'm probably doing something stupid.  Has anybody else run into 
>> the situation where apache times out while fastcgi just launches more 
>> and more processes?
>>
>> The init script launches a process, and the webserver seems to launch 
>> the same thing, so I'm not clear on what should be happening here.  
>> Either way, I get nothing back when making a simple GET request to the 
>> domain.
>>
>> If anybody has suggestions, even if they are "You nincompoop! 
>> Everybody knows that you need to do such and such", that would be 
>> helpful.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] fail to add another rgw

2014-10-29 Thread yuelongguang
hi, clewis:
my environment:
one ceph cluster, 3 nodes, each node has one  monitor and  one osd.  one 
rgw(rgw1) which is on one of them(osd1).  before i deploy the second rgw(rgw2), 
the first rgw works well.
after i deploy a second rgw, which can not start.
the number of radosgw process increases constantly. 
the configuration file of rgw1 and rgw2 are almost same, except servername, 
host option of section of client.radosgw.gateway in ceph.conf.
default region, default zone.
 
another test:
i shutdown rgw1, then try to start rgw2. of course rgw3 can not start like 
before.
then i try to restart rgw1, it  fails . errors are almost  same as rgw2? 
i try to deploy multiple rgw on one ceph cluster in default zone and default 
region。
 
thanks。
---log---
[root@cephosd2-monb ceph]# /usr/bin/radosgw -d -c /etc/ceph/ceph.conf 
--debug-rgw=10 -n client.radosgw.gateway
2014-10-29 21:59:10.763921 7f32d24cf820  0 ceph version 0.80.7 
(6c0127fcb58008793d3c8b62d925bc91963672a3), process radosgw, pid 11887
2014-10-29 21:59:10.767922 7f32d24cf820 -1 asok(0xaa3110) 
AdminSocketConfigObs::init: failed: AdminSocket::bind_and_listen: failed to 
bind the UNIX domain socket to 
'/var/run/ceph/ceph-client.radosgw.gateway.asok': (17) File exists
2014-10-29 21:59:10.776185 7f32c17fb700  2 
RGWDataChangesLog::ChangesRenewThread: start
2014-10-29 21:59:10.777282 7f32d24cf820 10 cache get: 
name=.rgw.root+default.region : miss
2014-10-29 21:59:10.780609 7f32d24cf820 10 cache put: 
name=.rgw.root+default.region
2014-10-29 21:59:10.780617 7f32d24cf820 10 adding .rgw.root+default.region to 
cache LRU end
2014-10-29 21:59:10.780634 7f32d24cf820 10 cache get: 
name=.rgw.root+default.region : type miss (requested=1, cached=6)
2014-10-29 21:59:10.780658 7f32d24cf820 10 cache get: 
name=.rgw.root+default.region : hit
2014-10-29 21:59:10.781820 7f32d24cf820 10 cache put: 
name=.rgw.root+default.region
2014-10-29 21:59:10.781825 7f32d24cf820 10 moving .rgw.root+default.region to 
cache LRU end
2014-10-29 21:59:10.781883 7f32d24cf820 10 cache get: 
name=.rgw.root+region_info.default : miss
2014-10-29 21:59:10.783149 7f32d24cf820 10 cache put: 
name=.rgw.root+region_info.default
2014-10-29 21:59:10.783156 7f32d24cf820 10 adding .rgw.root+region_info.default 
to cache LRU end
2014-10-29 21:59:10.783168 7f32d24cf820 10 cache get: 
name=.rgw.root+region_info.default : type miss (requested=1, cached=6)
2014-10-29 21:59:10.783187 7f32d24cf820 10 cache get: 
name=.rgw.root+region_info.default : hit
2014-10-29 21:59:10.784622 7f32d24cf820 10 cache put: 
name=.rgw.root+region_info.default
2014-10-29 21:59:10.784627 7f32d24cf820 10 moving .rgw.root+region_info.default 
to cache LRU end
2014-10-29 21:59:10.784671 7f32d24cf820 10 cache get: 
name=.rgw.root+zone_info.default : miss
2014-10-29 21:59:10.788050 7f32d24cf820 10 cache put: 
name=.rgw.root+zone_info.default
2014-10-29 21:59:10.788071 7f32d24cf820 10 adding .rgw.root+zone_info.default 
to cache LRU end
2014-10-29 21:59:10.788091 7f32d24cf820 10 cache get: 
name=.rgw.root+zone_info.default : type miss (requested=1, cached=6)
2014-10-29 21:59:10.788125 7f32d24cf820 10 cache get: 
name=.rgw.root+zone_info.default : hit
2014-10-29 21:59:10.789630 7f32d24cf820 10 cache put: 
name=.rgw.root+zone_info.default
2014-10-29 21:59:10.789645 7f32d24cf820 10 moving .rgw.root+zone_info.default 
to cache LRU end
2014-10-29 21:59:10.789695 7f32d24cf820  2 zone default is master
2014-10-29 21:59:10.789742 7f32d24cf820 10 cache get: name=.rgw.root+region_map 
: miss
2014-10-29 21:59:10.791929 7f32d24cf820 10 cache put: name=.rgw.root+region_map
2014-10-29 21:59:10.791958 7f32d24cf820 10 adding .rgw.root+region_map to cache 
LRU end
2014-10-29 21:59:10.898679 7f32c0af7700  2 garbage collection: start
2014-10-29 21:59:10.899114 7f32a35fe700  0 ERROR: can't get key: ret=-2
2014-10-29 21:59:10.899663 7f32a35fe700  0 ERROR: sync_all_users() returned 
ret=-2
2014-10-29 21:59:10.900019 7f32d24cf820  0 framework: fastcgi
2014-10-29 21:59:10.900046 7f32d24cf820  0 starting handler: fastcgi
2014-10-29 21:59:10.909479 7f32a20fb700 10 allocated request req=0x7f329400b7c0
2014-10-29 21:59:10.926163 7f32c0af7700  0 RGWGC::process() failed to acquire 
lock on gc.89
2014-10-29 21:59:10.927823 7f32c0af7700  0 RGWGC::process() failed to acquire 
lock on gc.90
2014-10-29 21:59:10.958487 7f32c0af7700  0 RGWGC::process() failed to acquire 
lock on gc.93
2014-10-29 21:59:11.002497 7f32c0af7700  0 RGWGC::process() failed to acquire 
lock on gc.97
2014-10-29 21:59:11.032245 7f32c0af7700  0 RGWGC::process() failed to acquire 
lock on gc.0___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] can we deploy multi-rgw on one ceph cluster?

2014-10-28 Thread yuelongguang
clewis:
1.
i  do not understand the use case of  multiple regions and multiple zones and 
multiple rgw. 
in actual use(upload, download) , users  have dealings with rgw directly, in 
this process , what do regions and zones do?
 
2.
it is radosgw-agent to sync data and metadata.   if  data synchronization is 
finished  right after metadata is synced , or conversely? 
if not,  how is it when user access a file throught another region/zone?
 
 
3. during using multiple regions/zones,rgw , do they have any relations with 
cdn(Content Delivery Network)? or how to use multiple regions/zones/rgws with 
cdn?
 
 
thanks, looking forward to your reply.
 





 


在 2014-10-28 02:29:03,"Craig Lewis"  写道:





On Sun, Oct 26, 2014 at 9:08 AM, yuelongguang  wrote:

hi,
1. if  one radosgw daemon  corresponds to one zone ?   the rate is 1:1


Not necessarily.  You need at least one radosgw daemon per zone, but you can 
have more.  I have a two small clusters.  The primary has 5 nodes, and the 
secondary has 4 nodes.  Every node in the clusters run an apache and radosgw.


It's possible (and confusing) to run multiple radosgw daemons on a single node 
for different clusters.  You can either use Apache VHosts, or have CivetWeb 
listening on different ports.  I won't recommend this though, as it introduces 
a common failure mode to both zones.




 
2. it seems that we can deploy any number of rgw in a gingle ceph cluster,  
those rgw can work separately or cooperate by using radosgw-agent to sync data 
and metadata, am i right?


You can deploy as many zones as you want in a single cluster.  Each zone needs 
a set of pools and a radosgw daemon.  They can be completely independant, or 
have a master-slave replication setup using radosgw-agent.


Keep in mind that radosgw-agent is not bi-directional replication, and the 
secondary zone is read-only.


 
3. do you know how to set up load balance for rgws?  is nginx a good choose, 
how to let nginx work with rgw?


Any Load Balancer should work, since the protocol is just HTTP/HTTPS.  Some 
people on the list had issues with nginx.  Search the list archive for radosgw 
and tengine.


I'm using HAProxy, and it's working for me.  I have a slight issue in my 
secondary cluster, with locking during replication.  I believe I need to enable 
some kind of stickiness, but I haven't gotten around to investigating.  In the 
mean time, I've configured that cluster with a single node in the active 
backend, and the other nodes in a backup backend.  It's not a setup that can 
work for everybody, but it meets my needs until I fix the real issue.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] can we deploy multi-rgw on one ceph cluster?

2014-10-26 Thread yuelongguang
hi,
1. if  one radosgw daemon  corresponds to one zone ?   the rate is 1:1
 
2. it seems that we can deploy any number of rgw in a gingle ceph cluster,  
those rgw can work separately or cooperate by using radosgw-agent to sync data 
and metadata, am i right?
 
3. do you know how to set up load balance for rgws?  is nginx a good choose, 
how to let nginx work with rgw?
 
 
thanks







At 2014-10-25 05:53:27, "Craig Lewis"  wrote:

You can deploy multiple RadosGW in a single cluster.  You'll need to setup 
zones (see http://ceph.com/docs/master/radosgw/federated-config/).  Most people 
seem to be using zones for geo-replication, but local replication works even 
better.  Multiple zones don't have to be replicated either.  For example, you 
could use multiple zones for tiered services.  For example, a service with 4x 
replication on pure SSDs, and a cheaper service with 2x replication on HDDs.


If you do have separate zones in a single cluster, you'll want to configure 
different OSDs to serve the different zones.  You want fault isolation between 
the zones. The problems this brings are mostly management of the extra 
complexity.




CivetWeb is embedded into the RadosGW daemon, where as Apache talks to RadosGW 
using FastCGI.  Overall, CivetWeb should be simpler to setup and manage, since 
it doesn't require Apache, it's configuration, or the overhead.


I don't know if Civetweb is considered production ready.  Giant has a bunch of 
fixes for Civetweb, so I'm leaning towards "not on Firefly" unless somebody 
more knowledgeable tells me otherwise.




On Thu, Oct 23, 2014 at 11:04 PM, yuelongguang  wrote:

hi,yehuda
 
1.
can we deploy multi-rgws on one ceph cluster?
if so  does it bring us any problems? 
 
2. what is the major difference between apache and civetweb? 
what is  civetweb's advantage? 
 
thanks






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] can we deploy multi-rgw on one ceph cluster?

2014-10-23 Thread yuelongguang
hi,yehuda
 
1.
can we deploy multi-rgws on one ceph cluster?
if so  does it bring us any problems? 
 
2. what is the major difference between apache and civetweb? 
what is  civetweb's advantage? 
 
thanks


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] can we deploy multi-rgw on one ceph cluster?

2014-10-23 Thread yuelongguang
hi,all
 
can we deploy multi-rgws on one ceph cluster?
if so  does it bring us any problems? 
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] question about erasure coded pool and rados

2014-10-17 Thread yuelongguang
1. why erasure coded pool does not work with rbd?
2. i used rados command to put a file into erasue coded pool,then rm it. why 
the file remains on osd's backend fs all the time?
3. what is the best use case with erasure coded pool?


4. command of 'rados ls' is to list objects, where are the object-name stored?
5.when a rbd file is put on erasure coded pool, where is the infomation(rbd 
name) of the rbd stored?


thanks

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] pool size/min_size does not make any effect on erasure-coded pool, right?

2014-10-16 Thread yuelongguang

 
hi,all
 
pool size/min_size does not make any effect on erasure-coded pool,right?
and  erasure-coded pool does support rbd?
 
thanks


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] pool size/min_size does not make any effect on erasure-coded pool, right?

2014-10-16 Thread yuelongguang
hi,all
 
pool size/min_size does not make any effect on erasure-coded pool,right?
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] what does osd's ms_objecter do? and who will connect it?

2014-09-29 Thread yuelongguang
thanks, sage weil.

writing  fs is a serious matter,we should make it clear, includes coding style.
there are other places we should fix.
thanks









At 2014-09-29 12:10:52, "Sage Weil"  wrote:
>On Mon, 29 Sep 2014, yuelongguang wrote:
>> hi, sage will1.
>> you mean if i use cache tiering, client's objecter can know how to connect
>> to osd daemon's objecter?
>> if i can see it  throught osdmap?
>> 2.
>> i know if rbd use cache, it use objecter, so i thought objecter is a cache
>> for IO scatter/gather.
>> i do not know 'COPYFROM operation ', you mention.
>>  
>> 3.
>> i search radosclient.cc, i know objecter is  a client for osd .
>> but for osd daemon, the code tells me  osd's objecter is a listen socket.
>> ceph-osd.cc : main:
>> ms_objecter->bind(g_conf->public_addr)
>>  
>> OSD:OSD  :: objecter_messenger(ms_objecter)
>>  
>> objecter_messenger->add_dispatcher_head(&service.objecter_dispatcher) , this
>> tells me all messages received from pipes(which is accepted by
>> objecter_messenger) are handled by  Objecter::dispatch. right?
>>  
>> no code tells me  this objecter connect other osds actively.
>
>Yeah.  I mean that the bind() call for ms_objecter is a mistake.  It only 
>initiates client-side connections, so it doesn't need to bind/listen.  
>I'll fix it shortly.  (It's pretty harmless, though.. just confusing.)
>
>sage
>
>
>>  
>> maybe i   miss something.
>> thanks
>> 
>> 
>> 
>> At 2014-09-29 11:23:37, "Sage Weil"  wrote:
>> >On Mon, 29 Sep 2014, yuelongguang wrote:
>> >> hi,all
>> >> 1.
>> >>  
>> >> and who will connect it? as for osd, this ms_objecter is listen socket.
>> >> it  is not included in osdmap. so how to know ms_objecter's  listen addre
>> ss
>> >> and connect it.
>> >
>> >Ah, that's a mistake.  It is only used to connect to other OSDs as a 
>> >client for the COPYFROM operation and for cache tiering.
>> >
>> >sage
>> 
>> 
>> 
>> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] what does osd's ms_objecter do? and who will connect it?

2014-09-28 Thread yuelongguang
hi, sage will
1.
you mean if i use cache tiering, client's objecter can know how to connect to 
osd daemon's objecter?
if i can see it  throught osdmap?
2.
i know if rbd use cache, it use objecter, so i thought objecter is a cache for 
IO scatter/gather.
i do not know 'COPYFROM operation ', you mention.
 
3.
i search radosclient.cc, i know objecter is  a client for osd .
but for osd daemon, the code tells me  osd's objecter is a listen socket.
ceph-osd.cc : main:
ms_objecter->bind(g_conf->public_addr)
 
OSD:OSD  :: objecter_messenger(ms_objecter)
 
objecter_messenger->add_dispatcher_head(&service.objecter_dispatcher) , this 
tells me all messages received from pipes(which is accepted by 
objecter_messenger) are handled by  Objecter::dispatch. right?
 
no code tells me  this objecter connect other osds actively.
 
maybe i   miss something.

thanks






At 2014-09-29 11:23:37, "Sage Weil"  wrote:
>On Mon, 29 Sep 2014, yuelongguang wrote:
>> hi,all
>> 1.
>>  
>> and who will connect it? as for osd, this ms_objecter is listen socket.
>> it  is not included in osdmap. so how to know ms_objecter's  listen address
>> and connect it.
>
>Ah, that's a mistake.  It is only used to connect to other OSDs as a 
>client for the COPYFROM operation and for cache tiering.
>
>sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] what does osd's ms_objecter do? and who will connect it?

2014-09-28 Thread yuelongguang
hi,all
1.
 
and who will connect it? as for osd, this ms_objecter is listen socket.
it  is not included in osdmap. so how to know ms_objecter's  listen address and 
connect it.
 
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bug: ceph-deploy does not support jumbo frame

2014-09-25 Thread yuelongguang
thanks. i have not configured switch.

i just know about it.









在 2014-09-25 12:38:48,"Irek Fasikhov"  写道:

You have configured the switch?


2014-09-25 5:07 GMT+04:00 yuelongguang :

hi,all
after i set mtu=9000,  ceph-deply waits reply all the time , 'detecting 
platform for host.'
 
how to know what commands  ceph-deploy need that osd to do?
 
thanks



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







--

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] bug: ceph-deploy does not support jumbo frame

2014-09-24 Thread yuelongguang
hi,all
after i set mtu=9000,  ceph-deply waits reply all the time , 'detecting 
platform for host.'
 
how to know what commands  ceph-deploy need that osd to do?
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] question about client's cluster aware

2014-09-23 Thread yuelongguang
hi,all
 
my question is from my test.
let's take a example.   object1(4MB)--> pg 0.1 --> osd 1,2,3,p1
 
when client is writing object1, during the write , osd1 is down. let suppose 
2MB is writed.
1.
   when the connection to osd1 is down, what does client do?  ask monitor for 
new osdmap? or only the pg map?
 
2.
  now client gets a newer map , continues the write , the primary osd should be 
osd2.  the rest 2MB is writed out.
 now what does ceph do to integrate the two part data? and to promise that 
replicas is enough?
 
3.
 where is the code.  Be sure to tell me where the code is。
 
it is a very difficult question.
 
Thanks so much___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] question about object replication theory

2014-09-23 Thread yuelongguang
hi,all
   take a look at the link ,  
http://www.ceph.com/docs/master/architecture/#smart-daemons-enable-hyperscale
could you explain  point 2, 3 in that picture.
 
1.
at point 2,3, before primary writes data  to next osd, where is the data?  it 
is in momory or on disk already?
 
2. where is the  code of point 2 or 3,  at there  primary distributes data to 
others?
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] confusion when kill 3 osds that store the same pg

2014-09-18 Thread yuelongguang
hi,all
in order to test ceph stability. i  try to kill osds.
in this case ,i kill 3 osds(osd3,2,0) that store the same pg  2.30.
 
---crush---
osdmap e1342 pool 'rbd' (2) object 'rbd_data.19d92ae8944a.' -> 
pg 2.c59a45b0 (2.30) -> up ([3,2,0], p3) acting ([3,2,0], p3)
[root@cephosd5-gw current]# ceph osd tree
# idweight  type name   up/down reweight
-1  0.09995 root default
-2  0.01999 host cephosd1-mona
0   0.01999 osd.0   down0
-3  0.01999 host cephosd2-monb
1   0.01999 osd.1   up  1
-4  0.01999 host cephosd3-monc
2   0.01999 osd.2   down0
-5  0.01999 host cephosd4-mdsa
3   0.01999 osd.3   down0
-6  0.01999 host cephosd5-gw
4   0.01999 osd.4   up  1
-
according to the test result, i have some confusion.
 
1.
[root@cephosd5-gw current]# ceph pg 2.30 query
Error ENOENT: i don't have pgid 2.30
 
why i can not query infomations of this pg?  how to dump this pg?
 
2.
#ceph osd map rbd rbd_data.19d92ae8944a.
osdmap e1451 pool 'rbd' (2) object 'rbd_data.19d92ae8944a.' -> 
pg 2.c59a45b0 (2.30) -> up ([4,1], p4) acting ([4,1], p4)
 
does 'ceph osd map' command just calculate map , but does not check real pg 
stat?  i do not find 2.30  on osd1 and osd.4.
new that client will get the new map, why client hang ?
 
 
thanks very much
 
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] do you have any test case that lost data mostlikely

2014-09-18 Thread yuelongguang
hi,all
 
i want to test some cases that lost data mostlikely.
now i just test killing osds.
 
do you have any such test cases?
 
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] question about librbd io(fio paramenters)

2014-09-11 Thread yuelongguang
fio paramenters 
--fio
[global]
ioengine=libaio
direct=1
rw=randwrite
filename=/dev/vdb
time_based
runtime=300
stonewall
 
[iodepth32]
iodepth=32
bs=4k








At 2014-09-11 05:04:09, "yuelongguang"  wrote:

hi, josh durgin:
 
please look at my test.   inside vm using fio to test rbd performance.
fio paramters: dircet io, bs=4k, iodepth >> 4
from the infomation below, it does not match.
avgrq-sz is not  approximately  8,
for avgqu-sz   , its value is small and ruleless, lesser than 32.   why?
in ceph , which part maybe gather/scatter io request.  why avgqu-szis so 
small?
 
let's work it out.  haha
 
thanks
 
iostat-iodepth=32-- blocksize=4k--
Linux 2.6.32-358.el6.x86_64 (cephosd4-mdsa) 2014年09月11日  _x86_64_(2 
CPU)
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.12 5.818.19   35.39   132.09   670.6518.42 
0.317.06   0.55   2.41
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   291.500.00 1151.00 0.00 13091.5011.37 
5.064.40   0.23  26.35
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   208.500.00 1020.00 0.00  8294.50 8.13 
2.522.47   0.39  39.30
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0036.000.00 1076.00 0.00 17560.0016.32 
0.600.56   0.30  32.30
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   242.500.00 1143.00 0.00 22402.0019.60 
3.783.31   0.25  28.90
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0031.000.00  906.50 0.00  5351.50 5.90 
0.370.40   0.28  25.70
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   294.500.00 1148.50 0.00 16620.5014.47 
4.493.91   0.21  24.60
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0026.500.00  810.50 0.00  4922.50 6.07 
0.370.45   0.35  28.35
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0045.500.00 1022.00 0.00  6117.00 5.99 
0.380.37   0.28  28.15
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   300.000.00 1155.00 0.00 16997.5014.72 
3.583.10   0.21  24.30
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0027.000.00  962.50 0.00  6846.50 7.11 
0.440.46   0.35  33.60
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   270.000.00 1249.50 0.00 14400.0011.52 
4.613.69   0.25  31.25
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0015.003.00  660.0024.00  4247.00 6.44 
0.380.57   0.45  29.60
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0017.00   24.50  592.50   196.00  8039.0013.35 
0.580.94   0.83  51.05
 









在 2014-09-10 08:37:23,"Josh Durgin"  写道:
>On 09/09/2014 07:06 AM, yuelongguang wrote:
>> hi, josh.durgin:
>> i want to know how librbd launch io request.
>> use case:
>> inside vm, i use fio to test rbd-disk's io performance.
>> fio's pramaters are  bs=4k, direct io, qemu cache=none.
>> in this case, if librbd just send what it gets from vm, i mean  no
>> gather/scatter. the rate , io inside vm : io at librbd: io at osd
>> filestore = 1:1:1?
>
>If the rbd image is not a clone, the io issued from the vm's block
>driver will match the io issued by librbd. With caching disabled
>as you have it, the io from the OSDs will be similar, with some
>small amount extra for OSD bookkeeping.
>



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] question about librbd io

2014-09-11 Thread yuelongguang
hi, josh durgin:
 
please look at my test.   inside vm using fio to test rbd performance.
fio paramters: dircet io, bs=4k, iodepth >> 4
from the infomation below, it does not match.
avgrq-sz is not  approximately  8,
for avgqu-sz   , its value is small and ruleless, lesser than 32.   why?
in ceph , which part maybe gather/scatter io request.  why avgqu-szis so 
small?
 
let's work it out.  haha
 
thanks
 
iostat-iodepth=32-- blocksize=4k--
Linux 2.6.32-358.el6.x86_64 (cephosd4-mdsa) 2014年09月11日  _x86_64_(2 
CPU)
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.12 5.818.19   35.39   132.09   670.6518.42 
0.317.06   0.55   2.41
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   291.500.00 1151.00 0.00 13091.5011.37 
5.064.40   0.23  26.35
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   208.500.00 1020.00 0.00  8294.50 8.13 
2.522.47   0.39  39.30
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0036.000.00 1076.00 0.00 17560.0016.32 
0.600.56   0.30  32.30
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   242.500.00 1143.00 0.00 22402.0019.60 
3.783.31   0.25  28.90
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0031.000.00  906.50 0.00  5351.50 5.90 
0.370.40   0.28  25.70
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   294.500.00 1148.50 0.00 16620.5014.47 
4.493.91   0.21  24.60
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0026.500.00  810.50 0.00  4922.50 6.07 
0.370.45   0.35  28.35
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0045.500.00 1022.00 0.00  6117.00 5.99 
0.380.37   0.28  28.15
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   300.000.00 1155.00 0.00 16997.5014.72 
3.583.10   0.21  24.30
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0027.000.00  962.50 0.00  6846.50 7.11 
0.440.46   0.35  33.60
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.00   270.000.00 1249.50 0.00 14400.0011.52 
4.613.69   0.25  31.25
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0015.003.00  660.0024.00  4247.00 6.44 
0.380.57   0.45  29.60
Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   0.0017.00   24.50  592.50   196.00  8039.0013.35 
0.580.94   0.83  51.05
 









在 2014-09-10 08:37:23,"Josh Durgin"  写道:
>On 09/09/2014 07:06 AM, yuelongguang wrote:
>> hi, josh.durgin:
>> i want to know how librbd launch io request.
>> use case:
>> inside vm, i use fio to test rbd-disk's io performance.
>> fio's pramaters are  bs=4k, direct io, qemu cache=none.
>> in this case, if librbd just send what it gets from vm, i mean  no
>> gather/scatter. the rate , io inside vm : io at librbd: io at osd
>> filestore = 1:1:1?
>
>If the rbd image is not a clone, the io issued from the vm's block
>driver will match the io issued by librbd. With caching disabled
>as you have it, the io from the OSDs will be similar, with some
>small amount extra for OSD bookkeeping.
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] why one osd-op from client can get two osd-op-reply?

2014-09-11 Thread yuelongguang
as for the second question, could you tell me where the code is.

how ceph makes size/min_szie copies?
 
thanks









At 2014-09-11 12:19:18, "Gregory Farnum"  wrote:
>On Wed, Sep 10, 2014 at 8:29 PM, yuelongguang  wrote:
>>
>>
>>
>>
>> as for ack and ondisk, ceph has size and min_size to decide there are  how
>> many replications.
>> if client receive ack or ondisk, which means there are at least min_size
>> osds  have  done the ops?
>>
>> i am reading the cource code, could you help me with the two questions.
>>
>> 1.
>>  on osd, where is the code that reply ops  separately  according to ack or
>> ondisk.
>>  i check the code, but i thought they always are replied together.
>
>It depends on what journaling mode you're in, but generally they're
>triggered separately (unless it goes on disk first, in which case it
>will skip the ack — this is the mode it uses for non-btrfs
>filesystems). The places where it actually replies are pretty clear
>about doing one or the other, though...
>
>>
>> 2.
>>  now i just know how client write ops to primary osd, inside osd cluster,
>> how it promises min_size copy are reached.
>> i mean  when primary osd receives ops , how it spreads ops to others, and
>> how it processes other's reply.
>
>That's not how it works. The primary for a PG will not go "active"
>with it until it has at least min_size copies that it knows about.
>Once the OSD is doing any processing of the PG, it requires all
>participating members to respond before it sends any messages back to
>the client.
>-Greg
>Software Engineer #42 @ http://inktank.com | http://ceph.com
>
>>
>>
>> greg, thanks very much
>>
>>
>>
>>
>>
>> 在 2014-09-11 01:36:39,"Gregory Farnum"  写道:
>>
>> The important bit there is actually near the end of the message output line,
>> where the first says "ack" and the second says "ondisk".
>>
>> I assume you're using btrfs; the ack is returned after the write is applied
>> in-memory and readable by clients. The ondisk (commit) message is returned
>> after it's durable to the journal or the backing filesystem.
>> -Greg
>>
>> On Wednesday, September 10, 2014, yuelongguang  wrote:
>>>
>>> hi,all
>>> i recently debug ceph rbd, the log tells that  one write to osd can get
>>> two if its reply.
>>> the difference between them is seq.
>>> why?
>>>
>>> thanks
>>> ---log-
>>> reader got message 6 0x7f58900010a0 osd_op_reply(15
>>> rbd_data.19d92ae8944a.0001 [set-alloc-hint object_size 4194304
>>> write_size 4194304,write 0~3145728] v211'518 uv518 ack = 0) v6
>>> 2014-09-10 08:47:32.348213 7f58bc16b700 20 -- 10.58.100.92:0/1047669 queue
>>> 0x7f58900010a0 prio 127
>>> 2014-09-10 08:47:32.348230 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >>
>>> 10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1
>>> c=0xfae940).reader reading tag...
>>> 2014-09-10 08:47:32.348245 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >>
>>> 10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1
>>> c=0xfae940).reader got MSG
>>> 2014-09-10 08:47:32.348257 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >>
>>> 10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1
>>> c=0xfae940).reader got envelope type=43 src osd.1 front=247 data=0 off 0
>>> 2014-09-10 08:47:32.348269 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >>
>>> 10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1
>>> c=0xfae940).reader wants 247 from dispatch throttler 247/104857600
>>> 2014-09-10 08:47:32.348286 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >>
>>> 10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1
>>> c=0xfae940).reader got front 247
>>> 2014-09-10 08:47:32.348303 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >>
>>> 10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1
>>> c=0xfae940).aborted = 0
>>> 2014-09-10 08:47:32.348312 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >>
>>> 10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1
>>> c=0xfae940).reader got 247 + 0 + 0 byte message
>>> 2014-09-10 08:47:32.348332 7f58bc16b700 10 check_message_signature: seq #
>>> = 7 front_crc_ = 3699418201 middle_crc = 0 data_crc = 0
>>> 2014-09-10 08:47:32.348369 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >>
>>> 10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1
>>> c=0xfae940).reader got message 7 0x7f5890003660 osd_op_reply(15
>>> rbd_data.19d92ae8944a.0001 [set-alloc-hint object_size 4194304
>>> write_size 4194304,write 0~3145728] v211'518 uv518 ondisk = 0) v6
>>>
>>>
>>
>>
>> --
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osd cpu usage is bigger than 100%

2014-09-11 Thread yuelongguang
hi,all
i am testing   rbd performance, now there is only one vm which is using  rbd as 
its disk, and inside it  fio is doing r/w.
the big diffenence is that i set a big iodepth other than iodepth=1.
according to my test,  the bigger iodepth, the bigger cpu usage. 
 
analyse  the output of top command.  
1. 
12% wa,  if it means disk speed is not fast enough?
 
2. from where  we  can know  whether ceph's number of threads  is  enough or 
not?
 
 
how do you think about it,  which part is using up cpu? i want to find the root 
cause, why big iodepth leads to high cpu usage.
 
 
---default options
osd_op_threads": "2",
  "osd_disk_threads": "1",
  "osd_recovery_threads": "1",
"filestore_op_threads": "2",
 
 
thanks
 
--top---iodepth=16-
top - 15:27:34 up 2 days,  6:03,  2 users,  load average: 0.49, 0.56, 0.62
Tasks:  97 total,   1 running,  96 sleeping,   0 stopped,   0 zombie
Cpu(s): 19.0%us,  8.1%sy,  0.0%ni, 59.3%id, 12.1%wa,  0.0%hi,  0.8%si,  0.7%st
Mem:   1922540k total,  1853180k used,69360k free, 7012k buffers
Swap:  1048568k total,76796k used,   971772k free,  1034272k cached
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND

   
 2763 root  20   0 1112m 386m 5028 S 60.8 20.6 200:43.47 ceph-osd 
 
 -top
top - 19:50:08 up 1 day, 10:26,  2 users,  load average: 1.55, 0.97, 0.81
Tasks:  97 total,   1 running,  96 sleeping,   0 stopped,   0 zombie
Cpu(s): 37.6%us, 14.2%sy,  0.0%ni, 37.0%id,  9.4%wa,  0.0%hi,  1.3%si,  0.5%st
Mem:   1922540k total,  1820196k used,   102344k free,23100k buffers
Swap:  1048568k total,91724k used,   956844k free,  1052292k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND

   
 4312 root  20   0 1100m 337m 5192 S 107.3 18.0  88:33.27 ceph-osd  

   
 1704 root  20   0  514m 272m 3648 S  0.7 14.5   3:27.19 ceph-mon  

 

--iostat--

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   5.50   137.50  247.00  782.00  2896.00  8773.0011.34 
7.083.55   0.63  65.05

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   9.50   119.00  327.50  458.50  3940.00  4733.5011.03
12.03   19.66   0.70  55.40

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd  15.5010.50  324.00  559.50  3784.00  3398.00 8.13 
1.982.22   0.81  71.25

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   4.50   253.50  273.50  803.00  3056.00 12155.0014.13 
4.704.32   0.55  59.55

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd  10.00 6.00  294.00  488.00  3200.00  2933.50 7.84 
1.101.49   0.70  54.85

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd  10.0014.00  333.00  645.00  3780.00  3846.00 7.80 
2.132.15   0.90  87.55

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd  11.00   240.50  259.00  579.00  3144.00 10035.5015.73 
8.51   10.18   0.84  70.20

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd  10.5017.00  318.50  707.00  3876.00  4084.50 7.76 
1.321.30   0.61  62.65

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   4.50   208.00  233.50  918.00  2648.00 19214.5018.99 
5.434.71   0.55  63.20

Device: rrqm/s   wrqm/s r/s w/s   rsec/s   wsec/s avgrq-sz 
avgqu-sz   await  svctm  %util
vdd   7.00 1.50  306.00  212.00  3376.00  2176.5010.72 
1.031.83   0.96  49.70




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] why one osd-op from client can get two osd-op-reply?

2014-09-10 Thread yuelongguang
 

 

as for ack and ondisk, ceph has size and min_size to decide there are  how many 
replications.
if client receive ack or ondisk, which means there are at least min_size  osds  
have  done the ops?
 
i am reading the cource code, could you help me with the two questions.
 
1.
 on osd, where is the code that reply ops  separately  according to ack or 
ondisk.
 i check the code, but i thought they always are replied together.
 
2.
 now i just know how client write ops to primary osd, inside osd cluster, how 
it promises min_size copy are reached.
i mean  when primary osd receives ops , how it spreads ops to others, and how 
it processes other's reply.
 
 
greg, thanks very much 







在 2014-09-11 01:36:39,"Gregory Farnum"  写道:
The important bit there is actually near the end of the message output line, 
where the first says "ack" and the second says "ondisk".


I assume you're using btrfs; the ack is returned after the write is applied 
in-memory and readable by clients. The ondisk (commit) message is returned 
after it's durable to the journal or the backing filesystem.
-Greg

On Wednesday, September 10, 2014, yuelongguang  wrote:

hi,all
i recently debug ceph rbd, the log tells that  one write to osd can get two if 
its reply.
the difference between them is seq.
why?
 
thanks
---log-
reader got message 6 0x7f58900010a0 osd_op_reply(15 
rbd_data.19d92ae8944a.0001 [set-alloc-hint object_size 4194304 
write_size 4194304,write 0~3145728] v211'518 uv518 ack = 0) v6
2014-09-10 08:47:32.348213 7f58bc16b700 20 -- 10.58.100.92:0/1047669 queue 
0x7f58900010a0 prio 127
2014-09-10 08:47:32.348230 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader reading tag...
2014-09-10 08:47:32.348245 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got MSG
2014-09-10 08:47:32.348257 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got envelope type=43 src osd.1 front=247 data=0 off 0
2014-09-10 08:47:32.348269 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader wants 247 from dispatch throttler 247/104857600
2014-09-10 08:47:32.348286 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got front 247
2014-09-10 08:47:32.348303 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).aborted = 0
2014-09-10 08:47:32.348312 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got 247 + 0 + 0 byte message
2014-09-10 08:47:32.348332 7f58bc16b700 10 check_message_signature: seq # = 7 
front_crc_ = 3699418201 middle_crc = 0 data_crc = 0
2014-09-10 08:47:32.348369 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got message 7 0x7f5890003660 osd_op_reply(15 
rbd_data.19d92ae8944a.0001 [set-alloc-hint object_size 4194304 
write_size 4194304,write 0~3145728] v211'518 uv518 ondisk = 0) v6






--
Software Engineer #42 @ http://inktank.com | http://ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osd cpu usage is bigger than 100%

2014-09-10 Thread yuelongguang
hi,all
i am testing   rbd performance, now there is only one vm which is using  rbd as 
its disk, and inside it  fio is doing r/w.
the big diffenence is that i set a big iodepth other than iodepth=1.
 
how do you think about it,  which part is using up cpu? i want to find the root 
cause.
 
 
---default options
osd_op_threads": "2",
  "osd_disk_threads": "1",
  "osd_recovery_threads": "1",
"filestore_op_threads": "2",
 
 
thanks
 
 
 
top - 19:50:08 up 1 day, 10:26,  2 users,  load average: 1.55, 0.97, 0.81
Tasks:  97 total,   1 running,  96 sleeping,   0 stopped,   0 zombie
Cpu(s): 37.6%us, 14.2%sy,  0.0%ni, 37.0%id,  9.4%wa,  0.0%hi,  1.3%si,  0.5%st
Mem:   1922540k total,  1820196k used,   102344k free,23100k buffers
Swap:  1048568k total,91724k used,   956844k free,  1052292k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND

   
 4312 root  20   0 1100m 337m 5192 S 107.3 18.0  88:33.27 ceph-osd  

   
 1704 root  20   0  514m 272m 3648 S  0.7 14.5   3:27.19 ceph-mon  ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] why one osd-op from client can get two osd-op-reply?

2014-09-10 Thread yuelongguang
hi,all
i recently debug ceph rbd, the log tells that  one write to osd can get two if 
its reply.
the difference between them is seq.
why?
 
thanks
---log-
reader got message 6 0x7f58900010a0 osd_op_reply(15 
rbd_data.19d92ae8944a.0001 [set-alloc-hint object_size 4194304 
write_size 4194304,write 0~3145728] v211'518 uv518 ack = 0) v6
2014-09-10 08:47:32.348213 7f58bc16b700 20 -- 10.58.100.92:0/1047669 queue 
0x7f58900010a0 prio 127
2014-09-10 08:47:32.348230 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader reading tag...
2014-09-10 08:47:32.348245 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got MSG
2014-09-10 08:47:32.348257 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got envelope type=43 src osd.1 front=247 data=0 off 0
2014-09-10 08:47:32.348269 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader wants 247 from dispatch throttler 247/104857600
2014-09-10 08:47:32.348286 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got front 247
2014-09-10 08:47:32.348303 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).aborted = 0
2014-09-10 08:47:32.348312 7f58bc16b700 20 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got 247 + 0 + 0 byte message
2014-09-10 08:47:32.348332 7f58bc16b700 10 check_message_signature: seq # = 7 
front_crc_ = 3699418201 middle_crc = 0 data_crc = 0
2014-09-10 08:47:32.348369 7f58bc16b700 10 -- 10.58.100.92:0/1047669 >> 
10.154.249.4:6800/2473 pipe(0xfae6d0 sd=6 :64407 s=2 pgs=133 cs=1 l=1 
c=0xfae940).reader got message 7 0x7f5890003660 osd_op_reply(15 
rbd_data.19d92ae8944a.0001 [set-alloc-hint object_size 4194304 
write_size 4194304,write 0~3145728] v211'518 uv518 ondisk = 0) v6
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] question about librbd io

2014-09-09 Thread yuelongguang
hi, josh.durgin:
 
i want to know how librbd launch io request.
 
use case:
inside vm, i use fio to test rbd-disk's io performance.
fio's pramaters are  bs=4k, direct io, qemu cache=none.
in this case, if librbd just send what it gets from vm, i mean  no 
gather/scatter. the rate , io inside vm : io at librbd: io at osd filestore = 
1:1:1?
 
 
 
thanks
 
fio
[global]
ioengine=libaio
buffered=0
rw=randrw
#size=3g
#directory=/data1
filename=/dev/vdb

[file0]
iodepth=1
bs=4k
time_based
runtime=300
stonewall___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] all my osds are down, but ceph -s tells they are up and in.

2014-09-08 Thread yuelongguang
hi,all
 
that is crazy.
1.
all my osds are down, but ceph -s tells they are up and in. why?
2.
now all osds are down, a vm is using rbd as its disk, and inside vm  fio is 
r/wing the disk , but it hang ,can not be killed. why ?
 
thanks
 
[root@cephosd2-monb ~]# ceph -v
ceph version 0.81 (8de9501df275a5fe29f2c64cb44f195130e4a8fc)
 
 [root@cephosd2-monb ~]# ceph -s
cluster 508634f6-20c9-43bb-bc6f-b777f4bb1651
 health HEALTH_WARN mds 0 is laggy
 monmap e13: 3 mons at 
{cephosd1-mona=10.154.249.3:6789/0,cephosd2-monb=10.154.249.4:6789/0,cephosd3-monc=10.154.249.5:6789/0},
 election epoch 154, quorum 0,1,2 cephosd1-mona,cephosd2-monb,cephosd3-monc
 mdsmap e21: 1/1/1 up {0=0=up:active(laggy or crashed)}
 osdmap e196: 5 osds: 5 up, 5 in
  pgmap v21836: 512 pgs, 5 pools, 3115 MB data, 805 objects
9623 MB used, 92721 MB / 102344 MB avail
 512 active+clean___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] what does monitor data directory include?

2014-08-28 Thread yuelongguang
hi, joao,mark nelson, both of you.

where monmap is stored?
how to dump monitor's data in  /var/lib/ceph/mon/ceph-cephosd1-mona/store.db/?
 
thanks









At 2014-08-28 09:00:41, "Mark Nelson"  wrote:
>On 08/28/2014 07:48 AM, yuelongguang wrote:
>> hi,all
>> what is in directory, /var/lib/ceph/mon/ceph-cephosd1-mona/store.db/
>> how to dump?
>> where monmap is stored?
>
>That directory is typically a leveldb store, though potentially could be 
>rocksdb or maybe something else after firefly.  You can use the leveldb 
>api to access it.  There may be other convenience tools to extract the 
>data out of it too.  Joao may know more.
>
>
>> thanks
>>
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] what does monitor data directory include?

2014-08-28 Thread yuelongguang
hi,all
what is in directory, /var/lib/ceph/mon/ceph-cephosd1-mona/store.db/
how to dump?
where monmap is stored?
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph can not repair itself after accidental power down, half of pgs are peering

2014-08-28 Thread yuelongguang

the next day  it returnes to normal.
i have no idea.






At 2014-08-27 00:38:29, "Michael"  wrote:

How far out are your clocks? It's showing a clock skew, if they're too far out 
it can cause issues with cephx.
Otherwise you're probably going to need to check your cephx auth keys.

-Michael

On 26/08/2014 12:26, yuelongguang wrote:

hi,all
 
i have 5 osds and 3 mons. its status is ok then.
 
to be mentioned , this cluster has no any data.  i just deploy it and to be 
familar with some command lines.
what is the probpem and how to fix?
 
thanks
 
 
---environment-
ceph-release-1-0.el6.noarch
ceph-deploy-1.5.11-0.noarch
ceph-0.81.0-5.el6.x86_64
ceph-libs-0.81.0-5.el6.x86_64
-ceph -s --
[root@cephosd1-mona ~]# ceph -s
cluster 508634f6-20c9-43bb-bc6f-b777f4bb1651
 health HEALTH_WARN 183 pgs peering; 183 pgs stuck inactive; 183 pgs stuck 
unclean; clock skew detected on mon.cephosd2-monb, mon.cephosd3-monc
 monmap e13: 3 mons at 
{cephosd1-mona=10.154.249.3:6789/0,cephosd2-monb=10.154.249.4:6789/0,cephosd3-monc=10.154.249.5:6789/0},
 election epoch 74, quorum 0,1,2 cephosd1-mona,cephosd2-monb,cephosd3-monc
 osdmap e151: 5 osds: 5 up, 5 in
  pgmap v499: 384 pgs, 4 pools, 0 bytes data, 0 objects
201 MB used, 102143 MB / 102344 MB avail
 167 peering
 201 active+clean
  16 remapped+peering
 
 
--log--osd.0
2014-08-26 19:16:13.926345 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:16:13.926355 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2a80 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d5960).accept: got bad authorizer
2014-08-26 19:16:28.928023 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:16:28.928050 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2800 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d56a0).accept: got bad authorizer
2014-08-26 19:16:28.929139 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:16:28.929237 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :38071 s=1 pgs=0 cs=0 l=0 
c=0x45d23c0).failed verifying authorize reply
2014-08-26 19:16:43.930846 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:16:43.930899 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2580 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d0b00).accept: got bad authorizer
2014-08-26 19:16:43.932204 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:16:43.932230 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :38073 s=1 pgs=0 cs=0 l=0 
c=0x45d23c0).failed verifying authorize reply
2014-08-26 19:16:58.933526 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:16:58.935094 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2300 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d0840).accept: got bad authorizer
2014-08-26 19:16:58.936239 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:16:58.936261 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :38074 s=1 pgs=0 cs=0 l=0 
c=0x45d23c0).failed verifying authorize reply
2014-08-26 19:17:13.937335 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:17:13.937368 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2080 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d1b80).accept: got bad authorizer
2014-08-26 19:17:13.937923 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:17:13.937933 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :38075 s=1 pgs=0 cs=0 l=0 
c=0x45d23c0).failed verifying authorize reply
2014-08-26 19:17:28.939439 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block paddi

[ceph-users] ceph can not repair itself after accidental power down, half of pgs are peering

2014-08-26 Thread yuelongguang
hi,all
 
i have 5 osds and 3 mons. its status is ok then.
 
to be mentioned , this cluster has no any data.  i just deploy it and to be 
familar with some command lines.
what is the probpem and how to fix?
 
thanks
 
 
---environment-
ceph-release-1-0.el6.noarch
ceph-deploy-1.5.11-0.noarch
ceph-0.81.0-5.el6.x86_64
ceph-libs-0.81.0-5.el6.x86_64
-ceph -s --
[root@cephosd1-mona ~]# ceph -s
cluster 508634f6-20c9-43bb-bc6f-b777f4bb1651
 health HEALTH_WARN 183 pgs peering; 183 pgs stuck inactive; 183 pgs stuck 
unclean; clock skew detected on mon.cephosd2-monb, mon.cephosd3-monc
 monmap e13: 3 mons at 
{cephosd1-mona=10.154.249.3:6789/0,cephosd2-monb=10.154.249.4:6789/0,cephosd3-monc=10.154.249.5:6789/0},
 election epoch 74, quorum 0,1,2 cephosd1-mona,cephosd2-monb,cephosd3-monc
 osdmap e151: 5 osds: 5 up, 5 in
  pgmap v499: 384 pgs, 4 pools, 0 bytes data, 0 objects
201 MB used, 102143 MB / 102344 MB avail
 167 peering
 201 active+clean
  16 remapped+peering
 
 
--log--osd.0
2014-08-26 19:16:13.926345 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:16:13.926355 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2a80 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d5960).accept: got bad authorizer
2014-08-26 19:16:28.928023 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:16:28.928050 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2800 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d56a0).accept: got bad authorizer
2014-08-26 19:16:28.929139 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:16:28.929237 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :38071 s=1 pgs=0 cs=0 l=0 
c=0x45d23c0).failed verifying authorize reply
2014-08-26 19:16:43.930846 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:16:43.930899 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2580 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d0b00).accept: got bad authorizer
2014-08-26 19:16:43.932204 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:16:43.932230 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :38073 s=1 pgs=0 cs=0 l=0 
c=0x45d23c0).failed verifying authorize reply
2014-08-26 19:16:58.933526 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:16:58.935094 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2300 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d0840).accept: got bad authorizer
2014-08-26 19:16:58.936239 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:16:58.936261 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :38074 s=1 pgs=0 cs=0 l=0 
c=0x45d23c0).failed verifying authorize reply
2014-08-26 19:17:13.937335 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:17:13.937368 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc2080 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d1b80).accept: got bad authorizer
2014-08-26 19:17:13.937923 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:17:13.937933 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :38075 s=1 pgs=0 cs=0 l=0 
c=0x45d23c0).failed verifying authorize reply
2014-08-26 19:17:28.939439 7f114a8d2700  0 cephx: verify_authorizer could not 
decrypt ticket info: error: decryptor.MessageEnd::Exception: 
StreamTransformationFilter: invalid PKCS #7 block padding found
2014-08-26 19:17:28.939455 7f114a8d2700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x4dc1e00 sd=25 :6800 s=0 pgs=0 cs=0 l=0 
c=0x45d5540).accept: got bad authorizer
2014-08-26 19:17:28.939716 7f114c009700  0 cephx: verify_reply couldn't decrypt 
with error: error decoding block for decryption
2014-08-26 19:17:28.939731 7f114c009700  0 -- 11.154.249.2:6800/1667 >> 
11.154.249.7:6800/1599 pipe(0x3edb700 sd=24 :3807

Re: [ceph-users] enrich ceph test methods, what is your concern about ceph. thanks

2014-08-26 Thread yuelongguang

thanks Irek Fasikhov.
is it the only way to test ceph-rbd?  and an important aim of the test is to 
find where  the bottleneck is.   qemu/librbd/ceph.
could you share your test result with me?
 
 
 
thanks




 


在 2014-08-26 04:22:22,"Irek Fasikhov"  写道:

Hi.
I and many people use fio. 
For ceph rbd has a special engine: 
https://telekomcloud.github.io/ceph/2014/02/26/ceph-performance-analysis_fio_rbd.html



2014-08-26 12:15 GMT+04:00 yuelongguang :

hi,all
 
i am planning to do a test on ceph, include performance, throughput, 
scalability,availability.
in order to get a full test result, i  hope you all can give me some advice. 
meanwhile i can send the result to you,if you like.
as for each category test( performance, throughput, scalability,availability)  
,  do you have some some test idea and test tools?
basicly i have know some tools to test throughtput,iops .  but you can tell the 
tools you prefer and the result you expect.  
 
thanks very much
 



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com







--

С уважением, Фасихов Ирек Нургаязович
Моб.: +79229045757___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] enrich ceph test methods, what is your concern about ceph. thanks

2014-08-26 Thread yuelongguang
hi,all
 
i am planning to do a test on ceph, include performance, throughput, 
scalability,availability.
in order to get a full test result, i  hope you all can give me some advice. 
meanwhile i can send the result to you,if you like.
as for each category test( performance, throughput, scalability,availability)  
,  do you have some some test idea and test tools?
basicly i have know some tools to test throughtput,iops .  but you can tell the 
tools you prefer and the result you expect.  
 
thanks very much
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] question about getting rbd.ko and ceph.ko

2014-08-26 Thread yuelongguang
hi,all
 
is there a way to get rbd,ko and ceph.ko for centos 6.X.
 
or  i have to build them from source code?  which is the least kernel version?
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] help to confirm if journal includes everything a OP has

2014-08-14 Thread yuelongguang
hi,all
 
By reading the code , i notice everything of a OP is encoded into Transaction 
which is writed into journal later.
does journal record everything(meta,xattr,file data...) of a OP. 
if so everything is writed into disk twice, and journal always reaches full 
state, right?
 
 
thanks ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] can osd start up if journal is lost and it has not been replayed?

2014-08-14 Thread yuelongguang
hi

could you tell the reason, why 'the journal is lost, the OSD is lost'? if 
journal is lost, actually it only lost part  which ware not replayed.
let take a similar case as example, a osd is down for some time , its journal 
is out of date(lose part of journal), but it can catch up with other osds. why?
that example can tell that  either outdated osd can get all journal from others 
 or 'catch up' has different theory with journal.
could you explain?
 
 
 
thanks








At 2014-08-14 05:21:20, "Craig Lewis"  wrote:

If the journal is lost, the OSD is lost.  This can be a problem if you use 1 
SSD for journals for many OSDs.


There has been some discussion about making the OSDs able to recover from a 
lost journal, but I haven't heard anything else about it.  I haven't been 
paying much attention to the developer mailing list though.




For your second question, I'd start by looking at the source code in 
src/osd/ReplicatedPG.cc (for standard replication), or src/osd/ECBackend.cc 
(for Erasure Coding).  I'm not a Ceph developer though, so that might not be 
the right place to start.





On Tue, Aug 12, 2014 at 7:08 PM, yuelongguang  wrote:

hi,all
 
1.
can osd start up  if journal is lost and it has not been replayed?
 
2.
how it catchs up latest epoch?  take osd as example,  where is the code? it 
better you consider journal is lost or not.
in my mind journal only includes meta/R/W operations, does not include 
data(file data).
 
 
thanks



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] could you tell the call flow of pg state migration from log

2014-08-13 Thread yuelongguang
2014-08-11 10:17:04.591497 7f0ec9b4f7a0 10 osd.0 pg_epoch: 153 pg[5.63( empty 
local-les=153 n=0 ec=81 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 
mlcod 0'0 inactive] null
2014-08-11 10:17:04.591501 7f0eb2b8f700  5 osd.0 pg_epoch: 155 pg[0.10( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] enter Started
2014-08-11 10:17:04.591501 7f0ec9b4f7a0 10 osd.0 pg_epoch: 153 pg[2.65( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] null
2014-08-11 10:17:04.591505 7f0eb2b8f700  5 osd.0 pg_epoch: 155 pg[0.10( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] enter Start
2014-08-11 10:17:04.591508 7f0ec9b4f7a0 10 osd.0 pg_epoch: 153 pg[1.66( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] null
2014-08-11 10:17:04.591509 7f0eb2b8f700  1 osd.0 pg_epoch: 155 pg[0.10( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] state: transitioning to Primary
2014-08-11 10:17:04.591513 7f0ec9b4f7a0 10 osd.0 pg_epoch: 153 pg[3.64( empty 
local-les=153 n=0 ec=76 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 
mlcod 0'0 inactive] null
2014-08-11 10:17:04.591517 7f0ec9b4f7a0 10 osd.0 pg_epoch: 153 pg[4.63( empty 
local-les=153 n=0 ec=79 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 
mlcod 0'0 inactive] null
2014-08-11 10:17:04.591518 7f0eb2b8f700  5 osd.0 pg_epoch: 155 pg[0.10( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] exit Start 0.13 0 0.00
2014-08-11 10:17:04.591521 7f0ec9b4f7a0 10 osd.0 pg_epoch: 153 pg[4.6c( empty 
local-les=153 n=0 ec=79 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 
mlcod 0'0 inactive] null
2014-08-11 10:17:04.591524 7f0eb2b8f700  5 osd.0 pg_epoch: 155 pg[0.10( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] enter Started/Primary
2014-08-11 10:17:04.591526 7f0ec9b4f7a0 10 osd.0 pg_epoch: 153 pg[1.68( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] null
2014-08-11 10:17:04.591529 7f0eb2b8f700  5 osd.0 pg_epoch: 155 pg[0.10( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] enter Started/Primary/Peering
2014-08-11 10:17:04.591531 7f0ec9b4f7a0 10 osd.0 pg_epoch: 153 pg[1.6b( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 inactive] null
2014-08-11 10:17:04.591535 7f0eb2b8f700  5 osd.0 pg_epoch: 155 pg[0.10( empty 
local-les=153 n=0 ec=1 les/c 153/153 152/152/152) [0] r=0 lpr=153 crt=0'0 mlcod 
0'0 peering] enter Started/Primary/Peering/GetInfo
 
hi,all
pg's state changes ,  Start --> Started/Primary-> 
Started/Primary/Peering/GetInfo .
why is that?  what makes it do such change, and it is which thread handles the 
change? the code
 
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] can osd start up if journal is lost and it has not been replayed?

2014-08-12 Thread yuelongguang
hi,all
 
1.
can osd start up  if journal is lost and it has not been replayed?
 
2.
how it catchs up latest epoch?  take osd as example,  where is the code? it 
better you consider journal is lost or not.
in my mind journal only includes meta/R/W operations, does not include 
data(file data).
 
 
thanks___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph network

2014-08-11 Thread yuelongguang
hi,all
i know ceph differentiates network, mostly it uses public and cluster 
,heartbeat network.
do mon and mds have those network? i only know osd has.
is there a place to introduce ceph's network?
 
 
thanks.
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] best practice of installing ceph(large-scale deployment)

2014-08-11 Thread yuelongguang
hi,all
i am using ceph-rbd with openstack as its backends storage.
is there a best practice?
1.
it needs at least   how many osds,mons, and their proportion ?
 
2. how you deploy the network?public , cluster network...
 
3.as for performance, what do you do? journal..
 
4. anything  it promotes ceph performance.
thanks.
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] what is collection(COLL) and cid

2014-08-03 Thread yuelongguang
hi,all
look at the code.
case Transaction::OP_MKCOLL:
  {
 coll_t cid = i.get_cid();
  ...}
1 .
what is COLL and cid?  
is coll is a pg and cid is pgid?
 
2. what is the relation between cid and 'current/meta'?   or what is in 
current/meta?
 
 
thanks very much.___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] question about ApplyManager, SubmitManager and FileJournal

2014-07-31 Thread yuelongguang
hi,all
recently i dive into the source code, i am a little confused about them,
maybe because of many threads,wait,seq.
 
1. what does apply_manager  do? it is related to filestore and filejournal.
2. what does SubmitManager  do?
3. how they interact and work together?
 
what a  big question :), thanks very much.
 ___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] how ceph store xattr

2014-07-30 Thread yuelongguang
hi,all
1.
it seems that there are 2 kinds of function  that get/set xattrs.
one kind start with collection_*,the another one start with omap_*.
what is the differences between them, and what xattrs use which kind of 
function?
 
2.
there is a xattr that tells whethe xattrs are stored on leveldb , what is that 
xattr?
 
 
thanks.___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com