[ceph-users] Why is file extents size observed by "rbd diff" much larger than observed by "du" the object file on the OSD's machie?

2016-12-29 Thread xxhdx1985126
Hi, everyone.

Recently, I've got a need to check the real size of an rbd image. I followed 
the instructions in "http://ceph.com/planet/rbd-image-real-size/;. the result 
is shown as follows:

[xuxuehan@localhost ~]$ rbd diff xxh_pool/clone_test_img2
OffsetLength  Type 
0 4194304 data 
4194304   4190208 data 
8388608   4182016 data 
12582912  4194304 data 
16777216  4194304 data 
20971520  4194304 data 
25165824  4186112 data 
29360128  4190208 data 
33554432  4194304 data 
37748736  4190208 data 
41943040  4194304 data 
46137344  4186112 data 
50331648  4186112 data 
54525952  4194304 data 
58720256  4190208 data 
62914560  4194304 data 


However, I checked the file size of the object 
"rbd_data.1bfad6b8b4567.0001" which belongs to clone_test_img2, and 
the result is as follows:

[xuxuehan@hdp2384 ~]$ du 
/home/ceph/software/ceph/var/lib/ceph/osd/ceph-2/current/1.154_head/rbd\\udata.1bfad6b8b4567.0001*
2020
/home/ceph/software/ceph/var/lib/ceph/osd/ceph-2/current/1.154_head/rbd\udata.1bfad6b8b4567.0001__head_A2511954__1

As shown above, the bytes changed observed in the result of "rbd diff" is about 
4MB, while the real disk space usage observed by "du" is only about 2MB. Why 
are they so different?

Please help me, thanks:-)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] What is replay_version used for?

2016-12-26 Thread xxhdx1985126
Hi, everyone.


What is Objecter::Op::replay_version used for?


Thanks:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] What is replay_version used for?

2016-12-26 Thread xxhdx1985126



Hi, everyone.


What is Objecter::Op::replay_version used for?


Thanks:-)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to know the address of ceph clients from OSD?

2016-12-21 Thread xxhdx1985126


Hi, everyone.


Sometimes, I've got a need to know the ip address of the ceph client at the 
time, is there any way to list those ip address in ceph cluster? I'm using ceph 
rbd with kvm servers.


Thank you:-)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How to know the ceph client's ip address?

2016-12-21 Thread xxhdx1985126
Hi, everyone.


Sometimes, I've got a need to know the ip address of the ceph client at the 
time, is there any way to list those ip address in ceph cluster?


Thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Assertion "needs_recovery" fails when balance_read reaches a replica OSD where the target object is not recovered yet.

2016-11-25 Thread xxhdx1985126
Hi, everyone.

In our online system, some OSDs always fail due to the following error:

2016-10-25 19:00:00.626567 7f9a63bff700 -1 error_msg osd/ReplicatedPG.cc: In 
function 'void ReplicatedPG::wait_for_unreadable_object(const hobject_t&, 
OpRequestRef)' thread 7f9a63bff700 time 2016-10-25 19:00:00.624499
osd/ReplicatedPG.cc: 387: FAILED assert(needs_recovery)

ceph version 0.94.5-12-g83f56a1 (83f56a1c84e3dbd95a4c394335a7b1dc926dd1c4)
 1: (ReplicatedPG::wait_for_unreadable_object(hobject_t const&, 
std::tr1::shared_ptrOpRequest)+0x3f5) [0x8b5a65]
 2: (ReplicatedPG::do_op(std::tr1::shared_ptrOpRequest&)+0x5e9) 
[0x8f0c79]
 3: (ReplicatedPG::do_request(std::tr1::shared_ptrOpRequest&, 
ThreadPool::TPHandle&)+0x4e3) [0x87fdc3]
 4: (OSD::dequeue_op(boost::intrusive_ptrPG, 
std::tr1::shared_ptrOpRequest, ThreadPool::TPHandle&)+0x178) [0x66b3f8]
 5: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x59e) 
[0x66f8ee]
 6: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x795) [0xa76d85]
 7: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0xa7a610]
 8: /lib64/libpthread.so.0() [0x3471407a51]
 9: (clone()+0x6d) [0x34710e893d]
 NOTE: a copy of the executable, or `objdump -rdS executable` is needed 
to interpret this.

Our verion of ceph is 0.94.5. 
After doing some reading of the source code and analysis of our online 
scenarios, we made some conjecture:
   When encountering a large number of "balance_reads", the OSDs can be so 
busy that they can't send heartbeats in time, which could lead to monitors 
wrongly mark them down and triggers other OSDs to go through 
peering+recovery+process during which, on the replica OSDs, the assertion 
"needs_recovery" at ReplicatedPG.cc:387 has a large probability to fail.

To confirm this guess, we did some designated test. If I write extra code to 
make the recovery of some object wait for those ops targeting that object with 
the type "CEPH_MSG_OSD_OP"  to finish, the assertion "needs_recovery" at 
ReplicatedPG.cc:387 will always fail. And on the other hand, if I make those 
ops targeting some object with the type "CEPH_MSG_OSD_OP" wait for the 
corresponding recovery to finish, the assertion won't be triggered.

Can we come to the conclusion that the cause to the assertion failure is just 
as we thought? And, it seems that the purpose of the failed assertion is to 
make sure that the "missing_loc.needs_recovery_map" do contain the unreadable 
object. However, "missing_loc.needs_recovery_map" seems to be always empty on 
replica OSDs. Can we fix this problem simply by bypassing this assertion in 
some way like:
  if ( is_primary() ){
  bool needs_recovery = missing_loc.needs_recovery(soid, 
);
  assert(needs_recovery);
   }

I've also submit a new issue: BUG #18021. Please help me. Thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Question about last_backfill

2016-11-06 Thread xxhdx1985126
Hi, everyone.


In the PGLog::merge_log method, pg log entries in "olog" are inserted into 
current PGLog's "missing" structure only when they have "version" larger than 
current PGLog's head and its target object has "soid" less than current pg 
info's last_backfill. What does "last_backfill" in pg_info_t mean? Is it the 
max object id that the pg possessed after the last recovery_backfill process? 
If so, why only objects with "soid" less than "last_backfill" is considered 
missing, what if new objects are created by the current osd and modified by 
other osd during the current osd was "out" or "down"?


I'm really confused about this, please help me, thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Question about PG class

2016-11-02 Thread xxhdx1985126
Hi, everyone.


What are the meanings of the fields  actingbackfill, want_acting and 
backfill_targets  of the PG class?
Thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Question about writing a program that transfer snapshot diffs between ceph clusters

2016-10-31 Thread xxhdx1985126
Hi, everyone.


I'm trying to write a program based on the librbd API that transfers snapshot 
diffs between ceph clusters without the need for a temporary storage which is 
required if I use the "rbd export-diff" and "rbd import-diff" pair. I found 
that the configuration object "g_conf" and ceph context object "g_ceph_context" 
are global variables which are used almost everywhere in the source code, while 
what I need ot do in the first place is to construct two or more configuration 
objects, each corresponding to a ceph cluster, and make those operations 
intended to a ceph cluster use the corresponding configuration object. 


How can I accomplish this task? Or, is it just not viable? Thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] log file owner not right

2016-10-30 Thread xxhdx1985126
Hi, everyone.


Recently, I deployed a ceph cluster manually. And I found that, after I start 
the ceph osd through "/etc/init.d/ceph -a start osd", the size of the log file 
"ceph-osd.log" is 0, and its owner isnot "ceph" which I configured in 
/etc/ceph/ceph.conf but the user who actually run the /etc/init.d/ceph script. 
I read the /etc/init.d/ceph script, and found that the command "ceph-conf" is 
run by the current user with the arguments "-n $type.$id", which makes it 
create a ceph-osd.log which is owned by the current user.


How should I deal with this problem? Thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Question about OSDSuperblock

2016-10-22 Thread xxhdx1985126
Title: 缃戞槗閭
Sorry, sir. I don't quite follow you. I agree that the osds must get the current map to know who to contact so it can catch up. But it looks to me that the osd is getting the current map through get_map(superblock.current_epoch) in which the content of the variable superblock.current_epoch is read from the disk by OSD::read_superblock and has't been updated by a monitor at boot time, which means it is not the real curent epoch but an old one. How can OSD get the current map using an old epoch?



On xxhdx1985126 <xxhdx1985...@163.com>, Oct 23, 2016 3:13 AM wrote:



Sorry, sir. I don't quite follow you. I agree that the osds must get the current map to know who to contact so it can catch up. But it looks to me that the osd is getting the current map through get_map(superblock.current_epoch) in which the variable superblock.current_epoch is read from the disk by OSD::read_superblock at boot time and has't been updated by a monitor, which means it is not the real curent epoch. How can OSD get the current map using an old epoch?



Sent from my Mi phoneOn David Turner <david.tur...@storagecraft.com>, Oct 23, 2016 12:34 AM wrote:







The osd needs to know where it thought data was, in particular so it knows what it has. Then it gets the current map so it knows who to talk to so it can catch back up.

Sent from my iPhone

On Oct 22, 2016, at 7:12 AM, xxhdx1985126 <xxhdx1985...@163.com> wrote:





Hi, everyone.


I'm trying to read the source code that boots an OSD instance, and I find something really overwhelms me.
In the OSD::init() method, it read the OSDSuperblock by calling OSD::read_superblock(), and the it tried to get the "current" map : "osdmap = get_map(superblock.current_epoch)". Then OSD uses this osdmap to calculate the acting and up set of each pg.聽
I really don't understand this! Since the OSDSuperblock is read from the disk, the superblock.current_epoch should be an old epoch which is recorded by the last OSD instance that run on this directory. Why use an old "current_epoch" to calculate the acting
 and up set of each pg?


Please help me, thank you:-)




聽












David聽Turner聽|
Cloud Operations Engineer聽|
StorageCraft
 Technology Corporation
380 Data Drive Suite 300聽|
Draper聽|
Utah聽|
84020
Office:
801.871.2760 |
Mobile:
385.224.2943










If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this
 message is prohibited.







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

















___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Question about OSDSuperblock

2016-10-22 Thread xxhdx1985126
Title: 缃戞槗閭
Sorry, sir. I don't quite follow you. I agree that the osds must get the current map to know who to contact so it can catch up. But it looks to me that the osd is getting the current map through get_map(superblock.current_epoch) in which the variable superblock.current_epoch is read from the disk by OSD::read_superblock at boot time and has't been updated by a monitor, which means it is not the real curent epoch. How can OSD get the current map using an old epoch?



Sent from my Mi phoneOn David Turner <david.tur...@storagecraft.com>, Oct 23, 2016 12:34 AM wrote:







The osd needs to know where it thought data was, in particular so it knows what it has. Then it gets the current map so it knows who to talk to so it can catch back up.

Sent from my iPhone

On Oct 22, 2016, at 7:12 AM, xxhdx1985126 <xxhdx1985...@163.com> wrote:





Hi, everyone.


I'm trying to read the source code that boots an OSD instance, and I find something really overwhelms me.
In the OSD::init() method, it read the OSDSuperblock by calling OSD::read_superblock(), and the it tried to get the "current" map : "osdmap = get_map(superblock.current_epoch)". Then OSD uses this osdmap to calculate the acting and up set of each pg.聽
I really don't understand this! Since the OSDSuperblock is read from the disk, the superblock.current_epoch should be an old epoch which is recorded by the last OSD instance that run on this directory. Why use an old "current_epoch" to calculate the acting
 and up set of each pg?


Please help me, thank you:-)




聽












David聽Turner聽|
Cloud Operations Engineer聽|
StorageCraft
 Technology Corporation
380 Data Drive Suite 300聽|
Draper聽|
Utah聽|
84020
Office:
801.871.2760 |
Mobile:
385.224.2943










If you are not the intended recipient of this message or received it erroneously, please notify the sender and delete it, together with any attachments, and be advised that any dissemination or copying of this
 message is prohibited.







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com











___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Question about OSDSuperblock

2016-10-22 Thread xxhdx1985126
Hi, everyone.


I'm trying to read the source code that boots an OSD instance, and I find 
something really overwhelms me.
In the OSD::init() method, it read the OSDSuperblock by calling 
OSD::read_superblock(), and the it tried to get the "current" map : "osdmap = 
get_map(superblock.current_epoch)". Then OSD uses this osdmap to calculate the 
acting and up set of each pg. 
I really don't understand this! Since the OSDSuperblock is read from the disk, 
the superblock.current_epoch should be an old epoch which is recorded by the 
last OSD instance that run on this directory. Why use an old "current_epoch" to 
calculate the acting and up set of each pg?


Please help me, thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Does marking OSD "down" trigger "AdvMap" event in other OSD?

2016-10-16 Thread xxhdx1985126
Hi, everyone.


If one OSD's state transforms from up to down, by "kill -i" for example, will 
an "AdvMap" event be triggered on other related OSDs?___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Fw:PG go "incomplete" after setting min_size

2016-10-09 Thread xxhdx1985126
Sorry, I forgot to tell that those pgs assigned to the kill osd are still 
writable after I raise min_size from 1 to 2 but before I restarted the killed 
osd.



 Forwarding messages 
From: "xxhdx1985126" <xxhdx1985...@163.com>
Date: 2016-10-09 18:08:45
To: "ceph-us...@ceph.com" <ceph-us...@ceph.com>
Subject: PG go "incomplete" after setting min_size

Hi, everyone.


I'm a newbie about ceph and trying to do some test to understand the behavior 
of ceph. The following situation really overwhelmed me:


I first killed a osd, which made the size of the acting set of some pg became 
1. Then I set min_size from 1 to 2, after which I started the killed osd. Then 
there came the phenomenon that all the pg previous assigned to the killed osd 
goes "incomplete". My cluster contains only 2 hosts, each running 10 osds. And 
I made the configurations that replicas of pgs be assigned to both osds.


Is it supposed to be this way? What is the philosophy about this?


Thank you:-)




 




网易天天特卖:韩国emart排毒止咳蜂蜜柚子茶 88元4kg包邮(100%正品,网易亲自采购),30分钟即刻顺丰包邮发货!___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] PG go "incomplete" after setting min_size

2016-10-09 Thread xxhdx1985126
Hi, everyone.


I'm a newbie about ceph and trying to do some test to understand the behavior 
of ceph. The following situation really overwhelmed me:


I first killed a osd, which made the size of the acting set of some pg became 
1. Then I set min_size from 1 to 2, after which I started the killed osd. Then 
there came the phenomenon that all the pg previous assigned to the killed osd 
goes "incomplete". My cluster contains only 2 hosts, each running 10 osds. And 
I made the configurations that replicas of pgs be assigned to both osds.


Is it supposed to be this way? What is the philosophy about this?


Thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Is it possible to recover the data of which all replicas are lost?

2016-09-27 Thread xxhdx1985126
Hi, everyone.


I've got a problem, here. Due to some miss operations, I deleted all three 
replicas of my data, is there any way to recover it?
This is a very urgent problem.


Please help me, Thanks.___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Does the journal of a single OSD roll itself automatically?

2016-09-27 Thread xxhdx1985126
Hi, everyone.


After the file system synchronization, does OSD delete those journals that 
corresponds to operations before the synchronization point?___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How is RBD image implemented?

2016-09-19 Thread xxhdx1985126
Hi, everyone.


On the "Block Storage" page, I found this: "Ceph RBD interfaces with the same 
Ceph object storage system that provides the librados interface and the Ceph FS 
file system, and it stores block device images as objects.".
Does it mean literally that a RBD image is implemented as an object not a file 
on Ceph? If this is true, wouldn't it be a problem when creating a very large 
image?


Thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What file system does ceph use for an individual OSD, is it still EBOFS?

2016-09-18 Thread xxhdx1985126
Thanks, sir:-)


At 2016-09-19 13:00:18, "Ian Colle" <ico...@redhat.com> wrote:
Some use xfs, others btrfs, and still others use (gasp) zfs and ext4.  


Upstream automated testing currently only runs on xfs, if that gives you a 
sense of the community's comfort level, but there are strong advocates for each 
of the others I listed initially. 


Caveat emptor.


Ian

On Sunday, September 18, 2016, xxhdx1985126 <xxhdx1985...@163.com> wrote:


Hi, everyone.

I'm newbie for Ceph. According to Sage A. Weil's paper, Ceph was using EBOFS as 
the file system for its OSDs. However, I looked into the source code of Ceph 
and could hardly find any code of EBOFS. Is Ceph still using EBOFS or has opted 
to use other types of file system for a single OSB?

Thank you:-)





 



--

Ian R. Colle

Global Director of Software Engineering

Red Hat, Inc.

ico...@redhat.com
+1-303-601-7713
http://www.linkedin.com/in/ircolle
http://www.twitter.com/ircolle


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] What file system does ceph use for an individual OSD, is it still EBOFS?

2016-09-18 Thread xxhdx1985126
Hi, everyone.

I'm newbie for Ceph. According to Sage A. Weil's paper, Ceph was using EBOFS as 
the file system for its OSDs. However, I looked into the source code of Ceph 
and could hardly find any code of EBOFS. Is Ceph still using EBOFS or has opted 
to use other types of file system for a single OSB?

Thank you:-)___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com