Re: [ceph-users] EU mirror now supports rsync

2014-11-04 Thread Florent Bautista
Hi Wido,

I have a connection refused for a few days on your mirror using rsync
(via IPv4).

Is it a blacklist or an issue ?

Thank you
Florent


On 04/09/2014 08:04 AM, Wido den Hollander wrote:
 Hi,

 I just enabled rsync on the eu.ceph.com mirror.

 eu.ceph.com mirrors from Ceph.com every 3 hours.

 Feel free to rsync all the contents to your local environment, might be useful
 for some large deployments where you want to save external bandwidth by not
 having each machine fetch the Deb/RPM packages from the internet.

 Rsync is available over IPv4 and IPv6, simply sync with this command:
 $ mkdir cephmirror
 $ rsync -avr --stats --progress eu.ceph.com::ceph cephmirror

 I ask you all to be gentle. It's a free service, so don't start hammering the
 server by setting your Cron to sync every 5 minutes. Once every couple of 
 hours
 should be sufficient.

 Also, please don't all start syncing at the first minute of the hour. When
 setting up the Cron, select a random minute from the hour. This way the load 
 on
 the system can be spread out.

 Should you have any questions or issues, let me know!

 --
 Wido den Hollander
 42on B.V.
 Ceph trainer and consultant
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Sage Weil
On Tue, 4 Nov 2014, Blair Bethwaite wrote:
 On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote:
  In the Ceph session at the OpenStack summit someone asked what the CephFS
  survey results looked like.
 
 Thanks Sage, that was me!
 
   Here's the link:
 
  https://www.surveymonkey.com/results/SM-L5JV7WXL/
 
  In short, people want
 
  fsck
  multimds
  snapshots
  quotas
 
 TBH I'm a bit surprised by a couple of these and hope maybe you guys
 will apply a certain amount of filtering on this...
 
 fsck and quotas were there for me, but multimds and snapshots are what
 I'd consider icing features - they're nice to have but not on the
 critical path to using cephfs instead of e.g. nfs in a production
 setting. I'd have thought stuff like small file performance and
 gateway support was much more relevant to uptake and
 positive/pain-free UX. Interested to hear others rationale here.

Yeah, I agree, and am taking the results with a grain of salt.  I 
think the results are heavily influenced by the order they were 
originally listed (I whish surveymonkey would randomize is for each 
person or something).

fsck is a clear #1.  Everybody wants multimds, but I think very few 
actually need it at this point.  We'll be merging a soft quota patch 
shortly, and things like performance (adding the inline data support to 
the kernel client, for instance) will probably compete with getting 
snapshots working (as part of a larger subvolume infrastructure).  That's 
my guess at least; for now, we're really focused on fsck and hard 
usability edges and haven't set priorities beyond that.

We're definitely interested in hearing feedback on this strategy, and on 
peoples' experiences with giant so far...

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Giant not fixed RepllicatedPG:NotStrimming?

2014-11-04 Thread Ta Ba Tuan

Hi Sam,
I resend logs with debug options http://123.30.41.138/ceph-osd.21.log
(Sorry about my spam :D)

I saw many missing objects :|

2014-11-04 15:26:02.205607 7f3ab11a8700 10 osd.21 pg_epoch: 106407 
pg[24.7d7( v 106407'491583 lc 106401'491579 
(105805'487042,106407'491583] loca
l-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) 
[21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=1 crt=106401'491581 mlcod
106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] 
recover_primary 
675ea7d7/*rbd_data.4930222ae8944a.0001/head//24 
106401'491580 (missing) (missing head) (recovering) (recovering head)*
2014-11-04 15:26:02.205642 7f3ab11a8700 10 osd.21 pg_epoch: 106407 
pg[24.7d7( v 106407'491583 lc 106401'491579 
(105805'487042,106407'491583] local-les=106403 n=179 ec=25000 les/c 
106403/106390 106402/106402/106402) [21,28,4] r=0 lpr=106402 
pi=106377-106401/4 rops=1 crt=106401'491581 mlcod 106393'491097 
active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] recover_primary 
d4d4bfd7/rbd_data.c6964d30a28220.035f/head//24 106401'491581 
(missing) (missing head)
2014-11-04 15:26:02.237994 7f3ab29ab700 10 osd.21 pg_epoch: 106407 
pg[24.7d7( v 106407'491583 lc 106401'491579 
(105805'487042,106407'491583] local-les=106403 n=179 ec=25000 les/c 
106403/106390 106402/106402/106402) [21,28,4] r=0 lpr=106402 
pi=106377-106401/4 rops=2 crt=106401'491581 mlcod 106393'491097 
active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] *got missing 
d4d4bfd7/rbd_data.c6964d30a28220.035f/head//24 v 106401'491581*


Thanks Sam and All,
--
Tuan
HaNoi-Vietnam

On 11/04/2014 04:54 AM, Samuel Just wrote:

Can you reproduce with

debug osd = 20
debug filestore = 20
debug ms = 1

In the [osd] section of that osd's ceph.conf?
-Sam

On Sun, Nov 2, 2014 at 9:10 PM, Ta Ba Tuan tua...@vccloud.vn wrote:

Hi Sage, Samuel  All,

I upgraded to GAINT, but still appearing that errors |:
I'm trying on deleting  related objects/volumes, but very hard to verify
missing objects :(.

Guide me to resolve it, please! (I send attached detail log).

2014-11-03 11:37:57.730820 7f28fb812700  0 osd.21 105950 do_command r=0
2014-11-03 11:37:57.856578 7f28fc013700 -1 *** Caught signal (Segmentation
fault) **
  in thread 7f28fc013700

  ceph version 0.87-6-gdba7def (dba7defc623474ad17263c9fccfec60fe7a439f0)
  1: /usr/bin/ceph-osd() [0x9b6725]
  2: (()+0xfcb0) [0x7f291fc2acb0]
  3: (ReplicatedPG::trim_object(hobject_t const)+0x395) [0x811b55]
  4: (ReplicatedPG::TrimmingObjects::react(ReplicatedPG::SnapTrim
const)+0x43e) [0x82b9be]
  5: (boost::statechart::simple_stateReplicatedPG::TrimmingObjects,
ReplicatedPG::SnapTrimmer, boost::mpl::listmpl_::na, mpl_::na, mpl_::na,
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
mpl_::na, mpl_::na, mpl_::na,
(boost::statechart::history_mode)0::react_impl(boost::statechart::event_base
const, void const*)+0xc0) [0x870ce0]
  6: (boost::statechart::state_machineReplicatedPG::SnapTrimmer,
ReplicatedPG::NotTrimming, std::allocatorvoid,
boost::statechart::null_exception_translator::process_queued_events()+0xfb)
[0x85618b]
  7: (boost::statechart::state_machineReplicatedPG::SnapTrimmer,
ReplicatedPG::NotTrimming, std::allocatorvoid,
boost::statechart::null_exception_translator::process_event(boost::statechart::event_base
const)+0x1e) [0x85633e]
  8: (ReplicatedPG::snap_trimmer()+0x4f8) [0x7d5ef8]
  9: (OSD::SnapTrimWQ::_process(PG*)+0x14) [0x673ab4]
  10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x48e) [0xa8fade]
  11: (ThreadPool::WorkThread::entry()+0x10) [0xa92870]
  12: (()+0x7e9a) [0x7f291fc22e9a]
  13: (clone()+0x6d) [0x7f291e5ed31d]
  NOTE: a copy of the executable, or `objdump -rdS executable` is needed to
interpret this.

  -9993 2014-11-03 11:37:47.689335 7f28fc814700  1 -- 172.30.5.2:6803/7606
-- 172.30.5.1:6886/3511 -- MOSDPGPull(6.58e 105950
[PullOp(87f82d8e/rbd_data.45e62779c99cf1.22b5/head//6,
recovery_info:
ObjectRecoveryInfo(87f82d8e/rbd_data.45e62779c99cf1.22b5/head//6@105938'11622009,
copy_subset: [0~18446744073709551615], clone_subset: {}), recovery_progress:
ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false,
omap_recovered_to:, omap_complete:false))]) v2 -- ?+0 0x26c59000 con
0x22fbc420

 -2 2014-11-03 11:37:57.853585 7f2902820700  5 osd.21 pg_epoch: 105950
pg[24.9e4( v 105946'113392 lc 105946'113391 (103622'109598,105946'113392]
local-les=1
05948 n=88 ec=25000 les/c 105948/105943 105947/105947/105947) [21,112,33]
r=0 lpr=105947 pi=105933-105946/4 crt=105946'113392 lcod 0'0 mlcod 0'0
active+recovery
_wait+degraded m=1 snaptrimq=[303~3,307~1]] enter
Started/Primary/Active/Recovering
 -1 2014-11-03 11:37:57.853735 7f28fc814700  1 -- 172.30.5.2:6803/7606
-- 172.30.5.9:6806/24552 -- MOSDPGPull(24.9e4 105950
[PullOp(5abb99e4/rbd_data.5dd32
f2ae8944a.0165/head//24, recovery_info:

Re: [ceph-users] 0.87 rados df fault

2014-11-04 Thread Thomas Lemarchand
Thanks for your answer Greg.

Unfortunately, the three monitor were working perfectly for at least 30
minutes after the upgrade.

I don't know their memory usage at the time.
What I did was : upgrade mons, upgrade osds, upgrade mds (single mds),
upgrade fuse clients. I checked that everything was ok (health OK and
data available). Then I started a rsync of around 7TB of data, mostly
files between 100KB and 10MB, with 6TB of data already in CephFS.

Currently the memory usage of my mons is around 110MB (on 1GB of memory
and 1GB of swap).

I'll keep an eye on this.

On another matter (maybe I should start another thread), sometimes I
have : health HEALTH_WARN mds0: Client
wimi-recette-files-nginx:recette-files-rw failing to respond to cache
pressure; mds0: Client wimi-prod-backupmanager:files-rw failing to
respond to cache pressure

And two minutes later :
health HEALTH_OK

Cephfs fuse clients only. But everything is working well, so I'm not so
worried.

Regards,

-- 
Thomas Lemarchand
Cloud Solutions SAS - Responsable des systèmes d'information



On lun., 2014-11-03 at 09:57 -0800, Gregory Farnum wrote:
 On Mon, Nov 3, 2014 at 4:40 AM, Thomas Lemarchand
 thomas.lemarch...@cloud-solutions.fr wrote:
  Update :
 
  /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746084]
  [21787] 0 21780   492110   185044 920   240143 0
  ceph-mon
  /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746115]
  [13136] 0 1313652172 1753  590 0
  ceph
  /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746126] Out
  of memory: Kill process 21787 (ceph-mon) score 827 or sacrifice child
  /var/log/kern.log.1:Oct 31 17:19:17 c-mon kernel: [17289149.746262]
  Killed process 21787 (ceph-mon) total-vm:1968440kB, anon-rss:740176kB,
  file-rss:0kB
 
  OOM kill.
  I have 1GB memory on my mons, and 1GB swap.
  It's the only mon that crashed. Is there a change in memory requirement
  from Firefly ?
 
 There generally shouldn't be, but I don't think it's something we
 monitored closely.
 More likely your monitor was running near its memory limit already and
 restarting all the OSDs (and servicing the resulting changes) pushed
 it over the edge.
 -Greg
 


-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Wido den Hollander
On 11/04/2014 10:02 AM, Sage Weil wrote:
 On Tue, 4 Nov 2014, Blair Bethwaite wrote:
 On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote:
 In the Ceph session at the OpenStack summit someone asked what the CephFS
 survey results looked like.

 Thanks Sage, that was me!

  Here's the link:

 https://www.surveymonkey.com/results/SM-L5JV7WXL/

 In short, people want

 fsck
 multimds
 snapshots
 quotas

 TBH I'm a bit surprised by a couple of these and hope maybe you guys
 will apply a certain amount of filtering on this...

 fsck and quotas were there for me, but multimds and snapshots are what
 I'd consider icing features - they're nice to have but not on the
 critical path to using cephfs instead of e.g. nfs in a production
 setting. I'd have thought stuff like small file performance and
 gateway support was much more relevant to uptake and
 positive/pain-free UX. Interested to hear others rationale here.
 
 Yeah, I agree, and am taking the results with a grain of salt.  I 
 think the results are heavily influenced by the order they were 
 originally listed (I whish surveymonkey would randomize is for each 
 person or something).
 
 fsck is a clear #1.  Everybody wants multimds, but I think very few 
 actually need it at this point.  We'll be merging a soft quota patch 
 shortly, and things like performance (adding the inline data support to 
 the kernel client, for instance) will probably compete with getting 
 snapshots working (as part of a larger subvolume infrastructure).  That's 
 my guess at least; for now, we're really focused on fsck and hard 
 usability edges and haven't set priorities beyond that.
 
 We're definitely interested in hearing feedback on this strategy, and on 
 peoples' experiences with giant so far...
 

I think the approach is correct. Everybody I talk to wants to kick out
their NFS server, but you don't need multi MDS for that. Active/Standby
is just fine.

Wido

 sage
 --
 To unsubscribe from this list: send the line unsubscribe ceph-devel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 


-- 
Wido den Hollander
42on B.V.
Ceph trainer and consultant

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Mariusz Gronczewski
On Tue, 4 Nov 2014 10:36:07 +1100, Blair Bethwaite
blair.bethwa...@gmail.com wrote:

 
 TBH I'm a bit surprised by a couple of these and hope maybe you guys
 will apply a certain amount of filtering on this...
 
 fsck and quotas were there for me, but multimds and snapshots are what
 I'd consider icing features - they're nice to have but not on the
 critical path to using cephfs instead of e.g. nfs in a production
 setting. I'd have thought stuff like small file performance and
 gateway support was much more relevant to uptake and
 positive/pain-free UX. Interested to hear others rationale here.
 

Those are related; if small file performance will be enough for one
MDS to handle high load with a lot of small files (typical case of
webserver), having multiple acive MDS will be less of a priority;

And if someone currently have OSD on bunch of relatively weak nodes,
again, having active-active setup with MDS will be more interesting to
him than someone that can just buy new fast machine for it.


-- 
Mariusz Gronczewski, Administrator

Efigence S. A.
ul. Wołoska 9a, 02-583 Warszawa
T: [+48] 22 380 13 13
F: [+48] 22 380 13 14
E: mariusz.gronczew...@efigence.com
mailto:mariusz.gronczew...@efigence.com


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor - firefly 0.80.7 upgrade problem

2014-11-04 Thread Chad Seys
On Monday, November 03, 2014 17:34:06 you wrote:
 If you have osds that are close to full, you may be hitting 9626.  I
 pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626.
 -Sam

Thanks Sam  I may have been hitting that as well.  I certainly hit too_full 
conditions often.  I am able to squeeze PGs off of the too_full OSD by 
reweighting and then eventually all PGs get to where they want to be.  Kind of 
silly that I have to do this manually though.  Could Ceph order the PG 
movements better? (Is this what your bug fix does in effect?)


So, at the moment there are no PG moving around the cluster, but all are not 
in active+clean. Also, there is one OSD which has blocked requests.  The OSD 
seems idle and restarting the OSD just results in a younger blocked request.

~# ceph -s
cluster 7797e50e-f4b3-42f6-8454-2e2b19fa41d6
 health HEALTH_WARN 35 pgs down; 208 pgs incomplete; 210 pgs stuck 
inactive; 210 pgs stuck unclean; 1 requests are blocked  32 sec
 monmap e3: 3 mons at 
{mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=144.92.180.139:67
89/0}, election epoch 2996, quorum 0,1,2 mon01,mon02,mon03
 osdmap e115306: 24 osds: 24 up, 24 in
  pgmap v6630195: 8704 pgs, 7 pools, 6344 GB data, 1587 kobjects
12747 GB used, 7848 GB / 20596 GB avail
   2 inactive
8494 active+clean
 173 incomplete
  35 down+incomplete

# ceph health detail
...
1 ops are blocked  8388.61 sec
1 ops are blocked  8388.61 sec on osd.15
1 osds have slow requests

from the log of the osd with the blocked request (osd.15):
2014-11-04 08:57:26.851583 7f7686331700  0 log [WRN] : 1 slow requests, 1 
included below; oldest blocked for  3840.430247 secs
2014-11-04 08:57:26.851593 7f7686331700  0 log [WRN] : slow request 
3840.430247 seconds old, received at 2014-11-04 07:53:26.421301: 
osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512] 
4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg


Other requests (like PG scrubs) are happening without taking a long time on 
this OSD.
Also, this was one of the OSDs which I completely drained, removed from ceph, 
reformatted, and created again using ceph-deploy.  So it is completely created 
by firefly 0.80.7 code.


As Greg requested, output of ceph scrub:

2014-11-04 09:25:58.761602 7f6c0e20b700  0 mon.mon01@0(leader) e3 
handle_command mon_command({prefix: scrub} v 0) v1
2014-11-04 09:26:21.320043 7f6c0ea0c700  1 mon.mon01@0(leader).paxos(paxos 
updating c 11563072..11563575) accept timeout, calling fresh elect
ion
2014-11-04 09:26:31.264873 7f6c0ea0c700  0 
mon.mon01@0(probing).data_health(2996) update_stats avail 38% total 6948572 
used 3891232 avail 268
1328
2014-11-04 09:26:33.529403 7f6c0e20b700  0 log [INF] : mon.mon01 calling new 
monitor election
2014-11-04 09:26:33.538286 7f6c0e20b700  1 mon.mon01@0(electing).elector(2996) 
init, last seen epoch 2996
2014-11-04 09:26:38.809212 7f6c0ea0c700  0 log [INF] : mon.mon01@0 won leader 
election with quorum 0,2
2014-11-04 09:26:40.215095 7f6c0e20b700  0 log [INF] : monmap e3: 3 mons at 
{mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=
144.92.180.139:6789/0}
2014-11-04 09:26:40.215754 7f6c0e20b700  0 log [INF] : pgmap v6630201: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
2014-11-04 09:26:40.215913 7f6c0e20b700  0 log [INF] : mdsmap e1: 0/0/1 up
2014-11-04 09:26:40.216621 7f6c0e20b700  0 log [INF] : osdmap e115306: 24 
osds: 24 up, 24 in
2014-11-04 09:26:41.227010 7f6c0e20b700  0 log [INF] : pgmap v6630202: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
2014-11-04 09:26:41.367373 7f6c0e20b700  1 mon.mon01@0(leader).osd e115307 
e115307: 24 osds: 24 up, 24 in
2014-11-04 09:26:41.437706 7f6c0e20b700  0 log [INF] : osdmap e115307: 24 
osds: 24 up, 24 in
2014-11-04 09:26:41.471558 7f6c0e20b700  0 log [INF] : pgmap v6630203: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
2014-11-04 09:26:41.497318 7f6c0e20b700  1 mon.mon01@0(leader).osd e115308 
e115308: 24 osds: 24 up, 24 in
2014-11-04 09:26:41.533965 7f6c0e20b700  0 log [INF] : osdmap e115308: 24 
osds: 24 up, 24 in
2014-11-04 09:26:41.553161 7f6c0e20b700  0 log [INF] : pgmap v6630204: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
2014-11-04 09:26:42.701720 7f6c0e20b700  1 mon.mon01@0(leader).osd e115309 
e115309: 24 osds: 24 up, 24 in
2014-11-04 09:26:42.953977 7f6c0e20b700  0 log [INF] : osdmap e115309: 24 
osds: 24 up, 24 in
2014-11-04 09:26:45.776411 7f6c0e20b700  0 log [INF] : pgmap v6630205: 8704 
pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
plete; 6344 GB data, 

[ceph-users] RBD - possible to query used space of images/clones ?

2014-11-04 Thread Daniel Schwager
Hi,

is there a way to query the used space of a RBD image created with format 2 
(used for kvm)?
Also, if I create a linked clone base on this image, how do I get the 
additional, individual used space of this clone?

In zfs, I can query these kind of information by calling zfs info .. (2). 
rbd info (1) shows not that much information about the image.

best regards
Danny

---

(1) Output of zfs info from a Solaris system

root@storage19:~# zfs get all pool5/w2k8.dsk
NAMEPROPERTY  VALUE  SOURCE
pool5/w2k8.dsk  available 75,3G  -
pool5/w2k8.dsk  checksum  on default
pool5/w2k8.dsk  compression   offdefault
pool5/w2k8.dsk  compressratio 1.00x  -
pool5/w2k8.dsk  copies1  default
pool5/w2k8.dsk  creation  Di. Mai 10 14:44 2011  -
pool5/w2k8.dsk  dedup offdefault
pool5/w2k8.dsk  encryptionoff-
pool5/w2k8.dsk  keychangedate -  default
pool5/w2k8.dsk  keysource none   default
pool5/w2k8.dsk  keystatus none   -
pool5/w2k8.dsk  logbias   latencydefault
pool5/w2k8.dsk  primarycache  alldefault
pool5/w2k8.dsk  readonly  offdefault
pool5/w2k8.dsk  referenced17,4G  -
pool5/w2k8.dsk  refreservationnone   default
pool5/w2k8.dsk  rekeydate -  default
pool5/w2k8.dsk  reservation   none   default
pool5/w2k8.dsk  secondarycachealldefault
pool5/w2k8.dsk  sync  standard   default
pool5/w2k8.dsk  type  volume -
pool5/w2k8.dsk  used  18,5G  -
pool5/w2k8.dsk  usedbychildren0  -
pool5/w2k8.dsk  usedbydataset 17,4G  -
pool5/w2k8.dsk  usedbyrefreservation  0  -
pool5/w2k8.dsk  usedbysnapshots   1,15G  -
pool5/w2k8.dsk  volblocksize  8K -
pool5/w2k8.dsk  volsize   25Glocal
pool5/w2k8.dsk  zoned offdefault


(2) Output of rbd info

[root@ceph-admin2 ~]# rbd info rbd/myimage-1
rbd image 'myimage-1':
size 5 MB in 12500 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.11e82ae8944a
format: 2
features: layering




smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Scottix
Agreed Multi-MDS is a nice to have but not required for full production use.
TBH stability and recovery will win any IT person dealing with filesystems.

On Tue, Nov 4, 2014 at 7:33 AM, Mariusz Gronczewski
mariusz.gronczew...@efigence.com wrote:
 On Tue, 4 Nov 2014 10:36:07 +1100, Blair Bethwaite
 blair.bethwa...@gmail.com wrote:


 TBH I'm a bit surprised by a couple of these and hope maybe you guys
 will apply a certain amount of filtering on this...

 fsck and quotas were there for me, but multimds and snapshots are what
 I'd consider icing features - they're nice to have but not on the
 critical path to using cephfs instead of e.g. nfs in a production
 setting. I'd have thought stuff like small file performance and
 gateway support was much more relevant to uptake and
 positive/pain-free UX. Interested to hear others rationale here.


 Those are related; if small file performance will be enough for one
 MDS to handle high load with a lot of small files (typical case of
 webserver), having multiple acive MDS will be less of a priority;

 And if someone currently have OSD on bunch of relatively weak nodes,
 again, having active-active setup with MDS will be more interesting to
 him than someone that can just buy new fast machine for it.


 --
 Mariusz Gronczewski, Administrator

 Efigence S. A.
 ul. Wołoska 9a, 02-583 Warszawa
 T: [+48] 22 380 13 13
 F: [+48] 22 380 13 14
 E: mariusz.gronczew...@efigence.com
 mailto:mariusz.gronczew...@efigence.com

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




-- 
Follow Me: @Scottix
http://about.me/scottix
scot...@gmail.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Is there an negative relationship between storage utilization and ceph performance?

2014-11-04 Thread Mark Nelson
I'd say it's storage in general, though Ceph can be especially harsh on 
file systems (RBD can invoke particularly bad fragmentation in btrfs for 
example due to how COW works).


So generally there's a lot of things that can cause slow downs as your 
disks get full:


1) More objects spread across deeper PG directory trees
2) More disk fragmentation in general
3) fragmentation that generates even more fragmentation during writes 
once you no longer have contiguous space to store objects.

4) A higher data/pagecache ratio (and more dentries/inodes to cache)
5) disk heads move farther across the disk during random IO.
6) differences between outer and inner track performance on some disks.

There's probably other things I'm missing.

Mark

On 11/04/2014 01:56 PM, Andrey Korolyov wrote:

On Tue, Nov 4, 2014 at 10:49 PM, Udo Lembke ulem...@polarzone.de wrote:

Hi,
since a long time I'm looking for performance improvements for our
ceph-cluster.
The last expansion got better performance, because we add another node
(with 12 OSDs). The storage utilization was after that 60%.

Now we reach again 69% (the next nodes are waiting for installation) and
the performance drop! OK, we also change the ceph-version from 0.72.x to
firefly.
But I'm wonder if there an relationship between utilization an performance?!
The OSDs are xfs disks, but now i start to use ext4, because of the bad
fragmentation on a xfs-filesystem (yes, I use the mountoption
allocsize=4M allready).

Has anybody the same effect?

Udo



AFAIR there is a specific point somewhere in Ceph user guide to not
reach commit ratio higher than 70% due to heavy performance impact. In
practice, hot storage feels even fifty-percent commit on a xfs with
default mount parameters, so you may consider as a rule of thumb to
reach commit not higher than 60 percents. For mixed or cold storage
numbers will vary, as average clat and write throughput will matter
less.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Mark Kirkwood

On 04/11/14 22:02, Sage Weil wrote:

On Tue, 4 Nov 2014, Blair Bethwaite wrote:

On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote:

In the Ceph session at the OpenStack summit someone asked what the CephFS
survey results looked like.


Thanks Sage, that was me!


  Here's the link:

 https://www.surveymonkey.com/results/SM-L5JV7WXL/

In short, people want

fsck
multimds
snapshots
quotas


TBH I'm a bit surprised by a couple of these and hope maybe you guys
will apply a certain amount of filtering on this...

fsck and quotas were there for me, but multimds and snapshots are what
I'd consider icing features - they're nice to have but not on the
critical path to using cephfs instead of e.g. nfs in a production
setting. I'd have thought stuff like small file performance and
gateway support was much more relevant to uptake and
positive/pain-free UX. Interested to hear others rationale here.


Yeah, I agree, and am taking the results with a grain of salt.  I
think the results are heavily influenced by the order they were
originally listed (I whish surveymonkey would randomize is for each
person or something).

fsck is a clear #1.  Everybody wants multimds, but I think very few
actually need it at this point.  We'll be merging a soft quota patch
shortly, and things like performance (adding the inline data support to
the kernel client, for instance) will probably compete with getting
snapshots working (as part of a larger subvolume infrastructure).  That's
my guess at least; for now, we're really focused on fsck and hard
usability edges and haven't set priorities beyond that.

We're definitely interested in hearing feedback on this strategy, and on
peoples' experiences with giant so far...



Heh, not necessarily - I put multi mds in there, as we want the cephfs 
part to be of similar to the rest of ceph in its availability.


Maybe its because we are looking at plugging it in with an Openstack 
setup and for that you want everything to 'just look after itself'. If 
on the other hand we were wanting merely an nfs replacement, then sure 
multi mds not so important there.


regards

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD - possible to query used space of images/clones ?

2014-11-04 Thread Sébastien Han
$ rbd diff rbd/myimage-1 | awk '{ SUM += $2 } END { print SUM/1024/1024  MB }'

--
Regards,
Sébastien Han.

 On 04 Nov 2014, at 16:57, Daniel Schwager daniel.schwa...@dtnet.de wrote:
 
 Hi,
 
 is there a way to query the used space of a RBD image created with format 2 
 (used for kvm)?
 Also, if I create a linked clone base on this image, how do I get the 
 additional, individual used space of this clone?
 
 In zfs, I can query these kind of information by calling zfs info .. (2). 
 rbd info (1) shows not that much information about the image.
 
 best regards
 Danny
 
 ---
 
 (1) Output of zfs info from a Solaris system
 
 root@storage19:~# zfs get all pool5/w2k8.dsk
 NAMEPROPERTY  VALUE  SOURCE
 pool5/w2k8.dsk  available 75,3G  -
 pool5/w2k8.dsk  checksum  on default
 pool5/w2k8.dsk  compression   offdefault
 pool5/w2k8.dsk  compressratio 1.00x  -
 pool5/w2k8.dsk  copies1  default
 pool5/w2k8.dsk  creation  Di. Mai 10 14:44 2011  -
 pool5/w2k8.dsk  dedup offdefault
 pool5/w2k8.dsk  encryptionoff-
 pool5/w2k8.dsk  keychangedate -  default
 pool5/w2k8.dsk  keysource none   default
 pool5/w2k8.dsk  keystatus none   -
 pool5/w2k8.dsk  logbias   latencydefault
 pool5/w2k8.dsk  primarycache  alldefault
 pool5/w2k8.dsk  readonly  offdefault
 pool5/w2k8.dsk  referenced17,4G  -
 pool5/w2k8.dsk  refreservationnone   default
 pool5/w2k8.dsk  rekeydate -  default
 pool5/w2k8.dsk  reservation   none   default
 pool5/w2k8.dsk  secondarycachealldefault
 pool5/w2k8.dsk  sync  standard   default
 pool5/w2k8.dsk  type  volume -
 pool5/w2k8.dsk  used  18,5G  -
 pool5/w2k8.dsk  usedbychildren0  -
 pool5/w2k8.dsk  usedbydataset 17,4G  -
 pool5/w2k8.dsk  usedbyrefreservation  0  -
 pool5/w2k8.dsk  usedbysnapshots   1,15G  -
 pool5/w2k8.dsk  volblocksize  8K -
 pool5/w2k8.dsk  volsize   25Glocal
 pool5/w2k8.dsk  zoned offdefault
 
 
 (2) Output of rbd info
 
 [root@ceph-admin2 ~]# rbd info rbd/myimage-1
 rbd image 'myimage-1':
size 5 MB in 12500 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.11e82ae8944a
format: 2
features: layering
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Mark Nelson

On 11/04/2014 03:11 PM, Mark Kirkwood wrote:

On 04/11/14 22:02, Sage Weil wrote:

On Tue, 4 Nov 2014, Blair Bethwaite wrote:

On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote:

In the Ceph session at the OpenStack summit someone asked what the
CephFS
survey results looked like.


Thanks Sage, that was me!


  Here's the link:

 https://www.surveymonkey.com/results/SM-L5JV7WXL/

In short, people want

fsck
multimds
snapshots
quotas


TBH I'm a bit surprised by a couple of these and hope maybe you guys
will apply a certain amount of filtering on this...

fsck and quotas were there for me, but multimds and snapshots are what
I'd consider icing features - they're nice to have but not on the
critical path to using cephfs instead of e.g. nfs in a production
setting. I'd have thought stuff like small file performance and
gateway support was much more relevant to uptake and
positive/pain-free UX. Interested to hear others rationale here.


Yeah, I agree, and am taking the results with a grain of salt.  I
think the results are heavily influenced by the order they were
originally listed (I whish surveymonkey would randomize is for each
person or something).

fsck is a clear #1.  Everybody wants multimds, but I think very few
actually need it at this point.  We'll be merging a soft quota patch
shortly, and things like performance (adding the inline data support to
the kernel client, for instance) will probably compete with getting
snapshots working (as part of a larger subvolume infrastructure).  That's
my guess at least; for now, we're really focused on fsck and hard
usability edges and haven't set priorities beyond that.

We're definitely interested in hearing feedback on this strategy, and on
peoples' experiences with giant so far...



Heh, not necessarily - I put multi mds in there, as we want the cephfs
part to be of similar to the rest of ceph in its availability.

Maybe its because we are looking at plugging it in with an Openstack
setup and for that you want everything to 'just look after itself'. If
on the other hand we were wanting merely an nfs replacement, then sure
multi mds not so important there.


Do you need active/active or is active/passive good enough?



regards

Mark

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Mark Kirkwood

On 05/11/14 10:58, Mark Nelson wrote:

On 11/04/2014 03:11 PM, Mark Kirkwood wrote:

Heh, not necessarily - I put multi mds in there, as we want the cephfs
part to be of similar to the rest of ceph in its availability.

Maybe its because we are looking at plugging it in with an Openstack
setup and for that you want everything to 'just look after itself'. If
on the other hand we were wanting merely an nfs replacement, then sure
multi mds not so important there.


Do you need active/active or is active/passive good enough?



That is of course a good question. We are certainly seeing active/active 
as much better - essentially because all the other bits are, and it 
avoids the need to wake people up to change things. Does that make it 
essential? I'm not 100% sure, it might just be a nice to have that is so 
nice that we'll wait for it to be there!


Cheers

Mark
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Sage Weil
On Wed, 5 Nov 2014, Mark Kirkwood wrote:
 On 04/11/14 22:02, Sage Weil wrote:
  On Tue, 4 Nov 2014, Blair Bethwaite wrote:
   On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote:
In the Ceph session at the OpenStack summit someone asked what the
CephFS
survey results looked like.
   
   Thanks Sage, that was me!
   
  Here's the link:

 https://www.surveymonkey.com/results/SM-L5JV7WXL/

In short, people want

fsck
multimds
snapshots
quotas
   
   TBH I'm a bit surprised by a couple of these and hope maybe you guys
   will apply a certain amount of filtering on this...
   
   fsck and quotas were there for me, but multimds and snapshots are what
   I'd consider icing features - they're nice to have but not on the
   critical path to using cephfs instead of e.g. nfs in a production
   setting. I'd have thought stuff like small file performance and
   gateway support was much more relevant to uptake and
   positive/pain-free UX. Interested to hear others rationale here.
  
  Yeah, I agree, and am taking the results with a grain of salt.  I
  think the results are heavily influenced by the order they were
  originally listed (I whish surveymonkey would randomize is for each
  person or something).
  
  fsck is a clear #1.  Everybody wants multimds, but I think very few
  actually need it at this point.  We'll be merging a soft quota patch
  shortly, and things like performance (adding the inline data support to
  the kernel client, for instance) will probably compete with getting
  snapshots working (as part of a larger subvolume infrastructure).  That's
  my guess at least; for now, we're really focused on fsck and hard
  usability edges and haven't set priorities beyond that.
  
  We're definitely interested in hearing feedback on this strategy, and on
  peoples' experiences with giant so far...
  
 
 Heh, not necessarily - I put multi mds in there, as we want the cephfs part to
 be of similar to the rest of ceph in its availability.
 
 Maybe its because we are looking at plugging it in with an Openstack setup and
 for that you want everything to 'just look after itself'. If on the other hand
 we were wanting merely an nfs replacement, then sure multi mds not so
 important there.

Important clarification: multimds == multiple *active* MDS's.  single 
mds means 1 active MDS and N standy's.  One perfectly valid strategy, 
for example, is to run a ceph-mds on *every* node and let the mon pick 
whichever one is active.  (That works as long as you have sufficient 
memory on all nodes.)

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Mark Kirkwood

On 05/11/14 11:47, Sage Weil wrote:

On Wed, 5 Nov 2014, Mark Kirkwood wrote:

On 04/11/14 22:02, Sage Weil wrote:

On Tue, 4 Nov 2014, Blair Bethwaite wrote:

On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote:

In the Ceph session at the OpenStack summit someone asked what the
CephFS
survey results looked like.


Thanks Sage, that was me!


   Here's the link:

  https://www.surveymonkey.com/results/SM-L5JV7WXL/

In short, people want

fsck
multimds
snapshots
quotas


TBH I'm a bit surprised by a couple of these and hope maybe you guys
will apply a certain amount of filtering on this...

fsck and quotas were there for me, but multimds and snapshots are what
I'd consider icing features - they're nice to have but not on the
critical path to using cephfs instead of e.g. nfs in a production
setting. I'd have thought stuff like small file performance and
gateway support was much more relevant to uptake and
positive/pain-free UX. Interested to hear others rationale here.


Yeah, I agree, and am taking the results with a grain of salt.  I
think the results are heavily influenced by the order they were
originally listed (I whish surveymonkey would randomize is for each
person or something).

fsck is a clear #1.  Everybody wants multimds, but I think very few
actually need it at this point.  We'll be merging a soft quota patch
shortly, and things like performance (adding the inline data support to
the kernel client, for instance) will probably compete with getting
snapshots working (as part of a larger subvolume infrastructure).  That's
my guess at least; for now, we're really focused on fsck and hard
usability edges and haven't set priorities beyond that.

We're definitely interested in hearing feedback on this strategy, and on
peoples' experiences with giant so far...



Heh, not necessarily - I put multi mds in there, as we want the cephfs part to
be of similar to the rest of ceph in its availability.

Maybe its because we are looking at plugging it in with an Openstack setup and
for that you want everything to 'just look after itself'. If on the other hand
we were wanting merely an nfs replacement, then sure multi mds not so
important there.


Important clarification: multimds == multiple *active* MDS's.  single
mds means 1 active MDS and N standy's.  One perfectly valid strategy,
for example, is to run a ceph-mds on *every* node and let the mon pick
whichever one is active.  (That works as long as you have sufficient
memory on all nodes.)



Righty, so I think I've (plus a few others perhaps) misunderstood the 
nature of the 'promotion mechanism' for 1 active several standby design 
- I was under the (possibly wrong) impression that you needed to 'do 
something' to make a standby active? If not then yeah it would be fine, 
sorry!


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs survey results

2014-11-04 Thread Shain Miley
+1 for fsck and snapshots, being able to have snapshot backups and 
protect against accidental deletion, etc is something we are really 
looking forward to.


Thanks,

Shain


On 11/04/2014 04:02 AM, Sage Weil wrote:

On Tue, 4 Nov 2014, Blair Bethwaite wrote:

On 4 November 2014 01:50, Sage Weil s...@newdream.net wrote:

In the Ceph session at the OpenStack summit someone asked what the CephFS
survey results looked like.

Thanks Sage, that was me!


  Here's the link:

 https://www.surveymonkey.com/results/SM-L5JV7WXL/

In short, people want

fsck
multimds
snapshots
quotas

TBH I'm a bit surprised by a couple of these and hope maybe you guys
will apply a certain amount of filtering on this...

fsck and quotas were there for me, but multimds and snapshots are what
I'd consider icing features - they're nice to have but not on the
critical path to using cephfs instead of e.g. nfs in a production
setting. I'd have thought stuff like small file performance and
gateway support was much more relevant to uptake and
positive/pain-free UX. Interested to hear others rationale here.

Yeah, I agree, and am taking the results with a grain of salt.  I
think the results are heavily influenced by the order they were
originally listed (I whish surveymonkey would randomize is for each
person or something).

fsck is a clear #1.  Everybody wants multimds, but I think very few
actually need it at this point.  We'll be merging a soft quota patch
shortly, and things like performance (adding the inline data support to
the kernel client, for instance) will probably compete with getting
snapshots working (as part of a larger subvolume infrastructure).  That's
my guess at least; for now, we're really focused on fsck and hard
usability edges and haven't set priorities beyond that.

We're definitely interested in hearing feedback on this strategy, and on
peoples' experiences with giant so far...

sage
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


--
Shain Miley | Manager of Systems and Infrastructure, Digital Media | 
smi...@npr.org | 202.513.3649

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd down question

2014-11-04 Thread Craig Lewis
The OSDs will heartbeat each other, and report back to the monitors if
any other OSD fails to respond.

An OSD that fails to respond is effectively down, since it's not doing
the things that it's supposed to do.  It is possible for this process
to cause problems.  For example, I've had some OSDs on an overloaded
node mark all of the other OSDs in the cluster down, because the
overloaded node wasn't processing the heartbeat responses quickly
enough.  The solution there was to adjust mon osd min down reporters
and mon osd min down reports so that a single node can't do that.

You might see something like epoch XXX wrongly marked me down,
followed by the OSD rejoining the cluster.  That's a sign that the OSD
was overloaded, but not down.  Once it was kicked out of the cluster,
it caught up with the backlog and was able to rejoin.  This shouldn't
cause a chain reaction though.

If you're not seeing that, then the OSD really is unresponsive, and
needs to be restarted.  The other OSDs will start replicating it's
data automatically to make the cluster healthy again.  This should not
cause a chain reaction.  If your cluster is overloaded (very close to
running out of CPU, RAM, or Disk IO), then a failed OSD can cause a
chain reaction as other OSDs pick up the failed OSD's workload.



On Mon, Nov 3, 2014 at 11:12 PM, 飞 duron...@qq.com wrote:
 hello, I am running ceph v0.87 for one week, at this week,
 many osd have marking down, but I run ps -ef | grep osd, I can see
 the osd process, the osd not really down, then, I check osd log,
 I see many logs like  osd.XX from dead osd.YY,marking down,
 if the 0.87 will check other osd process ? if some osd is down, then the mon
 will mark the current to down state ?
 This will cause a chain reaction, leading to failure of the entire cluster,
 it is a bug ?

 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] emperor - firefly 0.80.7 upgrade problem

2014-11-04 Thread Samuel Just
Incomplete usually means the pgs do not have any complete copies.  Did
you previously have more osds?
-Sam

On Tue, Nov 4, 2014 at 7:37 AM, Chad Seys cws...@physics.wisc.edu wrote:
 On Monday, November 03, 2014 17:34:06 you wrote:
 If you have osds that are close to full, you may be hitting 9626.  I
 pushed a branch based on v0.80.7 with the fix, wip-v0.80.7-9626.
 -Sam

 Thanks Sam  I may have been hitting that as well.  I certainly hit too_full
 conditions often.  I am able to squeeze PGs off of the too_full OSD by
 reweighting and then eventually all PGs get to where they want to be.  Kind of
 silly that I have to do this manually though.  Could Ceph order the PG
 movements better? (Is this what your bug fix does in effect?)


 So, at the moment there are no PG moving around the cluster, but all are not
 in active+clean. Also, there is one OSD which has blocked requests.  The OSD
 seems idle and restarting the OSD just results in a younger blocked request.

 ~# ceph -s
 cluster 7797e50e-f4b3-42f6-8454-2e2b19fa41d6
  health HEALTH_WARN 35 pgs down; 208 pgs incomplete; 210 pgs stuck
 inactive; 210 pgs stuck unclean; 1 requests are blocked  32 sec
  monmap e3: 3 mons at
 {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=144.92.180.139:67
 89/0}, election epoch 2996, quorum 0,1,2 mon01,mon02,mon03
  osdmap e115306: 24 osds: 24 up, 24 in
   pgmap v6630195: 8704 pgs, 7 pools, 6344 GB data, 1587 kobjects
 12747 GB used, 7848 GB / 20596 GB avail
2 inactive
 8494 active+clean
  173 incomplete
   35 down+incomplete

 # ceph health detail
 ...
 1 ops are blocked  8388.61 sec
 1 ops are blocked  8388.61 sec on osd.15
 1 osds have slow requests

 from the log of the osd with the blocked request (osd.15):
 2014-11-04 08:57:26.851583 7f7686331700  0 log [WRN] : 1 slow requests, 1
 included below; oldest blocked for  3840.430247 secs
 2014-11-04 08:57:26.851593 7f7686331700  0 log [WRN] : slow request
 3840.430247 seconds old, received at 2014-11-04 07:53:26.421301:
 osd_op(client.11334078.1:592 rb.0.206609.238e1f29.000752e8 [read 512~512]
 4.17df39a7 RETRY=1 retry+read e115304) v4 currently reached pg


 Other requests (like PG scrubs) are happening without taking a long time on
 this OSD.
 Also, this was one of the OSDs which I completely drained, removed from ceph,
 reformatted, and created again using ceph-deploy.  So it is completely created
 by firefly 0.80.7 code.


 As Greg requested, output of ceph scrub:

 2014-11-04 09:25:58.761602 7f6c0e20b700  0 mon.mon01@0(leader) e3
 handle_command mon_command({prefix: scrub} v 0) v1
 2014-11-04 09:26:21.320043 7f6c0ea0c700  1 mon.mon01@0(leader).paxos(paxos
 updating c 11563072..11563575) accept timeout, calling fresh elect
 ion
 2014-11-04 09:26:31.264873 7f6c0ea0c700  0
 mon.mon01@0(probing).data_health(2996) update_stats avail 38% total 6948572
 used 3891232 avail 268
 1328
 2014-11-04 09:26:33.529403 7f6c0e20b700  0 log [INF] : mon.mon01 calling new
 monitor election
 2014-11-04 09:26:33.538286 7f6c0e20b700  1 mon.mon01@0(electing).elector(2996)
 init, last seen epoch 2996
 2014-11-04 09:26:38.809212 7f6c0ea0c700  0 log [INF] : mon.mon01@0 won leader
 election with quorum 0,2
 2014-11-04 09:26:40.215095 7f6c0e20b700  0 log [INF] : monmap e3: 3 mons at
 {mon01=128.104.164.197:6789/0,mon02=128.104.164.198:6789/0,mon03=
 144.92.180.139:6789/0}
 2014-11-04 09:26:40.215754 7f6c0e20b700  0 log [INF] : pgmap v6630201: 8704
 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
 plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
 2014-11-04 09:26:40.215913 7f6c0e20b700  0 log [INF] : mdsmap e1: 0/0/1 up
 2014-11-04 09:26:40.216621 7f6c0e20b700  0 log [INF] : osdmap e115306: 24
 osds: 24 up, 24 in
 2014-11-04 09:26:41.227010 7f6c0e20b700  0 log [INF] : pgmap v6630202: 8704
 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
 plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
 2014-11-04 09:26:41.367373 7f6c0e20b700  1 mon.mon01@0(leader).osd e115307
 e115307: 24 osds: 24 up, 24 in
 2014-11-04 09:26:41.437706 7f6c0e20b700  0 log [INF] : osdmap e115307: 24
 osds: 24 up, 24 in
 2014-11-04 09:26:41.471558 7f6c0e20b700  0 log [INF] : pgmap v6630203: 8704
 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
 plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
 2014-11-04 09:26:41.497318 7f6c0e20b700  1 mon.mon01@0(leader).osd e115308
 e115308: 24 osds: 24 up, 24 in
 2014-11-04 09:26:41.533965 7f6c0e20b700  0 log [INF] : osdmap e115308: 24
 osds: 24 up, 24 in
 2014-11-04 09:26:41.553161 7f6c0e20b700  0 log [INF] : pgmap v6630204: 8704
 pgs: 2 inactive, 8494 active+clean, 173 incomplete, 35 down+incom
 plete; 6344 GB data, 12747 GB used, 7848 GB / 20596 GB avail
 2014-11-04 09:26:42.701720 7f6c0e20b700  1 mon.mon01@0(leader).osd e115309
 e115309: 24 osds: 24 up, 24 in
 2014-11-04 09:26:42.953977 

[ceph-users] osd troubleshooting

2014-11-04 Thread shiva rkreddy
Hi,
I'm trying to run osd troubleshooting commands.

*Use case: Stopping osd without re-balancing.*

.#ceph osd noout  // this command works.
But, neither of the following work:
#stop ceph-osd id=1
(Error message: *no valid command found; 10 closest matches:* ...)
 or
# ceph osd stop osd.1
( Error message: *stop: Unknown job: ceph-osd* )

Environment:
ceph: 0.80.7
OS: RHEL6.5
upstart-0.6.5-13.el6_5.3.x86_64
ceph-0.80.7-0.el6.x86_64
ceph-common-0.80.7-0.el6.x86_64

Thanks,
shiva
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Giant not fixed RepllicatedPG:NotStrimming?

2014-11-04 Thread David Zafman

Can you upload the entire log file?

David

 On Nov 4, 2014, at 1:03 AM, Ta Ba Tuan tua...@vccloud.vn wrote:
 
 Hi Sam,
 I resend logs with debug options  http://123.30.41.138/ceph-osd.21.log 
 http://123.30.41.138/ceph-osd.21.log  
 (Sorry about my spam :D)
 
 I saw many missing objects :|
 
 2014-11-04 15:26:02.205607 7f3ab11a8700 10 osd.21 pg_epoch: 106407 pg[24.7d7( 
 v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] loca
 l-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) 
 [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=1 crt=106401'491581 mlcod 
 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] 
 recover_primary 675ea7d7/rbd_data.4930222ae8944a.0001/head//24 
 106401'491580 (missing) (missing head) (recovering) (recovering head)
 2014-11-04 15:26:02.205642 7f3ab11a8700 10 osd.21 pg_epoch: 106407 pg[24.7d7( 
 v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] 
 local-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) 
 [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=1 crt=106401'491581 mlcod 
 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] 
 recover_primary 
 d4d4bfd7/rbd_data.c6964d30a28220.035f/head//24 106401'491581 
 (missing) (missing head)
 2014-11-04 15:26:02.237994 7f3ab29ab700 10 osd.21 pg_epoch: 106407 pg[24.7d7( 
 v 106407'491583 lc 106401'491579 (105805'487042,106407'491583] 
 local-les=106403 n=179 ec=25000 les/c 106403/106390 106402/106402/106402) 
 [21,28,4] r=0 lpr=106402 pi=106377-106401/4 rops=2 crt=106401'491581 mlcod 
 106393'491097 active+recovering+degraded m=2 snaptrimq=[306~1,312~1]] got 
 missing d4d4bfd7/rbd_data.c6964d30a28220.035f/head//24 v 
 106401'491581
 
 Thanks Sam and All,
 --
 Tuan
 HaNoi-Vietnam
 
 On 11/04/2014 04:54 AM, Samuel Just wrote:
 Can you reproduce with
 
 debug osd = 20
 debug filestore = 20
 debug ms = 1
 
 In the [osd] section of that osd's ceph.conf?
 -Sam
 
 On Sun, Nov 2, 2014 at 9:10 PM, Ta Ba Tuan tua...@vccloud.vn 
 mailto:tua...@vccloud.vn wrote:
 Hi Sage, Samuel  All,
 
 I upgraded to GAINT, but still appearing that errors |:
 I'm trying on deleting  related objects/volumes, but very hard to verify
 missing objects :(.
 
 Guide me to resolve it, please! (I send attached detail log).
 
 2014-11-03 11:37:57.730820 7f28fb812700  0 osd.21 105950 do_command r=0
 2014-11-03 11:37:57.856578 7f28fc013700 -1 *** Caught signal (Segmentation
 fault) **
  in thread 7f28fc013700
 
  ceph version 0.87-6-gdba7def (dba7defc623474ad17263c9fccfec60fe7a439f0)
  1: /usr/bin/ceph-osd() [0x9b6725]
  2: (()+0xfcb0) [0x7f291fc2acb0]
  3: (ReplicatedPG::trim_object(hobject_t const)+0x395) [0x811b55]
  4: (ReplicatedPG::TrimmingObjects::react(ReplicatedPG::SnapTrim
 const)+0x43e) [0x82b9be]
  5: (boost::statechart::simple_stateReplicatedPG::TrimmingObjects,
 ReplicatedPG::SnapTrimmer, boost::mpl::listmpl_::na, mpl_::na, mpl_::na,
 mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
 mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
 mpl_::na, mpl_::na, mpl_::na,
 (boost::statechart::history_mode)0::react_impl(boost::statechart::event_base
 const, void const*)+0xc0) [0x870ce0]
  6: (boost::statechart::state_machineReplicatedPG::SnapTrimmer,
 ReplicatedPG::NotTrimming, std::allocatorvoid,
 boost::statechart::null_exception_translator::process_queued_events()+0xfb)
 [0x85618b]
  7: (boost::statechart::state_machineReplicatedPG::SnapTrimmer,
 ReplicatedPG::NotTrimming, std::allocatorvoid,
 boost::statechart::null_exception_translator::process_event(boost::statechart::event_base
 const)+0x1e) [0x85633e]
  8: (ReplicatedPG::snap_trimmer()+0x4f8) [0x7d5ef8]
  9: (OSD::SnapTrimWQ::_process(PG*)+0x14) [0x673ab4]
  10: (ThreadPool::worker(ThreadPool::WorkThread*)+0x48e) [0xa8fade]
  11: (ThreadPool::WorkThread::entry()+0x10) [0xa92870]
  12: (()+0x7e9a) [0x7f291fc22e9a]
  13: (clone()+0x6d) [0x7f291e5ed31d]
  NOTE: a copy of the executable, or `objdump -rdS executable` is needed to
 interpret this.
 
  -9993 2014-11-03 11:37:47.689335 7f28fc814700  1 -- 172.30.5.2:6803/7606
 -- 172.30.5.1:6886/3511 -- MOSDPGPull(6.58e 105950
 [PullOp(87f82d8e/rbd_data.45e62779c99cf1.22b5/head//6,
 recovery_info:
 ObjectRecoveryInfo(87f82d8e/rbd_data.45e62779c99cf1.22b5/head//6@105938'11622009,
 copy_subset: [0~18446744073709551615], clone_subset: {}), recovery_progress:
 ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false,
 omap_recovered_to:, omap_complete:false))]) v2 -- ?+0 0x26c59000 con
 0x22fbc420
 
 -2 2014-11-03 11:37:57.853585 7f2902820700  5 osd.21 pg_epoch: 105950
 pg[24.9e4( v 105946'113392 lc 105946'113391 (103622'109598,105946'113392]
 local-les=1
 05948 n=88 ec=25000 les/c 105948/105943 105947/105947/105947) [21,112,33]
 r=0 lpr=105947 pi=105933-105946/4 crt=105946'113392 lcod 0'0 mlcod 0'0
 active+recovery
 _wait+degraded m=1 

Re: [ceph-users] osd troubleshooting

2014-11-04 Thread Steve Anthony
Shiva,

You need to connect to the host where the OSD is located and stop it by
invoking:

service stop ceph osd.1

I don't think there's a way to stop and start OSDs from an admin node,
unless I missed a change that provides this functionality.

-Steve

On 11/04/2014 10:59 PM, shiva rkreddy wrote:
 Hi,
 I'm trying to run osd troubleshooting commands.

 *Use case: Stopping osd without re-balancing.*

 .#ceph osd noout  // this command works.
 But, neither of the following work:
 #stop ceph-osd id=1
 (Error message: /*no valid command found; 10 closest matches:*/ ...)
  or
 # ceph osd stop osd.1
 ( Error message: /*stop: Unknown job: ceph-osd*/ )

 Environment:
 ceph: 0.80.7
 OS: RHEL6.5
 upstart-0.6.5-13.el6_5.3.x86_64
 ceph-0.80.7-0.el6.x86_64
 ceph-common-0.80.7-0.el6.x86_64

 Thanks,
 shiva



 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

-- 
Steve Anthony
LTS HPC Support Specialist
Lehigh University
sma...@lehigh.edu

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Full backup/restore of Ceph cluster?

2014-11-04 Thread Christopher Armstrong
Hi folks,

I was wondering if anyone has a solution for performing a complete backup
and restore of a CEph cluster. A Google search came up with some
articles/blog posts, some of which are old, and I don't really have a great
idea of the feasibility of this.

Here's what I've found:

http://ceph.com/community/blog/tag/backup/
http://ceph.com/docs/giant/rbd/rbd-snapshot/
http://t3491.file-systems-ceph-user.file-systemstalk.us/backups-t3491.html

Is RBD snapshotting what I'm looking for? Is this even possible? Any info
is much appreciated!

Thanks,

Chris


*Chris Armstrong*Head of Services
OpDemand / Deis.io

GitHub: https://github.com/deis/deis -- Docs: http://docs.deis.io/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com