Re: failed to open http://apt-mirror.front.sepia.ceph.com

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Loic Dachary wrote: > Hi, > > On 23/09/2015 12:29, wangsongbo wrote: > > 64.90.32.37 apt-mirror.front.sepia.ceph.com > > It works for me. Could you send a traceroute > apt-mirror.front.sepia.ceph.com ? This is a private IP internal to the sepia lab. Anythign outside the

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Jason Dillaman
> In this case the commands look a little confusing to me, as from their > names I would rather think they enable/disable mirror for existent > images too. Also, I don't see a command to check what current > behaviour is. And, I suppose it would be useful if we could configure > other default

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Igor Fedotov wrote: > Hi Sage, > thanks a lot for your feedback. > > Regarding issues with offset mapping and stripe size exposure. > What's about the idea to apply compression in two-tier (cache+backing storage) > model only ? I'm not sure we win anything by making it a

Re: failed to open http://apt-mirror.front.sepia.ceph.com

2015-09-23 Thread Loic Dachary
Hi, On 23/09/2015 12:29, wangsongbo wrote: > 64.90.32.37 apt-mirror.front.sepia.ceph.com It works for me. Could you send a traceroute apt-mirror.front.sepia.ceph.com ? Cheers -- Loïc Dachary, Artisan Logiciel Libre signature.asc Description: OpenPGP digital signature

Re: [Ceph-announce] Important security noticed regarding release signing key

2015-09-23 Thread Gaudenz Steinlin
Sage Weil writes: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > Last week, Red Hat investigated an intrusion on the sites of both the Ceph > community project (ceph.com) and Inktank (download.inktank.com), which > were hosted on a computer system outside of Red Hat

Re: ceph-mon always election when change crushmap in firefly

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Alexander Yang wrote: > hello, > We use Ceph+Openstack in our private cloud. In our cluster, we have > 5 mons and 800 osds, the Capacity is about 1Pb. And run about 700 vms and > 1100 volumes, > recently, we increase our pg_num , now the cluster have about

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Jason Dillaman
> > > * rbd mirror pool add > > > This will register a remote cluster/pool as a peer to the > > > current, > > > local pool. All existing mirrored images and all future mirrored > > > images will have this peer registered as a journal client. > > > > > > * rbd

Re: [Ceph-announce] Important security noticed regarding release signing key

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Gaudenz Steinlin wrote: > Sage Weil writes: > > > -BEGIN PGP SIGNED MESSAGE- > > Hash: SHA1 > > > > Last week, Red Hat investigated an intrusion on the sites of both the Ceph > > community project (ceph.com) and Inktank (download.inktank.com),

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Igor Fedotov
Hi Sage, thanks a lot for your feedback. Regarding issues with offset mapping and stripe size exposure. What's about the idea to apply compression in two-tier (cache+backing storage) model only ? I doubt single-tier one is widely used for EC pools since there is no random write support in such

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Jason Dillaman
> > In this case the commands look a little confusing to me, as from their > > names I would rather think they enable/disable mirror for existent > > images too. Also, I don't see a command to check what current > > behaviour is. And, I suppose it would be useful if we could configure > > other

failed to open http://apt-mirror.front.sepia.ceph.com

2015-09-23 Thread wangsongbo
Hi Loic and other Cephers, I am running teuthology-suites in our testing, because the connection to "apt-mirror.front.sepia.ceph.com" timed out , "ceph-cm-ansible" failed. And from a web-browser, I got the response like this : "502 Bad Gateway". "64.90.32.37 apt-mirror.front.sepia.ceph.com"

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Igor Fedotov
On 23.09.2015 17:05, Gregory Farnum wrote: On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: On Wed, 23 Sep 2015, Igor Fedotov wrote: Hi Sage, thanks a lot for your feedback. Regarding issues with offset mapping and stripe size exposure. What's about the idea to apply

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: > On Wed, 23 Sep 2015, Igor Fedotov wrote: >> Hi Sage, >> thanks a lot for your feedback. >> >> Regarding issues with offset mapping and stripe size exposure. >> What's about the idea to apply compression in two-tier

Re: failed to open http://apt-mirror.front.sepia.ceph.com

2015-09-23 Thread Loic Dachary
On 23/09/2015 15:11, Sage Weil wrote: > On Wed, 23 Sep 2015, Loic Dachary wrote: >> Hi, >> >> On 23/09/2015 12:29, wangsongbo wrote: >>> 64.90.32.37 apt-mirror.front.sepia.ceph.com >> >> It works for me. Could you send a traceroute >> apt-mirror.front.sepia.ceph.com ? > > This is a private IP

09/23/2015 Weekly Ceph Performance Meeting IS ON!

2015-09-23 Thread Mark Nelson
8AM PST as usual! Discussion topics include an update on transparent huge pages testing and I think Ben would like to talk a bit about CBT PRs. Please feel free to add your own! Here's the links: Etherpad URL: http://pad.ceph.com/p/performance_weekly To join the Meeting:

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Igor Fedotov
Sage, so you are saying that radosgw tend to use EC pools directly without caching, right? I agree that we need offset mapping anyway. And the difference between cache writes and direct writes is mainly in block size granularity: 8 Mb vs. 4 Kb. In the latter case we have higher overhead

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Igor Fedotov wrote: > Sage, > > so you are saying that radosgw tend to use EC pools directly without caching, > right? > > I agree that we need offset mapping anyway. > > And the difference between cache writes and direct writes is mainly in block > size granularity: 8 Mb

Re: 09/23/2015 Weekly Ceph Performance Meeting IS ON!

2015-09-23 Thread Alexandre DERUMIER
Hi Mark, can you post the video records of previous meetings ? Thanks Alexandre - Mail original - De: "Mark Nelson" À: "ceph-devel" Envoyé: Mercredi 23 Septembre 2015 15:51:21 Objet: 09/23/2015 Weekly Ceph Performance Meeting IS ON!

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Samuel Just
I think before moving forward with any sort of implementation, the design would need to be pretty much completely mapped out -- particularly how the offset mapping will be handled and stored. The right thing to do would be to produce a blueprint and submit it to the list. I also would vastly

Re: failed to open http://apt-mirror.front.sepia.ceph.com

2015-09-23 Thread wangsongbo
Sage and Loic, Thanks for your reply. I am running teuthology in our testing.I can send a traceroute to 64.90.32.37. but when ceph-cm-ansible run the " yum-complete-transaction --cleanup-only" command, it got such a response

perf counters from a performance discrepancy

2015-09-23 Thread Deneau, Tom
Hi all -- Looking for guidance with perf counters... I am trying to see whether the perf counters can tell me anything about the following discrepancy I populate a number of 40k size objects in each of two pools, poolA and poolB. Both pools cover osds on a single node, 5 osds total. *

Re: failed to open http://apt-mirror.front.sepia.ceph.com

2015-09-23 Thread Loic Dachary
On 23/09/2015 18:50, wangsongbo wrote: > Sage and Loic, > Thanks for your reply. > I am running teuthology in our testing.I can send a traceroute to 64.90.32.37. > but when ceph-cm-ansible run the " yum-complete-transaction --cleanup-only" > command, > it got such a response >

Re: [ceph-users] Potential OSD deadlock?

2015-09-23 Thread Mark Nelson
FWIW, we've got some 40GbE Intel cards in the community performance cluster on a Mellanox 40GbE switch that appear (knock on wood) to be running fine with 3.10.0-229.7.2.el7.x86_64. We did get feedback from Intel that older drivers might cause problems though. Here's ifconfig from one of the

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Ilya Dryomov
On Wed, Sep 23, 2015 at 4:08 PM, Jason Dillaman wrote: >> > In this case the commands look a little confusing to me, as from their >> > names I would rather think they enable/disable mirror for existent >> > images too. Also, I don't see a command to check what current >> >

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Jason Dillaman
> So a pool policy is just a set of feature bits? It would have to store additional details as well. > I think Cinder at least creates images with rbd_default_features from > ceph.conf and adds in layering if it's not set, meaning there is no > interface for passing through feature bits (or

Re: [ceph-users] Potential OSD deadlock?

2015-09-23 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 We were able to only get ~17Gb out of the XL710 (heavily tweaked) until we went to the 4.x kernel where we got ~36Gb (no tweaking). It seems that there were some major reworks in the network handling in the kernel to efficiently handle that network

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Ilya Dryomov
On Wed, Sep 23, 2015 at 9:28 PM, Jason Dillaman wrote: >> So a pool policy is just a set of feature bits? > > It would have to store additional details as well. > >> I think Cinder at least creates images with rbd_default_features from >> ceph.conf and adds in layering if

Re: Adding Data-At-Rest compression support to Ceph

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 8:26 AM, Igor Fedotov wrote: > > > On 23.09.2015 17:05, Gregory Farnum wrote: >> >> On Wed, Sep 23, 2015 at 6:15 AM, Sage Weil wrote: >>> >>> On Wed, 23 Sep 2015, Igor Fedotov wrote: Hi Sage, thanks a lot for your

Re: perf counters from a performance discrepancy

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 11:19 AM, Sage Weil wrote: > On Wed, 23 Sep 2015, Deneau, Tom wrote: >> Hi all -- >> >> Looking for guidance with perf counters... >> I am trying to see whether the perf counters can tell me anything about the >> following discrepancy >> >> I populate a

Re: perf counters from a performance discrepancy

2015-09-23 Thread Mark Nelson
On 09/23/2015 01:25 PM, Gregory Farnum wrote: On Wed, Sep 23, 2015 at 11:19 AM, Sage Weil wrote: On Wed, 23 Sep 2015, Deneau, Tom wrote: Hi all -- Looking for guidance with perf counters... I am trying to see whether the perf counters can tell me anything about the

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Mykola Golub
On Tue, Sep 22, 2015 at 01:32:49PM -0400, Jason Dillaman wrote: > > > * rbd mirror pool enable > > > This will, by default, ensure that all images created in this > > > pool have exclusive lock, journaling, and mirroring feature bits > > > enabled. > > > > > > * rbd

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Mykola Golub
On Wed, Sep 23, 2015 at 09:33:14AM +0300, Mykola Golub wrote: > Also, I am not sure we should specify this way, as it is > not consistent with other rbd commands. By default rbd operates on > 'rbd' pool, which can be changed by --pool option. The same reasoning for these commands: > > * rbd

Re: RBD mirroring CLI proposal ...

2015-09-23 Thread Ilya Dryomov
On Wed, Sep 23, 2015 at 9:33 AM, Mykola Golub wrote: > On Tue, Sep 22, 2015 at 01:32:49PM -0400, Jason Dillaman wrote: > >> > > * rbd mirror pool enable >> > > This will, by default, ensure that all images created in this >> > > pool have exclusive lock,

答复: Ceph problem

2015-09-23 Thread zhao.ming...@h3c.com
Dear ceph-devel Cluster environment composed of three hosts,each host runs a monitor process and ten OSD processes If one of the hosts in the cluster is restarted,we run 'rbd create……' command will block 120 seconds,but the normal is blocked for 20 seconds. When a host is

Re: perf counters from a performance discrepancy

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Deneau, Tom wrote: > Hi all -- > > Looking for guidance with perf counters... > I am trying to see whether the perf counters can tell me anything about the > following discrepancy > > I populate a number of 40k size objects in each of two pools, poolA and poolB. > Both

Re: [ceph-users] Potential OSD deadlock?

2015-09-23 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 OK, here is the update on the saga... I traced some more of blocked I/Os and it seems that communication between two hosts seemed worse than others. I did a two way ping flood between the two hosts using max packet sizes (1500). After 1.5M packets,

ceph pg query - num_objects_missing_on_primary

2015-09-23 Thread GuangYang
Hello, While doing a 'ceph pg {id} query', it dumps the info from all peers, however, for all peers, it only shows 'num_objects_missing_on_primary', which is the same across all peers. Isn't it better to show the 'num_objecgts_missing' for the peer rather than primary? Thanks, Guang

RE: perf counters from a performance discrepancy

2015-09-23 Thread Deneau, Tom
I will be out of office for a week but will put this on the list of things to try when I get back. -- Tom > -Original Message- > From: Samuel Just [mailto:sj...@redhat.com] > Sent: Wednesday, September 23, 2015 3:28 PM > To: Deneau, Tom > Cc: Mark Nelson; Gregory Farnum; Sage Weil;

RE: perf counters from a performance discrepancy

2015-09-23 Thread Deneau, Tom
> -Original Message- > From: Gregory Farnum [mailto:gfar...@redhat.com] > Sent: Wednesday, September 23, 2015 3:39 PM > To: Deneau, Tom > Cc: ceph-devel@vger.kernel.org > Subject: Re: perf counters from a performance discrepancy > > On Wed, Sep 23, 2015 at 9:33 AM, Deneau, Tom

Re: perf counters from a performance discrepancy

2015-09-23 Thread Samuel Just
Just to eliminate a variable, can you reproduce this on master, first with the simple messenger, and then with the async messenger? (make sure to switch the messengers on all daemons and clients, just put it in the [global] section on all configs). -Sam On Wed, Sep 23, 2015 at 1:05 PM, Deneau,

RE: perf counters from a performance discrepancy

2015-09-23 Thread Deneau, Tom
> -Original Message- > From: Mark Nelson [mailto:mnel...@redhat.com] > Sent: Wednesday, September 23, 2015 1:43 PM > To: Gregory Farnum; Sage Weil > Cc: Deneau, Tom; ceph-devel@vger.kernel.org > Subject: Re: perf counters from a performance discrepancy > > > > On 09/23/2015 01:25 PM,

async messenger peering hang

2015-09-23 Thread Samuel Just
I'm seeing some rados runs stuck on peering messages not getting sent by the async messenger: http://tracker.ceph.com/issues/13213. Can you take a look? -Sam -- To unsubscribe from this list: send the line "unsubscribe ceph-devel" in the body of a message to majord...@vger.kernel.org More

Re: perf counters from a performance discrepancy

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 9:33 AM, Deneau, Tom wrote: > Hi all -- > > Looking for guidance with perf counters... > I am trying to see whether the perf counters can tell me anything about the > following discrepancy > > I populate a number of 40k size objects in each of two

Re: perf counters from a performance discrepancy

2015-09-23 Thread Gregory Farnum
On Wed, Sep 23, 2015 at 1:51 PM, Deneau, Tom wrote: > > >> -Original Message- >> From: Gregory Farnum [mailto:gfar...@redhat.com] >> So if you've got 20k objects and 5 OSDs then each OSD is getting ~4k reads >> during this test. Which if I'm reading these properly

Re: failed to open http://apt-mirror.front.sepia.ceph.com

2015-09-23 Thread wangsongbo
Loic, It's my fault. The dns server I set is unreachable. when I modify that , everything is ok. Thanks and Regards, WangSongbo On 15/9/24 上午1:01, Loic Dachary wrote: On 23/09/2015 18:50, wangsongbo wrote: Sage and Loic, Thanks for your reply. I am running teuthology in our testing.I can

Re: [ceph-users] Important security noticed regarding release signing key

2015-09-23 Thread wangsongbo
Hi Ken, Just now, I run teuthology-suites in our testing, it failed because of lacking these packages, such as qemu-kvm-0.12.1.2-2.415.el6.3ceph.x86_64, qemu-kvm-tools-0.12.1.2-2.415.el6.3ceph etc. The modify "rm ceph-extras repository config#137" only remove the repository , but did not

Re: Very slow recovery/peering with latest master

2015-09-23 Thread Samuel Just
Wow. Why would that take so long? I think you are correct that it's only used for metadata, we could just add a config value to disable it. -Sam On Wed, Sep 23, 2015 at 3:48 PM, Somnath Roy wrote: > Sam/Sage, > I debugged it down and found out that the >

RE: Very slow recovery/peering with latest master

2015-09-23 Thread Somnath Roy
I am not sure why it is taking time..I installed latest libblkid as well, but, same result. Yeah, config option will be better..I will add that along with my write-path pull request. Thanks & Regards Somnath -Original Message- From: Samuel Just [mailto:sj...@redhat.com] Sent:

Re: Copyright header

2015-09-23 Thread Handzik, Joe
Yes...HP corporate open source contribution standards require me to submit that copyright. Such additions exist all over the place in Linux and open stack too. > On Sep 23, 2015, at 7:30 PM, Somnath Roy wrote: > > Hi Sage, > In the latest master, I am seeing a new

RE: Very slow recovery/peering with latest master

2015-09-23 Thread Somnath Roy
Sam/Sage, I debugged it down and found out that the get_device_by_uuid->blkid_find_dev_with_tag() call within FileStore::collect_metadata() is hanging for ~3 mins before returning a EINVAL. I saw this portion is newly added after hammer. Commenting it out resolves the issue. BTW, I saw this

RE: Very slow recovery/peering with latest master

2015-09-23 Thread Somnath Roy

Re: Very slow recovery/peering with latest master

2015-09-23 Thread Handzik, Joe
Ok. When configuring with ceph-disk, it does something nifty and actually gives the OSD the uuid of the disk's partition as its fsid. I bootstrap off that to get an argument to pass into the function you have identified as the bottleneck. I ran it by sage and we both realized there would be

Re: Very slow recovery/peering with latest master

2015-09-23 Thread Handzik, Joe
I added that, there is code up the stack in calamari that consumes the path provided, which is intended in the future to facilitate disk monitoring and management. Somnath, what does your disk configuration look like (filesystem, SSD/HDD, anything else you think could be relevant)? Did you

Copyright header

2015-09-23 Thread Somnath Roy
Hi Sage, In the latest master, I am seeing a new Copyright header entry for HP in the file Filestore.cc. Is this incidental ? * Copyright (c) 2015 Hewlett-Packard Development Company, L.P. Thanks & Regards Somnath PLEASE NOTE: The information contained in

Re: Very slow recovery/peering with latest master

2015-09-23 Thread Sage Weil
On Wed, 23 Sep 2015, Handzik, Joe wrote: > Ok. When configuring with ceph-disk, it does something nifty and > actually gives the OSD the uuid of the disk's partition as its fsid. I > bootstrap off that to get an argument to pass into the function you have > identified as the bottleneck. I ran