On Wed, Mar 22, 2017 at 8:24 AM, Marcus Furlong wrote:
> Hi,
>
> I'm experiencing the same issue as outlined in this post:
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-September/013330.html
>
> I have also deployed this jewel cluster using ceph-deploy.
>
> This
Can't help, but just wanted to say that the upgrade worked for us:
# ceph health
HEALTH_OK
# ceph tell mon.* version
mon.p01001532077488: ceph version 10.2.7
(50e863e0f4bc8f4b9e31156de690d765af245185)
mon.p01001532149022: ceph version 10.2.7
(50e863e0f4bc8f4b9e31156de690d765af245185)
Dear ceph-*,
A couple weeks ago I wrote this simple tool to measure the round-trip
latency of a shared filesystem.
https://github.com/dvanders/fsping
In our case, the tool is to be run from two clients who mount the same
CephFS.
First, start the server (a.k.a. the ping reflector) on one
Hi,
This sounds familiar: http://tracker.ceph.com/issues/17939
I found that you can get the updated quota on node2 by touching the
base dir. In your case:
touch /shares/share0
-- Dan
On Tue, Mar 14, 2017 at 10:52 AM, yu2xiangyang wrote:
> Dear cephers,
> I met
On Mon, Mar 13, 2017 at 10:35 AM, Florian Haas wrote:
> On Sun, Mar 12, 2017 at 9:07 PM, Laszlo Budai wrote:
>> Hi Florian,
>>
>> thank you for your answer.
>>
>> We have already set the IO scheduler to cfq in order to be able to lower the
>>
On Sat, Mar 11, 2017 at 12:21 PM, wrote:
>
> The next and biggest problem we encountered had to do with the CRC errors on
> the OSD map. On every map update, the OSDs that were not upgraded yet, got
> that CRC error and asked the monitor for a full OSD map instead of
Hi John,
Last week we updated our prod CephFS cluster to 10.2.6 (clients and
server side), and for the first time today we've got an object info
size mismatch:
I found this ticket you created in the tracker, which is why I've
emailed you: http://tracker.ceph.com/issues/18240
Here's the detail
On Mon, Mar 13, 2017 at 1:35 PM, John Spray <jsp...@redhat.com> wrote:
> On Mon, Mar 13, 2017 at 10:28 AM, Dan van der Ster <d...@vanderster.com>
> wrote:
>> Hi John,
>>
>> Last week we updated our prod CephFS cluster to 10.2.6 (clients and
>> server sid
Hi all,
We are trying to outsource the disk replacement process for our ceph
clusters to some non-expert sysadmins.
We could really use a tool that reports if a Ceph OSD *would* or
*would not* be safe to stop, e.g.
# ceph-osd-safe-to-stop osd.X
Yes it would be OK to stop osd.X
(which of course
r
> recovery is complete, respectively. (the magic that made my reweight script
> efficient compared to the official reweight script)
>
> And I have not used such a method in the past... my cluster is small, so I
> have always just let recovery completely finish instead. I hope y
On Fri, Jul 28, 2017 at 9:39 PM, Alexandre Germain
<germain.alexan...@gmail.com> wrote:
> Hello Dan,
>
> Something like this maybe?
>
> https://github.com/CanonicalLtd/ceph_safe_disk
>
> Cheers,
>
> Alex
>
> 2017-07-28 9:36 GMT-04:00 Dan van der Ster <d.
On Thu, Aug 3, 2017 at 11:42 AM, Peter Maloney
<peter.malo...@brockmann-consult.de> wrote:
> On 08/03/17 11:05, Dan van der Ster wrote:
>
> On Fri, Jul 28, 2017 at 9:42 PM, Peter Maloney
> <peter.malo...@brockmann-consult.de> wrote:
>
> Hello Dan,
>
> Based on
minimal impact. Reading previous
> threads on this topic from the list I've found the ceph-gentle-reweight
> script
> (https://github.com/cernceph/ceph-scripts/blob/master/tools/ceph-gentle-reweight)
> created by Dan van der Ster (Thank you Dan for sharing the script with us!).
>
Hi,
I also noticed this and finally tracked it down:
http://tracker.ceph.com/issues/20972
Cheers, Dan
On Mon, Jul 10, 2017 at 3:58 PM, Florent B wrote:
> Hi,
>
> Since 10.2.8 Jewel update, when ceph-fuse is mounting a file system, it
> returns 255 instead of 0 !
>
> $
On Thu, Jul 13, 2017 at 4:23 PM, Aaron Bassett
wrote:
> Because it was a read error I check SMART stats for that osd's disk and sure
> enough, it had some uncorrected read errors. In order to stop it from causing
> more problems > I stopped the daemon to let ceph
Hi,
Occasionally we want to change the scrub schedule for a pool or whole
cluster, but we want to do this by injecting new settings without
restarting every daemon.
I've noticed that in jewel, changes to scrub_min/max_interval and
deep_scrub_interval do not take immediate effect, presumably
On Tue, Jul 11, 2017 at 5:40 PM, Sage Weil wrote:
> On Tue, 11 Jul 2017, Haomai Wang wrote:
>> On Tue, Jul 11, 2017 at 11:11 PM, Sage Weil wrote:
>> > On Tue, 11 Jul 2017, Sage Weil wrote:
>> >> Hi all,
>> >>
>> >> Luminous features a new 'service map' that
On Wed, Jul 12, 2017 at 5:51 PM, Abhishek L
wrote:
> On Wed, Jul 12, 2017 at 9:13 PM, Xiaoxi Chen wrote:
>> +However, it also introduced a regression that could cause MDS damage.
>> +Therefore, we do *not* recommend that Jewel users upgrade
Jul 13, 2017, at 10:29 AM, Aaron Bassett
>> > <aaron.bass...@nantomics.com> wrote:
>> >
>> > Ok good to hear, I just kicked one off on the acting primary so I guess
>> > I'll be patient now...
>> >
>> > Thanks,
>> > Aaron
>&g
orry about.
(Btw, we just upgraded our biggest prod clusters to jewel -- that also
went totally smooth!)
-- Dan
> sage
>
>
>>
>>
>>
>> On Mon, Jul 10, 2017 at 3:17 PM, Dan van der Ster <d...@vanderster.com>
>> wrote:
>> > Hi all,
>> >
>&g
On Fri, Jul 14, 2017 at 10:40 PM, Gregory Farnum <gfar...@redhat.com> wrote:
> On Fri, Jul 14, 2017 at 5:41 AM Dan van der Ster <d...@vanderster.com> wrote:
>>
>> Hi,
>>
>> Occasionally we want to change the scrub schedule for a pool or whole
>> clu
On Tue, Jul 18, 2017 at 6:08 AM, Marcus Furlong <furlo...@gmail.com> wrote:
> On 22 March 2017 at 05:51, Dan van der Ster <d...@vanderster.com> wrote:
>> On Wed, Mar 22, 2017 at 8:24 AM, Marcus Furlong <furlo...@gmail.com>
>> wrote:
>>> Hi,
>>&
o be set if we have a cluster running OSDs on
> 10.2.6 and some OSDs on 10.2.9? Or should we wait that all OSDs are on
> 10.2.9?
>
> Monitor nodes are already on 10.2.9.
>
> Best,
> Martin
>
> On Fri, Jul 14, 2017 at 1:16 PM, Dan van der Ster <d...@vanderster.com>
Hi all,
With 10.2.8, ceph will now warn if you didn't yet set sortbitwise.
I just updated a test cluster, saw that warning, then did the necessary
ceph osd set sortbitwise
I noticed a short re-peering which took around 10s on this small
cluster with very little data.
Has anyone done this
Hi Wido,
Quick question about IPv6 clusters which you may have already noticed.
We have an IPv6 cluster and clients use this as the ceph.conf:
[global]
mon host = cephv6.cern.ch
cephv6 is an alias to our three mons, which are listening on their v6
addrs (ms bind ipv6 = true). But those mon
Hi,
The mon's on my test luminous cluster do not start after upgrading
from 12.0.1 to 12.0.2. Here is the backtrace:
0> 2017-04-25 11:06:02.897941 7f467ddd7880 -1 *** Caught signal
(Aborted) **
in thread 7f467ddd7880 thread_name:ceph-mon
ceph version 12.0.2
out(7) << __func__ << " loading creating_pgs e" <<
creating_pgs.last_scan_epoch << dendl;
}
...
Cheers, Dan
On Tue, Apr 25, 2017 at 11:15 AM, Dan van der Ster <d...@vanderster.com> wrote:
> Hi,
>
> The mon's on my test luminous cluster do not start
Created ticket to follow up: http://tracker.ceph.com/issues/19769
On Tue, Apr 25, 2017 at 11:34 AM, Dan van der Ster <d...@vanderster.com> wrote:
> Could this change be the culprit?
>
> commit 973829132bf7206eff6c2cf30dd0aa32fb0ce706
> Author: Sage Weil <s...@redhat.com>
SH weight by 1.0 each time which
> seemed to reduce the extra data movement we were seeing with smaller weight
> increases. Maybe something to try out next time?
>
> Bryan
>
> From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Dan van der
> Ster <d...@
On Thu, May 18, 2017 at 3:11 AM, Christian Balzer wrote:
> On Wed, 17 May 2017 18:02:06 -0700 Ben Hines wrote:
>
>> Well, ceph journals are of course going away with the imminent bluestore.
> Not really, in many senses.
>
But we should expect far fewer writes to pass through the
On Wed, May 17, 2017 at 11:29 AM, Dan van der Ster <d...@vanderster.com> wrote:
> I am currently pricing out some DCS3520's, for OSDs. Word is that the
> price is going up, but I don't have specifics, yet.
>
> I'm curious, does your real usage show that the 3500 series do
I am currently pricing out some DCS3520's, for OSDs. Word is that the
price is going up, but I don't have specifics, yet.
I'm curious, does your real usage show that the 3500 series don't
offer enough endurance?
Here's one of our DCS3700's after 2.5 years of RBD + a bit of S3:
Model Family:
Hi Bryan,
On Fri, Jun 9, 2017 at 1:55 AM, Bryan Stillwell wrote:
> This has come up quite a few times before, but since I was only working with
> RBD before I didn't pay too close attention to the conversation. I'm
> looking
> for the best way to handle existing clusters
On Thu, Jun 15, 2017 at 7:56 PM, Casey Bodley <cbod...@redhat.com> wrote:
>
> On 06/14/2017 05:59 AM, Dan van der Ster wrote:
>>
>> Dear ceph users,
>>
>> Today we had O(100) slow requests which were caused by deep-scrubbing
>> of the metadata log:
>>
osd
> config and restarting.
>
> Casey
>
>
> On 06/19/2017 11:01 AM, Dan van der Ster wrote:
>>
>> On Thu, Jun 15, 2017 at 7:56 PM, Casey Bodley <cbod...@redhat.com> wrote:
>>>
>>> On 06/14/2017 05:59 AM, Dan van der Ster wrote:
>>>>
at osd in order to trim more at a time.
>
>
> On 06/21/2017 09:27 AM, Dan van der Ster wrote:
>>
>> Hi Casey,
>>
>> I managed to trim up all shards except for that big #54. The others
>> all trimmed within a few seconds.
>>
>> But 54 is provin
On Wed, Jun 21, 2017 at 4:16 PM, Peter Maloney
<peter.malo...@brockmann-consult.de> wrote:
> On 06/14/17 11:59, Dan van der Ster wrote:
>> Dear ceph users,
>>
>> Today we had O(100) slow requests which were caused by deep-scrubbing
>> of the metadata log:
>>
On Thu, Jun 22, 2017 at 5:31 PM, Casey Bodley <cbod...@redhat.com> wrote:
>
> On 06/22/2017 10:40 AM, Dan van der Ster wrote:
>>
>> On Thu, Jun 22, 2017 at 4:25 PM, Casey Bodley <cbod...@redhat.com> wrote:
>>>
>>> On 06/22/2017 04:00 AM, Dan van der
On Thu, Jun 22, 2017 at 4:25 PM, Casey Bodley <cbod...@redhat.com> wrote:
>
> On 06/22/2017 04:00 AM, Dan van der Ster wrote:
>>
>> I'm now running the three relevant OSDs with that patch. (Recompiled,
>> replaced /usr/lib64/rados-classes/libcls_log.so with the
Hi Sage,
We need named clusters on the client side. RBD or CephFS clients, or
monitoring/admin machines all need to be able to access several clusters.
Internally, each cluster is indeed called "ceph", but the clients use
distinct names to differentiate their configs/keyrings.
Cheers, Dan
On
On Fri, Jun 9, 2017 at 5:58 PM, Vasu Kulkarni wrote:
> On Fri, Jun 9, 2017 at 6:11 AM, Wes Dillingham
> wrote:
>> Similar to Dan's situation we utilize the --cluster name concept for our
>> operations. Primarily for "datamover" nodes which do
Hi Patrick,
We've just discussed this internally and I wanted to share some notes.
First, there are at least three separate efforts in our IT dept to
collect and analyse SMART data -- its clearly a popular idea and
simple to implement, but this leads to repetition and begs for a
common, good
Dear ceph users,
Today we had O(100) slow requests which were caused by deep-scrubbing
of the metadata log:
2017-06-14 11:07:55.373184 osd.155
[2001:1458:301:24::100:d]:6837/3817268 7387 : cluster [INF] 24.1d
deep-scrub starts
...
2017-06-14 11:22:04.143903 osd.155
On Wed, May 3, 2017 at 10:32 AM, Blair Bethwaite
<blair.bethwa...@gmail.com> wrote:
> On 3 May 2017 at 18:15, Dan van der Ster <d...@vanderster.com> wrote:
>> It looks like el7's tuned natively supports the pmqos interface in
>> plugins/plugin_cpu.py.
>
> Ahha, you
On Wed, May 3, 2017 at 9:13 AM, Blair Bethwaite
wrote:
> We did the latter using the pmqos_static.py, which was previously part of
> the RHEL6 tuned latency-performance profile, but seems to have been dropped
> in RHEL7 (don't yet know why),
It looks like el7's tuned
On Wed, May 3, 2017 at 10:52 AM, Blair Bethwaite
<blair.bethwa...@gmail.com> wrote:
> On 3 May 2017 at 18:38, Dan van der Ster <d...@vanderster.com> wrote:
>> Seems to work for me, or?
>
> Yeah now that I read the code more I see it is opening and
> manipulating /d
On Tue, Jun 27, 2017 at 1:56 PM, Christian Balzer wrote:
> On Tue, 27 Jun 2017 13:24:45 +0200 (CEST) Wido den Hollander wrote:
>
>> > Op 27 juni 2017 om 13:05 schreef Christian Balzer :
>> >
>> >
>> > On Tue, 27 Jun 2017 11:24:54 +0200 (CEST) Wido den Hollander
On Wed, Oct 4, 2017 at 9:08 AM, Piotr Dałek wrote:
> On 17-10-04 08:51 AM, lists wrote:
>>
>> Hi,
>>
>> Yesterday I chowned our /var/lib/ceph ceph, to completely finalize our
>> jewel migration, and noticed something interesting.
>>
>> After I brought back up the OSDs I
On Fri, Oct 6, 2017 at 6:56 PM, Alfredo Deza wrote:
> Hi,
>
> Now that ceph-volume is part of the Luminous release, we've been able
> to provide filestore support for LVM-based OSDs. We are making use of
> LVM's powerful mechanisms to store metadata which allows the process
> to
Hi Thomas,
Yes we set it to a million.
>From our puppet manifest:
# need to increase aio-max-nr to allow many bluestore devs
sysctl { 'fs.aio-max-nr': val => '1048576' }
Cheers, Dan
On Aug 30, 2017 9:53 AM, "Thomas Bennett" wrote:
>
> Hi,
>
> I've
Hi Blair,
You can add/remove mons on the fly -- connected clients will learn
about all of the mons as the monmap changes and there won't be any
downtime as long as the quorum is maintained.
There is one catch when it comes to OpenStack, however.
Unfortunately, OpenStack persists the mon IP
On Wed, Sep 13, 2017 at 11:04 AM, Dan van der Ster <d...@vanderster.com> wrote:
> On Wed, Sep 13, 2017 at 10:54 AM, Wido den Hollander <w...@42on.com> wrote:
>>
>>> Op 13 september 2017 om 10:38 schreef Dan van der Ster
>>> <d...@vanderster.com>:
>
On Wed, Sep 13, 2017 at 10:54 AM, Wido den Hollander <w...@42on.com> wrote:
>
>> Op 13 september 2017 om 10:38 schreef Dan van der Ster <d...@vanderster.com>:
>>
>>
>> Hi Blair,
>>
>> You can add/remove mons on the fly -- connected clients will l
Hi,
How big is your cluster and what is your use case?
For us, we'll likely never enable the recent tunables that need to
remap *all* PGs -- it would simply be too disruptive for marginal
benefit.
Cheers, Dan
On Thu, Sep 28, 2017 at 9:21 AM, mj wrote:
> Hi,
>
> We have
Hi,
I see the same with jewel on el7 -- it started one of the recent point
releases around ~10.2.5, IIRC.
Problem seems to be the same -- daemon is started before the osd is
mounted... then the service waits several seconds before trying again.
Aug 31 15:41:47 ceph-osd: 2017-08-31
eph-osd@84.service
● │ ├─ceph-osd@89.service
● │ ├─ceph-osd@90.service
● │ ├─ceph-osd@91.service
● │ └─ceph-osd@92.service
● ├─getty.target
...
On Thu, Aug 31, 2017 at 4:57 PM, Dan van der Ster <d...@vanderster.com> wrote:
> Hi,
>
> I see the same with jewel on el7
Hi all,
As we are starting to ramp up our internal rgw service, I am wondering if
someone already developed some "open source" high-level admin tools for
rgw. On the one hand, we're looking for a web UI for users to create and
see their credentials, quota, usage, and maybe a web bucket browser.
On Tue, Nov 7, 2017 at 12:57 PM, John Spray wrote:
> On Sun, Nov 5, 2017 at 4:19 PM, Brady Deetz wrote:
>> My organization has a production cluster primarily used for cephfs upgraded
>> from jewel to luminous. We would very much like to have snapshots on
On Tue, Nov 7, 2017 at 4:15 PM, John Spray <jsp...@redhat.com> wrote:
> On Tue, Nov 7, 2017 at 3:01 PM, Dan van der Ster <d...@vanderster.com> wrote:
>> On Tue, Nov 7, 2017 at 12:57 PM, John Spray <jsp...@redhat.com> wrote:
>>> On Sun, Nov 5, 2017 at 4:19 PM,
Hi all,
I'm playing with the dashboard module in 12.2.2 (and it's very cool!) but I
noticed that some OSDs do not have metadata, e.g. this page:
http://xxx:7000/osd/perf/74
Has empty metadata. I *am* able to see all the info with `ceph osd metadata 74`.
I noticed in the mgr log we have:
Doh!
The activate command needs the *osd* fsid, not the cluster fsid.
So this works:
ceph-volume lvm activate 0 6608c0cf-3827-4967-94fd-5a3336f604c3
Is an "activate-all" equivalent planned?
-- Dan
On Tue, Dec 12, 2017 at 11:35 AM, Dan van der Ster <d...@vanderster.com>
Hi all,
Did anyone successfully prepare a new OSD with ceph-volume in 12.2.2?
We are trying the simplest thing possible and not succeeding :(
# ceph-volume lvm prepare --bluestore --data /dev/sdb
# ceph-volume lvm list
== osd.0 ===
[block]
:
> Hi Dan,
>
> We agreed in upstream RGW to make this change. Do you intend to
> submit this as a PR?
>
> regards
>
> Matt
>
> On Fri, May 4, 2018 at 10:57 AM, Dan van der Ster <d...@vanderster.com> wrote:
>> Hi Valery,
>>
>> Did you eventual
On Tue, May 8, 2018 at 7:35 PM, Vasu Kulkarni wrote:
> On Mon, May 7, 2018 at 2:26 PM, Maciej Puzio wrote:
>> I am an admin in a research lab looking for a cluster storage
>> solution, and a newbie to ceph. I have setup a mini toy cluster on
>> some VMs,
Hi Adrian,
Is there a strict reason why you *must* upgrade the tunables?
It is normally OK to run with old (e.g. hammer) tunables on a luminous
cluster. The crush placement won't be state of the art, but that's not
a huge problem.
We have a lot of data in a jewel cluster with hammer tunables.
Hi Scott,
Multi MDS just assigns different parts of the namespace to different
"ranks". Each rank (0, 1, 2, ...) is handled by one of the active
MDSs. (You can query which parts of the name space are assigned to
each rank using the jq tricks in [1]). If a rank is down and there are
no more
Hi Valery,
Did you eventually find a workaround for this? I *think* we'd also
prefer rgw to fallback to external plugins, rather than checking them
before local. But I never understood the reasoning behind the change
from jewel to luminous.
I saw that there is work towards a cache for ldap [1]
Hi,
It still isn't clear if you're using the fuse or kernel client.
Do you `mount -t ceph` or something else?
-- Dan
On Wed, May 16, 2018 at 8:28 PM Donald "Mac" McCarthy
wrote:
> CephFS. 8 core atom C2758, 16 GB ram, 256GB ssd, 2.5 GB NIC (supermicro
microblade
On Thu, Jun 7, 2018 at 4:31 PM Alfredo Deza wrote:
>
> On Thu, Jun 7, 2018 at 10:23 AM, Dan van der Ster wrote:
> > Hi all,
> >
> > We have an intermittent issue where bluestore osds sometimes fail to
> > start after a reboot.
> > The osds all fail t
On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > Hi all,
> >
> > We have an intermittent issue where bluestore osds sometimes fail to
> > start after a reboot.
> > The osds all fail the same way [see 2], fai
Hi all,
We have an intermittent issue where bluestore osds sometimes fail to
start after a reboot.
The osds all fail the same way [see 2], failing to open the superblock.
One one particular host, there are 24 osds and 4 SSDs partitioned for
the block.db's. The affected non-starting OSDs all have
On Thu, Jun 7, 2018 at 6:09 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote:
> > >
> > > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote:
> > > >
> > > > On Thu
On Thu, Jun 7, 2018 at 6:01 PM Dan van der Ster wrote:
>
> On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote:
> >
> > On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote:
> > >
> > > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > >
On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster wrote:
>
> On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote:
> >
> > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote:
> > > >
> > > > On Thu, 7 Jun
On Thu, Jun 7, 2018 at 5:16 PM Alfredo Deza wrote:
>
> On Thu, Jun 7, 2018 at 10:54 AM, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote:
> >>
> >> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> >> > On Th
On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil wrote:
> > >
> > > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > > Hi all,
> > > >
> > >
On Thu, Jun 7, 2018 at 5:34 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 4:41 PM Sage Weil wrote:
> > >
> > > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > > On Thu, Jun 7, 2018 at 4:33 PM Sage Weil
On Thu, Jun 7, 2018 at 6:33 PM Sage Weil wrote:
>
> On Thu, 7 Jun 2018, Dan van der Ster wrote:
> > > > Wait, we found something!!!
> > > >
> > > > In the 1st 4k on the block we found the block.db pointing at the wrong
> > > > device (/dev/sd
On Thu, Jun 7, 2018 at 6:58 PM Alfredo Deza wrote:
>
> On Thu, Jun 7, 2018 at 12:09 PM, Sage Weil wrote:
> > On Thu, 7 Jun 2018, Dan van der Ster wrote:
> >> On Thu, Jun 7, 2018 at 5:36 PM Dan van der Ster
> >> wrote:
> >> >
> >>
On Thu, Jun 7, 2018 at 8:58 PM Alfredo Deza wrote:
>
> On Thu, Jun 7, 2018 at 2:45 PM, Dan van der Ster wrote:
> > On Thu, Jun 7, 2018 at 6:58 PM Alfredo Deza wrote:
> >>
> >> On Thu, Jun 7, 2018 at 12:09 PM, Sage Weil wrote:
> >> > On Thu, 7 Jun 20
8=128.55.xxx.xx:6789/0}
>
> election epoch 4, quorum 0,1 ngfdv076,ngfdv078
>
> osdmap e280: 48 osds: 48 up, 48 in
>
> flags sortbitwise,require_jewel_osds
>
> pgmap v117283: 3136 pgs, 11 pools, 25600 MB data, 510 objects
>
>
See this thread:
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-April/000106.html
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-June/000113.html
(Wido -- should we kill the ceph-large list??)
-- dan
On Wed, Jun 13, 2018 at 12:27 PM Marc Roos wrote:
>
>
> Shit, I added
See this thread:
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-April/000106.html
http://lists.ceph.com/pipermail/ceph-large-ceph.com/2018-June/000113.html
(Wido -- should we kill the ceph-large list??)
On Wed, Jun 13, 2018 at 1:14 PM Marc Roos wrote:
>
>
> I wonder if this is not a
Hi,
One way you can see exactly what is happening when you write an object
is with --debug_ms=1.
For example, I write a 100MB object to a test pool: rados
--debug_ms=1 -p test put 100M.dat 100M.dat
I pasted the output of this here: https://pastebin.com/Zg8rjaTV
In this case, it first gets the
irtio-blk vs
virtio-scsi: the latter has a timeout but blk blocks forever.
On 5000 attached volumes we saw around 12 of these IO errors, and this
was the first time in 5 years of upgrades that an IO error happened...
-- dan
> -Greg
>
>>
>>
>> On 22.06.2018, at 16:16, Dan
Hi Nick,
Our latency probe results (4kB rados bench) didn't change noticeably
after converting a test cluster from FileStore (sata SSD journal) to
BlueStore (sata SSD db). Those 4kB writes take 3-4ms on average from a
random VM in our data centre. (So bluestore DB seems equivalent to
FileStore
Hi all,
Is anyone getting useful results with your benchmarking? I've prepared
two test machines/pools and don't see any definitive slowdown with
patched kernels from CentOS [1].
I wonder if Ceph will be somewhat tolerant of these patches, similarly
to what's described here:
On Mon, Jan 8, 2018 at 4:37 PM, Alfredo Deza <ad...@redhat.com> wrote:
> On Thu, Dec 21, 2017 at 11:35 AM, Stefan Kooman <ste...@bit.nl> wrote:
>> Quoting Dan van der Ster (d...@vanderster.com):
>>> Thanks Stefan. But isn't there also some vgremove or lvremove magi
Hi all,
We just saw an example of one single down OSD taking down a whole
(small) luminous 12.2.2 cluster.
The cluster has only 5 OSDs, on 5 different servers. Three of those
servers also run a mon/mgr combo.
First, we had one server (mon+osd) go down legitimately [1] -- I can
tell when it went
r the help solving this puzzle,
Dan
On Mon, Jan 22, 2018 at 8:07 PM, Dan van der Ster <d...@vanderster.com> wrote:
> Hi all,
>
> We just saw an example of one single down OSD taking down a whole
> (small) luminous 12.2.2 cluster.
>
> The cluster has only 5 OSDs, on 5 diff
Hi Caspar,
I've been trying the mgr balancer for a couple weeks now and can share
some experience.
Currently there are two modes implemented: upmap and crush-compat.
Upmap requires all clients to be running luminous -- it uses this new
pg-upmap mechanism to precisely move PGs one by one to a
On Wed, Feb 21, 2018 at 2:24 PM, Alfredo Deza wrote:
> On Tue, Feb 20, 2018 at 9:05 PM, Oliver Freyermuth
> wrote:
>> Many thanks for your replies!
>>
>> Are there plans to have something like
>> "ceph-volume discover-and-activate"
>> which would
On Wed, Feb 21, 2018 at 11:56 PM, Oliver Freyermuth
<freyerm...@physik.uni-bonn.de> wrote:
> Am 21.02.2018 um 15:58 schrieb Alfredo Deza:
>> On Wed, Feb 21, 2018 at 9:40 AM, Dan van der Ster <d...@vanderster.com>
>> wrote:
>>> On Wed, Feb 21, 2018 at 2:24 PM, A
Hi Wido,
We have used a few racks of Wiwynn OCP servers in a Ceph cluster for a
couple of years.
The machines are dual Xeon [1] and use some of those 2U 30-disk "Knox"
enclosures.
Other than that, I have nothing particularly interesting to say about
these. Our data centre procurement team have
Hi,
For someone who is not an lvm expert, does anyone have a recipe for
destroying a ceph-volume lvm osd?
(I have a failed disk which I want to deactivate / wipe before
physically removing from the host, and the tooling for this doesn't
exist yet http://tracker.ceph.com/issues/22287)
>
Hi,
We've used double the defaults for around 6 months now and haven't had any
behind on trimming errors in that time.
mds log max segments = 60
mds log max expiring = 40
Should be simple to try.
-- dan
On Thu, Dec 21, 2017 at 2:32 PM, Stefan Kooman wrote:
> Hi,
>
>
On Thu, Dec 21, 2017 at 3:59 PM, Stefan Kooman <ste...@bit.nl> wrote:
> Quoting Dan van der Ster (d...@vanderster.com):
>> Hi,
>>
>> For someone who is not an lvm expert, does anyone have a recipe for
>> destroying a ceph-volume lvm osd?
>> (I have a failed
On Thu, Jun 21, 2018 at 2:41 PM Kai Wagner wrote:
>
> On 20.06.2018 17:39, Dan van der Ster wrote:
> > And BTW, if you can't make it to this event we're in the early days of
> > planning a dedicated Ceph + OpenStack Days at CERN around May/June
> > 2019.
>
And BTW, if you can't make it to this event we're in the early days of
planning a dedicated Ceph + OpenStack Days at CERN around May/June
2019.
More news on that later...
-- Dan @ CERN
On Tue, Jun 19, 2018 at 10:23 PM Leonardo Vaz wrote:
>
> Hey Cephers,
>
> We will join our friends from
It's here https://ceph-storage.slack.com/ but for some reason the list of
accepted email domains is limited. I have no idea who is maintaining this.
Anyway, the slack is just mirroring #ceph and #ceph-devel on IRC so better
to connect there directly.
Cheers, Dan
On Sat, Jul 28, 2018, 6:59 PM
301 - 400 of 571 matches
Mail list logo