Just to update, this is still an issue as of the latest Git commit (
64bcf92e87f9fbb3045de49b7deb53aca1989123).
On Fri, Nov 11, 2016 at 1:31 PM, bobobo1...@gmail.com
wrote:
> Here's another: http://termbin.com/smnm
>
> On Fri, Nov 11, 2016 at 1:28 PM, Sage Weil
Olá Bruno
I am not understanding your outputs.
On the first 'ceph -s' it says one mon is down but hour 'ceph health detail'
does not report it further.
On your crush map I count 7 osds= 0,1,2,3,4,6,7 but ceph -s says only 6 are
active.
Can you send the output of 'ceph osd tree, 'ceph osd df'
Hi, thanks.
# begin crush map
tunable choose_local_tries 0
tunable choose_local_fallback_tries 0
tunable choose_total_tries 50
tunable chooseleaf_descend_once 1
tunable straw_calc_version 1
# devices
device 0 osd.0
device 1 osd.1
device 2 osd.2
device 3 osd.3
device 4 osd.4
device 5 device5
Many reasons:
1) You will eventually get a DC wide power event anyway at which point
probably most of the OSDs will have hopelessly corrupted internal xfs
structures (yes, I have seen this happen to a poor soul with a DC with
redundant power).
2) Even in the case of a single rack/node power
Yes, because these things happen
http://www.theregister.co.uk/2016/11/15/memset_power_cut_service_interruption/
We had customers who had kit in this DC.
To use your analogy, it's like crossing the road at traffic lights but
not checking cars have stopped. You might be OK 99%of the time, but
This is like your mother telling not to cross the road when you were 4
years of age but not telling you it was because you could be flattened
by a car :)
Can you expand on your answer? If you are in a DC with AB power,
redundant UPS, dual feed from the electric company, onsite generators,
dual
On Sat, Nov 19, 2016 at 6:59 AM, Brad Hubbard wrote:
> +ceph-devel
>
> On Fri, Nov 18, 2016 at 8:45 PM, Nick Fisk wrote:
>> Hi All,
>>
>> I want to submit a PR to include fix in this tracker bug, as I have just
>> realised I've been experiencing it.
>>
>>
+ceph-devel
On Fri, Nov 18, 2016 at 8:45 PM, Nick Fisk wrote:
> Hi All,
>
> I want to submit a PR to include fix in this tracker bug, as I have just
> realised I've been experiencing it.
>
> http://tracker.ceph.com/issues/9860
>
> I understand that I would also need to update
On Fri, Nov 18, 2016 at 1:14 PM, Jeffrey McDonald wrote:
> Hi,
>
> MSI has an erasure coded ceph pool accessible by the radosgw interface.
> We recently upgraded to Jewel from Hammer. Several days ago, we
> experienced issues with a couple of the rados gateway servers and
>
Hi,
MSI has an erasure coded ceph pool accessible by the radosgw interface.
We recently upgraded to Jewel from Hammer. Several days ago, we
experienced issues with a couple of the rados gateway servers and
inadvertently deployed older Hammer versions of the radosgw instances.
This
Hi Nick and other Cephers,
Thanks for your reply.
>2) Config Errors>This can be an easy one to say you are safe from. But I would
>say most outages and data loss incidents I have seen on the mailing>lists have
>been due to poor hardware choice or configuring options such as size=2,
>min_size=1
Hi,
On 15/11/16 11:55, Craig Chi wrote:
> You can try to manually fix this by adding the
> /lib/systemd/system/ceph-mon.target file, which contains:
> and then execute the following command to tell systemd to start this
> target on bootup
> systemctl enable ceph-mon.target
This worked a
We use the 800GB version as journal devices with up to an 1:18 ratio and have
had good experiences no bottleneck on the journal side. These also feature good
endurance characteristics. I would think that higher capacities are hard to
justify as journals
-Original Message-
From:
I often read that small IO write and RBD are working better with bigger
filestore_max_sync_interval than default value.
Default value is 5 sec and I saw many post saying they are using 30 sec.
Also the slow request symptom is often linked to this parameter.
My journals are 10GB ( collocated with
- Le 4 Nov 16, à 21:17, Andrey Ptashnik a écrit :
> Hello Ceph team!
> I’m trying to create different pools in Ceph in order to have different tiers
> (some are fast, small and expensive and others are plain big and cheap), so
> certain users will be tied to one pool
- Le 3 Nov 16, à 5:18, Thomas a écrit :
> Hi guys,
Hi Thomas,
This is a question I also asked myself ...
Maybe something like :
radosgw-admin zonegroup get
radosgw-admin zone get
And for each user :
radosgw-admin metadata get user:uid
Anyone ?
Stephane.
>
Hey Cephers,
Due to Dreamhost shutting down the old DreamCompute cluster in their
US-East 1 region, we are in the process of beginning the migration of
Ceph infrastructure. We will need to move download.ceph.com,
tracker.ceph.com, and docs.ceph.com to their US-East 2 region.
The current plan is
> I was wondering how exactly you accomplish that?
> Can you do this with a "ceph-deploy create" with "noin" or "noup" flags
> set, or does one need to follow the manual steps of adding an osd?
You can do it either way (manual or with ceph-deploy). Here are the
steps using ceph-deploy:
1. Add
yes nick, you're right, I can now see on page 16 here
www.intel.com/content/www/xa/en/solid-state-drives/ssd-dc-p3700-spec.html
there is a difference in the durability.
However, I think 7.3PBW isn't much worse than Intel S3610 that's much
slower. thx will
400GB: 7.3 PBW
800GB: 14.6 PBW (10 drive
I'm using the 400Gb models as a Journal for 12x drives. I know this is probably
pushing it a little bit, but seems to work fine. I'm
guessing the reason may be relating to the TBW figure being higher on the more
expensive models, maybe they don't want to have to
replace warn NVME's on warranty?
hi Corin. We run latest hammer on CentOS7.2, with 3 mons and have not
seen this problem. I'm not sure if there are any other possible
differences between the healthy nodes and the one that has excessive
consumption of memory? thx will
On Fri, Nov 18, 2016 at 6:35 PM, Corin Langosch
thanks Yehuda and Brian. I'm not sure if you have ever seen this error
with radosgw (lastest Hammer CentOS7), or can advice whether this is a
critical error? appreciate any hints here. thx will
2016-11-12 13:49:08.905114 7fbba7fff700 20 RGWUserStatsCache: sync
user=myuserid1
2016-11-12
We've had this for a while. We just monitor memory usage and restart the mon
services when 1 or more reach 80%.
Sent from my iPhone
> On Nov 18, 2016, at 3:35 AM, Corin Langosch
> wrote:
>
> Hi,
>
> about 2 weeks ago I upgraded a rather small cluster from ceph
Hi list, I wonder if there is anyone who have experience with Intel
P3700 SSD drives as Journals, and can share their experience?
I was thinking of using the P3700 SSD 400GB as journal in my ceph
deployment. It is benchmarked in Sebastian hann ssd page as well.
However a vendor I spoke to didn't
On Fri, Nov 18, 2016 at 1:04 PM, Iain Buclaw wrote:
> On 18 November 2016 at 13:14, John Spray wrote:
>> On Fri, Nov 18, 2016 at 11:53 AM, Iain Buclaw wrote:
>>> Hi,
>>>
>>> Follow up from the suggestion to use any of the following
On 18 November 2016 at 13:14, John Spray wrote:
> On Fri, Nov 18, 2016 at 11:53 AM, Iain Buclaw wrote:
>> Hi,
>>
>> Follow up from the suggestion to use any of the following options:
>>
>> - client_mount_timeout
>> - rados_mon_op_timeout
>> -
On Fri, Nov 18, 2016 at 11:53 AM, Iain Buclaw wrote:
> Hi,
>
> Follow up from the suggestion to use any of the following options:
>
> - client_mount_timeout
> - rados_mon_op_timeout
> - rados_osd_op_timeout
>
> To mitigate the waiting time being blocked on requests. Is there
>
Hi,
Follow up from the suggestion to use any of the following options:
- client_mount_timeout
- rados_mon_op_timeout
- rados_osd_op_timeout
To mitigate the waiting time being blocked on requests. Is there
really no other way around this?
If two OSDs go down that between them have the both
Hi Nick,
Here are some logs. The system is in IST TZ and I have filtered the logs to get
only 2 last hours during which we can observe the issue.
In that particular case, issue is illustrated with the following OSDs
Primary:
ID:607
PID:2962227
HOST:10.137.81.18
Secondary1
ID:528
PID:3721728
Hi Sam,
Updated with some more info.
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Samuel Just
> Sent: 17 November 2016 19:02
> To: Nick Fisk
> Cc: Ceph Users
> Subject: Re: [ceph-users]
Hi All,
I want to submit a PR to include fix in this tracker bug, as I have just
realised I've been experiencing it.
http://tracker.ceph.com/issues/9860
I understand that I would also need to update the debian/ceph-osd.* to get the
file copied, however I'm not quite sure where this
new file
Hi,
We have support for offline bucket resharding admin command:
https://github.com/ceph/ceph/pull/11230.
It will be available in Jewel 10.2.5.
Orit
On Thu, Nov 17, 2016 at 9:11 PM, Yoann Moulin wrote:
> Hello,
>
> is that possible to shard the index of existing buckets
Hi,
about 2 weeks ago I upgraded a rather small cluster from ceph 0.94.2 to 0.94.9. The upgrade went fine, the cluster is
running stable. But I just noticed that one monitor is already eating 20 GB of memory, growing slowly over time. The
other 2 mons look fine. The disk space used by the
Thanks! I solved it by the ceph-osd command.
So... there is no a script to install Upstart, isn't it?
Jae
On Fri, Nov 18, 2016 at 3:26 PM 钟佳佳 wrote:
> if you built from git repo tag v10.2.3,
> refers to links below from ceph.com
>
34 matches
Mail list logo