On 7/25/19 11:55 AM, Konstantin Shalygin wrote:
>> we just recently upgraded our cluster from luminous 12.2.10 to nautilus
>> 14.2.1 and I noticed a massive increase of the space used on the cephfs
>> metadata pool although the used space in the 2 data pools basically did
>> not change. See the
On 7/24/19 10:05 PM, Wido den Hollander wrote:
>
>
> On 7/24/19 9:38 PM, dhils...@performair.com wrote:
>> All;
>>
>> There's been a lot of discussion of various kernel versions on this list
>> lately, so I thought I'd seek some clarification.
>>
>> I prefer to run CentOS, and I prefer to keep
08 AM, Yan, Zheng wrote:
> Check if there is any hang request in 'ceph daemon mds.xxx objecter_requests'
>
> On Tue, Jul 16, 2019 at 11:51 PM Dietmar Rieder
> wrote:
>>
>> On 7/16/19 4:11 PM, Dietmar Rieder wrote:
>>> Hi,
>>>
>>> We are running
On 7/16/19 5:34 PM, Dietmar Rieder wrote:
> On 7/16/19 4:11 PM, Dietmar Rieder wrote:
>> Hi,
>>
>> We are running ceph version 14.1.2 with cephfs only.
>>
>> I just noticed that one of our pgs had scrub errors which I could repair
>>
>> # ceph h
On 7/16/19 4:11 PM, Dietmar Rieder wrote:
> Hi,
>
> We are running ceph version 14.1.2 with cephfs only.
>
> I just noticed that one of our pgs had scrub errors which I could repair
>
> # ceph health detail
> HEALTH_ERR 1 MDSs report slow metadata IOs; 1 MDSs report
Hi,
We are running ceph version 14.1.2 with cephfs only.
I just noticed that one of our pgs had scrub errors which I could repair
# ceph health detail
HEALTH_ERR 1 MDSs report slow metadata IOs; 1 MDSs report slow requests;
1 scrub errors; Possible data damage: 1 pg inconsistent
+1
Operators view: 12 months cycle is definitely better than 9. March seem
to be a reasonable compromise.
Best
Dietmar
On 6/6/19 2:31 AM, Linh Vu wrote:
> I think 12 months cycle is much better from the cluster operations
> perspective. I also like March as a release month as well.
>
On 5/8/19 10:52 PM, Gregory Farnum wrote:
> On Wed, May 8, 2019 at 5:33 AM Dietmar Rieder
> wrote:
>>
>> On 5/8/19 1:55 PM, Paul Emmerich wrote:
>>> Nautilus properly accounts metadata usage, so nothing changed it just
>>> shows up correctly now ;)
>>
>
On 5/8/19 1:55 PM, Paul Emmerich wrote:
> Nautilus properly accounts metadata usage, so nothing changed it just
> shows up correctly now ;)
OK, but then I'm not sure I understand why the increase was not sudden
(with the update) but it kept growing steadily over days.
~Dietmar
--
Hi,
we just recently upgraded our cluster from luminous 12.2.10 to nautilus
14.2.1 and I noticed a massive increase of the space used on the cephfs
metadata pool although the used space in the 2 data pools basically did
not change. See the attached graph (NOTE: log10 scale on y-axis)
Is there
ow I could start the three OSDs again and the cluster is HEALTHY.
I hope this gets fixed soon, meanwhile one should keep this in mind and
be careful when trying the ceph device monitoring and deleting the
device_health_metrics pool.
Best
Dietmar
On 5/3/19 10:09 PM, Dietmar Rieder wrote:
> HI,
HI,
I think I just hit the sam problem on Nautilus 14.2.1
I tested the ceph device monitoring, which created a new pool
(device_health_metrics), after looking into the monitoring feature, I
turned it off again and removed the pool. This resulted int 3 OSDs down
which can not be started again
On 1/23/19 3:05 PM, Alfredo Deza wrote:
> On Wed, Jan 23, 2019 at 8:25 AM Jan Fajerski wrote:
>>
>> On Wed, Jan 23, 2019 at 10:01:05AM +0100, Manuel Lausch wrote:
>>> Hi,
>>>
>>> thats a bad news.
>>>
>>> round about 5000 OSDs are affected from this issue. It's not realy a
>>> solution to
On 12/14/18 1:44 AM, Christian Balzer wrote:
> On Thu, 13 Dec 2018 19:44:30 +0100 Ronny Aasen wrote:
>
>> On 13.12.2018 18:19, Alex Gorbachev wrote:
>>> On Thu, Dec 13, 2018 at 10:48 AM Dietmar Rieder
>>> wrote:
>>>> Hi Cephers,
>>>>
>&
, Matthew Vernon wrote:
> Hi,
>
> On 13/12/2018 15:48, Dietmar Rieder wrote:
>
>> one of our OSD nodes is experiencing a Disk controller problem/failure
>> (frequent resetting), so the OSDs on this controller are flapping
>> (up/down in/out).
>
> Ah, hardware...
>
Hi Cephers,
one of our OSD nodes is experiencing a Disk controller problem/failure
(frequent resetting), so the OSDs on this controller are flapping
(up/down in/out).
I will hopefully get the replacement part soon.
I have some simple questions, what are the best steps to take now before
an
On 11/7/18 11:59 AM, Konstantin Shalygin wrote:
>> I wonder if there is any release announcement for ceph 12.2.9 that I missed.
>> I just found the new packages on download.ceph.com, is this an official
>> release?
>
> This is because 12.2.9 have a several bugs. You should avoid to use this
>
Hi,
I wonder if there is any release announcement for ceph 12.2.9 that I missed.
I just found the new packages on download.ceph.com, is this an official
release?
~ Dietmar
--
_
D i e t m a r R i e d e r, Mag.Dr.
Innsbruck Medical University
Biocenter -
On 10/15/18 1:17 PM, jes...@krogh.cc wrote:
>> On 10/15/18 12:41 PM, Dietmar Rieder wrote:
>>> No big difference here.
>>> all CentOS 7.5 official kernel 3.10.0-862.11.6.el7.x86_64
>>
>> ...forgot to mention: all is luminous ceph-12.2.7
>
> Thanks for yo
On 10/15/18 12:41 PM, Dietmar Rieder wrote:
> On 10/15/18 12:02 PM, jes...@krogh.cc wrote:
>>>> On Sun, Oct 14, 2018 at 8:21 PM wrote:
>>>> how many cephfs mounts that access the file? Is is possible that some
>>>> program opens that file i
On 10/15/18 12:02 PM, jes...@krogh.cc wrote:
>>> On Sun, Oct 14, 2018 at 8:21 PM wrote:
>>> how many cephfs mounts that access the file? Is is possible that some
>>> program opens that file in RW mode (even they just read the file)?
>>
>>
>> The nature of the program is that it is "prepped" by
Try to update to kernel-3.10.0-862.11.6.el7.x86_64.rpm that should solve the
problem.
Best
Dietmar
Am 28. August 2018 11:50:31 MESZ schrieb Marc Roos :
>
>I have a idle test cluster (centos7.5, Linux c04
>3.10.0-862.9.1.el7.x86_64), and a client kernel mount cephfs.
>
>I tested reading a few
On 08/21/2018 02:22 PM, Ilya Dryomov wrote:
> On Tue, Aug 21, 2018 at 9:12 AM Dietmar Rieder
> wrote:
>>
>> On 08/20/2018 05:36 PM, Ilya Dryomov wrote:
>>> On Mon, Aug 20, 2018 at 4:52 PM Dietmar Rieder
>>> wrote:
>>>>
>>>> Hi Cep
On 08/20/2018 05:36 PM, Ilya Dryomov wrote:
> On Mon, Aug 20, 2018 at 4:52 PM Dietmar Rieder
> wrote:
>>
>> Hi Cephers,
>>
>>
>> I wonder if the cephfs client in RedHat/CentOS 7.5 will be updated to
>> luminous?
>> As far as I see there is so
Hi Cephers,
I wonder if the cephfs client in RedHat/CentOS 7.5 will be updated to
luminous?
As far as I see there is some luminous related stuff that was
backported, however,
the "ceph features" command just reports "jewel" as release of my cephfs
clients running CentOS 7.5 (kernel
Hi Caspar,
did you have a chance yet to proceed with switching from crush-compat to
upmap?
If yes, would you eventually share your experience?
Best
Dietmar
On 07/18/2018 11:07 AM, Caspar Smit wrote:
> Hi Xavier,
>
> Not yet, i got a little anxious in changing anything major in the
> cluster
On 07/19/2018 04:44 AM, Satish Patel wrote:
> If i have 8 OSD drives in server on P410i RAID controller (HP), If i
> want to make this server has OSD node in that case show should i
> configure RAID?
>
> 1. Put all drives in RAID-0?
> 2. Put individual HDD in RAID-0 and create 8 individual RAID-0
+1 for supporting both!
Disclosure: Prometheus user
Dietmar
On 05/07/2018 04:53 PM, Reed Dier wrote:
> I’ll +1 on InfluxDB rather than Prometheus, though I think having a version
> for each infrastructure path would be best.
> I’m sure plenty here have existing InfluxDB infrastructure as their
Hi Wido,
thanks for the tool. Here are some stats from our cluster:
Ceph 12.2.4, 240 OSDs, CephFS only
onodes db_used_bytes avg_obj_sizeoverhead_per_obj
Mean214871 1574830080 2082298 7607
Max 309855 3018850304 3349799 17753
On 04/04/2018 08:58 PM, Robert Stanford wrote:
>
> I read a couple of versions ago that ceph-deploy was not recommended
> for production clusters. Why was that? Is this still the case? We
> have a lot of problems automating deployment without ceph-deploy.
>
We are using it in production on
a while. '' in this
> case is ceph-osd of course.
>
> Alternatively, if you can upload a coredump and an sosreport (so I can
> validate exact versions of all packages installed) I can try and take
> a look.
>
> On Fri, Mar 23, 2018 at 9:20 PM, Dietmar Rieder
> <dietmar.ri
Cheers,
> Oliver
>
> Am 08.03.2018 um 15:00 schrieb Dietmar Rieder:
>> Hi,
>>
>> I noticed in my client (using cephfs) logs that an osd was unexpectedly
>> going down.
>> While checking the osd logs for the affected OSD I found that the osd
>> was seg faul
On 03/14/2018 01:48 PM, Lars Marowsky-Bree wrote:
> On 2018-02-28T02:38:34, Patrick Donnelly wrote:
>
>> I think it will be necessary to reduce the actives to 1 (max_mds -> 1;
>> deactivate other ranks), shutdown standbys, upgrade the single active,
>> then upgrade/start the
Hi,
See:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-February/025092.html
Might be of interest.
Dietmar
Am 12. März 2018 18:19:51 MEZ schrieb Reed Dier :
>Figured I would see if anyone has seen this or can see something I am
>doing wrong.
>
>Upgrading all
anyone capture a core file? Please feel free to open a tracker on this.
I've no core file avilable, was not dumped, and so far I've noticed just
that single segfault.
Dietmar
>
>>
>>
>> Thanks
>>
>> Subhachandra
>>
>>
>>
>> On Thu, Mar 8, 2018
Hi,
I noticed in my client (using cephfs) logs that an osd was unexpectedly
going down.
While checking the osd logs for the affected OSD I found that the osd
was seg faulting:
[]
2018-03-07 06:01:28.873049 7fd9af370700 -1 *** Caught signal
(Segmentation fault) **
in thread 7fd9af370700
Thanks for making this clear.
Dietmar
On 02/27/2018 05:29 PM, Alfredo Deza wrote:
> On Tue, Feb 27, 2018 at 11:13 AM, Dietmar Rieder
> <dietmar.rie...@i-med.ac.at> wrote:
>> ... however, it would be nice if ceph-volume would also create the
>> partitions for the
... however, it would be nice if ceph-volume would also create the
partitions for the WAL and/or DB if needed. Is there a special reason,
why this is not implemented?
Dietmar
On 02/27/2018 04:25 PM, David Turner wrote:
> Gotcha. As a side note, that setting is only used by ceph-disk as
>
Anyone?
Am 9. Februar 2018 09:59:54 MEZ schrieb Dietmar Rieder
<dietmar.rie...@i-med.ac.at>:
>Hi,
>
>we are running ceph version 12.2.2 (10 nodes, 240 OSDs, 3 mon). While
>monitoring the WAL db used bytes we noticed that there are some OSDs
>that use proportionally more WAL
Hi,
we are running ceph version 12.2.2 (10 nodes, 240 OSDs, 3 mon). While
monitoring the WAL db used bytes we noticed that there are some OSDs
that use proportionally more WAL db bytes than others (800Mb vs 300Mb).
These OSDs eventually exceed the WAL db size (1GB in our case) and spill
over to
Hi,
just for the record:
A reboot of the osd node solved the issue, now the wal is fully purged
and the extra 790MB are gone.
Sorry for the noise.
Dietmar
On 01/27/2018 11:08 AM, Dietmar Rieder wrote:
> Hi,
>
> replying to my own message.
>
> After I restarted the OS
flushed.
Is it somhow possible to reinitialize the wal for that OSD in question?
Thanks
Dietmar
On 01/26/2018 05:11 PM, Dietmar Rieder wrote:
> Hi all,
>
> I've a question regarding bluestore wal.db:
>
>
> We are running a 10 OSD node + 3 MON/MDS node cluster (luminous 12.2.2
Hi all,
I've a question regarding bluestore wal.db:
We are running a 10 OSD node + 3 MON/MDS node cluster (luminous 12.2.2).
Each OSD node has 22xHDD (8TB) OSDs, 2xSSD (1.6TB) OSDs and 2xNVME (800
GB) for bluestore wal and db.
We have separated wal and db partitions
wal partitions are 1GB
db
Hi,
I finally found a working way to replace the failed OSD. Everthing looks
fine again.
Thanks again for your comments and suggestions.
Dietmar
On 01/12/2018 04:08 PM, Dietmar Rieder wrote:
> Hi,
>
> can someone, comment/confirm my planned OSD replacement procedure?
>
> I
Hi,
can someone, comment/confirm my planned OSD replacement procedure?
It would be very helpful for me.
Dietmar
Am 11. Januar 2018 17:47:50 MEZ schrieb Dietmar Rieder
<dietmar.rie...@i-med.ac.at>:
>Hi Alfredo,
>
>thanks for your coments, see my answers inline.
>
>O
Hi Konstantin,
thanks for your answer, see my answer to Alfredo which includes your
suggestions.
~Dietmar
On 01/11/2018 12:57 PM, Konstantin Shalygin wrote:
>> Now wonder what is the correct way to replace a failed OSD block disk?
>
> Generic way for maintenance (e.g. disk replace) is
Hi Alfredo,
thanks for your coments, see my answers inline.
On 01/11/2018 01:47 PM, Alfredo Deza wrote:
> On Thu, Jan 11, 2018 at 4:30 AM, Dietmar Rieder
> <dietmar.rie...@i-med.ac.at> wrote:
>> Hello,
>>
>> we have failed OSD disk in our Luminous v12.2.2 cluster
Hello,
we have failed OSD disk in our Luminous v12.2.2 cluster that needs to
get replaced.
The cluster was initially deployed using ceph-deploy on Luminous
v12.2.0. The OSDs were created using
ceph-deploy osd create --bluestore cephosd-${osd}:/dev/sd${disk}
--block-wal /dev/nvme0n1 --block-db
On 12/01/2017 01:45 PM, Alfredo Deza wrote:
> On Fri, Dec 1, 2017 at 3:28 AM, Stefan Kooman wrote:
>> Quoting Fabian Grünbichler (f.gruenbich...@proxmox.com):
>>> I think the above roadmap is a good compromise for all involved parties,
>>> and I hope we can use the remainder of
ver (or if it does that you're ok with the
> degraded performance while the db partition is full). I haven't come
> across an equation to judge what size should be used for either
> partition yet.
>
> On Mon, Sep 25, 2017 at 10:53 AM Dietmar Rieder
> <dietmar.rie...@i-med.a
me. Is the statement below correct?
>>>>
>>>> "The BlueStore journal will always be placed on the fastest device
>>>> available, so using a DB device will provide the same benefit that the
>>>> WAL device would while also allowing additional m
Hmm...
not sure what happens if you loose 2 disks in 2 different rooms, isn't
there is a risk that you loose data ?
Dietmar
On 09/22/2017 10:39 AM, Luis Periquito wrote:
> Hi all,
>
> I've been trying to think what will be the best erasure code profile,
> but I don't really like the one I
On 09/21/2017 05:03 PM, Mark Nelson wrote:
>
>
> On 09/21/2017 03:17 AM, Dietmar Rieder wrote:
>> On 09/21/2017 09:45 AM, Maged Mokhtar wrote:
>>> On 2017-09-21 07:56, Lazuardi Nasution wrote:
>>>
>>>> Hi,
>>>>
>>>> I
On 09/21/2017 09:45 AM, Maged Mokhtar wrote:
> On 2017-09-21 07:56, Lazuardi Nasution wrote:
>
>> Hi,
>>
>> I'm still looking for the answer of these questions. Maybe someone can
>> share their thought on these. Any comment will be helpful too.
>>
>> Best regards,
>>
>> On Sat, Sep 16, 2017
On 02/16/2017 09:47 AM, John Spray wrote:
> On Thu, Feb 16, 2017 at 8:37 AM, Dietmar Rieder
> <dietmar.rie...@i-med.ac.at> wrote:
>> Hi,
>>
>> On 12/13/2016 12:35 PM, John Spray wrote:
>>> On Tue, Dec 13, 2016 at 7:35 AM, Dietmar Rieder
>>
Hi,
On 12/13/2016 12:35 PM, John Spray wrote:
> On Tue, Dec 13, 2016 at 7:35 AM, Dietmar Rieder
> <dietmar.rie...@i-med.ac.at> wrote:
>> Hi,
>>
>> this is good news! Thanks.
>>
>> As far as I see the RBD supports (experimentally) now EC d
Hi John,
Thanks for your answer.
The mentioned modification of the pool validation would than allow for
CephFS having the data pools on EC while keeping the metadata on a
replicated pool, right?
Dietmar
On 12/13/2016 12:35 PM, John Spray wrote:
> On Tue, Dec 13, 2016 at 7:35 AM, Dietmar Rie
Hi,
this is good news! Thanks.
As far as I see the RBD supports (experimentally) now EC data pools. Is
this true also for CephFS? It is not stated in the announce, so I wonder
if and when EC pools are planned to be supported by CephFS.
~regards
Dietmar
On 12/13/2016 03:28 AM, Abhishek L
On 10/24/2016 03:10 AM, Christian Balzer wrote:
[...]
> There are several items here and I very much would welcome a response from
> a Ceph/RH representative.
>
> 1. Is that depreciation only in regards to RHCS, as Nick seems to hope?
> Because I very much doubt that, why develop code you just
Hello,
On 05/19/2016 03:36 AM, Christian Balzer wrote:
>
> Hello again,
>
> On Wed, 18 May 2016 15:32:50 +0200 Dietmar Rieder wrote:
>
>> Hello Christian,
>>
>>> Hello,
>>>
>>> On Wed, 18 May 2016 13:57:59 +0200 Dietmar Rieder wrot
Hello Christian,
> Hello,
>
> On Wed, 18 May 2016 13:57:59 +0200 Dietmar Rieder wrote:
>
>> Dear Ceph users,
>>
>> I've a question regarding the memory recommendations for an OSD node.
>>
>> The official Ceph hardware recommendations say that an O
Dear Ceph users,
I've a question regarding the memory recommendations for an OSD node.
The official Ceph hardware recommendations say that an OSD node should
have 1GB Ram / TB OSD [1]
The "Reference Architecture" whitpaper from Red Hat & Supermicro says
that "typically" 2GB of memory per OSD on
Dear ceph users,
I'm in the very initial phase of planning a ceph cluster an have a
question regarding the RAM recommendation for an MDS.
According to the ceph docs the minimum amount of RAM should be "1 GB
minimum per daemon". Is this per OSD in the cluster or per MDS in the
cluster?
I plan
63 matches
Mail list logo