On 12/06/2019 22.33, Yan, Zheng wrote:
> I have tracked down the bug. thank you for reporting this. 'echo 2 >
> /proc/sys/vm/drop_cache' should fix the hang. If you can compile ceph
> from source, please try following patch.
>
> diff --git a/src/mds/Locker.cc b/src/mds/Locker.cc
> index
Hi
How can we enable bluestore_default_buffered_write using ceph-conf utility
Any pointers would be appreciated
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On 6/12/19 5:51 PM, Jorge Garcia wrote:
I'm following the bluestore config reference guide and trying to
change the value for osd_memory_target. I added the following entry in
the /etc/ceph/ceph.conf file:
[osd]
osd_memory_target = 2147483648
and restarted the osd daemons doing
I'm following the bluestore config reference guide and trying to change
the value for osd_memory_target. I added the following entry in the
/etc/ceph/ceph.conf file:
[osd]
osd_memory_target = 2147483648
and restarted the osd daemons doing "systemctl restart ceph-osd.target".
Now, how do
On Wed, 12 Jun 2019, Sage Weil wrote:
> On Thu, 13 Jun 2019, Simon Leinen wrote:
> > Sage Weil writes:
> > >> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column
> > >> families: [default]
> > >> Unrecognized command: stats
> > >> ceph-kvstore-tool:
[Sorry for the piecemeal information... it's getting late here]
> Oops, I forgot: Before it crashed, it did modify /mnt/ceph/db; the
> overall size of that directory increased(!) from 3.9GB to 12GB. The
> compaction seems to have eaten two .log files, but created many more
> .sst files.
...and
On Thu, 13 Jun 2019, Simon Leinen wrote:
> Sage Weil writes:
> >> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column families:
> >> [default]
> >> Unrecognized command: stats
> >> ceph-kvstore-tool: /build/ceph-14.2.1/src/rocksdb/db/version_set.cc:356:
> >>
Simon Leinen writes:
> Sage Weil writes:
>> Try 'compact' instead of 'stats'?
> That run for a while and then crashed, also in the destructor for
> rocksdb::Version, but with an otherwise different backtrace. [...]
Oops, I forgot: Before it crashed, it did modify /mnt/ceph/db; the
overall size
Sage Weil writes:
>> 2019-06-12 23:40:43.555 7f724b27f0c0 1 rocksdb: do_open column families:
>> [default]
>> Unrecognized command: stats
>> ceph-kvstore-tool: /build/ceph-14.2.1/src/rocksdb/db/version_set.cc:356:
>> rocksdb::Version::~Version(): Assertion `path_id <
>>
On Wed, 12 Jun 2019, Simon Leinen wrote:
> We hope that we can get some access to S3 bucket indexes back, possibly
> by somehow dropping and re-creating those indexes.
Are all 3 OSDs crashing in the same way?
My guess is that the reshard process triggered some massive rocksdb
transaction that
On Wed, 12 Jun 2019, Simon Leinen wrote:
> Sage Weil writes:
> > What happens if you do
>
> > ceph-kvstore-tool rocksdb /mnt/ceph/db stats
>
> (I'm afraid that our ceph-kvstore-tool doesn't know about a "stats"
> command; but it still tries to open the database.)
>
> That aborts after
Sage Weil writes:
> What happens if you do
> ceph-kvstore-tool rocksdb /mnt/ceph/db stats
(I'm afraid that our ceph-kvstore-tool doesn't know about a "stats"
command; but it still tries to open the database.)
That aborts after complaining about many missing files in /mnt/ceph/db.
When I ( cd
On Wed, 12 Jun 2019, Simon Leinen wrote:
> Dear Sage,
>
> > Also, can you try ceph-bluestore-tool bluefs-export on this osd? I'm
> > pretty sure it'll crash in the same spot, but just want to confirm
> > it's a bluefs issue.
>
> To my surprise, this actually seems to have worked:
>
> $ time
Dear Sage,
> Also, can you try ceph-bluestore-tool bluefs-export on this osd? I'm
> pretty sure it'll crash in the same spot, but just want to confirm
> it's a bluefs issue.
To my surprise, this actually seems to have worked:
$ time sudo ceph-bluestore-tool --out-dir /mnt/ceph bluefs-export
On Wed, 12 Jun 2019, Harald Staub wrote:
> On 12.06.19 17:40, Sage Weil wrote:
> > On Wed, 12 Jun 2019, Harald Staub wrote:
> > > Also opened an issue about the rocksdb problem:
> > > https://tracker.ceph.com/issues/40300
> >
> > Thanks!
> >
> > The 'rocksdb: Corruption: file is too short' the
On 12.06.19 17:40, Sage Weil wrote:
On Wed, 12 Jun 2019, Harald Staub wrote:
Also opened an issue about the rocksdb problem:
https://tracker.ceph.com/issues/40300
Thanks!
The 'rocksdb: Corruption: file is too short' the root of the problem
here. Can you try starting the OSD with
Le 12/06/2019 à 16:01, Jason Dillaman a écrit :
On Wed, Jun 12, 2019 at 9:50 AM Rafael Diaz Maurin
wrote:
Hello Jason,
Le 11/06/2019 à 15:31, Jason Dillaman a écrit :
4- I export the snapshot from the source pool and I import the snapshot
towards the destination pool (in the pipe)
rbd
All;
I'm testing and evaluating Ceph for the next generation of storage architecture
for our company, and so far I'm fairly impressed, but I've got a couple of
questions around cluster replication and disaster recovery.
First; intended uses.
Ceph Object Gateway will be used to support new
Correct, it was pre-jewel. I believe we toyed with multisite replication back
then so it may have gotten baked into the zonegroup inadvertently. Thanks for
the info!
> On Jun 12, 2019, at 11:08 AM, Casey Bodley wrote:
>
> Hi Aaron,
>
> The data_log objects are storing logs for multisite
Hi,
Could someone be able to point me to a blog or documentation page which helps
me resolve the issues noted below?
All nodes are Luminous, 12.2.12; one realm, one zonegroup (clustered haproxies
fronting), two zones (three rgw in each); All endpoint references to each zone
go are an haproxy.
On Wed, 12 Jun 2019, Harald Staub wrote:
> Also opened an issue about the rocksdb problem:
> https://tracker.ceph.com/issues/40300
Thanks!
The 'rocksdb: Corruption: file is too short' the root of the problem
here. Can you try starting the OSD with 'debug_bluestore=20' and
'debug_bluefs=20'?
Hi Harald,
If the bucket reshard didn't complete, it's most likely one of the new
bucket index shards that got corrupted here and the original index shard
should still be intact. Does $BAD_BUCKET_ID correspond to the
new/resharded instance id? If so, once the rocksdb/osd issues are
resolved,
Hi Aaron,
The data_log objects are storing logs for multisite replication. Judging
by the pool name '.us-phx2.log', this cluster was created before jewel.
Are you (or were you) using multisite or radosgw-agent?
If not, you'll want to turn off the logging (log_meta and log_data ->
false) in
On Fri, 10 May 2019, Sage Weil wrote:
> Hi everyone,
>
> -- What --
>
> The Ceph Leadership Team[1] is proposing a change of license from
> *LGPL-2.1* to *LGPL-2.1 or LGPL-3.0* (dual license). The specific changes
> are described by this pull request:
>
>
Hi,
If nothing special with defined “initial monitos” on cluster, we’ll try to
remove mon01 from cluster.
I comment about “initial monitor” because in our ceph implementation there is
only one monitor as “initial:
[root@mon01 ceph]# cat /etc/ceph/ceph.conf
[global]
fsid =
Also opened an issue about the rocksdb problem:
https://tracker.ceph.com/issues/40300
On 12.06.19 16:06, Harald Staub wrote:
We ended in a bad situation with our RadosGW (Cluster is Nautilus
14.2.1, 350 OSDs with BlueStore):
1. There is a bucket with about 60 million objects, without shards.
I have run into a similar hang on 'ls .snap' recently:
https://tracker.ceph.com/issues/40101#note-2
On Wed, Jun 12, 2019 at 9:33 AM Yan, Zheng wrote:
>
> On Wed, Jun 12, 2019 at 3:26 PM Hector Martin wrote:
> >
> > Hi list,
> >
> > I have a setup where two clients mount the same filesystem and
We ended in a bad situation with our RadosGW (Cluster is Nautilus
14.2.1, 350 OSDs with BlueStore):
1. There is a bucket with about 60 million objects, without shards.
2. radosgw-admin bucket reshard --bucket $BIG_BUCKET --num-shards 1024
3. Resharding looked fine first, it counted up to the
If there was an optimal setting, then it would be the default.
Also, both have these options have been removed in Luminous ~2 years ago
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel:
On Wed, Jun 12, 2019 at 9:50 AM Rafael Diaz Maurin
wrote:
>
> Hello Jason,
>
> Le 11/06/2019 à 15:31, Jason Dillaman a écrit :
> >> 4- I export the snapshot from the source pool and I import the snapshot
> >> towards the destination pool (in the pipe)
> >> rbd export-diff --from-snap ${LAST-SNAP}
On both larger and smaller clusters i have never had problems with the default
values.
So i guess thats a pretty good start.
- Original Message -
From: "tim taler"
To: "Paul Emmerich"
Cc: "ceph-users"
Sent: Wednesday, June 12, 2019 3:51:43 PM
Subject: Re: [ceph-users] ceph threads
I will look into that, but:
IS there a rule of thumb to determine the optimal setting for
osd disk threads
and
osd op threads
?
TIA
On Wed, Jun 12, 2019 at 3:22 PM Paul Emmerich wrote:
>
>
>
> On Wed, Jun 12, 2019 at 10:57 AM tim taler wrote:
>>
>> We experience absurd slow i/o in the VMs and I
Hello Jason,
Le 11/06/2019 à 15:31, Jason Dillaman a écrit :
4- I export the snapshot from the source pool and I import the snapshot
towards the destination pool (in the pipe)
rbd export-diff --from-snap ${LAST-SNAP}
${POOL-SOURCE}/${KVM-IMAGE}@${TODAY-SNAP} - | rbd -c ${BACKUP-CLUSTER}
On Wed, Jun 12, 2019 at 3:26 PM Hector Martin wrote:
>
> Hi list,
>
> I have a setup where two clients mount the same filesystem and
> read/write from mostly non-overlapping subsets of files (Dovecot mail
> storage/indices). There is a third client that takes backups by
> snapshotting the
Hi all,
we have a 5 node ceph cluster with 44 OSDs
where all nodes also serve as virtualization hosts,
running about 22 virtual machines with all in all about 75 rbd s
(158 including snapshots).
We experience absurd slow i/o in the VMs and I suspect
our thread settings in ceph.conf to be one of
On Wed, Jun 12, 2019 at 10:57 AM tim taler wrote:
> We experience absurd slow i/o in the VMs and I suspect
> our thread settings in ceph.conf to be one of the culprits.
>
this is probably not the cause. But someone might be able to help you if you
share details on your setup (hardware,
On Wed, Jun 12, 2019 at 11:45 AM Lluis Arasanz i Nonell - Adam <
lluis.aras...@adam.es> wrote:
> - Be careful adding or removing monitors in a not healthy monitor cluster:
> If they lost quorum you will be into problems.
>
safe procedure: remove the dead monitor before adding a new one
>
>
>
On Wed, Jun 12, 2019 at 6:48 AM Glen Baars
wrote:
> Interesting performance increase! I'm Iscsi it at a few installations and
> now a wonder what version of Centos is required to improve performance! Did
> the cluster go from Luminous to Mimic?
>
wild guess: probably related to updating
Hi all,
Here our story. Perhaps some day could help anyone. Be in mind that English is
not my native language so sorry if I make mistakes.
Our system is: Ceph 0.87.2 (Giant), with 5 OSD servers (116 1TB osd total) and
3 monitors.
After a nightmare time, we initially "correct" ceph monitor
Hi Felix,
Better use fio.
Like fio -ioengine=rbd -direct=1 -invalidate=1 -name=test -bs=4k -iodepth=128
-rw=randwrite -pool=rpool_hdd -runtime=60 -rbdname=testimg (for peak parallel
random iops)
Or the same with -iodepth=1 for the latency test. Here you usually get
Or the same with
Hi all,
we have a 5 node ceph cluster with 44 OSDs
where all nodes also serve as virtualization hosts,
running about 22 virtual machines with all in all about 75 rbd s
(158 including snapshots).
We experience absurd slow i/o in the VMs and I suspect
our thread settings in ceph.conf to be one of
Hi list,
I have a setup where two clients mount the same filesystem and
read/write from mostly non-overlapping subsets of files (Dovecot mail
storage/indices). There is a third client that takes backups by
snapshotting the top-level directory, then rsyncing the snapshot over to
another location.
On 6/11/19 9:48 PM, J. Eric Ivancich wrote:
> Hi Wido,
>
> Interleaving below
>
> On 6/11/19 3:10 AM, Wido den Hollander wrote:
>>
>> I thought it was resolved, but it isn't.
>>
>> I counted all the OMAP values for the GC objects and I got back:
>>
>> gc.0: 0
>> gc.11: 0
>> gc.14: 0
>>
Quoting Patrick Donnelly (pdonn...@redhat.com):
> Hi Stefan,
>
> Sorry I couldn't get back to you sooner.
NP.
> Looks like you hit the infinite loop bug in OpTracker. It was fixed in
> 12.2.11: https://tracker.ceph.com/issues/37977
>
> The problem was introduced in 12.2.8.
We've been quite
44 matches
Mail list logo