and was not successful or the monitors
reacted not correctly in this situation and didn't complete key exchange with OSD's.
After system disk replacement on the problematic mon, verify_authorizer problem
was not in log anymore.
With regards
Jan Pekar
On 01/05/2019 13.58, Jan Pekař - Imatic wrote:
Today problem
Today problem reappeared.
Restarting mon helps, but it is no solving the issue.
Is there any way how to debug that? Can I dump this keys from MON, from OSD or
other components? Can I debug key exchange?
Thank you
On 27/04/2019 10.56, Jan Pekař - Imatic wrote:
On 26/04/2019 21.50, Gregory
On 26/04/2019 21.50, Gregory Farnum wrote:
On Fri, Apr 26, 2019 at 10:55 AM Jan Pekař - Imatic wrote:
Hi,
yesterday my cluster reported slow request for minutes and after restarting
OSDs (reporting slow requests) it stuck with peering PGs. Whole
cluster was not responding and IO stopped.
I
there some timeout or grace
period of old keys usage before they are invalidated?
Thank you
With regards
Jan Pekar
--
====
Ing. Jan Pekař
jan.pe...@imatic.cz
Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz
--
___
m appeared before trying to re-balance my cluster and was invisible to me. But it never happened before and scrub and
depp-scrub is running regularly.
I don't know where to continue with debugging this problem.
JP
On 3.10.2018 08:47, Jan Pekař - Imatic wrote:
Hi all,
I'm playing with my testi
quot;snapid": -2,
"hash": 586898362,
"max": 0,
"pool": 10,
"namespace": ""
},
"need": "13528'6795",
&q
On 6.3.2018 22:28, Gregory Farnum wrote:
On Sat, Mar 3, 2018 at 2:28 AM Jan Pekař - Imatic <jan.pe...@imatic.cz
<mailto:jan.pe...@imatic.cz>> wrote:
Hi all,
I have few problems on my cluster, that are maybe linked together and
now caused OSD down during pg repair.
Hi all,
I have few problems on my cluster, that are maybe linked together and
now caused OSD down during pg repair.
First few notes about my cluster:
4 nodes, 15 OSDs installed on Luminous (no upgrade).
Replicated pools with 1 pool (pool 6) cached by ssd disks.
I don't detect any hardware
On 3.3.2018 11:12, Yan, Zheng wrote:
On Tue, Feb 27, 2018 at 2:29 PM, Jan Pekař - Imatic <jan.pe...@imatic.cz> wrote:
I think I hit the same issue.
I have corrupted data on cephfs and I don't remember the same issue before
Luminous (i did the same tests before).
It is on my test 1 node c
and let you
know.
With regards
Jan Pekar
On 28.2.2018 15:14, David C wrote:
On 27 Feb 2018 06:46, "Jan Pekař - Imatic" <jan.pe...@imatic.cz
<mailto:jan.pe...@imatic.cz>> wrote:
I think I hit the same issue.
I have corrupted data on cephfs and I don't r
there is any change.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
====
Ing. Jan Pekař
jan.pe...@imatic.cz | +420603811737
Imatic | Jagellonská 14 | Praha 3 | 130 00
Hi all,
yesterday I got OSD down with error
2018-01-04 06:47:25.304513 7fe6eda51700 -1 log_channel(cluster) log
[ERR] : 6.20 repair 1 missing, 0 inconsistent objects
2018-01-04 06:47:25.312861 7fe6eda51700 -1 log_channel(cluster) log
[ERR] : 6.20 repair 3 errors, 2 fixed
2018-01-04
egards
Jan Pekar
On 6.12.2017 23:58, David Turner wrote:
Do you have the FS mounted with a trimming ability? What are your mount
options?
On Wed, Dec 6, 2017 at 5:30 PM Jan Pekař - Imatic <jan.pe...@imatic.cz
<mailto:jan.pe...@imatic.cz>> wrote:
Hi,
On 6.12.2017 15:24
at seeing up an mgr daemon.
On Mon, Dec 11, 2017, 2:07 PM Jan Pekař - Imatic <jan.pe...@imatic.cz
<mailto:jan.pe...@imatic.cz>> wrote:
Hi,
thank you for response. I started mds manually and accessed cephfs, I'm
not running mgr yet, it is not necessary.
I just responde
c 11, 2017 at 1:08 PM Jan Pekař - Imatic <jan.pe...@imatic.cz
<mailto:jan.pe...@imatic.cz>> wrote:
Hi all,
hope that somebody can help me. I have home ceph installation.
After power failure (it can happen in datacenter also) my ceph booted in
non-consistent state.
that pg data from osd's?
In osd logs I can see, that backfilling is continuing etc, so they have
correct informations or they are running previous operations before
power failure.
With regards
Jan Pekar
On 11.12.2017 19:07, Jan Pekař - Imatic wrote:
Hi all,
hope that somebody can help me. I have
Hi all,
hope that somebody can help me. I have home ceph installation.
After power failure (it can happen in datacenter also) my ceph booted in
non-consistent state.
I was backfilling data on one new disk during power failure. First time
it booted without some OSDs, but I fixed that. Now I
Hi,
On 6.12.2017 15:24, Jason Dillaman wrote:
On Wed, Dec 6, 2017 at 3:46 AM, Jan Pekař - Imatic <jan.pe...@imatic.cz> wrote:
Hi,
I run to overloaded cluster (deep-scrub running) for few seconds and rbd-nbd
client timeouted, and device become unavailable.
block nbd0: Connection tim
Hi,
I run to overloaded cluster (deep-scrub running) for few seconds and
rbd-nbd client timeouted, and device become unavailable.
block nbd0: Connection timed out
block nbd0: shutting down sockets
block nbd0: Connection timed out
print_req_error: I/O error, dev nbd0, sector 2131833856
ays to flush all objects (like turn off VMs, set
short time to evict or target size) and remove overlay after that.
With regards
Jan Pekar
On 1.12.2017 03:43, Jan Pekař - Imatic wrote:
Hi all,
today I tested adding SSD cache tier to pool.
Everything worked, but when I tried to remove it and run
rados
Hi all,
today I tested adding SSD cache tier to pool.
Everything worked, but when I tried to remove it and run
rados -p hot-pool cache-flush-evict-all
I got
rbd_data.9c000238e1f29.
failed to flush /rbd_data.9c000238e1f29.: (2) No such
file or directory
was deadlocked, the worst case that I would expect would
be your guest OS complaining about hung kernel tasks related to disk IO
(since the disk wouldn't be responding).
On Mon, Nov 6, 2017 at 6:02 PM, Jan Pekař - Imatic <jan.pe...@imatic.cz
<mailto:jan.pe...@imatic.cz>> wro
den Hollander wrote:
Op 7 november 2017 om 10:14 schreef Jan Pekař - Imatic <jan.pe...@imatic.cz>:
Additional info - it is not librbd related, I mapped disk through
rbd map and it was the same - virtuals were stuck/frozen.
I happened exactly when in my log appeared
Why aren't you
st trying a
different version of QEMU and/or different host OS since loss of a disk
shouldn't hang it -- only potentially the guest OS.
On Tue, Nov 7, 2017 at 5:17 AM, Jan Pekař - Imatic <jan.pe...@imatic.cz
<mailto:jan.pe...@imatic.cz>> wrote:
I'm calling kill -STOP to simulat
attached inside
QEMU/KVM virtuals.
JP
On 7.11.2017 10:57, Piotr Dałek wrote:
On 17-11-07 12:02 AM, Jan Pekař - Imatic wrote:
Hi,
I'm using debian stretch with ceph 12.2.1-1~bpo80+1 and qemu
1:2.8+dfsg-6+deb9u3
I'm running 3 nodes with 3 monitors and 8 osds on my nodes, all on IPV6.
When I
was deadlocked, the worst case that I would expect would
be your guest OS complaining about hung kernel tasks related to disk IO
(since the disk wouldn't be responding).
On Mon, Nov 6, 2017 at 6:02 PM, Jan Pekař - Imatic <jan.pe...@imatic.cz
<mailto:jan.pe...@imatic.cz>> wrote:
Hi,
I'm using debian stretch with ceph 12.2.1-1~bpo80+1 and qemu
1:2.8+dfsg-6+deb9u3
I'm running 3 nodes with 3 monitors and 8 osds on my nodes, all on IPV6.
When I tested the cluster, I detected strange and severe problem.
On first node I'm running qemu hosts with librados disk connection to
On 2015-07-13 12:01, Gregory Farnum wrote:
On Mon, Jul 13, 2015 at 9:49 AM, Ilya Dryomov idryo...@gmail.com wrote:
On Fri, Jul 10, 2015 at 9:36 PM, Jan Pekař jan.pe...@imatic.cz wrote:
Hi all,
I think I found a bug in cephfs kernel client.
When I create directory in cephfs and set layout
Hi all,
I think I found a bug in cephfs kernel client.
When I create directory in cephfs and set layout to
ceph.dir.layout=stripe_unit=1073741824 stripe_count=1
object_size=1073741824 pool=somepool
attepmts to write larger file will cause kernel hung or reboot.
When I'm using cephfs client
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
--
Ing. Jan Pekař
jan.pe...@imatic.cz | +420603811737
Imatic | Jagellonská 14 | Praha 3 | 130 00
http://www.imatic.cz
--
___
ceph-users mailing
On 2014-11-10 20:53, Craig Lewis wrote:
nothing to send, going to standby isn't necessarily bad, I see it from
time to time. It shouldn't stay like that for long though. If it's
been 5 minutes, and the cluster still isn't doing anything, I'd restart
that osd.
On Fri, Nov 7, 2014 at 1:55 PM, Jan Pekař
Hi,
is there any possibility to change erasure coding pool parameters ie k
and m values on the fly? I want to add more disks to existing erasure
pool and change redundancy level. I cannot find it in docs.
Changing erasure-code-profile is not working so I assume that is only
template for
Hi,
I was testing ceph cluster map changes and I got to stuck state which
seems to be indefinite.
First my description what I have done.
I'm testing special case with only one copy of pg's (pool size = 1).
All pg's was on one osd.0. I created second osd.1 and modified cluster
map to
33 matches
Mail list logo