I have to admit that it's probably buried deep within the backlog [1].
Immediate term alternative solutions to presenting a RBD-backed block
device that support journaling is available via rbd-nbd (creates
/dev/nbdX devices) and also via LIO's tcmu-runner and a loopback
target (creates /dev/sdX dev
On Tue, Sep 26, 2017 at 9:36 AM, Yoann Moulin wrote:
>
>>> ok, I don't know where I read the -o option to write the key but the file
>>> was empty I do a ">" and seems to work to list or create rbd now.
>>>
>>> and for what I have tested then, the good syntax is « mon 'profile rbd' osd
>>> 'prof
On Tue, Sep 26, 2017 at 4:52 AM, Yoann Moulin wrote:
> Hello,
>
>> I try to give access to a rbd to a client on a fresh Luminous cluster
>>
>> http://docs.ceph.com/docs/luminous/rados/operations/user-management/
>>
>> first of all, I'd like to know the exact syntax for auth caps
>>
>> the result o
; export/import workaround mentioned here without *first* racing to this
>>> ML or IRC and crying out for help. This is a valuable resource, made
>>> more so by people sharing issues.
>>>
>>> Cheers,
>>>
>>> On 12 September 2017 at 07:22, Jason
wrong in the
> upgrade process and if not, should it be documented somewhere how to
> setup the permissions correctly on upgrade?
>
> Or should the documentation on the side of the cloud infrastructure
> software be updated?
>
>
>
> Jason Dillaman writes:
>
>> Since yo
>
> p.s.: I am also available for online chat on
> https://brandnewchat.ungleich.ch/ in case you need more information quickly.
>
> Jason Dillaman writes:
>
>> I see the following which is most likely the issue:
>>
>> 2017-09-11 22:26:38.945776 7efd677fe700 -1
&g
I see the following which is most likely the issue:
2017-09-11 22:26:38.945776 7efd677fe700 -1
librbd::managed_lock::BreakRequest: 0x7efd58020e70 handle_blacklist:
failed to blacklist lock owner: (13) Permission denied
2017-09-11 22:26:38.945795 7efd677fe700 10
librbd::managed_lock::BreakRequest:
... also, do have any logs from the OS associated w/ this log file? I
am specifically looking for anything to indicate which sector was
considered corrupt.
On Mon, Sep 11, 2017 at 4:41 PM, Jason Dillaman wrote:
> Thanks -- I'll take a look to see if anything else stands out. That
> &
Thanks -- I'll take a look to see if anything else stands out. That
"Exec format error" isn't actually an issue -- but now that I know
about it, we can prevent it from happening in the future [1]
[1] http://tracker.ceph.com/issues/21360
On Mon, Sep 11, 2017 at 4:32 PM, Nico Schottelius
wrote:
>
Definitely would love to see some debug-level logs (debug rbd = 20 and
debug objecter = 20) for any VM that experiences this issue. The only
thing I can think of is something to do with sparse object handling
since (1) krbd doesn't perform sparse reads and (2) re-importing the
file would eliminate
does librbd also read ceph.conf and will that cause qemu to output
> debug messages?
>
> Best,
>
> Nico
>
> Jason Dillaman writes:
>
>> I presume QEMU is using librbd instead of a mapped krbd block device,
>> correct? If that is the case, can you add "debug-rb
I presume QEMU is using librbd instead of a mapped krbd block device,
correct? If that is the case, can you add "debug-rbd=20" and "debug
objecter=20" to your ceph.conf and boot up your last remaining broken
OSD?
On Sun, Sep 10, 2017 at 8:23 AM, Nico Schottelius
wrote:
>
> Good morning,
>
> yeste
No support for that yet -- it's being tracked by a backlog ticket [1].
[1] https://trello.com/c/npmsOgM5
On Wed, Sep 6, 2017 at 12:27 PM, Christoph Adomeit
wrote:
> Now that we are 2 years and some ceph releases farther and have bluestor:
>
> Are there meanwhile any better ways to find out the m
The rbd CLI's "lock"-related commands are advisory locks that require
an outside process to manage. The exclusive-lock feature replaces the
advisory locks (and purposely conflicts with it so you cannot use both
concurrently). I'd imagine at some point those CLI commands should be
deprecated, but th
It's up for me as well -- but for me the master branch docs are
missing the table of contents on the nav pane on the left.
On Thu, Aug 17, 2017 at 3:32 PM, David Turner wrote:
> I've been using docs.ceph.com all day and just double checked that it's up.
> Make sure that your DNS, router, firewall
I'm not sure what's going on w/ the master branch docs today, but in
the meantime you can use the luminous docs [1] until this is sorted
out since they should be nearly identical.
[1] http://docs.ceph.com/docs/luminous/
On Thu, Aug 17, 2017 at 2:52 PM, wrote:
> ... or at least since yesterday!
You should be able to set a CEPH_ARGS='--id rbd' environment variable.
On Thu, Aug 17, 2017 at 2:25 PM, David Turner wrote:
> I already tested putting name, user, and id in the global section with
> client.rbd and rbd as the value (one at a time, testing in between). None of
> them had any affect
You should be able to utilize image-meta to override the configuration
on a particular image:
# rbd image-meta set conf_rbd_clone_copy_on_read true
On Wed, Aug 16, 2017 at 8:36 PM, Xavier Trilla
wrote:
> Hi,
>
>
>
> Is it possible to enable copy on read for a rbd child image? I’ve been
> chec
I believe you are thinking of the "exclusive-lock" feature which has
been supported since kernel v4.9. The latest kernel only supports
layering, exclusive-lock, and data-pool features. There is also
support for tolerating the striping feature when it's (erroneously)
enabled on an image but doesn't
I believe this is a known issue [1] and that there will potentially be
a new 12.1.4 RC released because of it. The tracker ticket has a link
to a set of development packages that should resolve the issue in the
meantime.
[1] http://tracker.ceph.com/issues/20985
On Tue, Aug 15, 2017 at 9:08 AM, A
Personally, I didn't quite understand your use-case. You only have a
single host and two drives (one for live data and the other for DR)?
On Mon, Aug 14, 2017 at 4:09 PM, Oscar Segarra wrote:
> Hi,
>
> Anybody has been able to work with mirroring?
>
> does has any sense the scenario I'm proposing
system and don't need nor want an added
layer of complexity in the long term.
On Wed, Aug 9, 2017 at 12:42 PM, Samuel Soulard
wrote:
> Hmm :( Even for an Active/Passive configuration? I'm guessing we will need
> to do something with Pacemaker in the meantime?
>
> On Wed, Aug 9,
We are working hard to formalize active/passive iSCSI configuration
across Linux/Windows/ESX via LIO. We have integrated librbd into LIO's
tcmu-runner and have developed a set of support applications to
managing the clustered configuration of your iSCSI targets. There is
some preliminary documentat
lient_mount_timeout = 75
>
> -邮件原件-
> 发件人: Jason Dillaman [mailto:jdill...@redhat.com]
> 发送时间: 2017年8月8日 7:58
> 收件人: shilu 09816 (RD)
> 抄送: ceph-users
> 主题: Re: hammer(0.94.5) librbd dead lock,i want to how to resolve
>
> I am not sure what you mean by "I stop
I am not sure what you mean by "I stop ceph" (stopped all the OSDs?)
-- and I am not sure how you are seeing ETIMEDOUT errors on a
"rbd_write" call since it should just block assuming you are referring
to stopping the OSDs. What is your use-case? Are you developing your
own application on top of li
ned on for that image.
>
> Deep-flatten cannot be added to an rbd after creation, correct? What are my
> options here?
>
> On Mon, Aug 7, 2017 at 3:32 PM Jason Dillaman wrote:
>>
>> Does the image "tyr-p0/a56eae5f-fd35-4299-bcdc-65839a00f14c" have
>> snapsh
Does the image "tyr-p0/a56eae5f-fd35-4299-bcdc-65839a00f14c" have
snapshots? If the deep-flatten feature isn't enabled, the flatten
operation is not able to dissociate child images from parents when
those child images have one or more snapshots.
On Fri, Aug 4, 2017 at 2:30 PM, Shawn Edwards wrote
e its configuration within ceph-ansible.
On Wed, Aug 2, 2017 at 12:02 PM, Дмитрий Глушенок wrote:
> Will it be a separate project? There is a third RC for Luminous without a
> word about iSCSI Gateway.
>
> 17 июля 2017 г., в 14:54, Jason Dillaman написал(а):
>
> On Sat, Jul 15, 2
You could just use the "rbd du" command to calculate the real disk
usage of images / snapshots and compare that to the thin-provisioned
size of the images.
On Mon, Jul 31, 2017 at 11:28 PM, Italo Santos wrote:
> Hello everyone,
>
> As we know the Openstack ceph integration uses ceph rbd snapshot
While I cannot reproduce what you are seeing, I can see how it could
theoretically be possible for this to deadlock on a thread shutdown if
the process was being shutdown before the service thread had a chance
to actually start executing. I've opened a tracker ticket for the
issue [1].
[1] http://
oVirt 3.6 added Cinder/RBD integration [1] and it looks like they are
currently working on integrating Cinder within a container to simplify
the integration [2].
[1]
http://www.ovirt.org/develop/release-management/features/storage/cinder-integration/
[2]
http://www.ovirt.org/develop/release-mana
t;exclusive" map option that is only available starting
with kernel 4.12.
> 2 The comand with exclusive
> would this ?
>
> rbd map --exclusive test-xlock3
Yes, that should be it.
> Thanks a Lot,
>
> Marcelo
>
>
> Em 24/07/2017, Jason Dillaman escreveu:
>
Your google-fu hasn't failed -- that is a missing feature. I've opened
a new feature-request tracker ticket to get support for that.
[1] http://tracker.ceph.com/issues/20762
On Fri, Jul 21, 2017 at 5:04 PM, Daniel K wrote:
> Once again my google-fu has failed me and I can't find the 'correct' wa
Increasing the size of an image only issues a single write to update
the image size metadata in the image header. That operation is atomic
and really shouldn't be able to do what you are saying. Regardless,
since this is a grow operation, just re-run the resize to update the
metadata again.
On Mo
You will need to pass the "exclusive" option when running "rbd map"
(and be running kernel >= 4.12).
On Mon, Jul 24, 2017 at 8:42 AM, wrote:
> I'm testing ceph in my enviroment, but the feature exclusive lock don't
> works fine for me or maybe i'm doing something wrong.
>
> I testing in two mach
Nothing is built-in for this yet, but it's on the roadmap for a future
release [1].
[1] http://pad.ceph.com/p/ceph-top
On Thu, Jul 20, 2017 at 9:52 AM, Stéphane Klein
wrote:
> Hi,
>
> is it possible to get IO stats (read / write bandwidth) by client or image?
>
> I see this thread
> http://lists
>
> Parameters Time taken
> -t writeback 38mins
> -t none 38 mins
> -S 4k 38 mins
> With client options mentions by Irek Fasikhov 40 mins
> The time taken is almost the same.
>
> On Thu, Jul 13, 2017 at 6:40 PM, Jason Dillaman
> wrote:
>
>> On Thu, Jul 13
On Sat, Jul 15, 2017 at 8:00 PM, Ruben Rodriguez wrote:
>
>
> On 14/07/17 18:43, Ruben Rodriguez wrote:
>> How to reproduce...
>
> I'll provide more concise details on how to test this behavior:
>
> Ceph config:
>
> [client]
> rbd readahead max bytes = 0 # we don't want forced readahead to fool us
Are you 100% positive that your files are actually stored sequentially
on the block device? I would recommend running blktrace to verify the
IO pattern from your use-case.
On Sat, Jul 15, 2017 at 5:42 PM, Ruben Rodriguez wrote:
>
>
> On 15/07/17 09:43, Nick Fisk wrote:
>>> -Original Message--
On Sat, Jul 15, 2017 at 5:35 PM, Ruben Rodriguez wrote:
>
>
> On 15/07/17 15:33, Jason Dillaman wrote:
>> On Sat, Jul 15, 2017 at 9:43 AM, Nick Fisk wrote:
>>> Unless you tell the rbd client to not disable readahead after reading the
>>> 1st x number of bytes (
On Sat, Jul 15, 2017 at 11:01 PM, Alvaro Soto wrote:
> Hi guys,
> does anyone know any news about in what release iSCSI interface is going to
> be production ready, if not yet?
There are several flavors of RBD iSCSI implementations that are in-use
by the community. We are working to solidify the
On Sat, Jul 15, 2017 at 9:43 AM, Nick Fisk wrote:
> Unless you tell the rbd client to not disable readahead after reading the 1st
> x number of bytes (rbd readahead disable after bytes=0), it will stop reading
> ahead and will only cache exactly what is requested by the client.
The default is t
> > In the moment, in my enviroment testing with ceph, using the version
>> > 4.10 of kernel and i mount the system in two machines in the same
>> > time, in production enviroment, i could serious problem with this
>> > comportament.
>> >
>> > How c
at 3:02 AM, 许雪寒 wrote:
> Yes, I believe so. Is there any workarounds?
>
> -邮件原件-
> 发件人: Jason Dillaman [mailto:jdill...@redhat.com]
> 发送时间: 2017年7月13日 21:13
> 收件人: 许雪寒
> 抄送: ceph-users@lists.ceph.com
> 主题: Re: [ceph-users] 答复: No "snapset" attribute for c
On Thu, Jul 13, 2017 at 10:58 AM, Maged Mokhtar wrote:
> The case also applies to active/passive iSCSI.. you still have many
> initiators/hypervisors writing concurrently to the same rbd image using a
> clustered file system (csv/vmfs).
Except from that point-of-view, there is only a single RBD c
Quite possibly the same as this issue? [1]
[1] http://tracker.ceph.com/issues/17445
On Thu, Jul 13, 2017 at 8:13 AM, 许雪寒 wrote:
> By the way, we are using hammer version's rbd command to export-diff rbd
> images on Jewel version's cluster.
>
> -邮件原件-
> 发件人: ceph-users [mailto:ceph-users
On Thu, Jul 13, 2017 at 8:57 AM, Irek Fasikhov wrote:
> rbd readahead disable after bytes = 0
There isn't any reading from an RBD image in this example -- plus
readahead disables itself automatically after the first 50MBs of IO
(i.e. after the OS should have had enough time to start its own
I'll refer you to the original thread about this [1] that was awaiting
an answer. I would recommend dropping the "-t none" option since that
might severely slow down sequential write operations if "qemu-img
convert" is performing 512 byte IO operations. You might also want to
consider adding the "-
On Mon, Jul 10, 2017 at 3:41 PM, Maged Mokhtar wrote:
> On 2017-07-10 20:06, Mohamad Gebai wrote:
>
>
> On 07/10/2017 01:51 PM, Jason Dillaman wrote:
>
> On Mon, Jul 10, 2017 at 1:39 PM, Maged Mokhtar wrote:
>
> These are significant differences, to the point where it may
On Mon, Jul 10, 2017 at 1:39 PM, Maged Mokhtar wrote:
> These are significant differences, to the point where it may not make sense
> to use rbd journaling / mirroring unless there is only 1 active client.
I interpreted the results as the same RBD image was being concurrently
used by two fio jobs
On Fri, Jul 7, 2017 at 2:48 AM, Piotr Dałek wrote:
> Is this:
> https://github.com/yuyuyu101/ceph/commit/794b49b5b860c538a349bdadb16bb6ae97ad9c20#commitcomment-15707924
> the issue you mention? Because at this point I'm considering switching to
> C++ API and passing static bufferptr buried in my b
There are no immediate plans to support the RBD journaling in krbd.
The journaling feature requires a lot of code and, with limited
resources, the priority has been to provide alternative block device
options that pass-through to librbd for such use-cases and to optimize
the performance of librbd /
On Thu, Jul 6, 2017 at 3:25 PM, Piotr Dałek wrote:
> Is that deep copy an equivalent of what
> Jewel librbd did at unspecified point of time, or extra one?
It's equivalent / replacement -- not an additional copy. This was
changed to support scatter/gather IO API methods which the latest
version o
On Thu, Jul 6, 2017 at 11:46 AM, Piotr Dałek wrote:
> How about a hybrid solution? Keep the old rbd_aio_write contract (don't copy
> the buffer with the assumption that it won't change) and instead of
> constructing bufferlist containing bufferptr to copied data, construct a
> bufferlist containin
On Thu, Jul 6, 2017 at 10:22 AM, Piotr Dałek wrote:
> So I really see two problems here: lack of API docs and
> backwards-incompatible change in API behavior.
Docs are always in need of update, so any pull requests would be
greatly appreciated.
However, I disagree that the behavior has substanti
.
On Thu, Jul 6, 2017 at 9:33 AM, Piotr Dałek wrote:
> On 17-07-06 03:03 PM, Jason Dillaman wrote:
>>
>> On Thu, Jul 6, 2017 at 8:26 AM, Piotr Dałek
>> wrote:
>>>
>>> Hi,
>>>
>>> If you're using "rbd_aio_write()" in your co
Pre-Luminous also copies the provided buffer when using the C API --
it just copies it at a later point and not immediately. The eventual
goal is to eliminate the copy completely, but that requires some
additional plumbing work deep down within the librados messenger
layer.
On Thu, Jul 6, 2017 at
On Thu, Jun 29, 2017 at 1:33 PM, Gregory Farnum wrote:
> I'm not sure if there are built-in tunable commands available (check the
> manpages? Or Jason, do you know?), but if not you can use any generic
> tooling which limits how much network traffic the RBD command can run.
Long-running RBD actio
On Wed, Jun 28, 2017 at 11:42 PM, YuShengzuo wrote:
> Hi Jason Dillaman,
>
>
>
> I am using rbd-mirror now (release Jewel).
>
>
>
> 1.
>
> And in many webs or other information introduced rbd-mirror notices that two
> ceph cluster should be the ‘same fsid’.
&g
... additionally, the forthcoming 4.12 kernel release will support
non-cooperative exclusive locking. By default, since 4.9, when the
exclusive-lock feature is enabled, only a single client can write to the
block device at a time -- but they will cooperatively pass the lock back
and forth upon writ
working fine. We have not changed any queue depth setting
> on that setup either. If it turns out to be queue depth how can we set queue
> setting for qemu-img convert operation?
>
> Thank you.
>
> Sent from my iPhone
>
>> On Jun 28, 2017, at 7:56 PM, Jason Dillaman wrot
Given that your time difference is roughly 10x, best guess is that
qemu-img is sending the IO operations synchronously (queue depth = 1),
whereas, by default, "rbd import" will send up to 10 write requests in
parallel to the backing OSDs. Such an assumption assumes that you have
really high latency
On Tue, Jun 27, 2017 at 7:17 PM, Daniel K wrote:
> I'm trying to find a good way to mount ceph rbd images for export by
> LIO/targetcli
I would eventually recommend just directly serving the RBD images via
LIO/TCMU. This is still a work-in-progress but it's being actively
worked on with the goal
Have you tried blktrace to determine if there are differences in the
IO patterns to the rbd-backed virtio-scsi block device (direct vs
indirect through loop)?
On Tue, Jun 27, 2017 at 3:17 PM, Ruben Rodriguez wrote:
>
> We are setting a new set of servers to run the FSF/GNU infrastructure,
> and w
lots of place in the
> centos7.3,
> are they fixed for something?
> Tks and Rgds.
>
>
>
> ------ 原始邮件 --
> 发件人: "Jason Dillaman";;
> 发送时间: 2017年6月27日(星期二) 上午7:28
> 收件人: "码云";
> 抄送: "ceph-users";
> 主题: Re: [cep
May I ask why you are using krbd with QEMU instead of librbd?
On Fri, Jun 16, 2017 at 12:18 PM, 码云 wrote:
> Hi All,
> Recently.I meet a question and I did'nt find out any thing for explain it.
>
> Ops process like blow:
> ceph 10.2.5 jewel, qemu 2.5.0 centos 7.2 x86_64
> create pool rbd_vms 3
On Mon, Jun 26, 2017 at 2:55 PM, Mayank Kumar wrote:
> Thanks David, few more questions:-
> - Is there a way to limit the capability of the keyring which is used to
> map/unmap/lock to only allow those operations and nothing else using that
> specific keyring
Since RBD is basically just a collect
Restoring a snapshot involves copying the entire image from the
snapshot revision to the HEAD revision. The faster approach would be
to just create a clone from the snapshot.
2017-06-26 10:59 GMT-04:00 Marco Gaiarin :
> Mandi! Lindsay Mathieson
> In chel di` si favelave...
>
>> Have you tried re
Are you using librbd via QEMU or krbd? If librbd, what errors are
noted in the instance's librbd log file?
On Sun, Jun 25, 2017 at 4:30 AM, Massimiliano Cuttini
wrote:
> After 4 months of test we decided to go live and store real VDI in
> production.
> However just the same day something went su
On Fri, Jun 23, 2017 at 8:47 AM, Hall, Eric wrote:
> I have debug logs. Should I open a RBD tracker ticket at
> http://tracker.ceph.com/projects/rbd/issues for this?
Yes, please. You might need to use the "ceph-post-file" utility if the
logs are too large to attach to the ticket. In that case,
CentOS 7.3's krbd supports Jewel tunables (CRUSH_TUNABLES5) and does
not support NBD since that driver is disabled out-of-the-box. As an
alternative for NBD, the goal is to also offer LIO/TCMU starting with
Luminous and the next point release of CentOS (or a vanilla >=4.12-ish
kernel).
On Fri, Jun
thodology are welcome.
>
>
> We occasionally see blocked requests in a running log (ceph –w > log),
> but not correlated with hung VM IO. Scrubbing doesn’t seem correlated either.
>
> --
> Eric
>
> On 6/21/17, 2:55 PM, "Jason Dillaman" wrote:
ges so yes, there is the default
> “/sbin/fstrim –all” in /etc/cron.weekly/fstrim.
>
> --
> Eric
>
> On 6/21/17, 1:58 PM, "Jason Dillaman" wrote:
>
> Are some or many of your VMs issuing periodic fstrims to discard
> unused extents?
>
> On Wed,
Are some or many of your VMs issuing periodic fstrims to discard
unused extents?
On Wed, Jun 21, 2017 at 2:36 PM, Hall, Eric wrote:
> After following/changing all suggested items (turning off exclusive-lock
> (and associated object-map and fast-diff), changing host cache behavior,
> etc.) this is
On Wed, Jun 21, 2017 at 3:05 AM, Piotr Dałek wrote:
> I saw that RBD (librbd) does that - replacing writes with discards when
> buffer contains only zeros. Some code that does the same in librados could
> be added and it shouldn't impact performance much, current implementation of
> mem_is_zero is
s/troubleshooting/log-and-debug/
On Thu, Jun 15, 2017 at 3:06 AM, YuShengzuo wrote:
> Hi Jason Dillaman,
>
>
>
> I have a question about rbd-mirror :
>
> Recently, I begin to use the feature , but I can’t find logs
> about it.(it running well)
>
>
Assuming the RBD object-map feature is *not* enabled, if the
associated backing object was not overwritten in rbd2 nor rbd3, every
read operation to that object would involve first attempting to read
from rbd3's object, then rbd2's, followed by rbd1's, which would
introduce extra latency. The first
ult of those two writes are as expected? Does it merge those two
> operations, or synchronously issue those writes to the disk? If the latter,
> does the file system insert some other operations, like io barrier, between
> those to writes so that the underlying storage system is aw
Just like a regular block device, re-orders are permitted between
write barriers/flushes. For example, if I had a HDD with 512 byte
sectors and I attempted to write 4K, there is no guarantee what the
disk will look like if you had a crash mid-write or if you
concurrently issued an overlapping write
2017-05-19 23:40:14.150887 451238'20721095
>2017-05-17 21:25:09.174598
>
> Greets,
> Stefan
>
> Am 22.05.2017 um 15:32 schrieb Jason Dillaman:
>> If you have the debug symbols installed, I'd say "thread apply all bt"
>> in addition to a "
But if I don't execute creating ext4 filesystem,snap can rollback with
> exclusive-lock enabled successfully .
>
> So can u explain it?
>
> -邮件原件-
> 发件人: Jason Dillaman [mailto:jdill...@redhat.com]
> 发送时间: 2017年5月22日 19:46
> 收件人: lijie 11803 (RD)
> 抄送: Sage Weil; ce
llo Jason,
>
> should i do a coredump or a thread apply all bt?
>
> Don't know what is better.
>
> Greets,
> Stefan
>
> Am 22.05.2017 um 15:19 schrieb Jason Dillaman:
>> If you cannot recreate with debug logging enabled, that might be the
>> next best opt
er an osd restart. Any further ideas?
>
> Coredump of the OSD with hanging scrub?
>
> Greets,
> Stefan
>
> Am 18.05.2017 um 17:26 schrieb Jason Dillaman:
>> I'm unfortunately out of ideas at the moment. I think the best chance
>> of figuring out what is wrong is to
That's by design -- it doesn't make sense to live-rollback a block
device when you have a running VM actively accessing the device. Once
you shut-down the VM, rollback should work just fine.
On Mon, May 22, 2017 at 5:29 AM, Lijie wrote:
> Hi All,
>
> When I do a snap rollback command with exclus
t; Greets,
> Stefan
>
> Am 17.05.2017 um 21:26 schrieb Stefan Priebe - Profihost AG:
>> Am 17.05.2017 um 21:21 schrieb Jason Dillaman:
>>> Any chance you still have debug logs enabled on OSD 23 after you
>>> restarted it and the scrub froze again?
>>
>> N
226 2017-05-10 03:43:20.849784 171715'10548192
>2017-05-04 14:27:39.210713
>
> So it seems the same scrub is stuck again... even after restarting the
> osd. It just took some time until the scrub of this pg happened again.
>
> Greets,
> Stefan
> Am 17.05.2017 um
Stefan Priebe - Profihost AG:
>> Hi,
>>
>> that command does not exist.
>>
>> But at least ceph -s permanently reports 1 pg in scrubbing with no change.
>>
>> Log attached as well.
>>
>> Greets,
>> Stefan
>> Am 17.05.2017 um 20:20
b Stefan Priebe - Profihost AG:
>> Hello Jason,
>>
>> the command
>> # rados -p cephstor6 rm rbd_data.21aafa6b8b4567.0aaa
>>
>> hangs as well. Doing absolutely nothing... waiting forever.
>>
>> Greets,
>> Stefan
>>
>> Am
"linger_id": 1,
> "pg": "2.f0709c34",
> "osd": 23,
> "object_id": "rbd_header.21aafa6b8b4567",
> "object_locator": "@2",
> "targe
On Wed, May 17, 2017 at 10:25 AM, Stefan Priebe - Profihost AG
wrote:
> issue the delete request and send you the log?
Yes, please.
--
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
On Wed, May 17, 2017 at 10:21 AM, Stefan Priebe - Profihost AG
wrote:
> You mean the request no matter if it is successful or not? Which log
> level should be set to 20?
I'm hoping you can re-create the hung remove op when OSD logging is
increased -- "debug osd = 20" would be nice if you can tur
all debug symbols and issue a gdb: "thread apply all bt full"?
>
> Does it help?
>
> Greets,
> Stefan
>
> Am 17.05.2017 um 15:12 schrieb Jason Dillaman:
>> Perfect librbd log capture. I can see that a remove request to object
>> rbd_data.e10ca56b8b4567.
The VM was running until 2017-05-17 12:10 but there was no I/O for 10 min.
>
> Greets,
> Stefan
>
> Am 16.05.2017 um 22:54 schrieb Jason Dillaman:
>> It looks like it's just a ping message in that capture.
>>
>> Are you saying that you restarted OSD 46 and the
"target_object_locator": "@5",
> "paused": 0,
> "used_replica": 0,
> "precalc_pgid": 0,
> "snapid": "head",
> "registered": "1"
On Tue, May 16, 2017 at 3:37 PM, Stefan Priebe - Profihost AG
wrote:
> We've enabled the op tracker for performance reasons while using SSD
> only storage ;-(
Disabled you mean?
> Can enable the op tracker using ceph osd tell? Than reproduce the
> problem. Check what has stucked again? Or should
ot;
> ]
> }
> ],
> "linger_ops": [
> {
> "linger_id": 1,
> "pg": "5.5f3bd635",
> "osd": 17,
> "object_id": "rbd_header.e10ca56b
On Tue, May 16, 2017 at 2:12 AM, Stefan Priebe - Profihost AG
wrote:
> 3.) it still happens on pre jewel images even when they got restarted /
> killed and reinitialized. In that case they've the asok socket available
> for now. Should i issue any command to the socket to get log out of the
> hang
t; > [iometer_just_write]
>> > stonewall
>> > bs=4M
>> > rw=write
>> >
>> > [iometer_just_read]
>> > stonewall
>> > bs=4M
>> > rw=read
>> > """
>> >
>> > Then let it run:
>> > $&g
On Mon, May 15, 2017 at 3:54 PM, Stefan Priebe - Profihost AG
wrote:
> Would it be possible that the problem is the same you fixed?
No, I would not expect it to be related to the other issues you are
seeing. The issue I just posted a fix against only occurs when a
client requests the lock from th
;>>
>>> The problem only seems to occur at all if a client has connected to
>>> hammer without exclusive lock. Than got upgraded to jewel and exclusive
>>> lock gets enabled.
>>>
>>> Greets,
>>> Stefan
>>>
>>> Am 14.05.2017 um
401 - 500 of 825 matches
Mail list logo