plicated across all three, with
the hope that this sort of thing would not be fatal. It's a Jewel system with
that version's default of 1 for "mon osd min down reporters".
--
Bryan Henderson San Jose, California
___
the people who buy those are not aware that it's designed never to go down
and if something breaks while the system is coming up, a repair action may be
necessary before data is accessible again.
--
Bryan Henderson San Jose, California
hat possible?
A related question: If I mark an OSD down administratively, does it stay down
until I give a command to mark it back up, or will the monitor detect signs of
life and declare it up again on its own?
--
Bryan Henderson San Jose
that
to the monitor, which would believe it within about a minute and mark the OSDs
down. ("osd heartbeat interval", "mon osd min down reports", "mon osd min down
reporters", "osd reporter subtree level").
--
Brya
d down, which is
pretty complicated, at
http://docs.ceph.com/docs/master/rados/configuration/mon-osd-interaction/
It just doesn't seem to match the implementation.
--
Bryan Henderson San Jose, California
___
ceph-users m
of mon_osd_report_timeout),
it marks it down. But it didn't. I did "osd down" commands for the dead OSDs
and the status changed to down and I/O started working.
And wouldn't even 15 minutes of grace be unacceptable if it means I/Os have to
wait that long before falling back to a redundant OSD?
ons. Without such options, in
the one case I tried, Linux 4.9, blocksize was 32K. Maybe it's affected by
the server or by the filesystem the NFS server is serving. This was NFS 3.
> This patch should address this issue [massive reads of e.g. /dev/urandom]:.
Thanks!
> mount option should wor
's layout. In the default layout,
stripe unit size is 4 MiB.
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
stat block size been discussed much? Is there a good reason that it's
the RADOS object size?
I'm thinking of modifying the cephfs filesystem driver to add a mount option
to specify a fixed block size to be reported for all files, and using 4K or
64K. Would that break something?
--
Bryan
Is it possible to search the mailing list archives?
http://lists.ceph.com/pipermail/ceph-users-ceph.com/
seems to have a search function, but in my experience never finds anything.
--
Bryan Henderson San Jose, California
empty, while also giving an empty list of its
contents.
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
tem driver is the same way - I have
to tell it how big a write it can do; it can't figure it out from the OSDs.
So maybe its a fundamental architecture thing.
--
Bryan Henderson San Jose, California
___
ceph-users mailing
? Is this a job for
cephfs-journal-tool event recover_dentries
cephfs-journal-tool journal reset
?
This is Jewel.
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http
filesystem is offline)
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> If the active MDS is connected to a monitor and they fail at the same time,
> the monitors can't replace the mds until they've been through their own
> election and a full mds timeout window.
So how long are we talking?
--
Bryan Henderson San Jose, C
er happened.
This failure to restart happened after the MDS crashed, and I lost any
messages that would tell me why it crashed. I'll fix that and turn up
verbosity and if it happens again, I'll have a better idea how the zeroes got
there.
--
Bryan Henderson
rectly written?
I'm looking at this because I have an MDS that will not start because there
is junk (zeroes) in that space after where the log header says the log ends,
so replay of the log fails there.
--
Bryan Henderson
belong on another
OSD (which I guess it ought to, since the OSD is out), ceph-objecstore-tool is
what you would use to move them over there manually, since ordinary peering
can't do it.
--
Bryan Henderson San Jose, Cali
system for
these clients.
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
that there are many more bugs in the 3.16 cephfs
filesystem driver waiting for me. Indeed, I've seen panics not yet explained.
So what are other people using? A less stable kernel? An out-of-tree driver?
FUSE? Is there a working process for getting known bugs fixed in 3.16?
--
Bryan Henderson
> Kill all mds first , create new fs with old pools , then run ‘fs reset’
> before start any MDS.
Brilliant! I can't wait to try it.
Thanks.
--
Bryan Henderson San Jose, California
___
ceph-users mailing lis
it takes along with 'ceph-objecstore-tool --op update-mon-db' to recover
from a lost cluster map.
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
u've recovered access to the OSDs?
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
inside the cluster), and
the requests aren't just blocked for a long time; they're blocked
indefinitely. The only time I've seen it is when I brought the cluster up in
a different order than I usually do. So I'm just trying to understand the
inner workings in case I nee
I recently had some requests blocked indefinitely; I eventually cleared it
up by recycling the OSDs, but I'd like some help interpreting the log messages
that supposedly give clue as to what caused the blockage:
(I reformatted for easy email reading)
2018-05-03 01:56:35.248623 osd.0
My cluster got stuck somehow, and at one point in trying to recycle things to
unstick it, I ended up shutting down everything, then bringing up just the
monitors. At that point, the cluster reported the status below.
With nothing but the monitors running, I don't see how the status can say
there
would I be taking if I just haphazardly killed everything instead of
orchestrating a shutdown?
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi
than that, and what happens if the maximum I set is too low
to cover those necessesary old pgmaps?
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/lis
h_kvstore_tool after shutting down the monitor, I see hundreds of
keys.
So what does the monitor have to store to do a "status" command?
I've seen clues that the activity has to do with Paxos elections, but I'm
fuzzy on why elections would be happening or why they would need a persistent
someone searching the archives for
memory usage information.
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
o real memory
or paging rate rlimit. As it stands, any normal shell on my systems has an
address space limit of 256M, which has never been a problem before, but is
majorly inconvenient now.
--
Bryan Henderson San Jose, Cali
specific command I'm doing and it does this even with
there is no ceph cluster running, so it must be something pretty basic.
--
Bryan Henderson San Jose, California
___
ceph-users mailing list
ceph-users@lists.ceph.com
32 matches
Mail list logo