I have just deployed a cluster and started messing with it, which I
think two replicas. However when I have a metadata server and mount via
fuse, it is reporting its full size. With two replicas, I thought it
would be only reporting half of that. Did I make a mistake, or is there
something
When I add my ceph system to fstab, I can make mount by referencing it,
but when I restart the system it stops during boot because the mount
failed. I am guessing it is because fstab is run before the network
starts? Using centos 7.
thanks for the help,
Dan
Thanks John,
I just wanted to make sure I wasnt doing anything wrong, that should
work fine.
Dan
On 06/14/2016 03:24 PM, John Spray wrote:
On Tue, Jun 14, 2016 at 7:45 PM, Daniel Davidson
<dani...@igb.illinois.edu> wrote:
I have just deployed a cluster and started messing with it, w
I am starting to work with and benchmark our ceph cluster. While
throughput is so far looking good, metadata performance so far looks to
be suffering. Is there anything that can be done to speed up the
response time of looking through a lot of small files and folders?
Right now, I am
I just added two nodes to our cluster for the first time since we really
had any data on them to speak of. Each node has two rather large raid
arrays on them. Rebalancing it taking a very long time, estimated
weeks, to complete. Are there any tunables/procedures to speed this
up? Network
ceph daemonperf mds.ceph-0
-mds-- --mds_server-- ---objecter--- -mds_cache-
---mds_log
rlat inos caps|hsr hcs hcr |writ read actv|recd recy stry purg|segs
evts subm|
0 336k 97k| 000 | 00 20 | 00 246k 0 | 31
27k 0
0 336k 97k| 000
Thanks John,
I think that has resolved the problems.
Dan
On 03/04/2017 09:08 AM, John Spray wrote:
On Fri, Mar 3, 2017 at 9:48 PM, Daniel Davidson
<dani...@igb.illinois.edu> wrote:
ceph daemonperf mds.ceph-0
-mds-- --mds_server-- ---objecter--- -mds_cache-
---mds_log---
Thanks for the suggestion, however I think my more immediate problem is
the ms_handle_reset messages. I do not think the mds are getting the
updates when I send them.
Dan
On 03/04/2017 09:08 AM, John Spray wrote:
On Fri, Mar 3, 2017 at 9:48 PM, Daniel Davidson
<dani...@igb.illinois.
incorrectly somewhere, but I do not
know where to look.
Dan
On 03/06/2017 09:05 AM, John Spray wrote:
On Mon, Mar 6, 2017 at 3:03 PM, Daniel Davidson
<dani...@igb.illinois.edu> wrote:
Thanks for the suggestion, however I think my more immediate problem is the
ms_handle_reset messages
We have a weird issue. Whenever compiling Ruby, and only Ruby, on a
location served by cephfs, the node in our cluster (not the ceph node)
will crash. This always happens, even if we do not use a PXE bootable
node like the head/management node. If we compile to local disk, it
will succeed.
the right number for your environment.
Good Luck :)
On Mon, May 8, 2017 at 5:43 PM Daniel Davidson
<dani...@igb.illinois.edu <mailto:dani...@igb.illinois.edu>> wrote:
Our ceph system performs very poorly or not even at all while the
remapping procedure is underway. We are u
the daemon and stopping any operations it's working
on. Also while it's down, the secondary OSDs for the PG should be
able to handle the requests that are blocked. Check it's log to see
what it's doing.
You didn't answer what your size and min_size are for your 2 pools.
On Fri, J
e 2 inactive PGs. Not sure yet if that is
anything of concern, but didn't want to ignore it.
On Fri, Jun 23, 2017 at 1:16 PM Daniel Davidson
<dani...@igb.illinois.edu <mailto:dani...@igb.illinois.edu>> wrote:
Two of our OSD systems hit 75% disk utilization, so I added another
Two of our OSD systems hit 75% disk utilization, so I added another
system to try and bring that back down. The system was usable for a day
while the data was being migrated, but now the system is not responding
when I try to mount it:
mount -t ceph ceph-0,ceph-1,ceph-2,ceph-3:6789:/ /home
Our ceph system performs very poorly or not even at all while the
remapping procedure is underway. We are using replica 2 and the
following ceph tweaks while it is in process:
1013 ceph tell osd.* injectargs '--osd-recovery-max-active 20'
1014 ceph tell osd.* injectargs
may be lost (~mds0/stray7)
Dan
On 10/25/2017 08:54 AM, Daniel Davidson wrote:
Thanks for the information.
I did:
# ceph daemon mds.ceph-0 scrub_path / repair recursive
Saw in the logs it finished
# ceph daemon mds.ceph-0 flush journal
Saw in the logs it finished
#ceph mds fail 0
#ceph mds
ame is how
you would refer to the daemon from systemd, it's often set to the
hostname where the daemon is running by default.
John
On Wed, Oct 25, 2017 at 2:30 PM, Daniel Davidson
<dani...@igb.illinois.edu> wrote:
I do have a problem with running the commands you mentioned to repair the
mds
.
Dan
On 10/25/2017 03:55 AM, John Spray wrote:
On Tue, Oct 24, 2017 at 7:14 PM, Daniel Davidson
<dani...@igb.illinois.edu> wrote:
Our ceph system is having a problem.
A few days a go we had a pg that was marked as inconsistent, and today I
fixed it with a:
#ceph pg repair 1.37c
then
Any idea why that is not working?
Dan
On 10/25/2017 06:45 AM, Daniel Davidson wrote:
John, thank you so much. After doing the initial rados command you
mentioned it is back up and running. It did complain about a bunch of
files which frankly are not important having duplicate inodes, but I
may be lost
(~mds0/stray7)
2017-10-26 05:03:17.661711 7f1c598a6700 -1 mds.0.damage notify_dirfrag
Damage to fragment * of ino 607 is fatal because it is a system
directory for this rank
I would be grateful for any help in repair,
Dan
On 10/25/2017 04:17 PM, Daniel Davidson wrote:
A bit more
, Daniel Davidson wrote:
I increased the logging of the mds to try and get some more
information. I think the relevant lines are:
2017-10-26 05:03:17.661683 7f1c598a6700 0 mds.0.cache.dir(607)
_fetched missing object for [dir 607 ~mds0/stray7/ [2,head] auth
v=108918871 cv=0/0 ap=1+0+0 state
5643 mon.0 [INF] fsmap e121619: 0/1/1 up, 1 damaged
2017-10-25 00:02:10.182101 mon.0 [INF] mds.? 172.16.31.1:6809/2991612296
up:boot
2017-10-25 00:02:10.182189 mon.0 [INF] fsmap e121620: 0/1/1 up, 1
up:standby, 1 damaged
What should I do next? ceph fs reset igbhome scares me.
Dan
On 10/24/
a
lot of messages like:
2017-10-24 21:24:10.910489 7f775e539bc0 1 scavenge_dentries: frag
607. is corrupt, overwriting
The frag number is the same for every line and there have been thousands.
I really could use some assistance,
Dan
On 10/24/2017 12:14 PM, Daniel Davidson wrote
Hello,
Today we had a node crash, and looking at it, it seems there is a
problem with the RAID controller, so it is not coming back up, maybe
ever. It corrupted the local filesytem for the ceph storage there.
The remainder of our storage (10.2.10) cluster is running, and it looks
to be
ate of your
cluster. Most notable is `ceph status` but `ceph osd tree` would be
helpful. What are the size of the pools in your cluster? Are they all
size=3 min_size=2?
On Fri, May 11, 2018 at 12:05 PM Daniel Davidson
<dani...@igb.illinois.edu <mailto:dani...@igb.illinois.edu>> wro
and then make the system go down?
thanks again for all of your help,
Dan
On 10/26/2017 09:23 AM, John Spray wrote:
On Thu, Oct 26, 2017 at 12:40 PM, Daniel Davidson
<dani...@igb.illinois.edu> wrote:
And at the risk of bombing the mailing list, I can also see that the
stray7_head o
26 matches
Mail list logo