Is there any known connection with the previous discussion Hit suicide
timeout after adding new osd or Ceph unstable on XFS ?
-Original Message-
From: ceph-devel-ow...@vger.kernel.org
[mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Sage Weil
Sent: 2013年1月22日 14:06
To:
On 01/22/2013 07:12 AM, Yehuda Sadeh wrote:
On Mon, Jan 21, 2013 at 10:05 PM, Sage Weil s...@inktank.com wrote:
We observed an interesting situation over the weekend. The XFS volume
ceph-osd locked up (hung in xfs_ilock) for somewhere between 2 and 4
minutes. After 3 minutes (180s),
On 01/22/2013 12:15 AM, John Nielsen wrote:
Thanks all for your responses! Some comments inline.
On Jan 20, 2013, at 10:16 AM, Wido den Hollander w...@widodh.nl wrote:
On 01/19/2013 12:34 AM, John Nielsen wrote:
I'm planning a Ceph deployment which will include:
10Gbit/s
Wido den Hollander wrote:
One thing is still having multiple Varnish caches and object banning. I
proposed something for this some time ago, some hook in RGW you could
use to inform a upstream cache to purge something from it's cache.
Hopefully not Varnish-specific; something like the
Hi Guys,
We've got an article up looking at performance of CFQ, Deadline, and
NOOP IO schedulers with Ceph on the SAS2208. I won't claim that these
results are universally applicable to other controllers and disk setups,
but they might be interesting if you've been trying to determine what
Hi,
(http://permalink.gmane.org/gmane.comp.file-systems.ceph.devel/12316)
Hopefully not Varnish-specific; something like the Last-Modified header
would be good.
Also there are tricks you can do with queries; see for instance
http://forum.nginx.org/read.php?2,1047,1052
It seems like a good
Assuming that the clone is atomic so that the client only ever grabbed
a complete old or new version of the file, that method really seems
ideal. How much work/time would that be?
The objects will likely average around 10-20MB, but it's possible that
in some cases they may grow to a few hundred
I had thought about doing something like that, but I'm not sure how to
do it in a race-free way. For example if I was to set 'done=yes' on a
file, then check that before trying to download the file, the instant
I try to download the file the writer of the file could remove the
xattr and start
Wido den Hollander wrote:
Now, when running just one Varnish instance which does loadbalancing
over multiple RGW instances is not a real problem. When it sees a PUT
operation it can purge (called banning in Varnish) the object from
it's cache.
When looking at the scenario where you have
On Tue, 22 Jan 2013, Nick Bartos wrote:
Assuming that the clone is atomic so that the client only ever grabbed
a complete old or new version of the file, that method really seems
ideal. How much work/time would that be?
The objects will likely average around 10-20MB, but it's possible that
On Tue, 22 Jan 2013, Sage Weil wrote:
On Tue, 22 Jan 2013, Nick Bartos wrote:
Assuming that the clone is atomic so that the client only ever grabbed
a complete old or new version of the file, that method really seems
ideal. How much work/time would that be?
The objects will likely
On Tuesday, January 22, 2013 at 5:12 AM, Wido den Hollander wrote:
On 01/22/2013 07:12 AM, Yehuda Sadeh wrote:
On Mon, Jan 21, 2013 at 10:05 PM, Sage Weil s...@inktank.com
(mailto:s...@inktank.com) wrote:
We observed an interesting situation over the weekend. The XFS volume
ceph-osd
On Monday, January 21, 2013 at 5:44 AM, Loic Dachary wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
On 01/21/2013 12:02 AM, Gregory Farnum wrote:
On Sunday, January 20, 2013 at 5:39 AM, Loic Dachary wrote:
Hi,
While working on unit tests for Throttle.{cc,h} I tried to
On 01/22/2013 12:05 AM, Sage Weil wrote:
We observed an interesting situation over the weekend. The XFS volume
ceph-osd locked up (hung in xfs_ilock) for somewhere between 2 and 4
minutes.
...
FWIW I see this often enough on cheap sata drives: they've a failure
mode that makes sata driver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
Hi All
On 20/01/13 11:13, Constantinos Venetsanopoulos wrote:
Hello Loic, Sebastien, Patrick,
that's great news! I'm sure we'll have some very interesting stuff
to talk about. Saturday 14:00 @ K.3.201 also seems fine.
I'm attending FOSDEM as
On Tuesday, January 22, 2013 at 10:24 AM, Gandalf Corvotempesta wrote:
Hi all,
i'm trying my very first ceph installation following the 5-minutes quickstart:
http://ceph.com/docs/master/start/quick-start/#install-debian-ubuntu
just a question: why ceph is asking me for SSH password? Is ceph
If it is the command 'mkcephfs' that asked you for ssh password, then
that is probably because that script needs to push some files
(ceph.conf, e.g) to other hosts. If we open that script, we can see that
it uses 'scp' to send some files. If I remember correctly, for every osd
at other hosts,
Thanks! Is it safe to just apply that last commit to 0.56.1? Also,
is the rados command 'clonedata' instead of 'clone'? That's what it
looked like in the code.
On Tue, Jan 22, 2013 at 9:27 AM, Sage Weil s...@inktank.com wrote:
On Tue, 22 Jan 2013, Nick Bartos wrote:
Assuming that the clone
The variable str is used as both the source and destination in function
snprintf(), which is undefined behavior based on C11. The original description
in C11 is:
If copying takes place between objects that
overlap, the behavior is undefined.
And, the function of
Out of interest, would people prefer that the Ceph deployment script
didn't try to handle server-server file copy and just did the local
setup only, or is it useful that it tries to be a mini-config
management tool at the same time?
Neil
On Tue, Jan 22, 2013 at 10:46 AM, Xing Lin
Hi,
Since I have ceph in prod, I experienced a memory leak in the OSD
forcing to restart them every 5 or 6 days. Without that the OSD
process just grows infinitely and eventually gets killed by the OOM
killer. (To make sure it wasn't legitimate, I left one grow up to 4G
or RSS ...).
Here's for
Hi list,
In a mixed SSD SATA setup (5 or 8 nodes each holding 8x SATA and 4x
SSD) would it make sense to skip having journals on SSD or is the
advantage of doing so just too great? We're looking into having 2 pools,
sata and ssd and will be creating guests belonging into either of these
On Tue, 22 Jan 2013, Nick Bartos wrote:
Thanks! Is it safe to just apply that last commit to 0.56.1? Also,
is the rados command 'clonedata' instead of 'clone'? That's what it
looked like in the code.
Yep, and yep!
s
On Tue, Jan 22, 2013 at 9:27 AM, Sage Weil s...@inktank.com wrote:
I like the current approach. I think it is more convenient to run
commands once at one host to do all the setup work. When the first time
I deployed a ceph cluster with 4 hosts, I thought 'service ceph start'
would start the whole ceph cluster. But as it turns out, it only starts
local osd,
On 01/21/2013 12:19 AM, Gandalf Corvotempesta wrote:
2013/1/21 Gregory Farnum g...@inktank.com:
I'm not quite sure what you mean…the use of the cluster network and public
network are really just intended as conveniences for people with multiple NICs on their box.
There's nothing preventing
On Jan 17, 2013, at 11:19 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
2013/1/17 Atchley, Scott atchle...@ornl.gov:
10GbE should get close to 1.2 GB/s compared to 1 GB/s for IB SDR. Latency
again depends on the Ethernet driver.
10GbE faster than IB SDR? Really ?
On Jan 22, 2013, at 4:06 PM, Atchley, Scott atchle...@ornl.gov wrote:
On Jan 17, 2013, at 11:19 AM, Gandalf Corvotempesta
gandalf.corvotempe...@gmail.com wrote:
2013/1/17 Atchley, Scott atchle...@ornl.gov:
10GbE should get close to 1.2 GB/s compared to 1 GB/s for IB SDR. Latency
again
The '-a/--allhosts' parameter is to spread the command across the
cluster...that is, service ceph -a start will start across the cluster.
On 01/22/2013 01:01 PM, Xing Lin wrote:
I like the current approach. I think it is more convenient to run
commands once at one host to do all the setup
On 01/22/2013 01:59 PM, martin wrote:
Hi list,
In a mixed SSD SATA setup (5 or 8 nodes each holding 8x SATA and 4x
SSD) would it make sense to skip having journals on SSD or is the
advantage of doing so just too great? We're looking into having 2 pools,
sata and ssd and will be creating guests
I did not notice that there exists such a parameter. Thanks, Dan!
Xing
On 01/22/2013 02:11 PM, Dan Mick wrote:
The '-a/--allhosts' parameter is to spread the command across the
cluster...that is, service ceph -a start will start across the cluster.
--
To unsubscribe from this list: send
Hi,
I originally started a thread around these memory leaks problems here:
http://www.mail-archive.com/ceph-devel@vger.kernel.org/msg11000.html
I'm happy to see that someone supports my theory about the scrubbing
process leaking the memory. I only use RBD from Ceph, so your theory
makes sense as
Mark Nelson wrote:
It may (or may not) help to use a power-of-2 number of PGs. It's
generally a good idea to do this anyway, so if you haven't set up your
production cluster yet, you may want to play around with this. Basically
just take whatever number you were planning on using and round it up
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 10/12/12 09:53, Gregory Farnum wrote:
[...]
I love the idea of btrfs supporting encryption natively
much like it does compression. It may be some time before
that happens, so in the meantime, I'd love to see Ceph
support dm-crypt and/or
Hi,
I don't really want to try the mem profiler, I had quite a bad
experience with it on a test cluster. While running the profiler some
OSD crashed...
The only way to fix this is to provide a heap dump. Could you provide one?
I just did:
ceph osd tell 0 heap start_profiler
ceph osd tell 0
Well ideally you want to run the profiler during the scrubbing process
when the memory leaks appear :-).
--
Regards,
Sébastien Han.
On Tue, Jan 22, 2013 at 10:32 PM, Sylvain Munaut
s.mun...@whatever-company.com wrote:
Hi,
I don't really want to try the mem profiler, I had quite a bad
On 01/22/2013 07:32 PM, James Page wrote:
Hi All
On 20/01/13 11:13, Constantinos Venetsanopoulos wrote:
Hello Loic, Sebastien, Patrick,
that's great news! I'm sure we'll have some very interesting stuff
to talk about. Saturday 14:00 @ K.3.201 also seems fine.
I'm attending FOSDEM as
Stefan Priebe wrote:
Hi,
Am 22.01.2013 22:26, schrieb Jeff Mitchell:
Mark Nelson wrote:
It may (or may not) help to use a power-of-2 number of PGs. It's
generally a good idea to do this anyway, so if you haven't set up your
production cluster yet, you may want to play around with this.
On 01/22/2013 03:50 PM, Stefan Priebe wrote:
Hi,
Am 22.01.2013 22:26, schrieb Jeff Mitchell:
Mark Nelson wrote:
It may (or may not) help to use a power-of-2 number of PGs. It's
generally a good idea to do this anyway, so if you haven't set up your
production cluster yet, you may want to play
A few very minor changes to the rbd code:
- RBD_MAX_OPT_LEN is unused, so get rid of it
- Consolidate rbd options definitions
- Make rbd_segment_name() return pointer to const char
Signed-off-by: Alex Elder el...@inktank.com
---
drivers/block/rbd.c | 17 -
1 file
The return type of rbd_get_num_segments() is int, but the values it
operates on are u64. Although it's not likely, there's no guarantee
the result won't exceed what can be respresented in an int. The
function is already designed to return -ERANGE on error, so just add
this possible overflow as
When an rbd image is initially mapped a watch event is registered so
we can do something if the header object changes. Right now if that
returns ERANGE we loop back and try to initiate it again. However
the code that sets up the watch event doesn't clean up after itself
very well, and doing that
Delete rbd_req_sync_read() is no longer used, so get rid of it.
Signed-off-by: Alex Elder el...@inktank.com
---
drivers/block/rbd.c | 24
1 file changed, 24 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 5a8fef4..6193c69 100644
---
Implement a new function to set up or tear down a watch event
for an mapped rbd image header using the new request code.
Create a new object request type nodata to handle this. And
define rbd_osd_trivial_callback() which simply marks a request done.
Signed-off-by: Alex Elder el...@inktank.com
Get rid of rbd_req_sync_watch(), because it is no longer used.
Signed-off-by: Alex Elder el...@inktank.com
---
drivers/block/rbd.c | 42 --
1 file changed, 42 deletions(-)
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 3c110b3..7dedd18
Use the new object request tracking mechanism for handling a
notify_ack request.
Move the callback function below the definition of this so we don't
have to do a pre-declaration.
This resolves:
http://tracker.newdream.net/issues/3754
Signed-off-by: Alex Elder el...@inktank.com
---
Get rid rbd_req_sync_notify_ack() because it is no longer used.
As a result rbd_simple_req_cb() becomes unreferenced, so get rid
of that too.
Signed-off-by: Alex Elder el...@inktank.com
---
drivers/block/rbd.c | 33 -
1 file changed, 33 deletions(-)
diff --git
When we receive notification of a change to an rbd image's header
object we need to refresh our information about the image (its
size and snapshot context). Once we have refreshed our rbd image
we need to acknowledge the notification.
This acknowledgement was previously done synchronously, but
When we receive notification of a change to an rbd image's header
object we need to refresh our information about the image (its
size and snapshot context). Once we have refreshed our rbd image
we need to acknowledge the notification.
This acknowledgement was previously done synchronously, but
Get rid rbd_req_sync_exec() because it is no longer used. That
eliminates the last use of rbd_req_sync_op(), so get rid of that
too. And finally, that leaves rbd_do_request() unreferenced, so get
rid of that.
Signed-off-by: Alex Elder el...@inktank.com
---
drivers/block/rbd.c | 160
Reviewed-by: Dan Mick dan.m...@inktank.com
On 01/22/2013 01:57 PM, Alex Elder wrote:
A few very minor changes to the rbd code:
- RBD_MAX_OPT_LEN is unused, so get rid of it
- Consolidate rbd options definitions
- Make rbd_segment_name() return pointer to const char
Reviewed-by: Dan Mick dan.m...@inktank.com
On 01/22/2013 01:58 PM, Alex Elder wrote:
The return type of rbd_get_num_segments() is int, but the values it
operates on are u64. Although it's not likely, there's no guarantee
the result won't exceed what can be respresented in an int. The
function
On Wed, 23 Jan 2013, Andrey Korolyov wrote:
On Tue, Jan 22, 2013 at 10:05 AM, Sage Weil s...@inktank.com wrote:
We observed an interesting situation over the weekend. The XFS volume
ceph-osd locked up (hung in xfs_ilock) for somewhere between 2 and 4
minutes. After 3 minutes (180s),
On Tue, 22 Jan 2013, Dimitri Maziuk wrote:
On 01/22/2013 12:05 AM, Sage Weil wrote:
We observed an interesting situation over the weekend. The XFS volume
ceph-osd locked up (hung in xfs_ilock) for somewhere between 2 and 4
minutes.
...
FWIW I see this often enough on cheap sata
On Tue, 22 Jan 2013, Neil Levine wrote:
Out of interest, would people prefer that the Ceph deployment script
didn't try to handle server-server file copy and just did the local
setup only, or is it useful that it tries to be a mini-config
management tool at the same time?
BTW, you can also
On Tue, Jan 22, 2013 at 6:14 PM, Sage Weil s...@inktank.com wrote:
On Tue, 22 Jan 2013, Neil Levine wrote:
Out of interest, would people prefer that the Ceph deployment script
didn't try to handle server-server file copy and just did the local
setup only, or is it useful that it tries to be a
On Tue, 22 Jan 2013, Asghar Riahi wrote:
Are you familiar with Seagate's Self Encrypting Disk (SED)?
Here are some links which might be usefull:
http://smb.media.seagate.com/tag/seagate-sed/
http://csrc.nist.gov/groups/STM/cmvp/documents/140-1/140sp/140sp1299.pdf
Yeah! It would be
On Tue, 22 Jan 2013, James Page wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On 10/12/12 09:53, Gregory Farnum wrote:
[...]
I love the idea of btrfs supporting encryption natively
much like it does compression. It may be some time before
that happens, so in the meantime, I'd
We're having a chat about ceph-deploy tomorrow. We need to strike a
balance between its being a useful tool for standing up a quick
cluster and its ignoring the UNIX philosophy and trying to do to much.
My assumption is that for most production operations, or at the point
where people decide to
On 01/22/2013 01:58 PM, Jeff Mitchell wrote:
I'd be interested in figuring out the right way to migrate an RBD from
one pool to another regardless.
Each way involves copying data, since by definition a different pool
will use different placement groups.
You could export/import with the rbd
On Tue, Jan 22, 2013 at 7:25 PM, Josh Durgin josh.dur...@inktank.com wrote:
On 01/22/2013 01:58 PM, Jeff Mitchell wrote:
I'd be interested in figuring out the right way to migrate an RBD from
one pool to another regardless.
Each way involves copying data, since by definition a different
From: Yan, Zheng zheng.z@intel.com
Patch 1 fixes a readdir bug I introduced, I think it should be included in next
release.
Patch 2 and patch 3 are non-critical fixes for my previous patches.
Patch 4 modifies the EMetaBlob format to support journalling multiple root
inodes
The rest
From: Yan, Zheng zheng.z@intel.com
commit 1174dd3188 (don't retry readdir request after issuing caps)
introduced an bug that wrongly marks 'end' in the the readdir reply.
The code that touches existing dentries re-uses an iterator, and the
iterator is used for checking if readdir is end.
From: Yan, Zheng zheng.z@intel.com
Commit b03eab22e4 (mds: forbid creating file in deleted directory)
is not complete, mknod, mkdir and symlink are missed. Move the ckeck
into Server::rdlock_path_xlock_dentry() fixes the issue.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
From: Yan, Zheng zheng.z@intel.com
commit 1203cd2110 (mds: allow open_remote_ino() to open xlocked dentry)
makes Server::handle_client_rename() xlocks remote inodes' primary
dentry so witness MDS can open xlocked dentry. But I added remote inodes'
projected primary dentries to the xlock list.
From: Yan, Zheng zheng.z@intel.com
In some cases (rename, rmdir, subtree map), we may need journal multiple
root inodes (/, mdsdir) in one EMetaBlob. This patch modifies EMetaBlob
format to support journaling multiple root inodes.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
From: Yan, Zheng zheng.z@intel.com
If MDCache::handle_discover() receives an 'discover path' request but
can not find the base inode. It should properly set the 'error_dentry'
to make sure MDCache::handle_discover_reply() checks correct object's
wait queue.
Signed-off-by: Yan, Zheng
From: Yan, Zheng zheng.z@intel.com
the function will be used by later patch that fixes rename rollback
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/mds/Server.cc | 74 +--
src/mds/Server.h | 1 +
2 files changed, 46
From: Yan, Zheng zheng.z@intel.com
The reason of had dentry linked to wrong inode warning is that
Server::_rename_prepare() adds the destdir to the EMetaBlob before
adding the straydir. So during MDS recovers, the destdir is first
replayed. The old inode is directly replaced by the source
From: Yan, Zheng zheng.z@intel.com
After replaying a slave rename, non-auth directory that we rename out of will
be trimmed. So there is no need to journal it.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/mds/Server.cc | 26 ++
1 file changed, 10
From: Yan, Zheng zheng.z@intel.com
The MDS should not trim objects in non-auth subtree immediately after
replaying a slave rename. Because the slave rename may require rollback
later and these objects are needed for rollback.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
From: Yan, Zheng zheng.z@intel.com
The resolve stage serves to disambiguate the fate of uncommitted slave
updates and resolve subtrees authority. The MDS sends resolve message
that claims subtrees authority immediately when reslove stage is entered,
When receiving a resolve message, the MDS
From: Yan, Zheng zheng.z@intel.com
The main issue of old slave rename rollback code is that it assumes
all affected objects are in the cache. The assumption is not true
when MDS does rollback in the resolve stage. This patch removes the
assumption and makes Server::do_rename_rollback() check
From: Yan, Zheng zheng.z@intel.com
Current code sends resolve messages when resolving MDS set changes.
There is no need to send resolve messages when some MDS leave the
resolve stage. Sending message while some MDS are replaying is also
not very useful.
Signed-off-by: Yan, Zheng
From: Yan, Zheng zheng.z@intel.com
rename may overwrite an empty directory inode and move it into stray
directory. MDS who has auth subtree beneath the overwrited directory
need journal the stray dentry when handling rename slave request.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
From: Yan, Zheng zheng.z@intel.com
_rename_finish() does not send dentry link/unlink message to replicas.
We should prevent dentries that are modified by the rename operation
from getting new replicas when the rename operation is committing. So
don't mark xlocks done and early reply for
From: Yan, Zheng zheng.z@intel.com
Otherwise the journal entry will revert the effect of any on-going
rename operation for the inode.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/mds/Server.cc | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git
From: Yan, Zheng zheng.z@intel.com
If we journal opened non-auth inode, during journal replay, the corresponding
entry will add non-auth objects to the cache. But the MDS does not journal all
subsequent modifications (rmdir,rename) to these non-auth objects, so the code
that manages cache and
From: Yan, Zheng zheng.z@intel.com
when replaying EImportStart, we should set/clear directory's COMPLETE
flag according with the flag in the journal entry.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/mds/MDCache.cc | 5 +++--
src/mds/Migrator.cc| 4 +---
From: Yan, Zheng zheng.z@intel.com
Includes remote wrlocks and frozen authpin in cache rejoin strong message
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/mds/Locker.cc | 4 +--
src/mds/MDCache.cc | 56 +++---
From: Yan, Zheng zheng.z@intel.com
My previous patches add two pointers (ambiguous_auth_inode and
auth_pin_freeze) to class Mutation. They are both used by cross
authority rename, both point to the renamed inode. Later patches
need add more rename special state to MDRequest, So just move them
From: Yan, Zheng zheng.z@intel.com
In the resolve stage, if no MDS claims other MDS's disambiguous subtree
import, the subtree's dir_auth is undefined.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/mds/MDCache.cc | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git
From: Yan, Zheng zheng.z@intel.com
The problem of fetching missing inodes from replicas is that replicated inodes
does not have up-to-date rstat and fragstat. So just fetch missing inodes from
disk
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/mds/MDCache.cc | 83
From: Yan, Zheng zheng.z@intel.com
After swallowing extra subtrees, subtree bounds may change, so it
should re-check.
Signed-off-by: Yan, Zheng zheng.z@intel.com
---
src/mds/MDCache.cc | 24 +---
1 file changed, 13 insertions(+), 11 deletions(-)
diff --git
From: Yan, Zheng zheng.z@intel.com
The MDS may receive a client request, but find there is an existing
slave request. It means other MDS is handling the same request, so
we should not replace the slave request with a new client request,
just forward the request.
The client request may
From: Yan, Zheng zheng.z@intel.com
Current code skips using {push,pop}_projected_linkage to modify replica
dentry's linkage. This confuses EMetaBlob::add_dir_context() and makes
it record out-of-date path when TO_ROOT mode is used. This patch changes
the code to always use
From: Yan, Zheng zheng.z@intel.com
If lock is in XSYN state, Locker::simple_sync() firstly try changing
lock state to EXCL. If it fail to change lock state to EXCL, it just
returns. So Locker::simple_sync() does not guarantee the lock state
eventually changes to SYNC. This issue can cause
Since you are chatting about ceph-deploy tomorrow, I'll chime in with
a bit more.
I'm interested in ceph-deploy since it can be a light-weight
production appropriate installer. The docs repeatedly warn that
mkcephfs is not intended for production clusters, and Neil reminds us
that the
From my perspective, I want to ensure that we have a script that helps
users get Ceph up and running as quickly as possible so they can play,
explore and evaluate it. With this goal in mind, I would prefer to
lean towards the KISS principle to reduce the potential failure
scenarios which a) deter
Hi list,
When first time I start my ceph cluster,it takes more than 15 minutes
to get all the pg activeclean. It's fast at first (say 100pg/s) but quite slow
when only hundreds of PG left peering.
Is it a common situation? Since there is quite a few disk IO and
network IO
Hi List,
Here is part of /etc/init.d/ceph script:
case $command in
start)
# Increase max_open_files, if the configuration calls for it.
get_conf max_open_files 8192 max open files
if [ $max_open_files != 0 ]; then
# Note: Don't try
90 matches
Mail list logo