David, You will need to up the limit of open files in the linux system. Check /etc/security/limits.conf. it is explained somewhere in the docs and the autostart scripts 'fixes' the issue for most people. When I did a manual deploy for the same reasons you are, I ran into this too.
Robert LeBlanc Sent from a mobile device please excuse any typos. On Mar 5, 2015 3:14 AM, "Datatone Lists" <li...@datatone.co.uk> wrote: > > Thank you all for such wonderful feedback. > > Thank you to John Spray for putting me on the right track. I now see > that the cephfs aspect of the project is being de-emphasised, so that > the manual deployment instructions tell how to set up the object store, > and then the cephfs is a separate issue that needs to be explicitly set > up and configured in its own right. So that explains why the cephfs > pools are not created by default, and why the required cephfs pools are > now referred to, not as 'data' and 'metadata', but 'cepfs_data' and > 'cephfs_metadata'. I have created these pools, and created a new cephfs > filesystem, and I can mount it without problem. > > This confirms my suspicion that the manual deployment pages are in need > of review and revision. They still refer to three default pools. I am > happy that this section should deal with the object store setup only, > but I still think that the osd part is a bit confused and confusing, > particularly with respect to what is done on which machine. It would > then be useful to say something like "this completes the configuration > of the basic store. If you wish to use cephfs, you must set up a > metadata server, appropriate pools, and a cephfs filesystem. (See > http://...)". > > I was not trying to be smart or obscure when I made a brief and > apparently dismissive reference to ceph-deploy. I railed against it and > the demise of mkcephfs on this list at the point that mkcephfs was > discontinued in the releases. That caused a few supportive responses at > the time, so I know that I'm not alone. I did not wish to trawl over > those arguments again unnecessarily. > > There is a principle that is being missed. The 'ceph' code contains > everything required to set up and operate a ceph cluster. There should > be documentation detailing how this is done. > > 'Ceph-deploy' is a separate thing. It is one of several tools that > promise to make setting things up easy. However, my resistance is based > on two factors. If I recall correctly, it is one of those projects in > which the configuration needs to know what 'distribution' is being > used. (Presumably, this is to try to deduce where various things are > located). So if one is not using one of these 'distributions', one is > stuffed right from the start. Secondly, the challenge that we are > trying to overcome is learning what the various ceph components need, > and how they need to be set up and configured. I don't think that the > "don't worry your pretty little head about that, we have a natty tool > to do it for you" approach is particularly useful. > > So I am not knocking ceph-deploy, Travis, it is just that I do not > believe that it is relevant or useful to me at this point in time. > > I see that Lionel Bouton seems to share my views here. > > In general, the ceph documentation (in my humble opinion) needs to be > draughted with a keen eye on the required scope. Deal with ceph; don't > let it get contaminated with 'ceph-deploy', 'upstart', 'systemd', or > anything else that is not actually part of ceph. > > As an example, once you have configured your osd, you start it with: > > ceph-osd -i {osd-number} > > It is as simple as that! > > If it is required to start the osd automatically, then that will be > done using sysvinit, upstart, systemd, or whatever else is being used > to bring the system up in the first place. It is unnecessary and > confusing to try to second-guess the environment in which ceph may be > being used, and contaminate the documentation with such details. > (Having said that, I see no problem with adding separate, helpful, > sections such as "Suggestions for starting using 'upstart'", or > "Suggestions for starting using 'systemd'"). > > So I would reiterate the point that the really important documentation > is probably quite simple for an expert to produce. Just spell out what > each component needs in terms of keys, access to keys, files, and so > on. Spell out how to set everything up. Also how to change things after > the event, so that 'trial and error' does not have to contain really > expensive errors. Once we understand the fundamentals, getting fancy > and efficient is a completely separate further goal, and is not really > a responsibility of core ceph development. > > I have an inexplicable emotional desire to see ceph working well with > btrfs, which I like very much and have been using since the very early > days. Despite all the 'not ready for production' warnings, I adopted it > with enthusiasm, and have never had cause to regret it, and only once > or twice experienced a failure that was painful to me. However, as I > have experimented with ceph over the years, it has been very clear that > ceph seems to be the most ruthless stress test for it, and it has > always broken quite quickly (I also used xfs for comparison). I have > seen evidence of much work going into btrfs in the kernel development > now that the lead developer has moved from Oracle to, I think, Facebook. > > I now share the view that I think Robert LeBlanc has, that maybe btrfs > will now stand the ceph test. > > Thanks, Lincoln Bryant, for confirming that I can increase the size of > pools in line with increasing osd numbers. I felt that this had to be > the case, otherwise the 'scalable' claim becomes a bit limited. > > Returning from these digressions to my own experience; I set up my > cephfs file system as illuminated by John Spray. I mounted it and > started to rsync a multi-terabyte filesystem to it. This is my test, if > cephfs handles this without grinding to a snails pace or failing, I > will be ready to start to commit my data to it. My osd disk lights > started to flash and flicker and a comforting sound of drive activity > issued forth. I checked the osd logs, and to my dismay, there were > crash reports in them all. However, a closer look revealed that I am > getting the "too many open files" messages that precede the failures. > > I can see that this is not an osd failure, but a resource limit issue. > > I completely acknowledge that I must now RTFM, but I will ask whether > anybody can give any guidance, based on experience, with respect to > this issue. > > Thank you again for all for the previous prompt and invaluable advice > and information. > > David > > > On Wed, 4 Mar 2015 20:27:51 +0000 > Datatone Lists <li...@datatone.co.uk> wrote: > > > I have been following ceph for a long time. I have yet to put it into > > service, and I keep coming back as btrfs improves and ceph reaches > > higher version numbers. > > > > I am now trying ceph 0.93 and kernel 4.0-rc1. > > > > Q1) Is it still considered that btrfs is not robust enough, and that > > xfs should be used instead? [I am trying with btrfs]. > > > > I followed the manual deployment instructions on the web site > > (http://ceph.com/docs/master/install/manual-deployment/) and I managed > > to get a monitor and several osds running and apparently working. The > > instructions fizzle out without explaining how to set up mds. I went > > back to mkcephfs and got things set up that way. The mds starts. > > > > [Please don't mention ceph-deploy] > > > > The first thing that I noticed is that (whether I set up mon and osds > > by following the manual deployment, or using mkcephfs), the correct > > default pools were not created. > > > > bash-4.3# ceph osd lspools > > 0 rbd, > > bash-4.3# > > > > I get only 'rbd' created automatically. I deleted this pool, and > > re-created data, metadata and rbd manually. When doing this, I had to > > juggle with the pg- num in order to avoid the 'too many pgs for osd'. > > I have three osds running at the moment, but intend to add to these > > when I have some experience of things working reliably. I am puzzled, > > because I seem to have to set the pg-num for the pool to a number > > that makes (N-pools x pg-num)/N-osds come to the right kind of > > number. So this implies that I can't really expand a set of pools by > > adding osds at a later date. > > > > Q2) Is there any obvious reason why my default pools are not getting > > created automatically as expected? > > > > Q3) Can pg-num be modified for a pool later? (If the number of osds > > is increased dramatically). > > > > Finally, when I try to mount cephfs, I get a mount 5 error. > > > > "A mount 5 error typically occurs if a MDS server is laggy or if it > > crashed. Ensure at least one MDS is up and running, and the cluster is > > active + healthy". > > > > My mds is running, but its log is not terribly active: > > > > 2015-03-04 17:47:43.177349 7f42da2c47c0 0 ceph version 0.93 > > (bebf8e9a830d998eeaab55f86bb256d4360dd3c4), process ceph-mds, pid 4110 > > 2015-03-04 17:47:43.182716 7f42da2c47c0 -1 mds.-1.0 log_to_monitors > > {default=true} > > > > (This is all there is in the log). > > > > I think that a key indicator of the problem must be this from the > > monitor log: > > > > 2015-03-04 16:53:20.715132 7f3cd0014700 1 > > mon.ceph-mon-00@0(leader).mds e1 warning, MDS mds.? > > [2001:8b0:xxxx:5fb3:xxxx:1fff:xxxx:9054]:6800/4036 up but filesystem > > disabled > > > > (I have added the 'xxxx' sections to obscure my ip address) > > > > Q4) Can you give me an idea of what is wrong that causes the mds to > > not play properly? > > > > I think that there are some typos on the manual deployment pages, for > > example: > > > > ceph-osd id={osd-num} > > > > This is not right. As far as I am aware it should be: > > > > ceph-osd -i {osd-num} > > > > An observation. In principle, setting things up manually is not all > > that complicated, provided that clear and unambiguous instructions are > > provided. This simple piece of documentation is very important. My > > view is that the existing manual deployment instructions gets a bit > > confused and confusing when it gets to the osd setup, and the mds > > setup is completely absent. > > > > For someone who knows, this would be a fairly simple and fairly quick > > operation to review and revise this part of the documentation. I > > suspect that this part suffers from being really obvious stuff to the > > well initiated. For those of us closer to the start, this forms the > > ends of the threads that have to be picked up before the journey can > > be made. > > > > Very best regards, > > David > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > > _______________________________________________ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com