Yeah, I couldn't argue against LVMs when talking with the system admins. In terms of speed its noise because the CPUs are pretty efficient and unless you have more than 1 drive per physical core, you will end up saturating your disk I/O.
In terms of MapR, you want the raw disk. (But we're talking Apache) On Dec 19, 2012, at 4:59 PM, Jean-Marc Spaggiari <[email protected]> wrote: > Finally, it took me a while to run those tests because it was way > longer than expected, but here are the results: > > http://www.spaggiari.org/bonnie.html > > LVM is not really slower than JBOD and not really taking more CPU. So > I will say, if you have to choose between the 2, take the one you > prefer. Personally, I prefer LVM because it's easy to configure. > > The big winner here is RAID0. It's WAY faster than anything else. But > it's using twice the space... Your choice. > > I did not get a chance to test with the Ubuntu tool because it's not > working with LVM drives. > > JM > > 2012/11/28, Michael Segel <[email protected]>: >> Ok, just a caveat. >> >> I am discussing MapR as part of a complete response. As Mohit posted MapR >> takes the raw device for their MapR File System. >> They do stripe on their own within what they call a volume. >> >> But going back to Apache... >> You can stripe drives, however I wouldn't recommend it. I don't think the >> performance gains would really matter. >> You're going to end up getting blocked first by disk i/o, then your >> controller card, then your network... assuming 10GBe. >> >> With only 2 disks on an 8 core system, you will hit disk i/o first and then >> you'll watch your CPU Wait I/O climb. >> >> HTH >> >> -Mike >> >> On Nov 28, 2012, at 7:28 PM, Jean-Marc Spaggiari <[email protected]> >> wrote: >> >>> Hi Mike, >>> >>> Why not using LVM with MapR? Since LVM is reading from 2 drives almost >>> at the same time, it should be better than RAID0 or a single drive, >>> no? >>> >>> 2012/11/28, Michael Segel <[email protected]>: >>>> Just a couple of things. >>>> >>>> I'm neutral on the use of LVMs. Some would point out that there's some >>>> overhead, but on the flip side, it can make managing the machines >>>> easier. >>>> If you're using MapR, you don't want to use LVMs but raw devices. >>>> >>>> In terms of GC, its going to depend on the heap size and not the total >>>> memory. With respect to HBase. ... MSLABS is the way to go. >>>> >>>> >>>> On Nov 28, 2012, at 12:05 PM, Jean-Marc Spaggiari >>>> <[email protected]> >>>> wrote: >>>> >>>>> Hi Gregory, >>>>> >>>>> I founs this about LVM: >>>>> -> http://blog.andrew.net.au/2006/08/09 >>>>> -> >>>>> http://www.phoronix.com/scan.php?page=article&item=fedora_15_lvm&num=2 >>>>> >>>>> Seems that performances are still correct with it. I will most >>>>> probably give it a try and bench that too... I have one new hard drive >>>>> which should arrived tomorrow. Perfect timing ;) >>>>> >>>>> >>>>> >>>>> JM >>>>> >>>>> 2012/11/28, Mohit Anchlia <[email protected]>: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Nov 28, 2012, at 9:07 AM, Adrien Mogenet <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Does HBase really benefit from 64 GB of RAM since allocating too >>>>>>> large >>>>>>> heap >>>>>>> might increase GC time ? >>>>>>> >>>>>> Benefit you get is from OS cache >>>>>>> Another question : why not RAID 0, in order to aggregate disk >>>>>>> bandwidth >>>>>>> ? >>>>>>> (and thus keep 3x replication factor) >>>>>>> >>>>>>> >>>>>>> On Wed, Nov 28, 2012 at 5:58 PM, Michael Segel >>>>>>> <[email protected]>wrote: >>>>>>> >>>>>>>> Sorry, >>>>>>>> >>>>>>>> I need to clarify. >>>>>>>> >>>>>>>> 4GB per physical core is a good starting point. >>>>>>>> So with 2 quad core chips, that is going to be 32GB. >>>>>>>> >>>>>>>> IMHO that's a minimum. If you go with HBase, you will want more. >>>>>>>> (Actually >>>>>>>> you will need more.) The next logical jump would be to 48 or 64GB. >>>>>>>> >>>>>>>> If we start to price out memory, depending on vendor, your company's >>>>>>>> procurement, there really isn't much of a price difference in terms >>>>>>>> of >>>>>>>> 32,48, or 64 GB. >>>>>>>> Note that it also depends on the chips themselves. Also you need to >>>>>>>> see >>>>>>>> how many memory channels exist in the mother board. You may need to >>>>>>>> buy >>>>>>>> in >>>>>>>> pairs or triplets. Your hardware vendor can help you. (Also you need >>>>>>>> to >>>>>>>> keep an eye on your hardware vendor. Sometimes they will give you >>>>>>>> higher >>>>>>>> density chips that are going to be more expensive...) ;-) >>>>>>>> >>>>>>>> I tend to like having extra memory from the start. >>>>>>>> It gives you a bit more freedom and also protects you from 'fat' >>>>>>>> code. >>>>>>>> >>>>>>>> Looking at YARN... you will need more memory too. >>>>>>>> >>>>>>>> >>>>>>>> With respect to the hard drives... >>>>>>>> >>>>>>>> The best recommendation is to keep the drives as JBOD and then use >>>>>>>> 3x >>>>>>>> replication. >>>>>>>> In this case, make sure that the disk controller cards can handle >>>>>>>> JBOD. >>>>>>>> (Some don't support JBOD out of the box) >>>>>>>> >>>>>>>> With respect to RAID... >>>>>>>> >>>>>>>> If you are running MapR, no need for RAID. >>>>>>>> If you are running an Apache derivative, you could use RAID 1. Then >>>>>>>> cut >>>>>>>> your replication to 2X. This makes it easier to manage drive >>>>>>>> failures. >>>>>>>> (Its not the norm, but it works...) In some clusters, they are using >>>>>>>> appliances like Net App's e series where the machines see the drives >>>>>>>> as >>>>>>>> local attached storage and I think the appliances themselves are >>>>>>>> using >>>>>>>> RAID. I haven't played with this configuration, however it could >>>>>>>> make >>>>>>>> sense and its a valid design. >>>>>>>> >>>>>>>> HTH >>>>>>>> >>>>>>>> -Mike >>>>>>>> >>>>>>>> On Nov 28, 2012, at 10:33 AM, Jean-Marc Spaggiari >>>>>>>> <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi Mike, >>>>>>>>> >>>>>>>>> Thanks for all those details! >>>>>>>>> >>>>>>>>> So to simplify the equation, for 16 virtual cores we need 48 to >>>>>>>>> 64GB. >>>>>>>>> Which mean 3 to 4GB per core. So with quad cores, 12GB to 16GB are >>>>>>>>> a >>>>>>>>> good start? Or I simplified it to much? >>>>>>>>> >>>>>>>>> Regarding the hard drives. If you add more than one drive, do you >>>>>>>>> need >>>>>>>>> to build them on RAID or similar systems? Or can Hadoop/HBase be >>>>>>>>> configured to use more than one drive? >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> JM >>>>>>>>> >>>>>>>>> 2012/11/27, Michael Segel <[email protected]>: >>>>>>>>>> >>>>>>>>>> OK... I don't know why Cloudera is so hung up on 32GB. ;-) [Its an >>>>>>>> inside >>>>>>>>>> joke ...] >>>>>>>>>> >>>>>>>>>> So here's the problem... >>>>>>>>>> >>>>>>>>>> By default, your child processes in a map/reduce job get a default >>>>>>>> 512MB. >>>>>>>>>> The majority of the time, this gets raised to 1GB. >>>>>>>>>> >>>>>>>>>> 8 cores (dual quad cores) shows up at 16 virtual processors in >>>>>>>>>> Linux. >>>>>>>> (Note: >>>>>>>>>> This is why when people talk about the number of cores, you have >>>>>>>>>> to >>>>>>>> specify >>>>>>>>>> physical cores or logical cores....) >>>>>>>>>> >>>>>>>>>> So if you were to over subscribe and have lets say 12 mappers and >>>>>>>>>> 12 >>>>>>>>>> reducers, that's 24 slots. Which means that you would need 24GB of >>>>>>>> memory >>>>>>>>>> reserved just for the child processes. This would leave 8GB for >>>>>>>>>> DN, >>>>>>>>>> TT >>>>>>>> and >>>>>>>>>> the rest of the linux OS processes. >>>>>>>>>> >>>>>>>>>> Can you live with that? Sure. >>>>>>>>>> Now add in R, HBase, Impala, or some other set of tools on top of >>>>>>>>>> the >>>>>>>>>> cluster. >>>>>>>>>> >>>>>>>>>> Ooops! Now you are in trouble because you will swap. >>>>>>>>>> Also adding in R, you may want to bump up those child procs from >>>>>>>>>> 1GB >>>>>>>>>> to >>>>>>>> 2 >>>>>>>>>> GB. That means the 24 slots would now require 48GB. Now you have >>>>>>>>>> swap >>>>>>>> and >>>>>>>>>> if that happens you will see HBase in a cascading failure. >>>>>>>>>> >>>>>>>>>> So while you can do a rolling restart with the changed >>>>>>>>>> configuration >>>>>>>>>> (reducing the number of mappers and reducers) you end up with less >>>>>>>>>> slots >>>>>>>>>> which will mean in longer run time for your jobs. (Less slots == >>>>>>>>>> less >>>>>>>>>> parallelism ) >>>>>>>>>> >>>>>>>>>> Looking at the price of memory... you can get 48GB or even 64GB >>>>>>>>>> for >>>>>>>> around >>>>>>>>>> the same price point. (8GB chips) >>>>>>>>>> >>>>>>>>>> And I didn't even talk about adding SOLR either again a memory >>>>>>>>>> hog... >>>>>>>> ;-) >>>>>>>>>> >>>>>>>>>> Note that I matched the number of mappers w reducers. You could go >>>>>>>>>> with >>>>>>>>>> fewer reducers if you want. I tend to recommend a ratio of 2:1 >>>>>>>>>> mappers >>>>>>>> to >>>>>>>>>> reducers, depending on the work flow.... >>>>>>>>>> >>>>>>>>>> As to the disks... no 7200 SATA III drives are fine. SATA III >>>>>>>>>> interface >>>>>>>> is >>>>>>>>>> pretty much available in the new kit being shipped. >>>>>>>>>> Its just that you don't have enough drives. 8 cores should be 8 >>>>>>>> spindles if >>>>>>>>>> available. >>>>>>>>>> Otherwise you end up seeing your CPU load climb on wait states as >>>>>>>>>> the >>>>>>>>>> processes wait for the disk i/o to catch up. >>>>>>>>>> >>>>>>>>>> I mean you could build out a cluster w 4 x 3 3.5" 2TB drives in a >>>>>>>>>> 1 >>>>>>>>>> U >>>>>>>>>> chassis based on price. You're making a trade off and you should >>>>>>>>>> be >>>>>>>> aware of >>>>>>>>>> the performance hit you will take. >>>>>>>>>> >>>>>>>>>> HTH >>>>>>>>>> >>>>>>>>>> -Mike >>>>>>>>>> >>>>>>>>>> On Nov 27, 2012, at 1:52 PM, Jean-Marc Spaggiari < >>>>>>>> [email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Michael, >>>>>>>>>>> >>>>>>>>>>> so are you recommanding 32Gb per node? >>>>>>>>>>> >>>>>>>>>>> What about the disks? SATA drives are to slow? >>>>>>>>>>> >>>>>>>>>>> JM >>>>>>>>>>> >>>>>>>>>>> 2012/11/26, Michael Segel <[email protected]>: >>>>>>>>>>>> Uhm, those specs are actually now out of date. >>>>>>>>>>>> >>>>>>>>>>>> If you're running HBase, or want to also run R on top of Hadoop, >>>>>>>>>>>> you >>>>>>>>>>>> will >>>>>>>>>>>> need to add more memory. >>>>>>>>>>>> Also forget 1GBe got 10GBe, and w 2 SATA drives, you will be >>>>>>>>>>>> disk >>>>>>>>>>>> i/o >>>>>>>>>>>> bound >>>>>>>>>>>> way too quickly. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Nov 26, 2012, at 8:05 AM, Marcos Ortiz <[email protected]> >>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Are you asking about hardware recommendations? >>>>>>>>>>>>> Eric Sammer on his "Hadoop Operations" book, did a great job >>>>>>>>>>>>> about >>>>>>>>>>>>> this: >>>>>>>>>>>>> For middle size clusters (until 300 nodes): >>>>>>>>>>>>> Processor: A dual quad-core 2.6 Ghz >>>>>>>>>>>>> RAM: 24 GB DDR3 >>>>>>>>>>>>> Dual 1 Gb Ethernet NICs >>>>>>>>>>>>> a SAS drive controller >>>>>>>>>>>>> at least two SATA II drives in a JBOD configuration >>>>>>>>>>>>> >>>>>>>>>>>>> The replication factor depends heavily of the primary use of >>>>>>>>>>>>> your >>>>>>>>>>>>> cluster. >>>>>>>>>>>>> >>>>>>>>>>>>> On 11/26/2012 08:53 AM, David Charle wrote: >>>>>>>>>>>>>> hi >>>>>>>>>>>>>> >>>>>>>>>>>>>> what's the recommended nodes for NN, hmaster and zk nodes for >>>>>>>>>>>>>> a >>>>>>>> larger >>>>>>>>>>>>>> cluster, lets say 50-100+ >>>>>>>>>>>>>> >>>>>>>>>>>>>> also, what would be the ideal replication factor for larger >>>>>>>>>>>>>> clusters >>>>>>>>>>>>>> when >>>>>>>>>>>>>> u have 3-4 racks ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> David >>>>>>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS >>>>>>>>>>>>>> CIENCIAS >>>>>>>>>>>>>> INFORMATICAS... >>>>>>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION >>>>>>>>>>>>>> >>>>>>>>>>>>>> http://www.uci.cu >>>>>>>>>>>>>> http://www.facebook.com/universidad.uci >>>>>>>>>>>>>> http://www.flickr.com/photos/universidad_uci >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> Marcos Luis OrtÃz Valmaseda >>>>>>>>>>>>> about.me/marcosortiz <http://about.me/marcosortiz> >>>>>>>>>>>>> @marcosluis2186 <http://twitter.com/marcosluis2186> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> 10mo. ANIVERSARIO DE LA CREACION DE LA UNIVERSIDAD DE LAS >>>>>>>>>>>>> CIENCIAS >>>>>>>>>>>>> INFORMATICAS... >>>>>>>>>>>>> CONECTADOS AL FUTURO, CONECTADOS A LA REVOLUCION >>>>>>>>>>>>> >>>>>>>>>>>>> http://www.uci.cu >>>>>>>>>>>>> http://www.facebook.com/universidad.uci >>>>>>>>>>>>> http://www.flickr.com/photos/universidad_uci >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Adrien Mogenet >>>>>>> 06.59.16.64.22 >>>>>>> http://www.mogenet.me >>>>>> >>>>> >>>> >>>> >>> >> >> >
