Haven't had a chance to run netperf, but spotted messages in syslog of the form:
Oct 25 21:03:22 ... kernel: [107058.190743] net_ratelimit: 136 callbacks suppressed Oct 25 21:03:22 ... kernel: [107058.190746] nf_conntrack: table full, dropping packet. Which does perhaps suggest RPC requests are being dropped maybe. ~16000 connections for port 50060 i.e. tasktracker, I guess I'll try raising the max and seeing what effect that has. On 24 October 2013 23:02, Harry Waye <[email protected]> wrote: > Got it! Re. 50% utilisation, I forgot to mention that 6 cores does not > include hyper-threading. Foolish I know, but that would explain CPU0 being > at 50%. The nodes are as stated in > http://www.hetzner.de/en/hosting/produkte_rootserver/ex10 bar the RAID1. > > > On 24 October 2013 22:50, Jean-Marc Spaggiari <[email protected]>wrote: > >> Remote calls to a server. Just forget about it ;) Please verify the >> network >> bandwidth between your nodes. >> >> >> 2013/10/24 Harry Waye <[email protected]> >> >> > Excuse the ignorance, RCP? >> > >> > >> > On 24 October 2013 22:28, Jean-Marc Spaggiari <[email protected] >> > >wrote: >> > >> > > Your nodes are almost 50% idle... Might be something else. Sound it's >> not >> > > your disks nor your CPU... Maybe to many RCPs? >> > > >> > > Have you investigate on your network side? netperf might be a good >> help >> > for >> > > you. >> > > >> > > JM >> > > >> > > >> > > 2013/10/24 Harry Waye <[email protected]> >> > > >> > > > p.s. I guess this is more turning into a general hadoop issue, but >> I'll >> > > > keep the discussion here seeing that I have an audience, unless >> there >> > are >> > > > objections. >> > > > >> > > > >> > > > On 24 October 2013 22:02, Harry Waye <[email protected]> wrote: >> > > > >> > > > > So just a short update, I'll read into it a little more tomorrow. >> > This >> > > > is >> > > > > from three of the nodes: >> > > > > https://gist.github.com/hazzadous/1264af7c674e1b3cf867 >> > > > > >> > > > > The first is the grey guy. Just glancing at it, it looks to >> > fluctuate >> > > > > more than the others. I guess that could suggest that there are >> some >> > > > > issues with reading from the disks. Interestingly, it's the only >> one >> > > > that >> > > > > doesn't have smartd installed, which alerts us on changes for the >> > other >> > > > > nodes. I suspect there's probably some mileage in checking its >> smart >> > > > > attributes. Will do that tomorrow though. >> > > > > >> > > > > Out of curiosity, how do people normally monitor disk issues? I'm >> > > going >> > > > > to set up collectd to push various things from smartctl tomorrow, >> at >> > > the >> > > > > moment all we do is receive emails, which is mostly noise about >> > problem >> > > > > sector counts increasing +1. >> > > > > >> > > > > >> > > > > On 24 October 2013 19:40, Jean-Marc Spaggiari < >> > [email protected] >> > > > >wrote: >> > > > > >> > > > >> Can you try vmstat 2? 2 is the interval in seconds it will >> display >> > the >> > > > >> disk >> > > > >> usage. On the extract here, nothing is running. only 8% is used. >> (1% >> > > > disk >> > > > >> IO, 6% User, 1% sys) >> > > > >> >> > > > >> Run it on 2 or 3 different nodes while you are putting the load >> on >> > the >> > > > >> cluster. And take a look at the 4 last numbers and see what the >> > value >> > > of >> > > > >> the last one? >> > > > >> >> > > > >> On the usercpu0 graph, who is the gray guy showing hight? >> > > > >> >> > > > >> JM >> > > > >> >> > > > >> 2013/10/24 Harry Waye <[email protected]> >> > > > >> >> > > > >> > Ok I'm running a load job atm, I've add some possibly >> > > incomprehensible >> > > > >> > coloured lines to the graph: http://goo.gl/cUGCGG >> > > > >> > >> > > > >> > This is actually with one fewer nodes due to decommissioning to >> > > > replace >> > > > >> a >> > > > >> > disk, hence I guess the reason for one squiggly line showing no >> > disk >> > > > >> > activity. I've included only the cpu stats for CPU0 from each >> > node. >> > > > >> The >> > > > >> > last graph should read "Memory Used". vmstat from one of the >> > nodes: >> > > > >> > >> > > > >> > procs -----------memory---------- ---swap-- -----io---- >> -system-- >> > > > >> > ----cpu---- >> > > > >> > r b swpd free buff cache si so bi bo in >> cs >> > us >> > > > sy >> > > > >> id >> > > > >> > wa >> > > > >> > 6 0 0 392448 524668 43823900 0 0 501 1044 0 >> > 0 >> > > 6 >> > > > >> 1 >> > > > >> > 91 1 >> > > > >> > >> > > > >> > To me the wait doesn't seem that high. Job stats are >> > > > >> > http://goo.gl/ZYdUKp, the job setup is >> > > > >> > https://gist.github.com/hazzadous/ac57a384f2ab685f07f6 >> > > > >> > >> > > > >> > Does anything jump out at you? >> > > > >> > >> > > > >> > Cheers >> > > > >> > H >> > > > >> > >> > > > >> > >> > > > >> > On 24 October 2013 16:16, Harry Waye <[email protected]> >> wrote: >> > > > >> > >> > > > >> > > Hi JM >> > > > >> > > >> > > > >> > > I took a snapshot on the initial run, before the changes: >> > > > >> > > >> > > > >> > >> > > > >> >> > > > >> > > >> > >> https://www.evernote.com/shard/s95/sh/b8e1516d-7c49-43f0-8b5f-d16bbdd3fe13/00d7c6cd6dd9fba92d6f00f90fb54fc1/res/4f0e20a2-1ecb-4085-8bc8-b3263c23afb5/screenshot.png >> > > > >> > > >> > > > >> > > Good timing, disks appear to be exploding (ATA errors) atm >> thus >> > > I'm >> > > > >> > > decommissioning and reprovisioning with new disks. I'll be >> > > > >> > reprovisioning >> > > > >> > > as without RAID (it's software RAID just to compound the >> issue) >> > > > >> although >> > > > >> > > not sure how I'll go about migrating all nodes. I guess I'd >> > need >> > > to >> > > > >> put >> > > > >> > > more correctly speced nodes in the rack and decommission the >> > > > existing. >> > > > >> > > Makes diff. to >> > > > >> > > >> > > > >> > > We're using hetzner at the moment which may not have been a >> good >> > > > >> choice. >> > > > >> > > Has anyone had any experience with them wrt. Hadoop? They >> > offer >> > > 7 >> > > > >> and >> > > > >> > 15 >> > > > >> > > disk options, but are low on the cpu front (quad core). Our >> > > > workload >> > > > >> > will >> > > > >> > > be I assume on the high side. There's also a 8 disk Dell >> > > PowerEdge >> > > > >> what >> > > > >> > is >> > > > >> > > a little more powerful. What hosting providers would people >> > > > >> recommended? >> > > > >> > > (And what would be the strategy for migrating?) >> > > > >> > > >> > > > >> > > Anyhow, when I have things more stable I'll have a look at >> > > checking >> > > > >> out >> > > > >> > > what's using the cpu. In the mean time, can anything be >> gleamed >> > > > from >> > > > >> the >> > > > >> > > above snap? >> > > > >> > > >> > > > >> > > Cheers >> > > > >> > > H >> > > > >> > > >> > > > >> > > >> > > > >> > > On 24 October 2013 15:14, Jean-Marc Spaggiari < >> > > > >> [email protected] >> > > > >> > >wrote: >> > > > >> > > >> > > > >> > >> Hi Harry, >> > > > >> > >> >> > > > >> > >> Do you have more details on the exact load? Can you run >> vmstats >> > > and >> > > > >> see >> > > > >> > >> what kind of load it is? Is it user? cpu? wio? >> > > > >> > >> >> > > > >> > >> I suspect your disks to be the issue. There is 2 things >> here. >> > > > >> > >> >> > > > >> > >> First, we don't recommend RAID for the HDFS/HBase disk. The >> > best >> > > is >> > > > >> to >> > > > >> > >> simply mount the disks on 2 mounting points and give them to >> > > HDFS. >> > > > >> > >> Second, 2 disks per not is very low. On a dev cluster is not >> > even >> > > > >> > >> recommended. In production, you should go with 12 or more. >> > > > >> > >> >> > > > >> > >> So with only 2 disks in RAID, I suspect your WIO to be high >> > which >> > > > is >> > > > >> > what >> > > > >> > >> might slow your process. >> > > > >> > >> >> > > > >> > >> Can you take a look on that direction? If it's not that, we >> > will >> > > > >> > continue >> > > > >> > >> to investigate ;) >> > > > >> > >> >> > > > >> > >> Thanks, >> > > > >> > >> >> > > > >> > >> JM >> > > > >> > >> >> > > > >> > >> >> > > > >> > >> 2013/10/23 Harry Waye <[email protected]> >> > > > >> > >> >> > > > >> > >> > I'm trying to load data into hbase using HFileOutputFormat >> > and >> > > > >> > >> incremental >> > > > >> > >> > bulk load but am getting rather lackluster performance, >> 10h >> > for >> > > > >> ~0.5TB >> > > > >> > >> > data, ~50000 blocks. This is being loaded into a table >> that >> > > has >> > > > 2 >> > > > >> > >> > families, 9 columns, 2500 regions and is ~10TB in size. >> Keys >> > > are >> > > > >> md5 >> > > > >> > >> > hashes and regions are pretty evenly spread. The >> majority of >> > > > time >> > > > >> > >> appears >> > > > >> > >> > to be spend in the reduce phase, with the map phase >> > completing >> > > > very >> > > > >> > >> > quickly. The network doesn't appear to be saturated, but >> the >> > > > load >> > > > >> is >> > > > >> > >> > consistently at 6 which is the number or reduce tasks per >> > node. >> > > > >> > >> > >> > > > >> > >> > 12 hosts (6 cores, 2 disk as RAID0, 1GB eth, no one else >> on >> > the >> > > > >> rack). >> > > > >> > >> > >> > > > >> > >> > MR conf: 6 mappers, 6 reducers per node. >> > > > >> > >> > >> > > > >> > >> > I spoke to someone on IRC and they recommended reducing >> job >> > > > output >> > > > >> > >> > replication to 1, and reducing the number of mappers >> which I >> > > > >> reduced >> > > > >> > to >> > > > >> > >> 2. >> > > > >> > >> > Reducing replication appeared not to make any difference, >> > > > reducing >> > > > >> > >> > reducers appeared just to slow the job down. I'm going to >> > > have a >> > > > >> look >> > > > >> > >> at >> > > > >> > >> > running the benchmarks mentioned on Michael Noll's blog >> and >> > see >> > > > >> what >> > > > >> > >> that >> > > > >> > >> > turns up. I guess some questions I have are: >> > > > >> > >> > >> > > > >> > >> > How does the global number/size of blocks affect perf.? >> (I >> > > have >> > > > a >> > > > >> lot >> > > > >> > >> of >> > > > >> > >> > 10mb files, which are the input files) >> > > > >> > >> > >> > > > >> > >> > How does the job local number/size of input blocks affect >> > > perf.? >> > > > >> > >> > >> > > > >> > >> > What is actually happening in the reduce phase that >> requires >> > so >> > > > >> much >> > > > >> > >> CPU? >> > > > >> > >> > I assume the actual construction of HFiles isn't >> intensive. >> > > > >> > >> > >> > > > >> > >> > Ultimately, how can I improve performance? >> > > > >> > >> > Thanks >> > > > >> > >> > >> > > > >> > >> >> > > > >> > > >> > > > >> > > >> > > > >> > > >> > > > >> > > -- >> > > > >> > > Harry Waye, Co-founder/CTO >> > > > >> > > [email protected] >> > > > >> > > +44 7890 734289 >> > > > >> > > >> > > > >> > > Follow us on Twitter: @arachnys < >> > https://twitter.com/#!/arachnys> >> > > > >> > > >> > > > >> > > --- >> > > > >> > > Arachnys Information Services Limited is a company >> registered in >> > > > >> England >> > > > >> > & >> > > > >> > > Wales. Company number: 7269723. Registered office: 40 >> Clarendon >> > > St, >> > > > >> > > Cambridge, CB1 1JX. >> > > > >> > > >> > > > >> > >> > > > >> > >> > > > >> > >> > > > >> > -- >> > > > >> > Harry Waye, Co-founder/CTO >> > > > >> > [email protected] >> > > > >> > +44 7890 734289 >> > > > >> > >> > > > >> > Follow us on Twitter: @arachnys < >> https://twitter.com/#!/arachnys> >> > > > >> > >> > > > >> > --- >> > > > >> > Arachnys Information Services Limited is a company registered >> in >> > > > >> England & >> > > > >> > Wales. Company number: 7269723. Registered office: 40 Clarendon >> > St, >> > > > >> > Cambridge, CB1 1JX. >> > > > >> > >> > > > >> >> > > > > >> > > > > >> > > > > >> > > > > -- >> > > > > Harry Waye, Co-founder/CTO >> > > > > [email protected] >> > > > > +44 7890 734289 >> > > > > >> > > > > Follow us on Twitter: @arachnys <https://twitter.com/#!/arachnys> >> > > > > >> > > > > --- >> > > > > Arachnys Information Services Limited is a company registered in >> > > England >> > > > & >> > > > > Wales. Company number: 7269723. Registered office: 40 Clarendon >> St, >> > > > > Cambridge, CB1 1JX. >> > > > > >> > > > >> > > > >> > > > >> > > > -- >> > > > Harry Waye, Co-founder/CTO >> > > > [email protected] >> > > > +44 7890 734289 >> > > > >> > > > Follow us on Twitter: @arachnys <https://twitter.com/#!/arachnys> >> > > > >> > > > --- >> > > > Arachnys Information Services Limited is a company registered in >> > England >> > > & >> > > > Wales. Company number: 7269723. Registered office: 40 Clarendon St, >> > > > Cambridge, CB1 1JX. >> > > > >> > > >> > >> > >> > >> > -- >> > Harry Waye, Co-founder/CTO >> > [email protected] >> > +44 7890 734289 >> > >> > Follow us on Twitter: @arachnys <https://twitter.com/#!/arachnys> >> > >> > --- >> > Arachnys Information Services Limited is a company registered in >> England & >> > Wales. Company number: 7269723. Registered office: 40 Clarendon St, >> > Cambridge, CB1 1JX. >> > >> > > > > -- > Harry Waye, Co-founder/CTO > [email protected] > +44 7890 734289 > > Follow us on Twitter: @arachnys <https://twitter.com/#!/arachnys> > > --- > Arachnys Information Services Limited is a company registered in England & > Wales. Company number: 7269723. Registered office: 40 Clarendon St, > Cambridge, CB1 1JX. > -- Harry Waye, Co-founder/CTO [email protected] +44 7890 734289 Follow us on Twitter: @arachnys <https://twitter.com/#!/arachnys> --- Arachnys Information Services Limited is a company registered in England & Wales. Company number: 7269723. Registered office: 40 Clarendon St, Cambridge, CB1 1JX.
