Sorry Andy, The resulting data set is 94Gb for both.
Rich On Thu, Jun 16, 2011 at 8:45 AM, Richard Francis <[email protected]> wrote: > Hi Andy, > > Both file systems are the same - we're using the ephemeral storage on the > ec2 node - both machines are ext3; > > Ubuntu: > > df -Th /mnt > Filesystem Type Size Used Avail Use% Mounted on > /dev/xvdb ext3 827G 240G 545G 31% /mnt > > Centos; > > df -Th /mnt > Filesystem Type Size Used Avail Use% Mounted on > /dev/sdb ext3 827G 191G 595G 25% /mnt > > Both the input ntriples and the output indexes are written to this > partition. > meminfo does show some differences - I believe mainly because the Ubuntu > instance is a later kernel (2.6.38-8-virtual vs. 2.6.16.33-xenU). There does > seem to be a difference between the mapped values, and I think I should > investigate the HugePages & DirectMap settings. > > Ubuntu; > cat /proc/meminfo > MemTotal: 35129364 kB > MemFree: 817100 kB > Buffers: 70780 kB > Cached: 32674868 kB > SwapCached: 0 kB > Active: 17471436 kB > Inactive: 15297084 kB > Active(anon): 25752 kB > Inactive(anon): 44 kB > Active(file): 17445684 kB > Inactive(file): 15297040 kB > Unevictable: 3800 kB > Mlocked: 3800 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty: 10664 kB > Writeback: 0 kB > AnonPages: 26808 kB > Mapped: 7012 kB > Shmem: 176 kB > Slab: 855516 kB > SReclaimable: 847652 kB > SUnreclaim: 7864 kB > KernelStack: 680 kB > PageTables: 2044 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > WritebackTmp: 0 kB > CommitLimit: 17564680 kB > Committed_AS: 39488 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 114504 kB > VmallocChunk: 34359623800 kB > HardwareCorrupted: 0 kB > HugePages_Total: 0 > HugePages_Free: 0 > HugePages_Rsvd: 0 > HugePages_Surp: 0 > Hugepagesize: 2048 kB > DirectMap4k: 35848192 kB > DirectMap2M: 0 kB > > *************************************************** > Centos; > cat /proc/meminfo > MemTotal: 35840000 kB > MemFree: 31424 kB > Buffers: 166428 kB > Cached: 34658344 kB > SwapCached: 0 kB > Active: 1033384 kB > Inactive: 33803304 kB > HighTotal: 0 kB > HighFree: 0 kB > LowTotal: 35840000 kB > LowFree: 31424 kB > SwapTotal: 0 kB > SwapFree: 0 kB > Dirty: 220 kB > Writeback: 0 kB > Mapped: 17976 kB > Slab: 223256 kB > CommitLimit: 17920000 kB > Committed_AS: 38020 kB > PageTables: 1528 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 164 kB > VmallocChunk: 34359738203 kB > > Thanks, > Rich > > On Wed, Jun 15, 2011 at 10:07 PM, Andy Seaborne < > [email protected]> wrote: > > > > > So my questions are, has anyone else observed this? - can anyone > suggest any > > > further improvements - or things to try? - what is the best OS to > perform a > > > tdbload on? > > > > Richard - very useful feedback, thank you. > > > > I haven't come across this before - and the difference is quite > surprising. > > > > What is the "mapped" value on each machine? > > Could you "cat /proc/meminfo"? > > > > TDB is using memory mapped files - I'm wondering if the amount of RAM > available to the processes is different in some way. Together with the > parameters you have found to have an efefct, this might have an effect > (speculation I'm afraid). > > > > Is the filesystem the same? > > How big is the resulting dataset? > > > > (sorry for all the questions!) > > > > tdbloader2 works differently from tdbloader even during the data phase. > It seems like it is the B+trees slowing down, there is only one in > tdbloader2 phase one, but two in tdbloader phase one. That might explain > the roughly 80 -> 150million (or x2). > > > > Andy > > > > On 15/06/11 16:23, Richard Francis wrote: > >> > >> Hi, > >> > >> I'm using two identical machines in ec2 running tdbloader on centos > (CentOS > >> release 5 (Final)) and Ubuntu 11.04 (natty) > >> > >> I've observed an issue where Centos will run happily at a consistent > speed > >> and complete a load of 650million triples in around 12 hours, whereas > the > >> load on Ubuntu, after just 15million triples tails off and runs at an > ever > >> increasing slower interval. > >> > >> On initial observation of the Ubuntu machine I noticed that the > flush-202 > >> process was running quite high, also running iostat showed that io was > the > >> real bottle neck - with the ubuntu machine showing a constant use of the > >> disk for both reads and writes (the centos machine had periods of no > usage > >> followed by periods of writes). This led me to investigate how memory > was > >> being used by the Ubuntu machine - and a few blog posts / tutorials > later I > >> found a couple of settings to tweak - the first I tried > >> was dirty_writeback_centisecs - setting this to 0 had an immediate > positive > >> effect on the load that I was performing - but after some more testing I > >> found that the problem was just put back to around 80million triples > before > >> I saw a drop off on performance. > >> > >> This led me investigate whether there was the same issue with tdbloader2 > - > >> From my observations I got the same problem - but this time around 150m > >> triples. > >> > >> Again - I focused on "dirty" settings - and this time tweaking > dirty_bytes > >> = 30000000000 and dirty_background_bytes = 15000000000 saw a massive > >> performance increase and for the vast part of add phase of the tdbloader > it > >> kept up with the centos machine. > >> > >> Finally, last night I stopped all loads, and raced the centos machine > and > >> the ubuntu machine - both have completed - but the Centos machine > (around 12 > >> hours) was still far quicker than the Ubuntu machine (20 hours). > >> > >> So my questions are, has anyone else observed this? - can anyone suggest > any > >> further improvements - or things to try? - what is the best OS to > perform a > >> tdbload on? > >> > >> Rich > >> > >> > >> Tests were performed on three different machines 1x Centos and 2 x > Ubuntu - > >> to rule out EC2 being a bottle neck - all were (from > >> http://aws.amazon.com/ec2/instance-types/) > >> > >> High-Memory Double Extra Large Instance > >> > >> 34.2 GB of memory > >> 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each) > >> 850 GB of instance storage > >> 64-bit platform > >> I/O Performance: High > >> API name: m2.2xlarge > >> All machines are configured with no swap > >> > >> Here's the summary from the only completed load on Ubuntu; > >> > >> ** Index SPO->OSP: 685,552,449 slots indexed in 18,337.75 seconds [Rate: > >> 37,384.76 per second] > >> -- Finish triples index phase > >> ** 685,552,449 triples indexed in 37,063.51 seconds [Rate: 18,496.69 per > >> second] > >> -- Finish triples load > >> ** Completed: 685,552,449 triples loaded in 78,626.27 seconds [Rate: > >> 8,719.13 per second] > >> -- Finish quads load > >> > >> Some resources I used; > >> http://www.westnet.com/~gsmith/content/linux-pdflush.htm > >> http://arighi.blogspot.com/2008/10/fine-grained-dirtyratio-and.html > >> > >
