Hi Andy,

Just a quick update;

I tried playing with huge tables and I could see that they weren't getting
used by the load - The load promptly stopped to a crawl at C.60million
triples again. Turning this off echo "0" > /proc/sys/vm/nr_hugepages
promptly sped the load up. It appears that enabling hugepages will prevent
the Mapped value from growing larger than the amount of memory left after
allocation to hugepages (4Gb in my case) - I had set hugepages to 15360
(15360*2048 = 31457280 = 30Gb RAM) ... If only I had read the man pages
first :).

I suspect that further down the line the Mapped values will hit another
limit - both are at around 6.5Gb @ 100m triples atm.

Rich

On Thu, Jun 16, 2011 at 8:50 AM, Richard Francis <[email protected]> wrote:

> Sorry Andy,
>
> The resulting data set is 94Gb for both.
>
> Rich
>
>
> On Thu, Jun 16, 2011 at 8:45 AM, Richard Francis <[email protected]> wrote:
>
>> Hi Andy,
>>
>> Both file systems are the same - we're using the ephemeral storage on the
>> ec2 node - both machines are ext3;
>>
>> Ubuntu:
>>
>> df -Th /mnt
>> Filesystem    Type    Size  Used Avail Use% Mounted on
>> /dev/xvdb     ext3    827G  240G  545G  31% /mnt
>>
>> Centos;
>>
>> df -Th /mnt
>> Filesystem    Type    Size  Used Avail Use% Mounted on
>> /dev/sdb      ext3    827G  191G  595G  25% /mnt
>>
>> Both the input ntriples and the output indexes are written to this
>> partition.
>> meminfo does show some differences - I believe mainly because the Ubuntu
>> instance is a later kernel (2.6.38-8-virtual vs. 2.6.16.33-xenU). There does
>> seem to be a difference between the mapped values, and I think I should
>> investigate the HugePages & DirectMap settings.
>>
>> Ubuntu;
>> cat /proc/meminfo
>> MemTotal:       35129364 kB
>> MemFree:          817100 kB
>> Buffers:           70780 kB
>> Cached:         32674868 kB
>> SwapCached:            0 kB
>> Active:         17471436 kB
>> Inactive:       15297084 kB
>> Active(anon):      25752 kB
>> Inactive(anon):       44 kB
>> Active(file):   17445684 kB
>> Inactive(file): 15297040 kB
>> Unevictable:        3800 kB
>> Mlocked:            3800 kB
>> SwapTotal:             0 kB
>> SwapFree:              0 kB
>> Dirty:             10664 kB
>> Writeback:             0 kB
>> AnonPages:         26808 kB
>> Mapped:             7012 kB
>> Shmem:               176 kB
>> Slab:             855516 kB
>> SReclaimable:     847652 kB
>> SUnreclaim:         7864 kB
>> KernelStack:         680 kB
>> PageTables:         2044 kB
>> NFS_Unstable:          0 kB
>> Bounce:                0 kB
>> WritebackTmp:          0 kB
>> CommitLimit:    17564680 kB
>> Committed_AS:      39488 kB
>> VmallocTotal:   34359738367 kB
>> VmallocUsed:      114504 kB
>> VmallocChunk:   34359623800 kB
>> HardwareCorrupted:     0 kB
>> HugePages_Total:       0
>> HugePages_Free:        0
>> HugePages_Rsvd:        0
>> HugePages_Surp:        0
>> Hugepagesize:       2048 kB
>> DirectMap4k:    35848192 kB
>> DirectMap2M:           0 kB
>>
>> ***************************************************
>> Centos;
>> cat /proc/meminfo
>> MemTotal:     35840000 kB
>> MemFree:         31424 kB
>> Buffers:        166428 kB
>> Cached:       34658344 kB
>> SwapCached:          0 kB
>> Active:        1033384 kB
>> Inactive:     33803304 kB
>> HighTotal:           0 kB
>> HighFree:            0 kB
>> LowTotal:     35840000 kB
>> LowFree:         31424 kB
>> SwapTotal:           0 kB
>> SwapFree:            0 kB
>> Dirty:             220 kB
>> Writeback:           0 kB
>> Mapped:          17976 kB
>> Slab:           223256 kB
>> CommitLimit:  17920000 kB
>> Committed_AS:    38020 kB
>> PageTables:       1528 kB
>> VmallocTotal: 34359738367 kB
>> VmallocUsed:       164 kB
>> VmallocChunk: 34359738203 kB
>>
>> Thanks,
>> Rich
>>
>> On Wed, Jun 15, 2011 at 10:07 PM, Andy Seaborne <
>> [email protected]> wrote:
>> >
>> > > So my questions are, has anyone else observed this? - can anyone
>> suggest any
>> > > further improvements - or things to try? - what is the best OS to
>> perform a
>> > > tdbload on?
>> >
>> > Richard - very useful feedback, thank you.
>> >
>> > I haven't come across this before - and the difference is quite
>> surprising.
>> >
>> > What is the "mapped" value on each machine?
>> > Could you "cat /proc/meminfo"?
>> >
>> > TDB is using memory mapped files - I'm wondering if the amount of RAM
>> available to the processes is different in some way.  Together with the
>> parameters you have found to have an efefct, this might have an effect
>> (speculation I'm afraid).
>> >
>> > Is the filesystem the same?
>> > How big is the resulting dataset?
>> >
>> > (sorry for all the questions!)
>> >
>> > tdbloader2 works differently from tdbloader even during the data phase.
>> It seems like it is the B+trees slowing down, there is only one in
>> tdbloader2 phase one, but two in tdbloader phase one.  That might explain
>> the roughly 80 -> 150million (or x2).
>> >
>> >        Andy
>> >
>> > On 15/06/11 16:23, Richard Francis wrote:
>> >>
>> >> Hi,
>> >>
>> >> I'm using two identical machines in ec2 running tdbloader on centos
>> (CentOS
>> >> release 5 (Final)) and Ubuntu 11.04 (natty)
>> >>
>> >> I've observed an issue where Centos will run happily at a consistent
>> speed
>> >> and complete a load of 650million triples in around 12 hours, whereas
>> the
>> >> load on Ubuntu, after just 15million triples tails off and runs at an
>> ever
>> >> increasing slower interval.
>> >>
>> >> On initial observation of the Ubuntu machine I noticed that the
>> flush-202
>> >> process was running quite high, also running iostat showed that io was
>> the
>> >> real bottle neck - with the ubuntu machine showing a constant use of
>> the
>> >> disk for both reads and writes (the centos machine had periods of no
>> usage
>> >> followed by periods of writes). This led me to investigate how memory
>> was
>> >> being used by the Ubuntu machine - and a few blog posts / tutorials
>> later I
>> >> found a couple of settings to tweak - the first I tried
>> >> was dirty_writeback_centisecs - setting this to 0 had an immediate
>> positive
>> >> effect on the load that I was performing - but after some more testing
>> I
>> >> found that the problem was just put back to around 80million triples
>> before
>> >> I saw a drop off on performance.
>> >>
>> >> This led me investigate whether there was the same issue with
>> tdbloader2 -
>> >>  From my observations I got the same problem - but this time around
>> 150m
>> >> triples.
>> >>
>> >> Again - I focused on "dirty" settings - and this time tweaking
>> dirty_bytes
>> >> = 30000000000 and dirty_background_bytes = 15000000000 saw a massive
>> >> performance increase and for the vast part of add phase of the
>> tdbloader it
>> >> kept up with the centos machine.
>> >>
>> >> Finally, last night I stopped all loads, and raced the centos machine
>> and
>> >> the ubuntu machine - both have completed - but the Centos machine
>> (around 12
>> >> hours) was still far quicker than the Ubuntu machine (20 hours).
>> >>
>> >> So my questions are, has anyone else observed this? - can anyone
>> suggest any
>> >> further improvements - or things to try? - what is the best OS to
>> perform a
>> >> tdbload on?
>> >>
>> >> Rich
>> >>
>> >>
>> >> Tests were performed on three different machines 1x Centos and 2 x
>> Ubuntu -
>> >> to rule out EC2 being a bottle neck - all were  (from
>> >> http://aws.amazon.com/ec2/instance-types/)
>> >>
>> >> High-Memory Double Extra Large Instance
>> >>
>> >> 34.2 GB of memory
>> >> 13 EC2 Compute Units (4 virtual cores with 3.25 EC2 Compute Units each)
>> >> 850 GB of instance storage
>> >> 64-bit platform
>> >> I/O Performance: High
>> >> API name: m2.2xlarge
>> >> All machines are configured with no swap
>> >>
>> >> Here's the summary from the only completed load on Ubuntu;
>> >>
>> >> ** Index SPO->OSP: 685,552,449 slots indexed in 18,337.75 seconds
>> [Rate:
>> >> 37,384.76 per second]
>> >> -- Finish triples index phase
>> >> ** 685,552,449 triples indexed in 37,063.51 seconds [Rate: 18,496.69
>> per
>> >> second]
>> >> -- Finish triples load
>> >> ** Completed: 685,552,449 triples loaded in 78,626.27 seconds [Rate:
>> >> 8,719.13 per second]
>> >> -- Finish quads load
>> >>
>> >> Some resources I used;
>> >> http://www.westnet.com/~gsmith/content/linux-pdflush.htm
>> >> http://arighi.blogspot.com/2008/10/fine-grained-dirtyratio-and.html
>> >>
>>
>>
>

Reply via email to