Re: LOD Instance Update re. new Data Sets Loaded.

Yrjänä Rankka Fri, 06 Mar 2009 02:07:52 -0800

Dan Brickley wrote:

On 6/3/09 10:21, Yrjänä Rankka wrote:
Georgi Kobilarov wrote:
Hi Kingsley,
DESCRIBE <http://dbpedia.org/resource/London> takes 3 minutes to
execute on lod.openlinksw.com ...
It took only a few seconds when I tried it. Takes time to warm up a pan
of this size, as is the case with any DBMS. As the working set
stabilizes in memory, results will come faster.
What's the granularity of the warmup? If eg /resource/Paris hasn'tbeen directly viewed, will it benefit much from general warmup ofrelated resources that are mentioned in the queries for that entity?

Very likely so. Also in case of DESCRIBE<http://dbpedia.org/resource/London> the result of ~ 13MB takes a whileto transfer as well. Though not quite 3 minutes - at least not throughthe pipe I'm connected to.

Here's the explanation of how the read-ahead works straight from thehorse's mouth:

In general, looking for resources in a data set improves the working setfor that data set. There is some locality based on load order etc.

The disk format is 8K pages, 256 pages per extent of 2MB.It is 8 disks and 16 server processes, so disk is too narrow. Diskreads are in general in parallel on all disks.

The random access transfer unit is 8K but if you get two reads hittingthe same extent within a second of each other, the whole extent is readsequentially instead of the 2^nd single page request.So frequency of access drives bulk prefetching. Then there is cachemaintenance policies that differ between just prefetched and actuallyrequested pages. This is a tunable tradeoff between disk throughput andcache pollution.

Virtuoso IO is clever enough. But the fact is that running from memoryis 1000+ times faster than from disk on a random access workload and RDFis the very essence of random access.

cheers

Dan

Yrjänä

--
Yrjana Rankka            | [email protected]
Developer, Virtuoso Team | http://www.openlinksw.com

| Making Technology Work For You

Re: LOD Instance Update re. new Data Sets Loaded.

Reply via email to