owlim-discussion  

Re: [Owlim-discussion] BigOWLIM configuration

Peter Kostelnik, PhD.
Mon, 31 May 2010 05:47:58 -0700

we'll .. for default config (+ build-pcsot=true), the run was:
./import <opts> -Xmx1400m

the memory consumption chart showed slow increase of resources which then
balanced between cca. 1200m-1400m until the end of the process ..
after cca. 8 hours, the data were inside ..

for the configuration below and run:
./import <opts> -Dcache-memory=7000m -Xmx9000m

after half an hour, the memory consumtion chart showed increase near
9000m, and then the memory consumption jumped between X-9000m, where X was
increasing up to 9000m and then came the heap-space :) ..

what do you think? .. cheers,
                Peter K.


> So which one of the configurations seems to consume too much memory?
> When you specify -Dcache-memory=7000m this means that 7000m will be
> allocated
> for cache usage (different cache purposes). Do you see a lot more than
> that?
>
>
> On Monday 31 May 2010 15:24:20 Peter Kostelnik, PhD. wrote:
>> hi,
>>
>> thanks, ivan, the current configuration is as follows (right now we are
>> experimenting with various configs) ::
>>
>> "ruleset" -> "empty",
>> "build-pcsot" -> "true",
>> "build-ptsoc" -> "true", // we'll try to turn it off
>> "cache-memory" -> "384M", // the point for experiments
>> "predicate-memory" -> "56M", // queries don't use wildcard predicates
>> "enable-optimization" -> "true",
>> "entity-index-size" -> "20000000",
>> "fts-memory" -> "0M", // we are building our own lucene index
>> "storage-folder" -> "bigowlim-store",
>> "repository-type" -> "file-repository",
>> "console-thread" -> "false"
>>
>> last time, the run was something like:
>> ./import <options> -Dcache-memory=7000m -Xmx9000m -Xms256m
>>
>> all data are imported through conn.add(x, y, z, ctx1, ctx2) in a single
>> transaction commited when all data are inside ..
>>
>> (btw, on 3.2.a7 snapshot we were able to get them in with practically
>> default configuration)
>>
>> thanks in advance,
>>                            Peter K.
>>
>> > Hi Peter,
>> >
>> > The tripleset component is a 5-th dimension of the RDF space (subject,
>> > predicate, object and context being the first four). This component
>> > cannot be
>> > used through the Sesame API which is why you never knew it existed. In
>> > short,
>> > you don't need to enable this index, it won't help.
>> >
>> > You need to tune the memory parameters (e.g. tuple-index-memory,
>> > predicate-memory, cache-memory, etc.) if you experience memory issues.
>> > Can you post here your current configuration and also how much memory
>> it
>> > requires
>> > to load 100M statements?
>> >
>> >
>> > Cheers,
>> > Ivan
>> >
>> > On Monday 31 May 2010 15:03:32 Peter Kostelnik, PhD. wrote:
>> >> hi there ..
>> >>
>> >> I'd like to ask, what exactly means the tripleset, for which the
>> ptsoc
>> >> index can be build? .. we are wandering which indices to build when
>> >> loading the data (as we are often using context based querying, I
>> know,
>> >> that we have to build pcsot for sure .. )
>> >>
>> >> btw, which configuration parameters can affect the load of data (e.g.
>> >> enable-optimization?)? .. we've faced the strange memory consuming
>> >> behaviour when loading cca. 100 milions of statements ..
>> >>
>> >> thanks in advance,
>> >>                       Peter K.
>> >>
>> >> _______________________________________________
>> >> OWLIM-discussion mailing list
>> >> OWLIM-discussion@ontotext.com
>> >> http://ontotext.com/mailman/listinfo/owlim-discussion
>


_______________________________________________
OWLIM-discussion mailing list
OWLIM-discussion@ontotext.com
http://ontotext.com/mailman/listinfo/owlim-discussion