Hi!

Sure, the 5.6M titles in a HashMap take about 1.3-1.5 G ram, so I run
the whole stanbol with -Xmx2500M without issues.

In earlier iterations I have used ehcache + sophisticated custom hit
and miss handlers to save memory, but I had to realize that it creates
more performance issues than it solves in everyday setups, to I gave
up on that.

Cheers
Mihály

On 3 September 2012 15:58, Anuj Kumar <anujs...@gmail.com> wrote:
> Hi Mihály,
>
> Thanks a lot for sharing this. Looks good.
>
> I was curious to know the memory requirements to load the 5.6million titles
> and the whole system to run. If you have any stats, can you please share
> that?
>
> Regards,
> Anuj
>
> On Mon, Sep 3, 2012 at 7:14 PM, Mihály Héder <hederm...@gmail.com> wrote:
>
>> Hi!
>>
>> let me introduce BookSpotter Enhancement Engige by Sztaki:
>>
>> http://blog.iks-project.eu/introducing-bookspotter-enhancement-engine-by-sztaki/
>>
>> Bookspotter uses a selection of 5.6M titles from the British National
>> Bibliography and the Open Library.
>> It scans the incoming text, looking for titles, and in case the author
>> is also mentioned, it produces the corresponding entity annotations
>> that refer to the proper resource uris of either BNB or OL.
>>
>> You can check the system out here:
>> http://pedia2.sztaki.hu:9090/enhancer/chain/bookspotter
>>
>> Thanks to the Early Adopter Program, I was able to buy some student
>> work hours for data cleaning and for some basic testing.
>> You might want to read the report on our test set of 25 tests:
>> http://pedia2.sztaki.hu/stanbol/bookspotter/Bookspotter_tests.pdf
>>
>> For details, see the blog post!
>>
>> Any comments are much appreciated!
>> Cheers,
>> Mihály
>>

Reply via email to