That's great! Thanks for the info. Regards, Anuj
On Mon, Sep 3, 2012 at 8:49 PM, Mihály Héder <hederm...@gmail.com> wrote: > Hi! > > Sure, the 5.6M titles in a HashMap take about 1.3-1.5 G ram, so I run > the whole stanbol with -Xmx2500M without issues. > > In earlier iterations I have used ehcache + sophisticated custom hit > and miss handlers to save memory, but I had to realize that it creates > more performance issues than it solves in everyday setups, to I gave > up on that. > > Cheers > Mihály > > On 3 September 2012 15:58, Anuj Kumar <anujs...@gmail.com> wrote: > > Hi Mihály, > > > > Thanks a lot for sharing this. Looks good. > > > > I was curious to know the memory requirements to load the 5.6million > titles > > and the whole system to run. If you have any stats, can you please share > > that? > > > > Regards, > > Anuj > > > > On Mon, Sep 3, 2012 at 7:14 PM, Mihály Héder <hederm...@gmail.com> > wrote: > > > >> Hi! > >> > >> let me introduce BookSpotter Enhancement Engige by Sztaki: > >> > >> > http://blog.iks-project.eu/introducing-bookspotter-enhancement-engine-by-sztaki/ > >> > >> Bookspotter uses a selection of 5.6M titles from the British National > >> Bibliography and the Open Library. > >> It scans the incoming text, looking for titles, and in case the author > >> is also mentioned, it produces the corresponding entity annotations > >> that refer to the proper resource uris of either BNB or OL. > >> > >> You can check the system out here: > >> http://pedia2.sztaki.hu:9090/enhancer/chain/bookspotter > >> > >> Thanks to the Early Adopter Program, I was able to buy some student > >> work hours for data cleaning and for some basic testing. > >> You might want to read the report on our test set of 25 tests: > >> http://pedia2.sztaki.hu/stanbol/bookspotter/Bookspotter_tests.pdf > >> > >> For details, see the blog post! > >> > >> Any comments are much appreciated! > >> Cheers, > >> Mihály > >> >