Hi, Nice work and thanks for sharing. You had quite a good store of book titles of around 5.6million. Why is it that the recall is around 50%.? Are the dropped titles (60-28-13=19) missing in the book bank? Are you trying any more heuristics to reduce the false positives? Thanks, Harish
On Wed, Sep 5, 2012 at 2:22 AM, Fabian Christ <christ.fab...@googlemail.com>wrote: > Hi, > > nice engine ;) Thanks for sharing! > > Best, > - Fabian > > 2012/9/3 Anuj Kumar <anujs...@gmail.com>: > > That's great! Thanks for the info. > > > > Regards, > > Anuj > > > > On Mon, Sep 3, 2012 at 8:49 PM, Mihály Héder <hederm...@gmail.com> > wrote: > > > >> Hi! > >> > >> Sure, the 5.6M titles in a HashMap take about 1.3-1.5 G ram, so I run > >> the whole stanbol with -Xmx2500M without issues. > >> > >> In earlier iterations I have used ehcache + sophisticated custom hit > >> and miss handlers to save memory, but I had to realize that it creates > >> more performance issues than it solves in everyday setups, to I gave > >> up on that. > >> > >> Cheers > >> Mihály > >> > >> On 3 September 2012 15:58, Anuj Kumar <anujs...@gmail.com> wrote: > >> > Hi Mihály, > >> > > >> > Thanks a lot for sharing this. Looks good. > >> > > >> > I was curious to know the memory requirements to load the 5.6million > >> titles > >> > and the whole system to run. If you have any stats, can you please > share > >> > that? > >> > > >> > Regards, > >> > Anuj > >> > > >> > On Mon, Sep 3, 2012 at 7:14 PM, Mihály Héder <hederm...@gmail.com> > >> wrote: > >> > > >> >> Hi! > >> >> > >> >> let me introduce BookSpotter Enhancement Engige by Sztaki: > >> >> > >> >> > >> > http://blog.iks-project.eu/introducing-bookspotter-enhancement-engine-by-sztaki/ > >> >> > >> >> Bookspotter uses a selection of 5.6M titles from the British National > >> >> Bibliography and the Open Library. > >> >> It scans the incoming text, looking for titles, and in case the > author > >> >> is also mentioned, it produces the corresponding entity annotations > >> >> that refer to the proper resource uris of either BNB or OL. > >> >> > >> >> You can check the system out here: > >> >> http://pedia2.sztaki.hu:9090/enhancer/chain/bookspotter > >> >> > >> >> Thanks to the Early Adopter Program, I was able to buy some student > >> >> work hours for data cleaning and for some basic testing. > >> >> You might want to read the report on our test set of 25 tests: > >> >> http://pedia2.sztaki.hu/stanbol/bookspotter/Bookspotter_tests.pdf > >> >> > >> >> For details, see the blog post! > >> >> > >> >> Any comments are much appreciated! > >> >> Cheers, > >> >> Mihály > >> >> > >> > > > > -- > Fabian > http://twitter.com/fctwitt > -- Thanks Harish