Hi David, I managed to find the kore50 corpus but not the milne-witten one. Do you know if it's still publicly available?
In order to test the time performance of each phase, I was thinking to use the available endpoints: 1-spot 2-candidates 3-disambiguate 4-annotate Because for using the disambiguate endpoint I would have to provide NE annotations in my call I was thinking to use the annotate endpoint instead and subtract the time consumed by the candidates endpoint in order to be able to get the time consumed by the disambiguation phase. Would such logic be correct with respect to the implementation? Is there any other phase in the pipeline (between disambiguation and annotation) which might affect this logic? If I understood it well, the pipeline consists of the processing done by each of the endpoints in the order that I've listed them above. Please let me know if it is not the case. Thank you in advance, Pajolma ----- Original Message ----- From: "David Przybilla" <[email protected]> To: "Pajolma Rupi" <[email protected]> Cc: [email protected] Sent: Tuesday, June 2, 2015 6:45:19 PM Subject: Re: [Dbp-spotlight-users] Time performance for each phase Hi Pajolma, As far as I know there are no separate evaluations out of the box, but you could use the milne-witten corpus to evaluate only the spottter and disambiguation separately. In my experience problems are usually related to spotting: surface forms which are not in the models, surface forms without enough probability. There is also specific corpus for evaluating disambiguation (kore50) On Tue, Jun 2, 2015 at 1:58 PM, Pajolma Rupi < [email protected] > wrote: <blockquote> Dear all, I was not able to find some information regarding the time performance of Spotlight service for each of the phases (separately): phrase spotting (candidate generation, candidate selection), disambiguation, indexing.There are some numbers present in the paper " Improving efficiency and accuracy in multilingual entity extraction " but they are calculated in the context of all the annotation process, meanwhile I'm interested in knowing during which specific phase the service performs better and during which phase it performs worse. Could you please let me know if such information exists already? I would also be interested in knowing if I can produce such information by running my own local instance of Spotlight (I'm using Java in order to annotate text). Thank you in advance, Pajolma ------------------------------------------------------------------------------ _______________________________________________ Dbp-spotlight-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users </blockquote>
------------------------------------------------------------------------------
_______________________________________________ Dbp-spotlight-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dbp-spotlight-users
