Hey Zhiwei,
you can read more details here [1], but totalCount is the total number of
times a surface form occurs in Wikipedia, regardless of annotated or not.
Since it's computationally intensive to get this number, we use an
approximation which can mean that some surface forms have no count or that
their count is too low.
Jo
[1] http://jodaiber.de/doc/entity.pdf
On Wed, Apr 24, 2013 at 9:38 AM, Cai Zhiwei <[email protected]> wrote:
> Hi Jo,
>
> I'm working on CreateSpotlightModel on google corpus. I've finished
> extending the DBpediaResourceSource and CandidateMapSource to read from
> google corpus. But I have been stuck in SurfaceFormSource. What's the
> meaning of "totalCounts" (the third field) in "sfAndTotalCount" file? Why
> would some of them having annotatedCount > totalCount?
>
> Thanks for you time,
> Zhiwei
>
------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Dbpedia-gsoc mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dbpedia-gsoc