OpenRelevance Viewer (Orev)

Itamar Syn-Hershko Thu, 08 Jul 2010 08:51:15 -0700

Hi all,

Following a discussion with Robert, I have started working on a viewerapplication intended to make viewing and judgment of corpora and topicsas easy as possible. The intention is to make this development as rapidas it can possibly be. I'm building this with .NET (NHibernate / ASP.NETMVC).

Following are several remarks / high-level description. I'm interestedin capturing some early feedback and ideas, but please note my intentionis to start with something functional first.

While FILEFORMATS.txt defines file structures, since the viewer isworking against a DB those will only be honored via export functions.See attached image for a domain model.

A corpus DB entry points to a FS path (could also be remote via HTTP forexample). The viewer, in turn, will load the files one by one and thejudgment will be saved with the Corpus ID, Topic ID and a stringrepresentation of the document filename. The former 2 are integers, anddocument ID is defined as a string, so document file-names can use abase-24 ID representation for generated corpora (i.e. exporting from awiki-dump).

Unlike what was stated in FILEFORMATS.TXT, a corpus will not reside in agzipped file.

The above approach may allow for more than one people judging the samedocument for the same topic at once - which is bad since it could wastethe users time (no need for double-judgment). I'll probably have toresolve this by implementing a HiLo-like mechanism (or pooling), but I'mleaving this for later.

The web application will allow for submitting new topics per language,and to judge documents for a topic. The Judgment screen will show thetopic at top, navigation at left, and the document in rest of thescreen. The user can choose "Relevant", "Irrelevant", "Skip".

A user can filter by language, so he sees only topics relevant to him.Language filtering can be applied using a language string ("en-US") pertopic and corpus.


Thats about it for now, looking forward to some feedback.


Itamar.

OpenRelevance Viewer (Orev)

Reply via email to