The BFO team of ICube proposes a post-doctoral position in Strasbourg (France)
for 12 months starting in January 2015 within the NERD (Named Entities in
Relational Databases) research project.
Study of techniques coming from natural language processing, machine learning
and knowledge engineering for the extraction of named entities in sparse texts
in databases
Project Description
The overall goal of this research is to derive knowledge from unstructured and,
more importantly, unlabelled data in an unsupervised manner. This, for example,
will allow the content of a table in a database to be categorised, which will
allow higher-level processes to be developed which can use this knowledge. Many
enterprise level databases are very large, and deriving such knowledge through
manual means would be prohibitively expensive and the information needed would
be difficult to specify.
First we define the key difference between natural language documents and a
typical enterprise database. A webpage, or book, is generally characterised by
paragraphs of text, each containing natural language that follows specific
topics. In contrast to this, databases often contain fragmented information
such as addresses, descriptions, colours, labels, telephone numbers, etc. As
such, they contain information that is not accompanied by the additional
context that would be present within a natural language document and would be
exploited by current state-of-the-art algorithms (for example the contextual
information that would be derived from a sentence's structure or surrounding
words). Of course, databases may also contain natural language documents, but
this is the exception (in commercial databases) rather than the norm.
Nevertheless, the developed algorithms should be applicable to the spectrum of
documents that range between sparse (traditional business) databases and dense
(natural language) databases. Experiments on real world databases is expected
and access to an extensive corpus of data will be provided.
Expected Outcomes:
Proposal of a new method for identifying named entities in sparse text.
This method will be validated through publication in a high level international
conference.
Final report including a technical description of the method and results of
experimental validation.
The partners of the project are:
the BFO group of ICube, specialized in data mining and knowledge
engineering;
the FDT group of LiLPa, specialized in natural language processing;
the Laboratoire Quantup, experienced in the application of machine learning
and pattern recognition methods to commercial database systems.
Candidates applying for this position should have a PhD in Computer Science
with a good background in Natural Language Processing, Machine Learning and
formal Knowledge Representation. Experience in Java programming and distributed
systems (for example using the Map-Reduce framework and/or Spark) is desired. A
good knowledge of English is required along with an intermediate level of
French.
Candidates should send an academic curriculum vitae (including a list of
publications and the names and contact details of two referees), along with a
cover letter.
Deadline for application: 31st October 2014
Expected starting date: January 2015 (flexible)
Contact: Delphine Bernhard ([email protected]) and Tom Lampert
([email protected])
Salary: 2200 € per month (net, not including income tax).
Location
The University of Strasbourg traces its roots back to 1538 and is the second
largest university in France. It is amongst Europe's best in the League of
European Research, it consistently features on world university rankings, and
is well known for its international level research output. The ICube research
group brings together researchers of the University of Strasbourg, the CNRS
(Centre National de la Recherche Scientifique), the ENGEES and the INSA of
Strasbourg in the fields of engineering science, computer science and medical
science. With around 500 members and 14 research groups, ICube is a major
driving force for research in Strasbourg. The work will take place at ICube’s
offices in Illkirch, approximately fifteen minutes by public transport from the
centre of Strasbourg —- a historic university city, well connected to Paris,
Switzerland and Germany.
_______________________________________________
uai mailing list
[email protected]
https://secure.engr.oregonstate.edu/mailman/listinfo/uai