[Dbpedia-discussion] Two fully funded PhD positions on Answering Questions using Web Data

2015-02-08 Thread Saeedeh Shekarpour
Fraunhofer IAIS is pleased to announce two PhD positions - fully-funded
with the EU research project: “WDAqua: Answering questions using Web Data”,
which has started in January 2015.


Research Area:


The project will undertake advanced fundamental and applied research into
models, methods, and tools for data-driven question answering on the Web,
spanning over a diverse range of areas and disciplines (data analytics,
data mining, information retrieval, social computing, cloud computing,
large-scale distributed computing, Linked Data, and Web science). Potential
topics for a PhD dissertation include, but are not limited to:


● Design of a cloud-based system architecture for question answering (QA),
extensible by plugins for all stages of the process of QA and Web data

● High-quality interpretation of voice input and natural language text as
database queries for question answering.

● Leveraging Web Data for advanced entity disambiguation and
contextualisation of queries given as natural language.

● Question answering methods using ecosystems of heterogeneous data sets
(structured, unstructured, linked, stream-like, uncertain).


Institution


The about 200 employees of the Fraunhofer Institute for Intelligent
Analysis and Information Systems (IAIS; http://www.iais.fraunhofer.de)
investigate and develop innovative systems for data analysis and
information management. Specific areas of competence include information
integration (represented by the IAIS department Organized Knowledge), big
data (department Knowledge Discovery), and multimedia technologies
(department NetMedia).


Requirements:

1. Master Degree in Computer Science (or equivalent).

2. You must not have resided or worked for more than 12 months in Germany
in the 3 years before starting to work.

3. Proficiency in spoken and written English. Proficiency in German is a
plus but not required.

4. Proficiency in Programming languages like Java/Scala or JavaScript, and
modern software engineering methodology.

5. Familiarity with Semantic Web technologies, Natural Language Processing,
Speech Recognition, Indexing Technologies, Distributed Systems and Cloud
Computing is an asset.


As a successful candidate for this award, you will:

1. Spend the majority of your time at Fraunhofer IAIS, where you will
research and write a dissertation leading to a PhD (awarded by the
University of Bonn).

2. Have a minimum of two academic supervisors from the WDAqua project.

3. Receive a full salary and a support grant to attend conferences, summer
schools, and other events related to your research each year.

4. Engage with other researchers and participate in the training program
offered by the WDAqua project, including internships at other partners in
the project.


Further Information


For further information, please see the WDAqua homepage at
http://www.iai.uni-bonn.de/~langec/wdaqua/.


How to apply


Applications should include a CV and a letter of motivation. Applicants
should list two referees that may be contacted by the Department and are
moreover invited to submit a research sample (publication or research
paper). Applications will be evaluated on a rolling basis. For full
consideration, please apply until 27.02.2015.


Applications should be sent to Dr. Christoph Lange-Bever.

E-Mail: christoph.lange-be...@iais.fraunhofer.de

Tel.: +49 2241/14-2428

-- 

Best Regards




Saeedeh Shekarpour

Postdoctoral Researcher

Enterprise Information Systems Department, University of Bonn
--
Dive into the World of Parallel Programming. The Go Parallel Website,
sponsored by Intel and developed in partnership with Slashdot Media, is your
hub for all things parallel software development, from weekly thought
leadership blogs to news, videos, case studies, tutorials and more. Take a
look and join the conversation now. http://goparallel.sourceforge.net/___
Dbpedia-discussion mailing list
Dbpedia-discussion@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dbpedia-discussion


Re: [Dbpedia-discussion] URIs vs. other IDs (Was: New user interface for dbpedia.org)

2015-02-08 Thread Kingsley Idehen

On 2/7/15 6:07 PM, Markus Kroetzsch wrote:

Hi Kingsley,

We are getting a bit off-topic here, but let me answer briefly ...

On 07.02.2015 21:36, Kingsley Idehen wrote:
...


Not it isn't duplication. Wikipedia HTTP URLs identify Wikipedia
documents. DBpedia URIs identify entities associated with Wikipedia
documents. There's a world of difference here!


That's not my point (I know the difference, of course). Wikidata 
stores neither Wikipedia URLs nor DBpedia URIs. It just stores 
Wikipedia article names together with Wikimedia site (project) 
identifiers. The work to get from there to the URL is the same as the 
work to get to the URI. Storing either explicitly in another property 
value would only introduce redundancy (and potential inconsistencies). 
In a Linked Data export you could easily include one or both of these 
URIs, depending on the application, but it's not so clear that doing 
this in a data viewer would make much sense. Surely it would not be 
useful if people would have to enter all of this data manually three 
times.


On that note, is it the current best practice that all linked data 
exports include links to all other datasets that contain related 
information (exhaustive two-way linking)? That seems like a lot of 
triples and not very feasible if the LOD Web grows (a bit like two-way 
HTML linking ... ;-). Wouldn't it be more practical to integrate via 
shared key values? In this case, Wikipedia URLs might be a sensible 
choice to indicate the topic of a resource, rather than requiring all 
resources that have a Wikipedia article as their topic to cross link 
to all (quadratically many) other such resources directly. I would be 
curious to hear your take on this.






There are similar issues with most of the other identifiers: they are
usually the main IDs of the database, not the URIs of the
corresponding RDF data (if available).


Hmm.. if you look at the identifiers on the viewer's right hand side,
you will find out (depending on you understanding of Linked Open Data
concepts) that they too identify entities that are associated with Web
pages, rather than web pages themselves.


Sure, but you are confusing the purpose of URIs with the underlying 
technical standard here. People use identifiers to refer to entities, 
or course, yet they do not use identifiers that are based on the URI 
standard. We both know about the limitations of this approach, but 
that does not change the shape of the IDs people use to refer to 
things (e.g., on Freebase, but it is the same elsewhere). Usually, if 
you want to interface with such data collections (be it via UIs or via 
APIs), you need to use their official IDs, while URIs are not supported.


This is also the answer to your other comment. You are only seeing the 
purpose of the identifier, and you rightly say that there should be no 
big technical issue to use a URI instead. I agree, yet it has to be 
done, and it has to be done differently for each case. There is no 
general rule how to construct URIs from the official IDs used by open 
data collections on today's Web.


A related problem is that most online data sets have UIs that are 
much more user friendly than any LOD browser could be based on the RDF 
they export. There is no incentive for users to click on a LOD-based 
view of, say, IMDB, if they can just go to the IMDB page instead. This 
should be taken into account when building a DBpedia LOD view (back on 
topic! ;-): people who want to learn about something will usually be 
better served by going to Wikipedia; the target audience of the viewer 
is probably a different group who wants to inspect the DBpedia data 
set. This should probably affect how the UI is built, and maybe will 
lead to different design decisions than in the Wikidata browser I 
mentioned.


Markus


Markus,

Cutting a long story real short. Yes, you have industry standard 
identifiers, ditto HTTP URI that identify things in regards to Linked 
Open Data principles.
You simply use relations such as dcterms:identifier (and the like) to 
incorporate industry standard identifiers into an entity description. 
Even better, those relations should be inverse-functional in nature. 
That's really it.


DBpedia Identifiers (HTTP URI based References) and Industry Standard 
Identifiers (typically literal in nature) aren't mutually exclusive.


Getting back on topic, reasonator is a nice UI. What it lacks, from a 
DBpedia perspective, is incorporation of DBpedia URIs which is an issue 
the author of the tool assured me he will be addressing, as a high 
priority.



--
Regards,

Kingsley Idehen 
Founder  CEO
OpenLink Software
Company Web: http://www.openlinksw.com
Personal Weblog 1: http://kidehen.blogspot.com
Personal Weblog 2: http://www.openlinksw.com/blog/~kidehen
Twitter Profile: https://twitter.com/kidehen
Google+ Profile: https://plus.google.com/+KingsleyIdehen/about
LinkedIn Profile: http://www.linkedin.com/in/kidehen
Personal WebID: