Re: Final CFP: In-Use Track ISWC 2013

Sebastian Hellmann Thu, 02 May 2013 14:20:22 -0700

I am not completely familiar with DOI. Am I right, that it more or lessprovides the same service as http://purl.org .DOI links on the resource-level. You would still need frag ids to linkto parts.

Firefox can actually handle this:
http://dx.doi.org/10.1038%2Fscientificamerican1210-80#atl

If I am right, DOI also wouldn't be able to provide links to the 40million mentions contained in the Wiki links corpus:

http://techcrunch.com/2013/03/08/google-research-releases-wikilinks-corpus-with-40m-mentions-and-3m-entities/
That's 40 million DOIs ....

See the data excerpt below.

All the best,
Sebastian

URLftp://217.219.170.14/Computer%20Group/Faani/vaset%20fani/second/sattari/word/2007/source/s%20crt.docx

MENTION vacuum tube 421 http://en.wikipedia.org/wiki/Vacuum_tube
MENTION vacuum tubes 10838 http://en.wikipedia.org/wiki/Vacuum_tube
MENTION electron gun 598 http://en.wikipedia.org/wiki/Electron_gun
MENTION fluorescent 790 http://en.wikipedia.org/wiki/Fluorescent
MENTION oscilloscope 1307 http://en.wikipedia.org/wiki/Oscilloscope
MENTION computer monitor 1503 http://en.wikipedia.org/wiki/Computer_monitor
MENTION computer monitors 3066 http://en.wikipedia.org/wiki/Computer_monitor
MENTION radar 1657 http://en.wikipedia.org/wiki/Radar
MENTION plasma screens 2162 http://en.wikipedia.org/wiki/Plasma_screen

Each file is in the following format:

-------

URL\t<url>\n
MENTION\t<mention>\t<byte_offset>\t<target_url>\n
MENTION\t<mention>\t<byte_offset>\t<target_url>\n
MENTION\t<mention>\t<byte_offset>\t<target_url>\n
...
TOKEN\t<token>\t<byte_offset>\n
TOKEN\t<token>\t<byte_offset>\n
TOKEN\t<token>\t<byte_offset>\n
...
\n\n
URL\t<url>\n
...

Am 02.05.2013 22:36, schrieb Dawson, Laura:

Short DOIs for fragment IDs?

From: Sebastian Hellmann <[email protected]<mailto:[email protected]>>

Date: Thursday, May 2, 2013 4:33 PM
To: Paul Groth <[email protected] <mailto:[email protected]>>

Cc: Steve Pettifer <[email protected]<mailto:[email protected]>>, Sarven Capadisli<[email protected] <mailto:[email protected]>>, "[email protected]<mailto:[email protected]>" <[email protected] <mailto:[email protected]>>

Subject: Re: Final CFP: In-Use Track ISWC 2013

Resent-From: "[email protected] <mailto:[email protected]>"<[email protected] <mailto:[email protected]>>

Resent-Date: Thursday, May 2, 2013 4:34 PM

Open annotation is great. Really powerful and well designed ontologyand model. It doesn't replace fragment ids, however. Both are necessary:frag ids to link with in simple use cases (e.g. HTML) and the otherone to annotate properly.

A bridge between them would be nice.

All the best,
Sebastian

Am 02.05.2013 18:00, schrieb Paul Groth:

Hi Sebastien,

I use latex as well. Utopia is a pdf reader.

But utopia does support referencing bits of the pdf. As I understand,they are moving to extending the open annotation ontology. I've cc'dSteve Pettifer who created Utopia and who will known the ins-and-outs.


Currently, they store all the annotations separately.

Thanks
Paul

On Thu, May 2, 2013 at 5:21 PM, Sebastian Hellmann<[email protected]<mailto:[email protected]>> wrote:


    Hi Paul,
    personally for me latex works best, because it has good editors
    and support for description logic formulas. Plus it is widely
    used and quite good for PDF typesetting.

    It would be really swell to be able to address content within PDF
    with identifiers. Did Utopia solve that problem?

    I am asking along the lines of
    - mediafragments [1]
    - RFC 5147 text fragment identifier (see the example at the
    bottom of [2])
    - xpointer/xpath [3]

    If yes, I would like to use it immediately. There are plans to
    convert the Google Mention corpus (which includes PDF's) to NIF [2] .
    The PDF Open Parameters provided by [4] are way too simple.

    All the best,
    Sebastian

    [1] http://www.w3.org/TR/media-frags/
    <http://www.w3.org/TR/media-frags/>
    [2] (example is at the bottom of .ttl file)
    http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core
    [3] e.g. http://example.com/exampledoc.html#xpath(/html/body
    
<http://example.com/exampledoc.html#xpath%28/html/body>[1]/h2[1]/span[1]/text()[1])
    [4]
    
http://partners.adobe.com/public/developer/en/acrobat/PDFOpenParameters.pdf#page=7

    Am 02.05.2013 12:55, schrieb Paul Groth:

    Hi Sarven,

    Beyond the PDF for me is moving beyond the current research
    communication system as highlighted by the Force 11 manifesto
    (http://www.force11.org/white_paper). This includes adopting
    technologies that augment/extend (i.e. go beyond) existing
    technologies. For example, making data easily accessible and
    citable, providing links to online content, making multiple
    perspectives on content available, exposing provenance, using
    altmetrics. I'm very influenced by the work on Utopia
    (http://utopiadocs.com) so that's why I think using pdfs are
    fine - you can do a lot with them as they stand - and for a
    certain form of communication (written long form text) they work
    well. As technologist we need to make sure that these new
    technologies work well in the environment and connect to other
    things.

    cheers
    Paul








    On Thu, May 2, 2013 at 12:32 PM, Sarven Capadisli
    <[email protected] <mailto:[email protected]>> wrote:

        On 05/02/2013 12:23 PM, Paul Groth wrote:

            I think Harry makes the point better than I can.


        Paul, I have one last question for you if you don't mind,
        because it seems like you are not interested in playing this
        out and I don't want to bother you further: what does
        "beyond the PDF" mean to you?

        -Sarven

-------------------------------------------------------------------------------------

    Dr. Paul Groth ([email protected] <mailto:[email protected]>)
    http://www.few.vu.nl/~pgroth/ <http://www.few.vu.nl/%7Epgroth/>
    Assistant Professor
    - Web & Media Group | Department of Computer Science
    - The Network Institute
    VU University Amsterdam

--Dipl. Inf. Sebastian Hellmann

    Department of Computer Science, University of Leipzig
    Events: NLP & DBpedia 2013
    (http://nlp-dbpedia2013.blogs.aksw.org, Deadline: *July 8th*)
    Venha para a Alemanha como PhD:
    http://bis.informatik.uni-leipzig.de/csf
    <http://bis.informatik.uni-leipzig.de/csf>
    Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,
    http://dbpedia.org/Wiktionary , http://dbpedia.org
    Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
    Research Group: http://aksw.org <http://aksw.org>




--
-----------------------------------------------------------------------------------
Dr. Paul Groth ([email protected] <mailto:[email protected]>)
http://www.few.vu.nl/~pgroth/ <http://www.few.vu.nl/%7Epgroth/>
Assistant Professor
- Web & Media Group | Department of Computer Science
- The Network Institute
VU University Amsterdam



--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig

Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org,Deadline: *July 8th*)

Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf

Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,http://dbpedia.org/Wiktionary , http://dbpedia.org

Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org



--
Dipl. Inf. Sebastian Hellmann
Department of Computer Science, University of Leipzig

Events: NLP & DBpedia 2013 (http://nlp-dbpedia2013.blogs.aksw.org,Deadline: *July 8th*)

Venha para a Alemanha como PhD: http://bis.informatik.uni-leipzig.de/csf

Projects: http://nlp2rdf.org , http://linguistics.okfn.org ,http://dbpedia.org/Wiktionary , http://dbpedia.org

Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
Research Group: http://aksw.org

Re: Final CFP: In-Use Track ISWC 2013

Reply via email to