Re: May 2016 Newsletter – LDC

Mattmann, Chris A (3980) Fri, 20 May 2016 08:47:33 -0700

Thanks Lewis. I’m also an org rep for NASA at LDC, and also via my
USC hat. Good show.


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [email protected]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Director, Information Retrieval and Data Science Group (IRDS)
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
WWW: http://irds.usc.edu/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++










On 5/20/16, 8:45 AM, "Lewis John Mcgibbney" <[email protected]> wrote:

>Hi Folks,
>I've ended up primary JPL organizational rep for the linguistics data
>consortium. They produce monthly newsletters (see below for most
>recent) which I will be forwarding to dev@ Joshua from now on.
>They are pretty cool, especially the new datasets they publish.
>Lewis
>
>---------- Forwarded message ----------
>From: *Mcgibbney, Lewis J (398M)* <[email protected]>
>Date: Friday, May 20, 2016
>Subject: Fwd: May 2016 Newsletter – LDC
>To: "[email protected]" <[email protected]>
>
>
>
>
>Sent from my iPhone
>
>Begin forwarded message:
>
>*From:* Linguistic Data Consortium <[email protected]
><javascript:_e(%7B%7D,'cvml','[email protected]');>>
>*Date:* May 16, 2016 at 8:20:33 AM PDT
>*To:* Linguistic Data Consortium <[email protected]
><javascript:_e(%7B%7D,'cvml','[email protected]');>>
>*Subject:* *May 2016 Newsletter – LDC*
>
>*In this newsletter:*
>
>*LDC at LREC 2016*
>
>
>
>*New publications:*
>
>SDP 2014 & 2015: Broad Coverage Semantic Dependency Parsing
><#m_-2915229479963685663_SDP>
>
>
>GALE Phase 4 Chinese Broadcast Conversation Speech
><#m_-2915229479963685663_GALE1>
>
>
>GALE Phase 4 Chinese Broadcast Conversation Transcripts
><#m_-2915229479963685663_GALE2>
>
>
>
>
>
>*LDC at LREC 2016*
>
>
>
>LDC will attend the 10th Language Resource Evaluation Conference
>(LREC2016), hosted by ELRA, the European Language Resource Association. The
>conference will be held in Portorož, Slovenia from May 23-28 and features a
>broad range of sessions on language resources and human language
>technologies research. Seven LDC staff members will be presenting current
>work on topics including trends in HLT research, building language
>resources for autism spectrum disorders, data management plans, rapid
>development of morphological analyzers for typologically diverse languages,
>selection criteria for low resource language programs, multi-language
>speech collection for NIST LRE, novel incentives for collecting data and
>annotation from people, and more.
>
>
>
>Following the conference, LDC’s presented papers and posters will be
>available on LDC’s Papers Page
><https://www.ldc.upenn.edu/language-resources/papers/ldc-papers>.
>
>
>
>
>
>New Corpora
>
>
>
>(1) SDP 2014 & 2015: Broad Coverage Semantic Dependency Parsing
><https://catalog.ldc.upenn.edu/LDC2016S03> consists of data, tools, system
>results, and publications associated with the 2014 and 2015 tasks on
>Broad-Coverage Semantic Dependency Parsing (SDP <http://sdp.delph-in.net/>)
>conducted in conjunction with the International Workshop on Semantic
>Evaluation (SemEval <http://alt.qcri.org/semeval2015/>) and was developed
>by the SDP task organizers.
>
>SemEval is an ongoing series of evaluations of computational semantic
>analysis systems intended to explore the nature of meaning in language. It
>evolved from the Senseval <http://www.senseval.org/> word sense
>disambiguation series to include semantic analysis tasks outside of word
>sense disambiguation.
>
>This release is based on English, Chinese and Czech data from the following
>resources: Treebank-2 LDC95T17 <https://catalog.ldc.upenn.edu/LDC95T7>,
>Proposition Bank I LDC2004T14 <https://catalog.ldc.upenn.edu/LDC2004T14>,
>NomBaank v 1.0 LDC2008T23 <https://catalog.ldc.upenn.edu/LDC2008T23> and
>CCGBank LDC2005T13  <https://catalog.ldc.upenn.edu/LDC2005T13>(English);
>Chinese Treebank (e.g., Chinese Treebank 8.0 LDC2013T21
><https://catalog.ldc.upenn.edu/LDC2013T21>) (Chinese); and Prague
>Dependency Treebank (e.g., Prague Dependency Treebank 2.0, LDC2006T01
><https://catalog.ldc.upenn.edu/LDC2006T01>) (Czech).
>
>The results are presented as graphs in three target representations:
>MRS-Derived Semantic Dependencies (DM), Enju Predicate–Argument Structures
>(PAS), and Prague Semantic Dependencies (PSD). As a fourth, additional
>target representation CCGbank was converted to semantic dependency graphs
>(in the subdirectory ‘ccd’).
>
>SDP 2014 & 2015: Broad Coverage Semantic Dependency Parsing is distributed
>via web download.
>
>2016 Subscription Members will automatically receive two copies of this
>corpus. 2016 Standard Members may request a copy as part of their 16 free
>membership corpora. Non-members may license this data for US $400.
>
>
>
>*
>
>(2) GALE Phase 4 Chinese Broadcast Conversation Speech
><https://catalog.ldc.upenn.edu/LDC2016S03> was developed by LDC and is
>comprised of approximately 172 hours of Mandarin Chinese broadcast
>conversation speech collected in 2008 by LDC and Hong Kong University of
>Science and Technology during Phase 4 of the DARPA GALE (Global Autonomous
>Language Exploitation) Program.
>
>Corresponding transcripts are released as GALE Phase 4 Chinese Broadcast
>Conversation Transcripts (LDC2016T12
><http://catalog.ldc.upenn.edu/LDC2016T12>).
>
>The broadcast conversation recordings in this release feature interviews,
>call-in programs and roundtable discussions focusing principally on current
>events and are contained in 236 audio files presented in FLAC
><http://flac.sourceforge.net/>-compressed Waveform Audio File format
>(.flac), 16000 Hz single-channel 16-bit PCM. Each file was audited by a
>native Chinese speaker following Audit Procedure Specification Version 2.0
>which is included in this release.
>
>GALE Phase 4 Chinese Broadcast Conversation Speech is distributed via web
>download.
>
>
>
>2016 Subscription Members will automatically receive two copies of this
>corpus. 2016 Standard Members may request a copy as part of their 16 free
>membership corpora. Non-members may license this data for US $2000.
>
>
>
>*
>
>(3) GALE Phase 4 Chinese Broadcast Conversation Transcripts
><https://catalog.ldc.upenn.edu/LDC2016T12> was developed by LDC and
>contains transcriptions of approximately 172 hours of Chinese broadcast
>conversation speech collected in 2008 by LDC and Hong Kong University of
>Science and Technology during Phase 4 of the DARPA GALE (Global Autonomous
>Language Exploitation) Program.
>
>Corresponding audio data is released as GALE Phase 4 Chinese Broadcast
>Conversation Speech (LDC2016S03 <https://catalog.ldc.upenn.edu/LDC2016S03>).
>
>The transcript files are in plain-text, tab-delimited format (TDF) with
>UTF-8 encoding, and the transcribed data totals 2,259,952 tokens.
>
>The files in this corpus were transcribed by LDC staff and/or by
>transcription vendors under contract to LDC. Transcribers followed LDC’s
>quick transcription guidelines (QTR) and quick rich transcription
>specification (QRTR). QTR transcription consists of quick (near-) verbatim,
>time-aligned transcripts plus speaker identification with minimal
>additional mark-up. QRTR adds additional structural information such as
>topic boundaries and manual sentence unit annotation.
>
>GALE Phase 4 Chinese Broadcast Conversation Transcripts is distributed via
>web download.
>
>2016 Subscription Members will automatically receive two copies of this
>corpus. 2016 Standard Members may request a copy as part of their 16 free
>membership corpora. Non-members may license this data for US $1500.
>
>
>-- 
>Membership Office
>Linguistic Data Consortium
>University of Pennsylvania
>3600 Market St. Suite 810
>Philadelphia, PA 19130
>Tel: 215-573-1275email:[email protected]
><javascript:_e(%7B%7D,'cvml','email:[email protected]');>
>Fax: 215-573-2175
>
>
>
>
>-- 
>*Lewis*

Re: May 2016 Newsletter – LDC

Reply via email to