[ 
https://issues.apache.org/jira/browse/STANBOL-1291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13922864#comment-13922864
 ] 

ELOUMBAT ASSOUA ALBERT commented on STANBOL-1291:
-------------------------------------------------

hi,
can you kindly forward the changes to my email address.

> Phonetic Linking
> ----------------
>
>                 Key: STANBOL-1291
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1291
>             Project: Stanbol
>          Issue Type: New Feature
>          Components: Enhancement Engines
>            Reporter: Rupert Westenthaler
>              Labels: gsoc2014, mentoring
>
> Add Phonetic based EntityLinking support to Apache Stanbol
> The Idea is to
> 1. start of with a sound file
> 2. use a speech to text engine like STANBOL-1007 to get the transcript
> 3. use NLP processing
> 4. use the FST Linking Enigne (STANBOL-1128) to link a SolrIndex configured 
> for Phonetic linking [1].
> 5. correct the text transcript based on labels of linked entities.
> The main question to be answers is if the phonetic matching (step 4) can 
> correctly link Entities even if the writings in the text transcript are 
> incorrect.
> Additional things to validate are
> * the quality of the text transcript good enough
> * does NLP processing still sufficiently well work on text transcripts
> This will definitely also require adaptations to the FST Linking Engine as 
> the score is currently calculated base on the levenshtein distance of the 
> mention with the best matching label of an entity - what does not make sense 
> for this specific use case. 
> [1] 
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PhoneticFilterFactory



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to