[ 
https://issues.apache.org/jira/browse/STANBOL-1380?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14095413#comment-14095413
 ] 

Chalitha_Perera commented on STANBOL-1380:
------------------------------------------

Some of the Yago dumps (yagoTransitiveType.ttl, 
yagoMultilingualInstanceLabels.ttl, etc..) contains characters incompatible 
with Jena RDF parser. With these characters indexer will generate RIOT 
exceptions and indexing will not happen correctly.

A script is created for fixing these errors by removing characters that causes 
the exceptions.
Script can be found in the following link
https://github.com/ChalithaUdara/Stanbol-Yago-Site/blob/master/dumps_fix.sh


> Provides a script for fixing corrupted YAGO files
> -------------------------------------------------
>
>                 Key: STANBOL-1380
>                 URL: https://issues.apache.org/jira/browse/STANBOL-1380
>             Project: Stanbol
>          Issue Type: Sub-task
>          Components: Entityhub
>            Reporter: Rafa Haro
>             Fix For: 1.0.0
>
>
> Some YAGO's URIs contains characters that RIOT parser can't handle. Create a 
> shell script to escape those characters



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to