K solved, ty Dave and Andy for the response and sorry  i writing bad
constructor because i was in a rush, anyway in the specific my problem was
the iso-8859-1 encoding. Like Andy has said.
TY all.

2015-02-20 10:35 GMT+01:00 Andy Seaborne <[email protected]>:

> On 20/02/15 08:01, Marco Tenti wrote:
>
>> Hi everyone, i'm loading milions of triple split in hundred file in the
>> same server very simple, but during the process i get sometime a fail  to
>> read some file with .nt extension. These file are generated  with SILK (
>> http://wifo5-03.informatik.uni-mannheim.de/bizer/silk/)
>> I get in the specific these error:
>>
>> InputStream in = Filemanager.get().open("filename");
>>
>> //1)
>> model.read(in, "NT");
>>
>
> That is :
> model.read(in, baseURI)
>
> not setting the language.
>
>
>> Console: [line: 1, col: 7 ] Element or attribute do not match QName
>> production: QName::=(NCName':')?NCName.
>> *Exception in thread "main" org.apache.jena.riot.RiotException: [line: 1,
>> col: 7 ] Element or attribute do not match QName production:
>> QName::=(NCName':')?NCName. *
>>
>
> It thinks its RDF/XML because you set the base URI to "NT" and the default
> language is RDF/XML.
>
> Better:
>
> RDFDataMgr.read(model, in, Lang.NT) ;
>
> as it uses typed constants.
>
> RDFDataMgr.read(model, "filename") ;
>
> will work with file extension .nt/.ttl etc
>
> (actually, model.read("filename2) works nowadays)
>
>  :
>> //2)
>> org.apache.jena.riot.RDFDataMgr.read(model,in,"NT");
>> *Exception  org.apache.jena.atlas.AtlasException:
>> java.nio.charset.MalformedInputException: Input length = 1*
>>
>> any idea why jena trhow these exception?
>>
>
> Bad data.
>
> If you get
>
>   java.nio.charset.MalformedInputException
>
> it means the file is not valid UTF-8.  Exactly where is hard to determine
> from the error because Jena reads a block of 128K bytes for efficiency
> reasons (it's a major cost of N-Triples parsing) and the java bytes to
> chars conversion for UTF-8 does not say where the error occurs.
>
> A common cause is iso-8859-1 data.  N-Triples is UTF-8 only.
>
> There is a utility in jena "riotcmd.utf8" that does a careful utf8 read of
> the file character by character.
>
> Look at your data and very carefully check how the program you are using
> is setup.  It's all too easy to accidentally view a file in the platform
> native setup.
>
>  ty in advance. Greetings.
>>
>>
>         Andy
>
>

Reply via email to