Dear Andy,

Thanks for your pointing it out! IRILib.encodeUriComponent(String) works well.

Is there a tool in the Jena code base for parsing String to Jena
Literal for different data types? For example:
1) "65000" -> "65000"^^<http://www.w3.org/2001/XMLSchema#integer>
2) "65000.123" -> "65000.123"^^<http://www.w3.org/2001/XMLSchema#double>
3) "true" -> "true"^^<http://www.w3.org/2001/XMLSchema#boolean>
There would be more advanced features like parsing date/time to
xsd:date or xsd:dateTime for CSV values.
The NodeFactory.createLiteral() method does not provide these features.

Best regards,
Ying Jiang





On Sun, Jun 22, 2014 at 2:16 AM, Andy Seaborne <a...@apache.org> wrote:
> I completely forgot about:
>
> IRILibh.encodeUriComponent(String)
>
>         Andy
>
>
> On 19/06/14 11:09, Andy Seaborne wrote:
>>
>> On 15/06/14 18:18, Ying Jiang wrote:
>>>
>>> 1) space and other non-URI characters in column name
>>> I introduce the LangCSV.encodeURIComponent() method borrowed from [1].
>>> However it does not strictly conform to RFC 3986 [2].
>>> TestLangCSV.testNonURICharacters() [7] shows the escaping result.
>>> There's also another related standard of RFC 2396 [3]. I'm confused by
>>> them.
>>
>>
>> RFC 2396 is superseded by RFC 3986.
>>
>>> Which one is Jena URI supposed to stick to?
>>> There're other escaping method from libs, such as spring-web [4],
>>> guava [5] and the old commons-httpclient [6]. Is it OK to make Jena
>>> (jena-arq) depending on one of these libs?
>>
>>
>>
>> Jena has some IRI code that may be useful to you:
>>
>> // Includes punycode for host names!
>> IRI iri = IRIFactory.iriImplementation()
>>                      .create("http://examplé/foo bar?query=a b") ;
>> System.out.println(iri.toASCIIString()) ;
>>
>> iri = IRIFactory.iriImplementation()
>>                      .create("foo bar?query=a b") ;
>> System.out.println(iri.toASCIIString()) ;
>>
>> It's not query-string sensitive, "a b" becomes "a%20b" and not "a+b",
>> but for producing URIs in the CSV case that does not matter (?).
>>
>> You'll need to be careful about '?' anyway as you'll need to specially
>> %-encode it.
>>
>> Jena already depends on org.apache.httpcomponents.httpclient so that is
>> no extra dependency.
>>
>>      Andy
>
>

Reply via email to