Dear Andy, Thanks for your pointing it out! IRILib.encodeUriComponent(String) works well.
Is there a tool in the Jena code base for parsing String to Jena Literal for different data types? For example: 1) "65000" -> "65000"^^<http://www.w3.org/2001/XMLSchema#integer> 2) "65000.123" -> "65000.123"^^<http://www.w3.org/2001/XMLSchema#double> 3) "true" -> "true"^^<http://www.w3.org/2001/XMLSchema#boolean> There would be more advanced features like parsing date/time to xsd:date or xsd:dateTime for CSV values. The NodeFactory.createLiteral() method does not provide these features. Best regards, Ying Jiang On Sun, Jun 22, 2014 at 2:16 AM, Andy Seaborne <a...@apache.org> wrote: > I completely forgot about: > > IRILibh.encodeUriComponent(String) > > Andy > > > On 19/06/14 11:09, Andy Seaborne wrote: >> >> On 15/06/14 18:18, Ying Jiang wrote: >>> >>> 1) space and other non-URI characters in column name >>> I introduce the LangCSV.encodeURIComponent() method borrowed from [1]. >>> However it does not strictly conform to RFC 3986 [2]. >>> TestLangCSV.testNonURICharacters() [7] shows the escaping result. >>> There's also another related standard of RFC 2396 [3]. I'm confused by >>> them. >> >> >> RFC 2396 is superseded by RFC 3986. >> >>> Which one is Jena URI supposed to stick to? >>> There're other escaping method from libs, such as spring-web [4], >>> guava [5] and the old commons-httpclient [6]. Is it OK to make Jena >>> (jena-arq) depending on one of these libs? >> >> >> >> Jena has some IRI code that may be useful to you: >> >> // Includes punycode for host names! >> IRI iri = IRIFactory.iriImplementation() >> .create("http://examplé/foo bar?query=a b") ; >> System.out.println(iri.toASCIIString()) ; >> >> iri = IRIFactory.iriImplementation() >> .create("foo bar?query=a b") ; >> System.out.println(iri.toASCIIString()) ; >> >> It's not query-string sensitive, "a b" becomes "a%20b" and not "a+b", >> but for producing URIs in the CSV case that does not matter (?). >> >> You'll need to be careful about '?' anyway as you'll need to specially >> %-encode it. >> >> Jena already depends on org.apache.httpcomponents.httpclient so that is >> no extra dependency. >> >> Andy > >