[ 
https://issues.apache.org/jira/browse/JENA-601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13850371#comment-13850371
 ] 

Claude Warren commented on JENA-601:
------------------------------------

I was thinking about this just last night.

If it helps I have a regular expression URI pattern match/editor library at 
https://github.com/Claudenw/URIEditor that would match be able to transform the 
file names.  Might be handy when trying to get the base URI. (I am assuming 
foo.ttl.gz and foo.ttl would have the same base URI)



> Provide better support for compressed input formats
> ---------------------------------------------------
>
>                 Key: JENA-601
>                 URL: https://issues.apache.org/jira/browse/JENA-601
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: RIOT
>    Affects Versions: Jena 2.11.0
>            Reporter: Rob Vesse
>            Assignee: Rob Vesse
>
> Currently Jena has little or not support for compressed input formats.  There 
> are the odd cases where some consideration is given e.g.
> - {{RDFLanguages.filenameToLang()}} strips off {{.gz}} extensions to help it 
> correctly detect file types
> - HTTP responses can deal with compressed responses by virtue of Apache 
> HttpClient
> What would be nice is to have a better strategy for handling compressed 
> inputs.  For example having a registry of known compression extensions e.g. 
> {{.gz}}, {{.bz2}}, {{.deflate}} which ARQ would strip off when trying to 
> deduce format from the filename.
> It would also be useful if the various locator implementations took 
> compression into account when opening input streams as I'm fairly sure if you 
> asked ARQ to open a {{foo.nt.gz}} file it would just open a raw input stream 
> and then the reading would fail.



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to