Rob Vesse created JENA-601:
------------------------------

             Summary: Provide better support for compressed input formats
                 Key: JENA-601
                 URL: https://issues.apache.org/jira/browse/JENA-601
             Project: Apache Jena
          Issue Type: Improvement
          Components: RIOT
    Affects Versions: Jena 2.11.0
            Reporter: Rob Vesse


Currently Jena has little or not support for compressed input formats.  There 
are the odd cases where some consideration is given e.g.

- {{RDFLanguages.filenameToLang()}} strips off {{.gz}} extensions to help it 
correctly detect file types
- HTTP responses can deal with compressed responses by virtue of Apache 
HttpClient

What would be nice is to have a better strategy for handling compressed inputs. 
 For example having a registry of known compression extensions e.g. {{.gz}}, 
{{.bz2}}, {{.deflate}} which ARQ would strip off when trying to deduce format 
from the filename.

It would also be useful if the various locator implementations took compression 
into account when opening input streams as I'm fairly sure if you asked ARQ to 
open a {{foo.nt.gz}} file it would just open a raw input stream and then the 
reading would fail.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to