[ 
https://issues.apache.org/jira/browse/COMPRESS-325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14954501#comment-14954501
 ] 

Frédérik Bilhaut commented on COMPRESS-325:
-------------------------------------------

OK it works that way !

Thank you very much, and sorry for the irrelevant ticket ! My plain fault, I 
should have better read the doc.

However, it may be not possible to know in advance if a given bzip is 
concatenated, and I suppose that, by default, one expects to get the full 
content of the compressed file as will happen with most tools/apis I know. So:

- Either there is no circumstance where having this parameter set to {{true}} 
is inappropriate. In this case why not making it {{true}} by default ? 

- Either setting it to {{true}} may be harmful in some circumstances, but in 
this case there should be a test to detect the fact that the stream is 
concatenated ?

I understand the backward compatibility concerns, but I think that changing the 
default behavior would make it more consistent with the general InputStream 
contract where there is only one EOF. Just my cents, and maybe there are some 
other problems I don't have in mind...

Anyway thanks again for your help !!

> Unable to uncompress bzip2 dbPedia files
> ----------------------------------------
>
>                 Key: COMPRESS-325
>                 URL: https://issues.apache.org/jira/browse/COMPRESS-325
>             Project: Commons Compress
>          Issue Type: Bug
>    Affects Versions: 1.10
>            Reporter: Frédérik Bilhaut
>
> Sample code :
> {code:java}
> URL url = new 
> URL("http://downloads.dbpedia.org/current/core-i18n/en/labels_en.nt.bz2";);
> InputStream input = new 
> BZip2CompressorInputStream(url.openConnection().getInputStream());
> BufferedReader reader = new BufferedReader(new InputStreamReader(input, 
> "US-ASCII"));
>                       
> int count = 0;
> for(String line = reader.readLine(); line != null; line = reader.readLine()) {
>       if(++count > 10000) break;
>       else System.out.println(count + ": " + line);
> }
> {code}
> It stops at line 7801 (EOF) :
> {code}
> 7799: <http://dbpedia.org/resource/Gamemaster> 
> <http://www.w3.org/2000/01/rdf-schema#label> "Gamemaster"@en .
> 7800: <http://dbpedia.org/resource/Genetic_engineering> 
> <http://www.w3.org/2000/01/rdf-schema#label> "Genetic engineering"@en .
> 7801: <http://dbpedia.org/resource/Gradius_(video_game)> 
> <http://www.w3.org/2000/01/rdf-s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to