Hi Martin,
Thanks for your response. The concatenated file itself is a perfectly valid
gzip file. Quote from RFC 1952 (http://www.ietf.org/rfc/rfc1952.txt):
A gzip file consists of a series of "members" (compressed data
sets). The format of each member is specified in the following
section. The members simply appear one after another in the file,
with no additional information before, between, or after them.
It doesn't make sense if I can't extract a valid gzip file if I don't know
the structure of this file beforehand.
I am not sure why Node's zlib module assumes the second piece is not
gzipped data set. And it seems to be hard to tell from its API how many
bytes it has decompressed also.
To make sure this is not a common practice, I tested GO's gzip module,
which extracts the 2nd file correctly. (Code attached below).
// gzip-test.go
package main
import (
"fmt"
"compress/gzip"
"os"
"bufio"
)
func main(){
file, _ := os.Open("3.txt.gz");
fileGzip, _ := gzip.NewReader(file);
fileRead:= bufio.NewReader(fileGzip)
i := 0
for {
line, err := fileRead.ReadString('\n')
if err != nil {
break
}
fmt.Printf("Line %v: %s", i, line)
i++
}
}
---------- Forwarded message ----------
From: "Martin Cooper" <[email protected]>
Date: Aug 10, 2013 7:44 AM
Subject: Re: [nodejs] zlib fails to extract concatenated files
To: <[email protected]>
Cc:
This isn't a bug. The zlib module decompressed the raw gzipped buffer and
stopped. It isn't the job of zlib to try to infer any structure on the
buffer you gave it, or to attempt to look for more work to do after
finishing what it was asked to do. Subsequent data in the buffer could be
anything, after all, not just gzipped data. In contrast, the gunzip command
is being told "this is a gzipped file (and nothing else)", so it's
reasonable for it to continue after processing one compressed buffer to see
if it can do more.
Note that, in your case, you can gunzip the second chunk by slicing it from
the original buffer. But remember that only you know the structure of the
file you are reading; there is no structure embedded within the file for
some piece of code to interpret.
--
Martin Cooper
On Fri, Aug 9, 2013 at 11:45 PM, ribao wei <[email protected]> wrote:
> Hi,
>
> I just encounter this and not sure whether it is a bug in zlib module.
>
> echo "1" > 1.txt
> echo "2" > 2.txt
> gzip 1.txt
> gzip 2.txt
> cat 1.txt.gz 2.txt.gz > 3.txt.gz
>
> Now,
>
> gunzip 3.txt.gz
>
> prints:
> 1
> 2
>
>
> However,
>
> zlib.gunzip(fs.writeReadSync("**c.gz"), function (err, buffer) {
>
> console.log(buffer.toString())**;
>
> });
>
> prints:
>
> 1
>
> (empty line)
>
>
> Am I missing something?
>
>
> Thanks,
>
> Wei
>
> --
> --
> Job Board: http://jobs.nodejs.org/
> Posting guidelines:
> https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
> You received this message because you are subscribed to the Google
> Groups "nodejs" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/nodejs?hl=en?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "nodejs" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>
--
--
Job Board: http://jobs.nodejs.org/
Posting guidelines:
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en
---
You received this message because you are subscribed to the Google Groups
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.
--
--
Job Board: http://jobs.nodejs.org/
Posting guidelines:
https://github.com/joyent/node/wiki/Mailing-List-Posting-Guidelines
You received this message because you are subscribed to the Google
Groups "nodejs" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/nodejs?hl=en?hl=en
---
You received this message because you are subscribed to the Google Groups
"nodejs" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.