Re: Read header and data from a binary file [LONG]
En Tue, 22 Sep 2009 18:18:16 -0300, Jose Rafael Pacheco escribió: Hello, I want to read from a binary file called myaudio.dat Then I've tried the next code: import struct name = "myaudio.dat" f = open(name,'rb') f.seek(0) chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I" s = f.read(4*1+4*1+4*1+4*1+4*1+20*1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1) a = struct.unpack(chain, s) Easier: fmt = struct.Struct(chain) s = f.read(fmt.size) a = fmt.unpack(s) The audio data length is 300126, now I need a clue to build an array with the audio data (The Chunk SDA_), would it possible with struct?, any help ? The chunk module (http://docs.python.org/library/chunk.html) is designed to work with such file format. -- Gabriel Genellina -- http://mail.python.org/mailman/listinfo/python-list
Re: Read header and data from a binary file [LONG]
> char is 1 bytes long on Python (as per struct modules' definition) Also, this is also another option for you to use instead of built-in struct. http://www.sis.nl/python/xstruct/xstruct.shtml -- Regards, Ishwor Gurung -- http://mail.python.org/mailman/listinfo/python-list
Re: Read header and data from a binary file [LONG]
Jose, Hi Note: I've worked with struct but a while ago so might be rusty a bit. Also, this sounds a bit like a homework. If it is a homework please do it yourself(or at least try) as you'd otherwise never know the knowledge behind it on real-world scenario :-) Having said that I am giving you below an example on top of my reply. > import struct > name = "myaudio.dat" > f = open(name,'rb') > f.seek(0) f.seek(0) line is not explicitly needed afaik but you can if you want to. > chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I" > s = f.read(4*1+4*1+4*1+4*1+4*1+20* > 1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1) which is 136 bytes. > a = struct.unpack(chain, s) Yep. little-endian ordering pack 136 bytes of `s' in `a' according to chain. > header = {'identifier' : a[0], > 'cid' : a[1], > 'clength' : a[2], > 'hident' : a[3], > 'hcid32' : a[4], > 'hdate' : a[5], > 'sampling' : a[6], > 'length_B' : a[7], > 'max_cA' : a[8], > 'max_cA1' : a[9], > 'identNOTE' : a[10], > 'c2len' : a[11],} > > It produces: > > {'length_B': 150001, 'sampling': 5, 'max_cA1': 'NOTE', 'hident': 'HEDR', > 'c2len': "Normal Sustained Vowel 'A', Voice and Speech Lab., MEEI, Boston, > MA", 'hdate': 'Jul 13 11:57:41 1994', 'identNOTE': 68, 'max_cA': -44076, > 'cid': 'DS16', 'hcid32': 32, 'identifier': 'FORM', 'clength': 300126} > > So far when I run f.tell() >>>f.tell() > 136L tell( ) gives you current position of the file descriptor (you read 136 bytes so tell( ) says that you read in 136 so far as the position of the current file descriptor or position in the binary file). > The audio data length is 300126, now I need a clue to build an array with > the audio data (The Chunk SDA_), would it possible with struct?, any help ? clength above is 300126. Maybe you can use that to get Data? :-) SDA_'s format: does it mean it starts at offset 8 bytes-EOF? If it starts at 8 bytes after the header then what is stored in between the lengthOf(header)+8? In anycase, as I understand, to get all the values from the offset 8(called `Data' as per your protocol spec), you can do: reading_after_136_file_pos_to_eof = f.read(); #continue from 136L above. clen_fs = '<%ds' % clength; # I assume here that is a character x = struct.unpack(clen_fs, reading_after_136_file_pos_to_eof [8:]); #start at index 8 onwards Now, `x' will have stored unpacked value of the reading_after_136_file_pos_to_eof starting from 8'th byte and wil only store 300126 bytes of characters (1 byte each so 300136 bytes long) i.e., starting from 8'th byte file descriptor position assuming each char is 1 bytes long on Python (as per struct modules' definition) [ ... ] -- Regards, Ishwor Gurung -- http://mail.python.org/mailman/listinfo/python-list
Re: Read header and data from a binary file
Jose Rafael Pacheco wrote: Hello, I want to read from a binary file called myaudio.dat Then I've tried the next code: import struct name = "myaudio.dat" f = open(name,'rb') f.seek(0) chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I" s = f.read(4*1+4*1+4*1+4*1+4*1+20*1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1) [snip] FYI, the struct module has a function called 'calcsize', so: s = f.read(struct.calcsize(chain)) -- http://mail.python.org/mailman/listinfo/python-list
Re: Read header and data from a binary file
On Tue, Sep 22, 2009 at 4:30 PM, Jose Rafael Pacheco wrote: > Hello, > > I want to read from a binary file called myaudio.dat > Then I've tried the next code: > > import struct > name = "myaudio.dat" > f = open(name,'rb') > f.seek(0) Don't bother to seek(0) on a file you just opened. > chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I" > s = f.read(4*1+4*1+4*1+4*1+4*1+20*1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1) Instead of calculating the size of the data represented by the format, instead use the struct.calcsize() function s = f.read(struct.calcsize(chain)) > a = struct.unpack(chain, s) > header = {'identifier' : a[0], > 'cid' : a[1], > 'clength' : a[2], > 'hident' : a[3], > 'hcid32' : a[4], > 'hdate' : a[5], > 'sampling' : a[6], > 'length_B' : a[7], > 'max_cA' : a[8], > 'max_cA1' : a[9], > 'identNOTE' : a[10], > 'c2len' : a[11],} > > It produces: > > {'length_B': 150001, 'sampling': 5, 'max_cA1': 'NOTE', 'hident': 'HEDR', > 'c2len': "Normal Sustained Vowel 'A', Voice and Speech Lab., MEEI, Boston, > MA", 'hdate': 'Jul 13 11:57:41 1994', 'identNOTE': 68, 'max_cA': -44076, > 'cid': 'DS16', 'hcid32': 32, 'identifier': 'FORM', 'clength': 300126} > > So far when I run f.tell() >>>f.tell() > 136L > > The audio data length is 300126, now I need a clue to build an array with > the audio data (The Chunk SDA_), would it possible with struct?, any help ? Read the chunk ID and length and then use the length to read the rest of the chunk data. > Thanks > > The file format is: > > > Offset | Length | Type | Contents > 0 4 character Identifier: "FORM" > 4 4 character Chunk identifier: "DS16" > 8 4 integer Chunk length > 12 - - Chunk data > > Header 2 > > Offset Length Type Contents > 0 4 character Identifier: "HEDR" or "HDR8" > 4 4 integer Chunk length (32) > 8 20 character Date, e.g. "May 26 23:57:43 1995" > 28 4 integer Sampling rate > 32 4 integer Data length (bytes) > 36 2 unsigned integer Maximum absolute value for channel A: > 0x if not defined > 38 2 unsigned integer Maximum absolute value for channel A: > 0x if not defined > > NOTE Chunk > > Offset Length Type Contents > 0 4 character Identifier: "NOTE" > 4 4 integer Chunk length > 8 - character Comment string > > SDA_, SD_A or SDAB Chunk > Offset Length Type Contents > 0 4 character Identifier: "SDA_", "SD_B", or "SDAB" > 4 4 integer Chunk length > 8 - - Data > > > -- > http://mail.python.org/mailman/listinfo/python-list > > -- http://mail.python.org/mailman/listinfo/python-list