Re: Read header and data from a binary file [LONG]

2009-09-25 Thread Gabriel Genellina
En Tue, 22 Sep 2009 18:18:16 -0300, Jose Rafael Pacheco  
 escribió:



Hello,

I want to read from a binary file called myaudio.dat
Then I've tried the next code:

import struct
name = "myaudio.dat"
f = open(name,'rb')
f.seek(0)
chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I"
s = f.read(4*1+4*1+4*1+4*1+4*1+20*1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1)



a = struct.unpack(chain, s)


Easier:
fmt = struct.Struct(chain)
s = f.read(fmt.size)
a = fmt.unpack(s)


The audio data length is 300126, now I need a clue to build an array with
the audio data (The Chunk SDA_), would it possible with struct?, any  
help ?


The chunk module (http://docs.python.org/library/chunk.html) is designed  
to work with such file format.


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Read header and data from a binary file [LONG]

2009-09-23 Thread Ishwor
> char is 1 bytes long on Python (as per struct modules' definition)

Also, this is also another option for you to use instead of built-in struct.
http://www.sis.nl/python/xstruct/xstruct.shtml

-- 
Regards,
Ishwor Gurung
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Read header and data from a binary file [LONG]

2009-09-23 Thread Ishwor
Jose,
Hi

Note: I've worked with struct but a while ago so might be rusty a bit.
Also, this sounds a bit like a homework. If it is a homework please do
it yourself(or at least try) as you'd otherwise never know the
knowledge behind it on real-world scenario :-)

Having said that I am giving you below an example on top of my reply.

> import struct
> name = "myaudio.dat"
> f = open(name,'rb')
> f.seek(0)

f.seek(0) line is not explicitly needed afaik but you can if you want to.

> chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I"
> s = f.read(4*1+4*1+4*1+4*1+4*1+20*
> 1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1)

which is 136 bytes.

> a = struct.unpack(chain, s)

Yep. little-endian ordering pack 136 bytes of `s' in `a' according to chain.

> header = {'identifier' : a[0],
>           'cid'  : a[1],
>           'clength'   : a[2],
>   'hident' : a[3],
>   'hcid32' : a[4],
>   'hdate'  : a[5],
>   'sampling' : a[6],
>   'length_B'  : a[7],
>   'max_cA'   : a[8],
>   'max_cA1' : a[9],
>   'identNOTE'  : a[10],
>   'c2len'  : a[11],}
>
> It produces:
>
> {'length_B': 150001, 'sampling': 5, 'max_cA1': 'NOTE', 'hident': 'HEDR',
> 'c2len': "Normal Sustained Vowel 'A', Voice and Speech Lab., MEEI, Boston,
> MA", 'hdate': 'Jul 13 11:57:41 1994', 'identNOTE': 68, 'max_cA': -44076,
> 'cid': 'DS16', 'hcid32': 32, 'identifier': 'FORM', 'clength': 300126}
>
> So far when I run f.tell()
>>>f.tell()
> 136L

tell( ) gives you current position of the file descriptor (you read
136 bytes so tell( ) says that you read in 136 so far as the position
of the current file descriptor or position in the binary file).

> The audio data length is 300126, now I need a clue to build an array with
> the audio data (The Chunk SDA_), would it possible with struct?, any help ?

clength above is 300126. Maybe you can use that to get Data? :-)

SDA_'s format: does it mean it starts at offset 8 bytes-EOF?

If it starts at 8 bytes after the header then what is stored in
between the lengthOf(header)+8?

In anycase, as I understand, to get all the values from the offset
8(called `Data' as per your protocol spec), you can do:

reading_after_136_file_pos_to_eof = f.read(); #continue from 136L above.
clen_fs = '<%ds' % clength; # I assume here that is a character
x = struct.unpack(clen_fs, reading_after_136_file_pos_to_eof [8:]);
#start at index 8 onwards

Now, `x' will have stored unpacked value of the
reading_after_136_file_pos_to_eof starting from 8'th byte and wil only
store 300126 bytes of characters (1 byte each so 300136 bytes long)
i.e., starting from 8'th byte file descriptor position assuming each
char is 1 bytes long on Python (as per struct modules' definition)

[ ... ]

-- 
Regards,
Ishwor Gurung
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Read header and data from a binary file

2009-09-22 Thread MRAB

Jose Rafael Pacheco wrote:

Hello,

I want to read from a binary file called myaudio.dat
Then I've tried the next code:

import struct
name = "myaudio.dat"
f = open(name,'rb')
f.seek(0)
chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I"
s = f.read(4*1+4*1+4*1+4*1+4*1+20*1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1)

[snip]
FYI, the struct module has a function called 'calcsize', so:

s = f.read(struct.calcsize(chain))
--
http://mail.python.org/mailman/listinfo/python-list


Re: Read header and data from a binary file

2009-09-22 Thread Simon Forman
On Tue, Sep 22, 2009 at 4:30 PM, Jose Rafael Pacheco
 wrote:
> Hello,
>
> I want to read from a binary file called myaudio.dat
> Then I've tried the next code:
>
> import struct
> name = "myaudio.dat"
> f = open(name,'rb')
> f.seek(0)

Don't bother to seek(0) on a file you just opened.

> chain = "< 4s 4s I 4s I 20s I I i 4s I 67s s 4s I"
> s = f.read(4*1+4*1+4*1+4*1+4*1+20*1+4*1+4*1+4*1+4*1+4*1+67*1+1+4*1+4*1)

Instead of calculating the size of the data represented by the format,
instead use the struct.calcsize() function

s = f.read(struct.calcsize(chain))

> a = struct.unpack(chain, s)
> header = {'identifier' : a[0],
>           'cid'  : a[1],
>           'clength'   : a[2],
>   'hident' : a[3],
>   'hcid32' : a[4],
>   'hdate'  : a[5],
>   'sampling' : a[6],
>   'length_B'  : a[7],
>   'max_cA'   : a[8],
>   'max_cA1' : a[9],
>   'identNOTE'  : a[10],
>   'c2len'  : a[11],}
>
> It produces:
>
> {'length_B': 150001, 'sampling': 5, 'max_cA1': 'NOTE', 'hident': 'HEDR',
> 'c2len': "Normal Sustained Vowel 'A', Voice and Speech Lab., MEEI, Boston,
> MA", 'hdate': 'Jul 13 11:57:41 1994', 'identNOTE': 68, 'max_cA': -44076,
> 'cid': 'DS16', 'hcid32': 32, 'identifier': 'FORM', 'clength': 300126}
>
> So far when I run f.tell()
>>>f.tell()
> 136L
>
> The audio data length is 300126, now I need a clue to build an array with
> the audio data (The Chunk SDA_), would it possible with struct?, any help ?

Read the chunk ID and length and then use the length to read the rest
of the chunk data.



> Thanks
>
> The file format is:
>
>
> Offset  |  Length |  Type |    Contents
> 0   4    character     Identifier: "FORM"
> 4      4        character     Chunk identifier: "DS16"
> 8      4        integer         Chunk length
> 12      -         -  Chunk data
>
> Header 2
>
> Offset       Length       Type       Contents
> 0     4     character     Identifier: "HEDR" or "HDR8"
> 4     4     integer     Chunk length (32)
> 8     20     character     Date, e.g. "May 26 23:57:43 1995"
> 28     4     integer     Sampling rate
> 32     4     integer     Data length (bytes)
> 36     2     unsigned integer     Maximum absolute value for channel A:
> 0x if not defined
> 38     2     unsigned integer     Maximum absolute value for channel A:
> 0x if not defined
>
> NOTE Chunk
>
> Offset       Length       Type       Contents
> 0     4     character     Identifier: "NOTE"
> 4     4     integer     Chunk length
> 8     -     character     Comment string
>
> SDA_, SD_A or SDAB Chunk
> Offset     Length     Type     Contents
> 0     4     character     Identifier: "SDA_", "SD_B", or "SDAB"
> 4     4     integer     Chunk length
> 8     -     -     Data
>
>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
>
-- 
http://mail.python.org/mailman/listinfo/python-list