Dear all,
I am a newbie in audio processing research and using pydub module in python3 to
manipulate audio files in “m4a” format. It is ok for me to read the original
m4a files with pydub at the beginning, but after a few steps (such as VAD and
data augmentation operation) of operations, i am unable to read out frames in
the produced m4a files as numpy.ndarray and receive errors shown below:
np.array(frames["music_no_silence"].get_array_of_samples())
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File
"/home/user/miniconda/envs/py36/lib/python3.6/site-packages/pydub/audio_segment.py",
line 272, in get_array_of_samples
array_type_override = self.array_type
File
"/home/user/miniconda/envs/py36/lib/python3.6/site-packages/pydub/audio_segment.py",
line 277, in array_type
return get_array_type(self.sample_width * 8)
File
"/home/user/miniconda/envs/py36/lib/python3.6/site-packages/pydub/utils.py",
line 43, in get_array_type
t = ARRAY_TYPES[bit_depth]
KeyError: 64
It is wierd that all m4a files, no matte original inputs or final outputs, can
be successfully opened in audio applications and produce reasonable sounds in
speakers. By further investigating the problem, i notice that the frame in
final outputs are bytes in the size of 8, while that in the original inputs are
bytes in the size of 2.
When original input and final output files are both opened with audacity, both
as displayed as “mono 16000Hz 32-bit float”. Since frames in the size of 2bytes
is unable to be interpreted as 32bit-float, I guess 32bit-float is the result
of the normalization operation in Audacity.
My Question is for frame in bytes size of 2, 4, 8, what data type should (in
numpy) should it be converted to?
And does any guru knows normalization operation employed in audacity?
Thanks a lot!
buddhainside_______________________________________________
Libav-user mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/libav-user
To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".