Hello Roman,
o Roman Kempa [01/30/09 22:37]:
Hi,
first of all let me introduce myself. My name is Roman Kempa and I'm a
student of last semster of Computer Science at Silesian University of
Technology in Poland. For my diploma project I need to implement 3d
sound in an audio conference (auralization using headphones with HRTF)
using SEMS
I find that very interesting. Recently I have been demonstrated spatial
audio, and I think it could greatly enhance intelligibility in conferences.
.
As far as I have seen in SEMS there are no examples using the stereo
sound. Can you please give me any tips how to create a plug-in which is
using the stereo sound (with usage of L16 codec for example) ?
to start, I would use the wideband branch
(https://svn.berlios.de/svnroot/repos/sems/branches/wb), as it makes
sense to use wideband anyway (wideband is to be merged into trunk when
considered stable - actually I have had good experience with it). Now,
the audio engine in the core basically supports stereo, but it is with
wideband: it has not been used so far.
I think If you are going to hack on it, it is useful to know a little
about how audio is processed in SEMS.
So, in SEMS every session has the property rtp_str of the type
AmRtpAudio, which binds together an AmRtpStream and an AmAudio (is
subclass of both). Additionally, every session has one "input" AmAudio
and one "output" AmAudio. What the AmMediaProcessor basically does
(omitting stuff like local audio) is (look at
AmMediaProcessorThread::processAudio):
while (true) {
// receiving
session->rtp_str.receive(ts); // puts audio from received packets into
dejitter/playout buffer
session->rtp_str.get(ts,buffer,frame_length);
session->input->put(ts,buffer,rcvd_audio_len);
// sending
session->output->get(ts,buffer,frame_length);
session->rtp_str.put(ts,buffer,size)
ts+=x ms
}
It is important to see that both AmSession::input/AmSession::output and
AmSession::rtp_str are of the type AmAudio (the basic audio interface in
SEMS). Now, a simplified look at AmAudio::get and AmAudio::put :
AmAudio::get {
read(user_ts,size);
size = decode(size)
size = downMix(size);
}
AmAudio::put {
upMix(size);
size=encode(size);
write()
}
AmAudio::read and AmAudio::write are implemented by the Audio devices
(e.g. AmAudioFile, or AmRtpStream, or AmPlaylist etc). AmAudio::encode
and AmAudio::decode basically calls the encode/decode functions of the
Audio format.
So, when AmRtpStream::get is called, all audio is converted in internal
format (downMix), which includes resampling to internal rate and
stereo2mono.
For simplicity, we could either just say, that we set the internal
format to stereo (like I did with wideband: internal bitrate is set to
SYSTEM_SAMPLERATE, so on AmRtpStream::get everything is converted to
e.g. 16khz if not already in 16 khz). Alternatively, we would have to
downMix the audio in put - i.e. if the audio is stereo, but the audio
format of the RTP is mono, then convert stereo to mono. This is of
course better, but the other functions can not assume any more that
audio is one format - they would have to check what they got.
You can get stereo L16 into SEMS by adding some stereo payloads, e.g.
like this:
Index: plug-in/l16/l16.c
===================================================================
--- plug-in/l16/l16.c (revision 1082)
+++ plug-in/l16/l16.c (working copy)
@@ -62,12 +62,16 @@
BEGIN_PAYLOADS
PAYLOAD( -1, "L16", 8000, 8000, 1, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
+PAYLOAD( -1, "L16", 8000, 8000, 2, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
#if SYSTEM_SAMPLERATE >=16000
PAYLOAD( -1, "L16", 16000, 16000, 1, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
+PAYLOAD( -1, "L16", 16000, 16000, 2, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
#if SYSTEM_SAMPLERATE >=32000
PAYLOAD( -1, "L16", 32000, 32000, 1, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
+PAYLOAD( -1, "L16", 32000, 32000, 2, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
#if SYSTEM_SAMPLERATE >=48000
PAYLOAD( -1, "L16", 48000, 48000, 1, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
+PAYLOAD( -1, "L16", 48000, 48000, 2, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
#endif
#endif
#endif
Now, if i understand correctly HRFT gets you the impulse response
depending on the position, from where you calculate the filter. So you
will have to filter each channel of a conference according to current
value, but then all channels are composited linearly, independently of
the other channels' filters. So what you can simply do is implement a
filter and feed the audio through it before mixing it with the
conference mixer. For this, there is some useful tools in
AmAdvancedAudio: You can use the AmAudioQueue, with a filter in it, and
the conferenceChannel. Search the list for AmAudioQueue, there were some
posts about it, e.g. this one
http://www.mail-archive.com/[email protected]/msg00838.html
, the example code would almost work, you only need to set the entry in
the queue to write and read (you want to write and read from/to the
filter). The other thing that will have to be done is to make the
conference mixer able to mix stereo, this should be fairly easy.
So, I hope this will get you started. Feel free to ask any further
questions.
Stefan
Any help will be much appreciated :-)
Best regards
Roman Kempa
------------------------------------------------------------------------
_______________________________________________
Semsdev mailing list
[email protected]
http://lists.iptel.org/mailman/listinfo/semsdev
--
Stefan Sayer
VoIP Services
[email protected]
www.iptego.com
IPTEGO GmbH
Am Borsigturm 40
13507 Berlin
Germany
Amtsgericht Charlottenburg, HRB 101010
Geschaeftsfuehrer: Alexander Hoffmann
_______________________________________________
Semsdev mailing list
[email protected]
http://lists.iptel.org/mailman/listinfo/semsdev