Hello Roman,

o Roman Kempa [01/30/09 22:37]:
Hi,

first of all let me introduce myself. My name is Roman Kempa and I'm a student of last semster of Computer Science at Silesian University of Technology in Poland. For my diploma project I need to implement 3d sound in an audio conference (auralization using headphones with HRTF) using SEMS

I find that very interesting. Recently I have been demonstrated spatial audio, and I think it could greatly enhance intelligibility in conferences.
.

As far as I have seen in SEMS there are no examples using the stereo sound. Can you please give me any tips how to create a plug-in which is using the stereo sound (with usage of L16 codec for example) ?
to start, I would use the wideband branch (https://svn.berlios.de/svnroot/repos/sems/branches/wb), as it makes sense to use wideband anyway (wideband is to be merged into trunk when considered stable - actually I have had good experience with it). Now, the audio engine in the core basically supports stereo, but it is with wideband: it has not been used so far.

I think If you are going to hack on it, it is useful to know a little about how audio is processed in SEMS.

So, in SEMS every session has the property rtp_str of the type AmRtpAudio, which binds together an AmRtpStream and an AmAudio (is subclass of both). Additionally, every session has one "input" AmAudio and one "output" AmAudio. What the AmMediaProcessor basically does (omitting stuff like local audio) is (look at AmMediaProcessorThread::processAudio):
while (true) {
 // receiving
session->rtp_str.receive(ts); // puts audio from received packets into dejitter/playout buffer
 session->rtp_str.get(ts,buffer,frame_length);
 session->input->put(ts,buffer,rcvd_audio_len);

  // sending
 session->output->get(ts,buffer,frame_length);
 session->rtp_str.put(ts,buffer,size)

 ts+=x ms
}

It is important to see that both AmSession::input/AmSession::output and AmSession::rtp_str are of the type AmAudio (the basic audio interface in SEMS). Now, a simplified look at AmAudio::get and AmAudio::put :

AmAudio::get {
 read(user_ts,size);
 size = decode(size)
 size = downMix(size);
}

AmAudio::put {
 upMix(size);
 size=encode(size);
 write()
}

AmAudio::read and AmAudio::write are implemented by the Audio devices (e.g. AmAudioFile, or AmRtpStream, or AmPlaylist etc). AmAudio::encode and AmAudio::decode basically calls the encode/decode functions of the Audio format.

So, when AmRtpStream::get is called, all audio is converted in internal format (downMix), which includes resampling to internal rate and stereo2mono.

For simplicity, we could either just say, that we set the internal format to stereo (like I did with wideband: internal bitrate is set to SYSTEM_SAMPLERATE, so on AmRtpStream::get everything is converted to e.g. 16khz if not already in 16 khz). Alternatively, we would have to downMix the audio in put - i.e. if the audio is stereo, but the audio format of the RTP is mono, then convert stereo to mono. This is of course better, but the other functions can not assume any more that audio is one format - they would have to check what they got.

You can get stereo L16 into SEMS by adding some stereo payloads, e.g. like this:

Index: plug-in/l16/l16.c
===================================================================
--- plug-in/l16/l16.c   (revision 1082)
+++ plug-in/l16/l16.c   (working copy)
@@ -62,12 +62,16 @@

 BEGIN_PAYLOADS
 PAYLOAD( -1, "L16",  8000,  8000, 1, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
+PAYLOAD( -1, "L16",  8000,  8000, 2, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
 #if SYSTEM_SAMPLERATE >=16000
 PAYLOAD( -1, "L16", 16000, 16000, 1, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
+PAYLOAD( -1, "L16", 16000, 16000, 2, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
 #if SYSTEM_SAMPLERATE >=32000
 PAYLOAD( -1, "L16", 32000, 32000, 1, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
+PAYLOAD( -1, "L16", 32000, 32000, 2, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
 #if SYSTEM_SAMPLERATE >=48000
 PAYLOAD( -1, "L16", 48000, 48000, 1, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
+PAYLOAD( -1, "L16", 48000, 48000, 2, CODEC_L16, AMCI_PT_AUDIO_LINEAR )
 #endif
 #endif
 #endif

Now, if i understand correctly HRFT gets you the impulse response depending on the position, from where you calculate the filter. So you will have to filter each channel of a conference according to current value, but then all channels are composited linearly, independently of the other channels' filters. So what you can simply do is implement a filter and feed the audio through it before mixing it with the conference mixer. For this, there is some useful tools in AmAdvancedAudio: You can use the AmAudioQueue, with a filter in it, and the conferenceChannel. Search the list for AmAudioQueue, there were some posts about it, e.g. this one http://www.mail-archive.com/[email protected]/msg00838.html , the example code would almost work, you only need to set the entry in the queue to write and read (you want to write and read from/to the filter). The other thing that will have to be done is to make the conference mixer able to mix stereo, this should be fairly easy.

So, I hope this will get you started. Feel free to ask any further questions.

Stefan


Any help will be much appreciated :-)

Best regards
Roman Kempa


------------------------------------------------------------------------

_______________________________________________
Semsdev mailing list
[email protected]
http://lists.iptel.org/mailman/listinfo/semsdev

--
Stefan Sayer
VoIP Services

[email protected]
www.iptego.com

IPTEGO GmbH
Am Borsigturm 40
13507 Berlin
Germany

Amtsgericht Charlottenburg, HRB 101010
Geschaeftsfuehrer: Alexander Hoffmann
_______________________________________________
Semsdev mailing list
[email protected]
http://lists.iptel.org/mailman/listinfo/semsdev

Reply via email to