Let's assume an AI can hear you curse. Or laugh. Or hit the table. I got the architecture figured out:
User's mic -> Web Audio API -> OpusMediaEncoder -> WebSocket -> Buffer -> OpusDecoder -> Analysis -> Image -> WebSocket -> JavaScript Eval -> User I spent all day making the Opus decoder stream-ready. There is just one little bug somewhere in the Buffer area. Come on. Come on! ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T544a263bb189f7c2-Mc5b9ffdabae0fcd026e9d3fb Delivery options: https://agi.topicbox.com/groups/agi/subscription
