I didnt intend to say that we should not have server based conferencing at
all, all I originally meant to object to was putting that into a client
rather than just using a dedicated server, and in cases where you can that
you should use p2p. Now I can see that p2p wont cover all cases, but also I
am unconvinced server based will cover all uses either, so as I suggested at
the bottom of the email is a hybrid cs/p2p mechanism that can take the best
of both worlds and cover more situations that either one of them alone plus
being more flexable.
I would suggest still using the c/s model as a basis for the spec, since you can use as the basis for person to person over a direct link, person to person via a server, conferencing over direct links (with each "peer" acting as server, as described in an additional spec) and of course, conferencing on a server.
However, it should also allow for an additional spec where there is more than one server hosting the conference.
There's different approaches you can take for this, you could create a persistant network of "public" "supernodes", but this ofcourse brings it's security issues (since you can't encrypt if there has to be mixing on the supernodes) or you could create a network of "supernodes" for each conversation.
Ideally a client that only implements the "base" c/s spec should be able to work transparently with such an "upgraded" network. For example I can imagine the "clients" will have to disco the "server" for wich server address to connect their audiostream to (wether this a JID or IP I wouldn't know yet). The server could also provide them with fallback servers (wich could ofcourse be updated as well during the conference as new "clients" connect that are able/willing to take the role of such a "supernode"). This distributes bandwith and CPU load amongst servers. The spec would mostly focus on how "supernodes" amongst themselves regulate the network (including stuff like in-sync mixing).
However, I think there's still a lot of cases where at least I myself would just prefer to host everything on my own connection (if that's the fastest one avaible). It's also important person to person conversations over direct links will work. This still leaves the problem of NAT traversal, which is really what Skype is all about.
Skype uses UDP NAT travelsal based on getting it's IP from someone outside
the NAT (at least, so it was suggested either here or on SJIG), wich is
currently being rejected by the jabber server folks
I was under the impression it was asking your server for the IP address that
they are objecting to not other methods of obtaining it.
What I heard suggested was uPnP and SOCKS settings. I'm just pointing out the difference between that approach (relying on the client to figure it out) and Skype (whatever way will work, use it, and the jabber server telling you will cover it in a lot of cases to start with)
SOCKS5 is hardly integrated with an excisting mechanism
Well it is using an existing protocol rather than baking our own,
As I understand it, it basically just uses it for the connection initiation part so it can pass some meta-data.
I was
suggesting that maybe we could do the same in this case and see if there is
an existing protocol/system that we could integrate with rather than
possibly duplicating effort and saving us time.
I already named JXTA a few times. Though I have to admit it's been years probably since I looked at it, it will probably support opening binary streams over it's decentralized network (support for UDP I don't know).
But I think all existing solutions will depend on running a decentralized peer network independant of Jabber. Wheter this is something we want I don't know :)
Well, I agree that, just like with the bandwith requirments, demands on
the server will be higher than on a node in a direct link conference. Just
not THAT much higher
Sorry but it is much higher because of the fact that in p2p clients do not
need to do any re-encoding at all, plus the fact that you need individually
mix the streams before re-encoding them to send out, this IS much more of a
requirement since using p2p you dont necessarly even have to do the mixing
step, let alone the re-encoding step (which will be the most CPU intensive
part). Plus remember to prevent echoing you really need to mix/encode each
of the outgoing streams individually.
Unless you use client side echo removal (which ofcourse will put some extra burden on the client, what I'm indeed trying to avoid all along, but it's still a compromise). But agreed, this is one of the main advantages of using a direct link based conference. Again, you'll benifit the most (and the disadvantages will be the least obvious) if all clients have somewhat equal specs, while in most cases I doubt this is true for Joe Consumer.
Ofcourse you still have to mix when you use DirectX ;)
Playing two (or more) streams simultaneously is not mixing as far as I am concerned, so no you dont necessarly have to even mix.
Ofcourse the streams are mixed, just by DirectX/ALSA/whatever instead of you. If your soundcard has some Direct X compatible hardware for this the soundcard can do the mixing. (Which is still true in many cases if you use a server that uses DirectX for mixing too)
Servers can use existing technology too ofcourse.. Servers (components) specializing in hosting this kind for companies or paying customers could even use DSP hardware and such.
Sure but using hardware such as that will be out of reach for the vast majority of people.
The vast majority of people won't need to mix several thousend streams at once either ;)
I doubt it would be very clean and pretty hard to do, the only way I can see
it really working is by individually encoding each outgoing stream on the
server.
Then we both don't know ;) But most implementations probably won't be so advanced, if this is even possible (and you made a good point about re-encoding, which I more seriously doubt you can optimize much)
Ok but isnt that really an issue with you consuming so much of your outgoing
bandwidth that the TCP replies are having a hard job getting back which
slows down the particular TCP socket, I would expect that we would use a UDP
transport like RTP which wouldnt have this problem as there are no replies
to return in UDP, the packets will just get dropped if there is a problem
and the audio transport should be able to transparently handle this easily.
No, packets will just be send out slower, wich means latency will increase.
If you want to send 100 bytes and you have 20 bytes a sec. available, it will take 5 sec. If you have 10 bytes a sec. it will take 10 sec. Wich means your latency just increased by 5 seconds, wether you use UDP or TCP. UDP is more effective for audio (espc if you're willing to tolerate lost packets), so you can build a more effective stream control mechanism (which is already build in for TCP). Escp. in case of IP-packetloss on the connection this will indeed help decrease latency a lot, but it doesn't change the effect that as available bandwith shrinks latency will jump.. a lot!
This problem doesn't occur when you make long distance phonecalls..??? How
could it? It doesn't even happen in a long distance *conference* call!
Because you will hear yourself echo from the other end, that is I thought
what you meant, but overall I doubt syncing is something we really need to
concentrate on or worry about too much.
Out of sync mixing is *the* biggest annoyance about direct-link based conferencing if you ask me. Escp. when participants have severly different connections. I find this unbareable to work with.
If you think supernodes are good then you must like my compromise solution
below, since in the supernode model each supernode commuicates p2p with
other supernodes, and clients talk to the supernodes.
True, and I think the idea is intresting. But I still think in most cases 1 "supernode" will do (the host), and in many cases it won't be a problem to find one (broadband with a somewhat recent budget CPU will do I think).
In previous email I already briefly touched the subject, and some in this
email. I definatly think most of this could be handeled in the SI layer
though (with a little cheating), a c/s based spec will not rule this out
at all.
Im not sure how my solution above can be handled in the SI layer,
I wasn't exactly clear I was talking about bandwith here (where you can use some kind of peer network you connect to with SI that acts as a proxy/multicaster for your stream in a direct-link conference). Since then you pointed out CPU req. might be a bigger issue than I assumed at first.
its a
modified version of your model where instead of having a single server you
could have as many as possible which communicate with each other via p2p,
but anyone who is incapable due to platform issues, bandwidth, CPU, firewall
etc to act as a server itself connects to one of these servers, this
provides the benefit of more evenly load balancing the CPU/bandwidth use of
the chat over several nodes rather than concentrating on one, provides
instant fall back to another server if one has problems or leaves, and
provides your primary benefit of being able to support dialup users or
simple clients like pocket pc's or people who cant go p2p because of
firewalls.
I think my idea as decribed more at the beginning of this email and here are pretty much alike. Are you suggesting a persistant network or these servers (which is p2p I suppose) or a per-conference network? (which I'd rather just call clustering of the servers)
Do you feel such a system should be part of the "base" spec for (audio) conferncing or an extention? And what do you feel should be done for NAT traversel in person to person? In case you suggest a peristant network of these nodes is that what should be used for that?
_______________________________________________
jdev mailing list
[EMAIL PROTECTED]
http://mailman.jabber.org/listinfo/jdev
