On Mon, 1 Dec 2003 10:55:20 -0000, Richard Dobson <[EMAIL PROTECTED]> wrote:
Having one user assume the role as server, and one of client is really no
harder than a model in wich you asume both are equal peers. It's simply a
matter of different roles. If you can think of any reason why this is not
true, please share it with the rest of us!
I dont disbute that it is any harder (for one 2 one), simply that using a
client server model when a p2p model is more appropriate IMO can create more
problems than it solves.
Then please point out those problems for me.
I doubt you can think of any for person to person. Not at all compared to a p2p solution. And *just* implementing that is enough to participate in conferences with more than 2 persons. You get this *for free*, so to speak.
Wether you'd choose to make an extention on this for conferencing over direct links is up to you. I won't stop you. I'd encourage you if I could :)
However, using a client/model will allow you to participate in a
conference on a server with more people *with no extra effort at all*. Yet
you still state you don't believe it will be easyer?
Yes is easier to implement because you dont need extra p2p, but IMO its not
really that much more to implement it as you will already have a large
amount of the necessary code inplace once you have created a client with
inbuilt server.
So why not implement a c/s based solution for person to person and server conferencing (which will take about the same effort as implementing a p2p based solution for person to person). And then implement a direct link based conferencing solution, where each node it a server as defined in the c/2 spec (which will take about the same amount of effort as doing it based on a p2p spec.)
Unless for some reason, you think the c/s spec would bring up issues, which you seemed to imply a bit back in this email.
What I *am* saying, that an entirely p2p based conferencing model (with
more than 2 persons involved) is a lot more complex than a client/server
model. Even more so, if you only have to implement the client portion.
That's why this allows "thin" clients to still participate. It was you
yourself who argued against mixing and bandwith req. on thin clients such
as a pocket PC.
Yes if you only implement the client portion
You actually make a good point here. Implementing client portion + server portion (that's just suitable for talking to one person) takes about the same effort as implementing a p2p solution. But I suppose you could go for the solution of only implementing the client portion in extreme cases where resources are very limited :)
it will be a lot more work to
add server or p2p, but if everyone does that (to save time and effort) your
proposed system will fall apart because there will be no servers for people
to connect to.
If I work on client X, ofcourse I'll implement the server portion (for single person to person chat in the least). Else client X can't talk with client X! That'd be kind of dumb. But I can imagine if I have an assigment for company to build a client that's capable of conferencing on the company servers (so they can log, etc.) I could drop the server part.
More advanced clients are likely to also implement a server that supports hosting a conference with more than 2 people. Or they'll implement a direct link conferencing extention (still based on the same protocol ofcourse). Those two are complimentary not competitive. But as pointed out by you, direct link person to person is definatly needed most, and as pointed out by other, server based conferencing is needed most too.
That doesn't mean there are no use cases left for direct link based conferencing, but IMHO not enough to justify a spec that will miss out on server based conferencing when you can get that practically for free, and will complicate the spec and raise the requirments for conferencing. Again, it's not impossible.
As Mats Bengtsson suggests I think you should take a look at this http://www.skype.com/skype_p2pexplained.html their solution looks rather good (although goes further than I have been suggesting)
Skype uses UDP NAT travelsal based on getting it's IP from someone outside the NAT (at least, so it was suggested either here or on SJIG), wich is currently being rejected by the jabber server folks, and if that doesn't work it uses proxies on a p2p network. Those peers on the network basically act as a proxy. So I don't quite see the relationship with your proposal.
This sort of functionality can be used with SI. For example, you could make a SI bytestream over a JXTA network. With just a bit of cheating you could probably even use the Skype network itself with SI.
maybe what we really need to do rather than concocing our own solution is defer to the even greater experience of someone else and just try to integrate with an existing mechanism, just like we did with SOCKS5 for the bytestreams mechanism.
SOCKS5 is hardly integrated with an excisting mechanism, it just uses part of the same spec. Using SI you can intergrate other solutions, almost transparently, and fall back on others if they don't work. That doesn't eliminate the need for a spec of setting up these things, and I see no good reason to not use a c/s architecture there.
I think from the discussion it's pretty obvious what's needed/wanted most
are 2 things:
- person to person over a direct link
- conferencing with multiple persons on a server
As you realise I dont think you need to use a server to talk with a small group of people.
You're turning a blind eye to the issues with p2p then. Other people have pointed them out, and I have. I'm not ruling out direct links conferencing at all, but after direct link person to person second most needed is server based conferencing. Both as a fall back for direct link person to person, and because in many cases (not ALL, I'm not suggesting that) that's the only *quality* way of having a conference with multiple persons. So why throw this away if we can get it, almost for free?
Again, this does not rule out what you want at all.
This can both be handeled, without overlap, with a simple JEP based on a c/s model. P2P won't cover this, nor will it be any simpeler.
Sorry but it can handle it as I have clearly shown,
What you're talking about is simply a *different* problem. It's a solution, and a good one, but for a *different* problem. It can't handle it, and it doesn't cover it.
it wont be any simpler
but IMO its not much harder if you already have client/server code in place,
and is far more reliable.
Well exactly, if you have a c/s spec with c/s code in place, you can use that to implement your solution. You won't need anything p2p, it's about the direct links.
Conferencing over induvidual direct links between persons is intresting too, but too complex to be included in the basic JEP if you ask me.
I dont think its really all that much harder as you know.
Well, with a c/s spec, client (and servers for person to person) have it very easy. Bandwith reqs are low, CPU reqs are low, and you can talk to as many persons at once as you want. Ofcourse, the req. for the server are higher (when more than 2 persons are involved). But as I pointed out, not THAT much higher as a node in direct link conference. In many use cases there WILL be more advanced implementations that on more advanced platform with more resources that can support being, and many clients that couldn't be server. But it's no issue for those clients, since they only have to be client. In many MANY cases p2p/direct link style conferencing isn't an alternative. You too, have pointed to the dailups and the pocket PC's etc. I'm sure..
Conferencing over direct links doesn't have to be p2p either. You can base
it on the c/s JEP with every induvidual participant acting as a server.
Not that more complex than doing this on a p2p based model.
But that is p2p is it not?
Any node (JID) in the network can be a server. This is a role in the protocol. By having this role, you can support both direct link person to person conversations, and on the server conferences. That's my point. If instead in the protocol you use the role of two equals "peers" this is disruptive.
[cut out some stuff where we pretty much agree I think]
So let's apply this to some real world situations. In how many cases are
all the clients have about the same available bandwith, CPU, etc. With Joe
Consumer this is unlikely.. it's a mix of dailup and broadband users. If
I'd want to talk to my mother, sister and brother at the same time, I have
a 1 mbit link, 1 will have a cheap DSL account, and the other 2 will be on
dailup most likely.
I can see on dialup this is a problem, but as I detail below it can be
complex determining the correct machine to run the server from (bandwidth
available, CPU speed etc), this really needs to be automatic or we will make
it that much harder for normal users to use they might well not bother and
continue using MSN etc instead, we must make sure we offer something that is
at least as easy as MSN Messenger and the like to use, so whichever way we
go, be it client server or p2p or both all that needs to be hidden from the
user, and all they should need to do is select the people they wish to chat
to and click "chat".
Does MSN even *do* conferencing with more than one person? (I don't know)
I think in most cases users will know who has the fastest connection, but I can imagine you'd prefer an automatic solution for this. That would be rather neat. Ofcourse when you host all this on a server component the choice is clear.
Again I don't think direct-link style conferncing is unintresting or
unneeded, but it's a much more specific application than c/s conferencing.
And *again*, a c/s style approach will not prevent this from being an
extention.
Good, but once we have a client server system in clients we will have 90% of
the code needed to implement it, it would be a mistake IMO and could prove
to create a messy protocol if we dont consider how to include p2p function
into the protocol we create from day one, otherwise when we extended it
later it could end up either messy or we will end up duplicating lots of
effort.
Agreed, when creating such a spec based on c/s, attention should be paid to allowing a direct-link conference style solution from the start. For that matter, it should also allow for things such as distrubited hosting of a conference (a sort of hybrid between direct links and c/s) or any other things people can come up with. It should just be as generic as possible.
And how's that? When 4 people talk at once, *all* client will have to mix
4 streams in the case of direct links. In the case of c/s only the server
will have to mix 4 streams. Explain..
Yes but the server has to do more than simply mix the streams, it also has
to re-encode the mixed streams, also if you want to remove echo's as you
suggest below or be able to ignore partipants as someone has already
suggested as useful functionality you need to re-mix and re-encode all
outgoing streams individually, which would I expect be quite a CPU drain,
but in p2p mode clients if using available technologies (directx or the
equivalent) you dont even need to mix the streams as you can play
simultaneous WAVE streams at the same time, also the client isnt needing to
re-encode the stream to send out again.
Well, I agree that, just like with the bandwith requirments, demands on the server will be higher than on a node in a direct link conference. Just not THAT much higher, unless you want some more advanced features. There's always trade-offs between the two solutions, and at times you could prefer yours over the other. But the point I'm making is that we can have *all* of them, relativly simple with a c/s based architecture, even if a p2p spec might be just a *little* easier to work with in your case, or at least sound more logical when reading the spec.
Ofcourse you still have to mix when you use DirectX ;) Servers can use existing technology too ofcourse.. Servers (components) specializing in hosting this kind for companies or paying customers could even use DSP hardware and such.
(only thing I could think of is if you want to create a seperate mix for
each client, without their own channel in it to prevent echo. Rather than
mixing new streams for each client you should just surpress echo for each
clients. Admitted, it increases demands on the server if you want this,
but not as bad as having to mix a new stream for each client)
Not sure how you would suppress the echo of what someone said without re-coding the streams individually to exclude that person on their own incoming listening stream.
Well, aside from that you can surpress it client side... (which would raise the requirments for our poor pocketPC clients a little too much) I'm not an expert on audio technology but I'd imagine there are some optimizations heavy possible when making different mixes based on the same streams? I could be wrong ofcourse..
Yes, when the server quits the conference the other will get booted. If
this is a big issue for you, you could devise a fallback system to another
server (one of the clients for example) and still have a massivly less
complex system than direct-link based conferencing. Since servers are most
likely to be the best machines with the best connections this isn't such a
big problem, but it's still easily solved if you want.
Good this would have to be if I were to support this, problem is tho, adding
in this sort of thing brings us even closer to the requirements of just
using a p2p system,
Switching to a fallback server is *definatly* something different from using your direct links system. Again, c/s and direct links based conferencing are two different soltions to two different problems, except for perhaps in the most general sense.
People on the list made it very clear direct-link style conferncing with multiple persons will not fill their most basic needs. If your only problem would be worrying about wether the host dies, I'd recommend you go with the solution I proposed rather than go direct link style. But I doubt that's your only problem :)
also would have to make it easy to start chats for
normal users so the system needs to automatically determine which machine in
the group is best suited to be the server and set it up as it without the
user needing to do that themselves. There is also a problem with falling
back in this situation in that what if there is not a machine with enough
bandwidth etc left to maintain the chat?
Ofcourse this is a problem. If it won't work it won't work. If your solution *would* work in that case, well that's why I think it would be great to have. However, don't overestimate how often this will be the case. But it's definatly so on XBox Live.. which is still a brilliant example :)
It will go down, which it shouldnt in p2p because all nodes will require the same amount of bandwidth to maintain it and it should keep going.
When there are a few clients with bad connections in the conversation reliability will probably improve a bit too. Bad connection <-> Good connection <-> bad connection is generally more reliable than bad connection <-> bad connection. Escp. when you consider bandwith usage drops too.
Yup but there is no real way without user intervention to make sure the server is on a reliable connection, but we need to make it as easy as possible otherwise normal people would not know what to do.
You could automate this.. (and use a remote control protocol to set everything up transparantly) but I don't think user intervention is a bad thing here *necisarly*. Even the most oblivious of users know broadband is better than dailup..
Latency is an intresting case, but in practise the results would probably
surprise you. Because on low-bandwith nodes to bandwith requirments
dramatically drop when they act as a client rather than a node in the
direct link conference, latency in many cases will actually improve in a
lot of cases!
Thats good but do you have any real evidence of this?
I assume you have no problems with the idea that latency is lower on low-bandwith connections when the bandwith used is lower too? If not.. just play an online game, then exit it, turn on some filesharing network, and play the game again ;) That's just simple maths!
Even on my old "broadband" connection, where I had 15 KB/s upstream availably, latency would jump from about 25-40ms to 50-400ms if I used only 10KB/s of it for different purposes.
Gaming provides another example.. in the old days when I played Quake, I'd be a lot faster to play on my ISPs server with someone, then for either of us to host the server (latency would be higher and less reliable there). Experiance in using the old ICQ protocol gave me the same idea, even though the amounts of data are *very* limited there.
If latency is your main point for choosing direct link conferencing, I'd be very carefull if I were you cause the result might dissappoint in many cases.
So you can have the situation where a node in a direct-link
conference with 3 persons talking is barely able to keep up, with horrible
latency. While a client with the exact same quality connection is enjoying
a conference where 6 people are talking with lower latency! (it wouldn't
even be able to participate when 6 people are talking in a direct link
conference).
You would have to have very low bandwidth to not be able to talk to those 6
people tho in p2p, but yea that could be a problem, but one of the people
still needs to be on a good connection.
If you dedicate all your bandwith to voice chat, use low quality codecs, have at least a 56k6 on a decent ISP then you can probably talk to more than just a few. But that's hardly always the case. And still, the more streams will be active, the higher latency will get, and less reliability in some cases.
Now lets talk about out-of-sync mixing. With direct-link based conferences
every client will produce a different "mix" based on the latency /
bandwith of their connections, and that of the other nodes. This means
when we're in a meeting, for me it can sound like 3 people were talking at
once, while for you it can sound like they didn't at all. (that means I
didn't hear what they said and I'll ask them to repeat, while you'll be
annoyed with me (even more ;) cause for you it sounded like I could have
heard perfectly).
Sure that could be a problem, but its a problem people will be used to if they have ever made long distance phone calls, this sort of thing is the least of our worries IMO.
This problem doesn't occur when you make long distance phonecalls..??? How could it? It doesn't even happen in a long distance *conference* call!
With a serverside solution *everyone* will receive the same audiostream (with perhaps only their own stream emitted). With direct links every client makes their own "mix".
Let's pretend you and I are in a conversation with person A and B and C. We're useing direct links for conferencing. Person A ask a question. His stream is broadcasted induvidually to all nodes. Person B then starts to answer, and so does person C. When person B notices person C also wants to answer (they both have a fast connection so little latency) person B shuts up, and C answers. I however am on a bad link. I receive the question from A, my link with person B just went bad a little, so him starting to answer didn't make it to me yet, but already I can hear C start to give his answer. Then the link with B clears up, and in the middle of what C is saying (way after he noticed B was gonna let him do the talking) I suddenly hear B start to answer, and then stop. So I ask if C can repeat himself. But your link with B and C is just fine, you didn't hear B talk through C when he was answering at all. So you ask yourself wether I was sleeping during the meeting or something.
The more diverse your different types of connection are (unlike with XBox Live where they are all pretty much the same) the more of a problem this will be. Escp. if you use TCP sockert over an unreliable connection. This does not happen *at all* with server based conferencing.
Ofcourse there is a solution for this, syncing the mixes between nodes.
But then you loose all latentcy advantages, you'll be as slow as the
"weakest link". (and the weakest link will be a lot more stressed than it
would be in a c/s model). Ofcourse compromises are possible..
Sure
That doesn't mean doing in-sync mixing in a direct links conference isn't still a bitch to pull of.. how do you detirmine what delay the faster nodes should add? You'll need control channels at least, and ofcourse you don't want *those* to depend on a c/s architecture either. Good luck with that ;)
But p2p chats should not need
a server IMO because they are short lived sessions for which you will have
already located the other members of the chat via another means (your Jabber
session). Please bear in mind that client server systems are not always the
best solution, just think if the file sharing systems all went through
central servers the bandwidth use would be unsustanable for the server
admins.
That's not anything like what I am proposing. To start with, practically all person to person communication would be over direct links. Secondly, conferences would not be held on some gigantic server, rather there will be small clusters spread all over the place.
As you might know many p2p network have made this same change, relying more on the stronger better clients, letting them take some roles that traditionally were meant for servers, Peer caches, supernodes etc. At first this was just with control info, but Skype is the next step, using "peers" as proxy servers for data. (One could argue Skype is not the first one to do it though, there's Freenet for example)
I think it'd be great if we could take the same route with Jabber (I already named a SI/JXTA based solution as an example), but without ruling out the reliable and needed c/s model either. And I think I pointed out fairly decent how we could.
Although there is the fact that current audio chat systems are mostly p2p,
e.g. XBox Live, MSN Messenger, AIM, Yahoo Messenger, H.323, SIP. We need to
be careful not to dismiss all that research development and reasoning that
went into the decision for these people to go p2p.
With the exception of XBox Live perhaps, I wouldn't want to rely on any of them for conferencing with more than one person.
SIP and H.323 depend *can* depend on direct links for conferncing (as far as I know) that doesn't mean they have to, or even do so in a lot of cases (espc. SIP wich is often used just for replacing CSD channels!). If you're under the impression that SIP and H.323 are never used in conjunction with a "classic" phone conference you'd be very wrong I'm afraid. (as far as AIM, Yahoo, MSN I didn't even know they support conferencing, let alone how or what architecture they use for it on the protocol level)
Solutions like Net2Phone definatly connect to some server implementation, even for non-conferencing.
Maybe what we actually need to solve the low bandwidth problem of dial up
users and the reliability problem of having a single point of failure is to
have a hybrid client server and p2p system where the people with sufficent
bandwidth run as both servers and p2p between each other (like the idea of a
supernode) and the low bandwidth users connect to one of those servers, it
solves the low bandwidth user problem and the reliablity problem by having
multiple servers users can switch to if one goes down, and also the CPU
usage problem by not having too many people all connected to one server.
In previous email I already briefly touched the subject, and some in this email. I definatly think most of this could be handeled in the SI layer though (with a little cheating), a c/s based spec will not rule this out at all.
_______________________________________________
jdev mailing list
[EMAIL PROTECTED]
http://mailman.jabber.org/listinfo/jdev
