On Thu, 2011-05-26 at 12:33 +0300, Jussi Pakkanen wrote: > One of the issues we have been examining is the question of gesture > combinatorics. The basic problem is that with, say, 5 touches you get > the global gesture, 10 two finger gestures, 10 three finger gestures and > 5 four finger gestures. The problem is which one of these should be > evaluated and provided to the client.
I'm not clear on what "the global gesture" is here. It's true that in theory you get a combinatorial explosion on the number of touches. The reality is that applications only subscribe to a subset of gestures and the number of touch combinations involved are quite tractable. For example, Unity is only interested in 3- and 4-touch gestures. With 5 possible touches, that's 15 possible gestures. 15 is hardly an unweildy number. > The preliminary plan is that each application is always provided with > the global gesture and all two touch pairs. Or maybe all pairs, I don't > know these details. Could someone please provide the current facts on > this? The client then filters the incoming events and only takes the > ones he cares about. No, the preliminary plan is that applications are provided with every possible subscribed and recognized gesture and component touches. If the application asks for a particular gesture, they need to get the data. No second guessing what the developer's intent was. There is no concept of "the global gesture" that I am aware of, unless you are referring to simply the current touch frame. > The advantages of this scheme is that the client does not need to > communicate with the server producing zero round trips. The _point_ of that scheme is that round trips are not required. > The disadvantage is that this produces a combinatorial explosion. If the > amount of touches is low this is still manageable. However suppose there > are 15 touches (Apple Magic Trackpad supports up to 32 I think), this > means 105 two pair touches (and 455 three pair touches and 1365 four > pair touches, but let's ignore those). The hardware produces > measurements every 10 ms and assuming events are not combined (are > they?), this means up to 10 500 gesture events every second. Assuming > one event takes 20 bytes, that gives roughly 205 kB/sec data rate. The hardware is producing those data regardless. If the bottom half of the device driver is eating too much bandwidth it needs to be fixed. If the evdev layer in the kernel needs to consolidate events, it should be fixed. If the gesture recognizer should be consolidating frames or gesture events, it should. None of that should affect the data transfer design since they are orthogonal to it. The gesture recognizer has to process all combinations of gesture in parallel anyways, since it doesn't know what it's going to recognize until it's recognized a gesture. The amount of data transferred between the recognizer and the application(s) in each gesture event is not significant. The bounding factor is the number of data transfers not the size of the data. Round trips are inherently racy and time consuming. > Is that a lot? I'm not sure. Anyone with mobile experience want to weigh > in? > > I thought about this issue and came up with the following. It is more of > a explorative evaluation and not a concrete plan. It also ignores most > or all of what the implementation currently does. so some parts might > not be feasible. Consider this a nudge to start the ball rolling. > > > Design goal > > The system should do common case automatically. Uncommon cases should be > possible and mostly straightforward. > > > Assumptions > > In order to keep this analysis down to earth I make some assumptions on > usage. > > Most applications only have one gesture they care about. This is the > common case where only the (window-)global gesture matters. These sorts > of applications include EoG, Evince etc. Pairwise gestures have no > semantic meaning on these applications. This is a big assumption and mostly incorrect. The assumption is that all applications will care about all two-touch gestures within their bounding area (windows). Unity cares about all 3- and 4-touch gestures. Some applications may care about all one-touch gestures (ie. just MT events). > In applications that do want pairwise gestures, only a small subset of > all pairs is meaningful. Suppose an application that has four > independent pinch-to-zoom areas. That means up to eight touch points, > with a total of 28 combinations. Only 4 of these (14%) are used. The > others are meaningless. In mathematic terms this means that of the > O(N^2) combinations only O(N) are used. But the gesture recognizer does not know which combinations are going to be used. It has to send them all. The application can then be a good citizen and reject the unused ones, but the recognizer can not rely on all applications being good citizens. Developers are free to write bad applications that can bring the system to its knees. Don't forget that on mobile devices, you're not likely to get 8 simultaneous touchpoints on an input device. On a larger surface where you could conceivably have that many touchpoints, you generally enjoy the luxury of more and faster processors and no power consumption constraint. It turns out the requirements scale with the available technology. > The only thing that knows which touch pairs are meaningful in an > application is the application itself. There is no reliable heuristic. Therefore all subscribed combinations must be sent. > Individual touches are almost always important to the application. Are they? I think few applications are interested in individual touches. Just like few applications are interested in which keys on a keyboard are pressed: they're interested in what text gets entered instead. Some applications are interested in individual touches, and we support the Touch gesture. > The common case > > Based on the discussion above, it seems that most applications' gesture > needs can be fulfilled with just two pieces of information: the > individual touches and the global gesture that those touches form. I need a definition of 'global gesture'. It sounds like what you're saying is to pass the raw MT data to the application and have the application perform the gesture recognition. That design does not work under the requirement of having Unity grab the 3- and 4-touch gestures and providing a consistent feel across all applications. > The complicated case > > This is the big one. The basic case of transferring all pairs is > relatively simple but computing all the pairs always is a bit wasteful, > especially since usually only a small subset is required. (In fact the > most common case is the one above, meaning that none of the pairs is > ever used.) > > Since the important pairs can not be reliable determined the only way to > cut down on processing is that the app tells which pairs it cares about > as they come and go. Something along the lins of gest_id = > AddGestureDetection(num_touches, touch_point_array, gesture_mask). This > adds a round trip to the server. The app can't tell which pairs it wants until it receives them. Ergo, the existing design of sending all pairs and rejecting the unused ones via a round trip. Unless you're proposing to have the application do all the recognition using the raw MT data, which is a nonstarter because it does not support Unity grabbing 3- and 4-touch gestures and does not provide the consistent feel across all applications that we are aiming for. Applications (or toolkits used to build applications) already subscribe to only a subset of all possible gestures, selected by number of touches, window, device, and gesture class. The do not need to get all combinations of touches for all gestures all of the time, just all those gestures that could possibly satisfy what they asked for. > If the app wants to add all gesture pairs or other crazy things, it can > do that. At it also gets the blame when the system grinds to a halt. Yes. > Conclusions > > There are a quite a lot of issues, such as how Unity, the X server and > apps work together. But this mail is long enough already. How Unity and the apps work together, and consistency of interaction across the user experience, is the driving requirement. -- Stephen M. Webb <stephen.w...@canonical.com> Canonical Ltd. _______________________________________________ Mailing list: https://launchpad.net/~multi-touch-dev Post to : multi-touch-dev@lists.launchpad.net Unsubscribe : https://launchpad.net/~multi-touch-dev More help : https://help.launchpad.net/ListHelp