Hi Carlos, thanks for reviewing!
On Tue, 17 Jul 2018 19:18:36 +0200 Carlos Garnacho <[email protected]> wrote: > Hi!, > > (Way way late, trying to revive the conversation...) > > On Thu, May 3, 2018 at 9:22 PM, Dorota Czaplejewicz > <[email protected]> wrote: > > On Thu, 3 May 2018 20:47:27 +0200 > > Silvan Jegen <[email protected]> wrote: > > > >> Hi Dorota > >> > >> Some comments and typo fixes below. > >> > >> On Thu, May 03, 2018 at 05:41:21PM +0200, Dorota Czaplejewicz wrote: > >> > This new protocol description is a simplification over v2. > >> > > >> > - All pre-edit text styling is gone. > >> > - Pre-edit cursor can span characters. > >> > - No events regarding input panel (OSK) state nor covered rectangle. > >> > Compositors are still free to handle situations where the keyboard > >> > focus rectangle is covered by the input panel. > >> > - No set_preferred_language request for clients. > >> > - There is no event to send keysyms. Compositors can use wl_keyboard > >> > interface instead. > >> > - All state is double-buffered, with specified state. > >> > - Use Unicode codepoints to measure strings. > >> > > >> > Signed-off-by: Dorota Czaplejewicz <[email protected]> > >> > Signed-off-by: Carlos Garnacho <[email protected]> > >> > --- > >> > This is the next update coming from Purism to perfect the text input > >> > protocol. > >> > > >> > The following changes added on top of PATCHv3: > >> > > >> > - Fixed whitespaces. > >> > - Removed enable flags - the same information can be gathered from the > >> > first requests after enter. > >> > - Changed offsets inside UTF-8 strings to use Unicode character counts > >> > in order to remove the possibility of communicating invalid state. > >> > - Specified the exact lifetime of double-buffered state, and initial > >> > values. > >> > - Made changes requested by the IM double-buffered. > >> > > >> > Some questions remain open. One is: how to specify how much text to > >> > capture in set_surrounding_text, and how often to update? > > IMHO the only reason to state it here is that it's more likely that a > lazy implementation will try to squeeze a full book here, than eg. an > application setting an insanely long title. But certainly other > messages across protocols may hit this limit (the long title issue > wasn't made up :). > > As for how much, I think it ultimately depends on the IM behind. Text > correction probably just wants the current word, any sort of > prediction will probably require phrases to paragraphs, char > composition can probably do without. Sounds like this could be some > sort of hint, but I don't think IMs can tell you today how much text > do they want... > > >> > > >> > A possible change that I decided against for now is to replace > >> > enable/disable events by create/destroy of a new object, which would > >> > make more state lifetimes encoded in the protocol. > >> > > >> > After reading a blog post on fcitx [0], I got the impression that > >> > letting the compositor know some persistent ID of a text edit instance > >> > could be useful, however I'm not sure what the use cases are. > >> > > >> > As always, I'm happy to hear feedback. > >> > > >> > Cheers, > >> > Dorota Czaplejewicz > >> > > >> > [0] > >> > https://www.csslayer.info/wordpress/fcitx-dev/gaps-between-wayland-and-fcitx-or-all-input-methods/ > >> > > >> > Makefile.am | 1 + > >> > unstable/text-input/text-input-unstable-v3.xml | 362 > >> > +++++++++++++++++++++++++ > >> > 2 files changed, 363 insertions(+) > >> > create mode 100644 unstable/text-input/text-input-unstable-v3.xml > >> > > >> > diff --git a/Makefile.am b/Makefile.am > >> > index 4b9a901..86d7ca9 100644 > >> > --- a/Makefile.am > >> > +++ b/Makefile.am > >> > @@ -3,6 +3,7 @@ unstable_protocols = > >> > \ > >> > unstable/fullscreen-shell/fullscreen-shell-unstable-v1.xml > >> > \ > >> > unstable/linux-dmabuf/linux-dmabuf-unstable-v1.xml > >> > \ > >> > unstable/text-input/text-input-unstable-v1.xml > >> > \ > >> > + unstable/text-input/text-input-unstable-v3.xml > >> > \ > >> > unstable/input-method/input-method-unstable-v1.xml > >> > \ > >> > unstable/xdg-shell/xdg-shell-unstable-v5.xml > >> > \ > >> > unstable/xdg-shell/xdg-shell-unstable-v6.xml > >> > \ > >> > diff --git a/unstable/text-input/text-input-unstable-v3.xml > >> > b/unstable/text-input/text-input-unstable-v3.xml > >> > new file mode 100644 > >> > index 0000000..ed5204f > >> > --- /dev/null > >> > +++ b/unstable/text-input/text-input-unstable-v3.xml > >> > @@ -0,0 +1,362 @@ > >> > +<?xml version="1.0" encoding="UTF-8"?> > >> > + > >> > +<protocol name="text_input_unstable_v3"> > >> > + <copyright> > >> > + Copyright © 2012, 2013 Intel Corporation > >> > + Copyright © 2015, 2016 Jan Arne Petersen > >> > + Copyright © 2017, 2018 Red Hat, Inc. > >> > + Copyright © 2018 Purism SPC > >> > + > >> > + Permission to use, copy, modify, distribute, and sell this > >> > + software and its documentation for any purpose is hereby granted > >> > + without fee, provided that the above copyright notice appear in > >> > + all copies and that both that copyright notice and this permission > >> > + notice appear in supporting documentation, and that the name of > >> > + the copyright holders not be used in advertising or publicity > >> > + pertaining to distribution of the software without specific, > >> > + written prior permission. The copyright holders make no > >> > + representations about the suitability of this software for any > >> > + purpose. It is provided "as is" without express or implied > >> > + warranty. > >> > + > >> > + THE COPYRIGHT HOLDERS DISCLAIM ALL WARRANTIES WITH REGARD TO THIS > >> > + SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND > >> > + FITNESS, IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY > >> > + SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES > >> > + WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN > >> > + AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, > >> > + ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF > >> > + THIS SOFTWARE. > >> > + </copyright> > >> > + > >> > + <interface name="zwp_text_input_v3" version="1"> > >> > + <description summary="text input"> > >> > + The zwp_text_input_v3 interface represents text input and input > >> > methods > >> > + associated with a seat. It provides enter/leave events to follow > >> > the > >> > + text input focus for a seat. > >> > + > >> > + Requests are used to enable/disable the text-input object and set > >> > + state information like surrounding and selected text or the > >> > content type. > >> > + The information about the entered text is sent to the text-input > >> > object > >> > + via the pre-edit and commit_string events. > >> > + > >> > + Text is valid UTF-8 encoded, indices and lengths are in code > >> > points. If a > >> > + grapheme is made up of multiple code points, an index pointing to > >> > any of > >> > + them should be interpreted as pointing to the first one. > >> > >> That way we make sure we don't put the cursor/anchor between bytes that > >> belong to the same UTF-8 encoded Unicode code point which is nice. It > >> also means that the client has to parse all the UTF-8 encoded strings > >> into Unicode code points up to the desired cursor/anchor position > >> on each "preedit_string" event. For each "delete_surrounding_text" event > >> the client has to parse the UTF-8 sequences before and after the cursor > >> position up to the requested Unicode code point. > >> > >> I feel like we are processing the UTF-8 string already in the > >> input-method. So I am not sure that we should parse it again on the > >> client side. Parsing it again would also mean that the client would need > >> to know about UTF-8 which would be nice to avoid. > >> > >> Thoughts? > > > > The client needs to know about Unicode, but not necessarily about UTF-8. > > Specifying code points is actually an advantage here, because byte offsets > > are inherently expressed relative to UTF-8. By counting with code points, > > client's internal representation can be UTF-16 or maybe even something > > else. > > I personally think byte offsets are more handy than codepoints: > pointer math is O(1) and str*() functions are "sensible" (on UTF-8 at > least, and past the bytes!=chars gotchas), it's relatively simple to > find out whether you are in the middle of a UTF-8 char, it seems > simpler to deal with than the other way around if utf16/codepoints are > used in either side; and this might even be moot as all parties are > interested in chopping strings between word/char boundaries. > > As for using UTF-8 specifically, other protocols do use it for > exchange of strings (eg. xdg_surface.set_title). It's the perfect fit > for glib/pango/etc, so it wouldn't be me who objects, either :). > > Cheers, > Carlos I think you're tipping the scales here. In the interest of having the protocol move forward I'm changing code points to bytes, since I don't think they make a huge difference in practice. v5 incoming! Cheers, Dorota
pgp0Nrcz4VhMi.pgp
Description: OpenPGP digital signature
_______________________________________________ wayland-devel mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/wayland-devel
