On Thu, 04 Apr 2013 11:49:59 +0100 Tom Hacohen <[email protected]> said:
> On 04/04/13 11:43, Carsten Haitzler (The Rasterman) wrote: > > On Thu, 04 Apr 2013 08:39:24 +0100 Tom Hacohen <[email protected]> > > said: > > > >> On 04/04/13 00:52, Carsten Haitzler (The Rasterman) wrote: > >>> On Wed, 03 Apr 2013 17:26:42 +0100 Tom Hacohen <[email protected]> > >>> said: > >>> > >>>> On 28/03/13 10:49, Carsten Haitzler (The Rasterman) wrote: > >>>>> On Thu, 28 Mar 2013 09:56:40 +0000 Michael Blumenkrantz > >>>>> <[email protected]> said: > >>>>> > >>>>> thats cool. i just had to be grumpy about not having a bug report that > >>>>> told me what to look at instantly. i have found another bug. single > >>>>> letter words dont find word end markers. > >>>> > >>>> I just checked it, and it works for me: > >>>> #include <stdlib.h> > >>>> #include <wchar.h> > >>>> #include <stdio.h> > >>>> #include <wordbreak.h> > >>>> > >>>> int main() > >>>> { > >>>> { > >>>> const char *lang = ""; > >>>> wchar_t *text = L"This is a test"; > >>>> size_t len = wcslen(text); > >>>> char *breaks = malloc(len); > >>>> size_t i; > >>>> > >>>> printf("%ls\n", text); > >>>> > >>>> set_wordbreaks_utf32((const utf32_t *) text, len, lang, > >>>> breaks); for (i = 0 ; i < len ; i++) > >>>> printf("%d", (int) breaks[i]); > >>>> printf("\n"); > >>>> } > >>>> return 0; > >>>> } > >>>> > >>>> The output is: > >>>> This is a test > >>>> 11100100001110 > >>>> > >>>> 1s meaning no break, 0s meaning break here. It does break correctly > >>>> around the "a". Could you elaborate more on the bug you were seeing? > >>> > >>> no NON-breaks around "a". you can't tell that there is a word there at > >>> all. it may as well be " " (all spaces). :) > >>> > >>>> Cheers, > >>>> Tom. > >>>> > >>> > >>> > >> > >> Yeah, well, you know it using other means. Unfortunately it's beyond the > >> scope of the word breaking algorithm... There are no word breaks there, > >> thus the algorithm produces none. You probably need to just skip whites > >> in your code, not only rely on the wordbreak data when "merging" the > >> whites. > > > > and that is a problem as the word next/prev stuff relies on this.. and what > > are "whites" then? (from the word breaking point of view)... eg ' is NOT > > white. ( is. etc.... > > > > > > Well, word breaking has nothing to do with whites (well, they happen to > be in a class that separates words, but that's it). I wouldn't change > the word next/prev functions themselves, I'd just change the way they > are used in edje/elm. I.e, something like: > > If (is_white(cur_char)) > { > skip_whites; > skip_word; > } > else > { > skip_word; > skip_whites; > } and therein lies the rub. "what is a white" when it comes to word separattion assuming white == separator = eg " ", "/t", "\n", ")", "(", "." etc. > ------------------------------------------------------------------------------ > Minimize network downtime and maximize team effectiveness. > Reduce network management and security costs.Learn how to hire > the most talented Cisco Certified professionals. Visit the > Employer Resources Portal > http://www.cisco.com/web/learning/employer_resources/index.html > _______________________________________________ > enlightenment-devel mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/enlightenment-devel > -- ------------- Codito, ergo sum - "I code, therefore I am" -------------- The Rasterman (Carsten Haitzler) [email protected] ------------------------------------------------------------------------------ Minimize network downtime and maximize team effectiveness. Reduce network management and security costs.Learn how to hire the most talented Cisco Certified professionals. Visit the Employer Resources Portal http://www.cisco.com/web/learning/employer_resources/index.html _______________________________________________ enlightenment-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/enlightenment-devel
