Re: [Tracker] nie:plainTextContent, Unicode normalization and Word breaks

Aleksander Morgado Sun, 25 Apr 2010 13:34:52 -0700

Hi Jamie,

> > I think it makes sense to fix this. Just to be clear, does this mean we 
> > don't need Pango in libtracker-fts/tracker-parser.c to determine word 
> > breaks for CJK?
> 
> Thats not broken so would not recommend trying to "fix" that
> 
> IMHO, The tracker_text_normalize() in the extractor should just do utf8
> validation. It should not attempt word breaking as thats cpu expensive
> and being done by the parser already
>


But then how can we limit the extracted text based on the number of
words?

Cheers,
-- 
Aleksander

_______________________________________________
tracker-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/tracker-list

Re: [Tracker] nie:plainTextContent, Unicode normalization and Word breaks

Reply via email to