Hi Jamie, > > I think it makes sense to fix this. Just to be clear, does this mean we > > don't need Pango in libtracker-fts/tracker-parser.c to determine word > > breaks for CJK? > > Thats not broken so would not recommend trying to "fix" that > > IMHO, The tracker_text_normalize() in the extractor should just do utf8 > validation. It should not attempt word breaking as thats cpu expensive > and being done by the parser already >
But then how can we limit the extracted text based on the number of words? Cheers, -- Aleksander _______________________________________________ tracker-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/tracker-list
