On Fri, 2012-04-06 at 15:43 +0100, Karl Relton wrote: > Hi all > > Trackers odt extraction doesn't seem to be the full text. From what I > can see, it is only storing words in paragraphs using 'heading' styles > (any of 'heading' of 'heading1', 'heading2' etc.). Text in other > paragraphs (e.g. using 'Default' or 'Text body' is not being indexed. > > Is this intentional? >
Looking at the code in tracker-extract-oasis.c, it seems unnecessarily picky to me. For the main text content (ODF elements text:p & text:h) it is looking at whether styles are 'present', or are certain types. Why this effort? Why not just accept all content in text:p or text:h elements, regardless of styles etc.? That would seem to me to be more in line with the ODF spec. Also, as an aside, it seems pointless capturing stuff in elements text:s, since this is by definition just white space. _______________________________________________ tracker-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/tracker-list
