On Thu, 2012-10-04 at 10:54 +0100, Martyn Russell wrote: > On 04/10/12 09:17, Ivan Frade wrote: > > I think python script contents are indexed because the mimetype is > > "text/x-python" and it falls back to the "text/*" extractor. PHP files > > have the mimetype "application/x-php" and there is no default option > > for that. > > This can be solved adding "application/x-php" in the .rules file of > > the text extractor (check > > /usr/local/share/tracker/extract-rules/90-text-generic.rule and other > > rule files in the same folder). > > Note that generic text indexing means that the python code is treated > > as plain text, a bunch of words. You could always write an specialized > > extractor that takes into account the semantic of the file. For > > example ignoring __init__.py files, or import statemens, maybe > > ignoring the code and indexing only function names.... depends on what > > you want. Same applies to PHP. > > Writing an extractor module is not difficult with some rudiments of > > programming in C and we can help via mailing list or IRC. Patches are > > welcome ;) > I should add, you can use: > tracker-control -m $MIME > or > tracker-control --reindex-mime-type=$MIME > If you change the rules file to note have to reindex all content again.
awilliam@linux-nysu:~> cat /usr/share/tracker/extract-rules/90-text-generic.rule [ExtractorRule] ModulePath=/usr/lib64/tracker-0.14/extract-modules/libextract-text.so MimeTypes=text/*;application/php awilliam@linux-nysu:~> tracker-control --reindex-mime-type="application/php" Reindexing mime types was successful awilliam@linux-nysu:~> grep -c Vaccaro /home/awilliam/Documents/Development/PHP/jsonRPCClient.php 1 awilliam@linux-nysu:~> tracker-search Vaccaro Results: Nope. :( awilliam@linux-nysu:~> tracker-info /home/awilliam/Documents/Development/PHP/jsonRPCClient.php Querying information for entity:'/home/awilliam/Documents/Development/PHP/jsonRPCClient.php' 'urn:uuid:ccd602ad-60ea-faee-f4d3-6c8e54274fe3' Results: 'http://purl.org/dc/elements/1.1/date' = '2012-04-18T21:25:19Z' 'http://purl.org/dc/elements/1.1/source' = 'urn:nepomuk:datasource:9291a450-1d49-11de-8c30-0800200c9a66' 'tracker:added' = '2012-09-22T00:07:48Z' 'tracker:modified' = '441117' 'rdf:type' = 'http://www.w3.org/2000/01/rdf-schema#Resource' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#DataObject' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#InformationElement' 'rdf:type' = 'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject' 'nie:byteSize' = '3977' 'nie:dataSource' = 'urn:nepomuk:datasource:9291a450-1d49-11de-8c30-0800200c9a66' 'nie:isPartOf' = 'urn:uuid:44f7deca-c907-266e-ee4c-1850b641f8a7' 'nie:url' = 'file:///home/awilliam/Documents/Development/PHP/jsonRPCClient.php' 'nfo:belongsToContainer' = 'urn:uuid:44f7deca-c907-266e-ee4c-1850b641f8a7' 'tracker:available' = 'true' 'nie:isStoredAs' = 'urn:uuid:ccd602ad-60ea-faee-f4d3-6c8e54274fe3' 'nie:mimeType' = 'application/x-php' 'nfo:fileLastAccessed' = '2012-04-18T21:25:19Z' 'nfo:fileLastModified' = '2012-04-18T21:25:19Z' 'nfo:fileName' = 'jsonRPCClient.php' 'nfo:fileSize' = '3977' Doing a touch on the file seems to cause nfo:fileLastModified to change, but it still doesn't show up in a search. _______________________________________________ tracker-list mailing list [email protected] https://mail.gnome.org/mailman/listinfo/tracker-list
