[Zeitgeist] [Merge] lp:~kamstrup/zeitgeist/schema_versions into lp:zeitgeist
Mikkel Kamstrup Erlandsen has proposed merging lp:~kamstrup/zeitgeist/schema_versions into lp:zeitgeist. Requested reviews: Zeitgeist Framework Team (zeitgeist) Related bugs: #566898 Log DB file should be versioned https://bugs.launchpad.net/bugs/566898 #580643 conversion script for the database and database versioning https://bugs.launchpad.net/bugs/580643 It's not very thoroughly tested yet, but upgrades = 0.3.3 to 0.3.4 seem to work. But what is this? Versioning of the core DB schema (and also adds the possibility to version other schema if we ever have that). On startup we check if the schema version for the 'core' schema is what we expect and if that is case we assume the schema is good and no further setup is needed. If the schema version is not what we want we look for a module called _zeitgeist.engine.upgrades.core_$oldversion_$newversion and execute its run() method if it's there. In our case we are talking upgrading from core schema 0 to 1, so that would be _zeitgeist.engine.upgrades.core_0_1.py. Note that I did it this way in order to minimize the number of .py files we need to stat and/or parse at startup. If no upgrades are necessary, none of the upgrade .py files are parsed let alone read from disk. -- https://code.launchpad.net/~kamstrup/zeitgeist/schema_versions/+merge/26231 Your team Zeitgeist Framework Team is requested to review the proposed merge of lp:~kamstrup/zeitgeist/schema_versions into lp:zeitgeist. === modified file '_zeitgeist/engine/__init__.py' --- _zeitgeist/engine/__init__.py 2010-05-25 20:14:11 + +++ _zeitgeist/engine/__init__.py 2010-05-27 19:21:32 + @@ -80,6 +80,17 @@ SIG_EVENT = asaasay # Extensions + TREE DEFAULT_EXTENSIONS = _get_extensions() +=== + DEFAULT_EXTENSIONS = [ + _zeitgeist.engine.extensions.blacklist.Blacklist, + _zeitgeist.engine.extensions.datasource_registry.DataSourceRegistry, + ] + + # Required version of DB schema + CORE_SCHEMA=core + CORE_SCHEMA_VERSION = 1 + MERGE-SOURCE constants = _Constants() === modified file '_zeitgeist/engine/sql.py' --- _zeitgeist/engine/sql.py 2010-05-14 16:59:04 + +++ _zeitgeist/engine/sql.py 2010-05-27 19:21:32 + @@ -22,6 +22,7 @@ import sqlite3 import logging +import time from _zeitgeist.engine import constants @@ -49,14 +50,88 @@ else: return super(UnicodeCursor, self).execute(statement) +def _get_schema_version (cursor, schema_name): + + Returns the schema version for schema_name or returns 0 in case + the schema doesn't exist. + + try: + schema_version_result = cursor.execute( + SELECT version FROM schema_version WHERE schema='core' + ) + result = schema_version_result.fetchone() + return result[0] if result else 0 + except sqlite3.OperationalError, e: + # The schema isn't there... + log.debug (Schema '%s' not found: %s % (schema_name, e)) + return 0 + +def _set_schema_version (cursor, schema_name, version): + + Sets the version of `schema_name` to `version` + + cursor.execute( + CREATE TABLE IF NOT EXISTS schema_version + (schema VARCHAR PRIMARY KEY ON CONFLICT REPLACE, version INT) + ) + + # The 'ON CONFLICT REPLACE' on the PK converts INSERT to UPDATE + # when appriopriate + cursor.execute( + INSERT INTO schema_version VALUES (?, ?) + , (schema_name, version)) + cursor.connection.commit() + +def _do_schema_upgrade (cursor, schema_name, old_version, new_version): + + Try and upgrade schema `schema_name` from version `old_version` to + `new_version`. This is done by checking for an upgrade module named + '_zeitgeist.engine.upgrades.$schema_name_$old_version_$new_version' + and executing the run(cursor) method of that module + + # Fire of the right upgrade module + log.info(Upgrading database '%s' from version %s to %s. This may take a while % + (schema_name, old_version, new_version)) + upgrader_name = %s_%s_%s % (schema_name, old_version, new_version) + module = __import__ (_zeitgeist.engine.upgrades.%s % upgrader_name) + eval(module.engine.upgrades.%s.run(cursor) % upgrader_name) + + # Update the schema version + _set_schema_version(cursor, schema_name, new_version) + + log.info(Upgrade succesful) + def create_db(file_path): Create the database and return a default cursor for it - + start = time.time() log.info(Using database: %s % file_path) conn = sqlite3.connect(file_path) conn.row_factory = sqlite3.Row cursor = conn.cursor(UnicodeCursor) + # See if we have the right schema version, and try an upgrade if needed + core_schema_version = _get_schema_version(cursor, constants.CORE_SCHEMA) + if core_schema_version is not None: + if core_schema_version == constants.CORE_SCHEMA_VERSION: + _time = (time.time() - start)*1000 + log.debug(Core schema is good. DB loaded in %sms % _time) + return cursor + else: + try: +_do_schema_upgrade (cursor, +constants.CORE_SCHEMA, +core_schema_version, +
[Zeitgeist] [Merge] lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist
Michal Hruby has proposed merging lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist. Requested reviews: Zeitgeist Framework Team (zeitgeist) -- https://code.launchpad.net/~mhr3/zeitgeist/mimetypes/+merge/26233 Your team Zeitgeist Framework Team is requested to review the proposed merge of lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist. === modified file '_zeitgeist/loggers/datasources/recent.py' --- _zeitgeist/loggers/datasources/recent.py 2010-04-29 11:33:01 + +++ _zeitgeist/loggers/datasources/recent.py 2010-05-27 19:34:36 + @@ -24,8 +24,6 @@ from __future__ import with_statement import os -import re -import fnmatch import urllib import time import logging @@ -34,6 +32,7 @@ from zeitgeist import _config from zeitgeist.datamodel import Event, Subject, Interpretation, Manifestation, \ DataSource, get_timestamp_for_now +from zeitgeist.mimetypes import get_interpretation_for_mimetype from _zeitgeist.loggers.zeitgeist_base import DataProvider log = logging.getLogger(zeitgeist.logger.datasources.recent) @@ -51,166 +50,9 @@ else: enabled = True -class SimpleMatch(object): - Wrapper around fnmatch.fnmatch which allows to define mimetype - patterns by using shell-style wildcards. - - - def __init__(self, pattern): - self.__pattern = pattern - - def match(self, text): - return fnmatch.fnmatch(text, self.__pattern) - - def __repr__(self): - return %s(%r) %(self.__class__.__name__, self.__pattern) - -DOCUMENT_MIMETYPES = [ - # Covers: - # vnd.corel-draw - # vnd.ms-powerpoint - # vnd.ms-excel - # vnd.oasis.opendocument.* - # vnd.stardivision.* - # vnd.sun.xml.* - SimpleMatch(uapplication/vnd.*), - # Covers: x-applix-word, x-applix-spreadsheet, x-applix-presents - SimpleMatch(uapplication/x-applix-*), - # Covers: x-kword, x-kspread, x-kpresenter, x-killustrator - re.compile(uapplication/x-k(word|spread|presenter|illustrator)), - uapplication/ms-powerpoint, - uapplication/msword, - uapplication/pdf, - uapplication/postscript, - uapplication/ps, - uapplication/rtf, - uapplication/x-abiword, - uapplication/x-gnucash, - uapplication/x-gnumeric, - SimpleMatch(uapplication/x-java*), - SimpleMatch(u*/x-tex), - SimpleMatch(u*/x-latex), - SimpleMatch(u*/x-dvi), - utext/plain -] - -IMAGE_MIMETYPES = [ - # Covers: - # vnd.corel-draw - uapplication/vnd.corel-draw, - # Covers: x-kword, x-kspread, x-kpresenter, x-killustrator - re.compile(uapplication/x-k(word|spread|presenter|illustrator)), - SimpleMatch(uimage/*), -] - -AUDIO_MIMETYPES = [ - SimpleMatch(uaudio/*), - uapplication/ogg -] - -VIDEO_MIMETYPES = [ - SimpleMatch(uvideo/*), - uapplication/ogg -] - -DEVELOPMENT_MIMETYPES = [ - uapplication/ecmascript, - uapplication/javascript, - uapplication/x-csh, - uapplication/x-designer, - uapplication/x-desktop, - uapplication/x-dia-diagram, - uapplication/x-fluid, - uapplication/x-glade, - uapplication/xhtml+xml, - uapplication/x-java-archive, - uapplication/x-m4, - uapplication/xml, - uapplication/x-object, - uapplication/x-perl, - uapplication/x-php, - uapplication/x-ruby, - uapplication/x-shellscript, - uapplication/x-sql, - utext/css, - utext/html, - utext/x-c, - utext/x-c++, - utext/x-chdr, - utext/x-copying, - utext/x-credits, - utext/x-csharp, - utext/x-c++src, - utext/x-csrc, - utext/x-dsrc, - utext/x-eiffel, - utext/x-gettext-translation, - utext/x-gettext-translation-template, - utext/x-haskell, - utext/x-idl, - utext/x-java, - utext/x-lisp, - utext/x-lua, - utext/x-makefile, - utext/x-objcsrc, - utext/x-ocaml, - utext/x-pascal, - utext/x-patch, - utext/x-python, - utext/x-sql, - utext/x-tcl, - utext/x-troff, - utext/x-vala, - utext/x-vhdl, -] - -ALL_MIMETYPES = DOCUMENT_MIMETYPES + IMAGE_MIMETYPES + AUDIO_MIMETYPES + \ -VIDEO_MIMETYPES + DEVELOPMENT_MIMETYPES - -class MimeTypeSet(set): - Set which allows to match against a string or an object with a - match() method. - - - def __init__(self, *items): - super(MimeTypeSet, self).__init__() - self.__pattern = set() - for item in items: - if isinstance(item, (str, unicode)): -self.add(item) - elif hasattr(item, match): -self.__pattern.add(item) - else: -raise ValueError(Bad mimetype '%s' %item) - - def __contains__(self, mimetype): - result = super(MimeTypeSet, self).__contains__(mimetype) - if not result: - for pattern in self.__pattern: -if pattern.match(mimetype): - return True - return result - - def __len__(self): - return super(MimeTypeSet, self).__len__() + len(self.__pattern) - - def __repr__(self): - items = , .join(sorted(map(repr, self | self.__pattern))) - return %s(%s) %(self.__class__.__name__, items) - class RecentlyUsedManagerGtk(DataProvider): - FILTERS = { - # dict of name as key and the matching mimetypes as value - # if the value is None this filter matches all mimetypes - DOCUMENT: MimeTypeSet(*DOCUMENT_MIMETYPES), - IMAGE: MimeTypeSet(*IMAGE_MIMETYPES), - MUSIC:
Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist
Looks good on a quick glance over the diff. Nice work. -- https://code.launchpad.net/~mhr3/zeitgeist/mimetypes/+merge/26233 Your team Zeitgeist Framework Team is requested to review the proposed merge of lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist. ___ Mailing list: https://launchpad.net/~zeitgeist Post to : zeitgeist@lists.launchpad.net Unsubscribe : https://launchpad.net/~zeitgeist More help : https://help.launchpad.net/ListHelp
[Zeitgeist] [Merge] lp:~seif/zeitgeist/tests-fix into lp:zeitgeist
Seif Lotfy has proposed merging lp:~seif/zeitgeist/tests-fix into lp:zeitgeist. Requested reviews: Zeitgeist Framework Team (zeitgeist) fixing some benchmark tests :) -- https://code.launchpad.net/~seif/zeitgeist/tests-fix/+merge/26236 Your team Zeitgeist Framework Team is requested to review the proposed merge of lp:~seif/zeitgeist/tests-fix into lp:zeitgeist. === modified file '_zeitgeist/engine/main.py' --- _zeitgeist/engine/main.py 2010-05-14 17:16:49 + +++ _zeitgeist/engine/main.py 2010-05-27 19:48:36 + @@ -393,7 +393,6 @@ Only URIs for subjects matching the indicated `result_event_templates` and `result_storage_state` are returned. - if result_type == 0 or result_type == 1: t1 = time.time() @@ -411,6 +410,7 @@ result_storage_state, 0, 4) landmarks = set([unicode(event[0]) for event in landmarks]) + latest_uris = dict(uris) events = [unicode(u[0]) for u in uris] @@ -467,7 +467,6 @@ sets.sort(reverse = True) sets = map(lambda result: result[1], sets[:num_results]) - return sets else: raise NotImplementedError, Unsupported ResultType. === modified file 'test/benchmarks.py' --- test/benchmarks.py 2010-01-04 20:50:13 + +++ test/benchmarks.py 2010-05-27 19:48:36 + @@ -29,12 +29,10 @@ # range (0, randonmess)! # -CONTENTS = [Interpretation.DOCUMENT, Interpretation.TAG, Interpretation.BOOKMARK, Interpretation.MUSIC, - Interpretation.EMAIL, Interpretation.IMAGE] -SOURCES = [Manifestation.FILE, Manifestation.WEB_HISTORY, Manifestation.SYSTEM_RESOURCE, - Manifestation.USER_ACTIVITY] +INTERPRETATIONS = list(Interpretation.get_all_children()) +MANIFESTATIONS = list(Manifestation.get_all_children()) -USES = [Manifestation.USER_ACTIVITY, Manifestation.USER_NOTIFICATION] +USES = list(Manifestation.get_all_children()) APPS = [foo.desktop, bar.desktop, bleh.desktop] @@ -43,22 +41,22 @@ MIMES = [application/pdf, application/xml, text/plain, image/png, image/jpeg] -TAGS = [puppies, , kittens, , ponies, , , ] -def new_dummy_item(uri, randomness=0, timestamp=0): - return { - uri : uri, - content : CONTENTS[randint(0, randomness) % len(CONTENTS)], - source : SOURCES[randint(0, randomness) % len(SOURCES)], - app : APPS[randint(0, randomness) % len(APPS)], - timestamp : timestamp, - text : Text, - mimetype : MIMES[randint(0, randomness) % len(MIMES)], - use : USES[randint(0, randomness) % len(USES)], - origin : ORIGINS[randint(0, randomness) % len(ORIGINS)], - bookmark : 0 if randomness == 0 else randint(0,1), - tags : TAGS[randint(0, randomness) % len(TAGS)] - } +def new_dummy_item(uri, randomness=0, timestamp=0): + event = Event() + subject = Subject() + subject.uri = uri + subject.interpretation = INTERPRETATIONS[randint(0, randomness) % len(INTERPRETATIONS)] + subject.manifestation = MANIFESTATIONS[randint(0, randomness) % len(MANIFESTATIONS)] + event.actor = APPS[randint(0, randomness) % len(APPS)], + event.timestamp = timestamp + subject.text = Text, + subject.mimetype = MIMES[randint(0, randomness) % len(MIMES)], + subject.origin = ORIGINS[randint(0, randomness) % len(ORIGINS)], + event.interpretation = INTERPRETATIONS[randint(0, randomness) % len(INTERPRETATIONS)] + event.manifestation = MANIFESTATIONS[randint(0, randomness) % len(MANIFESTATIONS)] + event.set_subjects([subject]) + return event class EngineBenchmark (unittest.TestCase): @@ -83,12 +81,14 @@ def do5ChunksOf200(self, randomness): batch = [] full_start = time() + #batch_time = time() for i in range(1,1001): batch.append(new_dummy_item(test://item%s % i, randomness=randomness, timestamp=i)) - if len(batch) % 200 == 0: + if len(batch) % 200 == 0: +#log.info(Finished batch of 200 in: %ss % (time() - batch_time)) +#batch_time = time() start = time() self.engine.insert_events(batch) -log.info(Inserted 200 items in: %ss % (time()-start)) batch = [] log.info(Total insertion time for 1000 items: %ss % (time()-full_start)) @@ -115,9 +115,12 @@ inserted_items = [] batch = [] full_start = time() + #batch_time = time() for i in range(1,num+1): batch.append(new_dummy_item(test://item%s % i, randomness=randomness, timestamp=i)) if len(batch) % 400 == 0: +#log.info(Finished batch of 400 in: %ss % (time() - batch_time)) +#batch_time = time() self.engine.insert_events(batch) inserted_items.extend(batch) batch = [] @@ -128,7 +131,7 @@ log.info(Total insertion time for %s items: %ss % (num, time()-full_start)) return inserted_items - def do_find(self, expected_items, page_size, **kwargs): + def do_find(self, expected_items, page_size, filters = [], **kwargs): Helper method to find a set of items with page size of 'page_size' passin 'kwargs' directly to the engine.find_events() method. @@ -138,12 +141,14 @@ next_timestamp = 0 page_time = time() full_time = page_time - page =
[Zeitgeist] [Bug 566898] Re: Log DB file should be versioned
** Changed in: zeitgeist Status: In Progress = Fix Committed -- Log DB file should be versioned https://bugs.launchpad.net/bugs/566898 You received this bug notification because you are a member of Zeitgeist Framework Team, which is subscribed to Zeitgeist Framework. Status in Zeitgeist Framework: Fix Committed Bug description: We should really store log DB schema version in the our sqlite. That way we can do smooth upgrades without hacks. Not only that, but we can also shave off a lot of SQL grinding at startup if we just check the db schema version... I am thinking a new table: CREATE TABLE IF NOT EXISTS version_info (schema_name VARCHAR, version INT) This table will have one row for our initial use case, but we may add more rows in the future. The schema_name of our core log db could be main_log and on startup we'd do: SELECT version FROM version WHERE name='main_log'; if version != expected_version : do stuff else : no need to run all our initial sql ___ Mailing list: https://launchpad.net/~zeitgeist Post to : zeitgeist@lists.launchpad.net Unsubscribe : https://launchpad.net/~zeitgeist More help : https://help.launchpad.net/ListHelp