[Zeitgeist] [Merge] lp:~kamstrup/zeitgeist/schema_versions into lp:zeitgeist

2010-05-27 Thread Mikkel Kamstrup Erlandsen
Mikkel Kamstrup Erlandsen has proposed merging 
lp:~kamstrup/zeitgeist/schema_versions into lp:zeitgeist.

Requested reviews:
  Zeitgeist Framework Team (zeitgeist)
Related bugs:
  #566898 Log DB file should be versioned
  https://bugs.launchpad.net/bugs/566898
  #580643 conversion script for the database and database versioning
  https://bugs.launchpad.net/bugs/580643


It's not very thoroughly tested yet, but upgrades = 0.3.3 to 0.3.4 seem to 
work.

But what is this?
Versioning of the core DB schema (and also adds the possibility to version 
other schema if we ever have that).

On startup we check if the schema version for the 'core' schema is what we 
expect and if that is case we assume the schema is good and no further setup is 
needed.

If the schema version is not what we want we look for a module called 
_zeitgeist.engine.upgrades.core_$oldversion_$newversion and execute its run() 
method if it's there.
In our case we are talking upgrading from core schema 0 to 1, so that would be 
_zeitgeist.engine.upgrades.core_0_1.py.

Note that I did it this way in order to minimize the number of .py files we 
need to stat and/or parse at startup. If no upgrades are necessary, none of the 
upgrade .py files are parsed let alone read from disk.
-- 
https://code.launchpad.net/~kamstrup/zeitgeist/schema_versions/+merge/26231
Your team Zeitgeist Framework Team is requested to review the proposed merge of 
lp:~kamstrup/zeitgeist/schema_versions into lp:zeitgeist.
=== modified file '_zeitgeist/engine/__init__.py'
--- _zeitgeist/engine/__init__.py	2010-05-25 20:14:11 +
+++ _zeitgeist/engine/__init__.py	2010-05-27 19:21:32 +
@@ -80,6 +80,17 @@
 	SIG_EVENT = asaasay
 	
 	# Extensions
+ TREE
 	DEFAULT_EXTENSIONS = _get_extensions()
+===
+	DEFAULT_EXTENSIONS = [
+		_zeitgeist.engine.extensions.blacklist.Blacklist,
+		_zeitgeist.engine.extensions.datasource_registry.DataSourceRegistry,
+		]
+	
+	# Required version of DB schema
+	CORE_SCHEMA=core
+	CORE_SCHEMA_VERSION = 1
+ MERGE-SOURCE
 
 constants = _Constants()

=== modified file '_zeitgeist/engine/sql.py'
--- _zeitgeist/engine/sql.py	2010-05-14 16:59:04 +
+++ _zeitgeist/engine/sql.py	2010-05-27 19:21:32 +
@@ -22,6 +22,7 @@
 
 import sqlite3
 import logging
+import time
 
 from _zeitgeist.engine import constants
 
@@ -49,14 +50,88 @@
 		else:
 			return super(UnicodeCursor, self).execute(statement)
 
+def _get_schema_version (cursor, schema_name):
+	
+	Returns the schema version for schema_name or returns 0 in case
+	the schema doesn't exist.
+	
+	try:
+		schema_version_result = cursor.execute(
+			SELECT version FROM schema_version WHERE schema='core'
+		)
+		result = schema_version_result.fetchone()
+		return result[0] if result else 0
+	except sqlite3.OperationalError, e:
+		# The schema isn't there...
+		log.debug (Schema '%s' not found: %s % (schema_name, e))
+		return 0
+
+def _set_schema_version (cursor, schema_name, version):
+	
+	Sets the version of `schema_name` to `version`
+	
+	cursor.execute(
+		CREATE TABLE IF NOT EXISTS schema_version
+			(schema VARCHAR PRIMARY KEY ON CONFLICT REPLACE, version INT)
+	)
+	
+	# The 'ON CONFLICT REPLACE' on the PK converts INSERT to UPDATE
+	# when appriopriate
+	cursor.execute(
+		INSERT INTO schema_version VALUES (?, ?)
+	, (schema_name, version))
+	cursor.connection.commit()
+
+def _do_schema_upgrade (cursor, schema_name, old_version, new_version):
+	
+	Try and upgrade schema `schema_name` from version `old_version` to
+	`new_version`. This is done by checking for an upgrade module named
+	'_zeitgeist.engine.upgrades.$schema_name_$old_version_$new_version'
+	and executing the run(cursor) method of that module
+	
+	# Fire of the right upgrade module
+	log.info(Upgrading database '%s' from version %s to %s. This may take a while %
+	 (schema_name, old_version, new_version))
+	upgrader_name = %s_%s_%s % (schema_name, old_version, new_version)
+	module = __import__ (_zeitgeist.engine.upgrades.%s % upgrader_name)
+	eval(module.engine.upgrades.%s.run(cursor) % upgrader_name)
+	
+	# Update the schema version
+	_set_schema_version(cursor, schema_name, new_version)
+	
+	log.info(Upgrade succesful)
+
 def create_db(file_path):
 	Create the database and return a default cursor for it
-	
+	start = time.time()
 	log.info(Using database: %s % file_path)
 	conn = sqlite3.connect(file_path)
 	conn.row_factory = sqlite3.Row
 	cursor = conn.cursor(UnicodeCursor)
 	
+	# See if we have the right schema version, and try an upgrade if needed
+	core_schema_version = _get_schema_version(cursor, constants.CORE_SCHEMA)
+	if core_schema_version is not None:
+		if core_schema_version == constants.CORE_SCHEMA_VERSION:
+			_time = (time.time() - start)*1000
+			log.debug(Core schema is good. DB loaded in %sms % _time)
+			return cursor
+		else:
+			try:
+_do_schema_upgrade (cursor,
+constants.CORE_SCHEMA,
+core_schema_version,
+

[Zeitgeist] [Merge] lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist

2010-05-27 Thread Michal Hruby
Michal Hruby has proposed merging lp:~mhr3/zeitgeist/mimetypes into 
lp:zeitgeist.

Requested reviews:
  Zeitgeist Framework Team (zeitgeist)

-- 
https://code.launchpad.net/~mhr3/zeitgeist/mimetypes/+merge/26233
Your team Zeitgeist Framework Team is requested to review the proposed merge of 
lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist.
=== modified file '_zeitgeist/loggers/datasources/recent.py'
--- _zeitgeist/loggers/datasources/recent.py	2010-04-29 11:33:01 +
+++ _zeitgeist/loggers/datasources/recent.py	2010-05-27 19:34:36 +
@@ -24,8 +24,6 @@
 
 from __future__ import with_statement
 import os
-import re
-import fnmatch
 import urllib
 import time
 import logging
@@ -34,6 +32,7 @@
 from zeitgeist import _config
 from zeitgeist.datamodel import Event, Subject, Interpretation, Manifestation, \
 	DataSource, get_timestamp_for_now
+from zeitgeist.mimetypes import get_interpretation_for_mimetype
 from _zeitgeist.loggers.zeitgeist_base import DataProvider
 
 log = logging.getLogger(zeitgeist.logger.datasources.recent)
@@ -51,166 +50,9 @@
 else:
 	enabled = True
 
-class SimpleMatch(object):
-	 Wrapper around fnmatch.fnmatch which allows to define mimetype
-	patterns by using shell-style wildcards.
-	
-
-	def __init__(self, pattern):
-		self.__pattern = pattern
-
-	def match(self, text):
-		return fnmatch.fnmatch(text, self.__pattern)
-
-	def __repr__(self):
-		return %s(%r) %(self.__class__.__name__, self.__pattern)
-
-DOCUMENT_MIMETYPES = [
-		# Covers:
-		#	 vnd.corel-draw
-		#	 vnd.ms-powerpoint
-		#	 vnd.ms-excel
-		#	 vnd.oasis.opendocument.*
-		#	 vnd.stardivision.*
-		#	 vnd.sun.xml.*
-		SimpleMatch(uapplication/vnd.*),
-		# Covers: x-applix-word, x-applix-spreadsheet, x-applix-presents
-		SimpleMatch(uapplication/x-applix-*),
-		# Covers: x-kword, x-kspread, x-kpresenter, x-killustrator
-		re.compile(uapplication/x-k(word|spread|presenter|illustrator)),
-		uapplication/ms-powerpoint,
-		uapplication/msword,
-		uapplication/pdf,
-		uapplication/postscript,
-		uapplication/ps,
-		uapplication/rtf,
-		uapplication/x-abiword,
-		uapplication/x-gnucash,
-		uapplication/x-gnumeric,
-		SimpleMatch(uapplication/x-java*),
-		SimpleMatch(u*/x-tex),
-		SimpleMatch(u*/x-latex),
-		SimpleMatch(u*/x-dvi),
-		utext/plain
-]
-
-IMAGE_MIMETYPES = [
-		# Covers:
-		#	 vnd.corel-draw
-		uapplication/vnd.corel-draw,
-		# Covers: x-kword, x-kspread, x-kpresenter, x-killustrator
-		re.compile(uapplication/x-k(word|spread|presenter|illustrator)),
-		SimpleMatch(uimage/*),
-]
-
-AUDIO_MIMETYPES = [
-		SimpleMatch(uaudio/*),
-		uapplication/ogg
-]
-
-VIDEO_MIMETYPES = [
-		SimpleMatch(uvideo/*),
-		uapplication/ogg
-]
-
-DEVELOPMENT_MIMETYPES = [
-		uapplication/ecmascript,
-		uapplication/javascript,
-		uapplication/x-csh,
-		uapplication/x-designer,
-		uapplication/x-desktop,
-		uapplication/x-dia-diagram,
-		uapplication/x-fluid,
-		uapplication/x-glade,
-		uapplication/xhtml+xml,
-		uapplication/x-java-archive,
-		uapplication/x-m4,
-		uapplication/xml,
-		uapplication/x-object,
-		uapplication/x-perl,
-		uapplication/x-php,
-		uapplication/x-ruby,
-		uapplication/x-shellscript,
-		uapplication/x-sql,
-		utext/css,
-		utext/html,
-		utext/x-c,
-		utext/x-c++,
-		utext/x-chdr,
-		utext/x-copying,
-		utext/x-credits,
-		utext/x-csharp,
-		utext/x-c++src,
-		utext/x-csrc,
-		utext/x-dsrc,
-		utext/x-eiffel,
-		utext/x-gettext-translation,
-		utext/x-gettext-translation-template,
-		utext/x-haskell,
-		utext/x-idl,
-		utext/x-java,
-		utext/x-lisp,
-		utext/x-lua,
-		utext/x-makefile,
-		utext/x-objcsrc,
-		utext/x-ocaml,
-		utext/x-pascal,
-		utext/x-patch,
-		utext/x-python,
-		utext/x-sql,
-		utext/x-tcl,
-		utext/x-troff,
-		utext/x-vala,
-		utext/x-vhdl,
-]
-
-ALL_MIMETYPES = DOCUMENT_MIMETYPES + IMAGE_MIMETYPES + AUDIO_MIMETYPES + \
-VIDEO_MIMETYPES + DEVELOPMENT_MIMETYPES
-
-class MimeTypeSet(set):
-	 Set which allows to match against a string or an object with a
-	match() method.
-	
-
-	def __init__(self, *items):
-		super(MimeTypeSet, self).__init__()
-		self.__pattern = set()
-		for item in items:
-			if isinstance(item, (str, unicode)):
-self.add(item)
-			elif hasattr(item, match):
-self.__pattern.add(item)
-			else:
-raise ValueError(Bad mimetype '%s' %item)
-
-	def __contains__(self, mimetype):
-		result = super(MimeTypeSet, self).__contains__(mimetype)
-		if not result:
-			for pattern in self.__pattern:
-if pattern.match(mimetype):
-	return True
-		return result
-		
-	def __len__(self):
-		return super(MimeTypeSet, self).__len__() + len(self.__pattern)
-
-	def __repr__(self):
-		items = , .join(sorted(map(repr, self | self.__pattern)))
-		return %s(%s) %(self.__class__.__name__, items)
-
 
 class RecentlyUsedManagerGtk(DataProvider):
 	
-	FILTERS = {
-		# dict of name as key and the matching mimetypes as value
-		# if the value is None this filter matches all mimetypes
-		DOCUMENT: MimeTypeSet(*DOCUMENT_MIMETYPES),
-		IMAGE: MimeTypeSet(*IMAGE_MIMETYPES),
-		MUSIC: 

Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist

2010-05-27 Thread Siegfried Gevatter
Looks good on a quick glance over the diff. Nice work.
-- 
https://code.launchpad.net/~mhr3/zeitgeist/mimetypes/+merge/26233
Your team Zeitgeist Framework Team is requested to review the proposed merge of 
lp:~mhr3/zeitgeist/mimetypes into lp:zeitgeist.

___
Mailing list: https://launchpad.net/~zeitgeist
Post to : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp


[Zeitgeist] [Merge] lp:~seif/zeitgeist/tests-fix into lp:zeitgeist

2010-05-27 Thread Seif Lotfy
Seif Lotfy has proposed merging lp:~seif/zeitgeist/tests-fix into lp:zeitgeist.

Requested reviews:
  Zeitgeist Framework Team (zeitgeist)


fixing some benchmark tests :)
-- 
https://code.launchpad.net/~seif/zeitgeist/tests-fix/+merge/26236
Your team Zeitgeist Framework Team is requested to review the proposed merge of 
lp:~seif/zeitgeist/tests-fix into lp:zeitgeist.
=== modified file '_zeitgeist/engine/main.py'
--- _zeitgeist/engine/main.py	2010-05-14 17:16:49 +
+++ _zeitgeist/engine/main.py	2010-05-27 19:48:36 +
@@ -393,7 +393,6 @@
 		Only URIs for subjects matching the indicated `result_event_templates`
 		and `result_storage_state` are returned.
 		
-		
 		if result_type == 0 or result_type == 1:
 			
 			t1 = time.time()
@@ -411,6 +410,7 @@
 	result_storage_state, 0, 4)
 			landmarks = set([unicode(event[0]) for event in landmarks])
 			
+			
 			latest_uris = dict(uris)
 			events = [unicode(u[0]) for u in uris]
 
@@ -467,7 +467,6 @@
 
 			sets.sort(reverse = True)
 			sets = map(lambda result: result[1], sets[:num_results])
-			
 			return sets
 		else:
 			raise NotImplementedError, Unsupported ResultType.

=== modified file 'test/benchmarks.py'
--- test/benchmarks.py	2010-01-04 20:50:13 +
+++ test/benchmarks.py	2010-05-27 19:48:36 +
@@ -29,12 +29,10 @@
 # range (0, randonmess)!
 #
 
-CONTENTS = [Interpretation.DOCUMENT, Interpretation.TAG, Interpretation.BOOKMARK, Interpretation.MUSIC,
-			Interpretation.EMAIL, Interpretation.IMAGE]
-SOURCES = [Manifestation.FILE, Manifestation.WEB_HISTORY, Manifestation.SYSTEM_RESOURCE,
-			Manifestation.USER_ACTIVITY]
+INTERPRETATIONS = list(Interpretation.get_all_children())
+MANIFESTATIONS = list(Manifestation.get_all_children())
 
-USES = [Manifestation.USER_ACTIVITY, Manifestation.USER_NOTIFICATION]
+USES = list(Manifestation.get_all_children())
 
 APPS = [foo.desktop, bar.desktop, bleh.desktop]
 
@@ -43,22 +41,22 @@
 MIMES = [application/pdf, application/xml, text/plain,
 			image/png, image/jpeg]
 
-TAGS = [puppies, , kittens, , ponies, , , ]
 
-def new_dummy_item(uri, randomness=0, timestamp=0):		
-	return {
-		uri : uri,
-		content : CONTENTS[randint(0, randomness) % len(CONTENTS)],
-		source : SOURCES[randint(0, randomness) % len(SOURCES)],
-		app : APPS[randint(0, randomness) % len(APPS)],
-		timestamp : timestamp,
-		text : Text,
-		mimetype : MIMES[randint(0, randomness) % len(MIMES)],
-		use : USES[randint(0, randomness) % len(USES)],
-		origin : ORIGINS[randint(0, randomness) % len(ORIGINS)],
-		bookmark : 0 if randomness == 0 else randint(0,1),
-		tags : TAGS[randint(0, randomness) % len(TAGS)]
-	}
+def new_dummy_item(uri, randomness=0, timestamp=0):	
+	event = Event()
+	subject = Subject()	
+	subject.uri = uri
+	subject.interpretation = INTERPRETATIONS[randint(0, randomness) % len(INTERPRETATIONS)]
+	subject.manifestation = MANIFESTATIONS[randint(0, randomness) % len(MANIFESTATIONS)]
+	event.actor = APPS[randint(0, randomness) % len(APPS)],
+	event.timestamp = timestamp
+	subject.text = Text,
+	subject.mimetype = MIMES[randint(0, randomness) % len(MIMES)],
+	subject.origin = ORIGINS[randint(0, randomness) % len(ORIGINS)],
+	event.interpretation = INTERPRETATIONS[randint(0, randomness) % len(INTERPRETATIONS)]
+	event.manifestation = MANIFESTATIONS[randint(0, randomness) % len(MANIFESTATIONS)]
+	event.set_subjects([subject])
+	return event
 
 class EngineBenchmark (unittest.TestCase):
 	
@@ -83,12 +81,14 @@
 	def do5ChunksOf200(self, randomness):
 		batch = []
 		full_start = time()
+		#batch_time = time()
 		for i in range(1,1001):
 			batch.append(new_dummy_item(test://item%s % i, randomness=randomness, timestamp=i))
-			if len(batch) % 200 == 0:
+			if len(batch) % 200 == 0:
+#log.info(Finished batch of 200 in: %ss % (time() - batch_time))
+#batch_time = time()
 start = time()
 self.engine.insert_events(batch)
-log.info(Inserted 200 items in: %ss % (time()-start))
 batch = []
 		log.info(Total insertion time for 1000 items: %ss % (time()-full_start))
 	
@@ -115,9 +115,12 @@
 		inserted_items = []
 		batch = []
 		full_start = time()
+		#batch_time = time()
 		for i in range(1,num+1):
 			batch.append(new_dummy_item(test://item%s % i, randomness=randomness, timestamp=i))
 			if len(batch) % 400 == 0:
+#log.info(Finished batch of 400 in: %ss % (time() - batch_time))
+#batch_time = time()
 self.engine.insert_events(batch)
 inserted_items.extend(batch)
 batch = []
@@ -128,7 +131,7 @@
 		log.info(Total insertion time for %s items: %ss % (num, time()-full_start))
 		return inserted_items		
 	
-	def do_find(self, expected_items, page_size, **kwargs):
+	def do_find(self, expected_items, page_size, filters = [], **kwargs):
 		
 		Helper method to find a set of items with page size of 'page_size'
 		passin 'kwargs' directly to the engine.find_events() method.
@@ -138,12 +141,14 @@
 		next_timestamp = 0
 		page_time = time()
 		full_time = page_time	
-		page = 

[Zeitgeist] [Bug 566898] Re: Log DB file should be versioned

2010-05-27 Thread Mikkel Kamstrup Erlandsen
** Changed in: zeitgeist
   Status: In Progress = Fix Committed

-- 
Log DB file should be versioned
https://bugs.launchpad.net/bugs/566898
You received this bug notification because you are a member of Zeitgeist
Framework Team, which is subscribed to Zeitgeist Framework.

Status in Zeitgeist Framework: Fix Committed

Bug description:
We should really store log DB schema version in the our sqlite. That way we can 
do smooth upgrades without hacks. Not only that, but we can also shave off a 
lot of SQL grinding at startup if we just check the db schema version...

I am thinking a new table:

  CREATE TABLE IF NOT EXISTS version_info (schema_name VARCHAR, version INT)

This table will have one row for our initial use case, but we may add more rows 
in the future. The schema_name of our core log db could be main_log and on 
startup we'd do:

 SELECT version FROM version WHERE name='main_log';

  if version != expected_version : do stuff
  else : no need to run all our initial sql



___
Mailing list: https://launchpad.net/~zeitgeist
Post to : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp