FYI,

The diff contains a first look at the tracker-db-sqlite.c file, I added
some comments that illustrate how a journal table "Events" will be
filled up.

Note that the table will most likely become a sqlite memory table.

The reason why I don't think a GHashTable in the C code is as good is
because we want to repeat the query in the TrackerXesamLiveSearch on
this "Events" table (for example with an INNERT JOIN with Services).

If it where a GHashTable, that query would either need a lot of OR
clauses (each ServiceID in one OR) or we'd need to do a query for each
item in the table to check whether the items affect a live search.

/me is the master of pseudo code, here I go again! 

For each query in live-search-queries do

  // This one sounds like the best to me. It requires a In-Sqlite
  // In-Memory table called "Events"

  SELECT ... FROM Events, Services ... 
        WHERE   Events.ServiceID = Services.ID 
        AND     the live-search-query 
        AND     (ServiceID is in the table)

  // Pro: short arguments list, easy query
  // Con: JOIN (although the cartesian product is relatively small)

or

  // This one doesn't need a "Events" table in sqlite but does need a
  // In-C In-Memory GHashTable holding all the affected ServiceIDs

  SELECT ... FROM Services ... 
        WHERE   the live-search-query 
        AND     (
                           ServiceID = hashtable[0].key
                        OR ServiceID = hashtable[1].key 
                        OR ServiceID = hashtable[2].key
                        OR ServiceID = hashtable[n].key
                        ...
                )

  // Pro: no JOIN
  // Con: long arguments list


done

On Tue, 2008-04-29 at 17:56 +0200, Philip Van Hoof wrote:
> Pre note: 
> 
> This is about the Xesam support being done (since this week) in the
> indexer-split.
> 
> About:
> 
> Xesam requires notifying live searches about changes that affect them.
> We plan to implement this with a "events" table that journals all
> creates, deletes and updates that the indexer causes.
> 
> Periodically we will handle and then flush the items in that events
> table.
> 
> I made a cracktasty diagram that contains the from-a-high-distance
> abstract proposal that we have in mind for this.
> 
> 
> This is pseudo code that illustrates the periodic handler:
> 
> bool periodic_handler (...) 
> 
> {
> 
>   lock indexer
>   update eventstable set beinghandled=1 where 1=1 (all items)
>   unlock indexer
> 
>   foreach query in all livequeries
>      added, modified, removed = query.execute-on (eventstable)
>      query.emit_added (added)
>      query.emit_removed (removed)
>      query.emit_modified (modified)
>   done
> 
>   lock indexer
>   delete from eventstable where beinghandled = 1
>   unlock indexer
> 
>   return (!stopping)
> 
> }
> 
> 
> Here's a piece of IRC log between me and jamiecc about the proposal:
> 
> pvanhoof ping jamiemcc 
> pvanhoof same thing
> pvanhoof I'll make a pdf
> jamiemcc oh ok
> pvanhoof Sending
> pvanhoof ok
> pvanhoof so
> pvanhoof it's about the hitsadded, hitsremoved and hitsmodified signals for 
> xesam
> pvanhoof What we have in mind is using a "events" table that is a journal for 
> all creates, deletes and updates
> pvanhoof Periodically we will flush that table, each create (insert), update 
> and each delete we add a record in that table
> pvanhoof We'll make sure the table is queryable in a similar fashion as how 
> the Xesam query will execute
> pvanhoof In the periodical handler we'll for each live search check whether 
> it got affected by the items in the events table
> pvanhoof In pseudo, the handler:
> jamiemcc sounds feasible
> pvanhoof gboolean periodic_handler (void data) {
> pvanhoof   lock indexer
> pvanhoof   update eventstable set beinghandled=1 where 1=1 (all items)
> pvanhoof   unlock indexer
> pvanhoof   foreach query in all live queries
> pvanhoof      added, modified, removed = query.execute-on (eventstable)
> pvanhoof      query.emit_added (added)
> pvanhoof      query.emit_removed (removed)
> pvanhoof      query.emit_modified (modified)
> pvanhoof   done
> pvanhoof   lock indexer
> pvanhoof   delete from eventstable where beinghandled = 1
> pvanhoof   unlock indexer
> pvanhoof }
> pvanhoof I've send you a diagram that you can look at as if it's a 
> state-activity one, a ERD and a class diagram :) now how cool is that?? :)
> pvanhoof it's just three columns, although the ERD is quite simplistic of 
> course
> jamiemcc yeah just go tit
> * fritschy ([EMAIL PROTECTED]) has left #tracker
> pvanhoof so, the current idea is to adapt those stored procedures into 
> transactions that will also add this record to the "events" table
> * fritschy ([EMAIL PROTECTED]) has joined #tracker
> pvanhoof Which might not be sufficient, and we kinda lack the in-depth 
> know-how of all the db handling of tracker
> pvanhoof So that's a first issue we want to discuss with you
> pvanhoof The other is stopping the indexing, restarting it (locking it, in 
> the pseudo code): what you think about that
> jamiemcc ok I will need to think about it - I iwll probably reply later 
> tonight and we can discuss tomorrow
> pvanhoof I adapted my initial proposal to have two short critical sections 
> rather than letting the entire periodic handler be one critical section
> pvanhoof that way the lock is smaller
> jamiemcc the indexer will be seaparte process so will need to be locked via 
> dbus signals
> pvanhoof by just adding a column to the events table
> pvanhoof yes but I guess we want any such locking to be short
> jamiemcc well yes 
> pvanhoof then once the items that are to be handled are identified, we for 
> each live-search check whether the live-search is affected
> pvanhoof and we perform the necessary hitsadded, hitsremoved and hitsmodified 
> signals if needed
> pvanhoof if all is done, we simply purge the handled items from the events 
> table
> jamiemcc the query results will be store din temp tables
> pvanhoof which is the second location where we want the indexer to be 
> locked-out
> jamiemcc remember a query may be a cursor so wont include entire result set
> pvanhoof No okay, but that's something the check needs to worry about 
> pvanhoof so ottela is working on a query for the live-search
> jamiemcc ok cool
> pvanhoof and if we only want to update if the client has the affected item 
> visible, due to cursor-usage
> pvanhoof then i guess we'll somehow need to get that info into trackerd
> jamiemcc any reason we dont store whats change din memory rather than sqlite 
> table?
> pvanhoof oh, that's abstract right now
> jamiemcc o
> jamiemcc ok
> pvanhoof "tracker's event table" can also be a hashtable for me ..
> jamiemcc yeah fine
> pvanhoof implementation detail
> pvanhoof since it doesn't need to be persistent ...
> pvanhoof difference is that either we use a memory table and still a 
> transaction for the three stored procedures
> pvanhoof or we adapt code
> jamiemcc prefer hashtable as amount of data will be small
> jamiemcc can even be a list
> pvanhoof ok, your comments/ideas on this would of course be very useful btw
> jamiemcc yeah I will think about it more tonight and get back to you
> pvanhoof sounds great
> pvanhoof I'll make a mail about this to the mailing list? or I await your 
> ideas tomorrow?
> pvanhoof I'll just wait for now
> jamiemcc you cna mail if you like
> jamiemcc I will reply to it
> 
> 
> _______________________________________________
> tracker-list mailing list
> tracker-list@gnome.org
> http://mail.gnome.org/mailman/listinfo/tracker-list
-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be



Index: src/trackerd/tracker-db-sqlite.c
===================================================================
--- src/trackerd/tracker-db-sqlite.c	(revision 1330)
+++ src/trackerd/tracker-db-sqlite.c	(working copy)
@@ -3346,6 +3346,9 @@
 		}
 		id = tracker_db_interface_sqlite_get_last_insert_id (TRACKER_DB_INTERFACE_SQLITE (db_con->db));
 
+		// XESAM TODO 
+		// INSERT INTO Events (ServiceID, ..., EventType) VALUES (sid, ..., 'Create')
+
 		if (info->is_hidden) {
 			tracker_db_exec_no_reply (db_con,
 						  "Update services set Enabled = 0 where ID = %d",
@@ -3549,6 +3552,9 @@
 			tracker_exec_proc (db_con->common, "DeleteService7", path, name, NULL);
 			tracker_exec_proc (db_con->common, "DeleteService9", path, name, NULL);
 
+			// XESAM TODO"
+			// INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Delete')
+
 			g_free (name);
 			g_free (path);
 		}
@@ -3637,8 +3643,16 @@
 	name = g_path_get_basename (info->uri);
 	path = g_path_get_dirname (info->uri);
 
+	// Comment by Philip Van Hoof:
+	// Please verify that str_service_type_id must be the first argument.
+	// Reading the file sqlite-stored-procs.sql this doesn't seem to be
+	// true
+
 	tracker_exec_proc (db_con->index, "UpdateFile", str_service_type_id, path, name, info->mime, str_size, str_mtime, str_offset, str_file_id, NULL);
-	
+
+	// XESAM TODO:
+	// INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Update')
+
 	g_free (str_service_type_id);
 	g_free (str_size);
 	g_free (str_offset);
@@ -4248,6 +4262,9 @@
 	/* update db so that fileID reflects new uri */
 	tracker_exec_proc (db_con, "UpdateFileMove", path, name, str_file_id, NULL);
 
+	// XESAM TODO:
+	// INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Update')
+
 	/* update File:Path and File:Filename metadata */
 	tracker_db_set_single_metadata (db_con, "Files", str_file_id, "File:Path", path, FALSE);
 	tracker_db_set_single_metadata (db_con, "Files", str_file_id, "File:Name", name, FALSE);
_______________________________________________
tracker-list mailing list
tracker-list@gnome.org
http://mail.gnome.org/mailman/listinfo/tracker-list

Reply via email to