FYI,
The diff contains a first look at the tracker-db-sqlite.c file, I added
some comments that illustrate how a journal table "Events" will be
filled up.
Note that the table will most likely become a sqlite memory table.
The reason why I don't think a GHashTable in the C code is as good is
because we want to repeat the query in the TrackerXesamLiveSearch on
this "Events" table (for example with an INNERT JOIN with Services).
If it where a GHashTable, that query would either need a lot of OR
clauses (each ServiceID in one OR) or we'd need to do a query for each
item in the table to check whether the items affect a live search.
/me is the master of pseudo code, here I go again!
For each query in live-search-queries do
// This one sounds like the best to me. It requires a In-Sqlite
// In-Memory table called "Events"
SELECT ... FROM Events, Services ...
WHERE Events.ServiceID = Services.ID
AND the live-search-query
AND (ServiceID is in the table)
// Pro: short arguments list, easy query
// Con: JOIN (although the cartesian product is relatively small)
or
// This one doesn't need a "Events" table in sqlite but does need a
// In-C In-Memory GHashTable holding all the affected ServiceIDs
SELECT ... FROM Services ...
WHERE the live-search-query
AND (
ServiceID = hashtable[0].key
OR ServiceID = hashtable[1].key
OR ServiceID = hashtable[2].key
OR ServiceID = hashtable[n].key
...
)
// Pro: no JOIN
// Con: long arguments list
done
On Tue, 2008-04-29 at 17:56 +0200, Philip Van Hoof wrote:
> Pre note:
>
> This is about the Xesam support being done (since this week) in the
> indexer-split.
>
> About:
>
> Xesam requires notifying live searches about changes that affect them.
> We plan to implement this with a "events" table that journals all
> creates, deletes and updates that the indexer causes.
>
> Periodically we will handle and then flush the items in that events
> table.
>
> I made a cracktasty diagram that contains the from-a-high-distance
> abstract proposal that we have in mind for this.
>
>
> This is pseudo code that illustrates the periodic handler:
>
> bool periodic_handler (...)
>
> {
>
> lock indexer
> update eventstable set beinghandled=1 where 1=1 (all items)
> unlock indexer
>
> foreach query in all livequeries
> added, modified, removed = query.execute-on (eventstable)
> query.emit_added (added)
> query.emit_removed (removed)
> query.emit_modified (modified)
> done
>
> lock indexer
> delete from eventstable where beinghandled = 1
> unlock indexer
>
> return (!stopping)
>
> }
>
>
> Here's a piece of IRC log between me and jamiecc about the proposal:
>
> pvanhoof ping jamiemcc
> pvanhoof same thing
> pvanhoof I'll make a pdf
> jamiemcc oh ok
> pvanhoof Sending
> pvanhoof ok
> pvanhoof so
> pvanhoof it's about the hitsadded, hitsremoved and hitsmodified signals for
> xesam
> pvanhoof What we have in mind is using a "events" table that is a journal for
> all creates, deletes and updates
> pvanhoof Periodically we will flush that table, each create (insert), update
> and each delete we add a record in that table
> pvanhoof We'll make sure the table is queryable in a similar fashion as how
> the Xesam query will execute
> pvanhoof In the periodical handler we'll for each live search check whether
> it got affected by the items in the events table
> pvanhoof In pseudo, the handler:
> jamiemcc sounds feasible
> pvanhoof gboolean periodic_handler (void data) {
> pvanhoof lock indexer
> pvanhoof update eventstable set beinghandled=1 where 1=1 (all items)
> pvanhoof unlock indexer
> pvanhoof foreach query in all live queries
> pvanhoof added, modified, removed = query.execute-on (eventstable)
> pvanhoof query.emit_added (added)
> pvanhoof query.emit_removed (removed)
> pvanhoof query.emit_modified (modified)
> pvanhoof done
> pvanhoof lock indexer
> pvanhoof delete from eventstable where beinghandled = 1
> pvanhoof unlock indexer
> pvanhoof }
> pvanhoof I've send you a diagram that you can look at as if it's a
> state-activity one, a ERD and a class diagram :) now how cool is that?? :)
> pvanhoof it's just three columns, although the ERD is quite simplistic of
> course
> jamiemcc yeah just go tit
> * fritschy ([EMAIL PROTECTED]) has left #tracker
> pvanhoof so, the current idea is to adapt those stored procedures into
> transactions that will also add this record to the "events" table
> * fritschy ([EMAIL PROTECTED]) has joined #tracker
> pvanhoof Which might not be sufficient, and we kinda lack the in-depth
> know-how of all the db handling of tracker
> pvanhoof So that's a first issue we want to discuss with you
> pvanhoof The other is stopping the indexing, restarting it (locking it, in
> the pseudo code): what you think about that
> jamiemcc ok I will need to think about it - I iwll probably reply later
> tonight and we can discuss tomorrow
> pvanhoof I adapted my initial proposal to have two short critical sections
> rather than letting the entire periodic handler be one critical section
> pvanhoof that way the lock is smaller
> jamiemcc the indexer will be seaparte process so will need to be locked via
> dbus signals
> pvanhoof by just adding a column to the events table
> pvanhoof yes but I guess we want any such locking to be short
> jamiemcc well yes
> pvanhoof then once the items that are to be handled are identified, we for
> each live-search check whether the live-search is affected
> pvanhoof and we perform the necessary hitsadded, hitsremoved and hitsmodified
> signals if needed
> pvanhoof if all is done, we simply purge the handled items from the events
> table
> jamiemcc the query results will be store din temp tables
> pvanhoof which is the second location where we want the indexer to be
> locked-out
> jamiemcc remember a query may be a cursor so wont include entire result set
> pvanhoof No okay, but that's something the check needs to worry about
> pvanhoof so ottela is working on a query for the live-search
> jamiemcc ok cool
> pvanhoof and if we only want to update if the client has the affected item
> visible, due to cursor-usage
> pvanhoof then i guess we'll somehow need to get that info into trackerd
> jamiemcc any reason we dont store whats change din memory rather than sqlite
> table?
> pvanhoof oh, that's abstract right now
> jamiemcc o
> jamiemcc ok
> pvanhoof "tracker's event table" can also be a hashtable for me ..
> jamiemcc yeah fine
> pvanhoof implementation detail
> pvanhoof since it doesn't need to be persistent ...
> pvanhoof difference is that either we use a memory table and still a
> transaction for the three stored procedures
> pvanhoof or we adapt code
> jamiemcc prefer hashtable as amount of data will be small
> jamiemcc can even be a list
> pvanhoof ok, your comments/ideas on this would of course be very useful btw
> jamiemcc yeah I will think about it more tonight and get back to you
> pvanhoof sounds great
> pvanhoof I'll make a mail about this to the mailing list? or I await your
> ideas tomorrow?
> pvanhoof I'll just wait for now
> jamiemcc you cna mail if you like
> jamiemcc I will reply to it
>
>
> _______________________________________________
> tracker-list mailing list
> [email protected]
> http://mail.gnome.org/mailman/listinfo/tracker-list
--
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be
gnome: pvanhoof at gnome dot org
http://pvanhoof.be/blog
http://codeminded.be
Index: src/trackerd/tracker-db-sqlite.c
===================================================================
--- src/trackerd/tracker-db-sqlite.c (revision 1330)
+++ src/trackerd/tracker-db-sqlite.c (working copy)
@@ -3346,6 +3346,9 @@
}
id = tracker_db_interface_sqlite_get_last_insert_id (TRACKER_DB_INTERFACE_SQLITE (db_con->db));
+ // XESAM TODO
+ // INSERT INTO Events (ServiceID, ..., EventType) VALUES (sid, ..., 'Create')
+
if (info->is_hidden) {
tracker_db_exec_no_reply (db_con,
"Update services set Enabled = 0 where ID = %d",
@@ -3549,6 +3552,9 @@
tracker_exec_proc (db_con->common, "DeleteService7", path, name, NULL);
tracker_exec_proc (db_con->common, "DeleteService9", path, name, NULL);
+ // XESAM TODO"
+ // INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Delete')
+
g_free (name);
g_free (path);
}
@@ -3637,8 +3643,16 @@
name = g_path_get_basename (info->uri);
path = g_path_get_dirname (info->uri);
+ // Comment by Philip Van Hoof:
+ // Please verify that str_service_type_id must be the first argument.
+ // Reading the file sqlite-stored-procs.sql this doesn't seem to be
+ // true
+
tracker_exec_proc (db_con->index, "UpdateFile", str_service_type_id, path, name, info->mime, str_size, str_mtime, str_offset, str_file_id, NULL);
-
+
+ // XESAM TODO:
+ // INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Update')
+
g_free (str_service_type_id);
g_free (str_size);
g_free (str_offset);
@@ -4248,6 +4262,9 @@
/* update db so that fileID reflects new uri */
tracker_exec_proc (db_con, "UpdateFileMove", path, name, str_file_id, NULL);
+ // XESAM TODO:
+ // INSERT INTO Events (ServiceID, ..., EventType) VALUES (str_file_id, ..., 'Update')
+
/* update File:Path and File:Filename metadata */
tracker_db_set_single_metadata (db_con, "Files", str_file_id, "File:Path", path, FALSE);
tracker_db_set_single_metadata (db_con, "Files", str_file_id, "File:Name", name, FALSE);
_______________________________________________
tracker-list mailing list
[email protected]
http://mail.gnome.org/mailman/listinfo/tracker-list