[Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist

2012-03-14 Thread noreply
The proposal to merge lp:~mhr3/zeitgeist/fts-secondary-sorting into 
lp:zeitgeist has been updated.

Status: Needs review = Merged

For more details, see:
https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479
-- 
https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479
Your team Zeitgeist Framework Team is subscribed to branch lp:zeitgeist.

___
Mailing list: https://launchpad.net/~zeitgeist
Post to : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp


Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist

2012-03-11 Thread Michal Hruby
So, how about now?
-- 
https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479
Your team Zeitgeist Framework Team is requested to review the proposed merge of 
lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist.

___
Mailing list: https://launchpad.net/~zeitgeist
Post to : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp


Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist

2012-03-11 Thread Michal Hruby
I wouldn't really like to change the semantics of the Search() method, somehow 
magically it approximates the results quite ok (plus it'd be weird if _SUBJECTS 
groupings worked perfectly and the others just ok-ish).
-- 
https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479
Your team Zeitgeist Framework Team is subscribed to branch lp:zeitgeist.

___
Mailing list: https://launchpad.net/~zeitgeist
Post to : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp


Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist

2012-03-08 Thread Michal Hruby
I'm starting to think that doing the secondary sorting in FTS isn't a good 
idea, we're sending the relevancies to the client, so we should keep the full 
Zeitgeist sorting, and since the client has the relevancies, it can do this 
kind of sort itself (or not).
-- 
https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479
Your team Zeitgeist Framework Team is requested to review the proposed merge of 
lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist.

___
Mailing list: https://launchpad.net/~zeitgeist
Post to : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp


Re: [Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist

2012-03-08 Thread Siegfried Gevatter
As discussed on IRC, I don't really like how this is ending up. We should look 
into re-architecting FTS at some point, starting from the assumption that it's 
only for searching current documents (so it may change from storing all events 
to storing one of each subjects + event information, or whatever). But since 
this stuff is supposed to be working in Precise, I guess it's fine to go with 
the workaround for now.

The main problem I see with the MP right now is that it's just looking at the 
URI when re-requesting the events, so if the request was for something 
particular (especially event interpretation, event manifestation or actor, 
since the subject data could be seen as somewhat more constant) it's likely to 
end up giving a wrong sort of event.

A possible way of fixing this would be merging the uri templates with the 
request templates (the ones used in CompileEventFilterQuery). The trivial 
implementation would go something like:

tmpls = []
for template in templates:
  for uri in uris:
tmpl = copy(template)
tmpl.subject_uri = ...
tmpls.append(tmpl)

However, it may end up generating really big SQL queries (eg. consider just two 
templates for subject_interpretation={Music,Video} and a limit of 100 events; 
that becomes 200 templates with subject_interpretation and 
subject_manifestation, which is 400 conditions in the generated SQL).
-- 
https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479
Your team Zeitgeist Framework Team is requested to review the proposed merge of 
lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist.

___
Mailing list: https://launchpad.net/~zeitgeist
Post to : zeitgeist@lists.launchpad.net
Unsubscribe : https://launchpad.net/~zeitgeist
More help   : https://help.launchpad.net/ListHelp


[Zeitgeist] [Merge] lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist

2012-03-07 Thread Michal Hruby
Michal Hruby has proposed merging lp:~mhr3/zeitgeist/fts-secondary-sorting into 
lp:zeitgeist.

Requested reviews:
  Zeitgeist Framework Team (zeitgeist)

For more details, see:
https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479

Implements secondary sorting based on ResultType to SearchWithRelevancies 
method.
-- 
https://code.launchpad.net/~mhr3/zeitgeist/fts-secondary-sorting/+merge/96479
Your team Zeitgeist Framework Team is requested to review the proposed merge of 
lp:~mhr3/zeitgeist/fts-secondary-sorting into lp:zeitgeist.
=== modified file 'extensions/fts++/indexer.cpp'
--- extensions/fts++/indexer.cpp	2012-03-07 16:08:26 +
+++ extensions/fts++/indexer.cpp	2012-03-07 22:37:19 +
@@ -23,6 +23,7 @@
 #include xapian.h
 #include queue
 #include vector
+#include cmath
 
 #include gio/gio.h
 #include gio/gdesktopappinfo.h
@@ -804,7 +805,6 @@
 
   if (event_templates-len  0)
   {
-ZeitgeistTimeRange *time_range = zeitgeist_time_range_new_anytime ();
 results = zeitgeist_db_reader_find_events (zg_reader,
time_range,
event_templates,
@@ -813,8 +813,6 @@
result_type,
NULL,
error);
-
-g_object_unref (time_range);
   }
   else
   {
@@ -841,6 +839,34 @@
   return results;
 }
 
+static gint
+sort_events_by_relevance (gconstpointer a, gconstpointer b, gpointer user_data)
+{
+  gdouble rel1 = 0.0;
+  gdouble rel2 = 0.0;
+  std::mapunsigned, gdouble::const_iterator it;
+  ZeitgeistEvent **e1 = (ZeitgeistEvent**) a;
+  ZeitgeistEvent **e2 = (ZeitgeistEvent**) b;
+  std::mapunsigned, gdouble const relevancy_map =
+*(static_caststd::mapunsigned, gdouble* (user_data));
+
+  it = relevancy_map.find (zeitgeist_event_get_id (*e1));
+  if (it != relevancy_map.end ()) rel1 = it-second;
+
+  it = relevancy_map.find (zeitgeist_event_get_id (*e2));
+  if (it != relevancy_map.end ()) rel2 = it-second;
+
+  gdouble delta = rel1 - rel2;
+  if (fabs (delta)  0.1)
+  {
+// relevancy of both items is the same, let's make use of stable sort
+return e1  e2 ? 1 : -1;
+  }
+
+  // we want the higher ranked events first
+  return (delta  0) ? 1 : -1;
+}
+
 GPtrArray* Indexer::SearchWithRelevancies (const gchar *search,
ZeitgeistTimeRange *time_range,
GPtrArray *templates,
@@ -860,24 +886,51 @@
 
 guint maxhits = count;
 
-if (result_type == RELEVANCY_RESULT_TYPE)
-{
-  enquire-set_sort_by_relevance ();
-}
-else
-{
-  enquire-set_sort_by_value (VALUE_TIMESTAMP, true);
-}
-
 if (storage_state != ZEITGEIST_STORAGE_STATE_ANY)
 {
   g_set_error_literal (error,
ZEITGEIST_ENGINE_ERROR,
ZEITGEIST_ENGINE_ERROR_INVALID_ARGUMENT,
-   Only ANY stogate state is supported);
+   Only ANY storage state is supported);
   return NULL;
 }
 
+if (result_type == RELEVANCY_RESULT_TYPE)
+{
+  enquire-set_sort_by_relevance ();
+}
+else if (result_type == ZEITGEIST_RESULT_TYPE_MOST_RECENT_EVENTS ||
+result_type == ZEITGEIST_RESULT_TYPE_LEAST_RECENT_EVENTS)
+{
+  enquire-set_sort_by_relevance_then_value (VALUE_TIMESTAMP, true);
+  enquire-set_collapse_key (VALUE_EVENT_ID);
+}
+else if (result_type == ZEITGEIST_RESULT_TYPE_MOST_RECENT_SUBJECTS ||
+result_type == ZEITGEIST_RESULT_TYPE_LEAST_RECENT_SUBJECTS ||
+result_type == ZEITGEIST_RESULT_TYPE_MOST_POPULAR_SUBJECTS ||
+result_type == ZEITGEIST_RESULT_TYPE_LEAST_POPULAR_SUBJECTS)
+{
+  enquire-set_sort_by_relevance_then_value (VALUE_TIMESTAMP, true);
+  enquire-set_collapse_key (VALUE_URI_HASH);
+}
+else if (result_type == ZEITGEIST_RESULT_TYPE_MOST_RECENT_ORIGIN ||
+result_type == ZEITGEIST_RESULT_TYPE_LEAST_RECENT_ORIGIN ||
+result_type == ZEITGEIST_RESULT_TYPE_MOST_POPULAR_ORIGIN ||
+result_type == ZEITGEIST_RESULT_TYPE_LEAST_POPULAR_ORIGIN)
+{
+  // FIXME: not really correct but close :)
+  enquire-set_sort_by_relevance_then_value (VALUE_TIMESTAMP, true);
+  enquire-set_collapse_key (VALUE_URI_HASH);
+  maxhits *= 3;
+}
+else
+{
+  // throw an error for these?
+  enquire-set_sort_by_relevance_then_value (VALUE_TIMESTAMP, true);
+  enquire-set_collapse_key (VALUE_EVENT_ID);
+  maxhits *= 3;
+}
+
 Xapian::Query q(query_parser-parse_query (query_string, QUERY_PARSER_FLAGS));
 enquire-set_query (q);
 Xapian::MSet hits (enquire-get_mset (offset, maxhits));
@@ -906,6 +959,8 @@
 NULL