Re: Beagle-query and dates
I defiantly like this idea, I think we need to address a few quick things about how it would be done. 1) We need an advanced query interface, thats the best way to do it. I know this has been on a wish list forever, maybe next SoC we can ask for it... ( I think theres enough new features in the works as is, and if we get the cool metadata/rdf stuff figured out in the next few months, then they could/would be a definite boon to anyone looking to help build an interface for more advanced queries. a) A sub-question here, adding just date spans to queries wouldn't be too hard (from an interface perspective) if we have/had a calendar control. I know Windows Forms has one, and I seem to remember during a short stint of gtkmm work Gtk::Calendar being an autocomplete option, I dunno if its utility stuff or a Calendar control, but if anyone knows if its available in the current gtk# (and stable) then I wouldn't mind looking into it. (I know I'm building up quite a todo list I'm hoping to just hammer them hard one night in the near future once I get net) 2) We already have the most recent modified event and/or creation time stored in most cases, however it would be cool to somehow keep a record of all modification times (like to the nearest minute, not every one of a thousand events in a 30 second timeframe). I'm gonna put some thought into this, since just adding fields for every event is far too expensive for a variety of actions. But its just something that could be useful for the metadata stuff im working on. 3) Date stuff doesn't always seem to be 100% reliable in beagle, we have lots of backends that are prone to wildly inaccurate dates etc. While we can get the FIleSystemInfo or DirectoryInfo for file hits (and can in turn get modification/creation times in most cases) I think it wouldn't be out of the question to provide 2 new properties to Hit (LastModifiedTime and CreationTime) for these values. While we do find these values in properties sometimes, I think we should try to make them more universial/stabalize our date information across beagle's backends. (This is based off of some conversations had over 6 months ago, its possible all this has been fixed, but at the very least the convention of making the creation/mod time part of the indexables and hits (not properties) is worth consideration. 4) A small technicality when it comes to allowing searches with terms like 'yesterday' would we still require date: ? aka. date:yesterday vs. yesterday (assuming we want to be able to search for documents with the word yesterday, this doesn't exactly work) and for mulit-term phrases like '2 days ago' would we require quotes? date:2 days ago vs date:2 days ago the first is a little harder to discover, so we would probably need to add it to our hint page. The second is just impossible to intelligently discover what the user wants to do. ( I think) Cheers, Kevin Kubasik On 10/17/07, D Bera [EMAIL PROTECTED] wrote: Beagle does support date queries, though. So if you knew the date, you could programmatically construct the extra query parameter, which would be something like: date:20071017 It was added post 0.2.16 which unfortunately means its only in the svn trunk and not in 0.2.16.x, 0.2.17, 0.2.18 *sigh*. -- - Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE fan Mandriva / Inspiron-1100 user ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers -- Cheers, Kevin Kubasik http://kubasik.net/blog ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: GSoC Weekly Report
Hi, On 10/16/07, D Bera [EMAIL PROTECTED] wrote: A followup question, I didnot find any API documentation of Mono.Data.Sqlite :( #mono was also sleeping when I asked the question there. My understanding is that both M.D.SqliteClient and M.D.Sqlite follow the general ADO.Net API patterns and that the latter is more or less a drop-in replacement for the former. A few things may need to be tweaked, but in general just changing the using statements at the top of each source file should be all that's needed. If M.D.Sqlite does not have a way to return rows on demand, I am against the migration. In the worst case, we can ship with a modified copy of M.D.Sqlite but I am not sure what will that buy us. You've always been able to get rows on demand via ADO.Net, it's just a matter of the implementation underneath. The old one (not modified by us) would load all of them into memory. I'm not sure how the new one performs memory-wise. If the Mono guys don't have any idea, the right thing to do here would be to create a large test database (or use an existing TextCache or FAStore db) and do a SELECT * using the 3 implementations and walk the results, using heap-buddy and/or heap-shot to analyze their memory usage. In the same breath, what is the benefit of M.D.Sqlite over M.D.SqliteClient for beagle ? I figured out there are some ADO.Net advantages but other than that ... ? It's maintained for one, which our modified one essentially isn't. It has the backing of the Mono team. The code is much cleaner and easier to understand, largely because it doesn't have two separate codepaths (one for v2 and one for v3). I am sure the Mono guys have other good reasons too. :) Joe ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: System.InvalidOperationException: Invalid connection string
This is probably related. Now my beagle log is filling up with: 20071018 00:42:49.9363 09041 Beagle DEBUG: Unable to determine account name for [EMAIL PROTECTED]:993 Pressumably one for each of the bogus folders under /home/brian/.evolution/mail/imap/[EMAIL PROTECTED]:993/folders/cur/subf olders/ Any ideas on how to clean this mess up? I've asked on the evolution list but nobody has responded. Its something to do with the account_names for those folders as stored in gconf. I dont know much about these things ... maybe you can try to check the list at gconf:/apps/evolution/mail/accounts and see if there is any suspicous entry. Could be some bug in the Evolution backend too ... - dBera -- - Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE fan Mandriva / Inspiron-1100 user ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: GSoC Weekly Report
A followup question, I didnot find any API documentation of Mono.Data.Sqlite :( #mono was also sleeping when I asked the question there. My understanding is that both M.D.SqliteClient and M.D.Sqlite follow the general ADO.Net API patterns and that the latter is more or less a drop-in replacement for the former. A few things may need to be tweaked, but in general just changing the using statements at the top of each source file should be all that's needed. I was more looking for some method for row-by-row retrieval, on demand. Real on-demand, where the implementation does not retrieve all the rows at once but returns one by one. You've always been able to get rows on demand via ADO.Net, it's just a matter of the implementation underneath. The old one (not modified by us) would load all of them into memory. I'm not sure how the new one performs memory-wise. If the Mono guys don't have any idea, the right I checked the source out of curiousity http://anonsvn.mono-project.com/viewcvs/trunk/mcs/class/Mono.Data.Sqlite/Mono.Data.Sqlite/ And the code for DataReader looks exactly the same (didnt do a diff, just visually) as the one in Mono.Data.SqliteClient. So even if we migrate (the migration would be easy), we still have to ship with a modified inhouse M.D.Sqlite and keep syncing in with upstream. *sigh* - dBera -- - Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE fan Mandriva / Inspiron-1100 user ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: GSoC Weekly Report
Ignore my previous email ... I was looking at the wrong place :( This is the right place for the new M.D.Sqlite http://anonsvn.mono-project.com/viewcvs/trunk/mcs/class/Mono.Data.Sqlite/Mono.Data.Sqlite_2.0/SQLiteDataReader.cs - dBera -- - Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE fan Mandriva / Inspiron-1100 user ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: Best way of using Beagle to index data CDs
On 10/17/07, D Bera [EMAIL PROTECTED] wrote: Hi Nikolai, Please direct me to the right mailing list if this isn't the appropriate one for things like this. There doesn't seem to be a dashboard-users mailing list. This is the right place. Welcome aboard. I would like to use Beagle to index my data CDs. I figured that static indexes was the way to go, but I haven't quite determined the best way of doing it. My initial idea was to create an index for each CD, so that I could simply remove the index associated with a CD if I threw out the CD. But then I started worrying about using hundreds of static indexes. My idea then was to merge the static indexes into a master index that I could rebuild whenever an index was added or removed, using $(beagle-manage-index merge). Does this make sense? Does anyone have a better suggestion? Also, I figured that I wanted to add an attribute to each file indexed in this way that says what CD it is on, for example, 'disc:N', where N is an integer. There doesn't seem to be a way of doing this yet with beagle-build-index. Is this something that would be interesting to see as a patch, or are user-defined attributes outside the scope of Beagle? You are more or less in the right track. As Kevin pointed out, one way it so leverage static-indexes. Due to the way static indexes work, it isnt directly possible to use that for removable index. There is a --tag option to static indexes, which can be used to tag files when using beagle-build-index. You can use that to identify files from each medium. If you merge several indexes, there would be two kinds of problems: 1) Files that are not in the filesystem would not be reported (happens for any static index) 2) If there are files in different removable media but with same absolute path, then only one of them will be returned. And there might be more weirdness. OK. Both of these are real problem. Problem 2 can be solved by making sure that one uses --remap correctly to make each prefix unique, for example, /media/disc id/. Problem 1 is a complete bummer. That makes beagle more or less unusable to this end. How do we solve this? It seems that you've basically solved both of these problems in BuildRemovableIndex.cs by introducing a new URI protocol (removable) for solving problem 1 and using media_name for solving problem 2. However, BuildRemovableIndex.cs hasn't been completed. It doesn't seem to be missing that much, though. How would one tell Beagle to report any removable:///* URI? I guess I'm not familiar enough with the structure of Beagle to know where to begin resolving these issues. ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
HTML mimetype
Hey all, I recently noticed that *.html files are getting detected as application/x-mozilla-bookmarks instead of the correct text/html ! This is due to an xdgmime mime database (shared-mime-info) weirdness which recognizes *.html files as application/x-mozilla-bookmarks. Just for consolation, gnomevfs-info also makes the same mistake. I wonder what does nautilus do ? And no, beagle's HTML filter does not index application/x-mozilla-bookmarks file. Its trivial to add the mimetype to the HTML filter but I wonder if that is the right thing to do. Till this issue is resolved, don't be surprised if your html files are not indexed! The problem is partly due to shared-mime-info, so anybody with shared-mime-info-0.22 [1] will face the same problem. Anyone knows anything ? - dBera [1] http://webcvs.freedesktop.org/mime/shared-mime-info/freedesktop.org.xml.in?revision=1.246view=markup -- - Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE fan Mandriva / Inspiron-1100 user ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Migrate to Mono.Data.Sqlite (Was: Re: GSoC Weekly Report)
Ignore my previous email ... I was looking at the wrong place :( This is the right place for the new M.D.Sqlite http://anonsvn.mono-project.com/viewcvs/trunk/mcs/class/Mono.Data.Sqlite/Mo no.Data.Sqlite_2.0/SQLiteDataReader.cs Migration from Mono.Data.SqliteClient to Mono.Data.Sqlite completed (rev 4061). -- - Debajyoti Bera @ http://dtecht.blogspot.com beagle / KDE fan Mandriva / Inspiron-1100 user ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers
Re: Best way of using Beagle to index data CDs
You are more or less in the right track. As Kevin pointed out, one way it so leverage static-indexes. Due to the way static indexes work, it ... medium. If you merge several indexes, there would be two kinds of problems: 1) Files that are not in the filesystem would not be reported (happens for any static index) 2) If there are files in different removable media but with same absolute path, then only one of them will be returned. And there might be more weirdness. Problem 2 can be solved by making sure that one uses --remap correctly to make each prefix unique, for example, /media/disc id/. --remap doesn't work. I even think it was removed from svn trunk. Problem 1 is a complete bummer. That makes beagle more or less unusable to this end. How do we solve this? It seems that you've basically solved both of these problems in BuildRemovableIndex.cs by introducing a new URI protocol (removable) for solving problem 1 and using media_name for solving problem 2. However, BuildRemovableIndex.cs hasn't been completed. It doesn't seem to be missing that much, though. How would one tell Beagle to report any removable:///* URI? I guess I'm not familiar enough with the structure of Beagle to know where to begin resolving these issues. None of these issues require too much internal detail of beagle, so I am trying to describe whats there and whats missing. Pause me if you miss something. By nature, URIs should be unique. So the uri should be changed to use the media_name as well e.g. removable://media_name/relative/path/to/file Using the removable: scheme is just to capture the path and media name is a different kind of URL. I dont think its standard and beagle clients should interpret it as a removable media URL where the host of the URL is the name of the media and the path of the URL is the relative path of the file relative to where the media is mounted. Note: beagle-search and other clients out there don't yet know about removable media and would probably ignore such results. They need to be patched too. BuildRemovableIndex.cs is just a smart wrapper around BuildIndex.cs which does the above mentioned changes. The static backends are handled by StaticQueryable.cs; http://svn.gnome.org/viewvc/beagle?view=revisionrevision=3108 contained a modified StaticQueryable.cs which knew about removable media. When started, the backend would load the possible mappings from config file and store the mapping in a mapping_table. Every result passes through the backend just before it is returned. At that time, the modified StaticQueryable would take a removable media, extract the media_name and the relative path, use the mapping_table to get the mounted path for that media_name, append the relative path to the end of the mounted path and return the correct file:/// url. If the media is not mounted, it would just report the original removable:// url and mark a flag saying media not found. The client can then suitably interact with the user. E.g. the client can either drop all the un-mounted URLs or display all and when a user clicks on an unmounted URL, requests the user to mount that medium and then opens the file. The client interaction needs to be added to beagle-search. Another major part that needs to be completed is deciding where/how to store the media_name info for the medium. I was thinking beagle-removable-index would work like $ beagle-removable-index --build --medium medium_name [--config /path/to/new/config] --target /path/to/index/ ... would create a index at the path pointed to by target (as it happens now). It would also store a removableconfig.xml file with the name of the medium (and other possible configuration values) at /path/to/new/config. If --config is absent, the location will default to /path/to/index/removableconfig.xml $ beagle-removable-index --mount [--config /path/to/config] --target /path/to/index will inform running beagled that a removable index at /path/to/index is added. If the --config... is present, read the name and other information from there or try to read to the config information from /path/to/index/removableconfig.xml The running beagled will inform staticqueryable about the new medium being inserted which will in turn store the medium_name and /path/to/index to its mapping table. $ beagle-removable-index --unmount ... similarly It doesnt _have_ to work this way. This is just what I thought would make everybody happy. The last major piece which wasnt done (I think, I dont remember completely) is the real-time loading of new indexes. When StaticQueryable is informed about a new index and a mapping, and if the index at /path/to/index is not already loaded, then load the new static_index. This should not be too difficult, just call into QueryDriver.cs (see LoadRemovableMediaQueryables). If the index at /path/to/index is already loaded, then just update the (medium_name,path) mapping. Lucene allows beagle to silently update the index in the
Re: HTML mimetype
On 19/10/2007, Debajyoti Bera [EMAIL PROTECTED] wrote: snip And no, beagle's HTML filter does not index application/x-mozilla-bookmarks file. Its trivial to add the mimetype to the HTML filter but I wonder if that is the right thing to do. Till this issue is resolved, don't be surprised if your html files are not indexed! The problem is partly due to shared-mime-info, so anybody with shared-mime-info-0.22 [1] will face the same problem. Anyone knows anything ? Found this 2 month old bug -- https://bugs.freedesktop.org/show_bug.cgi?id=11843. -- Arun Raghavan (http://nemesis.accosted.net) v2sw5Chw4+5ln4pr6$OFck2ma4+9u8w3+1!m?l7+9GSCKi056 e6+9i4b8/9HTAen4+5g4/8APa2Xs8r1/2p5-8 hackerkey.com ___ Dashboard-hackers mailing list Dashboard-hackers@gnome.org http://mail.gnome.org/mailman/listinfo/dashboard-hackers