Re: [Evolution-hackers] EBookBackendSqliteDB comments

2011-05-05 Thread Chenthill
On Wed, 2011-05-04 at 13:04 +0200, sean finney wrote:
 Hi Everyone,
 
 I spoke with chen on IRC this morning and got hinted at a preliminary
 implementation of EBookBackendSqliteDB sitting in -ews.  Since there
 are some benefits of something something like this make it's way to
 a common place that could be used by -mapi as well, I thought I'd do
 a quick feasability review to see what problems there might be.
 
 Questions/commments/suggestions follow.  Please let me know what you
 think!
 
  * No backend _get_contact/_get_contacts equivalent.  Should be
easily implemented.
_get_vcard_string == _get_contact, i have not added an API return
EContact to let the callers decide whether they want to parse the string
to EContact.

i have not observed any use cases for get_contacts needed by the
backends. _book_backend_sqlitedb_search would server the
_get_contact_list API in the backend and also for querying using a
search query to fetch the contact list.

  * _add_contact/_remove_contact should be renamed to 
_add_contacts/_remove_contacts to be consistant with other backend
methods that take lists.
Makes sense as it already acts on multiple contacts.

  * but also having a _add_contact/_remove_contact that takes just a uid
(similar to other backends) would be useful
remove_contacts already takes only uid. I do not know how far
_add_contact with just the uid would be helpful. Which backend would
need it ?

  * -mapi seems to use one cache per-profile-per-folder, but the sqlitedb
backend takes these as calling parameters.  Not really a problem and
I think it may be reasons to have one cache db anyway, so this is
just more of an observation.

  * _get/_set/_delete interfaces are needed for cache metadata (last modified,
etc).
Am working on it atm.

  * if folder metadata is going to be free-form, it could be better to have
a key-value table ( folder_id_id int, key_name text, value text ) rather
than arbitrarily numbered text/binary fields.
I was thinking of allowing the backends to store key value pairs using a
bdata column which could be populated with xml key-value data. Would be
it be good idea ?

  * not sure of this one: given there may be multithreaded access to the db,
do we need to provide any external big locks on reads/writes?  maybe
the built in sqlite stuff is sufficient.
  * not sure of this one: beyond the COMMIT statements, should there be
something to periodically sync the db beyond the backend finalize 
 method?  
afaik it would be taken care of sqlite vfs and commit should be enough.

Unsure with commit is sufficient to get consistant on-disk in case of
crash, etc.
  * do we need a set_populated/is_populated equivalent?  or maybe that could
be solved in the cases it's needed wtih metadata.
I think I added it and removed later thinking it might be redundant with
sync_data column, but re-thinking now am clear both are independent.
Will get that added...

  * do we need a set_time/get_time equivalent?  or maybe that could
be solved in the cases it's needed wtih metadata.
There is a sync_data column which can be used for the same with either
last_modified date or sequence numbers or some synchronization text.

 
 @chen: I don't know how active you plan to be on this, but if you're looking
 to offload any work, I can pick up anything that results from the above if
 you like.  Just let me know!
The work is almost over, but will let you know once i finish the testing
and you can directly make changes if you require anything more there :)

- Chenthill.
 
 
   Sean
 


___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] EBookBackendSqliteDB comments

2011-05-05 Thread Chenthill
On Thu, 2011-05-05 at 11:20 +0530, Chenthill wrote:
 
   * not sure of this one: given there may be multithreaded access to
 the db,
 do we need to provide any external big locks on reads/writes?
 maybe 
Though sqlite has it, i have read in the FAQ that it recommends
applications not to perform any read operation while a write is in
place. And I did not want to our app to loop of _BUSY message, so I have
added a RW lock to avoid that.

- Chenthill.


___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] EBookBackendSqliteDB comments

2011-05-05 Thread sean finney
Hi!

On Thu, May 05, 2011 at 11:20:45AM +0530, Chenthill wrote:
   * No backend _get_contact/_get_contacts equivalent.  Should be
 easily implemented.
 _get_vcard_string == _get_contact, i have not added an API return
 EContact to let the callers decide whether they want to parse the string
 to EContact.

Ah, yes, I think that would work fine.

 i have not observed any use cases for get_contacts needed by the
 backends. _book_backend_sqlitedb_search would server the
 _get_contact_list API in the backend and also for querying using a
 search query to fetch the contact list.

Right, so I think that whole bullet point could be discarded.

   * _add_contact/_remove_contact should be renamed to 
 _add_contacts/_remove_contacts to be consistant with other backend
 methods that take lists.
 Makes sense as it already acts on multiple contacts.
 
   * but also having a _add_contact/_remove_contact that takes just a uid
 (similar to other backends) would be useful
 remove_contacts already takes only uid. I do not know how far
 _add_contact with just the uid would be helpful. Which backend would
 need it ?

Okay, I think I worded this one poorly.  What I meant was having the
singular form of _add_contacts/_remove_contacts (that doesn't use
a GSList but instead a single contact).  So that the calling application
doesn't need to make a 1-item list every time some async callback
acts on a single contact.

   * if folder metadata is going to be free-form, it could be better to have
 a key-value table ( folder_id_id int, key_name text, value text ) rather
 than arbitrarily numbered text/binary fields.
 I was thinking of allowing the backends to store key value pairs using a
 bdata column which could be populated with xml key-value data. Would be
 it be good idea ?

My own preference would be for something leaner and not requiring XML ,
since it would be embedding one structured/serialized data (xml) within
another (sqlite column), which I suspect would result in code more
complicated than it needed to be (getting/setting and
serializing/unserializing vs just getting/setting, esp with multiple
threads, is what jumps to mind).

But I don't have a particularly strong feeling on this, and it's probably
not ever going to be enough on the critical path to matter though.
It's just more of a gut feeling about how the metadata would be used,
and how it might be simpler/safer/cleaner/faster on the implementation
side key/value storage was used to reflect the key/value api unless
there's a pressing reason to have XML.

But I will defer to what you and the other evo folks think, since ultimately
the caller shouldn't be too concerned with the implementation details,
as long as the API provides the key/value functionality.

  @chen: I don't know how active you plan to be on this, but if you're looking
  to offload any work, I can pick up anything that results from the above if
  you like.  Just let me know!
 The work is almost over, but will let you know once i finish the testing
 and you can directly make changes if you require anything more there :)

Okay, sounds like a plan!



sean

-- 
___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] EBookBackendSqliteDB comments

2011-05-05 Thread Milan Crha
On Thu, 2011-05-05 at 11:20 +0530, Chenthill wrote:
   * if folder metadata is going to be free-form, it could be better
 to have a key-value table ( folder_id_id int, key_name text,
 value text  ) rather than arbitrarily numbered text/binary
 fields.
 I was thinking of allowing the backends to store key value pairs using
 a bdata column which could be populated with xml key-value data. Would
 be it be good idea ?

Hi,
you scary me. Could you repeat where is written information about a
design you chose for this, how it correlates with actual backend
cache(s) (we do not want to loose functionality here) and maybe why done
so?

Like in the above quoted text, is that to replace keys.xml file (it's
from calendar, I know, but you know what I mean)? Or what do you call
meta-data? I want to be able to store my own keys per account (not per
item, it's another thing which scary me, one addressbook cache file per
account, really?) Be sure that parsing bdata is a pain, and always will,
especially when you already are in a database world, where are tables
and relations between them pretty common and nature.

If I recall correctly then populated and last_modified were also
stored as keys in the background, but backend could drop them
accidentally, when accessing through keys directly. It sometimes can
be considered a benefit, but it usually isn't. If I have specialized API
to access these keys, then I should use it exclusively. I think.

I recall us chatting about this on IRC or somewhere one day and one
point was that the contacts will not be stored in a binary form, but
rather as separate files. What Sean wrote earlier sounds like you
changed your mind in this point. I do not think it's a good idea, see
how often the sqlite folders.db file in camel is broken, and users are
adviced to delete it. Will they loose all their contacts in such
situation?
Bye,
Milan

P.S.: I confess I didn't open your code, I only read this thread.

___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] EBookBackendSqliteDB comments

2011-05-05 Thread Chenthill
On Thu, 2011-05-05 at 13:53 +0200, Milan Crha wrote:
 On Thu, 2011-05-05 at 11:20 +0530, Chenthill wrote:
* if folder metadata is going to be free-form, it could be better
  to have a key-value table ( folder_id_id int, key_name text,
  value text  ) rather than arbitrarily numbered text/binary
  fields.
  I was thinking of allowing the backends to store key value pairs using
  a bdata column which could be populated with xml key-value data. Would
  be it be good idea ?
 
   Hi,
 you scary me. Could you repeat where is written information about a
 design you chose for this, how it correlates with actual backend
 cache(s) (we do not want to loose functionality here) and maybe why done
 so?
Well, I have not started on the meta-data storage yet :) Just have a
table for it. There is no specific design for it. 

 
 Like in the above quoted text, is that to replace keys.xml file (it's
 from calendar, I know, but you know what I mean)? Or what do you call
 meta-data? 
In calendar terms, yes.

 I want to be able to store my own keys per account (not per
 item, it's another thing which scary me, one addressbook cache file per
 account, really?) 
Meta-data table is one for an account and folder meta-bata will form the
rows. 

 Be sure that parsing bdata is a pain, and always will,
 especially when you already are in a database world, where are tables
 and relations between them pretty common and nature.
This is the reason I was thinking whether it would be good idea to have
a abstract API to store extended (apart from sync_data, populated
columns etc.) key-value pairs if the backend needs. This can form the
xml and store it as bdata. Now the bdata would not be exposed to the
callers. Is there any other better way to do this ?

 
 If I recall correctly then populated and last_modified were also
 stored as keys in the background, but backend could drop them
 accidentally, when accessing through keys directly. It sometimes can
 be considered a benefit, but it usually isn't. If I have specialized API
 to access these keys, then I should use it exclusively. I think.
For the commonly used keys such as the above we would have specialized
API's and they would be having separate columns on a per-folder basis.

 I recall us chatting about this on IRC or somewhere one day and one
 point was that the contacts will not be stored in a binary form, but
 rather as separate files. What Sean wrote earlier sounds like you
 changed your mind in this point. I do not think it's a good idea, see
 how often the sqlite folders.db file in camel is broken, and users are
 adviced to delete it. Will they loose all their contacts in such
 situation?
As I already said seanus on irc, I will be evaluating the performance
between having vcards as files Vs having it in db and then choose the
one which would be best. So the code for both will be there and we can
choose between them over after testing. I was also thinking of providing
it as an option for the backends to choose once i complete the testing..
So what we discussed stays the same :)

- Chenthill.
   Bye,
   Milan
 
 P.S.: I confess I didn't open your code, I only read this thread.
 
 ___
 evolution-hackers mailing list
 evolution-hackers@gnome.org
 To change your list options or unsubscribe, visit ...
 http://mail.gnome.org/mailman/listinfo/evolution-hackers


___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] EBookBackendSqliteDB comments

2011-05-05 Thread sean finney
On Thu, May 05, 2011 at 12:23:01PM +0530, Chenthill wrote:
  Be sure that parsing bdata is a pain, and always will,
  especially when you already are in a database world, where are tables
  and relations between them pretty common and nature.
 This is the reason I was thinking whether it would be good idea to have
 a abstract API to store extended (apart from sync_data, populated
 columns etc.) key-value pairs if the backend needs. This can form the
 xml and store it as bdata. Now the bdata would not be exposed to the
 callers. Is there any other better way to do this ?

Forgive the rusty SQL, but assuming you have a single db with
multiple folders in it, soemthing like:

create table folder_kvdata ( 
folder_id_id int foreign key references folders(folder_id),
keyname text,
keyval text
);

?  With this it would be pretty trivial to fetch single values
as well as enumerate/update/delete all keys/values for a folder.
If the caller needed something more complicated than a single value, 
an xml object or whatever else could be embedded on an as-needed basis.

  If I recall correctly then populated and last_modified were also
  stored as keys in the background, but backend could drop them
  accidentally, when accessing through keys directly. It sometimes can
  be considered a benefit, but it usually isn't. If I have specialized API
  to access these keys, then I should use it exclusively. I think.
 For the commonly used keys such as the above we would have specialized
 API's and they would be having separate columns on a per-folder basis.

yeah, I think it would be a good idea to claerly break them out from
the general k/v pairs, to avoid conflicts and special-casing any code.

  I recall us chatting about this on IRC or somewhere one day and one
  point was that the contacts will not be stored in a binary form, but
  rather as separate files. What Sean wrote earlier sounds like you
  changed your mind in this point. I do not think it's a good idea, see
  how often the sqlite folders.db file in camel is broken, and users are
  adviced to delete it. Will they loose all their contacts in such
  situation?
 As I already said seanus on irc, I will be evaluating the performance
 between having vcards as files Vs having it in db and then choose the
 one which would be best. So the code for both will be there and we can
 choose between them over after testing. I was also thinking of providing
 it as an option for the backends to choose once i complete the testing..
 So what we discussed stays the same :)

W.r.t. a performance standpoint, I will be testing against a Global
Address List of somewhere around 60k entries, so that should give a
pretty good idea :)

I think Milan also had concerns with regards to stability/fragility,
with corrupting databases, etc.  But I don't think that the split out
option is immune from these types of problems as well (and there may be
even further problems, since we would be home-rolling that solution as
opposed to relying on a well tested API/DB).


sean
___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers