Re: [Evolution-hackers] Sqlite cache for address-book storage in EDS

2011-03-21 Thread Adam Tauno Williams
On Mon, 2011-03-21 at 14:15 +0100, Matthias Braun wrote:
> Am Montag, den 14.03.2011, 08:40 -0400 schrieb Adam Tauno Williams:
> > On Mon, 2011-03-14 at 18:57 +0530, Chenthill Palanisamy wrote:
> > > On Mon, Mar 14, 2011 at 3:53 PM, Adam Tauno Williams
> > >  wrote:
> > > > I have a CardDAV/GroupDAV collection of ~21,000 contacts I'd love to
> > > > have access to via Evolutions WebDAV address book.  But anything more
> > > > than a thousand or so gets to be unbearably slow.
> > > AFAIR, there are some UI issues involved here which should be dealt
> > > with separately.
> > True,  most importantly [at least for WebDAV address books] why the &@^
> > $*&@ it issues a PROPFIND to the server to enumerate the collection at
> > every search?!  Just search the data you have;  it really seems like
> > update / synchronizing the collection and searching the collection
> > should be independent events.
> > I suppose I should get around to filing a bug about that.
> The problem here is that obviously contacts on the server could have
> changed since the last search.

Yes.

>  How else can you detect this? Do the
> propfind results get an ETag, then this would be a good way to speed
> things up.

Please support ctags! 

http://calendarserver.org/ns/";>
  

  


The CalDAV backend currently [recently] supports this.  If the ctag
matches what was present before then all-stop.

Reporting e-tags is pretty fast, but even then etags on >10,000 objects
is quite a response.  And it can still be fairly expensive of the server
depending on what security descriptors it needs to process.

> Apart from that I could obviously add some timout, which would only
> query the server every N-minutes and do faster queries from cache
> only...

That would be awesome.

> Anyway I'd be happy to get more input from people on the server side 

I am on the server side. :)


> - I
> wrote and used it only for my own contact collection which is ~200
> contacts on an apache mod_dav server, because that was enough for me and
> I never managed (and wanted) to setup one of the "big" groupwares just
> for myself.


___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Sqlite cache for address-book storage in EDS

2011-03-21 Thread Matthias Braun
Am Montag, den 14.03.2011, 08:40 -0400 schrieb Adam Tauno Williams:
> On Mon, 2011-03-14 at 18:57 +0530, Chenthill Palanisamy wrote:
> > On Mon, Mar 14, 2011 at 3:53 PM, Adam Tauno Williams
> >  wrote:
> > > On Mon, 2011-03-14 at 10:09 +0530, Chenthill Palanisamy wrote:
> > >> On Thu, Mar 10, 2011 at 6:54 PM, Matthew Barnes  
> > >> wrote:
> > >> > Okay, this might be a long shot but I'm gonna throw it out there 
> > >> > anyway:
> > >> > would it make sense to look at using Xapian to index a directory of raw
> > >> > vCards?
> > >> Am not sure if its worth doing this for adress-book. Am just making an
> > >> assumption that the
> > >> address-book content may not be as huge as mail data. The only 
> > >> address-book data
> > >> that would be large enough would be GAL (exchange) and
> > >> SystemAdressBook (groupwise).
> > > This is a self-fulfilling prophecy;  I and others have tried to have
> > > large address books... which doesn't work... so address books remain
> > > "small".
> > I agree, the *only* should be removed from the third sentence of mine,
> > there could be other address-books.
> > While thinking of Xapian for address-book, am not still convinced.
> > One could search on various fields such as sender, subject,
> > recipients, full-text search etc. in mailer often and xapian is said
> > to work much better.
> > Although I have not got any profiling information as such, but its
> > just from hearing from multiple people.
> > But for address-books, the most often used searches would be based on
> > name and email. Even if the address-book has 21k or more data,
> > a db with good indexing should perform better. The information stored
> > will be small when compared to mail content.. Well these are just
> > my observations, are there any other cases am missing ?
> 
> This makes sense to me [I've no idea really how it is currently
> implemented or what the practical alternatives are].  But funny side
> note: if I just walk the DAV collection and save all the vcf files to a
> directory ... a simple python script can parse each file [using the
> vobject module], compare the values to a criteria, and report what items
> match... an order of magnitude faster than Evolution.  But the reason
> for this is mentioned below.
> 
> > > I have a CardDAV/GroupDAV collection of ~21,000 contacts I'd love to
> > > have access to via Evolutions WebDAV address book.  But anything more
> > > than a thousand or so gets to be unbearably slow.
> > AFAIR, there are some UI issues involved here which should be dealt
> > with separately.
> 
> True,  most importantly [at least for WebDAV address books] why the &@^
> $*&@ it issues a PROPFIND to the server to enumerate the collection at
> every search?!  Just search the data you have;  it really seems like
> update / synchronizing the collection and searching the collection
> should be independent events.
> 
> I suppose I should get around to filing a bug about that.

The problem here is that obviously contacts on the server could have
changed since the last search. How else can you detect this? Do the
propfind results get an ETag, then this would be a good way to speed
things up.

Apart from that I could obviously add some timout, which would only
query the server every N-minutes and do faster queries from cache
only...

Anyway I'd be happy to get more input from people on the server side - I
wrote and used it only for my own contact collection which is ~200
contacts on an apache mod_dav server, because that was enough for me and
I never managed (and wanted) to setup one of the "big" groupwares just
for myself.

___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Sqlite cache for address-book storage in EDS

2011-03-14 Thread Adam Tauno Williams
On Mon, 2011-03-14 at 18:57 +0530, Chenthill Palanisamy wrote:
> On Mon, Mar 14, 2011 at 3:53 PM, Adam Tauno Williams
>  wrote:
> > On Mon, 2011-03-14 at 10:09 +0530, Chenthill Palanisamy wrote:
> >> On Thu, Mar 10, 2011 at 6:54 PM, Matthew Barnes  wrote:
> >> > Okay, this might be a long shot but I'm gonna throw it out there anyway:
> >> > would it make sense to look at using Xapian to index a directory of raw
> >> > vCards?
> >> Am not sure if its worth doing this for adress-book. Am just making an
> >> assumption that the
> >> address-book content may not be as huge as mail data. The only 
> >> address-book data
> >> that would be large enough would be GAL (exchange) and
> >> SystemAdressBook (groupwise).
> > This is a self-fulfilling prophecy;  I and others have tried to have
> > large address books... which doesn't work... so address books remain
> > "small".
> I agree, the *only* should be removed from the third sentence of mine,
> there could be other address-books.
> While thinking of Xapian for address-book, am not still convinced.
> One could search on various fields such as sender, subject,
> recipients, full-text search etc. in mailer often and xapian is said
> to work much better.
> Although I have not got any profiling information as such, but its
> just from hearing from multiple people.
> But for address-books, the most often used searches would be based on
> name and email. Even if the address-book has 21k or more data,
> a db with good indexing should perform better. The information stored
> will be small when compared to mail content.. Well these are just
> my observations, are there any other cases am missing ?

This makes sense to me [I've no idea really how it is currently
implemented or what the practical alternatives are].  But funny side
note: if I just walk the DAV collection and save all the vcf files to a
directory ... a simple python script can parse each file [using the
vobject module], compare the values to a criteria, and report what items
match... an order of magnitude faster than Evolution.  But the reason
for this is mentioned below.

> > I have a CardDAV/GroupDAV collection of ~21,000 contacts I'd love to
> > have access to via Evolutions WebDAV address book.  But anything more
> > than a thousand or so gets to be unbearably slow.
> AFAIR, there are some UI issues involved here which should be dealt
> with separately.

True,  most importantly [at least for WebDAV address books] why the &@^
$*&@ it issues a PROPFIND to the server to enumerate the collection at
every search?!  Just search the data you have;  it really seems like
update / synchronizing the collection and searching the collection
should be independent events.

I suppose I should get around to filing a bug about that.

___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Sqlite cache for address-book storage in EDS

2011-03-14 Thread Chenthill Palanisamy
On Mon, Mar 14, 2011 at 3:53 PM, Adam Tauno Williams
 wrote:
> On Mon, 2011-03-14 at 10:09 +0530, Chenthill Palanisamy wrote:
>> On Thu, Mar 10, 2011 at 6:54 PM, Matthew Barnes  wrote:
>> > Okay, this might be a long shot but I'm gonna throw it out there anyway:
>> > would it make sense to look at using Xapian to index a directory of raw
>> > vCards?
>> Am not sure if its worth doing this for adress-book. Am just making an
>> assumption that the
>> address-book content may not be as huge as mail data. The only address-book 
>> data
>> that would be large enough would be GAL (exchange) and
>> SystemAdressBook (groupwise).
>
> This is a self-fulfilling prophecy;  I and others have tried to have
> large address books... which doesn't work... so address books remain
> "small".
I agree, the *only* should be removed from the third sentence of mine,
there could be other address-books.
While thinking of Xapian for address-book, am not still convinced.

One could search on various fields such as sender, subject,
recipients, full-text search etc. in mailer often and xapian is said
to work much better.
Although I have not got any profiling information as such, but its
just from hearing from multiple people.

But for address-books, the most often used searches would be based on
name and email. Even if the address-book has 21k or more data,
a db with good indexing should perform better. The information stored
will be small when compared to mail content.. Well these are just
my observations, are there any other cases am missing ?

>
> I have a CardDAV/GroupDAV collection of ~21,000 contacts I'd love to
> have access to via Evolutions WebDAV address book.  But anything more
> than a thousand or so gets to be unbearably slow.
AFAIR, there are some UI issues involved here which should be dealt
with separately.

- Chenthill.
>
> ___
> evolution-hackers mailing list
> evolution-hackers@gnome.org
> To change your list options or unsubscribe, visit ...
> http://mail.gnome.org/mailman/listinfo/evolution-hackers
>
___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Sqlite cache for address-book storage in EDS

2011-03-14 Thread Adam Tauno Williams
On Mon, 2011-03-14 at 10:09 +0530, Chenthill Palanisamy wrote:
> On Thu, Mar 10, 2011 at 6:54 PM, Matthew Barnes  wrote:
> > Okay, this might be a long shot but I'm gonna throw it out there anyway:
> > would it make sense to look at using Xapian to index a directory of raw
> > vCards?
> Am not sure if its worth doing this for adress-book. Am just making an
> assumption that the
> address-book content may not be as huge as mail data. The only address-book 
> data
> that would be large enough would be GAL (exchange) and
> SystemAdressBook (groupwise).

This is a self-fulfilling prophecy;  I and others have tried to have
large address books... which doesn't work... so address books remain
"small".

I have a CardDAV/GroupDAV collection of ~21,000 contacts I'd love to
have access to via Evolutions WebDAV address book.  But anything more
than a thousand or so gets to be unbearably slow.

___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Sqlite cache for address-book storage in EDS

2011-03-13 Thread Chenthill Palanisamy
On Thu, Mar 10, 2011 at 6:54 PM, Matthew Barnes  wrote:
>
> On Thu, 2011-03-10 at 08:13 +0100, Milan Crha wrote:
> > do not forget that the DB cache is compiled conditionally, because some
> > distros do not ship libdb. Using SQLite for this was mentioned months
> > ago, only no-one got time to actually do it, so go for it.
>
> Also, as far as I know there is still licensing issues between Berkeley
> DB's Sleepcat license and [L]GPL, which is how libebackend was born.
>
> https://bugzilla.gnome.org/show_bug.cgi?id=465374
>
> I'm +1 on dumping Berkeley DB.
>
>
> > Only think of two things:
> > - using binary storage for this kind of data is bad for cases where
> >   the binary file breaks, either due to an update/downgrade of the
> >   library providing access to it, or just by a crash. It's not so hot
> >   with camel as SQLite has there only summary data, but if you want to
> >   store also real data in it, then it can be a problem. There are people
> >   having issues recovering their data from addressbook storage already,
> >   but if you are going to do any change on it, then it would be good to
> >   think of that from the beginning. It would be good to store raw vCards
> >   in some plain text file(s) which will be "indexed" by SQLite summary.
> >   This plain text file(s) will be then easy to import to evolution if
> >   something goes wrong, and with erasing SQLite file user will not
> >   loose any valuable data. (I'm thinking of a flat maildir approach
> >   here.)
>
> Milan raises a good point about binary formats versus text.  Would be
> good for the raw data to remain human readable.
Yes, it makes senses to store it that way. If we can index the data in
sqlite summary and store
VCards in the way we store individual mail data, it should be sufficient..

>
> Okay, this might be a long shot but I'm gonna throw it out there anyway:
> would it make sense to look at using Xapian to index a directory of raw
> vCards?
Am not sure if its worth doing this for adress-book. Am just making an
assumption that the
address-book content may not be as huge as mail data. The only address-book data
that would be large enough would be GAL (exchange) and
SystemAdressBook (groupwise).
I think sqlite should suffice in indexing this..

>
> We've been talking about moving to "notmuch" [1] for mail indexing, and
> "notmuch" is built on Xapian.  Trying out Xapian for address books might
> be a good test drive for using it with mail.
To be honest, I wont be having that much time for testing this for
address-book. Jony
was trying to evaluate the performance between sqlite and notmuch mail indexing
for mails, any updates there Jony ?

- Chenthill.
>
> The catch is, Xapian is written in C++.  So we'd likely have to hand
> write our own GObject bindings for it in C.  That's what makes it a long
> shot.  But we could look to "notmuch" even WebKit/GTK+ for examples of
> binding C++ to C.  My C++ is rusty but I still have my Stroustrup text
> book.
>
>
> [1] http://notmuchmail.org/
>
> ___
> evolution-hackers mailing list
> evolution-hackers@gnome.org
> To change your list options or unsubscribe, visit ...
> http://mail.gnome.org/mailman/listinfo/evolution-hackers
___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Sqlite cache for address-book storage in EDS

2011-03-10 Thread Matthew Barnes
On Thu, 2011-03-10 at 08:13 +0100, Milan Crha wrote:
> do not forget that the DB cache is compiled conditionally, because some
> distros do not ship libdb. Using SQLite for this was mentioned months
> ago, only no-one got time to actually do it, so go for it.

Also, as far as I know there is still licensing issues between Berkeley
DB's Sleepcat license and [L]GPL, which is how libebackend was born.

https://bugzilla.gnome.org/show_bug.cgi?id=465374

I'm +1 on dumping Berkeley DB.


> Only think of two things:
> - using binary storage for this kind of data is bad for cases where
>   the binary file breaks, either due to an update/downgrade of the
>   library providing access to it, or just by a crash. It's not so hot
>   with camel as SQLite has there only summary data, but if you want to
>   store also real data in it, then it can be a problem. There are people
>   having issues recovering their data from addressbook storage already,
>   but if you are going to do any change on it, then it would be good to
>   think of that from the beginning. It would be good to store raw vCards
>   in some plain text file(s) which will be "indexed" by SQLite summary.
>   This plain text file(s) will be then easy to import to evolution if
>   something goes wrong, and with erasing SQLite file user will not
>   loose any valuable data. (I'm thinking of a flat maildir approach
>   here.)

Milan raises a good point about binary formats versus text.  Would be
good for the raw data to remain human readable.

Okay, this might be a long shot but I'm gonna throw it out there anyway:
would it make sense to look at using Xapian to index a directory of raw
vCards?

We've been talking about moving to "notmuch" [1] for mail indexing, and
"notmuch" is built on Xapian.  Trying out Xapian for address books might
be a good test drive for using it with mail.

The catch is, Xapian is written in C++.  So we'd likely have to hand
write our own GObject bindings for it in C.  That's what makes it a long
shot.  But we could look to "notmuch" even WebKit/GTK+ for examples of
binding C++ to C.  My C++ is rusty but I still have my Stroustrup text
book.


[1] http://notmuchmail.org/

___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers


Re: [Evolution-hackers] Sqlite cache for address-book storage in EDS

2011-03-09 Thread Milan Crha
On Thu, 2011-03-10 at 12:09 +0530, Chenthill Palanisamy wrote:
> file, groupwise, exchange uses EBookBackendDBCache.

Hi,
do not forget that the DB cache is compiled conditionally, because some
distros do not ship libdb. Using SQLite for this was mentioned months
ago, only no-one got time to actually do it, so go for it.

Only think of two things:
- using binary storage for this kind of data is bad for cases where
  the binary file breaks, either due to an update/downgrade of the
  library providing access to it, or just by a crash. It's not so hot
  with camel as SQLite has there only summary data, but if you want to
  store also real data in it, then it can be a problem. There are people
  having issues recovering their data from addressbook storage already,
  but if you are going to do any change on it, then it would be good to
  think of that from the beginning. It would be good to store raw vCards
  in some plain text file(s) which will be "indexed" by SQLite summary.
  This plain text file(s) will be then easy to import to evolution if
  something goes wrong, and with erasing SQLite file user will not
  loose any valuable data. (I'm thinking of a flat maildir approach
  here.)

- be able to store custom values in the summary - backends can have
  a need to make its own notes in the summary, so make it possible for
  it. As these might not be so critical as contact information itself,
  then it should be fine to store to summary only.
Bye,
Milan

___
evolution-hackers mailing list
evolution-hackers@gnome.org
To change your list options or unsubscribe, visit ...
http://mail.gnome.org/mailman/listinfo/evolution-hackers