Re: New module proposal: tracker

2009-08-19 Thread Martyn Russell

On 18/08/09 23:54, Vincent Untz wrote:

Le mardi 18 août 2009, à 20:26 +0200, Vincent Untz a écrit :

Le mardi 18 août 2009, à 20:19 +0200, Philip Van Hoof a écrit :

We'll do our best and are committed to formulate our answers in a
non-vague way and improve the communication of the project's members,
about the project, towards the community.


Maybe just clearly state what tracker (or tracker-store, the thread
already lost me :/) will bring to GNOME if integrated. I don't want to
hear about ontology, sparql, data store, indexer, or whatever. I want to
know what it will bring me as a user, and what opportunity it gives me as a
hacker, for my modules.

So, yeah. Just list use cases. (Somebody already gave a few examples in
a mail, iirc, but it got lost in the noise for me).


Thanks for the listing some use cases. But I guess I should also have
asked: which of those use cases are ready to be integrated in GNOME now?
(ie, assuming we accept tracker today, which patches can we merge?)


The use cases are interesting for applications. Tracker doesn't really 
give anything to users directly it gives a framework to applications to 
query data and provide users with more ways to connect their data.


The only way it gives to users is if people decide to fire up the UI we 
send with Tracker, but this is a smaller part of the whole project and 
less important than tracker-store.



Note that I'm not for/against tracker; I'm just trying to understand
what accepting it in the desktop for 2.30 will change for 2.30. If the
reply is nothing, it's a chicken-and-egg problem, then I would think
that proposing it as an external dependency first could make more sense.


I think ultimately, not a lot. It will make the technology available for 
applications to integrate with. I think it is partially a 
chicken-and-egg problem, but right now we are integrated with Nokia apps 
on the Maemo platform so we are not just a stand alone entity with no 
testing in that respect.


I too think being an external dependency might well be a more logical 
first step. Until it is more established at least.



(and on a general note, I agree with Matthias' mail about finding what
we're trying to achieve)


Well, to some extent that's down to the applications, but I think, 
overall, we are trying to give users a more connected experience with 
their data and other applications on the desktop.


--
Regards,
Martyn
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread Florian Ludwig
On Tue, 2009-08-18 at 17:40 -0400, David Zeuthen wrote:
  1. what problems we want to solve for the user

Here some use cases I think that could be solved with some of those
shiny new technologies - but I'm not involved with any of tracker,
zeitgeist, couchdb, whatever, so I can't tell if they actually try to
solve them.


1. Tim is doing a video collage for his friend Foo. Therefore he wants
to browse all media files (audio, video and photos) related to his
friends Foo or Bar or himself.

1.1 Tim wants the oldest photo of Foo he got
1.2 Tim searches a specific photo he shot on his holidays that were 
between date X and Y
1.3 Found a photo he did some funny things with in the GIMP but now he 
would like to get the original photo back (and lets assume its 
still there, somewhere on his hard disk)
1.4 After showing the collage his friend, his friend asks him about some
movie clips Tim used. His computer could easily tell him which 
other video clips has been used to create this one.


2. Kim is a computer science student and his professors at the
university provide a pile of pdf files including presentations, scripts
and exercises. While working on a paper he is not sure about a algorithm
he needs but recalls one of his professors did a really good
explanation. But who was it? And in which of those hundred pdf files on
which page is it?


3. Tim sends (via mail or IM) Kim a good script that Kim didn't know
about. A day later he wants to look into it again but forgot where he
put it and how it was called. 

[I've had talks like this a lot after sending someone a file via IM: 
 How is the file named you just [yesterday/..] send me? I cant find it 
 anymore]


The problem that is to solve here is to improve effectiveness. You got
to interact with a several different programs and collect information
on your own - but your computer could do it for you.

-- 
Florian Ludwig d...@phidev.org


signature.asc
Description: This is a digitally signed message part
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list

Module dependencies

2009-08-19 Thread Uros Nedic

 Hi all,

 I'm observing your conversation and learn GNOME code
 since I plan to contribute something in the future.

 I would like to ask if it is possible that someone
 write module dependencies in gnome starting from top
 to down. For example - top modules are . -
 module is dependent from modules .

 Also I saw that dominant idea is to consolidate
 dependencies in gnome 2.30=gnome 3.0. Is this main
 target for gnome 3.0?

 Thanks,
 Uros Nedic, MSc
 Belgrade, Serbia


_
More than messages–check out the rest of the Windows Live™.
http://www.microsoft.com/windows/windowslive/
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Alan Cox
 at low I/O priority, without unpleasantly degrading system performance.
 I imagine the sheer seek cost of pulling all those dentries, inodes into
 memory, and evicting all the other useful data you had around - is a big
 part of the plague. Hopefully btrfs will improve the situation somewhat
 here, but wrt. inode / dentry management I suspect there is no really
 good solution.

On rotating media its seek and access times. This is amplified on most
older systems by the fact ATA devices had no queueing interface so the
drive couldn't do any smart re-ordering to extract further parallelism.
SSD is more important here than btrfs. Filesystems can try to be clever
and hide the fact rotating media sucks for latency versus processing
power, but only SSD actually fixes the problem properly.

   Unfortunately, as soon as we have this, it is only a small
 feature-creep step to lets index all .c/.h files to extract comments in
 the API documentation - which (I suspect) then commits you to the
 disaster of irritating a lot of developers - so they turn it off, and
 getting bogged down indexing things no-one is ever going to want indexed
 by tracker (?).

I think there lies a misassumption. The actual indexing has a fairly high
cost. The cost of extracting metadata while indexing ought to be
relatively low in comparison. That argues that allowing stuff to plug
into the indexing based on file type is useful. It's not really function
creep either given the only interface the indexer needs is

- who is associated with this file type (which exists)
- give me your metadata for this file content

and if there is nobody wanting to do so then who cares. If apps provide
the interface for metadata extraction (into a tag soup or something) then
if you don't have the app installed you won't index for it. Document to
tag ought to be fast.

   Personally, I'd start by ignoring any directory tree with a configure*
 script in the top-level, or perhaps a .git / .svn directory - that
 should reduce the inotify pain :-)
 
   So - my point is: are the devs fetching source code at the console -
 that you are concerned about above, really in the target audience for
 tracker ? and if so why ? 

How about who sent that patch, what are the related emails and when were
they last on irc - a classic developer query. Possibly bundled in with
do I have a picture of them (conferences) and who are their close
friends (other ways to get hold of and see connections), where are they
right now (irc connecting address, email headers and geodata for IP
addresses). Or in short - developers are not different. A lawyer wants to
do the same thing within a firm for a case note, an CAD designer for a
design change, a secretary for letters, etc.

Physical indexing (the file walking side), extracting meaning and query
processing are three unrelated tasks. In your developer case if I've got
various git helpers installed it would be nice that the indexer bothered
to talk to the git plugins about source code and git trees. If I don't
have them installed it doesn't need to - its a modular problem.

Maybe you also need to learn what types of metadata people use the most
for presentation (eg by what links they follow) but thats another story
in the UI anyway.

Alan
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Alan Cox
On Tue, 18 Aug 2009 19:31:04 +0200
 The tracker-store is a desktop service that offers the application
 developer a query capability against data that it stores. The data that
 it stores must be strictly defined by a schema (which is what in RDF is
 called an ontology). The schemas that we ship by default are the Nepomuk
 ones. The query language is SPARQL. The service provides the opportunity
 to the application developer to store. The application developer uses
 the an extension to SPARQL, SPARQL Update, which we support too. The
 communication between application and tracker-store happens over DBus.
 
 Nepomuk's ontologies:
 http://www.semanticdesktop.org/ontologies/

Broken link btw at: http://www.semanticdesktop.org/ontologies/
for:
http://dev.nepomuk.semanticdesktop.org/repos/trunk/ontologies/pimo/latex/pimo.pdf

 Let me know if that was a helpful description for you. I tried hard not
 to sound like an old German philosopher ;-).

One thing I couldn't quickly tell is whether you are always remembering
the source of external information, particularly any externally acquired
personal information about someone that is stored in the database. That
may be important for business users who have to meet data
protection/personal information rights legislation. Ditto that tracker
doesn't start extracting and organising by anything like religious,
medical or ethnic data whose processing is controlled in many countries.

Alan
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Philip Van Hoof
On Wed, 2009-08-19 at 11:25 +0100, Alan Cox wrote:

Hey Alan, thanks for your questions.

  The tracker-store is a desktop service that offers the application
  developer a query capability against data that it stores. The data that
  it stores must be strictly defined by a schema (which is what in RDF is
  called an ontology). The schemas that we ship by default are the Nepomuk
  ones. The query language is SPARQL. The service provides the opportunity
  to the application developer to store. The application developer uses
  the an extension to SPARQL, SPARQL Update, which we support too. The
  communication between application and tracker-store happens over DBus.
  
  Nepomuk's ontologies:
  http://www.semanticdesktop.org/ontologies/
 
 Broken link btw at: http://www.semanticdesktop.org/ontologies/
 for:
 http://dev.nepomuk.semanticdesktop.org/repos/trunk/ontologies/pimo/latex/pimo.pdf

I will ping upstream Nepomuk maintainers about this problem. Thanks.

Note that Tracker at this moment doesn't do the PIMO ontology: we think
it's too complicated. This has not been finally decided yet, though.

  Let me know if that was a helpful description for you. I tried hard not
  to sound like an old German philosopher ;-).
 
 One thing I couldn't quickly tell is whether you are always remembering
 the source of external information, particularly any externally acquired
 personal information about someone that is stored in the database.

To store the source of external information we need to do so called
named graphs (which is a RDF buzzword for namespaces).

Right now we don't do named graphs. We are in discussion about adding
support for it (actually, a few minutes ago we were), but given the
direction the discussion I don't think this feature will make it for the
first 0.7.x development releases.

More information about using named graphs with SPARQL queries can be
found here:

http://www.w3.org/TR/rdf-sparql-query/#namedGraphs

Our current implementation simply ignores any 'FROM NAMED' parts.

 That may be important for business users who have to meet data
 protection/personal information rights legislation. Ditto that tracker
 doesn't start extracting and organising by anything like religious,
 medical or ethnic data whose processing is controlled in many countries.

Tracker will store this if the applications request storage of it. The
issue of protecting the user's personal data is left to the applications
using it and the underlying operating system's security features.

The database's file will of course have the right UNIX permissions set.

-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be

___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Rob Taylor
Alan Cox wrote:
 On Tue, 18 Aug 2009 19:31:04 +0200

snip

 Let me know if that was a helpful description for you. I tried hard not
 to sound like an old German philosopher ;-).
 
 One thing I couldn't quickly tell is whether you are always remembering
 the source of external information, particularly any externally acquired
 personal information about someone that is stored in the database. That
 may be important for business users who have to meet data
 protection/personal information rights legislation. Ditto that tracker
 doesn't start extracting and organising by anything like religious,
 medical or ethnic data whose processing is controlled in many countries.

So we're getting pretty deep here! I believe the current thought on this
is to use named graphs to tag statements with their provenance, which
then allows you to do access control and easily remove sets of
statements of a certain provenance.

A very acedemic overview of this technique can be found at
http://tw.rpi.edu/proj/portal.wiki/images/5/59/Data_Usage_Control.pdf

On a more practical level, there is a branch of tracker-store with named
graph support, but currently some uncertainty about its current
usefulness vs. extra storage costs. For our web service data pulling
we'd love to use named graphs to allow us to easily idenify and remove
data that we've pulled.


Thanks,
Rob

 Alan
 ___
 desktop-devel-list mailing list
 desktop-devel-list@gnome.org
 http://mail.gnome.org/mailman/listinfo/desktop-devel-list


-- 
Rob Taylor, Codethink Ltd. - http://codethink.co.uk
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Alan Cox
 Tracker will store this if the applications request storage of it. The
 issue of protecting the user's personal data is left to the applications
 using it and the underlying operating system's security features.

To a business deploying systems with this feature there are multiple
issues

- Need to be able to keep personal data secure (OS problem mostly)
- Need to be able to search it (probably remotely) for data access
  requests (not really different to the situation now with pulling out
  emails and the like). Also the point of it is to index such data so it
  makes the job easier !
- Need to be able to identify the source of incorrect data
- Need not to be processing sensitive data (race etc) without appropriate
  authority.

The last one is very much a social (company policy) issue but at the same
time having tools that are a bit too good at their job and collected
this by default would be particularly bad.

___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Sankar P
Not contributing to the core discussion.

On Wed, Aug 19, 2009 at 3:26 PM, Alan Coxa...@lxorguk.ukuu.org.uk wrote:
 I think there lies a misassumption. The actual indexing has a fairly high
 cost. The cost of extracting metadata while indexing ought to be
 relatively low in comparison. That argues that allowing stuff to plug
 into the indexing based on file type is useful. It's not really function
 creep either given the only interface the indexer needs is

        - who is associated with this file type (which exists)
        - give me your metadata for this file content

One short coming in this approach will be, It will cause a problem
where multiple applications can be associated with a file-type, over a
period of time. For instance, for .mbox files, the applications could
vary like: Evolution, Mutt, Pine, Claws, Thunderbird, etc. And it is
common among some people to switch between applications; not for email
but other applications like PDF-viewer, etc. once in few months.

All these different applications may have different interpretations
for what is a valid meta-data. For instance, Evolution will consider
any thing within ** will be metadata whereas mutt might consider
subject is the metadata etc. So every time the user switches
applications, the earlier collected meta-data might need some brushup.

In my personal biased opinion, the need for meta-data and desktop
search is over-rated. Internet is extra-ordinarily mammoth and is
impossible to reach without a search engine as you won't even know how
many sites exist. For desktop the scale of the things is less,
individual application-provided-search is enough and will satisfy the
needs of most of the users. ctags, mairix etc. can provide specialized
and more effective searching.

-- 
Sankar P
http://psankar.blogspot.com
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread Rodrigo Moya
On Tue, 2009-08-18 at 18:38 -0400, Jamie McCracken wrote:
 On Wed, 2009-08-19 at 00:27 +0200, Rodrigo Moya wrote:
  On Tue, 2009-08-18 at 16:48 -0400, Matthias Clasen wrote:
   I think this recent discussion about tracker as a gnome module is
   somewhat backwards. I don't think it is leading us anywhere to talk
   about ontologies and rdf and events and timelines and metadata stores
   and kernel apis before we answer the first question:
   
   What is the user problem that we are solving here ?
   Can that be described in a paragraph ?
   And if it can, is it something that a 'regular' user would recognize
   as a problem he has on his computer ?
   
   Once we have the problem scoped out, we need to look at the user
   experience we want to aim for in solving it. Will it be a single
   search-for-everything dialog ? A query language ? Tagging everywhere ?
   
   After that, it might be possible to evaluate whether tracker,
   zeitgeist, couchdb or something else can be part of the
   implementation...
   
  couchdb provides just the storage of any kind of data, no indexing,
  searching, etc, so I think they solve different problems. In fact,
  tracker could just use local files as storage or a couchdb database. If
  using couchdb, it would get replication and synchronization for free,
  but it would still provide the indexing
 
 
 For your interest, I did want to use CouchDb for our backup of user
 metadata in tracker precisely for that reason. Currently we use turtle
 files which is not optimal. 
 
 However I suspect CouchDb is big and probably too big a dependency for
 nokia's smaller devices so it might not happen or would have to be
 optional in tracker.

yes, it might be too big. At canonical, for ubuntu karmic, we have
reduced the dependencies (erlang runtime) to the minimum (5-6 MB IIRC),
which is ok for a desktop machine, but I guess it's still too big for
nokia's smaller devices?

And of course, couchdb should always be optional. It makes sense to use
it as a storage for sharing data between applications (evolution and
akonadi are both using it to store contacts, which gives us shared data
storage for both GNOME and KDE users. ditto for firefox/epiphany for
bookmarks, evo/tomboy for notes, etc, etc), but the big point about it
is to allow replication of the data to other machines, which might not
be what some users want. So yeah, should still be optional

In fact, I see the tracker integration, if it happens in a GNOME-wide
aspect, like this: applications use tracker to store data, and there is
a global setting to allow users to specify where to store data, locally,
or on couchdb, which would give the replication/synchronization feature,
without changing any application, which would still use tracker to store
their data.

  I dont know a great deal about CouchDb but feel
 free to sell it to Nokia if you can :)
 
well, I'm a simple developer, I'll let the selling to the sales
people :D Although technically it would make a lot of sense, since data
saved in the nokia devices would get replicated to whatever couchdb
remote instance the user has, and appear automatically on the user's
desktop, without having to synchonize by hand the nokia device with the
PC. I think that's a good way to sell it to Nokia :D


___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread Rodrigo Moya
On Wed, 2009-08-19 at 00:36 +0200, Philip Van Hoof wrote:
 
 We evaluated CouchDB as a primary store over sqlite, but CouchDB lacked
 *very* important features. This makes it undoable. Feel free to get in
 touch with us to discuss which precise features I mean.
 
I talked to some tracker people at GCDS about it, so I think I know what you're
talking about, and indeed the CouchDB guys are aware of that (one of
them was at GCDS), and since some of those features are demanded by many
couchdb users (outside of the desktop also), I think they will be added
sooner or later.


___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Alan Cox
 One short coming in this approach will be, It will cause a problem
 where multiple applications can be associated with a file-type, over a
 period of time. For instance, for .mbox files, the applications could
 vary like: Evolution, Mutt, Pine, Claws, Thunderbird, etc. And it is
 common among some people to switch between applications; not for email
 but other applications like PDF-viewer, etc. once in few months.

This requires some commonality about indexing and the meaning of
concepts. There isn't anything wrong with several apps indexing the one
file (preferably at the same time so we walk the filestore once).  A more
interesting problem is heirarchical breakdowns (a multipart mime email of
a zip holding a pdf and a jpg file) or xml documents with multiple
namespaces in use.

 subject is the metadata etc. So every time the user switches
 applications, the earlier collected meta-data might need some brushup.

That assumes that the old meta data is somehow wrong. When an office
changes staff the way stuff is indexed may change a bit but the old index
doesn't become invalid or useless.

 many sites exist. For desktop the scale of the things is less,
 individual application-provided-search is enough and will satisfy the
 needs of most of the users. ctags, mairix etc. can provide specialized
 and more effective searching.

The notion that the internet and personal file store are separate is one
I would question. Why for example would I not be running a query across
my personal email and a company wide accumulated metadata source of all
the internal public mailing lists. Specialized searching is also very
different to general contexts. It is better at the one job but cannot
answer random queries or associations.

/home is a place where you keep stuff nobody else needs, or you
want fast access to, or you particularly don't want other people to have
access to. Indeed if you backup to an internet connected server its not
unreasonable to argue that user filestore is simply a cache, nothing more.

Alan
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Rodrigo Moya
On Tue, 2009-08-18 at 21:21 +0200, Lennart Poettering wrote:
 On Tue, 18.08.09 21:09, Patryk Zawadzki (pat...@pld-linux.org) wrote:
 
  
  On Tue, Aug 18, 2009 at 8:57 PM, Lennart Poetteringmzta...@0pointer.de 
  wrote:
   (I don't want to create the impression that I am opposed to the idea
   of a desktop search engine. I actually do believe it makes sense, but
   really, you need to do a better job selling the specific technology
   tracker does.)
  
  Don't think of the RDF store as Google where you enter two words and
  get back top 10 results. Think of it as a database that has all kinds
  of weird relations for different objects. You could ask it for the
  last.fm track that was playing while you were looking at lolcat images
  on the second day od GCDS while chatting with people whose last name
  contained a lowercase n :)
  
  More real life examples:
  
  - show me all the party pics
  - give me files and data related to gnome bug #123
  - list all the files I received from Lennart during the last week
  (over Jabber, e-mail etc.)
 
 Nice idea, but is this even realistic? How's a UI for this supposed to
 look like? I mean, Google is so awesome because you type stuff in a
 text field with only a minimal syntax requirements and will spit out
 useful stuff.
 
 But how would you expose in the UI a search mask that allows you to
 formulate queries like give me files and data related to gnome bug
 #123? Are you planning to duplicate the bugzilla search form in the
 GNOME Search interface? If that's the case, then www, stop right
 there!
 
Nat's dashboard could make sense for this, where you have related items
to the stuff you're doing, like when you're writing a mail to Lennart,
see all IM logs, mails, commits, etc related to that?

___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Philip Van Hoof
On Wed, 2009-08-19 at 13:07 +0100, Alan Cox wrote:
  One short coming in this approach will be, It will cause a problem
  where multiple applications can be associated with a file-type, over a
  period of time. For instance, for .mbox files, the applications could
  vary like: Evolution, Mutt, Pine, Claws, Thunderbird, etc. And it is
  common among some people to switch between applications; not for email
  but other applications like PDF-viewer, etc. once in few months.
 
 This requires some commonality about indexing and the meaning of
 concepts. There isn't anything wrong with several apps indexing the one
 file (preferably at the same time so we walk the filestore once).

 A more interesting problem is heirarchical breakdowns (a multipart 
 mime email of a zip holding a pdf and a jpg file) or xml documents 
 with multiple namespaces in use.

The libstreamanalyzer library is ideal for this. I opens a file and then
starts reading it using what they call a stream. When they reach a point
in the file that can be recursed (like a zip file in a zip file) then
they open a stream on top of the root stream and recurse into it.

Tracker's FS miner is integrating with libstreamanalyzer for extraction.
The libstreamanalyzer library is originally developed by and for KDE's
Strigi project by Jos Vandenoever. We're of course in discussion with
Jos about various things.

We realized that their method of extracting metadata is far superior
compared to our more simple FILE and fread() based extractors.

The MBox example is a good one for this: a Base64 encoded image/png
attachment in an E-mail that can be found somewhere deep in a large MBox
file ... can have Exif tags that are indexable (when Base64 decoded).

Surely you don't want to Base64 decode all the attachments in the MBox
file to files in /tmp, and then extract those file's Exif tags? (well,
that's what Tracker did for its Evolution support).

Instead, you want to lay a Base64 decoder stream over the root stream
for the MBox file, and then analyze the image/png image that comes out
of the Base64 decoder stream.

That's what libstreamanalyzer does.

  subject is the metadata etc. So every time the user switches
  applications, the earlier collected meta-data might need some brushup.
 
 That assumes that the old meta data is somehow wrong. When an office
 changes staff the way stuff is indexed may change a bit but the old index
 doesn't become invalid or useless.

Right

  many sites exist. For desktop the scale of the things is less,
  individual application-provided-search is enough and will satisfy the
  needs of most of the users. ctags, mairix etc. can provide specialized
  and more effective searching.
 
 The notion that the internet and personal file store are separate is one
 I would question.

Exactly. I think RDF metadata stores can be instrumental in bringing the
web and the desktop closer together. Which is among our goals.

 Why for example would I not be running a query across my personal email
 and a company wide accumulated metadata source of all the internal public
 mailing lists.

This would be possible if we'd first develop a protocol for doing remote
queries. Again a long term goal that might not even be part of Tracker
itself (we can easily proxy DBus over some TCP/IP or even UDP service).

You can probably imagine that this would require things like security
policies too? ;-) We wouldn't want random people accessing your data.

Of course.

 Specialized searching is also very different to general contexts. It is
 better at the one job but cannot answer random queries or associations.
 
 /home is a place where you keep stuff nobody else needs, or you
 want fast access to, or you particularly don't want other people to have
 access to. Indeed if you backup to an internet connected server its not
 unreasonable to argue that user filestore is simply a cache, nothing more.

Thanks for your input, Alan. It's very helpful.

-- 
Philip Van Hoof, freelance software developer
home: me at pvanhoof dot be 
gnome: pvanhoof at gnome dot org 
http://pvanhoof.be/blog
http://codeminded.be

___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread Ivan Frade
Hi,

On Wed, Aug 19, 2009 at 2:11 AM, Thomas H.P. Andersen pho...@gmail.comwrote:

 On Tue, Aug 18, 2009 at 22:48, Matthias Clasenmatthias.cla...@gmail.com
 wrote:
  I think this recent discussion about tracker as a gnome module is
  somewhat backwards. I don't think it is leading us anywhere to talk
  about ontologies and rdf and events and timelines and metadata stores
  and kernel apis before we answer the first question:
 
  What is the user problem that we are solving here ?
  Can that be described in a paragraph ?
  And if it can, is it something that a 'regular' user would recognize
  as a problem he has on his computer ?


 The basic problem is to link information and make it available out of the
application silos. The desktop is full of data, but there is no way to mix
those data. IM messaging, Contacts, Email, are very close to each other but
we cannot build a Google wave-like window.

 The classic search engines (tracker 0.6, beagle) were fighting against tons
of formats to reconstruct information. The application knows that some
information is a contact, save it in a vcard, tracker reads the vcard and
tries to reconstruct the contact. This solution tends to fail: the
reconstruction is never complete, and a lot of guess is needed.

 So the new approach in tracker 0.7 is to offer a common schema, and let the
application push the information directly. Skip the file step, so no
information is lose in the file roundtrip.

 What does this mean to the user? That he can see related information
everywhere.

For instance, take EOG
* you could filter by photos sent by...
* you could open an IM conversation with the person who sent you the picture
* you could have a tag cloud, including your fspot tags
* If zeitgeist set relations between photos, it could suggest related
documents
* In EOG you can ask for photos tagged as GCDS and find
local/flickr/facebook results with that tag

This could apply also to totem or rhythmbox. s/pictures/songs/

The browser:
* Bookmark a page in epiphany, and it is available in firefox.

Other example:
* Somebody write: Hey, have you take a look to the document i sent you
yesterday? and your dashboard shows the last document sent by that contact
and his last blog posts. It can show that not because it understand the
have you take a look... but because it knows you are talking with a
certain contact. I.E. No magic or wonderfull inference: just queries to a
well structured database.

Right now, all the information is already available somewhere (evo,
telepathy, pidgin, ...), but these use cases are impossible to implement.




 Once we have the problem scoped out, we need to look at the user
 experience we want to aim for in solving it. Will it be a single
 search-for-everything dialog ? A query language ? Tagging everywhere ?

 The dedicated search window is dead. The applications are the client of
this central storage: the application knows what is showing, so knows what
can be related with it!

 After that, it might be possible to evaluate whether tracker,
 zeitgeist, couchdb or something else can be part of the
 implementation...

 Zeitgeist + tracker are complementary (and a nice team together).

 CouchDB is also a storage but with a different philosophy. The nicest part
is the synchronization... but maybe we could wrap tracker in a similar
code to allow the online replication. This is just a wild guess.

 Regards,

Ivan
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list

Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread John Carr
On Wed, Aug 19, 2009 at 2:53 PM, Ivan Fradeivan.fr...@gmail.com wrote:
 Hi,

 On Wed, Aug 19, 2009 at 2:11 AM, Thomas H.P. Andersen pho...@gmail.com
 wrote:

 On Tue, Aug 18, 2009 at 22:48, Matthias Clasenmatthias.cla...@gmail.com
 wrote:
  I think this recent discussion about tracker as a gnome module is
  somewhat backwards. I don't think it is leading us anywhere to talk
  about ontologies and rdf and events and timelines and metadata stores
  and kernel apis before we answer the first question:
 
  What is the user problem that we are solving here ?
  Can that be described in a paragraph ?
  And if it can, is it something that a 'regular' user would recognize
  as a problem he has on his computer ?

  The basic problem is to link information and make it available out of the
 application silos. The desktop is full of data, but there is no way to mix
 those data. IM messaging, Contacts, Email, are very close to each other but
 we cannot build a Google wave-like window.

  The classic search engines (tracker 0.6, beagle) were fighting against tons
 of formats to reconstruct information. The application knows that some
 information is a contact, save it in a vcard, tracker reads the vcard and
 tries to reconstruct the contact. This solution tends to fail: the
 reconstruction is never complete, and a lot of guess is needed.

  So the new approach in tracker 0.7 is to offer a common schema, and let the
 application push the information directly. Skip the file step, so no
 information is lose in the file roundtrip.

  What does this mean to the user? That he can see related information
 everywhere.

 For instance, take EOG
 * you could filter by photos sent by...
 * you could open an IM conversation with the person who sent you the picture
 * you could have a tag cloud, including your fspot tags
 * If zeitgeist set relations between photos, it could suggest related
 documents
 * In EOG you can ask for photos tagged as GCDS and find
 local/flickr/facebook results with that tag

 This could apply also to totem or rhythmbox. s/pictures/songs/

 The browser:
 * Bookmark a page in epiphany, and it is available in firefox.

 Other example:
 * Somebody write: Hey, have you take a look to the document i sent you
 yesterday? and your dashboard shows the last document sent by that contact
 and his last blog posts. It can show that not because it understand the
 have you take a look... but because it knows you are talking with a
 certain contact. I.E. No magic or wonderfull inference: just queries to a
 well structured database.

 Right now, all the information is already available somewhere (evo,
 telepathy, pidgin, ...), but these use cases are impossible to implement.


Thanks for this explanation - its one of the best ones i've read :)


 Once we have the problem scoped out, we need to look at the user
 experience we want to aim for in solving it. Will it be a single
 search-for-everything dialog ? A query language ? Tagging everywhere ?
  The dedicated search window is dead. The applications are the client of
 this central storage: the application knows what is showing, so knows what
 can be related with it!

 After that, it might be possible to evaluate whether tracker,
 zeitgeist, couchdb or something else can be part of the
 implementation...

  Zeitgeist + tracker are complementary (and a nice team together).

  CouchDB is also a storage but with a different philosophy. The nicest part
 is the synchronization... but maybe we could wrap tracker in a similar
 code to allow the online replication. This is just a wild guess.

I think its possible - here is a little something i've been hacking up
on my train to work for couch pulling from tracker:

http://github.com/Jc2k/tracker-replicator/tree/master

And i've been talking to Rodrigo - couchdb-glib might make
implementing a client of this even easier :)

  Regards,

 Ivan

John
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread Jamie McCracken
On Wed, 2009-08-19 at 15:50 +0100, John Carr wrote:

   CouchDB is also a storage but with a different philosophy. The nicest part
  is the synchronization... but maybe we could wrap tracker in a similar
  code to allow the online replication. This is just a wild guess.
 
 I think its possible - here is a little something i've been hacking up
 on my train to work for couch pulling from tracker:
 
 http://github.com/Jc2k/tracker-replicator/tree/master
 

very nice work

its great to see we can take advantage of all three technologies and
integrate them seamlessly

i also think there is some mileage in using couchdb as the primary store
and just have tracker index that. one of the strengths of couchdb is its
mvcc architecture which means its completely corruption proof as updates
are always appended to the db and therefore cannot cause corruption of
existing data. that in itself makes it an excellent storage medium and
with all querying left to tracker and sqlite you can get the best of
both

jamie

___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread Ross Burton
On Wed, 2009-08-19 at 11:23 -0400, Jamie McCracken wrote:
 i also think there is some mileage in using couchdb as the primary store
 and just have tracker index that. one of the strengths of couchdb is its
 mvcc architecture which means its completely corruption proof as updates
 are always appended to the db and therefore cannot cause corruption of
 existing data. that in itself makes it an excellent storage medium and
 with all querying left to tracker and sqlite you can get the best of
 both

As couchdb only appends, would it be trivial to layer CouchDB on top of
a VCS such as Monotone to get historical data for free?  I can't think
of any use-cases for this straight away but I'm sure someone can.

Ross
-- 
Ross Burton mail: r...@burtonini.com
  jabber: r...@burtonini.com
   www: http://burtonini.com


signature.asc
Description: This is a digitally signed message part
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list

Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread Alberto Ruiz
2009/8/19 Ross Burton r...@burtonini.com:
 On Wed, 2009-08-19 at 11:23 -0400, Jamie McCracken wrote:
 i also think there is some mileage in using couchdb as the primary store
 and just have tracker index that. one of the strengths of couchdb is its
 mvcc architecture which means its completely corruption proof as updates
 are always appended to the db and therefore cannot cause corruption of
 existing data. that in itself makes it an excellent storage medium and
 with all querying left to tracker and sqlite you can get the best of
 both

 As couchdb only appends, would it be trivial to layer CouchDB on top of
 a VCS such as Monotone to get historical data for free?  I can't think
 of any use-cases for this straight away but I'm sure someone can.

Oh the irony, this thread is turning into a bigger crack than the one
it was complaining about at the beginning.

 Ross
 --
 Ross Burton                                 mail: r...@burtonini.com
                                          jabber: r...@burtonini.com
                                           www: http://burtonini.com

 ___
 desktop-devel-list mailing list
 desktop-devel-list@gnome.org
 http://mail.gnome.org/mailman/listinfo/desktop-devel-list




-- 
Un saludo,
Alberto Ruiz
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Tracker, Zeitgeist, Couchdb...where is the problem ?

2009-08-19 Thread Rob Taylor
Ross Burton wrote:
 On Wed, 2009-08-19 at 11:23 -0400, Jamie McCracken wrote:
 i also think there is some mileage in using couchdb as the primary store
 and just have tracker index that. one of the strengths of couchdb is its
 mvcc architecture which means its completely corruption proof as updates
 are always appended to the db and therefore cannot cause corruption of
 existing data. that in itself makes it an excellent storage medium and
 with all querying left to tracker and sqlite you can get the best of
 both
 
 As couchdb only appends, would it be trivial to layer CouchDB on top of
 a VCS such as Monotone to get historical data for free?  I can't think
 of any use-cases for this straight away but I'm sure someone can.

That's starting to sound a lot like Wizbit [1], right? I've kinda given
up on that particular pipe of crack ;) .. at least for now..

Rob

[1] http://www.wizbit.org/drupal/node/3

 Ross
 
 
 
 
 ___
 desktop-devel-list mailing list
 desktop-devel-list@gnome.org
 http://mail.gnome.org/mailman/listinfo/desktop-devel-list


-- 
Rob Taylor, Codethink Ltd. - http://codethink.co.uk
___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: Bumping/Dropping the PolicyKit external dep

2009-08-19 Thread Milan Bouchet-Valat
Frederic Peters wrote:
  - gnome-system-tools still depends on the old version, and I don't
see it listed in the Fedora feature page
As of 2.27.3, released today, the gnome-system-tools use PolicyKit1
(well, actually, polkit-gtk-1). They require the system-tools-backends
2.8 or above for that.

They are not listed on the Fedora wiki since they don't use them
(sadly...).


That said, I've just checked the external dependency list, and PolicyKit
is still 0.9. Is it outdated?


Cheers


___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list


Re: New module proposal: tracker

2009-08-19 Thread Andre Klapper
Am Dienstag, den 18.08.2009, 13:05 +0100 schrieb Martyn Russell:
 So I would like to propose Tracker as a new GNOME module.

Getting this back to inclusion requirements:
- GNOME3 readiness:
  Please fix http://bugzilla.gnome.org/show_bug.cgi?id=581984
- I18N:
  Please fix http://bugzilla.gnome.org/show_bug.cgi?id=592400

Thanks,
andre
-- 
 mailto:ak...@gmx.net | failed
 http://www.iomc.de/  | http://blogs.gnome.org/aklapper

___
desktop-devel-list mailing list
desktop-devel-list@gnome.org
http://mail.gnome.org/mailman/listinfo/desktop-devel-list