On Wed, 2007-03-14 at 17:27 +0100, Andrea Caccin wrote: > > > 2007/3/14, jamie <[EMAIL PROTECTED]>: > On Mon, 2007-03-12 at 11:51 +0100, Andrea Caccin wrote: > > Hi at all, > > I'm Andrea Caccin, a student of Computer Science in > University of > > Padua, Italy.I have some question for you: > > > > 1. I'd like to know where can I find the source code of > ranking > > (frequency, weight) algorithm. > > we will be moving towards IDF. at the moment its just weight * > freq > > > > 2. what's the maximum byte dimension of collection indexing > by > > Tracker? > > ? > > > > 3. what's the maximum number of documents indexing by > Tracker? > > 2 billion > > > > > Thanks in advance > > Andrea Caccin > > _______________________________________________ > > tracker-list mailing list > > [email protected] > > http://mail.gnome.org/mailman/listinfo/tracker-list > > > > Thanks a lot for the answers, but in the second I would to know if > there's a bound (in MB or GB) of the dimension of the documents > indexed.
there is no size of document limits as such There is an overall size limitation in sqlite which can be found here: http://www.sqlite.org/faq.html#q10 we compress the fulltext with zlib in sqlite so bear in mind size limits are the compressed size - the uncompressed can be up to 10x bigger than that (up to 10:1 compression) > I also would to know, if it's possible, the specific file that runs > the ranking. the rank as stored in the index can be found in tracker_parse_text : g_hash_table_insert (word_table, index_word, GINT_TO_POINTER (count + weight)); in file :http://svn.gnome.org/viewcvs/tracker/trunk/src/trackerd/tracker-parser.c?revision=539&view=markup _______________________________________________ tracker-list mailing list [email protected] http://mail.gnome.org/mailman/listinfo/tracker-list
