This is an automated notification sent by LCG Savannah.
It relates to:
                task #7875, project CDS Invenio

==============================================================================
 LATEST MODIFICATIONS of task #7875:
==============================================================================

Follow-up Comment #1, task #7875 (project cdsware):

Going further on this we can imagine a 64-bit tree implementation where a
long long int of 64 bit can be a map where every bit represent an allocated
word of 64 bit, and this with a recursive implementation.

Imaging an Invenio installation with ~8,000,000 recids. To represent the set
with just the last recid would need 1MBytes of memory, to be compared with
the teoretical 32Bytes needed by the above tree structure.

Set operations on sparse set would also gain because it would not require
anymore to scroll the set entirely.

e.g.: assuming words of 4 bits (and two levels of tree)

set1 = 0010   set2 = 0100    set1 & set2 = 0010 & 0100 = 0000
         |            |
         1010         1001   no need to descend the leaves

where currently (where trailing 0s are trimmed out):
set1 = 000000001010 set2 = 00001001  set1 & set2 = 00000000





==============================================================================
 OVERVIEW of task #7875:
==============================================================================

URL:
  <http://savannah.cern.ch/task/?7875>

                 Summary: Possible improvements to intbitset
                 Project: CDS Invenio
            Submitted by: skaplun
            Submitted on: 2008-09-22 09:16
         Should Start On: 2008-09-22 00:00
   Should be Finished on: 2008-09-22 00:00
                Category: BibIndex
                Priority: 2
                  Status: None
                 Privacy: Public
        Percent Complete: 0%
             Assigned to: skaplun
             Open/Closed: Open
         Discussion Lock: Any
                  Effort: 0.00

    _______________________________________________________


Currently intbitset is not optimized for boolean AND operations and for
storing few big integers.

AND operations are currently performed by aligning the two intbitsets to the
larger one. Actually aligning to the shortest one should also be possible
(provided that they're not both infinite sets).

If intbitset is storing e.g. only one big integer, it will store all the
zeros representing all the previous big integers before the one currently
stored.
000000000000000000000000...000000000000000000000000001
This implementation is not very scalable and can be improved by adding the
notion of an offset, so that intbitset will keep in memory what the first bit
set to 1 is really representing.



    _______________________________________________________

Follow-up Comments:


-------------------------------------------------------
Date: 2009-07-29 15:46              By: Samuele Kaplun <skaplun>
Going further on this we can imagine a 64-bit tree implementation where a
long long int of 64 bit can be a map where every bit represent an allocated
word of 64 bit, and this with a recursive implementation.

Imaging an Invenio installation with ~8,000,000 recids. To represent the set
with just the last recid would need 1MBytes of memory, to be compared with
the teoretical 32Bytes needed by the above tree structure.

Set operations on sparse set would also gain because it would not require
anymore to scroll the set entirely.

e.g.: assuming words of 4 bits (and two levels of tree)

set1 = 0010   set2 = 0100    set1 & set2 = 0010 & 0100 = 0000
         |            |
         1010         1001   no need to descend the leaves

where currently (where trailing 0s are trimmed out):
set1 = 000000001010 set2 = 00001001  set1 & set2 = 00000000









    _______________________________________________________

Carbon-Copy List:

CC Address                          | Comment
------------------------------------+-----------------------------
2195                                | -SUB-




==============================================================================

This item URL is:
  <http://savannah.cern.ch/task/?7875>

_______________________________________________
  Message sent via/by LCG Savannah
  http://savannah.cern.ch/

Reply via email to