This is an automated notification sent by LCG Savannah.
It relates to:
task #7875, project CDS Invenio
==============================================================================
LATEST MODIFICATIONS of task #7875:
==============================================================================
Follow-up Comment #1, task #7875 (project cdsware):
Going further on this we can imagine a 64-bit tree implementation where a
long long int of 64 bit can be a map where every bit represent an allocated
word of 64 bit, and this with a recursive implementation.
Imaging an Invenio installation with ~8,000,000 recids. To represent the set
with just the last recid would need 1MBytes of memory, to be compared with
the teoretical 32Bytes needed by the above tree structure.
Set operations on sparse set would also gain because it would not require
anymore to scroll the set entirely.
e.g.: assuming words of 4 bits (and two levels of tree)
set1 = 0010 set2 = 0100 set1 & set2 = 0010 & 0100 = 0000
| |
1010 1001 no need to descend the leaves
where currently (where trailing 0s are trimmed out):
set1 = 000000001010 set2 = 00001001 set1 & set2 = 00000000
==============================================================================
OVERVIEW of task #7875:
==============================================================================
URL:
<http://savannah.cern.ch/task/?7875>
Summary: Possible improvements to intbitset
Project: CDS Invenio
Submitted by: skaplun
Submitted on: 2008-09-22 09:16
Should Start On: 2008-09-22 00:00
Should be Finished on: 2008-09-22 00:00
Category: BibIndex
Priority: 2
Status: None
Privacy: Public
Percent Complete: 0%
Assigned to: skaplun
Open/Closed: Open
Discussion Lock: Any
Effort: 0.00
_______________________________________________________
Currently intbitset is not optimized for boolean AND operations and for
storing few big integers.
AND operations are currently performed by aligning the two intbitsets to the
larger one. Actually aligning to the shortest one should also be possible
(provided that they're not both infinite sets).
If intbitset is storing e.g. only one big integer, it will store all the
zeros representing all the previous big integers before the one currently
stored.
000000000000000000000000...000000000000000000000000001
This implementation is not very scalable and can be improved by adding the
notion of an offset, so that intbitset will keep in memory what the first bit
set to 1 is really representing.
_______________________________________________________
Follow-up Comments:
-------------------------------------------------------
Date: 2009-07-29 15:46 By: Samuele Kaplun <skaplun>
Going further on this we can imagine a 64-bit tree implementation where a
long long int of 64 bit can be a map where every bit represent an allocated
word of 64 bit, and this with a recursive implementation.
Imaging an Invenio installation with ~8,000,000 recids. To represent the set
with just the last recid would need 1MBytes of memory, to be compared with
the teoretical 32Bytes needed by the above tree structure.
Set operations on sparse set would also gain because it would not require
anymore to scroll the set entirely.
e.g.: assuming words of 4 bits (and two levels of tree)
set1 = 0010 set2 = 0100 set1 & set2 = 0010 & 0100 = 0000
| |
1010 1001 no need to descend the leaves
where currently (where trailing 0s are trimmed out):
set1 = 000000001010 set2 = 00001001 set1 & set2 = 00000000
_______________________________________________________
Carbon-Copy List:
CC Address | Comment
------------------------------------+-----------------------------
2195 | -SUB-
==============================================================================
This item URL is:
<http://savannah.cern.ch/task/?7875>
_______________________________________________
Message sent via/by LCG Savannah
http://savannah.cern.ch/