> On 18 Mar 2020, at 14:39, Claude Warren <cla...@xenei.com> wrote:
> 
>>> Shape Discussion:
>>> 
>>> as for getNumberOfBytes() it should return the maximum number of bytes
>>> returned by a getBits() call to a filter with this shape.  So yes, if
> there
>>> is a compressed internal representation, no it won't be that.  It is a
>>> method on Shape so it should literally be Math.ceil( getNumberOfBits() /
>>> 8.0 )
>>> 
>>> Basically, if you want to create an array that will fit all the bits
>>> returned by BloomFilter.iterator() you need an array of
>>> Shape.getNumberOfBytes().  And that is actually what I use it for.
> 
>> Then you are also mapping the index to a byte index and a bit within the
> byte. So if you are doing these two actions then this is something that you
> should control.
> 
> BloomFilter.getBits returns a long[].  that long[] may be shorter than the
> absolute number of bytes specified by Shape.  It also may be longer.
> 
> If you want to create a copy of the byte[] you have to know how long it
> should be.  The only way to determine that is from Shape, and currently
> only if you do the Ceil() method noted above.  There is a convenience in
> knowing how long (in bytes) the buffer can be.

Copy of what byte[]?

There is no method to create a byte[] for a BloomFilter. So no need for 
getNumberOfBytes().

Are you talking about compressing the long[] to a byte[] by truncating the 
final long into 1-8 bytes?

    BloomFilter bf;
    long[] bits = bf.getBits();
    ByteBuffer bb = ByteBuffer.allocate(bits.length * 
Long.BYTES).order(ByteOrder.LITTLE_ENDIAN);
    Arrays.stream(bits).forEachOrdered(bb::putLong);
    byte[] bytes = bb.array();
    int expected = (int) Math.ceil(bf.getShape().getNumberOfBits() / 8.0);
    if (bytes.length != expected) {
        bytes = Arrays.copyOf(bytes, expected);
    }

For a BloomFilter of any reasonable number of bits the storage saving will be 
small.

Is this for serialisation? This is outside of the scope of the library.


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

Reply via email to