Hi David,
Thanks for your review! Here are the clarifications:
1. You are right about the value range, the parenthesis excludes the upper
boundary.
2. Yes, we support sparsely populated bitmaps. We use RoaringBitmap[1] for
internal implementation, which efficiently handles both sparse and dense data
via adaptive compression (Array/Bitmap/Run containers).
3.1 Operations like BITMAP_AND are set algebra operations (intersection/union)
on the integers, not bit-level operations. Here is an example:
```
SELECT BITMAP_OR(BITMAP_BUILD(ARRAY[1,2,3]), BITMAP_BUILD(ARRAY[3,6,9]))
> {1,2,3,6,9}
```
3.2 Bitmap level null semantics are stated in section “BITMAP Semantics”. Any
integer value is either in (bit 1) or not in (bit 0) the bitmap, there is no
“null” state.
3.3 Bitmap stores integers only, so you have to do mapping before storing
non-integer values.
As for the timestamps example, I’m not sure why and how you store them in a
bitmap, could you explain a little more?
[1] https://github.com/RoaringBitmap/RoaringFormatSpec
Best regards,
dylanhz
> 2025年12月10日 20:21,David Radley <[email protected]> 写道:
>
> Hi,
> Looks good. A couple of thoughts :
>
> *
> It says “Only integers in the logical range [0, 2^32) are supported”. I
> assume the top value should be 2^32-1.
> *
> I assume we can have sparsely populated bitmaps.
> *
> When we do an AND or OR, I assume this is at the bit level, not at the
> integer level. Would a null value at position 6 be the same as 32 0’s for the
> purposes of these operations? It would be useful to show some examples around
> this for example around timestamps
>
> Kind regards, David.
>
>
> From: dylanhz <[email protected]>
> Date: Wednesday, 26 November 2025 at 12:09
> To: [email protected] <[email protected]>
> Subject: [EXTERNAL] Re: [DISCUSS] FLIP-556 Introduce BITMAP Data Type
>
> You are right that there is a binding, but I want to clarify: Bitmap is bound
> to Roaring as the external serialization format, not the internal
> implementation. The internal implementation can be changed independently
> without affecting users.
>
> If users need other serialization formats in the future, we can add a format
> parameter to Bitmap#toBytes and Bitmap#fromBytes methods, as well as their
> corresponding built-in functions. Users can then work with the BYTES type
> directly to serialize/deserialize in their preferred format when exchanging
> data with external systems.
>
>
> Best regards,
> dylanhz
>
>
>> 2025年11月26日 19:16,Xuyang <[email protected]> 写道:
>>
>> +1 for this feature.
>> Looks good to see the support for bitmaps to enable Flink handling
>> computations in extremely high-dimensional scenarios. After reviewing the
>> entire FLIP, I have one question:
>> Regarding the Bitmap#toBytes interface, I noticed that it will output bytes
>> in RoaringBitmap format default. Does this imply a strong binding to the
>> internal implementation of RoaringBitmap? For the writers on the sink table,
>> they need to be aware that the bytes are in RoaringBitmap format, right?
>>
>>
>>
>> --
>>
>> Best!
>> Xuyang
>>
>>
>>
>> 在 2025-11-24 18:09:25,"Lincoln Lee" <[email protected]> 写道:
>>> +1 for this feature! Expanding the bitmap type will help users unlock more
>>> computation scenarios and integrate more easily with external systems.
>>>
>>>
>>> Best,
>>> Lincoln Lee
>>>
>>>
>>> dylanhz <[email protected]> 于2025年11月21日周五 11:07写道:
>>>
>>>> Hi everyone,
>>>>
>>>>
>>>> I would like to start a discussion about FLIP-556 Introduce BITMAP Data
>>>> Type[1].
>>>>
>>>>
>>>> Flink currently has no native, compressed data type for large integer
>>>> sets, forcing users to rely on external libraries like RoaringBitmap via
>>>> UDFs.
>>>> This limits performance, maintainability, and integration with Flink’s
>>>> type system and SQL engine.
>>>> We propose adding a built‑in BITMAP type based on RoaringBitmap to provide
>>>> compact storage, exact deduplication, and efficient set operations (AND,
>>>> OR, XOR) directly within Flink.
>>>>
>>>>
>>>> I have had some initial discussions with @Lincoln Lee and @Jark Wu
>>>> regarding this FLIP.
>>>> Looking forward to your feedback and suggestions.
>>>>
>>>>
>>>> [1]
>>>> https://docs.google.com/document/d/1YNgIt93iFboogHMoKbDD4LjP5UrfqtF65hitGKRtKMs/edit?usp=sharing
>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Best regards,
>>>> dylanhz
>
>
> Unless otherwise stated above:
>
> IBM United Kingdom Limited
> Registered in England and Wales with number 741598
> Registered office: Building C, IBM Hursley Office, Hursley Park Road,
> Winchester, Hampshire SO21 2JN