Hi,
Looks good. A couple of thoughts :

  *
It says “Only integers in the logical range [0, 2^32) are supported”. I assume 
the top value should be 2^32-1.
  *
I assume we can have sparsely populated bitmaps.
  *
When we do an AND or OR, I assume this is at the bit level, not at the integer 
level. Would a null value at position 6 be the same as 32 0’s for the purposes 
of these operations? It would be useful to show some examples around this for 
example around timestamps

Kind regards, David.


From: dylanhz <[email protected]>
Date: Wednesday, 26 November 2025 at 12:09
To: [email protected] <[email protected]>
Subject: [EXTERNAL] Re: [DISCUSS] FLIP-556 Introduce BITMAP Data Type

You are right that there is a binding, but I want to clarify: Bitmap is bound 
to Roaring as the external serialization format, not the internal 
implementation. The internal implementation can be changed independently 
without affecting users.

If users need other serialization formats in the future, we can add a format 
parameter to Bitmap#toBytes and Bitmap#fromBytes methods, as well as their 
corresponding built-in functions. Users can then work with the BYTES type 
directly to serialize/deserialize in their preferred format when exchanging 
data with external systems.


Best regards,
dylanhz


> 2025年11月26日 19:16,Xuyang <[email protected]> 写道:
>
> +1 for this feature.
> Looks good to see the support for bitmaps to enable Flink handling 
> computations in extremely high-dimensional scenarios. After reviewing the 
> entire FLIP, I have one question:
> Regarding the Bitmap#toBytes interface, I noticed that it will output bytes 
> in RoaringBitmap format default. Does this imply a strong binding to the 
> internal implementation of RoaringBitmap? For the writers on the sink table, 
> they need to be aware that the bytes are in RoaringBitmap format, right?
>
>
>
> --
>
>    Best!
>    Xuyang
>
>
>
> 在 2025-11-24 18:09:25,"Lincoln Lee" <[email protected]> 写道:
>> +1 for this feature! Expanding the bitmap type will help users unlock more
>> computation scenarios and integrate more easily with external systems.
>>
>>
>> Best,
>> Lincoln Lee
>>
>>
>> dylanhz <[email protected]> 于2025年11月21日周五 11:07写道:
>>
>>> Hi everyone,
>>>
>>>
>>> I would like to start a discussion about FLIP-556 Introduce BITMAP Data
>>> Type[1].
>>>
>>>
>>> Flink currently has no native, compressed data type for large integer
>>> sets, forcing users to rely on external libraries like RoaringBitmap via
>>> UDFs.
>>> This limits performance, maintainability, and integration with Flink’s
>>> type system and SQL engine.
>>> We propose adding a built‑in BITMAP type based on RoaringBitmap to provide
>>> compact storage, exact deduplication, and efficient set operations (AND,
>>> OR, XOR) directly within Flink.
>>>
>>>
>>> I have had some initial discussions with @Lincoln Lee and @Jark Wu
>>> regarding this FLIP.
>>> Looking forward to your feedback and suggestions.
>>>
>>>
>>> [1]
>>> https://docs.google.com/document/d/1YNgIt93iFboogHMoKbDD4LjP5UrfqtF65hitGKRtKMs/edit?usp=sharing
>>>
>>>
>>>
>>> --
>>>
>>> Best regards,
>>> dylanhz


Unless otherwise stated above:

IBM United Kingdom Limited
Registered in England and Wales with number 741598
Registered office: Building C, IBM Hursley Office, Hursley Park Road, 
Winchester, Hampshire SO21 2JN

Reply via email to