davidwendt commented on code in PR #43234:
URL: https://github.com/apache/arrow/pull/43234#discussion_r1683324340


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -283,6 +283,28 @@ UUID
    A specific UUID version is not required or guaranteed. This extension 
represents
    UUIDs as FixedSizeBinary(16) with big-endian notation and does not 
interpret the bytes in any way.
 
+8-bit Boolean
+====
+
+Bool8 represents a boolean value using 1 byte (8 bits) to store each value 
instead of only 1 bit as in
+the native Arrow Boolean type. Although less compact that the native 
representation, Bool8 may have
+better zero-copy compatibility with various systems that also store booleans 
using 1 byte.
+
+* Extension name: ``arrow.bool8``.
+
+* The storage type of this extension is ``Int8`` where:
+
+  * **false** is denoted by the value ``0``.
+  * **true** can be specified using any non-zero value.

Review Comment:
   Just for clarification on libcudf. The comparator referenced resolves the 
BOOL8 value to a C/C++ `bool` variable before calling the comparator and so 
follows the same semantics that a non-zero bool compares equal to another 
non-zero bool (nicely illustrated in the godbolt link here).
   The magic is in the `column_device_view::element()` method which when called 
through with the type-dispatcher maps a `BOOL8` column type to a native `bool` 
type when reading the column row data from memory.
   So libcudf does not require the bool8 values to be normalized to [0,1] but 
relies on the underlying behavior of C++ `bool` type.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to