joellubi commented on code in PR #43234:
URL: https://github.com/apache/arrow/pull/43234#discussion_r1682707778


##########
docs/source/format/CanonicalExtensions.rst:
##########
@@ -283,6 +283,28 @@ UUID
    A specific UUID version is not required or guaranteed. This extension 
represents
    UUIDs as FixedSizeBinary(16) with big-endian notation and does not 
interpret the bytes in any way.
 
+8-bit Boolean
+=============
+
+Bool8 represents a boolean value using 1 byte (8 bits) to store each value 
instead of only 1 bit as in
+the native Arrow Boolean type. Although less compact that the native 
representation, Bool8 may have
+better zero-copy compatibility with various systems that also store booleans 
using 1 byte.
+
+* Extension name: ``arrow.bool8``.
+
+* The storage type of this extension is ``Int8`` where:

Review Comment:
   Hi @AlenkaF, either would work but `int8` was chosen for easier 
multi-language support. There are a few places in the Arrow docs that suggest 
this, such as the entry for [dictionary 
indices](https://arrow.apache.org/docs/format/Columnar.html#dictionary-encoded-layout):
   > Since unsigned integers can be more difficult to work with in some cases 
(e.g. in the JVM), we recommend preferring signed integers over unsigned 
integers for representing dictionary indices. Additionally, we recommend 
avoiding using 64-bit unsigned integer indices unless they are required by an 
application.
   
   The same principles will likely make `Bool8` work better on the JVM if it's 
represented with signed integers.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to