lidavidm commented on code in PR #14176:
URL: https://github.com/apache/arrow/pull/14176#discussion_r1061739228
##########
format/Schema.fbs:
##########
@@ -178,6 +178,9 @@ table FixedSizeBinary {
table Bool {
}
+table RunEndEncoded {
+}
Review Comment:
Hmm, shouldn't this encode the run-end index size, like Int above or Decimal?
##########
docs/source/format/Columnar.rst:
##########
@@ -765,6 +765,71 @@ application.
We discuss dictionary encoding as it relates to serialization further
below.
+.. _run-end-encoded-layout:
+
+Run-End Encoded Layout
+-------------------------
+
+Run-end encoding (REE) is a variation of run-length encoding (RLE). These
+encodings are well-suited for representing data containing sequences of the
+same value, called runs. In run-end encoding, each run is represented as a
+value and an integer giving the index in the array where the run ends.
+
+Any array can be run-end encoded. A run-end encoded array has no buffers
+by itself, but has two child arrays. The first one holds a signed integer
+called a "run end" for each run. The run ends array can hold either 16, 32, or
+64-bit integers. The actual values of each run are held
+the second child array.
Review Comment:
```suggestion
64-bit integers. The actual values of each run are held in
the second child array.
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]