GitHub user felipecrv added a comment to the discussion: Run end encoding and 
constant random access

> This allows relatively efficient random access from a logical index using 
> binary search.

This is said in comparison to run-length encoding which stores the lengths of 
each run instead of the index of the run-end (i.e. the cumulative sum of the 
run lengths).

> However wouldn’t this be O(log N) in the worst case and violate the 
> constant-time random access guarantee?

Yes, what is this guarantee mentioned so we can point to the exception?

Even though `for i in range(n): get(i)` is `O(n log n)`, it's possible to scan 
sequentially without having to run any binary-search. Every time you increment 
the index, you can check if it still belongs to the current run or the 
immediate next one.

GitHub link: 
https://github.com/apache/arrow/discussions/46577#discussioncomment-13273484

----
This is an automatically sent email for user@arrow.apache.org.
To unsubscribe, please send an email to: user-unsubscr...@arrow.apache.org

Reply via email to