[
https://issues.apache.org/jira/browse/ARROW-8543?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Antoine Pitrou reassigned ARROW-8543:
-------------------------------------
Assignee: Mayur Srivastava
> [C++] IO: single pass coalescing algorithm
> ------------------------------------------
>
> Key: ARROW-8543
> URL: https://issues.apache.org/jira/browse/ARROW-8543
> Project: Apache Arrow
> Issue Type: Improvement
> Components: C++
> Reporter: Mayur Srivastava
> Assignee: Mayur Srivastava
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.0.0
>
>
> The current coalescing algorithm is a two pass algorithm (where N is number
> of ranges) (first implemented in
> https://issues.apache.org/jira/browse/ARROW-7995). In the first pass, the
> Coalesce() function finds the begin and end of a candidate range that can be
> coalesced. In the second, pass the CoalesceUntilLargeEnough() function goes
> over the ranges from begin to end and adds coalesced range to the result
> (out).
> The proposal is to convert the algorithm to an single pass algorithm that
> computes coalesced ranges while making the first pass over the ranges in the
> list. This algorithm is also shorter in lines of code and hence (hopefully)
> more maintainable in long term.
> Correction: Post sorting, the current algorithm is O(N) and the improvement
> is O(N). I called the current algo O(N^2) due to an oversight.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)