[ 
https://issues.apache.org/jira/browse/ARROW-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liya Fan updated ARROW-6307:
----------------------------
    Description: 
RLE (run length encoding) is a widely used encoding/decoding technique. 
Compared with other encoding/decoding techniques, it is easier to work with the 
encoded data. 
  
 We want to provide an RLE vector implementation in Arrow. The design details 
include:
  
 1. RleVector implements ValueVector.
2. the data structure of RleVector includes an inner vector, plus a buffer 
storing the end indices for runs. 
3. we provide random access, with time complexity O(log(n)), so it should not 
be used frequently.
 4. In the future, we will provide iterators to access the vector in sequence.
 5. RleVector does not support update, but supports appending.
 6. In the future, we will provide encoder/decoder to efficiently transform 
encoded/decoded vectors.
  

  was:
RLE (run length encoding) is a widely used encoding/decoding technique. 
Compared with other encoding/decoding techniques, it is easier to work with the 
encoded data. 
 
We want to provide an RLE vector implementation in Arrow. The design details 
include:
 
1. RleVector implements ValueVector.
2. the data structure of RleVector includes an inner vector, plus a repetition 
buffer. 
3. we do not provide random access over the RleVector
4. In the future, we will provide iterators to access the vector in sequence.
5. RleVector does not support update, but supports appending.
6. In the future, we will provide encoder/decoder to efficiently transform 
encoded/decoded vectors.
 


> [Java] Provide RLE vector
> -------------------------
>
>                 Key: ARROW-6307
>                 URL: https://issues.apache.org/jira/browse/ARROW-6307
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Java
>            Reporter: Liya Fan
>            Assignee: Liya Fan
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> RLE (run length encoding) is a widely used encoding/decoding technique. 
> Compared with other encoding/decoding techniques, it is easier to work with 
> the encoded data. 
>   
>  We want to provide an RLE vector implementation in Arrow. The design details 
> include:
>   
>  1. RleVector implements ValueVector.
> 2. the data structure of RleVector includes an inner vector, plus a buffer 
> storing the end indices for runs. 
> 3. we provide random access, with time complexity O(log(n)), so it should not 
> be used frequently.
>  4. In the future, we will provide iterators to access the vector in sequence.
>  5. RleVector does not support update, but supports appending.
>  6. In the future, we will provide encoder/decoder to efficiently transform 
> encoded/decoded vectors.
>   



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

Reply via email to