dirtysalt opened a new pull request #676:
URL: https://github.com/apache/orc/pull/676


   The first 5 prime numbers are [2,3,5,7,11] and the encoded data have three 
parts
   - 2 [0xff, 0x02]
   - 3,5,7 (delta=2,length=3,base=3) [0x00, 0x02, 0x03]
   - 11 [0xff, 0xb]
   
   BTW: if numbers are [2,3,4,7,11], the encoded data is also wrong. It should 
be
   - 2,3,4 (delta=1, length=3, base=2) [0x00, 0x01, 0x02]
   - 7,11  [0xfe, 0x7, 0xb]
   
   And I have verified the encoded data with the following C++ code. It used 
RLEEncoderv1 to encode 5 prime numbers and dump data to a local file called 
`rle.data`. 
   
   ```c++
   
   int main() {
       auto pool = getDefaultPool();
       auto outputStream = writeLocalFile("rle.data");
       auto buf = new BufferedOutputStream(*pool, outputStream.get(), 1024, 32);
       std::unique_ptr<BufferedOutputStream> pbuf(buf);
       auto encoder = new RleEncoderV1(std::move(pbuf), false);
       encoder->write(2);
       encoder->write(3);
       encoder->write(5);
       encoder->write(7);
       encoder->write(11);
       encoder->flush();
       return 0;
   }
   ```
   
   And then I use `od -A n -t x1 rle.data` to see data in hex format. The 
result is following
   
   ```
    ff 02 00 02 03 ff 0b
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to