GitHub user yuruiz opened a pull request:
https://github.com/apache/orc/pull/273
ORC-343 Enable C++ writer to support RleV2
1. Port RleV2 implementation from Java to C++
2. Add RleV2 relevant tests to C++
You can merge this pull request into a Git repository by running
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r191973999
--- Diff: c++/src/Writer.cc ---
@@ -38,9 +38,10 @@ namespace orc {
FileVersion fileVersion;
double dictionaryKeySizeThreshold;
bool
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r191995946
--- Diff: c++/src/Writer.cc ---
@@ -122,9 +127,17 @@ namespace orc {
}
WriterOptions& WriterOptions::setFileVersion(const FileVer
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r191995987
--- Diff: c++/src/RLE.hh ---
@@ -68,7 +76,24 @@ namespace orc {
* record current position
* @param recorder use the recorder to record
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r191337994
--- Diff: c++/src/CMakeLists.txt ---
@@ -179,15 +179,15 @@ set(SOURCE_FILES
OrcFile.cc
Reader.cc
RLEv1.cc
- RLEv2.cc
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r191340586
--- Diff: c++/src/RleEncoderV2.cc ---
@@ -0,0 +1,768 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor
Github user yuruiz commented on the issue:
https://github.com/apache/orc/pull/273
[benchmark.xlsx](https://github.com/apache/orc/files/2047317/benchmark.xlsx)
RleV2 benchmark
---
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r191337473
--- Diff: c++/src/Writer.cc ---
@@ -38,9 +38,10 @@ namespace orc {
FileVersion fileVersion;
double dictionaryKeySizeThreshold;
bool
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r191672633
--- Diff: c++/src/RleEncoderV2.cc ---
@@ -0,0 +1,768 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r192614780
--- Diff: c++/src/RLEv2.hh ---
@@ -25,13 +25,89 @@
#include
+#define MIN_REPEAT 3
+#define HIST_LEN 32
namespace orc
Github user yuruiz commented on a diff in the pull request:
https://github.com/apache/orc/pull/273#discussion_r191102909
--- Diff: c++/src/RLE.cc ---
@@ -64,4 +66,55 @@ namespace orc {
}
}
+ void RleEncoder::add(const int64_t* data, uint64_t numValues
11 matches
Mail list logo