[GitHub] orc pull request #277: ORC-372: Enable valgrind for C++ travis-ci tests
Github user majetideepak commented on a diff in the pull request: https://github.com/apache/orc/pull/277#discussion_r192919379 --- Diff: c++/src/TypeImpl.cc --- @@ -258,31 +258,34 @@ namespace orc { case STRUCT: { StructVectorBatch *result = new StructVectorBatch(capacity, memoryPool); + std::unique_ptr return_value = std::unique_ptr(result); --- End diff -- we need `result` of type `StructVectorBatch` to access fields on line 263 ---
[GitHub] orc pull request #277: ORC-372: Enable valgrind for C++ travis-ci tests
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/orc/pull/277#discussion_r192909966 --- Diff: c++/src/TypeImpl.cc --- @@ -498,54 +506,54 @@ namespace orc { } ORC_UNIQUE_PTR Type::buildTypeFromString(const std::string& input) { -std::vector > res = +std::vector > > res = TypeImpl::parseType(input, 0, input.size()); if (res.size() != 1) { throw std::logic_error("Invalid type string."); } -return ORC_UNIQUE_PTR(res[0].second); +return std::move(res[0].second); } Type* TypeImpl::parseArrayType(const std::string , --- End diff -- is it better to change return type to unique_ptr as well to be consistent? same for below. ---
[GitHub] orc pull request #277: ORC-372: Enable valgrind for C++ travis-ci tests
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/orc/pull/277#discussion_r192907281 --- Diff: c++/src/TypeImpl.cc --- @@ -258,31 +258,34 @@ namespace orc { case STRUCT: { StructVectorBatch *result = new StructVectorBatch(capacity, memoryPool); + std::unique_ptr return_value = std::unique_ptr(result); --- End diff -- maybe we can directly merge line 260 and 261 to a single statement. same for below. ---
[GitHub] orc pull request #277: ORC-372: Enable valgrind for C++ travis-ci tests
Github user wgtmac commented on a diff in the pull request: https://github.com/apache/orc/pull/277#discussion_r192910566 --- Diff: c++/src/Compression.cc --- @@ -282,6 +285,11 @@ DIAGNOSTIC_PUSH } } + void ZlibCompressionStream::end() { +(void)deflateEnd(); + } + + --- End diff -- remove this line ---
[GitHub] orc pull request #279: ORC-373: Option to disable dictionary encoding
Github user asfgit closed the pull request at: https://github.com/apache/orc/pull/279 ---
[GitHub] orc issue #279: ORC-373: Option to disable dictionary encoding
Github user prasanthj commented on the issue: https://github.com/apache/orc/pull/279 lgtm ---
[GitHub] orc issue #279: ORC-373: Option to disable dictionary encoding
Github user omalley commented on the issue: https://github.com/apache/orc/pull/279 Prasanth, rather than adding the flush count, which is really test code in the main source, how about creating a StringTreeWriter and catching the stream directly. I'd propose this - https://github.com/omalley/orc/blob/8a782d204f071e2bd16bcfd591a54319fadff132/java/core/src/test/org/apache/orc/TestStringDictionary.java#L251 ---
[GitHub] orc issue #277: ORC-372: Enable valgrind for C++ travis-ci tests
Github user majetideepak commented on the issue: https://github.com/apache/orc/pull/277 @wgtmac, @xndai can you take a look at this patch? ZLIB compression code seems to be leaking memory. Thanks! ---
[GitHub] orc issue #276: ORC-373: Option to disable dictionary encoding
Github user omalley commented on the issue: https://github.com/apache/orc/pull/276 @prasanthj yeah, in general you should make a dev branch for the pull request (eg. prasanthj:orc-373). Then you can do a force-push to the branch to update the PR. ---
[GitHub] orc pull request #273: ORC-343 Enable C++ writer to support RleV2
Github user majetideepak commented on a diff in the pull request: https://github.com/apache/orc/pull/273#discussion_r192729045 --- Diff: c++/test/TestRleEncoder.cc --- @@ -0,0 +1,243 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +#include + +#include "MemoryOutputStream.hh" +#include "RLEv1.hh" + +#include "wrap/orc-proto-wrapper.hh" +#include "wrap/gtest-wrapper.h" + +namespace orc { + + const int DEFAULT_MEM_STREAM_SIZE = 1024 * 1024; // 1M + + void generateData( + uint64_t numValues, + int64_t start, + int64_t delta, + bool random, + int64_t* data, + uint64_t numNulls = 0, + char* notNull = nullptr) { +if (numNulls != 0 && notNull != nullptr) { + memset(notNull, 1, numValues); + while (numNulls > 0) { +uint64_t pos = static_cast(std::rand()) % numValues; +if (notNull[pos]) { + notNull[pos] = static_cast(0); + --numNulls; +} + } +} + +for (uint64_t i = 0; i < numValues; ++i) { + if (notNull == nullptr || notNull[i]) + { +if (!random) { + data[i] = start + delta * static_cast(i); +} else { + data[i] = std::rand(); +} + } +} + } + + void decodeAndVerify( + RleVersion version, + const MemoryOutputStream& memStream, + int64_t * data, + uint64_t numValues, + const char* notNull, + bool isSinged) { +std::unique_ptr decoder = createRleDecoder( +std::unique_ptr(new SeekableArrayInputStream( +memStream.getData(), +memStream.getLength())), +isSinged, version, *getDefaultPool()); + +int64_t* decodedData = new int64_t[numValues]; +decoder->next(decodedData, numValues, notNull); + +for (uint64_t i = 0; i < numValues; ++i) { + if (!notNull || notNull[i]) { +EXPECT_EQ(data[i], decodedData[i]); + } +} + +delete [] decodedData; + } + + std::unique_ptr getEncoder(RleVersion version, +MemoryOutputStream& memStream, +bool isSigned) + { +MemoryPool * pool = getDefaultPool(); + +return createRleEncoder( +std::unique_ptr( +new BufferedOutputStream(*pool, , 500 * 1024, 1024)), +isSigned, version, *pool, true); --- End diff -- can we template these tests for `alignedBitpacking = false`? ---
[GitHub] orc issue #276: ORC-373: Option to disable dictionary encoding
Github user prasanthj commented on the issue: https://github.com/apache/orc/pull/276 @damiencarol added junit tests. my branches got messed up. So had to give another PR. #279 ---
[GitHub] orc pull request #279: ORC-373: Option to disable dictionary encoding
GitHub user prasanthj opened a pull request: https://github.com/apache/orc/pull/279 ORC-373: Option to disable dictionary encoding disables dictionary encoding at the time of string tree writer initialization. You can merge this pull request into a Git repository by running: $ git pull https://github.com/prasanthj/orc ORC-373 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/orc/pull/279.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #279 commit a2a091b23972523b2c55448808dbb695f19ee4ef Author: Prasanth Jayachandran Date: 2018-06-02T02:24:22Z ORC-373: Option to disable dictionary encoding commit 7b8f90223b3d1021eff17b52ce92417ed561548c Author: Prasanth Jayachandran Date: 2018-06-04T06:22:38Z Added unit test ---
[GitHub] orc pull request #276: ORC-373: Option to disable dictionary encoding
Github user prasanthj closed the pull request at: https://github.com/apache/orc/pull/276 ---