[
https://issues.apache.org/jira/browse/ORC-1288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17618940#comment-17618940
]
noirello commented on ORC-1288:
-------------------------------
This is really odd. First I noticed this kind of error a coupe weeks ago, when
the conda-forge tried to rebuild the pyorc module with ORC 1.8.0. It's not
exactly the best example, because I have other errors in the pipeline (missing
zlib library), but the linux pypy builds are also failed with double free
errors on the [Azure
pipeline|https://dev.azure.com/conda-forge/feedstock-builds/_build/results?buildId=578425&view=logs&j=bb1c2637-64c6-57bd-9ea6-93823b2df951&t=350df31b-3291-5209-0bb7-031395f0baa1&l=479%C2%A0].
Then I tried to update the main pyorc repository to 1.8.0 and locally I
experienced the same error (using a WSL Ubuntu 20.04), but now I've just pushed
a new branch and none of the
[tests|https://noirello.visualstudio.com/pyorc/_build/results?buildId=547&view=logs&jobId=75d2528f-034b-5da0-f30d-411defbb2b02&j=f6fb4924-a9db-5bb0-8b06-98c3591fc924&t=584832f6-6be3-5103-aaa0-3d02b9106a45]
are failed with 1.8.0 on the CI pipeline due to memory error.
At this point I'm completely lost what's happening here (or more precisely on
my computer). I think we can probably close this issue.
> [C++] Invalid memory freeing with ZLIB compression
> --------------------------------------------------
>
> Key: ORC-1288
> URL: https://issues.apache.org/jira/browse/ORC-1288
> Project: ORC
> Issue Type: Bug
> Affects Versions: 1.8.0
> Reporter: noirello
> Priority: Major
>
> The simple example code ends with a segfault/munmap_chunk(): invalid pointer:
> {code:cpp}
> #include "orc/Common.hh"
> #include "orc/OrcFile.hh"
> using namespace orc;
> int main(void) {
> ORC_UNIQUE_PTR<OutputStream> outStream = writeLocalFile("test_file.orc");
> ORC_UNIQUE_PTR<Type> schema(Type::buildTypeFromString("struct<c0:int>"));
> WriterOptions options;
> options.setCompression(orc::CompressionKind_ZLIB);
> options.setStripeSize(4096);
> options.setCompressionBlockSize(4096);
> ORC_UNIQUE_PTR<Writer> writer = createWriter(*schema, outStream.get(),
> options);
> uint64_t batchSize = 65535, rowCount = 10000000;
> ORC_UNIQUE_PTR<ColumnVectorBatch> batch =
> writer->createRowBatch(batchSize);
> StructVectorBatch *root = dynamic_cast<StructVectorBatch *>(batch.get());
> LongVectorBatch *c0 = dynamic_cast<LongVectorBatch *>(root->fields[0]);
> uint64_t rows = 0;
>
> for (uint64_t i = 0; i < rowCount; ++i) {
> c0->data[rows] = i;
> rows++;
> if (rows == batchSize) {
> root->numElements = rows;
> c0->numElements = rows;
> writer->add(*batch);
> rows = 0;
> }
> }
> if (rows != 0) {
> root->numElements = rows;
> c0->numElements = rows;
> writer->add(*batch);
> rows = 0;
> }
> writer->close();
> return 0;
> }
> {code}
> The bug depends on the stripe size, compression size, and the record number
> written to the file as well. I wasn't able to reproduce the error with other
> compression strategies than ZLIB.
> It looks like to me that it's related to
> [ORC-1130|https://issues.apache.org/jira/projects/ORC/issues/ORC-1130]
> somehow, but I couldn't comprehend how. (Reverting that modification on the
> main branch solved the issue).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)