Thank you for the tip Micah. It turns out I hadn’t build any compression and my
file used compression. After rebuilding with compression, I was able to read
the file. Next problem is how to cast a Parquet date (format: 2023-11-18)
column in C++. The rest of my code:
parquet::StreamReader stream{ parquet::ParquetFileReader::Open(infile) };
int date;
//int32_t date;
std::string symbol;
while (!stream.eof())
{
stream >> date >> parquet::EndRow;
// ...
}
Error occurred: Column converted type mismatch. Column 'date' has converted
type 'DATE' not 'INT_32'.
I tried int, int32_t (same as int), time_t (long).
Kind regards,
Nick
Van: Micah Kornfield <[email protected]>
Verzonden: vrijdag 17 november 2023 17:11
Aan: [email protected]
Onderwerp: Re: [C++][Parquet] Unable to read memory??
I read this error as the program is crashing because the code is throwing an
exception that isn't being caught. Can you add code to catch the exception and
print the error message which might be more informative?
Thanks,
Micah
On Friday, November 17, 2023, <[email protected]
<mailto:[email protected]> > wrote:
These were the steps I followed:
1. Download from Github - https://github.com/apache/arrow - unzip it
2. Open Developer PowerShell for VS 2022 as Administrator
3. cd D:\path_to_arrow\14.0.1
4. cd .\cpp
5. mkdir build
6. cd build
7. cmake .. -G "Visual Studio 17 2022" -A x64 -DARROW_BUILD_TESTS=ON
-DARROW_PARQUET=ON
8. Open arrow.sln files in build folder in VS
9. Build ALL_BUILD
10. Copy arrow.dll, arrow.pdb, parquet.dll, parquet.pdb files to Debug folder
of project
In VS project Solution Explorer > Properties:
1. C/C++ > General > Additional Include Directories: add src directory
2. Linker > General > Additional Library Directories: add build/release/Debug
directory
3. Linker > Input > Additional Dependencies: arrow.lib;parquet.lib
Cmake is version 3.27.7
-----Oorspronkelijk bericht-----
Van: Bryce Mecum <[email protected] <mailto:[email protected]> >
Verzonden: donderdag 16 november 2023 20:43
Aan: [email protected] <mailto:[email protected]>
Onderwerp: Re: [C++][Parquet] Unable to read memory??
Your code is correct so I think something else is going on. Can you give us
more details about your environment, such as how you're getting the Arrow C++
DLLs (nuget, conda, building from source) and how you're compiling your program?
On Thu, Nov 16, 2023 at 4:27 AM <[email protected]
<mailto:[email protected]> > wrote:
>
> Hi,
>
>
>
> I’m trying to get Parquet to work in C++. I have the following code:
>
>
>
> #include "arrow/io/api.h"
>
> #include "parquet/arrow/reader.h"
>
> #include "arrow/io/file.h"
>
> #include "parquet/stream_reader.h"
>
>
>
> int main()
>
> {
>
> std::shared_ptr<arrow::io::ReadableFile> infile;
>
>
>
> PARQUET_ASSIGN_OR_THROW(
>
> infile,
>
>
> arrow::io::ReadableFile::Open("D:/path_to_parquet_file/file.parquet"))
> ;
>
> }
>
>
>
> I get an error on PARQUET_ASSIGN_OR_THROW. It seems to be unable to read
> memory. Exception that I’m getting:
>
> Unhandled exception at 0x00007FFE2866CF19 in cpp.exe: Microsoft C++
> exception: parquet::ParquetStatusException at memory location
> 0x000000648DCFFC60.: parquet::ParquetStatusException at memory location
> 0x000000648DCFFC60.
>
>
>
> What is wrong with this code? I’m using VS Community 2022 and Windows 10
> 64bit.
>
>
>
> Kind regards,
>
>
>
> Nick