poppgs opened a new issue, #45090:
URL: https://github.com/apache/arrow/issues/45090

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   I have been given a parquet file with the following schema:
   
   ```
   Name: string
   Ident: string
   Date: date32[day]
   EventTimestamp: timestamp[us]
   ```
   I dumped out the schema using a simple python script:
   
   ```
   #! /usr/bin/env python3
   import pyarrow as pa
   import pyarrow.parquet as pq
   import sys
   
   schema = pq.read_schema(sys.argv[1])
   print(schema)
   ```
   I was trying to figure out what C++ type is date32[day]? I've looked all 
over and it SEEMS like the underlying data type is int32_t. But if I try to 
read it using that type, I get:
   ```
   
   terminate called after throwing an instance of 'parquet::ParquetException'
     what():  Column converted type mismatch.  Column 'Date' has converted type 
'DATE' not 'INT_32'
   ```
   I've also tried using a string but I get this error:
   
   ```
   terminate called after throwing an instance of 'parquet::ParquetException'
     what():  Column physical type mismatch.  Column 'Date' has physical type 
'INT32' not 'BYTE_ARRAY'
   ```
   Here's my code, almost verbatim what is on the example page for Parquet. 
Nothing I try allows me to read the DATE type:
   ```
   
   #include <iostream>
   #include "arrow/io/file.h"
   #include "parquet/stream_reader.h"
   #include "readfile.h"
   
   int main(int argc, char **argv)
   {
   
   if (argc < 2)
   {
     std::cout << "Usage: " << argv[0] << " <input filename" << std::endl;
     exit(2);
   }
   char *input_filename = argv[1];
   std::cout << "Input filename is " << input_filename << std::endl;
   std::shared_ptr<arrow::io::ReadableFile> infile;
   
   PARQUET_ASSIGN_OR_THROW(
       infile,
       arrow::io::ReadableFile::Open(input_filename));
   
   parquet::StreamReader os{parquet::ParquetFileReader::Open(infile)}
   std::string name;
   std::string ident;
   int32_t date;
   std::string ts_event_utc;
   
   while (!os.eof())
   {
     os >> name >> ident >> date >> ts_event_utc >> parquet::EndRow;      
   }
   ```
   Looking at the code in streamreader.cc, I see no reference to a DATE type in 
either the exceptions at the top of the file or the operators below. 
   
   ### Component(s)
   
   C++


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to