setop commented on issue #4799:
URL: https://github.com/apache/arrow-rs/issues/4799#issuecomment-2132836977

   Same issue with version 51.0.0
   
   From the one big CSV cut into two, I created the first partquet file with 
parquet-cpp-arrow
   
   ```
   Metadata for file: E2021.parquet
   
   version: 1
   num of rows: 4283692
   created by: parquet-cpp-arrow version 5.0.0
   metadata:
     ARROW:schema: 
/////4ABAAAQAAAAAAAKAAwABgAFAAgACgAAAAABBAAMAAAACAAIAAAABAAIAAAABAAAAAYAAAAUAQAA0AAAAJwAAABoAAAAOAAAAAQAAAAU////AAABAxAAAAAcAAAABAAAAAAAAAAFAAAAcHJpY2UABgAIAAYABgAAAAAAAgBE////AAABAhAAAAAUAAAABAAAAAAAAAADAAAAZGF5ADD///8AAAABQAAAAHD///8AAAECEAAAABgAAAAEAAAAAAAAAAUAAABtb250aAAAAGD///8AAAABQAAAAKD///8AAAECEAAAABgAAAAEAAAAAAAAAAQAAAB5ZWFyAAAAAJD///8AAAABQAAAAND///8AAAECEAAAABgAAAAEAAAAAAAAAAQAAABmdWVsAAAAAMD///8AAAABQAAAABAAFAAIAAYABwAMAAAAEAAQAAAAAAABAhAAAAAgAAAABAAAAAAAAAAFAAAAcGR2aWQAAAAIAAwACAAHAAgAAAAAAAABQAAAAAAAAAA=
   message schema {
     OPTIONAL INT64 pdvid;
     OPTIONAL INT64 fuel;
     OPTIONAL INT64 year;
     OPTIONAL INT64 month;
     OPTIONAL INT64 day;
     OPTIONAL DOUBLE price;
   }
   ```
   
   Then, using the same schema (`message schema { ... }` in a file), I created 
the second half with parquet-rs:
   
   ```
   Metadata for file: E2022.parquet
   
   version: 1
   num of rows: 5044596
   created by: parquet-rs version 51.0.0
   metadata:
     ARROW:schema: 
/////5ABAAAQAAAAAAAKAAwACgAJAAQACgAAABAAAAAAAQQACAAIAAAABAAIAAAABAAAAAYAAAAoAQAA5AAAALAAAAB8AAAATAAAABQAAAAQABYAEAAOAA8ABAAAAAgAEAAAABgAAAAcAAAAAAABAxgAAAAAAAYACAAGAAYAAAAAAAIAAAAAAAUAAABwcmljZQAAAET///8QAAAAGAAAAAAAAQIUAAAANP///0AAAAAAAAABAAAAAAMAAABkYXkAcP///xAAAAAYAAAAAAABAhQAAABg////QAAAAAAAAAEAAAAABQAAAG1vbnRoAAAAoP///xAAAAAYAAAAAAABAhQAAACQ////QAAAAAAAAAEAAAAABAAAAHllYXIAAAAA0P///xAAAAAYAAAAAAABAhQAAADA////QAAAAAAAAAEAAAAABAAAAGZ1ZWwAAAAAEAAUABAADgAPAAQAAAAIABAAAAAYAAAAIAAAAAAAAQIcAAAACAAMAAQACwAIAAAAQAAAAAAAAAEAAAAABQAAAHBkdmlkAAAA
   message arrow_schema {
     OPTIONAL INT64 pdvid;
     OPTIONAL INT64 fuel;
     OPTIONAL INT64 year;
     OPTIONAL INT64 month;
     OPTIONAL INT64 day;
     OPTIONAL DOUBLE price;
   }
   ```
   
   When I try to concat them, I get `Error: General("inputs must have the same 
schema ...`
   
   The only diff is the naming of the schema, `schema` vs `arrow_schema`.
   
   ```diff
   --- a.txt       2024-05-27 09:32:48.409232203 +0200
   +++ b.txt       2024-05-27 09:32:55.073572763 +0200
   @@ -1,6 +1,6 @@
    GroupType {
        basic_info: BasicTypeInfo {
   -        name: \"schema\",
   +        name: \"arrow_schema\",
            repetition: None,
            converted_type: NONE,
            logical_type: None,
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to