[ https://issues.apache.org/jira/browse/ARROW-6737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joris Van den Bossche updated ARROW-6737: ----------------------------------------- Description: {code} from pyarrow import json import pyarrow.parquet as pq r = json.read_json('example.jl') pq.write_table(r, 'example.parquet') {code Doing the above operation resulting in {{ArrowInvalid: Nested column branch had multiple children}} Posting it here as per the request from https://github.com/apache/arrow/issues/4045#issuecomment-535867640 The sample schema looks like this {code} package_version: string source_version: string uuid: string _type: string position: struct<ais_type: string, course: double, draught: double, draught_raw: null, heading: double, lat: double, lon: double, nav_state: int64, received_time: timestamp[s], speed: double> child 0, ais_type: string child 1, course: double child 2, draught: double child 3, draught_raw: null child 4, heading: double child 5, lat: double child 6, lon: double child 7, nav_state: int64 child 8, received_time: timestamp[s] child 9, speed: double provider_name: string vessel: struct<beam: null, build_year: null, call_sign: string, dead_weight: null, dwt: null, flag_code: null, flag_name: string, gross_tonnage: null, imo: string, length: null, mmsi: string, name: string, type: null, vessel_type: string> child 0, beam: null child 1, build_year: null child 2, call_sign: string child 3, dead_weight: null child 4, dwt: null child 5, flag_code: null child 6, flag_name: string child 7, gross_tonnage: null child 8, imo: string child 9, length: null child 10, mmsi: string child 11, name: string child 12, type: null child 13, vessel_type: string source_provider: string {code} was: {code} from pyarrow import json import pyarrow.parquet as pq r = json.read_json('example.jl') pq.write_table(r, 'example.parquet') ``` Doing the above operation resulting in {{ArrowInvalid: Nested column branch had multiple children}} Posting it here as per the request from https://github.com/apache/arrow/issues/4045#issuecomment-535867640 The sample schema looks like this {code} package_version: string source_version: string uuid: string _type: string position: struct<ais_type: string, course: double, draught: double, draught_raw: null, heading: double, lat: double, lon: double, nav_state: int64, received_time: timestamp[s], speed: double> child 0, ais_type: string child 1, course: double child 2, draught: double child 3, draught_raw: null child 4, heading: double child 5, lat: double child 6, lon: double child 7, nav_state: int64 child 8, received_time: timestamp[s] child 9, speed: double provider_name: string vessel: struct<beam: null, build_year: null, call_sign: string, dead_weight: null, dwt: null, flag_code: null, flag_name: string, gross_tonnage: null, imo: string, length: null, mmsi: string, name: string, type: null, vessel_type: string> child 0, beam: null child 1, build_year: null child 2, call_sign: string child 3, dead_weight: null child 4, dwt: null child 5, flag_code: null child 6, flag_name: string child 7, gross_tonnage: null child 8, imo: string child 9, length: null child 10, mmsi: string child 11, name: string child 12, type: null child 13, vessel_type: string source_provider: string {code} > Nested column branch had multiple children > ------------------------------------------ > > Key: ARROW-6737 > URL: https://issues.apache.org/jira/browse/ARROW-6737 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: harikrishnan > Priority: Major > > {code} > from pyarrow import json > import pyarrow.parquet as pq > r = json.read_json('example.jl') > pq.write_table(r, 'example.parquet') > {code > Doing the above operation resulting in {{ArrowInvalid: Nested column branch > had multiple children}} > Posting it here as per the request from > https://github.com/apache/arrow/issues/4045#issuecomment-535867640 > The sample schema looks like this > {code} > package_version: string > source_version: string > uuid: string > _type: string > position: struct<ais_type: string, course: double, draught: double, > draught_raw: null, heading: double, lat: double, lon: double, nav_state: > int64, received_time: timestamp[s], speed: double> > child 0, ais_type: string > child 1, course: double > child 2, draught: double > child 3, draught_raw: null > child 4, heading: double > child 5, lat: double > child 6, lon: double > child 7, nav_state: int64 > child 8, received_time: timestamp[s] > child 9, speed: double > provider_name: string > vessel: struct<beam: null, build_year: null, call_sign: string, dead_weight: > null, dwt: null, flag_code: null, flag_name: string, gross_tonnage: null, > imo: string, length: null, mmsi: string, name: string, type: null, > vessel_type: string> > child 0, beam: null > child 1, build_year: null > child 2, call_sign: string > child 3, dead_weight: null > child 4, dwt: null > child 5, flag_code: null > child 6, flag_name: string > child 7, gross_tonnage: null > child 8, imo: string > child 9, length: null > child 10, mmsi: string > child 11, name: string > child 12, type: null > child 13, vessel_type: string > source_provider: string > {code} > > -- This message was sent by Atlassian Jira (v8.3.4#803005)