chadbrewbaker edited a comment on issue #703: URL: https://github.com/apache/arrow-rs/issues/703#issuecomment-991809095
This line of JSON is barfing in json2parquet with: ```bash thread 'main' panicked at 'Cannot filter indices on a non-primitive array, found List(true)' ``` https://github.com/apache/arrow-rs/blob/e0abda2c178be0c38d4257d22de2e4a3bfafde82/parquet/src/arrow/levels.rs#L757 ```json {"ts":1331901001.88,"fuid":"Fd3cGk2agqUftBeFx4","tx_hosts":["192.168.229.251"],"rx_hosts":["192.168.202.79"],"conn_uids":["CaJMZy195M8cuXfxn4"],"source":"HTTP","depth":0,"analyzers":[],"mime_type":"text/html","duration":0.0,"is_orig":false,"seen_bytes":1433,"total_bytes":1433,"missing_bytes":0,"overflow_bytes":0,"timedout":false} ``` The Python bindings handle this just fine. ```python from pyarrow import json fn = 'mini.json' table = json.read_json(fn) print(table) ``` ```bash pyarrow.Table ts: double fuid: string tx_hosts: list<item: string> child 0, item: string rx_hosts: list<item: string> child 0, item: string conn_uids: list<item: string> child 0, item: string source: string depth: int64 analyzers: list<item: null> child 0, item: null mime_type: string duration: double is_orig: bool seen_bytes: int64 total_bytes: int64 missing_bytes: int64 overflow_bytes: int64 timedout: bool ---- ts: [[1331901001.88]] fuid: [["Fd3cGk2agqUftBeFx4"]] tx_hosts: [[["192.168.229.251"]]] rx_hosts: [[["192.168.202.79"]]] conn_uids: [[["CaJMZy195M8cuXfxn4"]]] source: [["HTTP"]] depth: [[0]] analyzers: [[0 nulls]] mime_type: [["text/html"]] duration: [[0]] ... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
