Re: [I] [JS] Rows will intermittently return BigInt data, instead of the expected strings [arrow]

via GitHub Thu, 18 Jul 2024 19:30:23 -0700


Vectorrent commented on issue #43275:
URL: https://github.com/apache/arrow/issues/43275#issuecomment-2237947651


   After doing some more testing, I can confirm that this bug almost certainly 
exists in `apache-arrow` - and not in `parquet-wasm`. If you execute the above 
script in Node.js, it will quickly fail - and it will write the failing buffer 
to disk. That happens when `apache-arrow` fails to convert a buffer into a 
table correctly.
   
   However, if you run the following script - you will read that "failed" 
buffer from disk in `pyarrow` - and it will work correctly.
   
   Thus, the bug is in `apache-arrow` - not in `parquet-wasm`.
   ```py
   import pyarrow as pa
   import pyarrow.ipc as ipc
   
   def main():
       # Replace 'path/to/your/arrow_file.arrow' with the actual path to your 
Arrow file
       arrow_file_on_disk = './arrowStreamBuffer.txt'
   
       try:
           # Open the Arrow file
           with ipc.open_stream(arrow_file_on_disk) as reader:
               # Read all the data into a table
               table = reader.read_all()
   
           # Print information about the table
           print(f"Table schema: {table.schema}")
           print(f"Number of columns: {table.num_columns}")
           print(f"Number of rows: {table.num_rows}")
   
           # Print the first 5 rows of the table
           print("\nFirst 5 rows of data:")
           print(table.to_pandas().head())
   
       except FileNotFoundError:
           print(f"Error: The file '{arrow_file_on_disk}' was not found.")
       except pa.lib.ArrowInvalid:
           print(f"Error: '{arrow_file_on_disk}' is not a valid Arrow file or 
is corrupted.")
       except Exception as e:
           print(f"An error occurred: {str(e)}")
   
   if __name__ == "__main__":
       main()
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [JS] Rows will intermittently return BigInt data, instead of the expected strings [arrow]

Reply via email to