Maxsparrow opened a new issue, #941:
URL: https://github.com/apache/arrow-ballista/issues/941

   **Describe the bug**
   Various errors occur when trying to get flight info with pyarrow Flight 
connector against a Ballista deployment using latest code.
   
   Query1:
   ```
   create external table sample stored as CSV with header row location 
'/mnt/sample.csv';
   ```
   Error1 after calling `get_flight_info`:
   ```python
   ArrowInvalid: Flight returned invalid argument error, with message: 
DecodeError { description: "buffer underflow", stack: [("Any", "type_url")] }
   ```
   
   Query2:
   ```
   select 'Hello from Arrow Ballista!';
   ```
   Error2:
   ```python
   ArrowInvalid: Flight returned invalid argument error, with message: 
DecodeError { description: "unexpected end group tag", stack: [] }
   ```
   
   I also tried with the arrow-ballista-python repo, installing its latest 
code, and I'm unable to connect:
   ```python
   In [3]: ctx = ballista.BallistaContext(hostname, client_port)
   ---------------------------------------------------------------------------
   Exception                                 Traceback (most recent call last)
   Cell In[3], line 1
   ----> 1 ctx = ballista.BallistaContext(hostname, client_port)
   
   Exception: Ballista error: DataFusionError(Execution("Status { code: 
Internal, message: \"Error parsing request\", metadata: MetadataMap { headers: 
{\"content-type\": \"application/grpc\", \"date\": \"Tue, 19 Dec 2023 21:27:04 
GMT\", \"content-length\": \"0\"} }, source: None }"))
   ```
   
   **To Reproduce**
   Steps to reproduce the behavior:
   * Deploy Ballista scheduler and executors using latest code, built from the 
repo off commit 934b32fd
   * Install latest pyarrow `14.0.2` in a Python 3.10 environment
   
   Run against your service:
   ```python
   client = flight.FlightClient(f'grpc://{hostname}:{port}')
   client.authenticate_basic_token("admin", "password")
   query = "select 'Hello from Arrow Ballista!';"
   descriptor = flight.FlightDescriptor.for_command(query)
   info = client.get_flight_info(descriptor)
   # Errors here
   ```
   
   **Expected behavior**
   No error and return flight info object.
   
   **Additional context**
   I deployed Ballista in Kubernetes, so it could still be a networking or 
setup issue. The Ballista scheduler and executor logs seem to suggest they 
started up correctly though, and there are no errors. The Ballista UI for my 
deployment also works, and the 'client.authenticate_basic_token' call works in 
Python, which suggests the server is running correctly and I can connect to it 
somehow.
   
   I'm new to Rust and the whole DataFusion ecosystem, so I'm not aware if 
there's an easier way to test if my deployment is working. Any advice would be 
appreciated.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to