nitesh-sinha opened a new issue, #37925:
URL: https://github.com/apache/arrow/issues/37925

   ### Describe the usage question you have. Please include as many useful 
details as  possible.
   
   
   Hello,
   
   I'm trying to build an Arrow Flight SQL server(which wraps DuckDB querying 
parquet files) in Python. I've implemented the handler methods defined in 
pyarrow [FlightServerBase 
class](https://arrow.apache.org/docs/python/generated/pyarrow.flight.FlightServerBase.html#pyarrow.flight.FlightServerBase)
 and testing it with a Dbeaver client(loaded with[ JDBC driver for Arrow Flight 
SQL](https://www.dremio.com/drivers/jdbc/)). However even though the client 
connects successfully with the server, it is unable to read any of the data 
sent back from the server. I'm suspecting it might be due to the RecordBatch 
structure? After a lot of reading up the docs, I've tried various ways of 
creating the RecordBatch with no luck. 
   
   For debugging simplicity I hand-wrote the following RecordBatch to be sent 
for a DoGet RPC call(with CommandGetSqlInfo command) in the Ticket. Can someone 
help point out any errors in this?
   
   ```
   def do_get_sql_info(self, context: flight.ServerCallContext, cmd: 
sqlPb.CommandGetSqlInfo) -> flight.FlightDataStream:
           sql_info_metadata = [
               {"info_name": "0", "value": "db_name"},
               {"info_name": "1", "value": "duckdb"},
           ]
   
           schema = pa.schema([
               pa.field("info_name", pa.uint32()),
               pa.field("value", pa.dense_union([
                   pa.field("string_value", pa.string()),
                   pa.field("bool_value", pa.bool_()),
                   pa.field("bigint_value", pa.int64()),
                   pa.field("int32_bitmask", pa.int32()),
                   pa.field("string_list", pa.list_(pa.string())),
                   pa.field("int32_to_int32_list_map", pa.map_(pa.int32(), 
pa.list_(pa.int32())))
               ]))
           ])
           batch = pa.RecordBatch.from_pandas(pd.DataFrame(sql_info_metadata), 
schema=schema)
           return flight.FlightDataStream(batch)
   ```
   
   The client is unable to read the DB name as `duckdb`, instead it just prints 
`??` 
   
   Note: 
   - I'm using the [C++ Flight SQL server 
](https://github.com/apache/arrow/blob/15a8ac3ce4e3ac31f9f361770ad4a38c69102aa1/cpp/src/arrow/flight/sql/server.cc#L956)
 as reference. They seem to be using Builders to build the SqlInfoResult but I 
could not find its equivalent in Pyarrow.
   - I have checked Arrow Flight Python example server 
[here](https://github.com/apache/arrow/blob/aca1d3eeed3775c2f02e9f5d59d62478267950b1/python/examples/flight/server.py)
 but it feels too simplistic and does not cover Flight SQL usecase. 
   - Also tried to check what the client driver code expects 
[here](https://github.com/apache/arrow/blob/aca1d3eeed3775c2f02e9f5d59d62478267950b1/java/flight/flight-sql-jdbc-core/src/main/java/org/apache/arrow/driver/jdbc/client/ArrowFlightSqlClientHandler.java#L98)
 but its not too clear to me. 
   
   Appreciate some pointers on this. Thanks!
   
   ### Component(s)
   
   FlightRPC, Python


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to