lidavidm commented on issue #14089:
URL: https://github.com/apache/arrow/issues/14089#issuecomment-1243033923

   Well, yes of course: the server implementation isn't actually making the 
data available until the call finishes:
   
   ``` scala
     override def acceptPut(context: CallContext, flightStream: FlightStream, 
ackStream: StreamListener[PutResult]): Runnable = {
       val batches = new util.ArrayList[ArrowRecordBatch]()
       () => {
         var rows: Long = 0
         while (flightStream.next()) {
           val unloader = new VectorUnloader(flightStream.getRoot)
           val arb = unloader.getRecordBatch
           batches.add(arb)
           rows = rows + flightStream.getRoot.getRowCount
         }
   
         val dataset = ArrowFlightDataset(batches, flightStream.getSchema, rows)
         datasets.put(flightStream.getDescriptor, dataset)
         ackStream.onCompleted()
       }
     }
   ```
   This buffers the data in memory and does not update the member variable 
`datasets` until the call completes. How would Flight be able make the data 
available to the `getStream` implementation for you? 
   
   Flight is just a framework for building services, so if you want to be able 
to read data while writing, Flight will not get in your way of implementing 
that, but neither will it do that for you automatically. In this case, you'll 
have to update `datasets` as soon as the call starts, mutate the dataset as the 
call progresses, and have some way of coordinating between `acceptPut` and 
`getStream` so the latter knows when all data has been uploaded. And you'll 
want to handle the client disconnecting, etc.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to