lidavidm commented on issue #14089:
URL: https://github.com/apache/arrow/issues/14089#issuecomment-1243033923
Well, yes of course: the server implementation isn't actually making the
data available until the call finishes:
``` scala
override def acceptPut(context: CallContext, flightStream: FlightStream,
ackStream: StreamListener[PutResult]): Runnable = {
val batches = new util.ArrayList[ArrowRecordBatch]()
() => {
var rows: Long = 0
while (flightStream.next()) {
val unloader = new VectorUnloader(flightStream.getRoot)
val arb = unloader.getRecordBatch
batches.add(arb)
rows = rows + flightStream.getRoot.getRowCount
}
val dataset = ArrowFlightDataset(batches, flightStream.getSchema, rows)
datasets.put(flightStream.getDescriptor, dataset)
ackStream.onCompleted()
}
}
```
This buffers the data in memory and does not update the member variable
`datasets` until the call completes. How would Flight be able make the data
available to the `getStream` implementation for you?
Flight is just a framework for building services, so if you want to be able
to read data while writing, Flight will not get in your way of implementing
that, but neither will it do that for you automatically. In this case, you'll
have to update `datasets` as soon as the call starts, mutate the dataset as the
call progresses, and have some way of coordinating between `acceptPut` and
`getStream` so the latter knows when all data has been uploaded. And you'll
want to handle the client disconnecting, etc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]