Re: [PR] http/python: Simplify server.py and reduce client processing time in half [arrow-experiments]

via GitHub Thu, 22 Aug 2024 10:40:35 -0700


felipecrv commented on code in PR #31:
URL: https://github.com/apache/arrow-experiments/pull/31#discussion_r1727560521



##########
http/get_simple/python/server/server.py:
##########
@@ -56,24 +56,44 @@ def GetPutData():
     
     return batches
 
-def make_reader(schema, batches):
-    return pa.RecordBatchReader.from_batches(schema, batches)
-
-def generate_batches(schema, reader):
+def generate_buffers(schema, source):
     with io.BytesIO() as sink, pa.ipc.new_stream(sink, schema) as writer:
-        for batch in reader:
+        for batch in source:
             sink.seek(0)
-            sink.truncate(0)
             writer.write_batch(batch)
+            sink.truncate()
             yield sink.getvalue()

Review Comment:
   To avoid the `del buffer` which I think is kinda ugly and confusing, I ended 
up with this:
   
   ```python
           def write_chunk(buffer):
              if chunked:
                  
self.wfile.write('{:X}\r\n'.format(len(buffer)).encode('utf-8'))
              self.wfile.write(buffer)
              if chunked:
                  self.wfile.write('\r\n'.encode('utf-8'))
              self.wfile.flush()
           foreach_batch_buffer(schema, source, write_chunk)
   ```
   
   This means I can pass `sing.getbuffer()` to the `write_chunk` callback 
without having to `del` it when it goes out of scope. This also makes the 
version that chunks the buffer themselves avoid the need for more `del` 
statements.
   
   Do you think this is acceptable or the `yield` solution with `del` 
statements is preferred?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] http/python: Simplify server.py and reduce client processing time in half [arrow-experiments]

Reply via email to