cozos commented on issue #23932:
URL: https://github.com/apache/beam/issues/23932#issuecomment-1340370074

   > Where is the inference code executed. Is it executed in the SDK harness 
service
   Yes it is executed in the SDK Harness which is your case is a Docker 
container
   
   > If so can that service use the underlying GPUs. Also can I run any pytorch 
and HuggingFace Transformed model using RunInference.
   You need to install dependencies such as `pytorch` available in the Docker 
container. Same for GPUs - you need to install CUDA drivers and whatnot. You 
also need to do the needful for making GPUs accessible from Docker (I don't 
know how - probably here: 
https://docs.docker.com/config/containers/resource_constraints/#gpu). See 
instructions on how to build container here: 
https://beam.apache.org/documentation/runtime/environments/#custom-containers
   
   > Seems like converting data and send it back and forth between spark worker 
and sdk service may involve lot of overhead
   Sending data back and forth through the Fn API involves 
serialization/deserialization and sending your data through the 
transport/network layer so yes there is overhead. 
   
   > Will a native Spark job be faster then a Beam Job? Is there a performance 
hit when we write jobs in Beam instead of Native Spark?
   You'd have to benchmark your pipeline. But my guess would be that native 
Spark is faster. In addition to the Fn API overhead, if you can use the higher 
level Spark APIs (Spark SQL, Dataframe/Dataset), Spark can apply additional 
optimizations (vectorization, codegen) to your transforms - whereas Beam 
Transforms/DoFns are a blackbox and cannot be optimized. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to