[GitHub] [beam] yeandy opened a new issue, #22240: [Bug]: Pytorch RunInference PredictionResult is a Dict

GitBox Fri, 09 Sep 2022 06:10:05 -0700


yeandy opened a new issue, #22240:
URL: https://github.com/apache/beam/issues/22240


   ### What happened?
   The return value of the `forward` call for many models is a dictionary with 
the predictions along with more metadata. i.e. `Dict[str, Tensor]`. However, 
RunInference currently expects outputs to be an `Iterable[Any]`. i.e. 
`Iterable[Tensor]` or `Iterable[Dict[str, Tensor]]`. So when RunInference [zips 
the inputs with the 
predictions](https://github.com/apache/beam/blob/9ffeced5f246b3d72eedf8c55aa20574ae9d07cb/sdks/python/apache_beam/ml/inference/pytorch_inference.py#L131),
 the predictions iterate over the dictionary keys instead of the batch 
elements. So we end up with only the key name and actual prediction Tensors are 
discarded. This results in an unusable `PredictionResult`. 
   
   The PredictionResult challenge is a known quirk due to the current 
design/implementation of RunInference. We have [a 
sample](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/pytorch_language_modeling.py#L49)
 that does something similar by creating a wrapper class. We should support 
both tensors and dictionary of tensors. Currently, it only works for the 
former, and that is baked into the logic. We should figure out how to natively 
support results that are a dict type.
   
   ### Issue Priority
   
   Priority: 2
   
   ### Issue Component
   
   Component: sdk-py-core


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [beam] yeandy opened a new issue, #22240: [Bug]: Pytorch RunInference PredictionResult is a Dict

Reply via email to