Sergei Lilichenko created BEAM-10692:
----------------------------------------

             Summary: Change org.apache.beam.sdk.extensions.ml.CloudVision to 
associate the AnnotateImageResponses with the image data used for the annotation
                 Key: BEAM-10692
                 URL: https://issues.apache.org/jira/browse/BEAM-10692
             Project: Beam
          Issue Type: New Feature
          Components: extensions-java-gcp
    Affects Versions: 2.22.0
            Reporter: Sergei Lilichenko


There is a problem with the design of that transform. It takes a 
PCollection<String> (in case of GCS URIs) in and outputs 
PCollection<List<AnnotateImageResponse>>. There is no way to associate the 
responses with the original file URIs. 
[ImageAnnotationContext|https://cloud.google.com/vision/docs/reference/rest/v1/AnnotateImageResponse#ImageAnnotationContext]
 is returned as part of the response, but the "uri" is empty for the majority 
of annotations (looks like it's only populated for file annotations and not for 
image annotations).

One approach is to return KV<String, List<AnnotateImageResponse>> for images 
where the key is the GCS URI and for bytes to pass an id of any type and do 
KV<IDTYPE, List<AnnotateImageResponse>>.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to