I'm playing around with xlang portable pipelines in dataflow and noticed that it tries to pull the java harness (beam_java8_sdk:2.25.0) from docker.io. This is problematic because our VPC prevents access to external hosts. I was able to fix the problem by passing in
--sdk_harness_container_image_overrides=.*java.*, gcr.io/cloud-dataflow/v1beta3/beam_java8_sdk:2.25.0 to my job, but it's not ideal to have to do this. Is there a reason the default location is docker.io rather than gcr? Especially given that docker is going to be substantially limiting pulls / hour in the near future.
