Hi Ankur,
Thanks for the reply!! Please find my response as below :
1. Pipeline does not fail for this. It shows a warning message as 'WARN
org.apache.beam.runners.fnexecution.environment.DockerCommand - Unable to pull
docker image $userId-docker-apache.bintray.io/beam/python:latest, cause:
1. I see, the warning is harmless as it tries to update the image if hosted
on a remote repository. In this case, the remote repository does not have
the image hence the warning. Its not an error because the local image is
still used.
On Wed, May 29, 2019 at 5:32 PM Anjana Pydi
wrote:
> Hi
+user@beam.apache.org
1. Docker images created are only added to local docker repository and are
not pushed to bintray.io. I have not seen this exact error earlier so my
best guess would be that the request to fetch the remote docker image
failes when you try to do "docker pull -
I was wondering if there was an error earlier in the logs, e.g. at startup,
about this missing parameter (given that it was linked in, I'd assume it at
least tried to load).
As for the other question, if the Python worker is doing the read, then
yes, it needs access to HDFS as well.
On Wed, May
The environment for your Python function also needs to configure access for
HDFS. Unfortunately, this is a separate from Java. I haven't used HDFS with
Python myself, but it looks like you have to configure these options in
HadoopFileSystemOptions:
-hdfs_host
-hdfs_port
-hdfs_user
Now, I
Was there any indication in
the logs that the hadoop file system attempted to load but failed?
Nope, same message “No filesystem found for scheme hdfs” when HADOOP_CONF_DIR
not set.
I guess I met the last problem. When I load input data from HDFS, the python
sdk worker fails. It complains about
Hello all
I have a beam job that I use to read messages from RabbitMq t write them
in kafka.
As of now, messages are read/written as JSON.
Obviously, it's not that optimal storage, so i would like to transform
the messages to avro prior to write them in Kafka. I have the URL of a
schema
Glad you were able to figure it out!
Agree the error message was suboptimal. Was there any indication in
the logs that the hadoop file system attempted to load but failed?
On Wed, May 29, 2019 at 4:41 AM 青雉(祁明良) wrote:
>
> Thanks guys, I got it. It was because Flink taskmanager docker missing