Re: Question about --environment_type argument

Maximilian Michels Tue, 28 May 2019 08:47:04 -0700

Recent versions of Flink do not bundle Hadoop anymore, but they arestill "Hadoop compatible". You just need to include the Hadoop jars inthe classpath.

Beams's Hadoop does not bundle Hadoop either, it just provides Beam filesystem abstractions which are similar to Flink "Hadoop compatibility".


You probably want to add this to the job server:
  shadow library.java.hadoop_client
  shadow library.java.hadoop_common

Cheers,
Max

On 28.05.19 15:41, 青雉（祁明良） wrote:

Thanks Robert, I had one, “qmlmoon”
Looks like I had the jobserver working now, I just add a shadowdependency of /beam-sdks-java-io-hadoop-file-system/ to/beam-runners-flink_2.11-job-server/ and rebuild the job server, butFlink taskmanger also complains about the same issue during job running.
So how is Flink taskmanager finding this HDFS filesystem dependency?
-------
2019-05-28 13:15:57,695 INFOorg.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService- GetManifest forhdfs://myhdfs/algo-emr/k8s_flink/beam/job_87fa794e-9cd7-4c20-b95c-086f11abfaa4/MANIFEST2019-05-28 13:15:57,696 INFOorg.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService- Loading manifest for retrieval tokenhdfs://myhdfs/algo-emr/k8s_flink/beam/job_87fa794e-9cd7-4c20-b95c-086f11abfaa4/MANIFEST2019-05-28 13:15:57,698 INFOorg.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService- GetManifest forhdfs://myhdfs/algo-emr/k8s_flink/beam/job_87fa794e-9cd7-4c20-b95c-086f11abfaa4/MANIFESTfailedorg.apache.beam.vendor.guava.v20_0.com.google.common.util.concurrent.UncheckedExecutionException:java.lang.IllegalArgumentException: No filesystem found for scheme hdfsatorg.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2214)atorg.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache.get(LocalCache.java:4053)atorg.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4057)atorg.apache.beam.vendor.guava.v20_0.com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4986)atorg.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactRetrievalService.getManifest(BeamFileSystemArtifactRetrievalService.java:80)atorg.apache.beam.model.jobmanagement.v1.ArtifactRetrievalServiceGrpc$MethodHandlers.invoke(ArtifactRetrievalServiceGrpc.java:298)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.stub.ServerCalls$UnaryServerCallHandler$UnaryServerCallListener.onHalfClose(ServerCalls.java:171)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.PartialForwardingServerCallListener.onHalfClose(PartialForwardingServerCallListener.java:35)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:23)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.ForwardingServerCallListener$SimpleForwardingServerCallListener.onHalfClose(ForwardingServerCallListener.java:40)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.Contexts$ContextualizedServerCallListener.onHalfClose(Contexts.java:86)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerCallImpl$ServerStreamListenerImpl.halfClosed(ServerCallImpl.java:283)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ServerImpl$JumpToApplicationThreadServerStreamListener$1HalfClosed.runInContext(ServerImpl.java:707)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.ContextRunnable.run(ContextRunnable.java:37)atorg.apache.beam.vendor.grpc.v1p13p1.io.grpc.internal.SerializingExecutor.run(SerializingExecutor.java:123)atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
On 28 May 2019, at 9:31 PM, Robert Bradshaw <rober...@google.com<mailto:rober...@google.com>> wrote:
The easiest would probably be to create a project that depends on both
the job server and the hadoop filesystem and then build that as a fat
jar.
本邮件及其附件含有小红书公司的保密信息，仅限于发送给以上收件人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、或散发）本邮件中的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本邮件！This communication may contain privileged or other confidentialinformation of Red. If you have received it in error, please advise thesender by reply e-mail and immediately delete the message and anyattachments without copying or disclosing the contents. Thank you.

Re: Question about --environment_type argument

Reply via email to