Thanks JB,I copied the data file in all nodes. Applied TextIo() in src code.
Started the Flink cluster & deployed the beam fat file to it. Interesting
enough, the dashboard shows thatout of 2048 slots I hav configured for
parallelism, only ONE slot is being used in Flink cluster.This is not true when
i use KafkaIO(). All 2040 get consumed like below.Have a great eve.
From: Jean-Baptiste Onofré <[email protected]>
To: [email protected]
Cc: [email protected]
Sent: Tuesday, October 4, 2016 10:29 PM
Subject: Re: Regarding TextIO()
HiAll depends of the filesystem you are using. If you want all nodes share the
same files then you need a shared filesystem or distributed filesystem like
hdfs (not yet supported by textio) or gs (supported by textio). If you use file
(local filesystem) then the files will be local to each node.Regards
JBOn Oct 5, 2016, at 02:12, amir bahmanyari <[email protected]> wrote:
Hi Colleagues,When you are using TextIO() in a Beam (java) app executing in a
Flink Cluster, do you have to have a copy of the file in every Flink cluster
node?Also, if you want to read the file from one node (probably remote node)
only, how would the TextIO.Read.from("....") look like?What are the best
TextIO() practices in situations like this?Thanks+regards,Amir