Site-to-Site is a direct connection between NiFi instances/clusters over a
socket, so TCP based.

There will always have to be at least one local machine involved. When NiFi
pulls/receives data from somewhere, it takes that data under control and
stores it in the NiFi content repository on disk (configured in
nifi.properties). As a FlowFile moves through the flow, a pointer to this
content is being passed around until it needs to be accessed. So when
PutHDFS needs to send to the other cluster it would read the content and
send to the other HDFS. The data would then eventually age-off from the
NiFi content repository depending how it is configured. So it would not
have to hold all of the data on the local machine, but it would always have
some portion of the most recent data that has been moved across.

Let us know if this doesn't make sense.

-Bryan




On Thu, Dec 10, 2015 at 1:52 AM, digvijayp <digvijay.pisal1...@gmail.com>
wrote:

> Hi Bryan,
> So in edge node approach how data sent in site-to-site ?I mean to say is it
> using any protocol to transfer it like FTP,SFTP.
> As you are saying If both clusters can fully talk to each other than you
> don't need this edge node approach, you could just have a NiFi instance, or
> cluster, that pulls from one HDFS and pushes to the other.
> so my query is we have to use FetchHDFS/getHDFS process which get data from
> HDFS to local machine and putHDFS process which load data from local
> machine
> to HDFS.I dont have yo use the local machin in between .So how can we
> manage
> the transfer data without using local machine? Where can we do such
> configuration in nifi?
>
> Thanks in advance.
>
> Digvijay P.
>
>
>
> --
> View this message in context:
> http://apache-nifi-developer-list.39713.n7.nabble.com/Facing-Issue-while-connecting-with-HDFS-tp5684p5712.html
> Sent from the Apache NiFi Developer List mailing list archive at
> Nabble.com.
>

Reply via email to