That makes perfect sense, Joe.  We figured out that we like the content 
repository and queuing.  For huge files, we can develop our own processor that 
breaks big files into smaller chunks (or streams, if required), so should not 
actually be an issue - it's just a limitation of the out-of-the-box GetFile 
processor today (by the way, ran into some issues with recursive subdirectories 
not being picked up and transferred - should I open a bug report on that one?)

Rick

-----Original Message-----
From: Joe Witt [mailto:[email protected]] 
Sent: Sunday, August 16, 2015 8:02 PM
To: [email protected]
Subject: Re: New to NiFi - Remote Process Group failing to connect due to 
"magic header" not present

Rick

The content repository is subject to the available disk space of the content 
repository.  Data is not held resident in memory (ram/heap) unless a processor 
brings it into memory and the vast majority of them would only ever have some 
small buffer size held.  Thus you can truly handle objects that are extremely 
large but they certainly cannot be larger than the disk space you have 
available.

Streaming performance in nifi would compare favorably with other systems that 
are transactional and durable or which delegate to some remotely accessible 
messaging bus.  It would likely not compare favorably with systems that are 
non-transactional and in-memory.

Thanks
Joe

On Sun, Aug 16, 2015 at 5:56 PM, Rick Braddy <[email protected]> wrote:
> Yeah. There's a trade off between pure transfer speed of a specialized 
> utility vs. the flexibility and power of NiFi.
>
> Also concerned over very large files that won't fit in content 
> repository memory.
>
> How does streaming performance compare thru NiFi?
>
>
>
> On Aug 16, 2015, at 12:32 PM, Joe Witt <[email protected]> wrote:
>
> Rick,
>
> "much slower than basic "scp" across nodes, which doesn't incur the 
> extra data copying"
>
> That is certainly true.  If what you need is precisely what scp does 
> then scp is the perfect tool.
>
> Thanks
> Joe
>
> On Sun, Aug 16, 2015 at 1:19 PM, Rick Braddy <[email protected]> wrote:
>>
>> Indeed. I did increase the concurrent tasks which helped for sure. 
>> Still much slower than basic "scp" across nodes, which doesn't incur 
>> the extra data copying.
>>
>>
>>
>> On Aug 16, 2015, at 11:49 AM, Aldrin Piri <[email protected]> wrote:
>>
>> Rick,
>>
>> Thanks for the logs. I did not see anything particularly out of the 
>> ordinary and would be inclined to believe there may have been some 
>> network hiccups in the process.
>>
>> NiFi has flowfiles queued on connections until the file is 
>> transferred to another relationship. You would be correct in that 
>> they are enqueued for the duration of transfer until the successful 
>> transmission occurs. To help throughput, and allow multiple files to 
>> traverse the network, you can allocate additional concurrent tasks to 
>> the input port receiving these files.
>>
>>
>> On Sat, Aug 15, 2015 at 18:20 Rick Braddy <[email protected]> wrote:
>>>
>>> Sure.  Attached zip file contains log files on the target node.
>>>
>>>
>>>
>>> I have also observed some occasional Putty disconnects from the 
>>> sending side’s terminal connection (Remote Process Group’s VM), 
>>> which makes me wonder if there may be a networking issue with it, so 
>>> may not be a problem with NiFi at all.
>>>
>>>
>>>
>>> One other question I have.  When the network connection stayed up, 
>>> it was able to transfer the two 1 GB files and one 10 GB file from 
>>> source node to target node; however, it appears these files get 
>>> “queued” for a long period of time (showing up in the connection 
>>> between the Input Connector processor and the PutFile processor.  As 
>>> these are not “streamed”, I assume it’s just taking time to copy all that 
>>> data around and is to be expected.
>>>
>>>
>>>
>>> Is there a better way to “stream” from GetFile è Remote Process 
>>> Group è Input Connetor è PutFile (or something equivalent)?
>>>
>>>
>>>
>>> From: Aldrin Piri [mailto:[email protected]]
>>> Sent: Saturday, August 15, 2015 4:37 PM
>>>
>>>
>>> To: [email protected]
>>> Subject: Re: New to NiFi - Remote Process Group failing to connect 
>>> due to "magic header" not present
>>>
>>>
>>>
>>> Rick,
>>>
>>>
>>>
>>> Timeouts certainly aren't an expected behavior.  Might you have some 
>>> logs from your remote receiver that is receiving the files that we 
>>> could take a look at?
>>>
>>>
>>>
>>> It looks like the connection is functional in part, as one item did 
>>> at least make the transfer.
>>>
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>>> On Sat, Aug 15, 2015 at 5:21 PM, Rick Braddy <[email protected]> wrote:
>>>
>>> That appeared at first to resolve the connection problem, as I could 
>>> then see and connect to the remote input connector via the Remote 
>>> Process Group and my basic file transfer flow worked.
>>>
>>>
>>>
>>> However, now there are timeout warnings – assume this is not normal.
>>>
>>>
>>>
>>> <image001.png>
>>>
>>>
>>>
>>> From: Aldrin Piri [mailto:[email protected]]
>>> Sent: Saturday, August 15, 2015 4:06 PM
>>>
>>>
>>> To: [email protected]
>>> Subject: Re: New to NiFi - Remote Process Group failing to connect 
>>> due to "magic header" not present
>>>
>>>
>>>
>>> Rick,
>>>
>>>
>>>
>>> Site to Site works by talking to the port that the NiFi web tier is 
>>> running on and not the configured "nifi.remote.input.socket.port" 
>>> which is used after the initial handshaking and connection.  This is 
>>> likely why you are receiving the messages about the error of the 
>>> magic header.  It is receiving something from that socket, but not the 
>>> desired input.
>>>
>>>
>>>
>>> Make a remote processing group that points to port 8080 (assuming 
>>> this was left as the default) of your other instance and you should 
>>> be good to go.
>>>
>>>
>>>
>>> Please let us know if that is not the case.
>>>
>>>
>>>
>>> On Sat, Aug 15, 2015 at 4:57 PM, Rick Braddy <[email protected]> wrote:
>>>
>>> Hi Aldrin,
>>>
>>>
>>>
>>> Here are the property settings:
>>>
>>>
>>>
>>> # Site to Site properties
>>>
>>> nifi.remote.input.socket.port=8081
>>>
>>> nifi.remote.input.secure=false
>>>
>>>
>>>
>>> Referencing remote node via http://<IP>:8081/nifi (not the UI 
>>> address, but the separate site-to-site listener, which I see via 
>>> netstat on target
>>> node)
>>>
>>>
>>>
>>> Rick
>>>
>>>
>>>
>>> From: Aldrin Piri [mailto:[email protected]]
>>> Sent: Saturday, August 15, 2015 3:48 PM
>>> To: [email protected]
>>> Subject: Re: New to NiFi - Remote Process Group failing to connect 
>>> due to "magic header" not present
>>>
>>>
>>>
>>> Rick,
>>>
>>>
>>>
>>> Welcome to the community!
>>>
>>>
>>>
>>> We seem to be a little short in the documentation department for how 
>>> to make use of Remote Process Groups, but will look to remedy that.
>>>
>>>
>>>
>>> Just to confirm a few settings, both your nodes have a 
>>> nifi.remote.input.socket.port set and each has 
>>> nifi.remote.input.secure set to false within your nifi.properties.
>>>
>>>
>>>
>>> From here, are you referencing the remote node via its UI address?  
>>> Out of the box, this would be <server FQDN/IP>:8080/nifi
>>>
>>>
>>>
>>> Thanks,
>>>
>>> Aldrin
>>>
>>>
>>>
>>> On Sat, Aug 15, 2015 at 4:11 PM, Rick Braddy <[email protected]> wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> I’m new to NiFi, trying to get my first Remote Process Group 
>>> configured and working between two CentOS nodes.  For expediency, I 
>>> have configured site-to-site port to 8081 and set secure to false 
>>> (to avoid dealing with SSL certificate setup for now – will get to that 
>>> later).
>>>
>>>
>>>
>>> Trying to get two nodes to communicate using Remote Process Group.
>>> Google is not finding any useful examples of how to set this up and 
>>> get it to work, so kind of fumbling through it today (learning a 
>>> lot, but slow going).
>>>
>>>
>>>
>>> I found the nifil-app.log file and why the Remote Process Group is 
>>> failing to connect to the second node.  Not sure why the “Magic Header”
>>> isn’t right, but connections are being closed and getting Read 
>>> Timeouts on the RPG sending node – the receiving NiFi node is 
>>> closing the connection because it thinks the sender isn’t a valid 
>>> NiFi node due to missing/incorrect magic header.
>>>
>>>
>>>
>>> 2015-08-15 13:28:08,939 ERROR [Site-to-Site Worker Thread-27] 
>>> o.a.nifi.remote.SocketRemoteSiteListener Unable to communicate with 
>>> remote instance null due to 
>>> org.apache.nifi.remote.exception.HandshakeException:
>>> Handshake with nifi://SoftNAS-RGB1:57336 failed because the Magic 
>>> Header was not present; closing connection
>>>
>>>
>>>
>>> Not sure where to go from here to resolve this.
>>>
>>>
>>>
>>> Rick
>>>
>>>
>>>
>>>
>>>
>>>
>
>

Reply via email to