Rick,

Yes please regarding the GetFile recursion finding.

And yes you can certainly have processors that operate at the edges of
the flow Get* and Put* which are streaming oriented.  It is always
about how you demarcate objects within those streams as those
objects/things/events/data/whatever are what NiFi operates on.

Happy to talk through these things as you progress and if there are
features you think make sense for a variety of use cases please do let
us know.

Thanks
Joe

On Mon, Aug 17, 2015 at 6:27 PM, Rick Braddy <[email protected]> wrote:
> That makes perfect sense, Joe.  We figured out that we like the content 
> repository and queuing.  For huge files, we can develop our own processor 
> that breaks big files into smaller chunks (or streams, if required), so 
> should not actually be an issue - it's just a limitation of the 
> out-of-the-box GetFile processor today (by the way, ran into some issues with 
> recursive subdirectories not being picked up and transferred - should I open 
> a bug report on that one?)
>
> Rick
>
> -----Original Message-----
> From: Joe Witt [mailto:[email protected]]
> Sent: Sunday, August 16, 2015 8:02 PM
> To: [email protected]
> Subject: Re: New to NiFi - Remote Process Group failing to connect due to 
> "magic header" not present
>
> Rick
>
> The content repository is subject to the available disk space of the content 
> repository.  Data is not held resident in memory (ram/heap) unless a 
> processor brings it into memory and the vast majority of them would only ever 
> have some small buffer size held.  Thus you can truly handle objects that are 
> extremely large but they certainly cannot be larger than the disk space you 
> have available.
>
> Streaming performance in nifi would compare favorably with other systems that 
> are transactional and durable or which delegate to some remotely accessible 
> messaging bus.  It would likely not compare favorably with systems that are 
> non-transactional and in-memory.
>
> Thanks
> Joe
>
> On Sun, Aug 16, 2015 at 5:56 PM, Rick Braddy <[email protected]> wrote:
>> Yeah. There's a trade off between pure transfer speed of a specialized
>> utility vs. the flexibility and power of NiFi.
>>
>> Also concerned over very large files that won't fit in content
>> repository memory.
>>
>> How does streaming performance compare thru NiFi?
>>
>>
>>
>> On Aug 16, 2015, at 12:32 PM, Joe Witt <[email protected]> wrote:
>>
>> Rick,
>>
>> "much slower than basic "scp" across nodes, which doesn't incur the
>> extra data copying"
>>
>> That is certainly true.  If what you need is precisely what scp does
>> then scp is the perfect tool.
>>
>> Thanks
>> Joe
>>
>> On Sun, Aug 16, 2015 at 1:19 PM, Rick Braddy <[email protected]> wrote:
>>>
>>> Indeed. I did increase the concurrent tasks which helped for sure.
>>> Still much slower than basic "scp" across nodes, which doesn't incur
>>> the extra data copying.
>>>
>>>
>>>
>>> On Aug 16, 2015, at 11:49 AM, Aldrin Piri <[email protected]> wrote:
>>>
>>> Rick,
>>>
>>> Thanks for the logs. I did not see anything particularly out of the
>>> ordinary and would be inclined to believe there may have been some
>>> network hiccups in the process.
>>>
>>> NiFi has flowfiles queued on connections until the file is
>>> transferred to another relationship. You would be correct in that
>>> they are enqueued for the duration of transfer until the successful
>>> transmission occurs. To help throughput, and allow multiple files to
>>> traverse the network, you can allocate additional concurrent tasks to
>>> the input port receiving these files.
>>>
>>>
>>> On Sat, Aug 15, 2015 at 18:20 Rick Braddy <[email protected]> wrote:
>>>>
>>>> Sure.  Attached zip file contains log files on the target node.
>>>>
>>>>
>>>>
>>>> I have also observed some occasional Putty disconnects from the
>>>> sending side’s terminal connection (Remote Process Group’s VM),
>>>> which makes me wonder if there may be a networking issue with it, so
>>>> may not be a problem with NiFi at all.
>>>>
>>>>
>>>>
>>>> One other question I have.  When the network connection stayed up,
>>>> it was able to transfer the two 1 GB files and one 10 GB file from
>>>> source node to target node; however, it appears these files get
>>>> “queued” for a long period of time (showing up in the connection
>>>> between the Input Connector processor and the PutFile processor.  As
>>>> these are not “streamed”, I assume it’s just taking time to copy all that 
>>>> data around and is to be expected.
>>>>
>>>>
>>>>
>>>> Is there a better way to “stream” from GetFile è Remote Process
>>>> Group è Input Connetor è PutFile (or something equivalent)?
>>>>
>>>>
>>>>
>>>> From: Aldrin Piri [mailto:[email protected]]
>>>> Sent: Saturday, August 15, 2015 4:37 PM
>>>>
>>>>
>>>> To: [email protected]
>>>> Subject: Re: New to NiFi - Remote Process Group failing to connect
>>>> due to "magic header" not present
>>>>
>>>>
>>>>
>>>> Rick,
>>>>
>>>>
>>>>
>>>> Timeouts certainly aren't an expected behavior.  Might you have some
>>>> logs from your remote receiver that is receiving the files that we
>>>> could take a look at?
>>>>
>>>>
>>>>
>>>> It looks like the connection is functional in part, as one item did
>>>> at least make the transfer.
>>>>
>>>>
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> On Sat, Aug 15, 2015 at 5:21 PM, Rick Braddy <[email protected]> wrote:
>>>>
>>>> That appeared at first to resolve the connection problem, as I could
>>>> then see and connect to the remote input connector via the Remote
>>>> Process Group and my basic file transfer flow worked.
>>>>
>>>>
>>>>
>>>> However, now there are timeout warnings – assume this is not normal.
>>>>
>>>>
>>>>
>>>> <image001.png>
>>>>
>>>>
>>>>
>>>> From: Aldrin Piri [mailto:[email protected]]
>>>> Sent: Saturday, August 15, 2015 4:06 PM
>>>>
>>>>
>>>> To: [email protected]
>>>> Subject: Re: New to NiFi - Remote Process Group failing to connect
>>>> due to "magic header" not present
>>>>
>>>>
>>>>
>>>> Rick,
>>>>
>>>>
>>>>
>>>> Site to Site works by talking to the port that the NiFi web tier is
>>>> running on and not the configured "nifi.remote.input.socket.port"
>>>> which is used after the initial handshaking and connection.  This is
>>>> likely why you are receiving the messages about the error of the
>>>> magic header.  It is receiving something from that socket, but not the 
>>>> desired input.
>>>>
>>>>
>>>>
>>>> Make a remote processing group that points to port 8080 (assuming
>>>> this was left as the default) of your other instance and you should
>>>> be good to go.
>>>>
>>>>
>>>>
>>>> Please let us know if that is not the case.
>>>>
>>>>
>>>>
>>>> On Sat, Aug 15, 2015 at 4:57 PM, Rick Braddy <[email protected]> wrote:
>>>>
>>>> Hi Aldrin,
>>>>
>>>>
>>>>
>>>> Here are the property settings:
>>>>
>>>>
>>>>
>>>> # Site to Site properties
>>>>
>>>> nifi.remote.input.socket.port=8081
>>>>
>>>> nifi.remote.input.secure=false
>>>>
>>>>
>>>>
>>>> Referencing remote node via http://<IP>:8081/nifi (not the UI
>>>> address, but the separate site-to-site listener, which I see via
>>>> netstat on target
>>>> node)
>>>>
>>>>
>>>>
>>>> Rick
>>>>
>>>>
>>>>
>>>> From: Aldrin Piri [mailto:[email protected]]
>>>> Sent: Saturday, August 15, 2015 3:48 PM
>>>> To: [email protected]
>>>> Subject: Re: New to NiFi - Remote Process Group failing to connect
>>>> due to "magic header" not present
>>>>
>>>>
>>>>
>>>> Rick,
>>>>
>>>>
>>>>
>>>> Welcome to the community!
>>>>
>>>>
>>>>
>>>> We seem to be a little short in the documentation department for how
>>>> to make use of Remote Process Groups, but will look to remedy that.
>>>>
>>>>
>>>>
>>>> Just to confirm a few settings, both your nodes have a
>>>> nifi.remote.input.socket.port set and each has
>>>> nifi.remote.input.secure set to false within your nifi.properties.
>>>>
>>>>
>>>>
>>>> From here, are you referencing the remote node via its UI address?
>>>> Out of the box, this would be <server FQDN/IP>:8080/nifi
>>>>
>>>>
>>>>
>>>> Thanks,
>>>>
>>>> Aldrin
>>>>
>>>>
>>>>
>>>> On Sat, Aug 15, 2015 at 4:11 PM, Rick Braddy <[email protected]> wrote:
>>>>
>>>> Hi,
>>>>
>>>>
>>>>
>>>> I’m new to NiFi, trying to get my first Remote Process Group
>>>> configured and working between two CentOS nodes.  For expediency, I
>>>> have configured site-to-site port to 8081 and set secure to false
>>>> (to avoid dealing with SSL certificate setup for now – will get to that 
>>>> later).
>>>>
>>>>
>>>>
>>>> Trying to get two nodes to communicate using Remote Process Group.
>>>> Google is not finding any useful examples of how to set this up and
>>>> get it to work, so kind of fumbling through it today (learning a
>>>> lot, but slow going).
>>>>
>>>>
>>>>
>>>> I found the nifil-app.log file and why the Remote Process Group is
>>>> failing to connect to the second node.  Not sure why the “Magic Header”
>>>> isn’t right, but connections are being closed and getting Read
>>>> Timeouts on the RPG sending node – the receiving NiFi node is
>>>> closing the connection because it thinks the sender isn’t a valid
>>>> NiFi node due to missing/incorrect magic header.
>>>>
>>>>
>>>>
>>>> 2015-08-15 13:28:08,939 ERROR [Site-to-Site Worker Thread-27]
>>>> o.a.nifi.remote.SocketRemoteSiteListener Unable to communicate with
>>>> remote instance null due to 
>>>> org.apache.nifi.remote.exception.HandshakeException:
>>>> Handshake with nifi://SoftNAS-RGB1:57336 failed because the Magic
>>>> Header was not present; closing connection
>>>>
>>>>
>>>>
>>>> Not sure where to go from here to resolve this.
>>>>
>>>>
>>>>
>>>> Rick
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>

Reply via email to