The list-fetch approach sounds correct, and the micro acquisition
cluster (if necessary) also sounds like a good idea.
Regarding multiple hosts, the connection pooling in FetchSFTP does
account for that. Its basically a map from the hostname string to a
holder of connections for that hostname.
-B
Yep that's exactly how I have it set up with a push to RPG. Is that
preferred? I just started playing with it to be honest. I can see how it
could be tricky if you have to pull from multiple servers each flow file
could potentially have a different sftp host address in the queues.
All together we
Ryan,
The 10 seconds appears to be a hard-code rule in the processor,
although it seems like it could be turned into a configurable
property.
It would require a code change to make it grab a batch of flow files
during a single execution. In theory it shouldn't provide that much of
a difference, b
Joe/Bryan Thanks!
I believe the one specific file per concurrent task/connection (and too
many threads) is the issue I have we have a lot of small files and often
times backed up . I'm going to drop the task count to take advantage of the
pooling. Is it possible to have Fetch do batches vs a singl
Ryan,
Personally I don't have experience running these processors at scale,
but from a code perspective they are fundamentally different...
GetSFTP is a source processor, meaning is not being fed by an upstream
connection, so when it executes it can create a connection and
retrieve up to max-sele
Ryan - dont know the code specifics behind FetchSFTP off-hand but i
can confirm there are users at that range for it.
Thanks
On Tue, Oct 31, 2017 at 11:38 AM, Ryan Ward wrote:
> I've found that on a single node getSFTP is able to pull more files off a
> remote server than Fetch in a cluster. I n
I've found that on a single node getSFTP is able to pull more files off a
remote server than Fetch in a cluster. I noticed Fetch doesn't have a max
selects so it is requiring way more connections (one per file?) and
concurrent threads to keep up.
Was wondering if anyone is using List/Fetch at scal