Shweta,

While this may deviate from your initial requirements, NiFi offers the ability 
to compress, resize, and extract metadata from your images.   You can use NiFi 
to build a image-processing pipeline for incoming images to prioritize and 
route ~10% of images data that needs to arrive in 4 seconds.  The rest  of the 
images will show up shortly after.   Resizing and compression, where 
applicable, can also  help now you towards your goal. 

Have fun, 
Lee

On Nov 25, 2016, at 6:37 PM, Andy LoPresto <[email protected]> wrote:

> Unless my back of the envelope math is way off, to transfer 50GB (400Gb) per 
> second, you would need 40 parallel 10GbE connections, assuming absolutely no 
> overhead. Your precision for "a few seconds" would need to be 40+ seconds 
> using a single 10 GbE link and optimal transmission speed. 
> 
> From the Apache NiFi Overview document: 
> 
> "for something concrete and broadly applicable, consider the out-of-the-box 
> default implementations. These are all persistent with guaranteed delivery 
> and do so using local disk. So being conservative, assume roughly 50MB per 
> second read/write rate on modest disks or RAID volumes within a typical 
> server. NiFi for a large class of dataflows then should be able to 
> efficiently reach 100MB per second or more of throughput. "
> 
> Those numbers are at least 18 months old, so with a robust cluster of 8 
> high-performance machines and an optimized flow to balance computation across 
> all the boxes, I would ballpark a perfect world estimate at 1Gbps. My last 
> knowledge of HDFS write speeds was around 10-20Gbps. Again, if your tolerance 
> for the full process is 40-50 seconds, NiFi should be able to keep up, but 
> your uplink will probably be the long pole in the tent here. 
> 
> Feel free to correct any poor assumptions or bad math above. 
> 
> Andy LoPresto
> [email protected]
> [email protected]
> PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69
> 
>> On Nov 24, 2016, at 20:48, shweta Aggarwal <[email protected]> wrote:
>> 
>> Hi folks,
>> 
>> We have a requirement in one of our time critical application wherein we
>> are looking for transferring upto 40-50 GBs worth images
>> within few seconds between remote machine and HDFS.
>> 
>> Assuming network connectivity between the two is on a 10Gbe link and NIC
>> and socket buffers tuned optimally to give best performance , does Nifi
>> have a capability  to support desired performance using a combination of
>> "getFile" and "putHDFS" on a high ended cluster of  >8 nodes.
>> 
>> We are also exploring a combination of HDFS+GrdiFTP for fast transfer of
>> images from remote machine to HDFS cluster.
>> 
>> Any thoughts or pointers shall be helpful.
>> 
>> Thanks!!

Reply via email to