Hi, Are there any performance numbers related to how HDFS can ingest data?
I am assuming a case where multiple processes outside hadoop write intohadoop in parallel. I understand that it is probably related to various hardware constraints but any existing numbers would be interesting. In particular: Is the ingest rate tied to the number of nodes on the cluster? Is it somehow linear? What is the impact of ingesting data at a high rate on the map/reduce jobs running?
Thanks, S.
