sir presently i am working on this this project....i think your project is also similar to this ....
High Performance Distributed Computing Implements for BIG DATA using Hadoop Framework and running applications on large clusters - Super Computing It includes a distributed file system (HDFS), programming support for MapReduce, and infrastructure software for grid computing I Design framework for capturing workload statistics and replaying workload simulations to allow the assessment of framework improvements Benchmark suite for Data Intensive Supercomputing: A suite for data-intensive supercomputing application benchmarks that would present a target that Hadoop (and other map-reduce implementations) should be optimized for Design and build a scalable Internet anomaly detector over a very high throughput event stream but the goal would be low-latency as well as high throughput. Could be used for all sorts of things: intrusion detection. The open source data management software that helps organizations analyzes massive volumes of structured and unstructured data.I Deploy Hadoop cluster consist of number of server – nodes, these will be used to store data and process it in a parallel process and distributed mechanism. To create automation setup, i use Python.
