[
https://issues.apache.org/jira/browse/VXQUERY-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369808#comment-14369808
]
AASHEESH RANJAN commented on VXQUERY-131:
-----------------------------------------
sir presently i am working on this this project....i think your project is also
similar to this ....
High Performance Distributed Computing Implements for BIG DATA using Hadoop
Framework and running applications on large clusters - Super Computing
It includes a distributed file system (HDFS), programming support for
MapReduce, and infrastructure software for grid computing I Design framework
for capturing workload statistics and replaying workload simulations to allow
the assessment of framework improvements Benchmark suite for Data Intensive
Supercomputing: A suite for data-intensive supercomputing
application benchmarks that would present a target that Hadoop (and other
map-reduce implementations) should be optimized for Design and build a scalable
Internet anomaly detector over a very high throughput event stream but the goal
would be low-latency as well as high throughput. Could be used for all sorts of
things: intrusion detection. The open source data management software that
helps organizations analyzes massive volumes of structured and unstructured
data.I Deploy Hadoop cluster consist of number of server – nodes, these will be
used to store data and process it in a parallel process and distributed
mechanism.
To create automation setup, i use Python.
sir i have problem regarding how to join ur mail page...plzz help.
> Supporting Hadoop data and cluster management
> ---------------------------------------------
>
> Key: VXQUERY-131
> URL: https://issues.apache.org/jira/browse/VXQUERY-131
> Project: VXQuery
> Issue Type: Improvement
> Reporter: Preston Carman
> Assignee: Preston Carman
> Labels: gsoc, gsoc2015, hadoop, java, mentor, xml
>
> Many organizations support Hadoop. It would be nice to be able to read data
> from this source. The project will include creating a strategy (with the
> mentor's guidance) for reading XML data from HDFS and implementing it. When
> connecting VXQuery to HDFS, the strategy may need to consider how to read
> sections of an XML file.
> In addition, we could use Yarn as our cluster manager. The Apache Hadoop YARN
> (Yet Another Resource Negotiator) would be a good cluster management tool for
> VXQuery. If VXQuery can read data from HDFS, then why not also manage the
> cluster with a tool provided by Hadoop. The solution would replace the
> current custom python scripts for cluster management.
> Goal
> - Read XML from HDFS
> - Manage the VXQuery cluster with Yarn
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)