[jira] [Commented] (VXQUERY-131) Supporting Hadoop data and cluster management

AASHEESH RANJAN (JIRA) Thu, 19 Mar 2015 11:00:12 -0700

    [ 
https://issues.apache.org/jira/browse/VXQUERY-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369808#comment-14369808
 ]


AASHEESH RANJAN commented on VXQUERY-131:
-----------------------------------------

sir presently i am working on this this project....i think your project is also 
similar to this ....

High Performance Distributed Computing Implements for BIG DATA using Hadoop 
Framework and running applications on large clusters - Super Computing

It includes a distributed file system (HDFS), programming support for 
MapReduce, and infrastructure software for grid computing I Design framework 
for capturing workload statistics and replaying workload simulations to allow 
the assessment of framework improvements Benchmark suite for Data Intensive 
Supercomputing: A suite for data-intensive supercomputing
application benchmarks that would present a target that Hadoop (and other 
map-reduce implementations) should be optimized for Design and build a scalable 
Internet anomaly detector over a very high throughput event stream but the goal 
would be low-latency as well as high throughput. Could be used for all sorts of 
things: intrusion detection. The open source data management software that 
helps organizations analyzes massive volumes of structured and unstructured 
data.I Deploy Hadoop cluster consist of number of server – nodes, these will be 
used to store data and process it in a parallel process and distributed 
mechanism. 
To create automation setup, i use Python.

sir i have problem regarding how to join ur mail page...plzz help.

> Supporting Hadoop data and cluster management
> ---------------------------------------------
>
>                 Key: VXQUERY-131
>                 URL: https://issues.apache.org/jira/browse/VXQUERY-131
>             Project: VXQuery
>          Issue Type: Improvement
>            Reporter: Preston Carman
>            Assignee: Preston Carman
>              Labels: gsoc, gsoc2015, hadoop, java, mentor, xml
>
> Many organizations support Hadoop. It would be nice to be able to read data 
> from this source. The project will include creating a strategy (with the 
> mentor's guidance) for reading XML data from HDFS and implementing it. When 
> connecting VXQuery to HDFS, the strategy may need to consider how to read 
> sections of an XML file. 
> In addition, we could use Yarn as our cluster manager. The Apache Hadoop YARN 
> (Yet Another Resource Negotiator) would be a good cluster management tool for 
> VXQuery. If VXQuery can read data from HDFS, then why not also manage the 
> cluster with a tool provided by Hadoop. The solution would replace the 
> current custom python scripts for cluster management.
> Goal
> - Read XML from HDFS
> - Manage the VXQuery cluster with Yarn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (VXQUERY-131) Supporting Hadoop data and cluster management

Reply via email to