[ 
https://issues.apache.org/jira/browse/VXQUERY-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14497346#comment-14497346
 ] 

Efi Kaltirimidou commented on VXQUERY-131:
------------------------------------------

Dear Preston, my name is Efi Kaltirimidou and I am one of the students who 
applied for
this project,for this year's GSoC.
.
After reading the details you described for these goals and studying the 
project's code,I 
would like your opinion on which parts of these features,do you think,are the 
most 
challenging to implement?
I think it would be good to know, in order to start reading about them and 
handle them 
without problems.

Another question I have is about the YARN scheduling options.YARN offers some 
standard 
scheduling policies for workload optimization like FIFO, but also allows custom 
algorithms
to be implemented.Does something like this exists in the current project and 
will be implemented in YARN or something standard will be used?

Thank you, 
Efi 

> Supporting Hadoop data and cluster management
> ---------------------------------------------
>
>                 Key: VXQUERY-131
>                 URL: https://issues.apache.org/jira/browse/VXQUERY-131
>             Project: VXQuery
>          Issue Type: Improvement
>            Reporter: Preston Carman
>            Assignee: Preston Carman
>              Labels: gsoc, gsoc2015, hadoop, java, mentor, xml
>
> Many organizations support Hadoop. It would be nice to be able to read data 
> from this source. The project will include creating a strategy (with the 
> mentor's guidance) for reading XML data from HDFS and implementing it. When 
> connecting VXQuery to HDFS, the strategy may need to consider how to read 
> sections of an XML file. 
> In addition, we could use Yarn as our cluster manager. The Apache Hadoop YARN 
> (Yet Another Resource Negotiator) would be a good cluster management tool for 
> VXQuery. If VXQuery can read data from HDFS, then why not also manage the 
> cluster with a tool provided by Hadoop. The solution would replace the 
> current custom python scripts for cluster management.
> Goal
> - Read XML from HDFS
> - Manage the VXQuery cluster with Yarn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to