[
https://issues.apache.org/jira/browse/VXQUERY-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940394#comment-14940394
]
Steven Jacobs commented on VXQUERY-131:
---------------------------------------
We have several small issues that could give you a feel for the codebase. For
example, https://issues.apache.org/jira/browse/VXQUERY-54 is a request for our
string functions to match the XQuery spec (http://www.w3.org/TR/xquery/).
Basically there are some string functions that are missing. As a simple
solution, you could look at the lowercase function that we already have as an
example and use it to create the other functions that we are missing. This
would be a fairly simple change, and give you a chance to get familiar with our
runtime codebase, at which point we could point you to a larger project. We
would love to have you take a look if you are interested.
> Supporting Hadoop and Yarn
> --------------------------
>
> Key: VXQUERY-131
> URL: https://issues.apache.org/jira/browse/VXQUERY-131
> Project: VXQuery
> Issue Type: Improvement
> Reporter: Preston Carman
> Assignee: Steven Jacobs
> Labels: gsoc, gsoc2015, hadoop, java, mentor, xml
>
> Many organizations support Hadoop. It would be nice to be able to read data
> from this source. The project will include creating a strategy (with the
> mentor's guidance) for reading XML data from HDFS and implementing it. When
> connecting VXQuery to HDFS, the strategy may need to consider how to read
> sections of an XML file.
> We could use Yarn as our cluster manager. The Apache Hadoop YARN (Yet Another
> Resource Negotiator) would be a good cluster management tool for VXQuery. If
> VXQuery can read data from HDFS, then why not also manage the cluster with a
> tool provided by Hadoop. The solution would replace the current custom python
> scripts for cluster management.
> Goal
> - Read XML from HDFS
> - Manage cluster with YARN
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)