[ 
https://issues.apache.org/jira/browse/VXQUERY-131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14361885#comment-14361885
 ] 

Till Westmann commented on VXQUERY-131:
---------------------------------------

Good questions :)

1. VXQuery currently is a pure query processor and doesn't support updates. So 
there's nothing that we can write while processing. However, we could certainly 
write the result of a query back to HDFS. I think that we haven't called that 
out explicitly, but it would certainly be a great addition to the ability to 
read from HDFS.

2. There is a very nice solution to integrate JSON and XML processing called 
JSONiq (http://www.jsoniq.org). JSONiq extends the XQuery data model by adding 
array and objects and XQuery itself by functions that work with the added 
instances of the data model. It would be great to extend VXQuery to also 
support JSONiq, but there's no plan to do that so far (but plans can change ..).

> Supporting Hadoop data and cluster management
> ---------------------------------------------
>
>                 Key: VXQUERY-131
>                 URL: https://issues.apache.org/jira/browse/VXQUERY-131
>             Project: VXQuery
>          Issue Type: Improvement
>            Reporter: Preston Carman
>            Assignee: Preston Carman
>              Labels: gsoc, gsoc2015, hadoop, java, mentor, xml
>
> Many organizations support Hadoop. It would be nice to be able to read data 
> from this source. The project will include creating a strategy (with the 
> mentor's guidance) for reading XML data from HDFS and implementing it. When 
> connecting VXQuery to HDFS, the strategy may need to consider how to read 
> sections of an XML file. 
> In addition, we could use Yarn as our cluster manager. The Apache Hadoop YARN 
> (Yet Another Resource Negotiator) would be a good cluster management tool for 
> VXQuery. If VXQuery can read data from HDFS, then why not also manage the 
> cluster with a tool provided by Hadoop. The solution would replace the 
> current custom python scripts for cluster management.
> Goal
> - Read XML from HDFS
> - Manage the VXQuery cluster with Yarn



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to