[
https://issues.apache.org/jira/browse/ACCUMULO-1804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802995#comment-13802995
]
Josh Elser commented on ACCUMULO-1804:
--------------------------------------
[~aarongmldt], thanks for the code! Digging through some of the source, it
looks like you used the C++ interface for the Thrift proxy to connect R with
Accumulo. That, combined with some of the API calls, it looks like raccumulo
requires at least Accumulo 1.5.0 and an instance of the Thrift proxy running.
Is that correct?
It's been quite a while since I've used R; where do you see this fitting into
the Accumulo community? Generally speaking, we have two avenues which
integration code like this falls into.
# Inclusion in core Accumulo codebase
# A "contrib" project of Accumulo
For #1, this would typically require a committer to sign up to ensure that the
code is well-maintained as Accumulo itself grows. It is held to a certain level
of testing and has a good expectation of working as expected since it would be
released with Accumulo itself. For #2, a contrib project is a means for
Accumulo to keep related, developed code near Accumulo. These projects
typically follow their own schedule and aren't crucial to a release of Accumulo
itself.
To me, it seems like a contrib project is the best location for it at the
moment. What do you think? Other committers? Do you intend to maintain and add
additional functionality to raccumulo as people use it and find bugs or
improvements?
Thanks again. It's awesome to see contributions like this!
> Integrate RStudio to work with data residing in Accumulo
> --------------------------------------------------------
>
> Key: ACCUMULO-1804
> URL: https://issues.apache.org/jira/browse/ACCUMULO-1804
> Project: Accumulo
> Issue Type: Improvement
> Reporter: Aaron Glahe
> Priority: Minor
> Attachments: raccumulo-release.tar.gz
>
>
> Need to be able to support users who utilize RStudio to conduct analysis of
> data residing in the Accumulo data space instead of moving data from one
> repository to a stand alone system to have the analytic run in memory.
> RStudio should be able to make calls directly to the data space and provide
> the output within the RStudio interface.
--
This message was sent by Atlassian JIRA
(v6.1#6144)