Hi Robert,

Welcome to the Rya community! We are very excited about your proposed
contributions and the possibility of integrating Rya in your product. All
the contributions you proposed are of interest to us. In general,
contributions that do not break the public API will be merged faster. We
need to have a discussion/reach a decision about possibly maintaining
multiple branches of code when we do the switch from sesame to rdf4j, but
it seems that now is the time to really think about that.
Thanks a lot for contacting us and I'm looking forward to a long and
fruitful collaboration.

Best regards,
Adina

On Thu, Mar 15, 2018 at 1:36 PM, Robert David <robert.da...@semantic-web.com
> wrote:

> Hi Rya dev community,
>
> we are very interested in Apache Rya and would like to contribute to the
> project, starting with some proposals for development.
>
> But first we would like to introduce ourselves. Semantic Web Company is the
> leading provider of graph-based metadata, search and analytic solutions.
> The company is the vendor of PoolParty Semantic Suite (
> https://www.poolparty.biz/), one of the most renowned semantic software
> platforms on the global market. PoolParty supports enterprise needs in
> information management, metadata management, cognitive computing, data
> analytics and content excellence.
>
> PoolParty consists of different components that all integrate triple stores
> for data storage. We use the rdf4j api (currently 2.2.4) and integrate with
> stores from different vendors. Regarding data management, the different
> components have different requirements. Some components do a lot of rather
> small reads and writes, while others bulk store large data sets or use
> complex sparql queries for searching.
>
> The Rya store definitely looks promising for us. We would like to integrate
> it into our components and also contribute in the process. To do so we came
> up with some issues that we would have to solve. These might also be
> interesting for you so we want to share our list here for discussion. We
> already did some tests regarding these issues.
>
>
> Dependency versions:
> --------------------
>
> rdf4j: we are currently using 2.2.4. I think the swith from sesame to rdf4j
> is very important for integrators.
>
> mongodb: 3.6.3 seems to work with the current implementation.
>
> Accumulo: 1.8.1
>
>    - note that the accumulo upgrade also required two code changes that
>      should be checked by an accumulo expert.
>      for example in this file [1], the other file was this one [2]. this is
> just for
>      information now, we would of course submit pull requests or patches.
>    - also note that accumulo upgrade requires a libthrift upgrade
>      (when running Rya with the current libthrift dependency inside an
> alpine
>      docker image we ran into a segmentation fault)
>
> Hadoop 2.9.0
>
>
> Integration:
> ------------
>
> Programmatically:
>
> To integrate with most of our components, we will use the rdf4j library. As
> already noted above we are currently using version 2.2.4.
>
> For some of our components atomic actions are mandatory. The current
> version of MongoDB does not seem support this (only on document level).
> However, the upcoming version 4 with provide ACID, which we think could be
> beneficial to all who need some data consistency guarantees for their use
> cases. Accumulo seems to have some sort of transactional behaviour, but we
> do not know how this works in combination with rdf4j. Maybe someone could
> answer this?
>
> SPARQL:
>
>            We noted some issues regarding standards compliance of the
> SPARQL endpoint:
>
> - default URL path like (...my.domain.org/sparql)
> - return of official mimetypes (i.e. application/sparql-results+xml instead
> of text/xml)
> - content-negotiation via HTTP headers..
>
>
> Deployment and Testing:
> -----------------------
>
> We can deploy and test all components within our default system
> environment, where our system operations team will create the required
> nodes and all components will be installed as defined in Rya's installation
> guide either on one single node or on multiple nodes. The nodes would be
> dedicated VMs. So we can help not only with functional tests, but also
> regarding performance and scalalbility with different setups.
>
>          We are also deploying Rya within an orchestrated cluster based on
> DC/OS (https://dcos.io/), which means:
>
> - docker images with the components are managed by mesosphere marathon (
> https://mesosphere.com/blog/marathon-production-ready-containers/)
> deployed
> as a apache mesos framework (http://mesos.apache.org/).
> - getting the services involved (Accumulo/MongoDB + Rya) certified as a
> DC/OS service (https://universe.dcos.io/#/packages) would surely help to
> spread the word about Rya.
>
>
> We hope this sounds interesting for you and we would like to get your
> feedback. If you have any questions, please ask.
>
> Kind regards / Beste Grüße,
>
> Robert David
>
> *Download now
> <https://www.poolparty.biz/wp-content/uploads/2016/09/IDCPaper_
> DataIntegrationwithSemanticTechnologies.pdf>
> **IDC
> Technology Spotlight *
> *Get certified! <https://www.poolparty.biz/academy/> **PoolParty Academy*
>
>
> *Robert David*
> CTO
> Semantic Web Company GmbH
>
> EU: +43-14021235 <+43%201%204021235>
> US: (415) 800-3776
> https://www.poolparty.biz
> https://www.semantic-web.com
>



-- 
Dr. Adina Crainiceanu
Associate Professor
Computer Science Department
United States Naval Academy
410-293-6822
ad...@usna.edu
http://www.usna.edu/Users/cs/adina/

Reply via email to