No, I had not seen that, thanks! Looks very interesting!


ajs6f

Phil Coates wrote on 9/5/17 11:04 AM:
Have you looked at CM-Well (https://github.com/thomsonreuters/CM-Well)?

This is based on Cassandra and ElasticSearch.


*Philip Coates*

[email protected] 
<mailto:[email protected]>
[email protected] <mailto:[email protected]>
skype:philip.coates.76
Tel: +44 (0)7711 818384

*SemanticIntegration* <http://www.semanticintegration.co.uk/>

On 5 September 2017 at 15:40, <[email protected] <mailto:[email protected]>> 
wrote:

    The requirements for distributed storage are actually that DRAS-TIC (see 
that grant description) be used, and DRAS-TIC is 100% based around Cassandra, so
    effectively, the requirement is that Cassandra be used, at least at core. So 
part of what I am wondering (if it's not obvious) is "If we're going to have a
    Cassandra cluster as part of this, how can we get as much mileage as possible 
out of it?"

    I know that Cassandra offers some ordering capabilities out-of-the-box, 
although I'm not familiar with them. Maybe they could be used to support merge 
join
    generally.

    CumulusRDF (as shown in that paper I forwarded) uses a structure in which 
they mostly leave column values empty. The information is stored entirely in the
    keys, and use is made of prefix lookup. Does your system do something like 
that, Claude? It sounds like you are storing tuple component in the column 
values.


    ajs6f

    Andy Seaborne wrote on 9/5/17 4:43 AM:


                On Mon, Sep 4, 2017 at 12:10 PM, <[email protected] 
<mailto:[email protected]>> wrote:

                    Little of both? :grin:

                    Primarily I am interested because of a grant [1] in which 
the Smithsonian
                    Institution (where I work) is participating in a supporting 
role (partly
                    because I convinced us to). That work involves using 
Cassandra for
                    distributed storage, and it will also involve a distributed 
LDP
                    implementation (the Fedora API referred to in that grant 
description is
                    really just a packaging of Memento [2] with LDP [3]), hence 
my interest in
                    jena-on-cassandra.


        Turning this round - what are the requirements for the distributed 
storage?

                    As I understand the join question, the usual move with 
Cassandra is to
                    denormalize and store the joined data together, but that's 
obviously
                    nontrivial in our situation, where we don't know the 
potential queries.
                    Have you looked at an indexing solution such as was used by 
CumulusRDF [4]?


        (single graph example)

        If Cassandra has stored PSO and POS then parallel merge joins are 
possible.

            Andy


                    ajs6f

                    [1] https://www.imls.gov/grants/awarded/lg-71-17-0159-17 
<https://www.imls.gov/grants/awarded/lg-71-17-0159-17>
                    [2] http://www.mementoweb.org/guide/quick-intro/ 
<http://www.mementoweb.org/guide/quick-intro/>
                    [3] https://www.w3.org/TR/ldp/
                    [4] http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Worksh 
<http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Worksh>
                    ops/SSWS/Ladwig-et-all-SSWS2011.pdf

                    Claude Warren wrote on 9/2/17 12:44 PM:

                    are you looking to use jena-on-cassandra or do you have 
ideas?  what leads

                        you to ask about it?


                        On Sat, Sep 2, 2017 at 1:21 PM, <[email protected] 
<mailto:[email protected]>> wrote:

                        Hey, Claude--


                            Just curious as to where 
https://github.com/Claudenw/jena-on-cassandra 
<https://github.com/Claudenw/jena-on-cassandra>
                            has ended up. Is that still work-in-progress?

                            --

                            ajs6f







                --
                I like: Like Like - The likeliest place on the web
                <http://like-like.xenei.com>
                LinkedIn: http://www.linkedin.com/in/claudewarren 
<http://www.linkedin.com/in/claudewarren>





Reply via email to