Re: Help with designing our document for graphs. Indexing single nodes in graph with thousands of incoming edges

[email protected] Sat, 04 Oct 2014 01:53:22 -0700

Not sure if this helps but I use a variant of graphs in ES, it is called
Linked Data (JSON-LD)


By using JSON-LD, you can index something like

doc index: graph
doc type: relations
doc id: ...

{
   "user" : {
      "id" : "...",
      "label" : "Bob",
      "likes" : "restaurant:Duo"
  }
}

for the statement "Bob likes restaurant Duo"

and then you can run ES queries on the field "likes" or better "user.likes"
for finding the users that like a restaurant etc. Referencing the "id" it
is possible to lookup another document in another index about "Bob".

Just to give an idea how you can model relations in structured ES JSON
objects.

Jörg


On Fri, Oct 3, 2014 at 7:59 PM, Todd Nine <[email protected]> wrote:

> So clearly I need to RTFM.  I missed this in the documentation the first
> time.
>
>
> http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping.html#_how_types_are_implemented
>
> Will filters at this scale be fast enough?
>
>
>
> On Friday, October 3, 2014 11:48:40 AM UTC-6, Todd Nine wrote:
>>
>> Hey guys,
>>   We're currently storing entities and edges in Cassandra.  The entities
>> are JSON, and edges are directed edges with a source---type-->target.
>> We're using ElasticSearch for indexing and I could really use a hand with
>> design.
>>
>> What we're doing currently, is we take an entity, and turn it's JSON into
>> a document.  We then create multiple copies of our document and change it's
>> type to match the index.  For instance, Image the following use case.
>>
>>
>> bob(user) -- likes -- > Duo (restaurant)   ===> Document Type  =
>> bob(user) + likes + restaurant ; bob(user) + likes
>>
>>
>> bob(user) -- likes -> Root Down (restaurant)  ===> Document Type  =
>> bob(user) + likes+ restaurant ; bob(user) + likes
>>
>> bob(user) -- likes --> Coconut Porter (beer). ===> Document Types =
>> bob(user) + likes + beer; bob(user) + likes
>>
>>
>> When we index using this scheme we create 3 documents based on the
>> restaurants Duo and Root Down, and the beer Coconut Porter.  We then store
>> this document 2x, one for it's specific type, and one in the "all" bucket.
>>
>> Essentially, the document becomes a node in the graph.  For each incoming
>> directed edge, we're storing 2x documents and changing the type.  This
>> gives us fast seeks when we search by type, but a LOT of data bloat.  Would
>> it instead be more efficient to keep an array of incoming edges in the
>> document, then add it to our search terms?  For instance, should we instead
>> have a document like this?
>>
>>
>> docId: Duo(restaurant)
>>
>> edges: [ "bob(user) + likes + restaurant", "bob(user) + likes" ]
>>
>> When searching where edges = "bob(user) + likes + restaurant"?
>>
>>
>> I don't know internally what specifying type actually does, if it just
>> treats it as as field, or if it changes the routing of the response?    In
>> a social situation millions of people can be connected to any one entity,
>> so we have to have a scheme that won't fall over when we get to that case.
>>
>> Any help would be greatly appreciated!
>>
>> Thanks,
>> Todd
>>
>  --
> You received this message because you are subscribed to the Google Groups
> "elasticsearch" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/elasticsearch/f97c6475-f4fc-4078-b052-b497ac82dc91%40googlegroups.com
> <https://groups.google.com/d/msgid/elasticsearch/f97c6475-f4fc-4078-b052-b497ac82dc91%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoF0jKYVLKNV7RDjTCqsKnzjQmjZb%2BxBpkkGPa3YAHfM6A%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Re: Help with designing our document for graphs. Indexing single nodes in graph with thousands of incoming edges

Reply via email to