So clearly I need to RTFM.  I missed this in the documentation the first 
time.

http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping.html#_how_types_are_implemented

Will filters at this scale be fast enough?



On Friday, October 3, 2014 11:48:40 AM UTC-6, Todd Nine wrote:
>
> Hey guys,
>   We're currently storing entities and edges in Cassandra.  The entities 
> are JSON, and edges are directed edges with a source---type-->target. 
>  We're using ElasticSearch for indexing and I could really use a hand with 
> design.
>
> What we're doing currently, is we take an entity, and turn it's JSON into 
> a document.  We then create multiple copies of our document and change it's 
> type to match the index.  For instance, Image the following use case.
>
>
> bob(user) -- likes -- > Duo (restaurant)   ===> Document Type  = bob(user) 
> + likes + restaurant ; bob(user) + likes
>      
>
> bob(user) -- likes -> Root Down (restaurant)  ===> Document Type  = 
> bob(user) + likes+ restaurant ; bob(user) + likes
>
> bob(user) -- likes --> Coconut Porter (beer). ===> Document Types = 
> bob(user) + likes + beer; bob(user) + likes
>
>
> When we index using this scheme we create 3 documents based on the 
> restaurants Duo and Root Down, and the beer Coconut Porter.  We then store 
> this document 2x, one for it's specific type, and one in the "all" bucket.  
>
> Essentially, the document becomes a node in the graph.  For each incoming 
> directed edge, we're storing 2x documents and changing the type.  This 
> gives us fast seeks when we search by type, but a LOT of data bloat.  Would 
> it instead be more efficient to keep an array of incoming edges in the 
> document, then add it to our search terms?  For instance, should we instead 
> have a document like this?
>
>
> docId: Duo(restaurant)
>
> edges: [ "bob(user) + likes + restaurant", "bob(user) + likes" ]
>
> When searching where edges = "bob(user) + likes + restaurant"?
>
>
> I don't know internally what specifying type actually does, if it just 
> treats it as as field, or if it changes the routing of the response?    In 
> a social situation millions of people can be connected to any one entity, 
> so we have to have a scheme that won't fall over when we get to that case.
>
> Any help would be greatly appreciated!
>
> Thanks,
> Todd
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/f97c6475-f4fc-4078-b052-b497ac82dc91%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to