So clearly I need to RTFM. I missed this in the documentation the first time.
http://www.elasticsearch.org/guide/en/elasticsearch/guide/current/mapping.html#_how_types_are_implemented Will filters at this scale be fast enough? On Friday, October 3, 2014 11:48:40 AM UTC-6, Todd Nine wrote: > > Hey guys, > We're currently storing entities and edges in Cassandra. The entities > are JSON, and edges are directed edges with a source---type-->target. > We're using ElasticSearch for indexing and I could really use a hand with > design. > > What we're doing currently, is we take an entity, and turn it's JSON into > a document. We then create multiple copies of our document and change it's > type to match the index. For instance, Image the following use case. > > > bob(user) -- likes -- > Duo (restaurant) ===> Document Type = bob(user) > + likes + restaurant ; bob(user) + likes > > > bob(user) -- likes -> Root Down (restaurant) ===> Document Type = > bob(user) + likes+ restaurant ; bob(user) + likes > > bob(user) -- likes --> Coconut Porter (beer). ===> Document Types = > bob(user) + likes + beer; bob(user) + likes > > > When we index using this scheme we create 3 documents based on the > restaurants Duo and Root Down, and the beer Coconut Porter. We then store > this document 2x, one for it's specific type, and one in the "all" bucket. > > Essentially, the document becomes a node in the graph. For each incoming > directed edge, we're storing 2x documents and changing the type. This > gives us fast seeks when we search by type, but a LOT of data bloat. Would > it instead be more efficient to keep an array of incoming edges in the > document, then add it to our search terms? For instance, should we instead > have a document like this? > > > docId: Duo(restaurant) > > edges: [ "bob(user) + likes + restaurant", "bob(user) + likes" ] > > When searching where edges = "bob(user) + likes + restaurant"? > > > I don't know internally what specifying type actually does, if it just > treats it as as field, or if it changes the routing of the response? In > a social situation millions of people can be connected to any one entity, > so we have to have a scheme that won't fall over when we get to that case. > > Any help would be greatly appreciated! > > Thanks, > Todd > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f97c6475-f4fc-4078-b052-b497ac82dc91%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
