Avi, we are in the process to get out a nicer base framework for transactional index creation, and an index provider for redis. Meanwhile, if you want, you could look into the BerkelyDB index that I tried to cook together (no guarantees there), https://github.com/peterneubauer/bdb-index and see if that is something to contemplate?
Cheers, /peter neubauer GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - NOSQL for the Enterprise. http://startupbootcamp.org/ - Öresund - Innovation happens HERE. On Mon, Nov 21, 2011 at 2:40 AM, Avi Shai <unicornrainbowche...@gmail.com> wrote: > What is the best way to create an external index but only for certain nodes? > Really I want something like the in-graph data structures, but instead it > will be stored in another database(s). I am in essence indexing only a > sub-graph or a straight list of nodes. I then want to use these indexes as > entry points in some cases rather than traversing. > > I understand that there is already Lucene, but I have data that is better > suited to other indexes. I still want to use Lucene for full-text, just not > for anything else. I am currently taking a stab at implementing the > blueprint index interfaces (manual, automatic), but for another purpose. If > I am always updating these indexes, but only for certain vertex types, what > is the best integration point? In my data service classes/lower level neo4j > stuff, or in a server event handler to plug-in the transaction? What about > for all vertices? I guess I understand how to write the index classes but > not about the best way of consuming them, and not if they apply well for > lots of partial, smaller indexes. > > For instance, I want to store data as temporal values, with the most recent > data first for a group of nodes. I'm not doing "Twitter" or a blog, but > either is a good enough analogy. If I post something with a given tag, I > want to index all the nodes that have been tagged by that tag (tag edge) in > temporal order for example to create a "recently tagged" feed or a "recently > seen users" feed that contains the users that have recently tagged using > that tag. I could store this data in Redis exactly how I want and have a > hot set in memory that can then be used either directly in some pages in my > app, or as an entry point into neo4j for more complex queries. These indexes > probably require lots of writes and I wanted to also avoid locking related > nodes on any updates. > > Currently part of the reason I'm doing this is I have lots of super nodes in > my design. I've patched this some by keeping counts in node properties and > adding proxy nodes as mini-partions to reduce the number of relationships. > I've also looked at things like combining common nodes together as > junctions, but there are too many permutations to scale probably. Anyway, if > I use in-graph indexes, I have to update my indexes every insertion or > update. I'm going to try out indexed relationships, and I think it will > help, but with respect, I don't think it will scale well or fit my use > cases, especially for indexes where data drops out because the size is fixed > (like a fixed list). > > I feel that creating index structures in the graph is nice, but it will > severely balloon the graph. Moreover, I want to save resources on the > servers running neo for graph traversals and other graph activities and I > would rather use other clustered servers to store huge amounts of index data > in memory. One other idea is to use another neo4j instance as an index to > itself, but I think the characteristics of what I am doing are better suited > in some cases for Redis (temporal lists) or Mongo (hierarchical metrics) > depending the use-case. Example: pulling down linear lists of time-data by > page and sorting front to back or back to front. > > I know that's a lot, but I wanted to at least give some detail beyond what > I've already read here in all the old posts I've dug through this week. Any > feedback? Thanks. > > > -- > View this message in context: > http://neo4j-community-discussions.438527.n3.nabble.com/Creating-and-managing-external-index-tp3523613p3523613.html > Sent from the Neo4j Community Discussions mailing list archive at Nabble.com. > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user