Avi,
we are in the process to get out a nicer base framework for
transactional index creation, and an index provider for redis.
Meanwhile, if you want, you could look into the BerkelyDB index that I
tried to cook together (no guarantees there),
https://github.com/peterneubauer/bdb-index and see if that is
something to contemplate?

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org              - NOSQL for the Enterprise.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.



On Mon, Nov 21, 2011 at 2:40 AM, Avi Shai
<unicornrainbowche...@gmail.com> wrote:
> What is the best way to create an external index but only for certain nodes?
> Really I want something like the in-graph data structures, but instead it
> will be stored in another database(s). I am in essence indexing only a
> sub-graph or a straight list of nodes. I then want to use these indexes as
> entry points in some cases rather than traversing.
>
> I understand that there is already Lucene, but I have data that is better
> suited to other indexes. I still want to use Lucene for full-text, just not
> for anything else. I am currently taking a stab at implementing the
> blueprint index interfaces (manual, automatic), but for another purpose. If
> I am always updating these indexes, but only for certain vertex types, what
> is the best integration point? In my data service classes/lower level neo4j
> stuff, or in a server event handler to plug-in the transaction? What about
> for all vertices? I guess I understand how to write the index classes but
> not about the best way of consuming them, and not if they apply well for
> lots of partial, smaller indexes.
>
> For instance, I want to store data as temporal values, with the most recent
> data first for a group of nodes. I'm not doing "Twitter" or a blog, but
> either is a good enough analogy.  If I post something with a given tag, I
> want to index all the nodes that have been tagged by that tag (tag edge) in
> temporal order for example to create a "recently tagged" feed or a "recently
> seen users" feed that contains the users that have recently tagged using
> that tag. I could store this data in Redis exactly how I want and have  a
> hot set in memory that can then be used either directly in some pages in my
> app, or as an entry point into neo4j for more complex queries. These indexes
> probably require lots of writes and I wanted to also avoid locking related
> nodes on any updates.
>
> Currently part of the reason I'm doing this is I have lots of super nodes in
> my design. I've patched this some by keeping counts in node properties and
> adding proxy nodes as mini-partions to reduce the number of relationships.
> I've also looked at things like combining common nodes together as
> junctions, but there are too many permutations to scale probably. Anyway, if
> I use in-graph indexes, I have to update my indexes every insertion or
> update. I'm going to try out indexed relationships, and I think it will
> help, but with respect, I don't think it will scale well or fit my use
> cases, especially for indexes where data drops out because the size is fixed
> (like a fixed list).
>
> I feel that creating index structures in the graph is nice, but it will
> severely balloon the graph. Moreover, I want to save resources on the
> servers running neo for graph traversals and other graph activities and I
> would rather use other clustered servers to store huge amounts of index data
> in memory. One other idea is to use another neo4j instance as an index to
> itself, but I think the characteristics of what I am doing are better suited
> in some cases for Redis (temporal lists) or Mongo (hierarchical metrics)
> depending the use-case. Example: pulling down linear lists of time-data by
> page and sorting front to back or back to front.
>
> I know that's a lot, but I wanted to at least give some detail beyond what
> I've already read here in all the old posts I've dug through this week. Any
> feedback? Thanks.
>
>
> --
> View this message in context: 
> http://neo4j-community-discussions.438527.n3.nabble.com/Creating-and-managing-external-index-tp3523613p3523613.html
> Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
> _______________________________________________
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to