Very good points. But I must admit that there is a demand for automatic indexing. I personally am not using it, but I would like prepared indexes, indexes that can be configured up front and then just add the node. I see your point about this implying more schema (in the index preparation), but I do not see that as avoidable.
I think (or hope) that for automatic indexes, the criteria for how a node qualifies for indexing would be defined by the developer, hopefully with code, so it can be very general and flexible. For example, I guess that whenever a node is added to the graph, an event is triggered to pass the node to any listeners that look for patterns to match. For performance I guess there should be some simple patterns like the existence of some property to index, but it would be good if the user can define the code to be called, so more complex cases can be considered, like exploring the local sub-graph and indexing based on some more complex criteria. Certainly the user will then have the power to hurt performance, but that is currently the case anyway :-) On Mon, May 9, 2011 at 8:07 PM, Niels Hoogeveen <pd_aficion...@hotmail.com>wrote: > > Automatic indexes could be a very nice feature, though personally I would > very much like to maintain the ability to manually index nodes and > relationships. There are situations where I store a different value in a > property than I store in the index (string properties containing html tags, > but indexes that store those same values with the html tags stripped). There > are also situations where the indexed node is not the node that actually > contains the property being indexed (eg. in quad-store layout, a value node > contains the property, but the related node is used in the index). I can > also conceive of indexes where there is not even a stored property value > involved. > Having an automatic index would certainly make things easier in some > scenarios, but it's not easy to create an automatic indexing mechanism that > works for all possible use cases. > I am also a little bit concerned about such a feature, because it would > result in schema-creep. One of the most powerful features I find in Neo4J is > how storage and schema are completely independent. In fact the store can be > used without any schema at all, while at the same time the store can be used > to persist a schema if that is needed. > One of the things I disliked about table based databases is the mixing of > storage and schema. It is impossible to define an entity without defining a > table, which immediately creates a schema entity. Having strict separation > of storage and schema is one of the reasons NOSQL databases are so flexible. > Such separation makes it possible to invent different types of schemata for > different use cases. > When I still used relational databases, I always ended up replicating the > schema facility of the underlying database to add more meta information to > the database. Being able to roll my own schema facility is therefore one of > the key features that made Neo4J such an attractive option. If more schema > facilities would eventually creep into the kernel, those advantages would > slowly dissipate. > > > Date: Mon, 9 May 2011 18:34:10 +0200 > > From: cr...@amanzi.com > > To: user@lists.neo4j.org > > Subject: Re: [Neo4j] Timeline index > > > > +10 for both if Neils responses. I think both external and in-graph > indexes > > should be supported. > > > > The last time I talked to Mattias about this it sounded like the only > really > > clean option for integrating them behind one API would be once automatic > > indexes are supported, because at that point indexes get configured > up-front > > (like the BTree and RTree) and then simply used (behind the scenes in > > automated indexes). I'm hoping automatic indexes are planned for 1.4, > then > > all of this can come together :-) > > > > On Mon, May 9, 2011 at 3:14 PM, Niels Hoogeveen > > <pd_aficion...@hotmail.com>wrote: > > > > > > > > Rick, I am looking forward to the results of your investigation. I see > a > > > need for both external search mechanisms (Lucene, and possible Solr), > as > > > well as in-graph search mechanisms based on constrained traversals (eg. > > > Timeline index based on a Btree and the Rtree index used in > neo4j-spatial). > > > Any progress in either direction is most welcome. > > > > > > > From: rick.bullo...@thingworx.com > > > > To: matt...@neotechnology.com; user@lists.neo4j.org > > > > Date: Mon, 9 May 2011 03:57:13 -0700 > > > > Subject: Re: [Neo4j] Timeline index > > > > > > > > Niels/Mattias: we are also exploring a Solr implementation for the > index > > > framework. There are some potential benefits using Solr in a large > > > graph/HA/distributed scenario that we are investigating. The tough > part is > > > the distributed transactioning, though. > > > > > > > > > > > > ----- Reply message ----- > > > > From: "Mattias Persson" <matt...@neotechnology.com> > > > > Date: Mon, May 9, 2011 6:14 am > > > > Subject: [Neo4j] Timeline index > > > > To: "Neo4j user discussions" <user@lists.neo4j.org> > > > > > > > > 2011/4/12 Niels Hoogeveen <pd_aficion...@hotmail.com> > > > > > > > > > > > > > > Hi Mattias, > > > > > Thank you for your response. I am currently working with the > version > > > you > > > > > pointed out. My bigger concern is the possible deprecation of this > > > component > > > > > in future releases. > > > > > As I pointed out, there are use cases where the Lucene timeline is > not > > > an > > > > > appropriate choice, but the graph-based B-tree Timeline is. For > > > example, > > > > > versioning of nodes in the database requires an index per node > (except > > > for > > > > > the version nodes of course), also when using a quad-store with an > > > index on > > > > > recent-entries per context, a timeline index per context is > necessary. > > > These > > > > > type of scenarios can potentially create millions of indices, which > can > > > > > easily be stored in Neo4J, but are impossible! to store in the > Lucene > > > index > > > > > component. > > > > > So my issue is not so much how to use the graph-based B-tree > Timeline > > > index > > > > > in version 1.3, but having similar functionality in future > releases. > > > > > BTW... love the work you have done on the Lucene component, which > is > > > much > > > > > more flexible and usable than previous incarnations. > > > > > Kind regards,Niels Hoogeveen > > > > > > > > > > > > > Allright, I can see the problem with lucene indexes in such a case :) > I > > > > think the "legacy index" will live on as it is for at least a while. > It > > > may > > > > not be valid with @Deprecated since it's now it's own component, > having > > > the > > > > word "legacy" in it. > > > > > > > > Cool, right it opens up the lucene functionality a bit more then the > > > > previous incarnation of the index framework. > > > > > > > > > > > > > > > Date: Tue, 12 Apr 2011 22:37:42 +0200 > > > > > > From: matt...@neotechnology.com > > > > > > To: user@lists.neo4j.org > > > > > > Subject: Re: [Neo4j] Timeline index > > > > > > > > > > > > Hi Niels, > > > > > > > > > > > > I think you're right about the lucene-based timeline not being > right > > > > > > for millions of indices, not possible even! The old index > component > > > > > > isn't a part of the official release, but is supported and > available > > > > > > as neo4j-legacy-index from neo4j maven repository, > > > > > > http://m2.neo4j.org/org/neo4j/neo4j-legacy-index/ and will have > a > > > > > > version synchronized with the org.neo4j:neo4j artifact. Source is > > > > > > here: https://github.com/neo4j/legacy-index > > > > > > > > > > > > 2011/4/12 Niels Hoogeveen <pd_aficion...@hotmail.com>: > > > > > > > > > > > > > > I appreciate the new indexing framework available, and noticed > the > > > > > addition of a Timeline based on Lucene. I was wondering if this is > seen > > > as a > > > > > replacement of the original graph-based B-tree Timeline. > > > > > > > If that is the case, I will have serious problems with the > software > > > I > > > > > am developing, and which uses the B-tree based Timeline in several > > > places > > > > > where Lucene Timeline wouldn't work. > > > > > > > I have many (potentially millions) timeline indexes in my > > > application > > > > > to maintain sort orders. Almost every node in the graph is actually > the > > > root > > > > > of one or more Timeline indexes. This works well when the index is > > > graph > > > > > based, but I fear it wouldn't work when using Lucene. I don't think > > > > > maintaining millions of indexes is something Lucene is particularly > > > suited > > > > > for. > > > > > > > I didn't see neo4j-index being part of the 1.3-M5 release, so > I'd > > > like > > > > > to check-up to see if this component remains to be supported. > > > > > > > Kind regards,Niels Hoogeveen > > > > > > > > > > > > > > _______________________________________________ > > > > > > > Neo4j mailing list > > > > > > > User@lists.neo4j.org > > > > > > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > Mattias Persson, [matt...@neotechnology.com] > > > > > > Hacker, Neo Technology > > > > > > www.neotechnology.com<http://www.neotechnology.com> > > > > > > _______________________________________________ > > > > > > Neo4j mailing list > > > > > > User@lists.neo4j.org > > > > > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > > > > > _______________________________________________ > > > > > Neo4j mailing list > > > > > User@lists.neo4j.org > > > > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > > > > > > > > > > > > > > > > -- > > > > Mattias Persson, [matt...@neotechnology.com] > > > > Hacker, Neo Technology > > > > www.neotechnology.com<http://www.neotechnology.com> > > > > _______________________________________________ > > > > Neo4j mailing list > > > > User@lists.neo4j.org > > > > https://lists.neo4j.org/mailman/listinfo/user > > > > _______________________________________________ > > > > Neo4j mailing list > > > > User@lists.neo4j.org > > > > https://lists.neo4j.org/mailman/listinfo/user > > > > > > _______________________________________________ > > > Neo4j mailing list > > > User@lists.neo4j.org > > > https://lists.neo4j.org/mailman/listinfo/user > > > > > _______________________________________________ > > Neo4j mailing list > > User@lists.neo4j.org > > https://lists.neo4j.org/mailman/listinfo/user > > _______________________________________________ > Neo4j mailing list > User@lists.neo4j.org > https://lists.neo4j.org/mailman/listinfo/user > _______________________________________________ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user