Peter,
The project requires quick retrieval of data bases upon certain parameters,
which, without indexing, would not be feasible (as the relations that hold
the data are all same, so simple traversals won't work). As I said earlier,
I would have to extract data based upon combination of some parameters,
which I could easily do by indexing the same data on different parameters
and taking advantage of the lucene's queryparser.

Regarding those 50k users, they will be making much use of the C and R or
the CRUD (Reads will be the most used though). I estimate that at any time,
30-50% of them would be using the project.

Suppose a user generates 20 new nodes and 20 new relationships (not
relationship types) per day. I would not index the data that they're
posting, but the node number, so that I get to node with less memory usage.
That seems efficient to me because I may make use of more number of nodes,
but I get a smaller Index. To index a node X ( with some data in it), I can
index node Y (empty) that has a direct relation with X. The nodes are
getting exhausted, but atleast it gives me a smaller and faster index (and I
have virtually unlimited number of nodes with Neo4j). (I know this scheme
might seem a little vague because I am wasting nodes, and someday when the
scalability factor kicks in, I might have to rectify this!)

Any suggestions regarding the same?

Also, my database would need store mostly strings. What about putting up
another layer in front of my neo4j db that maps those strings to ints so
that I could index those easily in neo4j? Example of this can be: This
addictional layer can map an email address (which I need to index) to a
unique user id which I can index using Neo4j
_______________________________________________
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to