On Fri, Sep 2, 2011 at 3:33 PM, Peter Neubauer <[email protected]> wrote: > Hi Linan, > trying fast stabs at answers inline before heading home :) > > On Thu, Sep 1, 2011 at 3:29 AM, Linan Wang <[email protected]> wrote: > >> hi, >> got some questions not found simple answers from the documents. i bet >> some of them are pretty primitive, bear with me please. >> >> 1, what's the general rule for choosing properties or relationship? >> say a User lives in a City, which just contains a simple int id >> value. to find users live in a city, i can do a simple traversal, of >> all user nodes, or find the city node first, then collect all the >> users. seems to me both ways work and share same level of performance. >> (am i right here?) >> > Generally, if a number of properties really is denoting the same concept > (like a city) and you don't want to duplicate the data, and be able to > traverse or query it, I would introduce nodes. However, if the node woudl > turn into a supernode (like a city node with 100K relationships), then > consider introducing an in-graph indexing structure, or an out-of-graph > external index like Lucene in order to look up relationships or nodes when > you need them, since that will be cheaper. > is it https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships ? seems not included in the current stable ver. > >> 2, does index operation add/remove/modify threadsafe, don't need >> lock/transaction? >> > Yes, but the index framework is transactional as well as the graph. You need > TX for any modifying operation, but not for reads. > > >> 3, does it simple property writing operations also need to be wrapped >> inside transaction? if so, in the imdb exmaple >> tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither >> within transaction, nor put into a save method, do all setProperty >> works inside a transaction? >> > See Anders reply and above. Got the two. thanks! > > >> 4, what's the best practice to do bulk insertion when running (not >> seed initial data)? i read post says that too many insertions within a >> transaction may lead to memory problem? what's the proper mount of >> insertion within a transaction? >> > Yes, transaction data is kept in memory before calling commit and flushing > to disk, so overly large TX might result in memory problems. OTOH small TX > incur higher IO load. i'll probably do it with smaller batches (~1k operations per batch) from an external queue. does it sounds reasonable? > > >> 5, is there a suggested max length for string/array property? would it >> be better to put into sql? >> > Well, the String store block size is adjustable (and we are working on even > better layouts there), but for big strings like documents, a fiel system or > Key/Value store might be better, and just keeping the reference to the > location makes more sense. ok, i'll use redis for strings.
> > 6, say a facebook user may "likes" thousands of things, and these >> things are sparsly connected. in this case, things should be modeled > > as nodes or array property? >> > Nodes. Sparse connections are one of the places where Neo4j shines - a > fairly balanced graph where supernodes are seldom. > could you give a bottom number qualifies "supernode"? say 1k connections within a graph of 1m nodes? > >> 7, where can i find an example to use domain models with serverplugin? >> i want to put my data in a standalone server and just use the >> serverplugin, unmanaged extension. should i just put the domain models >> into the same serverplugin jar? >> > Yes, I would do that. However, if you are not expecting to return Nodes, > Relationships or Properties, an unmanaged extension will give you the full > API of REST services. One extension that way is for instance the scripting > extension, see https://github.com/neo4j/script-extension thanks. seems i really should look into github instead of neo4j.org ;) > > 8, the warning in the documentation about unmanaged extension is >> scary. what i can see is that people may use bad ways, instead of >> Iterator/IteratorWrappers. any comment on this? >> > Yeah. It's just a warning, no sudden death. With that approach, you are > inventing your own API and can do whatever you want, for good and bad. > > >> 9, i'm not sure if it's trival: find out users who are only 2 >> relationships a way (use twitter example: my followees' followers), >> live in same city, group by age and gender. also retrieve all their >> followees. i want to do the traversal in java, where can i find an >> examples? >> > Well, > http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-traversal.htmlshould > get you started? Also, in the next version, the Tinkerpop fluent > iterator API (https://github.com/tinkerpop/pipes/wiki/FluentPipeline) is > hopefully finding its way into the Neo4j release, if QA is ok, and you will > have more options to do this. > thanks, will check it out. > >> 10, i've had horrible experience in turning jvm options. have neo4j >> been running on Zing JVM, hp nonstop jvm? are they better options? >> >> I think there are initial tests running on Zing, but I don't know for sure. > If you have access to such a machine, ir would be great if you can give > feedback. Michael Hunger is doing a lot of these tests for hosting. > i don't have access to Zing either at this stage, but hey, it's always good to have a worst scenario plan (by putting money) to save my job :) > > Sorry for the delay, hope this helps. Let us know if you have more > questions! many thanks! i understand documentation is probably not your top priority at this point, but since we are all programmers, we can read codes. i feel samples on wiki and downloads are not updated to use the most recent release. > > /peter > _______________________________________________ > Neo4j mailing list > [email protected] > https://lists.neo4j.org/mailman/listinfo/user > -- Best regards Linan Wang _______________________________________________ Neo4j mailing list [email protected] https://lists.neo4j.org/mailman/listinfo/user

