On Fri, Sep 2, 2011 at 3:33 PM, Peter Neubauer
<[email protected]> wrote:
> Hi Linan,
> trying fast stabs at answers inline before heading home :)
>
> On Thu, Sep 1, 2011 at 3:29 AM, Linan Wang <[email protected]> wrote:
>
>> hi,
>> got some questions not found simple answers from the documents. i bet
>> some of them are pretty primitive, bear with me  please.
>>
>> 1, what's the general rule for choosing properties or relationship?
>> say a User lives in a City, which just contains a simple int  id
>> value. to find users live in a city, i can do a simple traversal, of
>> all user nodes, or find the city node first, then collect all the
>> users. seems to me both ways work and share same level of performance.
>> (am i right here?)
>>
> Generally, if a number of properties really is denoting the same concept
> (like a city) and you don't want to duplicate the data, and be able to
> traverse or query it, I would introduce nodes. However, if the node woudl
> turn into a supernode (like a city node with 100K relationships), then
> consider introducing an in-graph indexing structure, or an out-of-graph
> external index like Lucene in order to look up relationships or nodes when
> you need them, since that will be cheaper.
>
is it 
https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
?
seems not included in the current stable ver.
>
>> 2, does index operation add/remove/modify threadsafe, don't need
>> lock/transaction?
>>
> Yes, but the index framework is transactional as well as the graph. You need
> TX for any modifying operation, but not for reads.
>
>
>> 3, does it simple property writing operations also need to be wrapped
>> inside transaction? if so, in the imdb exmaple
>> tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither
>> within transaction, nor put into a save method, do all setProperty
>> works inside a transaction?
>>
> See Anders reply and above.
Got the two. thanks!
>
>
>> 4, what's the best practice to do bulk insertion when running (not
>> seed initial data)? i read post says that too many insertions within a
>> transaction may lead to memory problem? what's the proper mount of
>> insertion within a transaction?
>>
> Yes, transaction data is kept in memory before calling commit and flushing
> to disk, so overly large TX might result in memory problems. OTOH small TX
> incur higher IO load.
i'll probably do it with smaller batches (~1k operations per batch)
from an external queue. does it sounds reasonable?
>
>
>> 5, is there a suggested max length for string/array property? would it
>> be better to put into sql?
>>
> Well, the String store block size is adjustable (and we are working on even
> better layouts there), but for big strings like documents, a fiel system or
> Key/Value store might be better, and just keeping the reference to the
> location makes more sense.
ok, i'll use redis for strings.

>
> 6, say a facebook user may "likes" thousands of things, and these
>> things are sparsly connected. in this case, things should be modeled
>
> as nodes or array property?
>>
> Nodes. Sparse connections are one of the places where Neo4j shines - a
> fairly balanced graph where supernodes are seldom.
>
could you give a bottom number qualifies "supernode"? say 1k
connections within a graph of 1m nodes?

>
>> 7, where can i find an example to use domain models with serverplugin?
>> i want to put my data in a standalone server and just use the
>> serverplugin, unmanaged extension. should i just put the domain models
>> into the same serverplugin jar?
>>
>  Yes, I would do that. However, if you are not expecting to return Nodes,
> Relationships or Properties, an unmanaged extension will give you the full
> API of REST services. One extension that way is for instance the scripting
> extension, see https://github.com/neo4j/script-extension
thanks. seems i really should look into github instead of neo4j.org ;)
>
> 8, the warning in the documentation about unmanaged extension is
>> scary. what i can see is that people may use bad ways, instead of
>> Iterator/IteratorWrappers. any comment on this?
>>
> Yeah. It's just a warning, no sudden death. With that approach, you are
> inventing your own API and can do whatever you want, for good and bad.
>
>
>> 9, i'm not sure if it's trival: find out users who are only 2
>> relationships a way (use twitter example: my followees' followers),
>> live in same city, group by age and gender. also retrieve all their
>> followees. i want to do the traversal in java, where can i find an
>> examples?
>>
> Well,
> http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-traversal.htmlshould
> get you started? Also, in the next version, the Tinkerpop fluent
> iterator API (https://github.com/tinkerpop/pipes/wiki/FluentPipeline) is
> hopefully finding its way into the Neo4j release, if QA is ok, and you will
> have more options to do this.
>
thanks, will check it out.
>
>> 10, i've had horrible experience in turning jvm options. have neo4j
>> been running on Zing JVM, hp nonstop jvm? are they better options?
>>
>> I think there are initial tests running on Zing, but I don't know for sure.
> If you have access to such a machine, ir would be great if you can give
> feedback. Michael Hunger is doing a lot of these tests for hosting.
>
i don't have access to Zing either at this stage, but hey, it's always
good to have a worst scenario plan (by putting money) to save my job
:)

>
> Sorry for the delay, hope this helps. Let us know if you have more
> questions!
many thanks! i understand documentation is probably not your top
priority at this point, but since we are all programmers, we can read
codes. i feel samples on wiki and downloads are not updated to use the
most recent release.
>
> /peter
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Best regards

Linan Wang
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Reply via email to