Re: [Neo4j] 10 questions

2011-09-12 Thread Peter Neubauer
Linan,
that is actually exactly what Neo4j is doing. Indexing is done via the
Index Framework, that makes external indicies like Lucene, Redas,
BabuDB etc conform to transactional semantics in order to keep
consistent between the Neo4j graph engine kernel and the indecies. See
http://docs.neo4j.org/chunked/snapshot/indexing.html for more details
on this, there are others that we have been testing like BabuDB,
BerkeleyDB and Redis (soon to come out).

Cheers,

/peter neubauer

GTalk:      neubauer.peter
Skype       peter.neubauer
Phone       +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter      http://twitter.com/peterneubauer

http://www.neo4j.org               - Your high performance graph database.
http://startupbootcamp.org/    - Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.



On Mon, Sep 5, 2011 at 1:05 PM, Linan Wang  wrote:
> thank you very much for the detailed replied.
> i'm wondering if neo4j had considered the option to use redis or other
> key-value store for storing properties and focus upon pure graph,
> similar to the role Lucene plays. i assume the major problem is the
> performance problem due to address lookup process inside redis value
> reading and communication overhead. but the gain is significant in
> scalability. just some random thoughts.
>
> On Fri, Sep 2, 2011 at 9:50 PM, Peter Neubauer
>  wrote:
>> Linan,
>> see inline ...
>>
>> On Fri, Sep 2, 2011 at 9:01 PM, Linan Wang  wrote:
>>
>>> is it
>>> https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
>>> ?
>>> seems not included in the current stable ver.
>>>
>> No, this is work-in-progress, we are quite strict when it come sto including
>> things into the official release, since we need to test and document it
>> better. So, feel free to test it, a lot of code in there is stable, and
>> expect to fork and contribute if you find things to fix!
>>
>>
>>> >> 4, what's the best practice to do bulk insertion when running (not
>>> >> seed initial data)? i read post says that too many insertions within a
>>> >> transaction may lead to memory problem? what's the proper mount of
>>> >> insertion within a transaction?
>>> >>
>>> > Yes, transaction data is kept in memory before calling commit and
>>> flushing
>>> > to disk, so overly large TX might result in memory problems. OTOH small
>>> TX
>>> > incur higher IO load.
>>> i'll probably do it with smaller batches (~1k operations per batch)
>>> from an external queue. does it sounds reasonable?
>>> >
>>>
>> Yes. From experience, there seems to be a lot of cases where transactions
>> hold between 1K and 10K operations and give a good performance vs. RAM vs.
>> persistence balance, if you can afford it for your data.
>>
>>
>>> >
>>> >> 5, is there a suggested max length for string/array property? would it
>>> >> be better to put into sql?
>>> >>
>>> > Well, the String store block size is adjustable (and we are working on
>>> even
>>> > better layouts there), but for big strings like documents, a fiel system
>>> or
>>> > Key/Value store might be better, and just keeping the reference to the
>>> > location makes more sense.
>>> ok, i'll use redis for strings.
>>>
>>> Probably a sensible choice. There might even be an Neo4j index coming out
>> for Redis, making it transactional with the graph like Lucene.
>>
>>
>>>  >
>>> > 6, say a facebook user may "likes" thousands of things, and these
>>> >> things are sparsly connected. in this case, things should be modeled
>>> >
>>> > as nodes or array property?
>>> >>
>>> > Nodes. Sparse connections are one of the places where Neo4j shines - a
>>> > fairly balanced graph where supernodes are seldom.
>>> >
>>> could you give a bottom number qualifies "supernode"? say 1k
>>> connections within a graph of 1m nodes?
>>>
>>> with the current store layout, probably 1K is a good number. We are working
>> with store changes that require less reads, but don't explicitly take care
>> of supernodes. The is in plan, in which case this number will change upwards
>> with good performance :)
>>
>>> >
>>> >> 7, where can i find an example to use domain models with serverplugin?
>>> >> i want to put my data in a standalone server and just use the
>>> >> serverplugin, unmanaged extension. should i just put the domain models
>>> >> into the same serverplugin jar?
>>> >>
>>> >  Yes, I would do that. However, if you are not expecting to return Nodes,
>>> > Relationships or Properties, an unmanaged extension will give you the
>>> full
>>> > API of REST services. One extension that way is for instance the
>>> scripting
>>> > extension, see https://github.com/neo4j/script-extension
>>> thanks. seems i really should look into github instead of neo4j.org ;)
>>>
>> Well, it's hard to list everything, we are right now trying to put as much
>> as possible into the manual which can be generated, tested and curated, and
>> serve as a reference.
>>
>>
>>> >
>>> > Sorry for the dela

Re: [Neo4j] 10 questions

2011-09-05 Thread Tatham Oddie
Hi Linan,

> anyone show some love ;)

Generally 10 questions have a better chance of getting answered if they are 10 
separate threads. It's much easier to follow that way.


-- Tatham
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-05 Thread Linan Wang
thank you very much for the detailed replied.
i'm wondering if neo4j had considered the option to use redis or other
key-value store for storing properties and focus upon pure graph,
similar to the role Lucene plays. i assume the major problem is the
performance problem due to address lookup process inside redis value
reading and communication overhead. but the gain is significant in
scalability. just some random thoughts.

On Fri, Sep 2, 2011 at 9:50 PM, Peter Neubauer
 wrote:
> Linan,
> see inline ...
>
> On Fri, Sep 2, 2011 at 9:01 PM, Linan Wang  wrote:
>
>> is it
>> https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
>> ?
>> seems not included in the current stable ver.
>>
> No, this is work-in-progress, we are quite strict when it come sto including
> things into the official release, since we need to test and document it
> better. So, feel free to test it, a lot of code in there is stable, and
> expect to fork and contribute if you find things to fix!
>
>
>> >> 4, what's the best practice to do bulk insertion when running (not
>> >> seed initial data)? i read post says that too many insertions within a
>> >> transaction may lead to memory problem? what's the proper mount of
>> >> insertion within a transaction?
>> >>
>> > Yes, transaction data is kept in memory before calling commit and
>> flushing
>> > to disk, so overly large TX might result in memory problems. OTOH small
>> TX
>> > incur higher IO load.
>> i'll probably do it with smaller batches (~1k operations per batch)
>> from an external queue. does it sounds reasonable?
>> >
>>
> Yes. From experience, there seems to be a lot of cases where transactions
> hold between 1K and 10K operations and give a good performance vs. RAM vs.
> persistence balance, if you can afford it for your data.
>
>
>> >
>> >> 5, is there a suggested max length for string/array property? would it
>> >> be better to put into sql?
>> >>
>> > Well, the String store block size is adjustable (and we are working on
>> even
>> > better layouts there), but for big strings like documents, a fiel system
>> or
>> > Key/Value store might be better, and just keeping the reference to the
>> > location makes more sense.
>> ok, i'll use redis for strings.
>>
>> Probably a sensible choice. There might even be an Neo4j index coming out
> for Redis, making it transactional with the graph like Lucene.
>
>
>>  >
>> > 6, say a facebook user may "likes" thousands of things, and these
>> >> things are sparsly connected. in this case, things should be modeled
>> >
>> > as nodes or array property?
>> >>
>> > Nodes. Sparse connections are one of the places where Neo4j shines - a
>> > fairly balanced graph where supernodes are seldom.
>> >
>> could you give a bottom number qualifies "supernode"? say 1k
>> connections within a graph of 1m nodes?
>>
>> with the current store layout, probably 1K is a good number. We are working
> with store changes that require less reads, but don't explicitly take care
> of supernodes. The is in plan, in which case this number will change upwards
> with good performance :)
>
>> >
>> >> 7, where can i find an example to use domain models with serverplugin?
>> >> i want to put my data in a standalone server and just use the
>> >> serverplugin, unmanaged extension. should i just put the domain models
>> >> into the same serverplugin jar?
>> >>
>> >  Yes, I would do that. However, if you are not expecting to return Nodes,
>> > Relationships or Properties, an unmanaged extension will give you the
>> full
>> > API of REST services. One extension that way is for instance the
>> scripting
>> > extension, see https://github.com/neo4j/script-extension
>> thanks. seems i really should look into github instead of neo4j.org ;)
>>
> Well, it's hard to list everything, we are right now trying to put as much
> as possible into the manual which can be generated, tested and curated, and
> serve as a reference.
>
>
>> >
>> > Sorry for the delay, hope this helps. Let us know if you have more
>> > questions!
>> many thanks! i understand documentation is probably not your top
>> priority at this point, but since we are all programmers, we can read
>> codes. i feel samples on wiki and downloads are not updated to use the
>> most recent release.
>>
> I think what you are seeing is the Wiki getting outdated over time. We are
> in the process of moving the Wiki content into docs.neo4j.org, that one is
> MUCH better up to date - all code you see in there is generated from running
> tests. For real. So I think this will change soon to a point where we can
> delete most of the Wiki pages. Sorry for the inconvenience!
>
> /peter
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Peter Neubauer
Linan,
see inline ...

On Fri, Sep 2, 2011 at 9:01 PM, Linan Wang  wrote:

> is it
> https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
> ?
> seems not included in the current stable ver.
>
No, this is work-in-progress, we are quite strict when it come sto including
things into the official release, since we need to test and document it
better. So, feel free to test it, a lot of code in there is stable, and
expect to fork and contribute if you find things to fix!


> >> 4, what's the best practice to do bulk insertion when running (not
> >> seed initial data)? i read post says that too many insertions within a
> >> transaction may lead to memory problem? what's the proper mount of
> >> insertion within a transaction?
> >>
> > Yes, transaction data is kept in memory before calling commit and
> flushing
> > to disk, so overly large TX might result in memory problems. OTOH small
> TX
> > incur higher IO load.
> i'll probably do it with smaller batches (~1k operations per batch)
> from an external queue. does it sounds reasonable?
> >
>
Yes. From experience, there seems to be a lot of cases where transactions
hold between 1K and 10K operations and give a good performance vs. RAM vs.
persistence balance, if you can afford it for your data.


> >
> >> 5, is there a suggested max length for string/array property? would it
> >> be better to put into sql?
> >>
> > Well, the String store block size is adjustable (and we are working on
> even
> > better layouts there), but for big strings like documents, a fiel system
> or
> > Key/Value store might be better, and just keeping the reference to the
> > location makes more sense.
> ok, i'll use redis for strings.
>
> Probably a sensible choice. There might even be an Neo4j index coming out
for Redis, making it transactional with the graph like Lucene.


>  >
> > 6, say a facebook user may "likes" thousands of things, and these
> >> things are sparsly connected. in this case, things should be modeled
> >
> > as nodes or array property?
> >>
> > Nodes. Sparse connections are one of the places where Neo4j shines - a
> > fairly balanced graph where supernodes are seldom.
> >
> could you give a bottom number qualifies "supernode"? say 1k
> connections within a graph of 1m nodes?
>
> with the current store layout, probably 1K is a good number. We are working
with store changes that require less reads, but don't explicitly take care
of supernodes. The is in plan, in which case this number will change upwards
with good performance :)

> >
> >> 7, where can i find an example to use domain models with serverplugin?
> >> i want to put my data in a standalone server and just use the
> >> serverplugin, unmanaged extension. should i just put the domain models
> >> into the same serverplugin jar?
> >>
> >  Yes, I would do that. However, if you are not expecting to return Nodes,
> > Relationships or Properties, an unmanaged extension will give you the
> full
> > API of REST services. One extension that way is for instance the
> scripting
> > extension, see https://github.com/neo4j/script-extension
> thanks. seems i really should look into github instead of neo4j.org ;)
>
Well, it's hard to list everything, we are right now trying to put as much
as possible into the manual which can be generated, tested and curated, and
serve as a reference.


> >
> > Sorry for the delay, hope this helps. Let us know if you have more
> > questions!
> many thanks! i understand documentation is probably not your top
> priority at this point, but since we are all programmers, we can read
> codes. i feel samples on wiki and downloads are not updated to use the
> most recent release.
>
I think what you are seeing is the Wiki getting outdated over time. We are
in the process of moving the Wiki content into docs.neo4j.org, that one is
MUCH better up to date - all code you see in there is generated from running
tests. For real. So I think this will change soon to a point where we can
delete most of the Wiki pages. Sorry for the inconvenience!

/peter
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Linan Wang
yes, it's a smart question!

On Fri, Sep 2, 2011 at 4:06 PM, Rick Otten  wrote:
> Should one make an effort to keep "node 0" from becoming a 'supernode'?
>
>
> -Original Message-
> From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
> Behalf Of Steven Kalemkiewicz
> Sent: Friday, September 02, 2011 10:58 AM
> To: Neo4j user discussions
> Subject: Re: [Neo4j] 10 questions
>
> What would you consider the lower-bound to be to classify a node as a 
> supernode?  I saw that you referred to a city node with 100K relationships...
>
> -Steve
>
> On Fri, Sep 2, 2011 at 10:33 AM, Peter Neubauer < 
> peter.neuba...@neotechnology.com> wrote:
>
>> > 1, what's the general rule for choosing properties or relationship?
>> > say a User lives in a City, which just contains a simple int  id
>> > value. to find users live in a city, i can do a simple traversal, of
>> > all user nodes, or find the city node first, then collect all the
>> > users. seems to me both ways work and share same level of performance.
>> > (am i right here?)
>> >
>> Generally, if a number of properties really is denoting the same
>> concept (like a city) and you don't want to duplicate the data, and be
>> able to traverse or query it, I would introduce nodes. However, if the
>> node woudl turn into a supernode (like a city node with 100K
>> relationships), then consider introducing an in-graph indexing
>> structure, or an out-of-graph external index like Lucene in order to
>> look up relationships or nodes when you need them, since that will be 
>> cheaper.
>>
>> 6, say a facebook user may "likes" thousands of things, and these
>> > things are sparsly connected. in this case, things should be modeled
>> > as nodes or array property?
>> >
>> Nodes. Sparse connections are one of the places where Neo4j shines - a
>> fairly balanced graph where supernodes are seldom.
>>
>>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Linan Wang
On Fri, Sep 2, 2011 at 3:33 PM, Peter Neubauer
 wrote:
> Hi Linan,
> trying fast stabs at answers inline before heading home :)
>
> On Thu, Sep 1, 2011 at 3:29 AM, Linan Wang  wrote:
>
>> hi,
>> got some questions not found simple answers from the documents. i bet
>> some of them are pretty primitive, bear with me  please.
>>
>> 1, what's the general rule for choosing properties or relationship?
>> say a User lives in a City, which just contains a simple int  id
>> value. to find users live in a city, i can do a simple traversal, of
>> all user nodes, or find the city node first, then collect all the
>> users. seems to me both ways work and share same level of performance.
>> (am i right here?)
>>
> Generally, if a number of properties really is denoting the same concept
> (like a city) and you don't want to duplicate the data, and be able to
> traverse or query it, I would introduce nodes. However, if the node woudl
> turn into a supernode (like a city node with 100K relationships), then
> consider introducing an in-graph indexing structure, or an out-of-graph
> external index like Lucene in order to look up relationships or nodes when
> you need them, since that will be cheaper.
>
is it 
https://github.com/peterneubauer/graph-collections/wiki/Indexed-relationships
?
seems not included in the current stable ver.
>
>> 2, does index operation add/remove/modify threadsafe, don't need
>> lock/transaction?
>>
> Yes, but the index framework is transactional as well as the graph. You need
> TX for any modifying operation, but not for reads.
>
>
>> 3, does it simple property writing operations also need to be wrapped
>> inside transaction? if so, in the imdb exmaple
>> tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither
>> within transaction, nor put into a save method, do all setProperty
>> works inside a transaction?
>>
> See Anders reply and above.
Got the two. thanks!
>
>
>> 4, what's the best practice to do bulk insertion when running (not
>> seed initial data)? i read post says that too many insertions within a
>> transaction may lead to memory problem? what's the proper mount of
>> insertion within a transaction?
>>
> Yes, transaction data is kept in memory before calling commit and flushing
> to disk, so overly large TX might result in memory problems. OTOH small TX
> incur higher IO load.
i'll probably do it with smaller batches (~1k operations per batch)
from an external queue. does it sounds reasonable?
>
>
>> 5, is there a suggested max length for string/array property? would it
>> be better to put into sql?
>>
> Well, the String store block size is adjustable (and we are working on even
> better layouts there), but for big strings like documents, a fiel system or
> Key/Value store might be better, and just keeping the reference to the
> location makes more sense.
ok, i'll use redis for strings.

>
> 6, say a facebook user may "likes" thousands of things, and these
>> things are sparsly connected. in this case, things should be modeled
>
> as nodes or array property?
>>
> Nodes. Sparse connections are one of the places where Neo4j shines - a
> fairly balanced graph where supernodes are seldom.
>
could you give a bottom number qualifies "supernode"? say 1k
connections within a graph of 1m nodes?

>
>> 7, where can i find an example to use domain models with serverplugin?
>> i want to put my data in a standalone server and just use the
>> serverplugin, unmanaged extension. should i just put the domain models
>> into the same serverplugin jar?
>>
>  Yes, I would do that. However, if you are not expecting to return Nodes,
> Relationships or Properties, an unmanaged extension will give you the full
> API of REST services. One extension that way is for instance the scripting
> extension, see https://github.com/neo4j/script-extension
thanks. seems i really should look into github instead of neo4j.org ;)
>
> 8, the warning in the documentation about unmanaged extension is
>> scary. what i can see is that people may use bad ways, instead of
>> Iterator/IteratorWrappers. any comment on this?
>>
> Yeah. It's just a warning, no sudden death. With that approach, you are
> inventing your own API and can do whatever you want, for good and bad.
>
>
>> 9, i'm not sure if it's trival: find out users who are only 2
>> relationships a way (use twitter example: my followees' followers),
>> live in same city, group by age and gender. also retrieve all their
>> followees. i want to do the traversal in java, where can i find an
>> examples?
>>
> Well,
> http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-traversal.htmlshould
> get you started? Also, in the next version, the Tinkerpop fluent
> iterator API (https://github.com/tinkerpop/pipes/wiki/FluentPipeline) is
> hopefully finding its way into the Neo4j release, if QA is ok, and you will
> have more options to do this.
>
thanks, will check it out.
>
>> 10, i've had horrible experience in turning jvm options. have neo4j
>> been running on Zing

Re: [Neo4j] 10 questions

2011-09-02 Thread Marko Rodriguez
Oh, I didn't see this:

"(use twitter example: my followees' followers)"

Then the query I provided in the previous email:

g.v(1).out('livesIn').sideEffect{city = 
it}.back(2).out.out.filter{it.out('livesIn').next().equals(city)}.groupCount(age){it.age}.groupCount(gender){it.gender}

would now be:

g.v(1).out('livesIn').sideEffect{city = 
it}.back(2).in('follows').in('follows').filter{it.out('livesIn').next().equals(city)}.groupCount(age){it.age}.groupCount(gender){it.gender}

as in('follows') is someone's followees (i.e. the people that follow me).

Enjoy,
Marko.

http://markorodriguez.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Marko Rodriguez
Hi,

>> 9, i'm not sure if it's trival: find out users who are only 2
>> relationships a way (use twitter example: my followees' followers),
>> live in same city, group by age and gender. also retrieve all their
>> followees. i want to do the traversal in java, where can i find an
>> examples?

If you use Gremlin, then you can do your query as follows:

age = [:]
gender = [:]
people = g.v(1).out('livesIn').sideEffect{city = 
it}.back(2).out.out.filter{it.out('livesIn').next().equals(city)}.groupCount(age){it.age}.groupCount(gender){it.gender}
 >> []

NOTE: This is a Gremlin 1.2+ query (so use Neo4j 1.5M01).

The query says this:
1. create an empty hash map called age
2. create an empty hash map called gender
3. main traversal
- start from vertex 1
- determine what city vertex 1 lives in
- save that city vertex to the variable city
- go back to vertex 1 (back(2) means go back 2 steps ago)
- go down 2 relationships -- (could be both.both if you are 
doing undirected)
- filter out those vertices that do not live in the same city 
as vertex 1
- index those vertices ages into the age hash map with the 
values being the distribution of ages.
- index those vertices genders into the gender hash map with 
the values being the distribution of genders. 
- insert those vertices into an empty list and save the 
reference to people.

Thus, your results are:
people : the people that meat the traversal description
age : their ages as a distribution (count)
gender: their genders as a distribution (count)

As Peter says, you can do this in native Java (instead of Groovy) if you want 
using FluentPipeline, but that is still in SNAPSHOT over at TinkerPop and will 
not be released for about a month. 

Good luck with your project,
Marko.

http://markorodriguez.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Rick Otten
Should one make an effort to keep "node 0" from becoming a 'supernode'?


-Original Message-
From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On 
Behalf Of Steven Kalemkiewicz
Sent: Friday, September 02, 2011 10:58 AM
To: Neo4j user discussions
Subject: Re: [Neo4j] 10 questions

What would you consider the lower-bound to be to classify a node as a 
supernode?  I saw that you referred to a city node with 100K relationships...

-Steve

On Fri, Sep 2, 2011 at 10:33 AM, Peter Neubauer < 
peter.neuba...@neotechnology.com> wrote:

> > 1, what's the general rule for choosing properties or relationship?
> > say a User lives in a City, which just contains a simple int  id 
> > value. to find users live in a city, i can do a simple traversal, of 
> > all user nodes, or find the city node first, then collect all the 
> > users. seems to me both ways work and share same level of performance.
> > (am i right here?)
> >
> Generally, if a number of properties really is denoting the same 
> concept (like a city) and you don't want to duplicate the data, and be 
> able to traverse or query it, I would introduce nodes. However, if the 
> node woudl turn into a supernode (like a city node with 100K 
> relationships), then consider introducing an in-graph indexing 
> structure, or an out-of-graph external index like Lucene in order to 
> look up relationships or nodes when you need them, since that will be cheaper.
>
> 6, say a facebook user may "likes" thousands of things, and these
> > things are sparsly connected. in this case, things should be modeled 
> > as nodes or array property?
> >
> Nodes. Sparse connections are one of the places where Neo4j shines - a 
> fairly balanced graph where supernodes are seldom.
>
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Steven Kalemkiewicz
What would you consider the lower-bound to be to classify a node as a
supernode?  I saw that you referred to a city node with 100K
relationships...

-Steve

On Fri, Sep 2, 2011 at 10:33 AM, Peter Neubauer <
peter.neuba...@neotechnology.com> wrote:

> > 1, what's the general rule for choosing properties or relationship?
> > say a User lives in a City, which just contains a simple int  id
> > value. to find users live in a city, i can do a simple traversal, of
> > all user nodes, or find the city node first, then collect all the
> > users. seems to me both ways work and share same level of performance.
> > (am i right here?)
> >
> Generally, if a number of properties really is denoting the same concept
> (like a city) and you don't want to duplicate the data, and be able to
> traverse or query it, I would introduce nodes. However, if the node woudl
> turn into a supernode (like a city node with 100K relationships), then
> consider introducing an in-graph indexing structure, or an out-of-graph
> external index like Lucene in order to look up relationships or nodes when
> you need them, since that will be cheaper.
>
> 6, say a facebook user may "likes" thousands of things, and these
> > things are sparsly connected. in this case, things should be modeled
> > as nodes or array property?
> >
> Nodes. Sparse connections are one of the places where Neo4j shines - a
> fairly balanced graph where supernodes are seldom.
>
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Peter Neubauer
Hi Linan,
trying fast stabs at answers inline before heading home :)

On Thu, Sep 1, 2011 at 3:29 AM, Linan Wang  wrote:

> hi,
> got some questions not found simple answers from the documents. i bet
> some of them are pretty primitive, bear with me  please.
>
> 1, what's the general rule for choosing properties or relationship?
> say a User lives in a City, which just contains a simple int  id
> value. to find users live in a city, i can do a simple traversal, of
> all user nodes, or find the city node first, then collect all the
> users. seems to me both ways work and share same level of performance.
> (am i right here?)
>
Generally, if a number of properties really is denoting the same concept
(like a city) and you don't want to duplicate the data, and be able to
traverse or query it, I would introduce nodes. However, if the node woudl
turn into a supernode (like a city node with 100K relationships), then
consider introducing an in-graph indexing structure, or an out-of-graph
external index like Lucene in order to look up relationships or nodes when
you need them, since that will be cheaper.


> 2, does index operation add/remove/modify threadsafe, don't need
> lock/transaction?
>
Yes, but the index framework is transactional as well as the graph. You need
TX for any modifying operation, but not for reads.


> 3, does it simple property writing operations also need to be wrapped
> inside transaction? if so, in the imdb exmaple
> tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither
> within transaction, nor put into a save method, do all setProperty
> works inside a transaction?
>
See Anders reply and above.


> 4, what's the best practice to do bulk insertion when running (not
> seed initial data)? i read post says that too many insertions within a
> transaction may lead to memory problem? what's the proper mount of
> insertion within a transaction?
>
Yes, transaction data is kept in memory before calling commit and flushing
to disk, so overly large TX might result in memory problems. OTOH small TX
incur higher IO load.


> 5, is there a suggested max length for string/array property? would it
> be better to put into sql?
>
Well, the String store block size is adjustable (and we are working on even
better layouts there), but for big strings like documents, a fiel system or
Key/Value store might be better, and just keeping the reference to the
location makes more sense.

6, say a facebook user may "likes" thousands of things, and these
> things are sparsly connected. in this case, things should be modeled

as nodes or array property?
>
Nodes. Sparse connections are one of the places where Neo4j shines - a
fairly balanced graph where supernodes are seldom.


> 7, where can i find an example to use domain models with serverplugin?
> i want to put my data in a standalone server and just use the
> serverplugin, unmanaged extension. should i just put the domain models
> into the same serverplugin jar?
>
 Yes, I would do that. However, if you are not expecting to return Nodes,
Relationships or Properties, an unmanaged extension will give you the full
API of REST services. One extension that way is for instance the scripting
extension, see https://github.com/neo4j/script-extension

8, the warning in the documentation about unmanaged extension is
> scary. what i can see is that people may use bad ways, instead of
> Iterator/IteratorWrappers. any comment on this?
>
Yeah. It's just a warning, no sudden death. With that approach, you are
inventing your own API and can do whatever you want, for good and bad.


> 9, i'm not sure if it's trival: find out users who are only 2
> relationships a way (use twitter example: my followees' followers),
> live in same city, group by age and gender. also retrieve all their
> followees. i want to do the traversal in java, where can i find an
> examples?
>
Well,
http://docs.neo4j.org/chunked/snapshot/tutorials-java-embedded-traversal.htmlshould
get you started? Also, in the next version, the Tinkerpop fluent
iterator API (https://github.com/tinkerpop/pipes/wiki/FluentPipeline) is
hopefully finding its way into the Neo4j release, if QA is ok, and you will
have more options to do this.


> 10, i've had horrible experience in turning jvm options. have neo4j
> been running on Zing JVM, hp nonstop jvm? are they better options?
>
> I think there are initial tests running on Zing, but I don't know for sure.
If you have access to such a machine, ir would be great if you can give
feedback. Michael Hunger is doing a lot of these tests for hosting.


Sorry for the delay, hope this helps. Let us know if you have more
questions!

/peter
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Linan Wang
great. i thought transaction only applies to nodes operations. seems
it also including indexing. it's handy!
other 9 questions? :)

On Fri, Sep 2, 2011 at 12:55 PM, Anders Nawroth
 wrote:
> Hi!
>
>>> Seems like the node and index modifications belong in the same
>>> transaction, to make sure any modifications to nodes are always
>>> reflected in the indexes as well. Otherwise they could get out of sync
>>> if your application crashes after the commit of the first,
>>> node-modifying transaction.
>> it seems confusing. if indexing actions wrapped inside a transaction
>> and the transaction fails, will indexing action automatically get
>> rollback? think this example:
>
> Yes, it will be rolled back - that's the point of performing multiple
> operations in the same transaction.
>
> /anders
>
>>
>> class User{
>> public void dosomething(){
>> //node actions
>> //index actions
>> }
>> }
>>
>> class Ext extends ServerPlugin{
>> public action(){
>> // get an array of users;
>> Transaction tx = graphDb.beginTx();
>> try
>> {
>>      ... // operations that work with the graph
>>      for(User u:users){
>>        u.dosomething();
>>     }
>>      tx.success();
>> }
>> finally
>> {
>>      tx.finish();
>> }
>> }
>>
>> }
>>
>>>
>>> /anders
>>>
>>>
>
>
> /anders
>
>> 4, what's the best practice to do bulk insertion when running (not
>> seed initial data)? i read post says that too many insertions within a
>> transaction may lead to memory problem? what's the proper mount of
>> insertion within a transaction?
>> 5, is there a suggested max length for string/array property? would it
>> be better to put into sql?
>> 6, say a facebook user may "likes" thousands of things, and these
>> things are sparsly connected. in this case, things should be modeled
>> as nodes or array property?
>> 7, where can i find an example to use domain models with serverplugin?
>> i want to put my data in a standalone server and just use the
>> serverplugin, unmanaged extension. should i just put the domain models
>> into the same serverplugin jar?
>> 8, the warning in the documentation about unmanaged extension is
>> scary. what i can see is that people may use bad ways, instead of
>> Iterator/IteratorWrappers. any comment on this?
>> 9, i'm not sure if it's trival: find out users who are only 2
>> relationships a way (use twitter example: my followees' followers),
>> live in same city, group by age and gender. also retrieve all their
>> followees. i want to do the traversal in java, where can i find an
>> examples?
>> 10, i've had horrible experience in turning jvm options. have neo4j
>> been running on Zing JVM, hp nonstop jvm? are they better options?
>>
>> thanks in advance
>>
>>
>> Best regards
>>
>> Linan Wang
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



>>> ___
>>> Neo4j mailing list
>>> User@lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>>>
>>
>>
>>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Anders Nawroth
Hi!

>> Seems like the node and index modifications belong in the same
>> transaction, to make sure any modifications to nodes are always
>> reflected in the indexes as well. Otherwise they could get out of sync
>> if your application crashes after the commit of the first,
>> node-modifying transaction.
> it seems confusing. if indexing actions wrapped inside a transaction
> and the transaction fails, will indexing action automatically get
> rollback? think this example:

Yes, it will be rolled back - that's the point of performing multiple 
operations in the same transaction.

/anders

>
> class User{
> public void dosomething(){
> //node actions
> //index actions
> }
> }
>
> class Ext extends ServerPlugin{
> public action(){
> // get an array of users;
> Transaction tx = graphDb.beginTx();
> try
> {
>  ... // operations that work with the graph
>  for(User u:users){
>u.dosomething();
> }
>  tx.success();
> }
> finally
> {
>  tx.finish();
> }
> }
>
> }
>
>>
>> /anders
>>
>>


 /anders

> 4, what's the best practice to do bulk insertion when running (not
> seed initial data)? i read post says that too many insertions within a
> transaction may lead to memory problem? what's the proper mount of
> insertion within a transaction?
> 5, is there a suggested max length for string/array property? would it
> be better to put into sql?
> 6, say a facebook user may "likes" thousands of things, and these
> things are sparsly connected. in this case, things should be modeled
> as nodes or array property?
> 7, where can i find an example to use domain models with serverplugin?
> i want to put my data in a standalone server and just use the
> serverplugin, unmanaged extension. should i just put the domain models
> into the same serverplugin jar?
> 8, the warning in the documentation about unmanaged extension is
> scary. what i can see is that people may use bad ways, instead of
> Iterator/IteratorWrappers. any comment on this?
> 9, i'm not sure if it's trival: find out users who are only 2
> relationships a way (use twitter example: my followees' followers),
> live in same city, group by age and gender. also retrieve all their
> followees. i want to do the traversal in java, where can i find an
> examples?
> 10, i've had horrible experience in turning jvm options. have neo4j
> been running on Zing JVM, hp nonstop jvm? are they better options?
>
> thanks in advance
>
>
> Best regards
>
> Linan Wang
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

>>>
>>>
>>>
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
>
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Linan Wang
hi

On Fri, Sep 2, 2011 at 10:18 AM, Anders Nawroth
 wrote:
> Hi!
>
>>> All modifying operations need to be performed inside a transaction. In
>>> most cases it makes sense to perform multiple operations in a single
>>> transaction. For example in a web application it may be a good fit to
>>> wrap the handling of one request in a transaction. So if a method
>>> doesn't start a transaction, it just means that it's handled at a higher
>>> level in the application.
>> got this part. then do you suggest to leave indexing action outside of
>> node operation? say i have a domain model and it handles indexing
>> actions along with underlying nodes modification, then in an web app,
>> should I wrap only nodes operations inside transaction and do indexing
>> only after it success?
>
> Seems like the node and index modifications belong in the same
> transaction, to make sure any modifications to nodes are always
> reflected in the indexes as well. Otherwise they could get out of sync
> if your application crashes after the commit of the first,
> node-modifying transaction.
it seems confusing. if indexing actions wrapped inside a transaction
and the transaction fails, will indexing action automatically get
rollback? think this example:

class User{
public void dosomething(){
//node actions
//index actions
}
}

class Ext extends ServerPlugin{
public action(){
// get an array of users;
Transaction tx = graphDb.beginTx();
try
{
... // operations that work with the graph
for(User u:users){
  u.dosomething();
   }
tx.success();
}
finally
{
tx.finish();
}
}

}

>
> /anders
>
>
>>>
>>>
>>> /anders
>>>
 4, what's the best practice to do bulk insertion when running (not
 seed initial data)? i read post says that too many insertions within a
 transaction may lead to memory problem? what's the proper mount of
 insertion within a transaction?
 5, is there a suggested max length for string/array property? would it
 be better to put into sql?
 6, say a facebook user may "likes" thousands of things, and these
 things are sparsly connected. in this case, things should be modeled
 as nodes or array property?
 7, where can i find an example to use domain models with serverplugin?
 i want to put my data in a standalone server and just use the
 serverplugin, unmanaged extension. should i just put the domain models
 into the same serverplugin jar?
 8, the warning in the documentation about unmanaged extension is
 scary. what i can see is that people may use bad ways, instead of
 Iterator/IteratorWrappers. any comment on this?
 9, i'm not sure if it's trival: find out users who are only 2
 relationships a way (use twitter example: my followees' followers),
 live in same city, group by age and gender. also retrieve all their
 followees. i want to do the traversal in java, where can i find an
 examples?
 10, i've had horrible experience in turning jvm options. have neo4j
 been running on Zing JVM, hp nonstop jvm? are they better options?

 thanks in advance


 Best regards

 Linan Wang
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
>>> ___
>>> Neo4j mailing list
>>> User@lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>>>
>>
>>
>>
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Anders Nawroth
Hi!

>> All modifying operations need to be performed inside a transaction. In
>> most cases it makes sense to perform multiple operations in a single
>> transaction. For example in a web application it may be a good fit to
>> wrap the handling of one request in a transaction. So if a method
>> doesn't start a transaction, it just means that it's handled at a higher
>> level in the application.
> got this part. then do you suggest to leave indexing action outside of
> node operation? say i have a domain model and it handles indexing
> actions along with underlying nodes modification, then in an web app,
> should I wrap only nodes operations inside transaction and do indexing
> only after it success?

Seems like the node and index modifications belong in the same 
transaction, to make sure any modifications to nodes are always 
reflected in the indexes as well. Otherwise they could get out of sync 
if your application crashes after the commit of the first, 
node-modifying transaction.

/anders


>>
>>
>> /anders
>>
>>> 4, what's the best practice to do bulk insertion when running (not
>>> seed initial data)? i read post says that too many insertions within a
>>> transaction may lead to memory problem? what's the proper mount of
>>> insertion within a transaction?
>>> 5, is there a suggested max length for string/array property? would it
>>> be better to put into sql?
>>> 6, say a facebook user may "likes" thousands of things, and these
>>> things are sparsly connected. in this case, things should be modeled
>>> as nodes or array property?
>>> 7, where can i find an example to use domain models with serverplugin?
>>> i want to put my data in a standalone server and just use the
>>> serverplugin, unmanaged extension. should i just put the domain models
>>> into the same serverplugin jar?
>>> 8, the warning in the documentation about unmanaged extension is
>>> scary. what i can see is that people may use bad ways, instead of
>>> Iterator/IteratorWrappers. any comment on this?
>>> 9, i'm not sure if it's trival: find out users who are only 2
>>> relationships a way (use twitter example: my followees' followers),
>>> live in same city, group by age and gender. also retrieve all their
>>> followees. i want to do the traversal in java, where can i find an
>>> examples?
>>> 10, i've had horrible experience in turning jvm options. have neo4j
>>> been running on Zing JVM, hp nonstop jvm? are they better options?
>>>
>>> thanks in advance
>>>
>>>
>>> Best regards
>>>
>>> Linan Wang
>>> ___
>>> Neo4j mailing list
>>> User@lists.neo4j.org
>>> https://lists.neo4j.org/mailman/listinfo/user
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>>
>
>
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Linan Wang
Hi anders,
thanks for the clarification.

On Fri, Sep 2, 2011 at 8:32 AM, Anders Nawroth  wrote:
> Hi!
>
> 2011-09-01 03:29, Linan Wang:
>> 2, does index operation add/remove/modify threadsafe, don't need
>> lock/transaction?
>> 3, does it simple property writing operations also need to be wrapped
>> inside transaction? if so, in the imdb exmaple
>> tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither
>> within transaction, nor put into a save method, do all setProperty
>> works inside a transaction?
>
> All modifying operations need to be performed inside a transaction. In
> most cases it makes sense to perform multiple operations in a single
> transaction. For example in a web application it may be a good fit to
> wrap the handling of one request in a transaction. So if a method
> doesn't start a transaction, it just means that it's handled at a higher
> level in the application.
got this part. then do you suggest to leave indexing action outside of
node operation? say i have a domain model and it handles indexing
actions along with underlying nodes modification, then in an web app,
should I wrap only nodes operations inside transaction and do indexing
only after it success?
>
>
> /anders
>
>> 4, what's the best practice to do bulk insertion when running (not
>> seed initial data)? i read post says that too many insertions within a
>> transaction may lead to memory problem? what's the proper mount of
>> insertion within a transaction?
>> 5, is there a suggested max length for string/array property? would it
>> be better to put into sql?
>> 6, say a facebook user may "likes" thousands of things, and these
>> things are sparsly connected. in this case, things should be modeled
>> as nodes or array property?
>> 7, where can i find an example to use domain models with serverplugin?
>> i want to put my data in a standalone server and just use the
>> serverplugin, unmanaged extension. should i just put the domain models
>> into the same serverplugin jar?
>> 8, the warning in the documentation about unmanaged extension is
>> scary. what i can see is that people may use bad ways, instead of
>> Iterator/IteratorWrappers. any comment on this?
>> 9, i'm not sure if it's trival: find out users who are only 2
>> relationships a way (use twitter example: my followees' followers),
>> live in same city, group by age and gender. also retrieve all their
>> followees. i want to do the traversal in java, where can i find an
>> examples?
>> 10, i've had horrible experience in turning jvm options. have neo4j
>> been running on Zing JVM, hp nonstop jvm? are they better options?
>>
>> thanks in advance
>>
>>
>> Best regards
>>
>> Linan Wang
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>



-- 
Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-02 Thread Anders Nawroth
Hi!

2011-09-01 03:29, Linan Wang:
> 2, does index operation add/remove/modify threadsafe, don't need
> lock/transaction?
> 3, does it simple property writing operations also need to be wrapped
> inside transaction? if so, in the imdb exmaple
> tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither
> within transaction, nor put into a save method, do all setProperty
> works inside a transaction?

All modifying operations need to be performed inside a transaction. In 
most cases it makes sense to perform multiple operations in a single 
transaction. For example in a web application it may be a good fit to 
wrap the handling of one request in a transaction. So if a method 
doesn't start a transaction, it just means that it's handled at a higher 
level in the application.


/anders

> 4, what's the best practice to do bulk insertion when running (not
> seed initial data)? i read post says that too many insertions within a
> transaction may lead to memory problem? what's the proper mount of
> insertion within a transaction?
> 5, is there a suggested max length for string/array property? would it
> be better to put into sql?
> 6, say a facebook user may "likes" thousands of things, and these
> things are sparsly connected. in this case, things should be modeled
> as nodes or array property?
> 7, where can i find an example to use domain models with serverplugin?
> i want to put my data in a standalone server and just use the
> serverplugin, unmanaged extension. should i just put the domain models
> into the same serverplugin jar?
> 8, the warning in the documentation about unmanaged extension is
> scary. what i can see is that people may use bad ways, instead of
> Iterator/IteratorWrappers. any comment on this?
> 9, i'm not sure if it's trival: find out users who are only 2
> relationships a way (use twitter example: my followees' followers),
> live in same city, group by age and gender. also retrieve all their
> followees. i want to do the traversal in java, where can i find an
> examples?
> 10, i've had horrible experience in turning jvm options. have neo4j
> been running on Zing JVM, hp nonstop jvm? are they better options?
>
> thanks in advance
>
>
> Best regards
>
> Linan Wang
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-01 Thread Linan Wang
Thanks!
The Matrix example I was referring to 
http://docs.neo4j.org/chunked/stable/indexing-add.html

Sent from my iPad

On 1 Sep 2011, at 21:21, Peter Neubauer  
wrote:

> Hi there,
> would love to answer but had no time today, will try tomorrow. Also, very
> valid points about the docs, will try to put as much of the answers into the
> docs as possible. What Matrix example are you looking at?
> 
> Cheers,
> 
> /peter neubauer
> 
> GTalk:  neubauer.peter
> Skype   peter.neubauer
> Phone   +46 704 106975
> LinkedIn   http://www.linkedin.com/in/neubauer
> Twitter  http://twitter.com/peterneubauer
> 
> http://www.neo4j.org   - Your high performance graph database.
> http://startupbootcamp.org/- Öresund - Innovation happens HERE.
> http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.
> 
> 
> On Thu, Sep 1, 2011 at 10:10 PM, wangii  wrote:
> 
>> anyone show some love ;)
>> seriously, the product is great but not the documentation. e.g. about the
>> general rule for choosing property or relationship: in the matrix example,
>> the year of a movie should be modelled as relationship since every year
>> lots
>> of movies are produced.
>> 
>> --
>> View this message in context:
>> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-10-questions-tp3300093p3302418.html
>> Sent from the Neo4j Community Discussions mailing list archive at
>> Nabble.com.
>> ___
>> Neo4j mailing list
>> User@lists.neo4j.org
>> https://lists.neo4j.org/mailman/listinfo/user
>> 
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-01 Thread Peter Neubauer
Hi there,
would love to answer but had no time today, will try tomorrow. Also, very
valid points about the docs, will try to put as much of the answers into the
docs as possible. What Matrix example are you looking at?

Cheers,

/peter neubauer

GTalk:  neubauer.peter
Skype   peter.neubauer
Phone   +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter  http://twitter.com/peterneubauer

http://www.neo4j.org   - Your high performance graph database.
http://startupbootcamp.org/- Öresund - Innovation happens HERE.
http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party.


On Thu, Sep 1, 2011 at 10:10 PM, wangii  wrote:

> anyone show some love ;)
> seriously, the product is great but not the documentation. e.g. about the
> general rule for choosing property or relationship: in the matrix example,
> the year of a movie should be modelled as relationship since every year
> lots
> of movies are produced.
>
> --
> View this message in context:
> http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-10-questions-tp3300093p3302418.html
> Sent from the Neo4j Community Discussions mailing list archive at
> Nabble.com.
> ___
> Neo4j mailing list
> User@lists.neo4j.org
> https://lists.neo4j.org/mailman/listinfo/user
>
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] 10 questions

2011-09-01 Thread wangii
anyone show some love ;)
seriously, the product is great but not the documentation. e.g. about the
general rule for choosing property or relationship: in the matrix example,
the year of a movie should be modelled as relationship since every year lots
of movies are produced. 

--
View this message in context: 
http://neo4j-community-discussions.438527.n3.nabble.com/Neo4j-10-questions-tp3300093p3302418.html
Sent from the Neo4j Community Discussions mailing list archive at Nabble.com.
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] 10 questions

2011-08-31 Thread Linan Wang
hi,
got some questions not found simple answers from the documents. i bet
some of them are pretty primitive, bear with me  please.

1, what's the general rule for choosing properties or relationship?
say a User lives in a City, which just contains a simple int  id
value. to find users live in a city, i can do a simple traversal, of
all user nodes, or find the city node first, then collect all the
users. seems to me both ways work and share same level of performance.
(am i right here?)
2, does index operation add/remove/modify threadsafe, don't need
lock/transaction?
3, does it simple property writing operations also need to be wrapped
inside transaction? if so, in the imdb exmaple
tutor/domain/MovieImpl.java underlyingNode.setProperty is used neither
within transaction, nor put into a save method, do all setProperty
works inside a transaction?
4, what's the best practice to do bulk insertion when running (not
seed initial data)? i read post says that too many insertions within a
transaction may lead to memory problem? what's the proper mount of
insertion within a transaction?
5, is there a suggested max length for string/array property? would it
be better to put into sql?
6, say a facebook user may "likes" thousands of things, and these
things are sparsly connected. in this case, things should be modeled
as nodes or array property?
7, where can i find an example to use domain models with serverplugin?
i want to put my data in a standalone server and just use the
serverplugin, unmanaged extension. should i just put the domain models
into the same serverplugin jar?
8, the warning in the documentation about unmanaged extension is
scary. what i can see is that people may use bad ways, instead of
Iterator/IteratorWrappers. any comment on this?
9, i'm not sure if it's trival: find out users who are only 2
relationships a way (use twitter example: my followees' followers),
live in same city, group by age and gender. also retrieve all their
followees. i want to do the traversal in java, where can i find an
examples?
10, i've had horrible experience in turning jvm options. have neo4j
been running on Zing JVM, hp nonstop jvm? are they better options?

thanks in advance


Best regards

Linan Wang
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user