Re: [Neo4j] Arnoldi iteration

2010-06-02 Thread Peter Neubauer
Lorenzo,

On Mon, May 31, 2010 at 2:33 PM, Lorenzo Livi lorenz.l...@gmail.com wrote:
 I'll make a try with the 0.6-snapshot version and I'll let you know.
yes, please get back with feedback! Tobias and Mattias have been
looking closer at some of the algos in the graph-algo package but not
all are totally tested through, so your feedback is greatly
appreciated.


 The last thing: the power method is not guaranted to converge always
 ... especially if the graph is directed (neo4j is always directed).
 Then a method like Arnoldi Iteration is necessary IMHO.
 I've developed the simplest centrality measure, the degree centrality
 and I should develop PageRank and/or HITS (not now ..). Maybe you can
 be interested in these algos (for now the degree centrality). Let me
 know.
Yes, we are and could include them into the graph-algo package. You
can just sign the CLA and we can get going on this,
http://wiki.neo4j.org/content/About_Contributor_License_Agreement


/peter
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Node creation limit

2010-06-02 Thread Mattias Persson
Exactly, the problem is most likely that you try to insert all your
stuff in one transaction. All data for a transaction is kept in memory
until committed so for really big transactions it can fill your entire
heap. Try to group 10k operations or so for big insertions or use the
batch inserter.

Links:
http://wiki.neo4j.org/content/Transactions#Big_transactions
http://wiki.neo4j.org/content/Batch_Insert

2010/6/2, Laurent Laborde kerdez...@gmail.com:
 On Wed, Jun 2, 2010 at 3:50 AM, Biren Gandhi biren.gan...@gmail.com wrote:

 Is there any limit on number of nodes that can be created in a neo4j
 instance? Any other tips?

 I created hundreds of millions of nodes without problems, but it was
 splitted into many transaction.

 --
 Laurent ker2x Laborde
 Sysadmin  DBA at http://www.over-blog.com/
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Node creation limit

2010-06-02 Thread Mattias Persson
Exactly, the problem is most likely that you try to insert all your
stuff in one transaction. All data for a transaction is kept in memory
until committed so for really big transactions it can fill your entire
heap. Try to group 10k operations or so for big insertions or use the
batch inserter.

Links:
http://wiki.neo4j.org/content/Transactions#Big_transactions
http://wiki.neo4j.org/content/Batch_Insert

2010/6/2, Laurent Laborde kerdez...@gmail.com:
 On Wed, Jun 2, 2010 at 3:50 AM, Biren Gandhi biren.gan...@gmail.com wrote:

 Is there any limit on number of nodes that can be created in a neo4j
 instance? Any other tips?

 I created hundreds of millions of nodes without problems, but it was
 splitted into many transaction.

 --
 Laurent ker2x Laborde
 Sysadmin  DBA at http://www.over-blog.com/
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Compacting files?

2010-06-02 Thread Alex Averbuch
Hey,
Is there a way to compact the data stores (relationships, nodes, properties)
in Neo4j?
I don't mind if its a manual operation.

I have some datasets that have had a lot of relationships removed from them
but the file is still the same size, so I'm guessing there are a lot of
holes in this file at the moment.

Would this be hurting lookup performance?

Cheers,
Alex
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] PHP REST API client

2010-06-02 Thread Anders Nawroth
Hi!

This is awesome!

I tried it out and have a suggestion: to make the semantics for storing 
NULLs consistent you could change the PropertyContainer::__set method 
to remove the property if it exists when trying to set it to NULL. This 
will make sure NULL is returned when you try to read the property. 
Something along the lines of:

public function __set($k, $v)
{
// because neo doesn't store NULLs
if ($v===NULL)
{
if (array_key_exists($k, $this-_data))
{
unset($this-_data[$k]);
}
}
else
{
$this-_data[$k] = $v;
}
}

For some reason calling Node::save twice gives me an exception, so I 
can't update a node after the first save and save it again with new 
property values. Maybe a bug?


/anders


On 06/02/2010 01:00 AM, Alastair James wrote:
 Hi there!

 Sorry, been a bit quiet on the PHP REST API front for a few weeks.

 I will be added some features this week (traversals etc...), but in the mean
 time, I have (finally) written up a little blog post detailing how the
 current version works!

 http://onewheeledbicycle.com/2010/06/01/getting-started-with-neo4j-rest-api-and-php/

 Stay tuned for more!

 Alastair
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] [Neo] TransactionEventHandler and Spring transaction handling

2010-06-02 Thread Johan Svensson
Antonis,

Just committed some bug fixes in the event framework and hopefully
this also solves the problem you experienced when using Spring. Could
you please try the latest neo4j-kernel 1.1-SNAPSHOT to see if it
works?

To answer your other question the handler is called in the same thread
and you can access node properties in the afterCommit() call (we
changed so reads without a running transaction are possible).

Regards,
Johan

On Thu, May 20, 2010 at 2:56 PM, Antonis Lempesis ant...@di.uoa.gr wrote:
 To further clarify, I run 2 tests. In the first test, my objects were
 configured using spring + I had the @Transactional
 annotation in the test method. In the second test, I configured the same
 objects manually and also started and
 commited the transaction before and after calling the test method. In
 both cases, my handler got a TransactionData
 object (not null), but in the second case
 tData.assignedNodeProperties().hasNext() returned true while in the first
 it returned false.

 thanks for your support,
 Antonis

 PS 2 questions: is the handler called in a different thread? And, in
 afterCommit() method, can I access the node properties
 in the TransactionData object? Since the transaction is commited (I
 guess finished), shouldn't I get an NotInTransaction
 exception?

 On 5/20/10 3:38 PM, Johan Svensson wrote:
 Hi,

 I have not tried to reproduce this but just looking at the code I
 think it is a bug so thanks for reporting it!

 The synchronization hook that gathers the transaction data gets
 registered in the call to GraphDatabaseService.beginTx() but when
 using Spring (with that configuration) UserTransaction (old JTA) will
 be called directly so no events will be collected.

 Will fix this very soon.

 -Johan

 On Wed, May 19, 2010 at 5:49 PM, Antonis Lempesisant...@di.uoa.gr  wrote:

 Hello all,

    I have set up spring to handle transactions for neo4j (according to
 the imdb example) and it works fine. When
 I read about the new events framework, I checked out the latest revision
 (4421) and tried to register my
 TransactionEventHandler that simply prints the number of created nodes.
 The weird thing is that when I test
 this in a simple junit test case, the TransactionData I get contains the
 correct data. When I do the same thing
 using the spring configuration, the TransactionData is empty. Any ideas?

 Thanks,
 Antonis
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] Traversals in REST Service vs. Traversals in neo4j.py

2010-06-02 Thread Javier de la Rosa
I'm developing the support to traversals for Python REST Client. The
underlying idea for me is to mantain the compatibility with neo4j.py (a
really hard issue), but the traversals made me to think about some
questions:
1. How can I implement support to isStopNode or isReturnable in REST
Service? I guess that for isStopNode I may to use prune evaluator, but
what about with isReturnable, must I use returnable filer? Why this
parameter has no body attribute in order to define a function?
2. If max depth parameter is not set, it's equivalent to
STOP_AT_END_OF_GRAPH? If that's not true, how can I get the a behaviour like
STOP_AT_END_OF_GRAPH?

Sorry, perhaps they are dumb questions, but I need some of light, please.

Best regards.

-- 
Javier de la Rosa
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversals in REST Service vs. Traversals in neo4j.py

2010-06-02 Thread Mattias Persson
2010/6/2 Javier de la Rosa ver...@gmail.com:
 I'm developing the support to traversals for Python REST Client. The
 underlying idea for me is to mantain the compatibility with neo4j.py (a
 really hard issue), but the traversals made me to think about some
 questions:
 1. How can I implement support to isStopNode or isReturnable in REST
 Service? I guess that for isStopNode I may to use prune evaluator, but
 what about with isReturnable, must I use returnable filer? Why this
 parameter has no body attribute in order to define a function?

You can specify return filter just as you can do prune evaluator, like:

...
return filter: {
language: javascript,
body: position.node().getProperty( 'name' ).equals( 'Javier' )
}
...

 2. If max depth parameter is not set, it's equivalent to
 STOP_AT_END_OF_GRAPH? If that's not true, how can I get the a behaviour like
 STOP_AT_END_OF_GRAPH?

If max depth isn't supplied max depth 1 is assumed. To get the
STOP_AT_END_OF_GRAPH behaviour you should do:

...
prune evaluator: {
language: builtin,
value: none
}

which converts into PruneEvaluator.NONE.


 Sorry, perhaps they are dumb questions, but I need some of light, please.
Not at all, I hope this helps you!

 Best regards.

 --
 Javier de la Rosa
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Compacting files?

2010-06-02 Thread Johan Svensson
Alex,

You are correct about the holes in the store file and I would
suggest you export the data and then re-import it again. Neo4j is not
optimized for the use case were more data is removed than added over
time.

It would be possible to write a compacting utility but since this is
not a very common use case I think it is better to put that time into
producing a generic export/import dump utility. The plan is to get a
export/import utility in place as soon as possible so any input on how
that should work, what format to use etc. would be great.

-Johan

On Wed, Jun 2, 2010 at 9:23 AM, Alex Averbuch alex.averb...@gmail.com wrote:
 Hey,
 Is there a way to compact the data stores (relationships, nodes, properties)
 in Neo4j?
 I don't mind if its a manual operation.

 I have some datasets that have had a lot of relationships removed from them
 but the file is still the same size, so I'm guessing there are a lot of
 holes in this file at the moment.

 Would this be hurting lookup performance?

 Cheers,
 Alex
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Compacting files?

2010-06-02 Thread Alex Averbuch
Hi Johan,
Do you mean a utility that creates a new Neo4j instance and copies all
entities into it from an old Neo4j instance?
That's definitely no problem.

I've written a bit of import/export code in my graph_gen_utils branch.

I have a GraphReader interface which is generic and only contains getNodes()
 getRels() methods definitions, which return iterators. The iterators are
of type NodeData, basically a HashMap of HashMap for simplicity.
1 NodeData can contain 1 Node with Properties and all it's Relationships
with Properties.

Then I implemented various readers that I needed during the thesis.
For example, ChacoParser, GMLParser, TwitterParser (proprietry format), etc
which all implement GraphReader.

Similarly for GraphWriter...

That made it easy for me to add any parser and use my existing methods for
buffering multiple entities into Transactions, etc.

It's far from perfect, but might give an idea or two.

Maybe some of that could be reused, although someone would definitely need
to evaluate the quality of my code first.
Blueprints has some import functionality too (.graphml format for example).

Cheers,
Alex

On Wed, Jun 2, 2010 at 2:30 PM, Johan Svensson jo...@neotechnology.comwrote:

 Alex,

 You are correct about the holes in the store file and I would
 suggest you export the data and then re-import it again. Neo4j is not
 optimized for the use case were more data is removed than added over
 time.

 It would be possible to write a compacting utility but since this is
 not a very common use case I think it is better to put that time into
 producing a generic export/import dump utility. The plan is to get a
 export/import utility in place as soon as possible so any input on how
 that should work, what format to use etc. would be great.

 -Johan

 On Wed, Jun 2, 2010 at 9:23 AM, Alex Averbuch alex.averb...@gmail.com
 wrote:
  Hey,
  Is there a way to compact the data stores (relationships, nodes,
 properties)
  in Neo4j?
  I don't mind if its a manual operation.
 
  I have some datasets that have had a lot of relationships removed from
 them
  but the file is still the same size, so I'm guessing there are a lot of
  holes in this file at the moment.
 
  Would this be hurting lookup performance?
 
  Cheers,
  Alex
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversals in REST Service vs. Traversals in neo4j.py

2010-06-02 Thread Javier de la Rosa
Thank you for your clarification.

On 2 June 2010 13:31, Mattias Persson matt...@neotechnology.com wrote:

 return filter: {
language: javascript,
body: position.node().getProperty( 'name' ).equals( 'Javier' )
 }


Will we see language: python in the near future?


-- 
Javier de la Rosa
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversals in REST Service vs. Traversals in neo4j.py

2010-06-02 Thread Javier de la Rosa
And one more question, what's the meaning of uniqueness: node path
parameter? What values does it support? Which is the equivalent en neo4j.py?


-- 
Javier de la Rosa
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] PHP REST API client

2010-06-02 Thread Alastair James
Hi!

I tried it out and have a suggestion: to make the semantics for storing
 NULLs consistent you could change the PropertyContainer::__set method
 to remove the property if it exists when trying to set it to NULL.


Excellent idea! I will add ASAP.



 For some reason calling Node::save twice gives me an exception, so I
 can't update a node after the first save and save it again with new
 property values. Maybe a bug?

 Looks like it. I will fix ASAP!

Alastair



 /anders


 On 06/02/2010 01:00 AM, Alastair James wrote:
  Hi there!
 
  Sorry, been a bit quiet on the PHP REST API front for a few weeks.
 
  I will be added some features this week (traversals etc...), but in the
 mean
  time, I have (finally) written up a little blog post detailing how the
  current version works!
 
 
 http://onewheeledbicycle.com/2010/06/01/getting-started-with-neo4j-rest-api-and-php/
 
  Stay tuned for more!
 
  Alastair
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Dr Alastair James
CTO James Media Group
www.jamesmedia.net

www.adnet-media.net
www.worldreviewer.com
www.thehotelguru.com

'Inspiring Travel'

IATA 96012851
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversals in REST Service vs. Traversals in neo4j.py

2010-06-02 Thread Mattias Persson
I don't think the python bindings (or any other binding) has caught up
to the new traversal framework. Uniqueness is all about when to visit
a node and when not to. If the uniqueness would be NODE_GLOBAL a node
wouldn't be visited more than once in a traversal. NODE_PATH means
that a node won't be visited again for the current path (the path from
the start node to where ever the traverser is at the moment) if that
node is in the current path. It might as well be visited again in
another path.

Also see the javadoc of Uniqueness at
http://components.neo4j.org/neo4j-kernel/apidocs/org/neo4j/graphdb/traversal/Uniqueness.html

2010/6/2 Javier de la Rosa ver...@gmail.com:
 And one more question, what's the meaning of uniqueness: node path
 parameter? What values does it support? Which is the equivalent en neo4j.py?


 --
 Javier de la Rosa
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversals in REST Service vs. Traversals in neo4j.py

2010-06-02 Thread Mattias Persson
2010/6/2 Javier de la Rosa ver...@gmail.com:
 Thank you for your clarification.

 On 2 June 2010 13:31, Mattias Persson matt...@neotechnology.com wrote:

 return filter: {
    language: javascript,
    body: position.node().getProperty( 'name' ).equals( 'Javier' )
 }


 Will we see language: python in the near future?
Yep, I very much hope so!


 --
 Javier de la Rosa
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


[Neo4j] neo4j-utils

2010-06-02 Thread Mattias Persson
Is there someone out there using neo4j-utils component,
http://components.neo4j.org/neo4j-utils/ ? I'm the one responsible for
creating the (somewhat messy) utilities in there. Something just hit
me when looking at it: most of the public methods in the code
(although not all) which does some write operation to the graph wraps
the code in its own transaction. I find that to be a little off, since
it's good to be explicit about the scopes of your transactions.

So I was planning to remove all such transaction wrappings and also
remove a lot of GraphDatabaseService references from constructors,
since you now can reach the graph database via
http://components.neo4j.org/neo4j-kernel/apidocs/org/neo4j/graphdb/PropertyContainer.html#getGraphDatabase()
, making that extra reference unnecessary.

Does anyone have an opinion about all this?

-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Compacting files?

2010-06-02 Thread Craig Taverner
I've thought about this briefly, and somehow it actually seems easier (to
me) to consider a compacting (defragmenting) algorithm than a generic
import/export. The problem is that in both cases you have to deal with the
same issue, the node/relationship ID's are changed. For the import/export
this means you need another way to store the connectedness, so you export
the entire graph into another format that maintains the connectedness in
some way (perhaps a whole new set of IDs), and the re-import it again.
Getting a very complex, large and cyclic graph to work like this seems hard
to me because you have to maintain a complete table in memory of the
identity map during the export (which makes the export unscalable).

But de-fragmenting can be done by changing ID's in batches, breaking the
problem down into smaller steps, and never neading to deal with the entire
graph at the same time at any point. For example, take the node table, scan
from the base collecting free ID's. Once you have a decent block, pull that
many nodes down from above in the table. Since you keep the entire set in
memory, you maintain the mapping of old-new and can use that to 'fix' the
relationship table also. Rinse and repeat :-)

One option for the entire graph export that might work for most datasets
that have predominantly tree structures is to export to a common tree
format, like JSON (or,  XML). This maintains most of the relationships
without requiring any memory of id mappings. The less common cyclic
connections can be maintained with temporary ID's and a table of such ID's
maintained in memory (assuming it is much smaller than the total graph).
This can allow complete export of very large graphs if the temp id table
does indeed remain small. Probably true for many datasets.

On Wed, Jun 2, 2010 at 2:30 PM, Johan Svensson jo...@neotechnology.comwrote:

 Alex,

 You are correct about the holes in the store file and I would
 suggest you export the data and then re-import it again. Neo4j is not
 optimized for the use case were more data is removed than added over
 time.

 It would be possible to write a compacting utility but since this is
 not a very common use case I think it is better to put that time into
 producing a generic export/import dump utility. The plan is to get a
 export/import utility in place as soon as possible so any input on how
 that should work, what format to use etc. would be great.

 -Johan

 On Wed, Jun 2, 2010 at 9:23 AM, Alex Averbuch alex.averb...@gmail.com
 wrote:
  Hey,
  Is there a way to compact the data stores (relationships, nodes,
 properties)
  in Neo4j?
  I don't mind if its a manual operation.
 
  I have some datasets that have had a lot of relationships removed from
 them
  but the file is still the same size, so I'm guessing there are a lot of
  holes in this file at the moment.
 
  Would this be hurting lookup performance?
 
  Cheers,
  Alex
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Traversals in REST Service vs. Traversals in neo4j.py

2010-06-02 Thread Javier de la Rosa
On 2 June 2010 16:21, Mattias Persson matt...@neotechnology.com wrote:

 I don't think the python bindings (or any other binding) has caught up
 to the new traversal framework. Uniqueness is all about when to visit
 a node and when not to. If the uniqueness would be NODE_GLOBAL a node
 wouldn't be visited more than once in a traversal. NODE_PATH means
 that a node won't be visited again for the current path (the path from
 the start node to where ever the traverser is at the moment) if that
 node is in the current path. It might as well be visited again in
 another path.

 Also see the javadoc of Uniqueness at

 http://components.neo4j.org/neo4j-kernel/apidocs/org/neo4j/graphdb/traversal/Uniqueness.html


Great! Thank you so much.



-- 
Javier de la Rosa
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Compacting files?

2010-06-02 Thread Alex Averbuch
Hi Craig,
Just a quick note about needing to keep all IDs in memory during an
import/export operation. The way I'm doing it at the moment it's not
necessary to do so.

When exporting:
Write IDs to the exported format (this could be JSON, XML, GML, GraphML,
etc)

When importing:
First import all Nodes, this is easy to do in most formats (all that I've
tried).
While importing Nodes, store  index 1 extra property in every Node, I call
this GID for global ID.
Next import all Relationships, using the GID and Lucene to locate start Node
 end Node.

The biggest graph I've tried with this approach had 2.5million Nodes 
250million Relationships.
It took a quite a long time, but much of the slowness was because it was
performed on an old laptop with 2GB of RAM, I didn't give the BatchInserter
a properties file, and I used default JVM parameters.

There is at least one obvious downside to this though, and that is that you
pollute the dataset with GID properties.

Alex

On Wed, Jun 2, 2010 at 5:53 PM, Craig Taverner cr...@amanzi.com wrote:

 I've thought about this briefly, and somehow it actually seems easier (to
 me) to consider a compacting (defragmenting) algorithm than a generic
 import/export. The problem is that in both cases you have to deal with the
 same issue, the node/relationship ID's are changed. For the import/export
 this means you need another way to store the connectedness, so you export
 the entire graph into another format that maintains the connectedness in
 some way (perhaps a whole new set of IDs), and the re-import it again.
 Getting a very complex, large and cyclic graph to work like this seems hard
 to me because you have to maintain a complete table in memory of the
 identity map during the export (which makes the export unscalable).

 But de-fragmenting can be done by changing ID's in batches, breaking the
 problem down into smaller steps, and never neading to deal with the entire
 graph at the same time at any point. For example, take the node table, scan
 from the base collecting free ID's. Once you have a decent block, pull that
 many nodes down from above in the table. Since you keep the entire set in
 memory, you maintain the mapping of old-new and can use that to 'fix' the
 relationship table also. Rinse and repeat :-)

 One option for the entire graph export that might work for most datasets
 that have predominantly tree structures is to export to a common tree
 format, like JSON (or,  XML). This maintains most of the relationships
 without requiring any memory of id mappings. The less common cyclic
 connections can be maintained with temporary ID's and a table of such ID's
 maintained in memory (assuming it is much smaller than the total graph).
 This can allow complete export of very large graphs if the temp id table
 does indeed remain small. Probably true for many datasets.

 On Wed, Jun 2, 2010 at 2:30 PM, Johan Svensson jo...@neotechnology.com
 wrote:

  Alex,
 
  You are correct about the holes in the store file and I would
  suggest you export the data and then re-import it again. Neo4j is not
  optimized for the use case were more data is removed than added over
  time.
 
  It would be possible to write a compacting utility but since this is
  not a very common use case I think it is better to put that time into
  producing a generic export/import dump utility. The plan is to get a
  export/import utility in place as soon as possible so any input on how
  that should work, what format to use etc. would be great.
 
  -Johan
 
  On Wed, Jun 2, 2010 at 9:23 AM, Alex Averbuch alex.averb...@gmail.com
  wrote:
   Hey,
   Is there a way to compact the data stores (relationships, nodes,
  properties)
   in Neo4j?
   I don't mind if its a manual operation.
  
   I have some datasets that have had a lot of relationships removed from
  them
   but the file is still the same size, so I'm guessing there are a lot of
   holes in this file at the moment.
  
   Would this be hurting lookup performance?
  
   Cheers,
   Alex
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Compacting files?

2010-06-02 Thread Craig Taverner
Yes. I guess you cannot escape an old-new ID map (or in your case ID-GID). I
think it is possible to maintain that outside the database:

   - In memory, as I suggested, but only valid under some circumstances
   - On disk, and lucene is a good idea here. Why not index with lucene, but
   without storing the property to the node?

Since the index method takes the node, the property and the value, I assume
the property and value might be possible to index without actually being
real properties and values? I've not tried, but this way the graph is
cleaner, and we can delete the lucene index afterwards!

On Wed, Jun 2, 2010 at 6:12 PM, Alex Averbuch alex.averb...@gmail.comwrote:

 Hi Craig,
 Just a quick note about needing to keep all IDs in memory during an
 import/export operation. The way I'm doing it at the moment it's not
 necessary to do so.

 When exporting:
 Write IDs to the exported format (this could be JSON, XML, GML, GraphML,
 etc)

 When importing:
 First import all Nodes, this is easy to do in most formats (all that I've
 tried).
 While importing Nodes, store  index 1 extra property in every Node, I call
 this GID for global ID.
 Next import all Relationships, using the GID and Lucene to locate start
 Node
  end Node.

 The biggest graph I've tried with this approach had 2.5million Nodes 
 250million Relationships.
 It took a quite a long time, but much of the slowness was because it was
 performed on an old laptop with 2GB of RAM, I didn't give the BatchInserter
 a properties file, and I used default JVM parameters.

 There is at least one obvious downside to this though, and that is that you
 pollute the dataset with GID properties.

 Alex

 On Wed, Jun 2, 2010 at 5:53 PM, Craig Taverner cr...@amanzi.com wrote:

  I've thought about this briefly, and somehow it actually seems easier (to
  me) to consider a compacting (defragmenting) algorithm than a generic
  import/export. The problem is that in both cases you have to deal with
 the
  same issue, the node/relationship ID's are changed. For the import/export
  this means you need another way to store the connectedness, so you export
  the entire graph into another format that maintains the connectedness in
  some way (perhaps a whole new set of IDs), and the re-import it again.
  Getting a very complex, large and cyclic graph to work like this seems
 hard
  to me because you have to maintain a complete table in memory of the
  identity map during the export (which makes the export unscalable).
 
  But de-fragmenting can be done by changing ID's in batches, breaking the
  problem down into smaller steps, and never neading to deal with the
 entire
  graph at the same time at any point. For example, take the node table,
 scan
  from the base collecting free ID's. Once you have a decent block, pull
 that
  many nodes down from above in the table. Since you keep the entire set in
  memory, you maintain the mapping of old-new and can use that to 'fix' the
  relationship table also. Rinse and repeat :-)
 
  One option for the entire graph export that might work for most datasets
  that have predominantly tree structures is to export to a common tree
  format, like JSON (or,  XML). This maintains most of the
 relationships
  without requiring any memory of id mappings. The less common cyclic
  connections can be maintained with temporary ID's and a table of such
 ID's
  maintained in memory (assuming it is much smaller than the total graph).
  This can allow complete export of very large graphs if the temp id table
  does indeed remain small. Probably true for many datasets.
 
  On Wed, Jun 2, 2010 at 2:30 PM, Johan Svensson jo...@neotechnology.com
  wrote:
 
   Alex,
  
   You are correct about the holes in the store file and I would
   suggest you export the data and then re-import it again. Neo4j is not
   optimized for the use case were more data is removed than added over
   time.
  
   It would be possible to write a compacting utility but since this is
   not a very common use case I think it is better to put that time into
   producing a generic export/import dump utility. The plan is to get a
   export/import utility in place as soon as possible so any input on how
   that should work, what format to use etc. would be great.
  
   -Johan
  
   On Wed, Jun 2, 2010 at 9:23 AM, Alex Averbuch alex.averb...@gmail.com
 
   wrote:
Hey,
Is there a way to compact the data stores (relationships, nodes,
   properties)
in Neo4j?
I don't mind if its a manual operation.
   
I have some datasets that have had a lot of relationships removed
 from
   them
but the file is still the same size, so I'm guessing there are a lot
 of
holes in this file at the moment.
   
Would this be hurting lookup performance?
   
Cheers,
Alex
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user
  
  

Re: [Neo4j] Compacting files?

2010-06-02 Thread Alex Averbuch
 - On disk, and lucene is a good idea here. Why not index with lucene,
but without storing the property to the node?

I like it!

This sounds like a cleaner approach than my current one, and (I'm not sure
about how to do this either) may be no more complex than the way I'm doing
it.
Like you say, we can delete the Lucene index afterwards... or just the
Lucene folder associated with that one property.

I'm writing exams, thesis reports, and thesis opposition reports for the
next month so I won't have time to try it out.
If you give it a try I'd be interesting in hearing how the Lunece only
approach work out though.

On Wed, Jun 2, 2010 at 6:42 PM, Craig Taverner cr...@amanzi.com wrote:

 Yes. I guess you cannot escape an old-new ID map (or in your case ID-GID).
 I
 think it is possible to maintain that outside the database:

   - In memory, as I suggested, but only valid under some circumstances
   - On disk, and lucene is a good idea here. Why not index with lucene, but
   without storing the property to the node?

 Since the index method takes the node, the property and the value, I assume
 the property and value might be possible to index without actually being
 real properties and values? I've not tried, but this way the graph is
 cleaner, and we can delete the lucene index afterwards!

 On Wed, Jun 2, 2010 at 6:12 PM, Alex Averbuch alex.averb...@gmail.com
 wrote:

  Hi Craig,
  Just a quick note about needing to keep all IDs in memory during an
  import/export operation. The way I'm doing it at the moment it's not
  necessary to do so.
 
  When exporting:
  Write IDs to the exported format (this could be JSON, XML, GML, GraphML,
  etc)
 
  When importing:
  First import all Nodes, this is easy to do in most formats (all that I've
  tried).
  While importing Nodes, store  index 1 extra property in every Node, I
 call
  this GID for global ID.
  Next import all Relationships, using the GID and Lucene to locate start
  Node
   end Node.
 
  The biggest graph I've tried with this approach had 2.5million Nodes 
  250million Relationships.
  It took a quite a long time, but much of the slowness was because it was
  performed on an old laptop with 2GB of RAM, I didn't give the
 BatchInserter
  a properties file, and I used default JVM parameters.
 
  There is at least one obvious downside to this though, and that is that
 you
  pollute the dataset with GID properties.
 
  Alex
 
  On Wed, Jun 2, 2010 at 5:53 PM, Craig Taverner cr...@amanzi.com wrote:
 
   I've thought about this briefly, and somehow it actually seems easier
 (to
   me) to consider a compacting (defragmenting) algorithm than a generic
   import/export. The problem is that in both cases you have to deal with
  the
   same issue, the node/relationship ID's are changed. For the
 import/export
   this means you need another way to store the connectedness, so you
 export
   the entire graph into another format that maintains the connectedness
 in
   some way (perhaps a whole new set of IDs), and the re-import it again.
   Getting a very complex, large and cyclic graph to work like this seems
  hard
   to me because you have to maintain a complete table in memory of the
   identity map during the export (which makes the export unscalable).
  
   But de-fragmenting can be done by changing ID's in batches, breaking
 the
   problem down into smaller steps, and never neading to deal with the
  entire
   graph at the same time at any point. For example, take the node table,
  scan
   from the base collecting free ID's. Once you have a decent block, pull
  that
   many nodes down from above in the table. Since you keep the entire set
 in
   memory, you maintain the mapping of old-new and can use that to 'fix'
 the
   relationship table also. Rinse and repeat :-)
  
   One option for the entire graph export that might work for most
 datasets
   that have predominantly tree structures is to export to a common tree
   format, like JSON (or,  XML). This maintains most of the
  relationships
   without requiring any memory of id mappings. The less common cyclic
   connections can be maintained with temporary ID's and a table of such
  ID's
   maintained in memory (assuming it is much smaller than the total
 graph).
   This can allow complete export of very large graphs if the temp id
 table
   does indeed remain small. Probably true for many datasets.
  
   On Wed, Jun 2, 2010 at 2:30 PM, Johan Svensson 
 jo...@neotechnology.com
   wrote:
  
Alex,
   
You are correct about the holes in the store file and I would
suggest you export the data and then re-import it again. Neo4j is not
optimized for the use case were more data is removed than added over
time.
   
It would be possible to write a compacting utility but since this is
not a very common use case I think it is better to put that time into
producing a generic export/import dump utility. The plan is to get a
export/import utility in place as soon as possible so any 

[Neo4j] Tell neo to not reuse ID's

2010-06-02 Thread Martin Neumann
Hej,

Is it somehow possible to tell Neo4j not to reuse id's at all?

Im running some experiments on Neo4j and I want to add and delete the
nodes and relationships. To make sure that I can repeat the same
experiment I create a log containing the ID's of the nodes i want to
delete. To make sure that I can rerun the experiment each node I add
has to have the same ID in each experiment.
If ID's can be reused that is not always the case thats why I need to
turn it off or work around it.

hope for your help
cheers Martin
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Node creation limit

2010-06-02 Thread Biren Gandhi
Thanks. Big transactions were indeed problematic. Splitting them down into
smaller chunks did the trick.

I'm still disappointed by the on-disk size of a minimal node without any
relationships or attributes. For 500K nodes, it is taking 80MB space (160
byes/node) and for 1M objects it is consuming 160MB (again 160 byes/node).
Is this normal?

4.0Kactive_tx_log
12K lucene
12K lucene-fulltext
4.0Kneostore
4.0Kneostore.id
4.4Mneostore.nodestore.db
4.0Kneostore.nodestore.db.id
12M neostore.propertystore.db
4.0Kneostore.propertystore.db.arrays
4.0Kneostore.propertystore.db.arrays.id
4.0Kneostore.propertystore.db.id
4.0Kneostore.propertystore.db.index
4.0Kneostore.propertystore.db.index.id
4.0Kneostore.propertystore.db.index.keys
4.0Kneostore.propertystore.db.index.keys.id
64M neostore.propertystore.db.strings
4.0Kneostore.propertystore.db.strings.id
4.0Kneostore.relationshipstore.db
4.0Kneostore.relationshipstore.db.id
4.0Kneostore.relationshiptypestore.db
4.0Kneostore.relationshiptypestore.db.id
4.0Kneostore.relationshiptypestore.db.names
4.0Kneostore.relationshiptypestore.db.names.id
4.0Knioneo_logical.log.active
4.0Ktm_tx_log.1
80M total


On Wed, Jun 2, 2010 at 12:17 AM, Mattias Persson
matt...@neotechnology.comwrote:

 Exactly, the problem is most likely that you try to insert all your
 stuff in one transaction. All data for a transaction is kept in memory
 until committed so for really big transactions it can fill your entire
 heap. Try to group 10k operations or so for big insertions or use the
 batch inserter.

 Links:
 http://wiki.neo4j.org/content/Transactions#Big_transactions
 http://wiki.neo4j.org/content/Batch_Insert

 2010/6/2, Laurent Laborde kerdez...@gmail.com:
  On Wed, Jun 2, 2010 at 3:50 AM, Biren Gandhi biren.gan...@gmail.com
 wrote:
 
  Is there any limit on number of nodes that can be created in a neo4j
  instance? Any other tips?
 
  I created hundreds of millions of nodes without problems, but it was
  splitted into many transaction.
 
  --
  Laurent ker2x Laborde
  Sysadmin  DBA at http://www.over-blog.com/
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 


 --
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Node creation limit

2010-06-02 Thread Mattias Persson
Only 4,4mb out of those 80 is consumed by nodes so you must be storing
some properties somewhere. Would you mind sharing your code so that it
would be easier to get a better insight into your problem?

2010/6/2, Biren Gandhi biren.gan...@gmail.com:
 Thanks. Big transactions were indeed problematic. Splitting them down into
 smaller chunks did the trick.

 I'm still disappointed by the on-disk size of a minimal node without any
 relationships or attributes. For 500K nodes, it is taking 80MB space (160
 byes/node) and for 1M objects it is consuming 160MB (again 160 byes/node).
 Is this normal?

 4.0Kactive_tx_log
 12K lucene
 12K lucene-fulltext
 4.0Kneostore
 4.0Kneostore.id
 4.4Mneostore.nodestore.db
 4.0Kneostore.nodestore.db.id
 12M neostore.propertystore.db
 4.0Kneostore.propertystore.db.arrays
 4.0Kneostore.propertystore.db.arrays.id
 4.0Kneostore.propertystore.db.id
 4.0Kneostore.propertystore.db.index
 4.0Kneostore.propertystore.db.index.id
 4.0Kneostore.propertystore.db.index.keys
 4.0Kneostore.propertystore.db.index.keys.id
 64M neostore.propertystore.db.strings
 4.0Kneostore.propertystore.db.strings.id
 4.0Kneostore.relationshipstore.db
 4.0Kneostore.relationshipstore.db.id
 4.0Kneostore.relationshiptypestore.db
 4.0Kneostore.relationshiptypestore.db.id
 4.0Kneostore.relationshiptypestore.db.names
 4.0Kneostore.relationshiptypestore.db.names.id
 4.0Knioneo_logical.log.active
 4.0Ktm_tx_log.1
 80M total


 On Wed, Jun 2, 2010 at 12:17 AM, Mattias Persson
 matt...@neotechnology.comwrote:

 Exactly, the problem is most likely that you try to insert all your
 stuff in one transaction. All data for a transaction is kept in memory
 until committed so for really big transactions it can fill your entire
 heap. Try to group 10k operations or so for big insertions or use the
 batch inserter.

 Links:
 http://wiki.neo4j.org/content/Transactions#Big_transactions
 http://wiki.neo4j.org/content/Batch_Insert

 2010/6/2, Laurent Laborde kerdez...@gmail.com:
  On Wed, Jun 2, 2010 at 3:50 AM, Biren Gandhi biren.gan...@gmail.com
 wrote:
 
  Is there any limit on number of nodes that can be created in a neo4j
  instance? Any other tips?
 
  I created hundreds of millions of nodes without problems, but it was
  splitted into many transaction.
 
  --
  Laurent ker2x Laborde
  Sysadmin  DBA at http://www.over-blog.com/
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 


 --
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user



-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Node creation limit

2010-06-02 Thread Biren Gandhi
There is only 1 property - n (to store name of the node) - used as
follows:

Node node = graphDb.createNode();
node.setProperty( NAME_KEY, username );

And the values of username are Node-1, Node-2 etc.

On Wed, Jun 2, 2010 at 3:14 PM, Mattias Persson
matt...@neotechnology.comwrote:

 Only 4,4mb out of those 80 is consumed by nodes so you must be storing
 some properties somewhere. Would you mind sharing your code so that it
 would be easier to get a better insight into your problem?

 2010/6/2, Biren Gandhi biren.gan...@gmail.com:
  Thanks. Big transactions were indeed problematic. Splitting them down
 into
  smaller chunks did the trick.
 
  I'm still disappointed by the on-disk size of a minimal node without any
  relationships or attributes. For 500K nodes, it is taking 80MB space (160
  byes/node) and for 1M objects it is consuming 160MB (again 160
 byes/node).
  Is this normal?
 
  4.0Kactive_tx_log
  12K lucene
  12K lucene-fulltext
  4.0Kneostore
  4.0Kneostore.id
  4.4Mneostore.nodestore.db
  4.0Kneostore.nodestore.db.id
  12M neostore.propertystore.db
  4.0Kneostore.propertystore.db.arrays
  4.0Kneostore.propertystore.db.arrays.id
  4.0Kneostore.propertystore.db.id
  4.0Kneostore.propertystore.db.index
  4.0Kneostore.propertystore.db.index.id
  4.0Kneostore.propertystore.db.index.keys
  4.0Kneostore.propertystore.db.index.keys.id
  64M neostore.propertystore.db.strings
  4.0Kneostore.propertystore.db.strings.id
  4.0Kneostore.relationshipstore.db
  4.0Kneostore.relationshipstore.db.id
  4.0Kneostore.relationshiptypestore.db
  4.0Kneostore.relationshiptypestore.db.id
  4.0Kneostore.relationshiptypestore.db.names
  4.0Kneostore.relationshiptypestore.db.names.id
  4.0Knioneo_logical.log.active
  4.0Ktm_tx_log.1
  80M total
 
 
  On Wed, Jun 2, 2010 at 12:17 AM, Mattias Persson
  matt...@neotechnology.comwrote:
 
  Exactly, the problem is most likely that you try to insert all your
  stuff in one transaction. All data for a transaction is kept in memory
  until committed so for really big transactions it can fill your entire
  heap. Try to group 10k operations or so for big insertions or use the
  batch inserter.
 
  Links:
  http://wiki.neo4j.org/content/Transactions#Big_transactions
  http://wiki.neo4j.org/content/Batch_Insert
 
  2010/6/2, Laurent Laborde kerdez...@gmail.com:
   On Wed, Jun 2, 2010 at 3:50 AM, Biren Gandhi biren.gan...@gmail.com
  wrote:
  
   Is there any limit on number of nodes that can be created in a neo4j
   instance? Any other tips?
  
   I created hundreds of millions of nodes without problems, but it was
   splitted into many transaction.
  
   --
   Laurent ker2x Laborde
   Sysadmin  DBA at http://www.over-blog.com/
   ___
   Neo4j mailing list
   User@lists.neo4j.org
   https://lists.neo4j.org/mailman/listinfo/user
  
 
 
  --
  Mattias Persson, [matt...@neotechnology.com]
  Hacker, Neo Technology
  www.neotechnology.com
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 
  ___
  Neo4j mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 


 --
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Node creation limit

2010-06-02 Thread Biren Gandhi
Here is some content from neostore.propertystore.db.strings - another huge
file. What are the max number of nodes/relationships that people have tried
with Neo4j so far? Can someone share disk space usage characteristics?

od -N 1000 -x -c neostore.propertystore.db.strings

000  8500      
 \0  \0  \0 205  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
020        
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
200   0100    0c00 
 \0  \0  \0  \0  \0 001 377 377 377 377  \0  \0  \0  \f 377 377
220  4e00 6f00 6400 6500 2d00 3000 
377 377  \0   N  \0   o  \0   d  \0   e  \0   -  \0   0  \0  \0
240        
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
400      ff01  00ff
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001 377 377 377 377  \0
420  ff0c  00ff 004e 006f 0064 0065
 \0  \0  \f 377 377 377 377  \0   N  \0   o  \0   d  \0   e  \0
440 002d 0031      
  -  \0   1  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
460        
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
600        0100
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001
620    0c00   4e00 6f00
377 377 377 377  \0  \0  \0  \f 377 377 377 377  \0   N  \0   o
640 6400 6500 2d00 3200    
 \0   d  \0   e  \0   -  \0   2  \0  \0  \0  \0  \0  \0  \0  \0
660        
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0001020   ff01  00ff  ff0c 
 \0  \0  \0  \0 001 377 377 377 377  \0  \0  \0  \f 377 377 377
0001040 00ff 004e 006f 0064 0065 002d 0033 
377  \0   N  \0   o  \0   d  \0   e  \0   -  \0   3  \0  \0  \0
0001060        
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0001220     0100   
 \0  \0  \0  \0  \0  \0  \0  \0  \0 001 377 377 377 377  \0  \0
0001240 0c00   4e00 6f00 6400 6500 2d00
 \0  \f 377 377 377 377  \0   N  \0   o  \0   d  \0   e  \0   -
0001260 3400       
 \0   4  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
0001300        
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*
0001420        ff01
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0 001 377
0001440  00ff  ff0c  00ff 004e 006f
377 377 377  \0  \0  \0  \f 377 377 377 377  \0   N  \0   o  \0
0001460 0064 0065 002d 0035    
  d  \0   e  \0   -  \0   5  \0  \0  \0  \0  \0  \0  \0  \0  \0
0001500        
 \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0  \0
*


On Wed, Jun 2, 2010 at 3:50 PM, Biren Gandhi biren.gan...@gmail.com wrote:

 There is only 1 property - n (to store name of the node) - used as
 follows:

 Node node = graphDb.createNode();
 node.setProperty( NAME_KEY, username );

 And the values of username are Node-1, Node-2 etc.

 On Wed, Jun 2, 2010 at 3:14 PM, Mattias Persson matt...@neotechnology.com
  wrote:

 Only 4,4mb out of those 80 is consumed by nodes so you must be storing
 some properties somewhere. Would you mind sharing your code so that it
 would be easier to get a better insight into your problem?

 2010/6/2, Biren Gandhi biren.gan...@gmail.com:
  Thanks. Big transactions were indeed problematic. Splitting them down
 into
  smaller chunks did the trick.
 
  I'm still disappointed by the on-disk size of a minimal node without any
  relationships or attributes. For 500K nodes, it is taking 80MB space
 (160
  byes/node) and for 1M objects it is consuming 160MB (again 160
 byes/node).
  Is this normal?
 
  4.0Kactive_tx_log
  12K lucene
  12K lucene-fulltext
  4.0Kneostore
  4.0Kneostore.id
  4.4Mneostore.nodestore.db
  4.0Kneostore.nodestore.db.id
  12M neostore.propertystore.db
  4.0Kneostore.propertystore.db.arrays
  4.0Kneostore.propertystore.db.arrays.id
  4.0Kneostore.propertystore.db.id
  4.0Kneostore.propertystore.db.index
  4.0Kneostore.propertystore.db.index.id
  4.0Kneostore.propertystore.db.index.keys
  4.0Kneostore.propertystore.db.index.keys.id
  64M neostore.propertystore.db.strings
  4.0Kneostore.propertystore.db.strings.id
  4.0Kneostore.relationshipstore.db
  4.0Kneostore.relationshipstore.db.id
  4.0Kneostore.relationshiptypestore.db
  4.0Kneostore.relationshiptypestore.db.id
  4.0K

Re: [Neo4j] Tell neo to not reuse ID's

2010-06-02 Thread Craig Taverner
Here is a crazy idea that probably only works for nodes. Don't actually
delete the nodes, just the relationships and the node properties. The
skeleton node will retain the id in the table preventing re-use. If these
orphans are not relevant to your tests, this should have the effect you are
looking for.

On Wed, Jun 2, 2010 at 8:17 PM, Martin Neumann m.neumann.1...@gmail.comwrote:

 Hej,

 Is it somehow possible to tell Neo4j not to reuse id's at all?

 Im running some experiments on Neo4j and I want to add and delete the
 nodes and relationships. To make sure that I can repeat the same
 experiment I create a log containing the ID's of the nodes i want to
 delete. To make sure that I can rerun the experiment each node I add
 has to have the same ID in each experiment.
 If ID's can be reused that is not always the case thats why I need to
 turn it off or work around it.

 hope for your help
 cheers Martin
 ___
 Neo4j mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user

___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo4j] Compacting files?

2010-06-02 Thread Thomas Sant'ana
On Wed, Jun 2, 2010 at 9:30 AM, Johan Svensson jo...@neotechnology.comwrote:

 Alex,

 You are correct about the holes in the store file and I would
 suggest you export the data and then re-import it again. Neo4j is not
 optimized for the use case were more data is removed than added over
 time.


I like Postgres Auto Vaccum feature. I think if neo Reuses the holes it's
already nice. Some kind of compression and truncate of the files would be
great. In my opinion.

Just my 2 cents,

 Thomas
___
Neo4j mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user