Re: [Neo] meta meta classes

2010-04-08 Thread Mattias Persson
2010/3/30 Niels Hoogeveen pd_aficion...@hotmail.com


 MetaModelObject already has a getter (without using the word get) to
 access the node.

 Wrapping MetaModelObject to give it a Node-interface makes it possible to
 directly write:

 metalObject.setProperty(a, b)

 instead of

 metaobject.node.setPropert(a, b)

 If that were all, I wouldn't make a post about it. The more interesting
 part is the meta modeling of the node gotten from a MetaModelObject. This
 node represents the class, but it is not an instance of any class itself.
 That's where the reify method comes into play, which takes the node from a
 MetaModelObject and creates a class with the same name, but in a different
 namespace, and makes the node an instance of this new class.

 With that construction it becomes possible to model the
 relationships/properties of a class.

 There are many examples where this can be handy.

 I already gave the example of HTML tags, where the attributes can be
 modeled as properties and the tagname as a property of the meta class.

 Another example is:

 All countries have subdivision. Countries and their subdivisions all are
 instances of a certain class. There is one class for country, but there is a
 set of classes for subdivisions. In some countries, a subdivision is called
 province, in others it's a state or a district. Subdivision of various
 countries can have different sets of properties and relations.

 How to model the fact that the instance United States of America has
 subdivisions of the class State_(US)? Using the reify method we can make a
 class United States of America, where we can model that the ClassRange of
 State_(US) is United States of America. Or if we want the relationship
 to point the other way, that the United States of America has 50 (use of
 cardinality) subdivisions of the class State_(US).

Here you take a step into modeling the actual data into the meta model which
describes the data. Is it desirable to first model exactly how the data will
look, and then add data so that it looks like that? I get a feeling that the
data is described twice here...


 Of course all this can directly be expressed with nodes and relationships,
 but that's what the meta model does anyway.

 I do have one peeve with the meta model API. The class DataRange has a
 constructor:

 DataRange(String datatype, Object... values)

 I'd much rather see a new Restrictable class Datavalue, and see a
 constructor:

 DataRange(String datatype, Datavalue... values)

 That way the possible values a property can have additional properties and
 relationships (eg. link to Wordnet definition or Wikipedia entry).

 Kind regards,
 Niels Hoogeveen






  Date: Tue, 30 Mar 2010 09:30:10 +0200
  From: matt...@neotechnology.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo] meta meta classes
 
  Would making the underlying Node publically available (via a getter) be
  virtually the same thing? In that case the meta model classes could have
  such a getter.
 
  2010/3/26 Niels Hoogeveen pd_aficion...@hotmail.com
 
  
   Hi Peter,
  
   I added a Wiki entry in my github repo called Reification of meta
 classes
   and meta properties:
  
  
  
 http://wiki.github.com/NielsHoogeveen/Scala-Neo4j-utils/reification-of-meta-classes-and-meta-properties
  
   The source code for the Scala wrappers can be found found in my repo:
  
   http://github.com/NielsHoogeveen/Scala-Neo4j-utils
  
   Kind regards,
   Niels Hoogeveen
  
From: neubauer.pe...@gmail.com
Date: Fri, 26 Mar 2010 17:25:01 +0100
To: user@lists.neo4j.org
Subject: Re: [Neo] meta meta classes
   
Awesome Niels!
   
maybe you could blog or document some cool example on this?
   
Cheers,
   
/peter neubauer
   
COO and Sales, Neo Technology
   
GTalk:  neubauer.peter
Skype   peter.neubauer
Phone   +46 704 106975
LinkedIn   http://www.linkedin.com/in/neubauer
Twitter  http://twitter.com/peterneubauer
   
http://www.neo4j.org - Your high performance graph
 database.
http://www.tinkerpop.com  - Processing for Internet-scale
 graphs.
http://www.thoughtmade.com - Scandinavias coolest Bring-a-Thing
 party.
   
   
   
On Fri, Mar 26, 2010 at 5:22 PM, Niels Hoogeveen
pd_aficion...@hotmail.com wrote:

 Using Scala, I was actually able to extend MetaModelThing to act as
 a
   Node and MetaModelClass to have shadowing functionality for both
   MetaModelClasses and for MetaModelProperties, without touching the
 original
   source code.

 To: user@lists.neo4j.org
 From: rick.bullo...@burningskysoftware.com
 Date: Fri, 26 Mar 2010 14:29:03 +
 Subject: Re: [Neo] meta meta classes

 Such are the joys and challenges of frameworks and abstractions.
Sometimes you do need to get close to the metal though, to achieve
   specific functional and performance requirements.  Thus the reason open
   source frameworks are awesome.  At least we can 

Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread rick . bullotta
   Since no one responded yesterday, I wanted to re-emphasize that there
   are probably substantial optimizations that can be made in a well-known
   problem domain such as this.   For example, by using pre-calculated
   relevance measures for tags, and by narrowing the returned set of
   posts/nodes as rapidly as possible using the least used tag(s) in
   progressive order.  It would be quite trivial (and reasonably
   performant) to maintain a pair of properties on each node in the tag
   hierarchy that count the # of relationships of the tag and all its
   children.  Each time a tagging relationship was added to a post, simply
   add 1 to this property for the tag node and all its ancestors/parents.
   Then, when you are provided with a list of tags to search upon, order
   them by the least frequently used tag by leveraging this metric and
   execute your traversals/set analysis in that order.  I also think my
   proposal for a two-directional search (first one from the direction of
   the least frequently used tag to the posts that include it, followed by
   a search from each of those posts back to its tags as described in a
   previous message) could be quite fast.



   Another compound index approach that can be used, which is somewhat
   of a brute force method, is to maintain a property on each tag node
   that consists of its aggregrate name - e.g.
   Europe.Italy.Toscana.Siena or
   Activities.Active.Cycling.MountainBiking.  When doing a search for
   Cycling activities in Italy, you could grab the aggregate names for
   Italy (Europe.Italy) and Cycling (Activities.Active.Cycling), then,
   using whatever mechanism you choose for your initial node traversal
   (exhaustive or least-frequently-used tag), you can compare the
   aggregate name for the tags assigned to a post node to the aggregate
   names for the desired nodes using a simple String.startsWith().  For
   example, if I posted regarding a mountain bike ride I took in the hills
   around Siena, tagged with the above aggregate names, it would
   successfully match.  My first thought was that this could be
   problematic if a tag term appeared multiple times in the tag hierarchy,
   but that's easily managed on the query side.



   Just trying to make the point that sometime abstract or generic
   traversal schemes aren't always optimal and that it is often worth the
   effor to explore domain-specific approaches.



   Does that make any sense?







    Original Message 
   Subject: Re: [Neo] How to efficiently query in Neo4J?
   From: Craig Taverner cr...@amanzi.com
   Date: Wed, April 07, 2010 7:05 pm
   To: Neo user discussions user@lists.neo4j.org
   Hi Alastair,
   I have been using what you tag the 'composite index' although in mysql.
   Its
fast, but a pain to manage (as you need to keep the index up to
   date), so I
would like to stay away from indexes *if possible*.
   
   I would think that you only need to take action when you add or modify
   a
   node, and then only to (re)connect it to the index tree (creating index
   nodes on demand, if missing). This can be embedded in your domain
   classes,
   so indexing is automatic. You can even synchronously 'garbage-collect'
   unused index nodes (if the node unlinked was the last node for that
   index
   node). I think the index-service for this needs to be well tested for
   all
   scenarios, but should ultimately have a very simple API, with no manual
   management requirement.
   My one concern with the composite index for your case is that all my
   thinking in this has been for numerical indexes, where I plan to query
   with
   inequalities (eg. return all restaurants with rating = 4 stars). I've
   not
   thought about how to solve hierarchical tags like you have.
   One further optimisation is to only store new items in the hash on the
   first
traversal. Then, in the subsequent traversals, if the key does not
   exist,
there is no need to add key with count 1, as it cannot ever be
   emitted.
This
limits the memory requirements to the order of the first traversal,
   so if
you pick that well, it should be better.
   
   Nice idea. It makes your approach more like the 'one set intersection'
   approach in term of memory.
   Picking a good first query seems a common need for many of the
   solutions. I
   presume RDBMS have a query optimization phase that figures that out.
   I'm
   hoping to completely avoid that kind of non-deterministic approach with
   the
   composite index.
   Cheers, Craig
   ___
   Neo mailing list
   User@lists.neo4j.org
   [1]https://lists.neo4j.org/mailman/listinfo/user

References

   1. https://lists.neo4j.org/mailman/listinfo/user
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Unable to memory map

2010-04-08 Thread Johan Svensson
Hi,

The read only version is not faster on reads compared to a writable
store. Internally the only difference is we open files in read only
mode.

The reason you get the error is that your OS does not support to place
a memory mapped region to a file (opened in read only mode) when the
region maps outside the file data (in write mode the file will grow in
size when that happens).

-Johan

On Mon, Mar 29, 2010 at 9:03 PM, Marc Preddie mpred...@gmail.com wrote:
 Hi,

 I've had some time to look into this issue and it seems that when using the
 ReadOnly versions of the classes,  I get the memory mapping warnings and
 when using the Writable versions of the classes, the warning does not
 occur (I'm assuming memory mapping gets enabled).

 I'm not against using the writable versions of the classes; my only
 concern is performance. Are the readonly versions faster that the
 writable versions? And if they are; then if memory mapping is not enabled,
 are they faster that the writable versions with memory mapping?

 I'll run some tests, but I guess I would like an expert opinion.

 Regards,
 Marc

 On Mon, Mar 22, 2010 at 10:24 AM, Tobias Ivarsson 
 tobias.ivars...@neotechnology.com wrote:

 Hi,

 We have seen this message before emitted as a warning from Neo4j. Are you
 seing this as a warning as well, or are you getting an exception thrown to
 your application code?

 It's hard to deal with these errors since nio only throws IOException, and
 not any more semantic information than that, I believe we deal with all
 cases by issuing a warning and then falling back to another method of
 performing the same operation, but if you are getting exceptions we need to
 resolve it. If you are indeed getting exceptions, some code that triggers
 it
 would be very helpful.

 Cheers,
 Tobias

 On Wed, Mar 17, 2010 at 1:47 PM, Marc Preddie mpred...@gmail.com wrote:

  Hi,
 
  I've look at the mailing list and found 1 similar situation, but no real
  solution. So I was hoping someone could shed some light on this.
 
  I seem to have an issue with neo4j being able to use memory mapped files.
  I've run my service on Win XP 64bit, Mac OSX Snow Leopard 10.6.2 and
 Centos
  5.x 64bit and always get the same error when launching. I'm using APOC
 1.0
  and have a DB of approx 600M. In my neo config I allocate about 5M more
 for
  each type of file than the actual file size (I've tried multiple
 different
  settings). On each machine I also leave at least 1.5G for the OS and have
  at
  least 2.5G heap for the Java process. I'm also using the
  classes EmbeddedReadOnlyGraphDatabase and LuceneReadOnlyIndexService to
  access and browse DB.
 
  Neo config
 
  neostore.nodestore.db.mapped_memory=10M
  neostore.relationshipstore.db.mapped_memory=110M
  neostore.propertystore.db.mapped_memory=85M
  neostore.propertystore.db.index.mapped_memory=10M
  neostore.propertystore.db.index.keys.mapped_memory=10M
  neostore.propertystore.db.strings.mapped_memory=320M
  neostore.propertystore.db.arrays.mapped_memory=10M
 
  Here is the error
 
  org.neo4j.kernel.impl.nioneo.store.MappedMemException: Unable to map
  pos=3005872 recordSize=33 totalSize=1153416
  at
 
 
 org.neo4j.kernel.impl.nioneo.store.MappedPersistenceWindow.init(MappedPersistenceWindow.java:59)
  at
 
 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.allocateNewWindow(PersistenceWindowPool.java:530)
  at
 
 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.refreshBricks(PersistenceWindowPool.java:430)
  at
 
 
 org.neo4j.kernel.impl.nioneo.store.PersistenceWindowPool.acquire(PersistenceWindowPool.java:122)
  at
 
 
 org.neo4j.kernel.impl.nioneo.store.CommonAbstractStore.acquireWindow(CommonAbstractStore.java:459)
  at
 
 
 org.neo4j.kernel.impl.nioneo.store.RelationshipStore.getChainRecord(RelationshipStore.java:248)
  at
 
 
 org.neo4j.kernel.impl.nioneo.xa.NeoReadTransaction.getMoreRelationships(NeoReadTransaction.java:103)
  at
 
 
 org.neo4j.kernel.impl.nioneo.xa.NioNeoDbPersistenceSource$ReadOnlyResourceConnection.getMoreRelationships(NioNeoDbPersistenceSource.java:275)
  at
 
 
 org.neo4j.kernel.impl.persistence.PersistenceManager.getMoreRelationships(PersistenceManager.java:93)
  at
 
 
 org.neo4j.kernel.impl.core.NodeManager.getMoreRelationships(NodeManager.java:585)
  at
 
 org.neo4j.kernel.impl.core.NodeImpl.getMoreRelationships(NodeImpl.java:332)
  at
 
 
 org.neo4j.kernel.impl.core.NodeImpl.ensureFullRelationships(NodeImpl.java:320)
  at
 
 
 org.neo4j.kernel.impl.core.NodeImpl.getAllRelationshipsOfType(NodeImpl.java:129)
  at
 
 
 org.neo4j.kernel.impl.core.NodeImpl.getSingleRelationship(NodeImpl.java:179)
  at
 
 
 org.neo4j.kernel.impl.core.NodeProxy.getSingleRelationship(NodeProxy.java:98)
         
  Caused by: java.io.IOException: Access is denied
  at sun.nio.ch.FileChannelImpl.truncate0(Native Method)
  at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:728)
  at
 
 
 

Re: [Neo] meta meta classes

2010-04-08 Thread Niels Hoogeveen

The example of the tag library and countries/sub-divisions are not necessarily 
similar. The first shows the need to model the properties of a class.

The second example shows the need to have singleton classes, which is a 
different concept, and something that cannot be done out of the box, but I will 
show a solution that requires minimal modifications to the current software.

Suppose we want to model countries and their sub-divisions.

We do the following:

create class country
create class sub-division
create property has-sub-division
make has-subdivision a property of country

Now we like to populate the database, since all countries have different types 
of sub-divisions we need to create those:

create class french_region
create class canadian_province
create class canadian_territory
create class US_state
etc.

Populate the various classes with instances:

create node for Alsace and make it an instance of french_region
create node for Aquitaine and make it an instance of 
french_region
create node for Alberta and make it an instance of 
Canadian_province
create node for Nanavut and make it an instance of 
Canadian_territory
create node for British Columbia and make it an instance of 
Canadian_province
create node for Alabama and make it an instance of 
US_State
create node for Alaska and make it an instance of 
US_State
etc.

We'd like to state that each country has its own restriction on the type of 
subdivision. To do that we need to create classes for each country.

create class France 
create class Canada
create class United_States_of_America
etc.

make class France a subclass of Country
make class Canada a subclass of Country
make class United_States_of_America a subclass of Country
etc.

create restriction for France on has-sub-division with range 
french_province and cardinality = 26
create restriction for Canada on has-sub-division with range 
canadian_province and cardinality = 10
create restriction for Canada on has-sub-division with range 
canadian_territory and cardinality = 3
create restriction for United_States_of_America on has-sub-division with 
range US_State and cardinality = 50
etc.

Now we have classes for each country, but no instances. Unfortunately we cannot 
say that a class is an instance of itself, that would require a relationsship 
where the endnode equals the startnode, which Neo4J doesn't allow. So we have 
to create separate instances for each country.

create node for France
create node for Canada
create node for United_States_Of_America
etc.

make node France an instance of class France
make node Canada an instance of class Canada
make node United_States_of_America an instance of class 
United_States_of_America

And finally link the subdivisions to their countries:

 France has-sub-division Alsace 
France has-sub-division Aquitaine 
Canada has-sub-division-of Alberta 
Canada has-subdivision British Columbia 
Canada has-subdivision Nanavut 
United_States_of_America has-sub-division Alabama 
United_States_of_America has-sub-division Alaska

So we end up having a class for each country and an instance for each country.

Writing this down, I realize the patch I sent you a few days ago, contains a 
minor flaw, that needs to be fixed. In that patch I added an indexed property 
uri to the meta model, to bring it in line with what is being done in the RDF 
module, and to make certain that the same URI is not used for two different 
properties. Without unicity a class can have several different ProperyTypes 
with the same name. In that situation the lookup of the PropertyType of a 
property or relationship becomes impossible.

The flaw in my patch is the name of the uri property, which should be 
something like class_uri. That way the class of each country can be given the 
same URI as the instance of each country, because they live in different name 
spaces. 

This same technique is used in OWL to provide punning. An instance and a 
class can have the same URI, because instances in OWL live in a different 
namespaces from classes. 

Through the use of the uri property and the class_uri property we can also 
distill that each country class is a singleton class, because there exists an 
instance with the same URI. That way we can work around the limitation that 
relationships cannot have the same start and end node. 

Furthermore, it allows for some extra restrictions to MetaModelClass with the 
following logic: If a class has exactly one instance where the uri of that 
instance equals the class_uri of the class, no more instances can be added 
And if there is an instance of a class without a uri that equals the 
class_uri, no instances can be added where the uri of the instance equals 
the class_uri of the class. With that logic, we have proper singleton classes 
in the meta model of Neo4J.

Kind regards,
Niels Hoogeveen

 Date: Thu, 8 Apr 2010 11:37:37 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo] meta meta classes
 
 2010/3/30 Niels 

Re: [Neo] getNumberOfIdsInUse(Node.class)) return -1

2010-04-08 Thread Johan Svensson
Hi,

I had a look at this and can not figure out why -1 is returned.

When running the kernel in normal (write) mode the return value of
number of ids in use will only be correct if all previous shutdowns
have executed cleanly. This is an optimization to reduce the time
spent in recovery rebuilding id generators after a crash/non clean
shutdown.

After a crash/non clean shutdown the number of ids in use will always
be the highest id in use + 1. To force a full rebuild of the id
generators on each startup (on a non clean shutdown) pass in the
following configuration:

   rebuild_idgenerators_fast=false

In read only mode the return value will always be the highest id in use + 1.

You could try to delete the neostore.nodestore.db.id and pass in
rebuild_idgenerators_fast=false as configuration when starting up
(this will take a long time if the node store file is large). If you
still get incorrect results send me a compressed version of the
neostore.nodestore.db.id file and I will have a look at it.

Regards,
-Johan

On Tue, Apr 6, 2010 at 3:38 PM, Tobias Ivarsson
tobias.ivars...@neotechnology.com wrote:
 Sorry, we have not had time to look into that yet. I'll let you know when we
 have.

 On Mon, Apr 5, 2010 at 12:31 PM, Laurent Laborde kerdez...@gmail.comwrote:

 Any news ?

 --
 Ker2x

 On Fri, Mar 26, 2010 at 12:05 PM, Tobias Ivarsson
 tobias.ivars...@neotechnology.com wrote:
  Ok, thanks. We'll look into it.
 
  On Fri, Mar 26, 2010 at 11:49 AM, Laurent Laborde kerdez...@gmail.com
 wrote:
 
  something between 100 millions and 1 billions, i guess.
  the DB contain the result of my collatz code from 1 to 100 millions.
 
  --
  Ker2x
 
  On Fri, Mar 26, 2010 at 11:40 AM, Tobias Ivarsson
  tobias.ivars...@neotechnology.com wrote:
   If you have a large number of nodes it could be a truncation error
 from
  long
   to int somewhere, how many nodes to you estimate that you have?
  
   It is a bug so we will fix it, but if we know the approximate
 estimated
  size
   it would help in finding the cause.
  
   /Tobias
  
   On Fri, Mar 26, 2010 at 7:59 AM, Laurent Laborde kerdez...@gmail.com
  wrote:
  
   my code do a :
   System.out.println(Number of nodes :  +
  
  
 
 neo.getConfig().getNeoModule().getNodeManager().getNumberOfIdsInUse(Node.class));
  
   it print :
   Number of nodes : -1
  
   why does it print -1 ?
   how can i count node ?
  
   thank you :)
  
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] creating nodes with our own id

2010-04-08 Thread Mattias Persson
Did you make any progress with this? I could provide you with an example as
well, here goes:

GraphDatabaseService graphDb = new EmbeddedGraphDatabase( my/path );
IndexService index = new LuceneIndexService( graphDb );

// This is how to create and index a UUID for a node.
...
Node node = graphDb.createNode();
node.setProperty( uuid, java.util.UUID.randomUUID().toString() );
index.index( uuid, node.getProperty( uuid ) );
...

// This is how to get a node for a certain UUID
...
Node node = index.getSingleNode( uuid,
9cd0b5b0-7cb4-4806-8b54-39803b1a44e2 );
...


2010/4/6 Mattias Persson matt...@neotechnology.com



 2010/4/5 Niels Hoogeveen pd_aficion...@hotmail.com


 UUID's are for all practical purposes unique. so you can use those for an
 ID and have uniqueness for free.

 +1 That would also be my answer to that.


 Kind regards,
 Niels Hoogeveen

  Date: Mon, 5 Apr 2010 10:54:08 +0530
  From: sivait...@gmail.com
  To: matt...@neotechnology.com
  CC: user@lists.neo4j.org
  Subject: Re: [Neo] creating nodes with our own id
 
  Hi Mattias Persson,
  Thanks for your replay.
  But setting property cannot give your node uniqueness.
  I want to use my own Id for unique node represenattaion otherwise i have
 to
  remember the ids of nodes when i want the information back.
 
  Thanks,
  Bujji
 
 
  On Wed, Mar 31, 2010 at 4:52 PM, Mattias Persson
  matt...@neotechnology.comwrote:
 
   So, the LuceneIndexService is in the neo4j-index component (as I
 referred
   to in the previous mail), http://components.neo4j.org/neo4j-index/ .
 It is
   a separate component which depends on the Neo4j kernel component.
  
   Source code links are available at the above page, for short it's
   https://svn.neo4j.org/components/index/trunk/ . Also neo4j-index in
 turn
   have its own dependencies, f.ex. lucene and the neo4j-commons
 component, so
   it's recommended to use a dependency manager, f.ex. maven to gather
 all the
   dependencies, see http://wiki.neo4j.org/content/Getting_Started_Guidefor
   more information about that.
  
  
   2010/3/31 Bujji sivait...@gmail.com
  
   hi Mattias Persson,
  
   Thanks for your quick response.
   what I  see from realese 1.0 is there is no lucene(indexer) component
 in
   it
   please tell me where and how   i get the source from repository.
  
   Thanks and Regards,
   Bujji
  
   Message: 7
   Date: Wed, 31 Mar 2010 09:58:40 +0200
   From: Mattias Persson matt...@neotechnology.com
   Subject: Re: [Neo] creating nodes with our own id
   To: Neo user discussions user@lists.neo4j.org
   Message-ID:
  
 k2kacdd47331003310058idbbbf320h956430a2e0289...@mail.gmail.com
   Content-Type: text/plain; charset=UTF-8
  
  
   The node ids shouldn't be used for such lookups. Either you traverse
 to
   them
   via relationships and other nodes, or you can use the neo4j-index
   component,
   http://components.neo4j.org/neo4j-index/ where you can index nodes
 and do
   lookups, f.ex:
  
   GraphDatabaseService graphDb = new EmbeddedGraphDatabase( my/path
 );
   IndexService index = new LuceneIndexService( graphDb );
  
    // withing transaction
   Node myNode = graphDb.createNode();
   node.setProperty( uid, abc123 );
   index.index( node, uid, node.getProperty( uid ) );
   
   Node myNodeFoundViaIndex = index.getSingleNode( uid, abc123 );
  
   NOTE: Indexing operations automatically participates in neo4j
 transactions
  
   2010/3/31 Bujji sivait...@gmail.com
  
hi all,
i am not clear on how to use the nodes once we create them with
   identifiers
generated by the program.
how do i remember them
i want to have my own id for each node when i am creating a node
is that possible
what are the changes i have to made to work like that
otherwise give me any working example that uses neo4j as it is and
 how
   it
is
using its id's as well
   
plz help me
   
   
Thanks
bujji
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user
   
  
  
  
   --
   Mattias Persson, [matt...@neotechnology.com]
   Hacker, Neo Technology
   www.neotechnology.com
  
  
  
  
   --
   Mattias Persson, [matt...@neotechnology.com]
   Hacker, Neo Technology
   www.neotechnology.com
  
  ___
  Neo mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user

 _
 Express yourself instantly with MSN Messenger! Download today it's FREE!
 http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
 ___
 Neo mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




 --
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com




-- 
Mattias Persson, [matt...@neotechnology.com]

Re: [Neo] Node not found using BatchInserter

2010-04-08 Thread Mattias Persson
If it's not a very large data set (or you have enough RAM) you could keep
stuff like that in a HashMap really, that's how I do it sometimes... that
way you can get rid of that extra lookup and it'll be faster. So you insert
relationship as usual and in addition store that relationship (i.e. start
node id, end node id) in f.ex. a HashMap for later lookup.

2010/4/1 Amir Hossein Jadidinejad amir.jad...@yahoo.com

 Ok. Thank you very much.
 I just want to load a huge graph but during insertion I have to lookup for
 previous relations (In order to prevent of duplicate relations). Using
 BatchInserter isn't applicable?, Better idea?

 --- On Thu, 4/1/10, Mattias Persson matt...@neotechnology.com wrote:

 From: Mattias Persson matt...@neotechnology.com
 Subject: Re: [Neo] Node not found using BatchInserter
 To: Neo user discussions user@lists.neo4j.org
 Date: Thursday, April 1, 2010, 2:32 AM

 rel_itr.next() returns relationship ids, not node ids... that's why you get
 those NotFoundExceptions. What you'd need to do is to
 inserter.getRelationshipById( rel-id ) on those ids and get either start
 or end node from it.

 But, is it really right to use the batch inserter in your case?
 BatchInserter is only meant to be used if you're doing a one-time initial
 loading of a big dataset, but never in production or when you have a
 database containing data. Use the EmbeddedGraphDatabase for normal use.

 2010/3/31 Amir Hossein Jadidinejad amir.jad...@yahoo.com

  Hi,
  Check the following code:
  for (IteratorLong rel_itr =
  inserter.getRelationshipIds(current_node).iterator(); rel_itr.hasNext();)
 {
  long neighbor = rel_itr.next();
  if (neighbor != current_node  neighbor != -1) {
  try {
 
 
 exist_neighbors.add(inserter.getNodeProperties(neighbor).get(cui).toString());
  } catch (Exception e) {
  e.printStackTrace();
  }
  }
  }
 
  After running, I have a lot of this error:
  org.neo4j.graphdb.NotFoundException: id=3225225
  at
 
 org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.getNodeRecord(BatchInserterImpl.java:517)
  at
 
 org.neo4j.kernel.impl.batchinsert.BatchInserterImpl.getNodeProperties(BatchInserterImpl.java:238)
  at org.qiau.wnng.build.BuildGraph.addAllNodes(BuildGraph.java:220)
  at org.qiau.wnng.build.BuildGraph.main(BuildGraph.java:288)
 
  Is it possible that a neighbor node not found while getRelationshipIds
  method return it?!
 
 
 
 
  ___
  Neo mailing list
  User@lists.neo4j.org
  https://lists.neo4j.org/mailman/listinfo/user
 



 --
 Mattias Persson, [matt...@neotechnology.com]
 Hacker, Neo Technology
 www.neotechnology.com
 ___
 Neo mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




 ___
 Neo mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Mattias Persson, [matt...@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Requirements for an event framework for Neo4j

2010-04-08 Thread Tobias Ivarsson
Well, these were the kind of questions I would like to get input on, what is
it that you need. But since I am a user as well as a designer of this I
guess I could go ahead and answer these questions from my perspective. I'll
do so inline.

On Wed, Mar 31, 2010 at 5:26 PM, Rick Bullotta 
rick.bullo...@burningskysoftware.com wrote:

 Hi, Tobias.

 That's awesome news.

 A few general questions regarding an event framework for Neo4J...

 - In the current implementation, there's a thread affinity for
 transactions.
 I am guessing that this could create big challenges for proactive
 handlers
 that are potentially executed on a different thread?


My thinking around this is that the event handlers would get access to some
sort of objects that represent the changes made in the transaction. These
objects would be possible to access outside of a transactional context. The
Proactive handlers would however have to be executed synchronously in the
same thread. The reactive handlers would execute on a different thread, and
for them it would be nice to be able to operate on the graph without needing
a transactional context, but I guess opening a read transaction isn't that
big of a deal here anyway, so I think it will work out.


 - Will the handlers be synchronous or asynchronous?


I answered this above...


 - Also, another consideration is whether or not you want to provide support
 for event folding for chatty changes to properties on nodes/relationships
 (e.g. you choose the quality of service - all changes or most recent
 changes
 only if you haven't yet processed the mutation event).


I would like to keep the number of events fired to as low as possible,
meaning that a onNodePropertyChage() event is probably too chatty,
onBeforeWriteTransactionCommit(SomeObjectWithTheChanges) is probably a
better level. But any input on what you would need is useful. So I would say
that you would only observe the changes that were present at commit, and no
events would be fired before commit.


 - What do you envision passing along with events?  A full copy of the
 node/relationship?  Only the mutated property?


If we can keep it to only be the mutated state that would be great. If we
can limit ourselves to the node with this ID changed somehow that would be
even better. Actually I think we could limit ourselves to that since the
proactive events could be fired (in the same thread as the transaction is
executing in) while the transaction is still open, meaning that the modified
nodes and relationships are still available, and in the reactive handlers
you could open a transaction to get to the current state (the changed state
might already be stale anyway).



 - Would there be support for bucketed notifications that would allow
 notifications on multiple property changes on a node to be processed as a
 single entity?


See my answer to the folding question.



 Looking forward to seeing how this all materializes!

 Rick



 -Original Message-
 From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org]
 On
 Behalf Of Tobias Ivarsson
 Sent: Wednesday, March 31, 2010 6:39 AM
 To: Neo user discussions
 Subject: [Neo] Requirements for an event framework for Neo4j

 Fellow developers!

 The time has come to start the work on an event framework for Neo4j. In
 order to do a good work at this we would get input on what requirements you
 have on an event framework. We would like to get a list of use cases for
 which you would use an event framework, along with the features you think
 the use case would need from the event framework (i.e. which events you
 would like to receive notification about, and when). We would also like you
 to motivate why these features are required by the use case. Events can
 easily degrade performance if the framework is ill designed, so we would
 like to keep things very lean.

 We have made some early analysis and arrived at the following conclusions:

 * There can be two kinds of event handlers: Proactive event handlers and
 Reactive event handlers.
 Proactive event handlers have the ability to preempt operations and
 Reactive
 event handlers simply react to an event and cannot cause the event to not
 succeed.

 * There are three kinds of events in Neo4j kernel:
  - Lifecycle events, such as shutdown.
  - Transactional events, such as start commit, commit successful, rollback,
 etc.
  - Data modification events, such as node created, property changed,
 relationship removed, etc.

 It might be possible that other components, such as the indexing component,
 would want to add more events to the event framework.

 These are of course just some initial input to get your thoughts going,
 feel
 free to think outside of the constraints above. Our ultimate goal is to
 create an event framework that is as useful as possible while maintaining

 --
 Tobias Ivarsson tobias.ivars...@neotechnology.com
 Hacker, Neo Technology
 www.neotechnology.com
 Cellphone: +46 706 534857
 

Re: [Neo] Some feedback on ZooKeeper use in Neo4j zha.

2010-04-08 Thread Patrick Hunt

On 04/08/2010 03:20 AM, Johan Svensson wrote:
 Hi Patrick,

 Thanks for the feedback. I will have a look at this and implement
 handling for disconnection and expiration of sessions.


No problem. We'll be psyched to see you roll this out.

 Regarding the GC issues we are well aware of these (hopefully the new
 garbage first or G1 GC will solve these problems). As you say the
 concurrent mark sweep GC helps a lot but more important to avoid GC
 trashing is to make sure there is more (10-15%) available heap than
 the application ever consumes at any given moment.


Agree. HBase has tested G1 in 1.6.x but so far it is not stable enough 
for production use.

 I do have a question regarding ZooKeeper. Is there a reason why there
 is no embedded version?

 Now I have to start two JVMs on each machine when I really just want to:

 // start a zookeeper server on this machine
 ZooKeeperServer server = new ZooKeeperServer( 2181 );
 // start a client and pass in some zookeeper servers
 ZooKeeper zoo = new ZooKeeper( localhost:2181, otherhost:2181, ..., ... 
 );

I'm not sure what you mean by embedded. In production you typically 
want to have dedicated hosts for the servers. See this page for some insight
http://wiki.apache.org/hadoop/ZooKeeper/Troubleshooting

HBase wraps the zk server for their quickstart type use cases (in 
their startup scripts). But in large online production serving 
environments you typically run a ZK cluster separately from the client 
application.

Patrick

 Regards,
 -Johan

 On Wed, Apr 7, 2010 at 12:08 AM, Patrick Huntph...@apache.org  wrote:
 Hi, I'm Patrick (http://twitter.com/phunt) from the ZooKeeper team.
 Peter Neubauer brought to my attention today that you are considering
 use of ZooKeeper in Neo4j, that's great! I took a quick look at the code
 you currently have in SVN and wanted to provide a bit of feedback.

 I don't know your domain requirements but in general the mechanics of
 ZooClient use look fine.

 The use of 5second timeout is fine. This allows you to detect a client
 (zk client) failure after just 5 seconds. So if the node/process crashes
 you'd identify this after 5 seconds, same if a network connection fails,
 etc... One thing you may not have considered though, is that anything
 that causes the client to not be able to heartbeat to the server would
 also cause the session to be expired (sessions are expired when the zk
 cluster fails to hear from the client w/in the timeout time) - so long
 GC pauses could trigger this as well. In 1.6.x jvms we've seen that the
 GC can pause all threads for very long periods (in some cases with hbase
 we saw 4 minute pauses for gc). HBase was the first to see this, we
 worked with Solr early on to help them understand this issue as well.
 The problem can be alleviated somewhat by using the CMS/incremental GC
 options in the JVM, however it cannot be eliminated entirely (in some
 cases the Gc will still drop back to parallel). You need to consider the
 impact of GC on your domain and how to best handle it.

 See this JIRA for details on our discussion with Solr, you might gain
 some good insight:http://bit.ly/d7OSQ1
 https://issues.apache.org/jira/browse/SOLR-1277

 I did notice that ZooClient is not handling disconnection and expiration
 of the session in the process method. At the very least you need to
 handle the expiration, you may need to do something for disconnection,
 but this depends on whether you have active or passive actors (masters).
 Here's a good link on session lifecycle:
 http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkSessions


 You might also want to setup a wiki page similar to these at some point,
 it would help us with future discussion, feedback and provide insight
 for devs/users:
 http://wiki.apache.org/hadoop/ZooKeeper/HBaseAndZooKeeper
 http://wiki.apache.org/solr/ZooKeeperIntegration

 Regards and good luck,

 Patrick
 ___
 Neo mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] meta meta classes

2010-04-08 Thread Niels Hoogeveen

Your point about the cardinality restriction is a correct observation. 

In fact it would be better to create a is-subdivision-of PropertyType on 
sub-division and give that a range country with a cardinality of 1. Then 
for each subclass of sub-division a restriction should be set, naming the 
country class this specific sub-division class applies to.

Still, it requires  each country to be defined as both a class and an instance. 

 Date: Thu, 8 Apr 2010 19:15:30 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo] meta meta classes
 
 So, you describe the model (with country, sub-division and
 has-sub-division) which is OK! Then you not just want to add data which
 would conform to it, you also describe the highest level of that data in the
 meta model itself with cardinality for how many sub-divisions each such
 level must contain.
 
 My first though here is: why put actual data into the meta model (the design
 isn't intended for that)?
 The second is: why (since you put actual data into the meta model) would you
 stop there? why only say that France have 10 subdivisions? You don't say
 exactly which subdivisions or how many/which subdivisions each subdivision
 has, a.s.o. Which benefits would modeling only the highest level of data get
 you? And if you could describe the entire data in the meta model, you end up
 with a meta model which describes the entire data and then, in addition, the
 data which is exactly the same as the meta model... what benefits does that
 give you?
 
 I'm just confused about the fact that you want to have the meta model
 (classes, properties, restrictions) and (some of) the actual data modeled in
 the meta model, whereas the meta model was intended to model the meta model
 (the UML diagram, so to speak) and not the actual data it would have conform
 to it.
 
 Help me understand which benefits you're after by modeling the top level
 of your actual data into the meta model itself.
 
 
 Best,
 Mattias
 
 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com
 
 
  The example of the tag library and countries/sub-divisions are not
  necessarily similar. The first shows the need to model the properties of a
  class.
 
  The second example shows the need to have singleton classes, which is a
  different concept, and something that cannot be done out of the box, but I
  will show a solution that requires minimal modifications to the current
  software.
 
  Suppose we want to model countries and their sub-divisions.
 
  We do the following:
 
  create class country
  create class sub-division
  create property has-sub-division
  make has-subdivision a property of country
 
  Now we like to populate the database, since all countries have different
  types of sub-divisions we need to create those:
 
  create class french_region
  create class canadian_province
  create class canadian_territory
  create class US_state
  etc.
 
  Populate the various classes with instances:
 
  create node for Alsace and make it an instance of french_region
  create node for Aquitaine and make it an instance of
  french_region
  create node for Alberta and make it an instance of
  Canadian_province
  create node for Nanavut and make it an instance of
  Canadian_territory
  create node for British Columbia and make it an instance of
  Canadian_province
  create node for Alabama and make it an instance of
  US_State
  create node for Alaska and make it an instance of
  US_State
  etc.
 
  We'd like to state that each country has its own restriction on the type of
  subdivision. To do that we need to create classes for each country.
 
  create class France
  create class Canada
  create class United_States_of_America
  etc.
 
  make class France a subclass of Country
  make class Canada a subclass of Country
  make class United_States_of_America a subclass of Country
  etc.
 
  create restriction for France on has-sub-division with range
  french_province and cardinality = 26
  create restriction for Canada on has-sub-division with range
  canadian_province and cardinality = 10
  create restriction for Canada on has-sub-division with range
  canadian_territory and cardinality = 3
  create restriction for United_States_of_America on has-sub-division
  with range US_State and cardinality = 50
  etc.
 
  Now we have classes for each country, but no instances. Unfortunately we
  cannot say that a class is an instance of itself, that would require a
  relationsship where the endnode equals the startnode, which Neo4J doesn't
  allow. So we have to create separate instances for each country.
 
  create node for France
  create node for Canada
  create node for United_States_Of_America
  etc.
 
  make node France an instance of class France
  make node Canada an instance of class Canada
  make node United_States_of_America an instance of class
  United_States_of_America
 
  And finally link the subdivisions to their countries:
 
   France has-sub-division Alsace
  France 

Re: [Neo] meta meta classes

2010-04-08 Thread Mattias Persson
2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com


 Your point about the cardinality restriction is a correct observation.

 In fact it would be better to create a is-subdivision-of PropertyType on
 sub-division and give that a range country with a cardinality of 1. Then
 for each subclass of sub-division a restriction should be set, naming the
 country class this specific sub-division class applies to.

 Still, it requires  each country to be defined as both a class and an
 instance.

Why just countries as classes? why not each subdivision as classes as well?
Why countries as classes at all?


  Date: Thu, 8 Apr 2010 19:15:30 +0200
  From: matt...@neotechnology.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo] meta meta classes
 
  So, you describe the model (with country, sub-division and
  has-sub-division) which is OK! Then you not just want to add data which
  would conform to it, you also describe the highest level of that data in
 the
  meta model itself with cardinality for how many sub-divisions each such
  level must contain.
 
  My first though here is: why put actual data into the meta model (the
 design
  isn't intended for that)?
  The second is: why (since you put actual data into the meta model) would
 you
  stop there? why only say that France have 10 subdivisions? You don't
 say
  exactly which subdivisions or how many/which subdivisions each
 subdivision
  has, a.s.o. Which benefits would modeling only the highest level of data
 get
  you? And if you could describe the entire data in the meta model, you end
 up
  with a meta model which describes the entire data and then, in addition,
 the
  data which is exactly the same as the meta model... what benefits does
 that
  give you?
 
  I'm just confused about the fact that you want to have the meta model
  (classes, properties, restrictions) and (some of) the actual data modeled
 in
  the meta model, whereas the meta model was intended to model the meta
 model
  (the UML diagram, so to speak) and not the actual data it would have
 conform
  to it.
 
  Help me understand which benefits you're after by modeling the top
 level
  of your actual data into the meta model itself.
 
 
  Best,
  Mattias
 
  2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com
 
  
   The example of the tag library and countries/sub-divisions are not
   necessarily similar. The first shows the need to model the properties
 of a
   class.
  
   The second example shows the need to have singleton classes, which is a
   different concept, and something that cannot be done out of the box,
 but I
   will show a solution that requires minimal modifications to the current
   software.
  
   Suppose we want to model countries and their sub-divisions.
  
   We do the following:
  
   create class country
   create class sub-division
   create property has-sub-division
   make has-subdivision a property of country
  
   Now we like to populate the database, since all countries have
 different
   types of sub-divisions we need to create those:
  
   create class french_region
   create class canadian_province
   create class canadian_territory
   create class US_state
   etc.
  
   Populate the various classes with instances:
  
   create node for Alsace and make it an instance of french_region
   create node for Aquitaine and make it an instance of
   french_region
   create node for Alberta and make it an instance of
   Canadian_province
   create node for Nanavut and make it an instance of
   Canadian_territory
   create node for British Columbia and make it an instance of
   Canadian_province
   create node for Alabama and make it an instance of
   US_State
   create node for Alaska and make it an instance of
   US_State
   etc.
  
   We'd like to state that each country has its own restriction on the
 type of
   subdivision. To do that we need to create classes for each country.
  
   create class France
   create class Canada
   create class United_States_of_America
   etc.
  
   make class France a subclass of Country
   make class Canada a subclass of Country
   make class United_States_of_America a subclass of Country
   etc.
  
   create restriction for France on has-sub-division with range
   french_province and cardinality = 26
   create restriction for Canada on has-sub-division with range
   canadian_province and cardinality = 10
   create restriction for Canada on has-sub-division with range
   canadian_territory and cardinality = 3
   create restriction for United_States_of_America on has-sub-division
   with range US_State and cardinality = 50
   etc.
  
   Now we have classes for each country, but no instances. Unfortunately
 we
   cannot say that a class is an instance of itself, that would require a
   relationsship where the endnode equals the startnode, which Neo4J
 doesn't
   allow. So we have to create separate instances for each country.
  
   create node for France
   create node for Canada
   create node for United_States_Of_America
   etc.
  
   

Re: [Neo] meta meta classes

2010-04-08 Thread Niels Hoogeveen

Each country needs to be modeled as classes, because I want to set the 
restriction that French regions (which can have different properties from 
Canadian provinces) can only have a relationship with the country France, and 
Canadian provinces can only have a relationship with the country Canada. The 
domain and the range of a PropertyType are classes not instances. If countries 
were simply instances of the  country class, it would be possible to say that 
an instance of a Canadian province is a subdivision of France. 

I'd like to be able to iterate over the subdivision of France and have 
guaranteed that each instance has the property region code, a property 
unknown to Canadian provinces. Without having a restriction stating that a 
specific sub-division belongs to a specific country, any sub-division can be 
related to any country, so a user may erroneously say that Alberta is a French 
region. Not only is this factually incorrect, but structurally too. Alberta, 
being a Canadian province, doesn't have the region code property, which I 
want French regions to have. 


 Date: Thu, 8 Apr 2010 20:06:32 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo] meta meta classes
 
 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com
 
 
  Your point about the cardinality restriction is a correct observation.
 
  In fact it would be better to create a is-subdivision-of PropertyType on
  sub-division and give that a range country with a cardinality of 1. Then
  for each subclass of sub-division a restriction should be set, naming the
  country class this specific sub-division class applies to.
 
  Still, it requires  each country to be defined as both a class and an
  instance.
 
 Why just countries as classes? why not each subdivision as classes as well?
 Why countries as classes at all?
 
 
   Date: Thu, 8 Apr 2010 19:15:30 +0200
   From: matt...@neotechnology.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo] meta meta classes
  
   So, you describe the model (with country, sub-division and
   has-sub-division) which is OK! Then you not just want to add data which
   would conform to it, you also describe the highest level of that data in
  the
   meta model itself with cardinality for how many sub-divisions each such
   level must contain.
  
   My first though here is: why put actual data into the meta model (the
  design
   isn't intended for that)?
   The second is: why (since you put actual data into the meta model) would
  you
   stop there? why only say that France have 10 subdivisions? You don't
  say
   exactly which subdivisions or how many/which subdivisions each
  subdivision
   has, a.s.o. Which benefits would modeling only the highest level of data
  get
   you? And if you could describe the entire data in the meta model, you end
  up
   with a meta model which describes the entire data and then, in addition,
  the
   data which is exactly the same as the meta model... what benefits does
  that
   give you?
  
   I'm just confused about the fact that you want to have the meta model
   (classes, properties, restrictions) and (some of) the actual data modeled
  in
   the meta model, whereas the meta model was intended to model the meta
  model
   (the UML diagram, so to speak) and not the actual data it would have
  conform
   to it.
  
   Help me understand which benefits you're after by modeling the top
  level
   of your actual data into the meta model itself.
  
  
   Best,
   Mattias
  
   2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com
  
   
The example of the tag library and countries/sub-divisions are not
necessarily similar. The first shows the need to model the properties
  of a
class.
   
The second example shows the need to have singleton classes, which is a
different concept, and something that cannot be done out of the box,
  but I
will show a solution that requires minimal modifications to the current
software.
   
Suppose we want to model countries and their sub-divisions.
   
We do the following:
   
create class country
create class sub-division
create property has-sub-division
make has-subdivision a property of country
   
Now we like to populate the database, since all countries have
  different
types of sub-divisions we need to create those:
   
create class french_region
create class canadian_province
create class canadian_territory
create class US_state
etc.
   
Populate the various classes with instances:
   
create node for Alsace and make it an instance of french_region
create node for Aquitaine and make it an instance of
french_region
create node for Alberta and make it an instance of
Canadian_province
create node for Nanavut and make it an instance of
Canadian_territory
create node for British Columbia and make it an instance of
Canadian_province
create node for Alabama and make it an instance of
US_State

Re: [Neo] Traversers in the REST API

2010-04-08 Thread Alastair James
What I want to avoid
is keeping state on the server while waiting for the client to request the
next page.

You are quite right. However, I think for many use cases (e.g. generating a
paginated list of results on a webpage) it would not be necessary to store
state on the server.

That would be more similar to a SQL cursor, what I am talking about is
simply SQL LIMIT, OFFSET and ORDER BY.

Cheers

Al

On 8 April 2010 17:23, Tobias Ivarsson tobias.ivars...@neotechnology.comwrote:

 What I want to avoid
 is keeping state on the server while waiting for the client to request the
 next page.




-- 
Dr Alastair James
CTO James Publishing Ltd.
http://www.linkedin.com/pub/3/914/163

www.worldreviewer.com

WINNER Travolution Awards Best Travel Information Website 2009
WINNER IRHAS Awards, Los Angeles, Best Travel Website 2008
WINNER Travolution Awards Best New Online Travel Company 2008
WINNER Travel Weekly Magellan Award 2008
WINNER Yahoo! Finds of the Year 2007

Noli nothis permittere te terere!
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Traversers in the REST API

2010-04-08 Thread Michael Ludwig
Tobias Ivarsson schrieb am 08.04.2010 um 18:23:27 (+0200)
[Re: [Neo] Traversers in the REST API]:

 On Wed, Apr 7, 2010 at 3:05 PM, Alastair James al.ja...@gmail.com
 wrote:

  when we start talking about returning 1000s of nodes in JSON over
  HTTP just to get the first 10 this is clearly sub-optimal (as I
  build websites this is a very common use case). So, as you say,
  sorting and limiting can wait, but I suspect the HTTP API would
  benefit from offering it. Limiting need not require changes to the
  core API, it could be implemented as a second stage in the HTTP API
  code prior to output encoding.
 
 For paging / limiting: yes, you are absolutely right, this would not
 effect the core API at all, only the REST API. Limiting/paging is
 something we would probably add to the REST API before sorting.

Limiting and paging usually go hand in hand with sorting, in my
experience. Why would anyone want to page through an unsorted
collection?

 Sorting might be a similar case, but I still think the client would be
 better fitted to do sorting well.

The server has indexes to support the sorting. (If it doesn't, it has a
problem anyway.) What does the client have to support sorting? So how
would it be better fitted to do sorting well?

 But once paging / limiting is added it would be quite natural / useful
 to add sorting as well. What I want to avoid is keeping state on the
 server while waiting for the client to request the next page.

If you ensure a binary tree index is used to do the sorting, you should
be fine.

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] meta meta classes

2010-04-08 Thread Mattias Persson
Ok now I get your point! Thank you for clarifying.

Your singleton proposal could be a good idea then. Could it
potentially be a hindrance in some scenario? I mean should we have a
MetaModelClass#setSingleton(boolean) or something so that this
behaviour can be controlled?

2010/4/8, Niels Hoogeveen pd_aficion...@hotmail.com:

 Each country needs to be modeled as classes, because I want to set the
 restriction that French regions (which can have different properties from
 Canadian provinces) can only have a relationship with the country France,
 and Canadian provinces can only have a relationship with the country Canada.
 The domain and the range of a PropertyType are classes not instances. If
 countries were simply instances of the  country class, it would be
 possible to say that an instance of a Canadian province is a subdivision of
 France.

 I'd like to be able to iterate over the subdivision of France and have
 guaranteed that each instance has the property region code, a property
 unknown to Canadian provinces. Without having a restriction stating that a
 specific sub-division belongs to a specific country, any sub-division can be
 related to any country, so a user may erroneously say that Alberta is a
 French region. Not only is this factually incorrect, but structurally too.
 Alberta, being a Canadian province, doesn't have the region code property,
 which I want French regions to have.


 Date: Thu, 8 Apr 2010 20:06:32 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo] meta meta classes

 2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com

 
  Your point about the cardinality restriction is a correct observation.
 
  In fact it would be better to create a is-subdivision-of PropertyType
  on
  sub-division and give that a range country with a cardinality of 1.
  Then
  for each subclass of sub-division a restriction should be set, naming
  the
  country class this specific sub-division class applies to.
 
  Still, it requires  each country to be defined as both a class and an
  instance.
 
 Why just countries as classes? why not each subdivision as classes as
 well?
 Why countries as classes at all?

 
   Date: Thu, 8 Apr 2010 19:15:30 +0200
   From: matt...@neotechnology.com
   To: user@lists.neo4j.org
   Subject: Re: [Neo] meta meta classes
  
   So, you describe the model (with country, sub-division and
   has-sub-division) which is OK! Then you not just want to add data
   which
   would conform to it, you also describe the highest level of that data
   in
  the
   meta model itself with cardinality for how many sub-divisions each
   such
   level must contain.
  
   My first though here is: why put actual data into the meta model (the
  design
   isn't intended for that)?
   The second is: why (since you put actual data into the meta model)
   would
  you
   stop there? why only say that France have 10 subdivisions? You don't
  say
   exactly which subdivisions or how many/which subdivisions each
  subdivision
   has, a.s.o. Which benefits would modeling only the highest level of
   data
  get
   you? And if you could describe the entire data in the meta model, you
   end
  up
   with a meta model which describes the entire data and then, in
   addition,
  the
   data which is exactly the same as the meta model... what benefits does
  that
   give you?
  
   I'm just confused about the fact that you want to have the meta model
   (classes, properties, restrictions) and (some of) the actual data
   modeled
  in
   the meta model, whereas the meta model was intended to model the meta
  model
   (the UML diagram, so to speak) and not the actual data it would have
  conform
   to it.
  
   Help me understand which benefits you're after by modeling the top
  level
   of your actual data into the meta model itself.
  
  
   Best,
   Mattias
  
   2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com
  
   
The example of the tag library and countries/sub-divisions are not
necessarily similar. The first shows the need to model the
properties
  of a
class.
   
The second example shows the need to have singleton classes, which
is a
different concept, and something that cannot be done out of the box,
  but I
will show a solution that requires minimal modifications to the
current
software.
   
Suppose we want to model countries and their sub-divisions.
   
We do the following:
   
create class country
create class sub-division
create property has-sub-division
make has-subdivision a property of country
   
Now we like to populate the database, since all countries have
  different
types of sub-divisions we need to create those:
   
create class french_region
create class canadian_province
create class canadian_territory
create class US_state
etc.
   
Populate the various classes with instances:
   
create node for Alsace and make it an instance of french_region
create node 

Re: [Neo] Date effectiveness (Time Variance) implementation in Neo4J

2010-04-08 Thread Michael Ludwig
suryadev vasudev schrieb am 06.04.2010 um 23:26:35 (-0700)
[[Neo] Date effectiveness (Time Variance) implementation in Neo4J]:

 We are exploring Neo4J for a resource management application.

  [ straightforward requirements list without
any discernible graph specifica snipped ]

 In Neo4J, we created Library, Book-Club, Publisher, Student and Books.
 We are finding it difficult to implement the time variance.

Oh, that ...

 The business requirements are:-
 1. The book publisher can lease books till his end registering date
 2. Publisher can specify lease start date and end date for each book
 3. Do not lend beyond end leasing date
 4. Do not lend beyond end membership date
 5. Query Student-book relationships (What books were borrowed/
 reserved, who was the publisher, what was the book club) for a given
 date range
 
 How do we model the date in Neo4J?

Heretical counter-question:

Why model the date in Neo4J if any SQL database provides full-spectrum
date-time functionality?

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread Michael Ludwig
Alastair James schrieb am 07.04.2010 um 15:53:50 (+0100)
[[Neo] How to efficiently query in Neo4J?]:

 Briefly, the site consists of posts, each tagged with various
 attributes, e.g. (its a travel site) location, theme, cost etc... Also
 the tags are hierarchical. So, for location we have (say) 'tuscany'
 inside 'italy' inside 'europe'. For theme we have (say) 'cycling'
 inside 'activity'.

After giving this some thought, it looks to me as if there is nothing
particularly graphy in your example. I know, most everything is a graph,
but here the data is more regular: Your hierarchical catalog of tags
immediately made me think of Joe Celko's nested sets, which is a very
efficient way to represent trees in terms of sets, as found in SQL
databases. (Heresy again, I know, but well.) And the relationship of
posts to tags is simply N-M, and that's it.

There aren't any real links (edges) between posts, which arguably would
make your data model more graphy. In your model, related posts are
related by virtue of their attributes (they share some tags, or are
posted by the same user), and not eis ipsis. So I'd say there is not
much in the way of graphiness.

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] meta meta classes

2010-04-08 Thread Niels Hoogeveen

I think the best solution here is to have an instance enumeration on 
MetaModelClass. Singletons are special case of an enumeration. 

See: 
http://www.w3.org/TR/2002/WD-owl-ref-20021112/#Enumerated 
http://owl.cs.manchester.ac.uk/2007/05/api/javadoc/org/semanticweb/owl/model/OWLObjectOneOf.html



 Date: Thu, 8 Apr 2010 22:33:11 +0200
 From: matt...@neotechnology.com
 To: user@lists.neo4j.org
 Subject: Re: [Neo] meta meta classes
 
 Ok now I get your point! Thank you for clarifying.
 
 Your singleton proposal could be a good idea then. Could it
 potentially be a hindrance in some scenario? I mean should we have a
 MetaModelClass#setSingleton(boolean) or something so that this
 behaviour can be controlled?
 
 2010/4/8, Niels Hoogeveen pd_aficion...@hotmail.com:
 
  Each country needs to be modeled as classes, because I want to set the
  restriction that French regions (which can have different properties from
  Canadian provinces) can only have a relationship with the country France,
  and Canadian provinces can only have a relationship with the country Canada.
  The domain and the range of a PropertyType are classes not instances. If
  countries were simply instances of the  country class, it would be
  possible to say that an instance of a Canadian province is a subdivision of
  France.
 
  I'd like to be able to iterate over the subdivision of France and have
  guaranteed that each instance has the property region code, a property
  unknown to Canadian provinces. Without having a restriction stating that a
  specific sub-division belongs to a specific country, any sub-division can be
  related to any country, so a user may erroneously say that Alberta is a
  French region. Not only is this factually incorrect, but structurally too.
  Alberta, being a Canadian province, doesn't have the region code property,
  which I want French regions to have.
 
 
  Date: Thu, 8 Apr 2010 20:06:32 +0200
  From: matt...@neotechnology.com
  To: user@lists.neo4j.org
  Subject: Re: [Neo] meta meta classes
 
  2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com
 
  
   Your point about the cardinality restriction is a correct observation.
  
   In fact it would be better to create a is-subdivision-of PropertyType
   on
   sub-division and give that a range country with a cardinality of 1.
   Then
   for each subclass of sub-division a restriction should be set, naming
   the
   country class this specific sub-division class applies to.
  
   Still, it requires  each country to be defined as both a class and an
   instance.
  
  Why just countries as classes? why not each subdivision as classes as
  well?
  Why countries as classes at all?
 
  
Date: Thu, 8 Apr 2010 19:15:30 +0200
From: matt...@neotechnology.com
To: user@lists.neo4j.org
Subject: Re: [Neo] meta meta classes
   
So, you describe the model (with country, sub-division and
has-sub-division) which is OK! Then you not just want to add data
which
would conform to it, you also describe the highest level of that data
in
   the
meta model itself with cardinality for how many sub-divisions each
such
level must contain.
   
My first though here is: why put actual data into the meta model (the
   design
isn't intended for that)?
The second is: why (since you put actual data into the meta model)
would
   you
stop there? why only say that France have 10 subdivisions? You don't
   say
exactly which subdivisions or how many/which subdivisions each
   subdivision
has, a.s.o. Which benefits would modeling only the highest level of
data
   get
you? And if you could describe the entire data in the meta model, you
end
   up
with a meta model which describes the entire data and then, in
addition,
   the
data which is exactly the same as the meta model... what benefits does
   that
give you?
   
I'm just confused about the fact that you want to have the meta model
(classes, properties, restrictions) and (some of) the actual data
modeled
   in
the meta model, whereas the meta model was intended to model the meta
   model
(the UML diagram, so to speak) and not the actual data it would have
   conform
to it.
   
Help me understand which benefits you're after by modeling the top
   level
of your actual data into the meta model itself.
   
   
Best,
Mattias
   
2010/4/8 Niels Hoogeveen pd_aficion...@hotmail.com
   

 The example of the tag library and countries/sub-divisions are not
 necessarily similar. The first shows the need to model the
 properties
   of a
 class.

 The second example shows the need to have singleton classes, which
 is a
 different concept, and something that cannot be done out of the box,
   but I
 will show a solution that requires minimal modifications to the
 current
 software.

 Suppose we want to model countries and their sub-divisions.

 We do the 

Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread Michael Ludwig
Max De Marzi Jr. schrieb am 08.04.2010 um 16:48:18 (-0500)
[Re: [Neo] How to efficiently query in Neo4J?]:

 You know this is something that I think needs to be made clear...
 using just the graph is not the right way to go unless you have a very
 special application.

 Some things are better not done in the graph.  So I decided to keep
 that in tables, and just move the person relationships to the graph
 (works with, manages, knows, friends, etc).
 
 I treat the graph like a specialized index. Makes a lot more sense
 now, and I get the best of both worlds.

Exactly what I think. An iterable index, and a great one for the kind of
graphy queries that cannot be done efficiently using sets and joins.

Any thoughts on what constitutes *graphiness*, if I may venture this
term?

-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread rick . bullotta
   As always, it really isn't that simple.  Comparing cold queries is
   probably not a good indicator of steady state performance, since
   RDBMS's and Graph DB's have different models for file system access and
   caching.  Even different RDBMS's have dramatically different behaviors
   in common queries (ever try to use MySQL for set operations - yuck.).
   Factor in a wide range of SLAs needed for performance vs
   availability vs affordability vs scalability vs adminstration costs,
   and the equation gets a whole lot more complicated.



   I'm sure there's a graphy-model for the tag/post example that could be
   made smoking fast with Neo also.  Throw columnar storage, key-value,
   and document DB's into the mix, and the good news is that we have a lot
   of weapons in our arsenal now to tackle very demanding and diverse
   application challenges!





    Original Message 
   Subject: Re: [Neo] How to efficiently query in Neo4J?
   From: Michael Ludwig mil...@gmx.de
   Date: Thu, April 08, 2010 6:02 pm
   To: Neo user discussions user@lists.neo4j.org
   Max De Marzi Jr. schrieb am 08.04.2010 um 16:48:18 (-0500)
   [Re: [Neo] How to efficiently query in Neo4J?]:
You know this is something that I think needs to be made clear...
using just the graph is not the right way to go unless you have a
   very
special application.
Some things are better not done in the graph. So I decided to keep
that in tables, and just move the person relationships to the graph
(works with, manages, knows, friends, etc).
   
I treat the graph like a specialized index. Makes a lot more sense
now, and I get the best of both worlds.
   Exactly what I think. An iterable index, and a great one for the kind
   of
   graphy queries that cannot be done efficiently using sets and joins.
   Any thoughts on what constitutes *graphiness*, if I may venture this
   term?
   --
   Michael Ludwig
   ___
   Neo mailing list
   User@lists.neo4j.org
   [1]https://lists.neo4j.org/mailman/listinfo/user

References

   1. https://lists.neo4j.org/mailman/listinfo/user
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread Alastair James
Hi...

On 8 April 2010 22:35, Michael Ludwig mil...@gmx.de wrote:

 After giving this some thought, it looks to me as if there is nothing
 particularly graphy in your example. I know, most everything is a graph,
 but here the data is more regular: Your hierarchical catalog of tags
 immediately made me think of Joe Celko's nested sets, which is a very
 efficient way to represent trees in terms of sets, as found in SQL
 databases. (Heresy again, I know, but well.) And the relationship of
 posts to tags is simply N-M, and that's it.


We are currently using something similar to model this is SQL. However,
having to maintain the nested set model is quite complex and something I
really want to avoid in user code.


 There aren't any real links (edges) between posts, which arguably would
 make your data model more graphy. In your model, related posts are
 related by virtue of their attributes (they share some tags, or are
 posted by the same user), and not eis ipsis. So I'd say there is not
 much in the way of graphiness.


It was a simplified example, in reality there are relations between posts,
posts and authors, tags and tags etc...

It is exactly because we want 'anything to be relatable to anything' that
the graph database model works so well.

Al
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread Alastair James

 As always, it really isn't that simple.  Comparing cold queries is
   probably not a good indicator of steady state performance, since
   RDBMS's and Graph DB's have different models for file system access and
   caching.  Even different RDBMS's have dramatically different behaviors
   in common queries (ever try to use MySQL for set operations - yuck.).
   Factor in a wide range of SLAs needed for performance vs
   availability vs affordability vs scalability vs adminstration costs,
   and the equation gets a whole lot more complicated.


Exactly.

From experience its possible to build a post/tag system in SQL that performs
very well. However, the SQL model is inherently less flexible than the graph
database model (what if I want to introduce a new relationship type, in a
traditional SQL schema that would require new join tables etc...).


   I'm sure there's a graphy-model for the tag/post example that could be
   made smoking fast with Neo also.


Hopefully! I suppose my question is I would like to be able to harness a
graph database to give flexibility and eloquence to our data model. However,
can I query it efficiently without domain specific hacks and extra layers of
code?.

Al





    Original Message 
   Subject: Re: [Neo] How to efficiently query in Neo4J?
   From: Michael Ludwig mil...@gmx.de
   Date: Thu, April 08, 2010 6:02 pm
   To: Neo user discussions user@lists.neo4j.org
Max De Marzi Jr. schrieb am 08.04.2010 um 16:48:18 (-0500)
   [Re: [Neo] How to efficiently query in Neo4J?]:
You know this is something that I think needs to be made clear...
using just the graph is not the right way to go unless you have a
   very
special application.
Some things are better not done in the graph. So I decided to keep
that in tables, and just move the person relationships to the graph
(works with, manages, knows, friends, etc).
   
I treat the graph like a specialized index. Makes a lot more sense
now, and I get the best of both worlds.
   Exactly what I think. An iterable index, and a great one for the kind
   of
   graphy queries that cannot be done efficiently using sets and joins.
   Any thoughts on what constitutes *graphiness*, if I may venture this
   term?
   --
   Michael Ludwig
   ___
   Neo mailing list
   User@lists.neo4j.org
[1]https://lists.neo4j.org/mailman/listinfo/user

 References

   1. https://lists.neo4j.org/mailman/listinfo/user
 ___
 Neo mailing list
 User@lists.neo4j.org
 https://lists.neo4j.org/mailman/listinfo/user




-- 
Dr Alastair James
CTO James Publishing Ltd.
http://www.linkedin.com/pub/3/914/163

www.worldreviewer.com

WINNER Travolution Awards Best Travel Information Website 2009
WINNER IRHAS Awards, Los Angeles, Best Travel Website 2008
WINNER Travolution Awards Best New Online Travel Company 2008
WINNER Travel Weekly Magellan Award 2008
WINNER Yahoo! Finds of the Year 2007

Noli nothis permittere te terere!
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] Traversers in the REST API

2010-04-08 Thread Alastair James
On 8 April 2010 21:17, Michael Ludwig mil...@gmx.de wrote:

 Limiting and paging usually go hand in hand with sorting, in my
 experience. Why would anyone want to page through an unsorted
 collection?


Its quite possible that you might want the nodes in the order they were
found (e.g. the closest matching nodes first), however, I agree, sorting by
an arbitrary property is very useful!

Al
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user


Re: [Neo] How to efficiently query in Neo4J?

2010-04-08 Thread Michael Ludwig
rick.bullotta schrieb am 08.04.2010 um 15:16:11 (-0700)
[Re: [Neo] How to efficiently query in Neo4J?]:

 Factor in a wide range of SLAs needed for performance vs availability
 vs affordability vs scalability vs adminstration costs, and the
 equation gets a whole lot more complicated.

Granted.

 I'm sure there's a graphy-model for the tag/post example that could be
 made smoking fast with Neo also.

Sure, but there's also a way of looking at screws that might suggest you
should use a hammer ;-) and it would be wrong. Which doesn't mean it
couldn't be modeled for the tag/post example - just a general caveat to
think about both tools and problems when trying to find a good solution.

 Throw columnar storage, key-value, and document DB's into the mix, and
 the good news is that we have a lot of weapons in our arsenal now to
 tackle very demanding and diverse application challenges!

Yes, it's becoming very interesting. Lots of new high-level tools for
specialized or relaxed requirements.

SQL won't be dethroned, though.
-- 
Michael Ludwig
___
Neo mailing list
User@lists.neo4j.org
https://lists.neo4j.org/mailman/listinfo/user