Re: [Neo4j] Node.hasRelationship question
2010/10/27, Walaa Eldin Moustafa wa.moust...@gmail.com: I am looking for a method that is if given a node n1, can answer the query if n1 has a relationship of type t with node n2, something along the lines of: boolean b = n1.hasRelatioship(n2,t); I know we can answer this question by iterating over n1's getRelationships() and testing them. However, I was looking for something more efficient that does not have to iterate over the node's relationships, especially that I noticed that the RelationshipIndex interface has a get() method that does something like that; however we have to supply it with a key and a value for an attribute for the edge between the two nodes. So I was looking for something like this method, but without having to pass a key/value pair. you can actually leave out key/value (pass in null) if you pass in start/end node, or you can index the relationship type as a key/value pair for each relationship and then do: get( type, knows, person1, person2 ); would that help? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] BatchInserter - No index provider 'null' error
Ah, it's a bug allright... only when using the BatchInserterIndex. The problem goes away if you add provider:lucene to the configuration map, like: persons= indexProvider.nodeIndex(persons, MapUtil.stringMap(type,exact, provider, lucene)); It will be fixed in M03, where the LuceneBatchInserterIndexProvider fills that in for you. 2010/10/26 Peter Neubauer peter.neuba...@neotechnology.com Paddy, just tested it, and it seems it only occurs when running standalone, see the other mail. Seems to be a bug, and I think it is linked to the behavior seen in the Python bindings, too. Cheers, /peter neubauer VP Product Management, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Tue, Oct 26, 2010 at 3:56 AM, Paddy paddyf...@gmail.com wrote: Hi, I'm trying out the new Index Framework Batch Inserter using version 1.2.M02. I getting a No index provider 'null' found error when calling graphDb.index().forNodes(persons) in my test code below Exception in thread main java.lang.IllegalArgumentException: No index provider 'null' found at org.neo4j.kernel.IndexManagerImpl.getIndexProvider(IndexManagerImpl.java:65) at org.neo4j.kernel.IndexManagerImpl.forNodes(IndexManagerImpl.java:205) at ie.transportdublin.batchinserter.BatchIndexTest.main(BatchIndexTest.java:40) From debugging it looks like config.get( KEY_INDEX_PROVIDER )@ L205 in IndexManagerImpl is returning a null value. Or please let me know if I should be doing this a different way. thanks Paddy public class BatchIndexTest { static final String DB_PATH = data/neo-db-test10; static BatchInserterIndexProvider indexProvider; static BatchInserter inserter; static BatchInserterIndex persons ; static GraphDatabaseService graphDb; public static void main(String[] args) { inserter = new BatchInserterImpl(DB_PATH); indexProvider = new LuceneBatchInserterIndexProvider(inserter); persons= indexProvider.nodeIndex(persons, MapUtil.stringMap(type, exact)); MapString, Object properties = MapUtil.map(name, test); long node = inserter.createNode(properties); persons.add(node, properties); indexProvider.shutdown(); inserter.shutdown(); graphDb = new EmbeddedGraphDatabase(DB_PATH); Transaction tx = graphDb.beginTx(); try { IndexManager indexManager = graphDb.index(); System.out.println(indexManager.forRelationships(persons)); System.out.println(indexManager.existsForNodes(persons)); //returns true //Getting an error with the following code System.out.println(indexManager.forNodes(persons)); tx.success(); } finally { tx.finish(); graphDb.shutdown(); } } } ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Lucene index provider not found in certain cases
(As I answered in another thread) This is a bug in the LuceneIndexBatchInserterIndexProvider class where it doesn't automatically fill in the provider parameter. It has been fixed now, so until M03 send that in yourself... just like the tests do. I'm not sure this is connected to the python problems, since the python bindings hasn't been updated to use the new index API, and is even batchinsertion possible in neo4j.py? I think it's not. 2010/10/26 Peter Neubauer peter.neuba...@neotechnology.com Hi all, there are two cases (from python and Paddys code below) where a created index is not found. I have been trying to track it down, but cannot reproduce it using the code at https://trac.neo4j.org/browser/components/lucene-index/trunk/src/test/java/org/neo4j/index/impl/lucene/TestLuceneBatchInsert.java#L178 Somehow the index config in the main() code is getting result = ({type=exact}, false) back from IndexManagerImpl.getOrCreateIndexConfig() PairMapString, String, Boolean result = findIndexConfig( cls, indexName, suppliedConfig, graphDbImpl.getConfig().getParams() ); , when before (upon the first creation) it was ({provider=lucene, type=exact}, false). This is not happening in the unit test. Also, this might be linked to the Python index issue earlier mentioned on the list? package org.neo4j; import java.util.Map; import org.neo4j.graphdb.GraphDatabaseService; import org.neo4j.graphdb.Node; import org.neo4j.graphdb.Transaction; import org.neo4j.graphdb.index.BatchInserterIndex; import org.neo4j.graphdb.index.BatchInserterIndexProvider; import org.neo4j.graphdb.index.Index; import org.neo4j.graphdb.index.IndexManager; import org.neo4j.helpers.collection.MapUtil; import org.neo4j.index.impl.lucene.LuceneBatchInserterIndexProvider; import org.neo4j.kernel.EmbeddedGraphDatabase; import org.neo4j.kernel.impl.batchinsert.BatchInserter; import org.neo4j.kernel.impl.batchinsert.BatchInserterImpl; public class BatchIndexTest { static final String DB_PATH = target/db; static BatchInserterIndexProvider indexProvider; static BatchInserter inserter; static BatchInserterIndex persons; static GraphDatabaseService graphDb; public static void main( String[] args ) { inserter = new BatchInserterImpl( DB_PATH ); indexProvider = new LuceneBatchInserterIndexProvider( inserter ); persons = indexProvider.nodeIndex( persons, MapUtil.stringMap( type, exact ) ); MapString, Object properties = MapUtil.map( name, test ); long node = inserter.createNode( properties ); persons.add( node, properties ); indexProvider.shutdown(); inserter.shutdown(); graphDb = new EmbeddedGraphDatabase( DB_PATH ); Transaction tx = graphDb.beginTx(); try { IndexManager indexManager = graphDb.index(); System.out.println( indexManager.forRelationships( persons ) ); System.out.println( indexManager.existsForNodes( persons ) ); // returns // true // Getting an error with the following code IndexNode nodes = indexManager.forNodes( persons ); System.out.println( nodes ); tx.success(); } finally { tx.finish(); graphDb.shutdown(); } } } Cheers, /peter neubauer ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Determining the indices associated with a graph.
It could be added... what exactly do you need it for? Is there a use case for it? Do you want the Index instances, or just the names/configurations? 2010/10/26 Marko Rodriguez okramma...@gmail.com Hello, How do you determine what indices are associated with a graph? E.g. something along the lines of graph.index().getAllIndices() ?? Thank you, Marko. http://markorodriguez.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Deployment Scenarios
Good visualization! One thing: in scenario #7 I'd like the App to access the User REST directly in addition (or instead of) just REST Servlet. Because that's what your app would do, go directly through that User REST interface. Would you agreee? 2010/10/25 Andreas Kollegger andreas.kolleg...@neotechnology.com Great catch, we should definitely capture that one. Attached... On Oct 25, 2010, at 4:41 PM, Jan Boonen wrote: Having read the discussion about an in-memory version of neo4j for unit testing last week. I'd propose to add that scenario as well. As far as I understood it's almost the same scenario as #1, but I would be triggered faster to use neo4j when such a scenario was documented. Cheers, Jan On 25-10-2010 15:55, Andreas Kollegger wrote: Hi all, I've been sketching out various deployment scenarios for Neo4j, and have come up with the attached list of diagrams. Targeted for this next milestone release is packaging up scenario #7 into a standalone server install, with a bonus mechanism for adding extensions to the REST API. Does this set provide a decent base set of use cases? Are there more elaborate or simply different scenarios you'd like to see supported? Cheers, Andreas ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs. FULLTEXT_CONFIG
I just found the problem... fulltext defaults to being case-insensitive (by converting added values as well as string queries to lower case). There's a quirk in the Lucene QueryParser where you must specifically set whether or not range/wildcard queries should have their terms converted into lower case or not, and by that ignoring what the analyzer has to say about it, which I feel is a poor design decision in Lucene. Because now if you specify your own custom analyzer class in the configuration you must also set the to_lower_case parameter to how the analyzer is implemented, otherwise you cannot expect to get the correct results back. Anyways, range/wildcard terms are lower cased more correctly now. Your queries will work with the latest SNAPSHOT, however your second query doesn't look like a proper lucene query. Maybe you meant Arn* w/o the brackets? Main difference between exact and fulltext is that a fulltext index tokenizes your values into words and indexes each word individually (and also by default converting them into lower case). 2010/10/26 Konstanze.Lorenz konstanze.lor...@fh-zwickau.de Hello, I'm giving the new LuceneIndexProvider a trial und try to become acquainted with EXACT_CONFIG and FULLTEXT_CONFIG. Currently, I do not understand some differences between their quering-results.. For example: String nameArnold = Arnold Aronson; String key = name; Node nodeArnold = neo.createNode(); nodeArnold.setProperty(key, nameArnold); index.add(nodeArnold, key, nameArnold); -- index.query(key, [A TO Z]; //(1) index.query(key, [Arn*]; //(2) These querys work only with EXACT. FULLTEXT returns no matches. But a RangeQuery (1) for String would be quiet interesting with FULLTEXT. It should return the same matches as EXACT at least. Furthermore, the Query Parser Syntax of Lucene should be enabled in FULLTEXT (2). So here is the question: Am I not seeing the trick to use them similarly or are these configurations that different as they seem to be? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Determining the indices associated with a graph.
2010/10/26 Elena Pechko elenapec...@gmail.com Hi, I have a related question: is it possible to determine a list of indexed values? Regards, Elena For a specific entity or for a specific index? Short answer: no, not in any efficient way at least... What would be the use case for it? It could be added... what exactly do you need it for? Is there a use case for it? Do you want the Index instances, or just the names/configurations? 2010/10/26 Marko Rodriguez [1]okramma...@gmail.com Hello, How do you determine what indices are associated with a graph? E.g. something along the lines of graph.index().getAllIndices() ?? Thank you, Marko. [2]http://markorodriguez.com ___ Neo4j mailing list [3]u...@lists.neo4j.org [4]https://lists.neo4j.org/mailman/listinfo/user References 1. mailto:okramma...@gmail.com 2. http://markorodriguez.com/ 3. mailto:User@lists.neo4j.org 4. https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Deployment Scenarios
2010/10/26 Andreas Kollegger andreas.kolleg...@neotechnology.com That could be clarified somehow. The User REST is providing resources that are exposed through the Servlet, so the interaction still happens through the servlet. Perhaps new wording and a dotted line. Um... how's this... Sure, that'll work On Oct 26, 2010, at 12:10 PM, Mattias Persson wrote: Good visualization! One thing: in scenario #7 I'd like the App to access the User REST directly in addition (or instead of) just REST Servlet. Because that's what your app would do, go directly through that User REST interface. Would you agreee? 2010/10/25 Andreas Kollegger andreas.kolleg...@neotechnology.com Great catch, we should definitely capture that one. Attached... On Oct 25, 2010, at 4:41 PM, Jan Boonen wrote: Having read the discussion about an in-memory version of neo4j for unit testing last week. I'd propose to add that scenario as well. As far as I understood it's almost the same scenario as #1, but I would be triggered faster to use neo4j when such a scenario was documented. Cheers, Jan On 25-10-2010 15:55, Andreas Kollegger wrote: Hi all, I've been sketching out various deployment scenarios for Neo4j, and have come up with the attached list of diagrams. Targeted for this next milestone release is packaging up scenario #7 into a standalone server install, with a bonus mechanism for adding extensions to the REST API. Does this set provide a decent base set of use cases? Are there more elaborate or simply different scenarios you'd like to see supported? Cheers, Andreas ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
[Neo4j] How to create an index
Hi, I'm a bit torn about one aspect of the new index frameworkhttp://wiki.neo4j.org/content/Index_Framework: index creation. My initial though with it was do creation just like you do with an embedded graph database, i.e. there's no explicit creation phase for it, instead you just instantiate a: new EmbeddedGraphDatabase( my/dir ); or new EmbeddedGraphDatabase( my/dir, myConfigMap ); and it will be created if it doesn't exist... even taking some parameters from the map and storing permanently the first time so they cannot be changed as long as the database is there (f.ex. string block size and more). It's about usability IMO to not have to do: if ( dbDoesntExist( my/dir ) ) { createDb( my/dir ); } else { openDb( my/dir ); } or similar... Now looking at index creation, it's done in a similar fashion: // will create an index persons w/ default configuration if it doesn't exist // else it will just return it w/ the config used when creating it. graphDb.index().forNodes( persons ); // will create an index persons w supplied configuration if it doesn't exist. // if it does exist then configuration must match the stored config. // if it already existed it will be returned w/ the config used when created. graphDb.index().forNodes( persons, myConfigMap ); Is it a bad thing or surprising that index creation happens as a side effect of requesting? Of course it could be. Should there instead be: graphDb.index().forNodes( persons ); graphDb.index().createForNodes( persons, myConfigMap ); so that creation is explicit? Then your code would potentially have to do a: IndexNode personIndex = null; if ( indexExists( persons ) ) personIndex = graphDb.index().forNodes( persons ); else personIndex = graphDb.index().createForNodes( persons, myConfigMap ); doStuffWith( personIndex ); If you don't have a place where such initialization occurs for each startup, making sure all your needed indexes are created if they do not exist. Input on this, anyone? -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] [Blueprints] Index API look and feel (RFC)
2010/10/26 Marko Rodriguez okramma...@gmail.com Hello, The new Neo4j M2 release supports Edge (Relationship) indexing. See: http://components.neo4j.org/neo4j-examples/1.2.M02/apidocs/org/neo4j/graphdb/index/IndexManager.html#existsForNodes%28java.lang.String%29 Yep, the integrated index framework in Neo4j supports compound queries and indexing of relationships (edges). Even more information at http://wiki.neo4j.org/content/Index_Framework With next release of Blueprints, Vertex/Node and Edge/Relationship indexing will be possible through Neo4jGraph in Blueprints. Hope that clears things up, Marko http://markorodriguez.com http://pipes.tinkerpop.com On Oct 26, 2010, at 11:19 AM, Walaa Eldin Moustafa wrote: Thank you for this useful information. I understood that Neo4j implements the Lucene index which supports node indexing only but not edge indexing. Also, from what I understand from the Blueprints API, it can support both node and edge indexing, and it also supports Neo4j as an underlying implementation. Does that mean that by using the Blueprints API, we can have support for Neo4j edge indexing? Thanks, Walaa. On Tue, Oct 26, 2010 at 12:47 PM, Marko Rodriguez okramma...@gmail.com wrote: Hi everyone, *** I've included the Neo4j users group in the mailing in case they have any thoughts on the matter. *** So these are the classes currently associated with Indexing in the next release of Blueprints [ http://blueprints.tinkerpop.com ]. Much of this was inspired by the multi-indexing framework provided in the latest Neo4j. After a presentation of the Blueprints interfaces, a discussion ensues. public interface IndexableGraph extends Graph { public T IndexT acquireIndex(String indexName, ClassT indexClass); public IterableIndex getIndices(); public void dropIndex(String indexName); } public interface IndexT { public String getIndexName(); public ClassT getIndexClass(); public void put(String key, Object value, T object); public IterableT get(String key, Object value); public void remove(String key, Object value, T object); } public interface AutomaticPropertyIndexT extends IndexT { public void addAutoIndexKey(String key); public void removeAutoIndexKey(String key); public SetString getAutoIndexKeys(); } A graph (e.g. Neo4j, OrientDB, TinkerGraph, etc.) can implement IndexableGraph (analogous to TransactionalGraph, but for indices as opposed to transactions). Indices can be created and retrieved (acquireIndex(), getIndices()) and dropped (dropIndex()) through the main graph class. Every index maintains standard get/put/remove methods as well as their String name and the class type that they index. For Neo4j, these are ? extends PropertyContainer type indices. For OrientDB and TinkerGraph, any Java object (serializable for OrientDB) can be stored in an Index. Finally, to maintain the user friendly nature of Blueprints, there is an AutomaticPropertyIndex interface that can be used if so desired. This is like TransactionalGraph.Mode.AUTOMATIC, but for automatically indexing vertices/edges for the user. An AutomaticPropertyIndex will automatically maintain an index of your vertices/edges as you add, remove, update properties on those vertices/edges. For those familiar with Blueprints 0.2 and the past single-index model, the only difference is now a graph can have multiple indices. This will incur some minor changes to Gremlin and will be discussed in a later email. I've already implemented this for Neo4j and TinkerGraph to get a feel for the API and so far its feeling fine. There are a few oddities that need to be thought out (e.g. unique naming for indices, etc.). Thoughts are more than welcome. Take care, Marko. http://markorodriguez.com http://gremlin.tinkerpop.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j tutorial
I just looked at http://wiki.neo4j.org/content/Getting_Started_In_One_Minute_Guide and http://wiki.neo4j.org/content/Getting_Started_Guide and they include the tx.success() thingie. Does it need to be clarified somehow and if so how? 2010/10/26 Roman Uhlig roman.uh...@maxity.de Thanks Axel, that did it. Most of the tutorials (especially the getting started guide) don't show this, so I'd recommend to include it. It's a bit confusing, especially when you come from the relational database environment or even an ORM mapper, where .commit() / .rollback() usually closes the transaction as well. Thanks for your help, Roman -Ursprüngliche Nachricht- Von: Roman Uhlig Gesendet: Dienstag, 26. Oktober 2010 16:01 An: technik Betreff: Re: Neo4j tutorial You forgot tx.success() in the try{} block, see http://wiki.neo4j.org/content/Transactions#Controlling_success. Greetings Axel Am 26.10.2010 11:19, schrieb Roman Uhlig: Hi, I'm completely new to Neo4j and got an issue while trying the getting started guide. Everytime I close the database (GraphDatabaseService.shutdown()) and open it again later, all data is gone (or was never saved). I did exactly as described in the guide, so maybe someone can point me in the right direction? Probably just some stupid mistake by me. ;) My tiny test case (Windows, Java 1.6): Writing data: GraphDatabaseService db = new EmbeddedGraphDatabase(/temp/neo4j/test1); try { org.neo4j.graphdb.Transaction tx = db.beginTx(); try { Node rootNode = db.getReferenceNode(); Node firstNode = db.createNode(); rootNode.createRelationshipTo(firstNode, IdabaRelTypes.ROOT_TYPES); } finally { tx.finish(); } } finally { db.shutdown(); } Reading data: GraphDatabaseService db = new EmbeddedGraphDatabase(/temp/neo4j/test1); try { org.neo4j.graphdb.Transaction tx = db.beginTx(); try { Node rootNode = db.getReferenceNode(); for (Relationship rel : rootNode.getRelationships()) { // output test data, but nothing in it } } finally { tx.finish(); } } finally { db.shutdown(); } Thanks in advance, Roman ___ Neo4j mailing list User at lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Axel Morgner Creative Solutions - Software Engineering GUI UX Design - Project Management c/o inxire GmbH Hanauer Landstr. 293a 60314 Frankfurt Germany Phone +49 151 40522060 E-mail axel at morgner.de Web http://morgner.de ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Determining the indices associated with a graph.
2010/10/26 Marko Rodriguez okramma...@gmail.com Hello, It could be added... what exactly do you need it for? Is there a use case for it? Lets say I connect to some graph database and need to know what the indices are so I can make intelligent queries. If I don't know what indices exist, then I'm bound to generate new indices (e.g. I called it people the first time and person the second time). In MySQL, my most used operation when coming into a new database is to see what indices exist so I know what my running time for various queries will be. However, personally, I'm providing Blueprints support for multi-indexes and graph.getIndexNames() is an UnsupportedOperation currently for the Neo4j implementation. Allright, so more from a dba point of views, so to speak. Do you want the Index instances, or just the names/configurations? I would want just the names. From there I can get the instances. As, from what I can tell, you can't get the name from the instance. So, in effect, you get more information if you start with names first. The index name could be exposed in the interface as well, and maybe even its configuration... in neo4j you can create indices with a Map of strings as config. QUESTION: In Neo4j, are indices unique up to String[name]/Class[node/relationship]? Meaning, can two indices have the same name -- one for nodes and one for relationships? They are name/class unique so e.g. a persons Node index as well as a persons Relationship index can coexist. And each index can have different implementations... f.ex. one using Lucenehttp://lucene.apache.org/java/docs/index.htmland another using Babudb http://code.google.com/p/babudb/ or whatever (controlled by the provider key in the config map at creation of the index). Thank you for your time, Marko. http://tinkerpop.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Code Swarm
That looks awesome :) 2010/10/22 Peter Neubauer peter.neuba...@neotechnology.com Awesome, seems Anders and Tobias are the masters of big commit flashes ;) Do send over the file to Anders, we could put it up on Neo4j.org! Cheers, /peter neubauer VP Product Management, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Thu, Oct 21, 2010 at 10:20 PM, Craig Taverner cr...@amanzi.com wrote: Hi all, I was playing with code swarm for another project and thought I would see what the Neo4j development looked like. So I extracted the SVN logs from the public repository and made the following video: http://www.youtube.com/watch?v=sp2FhjH696g The video looks much better in the original AVI than in youtube, so if you want that, just let me know. It is only a 10MB file and the video lasts 1:48s. Also, it is possible the regex I used to differentiate components is not perfect, and various other settings can be changed. But this is a fun first view :-) Regards, Craig P.S. If you want to see my own project (a desktop app which embeds Neo4j) take a peek at the video at http://www.youtube.com/watch?v=P5gfBzYNum8 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GraphAlgoFactory.pathsWithLength returns paths with loops
Hi again, I just solved the problem... it was a bug because your example found a path which contained a relationship in two places, which isn't even loops (where a node can be found in more than one place in a path). So now I'd guess it's working as you'd expect. Please let me know otherwise. It's in the trunk of the graph-algo component, latest SNAPSHOT, that is. 2010/10/18 Mattias Persson matt...@neotechnology.com 2010/10/18, Yaniv Ben Yosef yani...@gmail.com: Hi Mattias, While taking a closer look at the code, I realized there's an AllSimplePaths class, which can be easily modified to filter paths whose length isn't maxDepth (with an extra argument or subclass). I think that's the simplest solution, but I'm not clear on why pathsWithLength() uses ShortestPath rather than AllPaths.. Is there an efficiency gain here that I don't see? Currently the AllSimplePaths traverses in one direction only, whereas ShortestPath traverser from both directions (start and end node) interleaved, making it more efficient. Filtering paths from AllSimplePaths will do the trick, but patching ShortestPath will give better performance! And a newbie question: I checked out the algo component (via the 0.7-1.2.M01 tag). I have no trouble building it, but Maven doesn't create a JAR package of it (the target directory only has classes). Is there a different pom.xml I should use (I used the one in the root of the component directory)? I gather it has to be automated somewhere. Thanks! --- Yaniv On Mon, Oct 18, 2010 at 11:27 AM, Mattias Persson matt...@neotechnology.com wrote: 2010/10/18 Yaniv Ben Yosef yani...@gmail.com Thanks, I figured that.. Would you be so kind reviewing the code once I finish it? Sure! --- Yaniv On Mon, Oct 18, 2010 at 8:20 AM, Mattias Persson matt...@neotechnology.comwrote: 2010/10/18, Yaniv Ben Yosef yani...@gmail.com: Thanks Mattias! Do you have an expected time-frame for that? Alternatively, do you have any quick tips on how I would go and implement this myself? I very briefly scanned the code in org.neo4j.graphalgo.impl.path.ShortestPath and I suspect I should test whether a node has already been visited (in goOneStep() perhaps?). Would you say that's the right approach? Thanks again, Unfortunately it's hard to estimate when there's time to do it. Your suggestion sounds reasonable, but keep in mind that the algo is used for the shortest path calculation as well. So an extra argument in the constructor for ignoring loopy paths when finding paths of a certain length would be the way to go IMO. --- Yaniv On Sun, Oct 17, 2010 at 10:18 PM, Mattias Persson matt...@neotechnology.com wrote: I just realized (it was me who put it there) that the documentation is wrong. That one allows cyclic paths, as you obviously noticed :). I'll try to add a simplePathsWithLength method also to take care of that... 2010/10/17 Yaniv Ben Yosef yani...@gmail.com Sure :) Will be happy to get your feedback. --- Yaniv On Sun, Oct 17, 2010 at 6:24 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Hi Yaniv, thanks for the report, I will take a look at it tomorrow if that is ok? Cheers, /peter neubauer VP Product Management, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Oct 17, 2010 at 1:28 PM, Yaniv Ben Yosef yani...@gmail.com wrote: Hi, I am playing with Neo4J version 1.2 M1, specifically with GraphAlgoFactory.pathsWithLength(). According to the javadoc, it should never return paths with loops. However, it seems like it does. I created a simple test case to demonstrate that: http://snipt.org/kpwn/ I expect the code not to show any path, but instead it prints the following path: Path: A - B - C - B Please let me know if there's any fault on my side, or if that's a bug. Thanks, Yaniv ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list
Re: [Neo4j] Lucene result sorting
2010/10/20 Balazs E. Pataki pat...@dsd.sztaki.hu Hi Andrés, I just quickly read through the code and have an idea for an additional sorting solution via QueryContext: a user provided sorter, which is invoked right after lucene search has been executed, but before the lucene results are turned into neo4j Nodes. This would give developers the option to sort lucene Documents according to their fields using whatever sorting method they want. The code below is just theoretical, I haven't tried it yet, but would require just minimal additions to the current LucenIndex#search() method and to QueryContext. This is the current search() method in LuceneIndex.java: private SearchResult search( IndexSearcherRef searcher, Query query, QueryContext additionalParametersOrNull ) { try { searcher.incRef(); Sort sorting = additionalParametersOrNull != null ? additionalParametersOrNull.sorting : null; Hits hits = new Hits( searcher.getSearcher(), query, null, sorting ); return new SearchResult( new HitsIterator( hits ), hits.length() ); } catch ( IOException e ) { throw new RuntimeException( Unable to query + this + with + query, e ); } } I would add two things: a SortingIterator interface and a sortingIterator field to QueryContext: public interface SortingIterator extends IteratorDocument{ public void setIterator(IteratorDocument hitIterator); public itn length(); } public class QueryContext { final Object queryOrQueryObject; Sort sorting; SortingIterator sortingIterator; Operator defaultOperator; boolean tradeCorrectnessForSpeed; ... } And would add these to search(): if sorting is available, it would be passed as usual to the constructor of Hits(), and then if a sortingIterator is set in the QueryContext we could pass it the original HitsIterator (via setIterator()) and access the results via the sortingIterator rather than the HitsIterator directly: private SearchResult search( IndexSearcherRef searcher, Query query, QueryContext additionalParametersOrNull ) { try { searcher.incRef(); Sort sorting = additionalParametersOrNull != null ? additionalParametersOrNull.sorting : null; SortingIterator sortingIterator = additionalParametersOrNull != null ? additionalParametersOrNull.sortingIterator : null; Hits hits = new Hits( searcher.getSearcher(), query, null, sorting ); IteratorDocument hitIterator = new HitsIterator( hits ); int hitLength = hits.length(); if (sortingIterator != null) { hitIterator = sortingIterator.setIterator(hitIterator); hitLength = sortingIterator.length(); } return new SearchResult( hitIterator, hitLength ); } catch ( IOException e ) { throw new RuntimeException( Unable to query + this + with + query, e ); } } The SortingIterator could fetch the lucene Documents via the iterator passed in setIterator(), possibly fetching the Documents and sorting using whatever method it wants based on whichever fields are available in the Documents. Then would provide the sorted result back via next(). Also this intermediate iterator could do other things with the result, eg. remove some Documents, etc. (that's why it provides its own length() for the result list) Do you think such solution is feasible? Sure, something like that could be implemented. The point would be to be able to control the order yourself, I assume? Because I'm guessing it'd be hard to implement something that would be more efficient than the internal Sort stuff in Lucene, however the options are quite limited there... you can just specify which keys to sort on and the order is always the natural lexical order, I think. Good input. Regards, --- balazs On 10/20/10 8:24 AM, Balazs E. Pataki wrote: Hi Andrés, Thanks for the answer, looks cool :-) I give it a try immediately! Regards, --- balazs On 10/19/10 8:37 PM, Mattias Persson wrote: 2010/10/19 Andres Taylorandres.tay...@neotechnology.com Hi Balazs, We've been working on a new lucene-index module just these last days. The new index module allows sorting, through the QueryContext-class. You can look in svnhttps://svn.neo4j.org/components/lucene-index/trunk/, if you are so inclined, or wait for the next milestone release (Thursday). Exactly, an example could be: IndexNode myNodeIndex = ... for ( Node hit : myNodeIndex.query( new QueryContext( name:Balazs ).sort( name ) ) ) { System.out.println( hit.getProperty( name ) ); } HTH, Andrés On Tue, Oct 19, 2010 at 5:41 PM, Balazs E. Patakipat...@dsd.sztaki.hu wrote: Hi, Is it possible to do get sorted results form LuceneIndex#query()? It would be really helpful if results would be sorted at lucene time according to one or more indexed
Re: [Neo4j] Lucene result sorting
2010/10/20 Balazs E. Pataki pat...@dsd.sztaki.hu Yes, the idea is to overcome the limitations of Lucene sorting. The current solution I use is to get the IndexHits from LuceneIndex and then sort the neo4j Node's by their properties. But this requires loading all nodes in the hit list. Rather than doing this, sorting the Lucene Documents, which need to be loaded anyway, and then only converting the necessary Documents to Nodes seems more efficient to me. Absolutely, I agree --- balazs On 10/20/10 10:09 AM, Mattias Persson wrote: 2010/10/20 Balazs E. Patakipat...@dsd.sztaki.hu Hi Andrés, I just quickly read through the code and have an idea for an additional sorting solution via QueryContext: a user provided sorter, which is invoked right after lucene search has been executed, but before the lucene results are turned into neo4j Nodes. This would give developers the option to sort lucene Documents according to their fields using whatever sorting method they want. The code below is just theoretical, I haven't tried it yet, but would require just minimal additions to the current LucenIndex#search() method and to QueryContext. This is the current search() method in LuceneIndex.java: private SearchResult search( IndexSearcherRef searcher, Query query, QueryContext additionalParametersOrNull ) { try { searcher.incRef(); Sort sorting = additionalParametersOrNull != null ? additionalParametersOrNull.sorting : null; Hits hits = new Hits( searcher.getSearcher(), query, null, sorting ); return new SearchResult( new HitsIterator( hits ), hits.length() ); } catch ( IOException e ) { throw new RuntimeException( Unable to query + this + with + query, e ); } } I would add two things: a SortingIterator interface and a sortingIterator field to QueryContext: public interface SortingIterator extends IteratorDocument{ public void setIterator(IteratorDocument hitIterator); public itn length(); } public class QueryContext { final Object queryOrQueryObject; Sort sorting; SortingIterator sortingIterator; Operator defaultOperator; boolean tradeCorrectnessForSpeed; ... } And would add these to search(): if sorting is available, it would be passed as usual to the constructor of Hits(), and then if a sortingIterator is set in the QueryContext we could pass it the original HitsIterator (via setIterator()) and access the results via the sortingIterator rather than the HitsIterator directly: private SearchResult search( IndexSearcherRef searcher, Query query, QueryContext additionalParametersOrNull ) { try { searcher.incRef(); Sort sorting = additionalParametersOrNull != null ? additionalParametersOrNull.sorting : null; SortingIterator sortingIterator = additionalParametersOrNull != null ? additionalParametersOrNull.sortingIterator : null; Hits hits = new Hits( searcher.getSearcher(), query, null, sorting ); IteratorDocument hitIterator = new HitsIterator( hits ); int hitLength = hits.length(); if (sortingIterator != null) { hitIterator = sortingIterator.setIterator(hitIterator); hitLength = sortingIterator.length(); } return new SearchResult( hitIterator, hitLength ); } catch ( IOException e ) { throw new RuntimeException( Unable to query + this + with + query, e ); } } The SortingIterator could fetch the lucene Documents via the iterator passed in setIterator(), possibly fetching the Documents and sorting using whatever method it wants based on whichever fields are available in the Documents. Then would provide the sorted result back via next(). Also this intermediate iterator could do other things with the result, eg. remove some Documents, etc. (that's why it provides its own length() for the result list) Do you think such solution is feasible? Sure, something like that could be implemented. The point would be to be able to control the order yourself, I assume? Because I'm guessing it'd be hard to implement something that would be more efficient than the internal Sort stuff in Lucene, however the options are quite limited there... you can just specify which keys to sort on and the order is always the natural lexical order, I think. Good input. Regards, --- balazs On 10/20/10 8:24 AM, Balazs E. Pataki wrote: Hi Andrés, Thanks for the answer, looks cool :-) I give it a try immediately! Regards, --- balazs On 10/19/10 8:37 PM, Mattias Persson wrote: 2010/10/19 Andres Taylorandres.tay...@neotechnology.com Hi Balazs, We've been working on a new lucene-index module just these last days. The new index module allows sorting, through
Re: [Neo4j] Lucene result sorting
2010/10/19 Andres Taylor andres.tay...@neotechnology.com Hi Balazs, We've been working on a new lucene-index module just these last days. The new index module allows sorting, through the QueryContext-class. You can look in svn https://svn.neo4j.org/components/lucene-index/trunk/, if you are so inclined, or wait for the next milestone release (Thursday). Exactly, an example could be: IndexNode myNodeIndex = ... for ( Node hit : myNodeIndex.query( new QueryContext( name:Balazs ).sort( name ) ) ) { System.out.println( hit.getProperty( name ) ); } HTH, Andrés On Tue, Oct 19, 2010 at 5:41 PM, Balazs E. Pataki pat...@dsd.sztaki.hu wrote: Hi, Is it possible to do get sorted results form LuceneIndex#query()? It would be really helpful if results would be sorted at lucene time according to one or more indexed fields rather than loading the actual neo4j nodes and than iterating over them for sorting. Currently, it seems that sorting is not supported by LuceneIndex, but are there plans regarding this? Thanks for any hints, --- balazs ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GraphAlgoFactory.pathsWithLength returns paths with loops
2010/10/18, Yaniv Ben Yosef yani...@gmail.com: Thanks Mattias! Do you have an expected time-frame for that? Alternatively, do you have any quick tips on how I would go and implement this myself? I very briefly scanned the code in org.neo4j.graphalgo.impl.path.ShortestPath and I suspect I should test whether a node has already been visited (in goOneStep() perhaps?). Would you say that's the right approach? Thanks again, Unfortunately it's hard to estimate when there's time to do it. Your suggestion sounds reasonable, but keep in mind that the algo is used for the shortest path calculation as well. So an extra argument in the constructor for ignoring loopy paths when finding paths of a certain length would be the way to go IMO. --- Yaniv On Sun, Oct 17, 2010 at 10:18 PM, Mattias Persson matt...@neotechnology.com wrote: I just realized (it was me who put it there) that the documentation is wrong. That one allows cyclic paths, as you obviously noticed :). I'll try to add a simplePathsWithLength method also to take care of that... 2010/10/17 Yaniv Ben Yosef yani...@gmail.com Sure :) Will be happy to get your feedback. --- Yaniv On Sun, Oct 17, 2010 at 6:24 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Hi Yaniv, thanks for the report, I will take a look at it tomorrow if that is ok? Cheers, /peter neubauer VP Product Management, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Oct 17, 2010 at 1:28 PM, Yaniv Ben Yosef yani...@gmail.com wrote: Hi, I am playing with Neo4J version 1.2 M1, specifically with GraphAlgoFactory.pathsWithLength(). According to the javadoc, it should never return paths with loops. However, it seems like it does. I created a simple test case to demonstrate that: http://snipt.org/kpwn/ I expect the code not to show any path, but instead it prints the following path: Path: A - B - C - B Please let me know if there's any fault on my side, or if that's a bug. Thanks, Yaniv ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GraphAlgoFactory.pathsWithLength returns paths with loops
2010/10/18 Yaniv Ben Yosef yani...@gmail.com Thanks, I figured that.. Would you be so kind reviewing the code once I finish it? Sure! --- Yaniv On Mon, Oct 18, 2010 at 8:20 AM, Mattias Persson matt...@neotechnology.comwrote: 2010/10/18, Yaniv Ben Yosef yani...@gmail.com: Thanks Mattias! Do you have an expected time-frame for that? Alternatively, do you have any quick tips on how I would go and implement this myself? I very briefly scanned the code in org.neo4j.graphalgo.impl.path.ShortestPath and I suspect I should test whether a node has already been visited (in goOneStep() perhaps?). Would you say that's the right approach? Thanks again, Unfortunately it's hard to estimate when there's time to do it. Your suggestion sounds reasonable, but keep in mind that the algo is used for the shortest path calculation as well. So an extra argument in the constructor for ignoring loopy paths when finding paths of a certain length would be the way to go IMO. --- Yaniv On Sun, Oct 17, 2010 at 10:18 PM, Mattias Persson matt...@neotechnology.com wrote: I just realized (it was me who put it there) that the documentation is wrong. That one allows cyclic paths, as you obviously noticed :). I'll try to add a simplePathsWithLength method also to take care of that... 2010/10/17 Yaniv Ben Yosef yani...@gmail.com Sure :) Will be happy to get your feedback. --- Yaniv On Sun, Oct 17, 2010 at 6:24 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Hi Yaniv, thanks for the report, I will take a look at it tomorrow if that is ok? Cheers, /peter neubauer VP Product Management, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Oct 17, 2010 at 1:28 PM, Yaniv Ben Yosef yani...@gmail.com wrote: Hi, I am playing with Neo4J version 1.2 M1, specifically with GraphAlgoFactory.pathsWithLength(). According to the javadoc, it should never return paths with loops. However, it seems like it does. I created a simple test case to demonstrate that: http://snipt.org/kpwn/ I expect the code not to show any path, but instead it prints the following path: Path: A - B - C - B Please let me know if there's any fault on my side, or if that's a bug. Thanks, Yaniv ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GraphAlgoFactory.pathsWithLength returns paths with loops
2010/10/18, Yaniv Ben Yosef yani...@gmail.com: Hi Mattias, While taking a closer look at the code, I realized there's an AllSimplePaths class, which can be easily modified to filter paths whose length isn't maxDepth (with an extra argument or subclass). I think that's the simplest solution, but I'm not clear on why pathsWithLength() uses ShortestPath rather than AllPaths.. Is there an efficiency gain here that I don't see? Currently the AllSimplePaths traverses in one direction only, whereas ShortestPath traverser from both directions (start and end node) interleaved, making it more efficient. Filtering paths from AllSimplePaths will do the trick, but patching ShortestPath will give better performance! And a newbie question: I checked out the algo component (via the 0.7-1.2.M01 tag). I have no trouble building it, but Maven doesn't create a JAR package of it (the target directory only has classes). Is there a different pom.xml I should use (I used the one in the root of the component directory)? I gather it has to be automated somewhere. Thanks! --- Yaniv On Mon, Oct 18, 2010 at 11:27 AM, Mattias Persson matt...@neotechnology.com wrote: 2010/10/18 Yaniv Ben Yosef yani...@gmail.com Thanks, I figured that.. Would you be so kind reviewing the code once I finish it? Sure! --- Yaniv On Mon, Oct 18, 2010 at 8:20 AM, Mattias Persson matt...@neotechnology.comwrote: 2010/10/18, Yaniv Ben Yosef yani...@gmail.com: Thanks Mattias! Do you have an expected time-frame for that? Alternatively, do you have any quick tips on how I would go and implement this myself? I very briefly scanned the code in org.neo4j.graphalgo.impl.path.ShortestPath and I suspect I should test whether a node has already been visited (in goOneStep() perhaps?). Would you say that's the right approach? Thanks again, Unfortunately it's hard to estimate when there's time to do it. Your suggestion sounds reasonable, but keep in mind that the algo is used for the shortest path calculation as well. So an extra argument in the constructor for ignoring loopy paths when finding paths of a certain length would be the way to go IMO. --- Yaniv On Sun, Oct 17, 2010 at 10:18 PM, Mattias Persson matt...@neotechnology.com wrote: I just realized (it was me who put it there) that the documentation is wrong. That one allows cyclic paths, as you obviously noticed :). I'll try to add a simplePathsWithLength method also to take care of that... 2010/10/17 Yaniv Ben Yosef yani...@gmail.com Sure :) Will be happy to get your feedback. --- Yaniv On Sun, Oct 17, 2010 at 6:24 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Hi Yaniv, thanks for the report, I will take a look at it tomorrow if that is ok? Cheers, /peter neubauer VP Product Management, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Oct 17, 2010 at 1:28 PM, Yaniv Ben Yosef yani...@gmail.com wrote: Hi, I am playing with Neo4J version 1.2 M1, specifically with GraphAlgoFactory.pathsWithLength(). According to the javadoc, it should never return paths with loops. However, it seems like it does. I created a simple test case to demonstrate that: http://snipt.org/kpwn/ I expect the code not to show any path, but instead it prints the following path: Path: A - B - C - B Please let me know if there's any fault on my side, or if that's a bug. Thanks, Yaniv ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list
Re: [Neo4j] Listing properties on relationships?
What you have to do currently (not the optimal solution) is to cd to that relationship: cd -r 468 now do your ls. From there you can then go: cd .. or cd start or cd end 2010/10/17 Peter Neubauer peter.neuba...@neotechnology.com Hi, I am trying to list relationship properties in the neo4j shell. I am starting the shell from the latest milestone like: .../program/neo4j-1.2.M01/bin/neo4j-shell -path db/ neo4j-sh (0)$ ls (me) --STATIONS- (2) (me) --TRAINS- (469) end then list a train: neo4j-sh (0)$ ls -v 468 *number =[2] (String) *till =[Stockholm] (String) (468) --469,2- (Malm? C,21) (468) -468,TRAIN-- (469) now, I would like to see the properties of the relationship (468) -TRAIN-- (469) . How do I do that? Cheers, /peter Cheers, /peter neubauer VP Product Development, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Listing properties on relationships?
2010/10/17 Peter Neubauer peter.neuba...@neotechnology.com Cool, thanks! If cd -r works, could we then have a ls -r variant that lists relationships directly from ID maybe? Sure, the ls -r means list the current node's relationships... but maybe the list relationship contents is more useful to have for an ls -r function... I'll add something like this (and maybe have the current -r removed or renamed to something else)! Cheers, /peter neubauer VP Product Development, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Oct 17, 2010 at 10:35 AM, Mattias Persson matt...@neotechnology.com wrote: What you have to do currently (not the optimal solution) is to cd to that relationship: cd -r 468 now do your ls. From there you can then go: cd .. or cd start or cd end 2010/10/17 Peter Neubauer peter.neuba...@neotechnology.com Hi, I am trying to list relationship properties in the neo4j shell. I am starting the shell from the latest milestone like: .../program/neo4j-1.2.M01/bin/neo4j-shell -path db/ neo4j-sh (0)$ ls (me) --STATIONS- (2) (me) --TRAINS- (469) end then list a train: neo4j-sh (0)$ ls -v 468 *number =[2] (String) *till =[Stockholm] (String) (468) --469,2- (Malm? C,21) (468) -468,TRAIN-- (469) now, I would like to see the properties of the relationship (468) -TRAIN-- (469) . How do I do that? Cheers, /peter Cheers, /peter neubauer VP Product Development, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] GraphAlgoFactory.pathsWithLength returns paths with loops
I just realized (it was me who put it there) that the documentation is wrong. That one allows cyclic paths, as you obviously noticed :). I'll try to add a simplePathsWithLength method also to take care of that... 2010/10/17 Yaniv Ben Yosef yani...@gmail.com Sure :) Will be happy to get your feedback. --- Yaniv On Sun, Oct 17, 2010 at 6:24 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Hi Yaniv, thanks for the report, I will take a look at it tomorrow if that is ok? Cheers, /peter neubauer VP Product Management, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Oct 17, 2010 at 1:28 PM, Yaniv Ben Yosef yani...@gmail.com wrote: Hi, I am playing with Neo4J version 1.2 M1, specifically with GraphAlgoFactory.pathsWithLength(). According to the javadoc, it should never return paths with loops. However, it seems like it does. I created a simple test case to demonstrate that: http://snipt.org/kpwn/ I expect the code not to show any path, but instead it prints the following path: Path: A - B - C - B Please let me know if there's any fault on my side, or if that's a bug. Thanks, Yaniv ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] EmbeddedGraphDatabase shutdown leaves a Timer thread running
This is an identified problem and it will be fixed soon, available in the next milestone as well as snapshot builds. Until then System.exit( 0 ) is your friend. 2010/10/13 Adam Lehenbauer a...@fixflyer.com I think I have the Getting Started example working, but after it creates the nodes and shuts down, my JVM won't exit because there is a non-daemon Timer thread running. I have an example class with a main() that is copied near-verbatim from http://wiki.neo4j.org/content/Getting_Started_Guide except for the db path and a few extra System.outs. Everything seems to work normally, but after shutting down the EmbeddedGraphDatabase the JVM will not exit. A jstack shows a running java.util.TimerThread, which I assume is started by the database. Its stack is: --- Timer-0 prio=10 tid=0x7f59c0056800 nid=0xa29 in Object.wait() [0x7f59bf6d5000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on 0x7f5a1ed37270 (a java.util.TaskQueue) at java.util.TimerThread.mainLoop(Timer.java:509) - locked 0x7f5a1ed37270 (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:462) --- I'm using the neo4j maven dependency: org.neo4j.neo4j, version=1.2.M01, type=pom. When I changed this to neo4j-kernel, version=1.0 the JVM exited as expected (no code change on my end). Does anyone have any idea why this is happening? Adam ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Exception when adding a Comment property to a node
/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Versioning :)
(2) I'd definately go with synced version for maven/non-maven stuff. (1) is a bit harder since component doesn't mature in the same rate as others, but maybe that doesn't matter... having synced versions for the components is rather good. 2010/10/6 Andreas Kollegger andreas.kolleg...@neotechnology.com Hello fellow graphytes, Today I offer for your consideration one of the classic unsolved problems of computer science: proper versioning. Neo4j is a available as individual library components and also pre-packaged collections of components. The obvious challenge is to maintain a coherent set of tested, known-good and compatible components. As we move towards regular milestone releases, what's the best way to control and inform about the various versions that are included? Use cases include: 1. I'm a maven developer, and want coherent dependencies 2. I develop offline, and want to know what combination of libs to download 3. I deploy neo4j as a server, and want to upgrade a component without breaking things Assuming that zip files (or similar) will always use the corresponding release version, the versioning of the included components could vary. For a milestone release with an overall group version of 1.2-M1, permutations of an individual component (the fictional neo4j-foo) version could be: Opt. | mvn version | download version --- 1| foo-0.7| foo-0.7 2| foo-0.7| foo-1.2-M1 3| foo-1.2-M1 | foo-1.2-M1 4| foo-0.7-1.2-M1 | foo-0.7-1.2-M1 5| foo-0.7| foo-0.7-1.2-M1 Questions include: 1. Should individual components keep their own versions, or defer to the grouped release version? 2. Should the maven version keep in sync with the non-maven version? Opinions? Cheers, Andreas ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Traversing Neo4j using python binding
There's fulltext index support in the indexing, but since I'm not by any means well travelled in the python bindings maybe someone can chip in about how this is done? 2010/10/5 Francois Kassis francois_kas...@hotmail.com Hi Mattias, Thx for your prompt reply. Actually I already used indexes while inserting data to check if a particular node is already inserted or not. but what if I want to get all nodes starting with PEP or PEPSI as in the below example? thx. Francois. -- From: Mattias Persson matt...@neotechnology.com Sent: Tuesday, October 05, 2010 12:46 PM To: Neo4j user discussions user@lists.neo4j.org Subject: Re: [Neo4j] Traversing Neo4j using python binding Yep, your use case is a good fit for using indexes... looping through 300k nodes just to find one particular isn't very efficient. Take a look at http://components.neo4j.org/neo4j.py/ for how to use indexing in the neo4j python bindings. 2010/10/5 Francois Kassis francois_kas...@hotmail.com Hi all, I am trying to retrieve data from neo4j database using python version. I have over 30 nodes all joined to a master node by a OFTYPE relationships. I used the following to traverse it: #!/usr/bin/env python # -*- coding: UTF-8 -*- # Traversal import neo4j def get_OFTYPE(ar_node):#, ar_filter): return OFTYPE(ar_node) class OFTYPE(neo4j.Traversal): #my_pos_filter = # #def __init__(self, start, pos_filter=): ## set an internal variable #self.my_pos_filter = pos_filter #neo4j.Traversal.__init__(self, start) types = [ neo4j.Incoming.OFTYPE, ] order = neo4j.BREADTH_FIRST stop = neo4j.StopAtDepth(1) def isReturnable(self, position): return (not position.is_start #and position.label == self.my_pos_filter and position.last_relationship.type == 'OFTYPE') and call the above by: #!/usr/bin/env python # -*- coding: UTF-8 -*- import argparse import neo4j import ConfigParser import os, sys, datetime, string import neoentity_traverse_test from neo4j.util import Subreference def main(): ls_current_script_path = sys.path[0] # os.getcwd() ls_current_script_filename = sys.argv[0] config = ConfigParser.RawConfigParser() config.read(ls_current_script_path + '/' + 'neoentity.cfg') parser = argparse.ArgumentParser(description='Initialize neo4j database for mediasharks root enteties. NOTE: if the database '\ 'does not exists, it will simply be created.') parser.add_argument('--neodbpath', dest='neodbpath', metavar='NEODB-PATH', default=config.get('database', 'database_path'), help='a directory where neodb should be created or opened.') parser.add_argument('--classpath', dest='kernalclasspath', metavar='KERNAL-CLASSPATH', default=config.get('neo4j', 'kernalclasspath'), help='the path toneo4j kernal path.') parser.add_argument('--jvm', dest='jvmclasspath', metavar='JVM-CLASSPATH', default=config.get('jvm', 'jvmpath'), help='the path toneo4j kernal path.') args = parser.parse_args() pytest(args.neodbpath, args.kernalclasspath, args.jvmclasspath) def pytest(arg_neodb_path, arg_kernal_classpath, arg_jvm_classpath): print = print initializing neo4j db using parameters: print database-path = + arg_neodb_path print kernel-path = + arg_kernal_classpath print jvm-path = + arg_jvm_classpath print = #initialize variables ls_message = #create new neo database graphdb = neo4j.GraphDatabase(arg_neodb_path, classpath=arg_kernal_classpath, jvm=arg_jvm_classpath) #start new transaction try: tx = graphdb.transaction.begin() rootindex = graphdb.index(root_index, create=True) subbrandnode = rootindex[subbrand] referencenode = graphdb.node[0] li_index = 0 ls_filter = PEPSI for node in neoentity_traverse_test.get_OFTYPE(subbrandnode):#, ls_filter): ls_result = node[label] if ls_result.startswith(ls_filter): li_index = li_index + 1 print ls_result print li_index except: tx.failure() print Error occurred, exiting... raise else: tx.success() finally: tx.finish() #saving current transactions and closing current database graphdb.shutdown() if __name__ == '__main__': main() The problem is it's taking too much time. how can I improve performance and how can I use
Re: [Neo4j] Using EmbeddedGraphDatabase, possible to stop node caching eating ram?
2010/9/30 Garrett Barton garrett.bar...@gmail.com Thanks for the reply! Root nodes are found via: private Node getNewNode(NeoTypes entity) { Node n = graphDb.createNode(); n.createRelationshipTp(getRootEntity(entity),entity); } private Node getRootNode(NeoTypes entity) { Node root = rootMap.get(entity); if(root == null) { if(indexService.getSingleNode(eid,entity.toString()) != null ) root = indexService.getSingleNode(eid,entity.toString()); else { Node entityNode = graphDb.createNode(); entityNode.createRelationshipTo(graphdb.getReferenceNode(),entity); root = entitiyNode; indexService.index(root, eid,entity.toString()); } rootMap.put(entity,root); } } Where rootMap = new HashMapNeoType,Node(); Thus I create the root entity once, attach it to the referenceNode and the look it up via the rootMap. My loads are loading one entity at a time, which from what you said will single thread me since every relationship that attaches to one of my rootNodes (the same one per run) locks that node until the transaction completes. I was creating root nodes in order to provide entrypoints, but this may be undesireable now that I think about it since each root entity could easily have 500M nodes hanging off of it. Neo probably would not be able to operate through a traversal of that very well correct? If I remove this restriction and load nodes individually I should be able to thread out again. Only when I do the relationships will I occasionally run into locks, which I can try to mitigate with more threads and smaller transaction sizes (10k maybe?) Is there any documentation on what operations will take out a lock? It depends on what kind of traversal you are doing... could you give some examples? I know its not the db (postgres) as the same code to hit this rs also can drive my full lucene indexing layer and I can pull well over 100k/s per thread with it. (My lucene implementation indexes on average 400-500k/s with 4 threads and peaks once in a while over 1mil/s) HUGE hack right now, but instead of just calling tx.finish() I am also shutting down and starting the db back up again every 150k. This has brought the rates (including start/stop time) up to about 15k/s and it stays at that level now. Need to figure out why I run out of ram so I can avoid doing this. (See my answer below on batch insertion as well). If neo specific configuration is used for neo4j it will try to cache pretty much in your heap so if your other database also caches stuff your heap will pretty soon be full. You can try to set down the caching levels. You can fiddle around with the Cache settings where an example http://dist.neo4j.org/neo_default.props , look at f.ex. adaptive_cache_heap_ratio=0.77, maybe lower it a bit. In general creating nodes I assume is expensive? Can I create batch of Nodes, close the transaction, update all of their properties and then reopen a transaction to attach relationships? What is the bottleneck when doing insertions? If this is a one-time batch insertion then should really use the batch inserter http://wiki.neo4j.org/content/Batch_Insert which is optimized for these things. It's much faster for imports of this sort and you don't have to (in fact, you cannot have) multiple threads inserting your data. Message: 1 Date: Thu, 30 Sep 2010 09:54:31 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] Using EmbeddedGraphDatabase, possible to stop node caching eating ram? To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktinkcdufqrjszxmyosgotj_fak+jnp-638cf5...@mail.gmail.comaanlktinkcdufqrjszxmyosgotj_fak%2bjnp-638cf5...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 2010/9/29 Garrett Barton garrett.bar...@gmail.com Hey all, I have an issue similar to this post http://www.mail-archive.com/user@lists.neo4j.org/msg04942.html I am following the advice under the Big Transactions page on the wiki and my code looks something like this: public void doBigBatchJob(NeoType entity) { Transaction tx = null; try { int counter = 0; while(rs.next()) { if(tx == null) tx = graphDb.beginTx(); Node n = getNewNode(entity); for(String col: columnList) if(rs.getString(col) != null) n.setProperty(col,rs.getString(col)); counter++; if ( counter % 1 == 0 ) { tx.success(); tx.finish(); tx = null; } } } finally { if(tx != null) { tx.success(); tx.finish(); } } } It looks correct to me. Where getNewNode creates a node and gives it a relationship to the parent entity. Parent nodes are cached, that helped a whole bunch. How are you looking
Re: [Neo4j] How to check if a relation already exists in neo4j
I added some kind of example in the FAQ, http://wiki.neo4j.org/content/FAQ#Checking_if_a_relationship_exists_between_two_nodes 2010/9/29 Mattias Persson matt...@neotechnology.com I think the python bindings doesn't have native support for the new index framework and it's completely separate from the normal indexing found in the python API. Maybe you could call java code directly, or write your own bindings for that... because it hasn't been done as far as my knowledge goes. An example a solution for your problem (java code): IndexProvider provider = new LucenIndexProvider( graphDb ); // Instantiate once RelationshipIndex index = provider.relationshipIndex( myIndexName, LuceneIndexProvider.EXACT_CONFIG ); // Index a relationship Relationship calledRel = n2.createRelationshipTo( n1, CALLED ); calledRel.setProperty( on, 1234567890 ); index.add( calledRel, on, calledRel.getProperty( on ) ); // Check if such a relationship exists boolean exists = index.get( on, 1234567890, n2, n1 ).getSingle() != null; 2010/9/29 Francois Kassis francois_kas...@hotmail.com Thx Arijit. actually I am using latest version of neo4j python version. so how to inistiate and use the lucene-index for relationships. Currently I am indexing nodes like: nodesindex = graphdb.index(insertednodes, create=True) graphdb.node(n1, name=hello) nodesindex[hello] = n1 is this the same should be done with relationships? THX in advance. Francois -- From: user-requ...@lists.neo4j.org Sent: Wednesday, September 29, 2010 1:00 PM To: user@lists.neo4j.org Subject: User Digest, Vol 42, Issue 56 Send User mailing list submissions to user@lists.neo4j.org To subscribe or unsubscribe via the World Wide Web, visit https://lists.neo4j.org/mailman/listinfo/user or, via email, send a message with subject or body 'help' to user-requ...@lists.neo4j.org You can reach the person managing the list at user-ow...@lists.neo4j.org When replying, please edit your Subject line so it is more specific than Re: Contents of User digest... Today's Topics: 1. Re: How do I sort Lucene results by field value, and numerical range queries ? (Mattias Persson) 2. Re: How do I sort Lucene results by field value, and numerical range queries ? (Andreas Ronge) 3. Re: How to check if a relation already exists in neo4j graph (Arijit Mukherjee) 4. Re: How to check if a relation already exists in neo4j (Francois Kassis) -- Message: 1 Date: Tue, 28 Sep 2010 23:22:44 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] How do I sort Lucene results by field value, and numerical range queries ? To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktikop7xbn4+jjxtr4ds5f3v-sy479kh5-ugyi...@mail.gmail.comaanlktikop7xbn4%2bjjxtr4ds5f3v-sy479kh5-ugyi...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 I think there's a working version of it now... look at the tests for more information: https://svn.neo4j.org/laboratory/components/lucene-index/trunk/src/test/java/org/neo4j/index/impl/lucene/TestLuceneIndex.java(testSortinghttps://svn.neo4j.org/laboratory/components/lucene-index/trunk/src/test/java/org/neo4j/index/impl/lucene/TestLuceneIndex.java%28testSorting , testNumericValues). 2010/9/24 Mattias Persson matt...@neotechnology.com 2010/9/24 Andreas Ronge andreas.ro...@jayway.se On Thu, Sep 23, 2010 at 7:50 PM, Mattias Persson matt...@neotechnology.com wrote: 2010/9/23 Andreas Ronge andreas.ro...@jayway.se That's really good news ! Does it also work if it was not indexes as Strings ? ( so that we can sort integers or floats without any padding) I guess that requires that neo4j-lucene adds NumericField instances to the lucene document. Exactly, I'm just working on a solution for that... I'm not sure it should index all Integer, Long, Float, Double values implicitly as NumericField since those aren't searchable with a regular term query. Maybe do something explicit like: myIndex.add( entity, age, new ValueContext( 31 ).indexNumeric() ); ... // Query integer range myIndex.query( NumericRangeQuery.newIntRange( age, 30, 40, true, false ) ); // And with sorting (quite verbose though) myIndex.query( new QueryContext( NumericRangeQuery.newIntRange( age, 0, 100, true, true ) ).sort( new Sort( new SortField( age, SortField.INT ) ) ) ); So that you must know what you're doing... would that be ok or any other better idea? I think it's great that we have access to the full lucene API in the QueryContext. I don't care if it's verbose since I will wrap it in a nice JRuby API :-) I've already started
Re: [Neo4j] Using EmbeddedGraphDatabase, possible to stop node caching eating ram?
2010/9/29 Garrett Barton garrett.bar...@gmail.com Hey all, I have an issue similar to this post http://www.mail-archive.com/user@lists.neo4j.org/msg04942.html I am following the advice under the Big Transactions page on the wiki and my code looks something like this: public void doBigBatchJob(NeoType entity) { Transaction tx = null; try { int counter = 0; while(rs.next()) { if(tx == null) tx = graphDb.beginTx(); Node n = getNewNode(entity); for(String col: columnList) if(rs.getString(col) != null) n.setProperty(col,rs.getString(col)); counter++; if ( counter % 1 == 0 ) { tx.success(); tx.finish(); tx = null; } } } finally { if(tx != null) { tx.success(); tx.finish(); } } } It looks correct to me. Where getNewNode creates a node and gives it a relationship to the parent entity. Parent nodes are cached, that helped a whole bunch. How are you looking up parent nodes? I have timers throughout the code as well, I know I eat some time pulling from the db, but if i take out the node creation and to a pull test of the db I can sustain 100k/s rates easily. When I start this process up, I get an initial 12-14k/s rate that works well for the first 500k or so then the drop off is huge. By the time its done the next 500k its down to under 3k/s. What I watch with JProfiler I see the ram I gave the vm maxes out and stays there, as soon as that peaks rates tank. Current setup is: -Xms2048m -Xmx2048m -XX:+UseConcMarkSweepGC Box has about 8GB of ram free for this, its own storage for the neo db, and I have already watched nr_dirty and nr_writeback and they never get over 2k/10 respectfully. neo config options: nodestore.db.mapped_memory= 500M relationshipstore.db.mapped_memory= 1G propertystore.db.mapped_memory= 500M propertystore.db.strings.mapped_memory= 2G propertystore.db.arrays.mapped_memory= 0M I have not run through a complete initial node load as the first set of nodes is ~16M, the second set is about 20M and theres a good 30M relationships between the two I haven't gotten to yet. Am I configuring something wrong? I read that neo will cache all the nodes I create, is that whats hurting me? I do not really want to use batchinserter because I think its bugged (lucene part) and I will be injesting 100's of millions of nodes live daily when this thing works anyways. (Yes I have the potential to see what the upper limits of Neo are). It might me the SQL database you're running from causes the slowdowns... I've seen this before a couple of times, so try to do this in two steps: 1) Extract data from your SQL database and store in a CSV file or something. 2) Import from that file into neo4j. If you do it this way, do you experience these slowdowns? Also, is neo single write transaction based? My injest code is actually threadable and I noticed in JProfiler that only 1 thread would be inserting at a time. It might be that you always create relationships to some parent node(s) so that locks are taken on them. That would mean that those locks are held until that thread has committed its transaction, this will make it look like it's only one thread at a time committing stuff. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] traversing-functionality
2010/9/28 Joshi Hemant - hjoshi hemant.jo...@acxiom.com On second thought, you could use the new traversal framework as well. I haven't used all the features but it should be possible to do something like: TraversalDescription td = Traversal.description().breadthFirst().relationships( relation_1, Direction.OUTGOING ).relationships( relation_2, Direction.OUTGOING).filter( Traversal.returnAllButStartNode() ); I think it will do relation_1 nodes first and then traverse to relation_2 nodes from those in a breadth_first manner. -Hemant That traverser will traverse both relation_1 and relation_2 on all levels, wherever it is. This kind of functionality is planned for the next iteration of the traversal framework. One not-so-funny-solution would be to do this manually with Node#getRelationships manually and use different types for each level. Node start = ...; for ( Relationship relL1 : start.getRelationships( relation_1 ) ) { Node nodeL1 = relL1.getOtherNode( start ); for ( Relationship relL2 : nodeL1.getRelationships( relation_2 ) ) { Node nodeL2 = relL2.getOtherNode( nodeL1 ); . } } -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Joshi Hemant - hjoshi Sent: Tuesday, September 28, 2010 9:42 AM To: Neo4j user discussions Subject: Re: [Neo4j] traversing-functionality Have you looked at PatternMatch functionality? Take a look at http://components.neo4j.org/neo4j-graph-matching/ IMO, you can define a pattern for relation_1 and from what you get as pattern nodes in relation_1, define relation_2 for them. Only thing you will need to know is where to look for such a pattern in the graph (i.e. start node). If you do not know the start node and want to want generic pattern, scan over entire graph using (Node n : graphDb.getAllNodes()) and repeat PatternMatch with each node as a starting point. -Hemant -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Konstanze.Lorenz Sent: Tuesday, September 28, 2010 8:36 AM To: user@lists.neo4j.org Subject: [Neo4j] traversing-functionality Hi, For the project I'm working on, I need a traversing as follows: the relationships, which are put in for the traversing should not be attended at the same depth. That means: At first I want to get all nodes by traversing relation_1 and then on this sample space I want to get all nodes by traversing relation_2. BUT: Traversal.description.expand(Traversal.expanderForTypes(Relation_1,Direc tion.Outgoing, Relation_2,Direction.Outgoing) traverses over all nodes with relation_1 and relation_2. By now, I didn't find this functionality implemented - has someone a hint where I can get it (if already existent) or an idea how to implement it in a reasonable way? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user *** The information contained in this communication is confidential, is intended only for the use of the recipient named above, and may be legally privileged. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please resend this communication to the sender and delete the original message or any copy of it from your computer system. Thank You. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] How do I sort Lucene results by field value, and numerical range queries ?
2010/9/24 Andreas Ronge andreas.ro...@jayway.se On Thu, Sep 23, 2010 at 7:50 PM, Mattias Persson matt...@neotechnology.com wrote: 2010/9/23 Andreas Ronge andreas.ro...@jayway.se That's really good news ! Does it also work if it was not indexes as Strings ? ( so that we can sort integers or floats without any padding) I guess that requires that neo4j-lucene adds NumericField instances to the lucene document. Exactly, I'm just working on a solution for that... I'm not sure it should index all Integer, Long, Float, Double values implicitly as NumericField since those aren't searchable with a regular term query. Maybe do something explicit like: myIndex.add( entity, age, new ValueContext( 31 ).indexNumeric() ); ... // Query integer range myIndex.query( NumericRangeQuery.newIntRange( age, 30, 40, true, false ) ); // And with sorting (quite verbose though) myIndex.query( new QueryContext( NumericRangeQuery.newIntRange( age, 0, 100, true, true ) ).sort( new Sort( new SortField( age, SortField.INT ) ) ) ); So that you must know what you're doing... would that be ok or any other better idea? I think it's great that we have access to the full lucene API in the QueryContext. I don't care if it's verbose since I will wrap it in a nice JRuby API :-) I've already started to document how my lucene sorting and range query API will look like in Neo4j.rb; http://github.com/andreasronge/neo4j/wiki/Lucene (I will specify both the field will be indexed as numerical and of what type.) Great, so I'll try to get this going and make a commit when I get the time (almost working locally). -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] How do I sort Lucene results by field value, and numerical range queries ?
It doesn't try to use NumericField... maybe that can be done somehow so that range queries can more easily be asked, I'll add that as a ticket 2010/9/23 Paddy paddyf...@gmail.com Hi Andreas, Yes it looks like you don't need to wrap it in a padded string. I tried using myIndex.add(ndOne, time,1f); it will stills work. thanks Paddy On Thu, Sep 23, 2010 at 12:37 AM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi Paddy Thanks for the response. But it would be nice to avoid wrapping integer or float values in padded strings, as you described in your example: Node ndOne = gds.createNode(); ndOne.setProperty(time, 1.3); myIndex.add(ndOne, time, ndOne.getProperty(time) ); Instead I believe it's possible in Lucene 3.0 to index the time property as a float. The question is if this feature is exposed in the Neo4j Lucene API ? Cheers Andreas On Thu, Sep 23, 2010 at 8:43 AM, Paddy paddyf...@gmail.com wrote: Hi Andres, Not sure about the use of sorting but I have previously tried some numeric range queries with the new indexProvider. I've added my some test code i used below. Cheers Paddy public class indexTest { static GraphDatabaseService gds; static IndexProvider provider; @BeforeClass public static void setup() { gds = new EmbeddedGraphDatabase(data/neodb/neodb-tmp); provider = new LuceneIndexProvider( gds ); } @Test public void fullTextIndex() { Transaction tx = gds.beginTx(); try { Node ndOne = gds.createNode(); ndOne.setProperty(time, 1.3); Node ndTwo = gds.createNode(); ndTwo.setProperty(time, 1.5); Node ndThree = gds.createNode(); ndThree.setProperty(time, 2.0); Node ndFour = gds.createNode(); ndFour.setProperty(time, 3.0); Node ndFive = gds.createNode(); ndFive.setProperty(time, 5.0); IndexNode myIndex = provider.nodeIndex( fulltext, LuceneIndexProvider.FULLTEXT_CONFIG ); myIndex.add(ndOne, time, ndOne.getProperty(time) ); myIndex.add(ndTwo, time, ndTwo.getProperty(time) ); myIndex.add(ndThree,time, ndThree.getProperty(time) ); myIndex.add(ndFour, time, ndFour.getProperty(time) ); myIndex.add(ndFive, time, ndFive.getProperty(time) ); for ( Node searchHit : myIndex.query( time:[1.5 TO 4.3] ) ) { System.out.println( Found + searchHit.toString() ); System.out.println(time + searchHit.getProperty(time)); } tx.success(); } finally { tx.finish(); } } } On Wed, Sep 22, 2010 at 11:15 PM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi In the example https://svn.neo4j.org/laboratory/components/lucene-index/trunk/src/test/java/org/neo4j/index/impl/lucene/TestLuceneIndex.java I only see how to sort by Sort.RELEVANCE and Sort.INDEXORDER. How do I sort ascending/ on different fields ? Another related question, how does neo4j work with the new improved numerical capabilities of lucene 3.0. Example if I add an integer with org.neo4j.graphdb.index.Index.add(entity, key, integer_value) will it be index as a NumericField so that I (how?) can use NumericRangeQueries ? (In old lucene one had to convert integers to strings and pad it with zeros) Cheers Andres ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] How do I sort Lucene results by field value, and numerical range queries ?
Btw I don't think lucene can do that kind of multiple-field sorting for you, or can it? 2010/9/23 Mattias Persson matt...@neotechnology.com It doesn't try to use NumericField... maybe that can be done somehow so that range queries can more easily be asked, I'll add that as a ticket 2010/9/23 Paddy paddyf...@gmail.com Hi Andreas, Yes it looks like you don't need to wrap it in a padded string. I tried using myIndex.add(ndOne, time,1f); it will stills work. thanks Paddy On Thu, Sep 23, 2010 at 12:37 AM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi Paddy Thanks for the response. But it would be nice to avoid wrapping integer or float values in padded strings, as you described in your example: Node ndOne = gds.createNode(); ndOne.setProperty(time, 1.3); myIndex.add(ndOne, time, ndOne.getProperty(time) ); Instead I believe it's possible in Lucene 3.0 to index the time property as a float. The question is if this feature is exposed in the Neo4j Lucene API ? Cheers Andreas On Thu, Sep 23, 2010 at 8:43 AM, Paddy paddyf...@gmail.com wrote: Hi Andres, Not sure about the use of sorting but I have previously tried some numeric range queries with the new indexProvider. I've added my some test code i used below. Cheers Paddy public class indexTest { static GraphDatabaseService gds; static IndexProvider provider; @BeforeClass public static void setup() { gds = new EmbeddedGraphDatabase(data/neodb/neodb-tmp); provider = new LuceneIndexProvider( gds ); } @Test public void fullTextIndex() { Transaction tx = gds.beginTx(); try { Node ndOne = gds.createNode(); ndOne.setProperty(time, 1.3); Node ndTwo = gds.createNode(); ndTwo.setProperty(time, 1.5); Node ndThree = gds.createNode(); ndThree.setProperty(time, 2.0); Node ndFour = gds.createNode(); ndFour.setProperty(time, 3.0); Node ndFive = gds.createNode(); ndFive.setProperty(time, 5.0); IndexNode myIndex = provider.nodeIndex( fulltext, LuceneIndexProvider.FULLTEXT_CONFIG ); myIndex.add(ndOne, time, ndOne.getProperty(time) ); myIndex.add(ndTwo, time, ndTwo.getProperty(time) ); myIndex.add(ndThree,time, ndThree.getProperty(time) ); myIndex.add(ndFour, time, ndFour.getProperty(time) ); myIndex.add(ndFive, time, ndFive.getProperty(time) ); for ( Node searchHit : myIndex.query( time:[1.5 TO 4.3] ) ) { System.out.println( Found + searchHit.toString() ); System.out.println(time + searchHit.getProperty(time)); } tx.success(); } finally { tx.finish(); } } } On Wed, Sep 22, 2010 at 11:15 PM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi In the example https://svn.neo4j.org/laboratory/components/lucene-index/trunk/src/test/java/org/neo4j/index/impl/lucene/TestLuceneIndex.java I only see how to sort by Sort.RELEVANCE and Sort.INDEXORDER. How do I sort ascending/ on different fields ? Another related question, how does neo4j work with the new improved numerical capabilities of lucene 3.0. Example if I add an integer with org.neo4j.graphdb.index.Index.add(entity, key, integer_value) will it be index as a NumericField so that I (how?) can use NumericRangeQueries ? (In old lucene one had to convert integers to strings and pad it with zeros) Cheers Andres ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] How do I sort Lucene results by field value, and numerical range queries ?
2010/9/23 Mattias Persson matt...@neotechnology.com Btw I don't think lucene can do that kind of multiple-field sorting for you, or can it? Scratch that... you can do: myNodeIndex.query( new QueryContext( name:*...@gmail.com ).sort( new Sort( new SortField( name, SortField.STRING ) ) ); that way you let Lucence sort the result for you on the key name. 2010/9/23 Mattias Persson matt...@neotechnology.com It doesn't try to use NumericField... maybe that can be done somehow so that range queries can more easily be asked, I'll add that as a ticket 2010/9/23 Paddy paddyf...@gmail.com Hi Andreas, Yes it looks like you don't need to wrap it in a padded string. I tried using myIndex.add(ndOne, time,1f); it will stills work. thanks Paddy On Thu, Sep 23, 2010 at 12:37 AM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi Paddy Thanks for the response. But it would be nice to avoid wrapping integer or float values in padded strings, as you described in your example: Node ndOne = gds.createNode(); ndOne.setProperty(time, 1.3); myIndex.add(ndOne, time, ndOne.getProperty(time) ); Instead I believe it's possible in Lucene 3.0 to index the time property as a float. The question is if this feature is exposed in the Neo4j Lucene API ? Cheers Andreas On Thu, Sep 23, 2010 at 8:43 AM, Paddy paddyf...@gmail.com wrote: Hi Andres, Not sure about the use of sorting but I have previously tried some numeric range queries with the new indexProvider. I've added my some test code i used below. Cheers Paddy public class indexTest { static GraphDatabaseService gds; static IndexProvider provider; @BeforeClass public static void setup() { gds = new EmbeddedGraphDatabase(data/neodb/neodb-tmp); provider = new LuceneIndexProvider( gds ); } @Test public void fullTextIndex() { Transaction tx = gds.beginTx(); try { Node ndOne = gds.createNode(); ndOne.setProperty(time, 1.3); Node ndTwo = gds.createNode(); ndTwo.setProperty(time, 1.5); Node ndThree = gds.createNode(); ndThree.setProperty(time, 2.0); Node ndFour = gds.createNode(); ndFour.setProperty(time, 3.0); Node ndFive = gds.createNode(); ndFive.setProperty(time, 5.0); IndexNode myIndex = provider.nodeIndex( fulltext, LuceneIndexProvider.FULLTEXT_CONFIG ); myIndex.add(ndOne, time, ndOne.getProperty(time) ); myIndex.add(ndTwo, time, ndTwo.getProperty(time) ); myIndex.add(ndThree,time, ndThree.getProperty(time) ); myIndex.add(ndFour, time, ndFour.getProperty(time) ); myIndex.add(ndFive, time, ndFive.getProperty(time) ); for ( Node searchHit : myIndex.query( time:[1.5 TO 4.3] ) ) { System.out.println( Found + searchHit.toString() ); System.out.println(time + searchHit.getProperty(time)); } tx.success(); } finally { tx.finish(); } } } On Wed, Sep 22, 2010 at 11:15 PM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi In the example https://svn.neo4j.org/laboratory/components/lucene-index/trunk/src/test/java/org/neo4j/index/impl/lucene/TestLuceneIndex.java I only see how to sort by Sort.RELEVANCE and Sort.INDEXORDER. How do I sort ascending/ on different fields ? Another related question, how does neo4j work with the new improved numerical capabilities of lucene 3.0. Example if I add an integer with org.neo4j.graphdb.index.Index.add(entity, key, integer_value) will it be index as a NumericField so that I (how?) can use NumericRangeQueries ? (In old lucene one had to convert integers to strings and pad it with zeros) Cheers Andres ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] How do I sort Lucene results by field value, and numerical range queries ?
2010/9/23 Andreas Ronge andreas.ro...@jayway.se That's really good news ! Does it also work if it was not indexes as Strings ? ( so that we can sort integers or floats without any padding) I guess that requires that neo4j-lucene adds NumericField instances to the lucene document. Exactly, I'm just working on a solution for that... I'm not sure it should index all Integer, Long, Float, Double values implicitly as NumericField since those aren't searchable with a regular term query. Maybe do something explicit like: myIndex.add( entity, age, new ValueContext( 31 ).indexNumeric() ); ... // Query integer range myIndex.query( NumericRangeQuery.newIntRange( age, 30, 40, true, false ) ); // And with sorting (quite verbose though) myIndex.query( new QueryContext( NumericRangeQuery.newIntRange( age, 0, 100, true, true ) ).sort( new Sort( new SortField( age, SortField.INT ) ) ) ); So that you must know what you're doing... would that be ok or any other better idea? /Andreas On Thu, Sep 23, 2010 at 11:21 AM, Mattias Persson matt...@neotechnology.com wrote: 2010/9/23 Mattias Persson matt...@neotechnology.com Btw I don't think lucene can do that kind of multiple-field sorting for you, or can it? Scratch that... you can do: myNodeIndex.query( new QueryContext( name:*...@gmail.com ).sort( new Sort( new SortField( name, SortField.STRING ) ) ); that way you let Lucence sort the result for you on the key name. 2010/9/23 Mattias Persson matt...@neotechnology.com It doesn't try to use NumericField... maybe that can be done somehow so that range queries can more easily be asked, I'll add that as a ticket 2010/9/23 Paddy paddyf...@gmail.com Hi Andreas, Yes it looks like you don't need to wrap it in a padded string. I tried using myIndex.add(ndOne, time,1f); it will stills work. thanks Paddy On Thu, Sep 23, 2010 at 12:37 AM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi Paddy Thanks for the response. But it would be nice to avoid wrapping integer or float values in padded strings, as you described in your example: Node ndOne = gds.createNode(); ndOne.setProperty(time, 1.3); myIndex.add(ndOne, time, ndOne.getProperty(time) ); Instead I believe it's possible in Lucene 3.0 to index the time property as a float. The question is if this feature is exposed in the Neo4j Lucene API ? Cheers Andreas On Thu, Sep 23, 2010 at 8:43 AM, Paddy paddyf...@gmail.com wrote: Hi Andres, Not sure about the use of sorting but I have previously tried some numeric range queries with the new indexProvider. I've added my some test code i used below. Cheers Paddy public class indexTest { static GraphDatabaseService gds; static IndexProvider provider; @BeforeClass public static void setup() { gds = new EmbeddedGraphDatabase(data/neodb/neodb-tmp); provider = new LuceneIndexProvider( gds ); } @Test public void fullTextIndex() { Transaction tx = gds.beginTx(); try { Node ndOne = gds.createNode(); ndOne.setProperty(time, 1.3); Node ndTwo = gds.createNode(); ndTwo.setProperty(time, 1.5); Node ndThree = gds.createNode(); ndThree.setProperty(time, 2.0); Node ndFour = gds.createNode(); ndFour.setProperty(time, 3.0); Node ndFive = gds.createNode(); ndFive.setProperty(time, 5.0); IndexNode myIndex = provider.nodeIndex( fulltext, LuceneIndexProvider.FULLTEXT_CONFIG ); myIndex.add(ndOne, time, ndOne.getProperty(time) ); myIndex.add(ndTwo, time, ndTwo.getProperty(time) ); myIndex.add(ndThree,time, ndThree.getProperty(time) ); myIndex.add(ndFour, time, ndFour.getProperty(time) ); myIndex.add(ndFive, time, ndFive.getProperty(time) ); for ( Node searchHit : myIndex.query( time:[1.5 TO 4.3] ) ) { System.out.println( Found + searchHit.toString() ); System.out.println(time + searchHit.getProperty(time)); } tx.success(); } finally { tx.finish(); } } } On Wed, Sep 22, 2010 at 11:15 PM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi In the example https://svn.neo4j.org/laboratory/components/lucene-index/trunk/src/test/java/org/neo4j/index/impl/lucene/TestLuceneIndex.java I only see how to sort by Sort.RELEVANCE and Sort.INDEXORDER. How do I sort ascending/ on different fields ? Another related question, how does neo4j work with the new improved numerical capabilities of lucene 3.0. Example if I add an integer with org.neo4j.graphdb.index.Index.add(entity, key, integer_value) will it be index as a NumericField so that I (how?) can use NumericRangeQueries ? (In old lucene one had to convert integers to strings and pad it with zeros) Cheers
Re: [Neo4j] BatchInserter usage with neo4j-lucene-index
Your code doesn't use the new index framework, but there might be a CLASSPATH issue where the neo4j-index 1.1 uses lucene 2.9.2 and neo4j-lucene-index 0.1-SNAPSHOT uses lucene 3.0.1. If you've got 'em both on the classpath there might be problems, so please use the one or the other. Or you could use neo4j-index 1.2-SNAPSHOT where this issue isn't a problem anymore since it's updated to lucene 3.0.1 as well. 2010/9/24 Paddy paddyf...@gmail.com Hi, I got the following error when trying out the neo4j-lucene-index 0.1-SNAPSHOT with the BatchInserter, it worked ok when using the neo4j-index 1.1. The problem is when i call: indexService.getSingleNode(Id, test); should i modify my code to use the new neo4j-lucene-index with the the BatchInserter? Thanks Paddy Exception in thread main java.lang.NoSuchMethodError: org.apache.lucene.search.IndexSearcher.search(Lorg/apache/lucene/search/Query;)Lorg/apache/lucene/search/Hits; at org.neo4j.index.lucene.LuceneIndexBatchInserterImpl.getNodes(LuceneIndexBatchInserterImpl.java:239) at org.neo4j.index.lucene.LuceneIndexBatchInserterImpl.getSingleNode(LuceneIndexBatchInserterImpl.java:279) at neo4j.indexTest2.main(indexTest2.java:42) public class indexTest2 { static GraphDatabaseService gds; static LuceneFulltextIndexBatchInserter indexService; static BatchInserter inserter; public static void setup() { inserter = new BatchInserterImpl(data/neodb/neodb-tmp, BatchInserterImpl .loadProperties(neo4j_config.props)); indexService = new LuceneFulltextIndexBatchInserter(inserter); } public static void main(String[] args) { setup(); MapString, Object nodeProperties = new TreeMapString, Object(); nodeProperties.put(Id, test); long node = inserter.createNode(nodeProperties); indexService.index(node, Id, test); long search = indexService.getSingleNode(Id, test); System.out.println(Search + search); inserter.shutdown(); indexService.optimize(); indexService.shutdown(); } } ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Minor change to neo4j-rest for major advantage
2010/9/21 Jacob Hansson ja...@voltvoodoo.com Hey all, I'm looking into modifying the DatabaseLocator class in neo4j-rest to allow connecting to remote databases (ie. letting neo4j-rest expose RemoteGraphDatabases via REST, enabling monitoring and management via webadmin of completely external projects). The change would basically just check if DB_PATH is a local file or an rmi:// URI, and use RemoteGraphDatabase if it is the latter. After the change, you could expose your internal database over RMI as normal ( http://components.neo4j.org/neo4j-remote-graphdb/), and then specify the RMI url as your db path to neo4j-rest (how you do that depends on how you run the project). There would have to be some checks put into place also, since certain things won't be possible when not running an EmbeddedGraphDatabase. Some things may also need to be done a bit differently and might have to wait, like indexing. Sure, why not! I would also like to deprecate this in the current API: DatabaseLocator.getXXX(URI baseUri) And change it for: DatabaseLocator.getXXX() Unless I am not understanding the reason for this syntax. Currently nothing is done with baseUri, it is simply ignored throughout DatabaseLocator. I assume it was meant to define the database path, but was replaced by a system setting instead as it is now? Neither is really a great solution, but it's confusing that there are two apparent ways to do it, and only one that works. Does anyone have any thoughts about this or does it all sound ok? I think that those were created with some sharding in mind or something... I don't know, but they are definately not used a.t.m. -- Jacob Hansson Phone: +46 (0) 763503395 Twitter: @jakewins ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Groovy and Unable to lock store..this is usually a result of..
The cause of this problem (for the OverlappingFileLockException) is that there's another Neo4j kernel instance already running withing the same JVM for that particular store. I also improved the exception to say that. 2010/9/23 Andrew Grealy iag...@yahoo.com Hi All, I have been on the learning path for Neo4J. I came across a problem people are experiencing. If you write groovy scripts to learn how to use Noe4j, then if they bomb out you are left with the 'unable to lock store problem'. I was playing with groovy code from: http://groovyconsole.appspot.com/script/245001 (To reproduce the problem just make a bug, such as comment out the tx.success() line then re-run again) I then tried doing the same thing in Java, to confirm Neo4J isn't flakey. Neo4J was rock solid. This may explain why all the people answering guys messages on this problem says it because you have two applications accessing the system. (Which I can tell you only had one groovy script running at a time.) I would like to make a suggestion have a special startup mode, that can clear the state on startup. (Getting to work again, was painful). In development this would be useful, it would be an over-ride and not normal operation cheers, Andrew ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Calling org.neo4j.graphdb.index.Index#remove in a beforeCommit event, allowed ?
I know, exception can get swallowed at some point in there. I'll try to make that better in some way. 2010/9/20 Andreas Ronge andreas.ro...@jayway.se Sorry, found the bug in my code. It's rather difficult to debug when using the neo4j event framework since the stack trace doesn't give you much clues of what went wrong. It may be something that has occurred earlier that causes the problem, like calling methods on a deleted node. Cheers Andreas On Mon, Sep 20, 2010 at 4:12 PM, Peter Neubauer peter.neuba...@neotechnology.com wrote: Andreas, that is strange. In the beforeCommit, there should be a Transaction open already. Also, are you deleting the Node in the same thread or a new one? Mattias, any comment on that? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Fri, Sep 17, 2010 at 6:05 AM, Andreas Ronge andreas.ro...@jayway.se wrote: Hi Is it not possible to call org.neo4j.graphdb.index.Index#remove when it's triggered from a beforeCommit event ? I get the following exception: (The TxData object contains the removed property (removedNodeProperties) that I want lucene to remove from it's index with the remove method.) class org.neo4j.graphdb.NotInTransactionException'; Message: null; StackTrace: org.neo4j.graphdb.NotInTransactionException at org.neo4j.index.impl.lucene.ConnectionBroker.acquireResourceConnection(ConnectionBroker.java:32) at org.neo4j.index.impl.lucene.LuceneIndex.getConnection(LuceneIndex.java:55) at org.neo4j.index.impl.lucene.LuceneIndex.remove(LuceneIndex.java:108) at org.neo4j.kernel.impl.core.TransactionEventsSyncHook.beforeCompletion(TransactionEventsSyncHook.java:76) at org.neo4j.kernel.impl.transaction.TransactionImpl.doBeforeCompletion(TransactionImpl.java:342) at org.neo4j.kernel.impl.transaction.TxManager.commit(TxManager.java:569) at org.neo4j.kernel.impl.transaction.TransactionImpl.commit(TransactionImpl.java:104) at org.neo4j.kernel.EmbeddedGraphDbImpl$TransactionImpl.finish(EmbeddedGraphDbImpl.java:513) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.jruby.javasupport.JavaMethod.invokeDirectWithExceptionHandling(JavaMethod.java:508) I have probably done something stupid. Cheers ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] IndexProvider question
2010/9/17 Honnur Vorvoi vhon...@yahoo.com Thanks Mattias for the suggestion. What if we had a method in IndexHitsNode as below. ListIteratorNode listIterator = IndexHitsNode.getListIterator() The ListIterator can traverse in both ways as opposed to Iterator which is forward only. If that solves the reverse traverse problem for pagination, is there an easy way to get a handle to the ListIterator. Even better would be to get a handle to the underlying list (ArrayList, in this case) so we can go back and forth by List's index value. I am interested to know if the underlying implementation can facilitate this. I just created these iterators https://svn.neo4j.org/components/kernel/trunk/src/main/java/org/neo4j/helpers/collection/CachingIterator.javaand https://svn.neo4j.org/components/kernel/trunk/src/main/java/org/neo4j/helpers/collection/PagingIterator.javawhich are general purpose and does exactly what you asked for, I hope :) I haven't added something on the IndexHits interface, but you could try out using CachingIterator, or even PagingIterator (which is just a CachingIterator with convenience for handling paging). They can go back and forth through the results: IndexHitsNode hits = myNodeIndex.query( some:query ); PagingIteratorNode pages = new PagingIterator( hits, 25 ); ... IteratorNode page = pages.nextPage(); while ( page.hasNext() ) { Node node = page.next(); ... } // Go to page 10 and iterator through the items in that page pages.page( 10 ); page =pages.nextPage(); while ( page.hasNext() ) ... There aren't methods like go to the last page or so and it may have to be added for it to be useful. Try the out and write back what you think about them. Date: Thu, 16 Sep 2010 22:39:10 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] IndexProvider question To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktinmd6-mrjrjspp92kan+de_bjbw2dp+l3+nt...@mail.gmail.comaanlktinmd6-mrjrjspp92kan%2bde_bjbw2dp%2bl3%2bnt...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 I'd say wrapping the Iterator from IndexHits in something that you can do pagination on would be a good way to go. There could also be such a built-in iterator in the index component for convenience. So the CachingIterator (or whatever it'd be called) would use the underlying IndexHits iterator to lazily fetch new results it hasn't already gotten and remember them so that consecutive requests for a particular item could be returned instantly, and even be able to be positioned at an arbitrary position and iterate from there. It's a pretty generic iterator implementation and it'd be fun to implement... if I get the time for it. 2010/9/16 Honnur Vorvoi vhon...@yahoo.com Thanks a lot Mattias. I was wondering if there's a way in IndexHitsNode to actually traverse back to search results. I know we can traverse forward the search results. I am trying to implement pagination and I am caching the IndexHitsNode for the same. Say, I have already moved to the node #30, but want to return to node#20 to 25. Obviously I dont want to cache all the search results and I like the lazy loading feature in IndexHitsNode Date: Wed, 15 Sep 2010 14:16:09 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] IndexProvider question To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlkti=xmdkstyk8l49m1uiqaxroxgrq2rcv004aj...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 I just added a way to do this (not as a persistent config, since they control write behaviour), but instead as an addition to QueryContext. So you can do like this: myNodeIndex.query( new QueryContext( name:Mattias occupation:developer ).defaultOperator( Operator.AND ) ); I know it's a bit verbose, but it's a start at least. Grab the latest version and try it out to see if it works for you. 2010/9/10 Mattias Persson matt...@neotechnology.com 2010/9/10, Honnur Vorvoi vhon...@yahoo.com: I would like to set AND as the default operator when I create index using the new index library: Index = indexProvider.nodeIndex( fulltext, LuceneIndexProvider.FULLTEXT_CONFIG ); I didn't find setDefaultOperator (similar to the one in LuceneFulltextQueryIndexService )in any of the provider classes. Is it supported in the new index provider? if not, is there a way we can set the same? Thanks in advance. That functionality is easy to add, I just haven't gotten around to do it. I'll try to add that as soon as possible. Excellent feedback on the new IndexProvider framework, keep it coming! --- On Thu, 9/9/10, Honnur Vorvoi vhon...@yahoo.com wrote: From: Honnur Vorvoi vhon...@yahoo.com Subject: Re: [Neo4j] IndexProvider question To: user@lists.neo4j.org
Re: [Neo4j] REST server: odd behavior in jsonAddToIndex()
2010/9/20 Erick Dennis erick.den...@gmail.com Hello there! I stumbled upon some rather strange behavior with the jsonAddToIndex method in the REST server and was wondering whether this was intentional or not. At the moment jsonAddToIndex is the only method that does not accept content of the type application/json but takes plain old text instead. Actually, it even barfs if you give it a json encoded string. It's intentional, since there really is no point accepting JSON when the only values it will ever receive is URLs for nodes or relationships to index. But for consistency it's probably a good thing... so why not? In agreement with Peter Neubauer I'm sending a patch to the list that fixes this. In addition to this change I've fixed the tests which I broke. I've also added a new test to check for malformed json and have updated the documentation of the curl examples. Excellent, however the patch didn't make it to the list since attachments have a hard time getting here. Maybe you can send it to me directly and I'll apply it? The one thing I'd like to note that is now different with this patch is that when attempting to index a node which doesn't exist the stack trace is no longer propagated to the output. It still returns a 500 as before but you don't see the org.neo4j.graphdb.NotFoundException. If someone could point me in the right direction of how to solve this I can gladly include it in the patch. Ok, I'll have a look at the patch and see what might be the problem, apply it and then come back to you (and the list) what caused it, allright? Cheers, Erick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Relationship type with name determined by a variable?
2010/9/19 Martin Aryee martin.ar...@gmail.com Thanks for the responses. Since I'm new to this graph db thing I might very well just be setting up my model wrong. Yeah, since there are more than one way to model your domain it's not always clear which solution is the best. I'm representing a group of scientists and want to somehow capture the list of publications they have coauthored. Each publication has a set of keywords representing research areas (e.g. diabetes, clinical trial). I don't know a priori what all the research areas will be as new ones will be added over time. I'd like to be able to do queries like find all the people who have worked with scientist A on diabetes. For each partner I also want the list of their shared publications For example, l I might have 3 scientists (A, B and C), 3 publications (1,2 and 3) and 3 research areas (cancer , clinical trial and diabetes): A and B have coauthored publication 1 on cancer and clinical trials A and C have coauthored publications 2 and 3 on diabetes. I had originally planned to have a relationship type for each research area, but maybe I should just be using a single relationship type (is_coauthor_with) and then add properties where the key is a research area and the value is a list of publications? Sorry if this is a bit confused - I'm still trying to get my head around the basic concepts... It's hard to know what would be the right thing to do here, I probably don't know enough about your whole domain. If you have a fixed (or rather fixed) and small number of research areas it could be an idea to have each area as a relationship type, but it may not give you any extra benefits. It would give a performance benefit if some of your nodes would have many relationships to many different research areas and you wanted to query for a specific research area... then it'd be faster (once the node and its relationships had been cached into memory from disk) to find those relationships as opposed to the case where you only had a is_coathor_with and had to look at all the relationships to sort out the ones you wanted. But if the number of research areas were very big or if you wouldn't ask queries which would benefit from having separate relationship types then you could go with only one relationships type. So it depends some on what types of queries you're going to ask it. Martin. On Sep 19, 2010, at 4:45 PM, Jawad - CitizenPlace wrote: Hi, you can easily create a relationship type with a name that is not detemined from an enum. In fact, you can use any string. You have to create a DynamicRelationshipType ( http://api.neo4j.org/current/org/neo4j/graphdb/DynamicRelationshipType.html ). You can then write: relationship_type = knows n1.has_relationship(n2, DynamicRelationshipType.withName(relationship_type)) Jawad Le 19/09/10 22:12, Victor Augusto de Campos a écrit : Sorry if I'm saying something silly, I started playing with neo a couple of weeks ago only, but afaik relationships can only be created using enums because of the nature of enums (representing fixed set of constants) and how that helps to maintain consistent relationship types that your application has without the overhead of string manipulation, is that correct? I can't see any benefit on using strings to store those relationship types, why would you like to do it? On Sun, Sep 19, 2010 at 4:18 PM, Martin Aryeemartin.ar...@gmail.com wrote: Hi, I was wondering if it's possible to create a new relationship type with a name that's determined by a variable. For example, when creating a relationship knows I'd typically write: n1 = graphdb.node() n2 = graphdb.node() n1.knows(n2) Instead of the last line, I'd like to do something like: relationship_type = knows n1.has_relationship(n2, relationship_type) Is that possible at all? Thanks, Martin. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- *Jawad, * pour l'équipe CitizenPlace Tél : +33 9 52 31 26 45 Mobile : +33 6 20 08 16 13 E-mail : ja...@citizenplace.com mailto:ja...@citizenplace.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] IndexProvider question
The implementation in IndexHits doesn't keep previous results in memory, which is a good thing IMHO. So your suggestion that there should be a getListIterator on IndexHits makes sense and is really the same solution (at least how I would choose to implement it) in that it would return a new ListIterator which would have the IndexHits iterator as its underlying iterator and keep visited items around for being able to position it however. 2010/9/17 Honnur Vorvoi vhon...@yahoo.com Thanks Mattias for the suggestion. What if we had a method in IndexHitsNode as below. ListIteratorNode listIterator = IndexHitsNode.getListIterator() The ListIterator can traverse in both ways as opposed to Iterator which is forward only. If that solves the reverse traverse problem for pagination, is there an easy way to get a handle to the ListIterator. Even better would be to get a handle to the underlying list (ArrayList, in this case) so we can go back and forth by List's index value. I am interested to know if the underlying implementation can facilitate this. Date: Thu, 16 Sep 2010 22:39:10 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] IndexProvider question To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktinmd6-mrjrjspp92kan+de_bjbw2dp+l3+nt...@mail.gmail.comaanlktinmd6-mrjrjspp92kan%2bde_bjbw2dp%2bl3%2bnt...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 I'd say wrapping the Iterator from IndexHits in something that you can do pagination on would be a good way to go. There could also be such a built-in iterator in the index component for convenience. So the CachingIterator (or whatever it'd be called) would use the underlying IndexHits iterator to lazily fetch new results it hasn't already gotten and remember them so that consecutive requests for a particular item could be returned instantly, and even be able to be positioned at an arbitrary position and iterate from there. It's a pretty generic iterator implementation and it'd be fun to implement... if I get the time for it. 2010/9/16 Honnur Vorvoi vhon...@yahoo.com Thanks a lot Mattias. I was wondering if there's a way in IndexHitsNode to actually traverse back to search results. I know we can traverse forward the search results. I am trying to implement pagination and I am caching the IndexHitsNode for the same. Say, I have already moved to the node #30, but want to return to node#20 to 25. Obviously I dont want to cache all the search results and I like the lazy loading feature in IndexHitsNode Date: Wed, 15 Sep 2010 14:16:09 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] IndexProvider question To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlkti=xmdkstyk8l49m1uiqaxroxgrq2rcv004aj...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 I just added a way to do this (not as a persistent config, since they control write behaviour), but instead as an addition to QueryContext. So you can do like this: myNodeIndex.query( new QueryContext( name:Mattias occupation:developer ).defaultOperator( Operator.AND ) ); I know it's a bit verbose, but it's a start at least. Grab the latest version and try it out to see if it works for you. 2010/9/10 Mattias Persson matt...@neotechnology.com 2010/9/10, Honnur Vorvoi vhon...@yahoo.com: I would like to set AND as the default operator when I create index using the new index library: Index = indexProvider.nodeIndex( fulltext, LuceneIndexProvider.FULLTEXT_CONFIG ); I didn't find setDefaultOperator (similar to the one in LuceneFulltextQueryIndexService )in any of the provider classes. Is it supported in the new index provider? if not, is there a way we can set the same? Thanks in advance. That functionality is easy to add, I just haven't gotten around to do it. I'll try to add that as soon as possible. Excellent feedback on the new IndexProvider framework, keep it coming! --- On Thu, 9/9/10, Honnur Vorvoi vhon...@yahoo.com wrote: From: Honnur Vorvoi vhon...@yahoo.com Subject: Re: [Neo4j] IndexProvider question To: user@lists.neo4j.org Date: Thursday, September 9, 2010, 10:33 PM Thanks Mattias. Since IndexProvider does all LuceneFulltextQueryIndexService can do and much more, I am going to use just IndexProvider. Date: Wed, 8 Sep 2010 16:28:56 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] IndexProvider question To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktin4cjw=smw00=1nlkt8ftmys6xtnvtrve_j9...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Hi Honnur! 2010/9/6, Honnur Vorvoi vhon...@yahoo.com: Hello, I have the following questions with regard
Re: [Neo4j] Graph algos in REST
2010/9/17 Jim Webber j...@webber.name I'm not convinced I like the way this works: POST /node/123/paths {to: http://localhost:/node/456;, algorithm: shortestPath, max depth: 100} Isn't the intention to retrieve the shortest path? If so I'd prefer: GET /node/123/paths?to=http%3a%2f%2flocalhost:%2fnode%2f456algorithm=shortestPathmaxDepth=100 Traversers works the same way (with POST)... I'm almost positive you were there, Jim, when we decided on not going with the HTTP-FORM-style-question-mark-parameters-pattern-or-whats-is-called. Or is this something different? And I would perhaps like to see that result cached for a bit. And I want a pony. Jim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Possible functional gap in Lucene indexing?
That can't be done currently. The only way to do it would be to loop through all your indices and check each if it contains your Node. And then you have the problem of which indices exists?. Well, that can't be answered either... the best way there would be to look at the neo4j-store-dir/lucene/ directory and see which subdirectories it contains (each for every index)... but now we're out on deep waters and implementation details and it's definately not the RightWay to do it IMHO. 2010/9/15 Victor Augusto de Campos piv...@gmail.com I tried something similar but went block when I couldn't find a way to retrieve indexes stored for a node so I'm wondering if Lucene can do that with a decent performance... Don't know if it can retrieve a relationship like indexed fields - node. Anyone knows if is that possible? On Wed, Sep 15, 2010 at 9:03 AM, rick.bullo...@burningskysoftware.com wrote: Hi, all. We're trying to use Lucene for fulltext indexing of some textual content that is stored in Neo, and we've hit a bit of a roadblock. In some cases, that content will be updated/edited and/or nodes will be removed, but the process by which index information is removed seems awkward. In particular, it would seem that a removeIndex(Node node) method would be extremely helpful for removing all indexes on a particular node. The current method requires retrieving and passing in the original textual content so that the node can be de-indexed. Is there any solution that would allow index removal given only a Node? Thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Bug: LuceneFullTextQueryIndex service ignoring last word/term
I previously wrote that you could override LuceneFulltextIndexService and assign your Analyzer there, but now I see that it can't be done there... it's on a lower level. So as Peter pointed out, one option would be to go with the new index framework where you can specify an analyzer at index creation time (and perhaps be able to change it later, but currently the code doesn't allow for that) and you'd be all set. 2010/9/15 rick.bullo...@burningskysoftware.com Neo uses the StandardTokenizer by default (lowercase + whitespace). We think we found a way to use StandardTokenizer with the Apache Solr HTMLStripReader reader implementation to handle it, but we're having a tough time determining how best to extend/override neo's default tokenizer implementation, since there are no public methods or constructors which provide a means for overriding them. We're working with the neo team on it and as soon as we have an answer, I'll post it up here. Thanks! ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Bug: LuceneFullTextQueryIndex service ignoring last word/term
Or, a slightly uglier solution: if you are going to use that analyzer for all your fulltext indexing needs then you could modify the source of the index component (very small patch), build a jar and use that instead of the standard one. I attached an example patch (for neo4j-index 1.1). 2010/9/16 Mattias Persson matt...@neotechnology.com I previously wrote that you could override LuceneFulltextIndexService and assign your Analyzer there, but now I see that it can't be done there... it's on a lower level. So as Peter pointed out, one option would be to go with the new index framework where you can specify an analyzer at index creation time (and perhaps be able to change it later, but currently the code doesn't allow for that) and you'd be all set. 2010/9/15 rick.bullo...@burningskysoftware.com Neo uses the StandardTokenizer by default (lowercase + whitespace). We think we found a way to use StandardTokenizer with the Apache Solr HTMLStripReader reader implementation to handle it, but we're having a tough time determining how best to extend/override neo's default tokenizer implementation, since there are no public methods or constructors which provide a means for overriding them. We're working with the neo team on it and as soon as we have an answer, I'll post it up here. Thanks! -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Bug: LuceneFullTextQueryIndex service ignoring last word/term
Oh, I forgot... attachments have a hard time survive the trip to the mailing list: Index: src/main/java/org/neo4j/index/lucene/LuceneFulltextDataSource.java === --- src/main/java/org/neo4j/index/lucene/LuceneFulltextDataSource.java (revision 5668) +++ src/main/java/org/neo4j/index/lucene/LuceneFulltextDataSource.java (arbetskopia) @@ -21,6 +21,7 @@ import java.util.Map; +import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.document.Field.Index; @@ -53,6 +54,12 @@ { return new LuceneFulltextTransaction( identifier, logicalLog, this ); } + +@Override +protected Analyzer instantiateAnalyzer() +{ +return new MyAnalyzer(); +} @Override protected Index getIndexStrategy( String key, Object value ) Index: src/main/java/org/neo4j/index/lucene/LuceneDataSource.java === --- src/main/java/org/neo4j/index/lucene/LuceneDataSource.java(revision 5668) +++ src/main/java/org/neo4j/index/lucene/LuceneDataSource.java (arbetskopia) @@ -171,7 +171,7 @@ return this.indexService; } -private Analyzer instantiateAnalyzer() +protected Analyzer instantiateAnalyzer() { return LOWER_CASE_WHITESPACE_ANALYZER; } 2010/9/16 Peter Neubauer peter.neuba...@neotechnology.com Mattias, no patch coming through ... Cheers, /peter neubauer VP Product Development, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Thu, Sep 16, 2010 at 9:46 AM, Mattias Persson matt...@neotechnology.com wrote: Or, a slightly uglier solution: if you are going to use that analyzer for all your fulltext indexing needs then you could modify the source of the index component (very small patch), build a jar and use that instead of the standard one. I attached an example patch (for neo4j-index 1.1). 2010/9/16 Mattias Persson matt...@neotechnology.com I previously wrote that you could override LuceneFulltextIndexService and assign your Analyzer there, but now I see that it can't be done there... it's on a lower level. So as Peter pointed out, one option would be to go with the new index framework where you can specify an analyzer at index creation time (and perhaps be able to change it later, but currently the code doesn't allow for that) and you'd be all set. 2010/9/15 rick.bullo...@burningskysoftware.com Neo uses the StandardTokenizer by default (lowercase + whitespace). We think we found a way to use StandardTokenizer with the Apache Solr HTMLStripReader reader implementation to handle it, but we're having a tough time determining how best to extend/override neo's default tokenizer implementation, since there are no public methods or constructors which provide a means for overriding them. We're working with the neo team on it and as soon as we have an answer, I'll post it up here. Thanks! -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] rest component branch merge and re-org
2010/9/16 Andreas Kollegger andreas.kolleg...@neotechnology.com Hi Todd (and anyone else working with the rest component in labs), I've just merged in the GraphAlgo branch. Unit tests look good, but it'd be great if you could take a look to see what you think of the state of things. That is probably because there are not pathfinder tests... I'll see if I can add some of those and also the proposed change I wrote about earlier. (pathfinder{single:true/false} vs path / paths). WARNING: As part of the process, I've reshuffled directories around to end up with the usual trunk, branches, tags arrangement of directories. This could of course be disruptive to anyone with direct svn references to laboratory/components/rest. You'll find the GraphAlgoBranch now under laboratory/components/rest/branches/GraphAlgoFactory. Please let me know if anything looks amiss. Cheers, Andreas ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] IndexProvider question
I just added a way to do this (not as a persistent config, since they control write behaviour), but instead as an addition to QueryContext. So you can do like this: myNodeIndex.query( new QueryContext( name:Mattias occupation:developer ).defaultOperator( Operator.AND ) ); I know it's a bit verbose, but it's a start at least. Grab the latest version and try it out to see if it works for you. 2010/9/10 Mattias Persson matt...@neotechnology.com 2010/9/10, Honnur Vorvoi vhon...@yahoo.com: I would like to set AND as the default operator when I create index using the new index library: Index = indexProvider.nodeIndex( fulltext, LuceneIndexProvider.FULLTEXT_CONFIG ); I didn't find setDefaultOperator (similar to the one in LuceneFulltextQueryIndexService )in any of the provider classes. Is it supported in the new index provider? if not, is there a way we can set the same? Thanks in advance. That functionality is easy to add, I just haven't gotten around to do it. I'll try to add that as soon as possible. Excellent feedback on the new IndexProvider framework, keep it coming! --- On Thu, 9/9/10, Honnur Vorvoi vhon...@yahoo.com wrote: From: Honnur Vorvoi vhon...@yahoo.com Subject: Re: [Neo4j] IndexProvider question To: user@lists.neo4j.org Date: Thursday, September 9, 2010, 10:33 PM Thanks Mattias. Since IndexProvider does all LuceneFulltextQueryIndexService can do and much more, I am going to use just IndexProvider. Date: Wed, 8 Sep 2010 16:28:56 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] IndexProvider question To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktin4cjw=smw00=1nlkt8ftmys6xtnvtrve_j9...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Hi Honnur! 2010/9/6, Honnur Vorvoi vhon...@yahoo.com: Hello, I have the following questions with regard to the IndexProvider(example below): 1. I already have LuceneFulltextQueryIndexService. Can I use IndexProvider with the same graphDb as well? or are they mutually exclusive? They are separate from one another so both can be used alongside of each other. Something stored in one of either LuceneIndexService/LuceneIndexProvider won't affect the other. 2. What doesn the param users in provider.nodeIndex(users) represent? The LuceneIndexService can only keep values from one key in each index, but the new LuceneIndexProvider can spawn indexes which can contain any number of keys and values (making compound queries possible). Since an index isn't tied to a property key you must name each index yourself. Each index can also be configured to be either fulltext or not, to use lower case conversion or not, a.s.o. 3. Do I need to add all the properties in IndexNode(line# 45) in order to query? (I have already index the same properties with LuceneFulltextQueryIndexService) see my answer for (1), in short: LuceneIndexProvider and the indexes it spawns has nothing to do with LuceneIndexService (or any derivative thereof) and hence can't share state. 4. Is it easy to include the query(String) method in LuceneFulltextQueryIndexService, so I can use just one indexservice otherwise I would be using LuceneIndexProvider just for query(String) method. To add compound querying the storage format (i.e. Lucene usage) needed to change in incompatible ways, so it isn't an easy fix to add that. It could however be done by querying multiple indexes in parallell and merge the results afterwards, but I don't think performance would be anywhere near using Lucene the right way for compound queries, as LuceneIndexProvider does. As alwasy, appreciate your suggestions/recommendations 1 IndexProvider provider = new LuceneIndexProvider( graphDb ); 2 IndexNode myIndex = provider.nodeIndex( users ); 3 4 myIndex.add( myNode, type, value1 ); 5 myIndex.add( myNode, key1, value2 ); 6 7 // Ask lucene queries directly here 8 for ( Node searchHit : myIndex.query( type:value1 AND key1:value2 ) ) 9 { 10 System.out.println( Found + searchHit ); 11 } ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Possible_functional_gap_in_Lucene_indexing?
2010/9/15 rick.bullo...@burningskysoftware.com Doh! Seems like we just overlooked the method signature with removeIndex(Node,key), which will do exactly what we want. Excellent! Have to lay off the Duff for a while... Original Message Subject: [Neo4j] Possible_functional_gap_in_Lucene_indexing? From: [1]rick.bullo...@burningskysoftware.com Date: Wed, September 15, 2010 8:03 am To: [2]u...@lists.neo4j.org Hi, all. We're trying to use Lucene for fulltext indexing of some textual content that is stored in Neo, and we've hit a bit of a roadblock. In some cases, that content will be updated/edited and/or nodes will be removed, but the process by which index information is removed seems awkward. In particular, it would seem that a removeIndex(Node node) method would be extremely helpful for removing all indexes on a particular node. The current method requires retrieving and passing in the original textual content so that the node can be de-indexed. Is there any solution that would allow index removal given only a Node? Thanks, Rick ___ Neo4j mailing list [3]u...@lists.neo4j.org [4]https://lists.neo4j.org/mailman/listinfo/user References 1. mailto:rick.bullo...@burningskysoftware.com 2. mailto:user@lists.neo4j.org 3. mailto:User@lists.neo4j.org 4. https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Bug: LuceneFullTextQueryIndex service ignoring last word/term
That sounds weird. Look at TestLuceneFulltextIndexService#testSimpleFulltext method, it queries for the last word and it seems to work. Could you provide more info on this? 2010/9/15 rick.bullo...@burningskysoftware.com I've noticed that when indexing full text, the last term/word is always ignored. This is a major issue, but I'm not sure if it is in the index utils or in Lucene itself. Any thoughts? Thanks, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Bug: LuceneFullTextQueryIndex service ignoring last word/term
Couldn't it be that sentences ends with a dot... so Cheese is good. will index the words: [Cheese, is, good.] ? Observe the last word isn't good, it's good. with a dot. I know that has messed up some searches for me at least. You could perhaps override the implementation and instantiate an Analyzer/Tokenizer which gets rid of such punctuation characters? 2010/9/15 rick.bullo...@burningskysoftware.com Using neo4j-index-1.1 and lucene-core-2.9.2, by the way. Original Message Subject: Re: [Neo4j] Bug: LuceneFullTextQueryIndex service ignoring last word/term From: Mattias Persson [1]matt...@neotechnology.com Date: Wed, September 15, 2010 10:37 am To: Neo4j user discussions [2]u...@lists.neo4j.org That sounds weird. Look at TestLuceneFulltextIndexService#testSimpleFulltext method, it queries for the last word and it seems to work. Could you provide more info on this? 2010/9/15 [3]rick.bullo...@burningskysoftware.com I've noticed that when indexing full text, the last term/word is always ignored. This is a major issue, but I'm not sure if it is in the index utils or in Lucene itself. Any thoughts? Thanks, Rick ___ Neo4j mailing list [4]u...@lists.neo4j.org [5]https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [[6]matt...@neotechnology.com] Hacker, Neo Technology [7]www.neotechnology.com ___ Neo4j mailing list [8]u...@lists.neo4j.org [9]https://lists.neo4j.org/mailman/listinfo/user References 1. mailto:matt...@neotechnology.com 2. mailto:user@lists.neo4j.org 3. mailto:rick.bullo...@burningskysoftware.com 4. mailto:User@lists.neo4j.org 5. https://lists.neo4j.org/mailman/listinfo/user 6. mailto:matt...@neotechnology.com 7. http://www.neotechnology.com/ 8. mailto:User@lists.neo4j.org 9. https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Using the REST neo4j
mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Timeline class in Neo Index
2010/8/14 Rick Bullotta rick.bullo...@burningskysoftware.com Hi, all. Has anyone used the Timeline capabilities for a searchable set of timestamped nodes? I was about to write my own custom linked list implementation using relationships w/a timestamp property, and came across this. I have a need to handle queries similar to those provided by the TimelineIndex with a few additions: - I want to be able to limit the # of nodes retrieved to a maximum of n It's just a matter of iterating over n nodes... the iterating is done lazily anyway. - I need to be able to retrieve nodes either in increasing timestamp or decreasing timestamp order That could be an easy patch... look into the code and see if that can be done in an easy way! I was planning on either building a new set of classes based on Timeline, extending Timeline, or starting from scratch. Any insights/suggestions welcomed. Best, Rick ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] neo4j REST server configuration
Is this resolved? Take a look at http://wiki.neo4j.org/content/Getting_Started_REST#Configure_amount_of_memoryotherwise 2010/8/7 Mohit Vazirani mohi...@yahoo.com Hi, I'm running the standalone neo4j REST server on a 64 bit linux machine with 64GB RAM and am trying to configure the following memory settings through the wrapper.conf file: wrapper.java.initmemory=16144 wrapper.java.maxmemory=16144 However when I restart the server, JMX shows me the following VM arguments: -Dcom.sun.management.jmxremote -Xms4096m -Xmx4096m -Djava.library.path=lib -Dwrapper.key=q8W6vP8LS9mj0ekz -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.pid=27943 -Dwrapper.version=3.2.3 -Dwrapper.native_library=wrapper -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 Another unrelated issue is that JMX Mbeans shows configuration attributes as unavailable when I attach to the REST wrapper. The reason I am looking into modifying the configuration is that my client servers seem to be timing out. The server cannot handle more than 5 concurrent transactions, so I want to tweak the heap size and see if that helps. Thanks, ~Mohit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] IndexProvider question
2010/9/10, Honnur Vorvoi vhon...@yahoo.com: I would like to set AND as the default operator when I create index using the new index library: Index = indexProvider.nodeIndex( fulltext, LuceneIndexProvider.FULLTEXT_CONFIG ); I didn't find setDefaultOperator (similar to the one in LuceneFulltextQueryIndexService )in any of the provider classes. Is it supported in the new index provider? if not, is there a way we can set the same? Thanks in advance. That functionality is easy to add, I just haven't gotten around to do it. I'll try to add that as soon as possible. Excellent feedback on the new IndexProvider framework, keep it coming! --- On Thu, 9/9/10, Honnur Vorvoi vhon...@yahoo.com wrote: From: Honnur Vorvoi vhon...@yahoo.com Subject: Re: [Neo4j] IndexProvider question To: user@lists.neo4j.org Date: Thursday, September 9, 2010, 10:33 PM Thanks Mattias. Since IndexProvider does all LuceneFulltextQueryIndexService can do and much more, I am going to use just IndexProvider. Date: Wed, 8 Sep 2010 16:28:56 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] IndexProvider question To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktin4cjw=smw00=1nlkt8ftmys6xtnvtrve_j9...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 Hi Honnur! 2010/9/6, Honnur Vorvoi vhon...@yahoo.com: Hello, I have the following questions with regard to the IndexProvider(example below): 1. I already have LuceneFulltextQueryIndexService. Can I use IndexProvider with the same graphDb as well? or are they mutually exclusive? They are separate from one another so both can be used alongside of each other. Something stored in one of either LuceneIndexService/LuceneIndexProvider won't affect the other. 2. What doesn the param users in provider.nodeIndex(users) represent? The LuceneIndexService can only keep values from one key in each index, but the new LuceneIndexProvider can spawn indexes which can contain any number of keys and values (making compound queries possible). Since an index isn't tied to a property key you must name each index yourself. Each index can also be configured to be either fulltext or not, to use lower case conversion or not, a.s.o. 3. Do I need to add all the properties in IndexNode(line# 45) in order to query? (I have already index the same properties with LuceneFulltextQueryIndexService) see my answer for (1), in short: LuceneIndexProvider and the indexes it spawns has nothing to do with LuceneIndexService (or any derivative thereof) and hence can't share state. 4. Is it easy to include the query(String) method in LuceneFulltextQueryIndexService, so I can use just one indexservice otherwise I would be using LuceneIndexProvider just for query(String) method. To add compound querying the storage format (i.e. Lucene usage) needed to change in incompatible ways, so it isn't an easy fix to add that. It could however be done by querying multiple indexes in parallell and merge the results afterwards, but I don't think performance would be anywhere near using Lucene the right way for compound queries, as LuceneIndexProvider does. As alwasy, appreciate your suggestions/recommendations 1 IndexProvider provider = new LuceneIndexProvider( graphDb ); 2 IndexNode myIndex = provider.nodeIndex( users ); 3 4 myIndex.add( myNode, type, value1 ); 5 myIndex.add( myNode, key1, value2 ); 6 7 // Ask lucene queries directly here 8 for ( Node searchHit : myIndex.query( type:value1 AND key1:value2 ) ) 9 { 10 System.out.println( Found + searchHit ); 11 } ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] IndexProvider question
Hi Honnur! 2010/9/6, Honnur Vorvoi vhon...@yahoo.com: Hello, I have the following questions with regard to the IndexProvider(example below): 1. I already have LuceneFulltextQueryIndexService. Can I use IndexProvider with the same graphDb as well? or are they mutually exclusive? They are separate from one another so both can be used alongside of each other. Something stored in one of either LuceneIndexService/LuceneIndexProvider won't affect the other. 2. What doesn the param users in provider.nodeIndex(users) represent? The LuceneIndexService can only keep values from one key in each index, but the new LuceneIndexProvider can spawn indexes which can contain any number of keys and values (making compound queries possible). Since an index isn't tied to a property key you must name each index yourself. Each index can also be configured to be either fulltext or not, to use lower case conversion or not, a.s.o. 3. Do I need to add all the properties in IndexNode(line# 45) in order to query? (I have already index the same properties with LuceneFulltextQueryIndexService) see my answer for (1), in short: LuceneIndexProvider and the indexes it spawns has nothing to do with LuceneIndexService (or any derivative thereof) and hence can't share state. 4. Is it easy to include the query(String) method in LuceneFulltextQueryIndexService, so I can use just one indexservice otherwise I would be using LuceneIndexProvider just for query(String) method. To add compound querying the storage format (i.e. Lucene usage) needed to change in incompatible ways, so it isn't an easy fix to add that. It could however be done by querying multiple indexes in parallell and merge the results afterwards, but I don't think performance would be anywhere near using Lucene the right way for compound queries, as LuceneIndexProvider does. As alwasy, appreciate your suggestions/recommendations 1 IndexProvider provider = new LuceneIndexProvider( graphDb ); 2 IndexNode myIndex = provider.nodeIndex( users ); 3 4 myIndex.add( myNode, type, value1 ); 5 myIndex.add( myNode, key1, value2 ); 6 7 // Ask lucene queries directly here 8 for ( Node searchHit : myIndex.query( type:value1 AND key1:value2 ) ) 9 { 10 System.out.println( Found + searchHit ); 11 } ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] neo4j REST GraphAlgo
I think it looks great, with maybe some modifications: /{node}/path and /{node}/paths (and skip the single attribute), instead of /{node}/pathfinder add tests for it (maybe you already have, but I've missed). How would you feel about those things? Anyways I'd love to see this merged into trunk! Would that be something you'd feel comfortable to do (subversion wise)?. Else I'd be happy to do it. / Mattias 2010/9/1, Mattias Persson matt...@neotechnology.com: Hi, I haven't forgotten about you... I just haven't had time to look at it yet. Definately this week though. So I'll let you know as soon as possible! Best, Mattias 2010/8/29 Peter Neubauer peter.neuba...@neotechnology.com Todd, Mattias is trying to get the time this weekend to look at it more closely. As you mentioned, a generic way to hook more algos, maybe even Gremlin, onto the URL space of a node/relationship/property would be a good thing, but maybe it is ok to go with a graph-algo hardcoding for now and see how things develop. After all the API is not frozen yet. Any more thoughts from you? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Fri, Aug 27, 2010 at 1:55 PM, Todd Chaffee t...@mikamai.com wrote: What are the next steps in moving towards getting the pathfinder GraphAlgo REST APIs as part of the regular distribution? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] neo4j REST GraphAlgo
Hi, I haven't forgotten about you... I just haven't had time to look at it yet. Definately this week though. So I'll let you know as soon as possible! Best, Mattias 2010/8/29 Peter Neubauer peter.neuba...@neotechnology.com Todd, Mattias is trying to get the time this weekend to look at it more closely. As you mentioned, a generic way to hook more algos, maybe even Gremlin, onto the URL space of a node/relationship/property would be a good thing, but maybe it is ok to go with a graph-algo hardcoding for now and see how things develop. After all the API is not frozen yet. Any more thoughts from you? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Fri, Aug 27, 2010 at 1:55 PM, Todd Chaffee t...@mikamai.com wrote: What are the next steps in moving towards getting the pathfinder GraphAlgo REST APIs as part of the regular distribution? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Limiting index query results - pagination
You can do that out of the box. If the result size is more than a certain threshold (100, I think) it gathers the results lazily instead of in the getNodes method itself. So if you do a query which has lots of hits you can keep that IndexHits instance with you to do pagination and such with. So there's really no need to specify max number of hits to return, instead just iterate over as many as you like and close the IndexHits instance when you're done. It should still perform very well. If it doesn't please come back with feedback about it. / Mattias 2010/8/27 Honnur Vorvoi vhon...@yahoo.com Thanks Mattias, that was very helpful. I have another related question: Is there a way we can limit the number of search results from the index query? Basically am looking to implement pagination incase the search results return say thousands of records. Are there any better solutions to achieve the same? Would be interested to know your thoughts. Honnur --- On Wed, 8/25/10, user-requ...@lists.neo4j.org user-requ...@lists.neo4j.org wrote: Date: Wed, 25 Aug 2010 10:28:10 +0200 From: Mattias Persson matt...@neotechnology.com Subject: Re: [Neo4j] Index search with more than one key To: Neo4j user discussions user@lists.neo4j.org Message-ID: aanlktinn0fisdaaxbiyd2-kcrebmyt9glqysm_dgx...@mail.gmail.com Content-Type: text/plain; charset=UTF-8 There's a prototype implementation of a new index which solves this (and some other issues as well, f.ex. indexing for relationships). The code is at https://svn.neo4j.org/laboratory/components/lucene-index/ and it's built and deployed over at http://m2.neo4j.org/org/neo4j/neo4j-lucene-index/ The new index isn't compatible with the old one so you'll have to index your data with the new index framework to be able to use it. IndexProvider provider = new LuceneIndexProvider( graphDb ); IndexNode myIndex = provider.nodeIndex( users ); myIndex.add( myNode, type, value1 ); myIndex.add( myNode, key1, value2 ); // Ask lucene queries directly here for ( Node searchHit : myIndex.query( type:value1 AND key1:value2 ) ) { System.out.println( Found + searchHit ); } 2010/8/25 Honnur Vorvoi vhon...@yahoo.com Hello, Is there a way we can search nodes based on more than one property key? For ex: Node1: type=value1, key1=value2 node2: type=value1, key1=value21 node3: type=value2 key1=value2 node4: type=value1 key1=value21 key2=value4 Let's say type key1 properties are indexed Any suggestions on how we can get all nodes with type=value1 AND key1=value21 in one call. TIA ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] LuceneIndexProvider EXACT_CONFIG vs FULLTEXT_CONFIG
Oh yeah, that's right... you can control it via to_lower_case property in the config. F.ex. you could do your own fulltext config like this: MapString, String caseSensitiveFulltextConfig = new HashMapString, String( FULLTEXT_CONFIG ); caseSensitiveFulltextConfig.put( to_lower_case, false ); When type=fulltext (as in FULLTEXT_CONFIG map) the to_lower_case defaults to true. All this is quite experimental, so sorry for inconveniences. 2010/8/27 Balazs E. Pataki pat...@dsd.sztaki.hu Thanks for the clarification. Indeed, wildcards work in both modes, however in FULLTEXT mode it only allows lowercase search strings, while in EXACT mode the search is case sensitive. Regards, --- balazs On 8/27/10 1:12 PM, Mattias Persson wrote: Hi Balazs, maybe the names aren't that great... but EXACT means that it indexes your data as it is without chopping it up, whereas FULLTEXT chops up the data into words and indexes every word separately. Both support wildcards, as lucene supports wildcards for both those modes. 2010/8/27 Balazs E. Patakipat...@dsd.sztaki.hu Hi, could someone please explain me when to use EXACT_CONFIG and when FULLTEXT_CONFIG when using the nodeIndex() of the LuceneIndexProvider? It seems to me that one cannot execute wildcard searches in FULLTEXT_CONFIG mode, it only works when EXACT_CONFIG is used. But what is actually exact then about this config. Thanks for any hints in advance! --- balazs ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] REST put over writing old properties
Hi Suhail, this is expected behaviour. PUT semantics in general means replace any existing data with the data I'm sending over right now. So if you'd like to add/set a property foo (and only foo) please use: PUT /node/123/properties/foo with bar as the payload/entity 2010/8/23, Suhail Ahmed suhail...@gmail.com: Hi David, I was just trying out the operations provided here : http://components.neo4j.org/neo4j-rest/ . I did the Set properties on node followed by Set property on node followed by get node. The response codes are coming back correctly. In terms of what I am expecting: 1. After the first PUT I should GET back { name: Thomas Anderson, profession: Hacker } 2. After the second PUT I should get back { name: Thomas Anderson, profession: Hacker, foo : bar } instead I get back { foo: bar } Let me know if you need more information. Thanks cheers su./hail On Mon, Aug 23, 2010 at 5:10 AM, David Montag david.mon...@neotechnology.com wrote: Hi Suhail, Could you explain the REST operation you're doing, what results you would expect from that operation, and what actually happens? David On Sun, Aug 22, 2010 at 9:11 AM, Suhail Ahmed suhail...@gmail.com wrote: Hi, i have been trying out the Neo4j REST interface and I found that PUT operation was replacing the existing properties of a node with a new one. This was happening on single values as well as multiple values. Is this a bug or am I doing something wrong here. I am using the REST plugin with Firefox. Cheers su./hail ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Remove Indexes during BatchInsertion
It may be an easy addition (but I'd have to dig into the code to verify that). However, as Peter pointed out, there's no time for that a.t.m. so either try it out yourself or wait patiently a couple of weeks at the very least :) /Mattias 2010/8/22, Peter Neubauer peter.neuba...@neotechnology.com: Paul, yeah, that discussion is still on, but I fear the workload constraints on Mattias and Tobias are preventing it right now. In a few weeks things are easing up again, which will open the door for a number of things that should be done. However, feel free to prototype this in the lab, dunno if any practical work already is done already ... Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sat, Aug 21, 2010 at 8:54 PM, Paul A. Jackson paul.jack...@pb.com wrote: I have a program for loading data into a graph and would like to support the case where later records contain data for nodes that were defined in prior records. In some cases it is possible that a later record may indicate that a node's property should be null where earlier it was given a value. This causes me to wish I had the removeIndex methods that I have in the non-batch-inserting version of LuceneIndex. Am I out of luck? I recall an earlier discussion where consideration was being given to a version of the batch inserter that implemented the GraphDatabaseService and LuceneIndexService interfaces. Did anything come of that? Thanks, -Paul ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Another issue with Neo4j.py
Maybe you're deleting a node which still has relationships left? All relationships on deleted nodes must be deleted in the same transaction. 2010/8/19 Dan Gould d...@dangould.com I'm trying to delete all the nodes in my graph. I iterate over all relationships and delete them successfully. Then I iterate over all the nodes and delete them (all node deletion is one transaction). Every once in awhile, I get: in node.delete() File virtualenv/lib/python2.6/site-packages/neo4j/_core.py, line 297, in__exit__ self.__tx.finish() java.lang.RuntimeExceptionPyRaisable: org.neo4j.kernel.impl.transaction.TransactionFailureException: Unable to commit transaction Once this has occurred, I can't delete that node: the same thng happens each time I try to delete it. Any ideas? Thanks, Dan ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] OpenJDK vs SunJDK
Not that I'm aware of. I've been using both on and off for neo4j. 2010/8/19 Amir Hossein Jadidinejad amir.jad...@yahoo.com Hi, Regarding Neo4j project, is there any difference or priority about Sun-Java-6 or OpenJDK? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] neo4j REST GraphAlgo
That's nice, I'll definately try it out when I getthe time for it! 2010/8/19 Todd Chaffee t...@mikamai.com I've just checked-in a draft implementation of the REST API for GraphAlgoFactory into the laboratory. It currently supports the following: - algorithms shortestPath, allSimplePaths, and allPaths - relationship expanders - max depth (shortestPath seems to ignore this - there may be a bug in org.neo4j.graphalgo.impl.path.ShortestPath) Could you supply some test case which can trigger it? That ShortestPath algo has been tested and even deployed in production environments with correct behaviour. - return a single path or all paths. Before moving forward I wonder if someone could take a quick look at it to see if it fits well with the existing REST API and if the java code looks acceptable. Here are some examples you could use to test. Again, I'd be happy to whenever I get the time :) Simple case - show all shortestPaths from Node 1 to Node 3: curl -H Accept:application/json -H Content-Type:application/json -d '{ to: http://localhost:/node/3; }' -X POST http://localhost:/node/1/pathfinder So shortest path is the default algo if none specified? Specify the algorithm, max depth, and return only one path: curl -H Accept:application/json -H Content-Type:application/json -d '{ to: http://localhost:/node/3;, algorithm: allSimplePaths, max depth: 3, single path: true }' -X POST http://localhost:/node/1/pathfinder Restrict relationships: curl -H Accept:application/json -H Content-Type:application/json -d '{ to: http://localhost:/node/3;, relationships: [ { type: KNOWS, direction: out }, {type: LOVES } ] }' -X POST http://localhost:/node/1/pathfinder My initial thought was to have a .../node/1/shortestpaths .../node/1/allpaths a.s.o. Or even (may be my favourite) .../node/1/paths/shortest .../node/1/paths/simple for multiple paths and .../node/1/path/shortest .../node/1/path/simple. for single paths, but it's a matter of taste here I'd guess. Your solution is also quite nice. Best, Mattias Thanks, Todd ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] check for existing relationship between two nodes
IndexProvider provider = new LuceneIndexProvider( graphDb ); RelationshipIndex index = provider.relationshipIndex( myIndex, LuceneIndexProvider.EXACT_CONFIG ); 2010/8/16 Jeff Klann jkl...@iupui.edu Hey I thought I'd try this out. Got the new kernel and downloaded the new index component snapshot but can't figure out how to use it. Perhaps I'm just dense. Looks like the only class in the Javadoc is an abstract class so I don't know how to instantiate a new relationship index. Have an example? Thanks, Jeff Klann On Fri, Jul 30, 2010 at 10:19 AM, Mattias Persson matt...@neotechnology.com wrote: The latest snapshots of things can be found at http://m2.neo4j.org/ and this component (jar-file) can be found in http://m2.neo4j.org/org/neo4j/neo4j-lucene-index/0.1-SNAPSHOT/ 2010/7/30, Arijit Mukherjee ariji...@gmail.com: Thanx Mattias. Can I download a tar.gz or zip file from somewhere? I'm not using Maven in my projects yet...I mean I'm not very comfortable with it. Arijit On 30 July 2010 17:33, Mattias Persson matt...@neotechnology.com wrote: Looping through relatiomships manually is the way to go. However there's a new component in https://svn.neo4j.org/laboratory/components/lucene-index/ which can index relationships and do fast lookups on whether or not a relationship (with a certain attribute even) exists between two nodes. You'll need to go with the latest kernel then as well (as seen in https://svn.neo4j.org/laboratory/components/lucene-index/pom.xml). 2010/7/30, Arijit Mukherjee ariji...@gmail.com: Hi All I have a requirement where I must check if there is an already existing relationship between two nodes (say N1 and N2). Right now, I'm doing it as follows: boolean found = false; final IterableRelationship currentRels = N1.getRelationships(RelTypes.KNOWS, Direction.OUTGOING); for (Relationship rel : currentRels) { found = rel.getEndNode().equals(N2); if (found) { do something - like add some property to the existing relationship; break; } } if (!found) { create new relationship between N1 and N2; } This means, for a high volume of data, all the relations going out of N1 will be retrieved and checked - and this seems costly. I'm using the 1.0 API, and wasn't able to find anything that would directly check whether N1 has an outgoing relationship with N2 - like N1.hasRelationship(N2, Direction.OUTGOING) - or something similar. I think there was a similar mail sometime ago. Has there been any updates lately which allows such checks? Or, is there any other direct way to do this with the 1.0 API? Regards Arijit -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j REST
Sounds good to me! 2010/8/18 Todd Chaffee t...@mikamai.com Hi Mattias, Yes, I think that might be you I was referring to. I'll post any ideas or questions I have to the mailing list as you suggested. Todd ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Broken restrictions example code in the meta-model component
I think I fixed it now 2010/8/18 Peter Neubauer peter.neuba...@neotechnology.com Niels, Mattias, I just refactored the maven site to load snippets from actually running code via Tobias' awesome maven hack, see https://svn.neo4j.org/components/meta-model/trunk/src/site/apt/index.apt . However, I don't know how to fix the broken example code on the site for the restictions, see https://svn.neo4j.org/components/meta-model/trunk/src/test/java/examples/SiteExamples.java , restrictions() . Could you please see if you have time to activate that code, so we get better information to the site? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Attributes or Relationship Check During Traversal
I'd probably go for that as well. It's harder to go with a gut feeling for the case where you have many categories... it may be better to go with relationships then, because you gain the traversal aspect of it which you don't really get if you go with properties. 2010/8/1 rick.bullo...@burningskysoftware.com rick.bullo...@burningskysoftware.com My instinct says a boolean property would be easier and faster, but ymmv. - Reply message - From: Alex D'Amour adam...@iq.harvard.edu Date: Sun, Aug 1, 2010 2:32 pm Subject: [Neo4j] Attributes or Relationship Check During Traversal To: Neo user discussions user@lists.neo4j.org Hello all, I have a question regarding traversals over a large graph when that traversal depends on a discretely valued attribute of the nodes being traversed. As a small example, the nodes in my graph can have 2 states -- on and off. I'd like to traverse over paths that only consist of active nodes. Since this state attributes can only take 2 values, I see two possible approaches to implementing this: 1) Use node properties, and have the PruneEvaluator and filter Predicate check to see whether the current endNode has a property called on. 2) Create a state node which represents the on state. Have all nodes that are in the on state have a relationship of type STATE_ON incoming from the on node. Have the PruneEvaluator and filter Predicate check whether the node has a single relationship of type STATE_ON, INCOMING. Which is closer to what we might consider best practices for Neo4j? The problem I see in implementation 1 is that that traversal has to hit the property store, which could slow things down. The problem with 2 is that there can be up to #nodes relationships coming from the on state node, and making this more efficient by setting up a tree of on state nodes seems to be manually replicating something that the indexing service has already accomplished. Also, how efficiently would each of these two implementations exploit caching (or is this irrelevant?)? Finally, would your answer change if we generalized this to a larger number of categories? Thanks, Alex ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] On twice?
I have the same issue 2010/8/1, Peter Neubauer peter.neuba...@neotechnology.com: Tom, let me sort that out tomorrow ... Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Sun, Aug 1, 2010 at 10:21 AM, Tom Smith tas...@york.ac.uk wrote: Small point, somehow I seem to be on neo4j list twice, getting two of everything... twice... Tom ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] check for existing relationship between two nodes
Looping through relatiomships manually is the way to go. However there's a new component in https://svn.neo4j.org/laboratory/components/lucene-index/ which can index relationships and do fast lookups on whether or not a relationship (with a certain attribute even) exists between two nodes. You'll need to go with the latest kernel then as well (as seen in https://svn.neo4j.org/laboratory/components/lucene-index/pom.xml). 2010/7/30, Arijit Mukherjee ariji...@gmail.com: Hi All I have a requirement where I must check if there is an already existing relationship between two nodes (say N1 and N2). Right now, I'm doing it as follows: boolean found = false; final IterableRelationship currentRels = N1.getRelationships(RelTypes.KNOWS, Direction.OUTGOING); for (Relationship rel : currentRels) { found = rel.getEndNode().equals(N2); if (found) { do something - like add some property to the existing relationship; break; } } if (!found) { create new relationship between N1 and N2; } This means, for a high volume of data, all the relations going out of N1 will be retrieved and checked - and this seems costly. I'm using the 1.0 API, and wasn't able to find anything that would directly check whether N1 has an outgoing relationship with N2 - like N1.hasRelationship(N2, Direction.OUTGOING) - or something similar. I think there was a similar mail sometime ago. Has there been any updates lately which allows such checks? Or, is there any other direct way to do this with the 1.0 API? Regards Arijit -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] check for existing relationship between two nodes
The latest snapshots of things can be found at http://m2.neo4j.org/ and this component (jar-file) can be found in http://m2.neo4j.org/org/neo4j/neo4j-lucene-index/0.1-SNAPSHOT/ 2010/7/30, Arijit Mukherjee ariji...@gmail.com: Thanx Mattias. Can I download a tar.gz or zip file from somewhere? I'm not using Maven in my projects yet...I mean I'm not very comfortable with it. Arijit On 30 July 2010 17:33, Mattias Persson matt...@neotechnology.com wrote: Looping through relatiomships manually is the way to go. However there's a new component in https://svn.neo4j.org/laboratory/components/lucene-index/ which can index relationships and do fast lookups on whether or not a relationship (with a certain attribute even) exists between two nodes. You'll need to go with the latest kernel then as well (as seen in https://svn.neo4j.org/laboratory/components/lucene-index/pom.xml). 2010/7/30, Arijit Mukherjee ariji...@gmail.com: Hi All I have a requirement where I must check if there is an already existing relationship between two nodes (say N1 and N2). Right now, I'm doing it as follows: boolean found = false; final IterableRelationship currentRels = N1.getRelationships(RelTypes.KNOWS, Direction.OUTGOING); for (Relationship rel : currentRels) { found = rel.getEndNode().equals(N2); if (found) { do something - like add some property to the existing relationship; break; } } if (!found) { create new relationship between N1 and N2; } This means, for a high volume of data, all the relations going out of N1 will be retrieved and checked - and this seems costly. I'm using the 1.0 API, and wasn't able to find anything that would directly check whether N1 has an outgoing relationship with N2 - like N1.hasRelationship(N2, Direction.OUTGOING) - or something similar. I think there was a similar mail sometime ago. Has there been any updates lately which allows such checks? Or, is there any other direct way to do this with the 1.0 API? Regards Arijit -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Node attributes as multiple lucene fields
2010/7/19 Andrew Mutz andrew.m...@appfolio.com Thanks Tobias! One more quick question: I'm using Neo4J-Rest. Right now, I'm modifying the rest project to use this prototyped index component. I'm fine continuing down this road, but I wanted to ask if anyone else has been doing the same thing? Is there a (presumably prerelease) version of Neo4J-Rest that uses this for indexing? No, it hasn't been done yet. If not, I'll continue modifying neo4j-rest on my own to test it out. Also, regarding timescale, should we expect to see this new indexing component be in a releasable state in a few months? Hopefully that will be the case. Compound indexing is a quite requested feature for Neo4j Thanks very much. I've been very impressed with Neo4J so far. Glad to hear that! -Andrew. On Sat, Jul 17, 2010 at 8:43 PM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: This feature is available in the new index component that is being prototyped at https://svn.neo4j.org/laboratory/components/lucene-index/ It is being built by the buildbot, which means that prebuilt snapshots are available at http://m2.neo4j.org/org/neo4j/neo4j-lucene-index/0.1-SNAPSHOT/ Please try it out and let us know what you think! Cheers, Tobias On Sun, Jul 18, 2010 at 3:42 AM, Andrew Mutz andrew.m...@appfolio.com wrote: Hi all, I've been getting up to speed in the last few days with the Lucene indexing capabilities in Neo4J and I have a question: When Neo4J creates a Lucene Document for indexing, it only assigns it two fields, the node id and the contents to be indexed. Is it possible to write to multiple lucene document fields? What I'd like is to be able to index multiple node attributes as multiple fields in a single lucene document. My goal is to be able to search on one field (node attribute) and use the others as boost fields for sorting the relevancy of the results returned. If my understanding is correct, and this is not currently possible, is this planned in the future? If it is not planned, would the Neo4J community be interested in me adding this functionality? And who would I talk to about this? Thanks, Andrew. -- Andrew Mutz Senior Software Engineer AppFolio, Inc. 55 Castilian Dr. | Goleta, CA | 93117 Phone: 805.617.2167 | Fax: 805.968.0646 andrew.m...@appfolio.com www.appfolio.com - Web-Based Property Management Software Made Easy. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Andrew Mutz Senior Software Engineer AppFolio, Inc. 55 Castilian Dr. | Goleta, CA | 93117 Phone: 805.617.2167 | Fax: 805.968.0646 andrew.m...@appfolio.com www.appfolio.com - Web-Based Property Management Software Made Easy. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Lucene and sorting results
Any luck with this? 2010/7/20 Mattias Persson matt...@neotechnology.com I copied that org.apache.lucene.Hits class into the lucene-index component, so it exists there in that package (and has existed there since the birth of this component). That's the class that LuceneIndex.search uses, not the one from lucene-core-3 (since it has been removed). 2010/7/20 Andrew Mutz andrew.m...@appfolio.com I was changing the neo4j-rest server to use the new lucene-index framework myself, and have been very frustrated with this problem. There seems to be a lucene version conflict: - org.neo4j.index.impl.lucene.LuceneIndex.search() uses org.apache.lucene.Hits, which was removed in lucene 3.0 - org.neo4j.index.impl.lucene.IndexType.query() seems to assume lucene version 3.0 (Version.LUCENE_30) So I can't use lucene 3.0 or above for the first reason, and I need to use 3.0 for the second reason. How are others able to use this? Am I doing something wrong? Maybe I should just wait until your changes go in? Thanks, Andrew. On Tue, Jul 20, 2010 at 12:38 PM, Mattias Persson matt...@neotechnology.com wrote: Sorting by relevance is possible via http://components.neo4j.org/neo4j-index/apidocs/org/neo4j/index/lucene/LuceneIndexService.html#getNodes(java.lang.String,%20java.lang.Object,%20org.apache.lucene.search.Sort)http://components.neo4j.org/neo4j-index/apidocs/org/neo4j/index/lucene/LuceneIndexService.html#getNodes%28java.lang.String,%20java.lang.Object,%20org.apache.lucene.search.Sort%29 . Exposing this sorting thingie would require you to add that in the rest code as well (as you probably could guess). But the IndexService doesn't support querying for more than one property at a time. However, there's a new indexing framework in the making over at https://svn.neo4j.org/laboratory/components/lucene-index/ which allows you to do these types of queries. This new framework will probably make its way into trunk rather soon and eventually replace the indexing found in neo4j-index component today. So the answer is no if you use neo4j-index component (which REST does). But it's yes if REST were to use the new framework instead. I'll commit the additions regarding sorting and all that soon (I'm laborating with it a.t.m.). You could f.ex. ask a query like: for ( Node node : myTitleIndex.query( new QueryContext( +title:foo* description:bar ).sort( Sort.RELEVANCE ) ) {} 2010/7/16 Andrew Mutz andrew.m...@appfolio.com Hi all, I've been evaluating using Neo4J for a project at my company and have been consistently impressed with it's capabilities. There is one thing I need to do, however, that I'm not sure is possible. I'm using the Neo4J REST server. I've been using lucene full text indexing/searching on my node attributes with great success. What I want to be able to do is to adjust the relevancy of the results returned by lucene based on attributes *other* than the one I'm searching on. Example: Nodes have attributes title and description. I want to search for all nodes, say, whose title matches foo*, but have whether or not description matches bar* affect the order of the search results. Is this possible? I'm very comfortable getting my hands dirty in the source, so if this is going to require some hacking, just point me in the right direction. I've been extensively modifying the REST server to fit my needs, so ideally my changes would be in that part of the code base. But I'm willing to dig deeper if necessary. Thanks much, Andrew. -- Andrew Mutz Senior Software Engineer AppFolio, Inc. 55 Castilian Dr. | Goleta, CA | 93117 Phone: 805.617.2167 | Fax: 805.968.0646 andrew.m...@appfolio.com www.appfolio.com - Web-Based Property Management Software Made Easy. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Andrew Mutz Senior Software Engineer AppFolio, Inc. 55 Castilian Dr. | Goleta, CA | 93117 Phone: 805.617.2167 | Fax: 805.968.0646 andrew.m...@appfolio.com www.appfolio.com - Web-Based Property Management Software Made Easy. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com
Re: [Neo4j] Querying for nodes that have no relationhip to a specfic node
One benefit you of Neo4j is that you can get rid of these pesky background jobs and instead calculate such things on the fly quite fast, and not needing to store that calculated info at all. Tried it? 2010/7/28, Alberto Perdomo alberto.perd...@gmail.com: Hi everyone, I would have an SQL db for the app besides the graph db. I have users that I would store as nodes within the graph besides storing them in SQL as well. Within those nodes I store attributes like male/female, age or date of birth, etc. I would have one kind of relationship for friendship, which doesn't present any kind of problem and I would do the standard type of queries neo4jr-social provides (e.g. friend suggestions, degrees of separation, friends in common, ...) We want to measure the compatibility/taste match/whatever between users in background, meaning for instance how much you have in common. This is done in Ruby. The result will be an integer between 0 and 100. BTW, this value is symmetric, meaning it could be modelled as a bidirectional relationship. Let's say I have 10k users and for every user I calculate the match between him and 10 other users. If I store all the results I calculate I potentially up to 100k relationships every day / 3m relationships every month. If I store this in SQL it can turn into a bottleneck very fast. The table will grow soon too big and the queries will be slower and slower. That's when I started thinking in storing those relationships in Neo4j because it's meant to handle a very large number of nodes and relationships really efficiently. I can model that as a relationship and either store the value inside the relationship or code the relationship names as 'match_high, match_medium, match_low' Now back to step 1. Selecting the users I'll be calculating new relationships with. They must match certain criteria, e.g. female/male, similar age, etc. and it could be pseudo random. Now the first step if you think in SQL is to query for all users that match the criteria and don't have a relationship with user A. And then yesterday looking at the Neo4j docs I thought this kind of query cannot be done. I could select all the users that match the criteria from SQL, then query all the relationships for A from Neo4j, substract those from the array of valid users and pick randomly n users. Because n is a low value, perhaps 10, this looks to me like a very inefficient way of doing this. Also it will be fast at the beginning but it will get slower as the relationship density grows with time... Maybe I should consider a different strategy. I've been also considering only storing high or interesting values but it would be more interesting to have the n top users for A ordered by relationship value. If I go ahead with this then I could just go and store it within SQL. This is not what we strive for but if I don't find a better way I'll guess we'll have to live with that. Also the solution I find should be easily scalable. It should also apply when having for instance 100k users. Any thoughts or comments? What would you recommend? Thanks for help guys! Alberto. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] TraversalDescription building hickup
2010/7/27 Peter Neubauer peter.neuba...@neotechnology.com Hi all, I just stumbled over the immutable TraversalDescription API (http://components.neo4j.org/neo4j-kernel/apidocs/index.html), which will not modify the object if you do TraversalDescription td = new TraversalDescriptionImpl(); td.depthFirst(); Instead, one needs to reassign td, like TraversalDescription td = new TraversalDescriptionImpl(); td = td.depthFirst(); However, TraversalDescription td = new TraversalDescriptionImpl().depthFirst(); will give you the expected td. IMHO this is unexpected behaviour and hard to get if you just follow the common fluent API and presume a Builder-pattern. Especially since no errors are thrown and you just end up with strange results and unreachable code i e.g. a custom PruneEvaluator etc. True, the API says it is immutable, but still I think this is hard. WDYT? Should we think of changing this to a proper builder.modify().modify etc and finally builder.build() wich gives you the final, immutable instance of TraversalDescription and is clearly understandable by clients? I still think the current approach is more useful (although it'd be nice with more input on this). One reason I think it's better is that you can half-bake descriptions as private static final or similar and then complete the descriptions in several different places in your code. You can even pass in descriptions in methods and what not, without any risc of them being modified. I think javadoc should better explain this and it should be expected that developers read javadoc, right? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Batch inserter shutdown taking forever
Since you're doing a depth 1 traversal please use something like this instead: for ( Relationship rel : graphDb.getReferenceNode().getRelationships( Relationships.ROUTE, Direction.OUTGOING ) ) { Node node = rel.getEndNode(); // Do stuff } Since a traverser keeps more memory than a simple call to getRelationships. Another thing, are you doing any write operation in that for-loop of yours? Also do you shut down the batch inserter and start a new EmbeddedGraphDatabase to traverse on, or how do you get a hold of the graphDb? 2010/7/26 Tim Jones bogol...@ymail.com OK, I found out what's taking the time. It's iterating over the result set of a traverser: // visit each Route node, and add it to the array Traverser routes = graphDb.getReferenceNode().traverse( Traverser.Order.BREADTH_FIRST, StopEvaluator.DEPTH_ONE, ReturnableEvaluator.ALL_BUT_START_NODE, Relationships.ROUTE, Direction.OUTGOING); for (Node node : routes) { // do stuff } The 'for' loop takes ages. There are probably 2m nodes being returned by that traverser at the moment, and that's only a very small subset of the data I want to add to the database. is there any way to tinker with the neo4j properties or anything to improve performance here? Thanks - Original Message From: Mattias Persson matt...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Sat, July 24, 2010 10:23:02 PM Subject: Re: [Neo4j] Batch inserter shutdown taking forever 2010/7/21 Tim Jones bogol...@ymail.com Hi, I'm using a BatchInserter and a LuceneIndexBatchInserter to insert 5m nodes and 5m relationships into a graph in one go. The insertion seems to work, but shutting down takes forever - it's been 2 hours now. At first, the JVM gave me garbage collection exception, so I've set the heap to 2gb. 'top' tells me that the application is still running: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9994 tim 17 0 2620m 2.3g 238m S 99.5 39.1 115:48.84 java but checking the filesystem by running 'ls -l' a few times doesn't indicate that files are being updated. Is this normal? Is there a way to improve performance? No, it sounds quite weird. Any chance to have a look at your code? I'm loading all my data in one go to ease creating the db - it's simpler to create it from scratch each time instead of updating an existing database - so ideally I don't want to break this job down into multiple smaller jobs (actually, this would be OK if performance was good, but I ran into problems inserting data and retrieving existing nodes). What kind of problems? could you supply code and description of your problems? Problems doing something similar in relational dbs. Also, the API recommends to optimise the batch search index before using it for lookups. I just decided not to take this approach. Thanks, Tim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Enabling LRU cache with BatchInserter
Are you thinking of the method in LuceneIndexService? The batch inserter index doesn't have such a method. Do you have performance problems inserting stuff, or why do you want such a method? 2010/7/27 Mohit Vazirani mohi...@yahoo.com Hi, I'm trying to call enableCache(..) for the following example: http://wiki.neo4j.org/content/Batch_Insert#Using_batch_inserter_together_with_indexing How would I go about doing that? ~Mohit ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Performance problem inserting nodes with many short string properties
2010/7/22 Jeff Klann jkl...@iupui.edu And in the meantime I'm rewriting some code to use the batch inserter, but the LuceneIndexBatchInserterImpl is not reading an index that already exists in the db! (I'm trying to use a pre-existing index to find parent nodes for the nodes I'm inserting.) The shell verifies the index is in fact there. What? Yeah... about that: that's an issue in neo4j-kernel version 1.0 (although it's fixed in 1.1-SNAPSHOT). - Jeff Klann On Thu, Jul 22, 2010 at 11:27 AM, Jeff Klann jkl...@iupui.edu wrote: I'm stumped on this one. I'm getting the fast write performance at first that slows to a crawl issue described in the performance guide, so I increased the Linux dirty_page ratio (all the way up to 80%), turned of auto log rotation, and increased the size of the memory mapped cache. This issue is still happening exactly as before. I've narrowed my problem to this: *If I insert a lot of nodes with about 50 short string properties each, the performance slows to a crawl at about 40,000 inserts (and it stays slow)* ... however if I don't insert the properties the performance is fine. What am I doing wrong? The machine currently has a small amount of RAM, but I don't understand why that would impact pure insertion, and only after thousands of inserts. (I don't read the properties back after adding them.) I have not used BatchInserter because it is nice to have normal database access for some parts of this database builder program I'm writing, but if that's the only way I could refactor. Also all these inserts are within one transaction (about 100k nodes per transaction) - do I need to split this into smaller transactions? Thanks, Jeff Klann ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Question about labelling all connected components
2010/7/23 Arijit Mukherjee ariji...@gmail.com Thanx to both of you. Yes, I can just check whether the label exists on the node or not. In my case checking for Integer.MIN_VALUE which is what is assigned when the subscriber node is created. To assign a temporary value (or a value representing the state not assigned) seems unecessary. A better way would be to not set that property on creating a node and then use: node.getProperty( whatever key, Integer.MIN_VALUE ); when getting that property. BTW - is it ever possible to label the components while creating the graph? I can't think of any way of doing this - but I might be missing something... Regards Arijit On 22 July 2010 20:54, Vitor De Mario vitordema...@gmail.com wrote: As far as the algorithm goes, I see nothing wrong. Connected components is a well known problem in graph theory, and you're doing just fine. I second the recommendations of Tobias, specially the second one, as you would get rid of the labelled collection completely, and that improves you both in time and memory. []'s Vitor On Thu, Jul 22, 2010 at 11:35 AM, Tobias Ivarsson tobias.ivars...@neotechnology.com wrote: The first obvious thing is that labelled.contains(currentNode.getId()) is going to take more time as your dataset grows, since it's a linear search for the element in an ArrayList. A HashSet would be a much more appropriate data structure for your application. The other thing that comes to mind is the memory overhead of the labelled-collection. Eventually it is going to contain every node in the graph, and be very large. This steals some of the memory that could have been used for caching the graph, forcing Neo4j to do more I/O than it would have if it could have used that memory for cache. Would it be possible for you to replace the !labelled.contains(currentNode.getId())-check with currentNode.getProperty(componentID,null) == null? Or are there situations where the node could have that property and not be considered labeled? Cheers, Tobias On Thu, Jul 22, 2010 at 3:35 PM, Arijit Mukherjee ariji...@gmail.com wrote: Hi All I'm trying to label all connected components in a graph - i.e. all nodes that are connected will have a common componentID property set. I'm using the Traverser to do this. For each node in the graph (unless it is already labelled, which I track by inserting the node ID in a list), the traverser finds out all the neighbours using BFS, and then the node and all the neighbours are labelled with a certain value. The code is something like this - IterableNode allNodes = graphDbService.getAllNodes(); ArrayList labelled = new ArrayList(); for (Node currentNode : allNodes) { if (currentNode.hasProperty(number) !labelled.contains(currentNode.getId())) { Traverser traverser = currentNode.traverse(Order.BREADTH_FIRST, StopEvaluator.END_OF_GRAPH, ReturnableEvaluator.ALL_BUT_START_NODE, RelTypes.CALLS, Direction.BOTH); int currentID = initialID; initialID++; currentNode.setProperty(componentID, currentID); labelled.add(currentNode.getId()); for (Node friend : traverser) { friend.setProperty(componentID, currentID); // mark each node as labelled labelled.add(friend.getId()); } } } This works well for a small graph (2000 nodes). But for a graph of about 1 million nodes, this is taking about 45 minutes on a 64-bit Intel 2.3GHz CPU, 4GB RAM (Java 1.6 update 21 and Neo4J 1.0). Is this normal? Or is the code I'm using faulty? Is there any other way to label the connected components? Regards Arijit -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Tobias Ivarsson tobias.ivars...@neotechnology.com Hacker, Neo Technology www.neotechnology.com Cellphone: +46 706 534857 ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt
Re: [Neo4j] Batch inserter shutdown taking forever
2010/7/21 Tim Jones bogol...@ymail.com Hi, I'm using a BatchInserter and a LuceneIndexBatchInserter to insert 5m nodes and 5m relationships into a graph in one go. The insertion seems to work, but shutting down takes forever - it's been 2 hours now. At first, the JVM gave me garbage collection exception, so I've set the heap to 2gb. 'top' tells me that the application is still running: PID USER PR NI VIRT RES SHR S %CPU %MEMTIME+ COMMAND 9994 tim17 0 2620m 2.3g 238m S 99.5 39.1 115:48.84 java but checking the filesystem by running 'ls -l' a few times doesn't indicate that files are being updated. Is this normal? Is there a way to improve performance? No, it sounds quite weird. Any chance to have a look at your code? I'm loading all my data in one go to ease creating the db - it's simpler to create it from scratch each time instead of updating an existing database - so ideally I don't want to break this job down into multiple smaller jobs (actually, this would be OK if performance was good, but I ran into problems inserting data and retrieving existing nodes). What kind of problems? could you supply code and description of your problems? Thanks, Tim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] how to self define node?
Ids (type long) are generated internally and cannot be supplied from outside. However, you can use another arbitrary property key to represent such a user-defined ID... For easy lookup also index that property for each created node and query the index to get the node for a specific ID. 2010/7/22 Hunt conan...@gmail.com How can I self define a new node? eg. define a new node with my own id, can the node's ID be a string?Have anyone successful deploy neo4j with thrift? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] TimelineIndex usage
2010/7/23 Tim Jones bogol...@ymail.com Hi, I need to be able to retrieve nodes whose timestamps are greater than a particular time. I've been trying to use a TimelineIndex, but didn't realise that it was a persistent structure - I've written code that will create a new Timeline each time my application is instantiated, and it's complaining that nodes already exist in the Timeline. So you don't want to use a persistent timeline? The Timeline class stores the timeline in the graph itself so if you'd like a non-persistent timeline it would have to be implemented and there isn't such an implementation a.t.m. The nodes that I'm relating to each other in the Timeline are not themselves directly related. Is this going to cause problems using the Timeline since it won't be able to traverse a subgraph? I'm not sure I understand what you mean here. It may be (I'm not really sure on this one) that the timeline structure just refers to your indexed nodes via id, not creating relationships to them. You're worried that nodes will become related when they are added to a timeline? Any way it wouldn't be a problem if you always specify which relationships to traverse in traversals. Thanks, Tim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Lucene and sorting results
Sorting by relevance is possible via http://components.neo4j.org/neo4j-index/apidocs/org/neo4j/index/lucene/LuceneIndexService.html#getNodes(java.lang.String,%20java.lang.Object,%20org.apache.lucene.search.Sort). Exposing this sorting thingie would require you to add that in the rest code as well (as you probably could guess). But the IndexService doesn't support querying for more than one property at a time. However, there's a new indexing framework in the making over at https://svn.neo4j.org/laboratory/components/lucene-index/ which allows you to do these types of queries. This new framework will probably make its way into trunk rather soon and eventually replace the indexing found in neo4j-index component today. So the answer is no if you use neo4j-index component (which REST does). But it's yes if REST were to use the new framework instead. I'll commit the additions regarding sorting and all that soon (I'm laborating with it a.t.m.). You could f.ex. ask a query like: for ( Node node : myTitleIndex.query( new QueryContext( +title:foo* description:bar ).sort( Sort.RELEVANCE ) ) {} 2010/7/16 Andrew Mutz andrew.m...@appfolio.com Hi all, I've been evaluating using Neo4J for a project at my company and have been consistently impressed with it's capabilities. There is one thing I need to do, however, that I'm not sure is possible. I'm using the Neo4J REST server. I've been using lucene full text indexing/searching on my node attributes with great success. What I want to be able to do is to adjust the relevancy of the results returned by lucene based on attributes *other* than the one I'm searching on. Example: Nodes have attributes title and description. I want to search for all nodes, say, whose title matches foo*, but have whether or not description matches bar* affect the order of the search results. Is this possible? I'm very comfortable getting my hands dirty in the source, so if this is going to require some hacking, just point me in the right direction. I've been extensively modifying the REST server to fit my needs, so ideally my changes would be in that part of the code base. But I'm willing to dig deeper if necessary. Thanks much, Andrew. -- Andrew Mutz Senior Software Engineer AppFolio, Inc. 55 Castilian Dr. | Goleta, CA | 93117 Phone: 805.617.2167 | Fax: 805.968.0646 andrew.m...@appfolio.com www.appfolio.com - Web-Based Property Management Software Made Easy. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Lucene and sorting results
I copied that org.apache.lucene.Hits class into the lucene-index component, so it exists there in that package (and has existed there since the birth of this component). That's the class that LuceneIndex.search uses, not the one from lucene-core-3 (since it has been removed). 2010/7/20 Andrew Mutz andrew.m...@appfolio.com I was changing the neo4j-rest server to use the new lucene-index framework myself, and have been very frustrated with this problem. There seems to be a lucene version conflict: - org.neo4j.index.impl.lucene.LuceneIndex.search() uses org.apache.lucene.Hits, which was removed in lucene 3.0 - org.neo4j.index.impl.lucene.IndexType.query() seems to assume lucene version 3.0 (Version.LUCENE_30) So I can't use lucene 3.0 or above for the first reason, and I need to use 3.0 for the second reason. How are others able to use this? Am I doing something wrong? Maybe I should just wait until your changes go in? Thanks, Andrew. On Tue, Jul 20, 2010 at 12:38 PM, Mattias Persson matt...@neotechnology.com wrote: Sorting by relevance is possible via http://components.neo4j.org/neo4j-index/apidocs/org/neo4j/index/lucene/LuceneIndexService.html#getNodes(java.lang.String,%20java.lang.Object,%20org.apache.lucene.search.Sort)http://components.neo4j.org/neo4j-index/apidocs/org/neo4j/index/lucene/LuceneIndexService.html#getNodes%28java.lang.String,%20java.lang.Object,%20org.apache.lucene.search.Sort%29 . Exposing this sorting thingie would require you to add that in the rest code as well (as you probably could guess). But the IndexService doesn't support querying for more than one property at a time. However, there's a new indexing framework in the making over at https://svn.neo4j.org/laboratory/components/lucene-index/ which allows you to do these types of queries. This new framework will probably make its way into trunk rather soon and eventually replace the indexing found in neo4j-index component today. So the answer is no if you use neo4j-index component (which REST does). But it's yes if REST were to use the new framework instead. I'll commit the additions regarding sorting and all that soon (I'm laborating with it a.t.m.). You could f.ex. ask a query like: for ( Node node : myTitleIndex.query( new QueryContext( +title:foo* description:bar ).sort( Sort.RELEVANCE ) ) {} 2010/7/16 Andrew Mutz andrew.m...@appfolio.com Hi all, I've been evaluating using Neo4J for a project at my company and have been consistently impressed with it's capabilities. There is one thing I need to do, however, that I'm not sure is possible. I'm using the Neo4J REST server. I've been using lucene full text indexing/searching on my node attributes with great success. What I want to be able to do is to adjust the relevancy of the results returned by lucene based on attributes *other* than the one I'm searching on. Example: Nodes have attributes title and description. I want to search for all nodes, say, whose title matches foo*, but have whether or not description matches bar* affect the order of the search results. Is this possible? I'm very comfortable getting my hands dirty in the source, so if this is going to require some hacking, just point me in the right direction. I've been extensively modifying the REST server to fit my needs, so ideally my changes would be in that part of the code base. But I'm willing to dig deeper if necessary. Thanks much, Andrew. -- Andrew Mutz Senior Software Engineer AppFolio, Inc. 55 Castilian Dr. | Goleta, CA | 93117 Phone: 805.617.2167 | Fax: 805.968.0646 andrew.m...@appfolio.com www.appfolio.com - Web-Based Property Management Software Made Easy. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Andrew Mutz Senior Software Engineer AppFolio, Inc. 55 Castilian Dr. | Goleta, CA | 93117 Phone: 805.617.2167 | Fax: 805.968.0646 andrew.m...@appfolio.com www.appfolio.com - Web-Based Property Management Software Made Easy. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org
Re: [Neo4j] Query for combination of properties
I just spent a little time extending the https://svn.neo4j.org/laboratory/components/lucene-index/ component so that IndexProvider#relationshipIndex(...) returns a RelationshipIndex (which extends IndexRelationship) and adds methods so that you can do (not committed yet): index.query( name, Some name, startNode, null ); The start node id and end node id of relationships are indexed in each lucene document for relationships so that you can supply start/end node in queries to narrow the result even more. This makes it feel like each node can have its own index of its relationships (at least those added to the index). And I think it's quite useful. You can, of course, also use this to have the index answer the question: is there a relationship R between node A and node B, additionally with properties X and Y in an environment where looping through all those relationships wouldn't be efficient. 2010/7/8 Balazs E. Pataki pat...@dsd.sztaki.hu A native solution would be also fine. This would practically allow, what is not really possible with the current relationship lookup implementation: to have really hundred thousands or millions of relationships to a Node and still be able to select relationships in a random access manner by some parameters (eg. relationship type, but maybe other properties as well). Would such native indexing require modifications to the current database file format, or it could be implemented as an additional service? --- balazs On 7/8/10 4:11 PM, Mattias Persson wrote: No, (lucene) indexing won't be implemented into getRelationships (it would totally break performance). However there are possibilities to create some other type of indexing (on relationship type for example/direction) natively. 2010/7/8 Balazs E. Patakipat...@dsd.sztaki.hu Great, thanks! Do you have any info on when 1.1 is expected? In the meantime we will use this laboratory version of the LuceneIndexProvider, because the multi-field search is essential in our case. By the way: I see that now one can also index relationships with the new API. Do you also plan to use these relationship indexes to make Node#getRelationships() and similar functions faster? So far it seems they look up relationships sequentially, which is pretty bad when you want too look for a specific type of relationships among 10.000 others. (OK, it is more of a problem with 1 million relationships, but anyway, I'm just curious ;-) ) --- balazs On 7/8/10 3:21 PM, Mattias Persson wrote: Yeah, that API isn't stable yet, but I think that it will end up similar to that... and hopefully merged into kernel trunk after 1.1 sometime. You can use it for fun, but you should expect changes in it. 2010/7/7 Peter Neubauerpeter.neuba...@neotechnology.com Balazs, Mattias is writing this component, not sure how stable it is right now, but as I perceived it the API is starting to settle ... Would be great to get some more indexes tried out, feel free to experiment with Sphinx, might be a good alternative to Lucene? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com - Scandinavia's coolest Bring-a-Thing party. On Wed, Jul 7, 2010 at 6:07 PM, Balazs E. Patakipat...@dsd.sztaki.hu wrote: That's great, works as expected. :-) Now, it seems you changed a lot of the indexing APIs. Should I use these new ones (and the neo4j sources from the SVN trunk), as these will be used in future versions, or these are still experimental? I ask this because in parallel we also investigate the possibility of integrating the shynx indexer (http://www.sphinxsearch.com/) to neo4j. If there's any experience or plans regarding sphynx, I would appreciate any info about it. Thanks again, --- balazs On 7/7/10 3:40 PM, Peter Neubauer wrote: Balazs, this is not explicitly possible today, but in the new Lucene-Index component in laboratory that will be integrated into trunk after Neo4j 1.1, see https://svn.neo4j.org/laboratory/components/lucene-index/src/test/java/org/neo4j/index/impl/lucene/TestLuceneIndex.java , method makeSureCompositeQueriesCanBeAsked . Sorry for the inconvenience! You could try out the component and let us know if that works for you? Cheers, /peter neubauer COO and Sales, Neo Technology GTalk: neubauer.peter Skype peter.neubauer Phone +46 704 106975 LinkedIn http://www.linkedin.com/in/neubauer Twitter http://twitter.com/peterneubauer http://www.neo4j.org - Your high performance graph database. http://www.thoughtmade.com
Re: [Neo4j] Neo4j Tuning for specific application
2010/7/16, Amir Hossein Jadidinejad amir.jad...@yahoo.com: OK. I think it's better if we have an InMemoryEmbeddedGraphDatabase, derive from EmbeddedGraphDatabase that load the whole graph in memory. It seems that the current interface is not appropriate for all applications. yep, an in-memory graph db would be handy in some cases (testing and such). How exactely is the interface not appropriate for all applications? It'd be great to hear more specific details about your view on that. And also, are you thinking of the GraphDatabaseService interface or the EmbeddedGraphDatabase implementation in particular? From: Mattias Persson matt...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Fri, July 16, 2010 5:53:08 PM Subject: Re: [Neo4j] Neo4j Tuning for specific application 2010/7/15, Amir Hossein Jadidinejad amir.jad...@yahoo.com: Hi, I have checked all the mentioned issues. But currently it's too slow! I takes 1sec for each node in order to get a list of its neighbors. The disk is overloaded while the memory is free! The following is my running command: java -d64 -server -XX:+UseNUMA -XX:+UseConcMarkSweepGC -Xmx4096m -classpath $CLASSPATH:../lib/geronimo-jta_1.1_spec-1.1.1.jar:../lib/jline-0.9.94.jar:../lib/lucene-core-2.9.2.jar:../lib/mysql-connector-java-5.1.7-bin.jar:../lib/neo4j-commons-1.0.jar:../lib/neo4j-index-1.1-20100714.135430-157.jar:../lib/neo4j-kernel-1.1-20100714.134745-137.jar:../lib/neo4j-remote-graphdb-0.7-20100714.140411-116.jar:../lib/neo4j-shell-1.1-20100714.140808-144.jar:../lib/neo4j-utils-1.0.jar:../lib/servlet-api.jar:../lib/trove.jar:../lib/weka.jar:. org.graph.InferenceEngine Although this has nothing with performance to do, please remove the neo4j-commons (deprecated component) and use neo4j-utils-1.1-SNAPSHOT instead of version 1.0 and the following is the configuration parameters: neostore.nodestore.db.mapped_memory=120M neostore.relationshipstore.db.mapped_memory=5G neostore.propertystore.db.mapped_memory=100M neostore.propertystore.db.strings.mapped_memory=200M neostore.propertystore.db.arrays.mapped_memory=0M What's the problem?! From: Mattias Persson matt...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Sat, July 10, 2010 11:46:47 PM Subject: Re: [Neo4j] Neo4j Tuning for specific application Are you using kernel/index version 1.0? Regarding the index lookups (each lookup in its own separate transaction): I think there's a bug in neo4j-index 1.0 which causes such a transaction (which contains a call to index.getNodes) to write stuff to and flush the logical log, which of course is completely unnecessary. That may very well be the cause of the disk being so heavily used. What you could try is to update to latest kernel/index version 1.1-SNAPSHOT where this problem have been fixed, also in that version you aren't forced to wrap reads in transactions. If you cannot update to latest 1.1-SNAPSHOT then try to do more cuis in the each transaction. 2010/7/10 Arjen van der Meijden acmmail...@tweakers.net Hi Amir, I'm just starting with neo4j, but saw some issues with your code from a normal java-standpoint. Please note, some of them are just micro-optimizations that may not matter much. But a lot of them are in your critical path, so perhaps they're worth a look. On 10-7-2010 17:59 Amir Hossein Jadidinejad wrote: Hi, I have a GraphDB with the following attributes: Number of nodes: 3.6M Number of relation types: 2 Total size of DB: 9GB lucene : 160MB neostore.nodestore.db : 31MB neostore.propertystore.db : 2GB neostore.propertystore.db.strings : 4GB neostore.relationshipstore.db : 1.5GB Machine characteristics: vm.dirty_background_ratio = 50 vm.dirty_ratio = 80 OS: Ubuntu x64 CPU: Corei7 MEM: 12GB The following is our running scenario (The source code is attached): 1. Iterate over all nodes and extract a list of node IDs (fillNodes function). 2. For each node ID, initiate a worker thread that process the following items (8 threads are executed in parallel using a pool - walk function): -extract relationships of this node. -perform a light processing. -update results (in a ConcurrentHashMap). Note that: -The above scenario is iterative. Roughly it runs 10 times. -No update is applied to the DB during running (read only). After running the application: -Less than 4GB/12GB of memory is occupied. It seems that Neo4j is leveraged only 2GB of memory. What jvm-flags did you specify? I take it, you didn't forget to include a high -Xmx, to allow more memory and perhaps the parallel 'old generation' garbage collector to allow more throughput. Otherwise, most 64-bit jvm's start with system-dependent maximums (afaik at most 2GB). -The hard disk
Re: [Neo4j] neo4j REST server standalone
I downloaded the July 10th build of the REST standalone server and created a neo4j.properties file with enable_remote_shell set to true. However, when I try to connect it, I still get the locked store message and it prompts me to open as readonly. Do I also need something in the wrapper.conf file? Also, used JConsole to see the configuration attributes which show up as unavailable. I think the problem is that you supply a -path at the shell client command line, am I right? Because when supplying a -path option it will try to open that graph database in the shell client JVM. If you skip that -path option it will try to connect via RMI to a remotely enabled shell server instead and you'll get a remote client with read/write capabilities. See more information about this over at http://wiki.neo4j.org/content/Shell#Starting_the_shell -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Question about ID indices...
In the current IndexService API you can't really query for IDs. But in a new index framework that I've been working on you can... you can also do composite queries (allowing you to ask lucene queries directly) and all such goodies. Lucene is just an implementation whereas the API is generic. Take a look at https://svn.neo4j.org/laboratory/components/lucene-index/ for this new (and great) API which hopefully will replace IndexService pretty soon. Anyways, you could ptentially ask a query like: IndexNode someIndex = myIndexProviderMaybeAGraphDbService.nodeIndex( persons ); for ( Node node : someIndex.query( name:Mar?o AND skill:*aphs AND _id_:*666* ) ) { } 2010/7/15 Marko Rodriguez okramma...@gmail.com Hello, Question: Is the node/relationship ID space indexed by Lucene --- and, if so, in a manner analogous to properties? Thank you, Marko. http://markorodriguez.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] getProperty returning wrong value
Ah, great you found the problem! 2010/7/13, Tim Jones bogol...@ymail.com: - Original Message From: Mattias Persson matt...@neotechnology.com To: Neo4j user discussions user@lists.neo4j.org Sent: Tue, July 13, 2010 2:57:33 PM Subject: Re: [Neo4j] getProperty returning wrong value 2010/7/13, Tim Jones bogol...@ymail.com: No, how ever you group your read operations it returns correct values, so it feels quite odd that wrong values are returned... I think you need to verify this a bit more. Maybe look up a node with count=2 in neoclipse and then see if that node gets returned by your query and really returns count=1. Hi Mattias, I've done this several times, and I've just done it again. This time I finally spotted the bug. It's not a problem with neo4j after all. I was setting a default value for the count property to '1' in the constructor, so when I instantiated it with an existing node, it overwrote the old value. Thanks anyway, Tim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] OutOfMemory while populating large graph
Great, so maybe neo4j-index should be updated to depend on Lucene 2.9.3. 2010/7/9 Bill Janssen jans...@parc.com Note that a couple of memory issues are fixed in Lucene 2.9.3. Leaking when indexing big docs, and indolent reclamation of space from the FieldCache. Bill Arijit Mukherjee ariji...@gmail.com wrote: I've a similar problem. Although I'm not going out of memory yet, I can see the heap constantly growing, and JProfiler says most of it is due to the Lucene indexing. And even if I do the commit after every X transactions, once the population is finished, the final commit is done, and the graph db closed - the heap stays like that - almost full. An explicit gc will clean up some part, but not fully. Arijit On 9 July 2010 17:00, Mattias Persson matt...@neotechnology.com wrote: 2010/7/9 Marko Rodriguez okramma...@gmail.com Hi, Would it actually be worth something to be able to begin a transaction which auto-committs stuff every X write operation, like a batch inserter mode which can be used in normal EmbeddedGraphDatabase? Kind of like: graphDb.beginTx( Mode.BATCH_INSERT ) ...so that you can start such a transaction and then just insert data without having to care about restarting it now and then? Thats cool! Does that already exist? In my code (like others on the list it seems) I have a counter++ that every 20,000 inserts (some made up number that is not going to throw an OutOfMemory) commits and the reopens a new transaction. Sorta sux. No it doesn't, I just wrote stuff which I though someone could think of as useful. A cool thing with just telling it to do a batch insert mode transaction (not the actual commit interval) is that it could look at how much memory it had to play around with and commit whenever it would be the most efficient, even having the ability to change the limit on the fly if the memory suddenly ran out. Thanks, Marko. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- And when the night is cloudy, There is still a light that shines on me, Shine on until tomorrow, let it be. ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Neo4j Tuning for specific application
sublist's. - You're retrieving the reference node for each iteration, rather than just once outside the loop in fillNodes. - Why are you converting the weight-property to a string, to then convert it to a Double? If its stored as a string, perhaps it'd be a good idea to change it to a Double? - Perhaps the cui-value can also be stored in a more efficient storage format (long?), thus saving space and memory. - Why are you filling v_star if you're not using the result? Best regards and good luck, Arjen PS, shouldn't a random walk do some random stuff? ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] OutOfMemory while populating large graph
Modifications in a transaction are kept in memory so that there's the ability to rollback the transaction completely if something would go wrong. There could of course be a solution where (I'm just spawning supposedly), so that if a tx gets big enough such a transaction gets converted into its own graph database or some other on-disk data structure which would then be merged into the main database on commit. Would it actually be worth something to be able to begin a transaction which auto-committs stuff every X write operation, like a batch inserter mode which can be used in normal EmbeddedGraphDatabase? Kind of like: graphDb.beginTx( Mode.BATCH_INSERT ) ...so that you can start such a transaction and then just insert data without having to care about restarting it now and then? Another view of this is that such big transactions (I'm assuming here) are only really used for a first-time insertion of a big data set, where the BatchInserter can be used and does exactly that... it flushes to disk whenever it feels like and you can just go on feeding it more and more data. 2010/7/8 Rick Bullotta rick.bullo...@burningskysoftware.com Paul, I also would like to see automatic swapping/paging to disk as part of Neo4J, minimally when in bulk insert mode...and ideally in every usage scenario. I don't fully understand why the in-memory logs get so large and/or aren't backed by the on-disk log, or if they are, why they need to be kept in memory as well. Perhaps it isn't the transaction stuff that is taking up memory, but the graph itself? Can any of the Neo team help provide some insight? Thanks! -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Paul A. Jackson Sent: Thursday, July 08, 2010 1:35 PM To: (User@lists.neo4j.org) Subject: [Neo4j] OutOfMemory while populating large graph I have seen people discuss committing transactions after some microbatch of a few hundred records, but I thought this was optional. I thought Neo4J would automatically write out to disk as memory became full. Well, I encountered an OOM and want to make sure that I understand the reason. Was my understanding incorrect, or is there a parameter that I need to set to some limit, or is the problem them I am indexing as I go. The stack trace, FWIW, is: Exception in thread main java.lang.OutOfMemoryError: Java heap space at java.util.HashMap.init(HashMap.java:209) at java.util.HashSet.init(HashSet.java:86) at org.neo4j.index.lucene.LuceneTransaction$TxCache.add(LuceneTransaction.java: 334) at org.neo4j.index.lucene.LuceneTransaction.insert(LuceneTransaction.java:93) at org.neo4j.index.lucene.LuceneTransaction.index(LuceneTransaction.java:59) at org.neo4j.index.lucene.LuceneXaConnection.index(LuceneXaConnection.java:94) at org.neo4j.index.lucene.LuceneIndexService.indexThisTx(LuceneIndexService.jav a:220) at org.neo4j.index.impl.GenericIndexService.index(GenericIndexService.java:54) at org.neo4j.index.lucene.LuceneIndexService.index(LuceneIndexService.java:209) at JiraLoader$JiraExtractor$Item.setNodeProperty(JiraLoader.java:321) at JiraLoader$JiraExtractor$Item.updateGraph(JiraLoader.java:240) Thanks, Paul Jackson ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Is it possible to count common nodes when traversing?
Just to notify you guys on this... since as of now (r4717) the TraversalFactory class is named Traversal instead, so code would look like: for ( Node currentNode : TraversalFactory.description() .breadthFirst().uniqueness(Uniqueness.RELATIONSHIP_GLOBAL) .relationships(MyRelationships.SIMILAR) .relationships(MyRelationships.CATEGORY) .prune(TraversalFactory.pruneAfterDepth(2)).traverse(node) ) { 2010/7/8 Mattias Persson matt...@neotechnology.com Your problem is that a node can't be visited more than once in a traversal, right? Have you looked at the new traversal framework in 1.1-SNAPSHOT? It solves that problem in that you can specify uniqueness for the traverser... you can instead say that each Relationship can't be visited more than once, but Nodes can. Your example: MapNode, Integer result = new HashMapNode, Integer(); for ( Node currentNode : TraversalFactory.createTraversalDescription() .breadthFirst().uniqueness(Uniqueness.RELATIONSHIP_GLOBAL) .relationships(MyRelationships.SIMILAR) .relationships(MyRelationships.CATEGORY) .prune(TraversalFactory.pruneAfterDepth(2)).traverse(node) ) { if(currentNode.hasProperty(category)) { if(result.get(currentNode) == null) { result.put(currentNode, 1); } else { result.put(currentNode, result.get(currentNode) + 1); } } } 2010/7/8 Rick Bullotta rick.bullo...@burningskysoftware.com A performance improvement might be achieved by minimizing object creation/hash inserts using a counter wrapper. - Create a simple class Counter with a single public property count of type int (not Integer) with an initial value of 1 - Tweak your code to something like: public MapString, Counter findCategoriesForWord(String word) { final Node node = index.getSingleNode(word, word); final MapString, Counter result = new HashMapString, Counter(); if(node != null) { Traverser traverserWords = node.traverse(Traverser.Order.BREADTH_FIRST, StopEvaluator.DEPTH_ONE, new ReturnableEvaluator() { @Override public boolean isReturnableNode(TraversalPosition traversalPosition) { final Node currentNode = traversalPosition.currentNode(); final IteratorRelationship relationshipIterator = currentNode.getRelationships(MyRelationships.CATEGORY).iterator(); while(relationshipIterator.hasNext()) { final Relationship relationship = relationshipIterator.next(); final String categoryName = (String) relationship.getProperty(catId); Counter counter = result.get(categoryName); if(counter == null) { result.put(categoryName, new Counter()); } else { ++counter.count; } } return true; } }, MyRelationships.SIMILAR, Direction.BOTH); traverserWords.getAllNodes(); } return result; } -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Java Programmer Sent: Thursday, July 08, 2010 8:12 AM To: Neo4j user discussions Subject: Re: [Neo4j] Is it possible to count common nodes when traversing? Hi, Thanks for your answer but it's not exactly what I was on my mind - word can belong to several categories, and different words can share same category e.g.: word 1 : category 1, category 2, category 3 word 2 : category 2, category 3 word 3 : category 3 there is relation between word 1 and word 2 and between word 2 and word 3 (SIMILAR). As a result when querying for word 1 with depth 1, I would like to get: category 1 - 1 (result), category 2 - 2, category 3 - 2 (not 3 because it's out of depth) So far I have changed previous method to use the relationship with property of categoryId, but I don't know if there won't be a performance issues (I iterate for all relationship of the found node (every similar), and store the categories in Map). If you could look at it and tell me if the way of thinking is good, I would be very appreciate: public MapString, Integer findCategoriesForWord(String word) { final Node node = index.getSingleNode(word, word); final MapString, Integer result = new HashMapString, Integer(); if(node != null) { Traverser traverserWords = node.traverse(Traverser.Order.BREADTH_FIRST, StopEvaluator.DEPTH_ONE, new ReturnableEvaluator() { @Override public boolean isReturnableNode(TraversalPosition traversalPosition
Re: [Neo4j] Is it possible to count common nodes when traversing?
Sorry, it should be: for ( Node currentNode : Traversal.description() .breadthFirst().uniqueness( Uniqueness.RELATIONSHIP_GLOBAL) .relationships(MyRelationships.SIMILAR) .relationships(MyRelationships.CATEGORY) .prune(TraversalFactory.pruneAfterDepth(2)).traverse(node) ) { 2010/7/9 Mattias Persson matt...@neotechnology.com Just to notify you guys on this... since as of now (r4717) the TraversalFactory class is named Traversal instead, so code would look like: for ( Node currentNode : TraversalFactory.description() .breadthFirst().uniqueness(Uniqueness.RELATIONSHIP_GLOBAL) .relationships(MyRelationships.SIMILAR) .relationships(MyRelationships.CATEGORY) .prune(TraversalFactory.pruneAfterDepth(2)).traverse(node) ) { 2010/7/8 Mattias Persson matt...@neotechnology.com Your problem is that a node can't be visited more than once in a traversal, right? Have you looked at the new traversal framework in 1.1-SNAPSHOT? It solves that problem in that you can specify uniqueness for the traverser... you can instead say that each Relationship can't be visited more than once, but Nodes can. Your example: MapNode, Integer result = new HashMapNode, Integer(); for ( Node currentNode : TraversalFactory.createTraversalDescription() .breadthFirst().uniqueness(Uniqueness.RELATIONSHIP_GLOBAL) .relationships(MyRelationships.SIMILAR) .relationships(MyRelationships.CATEGORY) .prune(TraversalFactory.pruneAfterDepth(2)).traverse(node) ) { if(currentNode.hasProperty(category)) { if(result.get(currentNode) == null) { result.put(currentNode, 1); } else { result.put(currentNode, result.get(currentNode) + 1); } } } 2010/7/8 Rick Bullotta rick.bullo...@burningskysoftware.com A performance improvement might be achieved by minimizing object creation/hash inserts using a counter wrapper. - Create a simple class Counter with a single public property count of type int (not Integer) with an initial value of 1 - Tweak your code to something like: public MapString, Counter findCategoriesForWord(String word) { final Node node = index.getSingleNode(word, word); final MapString, Counter result = new HashMapString, Counter(); if(node != null) { Traverser traverserWords = node.traverse(Traverser.Order.BREADTH_FIRST, StopEvaluator.DEPTH_ONE, new ReturnableEvaluator() { @Override public boolean isReturnableNode(TraversalPosition traversalPosition) { final Node currentNode = traversalPosition.currentNode(); final IteratorRelationship relationshipIterator = currentNode.getRelationships(MyRelationships.CATEGORY).iterator(); while(relationshipIterator.hasNext()) { final Relationship relationship = relationshipIterator.next(); final String categoryName = (String) relationship.getProperty(catId); Counter counter = result.get(categoryName); if(counter == null) { result.put(categoryName, new Counter()); } else { ++counter.count; } } return true; } }, MyRelationships.SIMILAR, Direction.BOTH); traverserWords.getAllNodes(); } return result; } -Original Message- From: user-boun...@lists.neo4j.org [mailto:user-boun...@lists.neo4j.org] On Behalf Of Java Programmer Sent: Thursday, July 08, 2010 8:12 AM To: Neo4j user discussions Subject: Re: [Neo4j] Is it possible to count common nodes when traversing? Hi, Thanks for your answer but it's not exactly what I was on my mind - word can belong to several categories, and different words can share same category e.g.: word 1 : category 1, category 2, category 3 word 2 : category 2, category 3 word 3 : category 3 there is relation between word 1 and word 2 and between word 2 and word 3 (SIMILAR). As a result when querying for word 1 with depth 1, I would like to get: category 1 - 1 (result), category 2 - 2, category 3 - 2 (not 3 because it's out of depth) So far I have changed previous method to use the relationship with property of categoryId, but I don't know if there won't be a performance issues (I iterate for all relationship of the found node (every similar), and store the categories in Map). If you could look at it and tell me if the way of thinking is good, I would be very appreciate: public MapString, Integer findCategoriesForWord(String word) { final Node node = index.getSingleNode(word, word); final MapString
Re: [Neo4j] API request
Hi Logo, Is this to not having to pass both the BatchInserter and the batch inserter index around? Or is it because of some other issue? 2010/7/6 Logo Bogo bogol...@ymail.com Hi, I'm working with a LuceneIndexBatchInserterImpl -- would it be possible to add an accessor method to be able to retrieve the underlying inserter associated with the LuceneIndexBatchInserterImpl please? Thanks, Tim ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user
Re: [Neo4j] Query for combination of properties
___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user -- Mattias Persson, [matt...@neotechnology.com] Hacker, Neo Technology www.neotechnology.com ___ Neo4j mailing list User@lists.neo4j.org https://lists.neo4j.org/mailman/listinfo/user