Hi Michael,
Apologies for the very slow reply.
The use case for having indexes on a relationship property would be to be
able to find purchases in a given time period, say 'last 7 days'.
Given a graph such as:
(Customer)-[:Purchased {date:<datetime as millliseconds>}]->(Product)
I have however changed the structure to be like this:
(Customer)-[Order {date}]->(Product)
I think this is a nicer model, so not having indexes on a Relationship
seems to be fine. I've not come across another case where I would need it.
Thank you,
V
On Monday, February 10, 2014 4:35:52 PM UTC, Michael Hunger wrote:
>
>
> Am 10.02.2014 um 16:46 schrieb V <[email protected] <javascript:>>:
>
> Thank you for the response Michael, very helpful.
>
> I thought it was me doing something wrong :)
>
> I've tried Option 1, using the Deffered Constraint. Query times in Cypher
> and the Java API have improved greatly.
>
> *Results from Option 1:*
> Cypher:
> MATCH (c:Customer)
> WHERE c.customerId = 279781
> RETURN c;
> // Quite variable ~100ms-150ms
>
> MATCH (c:Customer)-[]->(p:Product)
> WHERE c.customerId = 7593729
> RETURN c,p;
> // Quite variable ~150ms-200ms
>
>
> Java API:
> // Get customer node
> Total time: 00h 00m 00s 118ms
> Total time: 00h 00m 00s 02ms
> Total time: 00h 00m 00s 01ms
> Total time: 00h 00m 00s 01ms
>
> // Get customer's purchased product nodes
> Total time: 00h 00m 00s 191ms
> Total time: 00h 00m 00s 09ms
> Total time: 00h 00m 00s 06ms
> Total time: 00h 00m 00s 08ms
>
> Just a couple of questions from this;
>
> 1) I can't seem to find anything in either the documentation or the Java
> API javadoc about adding indexes to the relationships in this way. All I've
> found are notes about making indexes on *Node Labels*. Is this something
> that isn't currently available? Or have I overlooked something again?
>
>
> Not available and not planned, what is your use-case for those?
>
>
> 2) I wanted to check if the indexes were indeed created, since I couldn't
> find a Cypher query to list the indexes I tried the neo4j-sh command index
> --indexes but there were no listed node indexes. Is this because the
> indexes were created differently and are managed differently now? If so, is
> the Java API the only way currently to check the indexes?
>
>
> "index --indexes" is for the legacy indexes.
> The command is "schema" in the shell, ":schema" in the browser and
> db.schema().... for embedded.
>
>
> Btw. the first query is slower as the data has to be loaded from disk
> first.
>
> Cheers
>
> Michael
>
>
>
> Many thanks,
> V
>
>
>
> On Monday, February 10, 2014 9:13:50 AM UTC, Michael Hunger wrote:
>>
>> I think you ran into some misunderstanding of Neo4j indexes. Sorry for
>> the confusion.
>>
>> What you created were effectively legacy indexes that were how things
>> were done in 1.9 and before.
>>
>> With Neo4j 2.0 we have label based indexes that work are used differently.
>>
>> So what you can do (using 2.0.1):
>>
>> #1 rebuild your db without the legacy indexing and instead create unique
>>
>> But use this instead:
>> batchInserter.createDeferredSchemaIndex(label).on(property).create();
>> or
>>
>> batchInserter.createDeferredConstraint(label).assertPropertyIsUnique(property).create();
>>
>>
>> #2 keep your db but delete everything under graph.db/index
>>
>> and either create just an index like this (adapt your label and
>> property-name) in cypher:
>>
>> create index on :Customer(id)
>>
>> or even a unique constraint (for unique identifiers)
>>
>> create constraint on (c:Customer) assert c.id is unique
>>
>> the transactional Java API is:
>>
>>
>> db.schema().indexFor(DynamicLabel.label(label)).on(property).create();
>> or
>>
>> db.schema().constraintFor(DynamicLabel.label(label)).assertPropertyIsUnique(property).create();
>>
>> Am 09.02.2014 um 22:57 schrieb V <[email protected]>:
>>
>> Hi,
>>
>> I've spent a few hours today looking at the Neo4J docs and playing
>> around. I started to do something serious for evaluation and I'm a bit
>> frustrated with myself.
>>
>> Using the BatchInserterIndex I have created a graph with:
>>
>> 1,097,874 million nodes
>> 1,097,874 million properties
>> 8,104,479 million relationships
>>
>> The database size is 829 MB on disk.
>> The indexes directory size is 515 MB. (du -ch data/graph.db/index | grep
>> total)
>>
>> The graph has two node types Customers and Products, the only property on
>> these nodes is an ID used to identify the entity in another datastore, and
>> a single relationship type of Purchased.
>>
>> I have created indexes using the BatchInserterIndexProvider class. If
>> required I can post my full source code but essentially this is the
>> importer code:
>>
>> // Create the db and indexes
>> BatchInserter inserter = BatchInserters.inserter("target/graph.db");
>> BatchInserterIndexProvider indexProvider = new
>> LuceneBatchInserterIndexProvider(inserter);
>> BatchInserterIndex customersIndex =
>> indexProvider.nodeIndex("customersIdx", MapUtil.stringMap("type", "exact"));
>> customersIndex.setCacheCapacity("customerId", 100000);
>> // Indexes for Product nodes and Purchased Relationship created in the
>> same way
>>
>> // Create and add node to index
>> long cId = inserter.createNode(customerProperties, customerLabel);
>> customersIndex.add(nodeId, customerProperties);
>>
>> long pId = inserter.createNode(productProperties, productLabel);
>> productsIndex.add(nodeId, productProperties);
>>
>> long purchRelId = inserter.createRelationship(cId, pId, PURCHASED, null);
>> purchasesIndex.add(purchRelId, EMPTY_MAP);
>>
>> // Flush indexes and shutdown batch inserter
>> customersIndex.flush();
>> productsIndex.flush();
>> purchasesIndex.flush();
>> indexProvider.shutdown();
>> inserter.shutdown();
>>
>>
>> Once the batch indexer completes I copy the files to the real location of
>> the database and start the Neo4J server.
>>
>>
>> *Attempt 1 with Cypher*
>>
>> When I run a cypher query such as:
>>
>> MATCH (c:Customer)
>> WHERE c.customerId = 7593729
>> RETURN c;
>>
>>
>> The response returns in around 8 seconds the first time, and then around
>> 900 ms the following times.
>>
>> So, I thought perhaps it was just Cyhper, since I read that the Cypher
>> queries could be slow I tried with the Java API.
>>
>>
>>
>>
>> *Attempt 2 with JAVA API*
>>
>> This is how I did the same query via the Java API:
>>
>> DateTime startTime = new DateTime();
>>
>> Transaction tx = graphDb.beginTx();
>> ResourceIterator<Node> nodes =
>> graphDb.findNodesByLabelAndProperty(customerLabel, "customerId", 7593729
>> ).iterator();
>>
>> DateTime finishTime = new DateTime();
>>
>> while(nodes.hasNext()) {
>> Node node = nodes.next();
>> System.out.println(node.getProperty("customerId"));
>> }
>>
>> Period period = new Period(startTime, finishTime);
>> System.out.println("Total time: " + HHMMSSFormater.print(period));
>>
>>
>> The query was executed 4 times in a row and this is the result:
>>
>> Total time: 00h 00m 00s 355
>> Total time: 00h 00m 00s 55
>> Total time: 00h 00m 00s 04
>> Total time: 00h 00m 00s 04
>>
>> Awesome! BUT...
>>
>> If I change the code slightly, and put the finish time after the while
>> loop and run the same test the result is:
>>
>> Total time: 00h 00m 0*6s* 494
>> Total time: 00h 00m 00s 416
>> Total time: 00h 00m 00s 294
>> Total time: 00h 00m 00s 302
>>
>>
>> So it looks like iterating over the nodes took 6 seconds the first time,
>> this seems like a long time given that there's only a single Node in the
>> query result.
>>
>>
>> *Questions*
>>
>> 1. Why are my Cypher and Java queries slow?
>> 2. Have I messed up and not understood how indexing works or is this
>> normal and expected?
>> 3. How can I make the queries/result reading faster?
>>
>>
>> Many thanks for any replies.
>>
>>
>>
>>
>>
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Neo4j" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> For more options, visit https://groups.google.com/groups/opt_out.
>>
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Neo4j" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected] <javascript:>.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>
>
--
You received this message because you are subscribed to the Google Groups
"Neo4j" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.