Re: [Neo4j] [Spring Data Graph] Some questions/suggestions about cross-store persistence

Michael Hunger Tue, 06 Sep 2011 14:57:10 -0700

Hi Michel,

sorry just returned from vacation.


the problem with 1) is still - you have to start to define fetch-groups for 
subgraphs etc and you also have to define cascading rules for updates of 
relationships.

Right now I don't have the resources to add all these things (especially as the 
devil lies in the details).

As I already said it is preferable anyway to use attached entities within 
transactional contexts. The detached state was mainly added as convenience for 
web-ui handling.

I see the one issue that you pointed out regarding updates to "detached" 
entities by other users of the graphdb which then show up due to read-through.

Regarding your notion of domain objects - that always depends on your 
architecture.

I'll incorporate your suggestions in our planning but can't promise anything 
about schedules.

2) why is persist confusing?
3) you mean when you persist() a entity to the JPA store then the graph store 
should be persisted as well? It the entity was retrieved in a tx, all its 
changes should be written to the graph store anyway without an explicit 
persist. 
4) We'll put some effort in the cross-store part of Spring Data Graph, but that 
will be probably rather to the end of the year.

Cheers

Michael


Am 06.09.2011 um 23:11 schrieb Michel Domenjoud:

> Hello!
> Is  there anybody to answer to my previous email?
> 
> Thanks!
> Michel
> 
> 2011/9/2 Michel Domenjoud <[email protected]>
> 
>> Hi,
>> I'm currently testing Spring Data Graph, with a focus on polyglot
>> persistence use cases, in order to give a short presentation at Spring User
>> Group in Paris on September.
>> This email follows my previous discussion with Michael Hunger (pasted
>> below), and I have some questions/suggestions:
>> 
>> 1- Add a real detached state for entities:
>> In my previous discussion, I was a bit worrying about the behaviour of Node
>> Entities which make all getters calls doing a read through in the graph
>> database, even if we are not in a transaction.
>> If I understood it correctly, there is indeed no real detached state for
>> node entities.
>> I think this is really an issue because it doesn't correspond to the domain
>> centric purpose of Spring Data. IMHO, this is a semantic problem: if my
>> NodeEntities are domain objects, I expect that a getter call is immutable,
>> and so that it is not a read from database operation (at least once I'm out
>> of a transaction).
>> 
>> => Imagine I have a big process, for example a computation engine using
>> nodes entities retrieved from the graph, with long computation, and output
>> to a file, or another storage engine:
>> -With the current behaviour, the only way to be sure that all properties of
>> a Node Entity are immutable when doing some processes is either to keep a
>> transaction opened during the whole process, either using clones for all
>> nodes.
>> - Keeping a very long transaction, knowing I may use many nodes is IMHO
>> definitly a bad idea.
>> - I can clone my entities, but I think this is not a good idea too, as I
>> will use exactly the same class without any backing Node.
>> => This matter can be really more confusing when using cross-store
>> persistence, as JPA entities have real detached state.
>> To answer to Michael, I don't think this must always goes with complicated
>> Fetch strategies : you could implement a Lazy loading, which would only
>> retrieve node properties by default, then the developper would need to
>> retrieve relationships and related Nodes using an explicit call.
>> 
>> 2. persist() operation is a bit confusing and could lead to mistakes: I'd
>> suggest to separate it in two methods, save and merge.
>> 
>> 3. Cross-store persistence: Allow explicit re-attaching JPA side operation.
>> 
>> Currently, when retrieving a partial NodeEntity from graph database, its
>> JPA is automatically retrieved. On the other side, when retrieving an entity
>> from relationnal database, I have to make an explicit call to persist() to
>> merge the graph side.
>> => I think this can lead to errors, and performances leaks, by example:
>> I use a Traversal to retrieve some partial entities in order to update
>> them, but only for graph side properties. This will work, but for each
>> retrieved entity a implicit JPA merge call will be done...
>> 
>> 4- Last question: what are the forecasts about Cross-store persistence API
>> in Spring Data Graph? Are you planning to make some enhancements on it, or
>> is it just some sugar over Spring Data Graph API?
>> 
>> Thanks by advance for your answers!
>> Michel
>> 
>> Hi Michael,
>>> Ok, I get your point now. In fact, the thing I didn't understand yet was
>>> that each get call on an entity can be compared as a SELECT on relational
>>> db, even no explicit call to the graph repository is done.
>>> 
>>> So, if I understand well, I'd improve the documentation by adding somthing
>>> like that after the paragraph
>>> Existing > All entities returned by library functions are initially in an
>>> attached state. Just as with any other entity, changing them outside of a
>>> transaction detaches them, and they must be reattached with persist() for
>>> the data to be saved.
>>> Add this after > However, all entities are still attached when reading
>>> fields, as all gettters will read through the last data in the graph. For
>>> people used to develop with relationnal databases, this must be  undestood
>>> as each getter call can be assimiled to a SELECT operation.
>>> 
>>> Finally, I understand your point about read-through vs. fetch strategies
>>> issues, but this only means that developpers will have to code this glu by
>>> themselves on each application. I think that if SDG is intended to become a
>>> reference for using graph database, this kind of API will have to come with
>>> it one day (but maybe am I misleading because I'm too used to relationnal
>>> DB).
>>> 
>>> Moreover, I see one point that could be really confusing with this
>>> approach in SDG : cross-store persistence. With this API, you provided the
>>> capability to manage JPA entities, which can use various fetch strategies,
>>> and Graph entities which use read through, in the same entity class.
>>> This point is not mentionned is the documentation, I think you should add
>>> a big warning about this. Something like:
>>>> As mentionned on Chapter 18.8 Detached node entities, node entities are
>>> using read-through. On the other side, JPA entities can use various fetch
>>> strategies. This point must be considered with caution when developping
>>> applications.
>>> 
>>> HTH, and more questions will certainly come about cross-store persistence!
>>> :)
>>> 
>>> Michel
>>> 
>>> - Masquer le texte des messages précédents -
>>> 2011/8/23 Michael Hunger <[email protected]>
>>> - Masquer le texte des messages précédents -
>>> Hi Michel,
>>> 
>>> they are implicitely detached when modified outside of a transaction. But
>>> even in detached mode, for the unmodified fields it still reads through !
>>> 
>>> 
>>> Could you point out how the docs could be improved? To make that easier to
>>> understand:
>>> 
>>> http://static.springsource.org/spring-data/data-graph/snapshot-site/reference/html/#reference:programming-model:lifecycle
>>> 
>>> They read always through but the db uses a cache of course.
>>> 
>>> Regarding your example with different clients.
>>> 
>>> Assuming the operation persists before #4 the title will be the new one as
>>> this is the new state in the db.
>>> 
>>> It is the same as in a relational db, if you do two selects (which the
>>> read through is) then you get the value back that is current in the db.
>>> 
>>> I understand your issue though. Right now the only option would be to copy
>>> the values that are needed for the output to a separate datastructure if you
>>> never want to have that happen.
>>> 
>>> The problem with detaching and copying is that you get quickly into all
>>> the annoyances of fetch-depths, fetch-groups etc. again, that's a path I
>>> don't want to walk, it leads to hell :)
>>> 
>>> Michael
>>> 
>>> Am 23.08.2011 um 12:55 schrieb Michel Domenjoud:
>>> - Masquer le texte des messages précédents -
>>> 
>>>> Michael,
>>>> Thanks for your quick answer.
>>>> 
>>>> This leads me to two new points:
>>>> 
>>>> - You said that an entity is attached when freshly loaded, but I found
>>> no
>>>> way to explicitly detach entities. Am I right?
>>>> If so, I think you should update the documentation which is quite
>>> confusing
>>>> on this point, and explain clearly that detach entities should be used
>>> in a
>>>> "write-only" mode.
>>>> 
>>>> - Moreover, I think there  could be some confusing side effects if
>>> entities
>>>> always use read-through :
>>>> Does this work with a cache or do the entities always read through the
>>>> database?
>>>> 
>>>> How would this example behave with two different clients :
>>>> 
>>>> A client X does the following (let's say title property is indexed):
>>>> 1. Movie retrievedMovie = movieRepository.findByPropertyValue("Babel");
>>>> 2. output(retrievedMovie.getTitle()) // prepare some output like Web
>>> page
>>>> 3. ... do some other operations
>>>> 4. output(retrievedMovie.getTitle()) // for some reason, a second output
>>> is
>>>> needed
>>>> 
>>>> In the same time, a client Y executes the following code:
>>>> 1. Movie retrievedMovie = movieRepository.findByPropertyValue("Babel");
>>>> 2.retrievedMovie.setTitle("New title"));
>>>> 3. retrievedMovie.persist();
>>>> 4. Some other stuff we don't care
>>>> 
>>>> Which should be the value of the movie title for client X on step 4?
>>>> 
>>>> Thanks by advance for your answer.
>>>> Michel
>>>> 
>>>> 
>>>>> Date: Tue, 23 Aug 2011 11:42:13 +0200
>>>>> From: Michael Hunger <[email protected]>
>>>>> Subject: Re: [Neo4j] [Spring Data Graph] Precisions about Detached
>>>>>      Entities        and SDG under the hood
>>>>> To: Neo4j user discussions <[email protected]>
>>>>> Message-ID: <[email protected]
>>>>>> 
>>>>> Content-Type: text/plain; charset=us-ascii
>>>>> 
>>>>> there are two states attached and detached:
>>>>> 
>>>>> an entity is detached when it is created or when it is changed outside
>>> of a
>>>>> transaction.
>>>>> 
>>>>> Otherwise (when it is freshly loaded, or after persist it is attached).
>>>>> 
>>>>> For detached entities: persist() writes the changed properties and
>>>>> relationships to the graph. if attached (and inside of a tx) all
>>> changes are
>>>>> written directly.
>>>>> 
>>>>> In your example you just overwrote the title with Babel and persisted
>>> that
>>>>> information to the graph, so the assert should say:
>>>>> The retrieved movie is attached, it is never detached, so it always
>>> refers
>>>>> to the node in the graph (read-through) (the data is _not_ copied).
>>>>> 
>>>>>> assertEquals("Babel", retrievedMovie.getTitle());
>>>>> 
>>>>> 
>>>>> Attached entities read their data directly from the underlying node.
>>>>> 
>>>>> HTH
>>>>> 
>>>>> Michael
>>>>> 
>>>>> The model is different to hibernate, as hibernate has no read-through.
>>> We
>>>>> would have loved not to support detached entities but as they are so
>>> common
>>>>> in web-frameworks we had to.
>>>>> 
>>>>> The best way of working with SDG is to use domain level service methods
>>>>> which are transactional and do the interaction with the graph. Detached
>>>>> entities should just be used to (if at all) to persist
>>>>> user input (form data) from the UI.
>>>>> 
>>>>> 
>>>>> 
>>>>> Am 23.08.2011 um 10:56 schrieb Michel Domenjoud:
>>>>> 
>>> - Masquer le texte des messages précédents -
>>>>>> Hello,
>>>>>> I'm currently testing some of Spring Data Graph features, and I have a
>>>>> few
>>>>>> questions about some usages.
>>>>>> 
>>>>>> Could someone explain to me how the following example works?
>>>>>> I run the following unit test:
>>>>>> 
>>>>>> @Test
>>>>>> public void testUpdatingEntitiesNotInTrans
>>> action(){
>>>>>>     Movie m = new Movie();
>>>>>>     m.setTitle("Leon");
>>>>>>     m.persist();
>>>>>>     Long id = m.getNodeId();
>>>>>>     Movie retrievedMovie = movieRepository.findOne(id);
>>>>>>     m.setTitle("Babel");
>>>>>>     m.persist();
>>>>>>     assertEquals("Leon", retrievedMovie.getTitle());
>>>>>> 
>>>>>> }
>>>>>> 
>>>>>> And the assertion at the end fails, as retrievedMovie.getTitle()
>>> equals
>>>>>> "Babel" and not "Leon".
>>>>>> This point is not really clear in the documentation :
>>>>>> Does this occurs because of some cache? If so, is it the Neo4j cache?
>>> And
>>>>>> what is exactly its scope : thread, session, ...?
>>>>>> Or is any call to getters triggering an access to the database because
>>> of
>>>>>> AspectJ?
>>>>>> 
>>>>>> Anyway, unless I misundestood something, it's a bit confusing.
>>> Especially
>>>>>> when used to APIs like Hibernate, which don't make any refresh of
>>>>> retrieved
>>>>>> entities once we are outside of a transaction.
>>>>>> 
>>>>>> When I read this in documentation, I don't expect that any persist
>>>>> operation
>>>>>> affect other retrieved entities :
>>>>>> Changing an attached entity inside a transaction will immediately
>>> write
>>>>>> through the changes to the datastore. Whenever an entity is changed
>>>>> outside
>>>>>> of a transaction it becomes detached. The changes are stored in the
>>>>> entity
>>>>>> itself until the next call to persist().
>>>>>> 
>>>>>> All entities returned by library functions are initially in an
>>> attached
>>>>>> state. Just as with any other entity, changing them outside of a
>>>>> transaction
>>>>>> detaches them, and they must be reattached with persist() for the data
>>> to
>>>>> be
>>>>>> saved.
>>>>>> Maybe I have to precise some points :
>>>>>> 
>>>>>> - I'm using Embedded database, with beforeTest cleaning
>>>>>> - I don't use any transaction in this test.
>>>>>> 
>>>>>> 
>>>>>> Thanks by advance for your help!
>>>>>> Michel
>>> 
>> 
>> 
>> 
>> 
> _______________________________________________
> Neo4j mailing list
> [email protected]
> https://lists.neo4j.org/mailman/listinfo/user

_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] [Spring Data Graph] Some questions/suggestions about cross-store persistence

Reply via email to