Re: [Neo4j] [Spring Data Graph] Some questions/suggestions about cross-store persistence

Michel Domenjoud Tue, 06 Sep 2011 14:12:32 -0700

Hello!
Is  there anybody to answer to my previous email?

Thanks!
Michel


2011/9/2 Michel Domenjoud <[email protected]>

> Hi,
> I'm currently testing Spring Data Graph, with a focus on polyglot
> persistence use cases, in order to give a short presentation at Spring User
> Group in Paris on September.
>  This email follows my previous discussion with Michael Hunger (pasted
> below), and I have some questions/suggestions:
>
> 1- Add a real detached state for entities:
> In my previous discussion, I was a bit worrying about the behaviour of Node
> Entities which make all getters calls doing a read through in the graph
> database, even if we are not in a transaction.
> If I understood it correctly, there is indeed no real detached state for
> node entities.
> I think this is really an issue because it doesn't correspond to the domain
> centric purpose of Spring Data. IMHO, this is a semantic problem: if my
> NodeEntities are domain objects, I expect that a getter call is immutable,
> and so that it is not a read from database operation (at least once I'm out
> of a transaction).
>
> => Imagine I have a big process, for example a computation engine using
> nodes entities retrieved from the graph, with long computation, and output
> to a file, or another storage engine:
> -With the current behaviour, the only way to be sure that all properties of
> a Node Entity are immutable when doing some processes is either to keep a
> transaction opened during the whole process, either using clones for all
> nodes.
> - Keeping a very long transaction, knowing I may use many nodes is IMHO
> definitly a bad idea.
> - I can clone my entities, but I think this is not a good idea too, as I
> will use exactly the same class without any backing Node.
> => This matter can be really more confusing when using cross-store
> persistence, as JPA entities have real detached state.
> To answer to Michael, I don't think this must always goes with complicated
> Fetch strategies : you could implement a Lazy loading, which would only
> retrieve node properties by default, then the developper would need to
> retrieve relationships and related Nodes using an explicit call.
>
> 2. persist() operation is a bit confusing and could lead to mistakes: I'd
> suggest to separate it in two methods, save and merge.
>
> 3. Cross-store persistence: Allow explicit re-attaching JPA side operation.
>
> Currently, when retrieving a partial NodeEntity from graph database, its
> JPA is automatically retrieved. On the other side, when retrieving an entity
> from relationnal database, I have to make an explicit call to persist() to
> merge the graph side.
> => I think this can lead to errors, and performances leaks, by example:
> I use a Traversal to retrieve some partial entities in order to update
> them, but only for graph side properties. This will work, but for each
> retrieved entity a implicit JPA merge call will be done...
>
> 4- Last question: what are the forecasts about Cross-store persistence API
> in Spring Data Graph? Are you planning to make some enhancements on it, or
> is it just some sugar over Spring Data Graph API?
>
> Thanks by advance for your answers!
> Michel
>
> Hi Michael,
>> Ok, I get your point now. In fact, the thing I didn't understand yet was
>> that each get call on an entity can be compared as a SELECT on relational
>> db, even no explicit call to the graph repository is done.
>>
>> So, if I understand well, I'd improve the documentation by adding somthing
>> like that after the paragraph
>> Existing > All entities returned by library functions are initially in an
>> attached state. Just as with any other entity, changing them outside of a
>> transaction detaches them, and they must be reattached with persist() for
>> the data to be saved.
>> Add this after > However, all entities are still attached when reading
>> fields, as all gettters will read through the last data in the graph. For
>> people used to develop with relationnal databases, this must be  undestood
>> as each getter call can be assimiled to a SELECT operation.
>>
>> Finally, I understand your point about read-through vs. fetch strategies
>> issues, but this only means that developpers will have to code this glu by
>> themselves on each application. I think that if SDG is intended to become a
>> reference for using graph database, this kind of API will have to come with
>> it one day (but maybe am I misleading because I'm too used to relationnal
>> DB).
>>
>> Moreover, I see one point that could be really confusing with this
>> approach in SDG : cross-store persistence. With this API, you provided the
>> capability to manage JPA entities, which can use various fetch strategies,
>> and Graph entities which use read through, in the same entity class.
>> This point is not mentionned is the documentation, I think you should add
>> a big warning about this. Something like:
>> > As mentionned on Chapter 18.8 Detached node entities, node entities are
>> using read-through. On the other side, JPA entities can use various fetch
>> strategies. This point must be considered with caution when developping
>> applications.
>>
>> HTH, and more questions will certainly come about cross-store persistence!
>> :)
>>
>> Michel
>>
>> - Masquer le texte des messages précédents -
>> 2011/8/23 Michael Hunger <[email protected]>
>> - Masquer le texte des messages précédents -
>> Hi Michel,
>>
>> they are implicitely detached when modified outside of a transaction. But
>> even in detached mode, for the unmodified fields it still reads through !
>>
>>
>> Could you point out how the docs could be improved? To make that easier to
>> understand:
>>
>> http://static.springsource.org/spring-data/data-graph/snapshot-site/reference/html/#reference:programming-model:lifecycle
>>
>> They read always through but the db uses a cache of course.
>>
>> Regarding your example with different clients.
>>
>> Assuming the operation persists before #4 the title will be the new one as
>> this is the new state in the db.
>>
>> It is the same as in a relational db, if you do two selects (which the
>> read through is) then you get the value back that is current in the db.
>>
>> I understand your issue though. Right now the only option would be to copy
>> the values that are needed for the output to a separate datastructure if you
>> never want to have that happen.
>>
>> The problem with detaching and copying is that you get quickly into all
>> the annoyances of fetch-depths, fetch-groups etc. again, that's a path I
>> don't want to walk, it leads to hell :)
>>
>> Michael
>>
>> Am 23.08.2011 um 12:55 schrieb Michel Domenjoud:
>> - Masquer le texte des messages précédents -
>>
>> > Michael,
>> > Thanks for your quick answer.
>> >
>> > This leads me to two new points:
>> >
>> > - You said that an entity is attached when freshly loaded, but I found
>> no
>> > way to explicitly detach entities. Am I right?
>> > If so, I think you should update the documentation which is quite
>> confusing
>> > on this point, and explain clearly that detach entities should be used
>> in a
>> > "write-only" mode.
>> >
>> > - Moreover, I think there  could be some confusing side effects if
>> entities
>> > always use read-through :
>> > Does this work with a cache or do the entities always read through the
>> > database?
>> >
>> > How would this example behave with two different clients :
>> >
>> > A client X does the following (let's say title property is indexed):
>> > 1. Movie retrievedMovie = movieRepository.findByPropertyValue("Babel");
>> > 2. output(retrievedMovie.getTitle()) // prepare some output like Web
>> page
>> > 3. ... do some other operations
>> > 4. output(retrievedMovie.getTitle()) // for some reason, a second output
>> is
>> > needed
>> >
>> > In the same time, a client Y executes the following code:
>> > 1. Movie retrievedMovie = movieRepository.findByPropertyValue("Babel");
>> > 2.retrievedMovie.setTitle("New title"));
>> > 3. retrievedMovie.persist();
>> > 4. Some other stuff we don't care
>> >
>> > Which should be the value of the movie title for client X on step 4?
>> >
>> > Thanks by advance for your answer.
>> > Michel
>> >
>> >
>> >> Date: Tue, 23 Aug 2011 11:42:13 +0200
>> >> From: Michael Hunger <[email protected]>
>> >> Subject: Re: [Neo4j] [Spring Data Graph] Precisions about Detached
>> >>       Entities        and SDG under the hood
>> >> To: Neo4j user discussions <[email protected]>
>> >> Message-ID: <[email protected]
>> >>>
>> >> Content-Type: text/plain; charset=us-ascii
>> >>
>> >> there are two states attached and detached:
>> >>
>> >> an entity is detached when it is created or when it is changed outside
>> of a
>> >> transaction.
>> >>
>> >> Otherwise (when it is freshly loaded, or after persist it is attached).
>> >>
>> >> For detached entities: persist() writes the changed properties and
>> >> relationships to the graph. if attached (and inside of a tx) all
>> changes are
>> >> written directly.
>> >>
>> >> In your example you just overwrote the title with Babel and persisted
>> that
>> >> information to the graph, so the assert should say:
>> >> The retrieved movie is attached, it is never detached, so it always
>> refers
>> >> to the node in the graph (read-through) (the data is _not_ copied).
>> >>
>> >>> assertEquals("Babel", retrievedMovie.getTitle());
>> >>
>> >>
>> >> Attached entities read their data directly from the underlying node.
>> >>
>> >> HTH
>> >>
>> >> Michael
>> >>
>> >> The model is different to hibernate, as hibernate has no read-through.
>> We
>> >> would have loved not to support detached entities but as they are so
>> common
>> >> in web-frameworks we had to.
>> >>
>> >> The best way of working with SDG is to use domain level service methods
>> >> which are transactional and do the interaction with the graph. Detached
>> >> entities should just be used to (if at all) to persist
>> >> user input (form data) from the UI.
>> >>
>> >>
>> >>
>> >> Am 23.08.2011 um 10:56 schrieb Michel Domenjoud:
>> >>
>> - Masquer le texte des messages précédents -
>> >>> Hello,
>> >>> I'm currently testing some of Spring Data Graph features, and I have a
>> >> few
>> >>> questions about some usages.
>> >>>
>> >>> Could someone explain to me how the following example works?
>> >>> I run the following unit test:
>> >>>
>> >>> @Test
>> >>> public void testUpdatingEntitiesNotInTrans
>> action(){
>> >>>      Movie m = new Movie();
>> >>>      m.setTitle("Leon");
>> >>>      m.persist();
>> >>>      Long id = m.getNodeId();
>> >>>      Movie retrievedMovie = movieRepository.findOne(id);
>> >>>      m.setTitle("Babel");
>> >>>      m.persist();
>> >>>      assertEquals("Leon", retrievedMovie.getTitle());
>> >>>
>> >>> }
>> >>>
>> >>> And the assertion at the end fails, as retrievedMovie.getTitle()
>> equals
>> >>> "Babel" and not "Leon".
>> >>> This point is not really clear in the documentation :
>> >>> Does this occurs because of some cache? If so, is it the Neo4j cache?
>> And
>> >>> what is exactly its scope : thread, session, ...?
>> >>> Or is any call to getters triggering an access to the database because
>> of
>> >>> AspectJ?
>> >>>
>> >>> Anyway, unless I misundestood something, it's a bit confusing.
>> Especially
>> >>> when used to APIs like Hibernate, which don't make any refresh of
>> >> retrieved
>> >>> entities once we are outside of a transaction.
>> >>>
>> >>> When I read this in documentation, I don't expect that any persist
>> >> operation
>> >>> affect other retrieved entities :
>> >>> Changing an attached entity inside a transaction will immediately
>> write
>> >>> through the changes to the datastore. Whenever an entity is changed
>> >> outside
>> >>> of a transaction it becomes detached. The changes are stored in the
>> >> entity
>> >>> itself until the next call to persist().
>> >>>
>> >>> All entities returned by library functions are initially in an
>> attached
>> >>> state. Just as with any other entity, changing them outside of a
>> >> transaction
>> >>> detaches them, and they must be reattached with persist() for the data
>> to
>> >> be
>> >>> saved.
>> >>> Maybe I have to precise some points :
>> >>>
>> >>>  - I'm using Embedded database, with beforeTest cleaning
>> >>>  - I don't use any transaction in this test.
>> >>>
>> >>>
>> >>> Thanks by advance for your help!
>> >>> Michel
>>
>
>
>
>
_______________________________________________
Neo4j mailing list
[email protected]
https://lists.neo4j.org/mailman/listinfo/user

Re: [Neo4j] [Spring Data Graph] Some questions/suggestions about cross-store persistence

Reply via email to