Re: JCR integration

Janne Jalkanen Tue, 10 Feb 2009 12:26:01 -0800

It seems to me that the local instance of WikiPage should indeed have
a value of "blob." But as far as the JCR is concerned, it's still
"bar", because save() hadn't been called, right?


Yes, any other Session accessing it would return "bar".

I do not understand  what you mean. A call to getAttribute() would
return values from the WikiPage's field variables, no? Why would it
need to consult a cache?


The code would look like this:

Object getAttribute(String param)
{
   Object o = m_localFields.getAttribute(param);
   if( o == null )
   {
      o = m_node.getProperty(param).toString();
   }

   return o;
}

In this case localfields is a local cache. This is extraneous workevery single time we read a page. In addition, since we can storegeneric metadata, we do not know which properties will be fetched.

In addition, we need to build a disk cache, managed by WikiPage forReallyLarge Binaries, including deletion when WikiPage is garbagecollected, etc(unless we want to cache them in memory, which is something weprobablydon't want to do). Of course a ready cache library can be used todo that,
but again, it feels a bit pointless when JCR does it for us.
You've really lost me here...

Attachments are WikiPage objects with wiki:content property set to thebinary content, and wiki:contentType set to the MIME type of thecontent.

Basically, the idea is to make read-only access very cheap, butwritesmore expensive... and to hide all of the state management detailsfrom
callers.
I don't think we can; imagine a situation where we have somethinglike this:
WikiPage page = getPage(...)

page.setAttribute("foo","bar");

doThingy(page);

...

public doThingy(WikiPage page) throws Exception
{
 page.setAttribute("bar", "baz");
}

Now, if doThingy() throws an exception, it needs to do its own state
management to clean things up before it passes it up to the caller,or else
the state of the page is undefined when it goes up.
It seems to me that the state of the page is perfectly clear... the
save() method hasn't been called, and therefore nothing has been
committed.

Well, the methods manipulating the page are not certain what the stateof the page is. It needs to manage the exception somehow...

Yeah, but you've already made that assumption anyway. It seems to me
that the save() method is already a state-management method. Is
purpose is to change the state from "unsaved" to "saved," right?


Correct.

In addition, having a proper lifecycle means that the JCR Sessioncan bereleased by the JCR engine when it is not used. If theContentManager justholds on to the Session, it'll just keep consuming resources.Probably not
a lot, but it's still something.
Not sure this would be a big deal if we kept references, essentially
to ThreadLocal copies of Sessions inside ContentManager -- that were
only used for retrieving Nodes.


As I said, it's not a lot.  It's a deal, but not a big one.

Here's a counter-proposal for JCRNode:
0) Assume that most WikiPages will be created for read-only retrieval
1) Remove the field variable that references Node. No Node reference
means no Session, and no Session means WikiPage becomes (more)
thread-safe.
2) Store attributes, contents, title, author etc as JCRWikiPage fields
instead of manipulating the Node

This does not work, since it means that we have to fetch all themetadata for the page *every single time we access the page*. Yousee, we don't know what metadata there is for the page.

In the end, we will be writing a copy of Session functionality on topof WikiPage.

3) When save() is called, start a fresh Session, retrieve the old
Node, manipulate it; commit; release Session

Unfortunately, this again means that we need to store everythinginside memory for changed WikiPage fields, which means that for allread access, we need to first check internal fields, then go for theSession.

The problem is that for large objects, such as attachments, this is amajor overhead. So we would need to build some way to stream thecontent into a disk cache, and then open a Session and copy the databack-n-forth.

Which is *doable*, but it's essentially duplicating the work that JCRalready does for us invisibly.

4) Maybe store a String or other path-like node name (WikiName?) in
JCRWikiPage so that it's obvious where it came from in the repository.
Actually I think this happens already.


Yes.  There is a 1:1 mapping between WikiNames and JCR Paths.

This is a bit different than how it works now. But if JCRWikiPage
severs the reference to Node, it also gets rid of the most obvious
thread-safety issue.

Well, not really. WikiPage isn't threadsafe now (no synchronization),so anything wouldn't really change with that regard. In addition, ifyou cache a WikiPage at the moment, there is no guarantee that theinformation it contains is correct, so you're not even supposed tokeep a copy of it. When the Node is tied to it, that means thatanytime you access any info on it, you will always get the latest info.

You see, if we store it inside a WikiPage and we have the followingpattern:


WikiPage a = getPage("TestPage");
a.setAttribute("foo","bar");
a.save();
a.setAttribute("foo","blob");

WikiPage b = getPage("TestPage");
String attr = b.getAttribute("foo");

Is the value of attr "bar" or "blob"? And what would the developerexpect? How about if these two calls are in different functions, andWikiPage is not passed around? Note that in our current architecture,the result depends on whether the WikiPage is read from the cache ornot... So at any rate, this will be an improvement.

My belief is that they should reflect the same state, since otherwiseyou can't make changes in one place visible to any other place withoutcalling save(). And save() does create a new version... WithWikiPage carrying the state, it would be different for each.


One option would be the following pattern:

ContentManager mgr = m_engine.getContentManager();

try
{
    WikiPage p = mgr.getPage();
    // manipulate
    p.save();
}
finally
{
   mgr.release();
}

So instead of using WikiEngine, we could use ContentManager. Thismeans obviously that all WikiEngine methods relating to content woulddisappear, but on the other hand, that would make WikiEngine verylean. But this would be a fairly big change in all of the plugins,obviously.

However, storing the Node in the WikiPage is obviously just a speedoptimization. These two statements are equivalent:


m_node.getProperty("foo").getString();
and

m_engine.getContentManager().getJCRSession().getRootNode().getNode(m_jcrPath).getProperty("foo").getString();

or even

m_engine.getContentManager().getJCRSession().getByUUID(m_uuid).getProperty("foo").getString();

Obviously the latter does have some extra overhead. It can be somewhatoptimized, but still... This does achieve thread safety at theexpense of some complexity, while keeping the content in the Session,and not store it in a WikiPage. The problem of course is that thisdoes *not* solve the issue of lifecycles, since it's possible to leavestale stuff in the WikiPage, which might get saved by accident lateron - unless the Session is cleaned somehow.

What exactly would the advantage be for making a WikiPage threadsafe,since it isn't threadsafe now?


/Janne

Re: JCR integration

Reply via email to