RE: Reference not resolving

Jerome Louvel Thu, 12 Apr 2007 02:28:02 -0700

Hi John,

Thanks again for the useful feed-back. You made use rethink this issue more
deeply.


> > However, as said in chapter "5.1. Establishing a Base URI", 
> of the Uri
> > reference (http://www.ietf.org/rfc/rfc3986.txt), a relative 
> part cannot
> > be used in a context where the base reference is not 
> defined. The fix
> > consists of throwing an IllegalArgumentException.
> 
> That doesn't seem consistent with the other behavior of e.g.,
> returning null (or even an empty string).

The Reference class already throws a couple of exceptions when an attempted
operation is not valid. We return null when an optional data element is not
present (no query part for example) and we return "" when the data element
is present but empty.

Now, after a closer review, Thierry and I decided to change the
getRelativePart() method to act like this: 

"Returns the relative part of relative references, without the query and
fragment. If the reference is absolute, then null is returned."

We also improved the Javadocs of the Reference class to specify clearly when
exception are raised and to detail the behavior of getRelativePart().

Changes committed to SVN.

> I am very concerned that these "disturbing" behaviors don't seem to be
> consistent which means they will be brittle to use because different
> assumptions will be made depending on a slew of things that may or may
> not be apparent.

The design of Reference has been carefully thought to fully support the URI
spec and provide a better alternative to JDK's URI class. The internal
behavior is simpler than what it looks, I will explain it below. I hope this
will reduce your uncomfort.

> Actually, no, that's not the behavior for web pages.  To get
> "http://host.com/sub"; you would have to add "/sub" to e.g.,
> "http://host.com/dir";.  Adding "sub" to e.g. "http://host.com/dir";
> should result in "http://host.com/dir/sub";.

As there is no trailing slash after "dir", the current output is correct.
See the URI spec for more complex examples:
http://gbiv.com/protocols/uri/rfc/rfc3986.html#reference-examples

Note that we enforce all the URI spec examples with a set of unit tests.
Looking at how tricky those examples are, we feel very confident about the
quality of Reference's output :)
 
> [...]
> > **************** host = new Reference("http://host.com";)
> > Scheme        http
> > Authority     host.com
> > Path          null
> > RemainingPart http://host.com
> > toString      http://host.com
> > TargetRef     http://host.com
> > RelativePart  IllegalArgumentException
> 
> Sticking to the convention of using null for missing elements, I'd say
> that the RelativePart should be null.  It's just missing (or "not
> applicable", if you will) -- just like the Path.

Agreed. We now return null in this case.

> > **************** slashdir = new Reference(host, "/dir")
> > Scheme        null
> > Authority     null
> > Path          /dir
> > RemainingPart null
> > toString      /dir
> > TargetRef     http://host.com/dir
> > RelativePart  /dir
> 
> If this is built relative to the 'host' reference, why isn't slashdir
> "inheriting" the information from the 'host'?

Because the base reference is only a property of the Reference ("baseRef").
When you use the "Reference(base, path)" constructor, it is equivalent to
doing: 

 ref = new Reference(path);
 ref.setBaseRef(base);

The base ref is not automatically resolved or "merged" with the rest of the
reference information (the path here). For example, this let's you reuse a
single reference as the base of several relative references. If you modify
the base reference, all relative references are still accurate.

Frequently you will want to resolve the reference to "merge" the base
reference with the current reference info (for example a relative path). For
this purpose you can use the getTargetRef() which will return a new resolved
Reference instance, an absolute URI reference with no base reference.
 
> I.e., the abstract model of what Reference has been created for needs
> to be clear and clearly explained.  Is it a DAG or a tree or a wacky
> string or what?  How does each Reference stand on it's own? How do all
> of the pieces compose and why?  What about other facets of a reference
> such as anchors (e.g. "#footer1") and parameters get folded in? Etc.
> Basically, how can users understand how it should work without going
> insane?

The Reference stores its data as a single string, the one passed to the
constructor. This string can always be obtained using the toString() method.
We also maintain a couple of integer indexes to improve the extraction time
of various reference properties (URI components).

When you modify a specific component of the URI reference, via the setPath()
method for example, we simply regenerate the internal string by updating
only the relevant part. We try as much as possible to protect the bytes
given to the Reference class instead of transparently parsing and
normalizing the URI data. Our idea is to protect encodings and special
characters in all case and reduce the memory size taken by this class while
making Reference instances mutable.
 
> IMHO, *everything* follows from the answers to this fundamental
> question of, in essence, identity works because it's the fundamental
> pivot around which the entire notion of REST revolves.

Absolutely, and that's why we invested so much in creating this Reference
class instead of simply reusing the JDK's URI class.

The fundamental point I would like to underline here is the difference
between an URI "reference" and an URI. Contrary to an URI (the target
identifier of a REST resource), an URI reference can be relative (with or
without query and fragment part). This relative URI reference can then be
resolved against a base reference via the getTargetRef() method which will
return a new resolved Reference instance, an absolute URI reference with no
base reference.

I hope this clarifies a bit.

Best regards,
Jerome

RE: Reference not resolving

Reply via email to