fwiw, the following solves the simple problem shown by my previous
example:
private Session wrap(final SessionImpl origSession) throws
RepositoryException {
final WorkspaceImpl workspace = (WorkspaceImpl)
origSession.getWorkspace();
final RepositoryImpl rep = (RepositoryImpl)
origSession.getRepository();
return new SessionImpl(rep, origSession.getSubject(),
workspace.getConfig()) {
public Path getQPath(String path) throws
MalformedPathException, IllegalNameException, NamespaceException {
// this is the only relevant part:
return super.getQPath(Normalizer.normalize(path,
Normalizer.Form.NFC));
}
};
}
If there was a way to swap the session implementation or the Name-and/
or-PathResolver implementations that are used by default, I might give
this a spin.
Any opinions about the whole problem?
Cheers,
-g
On Nov 4, 2009, at 6:11 PM, Grégory Joseph wrote:
Hi list,
Given the following code,
import java.text.Normalizer;
...
final Session session = ...
final Repository rep = session.getRepository();
System.out.println(rep.getDescriptor("jcr.repository.name") +
" " + rep.getDescriptor("jcr.repository.version"));
final Node root = session.getRootNode();
final String name = "föö";
System.out.println("Normalizer.isNormalized(name,
Normalizer.Form.NFC) = " + Normalizer.isNormalized(name,
Normalizer.Form.NFC)); // true
System.out.println("Normalizer.isNormalized(name,
Normalizer.Form.NFD) = " + Normalizer.isNormalized(name,
Normalizer.Form.NFD)); // false
root.addNode(name);
session.save();
final Node node1 = root.getNode(name);
System.out.println("node1 = " + node1);
final Node node2 = root.getNode(Normalizer.normalize(name,
Normalizer.Form.NFC));
System.out.println("node2 = " + node2);
final Node node3 = root.getNode(Normalizer.normalize(name,
Normalizer.Form.NFD)); // fails
System.out.println("node3 = " + node3);
There's a good chance fetching node3 won't work. It might be
dependent on the underlying os and database, but in the case of OSX
and Derby, this fails. It's not that surprising, really, given that
Normalizer.normalize(name,
Normalizer.Form.NFC).equals(Normalizer.normalize(name,
Normalizer.Form.NFD)) is NOT true.
Now, taking into account the fact that all sorts of clients will use
a different Normalizing Form (Firefox seems to encode URL parameters
with NFD, Safari with NFC; linux NFC, OSX finder seems to favor
NFD), wouldn't it be a safe bet to normalize all input at repository
level ? Or do you consider this is something client applications
should do ?
ref: http://en.wikipedia.org/wiki/Unicode_equivalence#Normal_forms
Thanks for any tip, pointer, idea, feedback or reaction !
Cheers,
-greg