Hi Alex,

first of all - thank you for the reply.

There must be some sort of Davex limitation because Row.getPath() does trigger another roundtrip for the query I mentioned (JR 2.6.4 libraries, although I fetched some jars from maven repositories). That means 50 rows, 50 roundtrips.

When operating with NodeIterator I had a roundtrip for every property fetch and every iteration. That made 300 requests easily and slowed things down quite a lot.

I can only advise everyone to not use NodeIterator via Davex. Since I switched to querying only the properties I need with JCR SQL and XPATH, my code became much cleaner. All those nasty exceptions suddently needn't be handled anymore. Overall I think, it made the system more robust and fault tolerant.

But my system is working now and I am happy with Jackrabbit, although everything was quite a fight, from the tomcat setup to my naive understanding of davex remoting. :-)

I have uploaded roughly 2GB of content (production size for me) and it's really, really fast. Works like a charm.


Am 19.11.2013 01:10, schrieb Alexander Klimetschek:
On 17.11.2013, at 11:30, Datenheld <[email protected]> wrote:

Hi again,

here's a question about jcr:path in queries.

Consider the following query:

SELECT base.[jcr:path], content.[jcr:lastModified],
content.[jcr:lastModifiedBy] FROM [nt:base] as base LEFT OUTER JOIN
[nt:resource] as content ON ISCHILDNODE(content, base) WHERE
((base.[jcr:primaryType] = 'nt:file' OR base.[jcr:primaryType] =
'nt:folder') AND ISCHILDNODE(base, ['/some/path']))

How come that base.[jcr:path] is always null? I read somewhere on the
web that jcr:path is a time-intensive calculation and that the field can
be omitted. Can I force the query to calculate the path?

The whole point of this is a performance optimiziation. As NodeIterator
is slow and the query is fast I wanted to operate on paths only.  Sadly,
I can now get anything very fast, but the most important thing - the path.
Have you tried Row#getPath (since jcr 2.0) [0]?

What columns/properties you SELECT (including the "jcr:path") should be 
irrelevant to the performance, only the query (aka WHERE and JOINs) can make a 
difference. This is because the query will resolve to nodes either way, as JCR queries 
can only result in a list of nodes (regardless of using a NodeIterator or RowIterator, 
the row iterator just gives a column-like view on top of a list of nodes).

If you set column names in a query, the row iterator merely denies access to 
the other ones in the API, but the dynamic options (jcr:score, rep:spellcheck, 
rep:excerpt or jcr:path) are all calculated on-demand (i.e. in row.getValue()) 
and jcr:path is the same as calling node.getPath(), so the statement you found 
on the web that it is a time-intensive calculation is not true.

So, is there a way to retrieve the paths when you have a query with
joins? Or will I really have to call row.getPath("<selector>") which
slows everything down again.

Hmm, maybe that's a spi/remoting limitation...

Regarding the query: an nt:folder cannot have a nt:resource child, and nt:file 
defines the child as jcr:content, so your query can be simplified to this xpath 
query:

/jcr:root/some/path//element(*, nt:file)/jcr:content

(I personally think xpath is much better suited for hierarchical structures, 
and it's rare that you really need a complex JOIN only supported by jcr-sql2; 
also, xpath is deprecated in the JCR spec, but not in Jackrabbit; the intention 
of the spec writers is even if it is not supported through the jcr API, a 
repository would offer a generic xpath-2-aqm conversion).

[0] 
http://www.day.com/maven/javax.jcr/javadocs/jcr-2.0/javax/jcr/query/Row.html#getPath()

Cheers,
Alex



Reply via email to