Christoph Kiehl wrote:
Marcel Reutegger wrote:
Wouldn't it make sense to rewrite all @foo:bar!='john' queries to
not(@foo:bar!='john') by default instead of using creating a
MatchAllQuery?
do you mean rewrite: @foo:bar!='john' to not(@foo:bar='john') ?
Yes, of course. My mistake. Do you th
Marcel Reutegger wrote:
Wouldn't it make sense to rewrite all @foo:bar!='john' queries to
not(@foo:bar!='john') by default instead of using creating a
MatchAllQuery?
do you mean rewrite: @foo:bar!='john' to not(@foo:bar='john') ?
Yes, of course. My mistake. Do you think that's an option?
C
Christoph Kiehl wrote:
As I understand in DescendantSelfAxisQuery.DescendantSelfAxisScorer the
contextHits are used to filter the subHits result to only include nodes
of the given context. The context is something like /foo/bar//*, which
means all descendents of /foo/bar. Is that right?
yes,
Both of these proposals sound great - particularly the additional caching in
DescendantSelfAxisQuery. I think this would address the scenario that I
suggested additional indexing earlier in this thread. As I mentioned, in my
query test set DescendantSelfAxisQuery.DescendantSelfAxisScorer.next()
Marcel Reutegger wrote:
Christoph Kiehl wrote:
I've created a jira issue: http://issues.apache.org/jira/browse/JCR-791
Are you working on this issue? Or should I try to implement something?
I just started working on it ;)
Great news ;)
Now that you are working on implementing this cache o
Hi Christoph,
Christoph Kiehl wrote:
I've created a jira issue: http://issues.apache.org/jira/browse/JCR-791
Are you working on this issue? Or should I try to implement something?
I just started working on it ;)
Actually it's /foo/[EMAIL PROTECTED]:bar!='john']
ah, yes. that makes sense.
Marcel Reutegger wrote:
Christoph Kiehl wrote:
Christoph Kiehl wrote:
I was digging a bit into Jackrabbit today and found another place
where some caching did provide a substantial performance gain to
queries which check one attribute for more than one value (like
/foo/[EMAIL PROTECTED]:bar=
Christoph Kiehl wrote:
Christoph Kiehl wrote:
I was digging a bit into Jackrabbit today and found another place
where some caching did provide a substantial performance gain to
queries which check one attribute for more than one value (like
/foo/[EMAIL PROTECTED]:bar='john' or foo:bar='doe'])
David Johnson wrote:
While I can see how my suggested optimization could severely impact some
use
cases. Nevertheless, "our use case" :-) is mostly querying a stable
hierarchy structure - i.e., we rarely, if ever, would move a tree with even
1000s of sub-nodes (famous last words). And we use t
David Johnson wrote:
Do you have a patch for the file, I would love to check it out and run it
against my query suite.
For the eager I uploaded a quick an dirty patch for the calculateDocFilter()
caching:
http://download.yousendit.com/C4FC14DA01183678
I'll of course provide a complete patc
Do you have a patch for the file, I would love to check it out and run it
against my query suite.
-Dave
On 3/13/07, Christoph Kiehl <[EMAIL PROTECTED]> wrote:
Christoph Kiehl wrote:
> I was digging a bit into Jackrabbit today and found another place where
> some caching did provide a substant
Christoph Kiehl wrote:
I was digging a bit into Jackrabbit today and found another place where
some caching did provide a substantial performance gain to queries which
check one attribute for more than one value (like /foo/[EMAIL PROTECTED]:bar='john'
or foo:bar='doe']). The BitSet in calculat
David Johnson wrote:
Out of the Jackrabbit code,
DescendantSelfAxisQuery.DescendantSelfAxisScorer.next()
is now taking the most time while executing my query suite - taking 68% of
the time, within it, calls to
DescendantSelfAxisQuery.DescendantSelfAxisScorer.calculateSubHits() taking
the majorit
DescendantSelfAxisQuery is now taking the most time in the profiling that I
have recently done.
From my earlier post:
Out of the Jackrabbit code,
DescendantSelfAxisQuery.DescendantSelfAxisScorer.next()
is now taking the most time while executing my query suite - taking 68% of
the time, within
well, the problem with that approach is the following:
assume you have a tree of nodes under /a, let's say 10 million nodes. then a
user renames /a to /b. the index would have to re-index 10 million nodes. this
operation is currently very efficient and takes just a couple of milliseconds,
beca
Yeah I would +1 to that, its something I do fairly often (there is often a
lot of info in a path that is relevant to a query - given that we have gone
ahead and nicely partitioned our content !).
On 3/13/07, David Johnson <[EMAIL PROTECTED]> wrote:
As another example, for each node, perhaps eve
As another example, for each node, perhaps every potential parent path could
be added to the index - as an example a node at /a/b/c/d/e/f/g would have
index entries:
path1: /a
path2: /a/b
path3: /a/b/c
path4: /a/b/c/d
path5: /a/b/c/d/e
path6: /a/b/c/d/e/f
so queries for specific sub-paths - e.g.
Done:
https://issues.apache.org/jira/browse/JCR-787
I did file it as a bug - as it really is an incorrect implementation (i.e.,
missing implementation) of equals and hashcode.
-Dave
On 3/12/07, Jukka Zitting <[EMAIL PROTECTED]> wrote:
Hi,
On 3/10/07, David Johnson <[EMAIL PROTECTED]> wrote:
David Johnson wrote:
I think I was again focusing on range queries and giving Lucene some way of
filtering out subsets of the document set, so that the whole document set
wouldn't have to be walked. For the date range query the from and to dates
would most likely share some set of most significa
Hi,
On 3/10/07, David Johnson <[EMAIL PROTECTED]> wrote:
Will making an associated JIRA issue speed the inclusion of the change?
From my understanding it is fixing a real bug.
I'm currently not planning to cut any more releases from the 1.2
branch, as I'd like to focus on releasing 1.3 from sv
On 3/9/07, David Johnson <[EMAIL PROTECTED]> wrote:
-- snip --
yes, this should ensure that caching in lucene is used wherever possible.
> Even
> though there might be bugs that prevent this. Just like this one:
>
> http://svn.apache.org/viewvc?view=rev&revision=506908
>
> which prevented the
Will making an associated JIRA issue speed the inclusion of the change?
From my understanding it is fixing a real bug. I can create an issue if
that will bring it into a release faster.
-Dave
On 3/9/07, Jukka Zitting <[EMAIL PROTECTED]> wrote:
Hi,
On 3/9/07, David Johnson <[EMAIL PROTECTED
Hi,
On 3/9/07, David Johnson <[EMAIL PROTECTED]> wrote:
> Even though there might be bugs that prevent this. Just like this one:
>
> http://svn.apache.org/viewvc?view=rev&revision=506908
>
> which prevented the re-use of SharedFiledSortComparator even if nothing
> changed between two query execu
-- snip --
yes, this should ensure that caching in lucene is used wherever possible.
Even
though there might be bugs that prevent this. Just like this one:
http://svn.apache.org/viewvc?view=rev&revision=506908
which prevented the re-use of SharedFiledSortComparator even if nothing
changed
betw
David Johnson wrote:
In my last tests, I think I have done this - through parameters in the
repository.xml file and recreating the entire repository. Nevertheless, I
did not see that significant of a speed change in query response.
Perhaps I
wasn't using a small enough resultFetchSize (128)?
On 3/6/07, Marcel Reutegger <[EMAIL PROTECTED]> wrote:
Hi David,
David Johnson wrote:
> Yes, I am using Jackrabbit 1.2.x and I am not seeing that dramatic of a
> difference between 1.1.x and the 1.2.x, although I have not done a
direct
> comparison between the two with the same query suite.
pl
Hi David,
David Johnson wrote:
Yes, I am using Jackrabbit 1.2.x and I am not seeing that dramatic of a
difference between 1.1.x and the 1.2.x, although I have not done a direct
comparison between the two with the same query suite.
please note that you have to change the configuration to get th
Hi Jukka,
Thanks for the reply.
Yes, I am using Jackrabbit 1.2.x and I am not seeing that dramatic of a
difference between 1.1.x and the 1.2.x, although I have not done a direct
comparison between the two with the same query suite. It looks like adding
ordering and or large range queries can si
Hi,,
On 2/28/07, David Johnson <[EMAIL PROTECTED]> wrote:
"select * from Column where jcr:path like 'Gossip/ColumnName/Columns/%' and
status <> 'hidden' order by publishDate desc" takes 500 ms to execute - this
is just the execution time, I am not actually using or accessing the
NodeIterator.
Any pointers and thoughts from the developers who have worked on the
LuceneQueryBuilder would be very appreciated. As an idea, I was thinking of
running the Query AST through an optimization before it is passed the the
query builder. Perhaps in
org.apache.jackrabbit.core.query.lucene.QueryImpl.e
David Johnson wrote:
Digging into the internals of Jackrabbit, we have noticed that there is an
implementation of RangeQuery that essentially walks the results if the # of
query terms is greater than what Lucene can handle. Reading the Lucene
documentation, it looks like Filters are the recomme
31 matches
Mail list logo