Thanks for the replies,
We'll try the filters then, possibly with cache if required for
performance.
@Karsten: We did think about simplifying permissions to just top-level
folders, which is probably suitable for 80% of our clients. If the
filter is too slow we may have to. In that case it gets a lot simpler:
we can add an extra field for what we call "zone" and use just a term
query, no need for a prefix or wildcard anymor, and thus no more max
clause count errors.
Kind regards,
Nico Krijnen
On 5 aug 2008, at 15:31, Erick Erickson wrote:
This situation is pretty much the kind of thing PrefixFilters
were written for, so I'd certainly try those first, with or
without caching. I was surprised at how fast filters
get constructed, so I'd just try it and take a few measurements.
Best
Erick
On 5 aug 2008, at 11:11, Karsten F. wrote:
Hi Nico Krijnen,
I think it is ok, to store a filter for each user-session im memory.
And I think that a cached filter is the correct approach for
permissions.
(extra memory usage = one bit for each user and each document)
Hopefully someone with more experience will also answer your question.
But I want to ask the obvious question:
Is your permission-policy really on each file, or only on the top-most
folders?
Can't you store only the relevant path in an extra lucene field and
set the
maximum of query-terms to e.g. 2048 ?
Best regards
Karsten
On Tue, Aug 5, 2008 at 3:40 AM, Nico Krijnen
<[EMAIL PROTECTED]> wrote:
Hello,
Need some help with prefix filtering...
We ran into the max clause count problem with our usage of the
wildcard
query. Essentially what we are trying to do is:
One of the fields in our index contains a 'path' representing a
file system
location. For example:
/folder A/subfolder/document 1.pdf
/folder B/image 1.jpg
/folder B/image 2.jpg
/folder B2/image 3.jpg
/folder C/image 4.jpg
We have a security layer in our application that filters results
based on
the users permissions. These permissions (VIEW, EDIT, ...) can be
set on
'folder paths'. To filter the results we build a bool query with a
wildcard
(or prefix) query for each folder for which the user has VIEW
permissions,
for example:
/folder A/subfolder/*
/folder B/*
/folder B2/*
This does exactly what we want to, but because a wildcard query is
rewritten to term queries it fails when there are more then 1024
documents
below a folder (max clause count of rewritten bool query). After
all, each
document has a different (untokenized) term value for the 'path'
field.
After searching the web we found some alternative methods, for
example by
using a PrefixFilter wrapped in a CachingWrapperFilter instead of a
query.
Before we start implementing I'd like to check if anyone here may
have some
more experience with queries like this or may have a better
suggestion on
how to approach this?
Kind regards,
Nico Krijnen
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]