On Wed, 2011-02-09 at 13:03 +0200, Arno Kuhl wrote:
> On Tue, 2011-02-08 at 14:36 +0200, Arno Kuhl wrote:
> I'm hoping some clever php gurus have been here before and are
> willing to
> share some ideas.
> I have a site where articles are assigned to categories in
> containers. An
> article can be assigned to only one category per container, but one
> or more
> containers. Access permissions can be set per article, per category
> per container, for one or more users and/or user groups. If an
> article is
> assigned to 10 categories and only one of those has a permission
> access, then the article can't be accessed even if browsing through
> one of
> the other 9 categories. Currently everything works fine, with
> article titles
> showing when browsing through category or search result lists, and a
> is displayed when the article is clicked if it cannot be viewed
> because of a
> Now there's a requirement to not display the article title in
> category lists
> and search results if it cannot be viewed. I'm stuck with how to
> the number of results for paging at the start of the list or search.
> site is quite large (20,000+ articles and growing) so reading the
> result set and sifting through it with permission rules for each
> request is
> not an option. But it might be an option if done once at the start
> of each
> search or list request, and then use that temporary modified result
> set for
> subsequent requests on the same set. I thought of saving the set to
> temporary db table or file (not sure about overhead of
> serializing/unserializing large arrays). A sizing exercise based on
> recordset returned for searches and lists shows a max of about 150MB
> 20,000 articles and 380MB for 50,000 articles that needs to be saved
> temporarily per search or list request - in the vast majority of
> cases the
> set will be *much* smaller but it needs to cope with the worst case,
> still do so a year down the line.
> All this extra work because I can't simply get an accurate number of
> for paging, because of permissions!
> So my questions are:
> 1. Which is better (performance) for this situation: file or db?
> 2. How do I prepare a potentially very large data set for file or
> writing to a new table (ie I obviously don't want to write it record
> 3. Are there any other alternatives worth looking at?
> How are you determining (logically, not in code) when an article is allowed
> to be read?
> Assume an article on "user permissions in mysql" is in a container called
> 'databases' and in a second one called 'security' and both containers are in
> a category called 'computers'
> Now get a user called John who is in a group called 'db admins' and that
> group gives him permissions to view all articles in the 'databases'
> container and any articles in any container in the 'computers' category. Now
> assume John also has explicit user permissions revoking that right to view
> the article in any container.
> What I'm getting at is what's the order of privilege for rights? Do group
> rights for categories win out over those for containers, or do individual
> user rights trump all of them overall?
> I think once that's figured out, a lot can be done inside the query itself
> to minimise the impact on the script getting the results.
> The simple structure is articles in categories, categories in containers,
> only one article per container/category, in one or more containers. If an
> article permission explicitly allows or denies access then the permission
> applies, otherwise the container/s and category/s permissions are checked.
> The permission checks user access first then group. A user can belong to
> multiple groups.
> There's no query to handle this that can return a neat recordset for paging.
> Currently the complete checks are only done for an article request. The
> category list only checks access to the category and the container it
> belongs to, so the list is either displayed in its entirety (including
> titles of articles that can't be viewed) or not at all, and obviously the
> paging works perfectly because the total number of titles is known up front
> and remains constant for subsequent requests.
> If I use read-ahead to make allowance for permissions and remove paging
> (just keep prev/next) the problem goes away. Or I could use "best-guess"
> paging, which could range from 100% accurate to 99% wrong. At first glance
> that's not really acceptable, but I noticed recently Google does the same
> thing with their search results.
> First prize is to work out a proper solution that is fast and accurate and
> works on fairly large results, and I'm still hoping for some suggestions.
> But as a last resort I'll go the "best-guess" route. If Google can do it...
Well, without seeing your DB structure I can't say for definite, but
that should be OK in a single SQL statement with some IF clauses, etc.
However, if you go with the multiple queries route and use PHP to sort,
just grab a list of article ID's a user (the current logged in one) has
access to read given a set of criteria, so you should end up with a list
of article IDs for explicit user permissions, user category permissions
and user container permissions.
You can then iterate each list to get the final permissions and totals.
This is very simplified, and needs fleshing out, but it should give you
the general idea.