RE: [PHP] Paging and permissions

2011-02-09 Thread Arno Kuhl
On Tue, 2011-02-08 at 14:36 +0200, Arno Kuhl wrote: 

I'm hoping some clever php gurus have been here before and are
willing to
share some ideas.
 
I have a site where articles are assigned to categories in
containers. An
article can be assigned to only one category per container, but one
or more
containers. Access permissions can be set per article, per category
and/or
per container, for one or more users and/or user groups. If an
article is
assigned to 10 categories and only one of those has a permission
denying
access, then the article can't be accessed even if browsing through
one of
the other 9 categories. Currently everything works fine, with
article titles
showing when browsing through category or search result lists, and a
message
is displayed when the article is clicked if it cannot be viewed
because of a
permission.
 
Now there's a requirement to not display the article title in
category lists
and search results if it cannot be viewed. I'm stuck with how to
determine
the number of results for paging at the start of the list or search.
The
site is quite large (20,000+ articles and growing) so reading the
entire
result set and sifting through it with permission rules for each
request is
not an option. But it might be an option if done once at the start
of each
search or list request, and then use that temporary modified result
set for
subsequent requests on the same set. I thought of saving the set to
a
temporary db table or file (not sure about overhead of
serializing/unserializing large arrays). A sizing exercise based on
the
recordset returned for searches and lists shows a max of about 150MB
for
20,000 articles and 380MB for 50,000 articles that needs to be saved
temporarily per search or list request - in the vast majority of
cases the
set will be *much* smaller but it needs to cope with the worst case,
and
still do so a year down the line.
 
All this extra work because I can't simply get an accurate number of
results
for paging, because of permissions!
 
So my questions are:
1. Which is better (performance) for this situation: file or db?
2. How do I prepare a potentially very large data set for file or
fast
writing to a new table (ie I obviously don't want to write it record
by
record)
3. Are there any other alternatives worth looking at?
 
TIA
 
Cheers
Arno



How are you determining (logically, not in code) when an article is allowed
to be read?

Assume an article on user permissions in mysql is in a container called
'databases' and in a second one called 'security' and both containers are in
a category called 'computers'

Now get a user called John who is in a group called 'db admins' and that
group gives him permissions to view all articles in the 'databases'
container and any articles in any container in the 'computers' category. Now
assume John also has explicit user permissions revoking that right to view
the article in any container.

What I'm getting at is what's the order of privilege for rights? Do group
rights for categories win out over those for containers, or do individual
user rights trump all of them overall?

I think once that's figured out, a lot can be done inside the query itself
to minimise the impact on the script getting the results.


Thanks,
Ash
http://www.ashleysheridan.co.uk

---

The simple structure is articles in categories, categories in containers,
only one article per container/category, in one or more containers. If an
article permission explicitly allows or denies access then the permission
applies, otherwise the container/s and category/s permissions are checked.
The permission checks user access first then group. A user can belong to
multiple groups.

There's no query to handle this that can return a neat recordset for paging.
Currently the complete checks are only done for an article request. The
category list only checks access to the category and the container it
belongs to, so the list is either displayed in its entirety (including
titles of articles that can't be viewed) or not at all, and obviously the
paging works perfectly because the total number of titles is known up front
and remains constant for subsequent requests.

If I use read-ahead to make allowance for permissions and remove paging
(just keep prev/next) the problem goes away. Or I could use best-guess
paging, which could range from 100% accurate to 99% wrong. At first glance
that's not really acceptable, but I noticed recently Google does the same
thing with their search results. 

First prize is to work out a proper solution that is fast and accurate and
works on fairly large results, and I'm still hoping for some 

RE: [PHP] Paging and permissions

2011-02-09 Thread Ashley Sheridan
On Wed, 2011-02-09 at 13:03 +0200, Arno Kuhl wrote:

 On Tue, 2011-02-08 at 14:36 +0200, Arno Kuhl wrote: 
 
   I'm hoping some clever php gurus have been here before and are
 willing to
   share some ideas.

   I have a site where articles are assigned to categories in
 containers. An
   article can be assigned to only one category per container, but one
 or more
   containers. Access permissions can be set per article, per category
 and/or
   per container, for one or more users and/or user groups. If an
 article is
   assigned to 10 categories and only one of those has a permission
 denying
   access, then the article can't be accessed even if browsing through
 one of
   the other 9 categories. Currently everything works fine, with
 article titles
   showing when browsing through category or search result lists, and a
 message
   is displayed when the article is clicked if it cannot be viewed
 because of a
   permission.

   Now there's a requirement to not display the article title in
 category lists
   and search results if it cannot be viewed. I'm stuck with how to
 determine
   the number of results for paging at the start of the list or search.
 The
   site is quite large (20,000+ articles and growing) so reading the
 entire
   result set and sifting through it with permission rules for each
 request is
   not an option. But it might be an option if done once at the start
 of each
   search or list request, and then use that temporary modified result
 set for
   subsequent requests on the same set. I thought of saving the set to
 a
   temporary db table or file (not sure about overhead of
   serializing/unserializing large arrays). A sizing exercise based on
 the
   recordset returned for searches and lists shows a max of about 150MB
 for
   20,000 articles and 380MB for 50,000 articles that needs to be saved
   temporarily per search or list request - in the vast majority of
 cases the
   set will be *much* smaller but it needs to cope with the worst case,
 and
   still do so a year down the line.

   All this extra work because I can't simply get an accurate number of
 results
   for paging, because of permissions!

   So my questions are:
   1. Which is better (performance) for this situation: file or db?
   2. How do I prepare a potentially very large data set for file or
 fast
   writing to a new table (ie I obviously don't want to write it record
 by
   record)
   3. Are there any other alternatives worth looking at?

   TIA

   Cheers
   Arno
   
 
 
 How are you determining (logically, not in code) when an article is allowed
 to be read?
 
 Assume an article on user permissions in mysql is in a container called
 'databases' and in a second one called 'security' and both containers are in
 a category called 'computers'
 
 Now get a user called John who is in a group called 'db admins' and that
 group gives him permissions to view all articles in the 'databases'
 container and any articles in any container in the 'computers' category. Now
 assume John also has explicit user permissions revoking that right to view
 the article in any container.
 
 What I'm getting at is what's the order of privilege for rights? Do group
 rights for categories win out over those for containers, or do individual
 user rights trump all of them overall?
 
 I think once that's figured out, a lot can be done inside the query itself
 to minimise the impact on the script getting the results.
 
 
 Thanks,
 Ash
 http://www.ashleysheridan.co.uk
 
 ---
 
 The simple structure is articles in categories, categories in containers,
 only one article per container/category, in one or more containers. If an
 article permission explicitly allows or denies access then the permission
 applies, otherwise the container/s and category/s permissions are checked.
 The permission checks user access first then group. A user can belong to
 multiple groups.
 
 There's no query to handle this that can return a neat recordset for paging.
 Currently the complete checks are only done for an article request. The
 category list only checks access to the category and the container it
 belongs to, so the list is either displayed in its entirety (including
 titles of articles that can't be viewed) or not at all, and obviously the
 paging works perfectly because the total number of titles is known up front
 and remains constant for subsequent requests.
 
 If I use read-ahead to make allowance for permissions and remove paging
 (just keep prev/next) the problem goes away. Or I could use best-guess
 paging, which could range from 100% accurate to 99% wrong. At first glance
 that's not really acceptable, but I noticed recently Google does the same
 thing with their search results. 
 
 First prize is to work out a proper solution that is fast 

RE: [PHP] Paging and permissions

2011-02-09 Thread Arno Kuhl
Instead of serializing the articles, you only need their IDs. Using

$sql .= ' where id in (' . implode(',', $ids) . ')';

you can load the data for a page of results in a single query. Storing the
IDs is much cheaper than the articles.

If the permissions are fairly static (i.e. access for user X to article Y
doesn't change every two minutes) you could create a calculated permission
table as a many-to-many between user and article. Here's the logic flow for
a query:

1. Run the query to find matching article IDs
2. Load permissions from table for all IDs
3. For each article without a calculated permission, calculate it and insert
a row (do a batch insert to save time)

If you flag the query in the middle tier as having been processed as above,
you can join to the calculated permissions each time you need another page.
The downside is that the code that runs the queries has to operate in two
modes: raw and joined to the permissions. If most users end up querying for
all articles, the table could grow. Plus you need to purge rows any time the
permissions for an article/user changes which could get fairly complicated.

David

---
 
Storing only the IDs is way cheaper than storing the entire resultset, and
I'd been thinking along the same lines. Getting a complete list of valid IDs
in the first place is turning out to be a different matter. The permissions
for article/user aren't that straight-forward, and in fact the most common
permissions are group/category and group/container, where an article can be
assigned to one or more category/containers. Using a temporary permission
table could be the solution. Thanks.

Cheers
Arno



-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



RE: [PHP] Paging and permissions

2011-02-09 Thread Ashley Sheridan
On Wed, 2011-02-09 at 13:27 +0200, Arno Kuhl wrote:

 Instead of serializing the articles, you only need their IDs. Using
 
 $sql .= ' where id in (' . implode(',', $ids) . ')';
 
 you can load the data for a page of results in a single query. Storing the
 IDs is much cheaper than the articles.
 
 If the permissions are fairly static (i.e. access for user X to article Y
 doesn't change every two minutes) you could create a calculated permission
 table as a many-to-many between user and article. Here's the logic flow for
 a query:
 
 1. Run the query to find matching article IDs
 2. Load permissions from table for all IDs
 3. For each article without a calculated permission, calculate it and insert
 a row (do a batch insert to save time)
 
 If you flag the query in the middle tier as having been processed as above,
 you can join to the calculated permissions each time you need another page.
 The downside is that the code that runs the queries has to operate in two
 modes: raw and joined to the permissions. If most users end up querying for
 all articles, the table could grow. Plus you need to purge rows any time the
 permissions for an article/user changes which could get fairly complicated.
 
 David
 
 ---
  
 Storing only the IDs is way cheaper than storing the entire resultset, and
 I'd been thinking along the same lines. Getting a complete list of valid IDs
 in the first place is turning out to be a different matter. The permissions
 for article/user aren't that straight-forward, and in fact the most common
 permissions are group/category and group/container, where an article can be
 assigned to one or more category/containers. Using a temporary permission
 table could be the solution. Thanks.
 
 Cheers
 Arno
 
 
 


You can get the same set of results with a join betwixt the tables, and
it should be slightly faster than creating a temporary table if you've
got your indexes right.

Thanks,
Ash
http://www.ashleysheridan.co.uk




[PHP] Paging and permissions

2011-02-08 Thread Arno Kuhl
I'm hoping some clever php gurus have been here before and are willing to
share some ideas.
 
I have a site where articles are assigned to categories in containers. An
article can be assigned to only one category per container, but one or more
containers. Access permissions can be set per article, per category and/or
per container, for one or more users and/or user groups. If an article is
assigned to 10 categories and only one of those has a permission denying
access, then the article can't be accessed even if browsing through one of
the other 9 categories. Currently everything works fine, with article titles
showing when browsing through category or search result lists, and a message
is displayed when the article is clicked if it cannot be viewed because of a
permission.
 
Now there's a requirement to not display the article title in category lists
and search results if it cannot be viewed. I'm stuck with how to determine
the number of results for paging at the start of the list or search. The
site is quite large (20,000+ articles and growing) so reading the entire
result set and sifting through it with permission rules for each request is
not an option. But it might be an option if done once at the start of each
search or list request, and then use that temporary modified result set for
subsequent requests on the same set. I thought of saving the set to a
temporary db table or file (not sure about overhead of
serializing/unserializing large arrays). A sizing exercise based on the
recordset returned for searches and lists shows a max of about 150MB for
20,000 articles and 380MB for 50,000 articles that needs to be saved
temporarily per search or list request - in the vast majority of cases the
set will be *much* smaller but it needs to cope with the worst case, and
still do so a year down the line.
 
All this extra work because I can't simply get an accurate number of results
for paging, because of permissions!
 
So my questions are:
1. Which is better (performance) for this situation: file or db?
2. How do I prepare a potentially very large data set for file or fast
writing to a new table (ie I obviously don't want to write it record by
record)
3. Are there any other alternatives worth looking at?
 
TIA
 
Cheers
Arno


Re: [PHP] Paging and permissions

2011-02-08 Thread David Hutto
On Tue, Feb 8, 2011 at 7:36 AM, Arno Kuhl ak...@telkomsa.net wrote:
 I'm hoping some clever php gurus have been here before and are willing to
 share some ideas.

 I have a site where articles are assigned to categories in containers. An
 article can be assigned to only one category per container, but one or more
 containers. Access permissions can be set per article, per category and/or
 per container, for one or more users and/or user groups. If an article is
 assigned to 10 categories and only one of those has a permission denying
 access, then the article can't be accessed even if browsing through one of
 the other 9 categories. Currently everything works fine, with article titles
 showing when browsing through category or search result lists, and a message
 is displayed when the article is clicked if it cannot be viewed because of a
 permission.

 Now there's a requirement to not display the article title in category lists
 and search results if it cannot be viewed. I'm stuck with how to determine
 the number of results for paging at the start of the list or search. The
 site is quite large (20,000+ articles and growing) so reading the entire
 result set and sifting through it with permission rules for each request is
 not an option. But it might be an option if done once at the start of each
 search or list request, and then use that temporary modified result set for
 subsequent requests on the same set. I thought of saving the set to a
 temporary db table or file (not sure about overhead of
 serializing/unserializing large arrays). A sizing exercise based on the
 recordset returned for searches and lists shows a max of about 150MB for
 20,000 articles and 380MB for 50,000 articles that needs to be saved
 temporarily per search or list request - in the vast majority of cases the
 set will be *much* smaller but it needs to cope with the worst case, and
 still do so a year down the line.

 All this extra work because I can't simply get an accurate number of results
 for paging, because of permissions!

 So my questions are:
 1. Which is better (performance) for this situation: file or db?

have you timed it yourself?

 2. How do I prepare a potentially very large data set for file or fast
 writing to a new table (ie I obviously don't want to write it record by
 record)

Even the db's cant insert as fast as the function is presented to it,
and it can respond, so again...timeit

 3. Are there any other alternatives worth looking at?

This is a question for the experienced php developers. But the above
is applicable.


 TIA

 Cheers
 Arno




-- 
According to theoretical physics, the division of spatial intervals as
the universe evolves gives rise to the fact that in another timeline,
your interdimensional counterpart received helpful advice from me...so
be eternally pleased for them.

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php



Re: [PHP] Paging and permissions

2011-02-08 Thread Ashley Sheridan
On Tue, 2011-02-08 at 14:36 +0200, Arno Kuhl wrote:

 I'm hoping some clever php gurus have been here before and are willing to
 share some ideas.
  
 I have a site where articles are assigned to categories in containers. An
 article can be assigned to only one category per container, but one or more
 containers. Access permissions can be set per article, per category and/or
 per container, for one or more users and/or user groups. If an article is
 assigned to 10 categories and only one of those has a permission denying
 access, then the article can't be accessed even if browsing through one of
 the other 9 categories. Currently everything works fine, with article titles
 showing when browsing through category or search result lists, and a message
 is displayed when the article is clicked if it cannot be viewed because of a
 permission.
  
 Now there's a requirement to not display the article title in category lists
 and search results if it cannot be viewed. I'm stuck with how to determine
 the number of results for paging at the start of the list or search. The
 site is quite large (20,000+ articles and growing) so reading the entire
 result set and sifting through it with permission rules for each request is
 not an option. But it might be an option if done once at the start of each
 search or list request, and then use that temporary modified result set for
 subsequent requests on the same set. I thought of saving the set to a
 temporary db table or file (not sure about overhead of
 serializing/unserializing large arrays). A sizing exercise based on the
 recordset returned for searches and lists shows a max of about 150MB for
 20,000 articles and 380MB for 50,000 articles that needs to be saved
 temporarily per search or list request - in the vast majority of cases the
 set will be *much* smaller but it needs to cope with the worst case, and
 still do so a year down the line.
  
 All this extra work because I can't simply get an accurate number of results
 for paging, because of permissions!
  
 So my questions are:
 1. Which is better (performance) for this situation: file or db?
 2. How do I prepare a potentially very large data set for file or fast
 writing to a new table (ie I obviously don't want to write it record by
 record)
 3. Are there any other alternatives worth looking at?
  
 TIA
  
 Cheers
 Arno


How are you determining (logically, not in code) when an article is
allowed to be read?

Assume an article on user permissions in mysql is in a container
called 'databases' and in a second one called 'security' and both
containers are in a category called 'computers'

Now get a user called John who is in a group called 'db admins' and that
group gives him permissions to view all articles in the 'databases'
container and any articles in any container in the 'computers' category.
Now assume John also has explicit user permissions revoking that right
to view the article in any container.

What I'm getting at is what's the order of privilege for rights? Do
group rights for categories win out over those for containers, or do
individual user rights trump all of them overall?

I think once that's figured out, a lot can be done inside the query
itself to minimise the impact on the script getting the results.

Thanks,
Ash
http://www.ashleysheridan.co.uk




Re: [PHP] Paging and permissions

2011-02-08 Thread Tolas Anon
On Tue, Feb 8, 2011 at 1:36 PM, Arno Kuhl ak...@telkomsa.net wrote:

 I'm hoping some clever php gurus have been here before and are willing to
 share some ideas.

 I have a site where articles are assigned to categories in containers. An
 article can be assigned to only one category per container, but one or more
 containers. Access permissions can be set per article, per category and/or
 per container, for one or more users and/or user groups. If an article is
 assigned to 10 categories and only one of those has a permission denying
 access, then the article can't be accessed even if browsing through one of
 the other 9 categories. Currently everything works fine, with article
 titles
 showing when browsing through category or search result lists, and a
 message
 is displayed when the article is clicked if it cannot be viewed because of
 a
 permission.

 Now there's a requirement to not display the article title in category
 lists
 and search results if it cannot be viewed. I'm stuck with how to determine
 the number of results for paging at the start of the list or search. The
 site is quite large (20,000+ articles and growing) so reading the entire
 result set and sifting through it with permission rules for each request is
 not an option. But it might be an option if done once at the start of each
 search or list request, and then use that temporary modified result set for
 subsequent requests on the same set. I thought of saving the set to a
 temporary db table or file (not sure about overhead of
 serializing/unserializing large arrays). A sizing exercise based on the
 recordset returned for searches and lists shows a max of about 150MB for
 20,000 articles and 380MB for 50,000 articles that needs to be saved
 temporarily per search or list request - in the vast majority of cases the
 set will be *much* smaller but it needs to cope with the worst case, and
 still do so a year down the line.

 All this extra work because I can't simply get an accurate number of
 results
 for paging, because of permissions!

 So my questions are:
 1. Which is better (performance) for this situation: file or db?
 2. How do I prepare a potentially very large data set for file or fast
 writing to a new table (ie I obviously don't want to write it record by
 record)
 3. Are there any other alternatives worth looking at?

 TIA

 Cheers
 Arno


Seems to me you make your setup needlessly complicated and restrictive.

And it's bad form to display articles in search results that aren't allowed
to be viewed..

Tell us more about why you want it to be so restrictive, i just don't
understand it.


Re: [PHP] Paging and permissions

2011-02-08 Thread David Harkness
On Tue, Feb 8, 2011 at 4:36 AM, Arno Kuhl ak...@telkomsa.net wrote:

 But it might be an option if done once at the start of each
 search or list request, and then use that temporary modified result set for
 subsequent requests on the same set.


Instead of serializing the articles, you only need their IDs. Using

$sql .= ' where id in (' . implode(',', $ids) . ')';

you can load the data for a page of results in a single query. Storing the
IDs is much cheaper than the articles.

If the permissions are fairly static (i.e. access for user X to article Y
doesn't change every two minutes) you could create a calculated permission
table as a many-to-many between user and article. Here's the logic flow for
a query:

1. Run the query to find matching article IDs
2. Load permissions from table for all IDs
3. For each article without a calculated permission, calculate it and insert
a row (do a batch insert to save time)

If you flag the query in the middle tier as having been processed as above,
you can join to the calculated permissions each time you need another page.
The downside is that the code that runs the queries has to operate in two
modes: raw and joined to the permissions. If most users end up querying for
all articles, the table could grow. Plus you need to purge rows any time the
permissions for an article/user changes which could get fairly complicated.

On Tue, Feb 8, 2011 at 5:17 AM, Tolas Anon tolas...@gmail.com wrote:

 And it's bad form to display articles in search results that aren't allowed
 to be viewed.


On Tue, Feb 8, 2011 at 4:36 AM, Arno Kuhl ak...@telkomsa.net wrote:

 Now there's a requirement to not display the article title in category
 lists
 and search results if it cannot be viewed.


:)

David