Thank you again Peter for guiding me through this.  This approach is a bit more 
complicated but makes good sense.  Now, my difficulty is in creating this 
filter.  I gave it a try and did not get very far.  This is how I’m approaching 
it:  I need to create a filter that overrides the behavior of the two methods 
in Filter:

public BitSet bits(IndexReader reader) throws IOException
AND
public DocIdSet getDocIdSet(IndexReader reader) throws IOException

Now, I don’t know what to do with IndexReader.  Somehow I need to find out if 
the item being indexed is an authorize item, kind of like I did in the jspui 
interface:

        if (AuthorizeManager.authorizeActionBoolean(context, item, 
Constants.READ))

I don’t see how to put these two ideas together.   How can I extract item_id 
from IndexReader to create the object Item to pass on to the AuthorizeManager 
object?

Thank you!
Jose

From: Peter Dietz [mailto:[email protected]]
Sent: Monday, April 18, 2011 11:22 AM
To: Blanco, Jose
Cc: dspace-tech
Subject: Re: [Dspace-tech] question about item indexing.

Hi Jose,

The flow for when a search is performed is that a bunch of default parameters, 
along with whatever the user submits, get sent to the search result processor. 
The part you have customized in [jspui-api]/SimpleSearchServlet to weed out 
restricted items once the results comeback is what gives you the less than 20 
results. You could try another attempt at fudging things to "atleast" get 20 
hits for the first page of search results, and perhaps change one of the 
default parameters to make the search request search for 100 items, and then 
have your filter weed out the ones that can't be shown, however this gets 
messy, as Page2 will need to know the offset, which isn't 0 or 20, but 
20+number-user-was-able-to-view, which is messy.

So, to accomplish your task, you'll likely have to get one-level deeper into 
the matrix, of DSpace code, and mess with the dspace-api, namely, 
DSQuery<https://github.com/DSpace/DSpace/blob/master/dspace-api/src/main/java/org/dspace/search/DSQuery.java#L126>,
 and have your UI (SimpleSearchServlet) pass another parameter to the query 
such as a boolean onlyAuthorizedItems=true. In DSQuery, you'll need to 
intercept the request when it has a true for onlyAuthorizedItems, and add a 
Lucene Filter 
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Filter.html 
that restricts to items with proper dspace permissions. Then have all the 
places where it performs the Lucene Search 
http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Searcher.html#search(org.apache.lucene.search.Query,
 org.apache.lucene.search.Filter, 
org.apache.lucene.search.Sort)<http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Searcher.html#search(org.apache.lucene.search.Query,%20org.apache.lucene.search.Filter,%20org.apache.lucene.search.Sort)>
 add the filter. If your filter is null, then it ignores the filter, otherwise 
it will add the filter, and weed out the unviewable hits.

Its heavier lifting than the jspui-api changes, but still within reach.


Peter Dietz


On Fri, Apr 15, 2011 at 5:13 PM, Blanco, Jose 
<[email protected]<mailto:[email protected]>> wrote:
Peter, thank you for responding to my question.  I put in the filtering you 
described, and voila, the restricted items are not showing up, but the paging 
is all wrong as you predicted would happen.  I have been trying to add some 
code that can iterate through the list of returned items but I don’t think that 
the entire list is available.  If I’m understanding the code correctly, when 
someone performs a search ( and say dspace is configured to only return 20 
items ), then lucene only returns the 20 items at a time.  Is that correct?  
Below is the code that I would like to put in SimpleSearchServlet.java to 
compute the actual total and then set the total in qResults.  I think if I can 
do that, it will all work:

     int TotalcollCount = 0;
     int TotalcommCount = 0;
     int TotalitemCount = 0;

for (int i = 0; i < qResults.getHitCount(); i++)
{
    Integer myType = qResults.getHitTypes().get(i);    // This is where I run 
into problems.  It looks like qResults only has 20 at a time until the last page
                                                                                
                        // So this errors out after it tries to grab more than 
20

    // add the handle to the appropriate lists
    switch (myType.intValue())
    {
    case Constants.ITEM:
        Item item = Item.find(context, qResults.getHitIds().get(i));      // Of 
course this will not work either because there are only 20 in qResults
        if (AuthorizeManager.authorizeActionBoolean(context, item, 
Constants.READ))
        {
            TotalitemCount++;
       }
        break;

    case Constants.COLLECTION:
        TotalcollCount++;
        break;

    case Constants.COMMUNITY:
        TotalcommCount++;
        break;
    }
}

int TotalCount = TotalitemCount + TotalcollCount + TotalcommCount;
qResults.setHitCount( TotalCount );

Thank you! Jose

From: Peter Dietz [mailto:[email protected]<mailto:[email protected]>]
Sent: Tuesday, March 29, 2011 2:37 PM
To: Blanco, Jose
Cc: dspace-tech
Subject: Re: [Dspace-tech] question about item indexing.

Hi Jose,


This actually sounds like a fun project to solve with Discovery, they look like 
a good match.

Regarding the traditional lucene search index, my understanding is that the 
lucene index is there to provide fast query on the data. In your case you want 
to restrict what gets returned to the user based on authentication. So lucene 
will gives you all the fish in the ocean (all hits for query), you just want a 
few fish that are safe enough for this user to eat (just hits that user has 
authorization to read). Its doable, and without a monumental restructuring, it 
might however be slightly messy, then just giving the user results that can't 
view. On second thought, giving user hits they can't view is also kinda messy. 
But it will most likely involve some refactoring.

You use JSPUI, so I'll look at that code.
In: 
dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/servlet/SimpleSearchServlet.java<http://scm.dspace.org/svn/repo/dspace/trunk/dspace-jspui/dspace-jspui-api/src/main/java/org/dspace/app/webui/servlet/SimpleSearchServlet.java>
When it parses through each hit result, it determines if it is an item, and 
then increments itemCount, instead of that, you could perform an authorization 
check, see this quick and Pdirty 
example<https://gist.github.com/892895/0dcc440abd005df01cec7b40efb5538c7f992786>.
 Everywhere in Constants.ITEM you could do authz check. The weird part will be 
if the user specified they want to paginate at 20 hits per page, well, the 
query would return less than 20 hits after authorization restrictions are 
filtered out.

Peter Dietz

On Tue, Mar 29, 2011 at 11:57 AM, Blanco, Jose 
<[email protected]<mailto:[email protected]>> wrote:
We would like to make the results of the search be based on the user logged in 
the system.  For example,

Person:  John Smith, with item: The works of John Smith
Has sole rights to the item (metadata and bitstream )

So if an anonymous user is using the system and he searches for 'The works of 
John Smith' he does not get this  item as part of the results, but if say John 
Smith logs in and searches for 'The works of John Smith', he get the item in 
the result.  Presently, both users get the results but the one that can auth 
into the system can see his work.  In looking at the code a bit, it seems that 
it would take a monumental change in the code to get this functionality since 
the search indices are created for all the users of Deep Blue not just one at a 
time.  I just want to verify that this is true.

Thank you!
Jose

------------------------------------------------------------------------------
Enable your software for Intel(R) Active Management Technology to meet the
growing manageability and security demands of your customers. Businesses
are taking advantage of Intel(R) vPro (TM) technology - will your software
be a part of the solution? Download the Intel(R) Manageability Checker
today! http://p.sf.net/sfu/intel-dev2devmar
_______________________________________________
DSpace-tech mailing list
[email protected]<mailto:[email protected]>
https://lists.sourceforge.net/lists/listinfo/dspace-tech


------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload 
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve 
application availability and disaster protection. Learn more about boosting 
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to