Re: [CODE4LIB] Restrict solr index results based on client IP

2015-01-07 Thread Erik Hatcher
Post processing results as in #1 has big disadvantages as you can’t easily 
“fill back in” as those docs that were removed and may have been accounted for 
in facet counts for example.

#2 would be my recommendation as well.

There is an open issue to create an IP(v6) field type in Solr, with a patch 
there for IPv4 already.

Erik



 On Jan 7, 2015, at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote:
 
 Hello,
 
 Basically I have a solr index where, at times, some of the results from a 
 query will only be limited to a set of users based on their clients IP 
 address.  I have been thinking about accomplishing this in either two ways.
 
 1) Post-processing the results for IP validity against an external data 
 source and dropping out those results which are not valid.  That could leave 
 me with a portioned result list that would need another query to fill back 
 in.  Say I want 10 results, I end up dropping 2 of them, I need to fill back 
 in those 2 by performing another query.
 
 2) Making the IP permission check part of the query.  Basically appending an 
 AND in the query on a field that stores the permissible IP addresses.  The 
 index field would be set to allow all IPs to access the result by default, 
 but at times can contain the allowable IP addresses or maybe even ranges 
 somehow.
 
 Are there some other ways to accomplish this I haven't considered?  Right now 
 #2 sounds seems more desirable to me.
 
 Thanks in advance for your thoughts!
 
 --
 Chad Mills
 Digital Library Architect
 Ph: 848.932.5924
 Fax: 848.932.1386
 Cell: 732.309.8538
 
 Rutgers University Libraries
 Scholarly Communication Center
 Room 409D, Alexander Library
 169 College Avenue, New Brunswick, NJ 08901
 
 https://rucore.libraries.rutgers.edu/


Re: [CODE4LIB] Restrict solr index results based on client IP

2015-01-07 Thread Ethan Gruber
There are a few ways to do this, and yes, some version of #2 is desirable.
I think it may depend on how specific these IP addresses are. Do you
anticipate that one IP range may have access to X documents and a different
IP range may have access to Y documents, or will all IP ranges have access
to the same restricted documents (i.e., anyone on campus can access
everything). The former scenario requires IPs to stored in the Solr docs
and the second only requires a boolean field type, e.g. restricted =
yes/no. In fact, in the former scenario, you'd probably want to associate
the IP range with of key of some sort, e.g.

In the schema, have field name=group

In your doc have the group field contain the value medical_school. Then
somewhere in your application (not stored and indexed in Solr), you can say
that medical_school carries the ranges 192.168,1.*, 192.168.2.*, etc.
That way, if the medical school picks up a new IP range or the range
changes, you can make a minor update to your application without having to
reindex content in Solr.

Ethan

On Wed, Jan 7, 2015 at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote:

 Hello,

 Basically I have a solr index where, at times, some of the results from a
 query will only be limited to a set of users based on their clients IP
 address.  I have been thinking about accomplishing this in either two ways.

 1) Post-processing the results for IP validity against an external data
 source and dropping out those results which are not valid.  That could
 leave me with a portioned result list that would need another query to fill
 back in.  Say I want 10 results, I end up dropping 2 of them, I need to
 fill back in those 2 by performing another query.

 2) Making the IP permission check part of the query.  Basically appending
 an AND in the query on a field that stores the permissible IP addresses.
 The index field would be set to allow all IPs to access the result by
 default, but at times can contain the allowable IP addresses or maybe even
 ranges somehow.

 Are there some other ways to accomplish this I haven't considered?  Right
 now #2 sounds seems more desirable to me.

 Thanks in advance for your thoughts!

 --
 Chad Mills
 Digital Library Architect
 Ph: 848.932.5924
 Fax: 848.932.1386
 Cell: 732.309.8538

 Rutgers University Libraries
 Scholarly Communication Center
 Room 409D, Alexander Library
 169 College Avenue, New Brunswick, NJ 08901

 https://rucore.libraries.rutgers.edu/



Re: [CODE4LIB] Restrict solr index results based on client IP

2015-01-07 Thread Terrell, Trey
This is the best way to do it in my mind, and we do pretty much exactly
this for our Hydra project. +1

Trey Terrell
Analyst Programmer
trey.terr...@oregonstate.edu
Oregon State University Libraries
Corvallis, OR 97331





On 1/7/15, 8:55 AM, Ethan Gruber ewg4x...@gmail.com wrote:

There are a few ways to do this, and yes, some version of #2 is desirable.
I think it may depend on how specific these IP addresses are. Do you
anticipate that one IP range may have access to X documents and a
different
IP range may have access to Y documents, or will all IP ranges have access
to the same restricted documents (i.e., anyone on campus can access
everything). The former scenario requires IPs to stored in the Solr docs
and the second only requires a boolean field type, e.g. restricted =
yes/no. In fact, in the former scenario, you'd probably want to associate
the IP range with of key of some sort, e.g.

In the schema, have field name=group

In your doc have the group field contain the value medical_school. Then
somewhere in your application (not stored and indexed in Solr), you can
say
that medical_school carries the ranges 192.168,1.*, 192.168.2.*, etc.
That way, if the medical school picks up a new IP range or the range
changes, you can make a minor update to your application without having to
reindex content in Solr.

Ethan

On Wed, Jan 7, 2015 at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu
wrote:

 Hello,

 Basically I have a solr index where, at times, some of the results from
a
 query will only be limited to a set of users based on their clients IP
 address.  I have been thinking about accomplishing this in either two
ways.

 1) Post-processing the results for IP validity against an external data
 source and dropping out those results which are not valid.  That could
 leave me with a portioned result list that would need another query to
fill
 back in.  Say I want 10 results, I end up dropping 2 of them, I need to
 fill back in those 2 by performing another query.

 2) Making the IP permission check part of the query.  Basically
appending
 an AND in the query on a field that stores the permissible IP addresses.
 The index field would be set to allow all IPs to access the result by
 default, but at times can contain the allowable IP addresses or maybe
even
 ranges somehow.

 Are there some other ways to accomplish this I haven't considered?
Right
 now #2 sounds seems more desirable to me.

 Thanks in advance for your thoughts!

 --
 Chad Mills
 Digital Library Architect
 Ph: 848.932.5924
 Fax: 848.932.1386
 Cell: 732.309.8538

 Rutgers University Libraries
 Scholarly Communication Center
 Room 409D, Alexander Library
 169 College Avenue, New Brunswick, NJ 08901

 https://rucore.libraries.rutgers.edu/



Re: [CODE4LIB] Restrict solr index results based on client IP

2015-01-07 Thread Chad Mills
Ethan,

It could be a mixed bag really from on/off campus to access to only certain 
buildings, laboratories inside of buildings down to individual IP addresses.

I do like the idea of abstracting the actual IP values using a value like you 
suggest.  The ranges can be managed; grown or shrunk without having to reindex.

Thanks,
Chad

- Original Message -
From: Ethan Gruber ewg4x...@gmail.com
To: CODE4LIB@LISTSERV.ND.EDU
Sent: Wednesday, January 7, 2015 11:55:38 AM
Subject: Re: [CODE4LIB] Restrict solr index results based on client IP

There are a few ways to do this, and yes, some version of #2 is desirable.
I think it may depend on how specific these IP addresses are. Do you
anticipate that one IP range may have access to X documents and a different
IP range may have access to Y documents, or will all IP ranges have access
to the same restricted documents (i.e., anyone on campus can access
everything). The former scenario requires IPs to stored in the Solr docs
and the second only requires a boolean field type, e.g. restricted =
yes/no. In fact, in the former scenario, you'd probably want to associate
the IP range with of key of some sort, e.g.

In the schema, have field name=group

In your doc have the group field contain the value medical_school. Then
somewhere in your application (not stored and indexed in Solr), you can say
that medical_school carries the ranges 192.168,1.*, 192.168.2.*, etc.
That way, if the medical school picks up a new IP range or the range
changes, you can make a minor update to your application without having to
reindex content in Solr.

Ethan

On Wed, Jan 7, 2015 at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote:

 Hello,

 Basically I have a solr index where, at times, some of the results from a
 query will only be limited to a set of users based on their clients IP
 address.  I have been thinking about accomplishing this in either two ways.

 1) Post-processing the results for IP validity against an external data
 source and dropping out those results which are not valid.  That could
 leave me with a portioned result list that would need another query to fill
 back in.  Say I want 10 results, I end up dropping 2 of them, I need to
 fill back in those 2 by performing another query.

 2) Making the IP permission check part of the query.  Basically appending
 an AND in the query on a field that stores the permissible IP addresses.
 The index field would be set to allow all IPs to access the result by
 default, but at times can contain the allowable IP addresses or maybe even
 ranges somehow.

 Are there some other ways to accomplish this I haven't considered?  Right
 now #2 sounds seems more desirable to me.

 Thanks in advance for your thoughts!

 --
 Chad Mills
 Digital Library Architect
 Ph: 848.932.5924
 Fax: 848.932.1386
 Cell: 732.309.8538

 Rutgers University Libraries
 Scholarly Communication Center
 Room 409D, Alexander Library
 169 College Avenue, New Brunswick, NJ 08901

 https://rucore.libraries.rutgers.edu/



Re: [CODE4LIB] Restrict solr index results based on client IP

2015-01-07 Thread Erik Hatcher
I meant to include this link in my first reply, sorry: 
https://issues.apache.org/jira/browse/SOLR-6741 
https://issues.apache.org/jira/browse/SOLR-6741


 On Jan 7, 2015, at 11:53 AM, Erik Hatcher erikhatc...@mac.com wrote:
 
 Post processing results as in #1 has big disadvantages as you can’t easily 
 “fill back in” as those docs that were removed and may have been accounted 
 for in facet counts for example.
 
 #2 would be my recommendation as well.
 
 There is an open issue to create an IP(v6) field type in Solr, with a patch 
 there for IPv4 already.
 
   Erik
 
 
 
 On Jan 7, 2015, at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote:
 
 Hello,
 
 Basically I have a solr index where, at times, some of the results from a 
 query will only be limited to a set of users based on their clients IP 
 address.  I have been thinking about accomplishing this in either two ways.
 
 1) Post-processing the results for IP validity against an external data 
 source and dropping out those results which are not valid.  That could leave 
 me with a portioned result list that would need another query to fill back 
 in.  Say I want 10 results, I end up dropping 2 of them, I need to fill back 
 in those 2 by performing another query.
 
 2) Making the IP permission check part of the query.  Basically appending an 
 AND in the query on a field that stores the permissible IP addresses.  The 
 index field would be set to allow all IPs to access the result by default, 
 but at times can contain the allowable IP addresses or maybe even ranges 
 somehow.
 
 Are there some other ways to accomplish this I haven't considered?  Right 
 now #2 sounds seems more desirable to me.
 
 Thanks in advance for your thoughts!
 
 --
 Chad Mills
 Digital Library Architect
 Ph: 848.932.5924
 Fax: 848.932.1386
 Cell: 732.309.8538
 
 Rutgers University Libraries
 Scholarly Communication Center
 Room 409D, Alexander Library
 169 College Avenue, New Brunswick, NJ 08901
 
 https://rucore.libraries.rutgers.edu/
 


[CODE4LIB] Restrict solr index results based on client IP

2015-01-07 Thread Chad Mills
Hello,

Basically I have a solr index where, at times, some of the results from a query 
will only be limited to a set of users based on their clients IP address.  I 
have been thinking about accomplishing this in either two ways.

1) Post-processing the results for IP validity against an external data source 
and dropping out those results which are not valid.  That could leave me with a 
portioned result list that would need another query to fill back in.  Say I 
want 10 results, I end up dropping 2 of them, I need to fill back in those 2 by 
performing another query.

2) Making the IP permission check part of the query.  Basically appending an 
AND in the query on a field that stores the permissible IP addresses.  The 
index field would be set to allow all IPs to access the result by default, but 
at times can contain the allowable IP addresses or maybe even ranges somehow.

Are there some other ways to accomplish this I haven't considered?  Right now 
#2 sounds seems more desirable to me.

Thanks in advance for your thoughts!

--
Chad Mills
Digital Library Architect
Ph: 848.932.5924
Fax: 848.932.1386
Cell: 732.309.8538

Rutgers University Libraries
Scholarly Communication Center
Room 409D, Alexander Library
169 College Avenue, New Brunswick, NJ 08901

https://rucore.libraries.rutgers.edu/