Re: [CODE4LIB] Restrict solr index results based on client IP
Post processing results as in #1 has big disadvantages as you can’t easily “fill back in” as those docs that were removed and may have been accounted for in facet counts for example. #2 would be my recommendation as well. There is an open issue to create an IP(v6) field type in Solr, with a patch there for IPv4 already. Erik On Jan 7, 2015, at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote: Hello, Basically I have a solr index where, at times, some of the results from a query will only be limited to a set of users based on their clients IP address. I have been thinking about accomplishing this in either two ways. 1) Post-processing the results for IP validity against an external data source and dropping out those results which are not valid. That could leave me with a portioned result list that would need another query to fill back in. Say I want 10 results, I end up dropping 2 of them, I need to fill back in those 2 by performing another query. 2) Making the IP permission check part of the query. Basically appending an AND in the query on a field that stores the permissible IP addresses. The index field would be set to allow all IPs to access the result by default, but at times can contain the allowable IP addresses or maybe even ranges somehow. Are there some other ways to accomplish this I haven't considered? Right now #2 sounds seems more desirable to me. Thanks in advance for your thoughts! -- Chad Mills Digital Library Architect Ph: 848.932.5924 Fax: 848.932.1386 Cell: 732.309.8538 Rutgers University Libraries Scholarly Communication Center Room 409D, Alexander Library 169 College Avenue, New Brunswick, NJ 08901 https://rucore.libraries.rutgers.edu/
Re: [CODE4LIB] Restrict solr index results based on client IP
There are a few ways to do this, and yes, some version of #2 is desirable. I think it may depend on how specific these IP addresses are. Do you anticipate that one IP range may have access to X documents and a different IP range may have access to Y documents, or will all IP ranges have access to the same restricted documents (i.e., anyone on campus can access everything). The former scenario requires IPs to stored in the Solr docs and the second only requires a boolean field type, e.g. restricted = yes/no. In fact, in the former scenario, you'd probably want to associate the IP range with of key of some sort, e.g. In the schema, have field name=group In your doc have the group field contain the value medical_school. Then somewhere in your application (not stored and indexed in Solr), you can say that medical_school carries the ranges 192.168,1.*, 192.168.2.*, etc. That way, if the medical school picks up a new IP range or the range changes, you can make a minor update to your application without having to reindex content in Solr. Ethan On Wed, Jan 7, 2015 at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote: Hello, Basically I have a solr index where, at times, some of the results from a query will only be limited to a set of users based on their clients IP address. I have been thinking about accomplishing this in either two ways. 1) Post-processing the results for IP validity against an external data source and dropping out those results which are not valid. That could leave me with a portioned result list that would need another query to fill back in. Say I want 10 results, I end up dropping 2 of them, I need to fill back in those 2 by performing another query. 2) Making the IP permission check part of the query. Basically appending an AND in the query on a field that stores the permissible IP addresses. The index field would be set to allow all IPs to access the result by default, but at times can contain the allowable IP addresses or maybe even ranges somehow. Are there some other ways to accomplish this I haven't considered? Right now #2 sounds seems more desirable to me. Thanks in advance for your thoughts! -- Chad Mills Digital Library Architect Ph: 848.932.5924 Fax: 848.932.1386 Cell: 732.309.8538 Rutgers University Libraries Scholarly Communication Center Room 409D, Alexander Library 169 College Avenue, New Brunswick, NJ 08901 https://rucore.libraries.rutgers.edu/
Re: [CODE4LIB] Restrict solr index results based on client IP
This is the best way to do it in my mind, and we do pretty much exactly this for our Hydra project. +1 Trey Terrell Analyst Programmer trey.terr...@oregonstate.edu Oregon State University Libraries Corvallis, OR 97331 On 1/7/15, 8:55 AM, Ethan Gruber ewg4x...@gmail.com wrote: There are a few ways to do this, and yes, some version of #2 is desirable. I think it may depend on how specific these IP addresses are. Do you anticipate that one IP range may have access to X documents and a different IP range may have access to Y documents, or will all IP ranges have access to the same restricted documents (i.e., anyone on campus can access everything). The former scenario requires IPs to stored in the Solr docs and the second only requires a boolean field type, e.g. restricted = yes/no. In fact, in the former scenario, you'd probably want to associate the IP range with of key of some sort, e.g. In the schema, have field name=group In your doc have the group field contain the value medical_school. Then somewhere in your application (not stored and indexed in Solr), you can say that medical_school carries the ranges 192.168,1.*, 192.168.2.*, etc. That way, if the medical school picks up a new IP range or the range changes, you can make a minor update to your application without having to reindex content in Solr. Ethan On Wed, Jan 7, 2015 at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote: Hello, Basically I have a solr index where, at times, some of the results from a query will only be limited to a set of users based on their clients IP address. I have been thinking about accomplishing this in either two ways. 1) Post-processing the results for IP validity against an external data source and dropping out those results which are not valid. That could leave me with a portioned result list that would need another query to fill back in. Say I want 10 results, I end up dropping 2 of them, I need to fill back in those 2 by performing another query. 2) Making the IP permission check part of the query. Basically appending an AND in the query on a field that stores the permissible IP addresses. The index field would be set to allow all IPs to access the result by default, but at times can contain the allowable IP addresses or maybe even ranges somehow. Are there some other ways to accomplish this I haven't considered? Right now #2 sounds seems more desirable to me. Thanks in advance for your thoughts! -- Chad Mills Digital Library Architect Ph: 848.932.5924 Fax: 848.932.1386 Cell: 732.309.8538 Rutgers University Libraries Scholarly Communication Center Room 409D, Alexander Library 169 College Avenue, New Brunswick, NJ 08901 https://rucore.libraries.rutgers.edu/
Re: [CODE4LIB] Restrict solr index results based on client IP
Ethan, It could be a mixed bag really from on/off campus to access to only certain buildings, laboratories inside of buildings down to individual IP addresses. I do like the idea of abstracting the actual IP values using a value like you suggest. The ranges can be managed; grown or shrunk without having to reindex. Thanks, Chad - Original Message - From: Ethan Gruber ewg4x...@gmail.com To: CODE4LIB@LISTSERV.ND.EDU Sent: Wednesday, January 7, 2015 11:55:38 AM Subject: Re: [CODE4LIB] Restrict solr index results based on client IP There are a few ways to do this, and yes, some version of #2 is desirable. I think it may depend on how specific these IP addresses are. Do you anticipate that one IP range may have access to X documents and a different IP range may have access to Y documents, or will all IP ranges have access to the same restricted documents (i.e., anyone on campus can access everything). The former scenario requires IPs to stored in the Solr docs and the second only requires a boolean field type, e.g. restricted = yes/no. In fact, in the former scenario, you'd probably want to associate the IP range with of key of some sort, e.g. In the schema, have field name=group In your doc have the group field contain the value medical_school. Then somewhere in your application (not stored and indexed in Solr), you can say that medical_school carries the ranges 192.168,1.*, 192.168.2.*, etc. That way, if the medical school picks up a new IP range or the range changes, you can make a minor update to your application without having to reindex content in Solr. Ethan On Wed, Jan 7, 2015 at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote: Hello, Basically I have a solr index where, at times, some of the results from a query will only be limited to a set of users based on their clients IP address. I have been thinking about accomplishing this in either two ways. 1) Post-processing the results for IP validity against an external data source and dropping out those results which are not valid. That could leave me with a portioned result list that would need another query to fill back in. Say I want 10 results, I end up dropping 2 of them, I need to fill back in those 2 by performing another query. 2) Making the IP permission check part of the query. Basically appending an AND in the query on a field that stores the permissible IP addresses. The index field would be set to allow all IPs to access the result by default, but at times can contain the allowable IP addresses or maybe even ranges somehow. Are there some other ways to accomplish this I haven't considered? Right now #2 sounds seems more desirable to me. Thanks in advance for your thoughts! -- Chad Mills Digital Library Architect Ph: 848.932.5924 Fax: 848.932.1386 Cell: 732.309.8538 Rutgers University Libraries Scholarly Communication Center Room 409D, Alexander Library 169 College Avenue, New Brunswick, NJ 08901 https://rucore.libraries.rutgers.edu/
Re: [CODE4LIB] Restrict solr index results based on client IP
I meant to include this link in my first reply, sorry: https://issues.apache.org/jira/browse/SOLR-6741 https://issues.apache.org/jira/browse/SOLR-6741 On Jan 7, 2015, at 11:53 AM, Erik Hatcher erikhatc...@mac.com wrote: Post processing results as in #1 has big disadvantages as you can’t easily “fill back in” as those docs that were removed and may have been accounted for in facet counts for example. #2 would be my recommendation as well. There is an open issue to create an IP(v6) field type in Solr, with a patch there for IPv4 already. Erik On Jan 7, 2015, at 11:41 AM, Chad Mills cmmi...@rci.rutgers.edu wrote: Hello, Basically I have a solr index where, at times, some of the results from a query will only be limited to a set of users based on their clients IP address. I have been thinking about accomplishing this in either two ways. 1) Post-processing the results for IP validity against an external data source and dropping out those results which are not valid. That could leave me with a portioned result list that would need another query to fill back in. Say I want 10 results, I end up dropping 2 of them, I need to fill back in those 2 by performing another query. 2) Making the IP permission check part of the query. Basically appending an AND in the query on a field that stores the permissible IP addresses. The index field would be set to allow all IPs to access the result by default, but at times can contain the allowable IP addresses or maybe even ranges somehow. Are there some other ways to accomplish this I haven't considered? Right now #2 sounds seems more desirable to me. Thanks in advance for your thoughts! -- Chad Mills Digital Library Architect Ph: 848.932.5924 Fax: 848.932.1386 Cell: 732.309.8538 Rutgers University Libraries Scholarly Communication Center Room 409D, Alexander Library 169 College Avenue, New Brunswick, NJ 08901 https://rucore.libraries.rutgers.edu/
[CODE4LIB] Restrict solr index results based on client IP
Hello, Basically I have a solr index where, at times, some of the results from a query will only be limited to a set of users based on their clients IP address. I have been thinking about accomplishing this in either two ways. 1) Post-processing the results for IP validity against an external data source and dropping out those results which are not valid. That could leave me with a portioned result list that would need another query to fill back in. Say I want 10 results, I end up dropping 2 of them, I need to fill back in those 2 by performing another query. 2) Making the IP permission check part of the query. Basically appending an AND in the query on a field that stores the permissible IP addresses. The index field would be set to allow all IPs to access the result by default, but at times can contain the allowable IP addresses or maybe even ranges somehow. Are there some other ways to accomplish this I haven't considered? Right now #2 sounds seems more desirable to me. Thanks in advance for your thoughts! -- Chad Mills Digital Library Architect Ph: 848.932.5924 Fax: 848.932.1386 Cell: 732.309.8538 Rutgers University Libraries Scholarly Communication Center Room 409D, Alexander Library 169 College Avenue, New Brunswick, NJ 08901 https://rucore.libraries.rutgers.edu/