[CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
Hello, list, Do you know the Digital Library systems which can search within the documents (e.g. PDFs) and handle access restrictions (e.g. DRM)? Has any of you compared these DL systems? Thanks for any information! Sophie
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
I would think DSpace, Fedora, and Eprint. DSpace is fairly easy to implement, which has embargo support in 1.6 (https://wiki.duraspace.org/display/DSTEST/Embargo ). I have an article comparing DSpace and Fedora, but was written 6 years ago. DSpace has not been changed much, but Fedora is a different story. Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Deng, Sai Sent: Wednesday, October 20, 2010 10:33 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Hello, list, Do you know the Digital Library systems which can search within the documents (e.g. PDFs) and handle access restrictions (e.g. DRM)? Has any of you compared these DL systems? Thanks for any information! Sophie
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
Maybe my question is not clear. We are looking for some system which can search the full text of the deposited documents; these are licensed materials, so we'll also need access restriction. We use DSpace, but I don't think DSpace does full text search, e.g. it doesn't search content in bitstreams (pdfs, ppts...). Any suggestion? Thanks! Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Han, Yan [h...@u.library.arizona.edu] Sent: Wednesday, October 20, 2010 3:25 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? I would think DSpace, Fedora, and Eprint. DSpace is fairly easy to implement, which has embargo support in 1.6 (https://wiki.duraspace.org/display/DSTEST/Embargo ). I have an article comparing DSpace and Fedora, but was written 6 years ago. DSpace has not been changed much, but Fedora is a different story. Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Deng, Sai Sent: Wednesday, October 20, 2010 10:33 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Hello, list, Do you know the Digital Library systems which can search within the documents (e.g. PDFs) and handle access restrictions (e.g. DRM)? Has any of you compared these DL systems? Thanks for any information! Sophie
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
Deng, Sai sai.d...@wichita.edu wrote: Do you know the Digital Library systems which can search within the documents (e.g. PDFs) and handle access restrictions (e.g. DRM)? Not sure what you mean by handle access restrictions. Do you mean it can index the documents put into it even if they have DRM encumbrances? UpLib has search within the documents -- if you search for a word or phrase, it shows you all the documents which match, but also all the pages in each document which match. Supports a wide variety of document formats, from JPEG2000 to PDF to Powerpoint. But as far as I know it doesn't deal with DRM restrictions. Bill
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). I don't know how DRM affects file indexing. On second thought, I searched for DSpace full text search and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing However, I haven't seen any instance which shows the full text search results as I would see from vendor databases. Any idea on what system might be good/best for search within documents and DRM? Thank you for the reply! Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Bill Janssen [jans...@parc.com] Sent: Wednesday, October 20, 2010 4:01 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Deng, Sai sai.d...@wichita.edu wrote: Do you know the Digital Library systems which can search within the documents (e.g. PDFs) and handle access restrictions (e.g. DRM)? Not sure what you mean by handle access restrictions. Do you mean it can index the documents put into it even if they have DRM encumbrances? UpLib has search within the documents -- if you search for a word or phrase, it shows you all the documents which match, but also all the pages in each document which match. Supports a wide variety of document formats, from JPEG2000 to PDF to Powerpoint. But as far as I know it doesn't deal with DRM restrictions. Bill
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
Deng, Sai sai.d...@wichita.edu wrote: For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). OK, that's not I typically think of when I hear DRM. Access control is (I think) the way it's usually put. No, UpLib has no built-in access control system, though the hooks are there, and I know that some have used them to do access control. I know of one UpLib application which requires incoming connections to provide a client certificate, which it uses to give different clients different access rights. Probably overkill for most uses. You'd probably want to do an application-specific Web UI, though -- you could put the access restrictions there. I recently saw a Tomcat app which uses the UpLib Java client-side library to search for documents, then provided a completely custom UI. On second thought, I searched for DSpace full text search and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing However, I haven't seen any instance which shows the full text search results as I would see from vendor databases. Any idea on what system might be good/best for search within documents and DRM? How about Greenstone? Bill
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
DSpace does Full-text search, you need to turn on the configuration file. See UAL http://arizona.openrepository.com/arizona/ Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Deng, Sai Sent: Wednesday, October 20, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). I don't know how DRM affects file indexing. On second thought, I searched for DSpace full text search and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing However, I haven't seen any instance which shows the full text search results as I would see from vendor databases. Any idea on what system might be good/best for search within documents and DRM? Thank you for the reply! Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Bill Janssen [jans...@parc.com] Sent: Wednesday, October 20, 2010 4:01 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Deng, Sai sai.d...@wichita.edu wrote: Do you know the Digital Library systems which can search within the documents (e.g. PDFs) and handle access restrictions (e.g. DRM)? Not sure what you mean by handle access restrictions. Do you mean it can index the documents put into it even if they have DRM encumbrances? UpLib has search within the documents -- if you search for a word or phrase, it shows you all the documents which match, but also all the pages in each document which match. Supports a wide variety of document formats, from JPEG2000 to PDF to Powerpoint. But as far as I know it doesn't deal with DRM restrictions. Bill
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
How can people tell it searches content in bitstreams (pdfs, word docs)? It looks like it only searches metadata. Thanks. From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Han, Yan [h...@u.library.arizona.edu] Sent: Wednesday, October 20, 2010 4:43 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? DSpace does Full-text search, you need to turn on the configuration file. See UAL http://arizona.openrepository.com/arizona/ Yan -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Deng, Sai Sent: Wednesday, October 20, 2010 2:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). I don't know how DRM affects file indexing. On second thought, I searched for DSpace full text search and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing However, I haven't seen any instance which shows the full text search results as I would see from vendor databases. Any idea on what system might be good/best for search within documents and DRM? Thank you for the reply! Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Bill Janssen [jans...@parc.com] Sent: Wednesday, October 20, 2010 4:01 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Deng, Sai sai.d...@wichita.edu wrote: Do you know the Digital Library systems which can search within the documents (e.g. PDFs) and handle access restrictions (e.g. DRM)? Not sure what you mean by handle access restrictions. Do you mean it can index the documents put into it even if they have DRM encumbrances? UpLib has search within the documents -- if you search for a word or phrase, it shows you all the documents which match, but also all the pages in each document which match. Supports a wide variety of document formats, from JPEG2000 to PDF to Powerpoint. But as far as I know it doesn't deal with DRM restrictions. Bill
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
Thanks for the information! Greenstone has full text search, but I heard that its access control is much weaker than DSpace. Will it be able to allow certain documents open only to certain people or certain departments? Thanks. Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Bill Janssen [jans...@parc.com] Sent: Wednesday, October 20, 2010 4:31 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Deng, Sai sai.d...@wichita.edu wrote: For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). OK, that's not I typically think of when I hear DRM. Access control is (I think) the way it's usually put. No, UpLib has no built-in access control system, though the hooks are there, and I know that some have used them to do access control. I know of one UpLib application which requires incoming connections to provide a client certificate, which it uses to give different clients different access rights. Probably overkill for most uses. You'd probably want to do an application-specific Web UI, though -- you could put the access restrictions there. I recently saw a Tomcat app which uses the UpLib Java client-side library to search for documents, then provided a completely custom UI. On second thought, I searched for DSpace full text search and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing However, I haven't seen any instance which shows the full text search results as I would see from vendor databases. Any idea on what system might be good/best for search within documents and DRM? How about Greenstone? Bill
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
Sophie, It might help some of us on the list to understand what types of access control you need if you can describe some of the ways that the allowed users (people and/or departments, to use your examples) will identify themselves? Will they have already logged into the system with a local (to the system) account, or with a campus account that knows that they are part of a specific department? Will they need to log into he system when they request to see a specific document? Will where they are sitting matter (i.e., restricted by IP address)? Mark Mark Jordan Head of Library Systems W.A.C. Bennett Library, Simon Fraser University Burnaby, British Columbia, V5A 1S6, Canada Voice: 778.782.5753 / Fax: 778.782.3023 / Skype: mark.jordan50 mjor...@sfu.ca - Original Message - Thanks for the information! Greenstone has full text search, but I heard that its access control is much weaker than DSpace. Will it be able to allow certain documents open only to certain people or certain departments? Thanks. Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Bill Janssen [jans...@parc.com] Sent: Wednesday, October 20, 2010 4:31 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Deng, Sai sai.d...@wichita.edu wrote: For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). OK, that's not I typically think of when I hear DRM. Access control is (I think) the way it's usually put. No, UpLib has no built-in access control system, though the hooks are there, and I know that some have used them to do access control. I know of one UpLib application which requires incoming connections to provide a client certificate, which it uses to give different clients different access rights. Probably overkill for most uses. You'd probably want to do an application-specific Web UI, though -- you could put the access restrictions there. I recently saw a Tomcat app which uses the UpLib Java client-side library to search for documents, then provided a completely custom UI. On second thought, I searched for DSpace full text search and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing However, I haven't seen any instance which shows the full text search results as I would see from vendor databases. Any idea on what system might be good/best for search within documents and DRM? How about Greenstone? Bill
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
Thanks for the questions! We don't have a clear idea yet and we are looking for a system now. The basic idea is that we'll deposit some licensed materials for some department and open them only to that group. I guess a local account would be ok, of course, if a campus account can be recognized, that's better. They'll need to log in to see the document if it's not ip restricted, right? IP restriction might not be the best way since faculty members will not always be in their departments. Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Mark Jordan [mjor...@sfu.ca] Sent: Wednesday, October 20, 2010 5:08 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Sophie, It might help some of us on the list to understand what types of access control you need if you can describe some of the ways that the allowed users (people and/or departments, to use your examples) will identify themselves? Will they have already logged into the system with a local (to the system) account, or with a campus account that knows that they are part of a specific department? Will they need to log into he system when they request to see a specific document? Will where they are sitting matter (i.e., restricted by IP address)? Mark Mark Jordan Head of Library Systems W.A.C. Bennett Library, Simon Fraser University Burnaby, British Columbia, V5A 1S6, Canada Voice: 778.782.5753 / Fax: 778.782.3023 / Skype: mark.jordan50 mjor...@sfu.ca - Original Message - Thanks for the information! Greenstone has full text search, but I heard that its access control is much weaker than DSpace. Will it be able to allow certain documents open only to certain people or certain departments? Thanks. Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Bill Janssen [jans...@parc.com] Sent: Wednesday, October 20, 2010 4:31 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Deng, Sai sai.d...@wichita.edu wrote: For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). OK, that's not I typically think of when I hear DRM. Access control is (I think) the way it's usually put. No, UpLib has no built-in access control system, though the hooks are there, and I know that some have used them to do access control. I know of one UpLib application which requires incoming connections to provide a client certificate, which it uses to give different clients different access rights. Probably overkill for most uses. You'd probably want to do an application-specific Web UI, though -- you could put the access restrictions there. I recently saw a Tomcat app which uses the UpLib Java client-side library to search for documents, then provided a completely custom UI. On second thought, I searched for DSpace full text search and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing However, I haven't seen any instance which shows the full text search results as I would see from vendor databases. Any idea on what system might be good/best for search within documents and DRM? How about Greenstone? Bill
Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)?
Deng, Sai sai.d...@wichita.edu wrote: Thanks for the questions! We don't have a clear idea yet and we are looking for a system now. The basic idea is that we'll deposit some licensed materials for some department and open them only to that group. I guess a local account would be ok, of course, if a campus account can be recognized, that's better. In which case you'll need some access control system which can understand your campus login system. They'll need to log in to see the document if it's not ip restricted, right? IP restriction might not be the best way since faculty members will not always be in their departments. Will you let them search for documents, and show the search results, even if they can't retrieve the full document, as the ACM Digital Library does? Or do search results have to be filtered, too? How many different access groups will you have? One per department? One per licensed set of material? And what's the approximate size of each of those numbers? A simple thing to do would be to install something like DocuShare, which already does all this stuff and is built on top of Autonomy, one of the better suites for extracting and indexing content from documents. Bill Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Mark Jordan [mjor...@sfu.ca] Sent: Wednesday, October 20, 2010 5:08 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Sophie, It might help some of us on the list to understand what types of access control you need if you can describe some of the ways that the allowed users (people and/or departments, to use your examples) will identify themselves? Will they have already logged into the system with a local (to the system) account, or with a campus account that knows that they are part of a specific department? Will they need to log into he system when they request to see a specific document? Will where they are sitting matter (i.e., restricted by IP address)? Mark Mark Jordan Head of Library Systems W.A.C. Bennett Library, Simon Fraser University Burnaby, British Columbia, V5A 1S6, Canada Voice: 778.782.5753 / Fax: 778.782.3023 / Skype: mark.jordan50 mjor...@sfu.ca - Original Message - Thanks for the information! Greenstone has full text search, but I heard that its access control is much weaker than DSpace. Will it be able to allow certain documents open only to certain people or certain departments? Thanks. Sophie From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Bill Janssen [jans...@parc.com] Sent: Wednesday, October 20, 2010 4:31 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] DL Systems (allowing search within documents and access restrictions)? Deng, Sai sai.d...@wichita.edu wrote: For access restriction, I mean we would like to have certain documents open only to certain communities (UpLib cannot do that, right?). OK, that's not I typically think of when I hear DRM. Access control is (I think) the way it's usually put. No, UpLib has no built-in access control system, though the hooks are there, and I know that some have used them to do access control. I know of one UpLib application which requires incoming connections to provide a client certificate, which it uses to give different clients different access rights. Probably overkill for most uses. You'd probably want to do an application-specific Web UI, though -- you could put the access restrictions there. I recently saw a Tomcat app which uses the UpLib Java client-side library to search for documents, then provided a completely custom UI. On second thought, I searched for DSpace full text search and found this: https://wiki.duraspace.org/display/DSPACE/Configure+full+text+indexing However, I haven't seen any instance which shows the full text search results as I would see from vendor databases. Any idea on what system might be good/best for search within documents and DRM? How about Greenstone? Bill