Re: [Dspace-tech] Achieving security by obscurity
Scott Phillips wrote: We wanted to expose dspace's metadata in a way that can be used by other applications. They are nice restfull urls that are actionable and easily predictable. In most cases, that should be seen as a good thing. I can see in certain situations that you wouldn't want to expose that publicly though. So, in the interests of furthering our collective knowledge, I'll pose the question for tackling it another way - is it possible to restrict what has access to the /metadata urls? (ie. by IP address). If you can restrict part of the url space to only being accessible either from an internal Cocoon resolution and/or specific IP addresses (ie. 127.0.0.1), then you can prevent leakage of [sensitive] information, whilst still allowing the internal processes, your own debugging, and possibly even 'trusted partners' access to the data they need. G This email has been scanned by Postini. For more information please visit http://www.postini.com - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
On Wed, Jul 9, 2008 at 8:42 AM, Gary McGath [EMAIL PROTECTED] wrote: This still leaves the problem of other ways the metadata can be viewed. For instance, we've disabled METS data from the OAI provider because of the same problem. I'm considering patching InstallItem.java so that it never puts the provenance element into the metadata in the first place. Would this cause any problems? Technically? Probably not. Operationally? Not unlikely. I could be wrong about this and welcome correction, but I don't think depositor information on an item is stored anyplace but the provenance field. (The system does know who has add/remove/change rights on the said item, but that's not necessarily the same thing.) This means that if you remove DC provenance, *you don't know who clicked through the license*. Somebody check me on this, please? Truthfully, my librarian's teeth itch at the idea of removing provenance information; we have it drummed into our heads in library school that provenance is one of the bits of information you simply do not discard if you are a responsible content manager. But that's between you and your librarians. The licensing issue strikes me as more urgent. I think, alongside others in this thread, that the fix may be doing something more responsible with provenance/history information than leaving it in an (editable! easily falsifiable!) Dublin Core field. (I'd rather see it go into a separate table in the database, though I suspect that's a temporary fix at best.) The problems noted about exposing email addresses only strengthen this sense. Dorothea -- Dorothea Salo [EMAIL PROTECTED] Digital Repository Librarian AIM: mindsatuw University of Wisconsin Rm 218, Memorial Library (608) 262-5493 - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
I could be wrong about this and welcome correction, but I don't think depositor information on an item is stored anyplace but the provenance field. (The system does know who has add/remove/change rights on the said item, but that's not necessarily the same thing.) This means that if you remove DC provenance, *you don't know who clicked through the license*. Somebody check me on this, please? Never mind, I'm wrong. There's a submitter_id in the item table. Dorothea -- Dorothea Salo [EMAIL PROTECTED] Digital Repository Librarian AIM: mindsatuw University of Wisconsin Rm 218, Memorial Library (608) 262-5493 - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
I've always found it a bit odd about hiding the provenance metadata on the dspace items. I think this metadata came into existence as a weak attempt to introduce some history on the creation of the item. Likewise I've not been very concerned about its exposure (albeit the submitters email address embedded there) Ideally this user info in the provenence metadata should contain obviscated email addresses like the kind that are allowed as sha signatures in FOAF persons. Attaching that signature as metadata to the eperson and allowing lookup via sha signatures would allow the admins to get back to the user that submitted or approved the item from the metadata. and thus there would be no need to hide the actual metadata fields from the public. I'd like to eventually see this metadata stored differently so that METS packages (or ORE ReMs) can easily be exposed without too much concern for private metadata. The metadata/ space is currently being used as a catch-all for exposing various types of metadata to the user (at least in my usage), it shouldn't be difficult to block it behind either authentication or drop it entirely using the sitemap.xmap configuration of Cocoon in the dspace/modules/xmlui/src/main/webapp/ sitemap.xmap. You'll need to obtain a copy from the dspace-xmlui- webapp/src/main/webap/sitemap.xmap. -Mark On Jul 8, 2008, at 9:43 AM, Gary McGath wrote: In order to prevent provenance metadata from being easily reached by users, and to make embargoing watertight, I'd like to disable the metadata URLs in Manakin. (I've disabled METS output in the OAI provider for the same reason.) Since they're useful for debugging purposes, it would be nice to have them available, so changing the path component from metadata to something else seems like a useful approach. To this end, I edited (dspace-xmlui/dspace-xmlui-webapp/src/main/webapp/sitemap.xmap) and changed the relevant map:match element. When I restarted DSpace, though, it hung. Changing it back let DSpace run normally. Is there something else that needs to be changed in order to disable or modify the metadata URLs? -- Gary McGath Digital Library Software Engineer Harvard University Library Office for Information Systems http://hul.harvard.edu/~gary/index.html -- --- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
Mark Diggory wrote: Likewise I've not been very concerned about its exposure (albeit the submitters email address embedded there) Ideally this user info in the provenence metadata should contain obviscated email addresses like the kind that are allowed as sha signatures in FOAF persons. Attaching that signature as metadata to the eperson and allowing lookup via sha signatures would allow the admins to get back to the user that submitted or approved the item from the metadata. and thus there would be no need to hide the actual metadata fields from the public. Whatever the ideal may be, the reality is that there are e-mail addresses in the provenance data. I consider it the basic responsibility of any web application not to publish people's e-mail addresses without their consent. The second problem, peculiar to us because of the embargo requirement, is the inclusion of the bitstream file names, which make it easy to access embargoed files. On both counts, the exposure of provenance metadata is a serious problem and can't wait for 1.5.1. The metadata/ space is currently being used as a catch-all for exposing various types of metadata to the user (at least in my usage), it shouldn't be difficult to block it behind either authentication or drop it entirely using the sitemap.xmap configuration of Cocoon in the dspace/modules/xmlui/src/main/webapp/ sitemap.xmap. You'll need to obtain a copy from the dspace-xmlui- webapp/src/main/webap/sitemap.xmap. As I said in my post, that was what I did. My first attempt was to change metadata to some other string of letters (let's say detamata). If I do that, Manakin hangs. I then refined the attempt to redirect metadata/** to something else. Same result. -- Gary McGath Digital Library Software Engineer Harvard University Library Office for Information Systems http://hul.harvard.edu/~gary/index.html - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
On Jul 8, 2008, at 12:35 PM, Gary McGath wrote: Mark Diggory wrote: Likewise I've not been very concerned about its exposure (albeit the submitters email address embedded there) Ideally this user info in the provenence metadata should contain obviscated email addresses like the kind that are allowed as sha signatures in FOAF persons. Attaching that signature as metadata to the eperson and allowing lookup via sha signatures would allow the admins to get back to the user that submitted or approved the item from the metadata. and thus there would be no need to hide the actual metadata fields from the public. Whatever the ideal may be, the reality is that there are e-mail addresses in the provenance data. I consider it the basic responsibility of any web application not to publish people's e-mail addresses without their consent. The second problem, peculiar to us because of the embargo requirement, is the inclusion of the bitstream file names, which make it easy to access embargoed files. On both counts, the exposure of provenance metadata is a serious problem and can't wait for 1.5.1. I tend to agree. The metadata/ space is currently being used as a catch-all for exposing various types of metadata to the user (at least in my usage), it shouldn't be difficult to block it behind either authentication or drop it entirely using the sitemap.xmap configuration of Cocoon in the dspace/modules/xmlui/src/main/webapp/ sitemap.xmap. You'll need to obtain a copy from the dspace-xmlui- webapp/src/main/webap/sitemap.xmap. As I said in my post, that was what I did. My first attempt was to change metadata to some other string of letters (let's say detamata). If I do that, Manakin hangs. I then refined the attempt to redirect metadata/** to something else. Same result. Thats odd, because theres and /internal/metadata/... internal pipeline that all this is supposed to be going through, I wonder if thats not the case then. Do your hangs eventually timeout? ca you get any logging out of the xmlui/WEB-INF/logs on this issue? That might expose the pipeline thats attempting to access this via / metadata rather than /internal/metadata. -Mark - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
On Tue, 2008-07-08 at 12:43 -0400, Gary McGath wrote: In order to prevent provenance metadata from being easily reached by users, and to make embargoing watertight, I'd like to disable the metadata URLs in Manakin. (I've disabled METS output in the OAI provider for the same reason.) Since they're useful for debugging purposes, it would be nice to have them available, so changing the path component from metadata to something else seems like a useful approach. To this end, I edited (dspace-xmlui/dspace-xmlui-webapp/src/main/webapp/sitemap.xmap) and changed the relevant map:match element. When I restarted DSpace, though, it hung. Changing it back let DSpace run normally. Is there something else that needs to be changed in order to disable or modify the metadata URLs? The problem with disabling the pipelines which handle those URLs is that other parts of Manakin depend on them. The theme XSLTs which display item metadata on item pages, for instance, dereference those URLs to obtain the item data. :-) What you should be attempting is to patch the pipelines which handle those URLs, so that the sensitive information is filtered out. Take a look at this pipeline fragment, for instance: map:match pattern=metadata/handle/*/*/** map:generate type=DSpaceMETSGenerator map:parameter name=handle value={1}/{2}/ map:parameter name=extra value={3}/ /map:generate map:serialize type=xml/ /map:match If you insert a map:transform src=exclude-sensitive-metadata.xsl/ element between the generator and the serializer, then you can have complete control over what metadata Manakin exposes. e.g. removing any dc.description.provenance fields which include an @ character. -- Conal Tuohy New Zealand Electronic Text Centre www.nzetc.org - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
On Jul 8, 2008, at 3:37 PM, Conal Tuohy wrote: On Tue, 2008-07-08 at 12:43 -0400, Gary McGath wrote: In order to prevent provenance metadata from being easily reached by users, and to make embargoing watertight, I'd like to disable the metadata URLs in Manakin. (I've disabled METS output in the OAI provider for the same reason.) Since they're useful for debugging purposes, it would be nice to have them available, so changing the path component from metadata to something else seems like a useful approach. To this end, I edited (dspace-xmlui/dspace-xmlui-webapp/src/main/webapp/sitemap.xmap) and changed the relevant map:match element. When I restarted DSpace, though, it hung. Changing it back let DSpace run normally. Is there something else that needs to be changed in order to disable or modify the metadata URLs? The problem with disabling the pipelines which handle those URLs is that other parts of Manakin depend on them. The theme XSLTs which display item metadata on item pages, for instance, dereference those URLs to obtain the item data. :-) What you should be attempting is to patch the pipelines which handle those URLs, so that the sensitive information is filtered out. Take a look at this pipeline fragment, for instance: map:match pattern=metadata/handle/*/*/** map:generate type=DSpaceMETSGenerator map:parameter name=handle value={1}/{2}/ map:parameter name=extra value={3}/ /map:generate map:serialize type=xml/ /map:match If you insert a map:transform src=exclude-sensitive-metadata.xsl/ element between the generator and the serializer, then you can have complete control over what metadata Manakin exposes. e.g. removing any dc.description.provenance fields which include an @ character. -- Conal Tuohy New Zealand Electronic Text Centre www.nzetc.org Thats a pretty slick solution Conal! I'm still tempted to suggest that those xslt's should be using / internal/metadata/handle over /metdata/handle... scott can you comment on why this isn't always the case? -Mark - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
We wanted to expose dspace's metadata in a way that can be used by other applications. They are nice restfull urls that are actionable and easily predictable. There are a few themes that will do client side parsing of those urls in javascript, it opens the possibility for other things to be created. That's not a big argument for handles over internal identifiers, remember manakin is several years in the making and during the most of the design cycle dspace's independence from handles was nothing but a pipedream with no one really working on it. As a side note about the protection of Metadata. The DSpace API's data model says that all metadata for any item (withdrawn or in progress) is visible to any anonymous user. In general the dspace model is all objects are visible, but may not be actionable or some parts may be restricted. In my opinion the API should return an authorization errors for attempting to access metadata on restricted items, and then perhaps we could set some kid of security permissions on metadata fields either in the registry or through some dspace.cfg parameter. Although the xslt on the pipelie is a pretty easy hack that solves your problem and is way easier to implement Scott-- config On Jul 8, 2008, at 5:46 PM, Mark Diggory wrote: I'm still tempted to suggest that those xslt's should be using / internal/metadata/handle over /metdata/handle... scott can you comment on why this isn't always the case? -Mark - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
restricted. In my opinion the API should return an authorization errors for attempting to access metadata on restricted items, and then perhaps we could set some kid of security permissions on metadata fields either in the registry or through some dspace.cfg parameter. One vote from us on this. Any chance metadata authorization on the schedule? - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Achieving security by obscurity
Bruc, It is being discussed in the DSpace 2.0 architecture roadmap and is also something of a concern in the GSoC semantic web enablement project. Cheers, Mark On Jul 8, 2008, at 6:17 PM, Bruc Liong wrote: restricted. In my opinion the API should return an authorization errors for attempting to access metadata on restricted items, and then perhaps we could set some kid of security permissions on metadata fields either in the registry or through some dspace.cfg parameter. One vote from us on this. Any chance metadata authorization on the schedule? -- --- Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/ cca08___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech - Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW! Studies have shown that voting for your favorite open source project, along with a healthy diet, reduces your potential for chronic lameness and boredom. Vote Now at http://www.sourceforge.net/community/cca08___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech