Re: [Dspace-tech] Achieving security by obscurity

2008-07-09 Thread Graham Triggs
Scott Phillips wrote:
 We wanted to expose dspace's metadata in a way that can be used by  
 other applications. They are nice restfull urls that are actionable  
 and easily predictable.

In most cases, that should be seen as a good thing.

I can see in certain situations that you wouldn't want to expose that 
publicly though. So, in the interests of furthering our collective 
knowledge, I'll pose the question for tackling it another way -

is it possible to restrict what has access to the /metadata urls? (ie. 
by IP address).

If you can restrict part of the url space to only being accessible 
either from an internal Cocoon resolution and/or specific IP addresses 
(ie. 127.0.0.1), then you can prevent leakage of [sensitive] 
information, whilst still allowing the internal processes, your own 
debugging, and possibly even 'trusted partners' access to the data they 
need.

G
This email has been scanned by Postini.
For more information please visit http://www.postini.com


-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-09 Thread Dorothea Salo
On Wed, Jul 9, 2008 at 8:42 AM, Gary McGath [EMAIL PROTECTED] wrote:

 This still leaves the problem of other ways the metadata can be viewed.
 For instance, we've disabled METS data from the OAI provider because of
 the same problem. I'm considering patching InstallItem.java so that it
 never puts the provenance element into the metadata in the first place.
 Would this cause any problems?

Technically? Probably not. Operationally? Not unlikely.

I could be wrong about this and welcome correction, but I don't think
depositor information on an item is stored anyplace but the provenance
field. (The system does know who has add/remove/change rights on the
said item, but that's not necessarily the same thing.) This means that
if you remove DC provenance, *you don't know who clicked through the
license*. Somebody check me on this, please?

Truthfully, my librarian's teeth itch at the idea of removing
provenance information; we have it drummed into our heads in library
school that provenance is one of the bits of information you simply do
not discard if you are a responsible content manager. But that's
between you and your librarians. The licensing issue strikes me as
more urgent.

I think, alongside others in this thread, that the fix may be doing
something more responsible with provenance/history information than
leaving it in an (editable! easily falsifiable!) Dublin Core field.
(I'd rather see it go into a separate table in the database, though I
suspect that's a temporary fix at best.) The problems noted about
exposing email addresses only strengthen this sense.

Dorothea

-- 
Dorothea Salo [EMAIL PROTECTED]
Digital Repository Librarian AIM: mindsatuw
University of Wisconsin
Rm 218, Memorial Library
(608) 262-5493

-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-09 Thread Dorothea Salo
 I could be wrong about this and welcome correction, but I don't think
 depositor information on an item is stored anyplace but the provenance
 field. (The system does know who has add/remove/change rights on the
 said item, but that's not necessarily the same thing.) This means that
 if you remove DC provenance, *you don't know who clicked through the
 license*. Somebody check me on this, please?

Never mind, I'm wrong. There's a submitter_id in the item table.

Dorothea

-- 
Dorothea Salo [EMAIL PROTECTED]
Digital Repository Librarian AIM: mindsatuw
University of Wisconsin
Rm 218, Memorial Library
(608) 262-5493

-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-08 Thread Mark Diggory
I've always found it a bit odd about hiding the provenance metadata  
on the dspace items.  I think this metadata came into existence as a  
weak attempt to introduce some history on the creation of the item.

Likewise I've not been very concerned about its exposure (albeit the  
submitters email address embedded there) Ideally this user info in  
the provenence metadata should contain obviscated email addresses  
like the kind that are allowed as sha signatures in FOAF persons.  
Attaching that signature as metadata to the eperson and allowing  
lookup via sha signatures would allow the admins to get back to the  
user that submitted or approved the item from the metadata. and thus  
there would be no need to hide the actual metadata fields from the  
public.

I'd like to eventually see this metadata stored differently so that  
METS packages (or ORE ReMs) can easily be exposed without too much  
concern for private metadata.

The metadata/ space is currently being used as a catch-all for  
exposing various types of metadata to the user (at least in my  
usage), it shouldn't be difficult to block it behind either  
authentication or drop it entirely using the sitemap.xmap  
configuration of Cocoon in the dspace/modules/xmlui/src/main/webapp/ 
sitemap.xmap.  You'll need to obtain a copy from the dspace-xmlui- 
webapp/src/main/webap/sitemap.xmap.

-Mark

On Jul 8, 2008, at 9:43 AM, Gary McGath wrote:

 In order to prevent provenance metadata from being easily reached by
 users, and to make embargoing watertight, I'd like to disable the
 metadata URLs in Manakin. (I've disabled METS output in the OAI
 provider for the same reason.) Since they're useful for debugging
 purposes, it would be nice to have them available, so changing the  
 path
 component from metadata to something else seems like a useful
 approach. To this end, I edited
 (dspace-xmlui/dspace-xmlui-webapp/src/main/webapp/sitemap.xmap) and
 changed the relevant map:match element.  When I restarted DSpace,
 though, it hung. Changing it back let DSpace run normally.

 Is there something else that needs to be changed in order to  
 disable or
 modify the metadata URLs?

 -- 
 Gary McGath
 Digital Library Software Engineer
 Harvard University Library Office for Information Systems
 http://hul.harvard.edu/~gary/index.html


 -- 
 ---
 Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
 Studies have shown that voting for your favorite open source project,
 along with a healthy diet, reduces your potential for chronic lameness
 and boredom. Vote Now at http://www.sourceforge.net/community/cca08
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech


-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-08 Thread Gary McGath
Mark Diggory wrote:

 Likewise I've not been very concerned about its exposure (albeit the  
 submitters email address embedded there) Ideally this user info in  
 the provenence metadata should contain obviscated email addresses  
 like the kind that are allowed as sha signatures in FOAF persons.  
 Attaching that signature as metadata to the eperson and allowing  
 lookup via sha signatures would allow the admins to get back to the  
 user that submitted or approved the item from the metadata. and thus  
 there would be no need to hide the actual metadata fields from the  
 public.

Whatever the ideal may be, the reality is that there are e-mail 
addresses in the provenance data. I consider it the basic responsibility 
of any web application not to publish people's e-mail addresses without 
their consent. The second problem, peculiar to us because of the embargo 
requirement, is the inclusion of the bitstream file names, which make it 
easy to access embargoed files. On both counts, the exposure of 
provenance metadata is a serious problem and can't wait for 1.5.1.

 The metadata/ space is currently being used as a catch-all for  
 exposing various types of metadata to the user (at least in my  
 usage), it shouldn't be difficult to block it behind either  
 authentication or drop it entirely using the sitemap.xmap  
 configuration of Cocoon in the dspace/modules/xmlui/src/main/webapp/ 
 sitemap.xmap.  You'll need to obtain a copy from the dspace-xmlui- 
 webapp/src/main/webap/sitemap.xmap.

As I said in my post, that was what I did. My first attempt was to 
change metadata to some other string of letters (let's say 
detamata). If I do that, Manakin hangs. I then refined the attempt to 
redirect metadata/** to something else. Same result.

-- 
Gary McGath
Digital Library Software Engineer
Harvard University Library Office for Information Systems
http://hul.harvard.edu/~gary/index.html


-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-08 Thread Mark Diggory

On Jul 8, 2008, at 12:35 PM, Gary McGath wrote:

 Mark Diggory wrote:

 Likewise I've not been very concerned about its exposure (albeit the
 submitters email address embedded there) Ideally this user info in
 the provenence metadata should contain obviscated email addresses
 like the kind that are allowed as sha signatures in FOAF persons.
 Attaching that signature as metadata to the eperson and allowing
 lookup via sha signatures would allow the admins to get back to the
 user that submitted or approved the item from the metadata. and thus
 there would be no need to hide the actual metadata fields from the
 public.

 Whatever the ideal may be, the reality is that there are e-mail
 addresses in the provenance data. I consider it the basic  
 responsibility
 of any web application not to publish people's e-mail addresses  
 without
 their consent. The second problem, peculiar to us because of the  
 embargo
 requirement, is the inclusion of the bitstream file names, which  
 make it
 easy to access embargoed files. On both counts, the exposure of
 provenance metadata is a serious problem and can't wait for 1.5.1.

I tend to agree.

 The metadata/ space is currently being used as a catch-all for
 exposing various types of metadata to the user (at least in my
 usage), it shouldn't be difficult to block it behind either
 authentication or drop it entirely using the sitemap.xmap
 configuration of Cocoon in the dspace/modules/xmlui/src/main/webapp/
 sitemap.xmap.  You'll need to obtain a copy from the dspace-xmlui-
 webapp/src/main/webap/sitemap.xmap.

 As I said in my post, that was what I did. My first attempt was to
 change metadata to some other string of letters (let's say
 detamata). If I do that, Manakin hangs. I then refined the  
 attempt to
 redirect metadata/** to something else. Same result.

Thats odd, because theres and /internal/metadata/... internal  
pipeline that all this is supposed to be going through, I wonder if  
thats not the case then. Do your hangs eventually timeout? ca you  
get any logging out of the xmlui/WEB-INF/logs on this issue? That  
might expose the pipeline thats attempting to access this via / 
metadata rather than /internal/metadata.

-Mark


-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-08 Thread Conal Tuohy
On Tue, 2008-07-08 at 12:43 -0400, Gary McGath wrote: 
 In order to prevent provenance metadata from being easily reached by 
 users, and to make embargoing watertight, I'd like to disable the 
 metadata URLs in Manakin. (I've disabled METS output in the OAI 
 provider for the same reason.) Since they're useful for debugging 
 purposes, it would be nice to have them available, so changing the path 
 component from metadata to something else seems like a useful 
 approach. To this end, I edited 
 (dspace-xmlui/dspace-xmlui-webapp/src/main/webapp/sitemap.xmap) and 
 changed the relevant map:match element.  When I restarted DSpace, 
 though, it hung. Changing it back let DSpace run normally.
 
 Is there something else that needs to be changed in order to disable or 
 modify the metadata URLs?

The problem with disabling the pipelines which handle those URLs is that
other parts of Manakin depend on them. The theme XSLTs which display
item metadata on item pages, for instance, dereference those URLs to
obtain the item data. :-)

What you should be attempting is to patch the pipelines which handle
those URLs, so that the sensitive information is filtered out. Take a
look at this pipeline fragment, for instance:

map:match pattern=metadata/handle/*/*/**
   map:generate type=DSpaceMETSGenerator
  map:parameter name=handle value={1}/{2}/
  map:parameter name=extra value={3}/
   /map:generate
   map:serialize type=xml/
/map:match

If you insert a map:transform src=exclude-sensitive-metadata.xsl/
element between the generator and the serializer, then you can have
complete control over what metadata Manakin exposes. e.g. removing any
dc.description.provenance fields which include an @ character.

-- 
Conal Tuohy
New Zealand Electronic Text Centre
www.nzetc.org


-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-08 Thread Mark Diggory

On Jul 8, 2008, at 3:37 PM, Conal Tuohy wrote:

 On Tue, 2008-07-08 at 12:43 -0400, Gary McGath wrote:
 In order to prevent provenance metadata from being easily reached by
 users, and to make embargoing watertight, I'd like to disable the
 metadata URLs in Manakin. (I've disabled METS output in the OAI
 provider for the same reason.) Since they're useful for debugging
 purposes, it would be nice to have them available, so changing the  
 path
 component from metadata to something else seems like a useful
 approach. To this end, I edited
 (dspace-xmlui/dspace-xmlui-webapp/src/main/webapp/sitemap.xmap) and
 changed the relevant map:match element.  When I restarted DSpace,
 though, it hung. Changing it back let DSpace run normally.

 Is there something else that needs to be changed in order to  
 disable or
 modify the metadata URLs?

 The problem with disabling the pipelines which handle those URLs is  
 that
 other parts of Manakin depend on them. The theme XSLTs which display
 item metadata on item pages, for instance, dereference those URLs to
 obtain the item data. :-)

 What you should be attempting is to patch the pipelines which handle
 those URLs, so that the sensitive information is filtered out. Take a
 look at this pipeline fragment, for instance:

 map:match pattern=metadata/handle/*/*/**
map:generate type=DSpaceMETSGenerator
   map:parameter name=handle value={1}/{2}/
   map:parameter name=extra value={3}/
/map:generate
map:serialize type=xml/
 /map:match

 If you insert a map:transform src=exclude-sensitive-metadata.xsl/
 element between the generator and the serializer, then you can have
 complete control over what metadata Manakin exposes. e.g. removing any
 dc.description.provenance fields which include an @ character.

 -- 
 Conal Tuohy
 New Zealand Electronic Text Centre
 www.nzetc.org

Thats a pretty slick solution Conal!

I'm still tempted to suggest that those xslt's should be using / 
internal/metadata/handle over /metdata/handle... scott can you  
comment on why this isn't always the case?

-Mark


-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-08 Thread Scott Phillips

We wanted to expose dspace's metadata in a way that can be used by  
other applications. They are nice restfull urls that are actionable  
and easily predictable. There are a few themes that will do client  
side parsing of those urls in javascript, it opens the possibility for  
other things to be created. That's not a big argument for handles over  
internal identifiers, remember manakin is several years in the making  
and during the most of the design cycle dspace's independence from  
handles was nothing but a pipedream with no one really working on it.

As a side note about the protection of Metadata. The DSpace API's data  
model says that all metadata for any item (withdrawn or in progress)  
is visible to any anonymous user. In general the dspace model is all  
objects are visible, but may not be actionable or some parts may be  
restricted. In my opinion the API should return an authorization  
errors for attempting to access metadata on restricted items, and then  
perhaps we could set some kid of security permissions on metadata  
fields either in the registry or through some dspace.cfg parameter.  
Although the xslt on the pipelie is a pretty easy hack that solves  
your problem and is way easier to implement

Scott--
config

On Jul 8, 2008, at 5:46 PM, Mark Diggory wrote:

 I'm still tempted to suggest that those xslt's should be using / 
 internal/metadata/handle over /metdata/handle... scott can you  
 comment on why this isn't always the case?

 -Mark



-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-08 Thread Bruc Liong

 
 restricted. In my opinion the API should return an authorization  
 errors for attempting to access metadata on restricted items, and then  
 perhaps we could set some kid of security permissions on metadata  
 fields either in the registry or through some dspace.cfg parameter.  

One vote from us on this. Any chance metadata authorization on the schedule?
-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Achieving security by obscurity

2008-07-08 Thread Mark Diggory

Bruc,

It is being discussed in the DSpace 2.0 architecture roadmap and is  
also something of a concern in the GSoC semantic web enablement project.


Cheers,
Mark

On Jul 8, 2008, at 6:17 PM, Bruc Liong wrote:




 restricted. In my opinion the API should return an authorization
 errors for attempting to access metadata on restricted items, and  
then

 perhaps we could set some kid of security permissions on metadata
 fields either in the registry or through some dspace.cfg parameter.

One vote from us on this. Any chance metadata authorization on the  
schedule?


-- 
---

Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/ 
cca08___

DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


-
Sponsored by: SourceForge.net Community Choice Awards: VOTE NOW!
Studies have shown that voting for your favorite open source project,
along with a healthy diet, reduces your potential for chronic lameness
and boredom. Vote Now at http://www.sourceforge.net/community/cca08___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech