[Dspace-tech] HOW TO DEBUG JSP-UI WEB APPLICATION RESOURCES PROJECT

2012-08-30 Thread arjumand fatima

Dear All, I want to make a few changes in the submission process of the files 
that are uploaded in DSpace. For this purpose, i made a few changes in the 
SubmissionController class in package org.dspace.app.webui.servlet in JSP-UI 
Api and Implementation project.  However when i run the JSP-UI Web Application  
Resources project the changes are not depicted as expected (during the upload 
files process). 
To  find the exact problem i need to debug the JSP-UI Web Application  
Resources project so that i can reach the exact java code where the error 
exists. Can anyone please help me with the debug steps along with any specific 
settings to be followed.  Waiting for your kind response.

(I am using DSpace 1.8.2 JSP - User Interface with NetBeans IDE 7.1.2 and 
Apache tomcat Server 7.0.22)

 Regards
Arjumand
  --
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] HOW TO DEBUG JSP-UI WEB APPLICATION RESOURCES PROJECT

2012-08-30 Thread helix84
Hi Arjumand,

you can find information on how to turn on debugging here:

https://wiki.duraspace.org/display/DSPACE/Troubleshoot+an+error#Troubleshootanerror-TurningonDebugging%28optional%29


If you think you're seeing old code deployed instead of your modified code:

1) make sure you're using full build, not quick build:

https://wiki.duraspace.org/display/DSPACE/Rebuild+DSpace

2) run the clean target (mvn clean package).

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] opening a specific page of a pdf file with search index

2012-08-30 Thread sanjukta pradhan
dear sir/madam,

I have  specific requirement  as accessing the desired  section ofa single
pdf  with the index search



 i.e. If the pdf has a main page,  index page , content page, bibliography
etc. 



Can I open the main page with the search index  main

and access content page with the index content and so on



If it is possible without multiple part pdf import then let me know the way
how to do it.



Waiting for a reply.







regards
-- 
Sanjukta Pradhan
Scientific Officer/Engineer-SB,NIC
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Mirage: customising the summary item view

2012-08-30 Thread helix84
On Thu, Aug 30, 2012 at 2:22 AM, Mushashu Mwansa Lumpa
mlu...@cs.uct.ac.za wrote:
 On the summary item view (actually even the detailed item view), I would
 like to modify the section that lists the collection(s) to which a given
 item belongs to. Among other things, I would like to also add the owning
 community information. which .xsl file should i look at that handles that? I
 have already looked at the item-view.xsl but it looks to me, it is not the
 place that holds that information.

Hi Mushashu,

please, always state which DSpace version, interface (JSPUI/XMLUI) and
XMLUI theme you're using.

You're mentioning item-view.xsl, so I'm judging Mirage (the other
possibility is dri2xhtml-alt).

The owning community is already in the breadcrumb trail (xsl:template
match=dri:trail in Mirage/lib/xsl/core/page-structure.xsl), so I
assume you mean the list under This item appears in the following
Collection(s). This is the template in
dri2xhtml-alt/aspect/artifactbrowser/common.xsl (bear in mind that
Mirage build upon and overrides dri2xhtml-alt).

xsl:template match=dri:referenceSet[@type = 'detailList'] priority=2
xsl:apply-templates select=dri:head/
ul class=ds-referenceSet-list
xsl:apply-templates select=*[not(name()='head')]
mode=detailList/
/ul
/xsl:template

and

xsl:template match=dri:reference mode=detailList
xsl:variable name=externalMetadataURL
xsl:textcocoon://xsl:text
xsl:value-of select=@url/
!-- No options selected, render the full METS document --
/xsl:variable
xsl:comment External Metadata URL: xsl:value-of
select=$externalMetadataURL/ /xsl:comment
li
xsl:apply-templates
select=document($externalMetadataURL) mode=detailList/
xsl:apply-templates /
/li
/xsl:template

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] opening a specific page of a pdf file with search index

2012-08-30 Thread helix84
On Thu, Aug 30, 2012 at 11:00 AM, sanjukta pradhan
sanjukta1...@gmail.com wrote:
 Can I open the main page with the search index  main

 and access content page with the index content and so on

Hi Sanjukta,

what do you mean by open? Displaying a PDF means downloading it and
using a PDF viewer configured on the client to open it. That is
completely client-side and DSpace has no influence over it. Do you
mean something else?

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] opening a specific page of a pdf file with search index

2012-08-30 Thread helix84
On Thu, Aug 30, 2012 at 12:18 PM, sanjukta pradhan
sanjukta1...@gmail.com wrote:
 hi,
 Thanks for quick respond. I hope I will get solution to my problem. The
 problem is :
 We need to open the Pdf to a pre defined page number  on  click to a  the
 searched index .
  That  is specific to that pdf  and the pdf is already in dspace repository.
  Can Dspace has such a facility or do dspace provides options to customise
 it programmatically.

Please, always CC dspace-tech when replying.

Opening PDF on the client's computer simply has nothing to do with
DSpace. It's not a fault of DSpace, it's that HTTP simply doesn't work
that way.

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] Supress list of Communities on Front Page

2012-08-30 Thread Benjamin Ryan
Hi,
What is the easiest way to suppress the list of Communities on 
the front page?
I have been looking for XSLT templates that render this but 
cannot seem to find one that matches a DRI:div with n=community-browser.

Regards,
Ben

--
Dr Ben Ryan
Jorum Technical Coordinator (Services)

5.13 Roscoe Building
The University of Manchester
Oxford Road
Manchester
M13 9PL
Tel: 0160 275 6256
E-mail: benjamin.r...@manchester.ac.ukmailto:benjamin.r...@manchester.ac.uk
--

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Supress list of Communities on Front Page

2012-08-30 Thread helix84
Hi Ben,

xsl:template name=disable_front-page-communities
match=dri:div[@id='aspect.artifactbrowser.CommunityBrowser.div.comunity-browser']
/xsl:template

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Supress list of Communities on Front Page

2012-08-30 Thread helix84
Anyway, the one you suggested works, too, I just like mine more
because it's more specific. I guess the mistake you did was writing
the dri: namespace using capital letters?

xsl:template name=disable_front-page-communities
match=dri:div[@n='comunity-browser']
/xsl:template

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Problem with im4java

2012-08-30 Thread Daniel Shin
Hi Helix,

The problem is that i cannot create the images in my directory.
A user registered can upload your picture. My code need to publish the
resized picture.


Thanks,

Daniel

2012/8/29 helix84 heli...@centrum.sk

 On Wed, Aug 29, 2012 at 6:55 PM, Daniel Shin danielshin...@gmail.com
 wrote:
  The code create a newtest.png picture and save on the images directory.
 But,
  it doesn't work in my application.
  I don't have any java exception about these.

 What doesn't work? If you have the image saved in
 [dspace]/webapps/xmlui/themes/Mirage/images/newtest.png, displaying it
 is just a matter of writing some simple XSL. Do you need help with
 that?

 Regards,
 ~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Supress list of Communities on Front Page

2012-08-30 Thread Benjamin Ryan
Helix,
I did try it both upper and lower though I have just noticed that your 
template looks for n=comunity-browser and not n=community-browser.
Also where is the best place to put this template - in 
page-structure.xsl?

Regards,
Ben

--
Dr Ben Ryan
Jorum Technical Coordinator (Services)

5.13 Roscoe Building
The University of Manchester
Oxford Road
Manchester
M13 9PL
Tel: 0160 275 6256
E-mail: benjamin.r...@manchester.ac.uk
--


-Original Message-
From: ivan.ma...@gmail.com [mailto:ivan.ma...@gmail.com] On Behalf Of helix84
Sent: 30 August 2012 13:01
To: Benjamin Ryan
Cc: dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] Supress list of Communities on Front Page

Anyway, the one you suggested works, too, I just like mine more because it's 
more specific. I guess the mistake you did was writing the dri: namespace using 
capital letters?

xsl:template name=disable_front-page-communities
match=dri:div[@n='comunity-browser']
/xsl:template

Regards,
~~helix84
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Supress list of Communities on Front Page

2012-08-30 Thread helix84
On Thu, Aug 30, 2012 at 2:14 PM, Benjamin Ryan
benjamin.r...@manchester.ac.uk wrote:
 I did try it both upper and lower though I have just noticed that 
 your template looks for n=comunity-browser and not n=community-browser.

Yes, I noticed. That's the actual attribute name (with the typo).

 Also where is the best place to put this template - in 
 page-structure.xsl?

No, the best is to create a new template importing and overriding your
parent template and place it in there. This is described in here:

https://wiki.duraspace.org/display/DSPACE/Manakin+theme+tutorial

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Problem with im4java

2012-08-30 Thread helix84
On Thu, Aug 30, 2012 at 2:08 PM, Daniel Shin danielshin...@gmail.com wrote:
 The problem is that i cannot create the images in my directory.
 A user registered can upload your picture. My code need to publish the
 resized picture.

Check that the target directory has write permissons for the user
you're running your dspace/tomcat.

If the problem is in your code, I can hardly help you with that.

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Problem with im4java

2012-08-30 Thread Daniel Shin
Ok!

Thanks for your help.


Regards,

Daniel

2012/8/30 helix84 heli...@centrum.sk

 On Thu, Aug 30, 2012 at 2:08 PM, Daniel Shin danielshin...@gmail.com
 wrote:
  The problem is that i cannot create the images in my directory.
  A user registered can upload your picture. My code need to publish the
  resized picture.

 Check that the target directory has write permissons for the user
 you're running your dspace/tomcat.

 If the problem is in your code, I can hardly help you with that.

 Regards,
 ~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] data entry errors

2012-08-30 Thread Darren Arsenault
Hi Bram,

Cleaning up the current errors will have to be done, but I was more concerned 
with the prevention of future errors, as you deduced. (I do appreciate the 
information on the tools available for clean up though—Thanks!)

I had the same idea that you mentioned below—hit the database for near matches 
and display a list to the user, allowing them to simply select the data from 
the list if they see what they are looking for. The reason that I bring it to 
the community is two-fold:

Firstly, I highly doubt that I am the first person to come across this issue. I 
had hoped that someone had already developed a solution. There are so many 
different ideas, implementations, configurations, and patches out there that I 
would be a fool not to ask.
Secondly, I am fairly new to DSpace, having only been working with it for a few 
weeks now (and most of my time has been spent doing high-level changes), so I 
don't know the code intimately yet. While I know how to code this solution on 
it's own, I am concerned about the possibility of side-effects if I simply 
start adding code/logic to the JSP without fully understanding the supporting 
code. At present, querying the database, displaying the result set, and 
[possibly] updating an input field does not seem like it would cause an issue, 
but I have been surprised in the past by making assumptions.

In any case, I thank all of you for taking the time to consider my question and 
respond to it. Of the projects that I have worked on, this one definitely has 
the most helpful community I have ever seen.

Good-day and be well.

Darren Arsenault


From: bluy...@gmail.com [bluy...@gmail.com] On Behalf Of Bram Luyten 
[b...@mire.be]
Sent: August-30-12 3:06 AM
To: DSpace @ Lyncode
Cc: Darren Arsenault; dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] data entry errors

Hi Darren,

to be very clear: are you looking for a way to clean up the current errors, or 
just interested in prevention for new ones? In terms of prevention, it might 
help if you develop an auto-complete feature that tries to match anything a 
user is entering in a particular metadata field, with those values that are 
already stored for that field in archived items.

Referring back to your example, this would mean that if someone starts typing 
AB... he or she would get suggestions for ways in which someone else has 
already entered values starting with AB for that specific metadata field.

To deal with errors that already made it into your metadata, here are two 
suggestions, a free one, and a commercial add-on module from @mire:

- Since DSpace 1.6 you can export metadata into spreadsheets on a 
per-collection basis. So download the metadata in a spreadsheet, clean it up, 
and re-upload to see the changes get into effect. For the clean up part, you 
can go at it with your spreadsheet editor but you might want to look at Google 
Refinehttp://code.google.com/p/google-refine/. It's really awesome at 
detecting similar values and grouping them together.

- Our Metadata quality modulehttp://atmire.com/website/?q=modules/mqm has 
functionality for performing batch edits straight from the DSpace web UI and 
merging duplicates.

cheers,

Bram

--

Bram Luyten @mire
2888 Loker Avenue East, Suite 305, Carlsbad, CA. 92010
Esperantolaan 4, Heverlee 3001, Belgium
 http://www.atmire.com/ 
www.atmire.comhttp://atmire.com/website/?q=servicesutm_source=emailfooterutm_medium=emailutm_campaign=braml



On Wed, Aug 29, 2012 at 8:32 PM, DSpace @ Lyncode 
dsp...@lyncode.commailto:dsp...@lyncode.com wrote:
Hi,

i can only think of implementing an Authority Control for that.
Anyway, deposit workflow is meant to accomplish that task (validate/correct 
metadata values).

On 29 August 2012 16:22, Darren Arsenault 
arse...@algonquincollege.commailto:arse...@algonquincollege.com wrote:
I posted this a week ago and no one has responded yet, so I'm trying again:

For input fields where it is not possible (or practical) to implement 
controlled vocabularies or drop down lists, is there a less labour-intensive 
way of preventing data entry errors? For example: The author of several 
documents is ABC Statistics Inc., but each document is added by a different 
ePerson,and each of these people makes a spelling error when filling out the 
AUTHOR field, so these items appear to have different authors. (ABC 
Statisitcs, Inc., ABC Statistics, Inc, ABC Statistics, etc.).

Originally I thought that this would be a minor issue, easily correctable 
through raw SQL queries to update the offending fields. Unfortunately, my 
estimates as to the number of mistakes that would be made has proven to be 
extremely conservative. I do not want to be responsible for correcting so many 
entries myself, nor do I want to reject so many entries asking users to match 
the AUTHOR name that already exists.



Does anyone have any ideas?




Re: [Dspace-tech] data entry errors

2012-08-30 Thread helix84
On Thu, Aug 30, 2012 at 3:20 PM, Darren Arsenault
arse...@algonquincollege.com wrote:
 Firstly, I highly doubt that I am the first person to come across this issue. 
 I had hoped that someone had already developed a solution. There are so many 
 different ideas, implementations, configurations, and patches out there that 
 I would be a fool not to ask.

You're right to ask first. It seems like a logical extension of
existing functionality, a feature many people would be interested in.
When you implement it, please make sure to submit your patch to our
Jira [1].

I also wanted to draw you attention to the current development of
Discovery for JSPUI. It's planned to be in DSpace 3.0, which is due
before the end of this year. If I were you, I'd prefer talking to Solr
instead of the database, it's faster and built for search (so you may
forget LIKE). You may want to develop your improvements for the
upcoming version and deploy it when it comes out. You can find the
JSPUI Discovery branch here [2] and watch when it's merged into the
master Git branch here [3]. The corresponding Jira ticket is here [4].

Another option is to use the XMLUI interface where this functionality
already exists for submission (available only for your users
internally) and the JSPUI interface for the public-facing repository
(if you prefer). XMLUI and JSPUI can be deployed just fine in parallel
on one DSpace instance, just on different URLs.

[1] https://jira.duraspace.org/browse/
[2] https://github.com/abollini/DSpace/tree/DS-1217
[3] https://github.com/DSpace/DSpace/pull/60
[4] https://jira.duraspace.org/browse/DS-1217

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Edit Item display Issue

2012-08-30 Thread Keith Jones


On Tue, 28 Aug 2012, helix84 wrote:

 Hi Keith,
 which version are you using? I just checked this on demo.dspace.org
 which runs the latest code (to become 3.0) from git and the problem is
 not present there.

I'm running version 1.8.2. I'm using the new messages.xml. I've checked in 
the deployed webapps area, and the entry is in the deployed messages.xml 
file.



 I had the same problem when I copied over a customized messages.xml
 file from 1.7 to 1.8 (reordering was a new feature in 1.8) and the
 messages were missing there (obviously). You're saying that they are
 in your messages.xml now. Could you 1) clean your Cocoon cache [1] and
 2) restart Tomcat and check if the problem still persists?


Followed the procedure to clear the cache through the new interface 
option. Then I shutdown Tomcat and restarted, but the problem still 
persists.

Thanks

Keith

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Edit Item display Issue

2012-08-30 Thread helix84
I can't confirm that on 1.8.2, but then I added things to messages.xml
so it's not a default installation.
Can anyone here who hasn't edited messages.xml confirm it?

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Edit Item display Issue

2012-08-30 Thread Keith Jones

Helix84,

I've edited other things in the messages.xml and the edited items show up 
correctly. In this instance I did not edit the entries, they are not being 
displayed properly, but the matching entry exist in the file.

On Thu, 30 Aug 2012, helix84 wrote:

 I can't confirm that on 1.8.2, but then I added things to messages.xml
 so it's not a default installation.
 Can anyone here who hasn't edited messages.xml confirm it?

 Regards,
 ~~helix84


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] Ingesting large data set

2012-08-30 Thread Ingram , William A
I apologize if a similar questuon has been answered in a prior thread. 

We have a student needing to submit a 150 GB data set into DSpace. Is this even 
possible? Are there any tips or workarounds I should try? 

Cheers,
Bill

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Pottinger, Hardy J.
Hi, Bill, the theoretical limit for posting data via HTTP is 1.8 GB [1].
Your only recourse for storing this particular data set in DSpace, is to
transfer to the server via FTP, SFTP, or SCP, and then either batch load,
or run the item update script [2]. However, my main question is: once it
is *in* DSpace, how do you plan on getting it *out*?

I'd love to hear more from folks who are storing large data sets, just to
hear how you're handling usability of the stored data.

[warning, I'm about to jump of the deep end here...]


One possibility worth exploring is streaming uploads and downloads. I've
come across streaming upload clients before, and in a brief bit of
googling, I discovered that Apache Commons supports streaming upload. [3]

What would be cool is if someone had a streaming dataset viewer, where you
could plug some kind of visualization into a 'thumbnail' snapshot of a
data set, and then have the full data set stream in to fill out that
visualization/analysis. I can't be the first person to have thought of
such a thing, somebody has to be working on this already, right?

[1] http://stackoverflow.com/questions/1922414/file-upload-limit-in-http
[2] 
https://wiki.duraspace.org/display/DSDOC18/Updating+Items+via+Simple+Archiv
e+Format
[3] http://commons.apache.org/fileupload/streaming.html

--
HARDY POTTINGER pottinge...@umsystem.edu
University of Missouri Library Systems
http://lso.umsystem.edu/~pottingerhj/
https://MOspace.umsystem.edu/
Debug only code. Comments lie.





On 8/30/12 10:53 AM, Ingram, William A wingr...@illinois.edu wrote:

I apologize if a similar questuon has been answered in a prior thread.


We have a student needing to submit a 150 GB data set into DSpace. Is
this even possible? Are there any tips or workarounds I should try?


Cheers,
Bill






--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Choosing between CC in a single collection?

2012-08-30 Thread Darren Arsenault
Is it possible to set up a collection that allows an uploader to choose between 
multiple CC licences? For example, when they got to the GRANT LICENSE step they 
would see a list of available licenses and choose which one they would like to 
grant.

Thanks,

Darren

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread George S Kozak
Bill:

Normally, at Cornell, we discourage files that are greater than 2GB.  The 
problem isn’t that DSPace can’t handle it, the problem is in the time in 
uploading the file at submission time and downloading by a user.  A lot of 
times, people’s browsers just time out.

That said, we do have some large files and what I usually do with them is 
submit them using the item import application.  I upload the file to DSpace 
using SFTP (as a background job).  Once it’s on the server, I create an import 
directory with the Dublin core needed and then run the batch importer.

George Kozak
Digital Library Specialist
Cornell University Library Information Technologies (CUL-IT)
501 Olin Library
Cornell University
Ithaca, NY 14853
607-255-8924

From: Ingram, William A [mailto:wingr...@illinois.edu]
Sent: Thursday, August 30, 2012 11:54 AM
To: dspace-tech@lists.sourceforge.net
Subject: [Dspace-tech] Ingesting large data set

I apologize if a similar questuon has been answered in a prior thread.

We have a student needing to submit a 150 GB data set into DSpace. Is this even 
possible? Are there any tips or workarounds I should try?

Cheers,
Bill

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Mark H. Wood
We are just setting up a data repository and will probably soon be
facing similar challenges.  This also has some relationship to longer
videos and the like.

-- 
Mark H. Wood, Lead System Programmer   mw...@iupui.edu
Asking whether markets are efficient is like asking whether people are smart.


pgpNALZWvSuDw.pgp
Description: PGP signature
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Mark H. Wood
There's also the registration method:  put the file into the
assetstore space by some other means and then just tell DSpace it's
there, and here are the metadata.  No further copying required.

I suppose you could even carry your 2TB file in on a hot-plug disk
drive, push it into an empty slot, mount the volume over an empty
directory in the assetstore, and magick it into DSpace with *zero*
copying.  (I can just see the look on the salesman's face as we
ask for a quote on a three-rack FC SAN unit with *no* drives)

-- 
Mark H. Wood, Lead System Programmer   mw...@iupui.edu
Asking whether markets are efficient is like asking whether people are smart.


pgpoQo3atjjfd.pgp
Description: PGP signature
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Richard Rodgers
Yes, as has been remarked, the bigger questions revolve around access and 
usage, rather than ingest.
We recently did a pilot with large video files where we ingested them as 
preservation masters (via ItemImport),  suppressed the
download link, but offered in it's place a link to a much smaller transcoded 
access copy on YouTube. The thinking
was that  formats change, we could reuse the master, thereby guaranteeing 
access in a mediated way…

On Aug 30, 2012, at 12:42 PM, Mark H. Wood wrote:

 We are just setting up a data repository and will probably soon be
 facing similar challenges.  This also has some relationship to longer
 videos and the like.
 
 -- 
 Mark H. Wood, Lead System Programmer   mw...@iupui.edu
 Asking whether markets are efficient is like asking whether people are smart.
 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and 
 threat landscape has changed and how IT managers can respond. Discussions 
 will include endpoint security, mobile security and the latest in malware 
 threats. 
 http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Ryan Scherle
I agree with George. Files larger than 2GB are a serious pain. However, some 
users insist on giving us these files, and we would rather take the data than 
reject it. 

Even before you near 2GB, it is likely that something in your system will 
reject the upload. You must ensure that all of the pieces of your installation 
are configured correctly. This includes:
* Apache -- LimitRequestBody 0 
* Tomcat -- maxPostSize=0
* Cocoon -- see http://sourceforge.net/mailarchive/message.php?msg_id=28478227
* Any security software you are running

Even once everything is set correctly, users with poor bandwidth will have 
trouble transferring large files directly through HTTP.

Our normal process is for end-users to supply the metadata and upload a dummy 
file. This way, the user ensures the correct metadata is associated with the 
correct file, which is important when they are giving us several files at once. 
Once the dummy file is in place and associated with the correct metadata, the 
user sends us the large file out-of-band. Depending on the size of the file, we 
may have them upload via a third party site like WeTransfer.com, or we may open 
an FTP server.

Once we have the file in hand, we either use the import/export tools OR make 
the appropriate changes directly in the database. Both processes are ugly and 
need improvement, but here are our current instructions:
http://wiki.datadryad.org/Large_File_Technology

--- Ryan Scherle
--- Data Repository Architect
--- Dryad Digital Repository

On Aug 30, 2012, at 12:20 PM, George S Kozak wrote:

 Bill:
  
 Normally, at Cornell, we discourage files that are greater than 2GB.  The 
 problem isn’t that DSPace can’t handle it, the problem is in the time in 
 uploading the file at submission time and downloading by a user.  A lot of 
 times, people’s browsers just time out.
  
 That said, we do have some large files and what I usually do with them is 
 submit them using the item import application.  I upload the file to DSpace 
 using SFTP (as a background job).  Once it’s on the server, I create an 
 import directory with the Dublin core needed and then run the batch importer. 
  
 George Kozak
 Digital Library Specialist
 Cornell University Library Information Technologies (CUL-IT)
 501 Olin Library
 Cornell University
 Ithaca, NY 14853
 607-255-8924
  
 From: Ingram, William A [mailto:wingr...@illinois.edu] 
 Sent: Thursday, August 30, 2012 11:54 AM
 To: dspace-tech@lists.sourceforge.net
 Subject: [Dspace-tech] Ingesting large data set
  
 I apologize if a similar questuon has been answered in a prior thread. 
  
 We have a student needing to submit a 150 GB data set into DSpace. Is this 
 even possible? Are there any tips or workarounds I should try?
  
 Cheers,
 Bill
  
 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and 
 threat landscape has changed and how IT managers can respond. Discussions 
 will include endpoint security, mobile security and the latest in malware 
 threats. 
 http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] NOTICE: JIRA Issue Tracker Maintenance TODAY from 4-4:30PM ET

2012-08-30 Thread Tim Donohue
All,

DuraSpace will be performing maintenance  upgrading our JIRA Issue 
Tracker today (Thurs, Aug 30) from 4-4:30PM ET. To see the corresponding 
maintenance time in your area, visit:

http://www.timeanddate.com/worldclock/fixedtime.html?msg=DuraSpace+JIRA+Maintenanceiso=20120830T20am=30

This maintenance has been necessitated by several critical security 
issues within our current version of Atlassian JIRA.

During this maintenance window the http://jira.duraspace.org site will 
be unavailable to all users.

We apologize for any inconvenience this may cause. Please feel free to 
email sysad...@duraspace.org if you have any questions or concerns.

Sincerely,

Tim Donohue (on behalf of sysad...@duraspace.org)

-- 
Tim Donohue
Technical Lead for DSpace Project
DuraSpace.org

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Mark H. Wood
On Thu, Aug 30, 2012 at 05:03:02PM +, Richard Rodgers wrote:
 Yes, as has been remarked, the bigger questions revolve around access and 
 usage, rather than ingest.
 We recently did a pilot with large video files where we ingested them as 
 preservation masters (via ItemImport),  suppressed the
 download link, but offered in it's place a link to a much smaller transcoded 
 access copy on YouTube. The thinking
 was that  formats change, we could reuse the master, thereby guaranteeing 
 access in a mediated way…

Interesting.

I've toyed with the idea of having DSpace accept and advertise a
huge object, but serve up a link to e.g. a video streaming service
that has access to the same storage and knows how to play nicely with
DSpace's storage layer.  But there's no code at all yet.

-- 
Mark H. Wood, Lead System Programmer   mw...@iupui.edu
Asking whether markets are efficient is like asking whether people are smart.


pgpX146eMIyXN.pgp
Description: PGP signature
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Pottinger, Hardy J.
This may be just me hijacking the thread, so, apologies up front, but I
followed a link [1] on the Code4Lib mail list just now, and came across
Miso Dataset [2] Which looks very cool, indeed.

[1] http://selection.datavisualization.ch/
[2] http://misoproject.com/dataset/

--
HARDY POTTINGER pottinge...@umsystem.edu
University of Missouri Library Systems
http://lso.umsystem.edu/~pottingerhj/
https://MOspace.umsystem.edu/
Do you love it? Do you hate it? There it is, the way you made it.
--Frank Zappa





On 8/30/12 11:16 AM, Pottinger, Hardy J. pottinge...@umsystem.edu
wrote:

Hi, Bill, the theoretical limit for posting data via HTTP is 1.8 GB [1].
Your only recourse for storing this particular data set in DSpace, is to
transfer to the server via FTP, SFTP, or SCP, and then either batch load,
or run the item update script [2]. However, my main question is: once it
is *in* DSpace, how do you plan on getting it *out*?

I'd love to hear more from folks who are storing large data sets, just to
hear how you're handling usability of the stored data.

[warning, I'm about to jump of the deep end here...]


One possibility worth exploring is streaming uploads and downloads. I've
come across streaming upload clients before, and in a brief bit of
googling, I discovered that Apache Commons supports streaming upload. [3]

What would be cool is if someone had a streaming dataset viewer, where you
could plug some kind of visualization into a 'thumbnail' snapshot of a
data set, and then have the full data set stream in to fill out that
visualization/analysis. I can't be the first person to have thought of
such a thing, somebody has to be working on this already, right?

[1] http://stackoverflow.com/questions/1922414/file-upload-limit-in-http
[2] 
https://wiki.duraspace.org/display/DSDOC18/Updating+Items+via+Simple+Archi
v
e+Format
[3] http://commons.apache.org/fileupload/streaming.html

--
HARDY POTTINGER pottinge...@umsystem.edu
University of Missouri Library Systems
http://lso.umsystem.edu/~pottingerhj/
https://MOspace.umsystem.edu/
Debug only code. Comments lie.





On 8/30/12 10:53 AM, Ingram, William A wingr...@illinois.edu wrote:

I apologize if a similar questuon has been answered in a prior thread.


We have a student needing to submit a 150 GB data set into DSpace. Is
this even possible? Are there any tips or workarounds I should try?


Cheers,
Bill






--

Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Mark H. Wood
On Thu, Aug 30, 2012 at 01:17:03PM -0400, Ryan Scherle wrote:
[snip]
 Even before you near 2GB, it is likely that something in your system will 
 reject the upload. You must ensure that all of the pieces of your 
 installation are configured correctly. This includes:
 * Apache -- LimitRequestBody 0 
 * Tomcat -- maxPostSize=0
 * Cocoon -- see http://sourceforge.net/mailarchive/message.php?msg_id=28478227
 * Any security software you are running

 * Filesystem capable of representing huge files.  An easy thing to
   overlook, these days, but I just recently had a log file overflow
   2GB and start failing processes that wanted to write it.

Thanks for the list of limits.  That's going in my keeper file.

-- 
Mark H. Wood, Lead System Programmer   mw...@iupui.edu
Asking whether markets are efficient is like asking whether people are smart.


pgpOtOB99wPTk.pgp
Description: PGP signature
--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Ingram, William A
Interesting discussion-I'm glad I asked. Thank you all. 

I'm going to give item import a shot. Let's hope it doesn't choke on the 150 GB 
payload. It shouldn't, if it just copies the file into the asset store. My fear 
is that it tries to load the file in to memory for some reason and overflows 
the heap. 

Here goes. 

Cheers, 
Bill


 -Original Message-
 From: Pottinger, Hardy J. [mailto:pottinge...@umsystem.edu]
 Sent: Thursday, August 30, 2012 12:31 PM
 To: Pottinger, Hardy J.; Ingram, William A; dspace-tech@lists.sourceforge.net
 Subject: Re: [Dspace-tech] Ingesting large data set
 
 This may be just me hijacking the thread, so, apologies up front, but I
 followed a link [1] on the Code4Lib mail list just now, and came across
 Miso Dataset [2] Which looks very cool, indeed.
 
 [1] http://selection.datavisualization.ch/
 [2] http://misoproject.com/dataset/
 
 --
 HARDY POTTINGER pottinge...@umsystem.edu
 University of Missouri Library Systems
 http://lso.umsystem.edu/~pottingerhj/
 https://MOspace.umsystem.edu/
 Do you love it? Do you hate it? There it is, the way you made it.
 --Frank Zappa
 
 
 
 
 
 On 8/30/12 11:16 AM, Pottinger, Hardy J. pottinge...@umsystem.edu
 wrote:
 
 Hi, Bill, the theoretical limit for posting data via HTTP is 1.8 GB [1].
 Your only recourse for storing this particular data set in DSpace, is to
 transfer to the server via FTP, SFTP, or SCP, and then either batch load,
 or run the item update script [2]. However, my main question is: once it
 is *in* DSpace, how do you plan on getting it *out*?
 
 I'd love to hear more from folks who are storing large data sets, just to
 hear how you're handling usability of the stored data.
 
 [warning, I'm about to jump of the deep end here...]
 
 
 One possibility worth exploring is streaming uploads and downloads. I've
 come across streaming upload clients before, and in a brief bit of
 googling, I discovered that Apache Commons supports streaming upload. [3]
 
 What would be cool is if someone had a streaming dataset viewer, where you
 could plug some kind of visualization into a 'thumbnail' snapshot of a
 data set, and then have the full data set stream in to fill out that
 visualization/analysis. I can't be the first person to have thought of
 such a thing, somebody has to be working on this already, right?
 
 [1] http://stackoverflow.com/questions/1922414/file-upload-limit-in-http
 [2]
 https://wiki.duraspace.org/display/DSDOC18/Updating+Items+via+Simple+Arc
 hi
 v
 e+Format
 [3] http://commons.apache.org/fileupload/streaming.html
 
 --
 HARDY POTTINGER pottinge...@umsystem.edu
 University of Missouri Library Systems
 http://lso.umsystem.edu/~pottingerhj/
 https://MOspace.umsystem.edu/
 Debug only code. Comments lie.
 
 
 
 
 
 On 8/30/12 10:53 AM, Ingram, William A wingr...@illinois.edu wrote:
 
 I apologize if a similar questuon has been answered in a prior thread.
 
 
 We have a student needing to submit a 150 GB data set into DSpace. Is
 this even possible? Are there any tips or workarounds I should try?
 
 
 Cheers,
 Bill
 
 
 
 
 
 
 --
 
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and
 threat landscape has changed and how IT managers can respond. Discussions
 will include endpoint security, mobile security and the latest in malware
 threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech


--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Benjamin Ryan
My 2p worth,

1.  What does the 150Gb consist of – one data set, multiple data sets (that 
may be related e.g. time/geographic location)

2.  How would someone use this data set – at 150Gb I  would assume (hope) 
offline processing

3.  There are ways for GIS, Geo-spatial, Time-Series data that you can 
implement a “browse” functionality by “slicing/tiling” but very dependent on 
the data and what you want to ask about the data.

4.  Is your Dspace system a long-term archival storage with a focus on 
preservation?

Sorry for no answers, but willing to discuss questions.

Regards,
   Ben

--
Dr Ben Ryan
Jorum Technical Coordinator (Services)

5.12 Roscoe Building
The University of Manchester
Oxford Road
Manchester
M13 9PL
Tel: 0160 275 6039
E-mail: 
benjamin.r...@manchester.ac.ukhttps://outlook.manchester.ac.uk/owa/redir.aspx?C=b28b5bdd1a91425abf8e32748c93f487URL=mailto%3abenjamin.ryan%40manchester.ac.uk
--

From: Ingram, William A [mailto:wingr...@illinois.edu]
Sent: 30 August 2012 16:54
To: dspace-tech@lists.sourceforge.net
Subject: [Dspace-tech] Ingesting large data set

I apologize if a similar questuon has been answered in a prior thread.

We have a student needing to submit a 150 GB data set into DSpace. Is this even 
possible? Are there any tips or workarounds I should try?

Cheers,
Bill

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Ingesting large data set

2012-08-30 Thread Ingram, William A
1.   The 150 GB consists of a few—multiple, but not many—datasets

2.   These are supplementary files for an ETD

3.   I have no idea what they actually are yet, other than that they are a 
hard drive that the graduate college folks should be bringing over soon

4.   Storage in DSpace is mainly for preservation, but the data may be 
critical to reading and understanding the dissertation

Cheers,
Bill


From: Benjamin Ryan [mailto:benjamin.r...@manchester.ac.uk]
Sent: Thursday, August 30, 2012 2:19 PM
To: Ingram, William A; dspace-tech@lists.sourceforge.net
Subject: RE: [Dspace-tech] Ingesting large data set

My 2p worth,

1.   What does the 150Gb consist of – one data set, multiple data sets 
(that may be related e.g. time/geographic location)

2.   How would someone use this data set – at 150Gb I  would assume (hope) 
offline processing

3.   There are ways for GIS, Geo-spatial, Time-Series data that you can 
implement a “browse” functionality by “slicing/tiling” but very dependent on 
the data and what you want to ask about the data.

4.   Is your Dspace system a long-term archival storage with a focus on 
preservation?

Sorry for no answers, but willing to discuss questions.

Regards,
   Ben

--
Dr Ben Ryan
Jorum Technical Coordinator (Services)

5.12 Roscoe Building
The University of Manchester
Oxford Road
Manchester
M13 9PL
Tel: 0160 275 6039
E-mail: 
benjamin.r...@manchester.ac.ukhttps://outlook.manchester.ac.uk/owa/redir.aspx?C=b28b5bdd1a91425abf8e32748c93f487URL=mailto%3abenjamin.ryan%40manchester.ac.uk
--

From: Ingram, William A 
[mailto:wingr...@illinois.edu]mailto:[mailto:wingr...@illinois.edu]
Sent: 30 August 2012 16:54
To: dspace-tech@lists.sourceforge.netmailto:dspace-tech@lists.sourceforge.net
Subject: [Dspace-tech] Ingesting large data set

I apologize if a similar questuon has been answered in a prior thread.

We have a student needing to submit a 150 GB data set into DSpace. Is this even 
possible? Are there any tips or workarounds I should try?

Cheers,
Bill

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] XPDF to Thumbnail Preview in DSpace 1.8.2

2012-08-30 Thread Osama Alkadi
We upgraded pdftoppm from 3.0 to 3.02 and that fixed the problem.

Thanks
Osama

On 30/08/2012, at 12:57 PM, Osama Alkadi wrote:

 Just a follow up on my previous email, I ran the pdftoppm manually using this 
 command and got the error below:
 
 pdftoppm -q -f 1 -l 1 -r 62 DevelopmentBulletin-73_2009.pdf bleg2
 Bogus memory allocation size
 
 Link to the pdf file at: http://hdl.handle.net/1885/9207
 
 Has anyone seen this error?
 
 Thanks
 
 
 On 29/08/2012, at 2:52 PM, Osama Alkadi wrote:
 
 Hi all,
 
 We are running dspace 1.8.2/Linux and having some issues with the pdftoppm 
 tool when extracting some PDF's thumbnail. 
 
 Some properties of the PDF's:
 
 - Encoding software includes: Adobe PDF Library, Acrobat Distiller,  Acrobat 
 PDFWriter.
 - Size: varies from 1 to 10 MB.
 
 In the logs (in debug mode) throws this after executing filter-media:
 
 INFO  org.dspace.app.mediafilter.XPDF2Thumbnail @ XPDF2Thumbnail: outPrefix: 
 /tmp/prevu1738144616485715914out
 ERROR org.dspace.app.mediafilter.XPDF2Thumbnail @ Unable to delete file
 ERROR org.dspace.app.mediafilter.XPDF2Thumbnail @ PDF conversion proc 
 failed, exit status=1, file=/tmp/DSfilt2694438157933967840.pdf
 --
 Full Filter Name: org.dspace.app.mediafilter.HTMLFilter
 org.dspace.app.mediafilter.HTMLFilter
 Full Filter Name: org.dspace.app.mediafilter.WordFilter
 org.dspace.app.mediafilter.WordFilter
 Full Filter Name: org.dspace.app.mediafilter.JPEGFilter
 org.dspace.app.mediafilter.JPEGFilter
 Full Filter Name: org.dspace.app.mediafilter.XPDF2Text
 org.dspace.app.mediafilter.XPDF2Text
 Full Filter Name: org.dspace.app.mediafilter.BrandedPreviewJPEGFilter
 org.dspace.app.mediafilter.BrandedPreviewJPEGFilter
 Full Filter Name: org.dspace.app.mediafilter.XPDF2Thumbnail
 org.dspace.app.mediafilter.XPDF2Thumbnail
 Full Filter Name: org.dspace.app.mediafilter.PowerPointFilter
 org.dspace.app.mediafilter.PowerPointFilter
 FILTERED: bitstream 38802 (item: 1885/8749) and created 
 'DevelopmentBulletin-73_2009.pdf.txt'
 ERROR filtering, skipping bitstream:
 
  Item Handle: 1885/8749
  Bundle Name: ORIGINAL
  File Size: 1445348
  Checksum: 1a1b0472e9361c4a4a00d30846f3e211 (MD5)
  Asset Store: 0
 javax.imageio.IIOException: Can't read input file!
 javax.imageio.IIOException: Can't read input file!
  at javax.imageio.ImageIO.read(ImageIO.java:1275)
  at 
 org.dspace.app.mediafilter.XPDF2Thumbnail.getDestinationStream(XPDF2Thumbnail.java:246)
  at 
 org.dspace.app.mediafilter.MediaFilterManager.processBitstream(MediaFilterManager.java:746)
  at 
 org.dspace.app.mediafilter.MediaFilterManager.filterBitstream(MediaFilterManager.java:561)
  at 
 org.dspace.app.mediafilter.MediaFilterManager.filterItem(MediaFilterManager.java:511)
  at 
 org.dspace.app.mediafilter.MediaFilterManager.applyFiltersItem(MediaFilterManager.java:479)
  at 
 org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:353)
  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
  at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
  at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
  at java.lang.reflect.Method.invoke(Method.java:597)
  at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183)
 FILTERED: bitstream 38805 (item: 1885/8749) and created 
 '01whole_Grubb.pdf.txt'
 FILTERED: bitstream 38805 (item: 1885/8749) and created 
 '01whole_Grubb.pdf.jpg'
 Updating search index:
 
 Strangely, even when running the pdftoppm tool  manually I get  Bogus 
 memory allocation size  error.  My JAVA_OPTS is set to -Xmx1024M -Xms128M 
 -XX:PermSize=192M -XX:MaxPermSize=384M
 
 Also someone on the mailing list  suggested a solution to change a line in 
 XPDF2Thumbnail.java near the line reporting the error . The line was
 
 File outf = new File(outPrefix+-01.ppm);
 and change to 
 File outf = new File(outPrefix+-001.ppm);
 
 Unfortunately, this has not worked for me. Any help would be appreciated?
 
 Thanks
 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and 
 threat landscape has changed and how IT managers can respond. Discussions 
 will include endpoint security, mobile security and the latest in malware 
 threats. 
 http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 
 --
 Live Security Virtual Conference
 Exclusive live event will cover all the ways today's security and 
 threat landscape has changed and how IT managers can respond. Discussions 
 will include endpoint security, mobile security and the latest in malware 
 threats. 
 

Re: [Dspace-tech] XPDF to Thumbnail Preview in DSpace 1.8.2

2012-08-30 Thread helix84
On Fri, Aug 31, 2012 at 4:30 AM, Osama Alkadi osama.alk...@anu.edu.au wrote:
 We upgraded pdftoppm from 3.0 to 3.02 and that fixed the problem.

Thanks for reporting back!

Regards,
~~helix84

--
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech