Re: [Dspace-tech] About Java Unknown failure when filtering media with XPDF
Hi Antonio, In your [dspace]/config/dspace.cfg file, what do you have set for the 'filter.plugins' setting? The error you sent suggested that there is is no 'filter.plugins' set in your dspace.cfg. A typical value you be: filter.plugins = PDF Text Extractor, HTML Text Extractor, \ PowerPoint Text Extractor, \ Word Text Extractor, JPEG Thumbnail Thanks, Stuart Lewis Digital Development Manager Te Tumu Herenga The University of Auckland Library Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand Ph: +64 (0)9 373 7599 x81928 On 26/09/2011, at 6:33 PM, Antonio Calderón wrote: Hi Scott, I followed the instructions, but get this: Exception: null java.lang.NullPointerException at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:214) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183) Thank you very much for your help, A. 2011/9/23 Scott Thurston scott.thurs...@noaa.gov: Hello Antonio, Are you able to install your own JDK and configure your environment to use it to build, install and run DSpace? Your $JAVA_HOME should point to the new JDK. If you have downloaded the jai_imageio JAR file for your system, you can unzip it to extract the installer jai_imageio*.bin. The installer is a shell script that contains the binary installer image at the end of the file. Search the installer for the tail command, which looks like this in my installer: tail +215 $0 $outname I was able to get the installer to work by following these steps: 1. tail -n +215 jai_imageio-1_1-lib-linux-amd64-jdk.bin $JAVA_HOME/jai_installer 2. cd $JAVA_HOME 3. ./jai_installer Those steps should install the JAI ImageIO library in your JDK. I hope that is helpful. Regards, Scott On 9/22/2011 10:27 PM, neocalde...@gmail.com wrote: Hi, how do you do? Sorry to contact you directly. About: UPDATE: I resolved the problem This Afternoon. The solution is to JAI ImageIO install the library in the JDK itself. In my case I do not Have permissions to update the system's JDK so I installed my own JDK and Then installed the ImageIO library there. I rebuilt my using DSpace JDK own filtering and verified That dog now produces half thumbnail images for a PDF file . Do you have any guidance or howto? Thank you very much for your response. -- Scott Thurston scott.thurs...@noaa.gov NOAA / NGDC / WDC http://www.ngdc.noaa.gov/ Marine Geology Geophysics 303-497-4411 (phone) 325 Broadway E/GC3 303-497-6513 (fax) Boulder, CO 80305-3337 -- Antonio Calderón - Calderón Cardona Ltda. http://calderoncardona.com | http://ventura-systems.net *Proudly running Debian GNU/Linux. -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] DCAT Update: Collaborating on New DSpace Features
~Apologies for the cross posting~ Ithaca, NY The success of any open-source project lies with the community contributing its collective energy, knowledge, enthusiasm, and effort. In the DSpace community valuable contributions come not just from our numerous volunteer developers and committers, but also a group known as the DSpace Community Advisory Team or DCAT. The primary goals of DCAT are to help review and facilitate community discussions about new feature requests and to provide support to the DSpace committer group in producing software releases. New Feature Review Since the beginning of the year, DCAT has held detailed discussions on a half dozen new feature requests from JIRA. The discussions started asynchronously on the DCAT Discussion Forum, where the new feature requests were discussed and specific requirements outlined. DCAT members also recorded their vote on the priority level and how broadly they believed the feature would appeal to the larger community. Once a request was determined to be high priority/broad appeal, there would also be a discussion about it in one of the weekly developer meetings. Additional DCAT status discussions occurred during the monthly DCAT meetings, which Robin Taylor, the 1. 8 Release Coordinator, and Tim Donohue, DSpace Tech Lead attended. DCAT/Committers/Developers 1.8 Collaboration Thanks to everyone’s efforts, particularly the committers and developers, we are very pleased to announce that the DCAT/DSpace developer collaboration will yield fruits in the upcoming DSpace 1.8 release. Three new features are, in part, a result of this collaboration: DS-749 Reordering of bitstreams, contributed by Kevin Van de Velde from @mire, DCAT discussion leader Jennifer Laherty from Indiana University DS-638 Enable virus checking during submission, contributed by Robin Taylor from the University of Edinburgh, DCAT discussion leader Elin Stangeland from Cambridge University Library Additionally, at the request of the committers, DCAT members also consulted on the improvements to the bulk CSV editing (the feature also known as Batch Metadata Editing). DS-811 Delete/withdraw items via bulk CSV editing, contributed by Stuart Lewis from the University of Auckland, feedback provided by DCAT members For more information about these and other new features in 1.8 please visit https://wiki.duraspace.org/display/DSPACE/DSpace+Release+1.8.0+Notes. Other DCAT Efforts DCAT has also been working on a community survey to find out what type of improvements users would like to see for metadata support in DSpace. The survey will mark the beginning of the DCAT/committer effort to evolve the types of metadata schemas available as well as ease customization. The community metadata survey will be sent out in the next few weeks and will be used to inform efforts for improvements on future releases of DSpace. For more information about DCAT, please visit the wiki. If you would like to learn more about how to get involved, please contact Valorie Hollister at vhollis...@duraspace.org. -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] dsrun script to remove items en mass from a collection
Wolf, There is no script or java class in the DSpace distribution that does what you describe. ItemUpdate lets you remove metadata or bitstreams but not the whole item. If you are running DSpace 1.7, then you could write a curation task to do delete based on collection and dates. Alternatively, you could write me off-line and I can share an ItemMover class that can be easily adapted to delete. --Bill I have used dsrun org.dspace.app.itemimport etc but is there a similar script to remove multiple items from the db based on collection and (optimally) dates ingested? Are there any docs about the possible cli scripts? Wolf -- William Hays Software Development Analysis MIT Libraries E25-131 617.324.5682 (phone) wh...@mit.edu -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Localization Nightmare
Thank you for your rant. :-) While I don't enjoy learning that something must be fixed, I do appreciate a well-thought-out complaint. I am too often dismayed to learn that someone has long been suffering in silence a problem that is easily corrected. There is much to think about here, and I probably won't touch every point. I'm going to try to explain how things are done, as a starting point only. I do think we ought to be able to make localization easier. First: DSpace is using localization facilities provided by the underlying software o JSPUI uses Java's native PropertyResourceBundle class and is bound by its behavior. See http://download.oracle.com/javase/6/docs/api/java/util/ResourceBundle.html#getBundle(java.lang.String,%20java.util.Locale,%20java.lang.ClassLoader) for the gory details of how a particular localization is selected from those available. o XMLUI uses the Cocoon I18nTransformer class, which follows a search pattern similar to that of ResourceBundle, except that it does not search for .class files. It's not immediately clear to me why someone invented an XML profile which duplicates property files, but that's the way I18nTransformer was made. Cocoon documentation is in a sorry state, and I don't know of a good link for this. _Cocoon Developer's Handbook_ (Moczar, Aston 2003) p. 303-4, Configuring Message Catalogs, describes it a bit. o The commandline tools use PropertyResourceBundle, but have a different classpath than JSPUI and so may have access to a different set of resources. o I suppose that JSPUI's messages are in dspace-api.jar due to historical reasons: when there was only one UI, it made sense to put the messages all in one place. Anyway: the behavior you are seeing comes from the supporting software. Every message catalog has a parent catalog which is the next less-specific locale -- fr_CA has fr as its parent, for example, and fr has (e.g. messages.xml) as its parent. If a given key is not found in the most-specific existing catalog, it is searched for by going up the chain of parent catalogs. So, if key X is sought, and it is found in messages_de, messages will not be consulted. To make an alternation in the smallest number of places, you need to find all of the most-specific catalogs which define that key within the set of locales for which you wish to modify the text. Your example of keys in messages_en.xml being preferred over those in messages.xml (if the user's request is in an English locale) demonstrates this. Assume that the user's request is in the en_GB_Cockney locale. en is more specific than , so if the key exists in en then it will be used. If there were an en_GB containing the key , it would use that text, and if there were an en_GB_Cockney containing the key then it would prefer that text. DSpace is not doing any of this; it's done by the JRE or by Cocoon. (That doesn't excuse us from trying to avoid making things even more complex and difficult, or documenting well the complexities required by our choices.) A number of DSpace's components have their own catalogs. Expect to see more of this -- there is activity to loosen the coupling among components to the point that they can be released on separate schedules, and this will be facilitated by providing for a separate catalog for each component. I seem to recall that there is a way to configure XMLUI's default request locale, but it's different from JSPUI's way. I don't know the details. Apparently we could do a better job of documenting it. Defaulting the request locale is yet another dimension of the complexity of localization. What this defaulting does is to prevent ever *starting* at the locale. Any user who does not specify a locale gets the default, so if your site is set to insert a default de locale then that is where such requests start. DSpace could still search down to if the key isn't in de. messages.xml isn't a default catalog so much as it is a catch-all to try before giving up and presenting the key itself instead of a message text. It's important to recall that the thing being looked up is a specific key. A given request is associated with a locale which is tried and then repeatedly broadened *for each message key presented*. The localization mechanism will search the whole path each time a message text is wanted, until it finds one or runs out of places to look. [a rant of my own] It's my thought that we need to ensure that there is a place, or a well-defined and well-documented sequence of places, which appear early in the classpath for *every* application within DSpace, into which one may put overriding versions of message catalogs for *any or all* DSpace components. Message texts have no DSpace-defined behavior so it should not be necessary to rebuild or even reassemble any part of DSpace in order to provide additional localizations or site-specific rewording of any message. One should be able to
Re: [Dspace-tech] Localization Nightmare
On Mon, Sep 26, 2011 at 17:51, Mark H. Wood mw...@iupui.edu wrote: customization by each site. Shouldn't they all just go into: [DSpace] /config /messages /jspui.properties /xmlui.xml /discovery.xml /api.properties . . . I'm all for it. Did you consider how getting updated localization files from the dspace repository would work in this case? Users should be able to just build a new version and get corresponding (up-to-date) message catalogs instead of remembering to update them as as part of configuration. Regards, ~~helix84 -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1 ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] About Java Unknown failure when filtering media with XPDF
Scott, how are you? My configuration: # maximum width and height of generated thumbnails thumbnail.maxwidth = 80 thumbnail.maxheight = 80 #XPDF xpdf.path.pdftotext = /usr/bin/pdftotext xpdf.path.pdftoppm = /usr/bin/pdftoppm xpdf.path.pdfinfo = /usr/bin/pdfinfo plugin.named.org.dspace.app.mediafilter.FormatFilter = \ org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor, \ org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail, \ org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \ org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \ org.dspace.app.mediafilter.PowerPointFilter = PowerPoint Text Extractor, \ org.dspace.app.mediafilter.JPEGFilter = JPEG Thumbnail, \ org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG filter.org.dspace.app.mediafilter.XPDF2Text.inputFormats = Adobe PDF filter.org.dspace.app.mediafilter.XPDF2Thumbnail.inputFormats = Adobe PDF filter.org.dspace.app.mediafilter.HTMLFilter.inputFormats = HTML, Text filter.org.dspace.app.mediafilter.WordFilter.inputFormats = Microsoft Word filter.org.dspace.app.mediafilter.PowerPointFilter.inputFormats = Microsoft Powerpoint, Microsoft Powerpoint XML filter.org.dspace.app.mediafilter.JPEGFilter.inputFormats = BMP, GIF, JPEG, image/png filter.org.dspace.app.mediafilter.BrandedPreviewJPEGFilter.inputFormats = BMP, GIF, JPEG, image/png #Names of the enabled MediaFilter or FormatFilter plugins filter.plugins = PDF Thumbnail, PDF Text Extractor, HTML Text Extractor, PowerPoint Text Extractor, Word Text Extractor, JPEG Thumbnail After correction, another error: ./dspace filter-media ERROR: Unknown MediaFilter specified (either from command-line or in dspace.cfg): 'PDF Thumbnail' Thank you, A. 2011/9/26 Stuart Lewis s.le...@auckland.ac.nz: Hi Antonio, In your [dspace]/config/dspace.cfg file, what do you have set for the 'filter.plugins' setting? The error you sent suggested that there is is no 'filter.plugins' set in your dspace.cfg. A typical value you be: filter.plugins = PDF Text Extractor, HTML Text Extractor, \ PowerPoint Text Extractor, \ Word Text Extractor, JPEG Thumbnail Thanks, Stuart Lewis Digital Development Manager Te Tumu Herenga The University of Auckland Library Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand Ph: +64 (0)9 373 7599 x81928 On 26/09/2011, at 6:33 PM, Antonio Calderón wrote: Hi Scott, I followed the instructions, but get this: Exception: null java.lang.NullPointerException at org.dspace.app.mediafilter.MediaFilterManager.main(MediaFilterManager.java:214) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.dspace.app.launcher.ScriptLauncher.main(ScriptLauncher.java:183) Thank you very much for your help, A. 2011/9/23 Scott Thurston scott.thurs...@noaa.gov: Hello Antonio, Are you able to install your own JDK and configure your environment to use it to build, install and run DSpace? Your $JAVA_HOME should point to the new JDK. If you have downloaded the jai_imageio JAR file for your system, you can unzip it to extract the installer jai_imageio*.bin. The installer is a shell script that contains the binary installer image at the end of the file. Search the installer for the tail command, which looks like this in my installer: tail +215 $0 $outname I was able to get the installer to work by following these steps: 1. tail -n +215 jai_imageio-1_1-lib-linux-amd64-jdk.bin $JAVA_HOME/jai_installer 2. cd $JAVA_HOME 3. ./jai_installer Those steps should install the JAI ImageIO library in your JDK. I hope that is helpful. Regards, Scott On 9/22/2011 10:27 PM, neocalde...@gmail.com wrote: Hi, how do you do? Sorry to contact you directly. About: UPDATE: I resolved the problem This Afternoon. The solution is to JAI ImageIO install the library in the JDK itself. In my case I do not Have permissions to update the system's JDK so I installed my own JDK and Then installed the ImageIO library there. I rebuilt my using DSpace JDK own filtering and verified That dog now produces half thumbnail images for a PDF file . Do you have any guidance or howto? Thank you very much for your response. -- Scott Thurston scott.thurs...@noaa.gov NOAA / NGDC / WDC http://www.ngdc.noaa.gov/ Marine Geology Geophysics 303-497-4411 (phone) 325 Broadway E/GC3 303-497-6513 (fax) Boulder, CO 80305-3337 -- Antonio Calderón - Calderón Cardona Ltda. http://calderoncardona.com | http://ventura-systems.net *Proudly running Debian GNU/Linux.
Re: [Dspace-tech] Accessing non-DC fields with XSLT Crosswalk
Thanks, Brian. Turns out it had nothing to do with that element at all. I had a problem further up in my stylesheet. Jason Jason Stirnaman Biomedical Librarian, Digital Projects A.R. Dykes Library, University of Kansas Medical Center jstirna...@kumc.edu 913-588-7319 On 9/26/2011 at 11:16 AM, in message 4e8050f202010010c...@gwdomain.unm.edu, Brian Freels-Stendel bfre...@unm.edu wrote: Hi Jason, It looks like your match should work (although it should only be necessary to specify the mdschema if the element name also appears in other schemas.) I'm wondering about the value-of select, though. Does anything come out if you use 'select=.'? Or, perhaps try a different form and use 'xsl:copy-of select=./node()/'? (I'm not incredible at XSLT, but I'm not seeing where 'present element' is being supplied in that select) B-- On 9/23/2011 at 3:54 PM, in message 4e7cb9d9020501558...@smtpout.kumc.edu, Jason Stirnaman jstirna...@kumc.edu wrote: I have a XSLT crosswalk. How do I access a metadata field from a different schema in my stylesheet? For example, Here's my non-DC field in mets.xml: dim:field element=spage language=en_US mdschema=rft1/dim:field However, in my stylesheet applying the following template doesn't output anything: xsl:template match=dim:field[@element='spage' AND @mdschema='rft'] xsl:element name=FirstPage xsl:value-of select=concat('E',text())/ /xsl:element /xsl:template Thanks, Jason Jason Stirnaman Biomedical Librarian, Digital Projects A.R. Dykes Library, University of Kansas Medical Center jstirna...@kumc.edu 913-588-7319 -- All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2dcopy1___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech