Re: [Dspace-tech] [SPAM] Media filter handling of .docx
Hi Trevor, I made it as a new filter that could extract text from .doc, .docx, ppt, pptx, xls and xlsx. See my commit log for detail at the following url: https://github.com/zuki/DSpace/commit/302de5d098cf5a3914498345a0e49ba56b796181 Regards, Keiji Suzuki Ebetsu, Japan 2014-04-16 23:05 GMT+09:00 Trevor Wilson trevor.wil...@rmc.ca: The media filter doesn't currently process .docx files to enable full-text search. I've found a few mentions of this being a result of it using outdated text-mining tools instead of Apache's POI for it's processing. Has anyone rewritten this to make use of POI and got it to full-text-search the contents of a .docx? Thanks! Trevor -- View this message in context: http://dspace.2283337.n4.nabble.com/Media-filter-handling-of-docx-tp4672707.html Sent from the DSpace - Tech mailing list archive at Nabble.com. -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- 鈴木敬二@江別市 -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] solr statistics question
Hi Bill, these are very likely the collection/community logo's. Try comparing the ones you see occurring with the ones that come up if you query your DB as follows: SELECT * FROM bitstream WHERE bitstream_id IN (SELECT logo_bitstream_id FROM collection); SELECT * FROM bitstream WHERE bitstream_id IN (SELECT logo_bitstream_id FROM community); rgds Bram -- [image: logo] *Bram Luyten* +1 202 684 6365 *2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010* *Esperantolaan 4, Heverlee 3001, Belgium* www.atmire.comhttp://atmire.com/website/?q=servicesutm_source=emailfooterutm_medium=emailutm_campaign=braml On Thu, Apr 17, 2014 at 10:08 PM, Bill Tantzen wile...@gmail.com wrote: I'm on dspace 4.1. In my solr statistics data, I have noticed a small percentage (~5%) of type:0 (bitstream) records without any context -- no owningComm, owningColl, nor owningItem. Most of these have no bundleName either. Can anyone explain to me how these bitstream records are being added to solr? I have tried to reproduce this via the gui in every way I can think of, but have not been able to do it... Thanks in advance! Bill -- Learn Graph Databases - Download FREE O'Reilly Book Graph Databases is the definitive new guide to graph databases and their applications. Written by three acclaimed leaders in the field, this first edition is now available. Download your free book today! http://p.sf.net/sfu/NeoTech ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Communities and Collections in search results
Thanks for this - very helpful. Anthony -Original Message- From: zuki.ebe...@gmail.com [mailto:zuki.ebe...@gmail.com] On Behalf Of SUZUKI Keiji Sent: Sunday, April 20, 2014 2:34 AM To: Anthony Petryk Cc: dspace-tech@lists.sourceforge.net Subject: Re: [Dspace-tech] Communities and Collections in search results Hi Anthony, 2014-04-08 4:03 GMT+09:00 Anthony Petryk anthony.pet...@uottawa.ca: In DSpace 4.0 JSPUI, the search results page includes “hits” from Communities and Collections in addition to Items. However, some of these Communities and Collections only appear on the second and third page of results (possibly because they’re ranked lower in relevance than some Items). Is there a way of getting them all to float to the top? Is it possible to remove them from the results list altogether? You can remove them with setting the following property element to default configuration settings for discovery and Homepage specific configuration settings for discovery.in [dspace]/config/spring/api/discovery.xml In the default configuration setting this property is set as comment already. property name=defaultFilterQueries list valuesearch.resourcetype:2/value /list /property This setting removes communities and collections results in XMLUI but there is a bug in JSPUI and does not remove them. I made a PR and created a ticket in JIRA. Please refer to the following URL, https://jira.duraspace.org/browse/DS-1974 Also, is it possible to configure which metadata fields are indexed for Communities and Collections? It seems that the License field is indexed, which we don’t want. You can omit the license data with setting dc.rights.license as a value of property name=toIgnoreMetadataFields also in [dspace]/config/spring/api/discovery.xml and reindex discovery index. The following is a diff with these two setting. --- discovery.xml.org 2014-04-19 15:55:58.082480988 +0900 +++ discovery.xml 2014-04-19 19:06:37.579309730 +0900 @@ -76,6 +76,8 @@ !--valuedc.description.tableofcontents/value-- !--Copyright text-- valuedc.rights/value +!--License text-- +valuedc.rights.license/value !--Collection name-- !--valuedc.title/value-- /list @@ -124,12 +126,12 @@ /bean /property !--Any default filter queries, these filter queries will be used for all queries done by discovery for this configuration-- -!--property name=defaultFilterQueries-- -!--list-- +property name=defaultFilterQueries +list !--Only find items-- -!--valuesearch.resourcetype:2/value-- -!--/list-- -!--/property-- +valuesearch.resourcetype:2/value +/list +/property !--The configuration for the recent submissions-- property name=recentSubmissionConfiguration bean class=org.dspace.discovery.configuration.DiscoveryRecentSubmissionsConfiguration @@ -212,6 +214,12 @@ ref bean=searchFilterIssued / /list /property +property name=defaultFilterQueries +list +!--Only find items-- +valuesearch.resourcetype:2/value +/list +/property !--The sort filters for the discovery search (same as defaultConfiguration above)-- property name=searchSortConfiguration bean class=org.dspace.discovery.configuration.DiscoverySortConfiguration Regards, Keiji Suzuki Ebetsu, Japan -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Discover search update issue
Based on your note, it sounds like you applied your update with SQL. Try running dspace filter-media against the items or the collection containing the items and force the text index to be updated for these items. This will update the lucene/search index. After this task is complete, update your discovery index to pull the latest changes into SOLR. In case you have not tried it out, we have found the Import Metadata tool on the Administrative menu to be really useful (and less risky) for these types of changes. It does a good job of cascading updates to other components of the system. Terry On Mon, Apr 21, 2014 at 7:40 AM, Bhavesh Patel bhavesh.bece...@gmail.comwrote: Dear Tech Team, We are facing one issue on Discover search We have made one mistake while entering the author name Into some entry we have added Patel, Bhavesh and some entry Patel,Bhavesh R It will show like : Discover -- Patel, Bhavesh (10) Patel, Bhavesh R (5) So We have to convert into one authour Patel, Bhavesh I have made changes into database Table : metadatavalue Field : textvalue I have search Patel, Bhavesh R and converted into Patel, Bhavesh and then update index.. but still it's shows the old value.. but into that Item page it's show updated authour name What may be the issue ? Version : DSpace 4.1 OS : Window 7 Thanks Regards, *Bhavesh R. Patel * - www.bhaveshpatel.info - www.onlinequizportal.com - www.hindisuvichar.com Never leave till tomorrow which you can do today *Please consider the environment before printing this e-mail.* -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Terry Brady Applications Programmer Analyst Georgetown University Library Information Technology https://www.library.georgetown.edu/lit/code 202-687-7053 -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
[Dspace-tech] import/export of metadata registries?
Hi, Was wondering if there is a specific file or import system for metadata registries; we were doing a test and have a lot of the other dspace customizations set up to merge easily into the relevant installation, but haven't found one re: registries. Thanks, Charlene -- Charlene Barina, MPH Research Analyst 2, U.S. IMPACT Study The Information School 303-359-6347 | Skype: cbarina facebook.com/ImpactSurvey | twitter.com/impactsurvey -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] import/export of metadata registries?
nevermind; believe I should look here for the information: https://wiki.duraspace.org/display/DSDOC4x/Configuration+Reference#ConfigurationReference-MetadataFormatRegistries On Tue, Apr 22, 2014 at 4:28 PM, Charlene Chinda Barina cbar...@uw.eduwrote: Hi, Was wondering if there is a specific file or import system for metadata registries; we were doing a test and have a lot of the other dspace customizations set up to merge easily into the relevant installation, but haven't found one re: registries. Thanks, Charlene -- Charlene Barina, MPH Research Analyst 2, U.S. IMPACT Study The Information School 303-359-6347 | Skype: cbarina facebook.com/ImpactSurvey | twitter.com/impactsurvey -- Charlene Barina, MPH Research Analyst 2, U.S. IMPACT Study The Information School 303-359-6347 | Skype: cbarina facebook.com/ImpactSurvey | twitter.com/impactsurvey -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
[Dspace-tech] Setting default home page to a single community
Hi, We've set up dspace to have one community and 4 collections under it. Rather than list the Communities in X Repository on the home page, I'd just like to show the links to the 4 collections, like what you see when you click on the individual community. Could someone tell me how to change that, and where? Conceptually I could see it either being changing the default url/page for the homepage, or changing what is displayed on the homepage via a particular xsl file. Thanks, Charlene -- Charlene Barina, MPH Research Analyst 2, U.S. IMPACT Study The Information School 303-359-6347 | Skype: cbarina facebook.com/ImpactSurvey | twitter.com/impactsurvey -- Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette