Re: [Dspace-tech] [SPAM] Media filter handling of .docx

2014-04-22 Thread SUZUKI Keiji
Hi Trevor,

I made it as a new filter that could extract text from .doc, .docx,
ppt, pptx, xls and xlsx.

See my commit log for detail at the following url:

https://github.com/zuki/DSpace/commit/302de5d098cf5a3914498345a0e49ba56b796181

Regards,
Keiji Suzuki
Ebetsu, Japan


2014-04-16 23:05 GMT+09:00 Trevor Wilson trevor.wil...@rmc.ca:
 The media filter doesn't currently process .docx files to enable full-text
 search.

 I've found a few mentions of this being a result of it using outdated
 text-mining tools instead of Apache's POI for it's processing.

 Has anyone rewritten this to make use of POI and got it to full-text-search
 the contents of a .docx?


 Thanks!

 Trevor



 --
 View this message in context: 
 http://dspace.2283337.n4.nabble.com/Media-filter-handling-of-docx-tp4672707.html
 Sent from the DSpace - Tech mailing list archive at Nabble.com.

 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/NeoTech
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 List Etiquette: 
 https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette



-- 
鈴木敬二@江別市

--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] solr statistics question

2014-04-22 Thread Bram Luyten
Hi Bill,

these are very likely the collection/community logo's.

Try comparing the ones you see occurring with the ones that come up if you
query your DB as follows:

SELECT * FROM bitstream WHERE bitstream_id IN (SELECT logo_bitstream_id
FROM collection);
SELECT * FROM bitstream WHERE bitstream_id IN (SELECT logo_bitstream_id
FROM community);

rgds

Bram

-- 
[image: logo]
*Bram Luyten* +1 202 684 6365
*2888 Loker Avenue East, Suite 315, Carlsbad, CA. 92010*
*Esperantolaan 4, Heverlee 3001, Belgium*
www.atmire.comhttp://atmire.com/website/?q=servicesutm_source=emailfooterutm_medium=emailutm_campaign=braml



On Thu, Apr 17, 2014 at 10:08 PM, Bill Tantzen wile...@gmail.com wrote:

 I'm on dspace 4.1.

 In my solr statistics data, I have noticed a small percentage (~5%) of
 type:0 (bitstream) records without any context -- no owningComm,
 owningColl, nor owningItem.  Most of these have no bundleName either.

 Can anyone explain to me how these bitstream records are being added
 to solr?  I have tried to reproduce this via the gui in every way I
 can think of, but have not been able to do it...

 Thanks in advance!
 Bill


 --
 Learn Graph Databases - Download FREE O'Reilly Book
 Graph Databases is the definitive new guide to graph databases and their
 applications. Written by three acclaimed leaders in the field,
 this first edition is now available. Download your free book today!
 http://p.sf.net/sfu/NeoTech
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 List Etiquette:
 https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] Communities and Collections in search results

2014-04-22 Thread Anthony Petryk
Thanks for this - very helpful.

Anthony

-Original Message-
From: zuki.ebe...@gmail.com [mailto:zuki.ebe...@gmail.com] On Behalf Of SUZUKI 
Keiji
Sent: Sunday, April 20, 2014 2:34 AM
To: Anthony Petryk
Cc: dspace-tech@lists.sourceforge.net
Subject: Re: [Dspace-tech] Communities and Collections in search results

Hi Anthony,

2014-04-08 4:03 GMT+09:00 Anthony Petryk anthony.pet...@uottawa.ca:
 In DSpace 4.0 JSPUI, the search results page includes “hits” from 
 Communities and Collections in addition to Items.  However, some of 
 these Communities and Collections only appear on the second and third 
 page of results (possibly because they’re ranked lower in relevance 
 than some Items).  Is there a way of getting them all to float to the 
 top?  Is it possible to remove them from the results list altogether?

You can remove them with setting the following property element to default 
configuration settings for discovery and Homepage specific configuration 
settings for discovery.in [dspace]/config/spring/api/discovery.xml
In the default configuration setting this property is set as comment already.

property name=defaultFilterQueries
  list
valuesearch.resourcetype:2/value
  /list
/property

This setting removes communities and collections results in XMLUI but there is 
a bug in JSPUI and does not remove them. I made a PR and created a ticket in 
JIRA. Please refer to the following URL,

https://jira.duraspace.org/browse/DS-1974

 Also, is it possible to configure which metadata fields are indexed 
 for Communities and Collections?  It seems that the License field is 
 indexed, which we don’t want.

You can omit the license data with setting dc.rights.license as a value of 
property name=toIgnoreMetadataFields also in 
[dspace]/config/spring/api/discovery.xml
and reindex discovery index.

The following is a diff with these two setting.

--- discovery.xml.org 2014-04-19 15:55:58.082480988 +0900
+++ discovery.xml 2014-04-19 19:06:37.579309730 +0900
@@ -76,6 +76,8 @@
 !--valuedc.description.tableofcontents/value--
 !--Copyright text--
 valuedc.rights/value
+!--License   text--
+valuedc.rights.license/value
 !--Collection name--
 !--valuedc.title/value--
 /list
@@ -124,12 +126,12 @@
 /bean
 /property
 !--Any default filter queries, these filter queries will be used for 
all queries done by discovery for this configuration--
-!--property name=defaultFilterQueries--
-!--list--
+property name=defaultFilterQueries
+list
 !--Only find items--
-!--valuesearch.resourcetype:2/value--
-!--/list--
-!--/property--
+valuesearch.resourcetype:2/value
+/list
+/property
 !--The configuration for the recent submissions--
 property name=recentSubmissionConfiguration
 bean
class=org.dspace.discovery.configuration.DiscoveryRecentSubmissionsConfiguration
@@ -212,6 +214,12 @@
 ref bean=searchFilterIssued /
 /list
 /property
+property name=defaultFilterQueries
+list
+!--Only find items--
+valuesearch.resourcetype:2/value
+/list
+/property
 !--The sort filters for the discovery search (same as 
defaultConfiguration above)--
 property name=searchSortConfiguration
 bean
class=org.dspace.discovery.configuration.DiscoverySortConfiguration


Regards,
Keiji Suzuki
Ebetsu, Japan
--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] Discover search update issue

2014-04-22 Thread Terry Brady
Based on your note, it sounds like you applied your update with SQL.

Try running dspace filter-media against the items or the collection
containing the items and force the text index to be updated for these
items.  This will update the lucene/search index.

After this task is complete, update your discovery index to pull the latest
changes into SOLR.

In case you have not tried it out, we have found the Import Metadata tool
on the Administrative menu to be really useful (and less risky) for these
types of changes.  It does a good job of cascading updates to other
components of the system.

Terry


On Mon, Apr 21, 2014 at 7:40 AM, Bhavesh Patel bhavesh.bece...@gmail.comwrote:

 Dear Tech Team,

 We are facing one issue on Discover search

 We have made one mistake while entering the author name

 Into some entry we have added Patel, Bhavesh and some entry Patel,Bhavesh R
 It will show like :
 Discover
 --
 Patel, Bhavesh (10)
 Patel, Bhavesh R (5)

 So We have to convert into one authour Patel, Bhavesh

 I have made changes into database
 Table : metadatavalue
 Field : textvalue

 I have search Patel, Bhavesh R and converted into Patel, Bhavesh

 and then update index.. but still it's shows the old value.. but into that
 Item page it's show updated authour name

 What may be the issue ?

 Version : DSpace 4.1  OS : Window 7


 Thanks  Regards,
 *Bhavesh R. Patel *

- www.bhaveshpatel.info
- www.onlinequizportal.com
- www.hindisuvichar.com

 Never leave till tomorrow which you can do today
 *Please consider the environment before printing this e-mail.*


 --
 Start Your Social Network Today - Download eXo Platform
 Build your Enterprise Intranet with eXo Platform Software
 Java Based Open Source Intranet - Social, Extensible, Cloud Ready
 Get Started Now And Turn Your Intranet Into A Collaboration Platform
 http://p.sf.net/sfu/ExoPlatform
 ___
 DSpace-tech mailing list
 DSpace-tech@lists.sourceforge.net
 https://lists.sourceforge.net/lists/listinfo/dspace-tech
 List Etiquette:
 https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette




-- 
Terry Brady
Applications Programmer Analyst
Georgetown University Library Information Technology
https://www.library.georgetown.edu/lit/code
202-687-7053
--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

[Dspace-tech] import/export of metadata registries?

2014-04-22 Thread Charlene Chinda Barina
Hi,

Was wondering if there is a specific file or import system for metadata
registries; we were doing a test and have a lot of the other dspace
customizations set up to merge easily into the relevant installation, but
haven't found one re: registries.

Thanks,
Charlene

-- 
Charlene Barina, MPH
Research Analyst 2, U.S. IMPACT Study
The Information School
303-359-6347 | Skype: cbarina
facebook.com/ImpactSurvey | twitter.com/impactsurvey
--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

Re: [Dspace-tech] import/export of metadata registries?

2014-04-22 Thread Charlene Chinda Barina
nevermind; believe I should look here for the information:
https://wiki.duraspace.org/display/DSDOC4x/Configuration+Reference#ConfigurationReference-MetadataFormatRegistries


On Tue, Apr 22, 2014 at 4:28 PM, Charlene Chinda Barina cbar...@uw.eduwrote:

 Hi,

 Was wondering if there is a specific file or import system for metadata
 registries; we were doing a test and have a lot of the other dspace
 customizations set up to merge easily into the relevant installation, but
 haven't found one re: registries.

 Thanks,
 Charlene

 --
 Charlene Barina, MPH
 Research Analyst 2, U.S. IMPACT Study
 The Information School
 303-359-6347 | Skype: cbarina
 facebook.com/ImpactSurvey | twitter.com/impactsurvey




-- 
Charlene Barina, MPH
Research Analyst 2, U.S. IMPACT Study
The Information School
303-359-6347 | Skype: cbarina
facebook.com/ImpactSurvey | twitter.com/impactsurvey
--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

[Dspace-tech] Setting default home page to a single community

2014-04-22 Thread Charlene Chinda Barina
Hi,

We've set up dspace to have one community and 4 collections under it.
Rather than list the Communities in X Repository on the home page, I'd
just like to show the links to the 4 collections, like what you see when
you click on the individual community.

Could someone tell me how to change that, and where? Conceptually I could
see it either being changing the default url/page for the homepage, or
changing what is displayed on the homepage via a particular xsl file.

Thanks,
Charlene

-- 
Charlene Barina, MPH
Research Analyst 2, U.S. IMPACT Study
The Information School
303-359-6347 | Skype: cbarina
facebook.com/ImpactSurvey | twitter.com/impactsurvey
--
Start Your Social Network Today - Download eXo Platform
Build your Enterprise Intranet with eXo Platform Software
Java Based Open Source Intranet - Social, Extensible, Cloud Ready
Get Started Now And Turn Your Intranet Into A Collaboration Platform
http://p.sf.net/sfu/ExoPlatform___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech
List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette