Hello.

Opinions on this were requested.

I agree completely with Richard Jizba.

John

John Schumacher
Office of Library and Information Services
SUNY System Administration
SUNY Plaza
Albany, NY 12246
518-320-1477 (Note, new number!)
518-320-1554 (fax)
john.schumac...@suny.edu 
SUNY Digital Repository
http://dspace.sunyconnect.suny.edu/


==== Philosophical Discussion ==== 

I am little surprised that the DSpace community thinks stemming like
that done by the Porter Stemming Algorithm is so important. I have been
searching bibliographic databases since the early 1980s and teach
courses to our health sciences students on search techniques. We have
always appreciated the systems that give us the power to find exactly
the terms and the combinations we want. Language is just too rich and
varied for any other approach in my experience. There have been many
times when I have needed to search for a singular form of a noun vs a
plural form or vice versa. Using truncation and wildcard operators is
not rocket science. Lucene has some really powerful search operators,
but their power is basically nullified by the Stemming operation. 

Our DSpace instance isn't aimed primarily at a broad worldwide user
base, but select groups of students, staff and faculty with rather
sophisticated information needs. Besides, most of our collection can
also be discovered through Google. Why duplicate that, when I have the
option of also creating an alternative search environment that provides
for sophisticated, analytical searches of scholarly, curricular and
administrative documents?

You might be surprised at how quickly the people in our Office of
Medical Education have picked up on the nuances of how and where they
put metadata, the need for standardized vocabulary in defining lecture
objectives, and how quickly they figured out what was happening to their
attempts to search for "wellness" (stemmed to "well"). (It did not
surprise me!)

I think the distributed community administration available with DSpace
will really help our faculty and staff  take seriously the data (text)
they put into their collections. Our expertise as "consultants" and
trainers to the staff in the Office of Medical Education has really made
them appreciate the expertise of librarians, particularly my reference
librarians who have very good analytical search skills. Don't sell
people short -- they can be very sophisticated which means we need to
provide them with powerful tools, not heavy-handed interventions (the
Porter Algorithm)

I'm planning on being at OR11 and would be happy to discuss this over a
beer.

If anybody is still with me, I would be curious if there is a
LowerCaseFilter that would permit the retention of capital 'A's.
Eliminating 'A's in medical research databases is a problem. Vitamin A
is the obvious example, but there are many other occurrences of 'A' as
an important, non-trivial term in a name.

Richard Jizba
Creighton University


------------------------------------------------------------------------------
Special Offer-- Download ArcSight Logger for FREE (a $49 USD value)!
Finally, a world-class log management solution at an even better price-free!
Download using promo code Free_Logger_4_Dev2Dev. Offer expires 
February 28th, so secure your free ArcSight Logger TODAY! 
http://p.sf.net/sfu/arcsight-sfd2d
_______________________________________________
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to