Re: [CODE4LIB] exact title searches with z39.50
To sidestep the issue of strict/relaxed and face the real world of spotty implementation of standards (and it seems to apply however non/arcane they are) we provide a configurable strictness flag and the ability to have non-supported indexes and some functions mapped to supported ones on a Source by Source basis. Admins can allow users to have this strict/relaxed switch or not. And users can apply it or not. For both the majority case is not (i.e. relaxed is used). Peter Dr Peter Noerr CTO, MuseGlobal, Inc. +1 415 896 6873 (office) +1 415 793 6547 (mobile) www.museglobal.com -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Jonathan Rochkind Sent: Tuesday, April 28, 2009 08:43 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] exact title searches with z39.50 It can be a chicken-egg thing too. Maybe more users would be doing more sophisticated searches if they actually _worked_. Plus I know that I could write systems to use federated search to embed certain functionality in certain places, if more sophisticated searches worked more reliably. Walker, David wrote: I'm not sure it's a _big_ mess, though, at least for metasearching. I was just looking at our metasearch logs this morning, so did a quick count: 93% of the searches were keyword searches. Not a lot of exactness required there. It's mostly in the 7% who are doing more specific searches (author, title, subject) where the bulk if the problems lie, I suspect. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ray Denenberg, Library of Congress [r...@loc.gov] Sent: Tuesday, April 28, 2009 8:32 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] exact title searches with z39.50 Right, Mike. There is a long and rich history of the debate between loose and strict interpretation, in the world at large, and in particular, within Z39.50, this debate raged from the late 1980s throughout the 90s. The faction that said If you can't give the client what is asks for, at least give them something; make them happy was almost religious in its zeal. Those who said If you can't give the client what it asks for, be honest about it; give them good diagnostic information, tell them a better way to formulate the request, etc. But don't pretend the transaction was a success if it wasn't was shouted down most every time. I can't predict, but I'm just hoping that lessons have been learned from the mess that that mentality got us into. --Ray - Original Message - From: Mike Taylor m...@indexdata.com To: CODE4LIB@LISTSERV.ND.EDU Sent: Tuesday, April 28, 2009 10:43 AM Subject: Re: [CODE4LIB] exact title searches with z39.50 Ray Denenberg, Library of Congress writes: The irony is that Z39.50 actually make _much_ more effort to specify semantics than most other standards -- and yet still finds itself in the situation where many implementations do not respond correctly to the BIB-1 attribute 6=3 (completeness=complete field) which is how Eric should be able to do what he wants here. Not that I have any good answers to this problem ... but I DO know that inventing more and more replacement standards it NOT the answer. Everything that's come along since Z39.50 has suffered from exactly the same problem but more so. I think this remains to be seen for SRU/CQL, in particular for the example at hand, how to search for exact title. There are two related issues: one, how arcane the standard is, and two, how closely implementations conform to the intended semantics. And clearly the first has a bearing on the second. And even I would say that Z39.50 is a bit on the arcance side when it comes to formulating a query for exact title. With SRU/CQL there is an exact relation ('exact' in 1.1, '==' in 1.2). So I would think there is less excuse for a server to apply a creative interpretation. If it cannot support exact title it should fail the search. IMHO, this is where it breaks down 90% of the time. Servers that can't do what they're asked should say I can't do that, but -- for reasons that seem good at the time -- nearly no server fails requests that it can sort of fulfil. Nine out of ten Z39.50 servers asked to do a whole-field search and which can't do it will instead do a word search, because it's better to give the user SOMETHING. I bet the same is true of SRU servers. (I am as guilty as anyone else, I've written servers like that.) The idea that it's better to give the user SOMETHING might -- might -- have been true when we mostly used Z39.50 servers for interactive sessions. Now that they are mostly used as targets in metasearching
Re: [CODE4LIB] exact title searches with z39.50
Bill Dueber writes: What are the ways to accomplish exact title searches with z39.50? I'm looping through a list of MARC records trying to determine whether or not we own multiple copies of an item. After reading MARC field 245, subfield a I am creating the following z39.50 query: @attr 1=4 foo bar Unfortunately my local implementation seems to interpret this in a rather regular expression sort of way -- * foo bar *. Does anybody out there know how to create a more exact query? I only want to find titles exactly equalling foo bar. Like so many library standards, z30.50 is a syntax and a set of rough guidelines. You have no idea what's actually happening on the other end, because it's not specified, and you just have to either find someone you can ask at the target machine or reverse engineer it. The irony is that Z39.50 actually make _much_ more effort to specify semantics than most other standards -- and yet still finds itself in the situation where many implementations do not respond correctly to the BIB-1 attribute 6=3 (completeness=complete field) which is how Eric should be able to do what he wants here. Not that I have any good answers to this problem ... but I DO know that inventing more and more replacement standards it NOT the answer. Everything that's come along since Z39.50 has suffered from exactly the same problem but more so. _/|____ /o ) \/ Mike Taylorm...@indexdata.comhttp://www.miketaylor.org.uk )_v__/\ Not raw -- cooked -- Monty Python's Flying Circus.
Re: [CODE4LIB] exact title searches with z39.50 [resolved]
On Apr 27, 2009, at 5:13 PM, Eric Lease Morgan wrote: What are the ways to accomplish exact title searches with z39.50? Thank you for all the prompt and helpful replies. The most precise and complete magic incantation came from Larry Dixon of the Library of Congress: Exact match in Z39.50 is accomplished by using additional attributes. See the attribute table in the Bath Profile. [1] Below is a real-world example -- user is attempting to locate the journal Canadian Poetry and doesn't know any of the unique identifiers associated with it. A keyword search for Canadian poetry gets 90 hits -- exact match gets 3. Z f @attr 1=4 canadian poetry Number of hits: 90 Z f @attr 1=4 @attr 2=3 @attr 3=1 @attr 4=1 @attr 5=100 @attr 6=3 canadian poetry Number of hits: 3 This query takes advantage of many additional attribute types/names as alluded to by Tim Shearer: use/title, relation/equal, position/first in field, structure/phrase, truncation/do not truncate, and completeness/complete field. It would have taken me a long, long time to figure this out, and luckily my server supports it. Wow, isn't the Internet cool, and /me wonders, Did the Bath Profile come from... Bath? [2] [1] Bath Profile - http://www.collectionscanada.gc.ca/bath/tp-bath2.9-e.htm#a [2] Bath, London, and ancient stone circles - http://infomotions.com/gallery/bath/ -- Eric Lease Morgan Hesburgh Libraries, University of Notre Dame
Re: [CODE4LIB] exact title searches with z39.50
From: Mike Taylor m...@indexdata.com The irony is that Z39.50 actually make _much_ more effort to specify semantics than most other standards -- and yet still finds itself in the situation where many implementations do not respond correctly to the BIB-1 attribute 6=3 (completeness=complete field) which is how Eric should be able to do what he wants here. Not that I have any good answers to this problem ... but I DO know that inventing more and more replacement standards it NOT the answer. Everything that's come along since Z39.50 has suffered from exactly the same problem but more so. I think this remains to be seen for SRU/CQL, in particular for the example at hand, how to search for exact title. There are two related issues: one, how arcane the standard is, and two, how closely implementations conform to the intended semantics. And clearly the first has a bearing on the second. And even I would say that Z39.50 is a bit on the arcance side when it comes to formulating a query for exact title. With SRU/CQL there is an exact relation ('exact' in 1.1, '==' in 1.2). So I would think there is less excuse for a server to apply a creative interpretation. If it cannot support exact title it should fail the search. With Z39.50 there is more perceived latitude for a server to pretend it supports something it doesn't. --Ray
Re: [CODE4LIB] exact title searches with z39.50 [resolved]
On Tue, Apr 28, 2009 at 8:27 AM, Eric Lease Morgan emor...@nd.edu wrote: Wow, isn't the Internet cool, and /me wonders, Did the Bath Profile come from... Bath? [2] Yes. http://www.collectionscanada.gc.ca/bath/tp-bath2.1-e.htm#c
Re: [CODE4LIB] exact title searches with z39.50
Ray Denenberg, Library of Congress writes: The irony is that Z39.50 actually make _much_ more effort to specify semantics than most other standards -- and yet still finds itself in the situation where many implementations do not respond correctly to the BIB-1 attribute 6=3 (completeness=complete field) which is how Eric should be able to do what he wants here. Not that I have any good answers to this problem ... but I DO know that inventing more and more replacement standards it NOT the answer. Everything that's come along since Z39.50 has suffered from exactly the same problem but more so. I think this remains to be seen for SRU/CQL, in particular for the example at hand, how to search for exact title. There are two related issues: one, how arcane the standard is, and two, how closely implementations conform to the intended semantics. And clearly the first has a bearing on the second. And even I would say that Z39.50 is a bit on the arcance side when it comes to formulating a query for exact title. With SRU/CQL there is an exact relation ('exact' in 1.1, '==' in 1.2). So I would think there is less excuse for a server to apply a creative interpretation. If it cannot support exact title it should fail the search. IMHO, this is where it breaks down 90% of the time. Servers that can't do what they're asked should say I can't do that, but -- for reasons that seem good at the time -- nearly no server fails requests that it can sort of fulfil. Nine out of ten Z39.50 servers asked to do a whole-field search and which can't do it will instead do a word search, because it's better to give the user SOMETHING. I bet the same is true of SRU servers. (I am as guilty as anyone else, I've written servers like that.) The idea that it's better to give the user SOMETHING might -- might -- have been true when we mostly used Z39.50 servers for interactive sessions. Now that they are mostly used as targets in metasearching, that approach is disastrous. _/|____ /o ) \/ Mike Taylorm...@indexdata.comhttp://www.miketaylor.org.uk )_v__/\ I try to take one day at a time, but sometimes several days attack me at once -- Ashleigh Brilliant.
Re: [CODE4LIB] exact title searches with z39.50
Right, Mike. There is a long and rich history of the debate between loose and strict interpretation, in the world at large, and in particular, within Z39.50, this debate raged from the late 1980s throughout the 90s. The faction that said If you can't give the client what is asks for, at least give them something; make them happy was almost religious in its zeal. Those who said If you can't give the client what it asks for, be honest about it; give them good diagnostic information, tell them a better way to formulate the request, etc. But don't pretend the transaction was a success if it wasn't was shouted down most every time. I can't predict, but I'm just hoping that lessons have been learned from the mess that that mentality got us into. --Ray - Original Message - From: Mike Taylor m...@indexdata.com To: CODE4LIB@LISTSERV.ND.EDU Sent: Tuesday, April 28, 2009 10:43 AM Subject: Re: [CODE4LIB] exact title searches with z39.50 Ray Denenberg, Library of Congress writes: The irony is that Z39.50 actually make _much_ more effort to specify semantics than most other standards -- and yet still finds itself in the situation where many implementations do not respond correctly to the BIB-1 attribute 6=3 (completeness=complete field) which is how Eric should be able to do what he wants here. Not that I have any good answers to this problem ... but I DO know that inventing more and more replacement standards it NOT the answer. Everything that's come along since Z39.50 has suffered from exactly the same problem but more so. I think this remains to be seen for SRU/CQL, in particular for the example at hand, how to search for exact title. There are two related issues: one, how arcane the standard is, and two, how closely implementations conform to the intended semantics. And clearly the first has a bearing on the second. And even I would say that Z39.50 is a bit on the arcance side when it comes to formulating a query for exact title. With SRU/CQL there is an exact relation ('exact' in 1.1, '==' in 1.2). So I would think there is less excuse for a server to apply a creative interpretation. If it cannot support exact title it should fail the search. IMHO, this is where it breaks down 90% of the time. Servers that can't do what they're asked should say I can't do that, but -- for reasons that seem good at the time -- nearly no server fails requests that it can sort of fulfil. Nine out of ten Z39.50 servers asked to do a whole-field search and which can't do it will instead do a word search, because it's better to give the user SOMETHING. I bet the same is true of SRU servers. (I am as guilty as anyone else, I've written servers like that.) The idea that it's better to give the user SOMETHING might -- might -- have been true when we mostly used Z39.50 servers for interactive sessions. Now that they are mostly used as targets in metasearching, that approach is disastrous. _/|_ ___ /o ) \/ Mike Taylorm...@indexdata.com http://www.miketaylor.org.uk )_v__/\ I try to take one day at a time, but sometimes several days attack me at once -- Ashleigh Brilliant.
Re: [CODE4LIB] exact title searches with z39.50
I'm not sure it's a _big_ mess, though, at least for metasearching. I was just looking at our metasearch logs this morning, so did a quick count: 93% of the searches were keyword searches. Not a lot of exactness required there. It's mostly in the 7% who are doing more specific searches (author, title, subject) where the bulk if the problems lie, I suspect. --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [code4...@listserv.nd.edu] On Behalf Of Ray Denenberg, Library of Congress [r...@loc.gov] Sent: Tuesday, April 28, 2009 8:32 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] exact title searches with z39.50 Right, Mike. There is a long and rich history of the debate between loose and strict interpretation, in the world at large, and in particular, within Z39.50, this debate raged from the late 1980s throughout the 90s. The faction that said If you can't give the client what is asks for, at least give them something; make them happy was almost religious in its zeal. Those who said If you can't give the client what it asks for, be honest about it; give them good diagnostic information, tell them a better way to formulate the request, etc. But don't pretend the transaction was a success if it wasn't was shouted down most every time. I can't predict, but I'm just hoping that lessons have been learned from the mess that that mentality got us into. --Ray - Original Message - From: Mike Taylor m...@indexdata.com To: CODE4LIB@LISTSERV.ND.EDU Sent: Tuesday, April 28, 2009 10:43 AM Subject: Re: [CODE4LIB] exact title searches with z39.50 Ray Denenberg, Library of Congress writes: The irony is that Z39.50 actually make _much_ more effort to specify semantics than most other standards -- and yet still finds itself in the situation where many implementations do not respond correctly to the BIB-1 attribute 6=3 (completeness=complete field) which is how Eric should be able to do what he wants here. Not that I have any good answers to this problem ... but I DO know that inventing more and more replacement standards it NOT the answer. Everything that's come along since Z39.50 has suffered from exactly the same problem but more so. I think this remains to be seen for SRU/CQL, in particular for the example at hand, how to search for exact title. There are two related issues: one, how arcane the standard is, and two, how closely implementations conform to the intended semantics. And clearly the first has a bearing on the second. And even I would say that Z39.50 is a bit on the arcance side when it comes to formulating a query for exact title. With SRU/CQL there is an exact relation ('exact' in 1.1, '==' in 1.2). So I would think there is less excuse for a server to apply a creative interpretation. If it cannot support exact title it should fail the search. IMHO, this is where it breaks down 90% of the time. Servers that can't do what they're asked should say I can't do that, but -- for reasons that seem good at the time -- nearly no server fails requests that it can sort of fulfil. Nine out of ten Z39.50 servers asked to do a whole-field search and which can't do it will instead do a word search, because it's better to give the user SOMETHING. I bet the same is true of SRU servers. (I am as guilty as anyone else, I've written servers like that.) The idea that it's better to give the user SOMETHING might -- might -- have been true when we mostly used Z39.50 servers for interactive sessions. Now that they are mostly used as targets in metasearching, that approach is disastrous. _/|_ ___ /o ) \/ Mike Taylorm...@indexdata.com http://www.miketaylor.org.uk )_v__/\ I try to take one day at a time, but sometimes several days attack me at once -- Ashleigh Brilliant.
Re: [CODE4LIB] exact title searches with z39.50
From: Walker, David dwal...@calstate.edu I'm not sure it's a _big_ mess, though, at least for metasearching. I wasn't thinking specifically about metasearch, but rather, bad decisions getting replicated and you end up with an installed base of bad implementations. The best illustration would be the huge mess that HTML is. --Ray
Re: [CODE4LIB] exact title searches with z39.50
HTML works out pretty well. If our biggest failures were 'failures' like HTML, we'd be doing pretty well. Ray Denenberg, Library of Congress wrote: From: Walker, David dwal...@calstate.edu I'm not sure it's a _big_ mess, though, at least for metasearching. I wasn't thinking specifically about metasearch, but rather, bad decisions getting replicated and you end up with an installed base of bad implementations. The best illustration would be the huge mess that HTML is. --Ray
Re: [CODE4LIB] exact title searches with z39.50
Jonathan Rochkind writes: I'm not sure it's a _big_ mess, though, at least for metasearching. I wasn't thinking specifically about metasearch, but rather, bad decisions getting replicated and you end up with an installed base of bad implementations. The best illustration would be the huge mess that HTML is. HTML works out pretty well. If our biggest failures were 'failures' like HTML, we'd be doing pretty well. Got to agree there (even though it undermines the point I was making before) -- HTML is not a good example of a system that's undermined its utility by trying too hard to be helpful. That Clay Shirky observation again: You cannot simultaneously have mass adoption and rigor. It seems pretty clear that it applies to something like HTML, where you want to have literally millions of people writing it. Not so much in implementing search standards, where the number of implementers is likely in double figures. _/|____ /o ) \/ Mike Taylorm...@indexdata.comhttp://www.miketaylor.org.uk )_v__/\ Diagnosing: it is OK. -- wonderful diagnostic from _something_ in my AUTOEXEC.BAT
Re: [CODE4LIB] exact title searches with z39.50
From: Jonathan Rochkind rochk...@jhu.edu HTML works out pretty well. If our biggest failures were 'failures' like HTML, we'd be doing pretty well. HTML is a wonderful standard. And I don't mean to take the discussion off-course. My point was simply that because early browsers did not insist on clean html, the proliferation of unlean html has reached the point where, well, whether you consider it a mess or not depends on how much importance you place on clean html. It's important to me. --Ray
[CODE4LIB] exact title searches with z39.50
What are the ways to accomplish exact title searches with z39.50? I'm looping through a list of MARC records trying to determine whether or not we own multiple copies of an item. After reading MARC field 245, subfield a I am creating the following z39.50 query: @attr 1=4 foo bar Unfortunately my local implementation seems to interpret this in a rather regular expression sort of way -- * foo bar *. Does anybody out there know how to create a more exact query? I only want to find titles exactly equalling foo bar. -- Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] exact title searches with z39.50
You could start with the exact title search as expressed in the Bath Profile: http://www.collectionscanada.gc.ca/bath/tp-bath2.9-e.htm#a . But you may well have to tinker to discover the combination that your server will accept and interpret the way you want it to. All the best, Peter -Original Message- From: Code for Libraries [mailto:code4...@listserv.nd.edu] On Behalf Of Eric Lease Morgan Sent: Monday, April 27, 2009 3:14 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] exact title searches with z39.50 What are the ways to accomplish exact title searches with z39.50? I'm looping through a list of MARC records trying to determine whether or not we own multiple copies of an item. After reading MARC field 245, subfield a I am creating the following z39.50 query: @attr 1=4 foo bar Unfortunately my local implementation seems to interpret this in a rather regular expression sort of way -- * foo bar *. Does anybody out there know how to create a more exact query? I only want to find titles exactly equalling foo bar. -- Eric Lease Morgan University of Notre Dame
Re: [CODE4LIB] exact title searches with z39.50
Like so many library standards, z30.50 is a syntax and a set of rough guidelines. You have no idea what's actually happening on the other end, because it's not specified, and you just have to either find someone you can ask at the target machine or reverse engineer it. On Mon, Apr 27, 2009 at 5:13 PM, Eric Lease Morgan emor...@nd.edu wrote: What are the ways to accomplish exact title searches with z39.50? I'm looping through a list of MARC records trying to determine whether or not we own multiple copies of an item. After reading MARC field 245, subfield a I am creating the following z39.50 query: @attr 1=4 foo bar Unfortunately my local implementation seems to interpret this in a rather regular expression sort of way -- * foo bar *. Does anybody out there know how to create a more exact query? I only want to find titles exactly equalling foo bar. -- Eric Lease Morgan University of Notre Dame -- Bill Dueber Library Systems Programmer University of Michigan Library
Re: [CODE4LIB] exact title searches with z39.50
I think in order to accomplish this you'd have to send a completeness or truncation attribute: @attr 1=4 6=3 foo bar # search for 'foo bar' as the complete field @attr 1=4 6=2 foo bar # search for 'foo bar' as the complete subfield @attr 1=4 5=100 foo bar # do not truncate - although this is probably not exactly right. The full list is here: http://www.loc.gov/z3950/agency/defns/bib1.html Although I will bet that Aleph (which I assume you're querying) doesn't support any of this. I actually just wrote about this exact thing tonight: http://dilettantes.code4lib.org/2009/04/commoditizing-the-stack/ -Ross. On Mon, Apr 27, 2009 at 5:13 PM, Eric Lease Morgan emor...@nd.edu wrote: What are the ways to accomplish exact title searches with z39.50? I'm looping through a list of MARC records trying to determine whether or not we own multiple copies of an item. After reading MARC field 245, subfield a I am creating the following z39.50 query: �...@attr 1=4 foo bar Unfortunately my local implementation seems to interpret this in a rather regular expression sort of way -- * foo bar *. Does anybody out there know how to create a more exact query? I only want to find titles exactly equalling foo bar. -- Eric Lease Morgan University of Notre Dame