Re: [OPEN-ILS-DEV] Introduction and Question
Brandon W Uhlman [EMAIL PROTECTED] writes: Hi, Larry. :) Your friendly neighbourhood British Columbia Evergreen sysadmin here. :) Hi Brandon, Big neighbourhood! Are you going to be the technical point man for the BC Pines implementation? - The patron data in an Athena system is trivially exportable in dBase format. I'm not sure whether the same functionality exists for transactional and hold data, or for item-level data. Bibliographic data in MARC format is MARC data, and that's fine as it is. All the circulation data in Athena is held in various dBase files that can be accessed independently of the Athena interface. I guess my real question was: are those dBase filenames hard-coded into Athena or will they change with each library's implementation of Athena? Sagebrush's support team has refused to answer that question, claiming that it is their intellectual property. :( I can always just go visit another library to see for myself. - The database schema, mostly up-to-date, is available on the dokuwiki at http://open-ils.org/documentation/evergreen_1.1.3_erd.html. Thanks, that helps a lot. Will that structure stay more-or-less the same as features, such as acquisitions, are added to OpenILS? - The Django piece is used to tickle some of the back-end settings like circ policies, what Evergreen calls org units (branches, library systems, etc.), and the like. If you've not used Djano before, it's worth a look at http://www.djangoproject.com/. Pretty whiz-bang stuff. Cool! This will replace the perl cgi scripts for basic configuration, and then some? If I have questions about the BC Pines implementation, is it appropriate to ask them here, or should I take it to the Northern Pines list? Cheers, Larry - Larry Stamm, Network Administrator McBride and District Public Library Ph: 250-569-2411 http://mcbride.bclibrary.ca
Re: [OPEN-ILS-DEV] Introduction and Question
On 10/22/07, Larry Stamm [EMAIL PROTECTED] wrote: Brandon W Uhlman [EMAIL PROTECTED] writes: Hi, Larry. :) Your friendly neighbourhood British Columbia Evergreen sysadmin here. :) Hi Brandon, Big neighbourhood! Are you going to be the technical point man for the BC Pines implementation? - The patron data in an Athena system is trivially exportable in dBase format. I'm not sure whether the same functionality exists for transactional and hold data, or for item-level data. Bibliographic data in MARC format is MARC data, and that's fine as it is. All the circulation data in Athena is held in various dBase files that can be accessed independently of the Athena interface. I guess my real question was: are those dBase filenames hard-coded into Athena or will they change with each library's implementation of Athena? Sagebrush's support team has refused to answer that question, claiming that it is their intellectual property. :( I can always just go visit another library to see for myself. - The database schema, mostly up-to-date, is available on the dokuwiki at http://open-ils.org/documentation/evergreen_1.1.3_erd.html. Hey! That's my IP!! ;) Thanks, that helps a lot. Will that structure stay more-or-less the same as features, such as acquisitions, are added to OpenILS? They will, for the most part. I will put up a 1.2.0 ERD soon and announce it here. - The Django piece is used to tickle some of the back-end settings like circ policies, what Evergreen calls org units (branches, library systems, etc.), and the like. If you've not used Djano before, it's worth a look at http://www.djangoproject.com/. Pretty whiz-bang stuff. Cool! This will replace the perl cgi scripts for basic configuration, and then some? Yes. And we're looking for pythonistas to work on that as well. (hint-hint) :) If I have questions about the BC Pines implementation, is it appropriate to ask them here, or should I take it to the Northern Pines list? I'll mostly defer to Brandon/BCPines on that, but I will say that we'd appreciate anything you feel is appropriate to send here. -- Mike Rylander | VP, Research and Design | Equinox Software, Inc. / The Evergreen Experts | phone: 1-877-OPEN-ILS (673-6457) | email: [EMAIL PROTECTED] | web: http://www.esilibrary.com
Re: [OPEN-ILS-DEV] Introduction and Question
Hi, Larry. :) Your friendly neighbourhood British Columbia Evergreen sysadmin here. :) To the best of my knowledge, Georgia, the only other production Evergreen system (yet) did not migrate any Athena sites. Most systems were part of the previous Unicorn implementation for the already-extant consortium, and the incumbent ILS systems in the new members of PINES were not Athena sites. To answer your questions as posted: - The patron data in an Athena system is trivially exportable in dBase format. I'm not sure whether the same functionality exists for transactional and hold data, or for item-level data. Bibliographic data in MARC format is MARC data, and that's fine as it is. - The database schema, mostly up-to-date, is available on the dokuwiki at http://open-ils.org/documentation/evergreen_1.1.3_erd.html. - The Django piece is used to tickle some of the back-end settings like circ policies, what Evergreen calls org units (branches, library systems, etc.), and the like. If you've not used Djano before, it's worth a look at http://www.djangoproject.com/. Pretty whiz-bang stuff. Cheers, Brandon Quoting Larry Stamm [EMAIL PROTECTED]: Hello All, I am working with one of the small British Columbia rural libraries that will be migrating to the Evergreen OpenILS in the not too distant future. I am looking at getting the data out of our current Sagebrush Athena system and into the Evergreen DB, hoping to develop a somewhat automated script for all the other little libraries running Athena. I have successfully built Evergreen 1.2 in /usr/local/openils/ on a test server running Arch Linux, and am busy exploring the workings. My current questions are: 1. Is there any experience out there with migrating data out of Athena? I can get biblio data into a MARC21 format, and I think I have figured out where all the circ and patron data reside in Athena's database, but it would be nice to get some confirmation. 2. Is there any listing of the tables in the Evergreen DB where the circulation and patron data reside? I can figure it out eventually, but I'm a bit lazy 3. Finally, where does the Django stuff residing in openils/var/admin/ils_admin fit into the scheme of things? Regards, -- Larry Stamm, Network Administrator McBride and District Public Library Ph: 250-569-2411 http://mcbride.bclibrary.ca == Brandon W. Uhlman, Systems Consultant Public Library Services Branch Ministry of Education Government of British Columbia 605 Robson Street, 5th Floor Vancouver, BC V6B 5J3 Phone: (604) 660-2972 E-mail: [EMAIL PROTECTED] [EMAIL PROTECTED]
Re: [OPEN-ILS-DEV] Introduction and Question
If the search term is in a field not normally displayed (say the 505 – which I think should be displayed, but that's another topic), is it possible to display that field when the record is retrieved? It'd be magic from my perspective, but I don't know how easy or hard it'd be. I do know the next major OPAC revision will be a lot more flexible in how it pulls and formats information for the Record Summary page. I imagine the catalog could just retrieve the whole record (on the details page) and search for the terms within the record and make those visible, and not care about how the database actually found that record in the first place. Would that be worthwhile? --- Jason http://esilibrary.com/
Re: [OPEN-ILS-DEV] Introduction and Question
On 9/14/07, Patrick Durusau [EMAIL PROTECTED] wrote: BTW, I am still curious about the relevance algorithm that returned jazz music for the search term (without quotes) np-completeness. Or does the system not react well to hyphens in names unless surrounded by quotes? Not real sure why it would parse a hyphen but I have seen odder things. (Noting that when I surrounded it with quotes np-completeness I got zero hits, not jazz.) Hi Patrick, I believe when you quote a search term, it searches for that exact string, with no stemming or other interpretation. Without the quotes, I believe EG will strip out punctuation, so you'd basically be doing a search for np and completeness, or some stemmed variants. So your first hit there found a np in a 300 field, and complete in the 245. Hrmm, is that a valid record? For the cases where we do encounter messed up records, I imagine we could codify some cataloger sanity checking and not index certain things that look like garbage, but I don't think it'll ever be perfect. Here's a wiki document explaining some relevance ranking stuff, though I don't know if it's still accurate: http://open-ils.org/dokuwiki/doku.php?id=scratchpad:opac_demo The metarecords it talks about is the FRBR-like groupings you can get it if you choose Group Formats and Editions in the Advanced Search. PS: One more question: Are there plans to add synonym support to further confuse users with search results? ;-) I would think it would be an advanced search option. I know they're planning multiple thesaurus support, but I think that might manifest in the Did you mean/Are you looking for/spellcheck feature (another kettle of fish that needs work), and/or in the authority-based sidebars. I can't imagine loosening search results just to inflate the number of hits. I'd rather get zero hits and then a lot of suggestions. -- Jason http://esilibrary.com/
RE: [OPEN-ILS-DEV] Introduction and Question
Patrick, The added entry was not showing up in the opac display because the MARC record was incorrectly coded and incorrectly cataloged. In the case of this record, it had only a passing familiarity with MARC. Unfortunately, the PINES database is resplendent with very dreadful records that effect how Evergreen functions. We can get together at some point and I can regale you with all the reasons why (I don't know if you remember me - I used to work at Newton with Carol). Basically, the problems with the PINES catalog because of poor cataloging by libraries (prior to joining PINES), and the number of duplicate records created as a result, make it difficult to accurately understand and illustrate how Evergreen searches and displays. There are also some problems with authority control in Evergreen that also cause a few problems with searching author and subject. My understanding is that those problems will be addressed in an upcoming release. There are also some elements of the MARC record I, as a cataloger want displayed in the OPAC (cast lists, for example), but I don't have the final say in those kinds of local decisions. I have merged duplicate records in this search and overlaid the records with better OCLC records. If you do the search again, you should get a result set that is 13 records rather than 20. For the title in question, you should now see added entries for all people mentioned in the record as responsible for the item. Yesterday, I was busy juggling several different questions and problems and just answered one of your basic questions and did not follow your search. Hopefully, by cleaning up the records in the result set, some of your questions were resolved. I gave a much too brief explanation of 700 fields. We refer to them as author fields but they are actually added entry personal name fields. When we catalog, we create added entries for people and entities responsible for the item we have in hand. They can be corporations, individuals, editors, illustrators, compilers, publishers, translators, etc. A good explanation is in OCLC's Bibliographic formats and standards (http://www.oclc.org/bibformats/default.htm): Use fields 700-730 to provide additional access to a bibliographic record from names and/or titles having various relationships to the item you are cataloging. Added entries are made for persons, corporate bodies and meetings having some form of responsibility for the creation of the work. This includes intellectual and publishing responsibilities. http://www.oclc.org/bibformats/en/7xx/ Added entry is a term from card catalogs - an additional card to provide access to the title card. It is still relevant in understanding the hierarchy of responsibility for a work. Added entries in an online environment are additional access points to the record. The main entry is the person or corporation with primary responsibility for the intellectual or artistic content of the work or that shares primary responsibility. For multiple authors, the main entry is the first author in the list on the title page. A 700 field is for personal names, 710 for corporate, etc. You can look at the Bib formats and standards and see explanations and rules for input for these and other MARC fields. Another aspect of this particular title (Slipcover chic), based on the record and without seeing the item - apparently it is primarily illustrated with text subordinate to those illustrations. When this is true of a title, the illustrator is the main entry and the author of the text is an added entry since the person primarily responsible for the work as a whole is the illustrator. I hope this helps clarify things. Elaine J. Elaine Hardy Library Services Manager - Collections Reference Georgia Public Library Service, A Unit of the University System of Georgia 1800 Century Place, Suite 150 Atlanta, Ga. 30345-4304 404.235-7128 404.235-7201, fax [EMAIL PROTECTED] www.georgialibraries.org -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Durusau Sent: Wednesday, September 12, 2007 4:11 PM To: open-ils-dev@list.georgialibraries.org Subject: Re: [OPEN-ILS-DEV] Introduction and Question OK, but that doesn't explain why the *displayed* record appears to be incoherent given the search term. If the record in question had returned the content of the 700 field, as opposed to the 100 field for that record, the question would have never come up. In other words, search the 700 field (I am not sure what you do about the illus. who was also listed when there is an author search) but return a *displayed* result that is meaningful in terms of the search request. Hope you are having a great day! Patrick
Re: [OPEN-ILS-DEV] Introduction and Question
I do think it would be useful to give more information on how matches are being made, in some generic manner. Not as a snippet of the record with the terms highlighted without any context, but maybe something like Search terms found in blah and blah, where blah is maybe some friendly description of the pertinent field mapped from MODS or Dublin Core, and not necessarily MARC. So you might get something like: 1) Computers and intractability : a guide to the theory of NP-completeness by Michael R Garey; David S Johnson Search terms found in title, table of contents, and abstract. 2) Np Completeness Comprehensive Reference, Guide And Solution Manual for Pnp. Search terms found in title, subject, user tags, and reviews. Would that be too weird? Another notion is to have a sidebar or summary of the actual match points as facets for the whole result set. Maybe something like... Matched on: Title: 5 hits Subject: 10 hits Table of Contents: 3 hits Reviews: 2 hits Something like that, and then choosing one of those would further constrain your search. Useful? -- Jason
Re: [OPEN-ILS-DEV] Introduction and Question
On 13/09/2007, Jason Etheridge [EMAIL PROTECTED] wrote: I do think it would be useful to give more information on how matches are being made, in some generic manner. Not as a snippet of the record with the terms highlighted without any context, but maybe something like Search terms found in blah and blah, where blah is maybe some friendly description of the pertinent field mapped from MODS or Dublin Core, and not necessarily MARC. So you might get something like: 1) Computers and intractability : a guide to the theory of NP-completeness by Michael R Garey; David S Johnson Search terms found in title, table of contents, and abstract. 2) Np Completeness Comprehensive Reference, Guide And Solution Manual for Pnp. Search terms found in title, subject, user tags, and reviews. Would that be too weird? Another notion is to have a sidebar or summary of the actual match points as facets for the whole result set. Maybe something like... Matched on: Title: 5 hits Subject: 10 hits Table of Contents: 3 hits Reviews: 2 hits Something like that, and then choosing one of those would further constrain your search. Useful? -- Jason It could be useful info, I think, for a small population: developers interested in tweaking search algorithms, librarians doing detective work who want to peek under the covers, and for the atypical patron. If you're going that far, you might as well show the query after processing as well (strikeout text for stopwords, greyed-out text for stems that were removed, etc). That being said, I don't think most people are going to care how a particular item was matched with the search string - not enough to make it a visible part of every retrieved record. That screen real estate is precious! If you could make it unobtrusive (hide it by default, surfacing it only with a deliberately set user preference, or a tiny little How did you find me? link), it could be nice. -- Dan Scott Laurentian University
RE: [OPEN-ILS-DEV] Introduction and Question
Patrick, The 700 field in a MARC record is an author field. It is used when there are either multiple authors for the item (since the 1xx fields are not repeatable) or when the person responsible for the work is an editor. So we do want an author search to include the 700 field. Elaine Hardy J. Elaine Hardy Library Services Manager - Collections Reference Georgia Public Library Service, A Unit of the University System of Georgia 1800 Century Place, Suite 150 Atlanta, Ga. 30345-4304 404.235-7128 404.235-7201, fax [EMAIL PROTECTED] www.georgialibraries.org -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Durusau Sent: Wednesday, September 12, 2007 11:02 AM To: open-ils-dev@list.georgialibraries.org Subject: [OPEN-ILS-DEV] Introduction and Question Greetings! This is my first post so first a word or two about my background. I am currently a co-editor for the OpenDocument Format standard in OASIS and its project editor in ISO (ISO 26300). I am also chair of the US committee that is the mirror commitee of SC 34, which is currently considering OpenXML (DIS 29500). When I am not involved in either of those projects, I am the convener of SC 34/WG 3, Topic Maps, as well as a co-editor of various parts of that standard. I am an independent consultant on standards (primarily markup and semantic integration) and related technologies. My question: Where are the search and relevance sections of the Evergreen code? I ask because I was posting an ILL for Computers and Intractibility: A Guide to the Theory of NP-completeness to my local library and in an effort to be helpful, I did a keyword search in Pines for np-completeness (note the lack of quotes) thinking that is a fairly unique term. Try it with Pines. The results are rather amusing and quite definitely not relevant. I performed the same keyword search with np-completeness and got no hits. (I would have expected to have the same results with the first search.) That made me curious so I tried searching for author, Garey, thinking it is a fairly unusual spelling so I would not get too many hits. Ok, I get some garey authors in the first 10 hits but also: Found objects a style and source book Ruggiero, Joseph. Slipcover chic : designing and sewing elegant slipcovers at home Revland, Catherine. As hits 9 and 10. Perfectly fine books I am sure but not what I would be looking for when searching for author = garey. Anyway, since searching is one of my interests (topic maps and their construction) I was puzzled by the anomalous result. Looking at the MARC record for the Revland, Catherine hit it appears that author = garey request is searching the 100 field *and* the 700 field, which for this item includes: 700 aBall, Michell, ill. 700 aGarey, Carol Cooper Which would be understandable if I had asked for a keyword search. Not so understandable with a author search. Well, I suppose I have two questions in addition to my first one, ;-) . 2. Where is the relevance code in particular since it was the source of the seemingly odd results on np-completeness. 3. Shouldn't author searches default to the MARC 100 field? (With keyword taking in 700 entries, etc.) Hope everyone is having a great day! Patrick -- Patrick Durusau [EMAIL PROTECTED] Chair, V1 - US TAG to JTC 1/SC 34 Acting Convener, JTC 1/SC 34/WG 3 (Topic Maps) Co-Editor, ISO/IEC 13250-1, 13250-5 (Topic Maps) Co-Editor, OpenDocument Format (OASIS, ISO/IEC 26300)
Re: [OPEN-ILS-DEV] Introduction and Question
One of the problems that I have with Evergreen searching, which I've complained about before is the way that the system will search fields that are not displayed in the search results screen to the average user. My favourite example is an author search for Fiander in PINES. There are some books in the library field that might appear, but you also see some british videos that appear, and there's NO indication in the public display why they matched a search for Fiander, until you look at the MARC display and discover that the actor Lewis Fiander (no relation) appears in the cast list, which is not displayed to the public. One of the first rules of cataloguing is that access points (ie search fields) MUST be supported by the record somehow. - David Hardy, Elaine wrote: Patrick, The 700 field in a MARC record is an author field. It is used when there are either multiple authors for the item (since the 1xx fields are not repeatable) or when the person responsible for the work is an editor. So we do want an author search to include the 700 field. Elaine Hardy J. Elaine Hardy Library Services Manager - Collections Reference Georgia Public Library Service, A Unit of the University System of Georgia 1800 Century Place, Suite 150 Atlanta, Ga. 30345-4304 404.235-7128 404.235-7201, fax [EMAIL PROTECTED] www.georgialibraries.org -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Durusau Sent: Wednesday, September 12, 2007 11:02 AM To: open-ils-dev@list.georgialibraries.org Subject: [OPEN-ILS-DEV] Introduction and Question Greetings! This is my first post so first a word or two about my background. I am currently a co-editor for the OpenDocument Format standard in OASIS and its project editor in ISO (ISO 26300). I am also chair of the US committee that is the mirror commitee of SC 34, which is currently considering OpenXML (DIS 29500). When I am not involved in either of those projects, I am the convener of SC 34/WG 3, Topic Maps, as well as a co-editor of various parts of that standard. I am an independent consultant on standards (primarily markup and semantic integration) and related technologies. My question: Where are the search and relevance sections of the Evergreen code? I ask because I was posting an ILL for Computers and Intractibility: A Guide to the Theory of NP-completeness to my local library and in an effort to be helpful, I did a keyword search in Pines for np-completeness (note the lack of quotes) thinking that is a fairly unique term. Try it with Pines. The results are rather amusing and quite definitely not relevant. I performed the same keyword search with np-completeness and got no hits. (I would have expected to have the same results with the first search.) That made me curious so I tried searching for author, Garey, thinking it is a fairly unusual spelling so I would not get too many hits. Ok, I get some garey authors in the first 10 hits but also: Found objects a style and source book Ruggiero, Joseph. Slipcover chic : designing and sewing elegant slipcovers at home Revland, Catherine. As hits 9 and 10. Perfectly fine books I am sure but not what I would be looking for when searching for author = garey. Anyway, since searching is one of my interests (topic maps and their construction) I was puzzled by the anomalous result. Looking at the MARC record for the Revland, Catherine hit it appears that author = garey request is searching the 100 field *and* the 700 field, which for this item includes: 700 aBall, Michell, ill. 700 aGarey, Carol Cooper Which would be understandable if I had asked for a keyword search. Not so understandable with a author search. Well, I suppose I have two questions in addition to my first one, ;-) . 2. Where is the relevance code in particular since it was the source of the seemingly odd results on np-completeness. 3. Shouldn't author searches default to the MARC 100 field? (With keyword taking in 700 entries, etc.) Hope everyone is having a great day! Patrick -- David J. Fiander Digital Services Librarian
Re: [OPEN-ILS-DEV] Introduction and Question
On 9/12/07, David J. Fiander [EMAIL PROTECTED] wrote: One of the problems that I have with Evergreen searching, which I've complained about before is the way that the system will search fields that are not displayed in the search results screen to the average user. I agree with the point, but I'd be remiss in not mentioning that this is a local configuration decision, and something that is (relatively) easily changed if another instance requires such a change. PINES, simply by virtue of being the first, has helped define the default set of search points and displayed fields that Evergreen exposes out of the box. That doesn't mean the defaults are the only, or even the best, possible configuration (outside of PINES), both on the indexing side and the display side, but they are what's there because of effort from active stakeholders. -- Mike Rylander Equinox Software, Inc [EMAIL PROTECTED] http://esilibrary.com/ My favourite example is an author search for Fiander in PINES. There are some books in the library field that might appear, but you also see some british videos that appear, and there's NO indication in the public display why they matched a search for Fiander, until you look at the MARC display and discover that the actor Lewis Fiander (no relation) appears in the cast list, which is not displayed to the public. One of the first rules of cataloguing is that access points (ie search fields) MUST be supported by the record somehow. - David Hardy, Elaine wrote: Patrick, The 700 field in a MARC record is an author field. It is used when there are either multiple authors for the item (since the 1xx fields are not repeatable) or when the person responsible for the work is an editor. So we do want an author search to include the 700 field. Elaine Hardy J. Elaine Hardy Library Services Manager - Collections Reference Georgia Public Library Service, A Unit of the University System of Georgia 1800 Century Place, Suite 150 Atlanta, Ga. 30345-4304 404.235-7128 404.235-7201, fax [EMAIL PROTECTED] www.georgialibraries.org -Original Message- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Patrick Durusau Sent: Wednesday, September 12, 2007 11:02 AM To: open-ils-dev@list.georgialibraries.org Subject: [OPEN-ILS-DEV] Introduction and Question Greetings! This is my first post so first a word or two about my background. I am currently a co-editor for the OpenDocument Format standard in OASIS and its project editor in ISO (ISO 26300). I am also chair of the US committee that is the mirror commitee of SC 34, which is currently considering OpenXML (DIS 29500). When I am not involved in either of those projects, I am the convener of SC 34/WG 3, Topic Maps, as well as a co-editor of various parts of that standard. I am an independent consultant on standards (primarily markup and semantic integration) and related technologies. My question: Where are the search and relevance sections of the Evergreen code? I ask because I was posting an ILL for Computers and Intractibility: A Guide to the Theory of NP-completeness to my local library and in an effort to be helpful, I did a keyword search in Pines for np-completeness (note the lack of quotes) thinking that is a fairly unique term. Try it with Pines. The results are rather amusing and quite definitely not relevant. I performed the same keyword search with np-completeness and got no hits. (I would have expected to have the same results with the first search.) That made me curious so I tried searching for author, Garey, thinking it is a fairly unusual spelling so I would not get too many hits. Ok, I get some garey authors in the first 10 hits but also: Found objects a style and source book Ruggiero, Joseph. Slipcover chic : designing and sewing elegant slipcovers at home Revland, Catherine. As hits 9 and 10. Perfectly fine books I am sure but not what I would be looking for when searching for author = garey. Anyway, since searching is one of my interests (topic maps and their construction) I was puzzled by the anomalous result. Looking at the MARC record for the Revland, Catherine hit it appears that author = garey request is searching the 100 field *and* the 700 field, which for this item includes: 700 aBall, Michell, ill. 700 aGarey, Carol Cooper Which would be understandable if I had asked for a keyword search. Not so understandable with a author search. Well, I suppose I have two questions in addition to my first one, ;-) . 2. Where is the relevance code in particular since it was the source of the seemingly odd results on np-completeness. 3. Shouldn't author searches default to the MARC 100 field? (With keyword taking in 700 entries, etc.) Hope everyone is having a great day! Patrick -- David J. Fiander Digital Services