Re: [CODE4LIB] dspace, digiTool, etd-db, and mylibrary
Hi, On 2/3/06, Eric Lease Morgan [EMAIL PROTECTED] wrote: http://dewey.library.nd.edu/morgan/demo/ Your comments regarding our initial implementation would be greatly appreciated. Could you explain what we're seeing and what we should be looking for (like, the second digital image doesn't work; bad ID?) ? Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Code4Lib Code Sharing - (Re: [CODE4LIB] journal)
On 2/23/06, Ryan Eby [EMAIL PROTECTED] wrote: http://www.textualize.com/ Again I'm unsure if we would be looking at mostly small snippets and functions or full fledged classes/libraries. Thanks for pointing this out; looks good. Now with any of this, it not so much the actual libraries and classes that are of interest to me, but clever code to *use* them. And, to a big degree, it is talking about application design that I fear that we're overlooking. This is where all that we did X, and here's good and here's bad about that approach comes in really handy; how do we design for our audience, please the business side and make it look good in the process? The tools should support at least these three fundamental things, and I would hope that the journal discussed appraoched *design* more than code. As an example, how does it change our development infrastructure when moving to a SOA? I can write *books* about this topic, mostly positive. :) But this discussion seems to float to the edges (not geeky enough for geeks, not business enough for the business people, even though both can recognise the value of it) A journal could be a lever to use to push the importance of things that are on the fringe of things onto the real agenda. 2 cents worth. Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Libraries that support user tagging in OPAC?
Hi, In our next generation OPAC prototype, we do typed tagging and comments. (Typed means that there is a difference between a patron tagging something and a reference librarian; the tags and comments are fed back into the search engine, and alters relevance ranking) One day it may see the day of light, but it certainly has proved itself really powerful features. Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] At an end : when you rub against your managers
Hi Ed, On 3/9/06, Ed Summers [EMAIL PROTECTED] wrote: Lucky you! I've had similar problems in non-library settings, so I don't think that the library community is any worse at following software best practices than other communities. Ok, so what you're saying is that is, for me, an isolated incident, and I'd be better off to quit and find somewhere else. I can live with that. :) If they were then there wouldn't be such an appetite for the wisdom you find in Joel on Software, Paul Graham, et al. Hmm, having an appetite doesn't equate that what you're eating is healthy, but yes, I understand your point. :) I'm not sure griping in public like this will help much... I'm not so much griping in public as I'm reaching out to my fellow geeks; I'm pretty sure that I can't be the only one who's battled new things against conservative bastions before. Most of my problems are located within a rather conservative mindset of my managers that I can't seem to get through. I've broken through it in other places, to great success, but the library world, to me, seems inpenetrable. I guess I should have know, Z39.50 and all. :) In my experience I've found that people react best to seeing how a new development process, pattern or technology helps *in practice* rather than *in theory*. I agree, and I've done all that and more, yet nothing changes. If management above you still don't get it, or fight it, then there is nothing left to do, and as such I think I've just concluded that. I'm sorry to leave the library world, but not sorry to leave the mentality. But everyone likes recognition for good work--I'm sorry it sounds like you aren't getting that support. Good luck--and try to focus on one thing at a time...says ADD man. With anything that goes against what you know as good, it won't be classified as 'good'. Anyways, thanks for the input. Regards, Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] At an end : when you rub against your managers
On 3/9/06, Kevin S. Clarke [EMAIL PROTECTED] wrote: That's my opinion anyway... not sure this has anything to do with code. You're right, it hasn't; it was only geek related in the sense that we probably all face conservativism in liue of new and fancy code. Sorry for the noise, and thanks for the words. I think I know the answers now. Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] compact display for marc-xml
On 3/28/06, Hickey,Thom [EMAIL PROTECTED] wrote: I've attached a compressed tar file of compact.xsl, compact.css and mudlumps.xml, a test record. After you've extracted the files to a directory you should be able to view mudlumps.xml with a browser and see the results. I'd like to have a look and help out, but could you post it non-tar'ed and possibly non-zip'ed? My gmail is barfing, and WinZip coughs and splutters. :) Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Web services for LII content?
On 3/29/06, K.G. Schneider [EMAIL PROTECTED] wrote: Develop web services (accessible by subscription) to allow a developer to include some of the LII in an application. I was going to do exactly this for the Australasian part of the world (still pending; too much to do). I think the idea is a very good one, but I'm not sure about the paid service. I think the only thing you should consider is a flat small yearly fee to be part of the system, although setting something up like this and use deli.cio.us for tagging and commenting the links isn't that much of a deal. In short, I value the service, but not as a pay-service. :) Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Question re: ranking and FRBR
On 4/12/06, Jonathan Rochkind [EMAIL PROTECTED] wrote: If you are instead using a formula where an increased number of records for a given work increases your ranking, all other things being equal---I'm skeptical. Ditto; I think the answer to this is that there needs to be some serious pre-processing and analysis to come up with some really smarts in terms of these searches. I don't think there is an easy way out once you've gone past the ooh, shiny stage of whatever context you bring the user; good or bad context? Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Question re: ranking and FRBR
On 4/12/06, K.G. Schneider [EMAIL PROTECTED] wrote: Do users actually determine relevance or do they have faith in Google to provide the best results on the first results page? I'd say people use a click and try n times, before refine search until relevance is fulfilled technique. But again, this is *totally* dependant on what they're searching for; known or unknown ; - books by Frank Herbert (specific enough to get some results) - Jungs philosophy in fiction (general enough to cause bleeds) - good SciFi (general enough to cause bleeding) - oil crisis metaphors (specific and general at the same time) All of the above can lead to Dune by Frank Herbert. What is it's relevance to the above searches? It's a book by Herbert, it certainly contains Jungs philosophy, it's a good SciFi book, and has indeed the metaphors as part of its concept. And to top it all, it's still a popular book. So I could say The Dosadi Experiment and all the same is true, except the popularity. Who is to say that former is preferred over the latter? Google will give us the former, never the latter. For libraries, this is an interesting problem to solve, because popularity, at least in my view, is mostly a misnomer in searching for information. Popularity in Google is measured by people actually putting in the links, which means they point to something *because* there is something interesting that way. In the library catalogs there is no such thing. We've got an experiment running here which uses tags to do this last bit for us; people and librarians alike can tag books which will boost their ratings. An anonymous tag denotes popularity (unless stated otherwise), while a reference librarian boosts importance. Another fields I'm digging into is using search term logs to do some of this as well, generating heat for items ... close to popular, but can be very time-based (unlike links which stays around) if you don't feed the flame, it eventually will die out (or in this case, repurposed). Anyways, just a few thoughts and ideas. Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] next generation opac mailing list
On 6/6/06, Michael Bowden [EMAIL PROTECTED] wrote: We need something. My ILS has decided that their next generation catalog will be a portal with its own database, etc. I already have one database with MARC data why do I need another to hold the non-MARC data. Why isn't my ILS working to expand/create the next generation MARC record? I think the next generation catalog goes hand and hand with the next generation of MARC. Oh, this one is easy to answer; we need to get away from MARC. No, not the content of MARC, nor the idea of it, nor necessarily even the MARC format and standard itself, but we need to get away from we need MARC and the idea that knowledge sharing in libraries are best done through MARC and that Z39.50 must be part of our requirements. For example, MARC can hold some change control info, but never to the granulaity that supports for example an NBD which can properly update records and work on a distributed model. But as soon as we put that info outside of MARC, the culture will choose to ignore the problem rather than try to change it. The *culture* of MARC is the problem. I don't think the OPAC will go away, nor that it absolutely must, but the very idea of an OPAC is based on knowing what our patrons want; books that we've cataloged. But all too often we have no idea what they want; all we've got are assumptions. I think we've come a long way, but the time to look anew to what purpose the OPAC serves certainly is ripe. Ok, I'll stop now. :) Regards, Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] next generation opac mailing list
Hi, On 6/7/06, Jonathan Rochkind [EMAIL PROTECTED] wrote: My impression is that there are LOTS of catalogers interested in discussing this topic---the future of The Catalog. As much as I would love to disagree with you, I don't. :) My stance on this is not to let hackers create applications as they see fit, dear Dog, no! I'm a die-hard user-centred design and usability guy; my life is dedicated to develop solutions fit for the user, wheter that be patrons, catalogers, super-users and otherwise. I'm more talking about politics of *actually* doing something; I find it easy to talk about innovation with my collegues, but hard to do in practice, although we're setting up a labs area these days in an attempt to break free of the tyranny of PRINCE2 and top-down hiearchies. But hey, i realise this is probably besides the point; if we have fruitful discussions, maybe someone can do something with it. Some coders seem to assume that the cataloging community doesn't realize the need for change, or doesn't understand the possibilities of the online catalog. I think this is more and more NOT the case. Catalogers too realize that things are broken, change is the topic of discussion. Actually, I've found the reverse to be true; catalogers overly aware of things being broken, but having hackers that either can't see the problem or are too busy to do so. My feeling about this all is that we're too busy maintaining the MARC Legacy than create a shining new one which may or may not solve the problem. Of course, the problem with MARC is the culture not the technology, so in order to change the culture we need a *whopping* effort put in by *all* libraries around the world. No very likely, but it would be fantastic if we could. But such common vision is desperately needed. I'd say such common vision is desperately needed on the management level! What drives the libraries if not management? Sure, footsoldiers and captains can push the envelope, but only so far before it becomes political, huge, convuluted, a project with a steering commitee, and so forth. For me the strategy is to create prototypes to demonstrate what we're on about, and in my case I do that *with* catalogers, reference librarians and other friends around the library / library world. The idea here is to unite the bottom soldiers in such a way that the top management can see the light and resource and process accordingly. So we desperately need more forums for discussion involving both catalogers and developers, focused on this topic. No, we desperately need everyone to join the same forums! Not more forums, but less! Less is more. We don't need yet another commitee; we need one stronger one. But hey, I'm dreaming. As Eric writes, an important topic for discussion is: To what degree should traditional cataloging practices be used in such a thing, or to what degree should new and upcoming practices such as FRBR be exploited? The danger here is that automated processes adds a quality check to our processes, and a lot of people don't like that, especially top management, because it points out mistakes made in the past. Technically we don't have many problems, we can do pretty much anything we'd like to do if we really wanted to, but it's all about internal politics and shuffeling of resources which decides wheter it should be done or not. If *management* don't understand what hackers and catalogers and reference librarians are talking about, we're stuffed! Anyway, I don't think we disagree on this, only the part about needed yet another mailing-list. Regards, Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] next generation opac mailing list
Hiya, On 6/7/06, Ross Singer [EMAIL PROTECTED] wrote: That by trotting out their Endeca powered catalog, they've finally gotten the tangible that we nerds have been unable to get institutional support for. Now every librarian in the country wants clustering and faceted search. Sorry, I'm in the wrong country. :) In fact, that event as much as it triggered peoples hearts and minds, it never shook the foundation of the OPAC in this place. But this time last year, I defy you to tell me that you could have trotted out a project like that to anybody outside the systems office (that wasn't already labelled a 'systems apologist'). Possibly not. Hmm. No, not with the OPAC, but other systems. I think libraries have put too much faith in vendors who create crappy systems and continues to do so. If vendors want libraries to buy their stuff, they need to make sure they've got good stuff; it's getting easier and easier to do these things ourselves. Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Photo galleries and accessibility
On 7/13/06, Amy M Ostrom [EMAIL PROTECTED] wrote: Or does anyone know about photo galleries and accessibility? There is a bigger group of people which can both see images and have accessibility needs; low-vision users (estimated some 30% of all users). Having said that, there's really nothing stopping you making tables perfectly accessible, and it the sense of images they *are* presented in a tabular fashion. This is where we use common sense instead of rigid rules, so there is no reason to feel that using tables for this is somehow wrong (unless you want to go into the whole WAI 2.0 debate :). Do it the way you do, and clean up the generated code to fix the worst offenders. If you still want to be strict on it, try talking to the GAWDS community (http://www.gawds.org/) about gallery options. I seem to recall there were some discussion about this a while back, but the gist was that most gallery software were equally crap in accessibility regards. Maybe things have changed. regards, Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] OpenURL XML generation libraries?
On 10/18/06, Ross Singer [EMAIL PROTECTED] wrote: See also: http://www.textualize.com/trac/browser/ropenurl Why? What are we looking at? Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] OpenFRBR
Hi, You may be interested in OpenFRBR: http://www.openfrbr.org/ Its aim is to build a full, free implementation of FRBR, showing everything it can do, and looking for problems along the way. Everyone's welcome to get involved in whatever way they wish. I can't get to that site (is it down?), but a few words on what you're trying to do (is it a technical approach, model approach, philosophical approach?), and how you want to do it would be great. Alex -- Ultimately, all things are known because you want to believe you know. - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Getting data from Voyager into XML?
On 1/18/07, Doran, Michael D [EMAIL PROTECTED] wrote: So you may find that there is a well-founded reluctance among Voyager systems people to get too carried away with the DBA 101 stuff. ;-) We're routing around the problem by creating a webservice that is Voyager specific and let other apps and services use this one. That means that if you have to do DBA stuff, you do it in one spot. It's not the ultimate solution, but it solves a great deal of legacy and flexibility problems. Alex -- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/ ---
Re: [CODE4LIB] Videos?
On 3/6/07, Noel Peden [EMAIL PROTECTED] wrote: I'm finally back the office today and the videos are in process... I'm not sure where they'll go, but they'll be up somewhere. BTW, if anybody has any ideas for royalty free title music (a short 3+ second thing), I'm open. I'll whip up something if needed. In my dark past I was a musician, and I've got stuff lying around waiting for the oppertune moment to be donate. What are you looking for? Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] PHP Symfony
On 3/24/07, Michael J. Giarlo [EMAIL PROTECTED] wrote: Hmm? What's that you say? Just a sec, but in the meantime, why not sit down and have some of this delicious Kool-Aid over here? It's Ruby Red-flavored; I think you'll like it. Come, now; for those who meddle in things PHP knows that a lot of the goodness you get from Ruby you'll these days also find in PHP as well. Things have progressed quite a bit in the last 5 years, and PHP 5.2 is quite mature and offers an OO model on par with Ruby, without the hassle of being a fringe technology. :) As to about Symfony, yes, it's pretty good and compliments (or answers) the RoR thing well. I personally don't use it as I'm more of a XSLT, SOA, REST freak (and Symfony is slightly tricky to push into that box, especially given the non-MVC direction of the SOA we're building). Now that Ror 1.2 has better support for REST I think Symfony may follow, but I don't like the default templating language (PHP with specials) nor the non-MVC paradigm. Having said that, I haven't used it for a few versions and things may have improved. Check it out. Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Position Available: Manager of Data Systems
On 5/18/07, Patty De Anda [EMAIL PROTECTED] wrote: MANAGER OF DATA SYSTEMS ... and not a word (that I could find) on where in the world - or where in the assumed USA - this position is held. :) Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] good web service api
On 6/30/07, Eric Lease Morgan [EMAIL PROTECTED] wrote: What are the characteristics of a good Web Service API? That you refrain from the notion of an API. :) Seriously, before you do anything, read the book Restful WebServices by Sam Ruby and Leonard Richardson (http://www.oreilly.com/catalog/9780596529260/). I'd do it the ROA way (and have for some time; resource oriented architecture), but I do understand it puts certain strain on the areas of the brain responsible for learning conceptually new things. Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Open Source OPAC - VUFind Beta Released
On 7/20/07, Andrew Nagy [EMAIL PROTECTED] wrote: http://www.vufind.org/ Excellent stuff, and thanks for the open-source effort. Three things ; 1. Will there be efforts towards a development community outside your library? 2. http://www.vufind.org/demo/Record/56179 has serious problems in its similar items section. :) 3. If you scroll down a list of things and then do something that requires a login, only the top part of the page that's not in view has the action. The user sees nothing, and nothing happens. Apart from that, great stuff and, if you accept such, I'd love to participate in ways that I can. Kind regards, Alexander -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] [Fwd: [NGC4LIB] A Thought Experiment]
Hiya, On Nov 9, 2007 7:42 AM, Carl Grant [EMAIL PROTECTED] wrote: I'm seeking some help understanding here. From my perspective (again, that of a long time vendor of commercial software having recently moved to commercial service for OSS software) this is exactly what a number of us (LibLime, Evergreen, Index Data, CARE Affiliates) are *trying* to do. We're not only providing the services to allow libraries to adopt open source, we're also doing the marketing and selling that libraries seem to require before they'll even consider the option. I think this is extremely important for the library world right now, far more important than any current standard, model or prototyping exercise ; support the vendors going Open Source. Don't think about it for too long ; we must grab this opportunity *at all cost*, because, frankly, it's the only chance we've got to set ourselves straight again. The only way to get away from the suppressed and locked-down legacy-driven world we currently live in is to embrace openness, especially when it's coming from vendors (who's by that very token asking us to work *with* them this time instead of just buying their stuff). There's a slight clause here, though, for the vendors ; you *must* adopt web services for *every* part of your solutions. I know that this often goes against the grain of a proposed system (a system that holistically solves a problem space) but the truth of the matter is that you will never make your system work spot on for everyone, and we need the reassurance (even if we never use the option) of going in a different direction or using someone else's solution for a particular problem. By allowing a more open development model the library world will love you and gladly give you money for support and further development. Consider the openness even a token more than a reality option. Here's a quick list of things I see crucially happening ; * The library world has to come together to create a common language for these web services, an ontology if you will. We must decide on a few good (and possibly already existing) protocols and dictionaries. * Vendors must settle on a development model for web services (and I'd humbly suggest a REST model) and not be afraid of opening up or segmenting their holistic solutions into sharable / interchangeable parts. * Get some outside experts in to handle usability and interaction design, and open source the result. Create a consortium or interest-group for library systems usability and user experience. * Make sure we've got a *clean* cut of technology between business logic and the user interface. Enforce low-key semantically-rich XHTML and use CSS everywhere. Here's to dreaming. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] theinfo.org: for people who work with big data sets
On Jan 16, 2008 7:08 AM, Aaron Swartz [EMAIL PROTECTED] wrote: http://theinfo.org/ Excellent initiative! Joined, and I'll forward the information around to other communities I know do this type of work. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Gartner on OSS
Let's try the litmus test for enterprisey business bullshit : porridge ; Recommendations for Users * Look for a sustainable community that has a critical mass of skills supporting porridge. * Look for a cultural match between the porridge community and your internal developers and user culture as it enhances communication and perceived user satisfaction. * Prepare an SOA that can integrate IT services from many sources, including porridge. * Avoid porridge that is not built on open standards. * Make a conscious risk-based decision about whether you will depend on internal resources or external services for your porridge implementations. In short, another template piece where [insert your favourite thing here] is wrapped around generic advice. Do they say anything that's specific to what open-source is all about? Alex (without reading the darn article...) -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Gartner on OSS
On Sun, Mar 30, 2008 at 7:51 PM, K.G. Schneider [EMAIL PROTECTED] wrote: Sorry, Alexander, I disagree. What, is that allowed!? :) Gartner may sound creaky but under the starchy language, this is pretty revolutionary advice. I can't agree with the revolutionary advice part; business leaders, firms, advisers and abusers have been saying this already for years. That Gartner now is on the field saying it too shows nothing except how conservative they are; this is an old message, and certainly not aimed at people who's doing the actual work in their organisations. I've been in the enterprise for most of my life as a high-flying consultant (except my non-enterprise last few years in the library world), and currently work as both manager, developer and advisor to the largest enterprise organisations around. We've always recomended and / or used OSS, integrated the very ideal into the fabric of enterprise software development. The only people that Gartner now is playing to are the business people, who will be surprised to learn that their organisations already use (and many fully embrace) OSS, and have done so for years. (How they'll cope with that news is another story, and maybe Gartner is their coming safety blanket) Even big guys who think that only the Oracle business stack is good enough for them will be surprised to find the odd OSS project supporting their infrastructure. OSS is already successful, and it's already working great even if the MBAs don't know it. And because Gasrtner now is playing to those people, that's why the porridge litmus test works so great; in reality, nothing will change, which for many is the perfect advice. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Gartner on OSS
On Mon, Mar 31, 2008 at 2:45 AM, D Chudnov [EMAIL PROTECTED] wrote: ...at the risk of upsetting *everybody*... It's a bit depressive that once we get an interesting discussion going on this list which normally has such low volume, and which is *definitely* on-topic, someone comes along and tries to kill it because it doesn't fit *their* ideal of what the topics should be. Allow me to vent a few seconds; Sorry, but OSS is *all* about code and often about business models, and rest assured Karen and all the rest of us *definitely* are defining the enterprise in question as the library world, so this is *all* about code for libraries. We aren't writing code in the posts, but we certainly are talking about code. Nitpicking about such *tiny* semantic differences is just one of those things which drive me up the wall! Of *course* this topic has a place on this list, and of *course* we're not going to create Yet Another MailingListForSomethingJustBecauseWeAreBloodyLibraries, and of *course* we should talk about these things, and *especially* here where coders talk about code. Code is more than syntax. But I guess this thread is dead now, and so is at least *my* ideal of what this list is, so take care. Grumpy, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] planet.code4lib.org -- 3 suggestions
On Thu, May 22, 2008 at 5:06 PM, K.G. Schneider [EMAIL PROTECTED] wrote: I feel self-conscious about seeing posts reflected in the planet that are not related to library technology, only because I'm not willing to break up my blog into sub-blogs and don't know if oysters and pace layering really go together for the planet. Ouch, I suspect a conversation next about what fits the code4lib planet moniker. Does my technology rants that don't bash MARC fit? Does Topic Maps fit, even if libraries don't use them but they are a perfect fit? Posts about philosophical aspects of the code we make? Or the epistemological musings of workflows? Lest not forget that the human aspect of the library profession is what makes librarians so great ... It's a tough one. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] PHP5 Help
On Tue, Jul 1, 2008 at 13:42, Nicole Engard [EMAIL PROTECTED] wrote: I am missing something right in front of my eyes. I'm rusty on my PHP, I'm wondering if someone can help me with this error: Warning: gmmktime() expects parameter 3 to be long, string given in /public_html/magpierss-0.72/rss_utils.inc on line 35 Well, it's a bit puzzling in the sense that the parameters are all ints, but hey. :) Try casting the values ; gmmktime( (int) $hours, (int) $minutes, (int) $seconds, (int) $month, (int) $day, (int) $year ) ; or try the same with (long). Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] marc21 and usmarc
On Tue, Jan 27, 2009 at 17:04, Eric Lease Morgan emor...@nd.edu wrote: Can somebody say MARCXML or MODS complete with a schema? Well, we can say it, and I think we *have* said it for a very long time, but it doesn't seem to change anything. Damn those words. Such solutions offer at least syntactic validation if not also semantic validation. Oh well. I would say a little bit more than oh well (but I don't really have; you know how I feel :), but I would love to hear what the vendors are thinking about this all. They seem to very, very quiet about it all (without speculating to why ...) regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] marc21 and usmarc (fwd)
On Tue, Jan 27, 2009 at 17:09, Ardie Bausenbach a...@loc.gov wrote: Since that time, many other national libraries have moved from their national formats to MARC 21, including (among others), the UK, Germany, Finland, and Spain. I know a few more, but another point worth, er, screaming about, is the various AACT2 / RDA / other rules changes that's not linked to MARC at all. I know a lot of it is covered in MARC documentation, but there's hidden gems, like punctuations, symbols, character-encodings, etc which aren't always specified. If the library world embraced XML as a minimum a lot could be fixed in that area (and no, XMLMARC does not qualify :). Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] marc21 and usmarc
On Tue, Jan 27, 2009 at 18:56, Kyle Banerjee kyle.baner...@gmail.com wrote: There are arguments to do so, but the business case is not strong. Well, I'd say the future of the library world is a good business case, and I know several people (high and low) fully aware of it, but I think it's hard to take any step in either direction that would be deemed worth it. Toguh one, indeed. That data providers won't send MODS until libraries demand it. Libraries won't demand it until their systems use it. Systems won't use it until libraries demand it because that's what their data providers require. Well, I've been yelling for vendors to get more involved for a long time, but there's a lot of blankness coming from them. I guess they're happy with the current tie to MARC (binding the libraries to them forever) until the business is gone ... It's a vicious circle, so we're stuck with MARC. The only people who aren't happy with this arrangement are those who are trying to create something new. Many librarians who think they use MARC every day have no idea that it is a binary format that is unfriendly to eyes and machines. MARC may be MAchine Readable, but not MAchine Understandable or even MAchine Usable. I had an idea some time ago to create a dummy / fake MARC record with much more to it (like extensions and special tags systems can react to, such as validation) and pass it around the infrastructure to see what in it survives (the golden rule is to ignore what you don't understand, although I know a few MARC systems who filter out what they don't understand (!!!) because, well, these systems were mostly built back when a megabyte of storage and / or memory had a price of about a cataloger or two. Friggin' crazies!). Anyone in? :) Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MARC 21 and MODS
On Wed, Jan 28, 2009 at 20:29, Rebecca S Guenther r...@loc.gov wrote: It is interesting though that a study of different metadata formats at Los Alamos National Labs a few years ago concluded that MARCXML was the richest and most robust. http://www.dlib.org/dlib/september06/goldsmith/09goldsmith.html Umm, I just have to add that all those compared won't make it to my top 10 list of good formats, so, er, comparing library formats against each other is a bit like comparing all the wonderful juicy fruit in the world where your selection is limited to what can grow in Alaska. It still amazes me that RDF and / or DC hidden in SRDF or Topic Maps haven't gotten any traction when it seriously matches what you want. We are also working on modeling MODS as RDF-- some work has already been done on this. That is good news, albeit a little late and certainly a little slow. But I hear good things about Talis moving into this arena, and hopefully they can pull a few other vendors with them. I guess the first thing that is needed is a basic MARC / RDF vocabulary we can all participate in and extend, and then cross-pollinate vocabularies as we move away from AACR2 to more RDA / FRBR friendly stuff (although, me personally, I would jump way ahead of RDA, but that's not going to happen). In terms of MARC, we are planning for its evolution and streamlining to get rid of some of its problems and plan for a future where the transition to new cataloging rules will work well with existing records and cataloging infrastructure. Are you talking about RDA here? And when will these changes happen, in what form, how do you build momentum and expertize, etc.? Whatever the format of the future is, the transition will need to be evolutionary because of the billions of records that are out there and the need to satisfy a lot of the user tasks required of library (and other) metadata. I agree fully, although I'd stress the poor infra-structure as a reason more than records available (they can always be converted into something else, but you can't easily change how systems require MARC21) It is also worth noting that despite some calls for a MARC replacement, we have a number of national libraries throughout the world that are abandoning their national formats and just now adopting MARC 21. They also need to be considered in this transition. I find it a bit scary it's taken this long, but I certainly welcome the change as it makes it easier to move from one format to the other once we all agree on a fundamental platform. But I still don't think a clear direction forward is set. Any docos you can point to about the future direction of LoC approved meta data exchange? regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MARC 21 and MODS
Hi there, On Thu, Jan 29, 2009 at 15:55, Rebecca S Guenther r...@loc.gov wrote: Yes, better late than never (we're a small office and stretched thin). You're not *that* small, no? :) Also we want to explore MARC/RDF. We also have to keep in mind that MARC is also used by non-AACR2 users (and when RDA is implemented non-RDA users). Shouldn't the library world slowly work towards a common set of rules, backed by technology, to make it easier for us all to move forward with less pain? As a starting point in exploring semantic web types of technologies we are establishing a registry for controlled values used in various standards-- MARC, MODS, PREMIS. See the text at: http://id.loc.gov Ah, I like! This is very close to the concept in Topic Maps of Published Subject Indicators. Could the identifiers within have a certain degree of persistance and resolvability? If so, both the SemWeb and TM communities could use this out of the box. I also think the DC RDA working-group has something similar. Karen? And should you work together? In the meantime we have a prototype at: http://www.loc.gov:8081/standards/registry/lists.html Can't make much work there. Must be in alpha. :) But I like this direction. If you now can get the vendors on-board, or better, make more SemWeb systems yourselves, and you're a *huge* step forward. I'm *very* excited to see this coming from LoC. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?
On Thu, Feb 12, 2009 at 21:43, Rebecca S Guenther r...@loc.gov wrote: Patrick is right that an XML schema such as MODS or MARCXML would be text/xml. I would strongly advise against text/xml, as it is an oxymoron (text is not XML XML is not text even if it is delivered through a text protocol), and more and more are switching away from the generic text protocol (which makes little sense in structured data). Hence, a more correct MIME type for XMLMARC would be application/marc+xml, although until registered should be application/x-marc+xml. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?
On Thu, Feb 12, 2009 at 22:32, Jonathan Rochkind rochk...@jhu.edu wrote: Didn't we finish having this conversation last week? We talked about all this stuff being brought up now last week. We did indeed, and your summary is better than what my retort could have been; spot on. I guess it's hard to understand why text/xml is such a waste of MIME and time as long as we still got text/html as the original understood MIME for HTML pages, but luckily the internet has moved on and evolved. :) One question we haven't asked is if we really need a MIME type for MARCXML. :) Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?
One question we haven't asked is if we really need a MIME type for MARCXML. :) On Thu, Feb 12, 2009 at 23:28, Jonathan Rochkind rochk...@jhu.edu wrote: PPS: Yes, it has been asked, and it's pretty obvious to me that we do. I wasn't asking for technical reasons; I was more having a stab at how many people use and need MARCXML specifically as compared to a number of other more used formats. I mean, seriously, you can use MARCXML embedded in Atom and get the best of both worlds instead. Don't worry about it; it's not a serious _enough_ question. :) Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] points of failure (was Re: [CODE4LIB] resolution and identification )
On Fri, Apr 3, 2009 at 10:44, Mike Taylor m...@indexdata.com wrote: Going back to someone's point about living in the real world (sorry, I forget who), the Inconvenient Truth is that 90% of programs and 99% of users, on seeing an http: URL, will try to treat it as a link. They don't know any better. What on earth is this about? URIs *are* links; its in its design, it's what its supposed to be. Don't design systems where they are treated any differently. Again we're seeing that all we need are URIs poor judgement of SemWeb enthusiasts muddling the waters. The short of it is, if you're using URIs as identifiers, having the choice to dereference it is a *feature*; if it resolves to 404 then tough (and I'd say you designed your system poorly), but if it resolves to an information snippet about the semantic meaning of that URI, they yay. This is how us Topic Mappers see this whole debacle and flaw in the SemWeb structure, and we call it Public Subject Indicators, where Public means it resolves to something (just like WikiPedia URIs resolve to some text that explains what it is representing), Subjects are anything in the world (but distinct from Topics which are software representations), and Indicators as they indicate (rather than absolutely identify) things. In other words, if you use URIs as identifiers (which is a *good* thing), then resolvability is a feature to be promoted, not something to be shunned. If you can't make good systems design, use URNs. You can treat URI identifiers as both identifiers and subject indicators, while URNs are evil. Let's make our identifiers look like identifiers. What does that even mean? :) (By the way, note that this is NOT what I was saying back at the start of the thread. This means that I have -- *gasp* -- changed my mind! Is this a first on the Internet? :-) Maybe, but it surely will be the last ... Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
On Wed, Apr 8, 2009 at 22:38, Dr R. Sanderson azar...@liverpool.ac.uk wrote: I would encourage looking at rdf triplestores seriously, if the graph approach is the direction that you want to go in. Or, Topic Maps which is *not* a triplestore, closer to the OO model (basically a meta data model), and don't carry the stack overflow of RDF (RDF, RDFs, OWL 1-2-3) nor anonymous nodes. :) Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
On Thu, Apr 9, 2009 at 14:33, stuart yeates stuart.yea...@vuw.ac.nz wrote: That's not an entirely useful comparison on topic maps and RDF. If I indented to be useful I'd write something substantial, backed up with stuff other than humour. I'll give that a go the next time. :) We currently use topic maps, alot, in our infrastructure. If we were starting again tomorrow, I'd advocate using RDF instead, mainly because of the much better tool support and take-up. Hmm, not a good thing at all. Could you elaborate, though, as I use it too as part of infrastructure too, and wouldn't touch RDF / SemWeb without a long stick? I'm into application semantics and shared knowledge-bases. What are you guys doing where you feel the support and tools are lacking? And what are the RDF alternatives? Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)
On Tue, Apr 14, 2009 at 23:34, Jonathan Rochkind rochk...@jhu.edu wrote: The difference between URIs and URLs? I don't believe that URL is something that exists any more in any standard, it's all URIs. Correct me if I'm wrong. Sure it exists: URLs are a subset of URIs. URLs are locators as opposed to just identifiers (which is an important distinction, much used in SemWeb lingo), where URLs are closer to the protocol like things Ray describe (or so I think). I don't entirely agree with either dogmatic side here, but I do think that we've arrived at an awfully confusing (for developers) environment. But what about it is confusing (apart from us having this discussion :) ? Is it that we have IDs that happens to *also* resolve? And why is that confusing? Re-reading the various semantic web TAG position papers people keep referencing, I actually don't entirely agree with all of their principles in practice. Well, let me just say that there's more to SemWeb than what comes out of W3C. :) Kind regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
On Wed, Apr 15, 2009 at 07:10, stuart yeates stuart.yea...@vuw.ac.nz wrote: RDF, unlike topic maps, is being used by substantial numbers of people who we interact with in the real world and would like to interoperate with. If we used RDF rather than topic maps internally, that interoperability would be much, much cheaper. It's tempting to say it's free, but it's not quite, because it does impose some constraints. But it's not that hard to create a bridge from RDF to Topic Maps and back, no? Or is your interop story different? In my eyes, the core thing that RDF supports that topic maps don't seem to is seamless reuse by people you don't care about. Yes, this has been brought up on several occasions, including by me at the TMRA 2008. But then, it's not so much that RDF does something that Topic Maps doesn't *support*, it's that it's packaged differently. So, where RDF has got five standard ontology levels (RDF, RDFS, OWL DL/Lite/Full) Topic Maps got one simpler one (TMDM), yet neither can express anything better or differently than the other. My theory here is that people *like* 5 layers of RDF, because it gives the false sensation of choice. But it's all ontological definitions. However, the 5 levels of RDF does indeed create a defined platform for sharing (if not cast in iron), in which in the TM world you need to include it / create it. Oh, and of course the academics seem to have embraced W3C and anything by the authority of TBL, and its effect is trickling down. For example the people at http://lcsubjects.org have never heard of us (that I know of), but we can use their URLs like http://lcsubjects.org/subjects/sh90005545#concept to represent our roles. Not sure I understand your example. Here's my Topic Map identifier in a Topic Map ; http://psi.ontopedia.net/Alexander_Johannesen Identifier and locator, and resolvable, and can be used by anyone. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
On Wed, Apr 15, 2009 at 10:32, stuart yeates stuart.yea...@vuw.ac.nz wrote: Yes, we mint something very similar (see http://authority.nzetc.org/52969/ for mine), but none of our interoperability partners do. None of our local libraries, none of our local archives and only one of our local museums (by virtue of some work we did with them). All of them publish and most consume some form RDF. Hmm, RDF resources are just URIs, so I'm still a bit unsure about what you mean. Are you talking about the fact that the RDF definitions (and not the RDF vocabs themselves) aren't encoded in your TM engine? Additionally many of the taxonomies we're interested in are available in RDF but not topic maps. Converting them to a Topic Map isn't that hard to do, but I guess there is *a* cost there. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)
On Wed, Apr 15, 2009 at 00:20, Jonathan Rochkind rochk...@jhu.edu wrote: Can you show me where this definition of a URL vs. a URI is made in any RFC or standard-like document? From http://www.faqs.org/rfcs/rfc3986.html ; 1.1.3. URI, URL, and URN A URI can be further classified as a locator, a name, or both. The term Uniform Resource Locator (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network location). The term Uniform Resource Name (URN) has been used historically to refer to both URIs under the urn scheme [RFC2141], which are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable, and to any other URI with the properties of a name. An individual scheme does not have to be classified as being just one of name or locator. Instances of URIs from any given scheme may have the characteristics of names or locators or both, often depending on the persistence and care in the assignment of identifiers by the naming authority, rather than on any quality of the scheme. Future specifications and related documentation should use the general term URI rather than the more restrictive terms URL and URN [RFC3305]. As you can see, an URI is an identifier, and a URL is a locator (mechanism for retrieval), and since a URL is a subset of an URI, you _can_ resolve URIs as well. Sure, we have a _sense_ of how the connotation is different, but I don't think that sense is actually formalized anywhere. It is, and the same stuff is documented in WikiPedia as well ; http://en.wikipedia.org/wiki/Uniform_Resource_Identifier http://en.wikipedia.org/wiki/Uniform_Resource_Locator I think the sem web crowd actually embraces this confusingness, No, I think they take it at face value; they(the URIs) are identifiers for things, and can be used for just that purpose, but they are also URLs which mean they resolve to something. What I think you're coming at is that something thing it resolves too, as *that* has no definition. But then, if you go from RDF to Topic Maps PSIs (PSIs are URIs with an extended meaning), *that* thing it resolves to indeed has a definition; it's the prose explaining what the identifier identifies, and this is the most important difference between RDF and Topic Maps (and a very subtle but important difference, too). they want to have it both ways: Oh, a URI doesn't need to resolve, it's just an opaque identifier; but you really should use http URIs for all URIs; why? because it's important that they resolve. I smell straw-man. :) But yes, they do want both, as both is in fact a friggin' smart thing to have. We all deal with identifiers all the time, in internal as external applications, so why not use an indetifier scheme that has the added bonus of adding a resolver mechanism? If you want to be stupid and lock yourself in your limited world, then using them as just identifiers is fine but perhaps a bit, well, stupid. But if you want to be smart about it, realizing that without ontological work there will *never* be proper interop, you use those identifiers and let them resolve to something. And if you're really smart, you let them resolve to either more RDF statements, or, if you're seriously Einsteinly smart, use PSIs (as in Topic Maps) :). In general, combining two functions in one mechanism is a dangerous and confusing thing to do in data design, in my opinion. Because ... ? By analogy, it's what gets a lot of MARC/AACR2 into trouble. Hmm, and I thought it was crap design that did that, coupled with poor metadata constraints and validation channels, untyped fields, poor tooling, the lack of machine understandability, and the general library idiom of not invented here. But correct me if I'm wrong. :) Over in: http://www.w3.org/2001/tag/doc/URNsAndRegistries-50-2006-08-17.html Umm, I'd be wary to take as canon a draft with editorial notes going back 4 to 5 years that still aren't resolved. In other words, this document isn't relevant to the real world. Yet. They suggest: URI opacity 'Agents making use of URIs SHOULD NOT attempt to infer properties of the referenced resource.' Well, as a RESTafarian I understand this argument quite well. It's about not assuming too much from the internal structure of the URI. Again, it's an identifier, not a scheme such as an URL where structure is defined. Again, for URIs, don't assume structure because at this point it isn't an URL. If I get a URI representing (eg) a Sudoc (or an ISSN, or an LCCN), I need to be able to tell from the URI alone that it IS a Sudoc, AND I need to be able to extract the actual SuDoc identifier from it. That completely violates their Opacity requirement I think you are quite mistaken on this, but before we leap into wheter the web is suitable for SuDoc I'd
Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)
Hiya, On Thu, Apr 16, 2009 at 01:10, Jonathan Rochkind rochk...@jhu.edu wrote: It stands in the way of using them in the fully realized sem web vision. Ok, I'm puzzled. How? As the SemWeb vision is all about first-order logic over triplets, and the triplets are defined as URIs, if you can pop something into a URI you're good to go. So how is it that SuDoc doesn't fit into this, as you *can* chuck it in a URI? I said it was unfriendly to the Web, not impossible. It does NOT stand in the way of using them in many useful ways that I can and want to use them _right now_. Ah, but then go fix it. Ways which having a URI to refer to them are MUCH helped by. Whether it can resolve or not (YOU just made the point that a URI doesn't actually need to resolve, right? I'm still confused by this having it both ways -- URIs don't need to resolve, but if you're URIs don't resolve than you're doing it wrong. Huh?) C'mon, it ain't *that* hard. :) URIs as identifiers is fine, having them resolve as well is great. What's so confusing about that? , if you have a URI for a SuDoc you can use it in any infrastructure set up to accept, store, and relate URIs. Like an OpenURL rft_id, and, yeah, like RDF even. You can make statements about a SuDoc if it has a URI, whether or not it resolves, whether or not SuDoc itself is 'web friendly'. One step at a time. This is my frustration with semantic web stuff, making it harder to do things that we _could_ do right here and now, because it violates a fantasy of an ideal infrastructure that we may never actually have. Huh? The people who made SuDoc didn't make it web friendly, and thus the SemWeb stuff is harder to do because it lives on the web? (And chucking your meta data into HTML as MF or RDF snippets ain't that hard, it just require a minimum of knowledge) There are business costs, as well as technical problems, to be solved to create that ideal fantasy infrastructure. The business costs are _real_ No more real than the cost currently in place. The thing is that a lot of people see the traditional cost disappear with the advent of SemWeb and the new costs heavily reduced. Also, having a unified resolver for SuDoc isn't hard, can be at a fixed URL, and use a parameter for identifiers. You don't need to snoop the non-parameterized section of an URI to get the ID's ; Okay, Alex, why don't you set this up for us then? Why? I don't give a rats bottom about SuDoc, don't need it, think it's poorly designed, and gives me nothing in life. Why should I bother? (Unless I'm given money for it, then I'll start caring ... :) And commit to providing it persistently indefinitely? Because I don't have the resources to do that. Who's behind SuDoc, and are they serious about their creation? That's the people you should send your anger instead. And for the use cases I am confronted with, I don't _need_ it, any old URI, even not resolvable, will do--yes, as long as I can recognize it as a SuDoc and extract the bare SuDoc out of it. So what's the problem with just making some stuff up? If you can do your thing in a vacuum I don't fully understand your problem with the SemWeb stuff? If you don't want it, don't use it. Which you say I shouldn't be doing (while others say that's a mis-reading of those docs to think I shouldn't be doing it) No, I think this one is the subtle difference between a URL and a URI. but avoiding doing that would raise the costs of my software quite a bit, and make the feature infeasible in the first place. Business costs and resources _matter_. As with anything on the Web, you work with what you got, and if you can fix and share your fix, we all will love you for it. I seriously don't think I understand what you're getting at here; it's been this way since the Web popped into existance, and don't really want it to be any other way. No it's not; if you design your system RESTfully (which, indeed, HTTP is) then the discovery part can be fast, cached, and using URI templates embedded in HTTP responses, fully flexible and fit for your purposes. These URIs are _external_ URIs from third parties, I have no control over whether they are designed RESTfully or not. Not sure I follow this one. There are no good or bad RESTful URIs, just URIs. REST is how your framework work with the URIs. In the meantime, I'll continue trying to balance functionality, maintainability, future expansion, and the programming and hardware resources available to me, same as I always do, here in the real world when we're building production apps, not RD experiments My day job is to balance functionality, maintainability, future expansion, and the programming and hardware resources available to me, same as I always do, here in the real world when we're building production apps ... and I'm using Topic Maps and SemWeb technologies. Is there something I'm doing which degrades my work to an RD experiment, something I should let my customers
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
With Topic Maps it's been solved years and years ago, and it's the part of it that the RDF world didn't think of until recently (and applied their kludges). I'm not going to bang my gong on this, just urge you to read up on PSIs. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
[CODE4LIB] Another nail in the coffin
Another nail in the library coffin, especially the academic ones ; http://www.youtube.com/watch?v=5TIOH80Qg7Q Organisations and people are slowly turning into data producers, not book producers. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Another nail in the coffin
On Mon, May 4, 2009 at 23:25, Joe Hourcle onei...@grace.nascom.nasa.gov wrote: You're forgetting the 5th Law: The library is a growing organism. http://en.wikipedia.org/wiki/Five_laws_of_library_science Not forgotten, I just don't believe it anymore. And, taken to its natural consequence, organisms through evolution comes and goes. :) Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Another nail in the coffin
On Mon, May 4, 2009 at 22:44, Andreas Orphanides andreas_orphani...@ncsu.edu wrote: You say that as though libraries are all about books. Libraries still have the word biblio as their primer, and it certainly is the written word on paper that occupies most of our time, no? Sure libraries around the world are trying to play catch-up in the digital and modern world with all sorts of things, but the primary directive is still books for most librarians. Not sure what you mean they're *really* into? Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Wed, May 6, 2009 at 18:44, Mike Taylor m...@indexdata.com wrote: Can't you just tell us? Sorry, but surely you must be tired of me banging on this gong by now? It's not that I don't want to seem helpful, but I've been writing a bit on this here already and don't want to be marked as spam for Topic Maps. In the Topic Maps world our global identificators are called PSI, for Published Subject Indicators. There's a few subtleties within this, but they are not so different from any other identificator you'll find elsewhere (RDF, library world, etc.) except of course they are *always* URIs. Now, the thing here is that they should *always* be published somewhere, whether as a part of a list or somewhere. The next thing is that they always should resolve to something (although the standard don't require this, however I'd say you're doing it wrong if you couldn't do this, even if it sometimes is an evil necessity). This last part is really the important bit, where any PSI will act as 1) a global identificator, and 2) resolve to a human text explaining what it represents. Systems can just use it while at the same time people can choose the right ones for their uses. And, yes, the identificators can be done any way you slice them. Some might think that ie. a PSI set for all dates is crazy as you need to produce identificators for all dates (or times), and that would be just way too much to deal with, but again, that's not an identifcation problem, that's a resolver problem. If I can browse to a PSI and get the text that this is 3rd of June, 19971, using the whatsnot calendar style, then that's safe for me to use for my birthday. Let's pretend the PSI is http://iso.org/datetime/03061971. By releasing an URI template computers can work with this automatically, no frills. Now a bit more technical; any topic (which is a Topic Map representation of any subject, where subject is defined as anything you can ever hope to think of) can have more than one PSI, because I might use the PSI http://someother.org/time/date/3/6/1971 for my date. If my application only understand this former set of PSIs, I can't merge and find similar cross-semantics (which really is the core of the problem this thread has been talking about). But simply attach the second PSI to the same Topic, and you do. In fact, both parties will understand perfectly what you're talking about. More complex is that the definitions of PSI sets doesn't have to happen on the subject level, ie. the Topic called Alex to which I tried to attach my birthday. It can be moved to a meta model level, where you say the Topic for Time and dates have the PSI for both organsiations, and all Topics just use one or the other; we're shifting the explicity of identification up a notch. Having multiple PSIs might seem a bit unordered, but it's based on the notion of organic growth, just like the web. People will gravitate towards using PSIs from the most trusted sources (or most accurate or most whatever), shifting identification schemes around. This is a good thing (organic growth) at the price of multiple identifiers, but if the library world started creating PSIs, I betcha humanity and the library world both could be saved in one fell swoop! (That's another gong I like to bang) I'm kinda anticipating Jonathan saying this is all so complex now. :) But it's not really; your application only has to have complexity in the small meta model you set up, *not* for every single Topic you've got in your map. And they're mergable and shareable, and as such can be merged and fixed (or cleaned or sobered or made less complex) for all your various needs also. Anyway, that's the basics. Let me know if you want me to bang on. :) For me, the problem the library face isn't really the mechanisms of this (because this is solvable, and I guess you just have to trust that the Topic Maps community have been doing this for the last 10 years or so already :), however, but how you're going to fit existing resources into FRBR and RDA, but that's a separate discussion. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Sat, May 9, 2009 at 00:32, Jonathan Rochkind rochk...@jhu.edu wrote: I don't understand from your description how Topic Maps solve the identifying multiple versions of a standard problem. It's the mechanism of having multiple identifiers for Topics, so, in pseudo ; Topic MARC21 psi info:ofi/fmt:xml:xsd:MARC21 psi http://loc.org/stuff/marc21; property #mime-type whatever for the binary Topic MARC 1.1 is_a MARC psi info:srw/schema/1/marcxml-v1.1 psi http://loc.org/stuff/marcxml-v1.1; property #mime-type whatever 1.1 Topic MARC 1.2 is_a MARC psi info:srw/schema/1/marcxml-v1.2 psi http://bingo.com/psi/marcxml; property #mime-type whatever 1.2 Or, if if MARC 1.2 is backwards compatible with 1.1 ; Topic MARC 1.2 is_a MARC 1.1 psi info:srw/schema/1/marcxml-v1.2 Or, if I make my own unofficial version ; Topic MARC 2.0 is_a MARC 1.2 psi http://alex.com/psi/marc-2.0; This is enough to hobble together what is and isn't compatible in types of formats, so if your application is Topic Maps aware, this should be trivial (including what format to ignore or react to). The point is that you don't need *one* identifier for things; Topics are proxies for knowledge, and part of the notion of knowledge is what identifies that knowledge. Multiple PSIs help us leverage both rigid and fuzzy systems. As to the identifiers themselves (as in, the formatting), is that important? Anyway, I'm suspecting I don't see what the problem seems to be. To create the best identifier for things seems a bit of a strange notion to me, but is this based on that there is only (or rather, that you're trying to create) one identifier for any one thing? Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Mon, May 11, 2009 at 16:04, Rob Sanderson azar...@liverpool.ac.uk wrote: * One namespace is used to define two _totally_ separate sets of elements. There's no reason why this can't be done. As opposed to all the reasons for not doing it. :) This is crap design of a higher magnitude, and the designers should be either a) whipped in public and thrown out in shame, or b) repent and made to fix the problem. Even I would opt for the latter, but such a simple task not being done seems to suggest that perhaps the former needs to be put in place. * One namespace defines so many elements that it's meaningless to call it a format at all. Even though the top level tag might be the same, the contents are so varied that you're unable to realistically process it. Yeah, don't use MODS in general; it's a hack. It's even crazier still that many versions have the same namespace. What were they thinking?! Anyway, even if the namespace is botched, you can still (if I'll dare go by the Topic Maps moniker) have multiple namespaces for the same subject (the format in question), and simply publish and use your own and let the TM mechanics handle the ambiguity for you. If enough people do this, and perhaps even use your unofficial identifiers, maybe LOC will see the errors of their ways and repent. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Mon, May 11, 2009 at 19:34, Jonathan Rochkind rochk...@jhu.edu wrote: In the real world, we use things when they solve the problem in front of us in as easy a way as possible And somehow you're suggesting that I don't live in the real-world? :) Good try, but as far as I've experienced, people in the library world lives quite a distance away from the real one. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Thu, May 14, 2009 at 17:35, Rob Sanderson azar...@liverpool.ac.uk wrote: For example, the owl:sameAs predicate is used to express that the subject and object are the same 'thing'. Then the application can infer that if a owl:sameAs b, and a x y, then b x y. Yes, but there's a snag; as RDF work only on the URI resource level (no added semantics to the typification of the URI resource) if someone does an owl:sameAs between an identifier of a thing and a locator of a thing (a locator being the resource itself as opposed to being an identifier; example are you talking about Sun Corp (http://sun.com/) or are you talking about their website (http://sun.com/)) you can get a nasty case of integrity rot, and I've not seen any proposals to address this issue (the RDF world is essentially assuming modeling from the viewpoint of everything being true). I guess Mike don't like RDF *nor* Topic Maps now. :) Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Thu, May 14, 2009 at 17:45, Rob Sanderson azar...@liverpool.ac.uk wrote: I'll quote Mike (and most common approaches to the problem): Don't Do That Then. :) Oh, for sure. :) But these are very subtle things that are hard to understand, and certainly the long-term implications, so people *will* do this, and they *will* put rot into the SemWeb chains people create. It's unavoidable, but I know lots are trying to work out some kind of solution. Unfortunately, this one is being routed to software frameworks rather than the RDF core itself. Oh well. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] A Book Grab by Google
On Thu, May 21, 2009 at 10:07, Karen Coyle li...@kcoyle.net wrote: - without competition, Google (with the agreement of the registry, whose purpose is to garner as much income as possible for rights holders) will charge a price that is more than some institutions will be able to afford; others will subscribe, but to the detriment of other resource subscriptions. How is this different from what's already in place in terms of electronic resources? This is not uniquely Google, nor has it even been proven to happen. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] HTML mark-up in MARC records
Hiya, I guess I'm the one who's got to step up to the self-slaughtering altar, but the fact that a lot of our systems break or don't know how to handle HTML is despicable. I'm sure you guys are familiar with RSS / Atom, and because in there we *expect* HTML and therefore make sure our back-ends can grok it, it enhances the meta data *greatly*. Don't think for a second that purity of the data format in any shape or form is the definition of its usefulness. Mixed content models might be complex to work with, but their value is immense. I can fully understand *why* people say don't do it, because, yes, it ups the complexity, and perhaps with these dinosaur technologies like MARC and our ILS's breaking under the pressure of more modern technologies enforces it, I don't think we should shun it because of it. If your back-end can't grok HTML, I'd suggest you fix it immediately! If your ILS chokes on XML and / or HTML snippets, I suggest you replace it. You seriously shouldn't allow this rigidity into your infra-structure, and it's depressing to watch how we as complex users of MARC don't dare to extend it to become a format that does what it should and need to do. Even *if* HTML in MARC records probably is a bad idea. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Library Linked Data
Hiya, On Thu, Oct 29, 2009 at 15:16, Roy Tennant tenna...@oclc.org wrote: Could you elaborate a bit? In my mind, the only semantic web technology of any note is linked data. What do you mean by linked data? I work in fields of semantic web technology where there's very little linked data (ie. data on the web you can link to and use), yet I feel all our work is very valuable and certainly worthy of note ... Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Library Linked Data
Hiya, On Thu, Oct 29, 2009 at 16:19, stuart yeates stuart.yea...@vuw.ac.nz wrote: I'm guessing that Roy meant linked data in the sense of http://www.w3.org/DesignIssues/LinkedData.html and http://linkeddata.org/ I'm pretty sure he did, too. I guess I was trying to smoke out his reasoning for choosing linked data as the only worthwhile semantic web technology. Let me clarify, and have a look at this ; http://en.wikipedia.org/wiki/Semantic_Web_Stack Linked data is the bottom four boxes out of a total of 12 (13 if you count the top one), where the ones missing is things like Trust, Proof, Logic, Querying, Ontologies and Taxonomies, all things that I thought it was evident belonged at the core of what library science is all about. It simply astounds me the lack of understanding from the library world on these things, so sad to see that these things aren't linked up; you *are* what these things are about! Sure, linked data is easier; that's why everyone is doing it, have been doing it for years. But you're missing out in fields that should be second-nature to you. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Twitter annotations and library software
Hi, On Thu, Apr 29, 2010 at 22:47, Walker, David dwal...@calstate.edu wrote: I would suggest it's more because, once you step outside of the primary use case for OpenURL, you end-up bumping into *other* standards. These issues were raised all the back when it was created, as well. I guess it's easy to be clever in hindsight. :) Here's what I wrote about it 5 years ago (http://shelter.nu/blog-159.html) ; So let's talk about 'Not invented here' first, because surely, we're all guilty of this one from time to time. For example, lately I dug into the ANSI/NISO Z39.88 -2004 standard, better known as OpenURL. I was looking at it critically, I have to admit, comparing it to what I already knew about Web Services, SOA, http, Google/Amazon/Flickr/Del.icio.us API's, and various Topic Maps and semantic web technologies (I was the technical editor of Explorers Guide to the Semantic Web) I think I can sum up my experiences with OpenURL as such; why? Why have the library world invented a new way of doing things that already can be done quite well already? Now, there is absolutely nothing wrong with the standard per se (except a pretty darn awful choice of name!!), so I'm not here criticising the technical merits and the work put into it. No, it's a simple 'why' that I have yet to get a decent answer to, even after talking to the OpenURL bigwigs about it. I mean, come on; convince me! I'm not unreasonable, no truly, really, I just want to be convinced that we need this over anything else. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
On Fri, Apr 30, 2010 at 04:17, Jakob Voss jakob.v...@gbv.de wrote: But all the flaws of XML can be traced back to SGML which is why we now use JSON despite all of its limitations. Hmm, this is wrong on so many levels. First, SGML was pretty darn good for its *purpose*, but it was a geeks dream and pretty scary for anyone who hacked at it not fully getting it (like most normal developers). As with many things where the learning curve is steep, it fell into the not good for normal consumption category and they (well, people who cared, and made decisions about the web) were forced to make XML. But JSON? Are you sure you've got this figured out? JSON as a object serializing format is good for a number of things (small footprint, embedded type, etc.), but sucks for most information management tasks. However, I'd like to add here that I happen to love XML, even from an integration perspective, but maybe that stems from understanding all those tedious bits no one really cares about about it, like id(s) and refid(s) (and all the indexing goodness that comes from it), canonical datasets, character sets and Unicode, all that schema craziness (including Schematron and RelaxNG), XPath and XQuery (and all the sub-standards), XSLT and so on. I love it all, and not because of the generic simplicity itself (simple in the default mode of operation, I might add), but because of a) modeling advantages, b) cross-environment language and schema support, and c) ease of creation. (I don't like how easy well-formedness breaks, though. That sucks) But I mention all this for a specific reason ; MARCXML is the work of the devil! There's a certain dedication needed for doing it right, by paying attention in XML class, and play well with your playmates. This is how you build a community and understanding around standards; the standards themselves are not enough. The library world did nothing of the kind ; http://shelter.nu/blog/2008/09/marcxml-beast-of-burden.html The flaws of XML can most likely be traced back to people not playing well with playmates, and not the format itself. May brother Ted Nelson enlighten all of us - he not only hates XML [1] and similar formats but also proposed an alternative way to structure information even before the invention of hierarchical file systems and operating systems [2]. Bah. For someone who don't see the SGML - XML - HTML transgression as an inherited and more rigid structure (or, by popular language, more schematic) as a document model as a good thing, I'm not impressed. Any implied structure can be criticized, including pretty much any corner of Xanadu as well. (I mean, seriously; taking hypermedia one step closer to a file system does *not* solve problems with the paper-based document model of HTTP, it just shifts the focus) In his vision of Xanadu every piece of published information had a unique ID that was reused everytimes the publication was referenced - which would solve our problem. *Having* an identifier doesn't mean that identifier is a *good* one, nor that it solves your problem. There's plenty of systems out there where everything has an identifier (and, if you knew XML deeper, you'll find identification models as well in there, but people don't use them because the early onset of XML didn't understand nor need them). Have a look at the failed XLink brooha for something that worked and filled the niche, but people didn't get nor did tool-makers see the point of implementation, and the thing died a premature death. The current model of document structure and XQuery is somewhat of an alternative, but people are also switching to CSS3 styles as well. The thing is, just because you've got persistence in a system of identifiers, it does not follow that the information is persisted; the problem of change is *not* solved in neither systems, and so we work with the one we got and make the best of it. One thing I always found intriguing about librarians were their commitment to persistent URIs for information resources, and use of 303 if need be (although I see this mindset dwindling). I think you're the only ones in the entire world who gives a monkeys bottom about these issues, as the rest of the world simply use Google as a resolver. I can see where this is going. :) Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
On Fri, Apr 30, 2010 at 10:54, Eric Hellman e...@hellman.net wrote: May I just add here that of all the things we've talked about in these threads, perhaps the only thing that will still be in use a hundred years from now will be Unicode. إن شاء الله May I remind you that we're still using MARC. Maybe you didn't mean in the library world ... *rimshot* Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Twitter annotations and library software
On Fri, Apr 30, 2010 at 18:47, Owen Stephens o...@ostephens.com wrote: Could you expand on how you think the problem that OpenURL tackles would have been better approached with existing mechanisms? As we all know, it's pretty much a spec for a way to template incoming and outgoing URLs, defining some functionality along the way. As such, URLs with basic URI templates and rewriting have been around for a long time. Even longer than that is just the basics of HTTP which have status codes and functionality to do exactly the same. We've been doing link resolving since mid 90's, either as CGI scripts, or as Apache modules, so none of this were new. URI comes in, you look it up in a database, you cross-check with other REQUEST parameters (or sessions, if you must, as well as IP addresses) and pop out a 303 (with some possible rewriting of the outgoing URL) (with the hack we needed at the time to also create dummy pages with META tags *shudder*). So the idea was to standardize on a way to do this, and it was a good idea as such. OpenURL *could* have had a great potential if it actually defined something tangible, something concrete like a model of interaction or basic rules for fishing and catching tokens and the like, and as someone else mentioned, the 0.1 version was quite a good start. But by the time when 1.0 came out, all the goodness had turned so generic and flexible in such a complex way that handling it turned you right off it. The standard also had a very difficult language, and more specifically didn't use enough of the normal geeky language used by sysadmins around. The more I tried to wrap my head around it, the more I felt like just going back to CGI scripts that looked stuff up in a database. It was easier to hack legacy code, which, well, defeats the purpose, no? Also, forgive me if I've forgotten important details; I've suppressed this part of my life. :) Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Twitter annotations and library software
On Fri, Apr 30, 2010 at 20:29, Owen Stephens o...@ostephens.com wrote: However I'd argue that actually OpenURL 'succeeded' because it did manage to get some level of acceptance (ignoring the question of whether it is v0.1 or v1.0) - the cost of developing 'link resolvers' would have been much higher if we'd been doing something different for each publisher/platform. In this sense (I'd argue) sometimes crappy standards are better than none. Well, perhaps. I see OpenURL as the natural progression from PURL, in which both have their degree of success, however I'm careful using that word as I live on the outside of the library world. It may well be a success on the inside. :) I think the point about Link Resolvers doing stuff that Apache and CGI scripts were already doing is a good one - and I've argued before that what we actually should do is separate some of this out (a bit like Johnathan did with Umlaut) into an application that can answer questions about location (what is generally called the KnowledgeBase in link resolvers) and the applications that deal with analysing the context and the redirection Yes, split it into smaller chunks is always smart, especially with complex issues. For example, in the Topic Maps world, the who standard (reference model, data model, query language, constraint language, XML exchange language, various notational languages) is wrapped up with a guide in the middle. Make them into smaller parcels, and make your flexible point there. If you pop it all into one, no one will read it and fully understand it. (And don't get me started on the WS-* set of standards on the same issues ...) (To introduce another tangent in a tangential thread, interestingly (I think!) I'm having a not dissimilar debate about Linked Data at the moment - there are many who argue that it is too complex and that as long as you have a nice RESTful interface you don't need to get bogged down in ontologies and RDF etc. I'm still struggling with this one - my instinct is that it will pay to standardise but so far I've not managed to convince even myself this is more than wishful thinking at the moment) Ah, now this is certainly up my alley. As you might have seen, I'm a Topic Maps guy, and we have in our model a distinction between three different kinds of identities; internal, external indicators and published subject identifiers. The RDF world only had rdf:about, so when you used www.somewhere.org, are you talking about that thing, or does that thing represent something you're talking about? Tricky stuff which has these days become a *huge* problem with Linked Data. And yes, they're trying to solve that by issuing a HTTP 303 status code as a means of declaring the identifiers imperative, which is a *lot* of resolving to do on any substantial set of data, and in my eyes a huge ugly hack. (And what if your Internet falls down? Tough.) Anyway, here's more on these identity problems ; http://www.ontopia.net/topicmaps/materials/identitycrisis.html As to the RESTful notions, they only take you as far as content-types can take you. Sure, you can gleam semantics from it, but I reckon there's an impedance mismatch between just the things librarians how got down pat ; meta data vs. data. CRUD or, in this example, GPPD (get/post/put/delete), who aren't in a dichotomy btw, can only determine behavior that enables certain semantic paradigms, but cannot speak about more complex relationships or even modest models. (Very often models aren't actionable :) The funny thing is that after all these years of working with Topic Maps I find that these hard issues have been solved years ago, and the rest of the world is slowly catching up to it. I blame the lame DAML+OIL background of RDF and OWL, to be honest; a model too simple to be elegantly advanced and too complex to be easily useful. Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] OCLC Service Outage Update
On Tue, May 11, 2010 at 06:59, stuart yeates stuart.yea...@vuw.ac.nz wrote: No, the real problem is with trolls sending flamebait. Friggin' AMEEN! Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] OCLC Service Outage Update
Michael J. Giarlo leftw...@alumni.rutgers.edu wrote: ... people took Simon's comment seriously? Language is a funny thing ; some times the things that are being said is taken seriously. And the script-haters are spread far and wide, so there was no reason not to take him seriously. Should the default be not to take anyone seriously? Srsly? Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
Hiya, On Tue, Oct 26, 2010 at 6:26 AM, Nate Vack njv...@wisc.edu wrote: Switching to an XML format doesn't help with that at all. I'm willing to take it further and say that MARCXML was the worst thing the library world ever did. Some might argue it was a good first step, and that it was better with something rather than nothing, to which I respond ; Poppycock! MARCXML is nothing short of evil. Not only does it goes against every principal of good XML anywhere (don't rely on whitespace, structure over code, namespace conventions, identity management, document control, separation of entities and properties, and on and on), it breaks the ontological commitment that a better treatment of the MARC data could bring, deterring people from actually a) using the darn thing as anything but a bare minimal crutch, and b) expanding it to be actual useful and interesting. The quicker the library world can get rid of this monstrosity, the better, although I doubt that will ever happen; it will hang around like a foul stench for as long as there is MARC in the world. A long time. A long sad time. A few extra notes; http://shelterit.blogspot.com/2008/09/marcxml-beast-of-burden.html Can you tell I'm not a fan? :) Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
Ray Denenberg, Library of Congress r...@loc.gov wrote: It really is possible to make your point without being quite so obnoxious. Obnoxious? Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
On Tue, Oct 26, 2010 at 11:56 AM, Walker, David dwal...@calstate.edu wrote: Your criticisms of MARC-XML all seem to presume that MARC-XML is the goal, the end point in the process. But MARC-XML is really better seen as a utility, a middle step between binary MARC and the real goal, which is some other useful and interesting XML schema. How do you create an ontological commitment in a community to an expanding and useful set of tools and vocabularies? I think I need to remind people of what MARCXML is supposed to be ; a framework for working with MARC data in a XML environment. This framework is intended to be flexible and extensible to allow users to work with MARC data in ways specific to their needs. The framework itself includes many components such as schemas, stylesheets, and software tools. I'm not assuming MARCXML is a goal, no matter how we define that. I'm poo-pooing MARCXML for the semantics we, as a community, have been given by a process I suspect had goals very different from reality. Very few people would work with MARC through MARCXML, they would use it to convert it, filter it, hack around it to something else entirely. And I'm afraid lots of people are missing the point of stubbing the developments in a community by embracing tools that pushes a packet that inhibits innovation. So, here's the point, in paraphrased point; Here's our new thing. And we did it by simply converting all our MARC into MARCXML that runs on a cron job every midnight, and a bit of horrendous XSLT that's impossible to maintain. But it looks just like the old thing using MARC and some templates? Ah yes, but now we're doing it in XML! (Yeah, yeah, your mileage will vary) I'm sorry if I'm overly pessimistic about the XML goodness in the world, not for the XML itself, but the consequences of the named entities involved. I've been a die-hard XML wonk for far too many years, and the tools in that tool-chest doesn't automatically solve hard problems better by wrapping stuff up in angle brackets, and - dare I say it? - perhaps introduces a whole fleet of other problems rarely talked about when XML is the latest buzz-word, like using a document model on what's a traditional records model, character encodings, whitespace issues, unicode, size and efficiencies (the other part of this thread), and so on. But let me also be a bit more specific about that hard semantic problem I'm talking about; Lots of people around the library world infra-structure will think that since your data is now in XML it has taken some important step towards being inter-operable with the rest of the world, that library data now is part of the real world in *any* meaningful way, but this is simply demonstrably deceivingly not true. By having our data in XML has killed a few good projects where people have gone A new project to convert our MARC into useful XML? Aha! LoC has already solved that problem for us. Btw, to those who find me so obnoxious, at no point do I say it was intentionally evil, just evil none the same. The road to hell is, as always, paved with good intentions. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
On Tue, Oct 26, 2010 at 12:48 PM, Bill Dueber b...@dueber.com wrote: Here, I think you're guilty of radically underestimating lots of people around the library world. No one thinks MARC is a good solution to our modern problems, and no one who actually knows what MARC is has trouble understanding MARC-XML as an XML serialization of the same old data -- certainly not anyone capable of meaningful contribution to work on an alternative. Slow down, Tex. Lots of people in the library world is not the same as developers, or even good developers, or even good XML developers, or even good XML developers who knows what the document model imposes to a data-centric approach. The problem we're dealing with is *hard*. Mind-numbingly hard. This is no justification for not doing things better. (And I'd love to know what the hard bits are; always interesting to hear from various people as to what they think are the *real* problems of library problems, as opposed to any other problem they have) The library world has several generations of infrastructure built around MARC (by which I mean AACR2), and devising data structures and standards that are a big enough improvement over MARC to warrant replacing all that infrastructure is an engineering and political nightmare. Political? For sure. Engineering? Not so much. This is just that whole blinded by MARC issue that keeps cropping up from time to time, and rightly so; it is truly a beast - at least the way we have come to know it through AACR2 and all its friends and its death-defying focus on all things bibliographic - that has paralyzed library innovation, probably to the point of making libraries almost irrelevant to the world. I'm happy to take potshots at the RDA stuff from the sidelines, but I never forget that I'm on the sidelines, and that the people active in the game are among the best and brightest we have to offer, working on a problem that invariably seems more intractable the deeper in you go. Well, that's a pretty scary sentence, for all sorts of reasons, but I think I shall not go there. If you think MARC-XML is some sort of an actual problem What, because you don't agree with me the problem doesn't exist? :) and that people just need to be shouted at to realize that and do something about it, then, well, I think you're just plain wrong. Fair enough, although you seem to be under the assumption that all of the stuff I'm saying is a figment of my imagination (I've been involved in several projects lambasted because managers think MARCXML is solving some imaginary problem; this is not bullshit, but pain and suffering from the battlefields of library development), that I'm not one of those developers (or one of you, although judging from this discussion it's clear that I am not), that the things I say somehow doesn't apply because you don't agree with, umm, what I'm assuming is my somewhat direct approach to stating my heretic opinions. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
Hi, On Tue, Oct 26, 2010 at 1:23 PM, Bill Dueber b...@dueber.com wrote: Sorry. That was rude, and uncalled for. I disagree that the problem is easily solved, even without the politics. There've been lots of attempts to try to come up with a sufficiently expressive toolset for dealing with biblio data, and we're still working on it. If you do think you've got some insight, I'm sure we're all ears, but try to frame it terms of the existing work if you can (RDA, some of the dublin core stuff, etc.) so we have a frame of reference. Well, I've wined enough both here and on NGC4LIB, and I'm kinda over it, just like I'm sure most people are over my whining. But sufficient to say is that FRBR is a 15 year old model that has still not been proven in the Real World[TM] in any meaningful way (the prototypes works fine until you dig a bit) and probably never will as long as MARC21 runs the show, and trying to stick RDA on top with rules that has got use-cases that are old enough to be my kids, well, I'm not very positive about that either. The direction of going ontological is a good one, and in the lack of anything else, RDF-infused FRBR / RDA is probably the way to go (except I'd ditch RDA and, uh, perhaps even FRBR, or at least seriously modify it), but the community is decidedly not talking about ontological interoperability nor extensions nor the semantics involved to solve actual problems in the bibliographic world (including the fact that it is inherently bibliographic). There needs to be much more involvement by library geeks and managers in defining semantic reuse and extensibility, to properly define those things that are almost absent from the AACR2 and friends; the relationships between entities themselves. In other words, you need to get away from the record-centered view, and embrace the subject-centric view. Anyway, enough from this old grumpy bum. Sorry to stir up the dust. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
Political? For sure. Engineering? Not so much. Ok. Solve it. Let us know when you're done. Wow, lamest reply so far. Surely you could muster a tad bit better? I was excited about getting a list of the hardest problems, for example, I'd love to see that. Then by that perhaps you could explain what this unsurmountable hard mind-boggeling problem actually is, because, you know, you never actually said. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] mailing list administratativia
On Thu, Oct 28, 2010 at 2:44 AM, Doran, Michael D do...@uta.edu wrote: Can that limit threshold be raised? If so, are there reasons why it should not be raised? Is it to throttle spam or something? 50 seems rather low, and it's rather depressing to have a lively discussion throttled like that. Not to mention I thought I was simply kicked out for living things up (especially given my reasonable follow-up was where the throttling began). Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Django
On Wed, Oct 27, 2010 at 3:09 AM, Elliot Hallmark permafact...@gmail.com wrote: However, I switched to this other scripting language, python, because it could do things php cant. Not to start a flame, but that's a rather big statement which I think A) needs backing up, and B) is probably untrue. For instance, my first project in python involved capturing keyboard input before windows heard about it. Then I kept discovering amazing things python can do that php cant. For instance, PHP can do this fine. Was there something in particular you're thinking of that PHP can't do? I helped write a non-sequential optical ray tracer in python. When it needed to be faster there were several libraries for writing C code directly in a pythonic syntax. Python has hooks into everything, like optical character recognition, electronic music sequeuencing/generation, serial port i/o. Again, PHP the same. For the sophisticated hacker, most languages can be tweaked to solve almost any problem. And I'm not even suggesting that you use PHP. Happy hacking. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] mailing list administratativia
On Thu, Oct 28, 2010 at 6:53 AM, Jonathan Rochkind rochk...@jhu.edu wrote: Pretty sure it wasn't depressing to the vast majority of the listserv audience. That was/is a discussion that benefited from a timeout period, like you give the pre-schoolers. Given we're adults, and not in pre-school, I disagree. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] mailing list administratativia
On Thu, Oct 28, 2010 at 6:58 AM, Chris Fitzpatrick cf...@stanford.edu wrote: +1 to the this discussion is really depressing me camp. Ok, ok, I get the message. This is no place to voice strong opinions about bad library tech, and my (different, but not bad) language nor stance (contrarian, but not accusatory) are simply not acceptable. I'm outta here. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
[CODE4LIB] PHP vs. Python [was: Re: Django]
Hola, compadre, Elliot Hallmark permafact...@gmail.com wrote: Other things beyond that seemed awkward, difficult, or impossible from what I knew. python immediately jumped out to me as a tool more suited to these tasks. The fact that Python has a looping run-time environment is, of course, a give-away to why most people think this, and perhaps to some degree, rightly so, but PHP has got the same, it's just that *most* people use PHP through some Apache module as a request/response module. Indeed, that's where it started, and that's its forte. From my experience, it seemed php was a server side scripting language. Strictly speaking, so is Python. Can you write a php script that gets key presses and doesn't pass them along to windows to process? I thought the OS would have to process the key press, pass it along to the php server and then php could process it. (pyhook) A couple of obvious candidates; - http://gtk.php.net/ - http://winbinder.org/ Also, how would you go about using a GPU from a graphics card in php? (python cuda in google gives many results) PHP is just a C program with various bindings, so I suspect in the same way Python would do it. Whether anyone has done it, though, is a different question. Has anyone written a scientific computing package along the lines of matlab in php (scipy, numpy, matplotlib)? Or a non-sequential optical raytracer? Not seen any scientific packages, but I've seen a few ray-tracers, although they're all demo apps and fun toys (although I think that applies to Python, too). It's not so much about whether you can do it or not (you can), but whether it makes sense to do so (it mostly doesn't). Having said that, there's nothing stopping me making a local run-time PHP program to do either, it's just that it's PHP and hence slower than C. Python, too, is slower than C, except when it runs some C module, which, uh, is C, the same as if PHP runs some C module. For example, one of the fastest and best XSLT 1.0 processors and XML libraries out there is XMLlib and XSLTlib (RedHat and Gnome?), written in C, and is the defacto PHP XML and XSLT modules used. Whatever you've got that runs in C, you can run in PHP, it's not really a big deal, it just depends on whether it makes sense to patch it up with the way you use your PHP. if you wanted to write a web interface for GNU cash or another well established accounting program, could you do it? Sure. Here's someone who'dunnit back in 2008; http://web.archiveorange.com/archive/v/LJV4vT1u2IqE3LstFA1V please feel free to point me to the php equivilants of pyhook, pycuda, scipy, numpy and some examples of widely used programs with php bindings. You can bind PHP and Python the same, it's just a matter of doing and whether it makes sense to do so. It's *not* a question of /if/ you can do it, but if you /should/ do it. Your milage *will* vary. For the sophisticated hacker, most languages can be tweaked to solve almost any problem. I am sure that is true. Though, I feel many for many tasks php would require quite a bit more tweaking than python, with much less community support behind it (I mean, google comes up with fewer helpful links to the problems I sited above). Maybe your Google-foo is weak. :) My impression, based on very little experience with php, is that if you asked in a forum about using php for advanced scientific computing, or writing music generation/sequencing software, knowledgeable folks would first ask: are you sure you want to do this in php? how about java or python? Again, probably because they don't realize it can be done in a non-request/response kinda way with PHP as well. But then, PHP itself isn't all that fast if you have little knowledge of how to do proper PHP, but this is a pitfall in any language. That said, php may be superior for generating websites from databases. Not really, but the installations you'll find in the wild is readily configured for it, so it's easy to get going. However, this has little to do with the language itself, and more to do with default packaging of it. Anyway, I wasn't meaning to promote PHP over Python, just pointing out that PHP is a lot more (and more often still, a lot better) than what most people think it is. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] PHP vs. Python [was: Re: Django]
Olá, como vai? Luciano Ramalho luci...@ramalho.org wrote: Actually, Python is a general purpose programming language. It was not created specifically for server side scripting like PHP was. But it is very suitable to that task. I'm not sure talking about what something used to be is as interesting as talking about what it is. Both Pyhton and PHP can share whatever moniker we choose (scripting-language, programming language, real-time, half-time, bytecoded, virtual, etc.). Not seen any scientific packages, but I've seen a few ray-tracers, although they're all demo apps and fun toys (although I think that applies to Python, too). No, that does not apply to Python. Python is widely used for hardcore scientific computing. I was referring to the ray-tracing part. It is also the most important scripting language in large scale CGI settings Yes, Python is widely used for scripting up interfaces into other more complex systems. But rarely is the core of the thing written entirely in Python. Maybe your Google-foo is weak. :) Or maybe he's just realizing that outside of server side web scripting, PHP is just not so widely used. Absolutely, and fair enough. Having used both languages, I discovered that Python is easier for most tasks, and one reason is that the libraries that come with Python are extremely robust, well tested and consistent. Hmm. PHP is extremely robust and well-tested, but yes, it's not all that consistent, especially not before version 5.2+. However, things have moved on, and with release 6 around the corner things will be tighter still. Just like the first versions of Python were interesting, so was PHP's, but where the biggest problem with the evolution of PHP was the very fact that it was the most popular language for rapid web development by far. PHP is very practical for server-side web scripting, but it's libraries are unfortunately full of gotchas, traps and unexpected behaviour. There's gotchas in every language, even Python. A key reason for that is the fact that Python has always had an exception-handling mechanism while PHP has grown something like that only a few years ago True enough. But earlier versions of any language are less desirable than the latest versions, so I'm not sure this is a prevailing argument for the horribleness of PHP or any language. These things evolve. PHP 5.3+ and soon 6 are looking very good, indeed, but yes, we will just have to live with a poor reputation brought on by the big number of users and the pre 5.2+ era. So, I my opinion, PHP is great at what it does best: enabling quick server-side Web scripting on almost any hosting service on Earth. I'm fairly sure you can say that because you haven't done much other kind of PHP work. :) For everything else, it is very worthwhile to learn and use a general purpose dynamic language such as Python, Ruby or Perl. Of course. Developers should learn many of languages, and choose wisely the language best suited to the problem at hand. Sorry for the rant. I must confess I am a founder of the Brazilian Python Association and was its first president, so you can call me a Python advocate. No bias at all, really. :) Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] PHP vs. Python [was: Re: Django]
On Sat, Oct 30, 2010 at 7:49 AM, Bradley Allen bradley.p.al...@gmail.com wrote: Mark- I would highly recommend looking at Tornado (http://www.tornadoweb.org) as an alternative to using Django without the ORM. I'd second that one. Has used it for a couple of projects, and it seriously cut down on prerequisite clutter and is super fast. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Let's go somewhere [was PHP vs. Python...]
On Tue, Nov 2, 2010 at 5:03 AM, Jonathan Rochkind rochk...@jhu.edu wrote: I would be very unlikely to use someone's homegrown library specific scripting language. However, if you want to make a library for an existing popular scripting language that handles your specific domain well, I'd be quite likely to use that if I had a problem with your domain and I was comfortable with the existing popular scripting language, i'd use it for sure. Hmm. The balance between the old and tried, and the new and experimental will, forever, cause these kinds of discussions. Now, I agree with the basic sentiment of what you're saying, but ... Odds are your domain is not really libraries (that's not really a software problem domain), but perhaps as Patrick suggests dealing with relationships among semantic objects, and then odds are libraries are not the only people interested in this problem domain. I've worked in the three basic tiers of library development world; the plain vanilla programming world, the semantic web world, and the dark dungeons of the Cult of MARC. Is the domain of library IT solved by the generic technologies used? No. There's nothing bad about a DSL, in fact, I encourage it. If you want to get away from MARC, say, then having a DSL that approaches meta data on the programmatic level directly is a wonderful abstraction. But yes, we have to separate API from language. And API is, mostly these days, simply a function/method call on top of an abstraction, and it processes your request with your input. A language, on the other hand, will let you deal directly with that problem. Most DSLs are functional abstraction pre-compiled. The line between a library and a language perhaps these days are more blurred than ever before, however there are certain things that I think justifies a library DSL ; * focus on identity management * mergability on entities * large distributed sets * more defined line between data and meta data * controlled vocabularies and structures There's generic tools for all of these, however no one central thing that binds them all together in a seamless way, elegant or otherwise. No platform binds these together in an easy nor elegant way, and perhaps such a thing would be beneficial to the library community, to create a language that tries to create a bridge between computer programming and what you learn in library school. But even if we all concede that a library DSL perhaps is not a practical solution, I'd still like to see us work on it, for nothing more than sussing out our actual needs and wants in the process. Don't underestimate the process of doing something that will never eventuate, even knowingly. Some people like ruby because of it's support for creating what they call domain specific languages, which I think is a silly phrase, which really just means a libraryAPI at the right level of abstraction for the tasks at hand, so you can accomplish the tasks at hand concisely and without repeated code. Depends on the language. Perhaps this doesn't make sense in Ruby, but it certainly does in Scala, Haskell, and perhaps more than any, Rebol. Even Lisp and derivatives, who can create custom structures on the fly, are well suited to create actual languages that redefine the language's original syntax and structure. You can redefine the hell out of C to create any language imaginable, too, even when you shouldn't. A well-defined API is not a bad thing, though, but an API are basically semantic entities in a language to parse structures. However, a language redefines the syntax used by that language. Sure you can create a word record in an API that mimics, say, a MARC record, but the interesting part is when you redefine the syntax to work *with* that semantic concept, like ; external_repository { baseURI: 'http://example.com/', type: OAI-PHM } my_repository { baseURI: 'http://example.com/', type: RIF-CS } some_vocabulary { baseURI: 'http://example.com/vocab' type: thesauri } foreach record in external_repository [without tag 850] { inject into my_repository { with: exploded words ( tag 245 ) when: match words in some_vocabulary ( NT 2 ) merge into: tag 850 } } Creating classes that deal with record merging based on identity management and various standards would be trivial to script together super-fast, because the underlying concepts for us is rather well-known. Hacking this together in Java or otherwise is a test on patience and sanity, because they are generic tools, even when known library-type APIs are used. Of course lots of stuff is assumed in the example, but these are well-understood assumptions (about merging subject headings (like multiple tags handling, LCSH lookup, etc.), about identity control, about word lookup (for example, I'm assuming some form of stemming before matching), and on and on. A language that half text manipulation and lookup, and
Re: [CODE4LIB] Seth Godin on The future of the library
Hi, On Thu, Jun 2, 2011 at 9:11 AM, Jonathan Rochkind rochk...@jhu.edu wrote: There are some unanswered questions about what the purpose of the catalog is or should be in our users research workflow, and it's not obvious to me whether that purpose will involve putting any possible book or article that exists for free on the internet in the catalog. I personally think that libraries in general still have some fundamental issues of just getting their head around the two-headed problem of free web resources. Not only are these free, but they don't physically exists. This has certain implications for libraries ; Free: as has been pointed out, sometimes this means not being peer reviewed, or doesn't have the quality seal of a publisher, and as such there is no process for libraries to really understand how that knowledge fits into the rest of their collection. (I don't think it's a price issue; it's more a fundamental model issue) It's sometimes hard to wrap your head around the concept of anything free being of much *worth* where in the past worth and often quality was measured in the name of publishers and the amount of peer-review or the reputation of the author. The Internet has *changed* this to the core; it's all gone or going, and new models are coming through the haze of confusion which I think the library world is both unprepared for and seriously underfunded to deal with. Links: The whole concept of web resources, of what a link (or a link to a mirror or cache) is all about confuses libraries who are deeply rooted in all things being physical. I know this is a dozy, but I still find this an issue when talking to librarians even today. The concept of virtual things in the library world really only exists with the notion of meta data, and I don't think the transition to the resource itself *also* being virtual has worked out well. Libraries *likes* physical objects, they *like* shelves, they *like* their buildings, and I don't blame them; we are physical beings who love the smell of paper, however books are not actually important, buildings are not actually important, that smell is definitely not important : Ideas, knowledge and concepts are, and that's what we all try to pry from the books. (As an aside, if ideas and concepts were valued more, why couldn't LCSH morph into something far, far more important and useful? The mind boggles at the lost opportunities!) You cannot pry anything from a link except the possible resource at the other end, but it is a few traceroutes away in a virtual place, and in need of technological interpretation on arrival, and then comes the next level of trouble; These are just the conceptual problem. The next real problem of technology and the library world is - despite the hard and excellent work put in by people like us on this very list! - that they are still a slow-poke in the realm of using and developing technology. Most ILS are charmingly quaint in dealing with these things. OPAC's are mostly dreadful. Backend infra-structure never powerful or big enough for the growing digital stuff coming in. Systems running always a bunch of features away from being what we need, only getting by on a barely useful set of features (that far too often the vendors dictates) to do the minimum we have to do. Yes, yes, exceptions here and there, I would never deny that, but look at library land as a whole; you're lagging behind and you cannot really compete in a world that needs you to not only run, but win. And frankly, you *cannot* win, not on technology. There's just no way. Winning this one requires not technology as such, but paradigm shifts in thinking, both from inside and especially from the outside, coupled with proper resourcing by people who understands the value libraries truly bring to the world. And this latter thing is becoming a real problem, I think. One reason that libraries may not prioritize putting free ebooks in the catalog is because there are other places users can search for free ebooks on the internet -- but there aren't other places users can search for non-free ebooks that they know will be licensed to them as library patrons, or for that matter to search for physical things on the shelves that they know are available from their library. Seems like an odd argument to me. Why are we talking about the price and the format of the information rather than the *quality* of it? I thought a curated collection was the bee's knees, regardless of what formats used. Hmm. Maybe I'm thinking too much like a knowledge customer than a librarian these days, and I've lost my touch or my way. :) Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] iPads as Kiosks
Just my two bobs ; We're going through various stages of testing out tablets for both kiosks *and* portable workstations (for nurses and staff), and have tried out iPads and various Androids, and our current favorite is actually the Asus Eee Pad Transformer, a vanilla (but good quality) Honeycomb Android during day, but with a snap-on keyboard with extra ports and batteries for some netbook action at night, so it satisfies both our criteria. As with all things, it also depends on what software you want to run. If you go with iPad you need to go through Apple's various restrictions, while on Android you can use whatever you want. For a you are here tablet a cheap 150$ Android seems like a good option, too. Regards, Alex On Wed, Aug 24, 2011 at 11:51 PM, Madrigal, Juan A j.madrig...@miami.edu wrote: That零 the equivalent to $25/month and includes support for your whole development team/institution. If your employer can't afford that then I suggest you look for a new job! ;) Juan Madrigal Web Developer Web and Emerging Technologies University of Miami Richter Library On 8/23/11 2:21 PM, Dan Funk daniel.h.f...@gmail.com wrote: Wow, just $300/year and you can run your own software on your own hardware? What a deal. On Tue, Aug 23, 2011 at 2:13 PM, David Uspal david.us...@villanova.edu wrote: Thanks for the update. This definitely solves that issue -- its unfortunate this wasn't in place in 2009, or I'd be into year two of a five year contract... David K. Uspal Technology Development Specialist Falvey Memorial Library Phone: 610-519-8954 Email: david.us...@villanova.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Andrew Hankinson Sent: Tuesday, August 23, 2011 2:00 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] iPads as Kiosks You can distribute apps via an internal web server, with no need to go out to Apple. http://developer.apple.com/library/ios/#featuredarticles/FA_Wireless_Ente rprise_App_Distribution/Introduction/Introduction.html You need to be a registered business to do this, and it costs $299/yr. You get a digital certificate, but that doesn't mean your code needs to be seen by anyone outside of your org. On 2011-08-23, at 1:47 PM, David Uspal wrote: When I did my iPhone work, it was back in 2009 before this document even existed, so it's good they've come some distance on this issue since then. Still, the document below doesn't break the dependency on the iTunes store and/or a digital certificate issued by Apple to download applications (if I'm reading page 63 right), which was the big sticking point of the contract. Not only did the user not want the network controlled by Apple (which this document does handle), they also didn't want the code seen by any outside source at all (aka via uploading it to the store) David K. Uspal Technology Development Specialist Falvey Memorial Library Phone: 610-519-8954 Email: david.us...@villanova.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Andrew Hankinson Sent: Tuesday, August 23, 2011 1:34 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] iPads as Kiosks They now have an enterprise app deployment mechanism. http://www.apple.com/support/iphone/enterprise/ On 2011-08-23, at 12:54 PM, David Uspal wrote: Then again, by selecting the iPad you're essentially tethered to Apple's iron grip of the iWorld via its iTunes vetting process and strict control of Apple hardware. YMMV on this depending on what you're doing, but it should definitely be a consideration when choosing between Android tablets and the iPad. Quick side story -- we had to drop a contract one time at my old job due to the customer proprietary requirements. The customer didn't want to release its developed software outside of house (minus the developers of course) and Apple wouldn't give them a waiver from using the iTunes store. Mind you, this was a very big company with resources, so Apple probably lost a 5000 unit sale due to this David K. Uspal Technology Development Specialist Falvey Memorial Library Phone: 610-519-8954 Email: david.us...@villanova.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Stephen X. Flynn Sent: Tuesday, August 23, 2011 9:01 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] iPads as Kiosks Let's not forget a far superior user experience. Stephen X. Flynn Emerging Technologies Librarian Andrews Library, College of Wooster 1140 Beall Ave. Wooster, OH 44691 (330) 263-2154 http://www.sxflynn.net On Aug 22, 2011, at 12:56 PM, Madrigal, Juan A wrote: I would definitely go with the iPad. More accessories, better support and consistency. Juan Madrigal Web Developer Web and Emerging Technologies University of Miami Richter Library On 8/22/11 11:19 AM, Dan
Re: [CODE4LIB] Ontology Question
Hiya, Is it okay to just use the classes I need or should I include the super classes which they belong to? I think we also need to define a few concepts here. What do you mean, include? As far as I can tell, you want to say something like Here's a few concepts we're using, and their definition is based off this other ontology over *there* (pointing), but that's not always the case, so just asking. Now, Karen is of course right in her take on it, but there's a little thing that require a bit of focus, and that's how this new ontology is going to be used. Is it one of these manual labour things where it doesn't actually require formal definitions as much as a human one, or is it (however you use the ontology) to be passed through a tool, or more formally passed through an inferencer? Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Professional development advice?
I could give you tons of advice, most of it specific to some technological domain or another, but over the years I've more or less settled on one thing that beat out all the other ; Data models. Once you grok data models, what they are, how they work, and all the extended family (schemas, ontologies, persistent identification, querying, de-duplication, layered models, LUT/transcripts, stored procedures [and why they are evil], RDBMS vs. NoSQL vs. whatever, and so on), everything else is miscellaneous. The way we humans use computers as tools are all rooted in a data model at the bottom of some program or database, and the rest of the time is spent interacting with the data model, trying to make it do the things we need it to do, and so on. Everything is about and around that data model, so getting it right is a lot more important than any amount of beautiful coding against it. So, that's my big tip; all that technology we much about with is really trying to work well with a data model. Your task should rather be to understand the why, who, how, when and the thenceforth of data models, and everything else will follow. Now, this tip could under normal circumstances be applied to any part of the IT industry, but it makes especially sense in the library world. Most of the time is spent converting data between data models (whatever MARC whatever), or making sense of the one (MARC21/FRBR) or other (AACR2/RDA and that third one I can never remember the name of, that extension rules to AACR2?) or three (LCSH/DDC). We're all battling against the original thought and implementation of data models, and very often you'll find better technological solutions when you understand the underlying human efforts of ... data modeling (and by extension, you might discover my pet peeve, how all bad software and systems in the world comes from bad data modeling, and *not* from bad programming [even if there's plenty of that, too]) Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Professional development advice?
Hiya, On Tue, Nov 29, 2011 at 10:06 AM, Nate Vack njv...@wisc.edu wrote: A more productive task is to understand the who, how, when, and thenceforth of what tasks actual people want to accomplish with their computers Understanding this is not disconnected from designing data models *right*. It's the same thing. By extension I should mention that people are terrible at telling you what they want or need, but they're good at telling you what they hate. If nothing else, I'd suggest to tap into that wonderful hate. But an 'all flows from data modeling' thought process leads to FOAF, FOAF leads to hate, and hate leads to suffering. This sounds suspiciously like someone who don't understand the perils of data models and how they affect all the FOAF and hate that's built up around its faults. FOAF and suffering is a symptom of shitty data models, not shitty code. Unless you've got a little more meat on that argument? :) Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Professional development advice?
Kyle Banerjee wrote: Starting with data modeling is like trying to learn a new spoken language by focusing on grammar [...] Hmm. It seems that a lot of people are, shall we say, somewhat misguided to what data modelling is, even mighty WikiPedia who makes it into a formal process of sorts, and I can see it repeated ad nauseum wherever you go, giving us the idea that it is all about the schema of columns and the nuts and bolts of tables and relations in a RDBMS. That's confusing data modelling tools or processes with the generic open-ended category of data modelling. Data modelling is simply the act of exploring data-oriented structures. Over time I've learned that everything we do, every little problem you battle with in your every daily life, revolves around some data structure, the names of such, and their internal and external relationships. The simplest web form has a model, simple and complex applications do as well, enterprise systems, library systems, formats, databases, documents, spreadsheets, this conversation, your bicycle, your morning routine, *everything*. There is, in my strong opinion, a horrible conflation of the concept of modelling data and implementations pinning down data types; it's an evil so strong it blinds us, cripple us, and I feel like screaming out in terrifying agony the horrors within! The wrongly applied indeces! The labels on columns! The semantic binding of one sub-structure to another! The optimising tricks used! The stored procedures! The conceptual semantics of labels in n-ary graphs!! *aaarghhh!!* The wretched *name* of a single field and how it quietly eats up any disambiguous notion we put in place, through the many well-meaning but afflicting layers of abstraction and implementation, it drives me insane! Name!? What does that mean in the context of an email address? What does comment mean when it reaches my ORB? What were they thinking when the model designed resulted in SQL statements 1K long? There's so much information written of the topic of data modelling, and most of it ignore that very thing that it should embrace and focus heavily on; good semantic design. (Granted, it has become far more focused on in the last 10 years, and I'm extremely happy for that) Put some heavy thought into your tables, because what you perceive as a simple table of users becomes an overwhelming problem when you add special users to the system. Have any of you ever created an ILS with a table book in it? (C'mon, raise your hand, I know you have!) Yeah, that's the sort of evil I'm talking about! Libraries don't deal with books, they deal with bibliographic meta data of objects, and sometimes those objects are called a book which has certain constraints and properties that link to special meta data that isn't static. Version 1.0 of any system if famously rubbish because of the learning process of getting all this stuff wrong. Version 2.0 is famous for being overly abstracted and incomprehensible. Version 3.0 is getting there, but you're bogged down in the middleware, translating between good but incompatible models. By the time you get to version 4.0 you realize that the underlying concepts which drove versions 1 through 3 are flawed, and you need to work in terms of FRBR sub-graphs instead of MARC records. Version 5.0 is so re-written and re-conceptualized, you decide to call it something else, version 1.0 And we repeat the cycle. If your software isn't like this, consider yourself lucky (or at worst, self-deluded :). Data modeling is extremely useful, but mistaking drips and drabs of it early on for reality can poison your thinking. Sorry, you got that back to front. We all agree that understanding what user want and / or need is King, but unless you've got that understanding of not only what the users want but how systems can deliver this without creating constraints that will screw things up when you extend that original delivery idea, you're going to suffer. Badly. It's easy; take great care to what you call things in your system (no matter whether it's in the database, your objects / classes / instances / interfaces, user interface, buttons, messages, windows, data types, loops ... they're all data models that need to be as cooperative as possible, speaking the *same language*, to be compatible in the meaning they give the concepts used. If your Wheels API has different semantics from your Steering API, making that car is going to be a really crappy experience, for you as a developer, for testers, for maintenance guys, for service people, and most of all don't think for a second that the driver won't notice. These semantics are far more important than what our industry traditionally have given them, and in my opinion it is our biggest flaw. Trust me, I've stared at data models up and down so many systems over the years (10 of them in a high-flying big consultant agency where we came in when projects otherwise failed) it's amazing I'm still
Re: [CODE4LIB] Models of MARC in RDF
On Wed, Dec 7, 2011 at 1:49 PM, stuart yeates stuart.yea...@vuw.ac.nz wrote: As much as I have nothing against anyone on this list, isn't it a little US-centric? Didn't we make that mistake before? I wouldn't worry. A dream-team have no basis in reality, hence the dream part. I'd like to see a Real Team instead, an international collaboration of people, including international smarts and non-librarians. (Realistically, an international [or semi] library conference should have a three-day session with smart people first on this very issue, and that would make a fine place to get this thing working, even to some degree of speed) Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Namespace management, was Models of MARC in RDF
Richard Wallis richard.wal...@talis.com wrote: Your are not the only one who is looking for a better term for what is being created - maybe we should hold a competition to come up with one. A named graph gets thrown around a lot, and even though this is technically correct, it's neither nice nor sexy. In my past a bucket was much used, as you can easily thrown things in or take it out (as opposed to the more terminal record being set), however people have a problem with the conceptual size of said bucket, which more or less summarizes why this term is so hard to pin down. I have, however, seen some revert the old RDBMS world of rows, as they talk about properties on the same line, just thinking the line to be more flexible than what it used to be, but we'll see if it sticks around. Personally I think the problem is that people *like* the idea of a closed little silo that is perfectly contained, no matter if it is technically true or not, and therefore futile. This is also why, I think, it's been so hard to explain to more traditional developers the amazing advantages you get through true semantic modelling; people find it hard to let go of a pattern that has helped them so in the past. Breaking the meta data out of the wonderful constraints of a MARC record? FRBR/RDA will never fly, at least not until they all realize that the constraints are real and that they truly and utterly constrain not just the meta data but the future field of librarying ... :) Regards, Alex
Re: [CODE4LIB] Namespace management, was Models of MARC in RDF
Richard Wallis richard.wal...@talis.com wrote: Collection of triples? Yes, no baggage there ... :) Some of us are doing this completely without a single triplet, so I'm not sure it is accurate or even politically correct. *hehe* A classic example of only being able to describe/understand the future in the terms of your past experience. Yes, exactly. Although, having said that, I'm excited that the library world is finally taking the semantic challenge seriously. It's taken quite a number of years, but slowly there's a few drips and draps happening. Here's to hoping that there's a fluse somewhere about to open fully, and maybe the RDA vehicle have proper wheels? (Didn't the last time I checked, but that's admittedly a couple of years back. I hear they at least got new suspension?) Regards, Alex
[CODE4LIB] Open datasets
Hiya, I'm in the middle of creating a meta data management system (including merging and persistent identifier management) for a somewhat different domain (intranets and business integration), but it's based on Topic Maps and so is well suited to other means of meta data handling / mangling. It's also going to be open-source, and it might be well-suited to library tasks as well. So in order to test the integrity and performance of my system so far I'm wondering if there's a suitable open dataset of bibliographic records that aren't too obscure (meaning, I can find the titles at amazon or Open Library) that you could recommend? More than 1000 records, but less than a million, maybe? Regards, Alex
Re: [CODE4LIB] Open datasets
Hiya, Thanks for the all the pointers; just what I wanted, and gives me plenty of ways to test the generic meta data handling. Great! Regards, Alex On Jan 12, 2012 3:19 AM, Simon Spero s...@unc.edu wrote: You can get anything you want At Brewster Kahle's restaurant. http://openlibrary.org/data#bulk_download Simon On Wed, Jan 11, 2012 at 10:55 AM, LeVan,Ralph le...@oclc.org wrote: http://staff.oclc.org/~levan/PearsTraining/scifi.usmarc has 10,000 marc records in it. They are part of the old SiteSearch system that OCLC released as open source. They date back to 2002 and will not contain any Unicode, if you were hoping to include that as part of your testing. Ralph -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Alexander Johannesen Sent: Wednesday, January 11, 2012 5:36 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Open datasets Hiya, I'm in the middle of creating a meta data management system (including merging and persistent identifier management) for a somewhat different domain (intranets and business integration), but it's based on Topic Maps and so is well suited to other means of meta data handling / mangling. It's also going to be open-source, and it might be well-suited to library tasks as well. So in order to test the integrity and performance of my system so far I'm wondering if there's a suitable open dataset of bibliographic records that aren't too obscure (meaning, I can find the titles at amazon or Open Library) that you could recommend? More than 1000 records, but less than a million, maybe? Regards, Alex
Re: [CODE4LIB] Project Management Software Question
Hiya, --What project management software are you using? Semantic MediaWiki, xSiteable --What made you choose the system? Most project management software is written by geeks, not for humans. They all propose some methodology to go with their model, but either their model is inflexible (and crashing with yours), or it is so flexible that any tool might do the trick. Also, they are notoriously hard to configure on a cumulative scale of the people involved. Also, people hate putting in their data, so most software, even if they might just do the trick, fails for human reasons. So, a simple wiki with some added ontology cruff, and xSiteable delivering semantics and widgets across all people is enough. Simple todo's beat complex task management every time. --Has the system met all of your needs? If not, where does it fail? It only fails when we need average to higher degree of data, again, a human problem. Oh, and it sometimes fails because the MediaWiki GUI sucks for non-geeks. I think Confluence is better and overal pretty good. --Overall opinions? I could write you a sonnett or two, but I have very little trust in software helping much in project management (after having tried them all over a span of 20 years). A joint platform for documentation (and for heavens' sake, choose a Wiki that has a usable interface!) In fact, you'd be *far* better off getting Making stuff happen by Scott Berkun ( http://www.amazon.com/dp/0596517718?tag=scottberkunco-20camp=14573creative=327641linkCode=as1creativeASIN=0596517718adid=1B6JF6HWHDT0S5RYZNNM), the best book I ever got. Honest, I'm not affiliated. :) --What systems did you evaluate and decide not to recommend? Hmm, I think I've tried too many. I'm sure there's software out there that doesn't suck (ie. I hear good things about a few here and there), but far too often do I see this usability parred with human engagement problem crop up and ruin the best of software packages. Any information would be great! Sorry to be so glum. I'm more happy with simpler approaches such as project on a page (ie. one Wiki page with short description, people, contacts, goals, and progress) and more agile ways of dealing with requirements and development (reduces the need for approved paper, easier to roll back bad decisions, etc.). The closest I get to a Gant chart is that one of our vendors insists on sending me one every now and then, despite that he has to come into the office and explain it to people every single time. In other words; use software to document and drive forward, never use software to measure progress and estimates. Regards, Alex (disgruntled ex-beliver in project management software)
Re: [CODE4LIB] Seeking examples of outstanding discovery layers
I love the Trove from the National Library of Australia ; http://trove.nla.gov.au/ Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] rdf serialization
Hiya, On Tue, Nov 5, 2013 at 1:59 AM, Karen Coyle li...@kcoyle.net wrote: Eric, I really don't see how RDF or linked data is any more difficult to grasp than a database design Well, there's at least one thing which makes people tilt; the flexible structures for semantics (or, ontologies) in where things aren't as solid as in a data model. A framework where there are endless options (on the surface of it) for relationships between things is daunting to people who come from a world where the options are cast in iron. There's also a shift away from thing's identities being tied down in a model somewhere into a world where identities are a bit more, hmm, flexible? And less rigid? That can make some people cringe, as well. A master chef understands the chemistry of his famous dessert - the rest of us just eat and enjoy. Hmm. Some of us will try to make that dessert again, for sure. :) Alex
Re: [CODE4LIB] rdf serialization
Ross Singer rossfsin...@gmail.com wrote: This is definitely where RDF outclasses almost every alternative*, because each serialization (besides RDF/XML) works extremely well for specific purposes [...] Hmm. That depends on what you mean by alternative to RDF serialisation. I can think of a few, amongst them obviously (for me) is Topic Maps which don't go down the evil triplet way with conversion back and to an underlying data model. Having said that, there's tuples of many kinds, it's only that the triplet is the most used under the W3C banner. Many are using to a more expressive quad, a few crazies , for example, even though that may or may not be a better way of dealing with it. In the end, it all comes down to some variation over frames theory (or bundles); a serialisation of key/value pairs with some ontological denotation for what the semantics of that might be. It's hard to express what we perceive as knowledge in any notational form. The models and languages we propose are far inferior to what is needed for a world as complex as it is. But as you quoted George Box, some models are more useful than others. My personal experience is that I've got a hatred for RDF and triplets for many of the same reasons Eric touch on, and as many know, I prefer the more direct meta model of Topic Maps. However, these two different serialisation and meta model frameworks are - lo and behold! - compatible; there's canonical lossless conversion between the two. So the argument at this point comes down to personal taste for what makes more sense to you. As to more on problems of RDF, read this excellent (but slighlt dated) Bray article; http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet But wait, there's more! We haven't touched upon the next layer of the cake; OWL, which is, more or less, an ontology for dealing with all things knowledge and web. And it kinda puzzles me that it is not more often mentioned (or used) in the systems we make. A lot of OWL was tailored towards being a better language for expressing knowledge (which in itself comes from DAML and OIL ontologies), and then there's RDFs, and OWL in various formats, and then ... Complexity. The problem, as far as I see it, is that there's not enough expression and rigor for the things we want to talk about in RDF, but we don't want to complicate things with OWL or RDFs either. And then there's that tedious distinction between a web resource and something that represents the thing in reality that RDF skipped (and hacked a 304 solution to). It's all a bit messy. * Unless you're writing a parser, then having a kajillion serializations seriously sucks. Some of us do. And yes, it sucks. I wonder about non-political solutions ever being possible again ... Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps http://shelter.nu/blog | google.com/+AlexanderJohannesen | http://xsiteable.org http://www.linkedin.com/in/shelterit
Re: [CODE4LIB] rdf serialization
Hi, Robert Sanderson azarot...@gmail.com wrote: c) I've never used a Topic Maps application. (and see (a)) How do you know? There /are/ challenges with RDF [...] But for the vast majority of cases, the problems are solved (JSON-LD) or no one cares any more (httpRange14). What are you trying to say here? That httpRange14 somehow solves some issue, and we no longer need to worry about it? Having said that, there's tuples of many kinds, it's only that the triplet is the most used under the W3C banner. Many are using to a more expressive quad, a few crazies , for example, even though that ad hominem? really? Your argument ceased to be valid right about here. I think you're a touch sensitive, mate. Crazies as in, few and knowledgeable (most RDF users these days don't know what tuples are, and how they fit into the representation of data) but not mainstream. I'm one of those crazies. It was meant in jest. may or may not be a better way of dealing with it. In the end, it all comes down to some variation over frames theory (or bundles); a serialisation of key/value pairs with some ontological denotation for what the semantics of that might be. Except that RDF follows the web architecture through the use of URIs for everything. That is not to be under-estimated in terms of scalability and long term usage. So does Topic Maps. Not sure I get your point? This is just semantics of the key dominator in tuple serialisation, there's nothing revolutionary about that, it's just an ontological commitment used by systems. URIs don't give you some magic advantage; they're still a string of characters as far as representation is concerned, and I dare say, this points out the flaw in httpRange14 right there; in order to know representation you need to resolve the identifier, ie. there's a movable dynamic part to what in most cases needs to be static. Not saying I have the answer, mind you, but there are some fundamental problems with knowledge representation in RDF that a lot of people don't care about which I do feel people of a library bent should care about. But wait, there's more! [big snip] Your point? You don't like an ontology? #DDTT My point was the very first words in the following paragraph; Complexity. And of course I like ontologies. I've bandied them around these parts for the last 10 years or so, and I'm very happy with RDA/FRBR directions of late, taking at least RDF/Linked Data seriously. I'm thus not convinced you understood what I wrote, and if nothing else, my bad. I'll try again. That's no more a problem of RDF than any other system. Yes, it is. RDF is promoted as a solution to a big problem of findable and shareable meta data, however until you understand and use the full RDF cake, you're scratching the surface and doing things sloppy (and I'd argue, badly). The whole idea of strict ontologies is rigor, consistency and better means of normalising the meta data so we all can use it to represent the same things we're talking about. But the question to every piece of meta data is *authority*, which is the part of RDF that sucks. Currently it's all balanced on WikiPedia and dbPedia, which isn't a bad thing all in itself, but neither of those two are static nor authoritative in the same way, say, a global library organisation might be. With RDF, people are slowly being trained to accept all manners of crap meta data, and we as librarians should not be so eager to accept that. We can say what we like about the current library tools and models (and, of course, we do; they're not perfect), but there's a whole missing chunk of what makes RDF 'work' that is, well, sub-par for *knowledge representation*. And that's our game, no? The shorter version; the RDF cake with it myriad of layers and standards are too complex for most people to get right, so Linked Data comes along and try to be simpler by making the long goal harder to achieve. I'm not, however, *against* RDF. But I am for pointing out that RDF is neither easy to work with, nor ideal for any long-term goals we might have in knowledge representation. RDF could have been made a lot better which has better solutions upstream, but most of this RDF talk is stuck in 1.0 territory, suffering the sins of former versions. And then there's that tedious distinction between a web resource and something that represents the thing in reality that RDF skipped (and hacked a 304 solution to). It's all a bit messy. That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303* solution. You can use URIs to identify anything. I think my point was that since representation is so important to any goal you have for RDF (and the rest of the stack) it was a mistake to not get it right *first*. OWL has better means of dealing with it, but then, complexity, yadda, yadda. http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14 And it's not messy, it's very clean. Subjective, of course. Have you ever played with an
Re: [CODE4LIB] Protagonists
Hmm. So, I'm a big fan of WikiPedia and would still go that way even if the data can be haphazard. WikiPedia has a lot of classics with a section called Lead characters (Pride and Prejudice included) where the focus is the novel first, which should be easy to call and then trim with some simple text parsing to get basic characterizations, like gender, possibly age, place and purpose to the story (main protagonist, antagonist, support character, etc.) I'd start with a page like Le Monde's 100 Books of the Century (http://en.wikipedia.org/wiki/Le_Monde%27s_100_Books_of_the_Century) and give each of them a visit, scraping for main characters or characters headings, and devise a small set of parsing rules to grab the top ones and their properties. Sounds like a fun day or so. Cheers, Alex On Tue, Apr 14, 2015 at 3:35 PM, McAulay, Elizabeth emcau...@library.ucla.edu wrote: Cool set of questions! Here's a funny cheat -- what about querying Amazon or the like for a list of Cliff's Notes and call the subjects of the Cliff's Notes the Canon? That could serve as a the canon list. Another idea would be to consult a reference work, but I can't think of a good source offhand. One example that's not perfect is the Dictionary of Literary Biography. The Canon is created by what is included in the reference work. As for finding lead character names, that's something I don't have an immediate answer for. Good luck! Best, Lisa - Elizabeth Lisa McAulay Librarian for Digital Collection Development UCLA Digital Library Program http://digital.library.ucla.edu/ email: emcaulay [at] library.ucla.edu From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of davesgonechina davesgonech...@gmail.com Sent: Monday, April 13, 2015 7:12 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Protagonists So I have this idea I'd like to do for a hobby project, but it requires finding a table that lists a classic novel, a Gutenberg.org link to an instance of that work (first listed, one with most downloads, whichever), the lead female character, and the lead male character (can be null). E.g. Pride and Prejudice, http://www.gutenberg.org/ebooks/42671, Elizabeth Bennet, Mr. Darcy. Even leaving the Gutenberg part for another day, this has been really difficult to find. I've had no success with Dbpedia/Wikidata since there's no real standardized format for novels, characters often are associated more strongly with films or video games than original works (Cheshire Cat), and when characters are listed they are neither prioritized nor link to a record that clearly states gender. And then there's how to select some sort of Western Canon list. ISBNs are nowhere to be found, nor any other identifier that might help to corral a fair chunk of results. I looked at OCLC, but WorldCat Works is still an experiment and frankly looks like too much work to query for too little return even if it had good coverage. Amazon? Librarything? Goodreads? No luck yet. I raise this partly because a) I would like to make some toys with that list, and b) I feel this is a good test case for what developers might want from library data, linked or otherwise. It is the sort of request that includes many unspoken assumptions (that there is a canon, and it is well-defined) that app users, product managers, and developers typically want even if it is woefully incomplete or imperfect, so long as it matches expectations. While I appreciate what it takes to make such a list, I feel like this really ought to be a solved problem in the library space. Not in the process of being solved, hopefully, by new emerging standards solved, but like we solved this ages ago, here ya go solved. I'm posting this basically in the hopes that someone will say No, doofus, there's an easy way to do this, you just aren't very good at this - look: and show me where I'm wrong. D -- Project Wrangler, SOA, Info Alchemist, UX, RESTafarian, Topic Maps http://shelter.nu/blog | google.com/+AlexanderJohannesen http://xsiteable.org | http://www.linkedin.com/in/shelterit