Re: [CODE4LIB] Protagonists
Hmm. So, I'm a big fan of WikiPedia and would still go that way even if the data can be haphazard. WikiPedia has a lot of classics with a section called "Lead characters" (Pride and Prejudice included) where the focus is the novel first, which should be easy to call and then trim with some simple text parsing to get basic characterizations, like gender, possibly age, place and purpose to the story (main protagonist, antagonist, support character, etc.) I'd start with a page like "Le Monde's 100 Books of the Century" (http://en.wikipedia.org/wiki/Le_Monde%27s_100_Books_of_the_Century) and give each of them a visit, scraping for "main characters" or "characters" headings, and devise a small set of parsing rules to grab the top ones and their properties. Sounds like a fun day or so. Cheers, Alex On Tue, Apr 14, 2015 at 3:35 PM, McAulay, Elizabeth wrote: > Cool set of questions! Here's a funny "cheat" -- what about querying Amazon > or the like for a list of "Cliff's Notes" and call the subjects of the > Cliff's Notes "the Canon"? That could serve as a the canon list. Another idea > would be to consult a reference work, but I can't think of a good source > offhand. One example that's not perfect is the "Dictionary of Literary > Biography." The Canon is created by what is included in the reference work. > > As for finding lead character names, that's something I don't have an > immediate answer for. > > Good luck! > > Best, > Lisa > > - > Elizabeth "Lisa" McAulay > Librarian for Digital Collection Development > UCLA Digital Library Program > http://digital.library.ucla.edu/ > email: emcaulay [at] library.ucla.edu > > > From: Code for Libraries on behalf of > davesgonechina > Sent: Monday, April 13, 2015 7:12 PM > To: CODE4LIB@LISTSERV.ND.EDU > Subject: [CODE4LIB] Protagonists > > So I have this idea I'd like to do for a hobby project, but it requires > finding a table that lists a classic novel, a Gutenberg.org link to an > instance of that work (first listed, one with most downloads, whichever), > the lead female character, and the lead male character (can be null). E.g. > Pride and Prejudice, http://www.gutenberg.org/ebooks/42671, Elizabeth > Bennet, Mr. Darcy. Even leaving the Gutenberg part for another day, this > has been really difficult to find. > > I've had no success with Dbpedia/Wikidata since there's no real > standardized format for novels, characters often are associated more > strongly with films or video games than original works (Cheshire Cat), and > when characters are listed they are neither prioritized nor link to a > record that clearly states gender. And then there's how to select some sort > of "Western Canon" list. ISBNs are nowhere to be found, nor any other > identifier that might help to corral a fair chunk of results. > > I looked at OCLC, but WorldCat Works is still an experiment and frankly > looks like too much work to query for too little return even if it had good > coverage. Amazon? Librarything? Goodreads? No luck yet. > > I raise this partly because a) I would like to make some toys with that > list, and b) I feel this is a good test case for "what developers might > want" from library data, linked or otherwise. It is the sort of request > that includes many unspoken assumptions (that there is a canon, and it is > well-defined) that app users, product managers, and developers typically > want even if it is woefully incomplete or imperfect, so long as it matches > expectations. While I appreciate what it takes to make such a list, I feel > like this really ought to be a solved problem in the library space. Not "in > the process of being solved, hopefully, by new emerging standards" solved, > but like "we solved this ages ago, here ya go" solved. > > I'm posting this basically in the hopes that someone will say "No, doofus, > there's an easy way to do this, you just aren't very good at this - look:" > and show me where I'm wrong. > > D -- Project Wrangler, SOA, Info Alchemist, UX, RESTafarian, Topic Maps http://shelter.nu/blog | google.com/+AlexanderJohannesen http://xsiteable.org | http://www.linkedin.com/in/shelterit
Re: [CODE4LIB] rdf serialization
Hi, Robert Sanderson wrote: > c) I've never used a Topic Maps application. (and see (a)) How do you know? > There /are/ challenges with RDF [...] > But for the vast majority of cases, the problems are solved (JSON-LD) or no > one cares any more (httpRange14). What are you trying to say here? That httpRange14 somehow solves some issue, and we no longer need to worry about it? >> Having said that, there's tuples of many kinds, it's only that the >> triplet is the most used under the W3C banner. Many are using to a >> more expressive quad, a few crazies , for example, even though that > > ad hominem? really? Your argument ceased to be valid right about here. I think you're a touch sensitive, mate. "Crazies" as in, few and knowledgeable (most RDF users these days don't know what tuples are, and how they fit into the representation of data) but not mainstream. I'm one of those crazies. It was meant in jest. >> may or may not be a better way of dealing with it. In the end, it all >> comes down to some variation over frames theory (or bundles); a >> serialisation of key/value pairs with some ontological denotation for >> what the semantics of that might be. > > Except that RDF follows the web architecture through the use of URIs for > everything. That is not to be under-estimated in terms of scalability and > long term usage. So does Topic Maps. Not sure I get your point? This is just semantics of the key dominator in tuple serialisation, there's nothing revolutionary about that, it's just an ontological commitment used by systems. URIs don't give you some magic advantage; they're still a string of characters as far as representation is concerned, and I dare say, this points out the flaw in httpRange14 right there; in order to know representation you need to resolve the identifier, ie. there's a movable dynamic part to what in most cases needs to be static. Not saying I have the answer, mind you, but there are some fundamental problems with knowledge representation in RDF that a lot of people don't "care about" which I do feel people of a library bent should care about. >> But wait, there's more! [big snip] > > Your point? You don't like an ontology? #DDTT My point was the very first words in the following paragraph; >> Complexity. And of course I like ontologies. I've bandied them around these parts for the last 10 years or so, and I'm very happy with RDA/FRBR directions of late, taking at least RDF/Linked Data seriously. I'm thus not convinced you understood what I wrote, and if nothing else, my bad. I'll try again. > That's no more a problem of RDF than any other system. Yes, it is. RDF is promoted as a solution to a big problem of findable and shareable meta data, however until you understand and use the full RDF cake, you're scratching the surface and doing things sloppy (and I'd argue, badly). The whole idea of strict ontologies is rigor, consistency and better means of normalising the meta data so we all can use it to represent the same things we're talking about. But the question to every piece of meta data is *authority*, which is the part of RDF that sucks. Currently it's all balanced on WikiPedia and dbPedia, which isn't a bad thing all in itself, but neither of those two are static nor authoritative in the same way, say, a global library organisation might be. With RDF, people are slowly being trained to accept all manners of crap meta data, and we as librarians should not be so eager to accept that. We can say what we like about the current library tools and models (and, of course, we do; they're not perfect), but there's a whole missing chunk of what makes RDF 'work' that is, well, sub-par for *knowledge representation*. And that's our game, no? The shorter version; the RDF cake with it myriad of layers and standards are too complex for most people to get right, so Linked Data comes along and try to be simpler by making the long goal harder to achieve. I'm not, however, *against* RDF. But I am for pointing out that RDF is neither easy to work with, nor ideal for any long-term goals we might have in knowledge representation. RDF could have been made a lot better which has better solutions upstream, but most of this RDF talk is stuck in 1.0 territory, suffering the sins of former versions. >> And then there's that tedious distinction between a web resource and >> something that represents the thing "in reality" that RDF skipped (and >> hacked a 304 "solution" to). It's all a bit messy. > > That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303* > solution. You can use URIs to identify anything. I think my point was that since representation is so important to any goal you have for RDF (and the rest of the stack) it was a mistake to not get it right *first*. OWL has better means of dealing with it, but then, complexity, yadda, yadda. > http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14 > And it's not messy, it's very clean. Subjective, of course. H
Re: [CODE4LIB] rdf serialization
Ross Singer wrote: > This is definitely where RDF outclasses almost every alternative*, because > each serialization (besides RDF/XML) works extremely well for specific > purposes [...] Hmm. That depends on what you mean by "alternative to RDF serialisation". I can think of a few, amongst them obviously (for me) is Topic Maps which don't go down the evil triplet way with conversion back and to an underlying data model. Having said that, there's tuples of many kinds, it's only that the triplet is the most used under the W3C banner. Many are using to a more expressive quad, a few crazies , for example, even though that may or may not be a better way of dealing with it. In the end, it all comes down to some variation over frames theory (or bundles); a serialisation of key/value pairs with some ontological denotation for what the semantics of that might be. It's hard to express what we perceive as knowledge in any notational form. The models and languages we propose are far inferior to what is needed for a world as complex as it is. But as you quoted George Box, some models are more useful than others. My personal experience is that I've got a hatred for RDF and triplets for many of the same reasons Eric touch on, and as many know, I prefer the more direct meta model of Topic Maps. However, these two different serialisation and meta model frameworks are - lo and behold! - compatible; there's canonical lossless conversion between the two. So the argument at this point comes down to personal taste for what makes more sense to you. As to more on problems of RDF, read this excellent (but slighlt dated) Bray article; http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet But wait, there's more! We haven't touched upon the next layer of the cake; OWL, which is, more or less, an ontology for dealing with all things knowledge and web. And it kinda puzzles me that it is not more often mentioned (or used) in the systems we make. A lot of OWL was tailored towards being a better language for expressing knowledge (which in itself comes from DAML and OIL ontologies), and then there's RDFs, and OWL in various formats, and then ... Complexity. The problem, as far as I see it, is that there's not enough expression and rigor for the things we want to talk about in RDF, but we don't want to complicate things with OWL or RDFs either. And then there's that tedious distinction between a web resource and something that represents the thing "in reality" that RDF skipped (and hacked a 304 "solution" to). It's all a bit messy. > * Unless you're writing a parser, then having a kajillion serializations > seriously sucks. Some of us do. And yes, it sucks. I wonder about non-political solutions ever being possible again ... Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps http://shelter.nu/blog | google.com/+AlexanderJohannesen | http://xsiteable.org http://www.linkedin.com/in/shelterit
Re: [CODE4LIB] rdf serialization
Hiya, On Tue, Nov 5, 2013 at 1:59 AM, Karen Coyle wrote: > Eric, I really don't see how RDF or linked data is any more difficult to > grasp than a database design Well, there's at least one thing which makes people tilt; the flexible structures for semantics (or, ontologies) in where things aren't as solid as in a data model. A framework where there are endless options (on the surface of it) for relationships between things is daunting to people who come from a world where the options are cast in iron. There's also a shift away from thing's identities being tied down in a model somewhere into a world where identities are a bit more, hmm, flexible? And less rigid? That can make some people cringe, as well. > A master chef understands the chemistry of his famous dessert - the rest of > us just eat and enjoy. Hmm. Some of us will try to make that dessert again, for sure. :) Alex
Re: [CODE4LIB] Seeking examples of outstanding discovery layers
I love the Trove from the National Library of Australia ; http://trove.nla.gov.au/ Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Project Management Software Question
Hiya, > --What project management software are you using? Semantic MediaWiki, xSiteable > --What made you choose the system? Most project management software is written by geeks, not for humans. They all propose some methodology to go with their model, but either their model is inflexible (and crashing with yours), or it is so flexible that any tool might do the trick. Also, they are notoriously hard to configure on a cumulative scale of the people involved. Also, people hate putting in their data, so most software, even if they might just do the trick, fails for human reasons. So, a simple wiki with some added ontology cruff, and xSiteable delivering semantics and widgets across all people is enough. Simple todo's beat complex task management every time. > --Has the system met all of your needs? If not, where does it fail? It only fails when we need average to higher degree of data, again, a human problem. Oh, and it sometimes fails because the MediaWiki GUI sucks for non-geeks. I think Confluence is better and overal pretty good. > --Overall opinions? I could write you a sonnett or two, but I have very little trust in software helping much in project management (after having tried them all over a span of 20 years). A joint platform for documentation (and for heavens' sake, choose a Wiki that has a usable interface!) In fact, you'd be *far* better off getting "Making stuff happen" by Scott Berkun ( http://www.amazon.com/dp/0596517718?tag=scottberkunco-20&camp=14573&creative=327641&linkCode=as1&creativeASIN=0596517718&adid=1B6JF6HWHDT0S5RYZNNM), the best book I ever got. Honest, I'm not affiliated. :) > --What systems did you evaluate and decide not to recommend? Hmm, I think I've tried too many. I'm sure there's software out there that doesn't suck (ie. I hear good things about a few here and there), but far too often do I see this usability parred with human engagement problem crop up and ruin the best of software packages. > Any information would be great! Sorry to be so glum. I'm more happy with simpler approaches such as "project on a page" (ie. one Wiki page with short description, people, contacts, goals, and progress) and more agile ways of dealing with requirements and development (reduces the need for approved paper, easier to roll back bad decisions, etc.). The closest I get to a Gant chart is that one of our vendors insists on sending me one every now and then, despite that he has to come into the office and explain it to people every single time. In other words; use software to document and drive forward, never use software to measure progress and estimates. Regards, Alex (disgruntled ex-beliver in project management software)
Re: [CODE4LIB] Open datasets
Hiya, Thanks for the all the pointers; just what I wanted, and gives me plenty of ways to test the generic meta data handling. Great! Regards, Alex On Jan 12, 2012 3:19 AM, "Simon Spero" wrote: > You can get anything you want > At Brewster Kahle's restaurant. > http://openlibrary.org/data#bulk_download > > Simon > > On Wed, Jan 11, 2012 at 10:55 AM, LeVan,Ralph wrote: > > > http://staff.oclc.org/~levan/PearsTraining/scifi.usmarc has 10,000 marc > > records in it. They are part of the old SiteSearch system that OCLC > > released as open source. They date back to 2002 and will not contain > > any Unicode, if you were hoping to include that as part of your testing. > > > > Ralph > > > > -Original Message- > > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of > > Alexander Johannesen > > Sent: Wednesday, January 11, 2012 5:36 AM > > To: CODE4LIB@LISTSERV.ND.EDU > > Subject: Open datasets > > > > Hiya, > > > > I'm in the middle of creating a meta data management system (including > > merging and persistent identifier management) for a somewhat different > > domain (intranets and business integration), but it's based on Topic > > Maps > > and so is well suited to other means of meta data handling / mangling. > > It's > > also going to be open-source, and it might be well-suited to library > > tasks > > as well. > > > > So in order to test the integrity and performance of my system so far > > I'm > > wondering if there's a suitable open dataset of bibliographic records > > that > > aren't too obscure (meaning, I can find the titles at amazon or Open > > Library) that you could recommend? More than 1000 records, but less than > > a > > million, maybe? > > > > Regards, > > > > Alex > > >
[CODE4LIB] Open datasets
Hiya, I'm in the middle of creating a meta data management system (including merging and persistent identifier management) for a somewhat different domain (intranets and business integration), but it's based on Topic Maps and so is well suited to other means of meta data handling / mangling. It's also going to be open-source, and it might be well-suited to library tasks as well. So in order to test the integrity and performance of my system so far I'm wondering if there's a suitable open dataset of bibliographic records that aren't too obscure (meaning, I can find the titles at amazon or Open Library) that you could recommend? More than 1000 records, but less than a million, maybe? Regards, Alex
Re: [CODE4LIB] Linux Laptop
MJ Ray wrote: > I humbly suggest that long futz times are only necessary these days > when most of the following combine: > Hmm. > 1. unsupported/hard-to-support hardware (maybe bought for compatibility > with another even-fussier operating system?); > Yes, this is the big offender, however I've never met an Ubuntu first install that didn't work good on the first try. It's only when you start tweaking stuff it seems it falls down a little. > 2. control-freakery ("it must work/look exactly THIS way RIGHT NOW > without me doing much"); > Yes, hackers tweak, it's in their nature. They also know the consequences of hacking and tweaking, so I'm not sure this is bad thing per se. I personally went Linux *because* I like tweaking and then fixing my messes (my blog is full of angry anecdotes and stories about just this, some sillier than others), and there is one difference between (at least) the Windows world and the Linux world; fixing a broken Linux is tons easier than fixing a broken Windows, so even if we do talk about stuff getting broken the fixes are not even comparable. 3. not good at asking for technical help online or being patient with > LUGs; > Hardly ever used this. > 4. not willing to find and/or pay local experts; > I pay myself all the time. > 5. not willing to search/read the copious fine manuals or debug logs. > The amount of fragmented and irrelevant information out there is inverse proportional to the time you thought it would take to fix your problem. I guess newcomers still have to get used to > basics like having 5 or more useful mouse buttons instead of 1... > With the (reasonably) few mishaps I've had while updating and installing Ubuntu versions, I'm still a happy hacker that never regretted the move, even if the journey has been bumpy at times. However, a word of warning about Ubuntu is that it is moving in a direction that, to me, is completely wrong, so I'm switching to Mint (with that Gnome 3 layer that makes it Gnome 2 compatible). Unity is a travesty, and the people who hate it the most are ... the tweakers and hackers. Just sayin' Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Namespace management, was Models of MARC in RDF
"Richard Wallis" wrote: > Collection of triples? Yes, no baggage there ... :) Some of us are doing this completely without a single triplet, so I'm not sure it is accurate or even politically correct. *hehe* > A classic example of only being able to describe/understand the future in > the terms of your past experience. Yes, exactly. Although, having said that, I'm excited that the library world is finally taking the semantic challenge seriously. It's taken quite a number of years, but slowly there's a few drips and draps happening. Here's to hoping that there's a fluse somewhere about to open fully, and maybe the RDA vehicle have proper wheels? (Didn't the last time I checked, but that's admittedly a couple of years back. I hear they at least got new suspension?) Regards, Alex
Re: [CODE4LIB] Namespace management, was Models of MARC in RDF
"Richard Wallis" wrote: > Your are not the only one who is looking for a better term for what is > being created - maybe we should hold a competition to come up with one. A "named graph" gets thrown around a lot, and even though this is technically correct, it's neither nice nor sexy. In my past a "bucket" was much used, as you can easily thrown things in or take it out (as opposed to the more terminal record being set), however people have a problem with the conceptual size of said bucket, which more or less summarizes why this term is so hard to pin down. I have, however, seen some revert the old RDBMS world of "rows", as they talk about properties on the same line, just thinking the line to be more flexible than what it used to be, but we'll see if it sticks around. Personally I think the problem is that people *like* the idea of a closed little silo that is perfectly contained, no matter if it is technically true or not, and therefore futile. This is also why, I think, it's been so hard to explain to more traditional developers the amazing advantages you get through true semantic modelling; people find it hard to let go of a pattern that has helped them so in the past. Breaking the meta data out of the wonderful constraints of a MARC record? FRBR/RDA will never fly, at least not until they all realize that the constraints are real and that they truly and utterly constrain not just the meta data but the future field of librarying ... :) Regards, Alex
Re: [CODE4LIB] Models of MARC in RDF
On Wed, Dec 7, 2011 at 1:49 PM, stuart yeates wrote: > As much as I have nothing against anyone on this list, isn't it a little > US-centric? Didn't we make that mistake before? I wouldn't worry. A dream-team have no basis in reality, hence the "dream" part. I'd like to see a Real Team instead, an international collaboration of people, including international smarts and non-librarians. (Realistically, an international [or semi] library conference should have a three-day session with smart people first on this very issue, and that would make a fine place to get this thing working, even to some degree of speed) Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Namespace management, was Models of MARC in RDF
Hiya, Karen Coyle wrote: > I wonder how easy it will be to > manage a metadata scheme that has cherry-picked from existing ones, so > something like: > > dc:title > bibo:chapter > foaf:depiction Yes, you're right in pointing out this as a problem. And my answer is; it's complicated. My previous "rant" on this list was about data models*, and dangnabbit if this isn't related as well. What your example is doing is pointing out a new model based on bits of other models. This works fine, for the most part, when the concepts are simple; simple to understand, simple to extend. Often you'll find that what used to be unclear has grown clear over time (as more and more have used FOAF, you'll find some things are more used and better understood, while other parts of it fade into 'we don't really use that anymore') But when things get complicated, it *can* render your model unusable. Mixed data models can be good, but can also lead directly to meta data hell. For example ; dc:title foaf:title Ouch. Although not a biggie, I see this kind of discrepancy all the time, so the argument against mixed models is of course that the power of definition lies with you rather than some third-party that might change their mind (albeit rare) or have similar terms that differ (more often). I personally would say that the library world should define RDA as you need it to be, and worry less about reuse at this stage unless you know for sure that the external models do bibliographic meta data well. HOWEVER! When we're done talking about ontologies and vocabularies, we need to talk about identifiers, and there I would swing the other way and let reuse govern, because it is when you reuse an identifier you start thinking about what that identifiers means to *both* parties. Or, put differently ; It's remarkably easier to get this right if the identifier is a number, rather than some word. And for that reason I'd say reuse identifiers (subject proxies) as they are easier to get right and bring a lot of benefits, but not ontologies (model proxies) as they can be very difficult to get right and don't necessarily give you what you want. Just my .2 AUD. Alex * https://plus.google.com/u/0/111886865967199209050/posts/QLx3LLeseeD -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Professional development advice?
Kyle Banerjee wrote: > Starting with data modeling is like trying to learn a new spoken language > by focusing on grammar [...] Hmm. It seems that a lot of people are, shall we say, somewhat misguided to what data modelling is, even mighty WikiPedia who makes it into a formal process of sorts, and I can see it repeated ad nauseum wherever you go, giving us the idea that it is all about the schema of columns and the nuts and bolts of tables and relations in a RDBMS. That's confusing data modelling tools or processes with the generic open-ended category of data modelling. Data modelling is simply the act of exploring data-oriented structures. Over time I've learned that everything we do, every little problem you battle with in your every daily life, revolves around some data structure, the names of such, and their internal and external relationships. The simplest web form has a model, simple and complex applications do as well, enterprise systems, library systems, formats, databases, documents, spreadsheets, this conversation, your bicycle, your morning routine, *everything*. There is, in my strong opinion, a horrible conflation of the concept of modelling data and implementations pinning down data types; it's an evil so strong it blinds us, cripple us, and I feel like screaming out in terrifying agony the horrors within! The wrongly applied indeces! The labels on columns! The semantic binding of one sub-structure to another! The optimising tricks used! The stored procedures! The conceptual semantics of labels in n-ary graphs!! *aaarghhh!!* The wretched *name* of a single field and how it quietly eats up any disambiguous notion we put in place, through the many well-meaning but afflicting layers of abstraction and implementation, it drives me insane! "Name"!? What does that mean in the context of an email address? What does "comment" mean when it reaches my ORB? What were they thinking when the model designed resulted in SQL statements 1K long? There's so much information written of the topic of data modelling, and most of it ignore that very thing that it should embrace and focus heavily on; good semantic design. (Granted, it has become far more focused on in the last 10 years, and I'm extremely happy for that) Put some heavy thought into your tables, because what you perceive as a simple table of users becomes an overwhelming problem when you add special users to the system. Have any of you ever created an ILS with a table "book" in it? (C'mon, raise your hand, I know you have!) Yeah, that's the sort of evil I'm talking about! Libraries don't deal with "books", they deal with bibliographic meta data of objects, and sometimes those objects are called a "book" which has certain constraints and properties that link to special meta data that isn't static. Version 1.0 of any system if famously rubbish because of the learning process of getting all this stuff wrong. Version 2.0 is famous for being overly abstracted and incomprehensible. Version 3.0 is getting there, but you're bogged down in the middleware, translating between good but incompatible models. By the time you get to version 4.0 you realize that the underlying concepts which drove versions 1 through 3 are flawed, and you need to work in terms of FRBR sub-graphs instead of MARC records. Version 5.0 is so re-written and re-conceptualized, you decide to call it something else, version 1.0 And we repeat the cycle. If your software isn't like this, consider yourself lucky (or at worst, self-deluded :). > Data modeling is extremely useful, but > mistaking drips and drabs of it early on for reality can poison your > thinking. Sorry, you got that back to front. We all agree that understanding what user want and / or need is King, but unless you've got that understanding of not only what the users want but how systems can deliver this without creating constraints that will screw things up when you extend that original delivery idea, you're going to suffer. Badly. It's easy; take great care to what you call things in your system (no matter whether it's in the database, your objects / classes / instances / interfaces, user interface, buttons, messages, windows, data types, loops ... they're all data models that need to be as cooperative as possible, speaking the *same language*, to be compatible in the meaning they give the concepts used. If your Wheels API has different semantics from your Steering API, making that car is going to be a really crappy experience, for you as a developer, for testers, for maintenance guys, for service people, and most of all don't think for a second that the driver won't notice. These semantics are far more important than what our industry traditionally have given them, and in my opinion it is our biggest flaw. Trust me, I've stared at data models up and down so many systems over the years (10 of them in a high-flying big consultant agency where we came in when projects otherwise failed) it's amazi
Re: [CODE4LIB] Professional development advice?
Hiya, On Tue, Nov 29, 2011 at 10:06 AM, Nate Vack wrote: > A more productive task is to understand the who, how, when, and > thenceforth of what tasks actual people want to accomplish with their > computers Understanding this is not disconnected from designing data models *right*. It's the same thing. By extension I should mention that people are terrible at telling you what they want or need, but they're good at telling you what they hate. If nothing else, I'd suggest to tap into that wonderful hate. > But an 'all flows from data modeling' thought process leads to FOAF, > FOAF leads to hate, and hate leads to suffering. This sounds suspiciously like someone who don't understand the perils of data models and how they affect all the FOAF and hate that's built up around its faults. FOAF and suffering is a symptom of shitty data models, not shitty code. Unless you've got a little more meat on that argument? :) Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Professional development advice?
I could give you tons of advice, most of it specific to some technological domain or another, but over the years I've more or less settled on one thing that beat out all the other ; Data models. Once you grok data models, what they are, how they work, and all the extended family (schemas, ontologies, persistent identification, querying, de-duplication, layered models, LUT/transcripts, stored procedures [and why they are evil], RDBMS vs. NoSQL vs. whatever, and so on), everything else is miscellaneous. The way we humans use computers as tools are all rooted in a data model at the bottom of some program or database, and the rest of the time is spent interacting with the data model, trying to make it do the things we need it to do, and so on. Everything is about and around that data model, so getting it right is a lot more important than any amount of beautiful coding against it. So, that's my big tip; all that technology we much about with is really trying to work well with a data model. Your task should rather be to understand the why, who, how, when and the thenceforth of data models, and everything else will follow. Now, this tip could under normal circumstances be applied to any part of the IT industry, but it makes especially sense in the library world. Most of the time is spent converting data between data models (whatever > MARC > whatever), or making sense of the one (MARC21/FRBR) or other (AACR2/RDA and that third one I can never remember the name of, that extension rules to AACR2?) or three (LCSH/DDC). We're all battling against the original thought and implementation of data models, and very often you'll find better technological solutions when you understand the underlying human efforts of ... data modeling (and by extension, you might discover my pet peeve, how all bad software and systems in the world comes from bad data modeling, and *not* from bad programming [even if there's plenty of that, too]) Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Ontology Question
Hiya, > Is it okay to just use the classes I need or should I include the super > classes which they belong to? I think we also need to define a few concepts here. What do you mean, "include"? As far as I can tell, you want to say something like "Here's a few concepts we're using, and their definition is based off this other ontology over *there* (pointing)", but that's not always the case, so just asking. Now, Karen is of course right in her take on it, but there's a little thing that require a bit of focus, and that's how this new ontology is going to be used. Is it one of these manual labour things where it doesn't actually require formal definitions as much as a human one, or is it (however you use the ontology) to be passed through a tool, or more formally passed through an inferencer? Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] two open positions at Stanford
On Sat, Oct 15, 2011 at 3:40 AM, Cindy Harper wrote: > I mean - Bieber??? You > mean he has a beard? Unless they put a Phillips in a Tardis ... (and seriously, if you get that joke with it's three somewhat obscure references, and the one insane premise, there's something wrong / right with you, and I'm almost tempted to give out prizes ... ) Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] iPads as Kiosks
Just my two bobs ; We're going through various stages of testing out tablets for both kiosks *and* portable workstations (for nurses and staff), and have tried out iPads and various Androids, and our current favorite is actually the Asus Eee Pad Transformer, a vanilla (but good quality) Honeycomb Android during day, but with a snap-on keyboard with extra ports and batteries for some netbook action at night, so it satisfies both our criteria. As with all things, it also depends on what software you want to run. If you go with iPad you need to go through Apple's various restrictions, while on Android you can use whatever you want. For a "you are here" tablet a cheap 150$ Android seems like a good option, too. Regards, Alex On Wed, Aug 24, 2011 at 11:51 PM, Madrigal, Juan A wrote: > That零 the equivalent to $25/month and includes support for your whole > development team/institution. > > If your employer can't afford that then I suggest you look for a new job! > ;) > > Juan Madrigal > > Web Developer > Web and Emerging Technologies > University of Miami > Richter Library > > > > On 8/23/11 2:21 PM, "Dan Funk" wrote: > >>Wow, just $300/year and you can run your own software on your own >>hardware? What a deal. >> >>On Tue, Aug 23, 2011 at 2:13 PM, David Uspal >>wrote: >>> Thanks for the update. This definitely solves that issue -- its >>>unfortunate this wasn't in place in 2009, or I'd be into year two of a >>>five year contract... >>> >>> David K. Uspal >>> Technology Development Specialist >>> Falvey Memorial Library >>> Phone: 610-519-8954 >>> Email: david.us...@villanova.edu >>> >>> >>> >>> >>> -Original Message- >>> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of >>>Andrew Hankinson >>> Sent: Tuesday, August 23, 2011 2:00 PM >>> To: CODE4LIB@LISTSERV.ND.EDU >>> Subject: Re: [CODE4LIB] iPads as Kiosks >>> >>> You can distribute apps via an internal web server, with no need to go >>>out to Apple. >>> >>> >>>http://developer.apple.com/library/ios/#featuredarticles/FA_Wireless_Ente >>>rprise_App_Distribution/Introduction/Introduction.html >>> >>> You need to be a registered business to do this, and it costs $299/yr. >>>You get a digital certificate, but that doesn't mean your code needs to >>>be "seen" by anyone outside of your org. >>> >>> >>> On 2011-08-23, at 1:47 PM, David Uspal wrote: >>> When I did my iPhone work, it was back in 2009 before this document even existed, so it's good they've come some distance on this issue since then. Still, the document below doesn't break the dependency on the iTunes store and/or a digital certificate issued by Apple to download applications (if I'm reading page 63 right), which was the big sticking point of the contract. Not only did the user not want the network controlled by Apple (which this document does handle), they also didn't want the code seen by any outside source at all (aka via uploading it to the store) David K. Uspal Technology Development Specialist Falvey Memorial Library Phone: 610-519-8954 Email: david.us...@villanova.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Andrew Hankinson Sent: Tuesday, August 23, 2011 1:34 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] iPads as Kiosks They now have an enterprise app deployment mechanism. http://www.apple.com/support/iphone/enterprise/ On 2011-08-23, at 12:54 PM, David Uspal wrote: > Then again, by selecting the iPad you're essentially tethered to >Apple's iron grip of the iWorld via its iTunes vetting process and >strict control of Apple hardware. YMMV on this depending on what >you're doing, but it should definitely be a consideration when >choosing between Android tablets and the iPad. > > Quick side story -- we had to drop a contract one time at my old job >due to the customer proprietary requirements. The customer didn't >want to release its developed software outside of house (minus the >developers of course) and Apple wouldn't give them a waiver from using >the iTunes store. Mind you, this was a very big company with >resources, so Apple probably lost a 5000 unit sale due to this > > > David K. Uspal > Technology Development Specialist > Falvey Memorial Library > Phone: 610-519-8954 > Email: david.us...@villanova.edu > > > > > > > -Original Message- > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf >Of Stephen X. Flynn > Sent: Tuesday, August 23, 2011 9:01 AM > To: CODE4LIB@LISTSERV.ND.EDU > Subject: Re: [CODE4LIB] iPads as Kiosks > > Let's not forget a far superior user experience. > > > > Stephen X. Flynn > Emerging Technologies Librarian > Andr
Re: [CODE4LIB] Seth Godin on The future of the library
Hi, On Thu, Jun 2, 2011 at 9:11 AM, Jonathan Rochkind wrote: > There are some unanswered questions about what the purpose of the catalog is > or should be in our users research workflow, and it's not obvious to me > whether > that purpose will involve putting any possible book or article that exists > for free > on the internet in the catalog. I personally think that libraries in general still have some fundamental issues of just getting their head around the two-headed problem of free web resources. Not only are these free, but they don't physically exists. This has certain implications for libraries ; Free: as has been pointed out, sometimes this means not being peer reviewed, or doesn't have the quality seal of a publisher, and as such there is no process for libraries to really understand how that knowledge fits into the rest of their collection. (I don't think it's a price issue; it's more a fundamental model issue) It's sometimes hard to wrap your head around the concept of anything free being of much *worth* where in the past worth and often quality was measured in the name of publishers and the amount of peer-review or the reputation of the author. The Internet has *changed* this to the core; it's all gone or going, and new models are coming through the haze of confusion which I think the library world is both unprepared for and seriously underfunded to deal with. Links: The whole concept of web resources, of what a link (or a link to a mirror or cache) is all about confuses libraries who are deeply rooted in all things being physical. I know this is a dozy, but I still find this an issue when talking to librarians even today. The concept of virtual things in the library world really only exists with the notion of meta data, and I don't think the transition to the resource itself *also* being virtual has worked out well. Libraries *likes* physical objects, they *like* shelves, they *like* their buildings, and I don't blame them; we are physical beings who love the smell of paper, however books are not actually important, buildings are not actually important, that smell is definitely not important : Ideas, knowledge and concepts are, and that's what we all try to pry from the books. (As an aside, if ideas and concepts were valued more, why couldn't LCSH morph into something far, far more important and useful? The mind boggles at the lost opportunities!) You cannot pry anything from a link except the possible resource at the other end, but it is a few traceroutes away in a virtual place, and in need of technological interpretation on arrival, and then comes the next level of trouble; These are just the conceptual problem. The next real problem of technology and the library world is - despite the hard and excellent work put in by people like us on this very list! - that they are still a slow-poke in the realm of using and developing technology. Most ILS are charmingly quaint in dealing with these things. OPAC's are mostly dreadful. Backend infra-structure never powerful or big enough for the growing digital stuff coming in. Systems running always a bunch of features away from being what we need, only getting by on a barely useful set of features (that far too often the vendors dictates) to do the minimum we have to do. Yes, yes, exceptions here and there, I would never deny that, but look at library land as a whole; you're lagging behind and you cannot really compete in a world that needs you to not only run, but win. And frankly, you *cannot* win, not on technology. There's just no way. Winning this one requires not technology as such, but paradigm shifts in thinking, both from inside and especially from the outside, coupled with proper resourcing by people who understands the value libraries truly bring to the world. And this latter thing is becoming a real problem, I think. > One reason that libraries may not prioritize putting free ebooks in the > catalog is because > there are other places users can search for free ebooks on the internet -- > but there > aren't other places users can search for non-free ebooks that they know will > be licensed > to them as library patrons, or for that matter to search for physical things > on the shelves > that they know are available from their library. Seems like an odd argument to me. Why are we talking about the price and the format of the information rather than the *quality* of it? I thought a curated collection was the bee's knees, regardless of what formats used. Hmm. Maybe I'm thinking too much like a knowledge customer than a librarian these days, and I've lost my touch or my way. :) Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Let's go somewhere [was PHP vs. Python...]
On Tue, Nov 2, 2010 at 5:03 AM, Jonathan Rochkind wrote: > I would be very unlikely to use someone's homegrown library specific > scripting language. However, if you want to make a library for an existing > popular scripting > language that handles your specific domain well, I'd be quite likely to use > that if I had a problem with your domain and I was comfortable with the > existing popular scripting language, i'd use it for sure. Hmm. The balance between the old and tried, and the new and experimental will, forever, cause these kinds of discussions. Now, I agree with the basic sentiment of what you're saying, but ... > Odds are your > domain is not really "libraries" (that's not really a software problem > domain), but perhaps as Patrick suggests "dealing with relationships among > semantic objects", and then odds are libraries are not the only people > interested in this problem domain. I've worked in the three basic tiers of library development world; the plain vanilla programming world, the semantic web world, and the dark dungeons of the Cult of MARC. Is the domain of library IT solved by the generic technologies used? No. There's nothing bad about a DSL, in fact, I encourage it. If you want to get away from MARC, say, then having a DSL that approaches meta data on the programmatic level directly is a wonderful abstraction. But yes, we have to separate API from language. And API is, mostly these days, simply a function/method call on top of an abstraction, and it processes your request with your input. A language, on the other hand, will let you deal directly with that problem. Most DSLs are functional abstraction pre-compiled. The line between a library and a language perhaps these days are more blurred than ever before, however there are certain things that I think justifies a library DSL ; * focus on identity management * mergability on entities * large distributed sets * more defined line between data and meta data * controlled vocabularies and structures There's generic tools for all of these, however no one central thing that binds them all together in a seamless way, elegant or otherwise. No platform binds these together in an easy nor elegant way, and perhaps such a thing would be beneficial to the library community, to create a language that tries to create a bridge between computer programming and what you learn in library school. But even if we all concede that a library DSL perhaps is not a practical solution, I'd still like to see us work on it, for nothing more than sussing out our actual needs and wants in the process. Don't underestimate the process of doing something that will never eventuate, even knowingly. > Some people like ruby because of it's support for creating what they call > "domain specific languages", which I think is a silly phrase, which really > just means "a libraryAPI at the right level of abstraction for the tasks at > hand, so you can accomplish the tasks at hand concisely and without repeated > code." Depends on the language. Perhaps this doesn't make sense in Ruby, but it certainly does in Scala, Haskell, and perhaps more than any, Rebol. Even Lisp and derivatives, who can create custom structures on the fly, are well suited to create actual languages that redefine the language's original syntax and structure. You can redefine the hell out of C to create any language imaginable, too, even when you shouldn't. A well-defined API is not a bad thing, though, but an API are basically semantic entities in a language to parse structures. However, a language redefines the syntax used by that language. Sure you can create a word "record" in an API that mimics, say, a MARC record, but the interesting part is when you redefine the syntax to work *with* that semantic concept, like ; external_repository { baseURI: 'http://example.com/', type: OAI-PHM } my_repository { baseURI: 'http://example.com/', type: RIF-CS } some_vocabulary { baseURI: 'http://example.com/vocab' type: thesauri } foreach record in external_repository [without tag 850] { inject into my_repository { with: exploded words ( tag 245 ) when: match words in some_vocabulary ( NT > 2 ) merge into: tag 850 } } Creating classes that deal with record merging based on identity management and various standards would be trivial to script together super-fast, because the underlying concepts for us is rather well-known. Hacking this together in Java or otherwise is a test on patience and sanity, because they are generic tools, even when known library-type APIs are used. Of course lots of stuff is assumed in the example, but these are well-understood assumptions (about merging subject headings (like multiple tags handling, LCSH lookup, etc.), about identity control, about word lookup (for example, I'm assuming some form of stemming before matching), and on and on. A language that half text manipulation and looku
Re: [CODE4LIB] PHP vs. Python [was: Re: Django]
On Sat, Oct 30, 2010 at 7:49 AM, Bradley Allen wrote: > Mark- I would highly recommend looking at Tornado > (http://www.tornadoweb.org) as an alternative to using Django without > the ORM. I'd second that one. Has used it for a couple of projects, and it seriously cut down on prerequisite clutter and is super fast. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] PHP vs. Python [was: Re: Django]
Olá, como vai? Luciano Ramalho wrote: > Actually, Python is a general purpose programming language. It was not > created specifically for server side scripting like PHP was. But it is > very suitable to that task. I'm not sure talking about what something used to be is as interesting as talking about what it is. Both Pyhton and PHP can share whatever moniker we choose (scripting-language, programming language, real-time, half-time, bytecoded, virtual, etc.). >> Not seen any scientific packages, but I've seen a few ray-tracers, >> although they're all demo apps and fun toys (although I think that >> applies to Python, too). > > No, that does not apply to Python. Python is widely used for hardcore > scientific computing. I was referring to the ray-tracing part. > It is also the most important scripting language in large scale CGI > settings Yes, Python is widely used for scripting up interfaces into other more complex systems. But rarely is the core of the thing written entirely in Python. >> Maybe your Google-foo is weak. :) > > Or maybe he's just realizing that outside of server side web > scripting, PHP is just not so widely used. Absolutely, and fair enough. > Having used both languages, I discovered that Python is easier for > most tasks, and one reason is that the libraries that come with Python > are extremely robust, well tested and consistent. Hmm. PHP is extremely robust and well-tested, but yes, it's not all that consistent, especially not before version 5.2+. However, things have moved on, and with release 6 around the corner things will be tighter still. Just like the first versions of Python were interesting, so was PHP's, but where the biggest problem with the evolution of PHP was the very fact that it was the most popular language for rapid web development by far. > PHP is very > practical for server-side web scripting, but it's libraries are > unfortunately full of gotchas, traps and unexpected behaviour. There's gotchas in every language, even Python. > A key reason for that is the fact that Python has always had an > exception-handling mechanism while PHP has grown something like that > only a few years ago True enough. But earlier versions of any language are less desirable than the latest versions, so I'm not sure this is a prevailing argument for the horribleness of PHP or any language. These things evolve. PHP 5.3+ and soon 6 are looking very good, indeed, but yes, we will just have to live with a poor reputation brought on by the big number of users and the pre 5.2+ era. > So, I my opinion, PHP is great at what it does best: enabling quick > server-side Web scripting on almost any hosting service on Earth. I'm fairly sure you can say that because you haven't done much other kind of PHP work. :) > For everything else, it is very worthwhile to learn and use a general > purpose dynamic language such as Python, Ruby or Perl. Of course. Developers should learn many of languages, and choose wisely the language best suited to the problem at hand. > Sorry for the rant. I must confess I am a founder of the Brazilian > Python Association and was its first president, so you can call me a > Python advocate. No bias at all, really. :) Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
[CODE4LIB] PHP vs. Python [was: Re: Django]
Hola, compadre, Elliot Hallmark wrote: > Other things beyond that seemed > awkward, difficult, or impossible from what I knew. python immediately > jumped out to me as a tool more suited to these tasks. The fact that Python has a looping run-time environment is, of course, a give-away to why most people think this, and perhaps to some degree, rightly so, but PHP has got the same, it's just that *most* people use PHP through some Apache module as a request/response module. Indeed, that's where it started, and that's its forte. > From my experience, it seemed php was a server side > scripting language. Strictly speaking, so is Python. > Can you write a php script that gets key presses > and doesn't pass them along to windows to process? I thought the OS > would have to process the key press, pass it along to the php server > and then php could process it. (pyhook) A couple of obvious candidates; - http://gtk.php.net/ - http://winbinder.org/ > Also, how would you go about using a GPU from a graphics card in php? > (python cuda in google gives many results) PHP is just a C program with various bindings, so I suspect in the same way Python would do it. Whether anyone has done it, though, is a different question. > Has anyone written a scientific computing package along the lines of > matlab in php (scipy, numpy, matplotlib)? Or a non-sequential optical > raytracer? Not seen any scientific packages, but I've seen a few ray-tracers, although they're all demo apps and fun toys (although I think that applies to Python, too). It's not so much about whether you can do it or not (you can), but whether it makes sense to do so (it mostly doesn't). Having said that, there's nothing stopping me making a local run-time PHP program to do either, it's just that it's PHP and hence slower than C. Python, too, is slower than C, except when it runs some C module, which, uh, is C, the same as if PHP runs some C module. For example, one of the fastest and best XSLT 1.0 processors and XML libraries out there is XMLlib and XSLTlib (RedHat and Gnome?), written in C, and is the defacto PHP XML and XSLT modules used. Whatever you've got that runs in C, you can run in PHP, it's not really a big deal, it just depends on whether it makes sense to patch it up with the way you use your PHP. > if you wanted to write a web interface for GNU cash or another well > established accounting program, could you do it? Sure. Here's someone who'dunnit back in 2008; http://web.archiveorange.com/archive/v/LJV4vT1u2IqE3LstFA1V > please feel free to point me to the php equivilants of pyhook, pycuda, > scipy, numpy and some examples of widely used programs with php > bindings. You can bind PHP and Python the same, it's just a matter of doing and whether it makes sense to do so. It's *not* a question of /if/ you can do it, but if you /should/ do it. Your milage *will* vary. >> For the sophisticated hacker, most languages can >> be tweaked to solve almost any problem. > > I am sure that is true. Though, I feel many for many tasks php would > require quite a bit more tweaking than python, with much less > community support behind it (I mean, google comes up with fewer > helpful links to the problems I sited above). Maybe your Google-foo is weak. :) > My impression, based on very little experience with php, is that if > you asked in a forum about using php for advanced scientific > computing, or writing music generation/sequencing software, > knowledgeable folks would first ask: "are you sure you want to do this > in php? how about java or python?" Again, probably because they don't realize it can be done in a non-request/response kinda way with PHP as well. But then, PHP itself isn't all that fast if you have little knowledge of how to do proper PHP, but this is a pitfall in any language. > That said, php may be superior for generating websites from databases. Not really, but the installations you'll find in the wild is readily configured for it, so it's easy to get going. However, this has little to do with the language itself, and more to do with default packaging of it. Anyway, I wasn't meaning to promote PHP over Python, just pointing out that PHP is a lot more (and more often still, a lot better) than what most people think it is. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] mailing list administratativia
On Thu, Oct 28, 2010 at 6:58 AM, Chris Fitzpatrick wrote: > +1 to the "this discussion is really depressing me" camp. Ok, ok, I get the message. This is no place to voice strong opinions about bad library tech, and my (different, but not bad) language nor stance (contrarian, but not accusatory) are simply not acceptable. I'm outta here. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] mailing list administratativia
On Thu, Oct 28, 2010 at 6:53 AM, Jonathan Rochkind wrote: > Pretty sure it wasn't depressing to the vast majority of the listserv > audience. That was/is a discussion that benefited from a "timeout period", > like you give the pre-schoolers. Given we're adults, and not in pre-school, I disagree. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Django
On Wed, Oct 27, 2010 at 3:09 AM, Elliot Hallmark wrote: > However, I switched to this other scripting > language, python, because it could do things php cant. Not to start a flame, but that's a rather big statement which I think A) needs backing up, and B) is probably untrue. > For instance, > my first project in python involved capturing keyboard input before > windows heard about it. Then I kept discovering amazing things python > can do that php cant. For instance, PHP can do this fine. Was there something in particular you're thinking of that PHP can't do? > I helped write a non-sequential optical ray tracer in python. When it > needed to be faster there were several libraries for writing C code > directly in a pythonic syntax. Python has hooks into everything, like > optical character recognition, electronic music > sequeuencing/generation, serial port i/o. Again, PHP the same. For the sophisticated hacker, most languages can be tweaked to solve almost any problem. And I'm not even suggesting that you use PHP. Happy hacking. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] mailing list administratativia
On Thu, Oct 28, 2010 at 2:44 AM, Doran, Michael D wrote: > Can that limit threshold be raised? If so, are there reasons why it should > not be raised? Is it to throttle spam or something? 50 seems rather low, and it's rather depressing to have a lively discussion throttled like that. Not to mention I thought I was simply kicked out for living things up (especially given my reasonable follow-up was where the throttling began). Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
>> Political? For sure. Engineering? Not so much. > > Ok. Solve it. Let us know when you're done. Wow, lamest reply so far. Surely you could muster a tad bit better? I was excited about getting a list of the hardest problems, for example, I'd love to see that. Then by that perhaps you could explain what this unsurmountable hard mind-boggeling problem actually is, because, you know, you never actually said. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
Hi, On Tue, Oct 26, 2010 at 1:23 PM, Bill Dueber wrote: > Sorry. That was rude, and uncalled for. I disagree that the problem is > easily solved, even without the politics. There've been lots of attempts to > try to come up with a sufficiently expressive toolset for dealing with > biblio data, and we're still working on it. If you do think you've got some > insight, I'm sure we're all ears, but try to frame it terms of the existing > work if you can (RDA, some of the dublin core stuff, etc.) so we have a > frame of reference. Well, I've wined enough both here and on NGC4LIB, and I'm kinda over it, just like I'm sure most people are over my whining. But sufficient to say is that FRBR is a 15 year old model that has still not been proven in the Real World[TM] in any meaningful way (the prototypes works fine until you dig a bit) and probably never will as long as MARC21 runs the show, and trying to stick RDA on top with rules that has got use-cases that are old enough to be my kids, well, I'm not very positive about that either. The direction of going ontological is a good one, and in the lack of anything else, RDF-infused FRBR / RDA is probably the way to go (except I'd ditch RDA and, uh, perhaps even FRBR, or at least seriously modify it), but the community is decidedly not talking about ontological interoperability nor extensions nor the semantics involved to solve actual problems in the bibliographic world (including the fact that it is inherently bibliographic). There needs to be much more involvement by library geeks and managers in defining semantic reuse and extensibility, to properly define those things that are almost absent from the AACR2 and friends; the relationships between entities themselves. In other words, you need to get away from the record-centered view, and embrace the subject-centric view. Anyway, enough from this old grumpy bum. Sorry to stir up the dust. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
On Tue, Oct 26, 2010 at 12:48 PM, Bill Dueber wrote: > Here, I think you're guilty of radically underestimating "lots of people > around the library world." No one thinks MARC is a good solution to > our modern problems, and no one who actually knows what MARC > is has trouble understanding MARC-XML as an XML serialization of > the same old data -- certainly not anyone capable of meaningful > contribution to work on an alternative. Slow down, Tex. "Lots of people in the library world" is not the same as developers, or even good developers, or even good XML developers, or even good XML developers who knows what the document model imposes to a data-centric approach. > The problem we're dealing with is *hard*. Mind-numbingly hard. This is no justification for not doing things better. (And I'd love to know what the hard bits are; always interesting to hear from various people as to what they think are the *real* problems of library problems, as opposed to any other problem they have) > The library world has several generations of infrastructure built > around MARC (by which I mean AACR2), and devising data > structures and standards that are a big enough improvement over > MARC to warrant replacing all that infrastructure is an engineering > and political nightmare. Political? For sure. Engineering? Not so much. This is just that whole "blinded by MARC" issue that keeps cropping up from time to time, and rightly so; it is truly a beast - at least the way we have come to know it through AACR2 and all its friends and its death-defying focus on all things bibliographic - that has paralyzed library innovation, probably to the point of making libraries almost irrelevant to the world. > I'm happy to take potshots at the RDA stuff from the sidelines, but I never > forget that I'm on the sidelines, and that the people active in the game are > among the best and brightest we have to offer, working on a problem that > invariably seems more intractable the deeper in you go. Well, that's a pretty scary sentence, for all sorts of reasons, but I think I shall not go there. > If you think MARC-XML is some sort of an actual problem What, because you don't agree with me the problem doesn't exist? :) > and that people > just need to be shouted at to realize that and do something about it, then, > well, I think you're just plain wrong. Fair enough, although you seem to be under the assumption that all of the stuff I'm saying is a figment of my imagination (I've been involved in several projects lambasted because managers think MARCXML is solving some imaginary problem; this is not bullshit, but pain and suffering from the battlefields of library development), that I'm not one of those developers (or one of you, although judging from this discussion it's clear that I am not), that the things I say somehow doesn't apply because you don't agree with, umm, what I'm assuming is my somewhat direct approach to stating my heretic opinions. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
On Tue, Oct 26, 2010 at 11:56 AM, Walker, David wrote: > Your criticisms of MARC-XML all seem to presume that MARC-XML is the > goal, the end point in the process. But MARC-XML is really better seen as a > utility, a middle step between binary MARC and the real goal, which is some > other "useful and interesting" XML schema. How do you create an ontological commitment in a community to an expanding and useful set of tools and vocabularies? I think I need to remind people of what MARCXML is supposed to be ; "a framework for working with MARC data in a XML environment. This framework is intended to be flexible and extensible to allow users to work with MARC data in ways specific to their needs. The framework itself includes many components such as schemas, stylesheets, and software tools." I'm not assuming MARCXML is a goal, no matter how we define that. I'm poo-pooing MARCXML for the semantics we, as a community, have been given by a process I suspect had goals very different from reality. Very few people would "work with MARC through MARCXML", they would use it to convert it, filter it, hack around it to something else entirely. And I'm afraid lots of people are missing the point of stubbing the developments in a community by embracing tools that pushes a packet that inhibits innovation. So, here's the point, in paraphrased point; "Here's our new thing. And we did it by simply converting all our MARC into MARCXML that runs on a cron job every midnight, and a bit of horrendous XSLT that's impossible to maintain." "But it looks just like the old thing using MARC and some templates?" "Ah yes, but now we're doing it in XML!" (Yeah, yeah, your mileage will vary) I'm sorry if I'm overly pessimistic about the XML goodness in the world, not for the XML itself, but the consequences of the named entities involved. I've been a die-hard XML wonk for far too many years, and the tools in that tool-chest doesn't automatically solve hard problems better by wrapping stuff up in angle brackets, and - dare I say it? - perhaps introduces a whole fleet of other problems rarely talked about when XML is the latest buzz-word, like using a document model on what's a traditional records model, character encodings, whitespace issues, unicode, size and efficiencies (the other part of this thread), and so on. But let me also be a bit more specific about that hard semantic problem I'm talking about; Lots of people around the library world infra-structure will think that since your data is now in XML it has taken some important step towards being inter-operable with the rest of the world, that library data now is part of the real world in *any* meaningful way, but this is simply demonstrably deceivingly not true. By having our data in XML has killed a few good projects where people have gone "A new project to convert our MARC into useful XML? Aha! LoC has already solved that problem for us." Btw, to those who find me so obnoxious, at no point do I say it was intentionally evil, just evil none the same. The road to hell is, as always, paved with good intentions. Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
Ray Denenberg, Library of Congress wrote: > It really is possible to make your point without being quite so obnoxious. Obnoxious? Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] MARCXML - What is it for?
Hiya, On Tue, Oct 26, 2010 at 6:26 AM, Nate Vack wrote: > Switching to an XML format doesn't help with that at all. I'm willing to take it further and say that MARCXML was the worst thing the library world ever did. Some might argue it was a good first step, and that it was better with something rather than nothing, to which I respond ; Poppycock! MARCXML is nothing short of evil. Not only does it goes against every principal of good XML anywhere (don't rely on whitespace, structure over code, namespace conventions, identity management, document control, separation of entities and properties, and on and on), it breaks the ontological commitment that a better treatment of the MARC data could bring, deterring people from actually a) using the darn thing as anything but a bare minimal crutch, and b) expanding it to be actual useful and interesting. The quicker the library world can get rid of this monstrosity, the better, although I doubt that will ever happen; it will hang around like a foul stench for as long as there is MARC in the world. A long time. A long sad time. A few extra notes; http://shelterit.blogspot.com/2008/09/marcxml-beast-of-burden.html Can you tell I'm not a fan? :) Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] OCLC Service Outage Update
Michael J. Giarlo wrote: > ... people took Simon's comment seriously? Language is a funny thing ; some times the things that are being said is taken seriously. And the script-haters are spread far and wide, so there was no reason not to take him seriously. Should the default be not to take anyone seriously? Srsly? Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] OCLC Service Outage Update
On Tue, May 11, 2010 at 06:59, stuart yeates wrote: > No, the real problem is with trolls sending flamebait. Friggin' AMEEN! Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Twitter annotations and library software
On Fri, Apr 30, 2010 at 20:29, Owen Stephens wrote: > However I'd argue that actually OpenURL 'succeeded' because it did manage to > get some level of acceptance (ignoring the question of whether it is v0.1 or > v1.0) - the cost of developing 'link resolvers' would have been much higher > if we'd been doing something different for each publisher/platform. In this > sense (I'd argue) sometimes crappy standards are better than none. Well, perhaps. I see OpenURL as the natural progression from PURL, in which both have their degree of "success", however I'm careful using that word as I live on the outside of the library world. It may well be a success on the inside. :) > I think the point about Link Resolvers doing stuff that Apache and CGI > scripts were already doing is a good one - and I've argued before that what > we actually should do is separate some of this out (a bit like Johnathan did > with Umlaut) into an application that can answer questions about location > (what is generally called the KnowledgeBase in link resolvers) and the > applications that deal with analysing the context and the redirection Yes, split it into smaller chunks is always smart, especially with complex issues. For example, in the Topic Maps world, the who standard (reference model, data model, query language, constraint language, XML exchange language, various notational languages) is wrapped up with a guide in the middle. Make them into smaller parcels, and make your flexible point there. If you pop it all into one, no one will read it and fully understand it. (And don't get me started on the WS-* set of standards on the same issues ...) > (To introduce another tangent in a tangential thread, interestingly (I > think!) I'm having a not dissimilar debate about Linked Data at the moment - > there are many who argue that it is too complex and that as long as you have > a nice RESTful interface you don't need to get bogged down in ontologies and > RDF etc. I'm still struggling with this one - my instinct is that it will > pay to standardise but so far I've not managed to convince even myself this > is more than wishful thinking at the moment) Ah, now this is certainly up my alley. As you might have seen, I'm a Topic Maps guy, and we have in our model a distinction between three different kinds of identities; internal, external indicators and published subject identifiers. The RDF world only had rdf:about, so when you used "www.somewhere.org", are you talking about that thing, or does that thing represent something you're talking about? Tricky stuff which has these days become a *huge* problem with Linked Data. And yes, they're trying to solve that by issuing a HTTP 303 status code as a means of declaring the identifiers imperative, which is a *lot* of resolving to do on any substantial set of data, and in my eyes a huge ugly hack. (And what if your Internet falls down? Tough.) Anyway, here's more on these identity problems ; http://www.ontopia.net/topicmaps/materials/identitycrisis.html As to the RESTful notions, they only take you as far as content-types can take you. Sure, you can gleam semantics from it, but I reckon there's an impedance mismatch between just the things librarians how got down pat ; meta data vs. data. CRUD or, in this example, GPPD (get/post/put/delete), who aren't in a dichotomy btw, can only determine behavior that enables certain semantic paradigms, but cannot speak about more complex relationships or even modest models. (Very often models aren't actionable :) The funny thing is that after all these years of working with Topic Maps I find that these hard issues have been solved years ago, and the rest of the world is slowly catching up to it. I blame the lame DAML+OIL background of RDF and OWL, to be honest; a model too simple to be elegantly advanced and too complex to be easily useful. Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Twitter annotations and library software
On Fri, Apr 30, 2010 at 18:47, Owen Stephens wrote: > Could you expand on how you think the problem that OpenURL tackles would > have been better approached with existing mechanisms? As we all know, it's pretty much a spec for a way to template incoming and outgoing URLs, defining some functionality along the way. As such, URLs with basic URI templates and rewriting have been around for a long time. Even longer than that is just the basics of HTTP which have status codes and functionality to do exactly the same. We've been doing link resolving since mid 90's, either as CGI scripts, or as Apache modules, so none of this were new. URI comes in, you look it up in a database, you cross-check with other REQUEST parameters (or sessions, if you must, as well as IP addresses) and pop out a 303 (with some possible rewriting of the outgoing URL) (with the hack we needed at the time to also create dummy pages with META tags *shudder*). So the idea was to standardize on a way to do this, and it was a good idea as such. OpenURL *could* have had a great potential if it actually defined something tangible, something concrete like a model of interaction or basic rules for fishing and catching tokens and the like, and as someone else mentioned, the 0.1 version was quite a good start. But by the time when 1.0 came out, all the goodness had turned so generic and flexible in such a complex way that handling it turned you right off it. The standard also had a very difficult language, and more specifically didn't use enough of the normal geeky language used by sysadmins around. The more I tried to wrap my head around it, the more I felt like just going back to CGI scripts that looked stuff up in a database. It was easier to hack legacy code, which, well, defeats the purpose, no? Also, forgive me if I've forgotten important details; I've suppressed this part of my life. :) Kind regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
On Fri, Apr 30, 2010 at 10:54, Eric Hellman wrote: > May I just add here that of all the things we've talked about in these > threads, perhaps the only thing that will still be in use a hundred years > from now will be Unicode. إن شاء الله May I remind you that we're still using MARC. Maybe you didn't mean in the library world ... *rimshot* Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)
On Fri, Apr 30, 2010 at 04:17, Jakob Voss wrote: > But all the flaws of XML can be traced back to SGML which is why we now use > JSON despite all of its limitations. Hmm, this is wrong on so many levels. First, SGML was pretty darn good for its *purpose*, but it was a geeks dream and pretty scary for anyone who hacked at it not fully getting it (like most normal developers). As with many things where the learning curve is steep, it fell into the "not good for normal consumption" category and they (well, people who cared, and made decisions about the web) were "forced" to make XML. But JSON? Are you sure you've got this figured out? JSON as a object serializing format is good for a number of things (small footprint, embedded type, etc.), but sucks for most information management tasks. However, I'd like to add here that I happen to love XML, even from an integration perspective, but maybe that stems from understanding all those tedious bits no one really cares about about it, like id(s) and refid(s) (and all the indexing goodness that comes from it), canonical datasets, character sets and Unicode, all that schema craziness (including Schematron and RelaxNG), XPath and XQuery (and all the sub-standards), XSLT and so on. I love it all, and not because of the generic simplicity itself (simple in the default mode of operation, I might add), but because of a) modeling advantages, b) cross-environment language and schema support, and c) ease of creation. (I don't like how easy well-formedness breaks, though. That sucks) But I mention all this for a specific reason ; MARCXML is the work of the devil! There's a certain dedication needed for "doing it right", by paying attention in XML class, and play well with your playmates. This is how you build a community and understanding around standards; the standards themselves are not enough. The library world did nothing of the kind ; http://shelter.nu/blog/2008/09/marcxml-beast-of-burden.html The flaws of XML can most likely be traced back to people not playing well with playmates, and not the format itself. > May brother Ted Nelson enlighten all of > us - he not only hates XML [1] and similar formats but also proposed an > alternative way to structure information even before the invention of > hierarchical file systems and operating systems [2]. Bah. For someone who don't see the SGML -> XML -> HTML transgression as an inherited and more rigid structure (or, by popular language, more schematic) as a document model as a good thing, I'm not impressed. Any implied structure can be criticized, including pretty much any corner of Xanadu as well. (I mean, seriously; taking hypermedia one step closer to a file system does *not* solve problems with the paper-based document model of HTTP, it just shifts the focus) > In his vision of Xanadu > every piece of published information had a unique ID that was reused > everytimes the publication was referenced - which would solve our problem. *Having* an identifier doesn't mean that identifier is a *good* one, nor that it solves your problem. There's plenty of systems out there where everything has an identifier (and, if you knew XML deeper, you'll find identification models as well in there, but people don't use them because the early onset of XML didn't understand nor need them). Have a look at the failed XLink brooha for something that worked and filled the niche, but people didn't get nor did tool-makers see the point of implementation, and the thing died a premature death. The current model of document structure and XQuery is somewhat of an alternative, but people are also switching to CSS3 styles as well. The thing is, just because you've got persistence in a system of identifiers, it does not follow that the information is persisted; the problem of change is *not* solved in neither systems, and so we work with the one we got and make the best of it. One thing I always found intriguing about librarians were their commitment to persistent URIs for information resources, and use of 303 if need be (although I see this mindset dwindling). I think you're the only ones in the entire world who gives a monkeys bottom about these issues, as the rest of the world simply use Google as a resolver. I can see where this is going. :) Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Twitter annotations and library software
Hi, On Thu, Apr 29, 2010 at 22:47, Walker, David wrote: > I would suggest it's more because, once you step outside of the > primary use case for OpenURL, you end-up bumping into *other* standards. These issues were raised all the back when it was created, as well. I guess it's easy to be clever in hindsight. :) Here's what I wrote about it 5 years ago (http://shelter.nu/blog-159.html) ; So let's talk about 'Not invented here' first, because surely, we're all guilty of this one from time to time. For example, lately I dug into the ANSI/NISO Z39.88 -2004 standard, better known as OpenURL. I was looking at it critically, I have to admit, comparing it to what I already knew about Web Services, SOA, http, Google/Amazon/Flickr/Del.icio.us API's, and various Topic Maps and semantic web technologies (I was the technical editor of Explorers Guide to the Semantic Web) I think I can sum up my experiences with OpenURL as such; why? Why have the library world invented a new way of doing things that already can be done quite well already? Now, there is absolutely nothing wrong with the standard per se (except a pretty darn awful choice of name!!), so I'm not here criticising the technical merits and the work put into it. No, it's a simple 'why' that I have yet to get a decent answer to, even after talking to the OpenURL bigwigs about it. I mean, come on; convince me! I'm not unreasonable, no truly, really, I just want to be convinced that we need this over anything else. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Library Linked Data
Hiya, On Thu, Oct 29, 2009 at 16:19, stuart yeates wrote: > I'm guessing that Roy meant linked data in the sense of > http://www.w3.org/DesignIssues/LinkedData.html and http://linkeddata.org/ I'm pretty sure he did, too. I guess I was trying to smoke out his reasoning for choosing "linked data" as the only worthwhile semantic web technology. Let me clarify, and have a look at this ; http://en.wikipedia.org/wiki/Semantic_Web_Stack Linked data is the bottom four boxes out of a total of 12 (13 if you count the top one), where the ones missing is things like Trust, Proof, Logic, Querying, Ontologies and Taxonomies, all things that I thought it was evident belonged at the core of what library science is all about. It simply astounds me the lack of understanding from the library world on these things, so sad to see that these things aren't linked up; you *are* what these things are about! Sure, linked data is easier; that's why everyone is doing it, have been doing it for years. But you're missing out in fields that should be second-nature to you. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] Library Linked Data
Hiya, On Thu, Oct 29, 2009 at 15:16, Roy Tennant wrote: > Could you elaborate a bit? In my mind, the only "semantic web technology" of > any note is "linked data". What do you mean by linked data? I work in fields of semantic web technology where there's very little linked data (ie. data on the web you can link to and use), yet I feel all our work is very valuable and certainly worthy of note ... Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] HTML mark-up in MARC records
Hiya, I guess I'm the one who's got to step up to the self-slaughtering altar, but the fact that a lot of our systems break or don't know how to handle HTML is despicable. I'm sure you guys are familiar with RSS / Atom, and because in there we *expect* HTML and therefore make sure our back-ends can grok it, it enhances the meta data *greatly*. Don't think for a second that purity of the data format in any shape or form is the definition of its usefulness. Mixed content models might be complex to work with, but their value is immense. I can fully understand *why* people say "don't do it", because, yes, it ups the complexity, and perhaps with these dinosaur technologies like MARC and our ILS's breaking under the pressure of more modern technologies enforces it, I don't think we should shun it because of it. If your back-end can't grok HTML, I'd suggest you fix it immediately! If your ILS chokes on XML and / or HTML snippets, I suggest you replace it. You seriously shouldn't allow this rigidity into your infra-structure, and it's depressing to watch how we as complex users of MARC don't dare to extend it to become a format that does what it should and need to do. Even *if* HTML in MARC records probably is a bad idea. Regards, Alex -- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps --- http://shelter.nu/blog/ -- -- http://www.google.com/profiles/alexander.johannesen ---
Re: [CODE4LIB] A Book Grab by Google
On Thu, May 21, 2009 at 10:07, Karen Coyle wrote: > - without competition, Google (with the agreement of the registry, whose > purpose is to garner as much income as possible for rights holders) will > charge a price that is more than some institutions will be able to afford; > others will subscribe, but to the detriment of other resource subscriptions. How is this different from what's already in place in terms of electronic resources? This is not uniquely Google, nor has it even been proven to happen. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Thu, May 14, 2009 at 17:45, Rob Sanderson wrote: > I'll quote Mike (and most common approaches to the problem): > Don't Do That Then. > :) Oh, for sure. :) But these are very subtle things that are hard to understand, and certainly the long-term implications, so people *will* do this, and they *will* put rot into the SemWeb chains people create. It's unavoidable, but I know lots are trying to work out some kind of solution. Unfortunately, this one is being routed to software frameworks rather than the RDF core itself. Oh well. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Thu, May 14, 2009 at 17:35, Rob Sanderson wrote: > For example, the owl:sameAs predicate is used to express that the > subject and object are the same 'thing'. Then the application can infer > that if a owl:sameAs b, and a x y, then b x y. Yes, but there's a snag; as RDF work only on the URI resource level (no added semantics to the typification of the URI resource) if someone does an owl:sameAs between an identifier of a thing and a locator of a thing (a locator being the resource itself as opposed to being an identifier; example are you talking about Sun Corp (http://sun.com/) or are you talking about their website (http://sun.com/)) you can get a nasty case of integrity rot, and I've not seen any proposals to address this issue (the RDF world is essentially assuming modeling from the viewpoint of everything being true). I guess Mike don't like RDF *nor* Topic Maps now. :) Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Mon, May 11, 2009 at 19:34, Jonathan Rochkind wrote: > In the real world, we use things when they solve the problem in front of us > in as easy a way as possible And somehow you're suggesting that I don't live in the real-world? :) Good try, but as far as I've experienced, people in the library world lives quite a distance away from the real one. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Mon, May 11, 2009 at 16:04, Rob Sanderson wrote: > * One namespace is used to define two _totally_ separate sets of > elements. There's no reason why this can't be done. As opposed to all the reasons for not doing it. :) This is crap design of a higher magnitude, and the designers should be either a) whipped in public and thrown out in shame, or b) repent and made to fix the problem. Even I would opt for the latter, but such a simple task not being done seems to suggest that perhaps the former needs to be put in place. > * One namespace defines so many elements that it's meaningless to call > it a format at all. Even though the top level tag might be the same, > the contents are so varied that you're unable to realistically process > it. Yeah, don't use MODS in general; it's a hack. It's even crazier still that many versions have the same namespace. What were they thinking?! Anyway, even if the namespace is botched, you can still (if I'll dare go by the Topic Maps moniker) have multiple namespaces for the same subject (the format in question), and simply publish and use your own and let the TM mechanics handle the ambiguity for you. If enough people do this, and perhaps even use your unofficial identifiers, maybe LOC will see the errors of their ways and repent. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Sat, May 9, 2009 at 00:32, Jonathan Rochkind wrote: > I don't understand from your description how Topic Maps solve the > "identifying multiple versions of a standard" problem. It's the mechanism of having multiple identifiers for Topics, so, in pseudo ; Topic "MARC21" psi "info:ofi/fmt:xml:xsd:MARC21" psi "http://loc.org/stuff/marc21"; property #mime-type "whatever for the binary" Topic "MARC 1.1" is_a "MARC" psi "info:srw/schema/1/marcxml-v1.1" psi "http://loc.org/stuff/marcxml-v1.1"; property #mime-type "whatever 1.1" Topic "MARC 1.2" is_a "MARC" psi "info:srw/schema/1/marcxml-v1.2" psi "http://bingo.com/psi/marcxml"; property #mime-type "whatever 1.2" Or, if if "MARC 1.2" is backwards compatible with 1.1 ; Topic "MARC 1.2" is_a "MARC 1.1" psi "info:srw/schema/1/marcxml-v1.2" Or, if I make my own unofficial version ; Topic "MARC 2.0" is_a "MARC 1.2" psi "http://alex.com/psi/marc-2.0"; This is enough to hobble together what is and isn't compatible in types of formats, so if your application is Topic Maps aware, this should be trivial (including what format to ignore or react to). The point is that you don't need *one* identifier for things; Topics are proxies for knowledge, and part of the notion of "knowledge" is what identifies that knowledge. Multiple PSIs help us leverage both rigid and fuzzy systems. As to the identifiers themselves (as in, the formatting), is that important? Anyway, I'm suspecting I don't see what the problem seems to be. To create "the best identifier" for things seems a bit of a strange notion to me, but is this based on that there is only (or rather, that you're trying to create) one identifier for any one thing? Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
On Wed, May 6, 2009 at 18:44, Mike Taylor wrote: > Can't you just tell us? Sorry, but surely you must be tired of me banging on this gong by now? It's not that I don't want to seem helpful, but I've been writing a bit on this here already and don't want to be marked as spam for Topic Maps. In the Topic Maps world our global identificators are called PSI, for Published Subject Indicators. There's a few subtleties within this, but they are not so different from any other identificator you'll find elsewhere (RDF, library world, etc.) except of course they are *always* URIs. Now, the thing here is that they should *always* be published somewhere, whether as a part of a list or somewhere. The next thing is that they always should resolve to something (although the standard don't require this, however I'd say you're doing it wrong if you couldn't do this, even if it sometimes is an evil necessity). This last part is really the important bit, where any PSI will act as 1) a global identificator, and 2) resolve to a human text explaining what it represents. Systems can "just use it" while at the same time people can choose the right ones for their uses. And, yes, the identificators can be done any way you slice them. Some might think that ie. a PSI set for all dates is crazy as you need to produce identificators for all dates (or times), and that would be just way too much to deal with, but again, that's not an identifcation problem, that's a resolver problem. If I can browse to a PSI and get the text that "this is 3rd of June, 19971, using the whatsnot calendar style", then that's safe for me to use for my birthday. Let's pretend the PSI is http://iso.org/datetime/03061971. By releasing an URI template computers can work with this automatically, no frills. Now a bit more technical; any topic (which is a Topic Map representation of any subject, where "subject" is defined as "anything you can ever hope to think of") can have more than one PSI, because I might use the PSI http://someother.org/time/date/3/6/1971 for my date. If my application only understand this former set of PSIs, I can't merge and find similar cross-semantics (which really is the core of the problem this thread has been talking about). But simply attach the second PSI to the same Topic, and you do. In fact, both parties will understand perfectly what you're talking about. More complex is that the definitions of PSI sets doesn't have to happen on the subject level, ie. the Topic called "Alex" to which I tried to attach my birthday. It can be moved to a meta model level, where you say the Topic for "Time and dates" have the PSI for both organsiations, and all Topics just use one or the other; we're shifting the explicity of identification up a notch. Having multiple PSIs might seem a bit unordered, but it's based on the notion of organic growth, just like the web. People will gravitate towards using PSIs from the most trusted sources (or most accurate or most whatever), shifting identification schemes around. This is a good thing (organic growth) at the price of multiple identifiers, but if the library world started creating PSIs, I betcha humanity and the library world both could be saved in one fell swoop! (That's another gong I like to bang) I'm kinda anticipating Jonathan saying this is all so complex now. :) But it's not really; your application only has to have complexity in the small meta model you set up, *not* for every single Topic you've got in your map. And they're mergable and shareable, and as such can be merged and "fixed" (or cleaned or sobered or made less complex) for all your various needs also. Anyway, that's the basics. Let me know if you want me to bang on. :) For me, the problem the library face isn't really the mechanisms of this (because this is solvable, and I guess you just have to trust that the Topic Maps community have been doing this for the last 10 years or so already :), however, but how you're going to fit existing resources into FRBR and RDA, but that's a separate discussion. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Another nail in the coffin
On Mon, May 4, 2009 at 23:25, Joe Hourcle wrote: > You're forgetting the 5th Law: > The library is a growing organism. > http://en.wikipedia.org/wiki/Five_laws_of_library_science Not forgotten, I just don't believe it anymore. And, taken to its natural consequence, organisms through evolution comes and goes. :) Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Another nail in the coffin
On Mon, May 4, 2009 at 22:44, Andreas Orphanides wrote: > You say that as though libraries are all about books. Libraries still have the word "biblio" as their primer, and it certainly is the written word on paper that occupies most of our time, no? Sure libraries around the world are trying to play catch-up in the digital and modern world with all sorts of things, but the primary directive is still "books" for most librarians. Not sure what you mean they're *really* into? Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
[CODE4LIB] Another nail in the coffin
Another nail in the library coffin, especially the academic ones ; http://www.youtube.com/watch?v=5TIOH80Qg7Q Organisations and people are slowly turning into data producers, not book producers. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All
With Topic Maps it's been solved years and years ago, and it's the part of it that the RDF world didn't think of until recently (and applied their kludges). I'm not going to bang my gong on this, just urge you to read up on PSIs. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)
Hiya, On Thu, Apr 16, 2009 at 01:10, Jonathan Rochkind wrote: > It stands in the way of using them in the fully realized sem web vision. Ok, I'm puzzled. How? As the SemWeb vision is all about first-order logic over triplets, and the triplets are defined as URIs, if you can pop something into a URI you're good to go. So how is it that SuDoc doesn't fit into this, as you *can* chuck it in a URI? I said it was unfriendly to the Web, not impossible. > It does NOT stand in the way of using them in many useful ways that I can > and want to use them _right now_. Ah, but then go fix it. > Ways which having a URI to refer to them > are MUCH helped by. Whether it can resolve or not (YOU just made the point > that a URI doesn't actually need to resolve, right? I'm still confused by > this having it both ways -- URIs don't need to resolve, but if you're URIs > don't resolve than you're doing it wrong. Huh?) C'mon, it ain't *that* hard. :) URIs as identifiers is fine, having them resolve as well is great. What's so confusing about that? > , if you have a URI for a > SuDoc you can use it in any infrastructure set up to accept, store, and > relate URIs. Like an OpenURL rft_id, and, yeah, like RDF even. You can make > statements about a SuDoc if it has a URI, whether or not it resolves, > whether or not SuDoc itself is 'web friendly'. One step at a time. > > This is my frustration with semantic web stuff, making it harder to do > things that we _could_ do right here and now, because it violates a fantasy > of an ideal infrastructure that we may never actually have. Huh? The people who made SuDoc didn't make it web friendly, and thus the SemWeb stuff is harder to do because it lives on the web? (And chucking your meta data into HTML as MF or RDF snippets ain't that hard, it just require a minimum of knowledge) > There are business costs, as well as technical problems, to be solved to > create that ideal fantasy infrastructure. The business costs are _real_ No more real than the cost currently in place. The thing is that a lot of people see the traditional cost disappear with the advent of SemWeb and the new costs heavily reduced. >> Also, having a unified resolver for >> SuDoc isn't hard, can be at a fixed URL, and use a parameter for >> identifiers. You don't need to snoop the non-parameterized section of >> an URI to get the ID's ; > > Okay, Alex, why don't you set this up for us then? Why? I don't give a rats bottom about SuDoc, don't need it, think it's poorly designed, and gives me nothing in life. Why should I bother? (Unless I'm given money for it, then I'll start caring ... :) > And commit to providing > it persistently indefinitely? Because I don't have the resources to do that. Who's behind SuDoc, and are they serious about their creation? That's the people you should send your anger instead. > And for the use cases I am confronted with, I don't _need_ it, any old URI, > even not resolvable, will do--yes, as long as I can recognize it as a SuDoc > and extract the bare SuDoc out of it. So what's the problem with just making some stuff up? If you can do your thing in a vacuum I don't fully understand your problem with the SemWeb stuff? If you don't want it, don't use it. > Which you say I shouldn't be doing > (while others say that's a mis-reading of those docs to think I shouldn't be > doing it) No, I think this one is the subtle difference between a URL and a URI. > but avoiding doing that would raise the costs of my software > quite a bit, and make the feature infeasible in the first place. Business > costs and resources _matter_. As with anything on the Web, you work with what you got, and if you can fix and share your fix, we all will love you for it. I seriously don't think I understand what you're getting at here; it's been this way since the Web popped into existance, and don't really want it to be any other way. >> No it's not; if you design your system RESTfully (which, indeed, HTTP >> is) then the discovery part can be fast, cached, and using URI >> templates embedded in HTTP responses, fully flexible and fit for your >> purposes. > > These URIs are > _external_ URIs from third parties, I have no control over whether they are > designed RESTfully or not. Not sure I follow this one. There are no good or bad RESTful URIs, just URIs. REST is how your framework work with the URIs. > In the meantime, I'll continue trying to balance functionality, > maintainability, future expansion, and the programming and hardware > resources available to me, same as I always do, here in the real world when > we're building production apps, not R&D experiments My day job is to balance functionality, maintainability, future expansion, and the programming and hardware resources available to me, same as I always do, here in the real world when we're building production apps ... and I'm using Topic Maps and SemWeb technologies. Is there something I'm doing which degrades my work to an "R&D exper
Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)
On Wed, Apr 15, 2009 at 00:20, Jonathan Rochkind wrote: > Can you show me where this definition of a "URL" vs. a "URI" is made in any > RFC or standard-like document? >From http://www.faqs.org/rfcs/rfc3986.html ; 1.1.3. URI, URL, and URN A URI can be further classified as a locator, a name, or both. The term "Uniform Resource Locator" (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location"). The term "Uniform Resource Name" (URN) has been used historically to refer to both URIs under the "urn" scheme [RFC2141], which are required to remain globally unique and persistent even when the resource ceases to exist or becomes unavailable, and to any other URI with the properties of a name. An individual scheme does not have to be classified as being just one of "name" or "locator". Instances of URIs from any given scheme may have the characteristics of names or locators or both, often depending on the persistence and care in the assignment of identifiers by the naming authority, rather than on any quality of the scheme. Future specifications and related documentation should use the general term "URI" rather than the more restrictive terms "URL" and "URN" [RFC3305]. As you can see, an URI is an identifier, and a URL is a locator (mechanism for retrieval), and since a URL is a subset of an URI, you _can_ resolve URIs as well. > Sure, we have a _sense_ of how the connotation is different, but > I don't think that sense is actually formalized anywhere. It is, and the same stuff is documented in WikiPedia as well ; http://en.wikipedia.org/wiki/Uniform_Resource_Identifier http://en.wikipedia.org/wiki/Uniform_Resource_Locator > I think the sem web crowd actually embraces this confusingness, No, I think they take it at face value; they(the URIs) are identifiers for things, and can be used for just that purpose, but they are also URLs which mean they resolve to something. What I think you're coming at is that "something" thing it resolves too, as *that* has no definition. But then, if you go from RDF to Topic Maps PSIs (PSIs are URIs with an extended meaning), *that* thing it resolves to indeed has a definition; it's the prose explaining what the identifier identifies, and this is the most important difference between RDF and Topic Maps (and a very subtle but important difference, too). > they want to have it both ways: Oh, a URI doesn't need to resolve, > it's just an opaque identifier; but you really should use http URIs > for all URIs; why? because it's important that they resolve. I smell straw-man. :) But yes, they do want both, as both is in fact a friggin' smart thing to have. We all deal with identifiers all the time, in internal as external applications, so why not use an indetifier scheme that has the added bonus of adding a resolver mechanism? If you want to be stupid and lock yourself in your limited world, then using them as just identifiers is fine but perhaps a bit, well, stupid. But if you want to be smart about it, realizing that without ontological work there will *never* be proper interop, you use those identifiers and let them resolve to something. And if you're really smart, you let them resolve to either more RDF statements, or, if you're seriously Einsteinly smart, use PSIs (as in Topic Maps) :). > In general, combining two functions in one mechanism is a > dangerous and confusing thing to do in data design, in my opinion. Because ... ? > By analogy, it's what gets a lot of MARC/AACR2 into trouble. Hmm, and I thought it was crap design that did that, coupled with poor metadata constraints and validation channels, untyped fields, poor tooling, the lack of machine understandability, and the general library idiom of "not invented here". But correct me if I'm wrong. :) > Over in: http://www.w3.org/2001/tag/doc/URNsAndRegistries-50-2006-08-17.html Umm, I'd be wary to take as canon a draft with editorial notes going back 4 to 5 years that still aren't resolved. In other words, this document isn't relevant to the real world. Yet. > They suggest: "URI opacity 'Agents making use of URIs SHOULD NOT attempt > to infer properties of the referenced resource.'" Well, as a RESTafarian I understand this argument quite well. It's about not assuming too much from the internal structure of the URI. Again, it's an identifier, not a scheme such as an URL where structure is defined. Again, for URIs, don't assume structure because at this point it isn't an URL. > If I get a URI representing (eg) a Sudoc (or an ISSN, or an LCCN), I need to > be able to tell from the URI alone that it IS a Sudoc, AND I need to be able > to extract the actual SuDoc identifier from it. That completely violates > their > Opacity requirement I think you are quite mistaken on this, but before we leap into wheter the we
Re: [CODE4LIB] Something completely different
On Wed, Apr 15, 2009 at 10:32, stuart yeates wrote: > Yes, we mint something very similar (see http://authority.nzetc.org/52969/ > for mine), but none of our interoperability partners do. None of our local > libraries, none of our local archives and only one of our local museums (by > virtue of some work we did with them). > All of them publish and most consume some form RDF. Hmm, RDF resources are just URIs, so I'm still a bit unsure about what you mean. Are you talking about the fact that the RDF definitions (and not the RDF vocabs themselves) aren't encoded in your TM engine? > Additionally many of the taxonomies we're interested in are available in RDF > but not topic maps. Converting them to a Topic Map isn't that hard to do, but I guess there is *a* cost there. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
On Wed, Apr 15, 2009 at 07:10, stuart yeates wrote: > RDF, unlike topic maps, is being used by substantial numbers of people who > we interact with in the real world and would like to interoperate with. If > we used RDF rather than topic maps internally, that interoperability would > be much, much cheaper. It's tempting to say it's free, but it's not quite, > because it does impose some constraints. But it's not that hard to create a bridge from RDF to Topic Maps and back, no? Or is your interop story different? > In my eyes, the core thing that RDF supports that topic maps don't seem to > is seamless reuse by people you don't care about. Yes, this has been brought up on several occasions, including by me at the TMRA 2008. But then, it's not so much that RDF does something that Topic Maps doesn't *support*, it's that it's packaged differently. So, where RDF has got five standard ontology levels (RDF, RDFS, OWL DL/Lite/Full) Topic Maps got one simpler one (TMDM), yet neither can express anything better or differently than the other. My theory here is that people *like* 5 layers of RDF, because it gives the false sensation of choice. But it's all ontological definitions. However, the 5 levels of RDF does indeed create a defined platform for sharing (if not cast in iron), in which in the TM world you need to include it / create it. Oh, and of course the academics seem to have embraced W3C and anything by the authority of TBL, and its effect is trickling down. > For example the people at http://lcsubjects.org have never heard of us (that > I know of), but we can use their URLs like > http://lcsubjects.org/subjects/sh90005545#concept to represent our roles. Not sure I understand your example. Here's my Topic Map identifier in a Topic Map ; http://psi.ontopedia.net/Alexander_Johannesen Identifier and locator, and resolvable, and can be used by anyone. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)
On Tue, Apr 14, 2009 at 23:34, Jonathan Rochkind wrote: > The difference between URIs and URLs? I don't believe that "URL" is > something that exists any more in any standard, it's all URIs. Correct me if > I'm wrong. Sure it exists: URLs are a subset of URIs. URLs are locators as opposed to "just" identifiers (which is an important distinction, much used in SemWeb lingo), where URLs are closer to the "protocol like" things Ray describe (or so I think). > I don't entirely agree with either dogmatic side here, but I do think that > we've arrived at an > awfully confusing (for developers) environment. But what about it is confusing (apart from us having this discussion :) ? Is it that we have IDs that happens to *also* resolve? And why is that confusing? > Re-reading the various semantic web TAG position papers people keep > referencing, I actually don't entirely agree with all of their principles in > practice. Well, let me just say that there's more to SemWeb than what comes out of W3C. :) Kind regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)
Hiya, Been meaning to jump into this discussion for a while, but I've been off to an alternative universe and I can't even say it's good to be back. :) Anwhoo ... On Fri, Apr 3, 2009 at 03:48, Ray Denenberg, Library of Congress wrote: > You're right, if there were a "web:" URI scheme, the world would be a > better place. But it's not, and the world is worse off for it. I'm rather confused by this statement. The "web:" URI scheme? The Web *is* the URI scheme; they are all identifiers to resources (ftp: http: gopher: https: etc.), and together they make up, the, um, web of things. What am I missing? > Back in the old days, URIs (or URLs) were protocol based. No, which one do you mean, URIs or URLs? > The ftp scheme > was for retrieving documents via ftp. The telnet scheme was for telnet. And > so on. Again, have I missed something? This has changed, as opposed to the good old days? > A few years later the semantic web was conceived and alot of SW people began > coining all manner of http URIs that had nothing to do with the http > protocol. I've been browsing back and forth this discussion, and couldn't find much to back this up. What do you mean by this? > Instead, they should have bit the bullet and coined a new scheme. They > didn't, and that's why we're in the mess we're in. I'm sorry, but "mess"? Did you know the messiness of the web is probably what made it successful? Not to mention that having URIs be identifiers *and* have the ability to resolve them is a bonus; they're identifiers of things (as they've always been, as I'm sure you know URI stands for Unified Resource Identifier, right? :), as in they consists of a string of characters used to identify or name a resource on the Internet. And then, if you so choose, you can use the protocol level to *resolve* them. Not sure how anyone can consider this to be bad, though. Or is this just a misunderstanding of the difference between URIs and URLs? Kind regards, Alexander -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
On Thu, Apr 9, 2009 at 14:33, stuart yeates wrote: > That's not an entirely useful comparison on topic maps and RDF. If I indented to be useful I'd write something substantial, backed up with stuff other than humour. I'll give that a go the next time. :) > We currently use topic maps, alot, in our infrastructure. If we were > starting again tomorrow, I'd advocate using RDF instead, mainly because of > the much better tool support and take-up. Hmm, not a good thing at all. Could you elaborate, though, as I use it too as part of infrastructure too, and wouldn't touch RDF / SemWeb without a long stick? I'm into application semantics and shared knowledge-bases. What are you guys doing where you feel the support and tools are lacking? And what are the RDF alternatives? Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Something completely different
On Wed, Apr 8, 2009 at 22:38, Dr R. Sanderson wrote: > I would encourage looking at rdf triplestores seriously, if the graph > approach is the direction that you want to go in. Or, Topic Maps which is *not* a triplestore, closer to the OO model (basically a meta data model), and don't carry the stack "overflow" of RDF (RDF, RDFs, OWL 1-2-3) nor anonymous nodes. :) Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] points of failure (was Re: [CODE4LIB] resolution and identification )
On Fri, Apr 3, 2009 at 10:44, Mike Taylor wrote: > Going back to someone's point about living in the real > world (sorry, I forget who), the Inconvenient Truth is that 90% of > programs and 99% of users, on seeing an http: URL, will try to treat > it as a link. They don't know any better. What on earth is this about? URIs *are* links; its in its design, it's what its supposed to be. Don't design systems where they are treated any differently. Again we're seeing that "all we need are URIs" poor judgement of SemWeb enthusiasts muddling the waters. The short of it is, if you're using URIs as identifiers, having the choice to dereference it is a *feature*; if it resolves to 404 then tough (and I'd say you designed your system poorly), but if it resolves to an information snippet about the semantic meaning of that URI, they yay. This is how us Topic Mappers see this whole debacle and flaw in the SemWeb structure, and we call it Public Subject Indicators, where "Public" means it resolves to something (just like WikiPedia URIs resolve to some text that explains what it is representing), "Subjects" are anything in the world (but distinct from Topics which are software representations), and "Indicators" as they indicate (rather than absolutely identify) things. In other words, if you use URIs as identifiers (which is a *good* thing), then resolvability is a feature to be promoted, not something to be shunned. If you can't make good systems design, use URNs. You can treat URI identifiers as both identifiers and subject indicators, while URNs are evil. > Let's make our identifiers look like identifiers. What does that even mean? :) > (By the way, note that this is NOT what I was saying back at the start > of the thread. This means that I have -- *gasp* -- changed my mind! > Is this a first on the Internet? :-) Maybe, but it surely will be the last ... Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?
>> One question we haven't asked is if we really need a MIME type for >> MARCXML. :) On Thu, Feb 12, 2009 at 23:28, Jonathan Rochkind wrote: > PPS: Yes, it has been asked, and it's pretty obvious to me that we do. I wasn't asking for technical reasons; I was more having a stab at how many people use and need MARCXML specifically as compared to a number of other more used formats. I mean, seriously, you can use MARCXML embedded in Atom and get the best of both worlds instead. Don't worry about it; it's not a serious _enough_ question. :) Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?
On Thu, Feb 12, 2009 at 22:32, Jonathan Rochkind wrote: > Didn't we finish having this conversation last week? We talked about all > this stuff being brought up now last week. We did indeed, and your summary is better than what my retort could have been; spot on. I guess it's hard to understand why text/xml is such a waste of MIME and time as long as we still got text/html as the original understood MIME for HTML pages, but luckily the internet has moved on and evolved. :) One question we haven't asked is if we really need a MIME type for MARCXML. :) Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?
On Thu, Feb 12, 2009 at 21:43, Rebecca S Guenther wrote: > Patrick is right that an XML schema such as MODS or MARCXML would be text/xml. I would strongly advise against text/xml, as it is an oxymoron (text is not XML XML is not text even if it is delivered through a text protocol), and more and more are switching away from the generic text protocol (which makes little sense in structured data). Hence, a more correct MIME type for XMLMARC would be application/marc+xml, although until registered should be application/x-marc+xml. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MARC 21 and MODS
Hi there, On Thu, Jan 29, 2009 at 15:55, Rebecca S Guenther wrote: > Yes, better late than never (we're a small office and stretched thin). You're not *that* small, no? :) > Also we want to explore MARC/RDF. We also have to keep in mind > that MARC is also used by non-AACR2 users (and when RDA is > implemented non-RDA users). Shouldn't the library world slowly work towards a common set of rules, backed by technology, to make it easier for us all to move forward with less pain? > As a starting point in exploring semantic web types > of technologies we are establishing a registry for controlled values > used in various standards-- MARC, MODS, PREMIS. See the text at: > http://id.loc.gov Ah, I like! This is very close to the concept in Topic Maps of Published Subject Indicators. Could the identifiers within have a certain degree of persistance and resolvability? If so, both the SemWeb and TM communities could use this out of the box. I also think the DC RDA working-group has something similar. Karen? And should you work together? > In the meantime we have a prototype at: > http://www.loc.gov:8081/standards/registry/lists.html Can't make much work there. Must be in alpha. :) But I like this direction. If you now can get the vendors on-board, or better, make more SemWeb systems yourselves, and you're a *huge* step forward. I'm *very* excited to see this coming from LoC. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] MARC 21 and MODS
On Wed, Jan 28, 2009 at 20:29, Rebecca S Guenther wrote: > It is interesting though that a study of different metadata > formats at Los Alamos National Labs a few years ago > concluded that MARCXML was the richest and most robust. > http://www.dlib.org/dlib/september06/goldsmith/09goldsmith.html Umm, I just have to add that all those compared won't make it to my top 10 list of good formats, so, er, comparing library formats against each other is a bit like comparing all the wonderful juicy fruit in the world where your selection is limited to what can grow in Alaska. It still amazes me that RDF and / or DC hidden in SRDF or Topic Maps haven't gotten any traction when it seriously matches what you want. > We are also working on modeling MODS as RDF-- some > work has already been done on this. That is good news, albeit a little late and certainly a little slow. But I hear good things about Talis moving into this arena, and hopefully they can pull a few other vendors with them. I guess the first thing that is needed is a basic MARC / RDF vocabulary we can all participate in and extend, and then cross-pollinate vocabularies as we move away from AACR2 to more RDA / FRBR friendly stuff (although, me personally, I would jump way ahead of RDA, but that's not going to happen). > In terms of MARC, we are planning for its evolution and streamlining to > get rid of some of its problems and plan for a future where the transition > to new cataloging rules will work well with existing records and cataloging > infrastructure. Are you talking about RDA here? And when will these changes happen, in what form, how do you build momentum and expertize, etc.? > Whatever the format of the future is, the transition will need > to be evolutionary because of the billions of records that are > out there and the need to satisfy a lot of the user tasks > required of library (and other) metadata. I agree fully, although I'd stress the poor infra-structure as a reason more than records available (they can always be converted into something else, but you can't easily change how systems require MARC21) > It is also worth noting that despite some calls for a MARC > replacement, we have a number of national libraries > throughout the world that are abandoning their national > formats and just now adopting MARC 21. They also need > to be considered in this transition. I find it a bit scary it's taken this long, but I certainly welcome the change as it makes it easier to move from one format to the other once we all agree on a fundamental platform. But I still don't think a clear direction forward is set. Any docos you can point to about the future direction of LoC approved meta data exchange? regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] marc21 and usmarc
On Tue, Jan 27, 2009 at 18:56, Kyle Banerjee wrote: > There are arguments to do so, but the business case is not strong. Well, I'd say the future of the library world is a good business case, and I know several people (high and low) fully aware of it, but I think it's hard to take any step in either direction that would be deemed worth it. Toguh one, indeed. > That data providers won't send MODS until libraries demand it. > Libraries won't demand it until their systems use it. Systems won't > use it until libraries demand it because that's what their data > providers require. Well, I've been yelling for vendors to get more involved for a long time, but there's a lot of blankness coming from them. I guess they're happy with the current tie to MARC (binding the libraries to them forever) until the business is gone ... > It's a vicious circle, so we're stuck with MARC. The only people who > aren't happy with this arrangement are those who are trying to create > something new. Many librarians who think they use MARC every day > have no idea that it is a binary format that is unfriendly to eyes and > machines. MARC may be MAchine Readable, but not MAchine Understandable or even MAchine Usable. I had an idea some time ago to create a dummy / fake MARC record with much more to it (like extensions and special tags systems can react to, such as validation) and pass it around the infrastructure to see what in it survives (the golden rule is to ignore what you don't understand, although I know a few MARC systems who filter out what they don't understand (!!!) because, well, these systems were mostly built back when a megabyte of storage and / or memory had a price of about a cataloger or two. Friggin' crazies!). Anyone in? :) Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] marc21 and usmarc
On Tue, Jan 27, 2009 at 18:06, Jonathan Rochkind wrote: > Because their customers are not demanding it, and they > often don't have the technical expertise to understand > why it matters anyway. But mainly because > their customers are not demanding it. So, um, could librarians everywhere start being just a tad bit more demanding about this stuff? You know, before your profession becomes obsoleted from this planet? Actually, I was wondering what areas MODS can't handle which MARC does, hijack and / or change MODS to fit it (what I know of it seems a bit limiting, but through XML certainly extensible). Shouldn't folks start by demanding at least MODS (or XOBIS if we're *really* crazy :)? Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] marc21 and usmarc (fwd)
On Tue, Jan 27, 2009 at 17:09, Ardie Bausenbach wrote: > Since that time, many other national libraries have moved from > their national formats to MARC 21, including (among others), > the UK, Germany, Finland, and Spain. I know a few more, but another point worth, er, screaming about, is the various AACT2 / RDA / other rules changes that's not linked to MARC at all. I know a lot of it is covered in MARC documentation, but there's hidden gems, like punctuations, symbols, character-encodings, etc which aren't always specified. If the library world embraced XML as a minimum a lot could be fixed in that area (and no, XMLMARC does not qualify :). Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] marc21 and usmarc
On Tue, Jan 27, 2009 at 17:04, Eric Lease Morgan wrote: > Can somebody say "MARCXML or MODS complete with a schema"? Well, we can say it, and I think we *have* said it for a very long time, but it doesn't seem to change anything. Damn those words. > Such solutions offer at least syntactic validation if not also > semantic validation. Oh well. I would say a little bit more than "oh well" (but I don't really have; you know how I feel :), but I would love to hear what the vendors are thinking about this all. They seem to very, very quiet about it all (without speculating to why ...) regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] PHP Frameworks
On Mon, Oct 27, 2008 at 21:50, Susan Teague Rector <[EMAIL PROTECTED]> wrote: > We're exploring Zend as a framework for php based Web applications. I'm > curious to see if anyone out there is using this framework (or another MVC > framework). Also, I wondering how many full-time developers you have on > staff programming. Back when I was in the library world I used Zend Framework, albeit not as MVC (I needed a more RESTful paradigm), but the components themselves are fantastic, and I hear and see good things about the MVC as well. You can't fail with it as it brings easy OO to the PHP world. As to staff, I guess three or four would be up to ZF scratch at that time (about 1 year ago). Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] PHP5 Help
On Tue, Jul 1, 2008 at 13:42, Nicole Engard <[EMAIL PROTECTED]> wrote: > I am missing something right in front of my eyes. I'm rusty on my > PHP, I'm wondering if someone can help me with this error: > > Warning: gmmktime() expects parameter 3 to be long, string given in > /public_html/magpierss-0.72/rss_utils.inc on line 35 Well, it's a bit puzzling in the sense that the parameters are all ints, but hey. :) Try casting the values ; gmmktime( (int) $hours, (int) $minutes, (int) $seconds, (int) $month, (int) $day, (int) $year ) ; or try the same with (long). Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] planet.code4lib.org -- 3 suggestions
On Thu, May 22, 2008 at 5:06 PM, K.G. Schneider <[EMAIL PROTECTED]> wrote: > I feel self-conscious about seeing posts reflected in the "planet" that > are not related to library technology, only because I'm not willing to > break up my blog into sub-blogs and don't know if oysters and pace > layering really go together for the "planet." Ouch, I suspect a conversation next about what fits the code4lib planet moniker. Does my technology rants that don't bash MARC fit? Does Topic Maps fit, even if libraries don't use them but they are a perfect fit? Posts about philosophical aspects of the code we make? Or the epistemological musings of workflows? Lest not forget that the human aspect of the library profession is what makes librarians so great ... It's a tough one. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Gartner on OSS
On Mon, Mar 31, 2008 at 2:45 AM, D Chudnov <[EMAIL PROTECTED]> wrote: > ...at the risk of upsetting *everybody*... It's a bit depressive that once we get an interesting discussion going on this list which normally has such low volume, and which is *definitely* on-topic, someone comes along and tries to kill it because it doesn't fit *their* ideal of what the topics should be. Allow me to vent a few seconds; Sorry, but OSS is *all* about code and often about business models, and rest assured Karen and all the rest of us *definitely* are defining the "enterprise" in question as the library world, so this is *all* about code for libraries. We aren't writing code in the posts, but we certainly are talking about code. Nitpicking about such *tiny* semantic differences is just one of those things which drive me up the wall! Of *course* this topic has a place on this list, and of *course* we're not going to create Yet Another MailingListForSomethingJustBecauseWeAreBloodyLibraries, and of *course* we should talk about these things, and *especially* here where coders talk about code. Code is more than syntax. But I guess this thread is dead now, and so is at least *my* ideal of what this list is, so take care. Grumpy, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Gartner on OSS
On Sun, Mar 30, 2008 at 9:40 PM, K.G. Schneider <[EMAIL PROTECTED]> wrote: > For those of us in the field pushing for new approaches, the Gartner report > does represent positive change. It's not that OSS isn't successful. It's > that some of us would really like it to be much more successful... Fair enough. I certainly understand the significance for OSS passionadas in organisations under MBA and committee rule, it's just infuriating that these things have to be spelled out in childish ways (which the litmus test really is all about) by conservatives for "approved benefits to the enterprise." This is partly why I left the library world, mind you, so if that report can fix up some of the glaring things that made my experience there so painful (a constant struggle of spelling things out), I might think of coming back. :) Hehehe. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Gartner on OSS
On Sun, Mar 30, 2008 at 7:51 PM, K.G. Schneider <[EMAIL PROTECTED]> wrote: > Sorry, Alexander, I disagree. What, is that allowed!? :) > Gartner may sound creaky but under the starchy > language, this is pretty revolutionary advice. I can't agree with the "revolutionary advice" part; business leaders, firms, advisers and abusers have been saying this already for years. That Gartner now is on the field saying it too shows nothing except how conservative they are; this is an old message, and certainly not aimed at people who's doing the actual work in their organisations. I've been in the "enterprise" for most of my life as a high-flying consultant (except my non-enterprise last few years in the library world), and currently work as both manager, developer and advisor to the largest enterprise organisations around. We've always recomended and / or used OSS, integrated the very ideal into the fabric of enterprise software development. The only people that Gartner now is playing to are the business people, who will be surprised to learn that their organisations already use (and many fully embrace) OSS, and have done so for years. (How they'll cope with that news is another story, and maybe Gartner is their coming safety blanket) Even big guys who think that only the Oracle business stack is good enough for them will be surprised to find the odd OSS project supporting their infrastructure. OSS is already successful, and it's already working great even if the MBAs don't know it. And because Gasrtner now is playing to those people, that's why the porridge litmus test works so great; in reality, nothing will change, which for many is the perfect advice. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Gartner on OSS
Let's try the litmus test for enterprisey business bullshit : porridge ; "Recommendations for Users * Look for a sustainable community that has a critical mass of skills supporting porridge. * Look for a cultural match between the porridge community and your internal developers and user culture as it enhances communication and perceived user satisfaction. * Prepare an SOA that can integrate IT services from many sources, including porridge. * Avoid porridge that is not built on open standards. * Make a conscious risk-based decision about whether you will depend on internal resources or external services for your porridge implementations." In short, another template piece where [insert your favourite thing here] is wrapped around generic advice. Do they say anything that's specific to what open-source is all about? Alex (without reading the darn article...) -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] theinfo.org: for people who work with big data sets
On Jan 16, 2008 7:08 AM, Aaron Swartz <[EMAIL PROTECTED]> wrote: > http://theinfo.org/ Excellent initiative! Joined, and I'll forward the information around to other communities I know do this type of work. Regards, Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] [Fwd: [NGC4LIB] A Thought Experiment]
Hiya, On Nov 9, 2007 7:42 AM, Carl Grant <[EMAIL PROTECTED]> wrote: > I'm seeking some help understanding here. From my perspective > (again, that of a long time vendor of "commercial software" having > recently moved to "commercial service for OSS software") this is > exactly what a number of us (LibLime, Evergreen, Index Data, CARE > Affiliates) are *trying* to do. We're not only providing the > services to allow libraries to adopt open source, we're also doing > the marketing and selling that libraries seem to require before > they'll even consider the option. I think this is extremely important for the library world right now, far more important than any current standard, model or prototyping exercise ; support the vendors going Open Source. Don't think about it for too long ; we must grab this opportunity *at all cost*, because, frankly, it's the only chance we've got to set ourselves straight again. The only way to get away from the suppressed and locked-down legacy-driven world we currently live in is to embrace openness, especially when it's coming from vendors (who's by that very token asking us to work *with* them this time instead of just buying their stuff). There's a slight clause here, though, for the vendors ; you *must* adopt web services for *every* part of your solutions. I know that this often goes against the grain of a "proposed system" (a system that holistically solves a problem space) but the truth of the matter is that you will never make your system work spot on for everyone, and we need the reassurance (even if we never use the option) of going in a different direction or using someone else's solution for a particular problem. By allowing a more open development model the library world will love you and gladly give you money for support and further development. Consider the openness even a token more than a reality option. Here's a quick list of things I see crucially happening ; * The library world has to come together to create a common language for these web services, an ontology if you will. We must decide on a few good (and possibly already existing) protocols and dictionaries. * Vendors must settle on a development model for web services (and I'd humbly suggest a REST model) and not be afraid of opening up or segmenting their holistic solutions into sharable / interchangeable parts. * Get some outside experts in to handle usability and interaction design, and open source the result. Create a consortium or interest-group for library systems usability and user experience. * Make sure we've got a *clean* cut of technology between business logic and the user interface. Enforce low-key semantically-rich XHTML and use CSS everywhere. Here's to dreaming. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] mylibrary web services
On 8/7/07, Eric Lease Morgan <[EMAIL PROTECTED]> wrote: > In summary, RESTful Web Services using ROA (Resource Oriented > Architecture) appears to be a "purist" approach to using the Web. Purist? No, it's not the purist way, but the right way. You can use a hammer to put in a screw, but are you going to call me a purist if I suggest you use a screwdriver? "Purist" as a word has a lot of negative connotations. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] executing a cgi script in the middle of a url
On 7/31/07, Eric Lease Morgan <[EMAIL PROTECTED]> wrote: > What am I doing wrong? How do I need to configure Apache accordingly? We use a bunch of URL rewrite rules to solve this issue. We have a host of backend technologies like Perl, PHP, CGI, Java, but all the URL are equally clean. We set up one set of rules per service point, where a 'point' is defined in a rather technological fashion in addition to the semantic value, so soa.nla.gov.au/users is redirected to ws.nla.gov.au/services/usermanagement/index.cgi, including all sub-paths from this point. Alex -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Open Source OPAC - VUFind Beta Released
On 7/20/07, Andrew Nagy <[EMAIL PROTECTED]> wrote: http://www.vufind.org/ Excellent stuff, and thanks for the open-source effort. Three things ; 1. Will there be efforts towards a development community outside your library? 2. http://www.vufind.org/demo/Record/56179 has serious problems in its "similar items" section. :) 3. If you scroll down a list of things and then do something that requires a login, only the top part of the page that's not in view has the action. The user sees nothing, and nothing happens. Apart from that, great stuff and, if you accept such, I'd love to participate in ways that I can. Kind regards, Alexander -- --- Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] "good" web service api
On 6/30/07, Eric Lease Morgan <[EMAIL PROTECTED]> wrote: What are the characteristics of a "good" Web Service API? That you refrain from the notion of an API. :) Seriously, before you do anything, read the book "Restful WebServices" by Sam Ruby and Leonard Richardson (http://www.oreilly.com/catalog/9780596529260/). I'd do it the ROA way (and have for some time; resource oriented architecture), but I do understand it puts certain strain on the areas of the brain responsible for learning conceptually new things. Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Position Available: Manager of Data Systems
On 5/18/07, Patty De Anda <[EMAIL PROTECTED]> wrote: MANAGER OF DATA SYSTEMS ... and not a word (that I could find) on where in the world - or where in the assumed USA - this position is held. :) Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] PHP Symfony
On 3/24/07, Michael J. Giarlo <[EMAIL PROTECTED]> wrote: Hmm? What's that you say? Just a sec, but in the meantime, why not sit down and have some of this delicious Kool-Aid over here? It's Ruby Red-flavored; I think you'll like it. Come, now; for those who meddle in things PHP knows that a lot of the goodness you get from Ruby you'll these days also find in PHP as well. Things have progressed quite a bit in the last 5 years, and PHP 5.2 is quite mature and offers an OO model on par with Ruby, without the hassle of being a fringe technology. :) As to about Symfony, yes, it's pretty good and compliments (or answers) the RoR thing well. I personally don't use it as I'm more of a XSLT, SOA, REST freak (and Symfony is slightly tricky to push into that box, especially given the non-MVC direction of the SOA we're building). Now that Ror 1.2 has better support for REST I think Symfony may follow, but I don't like the default templating language (PHP with "specials") nor the non-MVC paradigm. Having said that, I haven't used it for a few versions and things may have improved. Check it out. Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Using OpenID in libraries
On 3/23/07, Jeremy Frumkin <[EMAIL PROTECTED]> wrote: While OpenID has potential within certain contexts, I have difficulty seeing it being quickly adopted by libraries, universities, or other entities that need to relate real identities to an OpenID. OpenID doesn¹t do trust; it explicitly says it is not a trust system. For libraries to adopt OpenID, they need to somehow link OpenID to a trust system. It isn¹t clear to me that there is enough added value to libraries at this point to adopt OpenID of course, I¹d be glad to buy someone a beer if they provide a use case to convince me otherwise ;-) I can only offer you a beer of agreement; OpenID is fantastic for geeks who can control their online environment, but hopeless for normal people. The only trust given in the system is based on the trust of the ID source, and in many cases that's just as hard to come by in new shapes as it has been in the past. For *me* OpenID is fantastic, but for my wife it means nothing. I suspect most of our patrons are in the latter category, but hey, we're going to implement OpenID cross-system soon so at least we're trying. :) Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Videos?
On 3/6/07, Noel Peden <[EMAIL PROTECTED]> wrote: I'm finally back the office today and the videos are in process... I'm not sure where they'll go, but they'll be up somewhere. BTW, if anybody has any ideas for royalty free title music (a short 3+ second thing), I'm open. I'll whip up something if needed. In my dark past I was a musician, and I've got stuff lying around waiting for the oppertune moment to be donate. What are you looking for? Alex -- --- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/
Re: [CODE4LIB] Getting data from Voyager into XML?
On 1/18/07, Doran, Michael D <[EMAIL PROTECTED]> wrote: So you may find that there is a well-founded reluctance among Voyager systems people to get too carried away with the DBA 101 stuff. ;-) We're routing around the problem by creating a webservice that is Voyager specific and let other apps and services use this one. That means that if you have to do DBA stuff, you do it in one spot. It's not the ultimate solution, but it solves a great deal of legacy and flexibility problems. Alex -- Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps -- http://shelter.nu/blog/ ---
Re: [CODE4LIB] OpenFRBR
Hi, You may be interested in OpenFRBR: http://www.openfrbr.org/ Its aim is to build a full, free implementation of FRBR, showing everything it can do, and looking for problems along the way. Everyone's welcome to get involved in whatever way they wish. I can't get to that site (is it down?), but a few words on what you're trying to do (is it a technical approach, model approach, philosophical approach?), and how you want to do it would be great. Alex -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] OpenURL XML generation libraries?
On 10/18/06, Ross Singer <[EMAIL PROTECTED]> wrote: I respond with an SVN repository for a ruby OpenURL library (that doesn't currently have any documentation). Not sure what's completely out of context about this. Because you didn't say "a ruby OpenURL library"? :) I had no idea what I was looking at. An SVN to some code doesn't mean everyone groks that code by listing generic directories. A few lines of what it is and what it can do would be fantastic. Alex -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] OpenURL XML generation libraries?
On 10/18/06, Ross Singer <[EMAIL PROTECTED]> wrote: See also: http://www.textualize.com/trac/browser/ropenurl Why? What are we looking at? Alex -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Open Source Seel?
On 9/13/06, Smith,Devon <[EMAIL PROTECTED]> wrote: So, I'm wondering how many people think they might actually work on it. Well, if I were you I'd spend some time writing up what the advantages are. The current docos are too wishywashy to tell anything more than that you've got a good idea you've tried out. Right now you state that it *is* better than XSLT, but I know both XSLT and semantic data modelling extremely well and need some convincing. Also, what languages and technologies are used? What would be the proposed license? Does it solve real issues or is it a nice to have? Real interest can be gained by showing us real benefits using real technology solving real issues. If not, then it was an interesting research project. :) Kind regards, Alex -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] native xml databases and/or XQuery?
On 8/17/06, Kevin S. Clarke <[EMAIL PROTECTED]> wrote: I'm curious in finding out how many libraries out there are using or experimenting with native xml databases. Asked here, answered here. :) We've got a few eXist's lying about serving mostly experimental stuff, although one is semi-experimental / quasi-production quality. We're also drooling over the latest release of DB2 with native XML support (I think it's good enough to count as native :), but that's just the next step, I think. I'm also interested in learning of libraries who are using XQuery as a primary development language. Primary? Over my dead body. :) Although do a search for "XqueryP" (notice the 'P') for something that just might solve some of the bigger issues I can think of with the proposition. Alex -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] Photo galleries and accessibility
On 7/13/06, Amy M Ostrom <[EMAIL PROTECTED]> wrote: Or does anyone know about photo galleries and accessibility? There is a bigger group of people which can both see images and have accessibility needs; low-vision users (estimated some 30% of all users). Having said that, there's really nothing stopping you making tables perfectly accessible, and it the sense of images they *are* presented in a tabular fashion. This is where we use common sense instead of rigid rules, so there is no reason to feel that using tables for this is somehow wrong (unless you want to go into the whole WAI 2.0 debate :). Do it the way you do, and clean up the generated code to fix the worst offenders. If you still want to be strict on it, try talking to the GAWDS community (http://www.gawds.org/) about gallery options. I seem to recall there were some discussion about this a while back, but the gist was that most gallery software were equally crap in accessibility regards. Maybe things have changed. regards, Alex -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] next generation opac mailing list
Hiya, On 6/7/06, Ross Singer <[EMAIL PROTECTED]> wrote: That by trotting out their Endeca powered catalog, they've finally gotten the tangible that we nerds have been unable to get institutional support for. Now every librarian in the country wants clustering and faceted search. Sorry, I'm in the wrong country. :) In fact, that event as much as it triggered peoples hearts and minds, it never shook the foundation of the OPAC in this place. But this time last year, I defy you to tell me that you could have trotted out a project like that to anybody outside the systems office (that wasn't already labelled a 'systems apologist'). Possibly not. Hmm. No, not with the OPAC, but other systems. I think libraries have put too much faith in vendors who create crappy systems and continues to do so. If vendors want libraries to buy their stuff, they need to make sure they've got good stuff; it's getting easier and easier to do these things ourselves. Alex -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] next generation opac mailing list
Hi, On 6/7/06, Jonathan Rochkind <[EMAIL PROTECTED]> wrote: My impression is that there are LOTS of catalogers interested in discussing this topic---the future of The Catalog. As much as I would love to disagree with you, I don't. :) My stance on this is not to let hackers create applications as they see fit, dear Dog, no! I'm a die-hard user-centred design and usability guy; my life is dedicated to develop solutions fit for the user, wheter that be patrons, catalogers, super-users and otherwise. I'm more talking about politics of *actually* doing something; I find it easy to talk about innovation with my collegues, but hard to do in practice, although we're setting up a "labs" area these days in an attempt to break free of the tyranny of PRINCE2 and top-down hiearchies. But hey, i realise this is probably besides the point; if we have fruitful discussions, maybe someone can do something with it. Some coders seem to assume that the cataloging community doesn't realize the need for change, or doesn't understand the possibilities of the online catalog. I think this is more and more NOT the case. Catalogers too realize that things are broken, change is the topic of discussion. Actually, I've found the reverse to be true; catalogers overly aware of things being broken, but having hackers that either can't see the problem or are too busy to do so. My feeling about this all is that we're too busy maintaining the MARC Legacy than create a shining new one which may or may not solve the problem. Of course, the problem with MARC is the culture not the technology, so in order to change the culture we need a *whopping* effort put in by *all* libraries around the world. No very likely, but it would be fantastic if we could. But such common vision is desperately needed. I'd say such common vision is desperately needed on the management level! What drives the libraries if not management? Sure, footsoldiers and captains can push the envelope, but only so far before it becomes political, huge, convuluted, a project with a steering commitee, and so forth. For me the strategy is to create prototypes to demonstrate what we're on about, and in my case I do that *with* catalogers, reference librarians and other friends around the library / library world. The idea here is to unite the bottom soldiers in such a way that the top management can see the light and resource and process accordingly. So we desperately need more forums for discussion involving both catalogers and developers, focused on this topic. No, we desperately need everyone to join the same forums! Not more forums, but less! Less is more. We don't need yet another commitee; we need one stronger one. But hey, I'm dreaming. As Eric writes, an important topic for discussion is: "To what degree should traditional cataloging practices be used in such a thing, or to what degree should new and upcoming practices such as FRBR be exploited?" The danger here is that automated processes adds a quality check to our processes, and a lot of people don't like that, especially top management, because it points out mistakes made in the past. Technically we don't have many problems, we can do pretty much anything we'd like to do if we really wanted to, but it's all about internal politics and shuffeling of resources which decides wheter it should be done or not. If *management* don't understand what hackers and catalogers and reference librarians are talking about, we're stuffed! Anyway, I don't think we disagree on this, only the part about needed yet another mailing-list. Regards, Alex -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __
Re: [CODE4LIB] next generation opac mailing list
Hi, You can thank NCSU for bringing the catalogers, reference types, administrators, vendors, etc. to the table. Hmm, how so? I've been at the table with many of them for many years already and know them quite well. :) Are you referring to something specific? Regards, Alexander -- "Ultimately, all things are known because you want to believe you know." - Frank Herbert __ http://shelter.nu/ __