Re: [CODE4LIB] Protagonists

2015-04-13 Thread Alexander Johannesen
Hmm. So, I'm a big fan of WikiPedia and would still go that way even
if the data can be haphazard. WikiPedia has a lot of classics with a
section called "Lead characters" (Pride and Prejudice included) where
the focus is the novel first, which should be easy to call and then
trim with some simple text parsing to get basic characterizations,
like gender, possibly age, place and purpose to the story (main
protagonist, antagonist, support character, etc.)

I'd start with a page like "Le Monde's 100 Books of the Century"
and give each of them a visit, scraping for "main characters" or
"characters" headings, and devise a small set of parsing rules to grab
the top ones and their properties. Sounds like a fun day or so.



On Tue, Apr 14, 2015 at 3:35 PM, McAulay, Elizabeth
> Cool set of questions! Here's a funny "cheat" -- what about querying Amazon 
> or the like for a list of "Cliff's Notes" and call the subjects of the 
> Cliff's Notes "the Canon"? That could serve as a the canon list. Another idea 
> would be to consult a reference work, but I can't think of a good source 
> offhand. One example that's not perfect is the "Dictionary of Literary 
> Biography." The Canon is created by what is included in the reference work.
> As for finding lead character names, that's something I don't have an 
> immediate answer for.
> Good luck!
> Best,
> Lisa
> -
> Elizabeth "Lisa" McAulay
> Librarian for Digital Collection Development
> UCLA Digital Library Program
> email: emcaulay [at]
> From: Code for Libraries  on behalf of 
> davesgonechina 
> Sent: Monday, April 13, 2015 7:12 PM
> Subject: [CODE4LIB] Protagonists
> So I have this idea I'd like to do for a hobby project, but it requires
> finding a table that lists a classic novel, a link to an
> instance of that work (first listed, one with most downloads, whichever),
> the lead female character, and the lead male character (can be null). E.g.
> Pride and Prejudice,, Elizabeth
> Bennet, Mr. Darcy. Even leaving the Gutenberg part for another day, this
> has been really difficult to find.
> I've had no success with Dbpedia/Wikidata since there's no real
> standardized format for novels, characters often are associated more
> strongly with films or video games than original works (Cheshire Cat), and
> when characters are listed they are neither prioritized nor link to a
> record that clearly states gender. And then there's how to select some sort
> of "Western Canon" list. ISBNs are nowhere to be found, nor any other
> identifier that might help to corral a fair chunk of results.
> I looked at OCLC, but WorldCat Works is still an experiment and frankly
> looks like too much work to query for too little return even if it had good
> coverage. Amazon? Librarything? Goodreads? No luck yet.
> I raise this partly because a) I would like to make some toys with that
> list, and b) I feel this is a good test case for "what developers might
> want" from library data, linked or otherwise. It is the sort of request
> that includes many unspoken assumptions (that there is a canon, and it is
> well-defined) that app users, product managers, and developers typically
> want even if it is woefully incomplete or imperfect, so long as it matches
> expectations. While I appreciate what it takes to make such a list, I feel
> like this really ought to be a solved problem in the library space. Not "in
> the process of being solved, hopefully, by new emerging standards" solved,
> but like "we solved this ages ago, here ya go" solved.
> I'm posting this basically in the hopes that someone will say "No, doofus,
> there's an easy way to do this, you just aren't very good at this - look:"
> and show me where I'm wrong.
> D

 Project Wrangler, SOA, Info Alchemist, UX, RESTafarian, Topic Maps  |  |

Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Alexander Johannesen

Robert Sanderson  wrote:
> c) I've never used a Topic Maps application. (and see (a))

How do you know?

> There /are/ challenges with RDF [...]
> But for the vast majority of cases, the problems are solved (JSON-LD) or no
> one cares any more (httpRange14).

What are you trying to say here? That httpRange14 somehow solves some
issue, and we no longer need to worry about it?

>> Having said that, there's tuples of many kinds, it's only that the
>> triplet is the most used under the W3C banner. Many are using to a
>> more expressive quad, a few crazies , for example, even though that
> ad hominem? really? Your argument ceased to be valid right about here.

I think you're a touch sensitive, mate. "Crazies" as in, few and
knowledgeable (most RDF users these days don't know what tuples are,
and how they fit into the representation of data) but not mainstream.
I'm one of those crazies. It was meant in jest.

>> may or may not be a better way of dealing with it. In the end, it all
>> comes down to some variation over frames theory (or bundles); a
>> serialisation of key/value pairs with some ontological denotation for
>> what the semantics of that might be.
> Except that RDF follows the web architecture through the use of URIs for
> everything. That is not to be under-estimated in terms of scalability and
> long term usage.

So does Topic Maps. Not sure I get your point? This is just semantics
of the key dominator in tuple serialisation, there's nothing
revolutionary about that, it's just an ontological commitment used by
systems. URIs don't give you some magic advantage; they're still a
string of characters as far as representation is concerned, and I dare
say, this points out the flaw in httpRange14 right there; in order to
know representation you need to resolve the identifier, ie. there's a
movable dynamic part to what in most cases needs to be static. Not
saying I have the answer, mind you, but there are some fundamental
problems with knowledge representation in RDF that a lot of people
don't "care about" which I do feel people of a library bent should
care about.

>> But wait, there's more! [big snip]
> Your point? You don't like an ontology? #DDTT

My point was the very first words in the following paragraph;

>> Complexity.

And of course I like ontologies. I've bandied them around these parts
for the last 10 years or so, and I'm very happy with RDA/FRBR
directions of late, taking at least RDF/Linked Data seriously. I'm
thus not convinced you understood what I wrote, and if nothing else,
my bad. I'll try again.

> That's no more a problem of RDF than any other system.

Yes, it is. RDF is promoted as a solution to a big problem of findable
and shareable meta data, however until you understand and use the full
RDF cake, you're scratching the surface and doing things sloppy (and
I'd argue, badly). The whole idea of strict ontologies is rigor,
consistency and better means of normalising the meta data so we all
can use it to represent the same things we're talking about. But the
question to every piece of meta data is *authority*, which is the part
of RDF that sucks. Currently it's all balanced on WikiPedia and
dbPedia, which isn't a bad thing all in itself, but neither of those
two are static nor authoritative in the same way, say, a global
library organisation might be. With RDF, people are slowly being
trained to accept all manners of crap meta data, and we as librarians
should not be so eager to accept that. We can say what we like about
the current library tools and models (and, of course, we do; they're
not perfect), but there's a whole missing chunk of what makes RDF
'work' that is, well, sub-par for *knowledge representation*. And
that's our game, no?

The shorter version; the RDF cake with it myriad of layers and
standards are too complex for most people to get right, so Linked Data
comes along and try to be simpler by making the long goal harder to

I'm not, however, *against* RDF. But I am for pointing out that RDF is
neither easy to work with, nor ideal for any long-term goals we might
have in knowledge representation. RDF could have been made a lot
better which has better solutions upstream, but most of this RDF talk
is stuck in 1.0 territory, suffering the sins of former versions.

>> And then there's that tedious distinction between a web resource and
>> something that represents the thing "in reality" that RDF skipped (and
>> hacked a 304 "solution" to). It's all a bit messy.
> That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303*
> solution. You can use URIs to identify anything.

I think my point was that since representation is so important to any
goal you have for RDF (and the rest of the stack) it was a mistake to
not get it right *first*. OWL has better means of dealing with it, but
then, complexity, yadda, yadda.

> And it's not messy, it's very clean.

Subjective, of course. H

Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Alexander Johannesen
Ross Singer  wrote:
> This is definitely where RDF outclasses almost every alternative*, because
> each serialization (besides RDF/XML) works extremely well for specific
> purposes [...]

Hmm. That depends on what you mean by "alternative to RDF
serialisation". I can think of a few, amongst them obviously (for me)
is Topic Maps which don't go down the evil triplet way with conversion
back and to an underlying data model.

Having said that, there's tuples of many kinds, it's only that the
triplet is the most used under the W3C banner. Many are using to a
more expressive quad, a few crazies , for example, even though that
may or may not be a better way of dealing with it. In the end, it all
comes down to some variation over frames theory (or bundles); a
serialisation of key/value pairs with some ontological denotation for
what the semantics of that might be.

It's hard to express what we perceive as knowledge in any notational
form. The models and languages we propose are far inferior to what is
needed for a world as complex as it is. But as you quoted George Box,
some models are more useful than others.

My personal experience is that I've got a hatred for RDF and triplets
for many of the same reasons Eric touch on, and as many know, I prefer
the more direct meta model of Topic Maps. However, these two different
serialisation and meta model frameworks are - lo and behold! -
compatible; there's canonical lossless conversion between the two. So
the argument at this point comes down to personal taste for what makes
more sense to you.

As to more on problems of RDF, read this excellent (but slighlt dated)
Bray article;

But wait, there's more! We haven't touched upon the next layer of the
cake; OWL, which is, more or less, an ontology for dealing with all
things knowledge and web. And it kinda puzzles me that it is not more
often mentioned (or used) in the systems we make. A lot of OWL was
tailored towards being a better language for expressing knowledge
(which in itself comes from DAML and OIL ontologies), and then there's
RDFs, and OWL in various formats, and then ...

Complexity. The problem, as far as I see it, is that there's not
enough expression and rigor for the things we want to talk about in
RDF, but we don't want to complicate things with OWL or RDFs either.
And then there's that tedious distinction between a web resource and
something that represents the thing "in reality" that RDF skipped (and
hacked a 304 "solution" to). It's all a bit messy.

> * Unless you're writing a parser, then having a kajillion serializations
> seriously sucks.

Some of us do. And yes, it sucks. I wonder about non-political
solutions ever being possible again ...


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps  |  |

Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Alexander Johannesen

On Tue, Nov 5, 2013 at 1:59 AM, Karen Coyle  wrote:
> Eric, I really don't see how RDF or linked data is any more difficult to
> grasp than a database design

Well, there's at least one thing which makes people tilt; the flexible
structures for semantics (or, ontologies) in where things aren't as
solid as in a data model. A framework where there are endless options
(on the surface of it) for relationships between things is daunting to
people who come from a world where the options are cast in iron.
There's also a shift away from thing's identities being tied down in a
model somewhere into a world where identities are a bit more, hmm,
flexible? And less rigid? That can make some people cringe, as well.

> A master chef understands the chemistry of his famous dessert - the rest of
> us just eat and enjoy.

Hmm. Some of us will try to make that dessert again, for sure. :)


Re: [CODE4LIB] Seeking examples of outstanding discovery layers

2012-09-19 Thread Alexander Johannesen
I love the Trove from the National Library of Australia ;

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Project Management Software Question

2012-02-23 Thread Alexander Johannesen

> --What project management software are you using?

Semantic MediaWiki, xSiteable

> --What made you choose the system?

Most project management software is written by geeks, not for humans. They
all propose some methodology to go with their model, but either their model
is inflexible (and crashing with yours), or it is so flexible that any tool
might do the trick. Also, they are notoriously hard to configure on a
cumulative scale of the people involved. Also, people hate putting in their
data, so most software, even if they might just do the trick, fails for
human reasons.

So, a simple wiki with some added ontology cruff, and xSiteable delivering
semantics and widgets across all people is enough. Simple todo's beat
complex task management every time.

> --Has the system met all of your needs? If not, where does it fail?

It only fails when we need average to higher degree of data, again, a human
problem. Oh, and it sometimes fails because the MediaWiki GUI sucks for
non-geeks. I think Confluence is better and overal pretty good.

> --Overall opinions?

I could write you a sonnett or two, but I have very little trust in
software helping much in project management (after having tried them all
over a span of 20 years). A joint platform for documentation (and for
heavens' sake, choose a Wiki that has a usable interface!)

In fact, you'd be *far* better off getting "Making stuff happen" by Scott
Berkun (,
the best book I ever got. Honest, I'm not affiliated. :)

> --What systems did you evaluate and decide not to recommend?

Hmm, I think I've tried too many. I'm sure there's software out there that
doesn't suck (ie. I hear good things about a few here and there), but far
too often do I see this usability parred with human engagement problem crop
up and ruin the best of software packages.

> Any information would be great!

Sorry to be so glum. I'm more happy with simpler approaches such as
"project on a page" (ie. one Wiki page with short description, people,
contacts, goals, and progress) and more agile ways of dealing with
requirements and development (reduces the need for approved paper, easier
to roll back bad decisions, etc.). The closest I get to a Gant chart is
that one of our vendors insists on sending me one every now and then,
despite that he has to come into the office and explain it to people every
single time.

In other words; use software to document and drive forward, never use
software to measure progress and estimates.


Alex (disgruntled ex-beliver in project management software)

Re: [CODE4LIB] Open datasets

2012-01-12 Thread Alexander Johannesen

Thanks for the all the pointers; just what I wanted, and gives me plenty of
ways to test the generic meta data handling. Great!


On Jan 12, 2012 3:19 AM, "Simon Spero"  wrote:

> You can get anything you want
> At Brewster Kahle's restaurant.
> Simon
> On Wed, Jan 11, 2012 at 10:55 AM, LeVan,Ralph  wrote:
> > has 10,000 marc
> > records in it.  They are part of the old SiteSearch system that OCLC
> > released as open source.  They date back to 2002 and will not contain
> > any Unicode, if you were hoping to include that as part of your testing.
> >
> > Ralph
> >
> > -Original Message-
> > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
> > Alexander Johannesen
> > Sent: Wednesday, January 11, 2012 5:36 AM
> > Subject: Open datasets
> >
> > Hiya,
> >
> > I'm in the middle of creating a meta data management system (including
> > merging and persistent identifier management) for a somewhat different
> > domain (intranets and business integration), but it's based on Topic
> > Maps
> > and so is well suited to other means of meta data handling / mangling.
> > It's
> > also going to be open-source, and it might be well-suited to library
> > tasks
> > as well.
> >
> > So in order to test the integrity and performance of my system so far
> > I'm
> > wondering if there's a suitable open dataset of bibliographic records
> > that
> > aren't too obscure (meaning, I can find the titles at amazon or Open
> > Library) that you could recommend? More than 1000 records, but less than
> > a
> > million, maybe?
> >
> > Regards,
> >
> > Alex
> >

[CODE4LIB] Open datasets

2012-01-11 Thread Alexander Johannesen

I'm in the middle of creating a meta data management system (including
merging and persistent identifier management) for a somewhat different
domain (intranets and business integration), but it's based on Topic Maps
and so is well suited to other means of meta data handling / mangling. It's
also going to be open-source, and it might be well-suited to library tasks
as well.

So in order to test the integrity and performance of my system so far I'm
wondering if there's a suitable open dataset of bibliographic records that
aren't too obscure (meaning, I can find the titles at amazon or Open
Library) that you could recommend? More than 1000 records, but less than a
million, maybe?



Re: [CODE4LIB] Linux Laptop

2011-12-14 Thread Alexander Johannesen
MJ Ray  wrote:

> I humbly suggest that long futz times are only necessary these days
> when most of the following combine:


>  1. unsupported/hard-to-support hardware (maybe bought for compatibility
> with another even-fussier operating system?);

Yes, this is the big offender, however I've never met an Ubuntu first
install that didn't work good on the first try. It's only when you start
tweaking stuff it seems it falls down a little.

>  2. control-freakery ("it must work/look exactly THIS way RIGHT NOW
> without me doing much");

Yes, hackers tweak, it's in their nature. They also know the consequences
of hacking and tweaking, so I'm not sure this is bad thing per se. I
personally went Linux *because* I like tweaking and then fixing my messes
(my blog is full of angry anecdotes and stories about just this, some
sillier than others), and there is one difference between (at least) the
Windows world and the Linux world; fixing a broken Linux is tons easier
than fixing a broken Windows, so even if we do talk about stuff getting
broken the fixes are not even comparable.

 3. not good at asking for technical help online or being patient with
> LUGs;

Hardly ever used this.

>  4. not willing to find and/or pay local experts;

I pay myself all the time.

>  5. not willing to search/read the copious fine manuals or debug logs.

The amount of fragmented and irrelevant information out there is inverse
proportional to the time you thought it would take to fix your problem.

 I guess newcomers still have to get used to
> basics like having 5 or more useful mouse buttons instead of 1...

With the (reasonably) few mishaps I've had while updating and installing
Ubuntu versions, I'm still a happy hacker that never regretted the move,
even if the journey has been bumpy at times. However, a word of warning
about Ubuntu is that it is moving in a direction that, to me, is completely
wrong, so I'm switching to Mint (with that Gnome 3 layer that makes it
Gnome 2 compatible). Unity is a travesty, and the people who hate it the
most are ... the tweakers and hackers. Just sayin'


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Namespace management, was Models of MARC in RDF

2011-12-12 Thread Alexander Johannesen
"Richard Wallis"  wrote:
> Collection of triples?

Yes, no baggage there ... :) Some of us are doing this completely without a
single triplet, so I'm not sure it is accurate or even politically correct.

> A classic example of only being able to describe/understand the future in
> the terms of your past experience.

Yes, exactly. Although, having said that, I'm excited that the library
world is finally taking the semantic challenge seriously. It's taken quite
a number of years, but slowly there's a few drips and draps happening.
Here's to hoping that there's a fluse somewhere about to open fully, and
maybe the RDA vehicle have proper wheels? (Didn't the last time I checked,
but that's admittedly a couple of years back. I hear they at least got new



Re: [CODE4LIB] Namespace management, was Models of MARC in RDF

2011-12-12 Thread Alexander Johannesen
"Richard Wallis"  wrote:
> Your are not the only one who is looking for a better term for what is
> being created - maybe we should hold a competition to come up with one.

A "named graph" gets thrown around a lot, and even though this is
technically correct, it's neither nice nor sexy.

In my past a "bucket" was much used, as you can easily thrown things in or
take it out (as opposed to the more terminal record being set), however
people have a problem with the conceptual size of said bucket, which more
or less summarizes why this term is so hard to pin down.

I have, however, seen some revert the old RDBMS world of "rows", as they
talk about properties on the same line, just thinking the line to be more
flexible than what it used to be, but we'll see if it sticks around.
Personally I think the problem is that people *like* the idea of a closed
little silo that is perfectly contained, no matter if it is technically
true or not, and therefore futile. This is also why, I think, it's been so
hard to explain to more traditional developers the amazing advantages you
get through true semantic modelling; people find it hard to let go of a
pattern that has helped them so in the past.

Breaking the meta data out of the wonderful constraints of a MARC record?
FRBR/RDA will never fly, at least not until they all realize that the
constraints are real and that they truly and utterly constrain not just the
meta data but the future field of librarying ... :)



Re: [CODE4LIB] Models of MARC in RDF

2011-12-06 Thread Alexander Johannesen
On Wed, Dec 7, 2011 at 1:49 PM, stuart yeates  wrote:
> As much as I have nothing against anyone on this list, isn't it a little
> US-centric? Didn't we make that mistake before?

I wouldn't worry. A dream-team have no basis in reality, hence the
"dream" part. I'd like to see a Real Team instead, an international
collaboration of people, including international smarts and
non-librarians. (Realistically, an international [or semi] library
conference should have a three-day session with smart people first on
this very issue, and that would make a fine place to get this thing
working, even to some degree of speed)

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Namespace management, was Models of MARC in RDF

2011-12-06 Thread Alexander Johannesen

Karen Coyle  wrote:
> I wonder how easy it will be to
> manage a metadata scheme that has cherry-picked from existing ones, so
> something like:
> dc:title
> bibo:chapter
> foaf:depiction

Yes, you're right in pointing out this as a problem. And my answer is;
it's complicated. My previous "rant" on this list was about data
models*, and dangnabbit if this isn't related as well.

What your example is doing is pointing out a new model based on bits
of other models. This works fine, for the most part, when the concepts
are simple; simple to understand, simple to extend. Often you'll find
that what used to be unclear has grown clear over time (as more and
more have used FOAF, you'll find some things are more used and better
understood, while other parts of it fade into 'we don't really use
that anymore')

But when things get complicated, it *can* render your model unusable.
Mixed data models can be good, but can also lead directly to meta data
hell. For example ;


Ouch. Although not a biggie, I see this kind of discrepancy all the
time, so the argument against mixed models is of course that the power
of definition lies with you rather than some third-party that might
change their mind (albeit rare) or have similar terms that differ
(more often).

I personally would say that the library world should define RDA as you
need it to be, and worry less about reuse at this stage unless you
know for sure that the external models do bibliographic meta data


When we're done talking about ontologies and vocabularies, we need to
talk about identifiers, and there I would swing the other way and let
reuse govern, because it is when you reuse an identifier you start
thinking about what that identifiers means to *both* parties. Or, put
differently ;

It's remarkably easier to get this right if the identifier is a
number, rather than some word. And for that reason I'd say reuse
identifiers (subject proxies) as they are easier to get right and
bring a lot of benefits, but not ontologies (model proxies) as they
can be very difficult to get right and don't necessarily give you what
you want.

Just my .2 AUD.



 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Professional development advice?

2011-11-29 Thread Alexander Johannesen
Kyle Banerjee wrote:
> Starting with data modeling is like trying to learn a new spoken language
> by focusing on grammar [...]

Hmm. It seems that a lot of people are, shall we say, somewhat
misguided to what data modelling is, even mighty WikiPedia who makes
it into a formal process of sorts, and I can see it repeated ad
nauseum wherever you go, giving us the idea that it is all about the
schema of columns and the nuts and bolts of tables and relations in a
RDBMS. That's confusing data modelling tools or processes with the
generic open-ended category of data modelling.

Data modelling is simply the act of exploring data-oriented
structures. Over time I've learned that everything we do, every little
problem you battle with in your every daily life, revolves around some
data structure, the names of such, and their internal and external
relationships. The simplest web form has a model, simple and complex
applications do as well, enterprise systems, library systems, formats,
databases, documents, spreadsheets, this conversation, your bicycle,
your morning routine, *everything*.

There is, in my strong opinion, a horrible conflation of the concept
of modelling data and implementations pinning down data types; it's an
evil so strong it blinds us, cripple us, and I feel like screaming out
in terrifying agony the horrors within! The wrongly applied indeces!
The labels on columns! The semantic binding of one sub-structure to
another! The optimising tricks used! The stored procedures! The
conceptual semantics of labels in n-ary graphs!!

The wretched *name* of a single field and how it quietly eats up any
disambiguous notion we put in place, through the many well-meaning but
afflicting layers of abstraction and implementation, it drives me
insane! "Name"!? What does that mean in the context of an email
address? What does "comment" mean when it reaches my ORB? What were
they thinking when the model designed resulted in SQL statements 1K

There's so much information written of the topic of data modelling,
and most of it ignore that very thing that it should embrace and focus
heavily on; good semantic design. (Granted, it has become far more
focused on in the last 10 years, and I'm extremely happy for that) Put
some heavy thought into your tables, because what you perceive as a
simple table of users becomes an overwhelming problem when you add
special users to the system. Have any of you ever created an ILS with
a table "book" in it? (C'mon, raise your hand, I know you have!) Yeah,
that's the sort of evil I'm talking about! Libraries don't deal with
"books", they deal with bibliographic meta data of objects, and
sometimes those objects are called a "book" which has certain
constraints and properties that link to special meta data that isn't
static. Version 1.0 of any system if famously rubbish because of the
learning process of getting all this stuff wrong. Version 2.0 is
famous for being overly abstracted and incomprehensible. Version 3.0
is getting there, but you're bogged down in the middleware,
translating between good but incompatible models. By the time you get
to version 4.0 you realize that the underlying concepts which drove
versions 1 through 3 are flawed, and you need to work in terms of FRBR
sub-graphs instead of MARC records. Version 5.0 is so re-written and
re-conceptualized, you decide to call it something else, version 1.0
And we repeat the cycle. If your software isn't like this, consider
yourself lucky (or at worst, self-deluded :).

> Data modeling is extremely useful, but
> mistaking drips and drabs of it early on for reality can poison your
> thinking.

Sorry, you got that back to front. We all agree that understanding
what user want and / or need is King, but unless you've got that
understanding of not only what the users want but how systems can
deliver this without creating constraints that will screw things up
when you extend that original delivery idea, you're going to suffer.

It's easy; take great care to what you call things in your system (no
matter whether it's in the database, your objects / classes /
instances / interfaces, user interface, buttons, messages, windows,
data types, loops ... they're all data models that need to be as
cooperative as possible, speaking the *same language*, to be
compatible in the meaning they give the concepts used. If your Wheels
API has different semantics from your Steering API, making that car is
going to be a really crappy experience, for you as a developer, for
testers, for maintenance guys, for service people, and most of all
don't think for a second that the driver won't notice. These semantics
are far more important than what our industry traditionally have given
them, and in my opinion it is our biggest flaw.

Trust me, I've stared at data models up and down so many systems over
the years (10 of them in a high-flying big consultant agency where we
came in when projects otherwise failed) it's amazi

Re: [CODE4LIB] Professional development advice?

2011-11-28 Thread Alexander Johannesen

On Tue, Nov 29, 2011 at 10:06 AM, Nate Vack  wrote:
> A more productive task is to understand the who, how, when, and
> thenceforth of what tasks actual people want to accomplish with their
> computers

Understanding this is not disconnected from designing data models
*right*. It's the same thing. By extension I should mention that
people are terrible at telling you what they want or need, but they're
good at telling you what they hate. If nothing else, I'd suggest to
tap into that wonderful hate.

> But an 'all flows from data modeling' thought process leads to FOAF,
> FOAF leads to hate, and hate leads to suffering.

This sounds suspiciously like someone who don't understand the perils
of data models and how they affect all the FOAF and hate that's built
up around its faults. FOAF and suffering is a symptom of shitty data
models, not shitty code. Unless you've got a little more meat on that
argument? :)


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Professional development advice?

2011-11-28 Thread Alexander Johannesen
I could give you tons of advice, most of it specific to some
technological domain or another, but over the years I've more or less
settled on one thing that beat out all the other ;

Data models.

Once you grok data models, what they are, how they work, and all the
extended family (schemas, ontologies, persistent identification,
querying, de-duplication, layered models, LUT/transcripts, stored
procedures [and why they are evil], RDBMS vs. NoSQL vs. whatever, and
so on), everything else is miscellaneous. The way we humans use
computers as tools are all rooted in a data model at the bottom of
some program or database, and the rest of the time is spent
interacting with the data model, trying to make it do the things we
need it to do, and so on. Everything is about and around that data
model, so getting it right is a lot more important than any amount of
beautiful coding against it.

So, that's my big tip; all that technology we much about with is
really trying to work well with a data model. Your task should rather
be to understand the why, who, how, when and the thenceforth of data
models, and everything else will follow.

Now, this tip could under normal circumstances be applied to any part
of the IT industry, but it makes especially sense in the library
world. Most of the time is spent converting data between data models
(whatever > MARC > whatever), or making sense of the one (MARC21/FRBR)
or other (AACR2/RDA and that third one I can never remember the name
of, that extension rules to AACR2?) or three (LCSH/DDC). We're all
battling against the original thought and implementation of data
models, and very often you'll find better technological solutions when
you understand the underlying human efforts of ... data modeling (and
by extension, you might discover my pet peeve, how all bad software
and systems in the world comes from bad data modeling, and *not* from
bad programming [even if there's plenty of that, too])


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Ontology Question

2011-11-11 Thread Alexander Johannesen

> Is it okay to just use the classes I need or should I include the super 
> classes which they belong to?

I think we also need to define a few concepts here. What do you mean,
"include"? As far as I can tell, you want to say something like
"Here's a few concepts we're using, and their definition is based off
this other ontology over *there* (pointing)", but that's not always
the case, so just asking.

Now, Karen is of course right in her take on it, but there's a little
thing that require a bit of focus, and that's how this new ontology is
going to be used. Is it one of these manual labour things where it
doesn't actually require formal definitions as much as a human one, or
is it (however you use the ontology) to be passed through a tool, or
more formally passed through an inferencer?


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] two open positions at Stanford

2011-10-14 Thread Alexander Johannesen
On Sat, Oct 15, 2011 at 3:40 AM, Cindy Harper  wrote:
> I mean - Bieber???  You
> mean he has a beard?

Unless they put a Phillips in a Tardis ...

(and seriously, if you get that joke with it's three somewhat obscure
references, and the one insane premise, there's something wrong /
right with you, and I'm almost tempted to give out prizes ... )

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] iPads as Kiosks

2011-08-24 Thread Alexander Johannesen
Just my two bobs ;

We're going through various stages of testing out tablets for both
kiosks *and* portable workstations (for nurses and staff), and have
tried out iPads and various Androids, and our current favorite is
actually the Asus Eee Pad Transformer, a vanilla (but good quality)
Honeycomb Android during day, but with a snap-on keyboard with extra
ports and batteries for some netbook action at night, so it satisfies
both our criteria.

As with all things, it also depends on what software you want to run.
If you go with iPad you need to go through Apple's various
restrictions, while on Android you can use whatever you want. For a
"you are here" tablet a cheap 150$ Android seems like a good option,



On Wed, Aug 24, 2011 at 11:51 PM, Madrigal, Juan A
> That零 the equivalent to $25/month and includes support for your whole
> development team/institution.
> If your employer can't afford that then I suggest you look for a new job!
> ;)
> Juan Madrigal
> Web Developer
> Web and Emerging Technologies
> University of Miami
> Richter Library
> On 8/23/11 2:21 PM, "Dan Funk"  wrote:
>>Wow, just $300/year and you can run your own software on your own
>>hardware? What a deal.
>>On Tue, Aug 23, 2011 at 2:13 PM, David Uspal 
>>> Thanks for the update.   This definitely solves that issue -- its
>>>unfortunate this wasn't in place in 2009, or I'd be into year two of a
>>>five year contract...
>>> David K. Uspal
>>> Technology Development Specialist
>>> Falvey Memorial Library
>>> Phone: 610-519-8954
>>> Email:
>>> -Original Message-
>>> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
>>>Andrew Hankinson
>>> Sent: Tuesday, August 23, 2011 2:00 PM
>>> Subject: Re: [CODE4LIB] iPads as Kiosks
>>> You can distribute apps via an internal web server, with no need to go
>>>out to Apple.
>>> You need to be a registered business to do this, and it costs $299/yr.
>>>You get a digital certificate, but that doesn't mean your code needs to
>>>be "seen" by anyone outside of your org.
>>> On 2011-08-23, at 1:47 PM, David Uspal wrote:
 When I did my iPhone work, it was back in 2009 before this document
even existed, so it's good they've come some distance on this issue
since then.  Still, the document below doesn't break the dependency on
the iTunes store and/or a digital certificate issued by Apple to
download applications (if I'm reading page 63 right), which was the big
sticking point of the contract.  Not only did the user not want the
network controlled by Apple (which this document does handle), they
also didn't want the code seen by any outside source at all (aka via
uploading it to the store)

 David K. Uspal
 Technology Development Specialist
 Falvey Memorial Library
 Phone: 610-519-8954

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
Of Andrew Hankinson
 Sent: Tuesday, August 23, 2011 1:34 PM
 Subject: Re: [CODE4LIB] iPads as Kiosks

 They now have an enterprise app deployment mechanism.

 On 2011-08-23, at 12:54 PM, David Uspal wrote:

> Then again, by selecting the iPad you're essentially tethered to
>Apple's iron grip of the iWorld via its iTunes vetting process and
>strict control of Apple hardware.   YMMV on this depending on what
>you're doing, but it should definitely be a consideration when
>choosing between Android tablets and the iPad.
> Quick side story -- we had to drop a contract one time at my old job
>due to the customer proprietary requirements.  The customer didn't
>want to release its developed software outside of house (minus the
>developers of course) and Apple wouldn't give them a waiver from using
>the iTunes store.  Mind you, this was a very big company with
>resources, so Apple probably lost a 5000 unit sale due to this
> David K. Uspal
> Technology Development Specialist
> Falvey Memorial Library
> Phone: 610-519-8954
> Email:
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
>Of Stephen X. Flynn
> Sent: Tuesday, August 23, 2011 9:01 AM
> Subject: Re: [CODE4LIB] iPads as Kiosks
> Let's not forget a far superior user experience.
> Stephen X. Flynn
> Emerging Technologies Librarian
> Andr

Re: [CODE4LIB] Seth Godin on The future of the library

2011-06-01 Thread Alexander Johannesen

On Thu, Jun 2, 2011 at 9:11 AM, Jonathan Rochkind  wrote:
> There are some unanswered questions about what the purpose of the catalog is
> or should be in our users research workflow, and it's not obvious to me 
> whether
> that purpose will involve putting any possible book or article that exists 
> for free
> on the internet in the catalog.

I personally think that libraries in general still have some
fundamental issues of just getting their head around the two-headed
problem of free web resources. Not only are these free, but they don't
physically exists. This has certain implications for libraries ;

Free: as has been pointed out, sometimes this means not being peer
reviewed, or doesn't have the quality seal of a publisher, and as such
there is no process for libraries to really understand how that
knowledge fits into the rest of their collection. (I don't think it's
a price issue; it's more a fundamental model issue) It's sometimes
hard to wrap your head around the concept of anything free being of
much *worth* where in the past worth and often quality was measured in
the name of publishers and the amount of peer-review or the reputation
of the author. The Internet has *changed* this to the core; it's all
gone or going, and new models are coming through the haze of confusion
which I think the library world is both unprepared for and seriously
underfunded to deal with.

Links: The whole concept of web resources, of what a link (or a link
to a mirror or cache) is all about confuses libraries who are deeply
rooted in all things being physical. I know this is a dozy, but I
still find this an issue when talking to librarians even today. The
concept of virtual things in the library world really only exists with
the notion of meta data, and I don't think the transition to the
resource itself *also* being virtual has worked out well. Libraries
*likes* physical objects, they *like* shelves, they *like* their
buildings, and I don't blame them; we are physical beings who love the
smell of paper, however books are not actually important, buildings
are not actually important, that smell is definitely not important :
Ideas, knowledge and concepts are, and that's what we all try to pry
from the books. (As an aside, if ideas and concepts were valued more,
why couldn't LCSH morph into something far, far more important and
useful? The mind boggles at the lost opportunities!) You cannot pry
anything from a link except the possible resource at the other end,
but it is a few traceroutes away in a virtual place, and in need of
technological interpretation on arrival, and then comes the next level
of trouble;

These are just the conceptual problem. The next real problem of
technology and the library world is - despite the hard and excellent
work put in by people like us on this very list! - that they are still
a slow-poke in the realm of using and developing technology. Most ILS
are charmingly quaint in dealing with these things. OPAC's are mostly
dreadful. Backend infra-structure never powerful or big enough for the
growing digital stuff coming in. Systems running always a bunch of
features away from being what we need, only getting by on a barely
useful set of features (that far too often the vendors dictates) to do
the minimum we have to do. Yes, yes, exceptions here and there, I
would never deny that, but look at library land as a whole; you're
lagging behind and you cannot really compete in a world that needs you
to not only run, but win. And frankly, you *cannot* win, not on
technology. There's just no way. Winning this one requires not
technology as such, but paradigm shifts in thinking, both from inside
and especially from the outside, coupled with proper resourcing by
people who understands the value libraries truly bring to the world.
And this latter thing is becoming a real problem, I think.

> One reason that libraries may not prioritize putting free ebooks in the 
> catalog is because
> there are other places users can search for free ebooks on the internet -- 
> but there
> aren't other places users can search for non-free ebooks that they know will 
> be licensed
> to them as library patrons, or for that matter to search for physical things 
> on the shelves
> that they know are available from their library.

Seems like an odd argument to me. Why are we talking about the price
and the format of the information rather than the *quality* of it? I
thought a curated collection was the bee's knees, regardless of what
formats used. Hmm. Maybe I'm thinking too much like a knowledge
customer than a librarian these days, and I've lost my touch or my
way. :)


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Let's go somewhere [was PHP vs. Python...]

2010-11-01 Thread Alexander Johannesen
On Tue, Nov 2, 2010 at 5:03 AM, Jonathan Rochkind  wrote:
> I would be very unlikely to use someone's homegrown library specific
> scripting language. However, if you want to make a library for an existing 
> popular scripting
> language that handles your specific domain well, I'd be quite likely to use
> that if I had a problem with your domain and I was comfortable with the
> existing popular scripting language, i'd use it for sure.

Hmm. The balance between the old and tried, and the new and
experimental will, forever, cause these kinds of discussions. Now, I
agree with the basic sentiment of what you're saying, but ...

> Odds are your
> domain is not really "libraries" (that's not really a software problem
> domain), but perhaps as Patrick suggests "dealing with relationships among
> semantic objects", and then odds are libraries are not the only people
> interested in this problem domain.

I've worked in the three basic tiers of library development world; the
plain vanilla programming world, the semantic web world, and the dark
dungeons of the Cult of MARC. Is the domain of library IT solved by
the generic technologies used? No.

There's nothing bad about a DSL, in fact, I encourage it. If you want
to get away from MARC, say, then having a DSL that approaches meta
data on the programmatic level directly is a wonderful abstraction.
But yes, we have to separate API from language. And API is, mostly
these days, simply a function/method call on top of an abstraction,
and it processes your request with your input. A language, on the
other hand, will let you deal directly with that problem. Most DSLs
are functional abstraction pre-compiled.

The line between a library and a language perhaps these days are more
blurred than ever before, however there are certain things that I
think justifies a library DSL ;

 * focus on identity management
 * mergability on entities
 * large distributed sets
 * more defined line between data and meta data
 * controlled vocabularies and structures

There's generic tools for all of these, however no one central thing
that binds them all together in a seamless way, elegant or otherwise.
No platform binds these together in an easy nor elegant way, and
perhaps such a thing would be beneficial to the library community, to
create a language that tries to create a bridge between computer
programming and what you learn in library school.

But even if we all concede that a library DSL perhaps is not a
practical solution, I'd still like to see us work on it, for nothing
more than sussing out our actual needs and wants in the process. Don't
underestimate the process of doing something that will never
eventuate, even knowingly.

> Some people like ruby because of it's support for creating what they call
> "domain specific languages", which I think is a silly phrase, which really
> just means "a libraryAPI at the right level of abstraction for the tasks at
> hand, so you can accomplish the tasks at hand concisely and without repeated
> code."

Depends on the language. Perhaps this doesn't make sense in Ruby, but
it certainly does in Scala, Haskell, and perhaps more than any, Rebol.
Even Lisp and derivatives, who can create custom structures on the
fly, are well suited to create actual languages that redefine the
language's original syntax and structure. You can redefine the hell
out of C to create any language imaginable, too, even when you

A well-defined API is not a bad thing, though, but an API are
basically semantic entities in a language to parse structures.
However, a language redefines the syntax used by that language. Sure
you can create a word "record" in an API that mimics, say, a MARC
record, but the interesting part is when you redefine the syntax to
work *with* that semantic concept, like ;

  external_repository {
 baseURI: '',
 type: OAI-PHM

  my_repository {
 baseURI: '',
 type: RIF-CS

  some_vocabulary {
 baseURI: ''
 type: thesauri

  foreach record in external_repository [without tag 850] {
 inject into my_repository {
with: exploded words ( tag 245 )
when: match words in some_vocabulary ( NT > 2 )
merge into: tag 850

Creating classes that deal with record merging based on identity
management and various standards would be trivial to script together
super-fast, because the underlying concepts for us is rather
well-known. Hacking this together in Java or otherwise is a test on
patience and sanity, because they are generic tools, even when known
library-type APIs are used. Of course lots of stuff is assumed in the
example, but these are well-understood assumptions (about merging
subject headings (like multiple tags handling, LCSH lookup, etc.),
about identity control, about word lookup (for example, I'm assuming
some form of stemming before matching), and on and on. A language that
half text manipulation and looku

Re: [CODE4LIB] PHP vs. Python [was: Re: Django]

2010-10-29 Thread Alexander Johannesen
On Sat, Oct 30, 2010 at 7:49 AM, Bradley Allen
> Mark- I would highly recommend looking at Tornado
> ( as an alternative to using Django without
> the ORM.

I'd second that one. Has used it for a couple of projects, and it
seriously cut down on prerequisite clutter and is super fast.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] PHP vs. Python [was: Re: Django]

2010-10-27 Thread Alexander Johannesen
Olá, como vai?

Luciano Ramalho  wrote:
> Actually, Python is a general purpose programming language. It was not
> created specifically for server side scripting like PHP was. But it is
> very suitable to that task.

I'm not sure talking about what something used to be is as interesting
as talking about what it is. Both Pyhton and PHP can share whatever
moniker we choose (scripting-language, programming language,
real-time, half-time, bytecoded, virtual, etc.).

>> Not seen any scientific packages, but I've seen a few ray-tracers,
>> although they're all demo apps and fun toys (although I think that
>> applies to Python, too).
> No, that does not apply to Python. Python is widely used for hardcore
> scientific computing.

I was referring to the ray-tracing part.

> It is also the most important scripting language in large scale CGI
> settings

Yes, Python is widely used for scripting up interfaces into other more
complex systems. But rarely is the core of the thing written entirely
in Python.

>> Maybe your Google-foo is weak. :)
> Or maybe he's just realizing that outside of server side web
> scripting, PHP is just not so widely used.

Absolutely, and fair enough.

> Having used both languages, I discovered that Python is easier for
> most tasks, and one reason is that the libraries that come with Python
> are extremely robust, well tested and consistent.

Hmm. PHP is extremely robust and well-tested, but yes, it's not all
that consistent, especially not before version 5.2+. However, things
have moved on, and with release 6 around the corner things will be
tighter still. Just like the first versions of Python were
interesting, so was PHP's, but where the biggest problem with the
evolution of PHP was the very fact that it was the most popular
language for rapid web development by far.

> PHP is very
> practical for server-side web scripting, but it's libraries are
> unfortunately full of gotchas, traps and unexpected behaviour.

There's gotchas in every language, even Python.

> A key reason for that is the fact that Python has always had an
> exception-handling mechanism while PHP has grown something like that
> only a few years ago

True enough. But earlier versions of any language are less desirable
than the latest versions, so I'm not sure this is a prevailing
argument for the horribleness of PHP or any language. These things
evolve. PHP 5.3+ and soon 6 are looking very good, indeed, but yes, we
will just have to live with a poor reputation brought on by the big
number of users and the pre 5.2+ era.

> So, I my opinion, PHP is great at what it does best: enabling quick
> server-side Web scripting on almost any hosting service on Earth.

I'm fairly sure you can say that because you haven't done much other
kind of PHP work. :)

> For everything else, it is very worthwhile to learn and use a general
> purpose dynamic language such as Python, Ruby or Perl.

Of course. Developers should learn many of languages, and choose
wisely the language best suited to the problem at hand.

> Sorry for the rant. I must confess I am a founder of the Brazilian
> Python Association and was its first president, so you can call me a
> Python advocate.

No bias at all, really. :)

Kind regards,

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

[CODE4LIB] PHP vs. Python [was: Re: Django]

2010-10-27 Thread Alexander Johannesen
Hola, compadre,

Elliot Hallmark  wrote:
> Other things beyond that seemed
> awkward, difficult, or impossible from what I knew. python immediately
> jumped out to me as a tool more suited to these tasks.

The fact that Python has a looping run-time environment is, of course,
a give-away to why most people think this, and perhaps to some degree,
rightly so, but PHP has got the same, it's just that *most* people use
PHP through some Apache module as a request/response module. Indeed,
that's where it started, and that's its forte.

> From my experience, it seemed php was a server side
> scripting language.

Strictly speaking, so is Python.

> Can you write a php script that gets key presses
> and doesn't pass them along to windows to process?  I thought the OS
> would have to process the key press, pass it along to the php server
> and then php could process it. (pyhook)

A couple of obvious candidates;

> Also, how would you go about using a GPU from a graphics card in php?
> (python cuda in google gives many results)

PHP is just a C program with various bindings, so I suspect in the
same way Python would do it. Whether anyone has done it, though, is a
different question.

> Has anyone written a scientific computing package along the lines of
> matlab in php (scipy, numpy, matplotlib)?  Or a non-sequential optical
> raytracer?

Not seen any scientific packages, but I've seen a few ray-tracers,
although they're all demo apps and fun toys (although I think that
applies to Python, too). It's not so much about whether you can do it
or not (you can), but whether it makes sense to do so (it mostly
doesn't). Having said that, there's nothing stopping me making a local
run-time PHP program to do either, it's just that it's PHP and hence
slower than C. Python, too, is slower than C, except when it runs some
C module, which, uh, is C, the same as if PHP runs some C module. For
example, one of the fastest and best XSLT 1.0 processors and XML
libraries out there is XMLlib and XSLTlib (RedHat and Gnome?), written
in C, and is the defacto PHP XML and XSLT modules used. Whatever
you've got that runs in C, you can run in PHP, it's not really a big
deal, it just depends on whether it makes sense to patch it up with
the way you use your PHP.

> if you wanted to write a web interface for GNU cash or another well
> established accounting program, could you do it?

Sure. Here's someone who'dunnit back in 2008;

> please feel free to point me to the php equivilants of pyhook, pycuda,
> scipy, numpy and some examples of widely used programs with php
> bindings.

You can bind PHP and Python the same, it's just a matter of doing and
whether it makes sense to do so. It's *not* a question of /if/ you can
do it, but if you /should/ do it. Your milage *will* vary.

>>  For the sophisticated hacker, most languages can
>> be tweaked to solve almost any problem.
> I am sure that is true. Though, I feel many for many tasks php would
> require quite a bit more tweaking than python, with much less
> community support behind it (I mean, google comes up with fewer
> helpful links to the problems I sited above).

Maybe your Google-foo is weak. :)

> My impression, based on very little experience with php, is that if
> you asked in a forum about using php for advanced scientific
> computing, or writing music generation/sequencing software,
> knowledgeable folks would first ask: "are you sure you want to do this
> in php?  how about java or python?"

Again, probably because they don't realize it can be done in a
non-request/response kinda way with PHP as well. But then, PHP itself
isn't all that fast if you have little knowledge of how to do proper
PHP, but this is a pitfall in any language.

> That said, php may be superior for generating websites from databases.

Not really, but the installations you'll find in the wild is readily
configured for it, so it's easy to get going. However, this has little
to do with the language itself, and more to do with default packaging
of it.

Anyway, I wasn't meaning to promote PHP over Python, just pointing out
that PHP is a lot more (and more often still, a lot better) than what
most people think it is.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Alexander Johannesen
On Thu, Oct 28, 2010 at 6:58 AM, Chris Fitzpatrick  wrote:
> +1 to the  "this discussion is really depressing me"  camp.

Ok, ok, I get the message. This is no place to voice strong opinions
about bad library tech, and my (different, but not bad)  language nor
stance (contrarian, but not accusatory) are simply not acceptable. I'm
outta here.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Alexander Johannesen
On Thu, Oct 28, 2010 at 6:53 AM, Jonathan Rochkind  wrote:
> Pretty sure it wasn't depressing to the vast majority of the listserv
> audience.  That was/is a discussion that benefited from a "timeout period",
> like you give the pre-schoolers.

Given we're adults, and not in pre-school, I disagree.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Django

2010-10-27 Thread Alexander Johannesen
On Wed, Oct 27, 2010 at 3:09 AM, Elliot Hallmark  wrote:
> However, I switched to this other scripting
> language, python, because it could do things php cant.

Not to start a flame, but that's a rather big statement which I think
A) needs backing up, and B) is probably untrue.

>  For instance,
> my first project in python involved capturing keyboard input before
> windows heard about it.  Then I kept discovering amazing things python
> can do that php cant.

For instance, PHP can do this fine. Was there something in particular
you're thinking of that PHP can't do?

> I helped write a non-sequential optical ray tracer in python.  When it
> needed to be faster there were several libraries for writing C code
> directly in a pythonic syntax.  Python has hooks into everything, like
> optical character recognition, electronic music
> sequeuencing/generation, serial port i/o.

Again, PHP the same. For the sophisticated hacker, most languages can
be tweaked to solve almost any problem.

And I'm not even suggesting that you use PHP. Happy hacking.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Alexander Johannesen
On Thu, Oct 28, 2010 at 2:44 AM, Doran, Michael D  wrote:
> Can that limit threshold be raised?  If so, are there reasons why it should 
> not be raised?

Is it to throttle spam or something? 50 seems rather low, and it's
rather depressing to have a lively discussion throttled like that. Not
to mention I thought I was simply kicked out for living things up
(especially given my reasonable follow-up was where the throttling

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-27 Thread Alexander Johannesen
>> Political? For sure. Engineering? Not so much.
> Ok. Solve it. Let us know when you're done.

Wow, lamest reply so far. Surely you could muster a tad bit better? I
was excited about getting a list of the hardest problems, for example,
I'd love to see that. Then by that perhaps you could explain what this
unsurmountable hard mind-boggeling problem actually is, because, you
know, you never actually said.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-27 Thread Alexander Johannesen

On Tue, Oct 26, 2010 at 1:23 PM, Bill Dueber  wrote:
> Sorry. That was rude, and uncalled for. I disagree that the problem is
> easily solved, even without the politics. There've been lots of attempts to
> try to come up with a sufficiently expressive toolset for dealing with
> biblio data, and we're still working on it. If you do think you've got some
> insight, I'm sure we're all ears, but try to frame it terms of the existing
> work if you can (RDA, some of the dublin core stuff, etc.) so we have a
> frame of reference.

Well, I've wined enough both here and on NGC4LIB, and I'm kinda over
it, just like I'm sure most people are over my whining. But sufficient
to say is that FRBR is a 15 year old model that has still not been
proven in the Real World[TM] in any meaningful way (the prototypes
works fine until you dig a bit) and probably never will as long as
MARC21 runs the show, and trying to stick RDA on top with rules that
has got use-cases that are old enough to be my kids, well, I'm not
very positive about that either.

The direction of going ontological is a good one, and in the lack of
anything else, RDF-infused FRBR / RDA is probably the way to go
(except I'd ditch RDA and, uh, perhaps even FRBR, or at least
seriously modify it), but the community is decidedly not talking about
ontological interoperability nor extensions nor the semantics involved
to solve actual problems in the bibliographic world (including the
fact that it is inherently bibliographic). There needs to be much more
involvement by library geeks and managers in defining semantic reuse
and extensibility, to properly define those things that are almost
absent from the AACR2 and friends; the relationships between entities
themselves. In other words, you need to get away from the
record-centered view, and embrace the subject-centric view.

Anyway, enough from this old grumpy bum. Sorry to stir up the dust.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Alexander Johannesen
On Tue, Oct 26, 2010 at 12:48 PM, Bill Dueber  wrote:
> Here, I think you're guilty of radically underestimating "lots of people
> around the library world." No one thinks MARC is a good solution to
> our modern problems, and no one who actually knows what MARC
> is has trouble understanding MARC-XML as an XML serialization of
> the same old data -- certainly not anyone capable of meaningful
> contribution to work on an alternative.

Slow down, Tex. "Lots of people in the library world" is not the same
as developers, or even good developers, or even good XML developers,
or even good XML developers who knows what the document model imposes
to a data-centric approach.

> The problem we're dealing with is *hard*. Mind-numbingly hard.

This is no justification for not doing things better. (And I'd love to
know what the hard bits are; always interesting to hear from various
people as to what they think are the *real* problems of library
problems, as opposed to any other problem they have)

> The library world has several generations of infrastructure built
> around MARC (by which I mean AACR2), and devising data
> structures and standards that are a big enough improvement over
>  MARC to warrant replacing all that infrastructure is an engineering
>  and political nightmare.

Political? For sure. Engineering? Not so much. This is just that whole
"blinded by MARC" issue that keeps cropping up from time to time, and
rightly so; it is truly a beast - at least the way we have come to
know it through AACR2 and all its friends and its death-defying focus
on all things bibliographic - that has paralyzed library innovation,
probably to the point of making libraries almost irrelevant to the

> I'm happy to take potshots at the RDA stuff from the sidelines, but I never
> forget that I'm on the sidelines, and that the people active in the game are
> among the best and brightest we have to offer, working on a problem that
>  invariably seems more intractable the deeper in you go.

Well, that's a pretty scary sentence, for all sorts of reasons, but I
think I shall not go there.

> If you think MARC-XML is some sort of an actual problem

What, because you don't agree with me the problem doesn't exist? :)

> and that people
> just need to be shouted at to realize that and do something about it, then,
> well, I think you're just plain wrong.

Fair enough, although you seem to be under the assumption that all of
the stuff I'm saying is a figment of my imagination (I've been
involved in several projects lambasted because managers think MARCXML
is solving some imaginary problem; this is not bullshit, but pain and
suffering from the battlefields of library development), that I'm not
one of those developers (or one of you, although judging from this
discussion it's clear that I am not), that the things I say somehow
doesn't apply because you don't agree with, umm, what I'm assuming is
my somewhat direct approach to stating my heretic opinions.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Alexander Johannesen
On Tue, Oct 26, 2010 at 11:56 AM, Walker, David  wrote:
> Your criticisms of MARC-XML all seem to presume that MARC-XML is the
> goal, the end point in the process.  But MARC-XML is really better seen as a
> utility, a middle step between binary MARC and the real goal, which is some
> other "useful and interesting" XML schema.

How do you create an ontological commitment in a community to an
expanding and useful set of tools and vocabularies? I think I need to
remind people of what MARCXML is supposed to be ;

"a framework for working with MARC data in a XML environment. This
framework is intended to be flexible and extensible to allow users to
work with MARC data in ways specific to their needs. The framework
itself includes many components such as schemas, stylesheets, and
software tools."

I'm not assuming MARCXML is a goal, no matter how we define that. I'm
poo-pooing MARCXML for the semantics we, as a community, have been
given by a process I suspect had goals very different from reality.
Very few people would "work with MARC through MARCXML", they would use
it to convert it, filter it, hack around it to something else
entirely. And I'm afraid lots of people are missing the point of
stubbing the developments in a community by embracing tools that
pushes a packet that inhibits innovation. So, here's the point, in
paraphrased point;

   "Here's our new thing. And we did it by simply converting all our
MARC into MARCXML that runs on a cron job every midnight, and a bit of
horrendous XSLT that's impossible to maintain."

   "But it looks just like the old thing using MARC and some templates?"

   "Ah yes, but now we're doing it in XML!"

   (Yeah, yeah, your mileage will vary)

I'm sorry if I'm overly pessimistic about the XML goodness in the
world, not for the XML itself, but the consequences of the named
entities involved. I've been a die-hard XML wonk for far too many
years, and the tools in that tool-chest doesn't automatically solve
hard problems better by wrapping stuff up in angle brackets, and -
dare I say it? - perhaps introduces a whole fleet of other problems
rarely talked about when XML is the latest buzz-word, like using a
document model on what's a traditional records model, character
encodings, whitespace issues, unicode, size and efficiencies (the
other part of this thread), and so on.

But let me also be a bit more specific about that hard semantic
problem I'm talking about;

Lots of people around the library world infra-structure will think
that since your data is now in XML it has taken some important step
towards being inter-operable with the rest of the world, that library
data now is part of the real world in *any* meaningful way, but this
is simply demonstrably deceivingly not true. By having our data in XML
has killed a few good projects where people have gone "A new project
to convert our MARC into useful XML? Aha! LoC has already solved that
problem for us."

Btw, to those who find me so obnoxious, at no point do I say it was
intentionally evil, just evil none the same. The road to hell is, as
always, paved with good intentions.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Alexander Johannesen
Ray Denenberg, Library of Congress  wrote:
> It really is possible to make your point without being quite so obnoxious.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Alexander Johannesen

On Tue, Oct 26, 2010 at 6:26 AM, Nate Vack  wrote:
> Switching to an XML format doesn't help with that at all.

I'm willing to take it further and say that MARCXML was the worst
thing the library world ever did. Some might argue it was a good first
step, and that it was better with something rather than nothing, to
which I respond ;


MARCXML is nothing short of evil. Not only does it goes against every
principal of good XML anywhere (don't rely on whitespace, structure
over code, namespace conventions, identity management, document
control, separation of entities and properties, and on and on), it
breaks the ontological commitment that a better treatment of the MARC
data could bring, deterring people from actually a) using the darn
thing as anything but a bare minimal crutch, and b) expanding it to be
actual useful and interesting.

The quicker the library world can get rid of this monstrosity, the
better, although I doubt that will ever happen; it will hang around
like a foul stench for as long as there is MARC in the world. A long
time. A long sad time.

A few extra notes;

Can you tell I'm not a fan? :)

Kind regards,

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] OCLC Service Outage Update

2010-05-10 Thread Alexander Johannesen
Michael J. Giarlo  wrote:
> ... people took Simon's comment seriously?

Language is a funny thing ; some times the things that are being said
is taken seriously. And the script-haters are spread far and wide, so
there was no reason not to take him seriously. Should the default be
not to take anyone seriously? Srsly?

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] OCLC Service Outage Update

2010-05-10 Thread Alexander Johannesen
On Tue, May 11, 2010 at 06:59, stuart yeates  wrote:
> No, the real problem is with trolls sending flamebait.

Friggin' AMEEN!

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 20:29, Owen Stephens  wrote:
> However I'd argue that actually OpenURL 'succeeded' because it did manage to
> get some level of acceptance (ignoring the question of whether it is v0.1 or
> v1.0) - the cost of developing 'link resolvers' would have been much higher
> if we'd been doing something different for each publisher/platform. In this
> sense (I'd argue) sometimes crappy standards are better than none.

Well, perhaps. I see OpenURL as the natural progression from PURL, in
which both have their degree of "success", however I'm careful using
that word as I live on the outside of the library world. It may well
be a success on the inside. :)

> I think the point about Link Resolvers doing stuff that Apache and CGI
> scripts were already doing is a good one - and I've argued before that what
> we actually should do is separate some of this out (a bit like Johnathan did
> with Umlaut) into an application that can answer questions about location
> (what is generally called the KnowledgeBase in link resolvers) and the
> applications that deal with analysing the context and the redirection

Yes, split it into smaller chunks is always smart, especially with
complex issues. For example, in the Topic Maps world, the who standard
(reference model, data model, query language, constraint language, XML
exchange language, various notational languages) is wrapped up with a
guide in the middle. Make them into smaller parcels, and make your
flexible point there. If you pop it all into one, no one will read it
and fully understand it. (And don't get me started on the WS-* set of
standards on the same issues ...)

> (To introduce another tangent in a tangential thread, interestingly (I
> think!) I'm having a not dissimilar debate about Linked Data at the moment -
> there are many who argue that it is too complex and that as long as you have
> a nice RESTful interface you don't need to get bogged down in ontologies and
> RDF etc. I'm still struggling with this one - my instinct is that it will
> pay to standardise but so far I've not managed to convince even myself this
> is more than wishful thinking at the moment)

Ah, now this is certainly up my alley. As you might have seen, I'm a
Topic Maps guy, and we have in our model a distinction between three
different kinds of identities; internal, external indicators and
published subject identifiers. The RDF world only had rdf:about, so
when you used "", are you talking about that thing,
or does that thing represent something you're talking about? Tricky
stuff which has these days become a *huge* problem with Linked Data.
And yes, they're trying to solve that by issuing a HTTP 303 status
code as a means of declaring the identifiers imperative, which is a
*lot* of resolving to do on any substantial set of data, and in my
eyes a huge ugly hack. (And what if your Internet falls down? Tough.)

Anyway, here's more on these identity problems ;

As to the RESTful notions, they only take you as far as content-types
can take you. Sure, you can gleam semantics from it, but I reckon
there's an impedance mismatch between just the things librarians how
got down pat ; meta data vs. data. CRUD or, in this example, GPPD
(get/post/put/delete), who aren't in a dichotomy btw, can only
determine behavior that enables certain semantic paradigms, but cannot
speak about more complex relationships or even modest models. (Very
often models aren't actionable :)

The funny thing is that after all these years of working with Topic
Maps I find that these hard issues have been solved years ago, and the
rest of the world is slowly catching up to it. I blame the lame
DAML+OIL background of RDF and OWL, to be honest; a model too simple
to be elegantly advanced and too complex to be easily useful.

Kind regards,

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 18:47, Owen Stephens  wrote:
> Could you expand on how you think the problem that OpenURL tackles would
> have been better approached with existing mechanisms?

As we all know, it's pretty much a spec for a way to template incoming
and outgoing URLs, defining some functionality along the way. As such,
URLs with basic URI templates and rewriting have been around for a
long time. Even longer than that is just the basics of HTTP which have
status codes and functionality to do exactly the same. We've been
doing link resolving since mid 90's, either as CGI scripts, or as
Apache modules, so none of this were new. URI comes in, you look it up
in a database, you cross-check with other REQUEST parameters (or
sessions, if you must, as well as IP addresses) and pop out a 303
(with some possible rewriting of the outgoing URL) (with the hack we
needed at the time to also create dummy pages with META tags

So the idea was to standardize on a way to do this, and it was a good
idea as such. OpenURL *could* have had a great potential if it
actually defined something tangible, something concrete like a model
of interaction or basic rules for fishing and catching tokens and the
like, and as someone else mentioned, the 0.1 version was quite a good
start. But by the time when 1.0 came out, all the goodness had turned
so generic and flexible in such a complex way that handling it turned
you right off it. The standard also had a very difficult language, and
more specifically didn't use enough of the normal geeky language used
by sysadmins around. The more I tried to wrap my head around it, the
more I felt like just going back to CGI scripts that looked stuff up
in a database. It was easier to hack legacy code, which, well, defeats
the purpose, no?

Also, forgive me if I've forgotten important details; I've suppressed
this part of my life. :)

Kind regards,

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 10:54, Eric Hellman  wrote:
> May I just add here that of all the things we've talked about in these 
> threads, perhaps the only thing that will still be in use a hundred years 
> from now will be Unicode. إن شاء الله

May I remind you that we're still using MARC. Maybe you didn't mean in
the library world ... *rimshot*

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 04:17, Jakob Voss  wrote:
> But all the flaws of XML can be traced back to SGML which is why we now use
> JSON despite all of its limitations.

Hmm, this is wrong on so many levels. First, SGML was pretty darn good
for its *purpose*, but it was a geeks dream and pretty scary for
anyone who hacked at it not fully getting it (like most normal
developers). As with many things where the learning curve is steep, it
fell into the "not good for normal consumption" category and they
(well, people who cared, and made decisions about the web) were
"forced" to make XML. But JSON? Are you sure you've got this figured
out? JSON as a object serializing format is good for a number of
things (small footprint, embedded type, etc.), but sucks for most
information management tasks.

However, I'd like to add here that I happen to love XML, even from an
integration perspective, but maybe that stems from understanding all
those tedious bits no one really cares about about it, like id(s) and
refid(s) (and all the indexing goodness that comes from it), canonical
datasets, character sets and Unicode, all that schema craziness
(including Schematron and RelaxNG), XPath and XQuery (and all the
sub-standards), XSLT and so on. I love it all, and not because of the
generic simplicity itself (simple in the default mode of operation, I
might add), but because of a) modeling advantages, b)
cross-environment language and schema support, and c) ease of
creation. (I don't like how easy well-formedness breaks, though. That

But I mention all this for a specific reason ; MARCXML is the work of
the devil! There's a certain dedication needed for "doing it right",
by paying attention in XML class, and play well with your playmates.
This is how you build a community and understanding around standards;
the standards themselves are not enough. The library world did nothing
of the kind ;

The flaws of XML can most likely be traced back to people not playing
well with playmates, and not the format itself.

> May brother Ted Nelson enlighten all of
> us - he not only hates XML [1] and similar formats but also  proposed an
> alternative way to structure information even before the invention of
> hierarchical file systems and operating systems [2].

Bah. For someone who don't see the SGML -> XML -> HTML transgression
as an inherited and more rigid structure (or, by popular language,
more schematic) as a document model as a good thing, I'm not
impressed. Any implied structure can be criticized, including pretty
much any corner of Xanadu as well. (I mean, seriously; taking
hypermedia one step closer to a file system does *not* solve problems
with the paper-based document model of HTTP, it just shifts the focus)

> In his vision of Xanadu
> every piece of published information had a unique ID that was reused
> everytimes the publication was referenced - which would solve our problem.

*Having* an identifier doesn't mean that identifier is a *good* one,
nor that it solves your problem. There's plenty of systems out there
where everything has an identifier (and, if you knew XML deeper,
you'll find identification models as well in there, but people don't
use them because the early onset of XML didn't understand nor need
them). Have a look at the failed XLink brooha for something that
worked and filled the niche, but people didn't get nor did tool-makers
see the point of implementation, and the thing died a premature death.
The current model of document structure and XQuery is somewhat of an
alternative, but people are also switching to CSS3 styles as well. The
thing is, just because you've got persistence in a system of
identifiers, it does not follow that the information is persisted; the
problem of change is *not* solved in neither systems, and so we work
with the one we got and make the best of it.

One thing I always found intriguing about librarians were their
commitment to persistent URIs for information resources, and use of
303 if need be (although I see this mindset dwindling). I think you're
the only ones in the entire world who gives a monkeys bottom about
these issues, as the rest of the world simply use Google as a
resolver. I can see where this is going. :)


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Alexander Johannesen

On Thu, Apr 29, 2010 at 22:47, Walker, David  wrote:
> I would suggest it's more because, once you step outside of the
> primary use case for OpenURL, you end-up bumping into *other* standards.

These issues were raised all the back when it was created, as well. I
guess it's easy to be clever in hindsight. :) Here's what I wrote
about it 5 years ago ( ;

So let's talk about 'Not invented here' first, because surely, we're
all guilty of this one from time to time. For example, lately I dug
into the ANSI/NISO Z39.88 -2004 standard, better known as OpenURL. I
was looking at it critically, I have to admit, comparing it to what I
already knew about Web Services, SOA, http,
Google/Amazon/Flickr/ API's, and various Topic Maps and
semantic web technologies (I was the technical editor of Explorers
Guide to the Semantic Web)

I think I can sum up my experiences with OpenURL as such; why? Why
have the library world invented a new way of doing things that already
can be done quite well already? Now, there is absolutely nothing wrong
with the standard per se (except a pretty darn awful choice of
name!!), so I'm not here criticising the technical merits and the work
put into it. No, it's a simple 'why' that I have yet to get a decent
answer to, even after talking to the OpenURL bigwigs about it. I mean,
come on; convince me! I'm not unreasonable, no truly, really, I just
want to be convinced that we need this over anything else.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Library Linked Data

2009-10-28 Thread Alexander Johannesen

On Thu, Oct 29, 2009 at 16:19, stuart yeates  wrote:
> I'm guessing that Roy meant linked data in the sense of
> and

I'm pretty sure he did, too. I guess I was trying to smoke out his
reasoning for choosing "linked data" as the only worthwhile semantic
web technology. Let me clarify, and have a look at this ;

Linked data is the bottom four boxes out of a total of 12 (13 if you
count the top one), where the ones missing is things like Trust,
Proof, Logic, Querying, Ontologies and Taxonomies, all things that I
thought it was evident belonged at the core of what library science is
all about. It simply astounds me the lack of understanding from the
library world on these things, so sad to see that these things aren't
linked up; you *are* what these things are about! Sure, linked data is
easier; that's why everyone is doing it, have been doing it for years.
But you're missing out in fields that should be second-nature to you.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] Library Linked Data

2009-10-28 Thread Alexander Johannesen

On Thu, Oct 29, 2009 at 15:16, Roy Tennant  wrote:
> Could you elaborate a bit? In my mind, the only "semantic web technology" of
> any note is "linked data".

What do you mean by linked data? I work in fields of semantic web
technology where there's very little linked data (ie. data on the web
you can link to and use), yet I feel all our work is very valuable and
certainly worthy of note ...


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] HTML mark-up in MARC records

2009-06-22 Thread Alexander Johannesen

I guess I'm the one who's got to step up to the self-slaughtering
altar, but the fact that a lot of our systems break or don't know how
to handle HTML is despicable. I'm sure you guys are familiar with RSS
/ Atom, and because in there we *expect* HTML and therefore make sure
our back-ends can grok it, it enhances the meta data *greatly*.

Don't think for a second that purity of the data format in any shape
or form is the definition of its usefulness. Mixed content models
might be complex to work with, but their value is immense. I can fully
understand *why* people say "don't do it", because, yes, it ups the
complexity, and perhaps with these dinosaur technologies like MARC and
our ILS's breaking under the pressure of more modern technologies
enforces it, I don't think we should shun it because of it.

If your back-end can't grok HTML, I'd suggest you fix it immediately!
If your ILS chokes on XML and / or HTML snippets, I suggest you
replace it. You seriously shouldn't allow this rigidity into your
infra-structure, and it's depressing to watch how we as complex users
of MARC don't dare to extend it to become a format that does what it
should and need to do.

Even *if* HTML in MARC records probably is a bad idea.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- --
-- ---

Re: [CODE4LIB] A Book Grab by Google

2009-05-20 Thread Alexander Johannesen
On Thu, May 21, 2009 at 10:07, Karen Coyle  wrote:
> - without competition, Google (with the agreement of the registry, whose
> purpose is to garner as much income as possible for rights holders) will
> charge a price that is more than some institutions will be able to afford;
> others will subscribe, but to the detriment of other resource subscriptions.

How is this different from what's already in place in terms of
electronic resources? This is not uniquely Google, nor has it even
been proven to happen.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-14 Thread Alexander Johannesen
On Thu, May 14, 2009 at 17:45, Rob Sanderson  wrote:
> I'll quote Mike (and most common approaches to the problem):
>        Don't Do That Then.
> :)

Oh, for sure. :) But these are very subtle things that are hard to
understand, and certainly the long-term implications, so people *will*
do this, and they *will* put rot into the SemWeb chains people create.
It's unavoidable, but I know lots are trying to work out some kind of
solution. Unfortunately, this one is being routed to software
frameworks rather than the RDF core itself. Oh well.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-14 Thread Alexander Johannesen
On Thu, May 14, 2009 at 17:35, Rob Sanderson  wrote:
> For example, the owl:sameAs predicate is used to express that the
> subject and object are the same 'thing'.  Then the application can infer
> that if a owl:sameAs b, and a x y, then b x y.

Yes, but there's a snag; as RDF work only on the URI resource level
(no added semantics to the typification of the URI resource) if
someone does an owl:sameAs between an identifier of a thing and a
locator of a thing (a locator being the resource itself as opposed to
being an identifier; example are you talking about Sun Corp
( or are you talking about their website
( you can get a nasty case of integrity rot, and I've
not seen any proposals to address this issue (the RDF world is
essentially assuming modeling from the viewpoint of everything being

I guess Mike don't like RDF *nor* Topic Maps now. :)


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-11 Thread Alexander Johannesen
On Mon, May 11, 2009 at 19:34, Jonathan Rochkind  wrote:
> In the real world, we use things when they solve the problem in front of us
> in as easy a way as possible

And somehow you're suggesting that I don't live in the real-world? :)
Good try, but as far as I've experienced, people in the library world
lives quite a distance away from the real one.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-11 Thread Alexander Johannesen
On Mon, May 11, 2009 at 16:04, Rob Sanderson  wrote:
> * One namespace is used to define two _totally_ separate sets of
> elements.  There's no reason why this can't be done.

As opposed to all the reasons for not doing it. :) This is crap design
of a higher magnitude, and the designers should be either a) whipped
in public and thrown out in shame, or b) repent and made to fix the
problem. Even I would opt for the latter, but such a simple task not
being done seems to suggest that perhaps the former needs to be put in

> * One namespace defines so many elements that it's meaningless to call
> it a format at all.  Even though the top level tag might be the same,
> the contents are so varied that you're unable to realistically process
> it.

Yeah, don't use MODS in general; it's a hack. It's even crazier still
that many versions have the same namespace. What were they thinking?!

Anyway, even if the namespace is botched, you can still (if I'll dare
go by the Topic Maps moniker) have multiple namespaces for the same
subject (the format in question), and simply publish and use your own
and let the TM mechanics handle the ambiguity for you. If enough
people do this, and perhaps even use your unofficial identifiers,
maybe LOC will see the errors of their ways and repent.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-08 Thread Alexander Johannesen
On Sat, May 9, 2009 at 00:32, Jonathan Rochkind  wrote:
> I don't understand from your description how Topic Maps solve the
> "identifying multiple versions of a standard" problem.

It's the mechanism of having multiple identifiers for Topics, so, in pseudo ;

Topic "MARC21"
  psi "info:ofi/fmt:xml:xsd:MARC21"
  psi "";
  property #mime-type "whatever for the binary"

Topic "MARC 1.1"
  is_a "MARC"
  psi "info:srw/schema/1/marcxml-v1.1"
  psi "";
  property #mime-type "whatever 1.1"

Topic "MARC 1.2"
  is_a "MARC"
  psi "info:srw/schema/1/marcxml-v1.2"
  psi "";
  property #mime-type "whatever 1.2"

Or, if if "MARC 1.2" is backwards compatible with 1.1 ;

Topic "MARC 1.2"
  is_a "MARC 1.1"
  psi "info:srw/schema/1/marcxml-v1.2"

Or, if I make my own unofficial version ;

Topic "MARC 2.0"
  is_a "MARC 1.2"
  psi "";

This is enough to hobble together what is and isn't compatible in
types of formats, so if your application is Topic Maps aware, this
should be trivial (including what format to ignore or react to). The
point is that you don't need *one* identifier for things; Topics are
proxies for knowledge, and part of the notion of "knowledge" is what
identifies that knowledge. Multiple PSIs help us leverage both rigid
and fuzzy systems.

As to the identifiers themselves (as in, the formatting), is that important?

Anyway, I'm suspecting I don't see what the problem seems to be. To
create "the best identifier" for things seems a bit of a strange
notion to me, but is this based on that there is only (or rather, that
you're trying to create) one identifier for any one thing?

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-07 Thread Alexander Johannesen
On Wed, May 6, 2009 at 18:44, Mike Taylor  wrote:
> Can't you just tell us?

Sorry, but surely you must be tired of me banging on this gong by now?
It's not that I don't want to seem helpful, but I've been writing a
bit on this here already and don't want to be marked as spam for Topic

In the Topic Maps world our global identificators are called PSI, for
Published Subject Indicators. There's a few subtleties within this,
but they are not so different from any other identificator you'll find
elsewhere (RDF, library world, etc.) except of course they are
*always* URIs. Now, the thing here is that they should *always* be
published somewhere, whether as a part of a list or somewhere. The
next thing is that they always should resolve to something (although
the standard don't require this, however I'd say you're doing it wrong
if you couldn't do this, even if it sometimes is an evil necessity).

This last part is really the important bit, where any PSI will act as
1) a global identificator, and 2) resolve to a human text explaining
what it represents. Systems can "just use it" while at the same time
people can choose the right ones for their uses.

And, yes, the identificators can be done any way you slice them. Some
might think that ie. a PSI set for all dates is crazy as you need to
produce identificators for all dates (or times), and that would be
just way too much to deal with, but again, that's not an identifcation
problem, that's a resolver problem. If I can browse to a PSI and get
the text that "this is 3rd of June, 19971, using the whatsnot calendar
style", then that's safe for me to use for my birthday. Let's pretend
the PSI is By releasing an URI
template computers can work with this automatically, no frills.

Now a bit more technical; any topic (which is a Topic Map
representation of any subject, where "subject" is defined as "anything
you can ever hope to think of") can have more than one PSI, because I
might use the PSI for my date.
If my application only understand this former set of PSIs, I can't
merge and find similar cross-semantics (which really is the core of
the problem this thread has been talking about). But simply attach the
second PSI to the same Topic, and you do. In fact, both parties will
understand perfectly what you're talking about.

More complex is that the definitions of PSI sets doesn't have to
happen on the subject level, ie. the Topic called "Alex" to which I
tried to attach my birthday. It can be moved to a meta model level,
where you say the Topic for "Time and dates" have the PSI for both
organsiations, and all Topics just use one or the other; we're
shifting the explicity of identification up a notch.

Having multiple PSIs might seem a bit unordered, but it's based on the
notion of organic growth, just like the web. People will gravitate
towards using PSIs from the most trusted sources (or most accurate or
most whatever), shifting identification schemes around. This is a good
thing (organic growth) at the price of multiple identifiers, but if
the library world started creating PSIs, I betcha humanity and the
library world both could be saved in one fell swoop! (That's another
gong I like to bang)

I'm kinda anticipating Jonathan saying this is all so complex now. :)
But it's not really; your application only has to have complexity in
the small meta model you set up, *not* for every single Topic you've
got in your map. And they're mergable and shareable, and as such can
be merged and "fixed" (or cleaned or sobered or made less complex) for
all your various needs also.

Anyway, that's the basics. Let me know if you want me to bang on. :)
For me, the problem the library face isn't really the mechanisms of
this (because this is solvable, and I guess you just have to trust
that the Topic Maps community have been doing this for the last 10
years or so already :), however, but how you're going to fit existing
resources into FRBR and RDA, but that's a separate discussion.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Another nail in the coffin

2009-05-04 Thread Alexander Johannesen
On Mon, May 4, 2009 at 23:25, Joe Hourcle  wrote:
> You're forgetting the 5th Law:
>        The library is a growing organism.

Not forgotten, I just don't believe it anymore. And, taken to its
natural consequence, organisms through evolution comes and goes. :)

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Another nail in the coffin

2009-05-04 Thread Alexander Johannesen
On Mon, May 4, 2009 at 22:44, Andreas Orphanides
> You say that as though libraries are all about books.

Libraries still have the word "biblio" as their primer, and it
certainly is the written word on paper that occupies most of our time,
no? Sure libraries around the world are trying to play catch-up in the
digital and modern world with all sorts of things, but the primary
directive is still "books" for most librarians. Not sure what you mean
they're *really* into?

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

[CODE4LIB] Another nail in the coffin

2009-05-03 Thread Alexander Johannesen
Another nail in the library coffin, especially the academic ones ;

Organisations and people are slowly turning into data producers, not
book producers.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-03 Thread Alexander Johannesen
With Topic Maps it's been solved years and years ago, and it's the
part of it that the RDF world didn't think of until recently (and
applied their kludges). I'm not going to bang my gong on this, just
urge you to read up on PSIs.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)

2009-04-15 Thread Alexander Johannesen

On Thu, Apr 16, 2009 at 01:10, Jonathan Rochkind  wrote:
> It stands in the way of using them in the fully realized sem web vision.

Ok, I'm puzzled. How? As the SemWeb vision is all about first-order
logic over triplets, and the triplets are defined as URIs, if you can
pop something into a URI you're good to go. So how is it that SuDoc
doesn't fit into this, as you *can* chuck it in a URI? I said it was
unfriendly to the Web, not impossible.

> It does NOT stand in the way of using them in many useful ways that I can
> and want to use them _right now_.

Ah, but then go fix it.

> Ways which having a URI to refer to them
> are MUCH helped by. Whether it can resolve or not (YOU just made the point
> that a URI doesn't actually need to resolve, right? I'm still confused by
> this having it both ways -- URIs don't need to resolve, but if you're URIs
> don't resolve than you're doing it wrong. Huh?)

C'mon, it ain't *that* hard. :) URIs as identifiers is fine, having
them resolve as well is great. What's so confusing about that?

> , if you have a URI for a
> SuDoc you can use it in any infrastructure set up to accept, store, and
> relate URIs. Like an OpenURL rft_id, and, yeah, like RDF even.  You can make
> statements about a SuDoc if it has a URI, whether or not it resolves,
> whether or not SuDoc itself is 'web friendly'.  One step at a time.
> This is my frustration with semantic web stuff, making it harder to do
> things that we _could_ do right here and now, because it violates a fantasy
> of an ideal infrastructure that we may never actually have.

Huh? The people who made SuDoc didn't make it web friendly, and thus
the SemWeb stuff is harder to do because it lives on the web? (And
chucking your meta data into HTML as MF or RDF snippets ain't that
hard, it just require a minimum of knowledge)

> There are business costs, as well as technical problems, to be solved to
> create that ideal fantasy infrastructure. The business costs are _real_

No more real than the cost currently in place. The thing is that a lot
of people see the traditional cost disappear with the advent of SemWeb
and the new costs heavily reduced.

>>  Also, having a unified resolver for
>> SuDoc isn't hard, can be at a fixed URL, and use a parameter for
>> identifiers. You don't need to snoop the non-parameterized section of
>> an URI to get the ID's ;
> Okay, Alex, why don't you set this up for us then?

Why? I don't give a rats bottom about SuDoc, don't need it, think it's
poorly designed, and gives me nothing in life. Why should I bother?
(Unless I'm given money for it, then I'll start caring ... :)

> And commit to providing
> it persistently indefinitely? Because I don't have the resources to do that.

Who's behind SuDoc, and are they serious about their creation? That's
the people you should send your anger instead.

>  And for the use cases I am confronted with, I don't _need_ it, any old URI,
> even not resolvable, will do--yes, as long as I can recognize it as a SuDoc
> and extract the bare SuDoc out of it.

So what's the problem with just making some stuff up? If you can do
your thing in a vacuum I don't fully understand your problem with the
SemWeb stuff? If you don't want it, don't use it.

> Which you say I shouldn't be doing
> (while others say that's a mis-reading of those docs to think I shouldn't be
> doing it)

No, I think this one is the subtle difference between a URL and a URI.

> but avoiding doing that would raise the costs of my software
> quite a bit, and make the feature infeasible in the first place. Business
> costs and resources _matter_.

As with anything on the Web, you work with what you got, and if you
can fix and share your fix, we all will love you for it. I seriously
don't think I understand what you're getting at here; it's been this
way since the Web popped into existance, and don't really want it to
be any other way.

>> No it's not; if you design your system RESTfully (which, indeed, HTTP
>> is) then the discovery part can be fast, cached, and using URI
>> templates embedded in HTTP responses, fully flexible and fit for your
>> purposes.
> These URIs are
> _external_ URIs from third parties, I have no control over whether they are
> designed RESTfully or not.

Not sure I follow this one. There are no good or bad RESTful URIs,
just URIs. REST is how your framework work with the URIs.

> In the meantime, I'll continue trying to balance functionality,
> maintainability, future expansion, and the programming and hardware
> resources available to me, same as I always do, here in the real world when
> we're building production apps, not R&D experiments

My day job is to balance functionality, maintainability, future
expansion, and the programming and hardware resources available to me,
same as I always do, here in the real world when we're building
production apps ... and I'm using Topic Maps and SemWeb technologies.
Is there something I'm doing which degrades my work to an "R&D

Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)

2009-04-14 Thread Alexander Johannesen
On Wed, Apr 15, 2009 at 00:20, Jonathan Rochkind  wrote:
> Can you show me where this definition of a "URL" vs. a "URI" is made in any 
> RFC or standard-like document?

>From ;

1.1.3.  URI, URL, and URN

   A URI can be further classified as a locator, a name, or both.  The
   term "Uniform Resource Locator" (URL) refers to the subset of URIs
   that, in addition to identifying a resource, provide a means of
   locating the resource by describing its primary access mechanism
   (e.g., its network "location").  The term "Uniform Resource Name"
   (URN) has been used historically to refer to both URIs under the
   "urn" scheme [RFC2141], which are required to remain globally unique
   and persistent even when the resource ceases to exist or becomes
   unavailable, and to any other URI with the properties of a name.

   An individual scheme does not have to be classified as being just one
   of "name" or "locator".  Instances of URIs from any given scheme may
   have the characteristics of names or locators or both, often
   depending on the persistence and care in the assignment of
   identifiers by the naming authority, rather than on any quality of
   the scheme.  Future specifications and related documentation should
   use the general term "URI" rather than the more restrictive terms
   "URL" and "URN" [RFC3305].

As you can see, an URI is an identifier, and a URL is a locator
(mechanism for retrieval), and since a URL is a subset of an URI, you
_can_ resolve URIs as well.

> Sure, we have a _sense_ of how the connotation is different, but
> I don't think that sense is actually formalized anywhere.

It is, and the same stuff is documented in WikiPedia as well ;

> I think the sem web crowd actually embraces this confusingness,

No, I think they take it at face value; they(the URIs)  are
identifiers for things, and can be used for just that purpose, but
they are also URLs which mean they resolve to something. What I think
you're coming at is that "something" thing it resolves too, as *that*
has no definition. But then, if you go from RDF to Topic Maps PSIs
(PSIs are URIs with an extended meaning), *that* thing it resolves to
indeed has a definition; it's the prose explaining what the identifier
identifies, and this is the most important difference between RDF and
Topic Maps (and a very subtle but important difference, too).

> they want to have it both ways: Oh, a URI doesn't need to resolve,
> it's just an opaque identifier; but you really should use http URIs
> for all URIs; why? because it's important that they resolve.

I smell straw-man. :) But yes, they do want both, as both is in fact a
friggin' smart thing to have. We all deal with identifiers all the
time, in internal as external applications, so why not use an
indetifier scheme that has the added bonus of adding a resolver
mechanism? If you want to be stupid and lock yourself in your limited
world, then using them as just identifiers is fine but perhaps a bit,
well, stupid. But if you want to be smart about it, realizing that
without ontological work there will *never* be proper interop, you use
those identifiers and let them resolve to something. And if you're
really smart, you let them resolve to either more RDF statements, or,
if you're seriously Einsteinly smart, use PSIs (as in Topic Maps) :).

> In general, combining two functions in one mechanism is a
> dangerous and confusing thing to do in data design, in my opinion.

Because ... ?

> By analogy, it's what gets a lot of MARC/AACR2 into trouble.

Hmm, and I thought it was crap design that did that, coupled with poor
metadata constraints and validation channels, untyped fields, poor
tooling, the lack of machine understandability, and the general
library idiom of "not invented here". But correct me if I'm wrong. :)

> Over in:

Umm, I'd be wary to take as canon a draft with editorial notes going
back 4 to 5 years that still aren't resolved. In other words, this
document isn't relevant to the real world. Yet.

> They suggest: "URI opacity    'Agents making use of URIs SHOULD NOT attempt 
> to infer properties of the referenced resource.'"

Well, as a RESTafarian I understand this argument quite well. It's
about not assuming too much from the internal structure of the URI.
Again, it's an identifier, not a scheme such as an URL where structure
is defined. Again, for URIs, don't assume structure because at this
point it isn't an URL.

> If I get a URI representing (eg) a Sudoc (or an ISSN, or an LCCN), I need to
> be able to tell from the URI alone that it IS a Sudoc, AND I need to be able
> to extract the actual SuDoc identifier from it.  That completely violates 
> their
> Opacity requirement

I think you are quite mistaken on this, but before we leap into wheter
the we

Re: [CODE4LIB] Something completely different

2009-04-14 Thread Alexander Johannesen
On Wed, Apr 15, 2009 at 10:32, stuart yeates  wrote:
> Yes, we mint something very similar (see
> for mine), but none of our interoperability partners do. None of our local
> libraries, none of our local archives and only one of our local museums (by
> virtue of some work we did with them).
> All of them publish and most consume some form RDF.

Hmm, RDF resources are just URIs, so I'm still a bit unsure about what
you mean. Are you talking about the fact that the RDF definitions (and
not the RDF vocabs themselves) aren't encoded in your TM engine?

> Additionally many of the taxonomies we're interested in are available in RDF
> but not topic maps.

Converting them to a Topic Map isn't that hard to do, but I guess
there is *a* cost there.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Something completely different

2009-04-14 Thread Alexander Johannesen
On Wed, Apr 15, 2009 at 07:10, stuart yeates  wrote:
> RDF, unlike topic maps, is being used by substantial numbers of people who
> we interact with in the real world and would like to interoperate with. If
> we used RDF rather than topic maps internally, that interoperability would
> be much, much cheaper. It's tempting to say it's free, but it's not quite,
> because it does impose some constraints.

But it's not that hard to create a bridge from RDF to Topic Maps and
back, no? Or is your interop story different?

> In my eyes, the core thing that RDF supports that topic maps don't seem to
> is seamless reuse by people you don't care about.

Yes, this has been brought up on several occasions, including by me at
the TMRA 2008. But then, it's not so much that RDF does something that
Topic Maps doesn't *support*, it's that it's packaged differently. So,
where RDF has got five standard ontology levels (RDF, RDFS, OWL
DL/Lite/Full) Topic Maps got one simpler one (TMDM), yet neither can
express anything  better or differently than the other.

My theory here is that people *like* 5 layers of RDF, because it gives
the false sensation of choice. But it's all ontological definitions.
However, the 5 levels of RDF does indeed create a defined platform for
sharing (if not cast in iron), in which in the TM world you need to
include it / create it.

Oh, and of course the academics seem to have embraced W3C and anything
by the authority of TBL, and its effect is trickling down.

> For example the people at have never heard of us (that
> I know of), but we can use their URLs like
> to represent our roles.

Not sure I understand your example. Here's my Topic Map identifier in
a Topic Map ;

Identifier and locator, and resolvable, and can be used by anyone.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)

2009-04-14 Thread Alexander Johannesen
On Tue, Apr 14, 2009 at 23:34, Jonathan Rochkind  wrote:
> The difference between URIs and URLs?  I don't believe that "URL" is 
> something that exists any more in any standard, it's all URIs. Correct me if 
> I'm wrong.

Sure it exists: URLs are a subset of URIs. URLs are locators as
opposed to "just" identifiers (which is an important distinction, much
used in SemWeb lingo), where URLs are closer to the "protocol like"
things Ray describe (or so I think).

> I don't entirely agree with either dogmatic side here, but I do think that 
> we've arrived at an
> awfully confusing (for developers) environment.

But what about it is confusing (apart from us having this discussion
:) ? Is it that we have IDs that happens to *also* resolve? And why is
that confusing?

> Re-reading the various semantic web TAG position papers people keep
> referencing, I actually don't entirely agree with all of their principles in 
> practice.

Well, let me just say that there's more to SemWeb than what comes out of W3C. :)

Kind regards,

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)

2009-04-14 Thread Alexander Johannesen

Been meaning to jump into this discussion for a while, but I've been
off to an alternative universe and I can't even say it's good to be
back. :) Anwhoo ...

On Fri, Apr 3, 2009 at 03:48, Ray Denenberg, Library of Congress
> You're right, if there were a "web:"  URI scheme, the world would be a
> better place.   But it's not, and the world is worse off for it.

I'm rather confused by this statement. The "web:" URI scheme? The Web
*is* the URI scheme; they are all identifiers to resources (ftp: http:
gopher: https: etc.), and together they make up, the, um, web of
things. What am I missing?

> Back in the old days, URIs (or URLs)  were protocol based.

No, which one do you mean, URIs or URLs?

> The ftp scheme
> was for retrieving documents via ftp. The telnet scheme was for telnet. And
> so on.

Again, have I missed something? This has changed, as opposed to the
good old days?

> A few years later the semantic web was conceived and alot of SW people began
> coining all manner of http URIs that had nothing to do with the http
> protocol.

I've been browsing back and forth this discussion, and couldn't find
much to back this up. What do you mean by this?

> Instead, they should have bit the bullet and coined a new scheme.  They
> didn't, and that's why we're in the mess we're in.

I'm sorry, but "mess"? Did you know the messiness of the web is
probably what made it successful? Not to mention that having URIs be
identifiers *and* have the ability to resolve them is a bonus; they're
identifiers of things (as they've always been, as I'm sure you know
URI stands for Unified Resource Identifier, right? :), as in they
consists of a string of characters used to identify or name a resource
on the Internet. And then, if you so choose, you can use the protocol
level to *resolve* them. Not sure how anyone can consider this to be
bad, though.

Or is this just a misunderstanding of the difference between URIs and URLs?

Kind regards,

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Something completely different

2009-04-08 Thread Alexander Johannesen
On Thu, Apr 9, 2009 at 14:33, stuart yeates  wrote:
> That's not an entirely useful comparison on topic maps and RDF.

If I indented to be useful I'd write something substantial, backed up
with stuff other than humour. I'll give that a go the next time. :)

> We currently use topic maps, alot, in our infrastructure. If we were
> starting again tomorrow, I'd advocate using RDF instead, mainly because of
> the much better tool support and take-up.

Hmm, not a good thing at all. Could you elaborate, though, as I use it
too as part of infrastructure too, and wouldn't touch RDF / SemWeb
without a long stick? I'm into application semantics and shared
knowledge-bases. What are you guys doing where you feel the support
and tools are lacking? And what are the RDF alternatives?


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Something completely different

2009-04-08 Thread Alexander Johannesen
On Wed, Apr 8, 2009 at 22:38, Dr R. Sanderson  wrote:
> I would encourage looking at rdf triplestores seriously, if the graph
> approach is the direction that you want to go in.

Or, Topic Maps which is *not* a triplestore, closer to the OO model
(basically a meta data model), and don't carry the stack "overflow" of
RDF (RDF, RDFs, OWL 1-2-3) nor anonymous nodes. :)


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] points of failure (was Re: [CODE4LIB] resolution and identification )

2009-04-02 Thread Alexander Johannesen
On Fri, Apr 3, 2009 at 10:44, Mike Taylor  wrote:
> Going back to someone's point about living in the real
> world (sorry, I forget who), the Inconvenient Truth is that 90% of
> programs and 99% of users, on seeing an http: URL, will try to treat
> it as a link.  They don't know any better.

What on earth is this about? URIs *are* links; its in its design, it's
what its supposed to be. Don't design systems where they are treated
any differently. Again we're seeing that "all we need are URIs" poor
judgement of SemWeb enthusiasts muddling the waters. The short of it
is, if you're using URIs as identifiers, having the choice to
dereference it is a *feature*; if it resolves to 404 then tough (and
I'd say you designed your system poorly), but if it resolves to an
information snippet about the semantic meaning of that URI, they yay.
This is how us Topic Mappers see this whole debacle and flaw in the
SemWeb structure, and we call it Public Subject Indicators, where
"Public" means it resolves to something (just like WikiPedia URIs
resolve to some text that explains what it is representing),
"Subjects" are anything in the world (but distinct from Topics which
are software representations), and "Indicators" as they indicate
(rather than absolutely identify) things.

In other words, if you use URIs as identifiers (which is a *good*
thing), then resolvability is a feature to be promoted, not something
to be shunned. If you can't make good systems design, use URNs. You
can treat URI identifiers as both identifiers and subject indicators,
while URNs are evil.

> Let's make our identifiers look like identifiers.

What does that even mean? :)

> (By the way, note that this is NOT what I was saying back at the start
> of the thread.  This means that I have -- *gasp* -- changed my mind!
> Is this a first on the Internet?  :-)

Maybe, but it surely will be the last ...

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?

2009-02-13 Thread Alexander Johannesen
>> One question we haven't asked is if we really need a MIME type for
>> MARCXML. :)

On Thu, Feb 12, 2009 at 23:28, Jonathan Rochkind  wrote:
> PPS: Yes, it has been asked, and it's pretty obvious to me that we do.

I wasn't asking for technical reasons; I was more having a stab at how
many people use and need MARCXML specifically as compared to a number
of other more used formats. I mean, seriously, you can use MARCXML
embedded in Atom and get the best of both worlds instead.

Don't worry about it; it's not a serious _enough_ question. :)

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?

2009-02-12 Thread Alexander Johannesen
On Thu, Feb 12, 2009 at 22:32, Jonathan Rochkind  wrote:
> Didn't we finish having this conversation last week? We talked about all
> this stuff being brought up now last week.

We did indeed, and your summary is better than what my retort could
have been; spot on.

I guess it's hard to understand why text/xml is such a waste of MIME
and time as long as we still got text/html as the original understood
MIME for HTML pages, but luckily the internet has moved on and
evolved. :)

One question we haven't asked is if we really need a MIME type for MARCXML. :)

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?

2009-02-12 Thread Alexander Johannesen
On Thu, Feb 12, 2009 at 21:43, Rebecca S Guenther  wrote:
> Patrick is right that an XML schema such as MODS or MARCXML would be text/xml.

I would strongly advise against text/xml, as it is an oxymoron (text
is not XML XML is not text even if it is delivered through a text
protocol), and more and more are switching away from the generic text
protocol (which makes little sense in structured data).

Hence, a more correct MIME type for XMLMARC would be
application/marc+xml, although until registered should be

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps


2009-01-29 Thread Alexander Johannesen
Hi there,

On Thu, Jan 29, 2009 at 15:55, Rebecca S Guenther  wrote:
> Yes, better late than never (we're a small office and stretched thin).

You're not *that* small, no? :)

> Also we want to explore MARC/RDF. We also have to keep in mind
> that MARC is also used by non-AACR2 users (and when RDA is
> implemented non-RDA users).

Shouldn't the library world slowly work towards a common set of rules,
backed by technology, to make it easier for us all to move forward
with less pain?

> As a starting point in exploring semantic web types
> of technologies we are establishing a registry for controlled values
> used in various standards-- MARC, MODS, PREMIS. See the text at:

Ah, I like! This is very close to the concept in Topic Maps of
Published Subject Indicators. Could the identifiers within have a
certain degree of persistance and resolvability? If so, both the
SemWeb and TM communities could use this out of the box. I also think
the DC RDA working-group has something similar. Karen? And should you
work together?

> In the meantime we have a prototype at:

Can't make much work there. Must be in alpha. :) But I like this
direction. If you now can get the vendors on-board, or better, make
more SemWeb systems yourselves, and you're a *huge* step forward. I'm
*very* excited to see this coming from LoC.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps


2009-01-28 Thread Alexander Johannesen
On Wed, Jan 28, 2009 at 20:29, Rebecca S Guenther  wrote:
> It is interesting though that a study of different metadata
> formats at Los Alamos National Labs a few years ago
> concluded that MARCXML was the richest and most robust.

Umm, I just have to add that all those compared won't make it to my
top 10 list of good formats, so, er, comparing library formats against
each other is a bit like comparing all the wonderful juicy fruit in
the world where your selection is limited to what can grow in Alaska.

It still amazes me that RDF and / or DC hidden in SRDF or Topic Maps
haven't gotten any traction when it seriously matches what you want.

> We are also working on modeling MODS as RDF-- some
> work has already been done on this.

That is good news, albeit a little late and certainly a little slow.
But I hear good things about Talis moving into this arena, and
hopefully they can pull a few other vendors with them. I guess the
first thing that is needed is a basic MARC / RDF vocabulary we can all
participate in and extend, and then cross-pollinate vocabularies as we
move away from AACR2 to more RDA / FRBR friendly stuff (although, me
personally, I would jump way ahead of RDA, but that's not going to

> In terms of MARC, we are planning for its evolution and streamlining to
> get rid of some of its problems and plan for a future where the transition
> to new cataloging rules will work well with existing records and cataloging
> infrastructure.

Are you talking about RDA here? And when will these changes happen, in
what form, how do you build momentum and expertize, etc.?

> Whatever the format of the future is, the transition will need
> to be evolutionary because of the billions of records that are
> out there and the need to satisfy a lot of the user tasks
> required of library (and other) metadata.

I agree fully, although I'd stress the poor infra-structure as a
reason more than records available (they can always be converted into
something else, but you can't easily change how systems require

> It is also worth noting that despite some calls for a MARC
> replacement, we have a number of national libraries
> throughout the world that are abandoning their national
> formats and just now adopting MARC 21. They also need
> to be considered in this transition.

I find it a bit scary it's taken this long, but I certainly welcome
the change as it makes it easier to move from one format to the other
once we all agree on a fundamental platform. But I still don't think a
clear direction forward is set. Any docos you can point to about the
future direction of LoC approved meta data exchange?


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] marc21 and usmarc

2009-01-27 Thread Alexander Johannesen
On Tue, Jan 27, 2009 at 18:56, Kyle Banerjee  wrote:
> There are arguments to do so, but the business case is not strong.

Well, I'd say the future of the library world is a good business case,
and I know several people (high and low) fully aware of it, but I
think it's hard to take any step in either direction that would be
deemed worth it. Toguh one, indeed.

> That data providers won't send MODS until libraries demand it.
> Libraries won't demand it until their systems use it. Systems won't
> use it until libraries demand it because that's what their data
> providers require.

Well, I've been yelling for vendors to get more involved for a long
time, but there's a lot of blankness coming from them. I guess they're
happy with the current tie to MARC (binding the libraries to them
forever) until the business is gone ...

> It's a vicious circle, so we're stuck with MARC. The only people who
> aren't happy with this arrangement are those who are trying to create
> something new. Many librarians who think they use MARC every day
> have no idea that it is a binary format that is unfriendly to eyes and
> machines.

MARC may be MAchine Readable, but not MAchine Understandable or even
MAchine Usable.

I had an idea some time ago to create a dummy / fake MARC record with
much more to it (like extensions and special tags systems can react
to, such as validation) and pass it around the infrastructure to see
what in it survives (the golden rule is to ignore what you don't
understand, although I know a few MARC systems who filter out what
they don't understand (!!!) because, well, these systems were mostly
built back when a megabyte of storage and / or memory had a price of
about a cataloger or two. Friggin' crazies!). Anyone in? :)


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] marc21 and usmarc

2009-01-27 Thread Alexander Johannesen
On Tue, Jan 27, 2009 at 18:06, Jonathan Rochkind  wrote:
> Because their customers are not demanding it, and they
> often don't have the technical expertise to understand
> why it matters anyway.  But mainly because
> their customers are not demanding it.

So, um, could librarians everywhere start being just a tad bit more
demanding about this stuff? You know, before your profession becomes
obsoleted from this planet?

Actually, I was wondering what areas MODS can't handle which MARC
does, hijack and / or change MODS to fit it (what I know of it seems a
bit limiting, but through XML certainly extensible). Shouldn't folks
start by demanding at least MODS (or XOBIS if we're *really* crazy :)?


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] marc21 and usmarc (fwd)

2009-01-27 Thread Alexander Johannesen
On Tue, Jan 27, 2009 at 17:09, Ardie Bausenbach  wrote:
> Since that time, many other national libraries have moved from
> their national formats to MARC 21, including (among others),
> the UK, Germany, Finland, and Spain.

I know a few more, but another point worth, er, screaming about, is
the various AACT2 / RDA / other rules changes that's not linked to
MARC at all. I know a lot of it is covered in MARC documentation, but
there's hidden gems, like punctuations, symbols, character-encodings,
etc which aren't always specified.

If the library world embraced XML as a minimum a lot could be fixed in
that area (and no, XMLMARC does not qualify :).


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] marc21 and usmarc

2009-01-27 Thread Alexander Johannesen
On Tue, Jan 27, 2009 at 17:04, Eric Lease Morgan  wrote:
> Can somebody say "MARCXML or MODS complete with a schema"?

Well, we can say it, and I think we *have* said it for a very long
time, but it doesn't seem to change anything. Damn those words.

> Such solutions offer at least syntactic validation if not also
> semantic validation. Oh well.

I would say a little bit more than "oh well" (but I don't really have;
you know how I feel :), but I would love to hear what the vendors are
thinking about this all. They seem to very, very quiet about it all
(without speculating to why ...)


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] PHP Frameworks

2008-10-27 Thread Alexander Johannesen
On Mon, Oct 27, 2008 at 21:50, Susan Teague Rector <[EMAIL PROTECTED]> wrote:
> We're exploring Zend as a framework for php based Web applications. I'm
> curious to see if anyone out there is using this framework (or another MVC
> framework). Also, I wondering how many full-time developers you have on
> staff programming.

Back when I was in the library world I used Zend Framework, albeit not
as MVC (I needed a more RESTful paradigm), but the components
themselves are fantastic, and I hear and see good things about the MVC
as well. You can't fail with it as it brings easy OO to the PHP world.

As to staff, I guess three or four would be up to ZF scratch at that
time (about 1 year ago).


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] PHP5 Help

2008-07-01 Thread Alexander Johannesen
On Tue, Jul 1, 2008 at 13:42, Nicole Engard <[EMAIL PROTECTED]> wrote:
> I am missing something right in front of my eyes.  I'm rusty on my
> PHP, I'm wondering if someone can help me with this error:
> Warning: gmmktime() expects parameter 3 to be long, string given in
> /public_html/magpierss-0.72/ on line 35

Well, it's a bit puzzling in the sense that the parameters are all
ints, but hey. :) Try casting the values ;
   gmmktime( (int) $hours, (int) $minutes, (int) $seconds, (int)
$month, (int) $day, (int) $year ) ;

or try the same with (long).

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] -- 3 suggestions

2008-05-22 Thread Alexander Johannesen
On Thu, May 22, 2008 at 5:06 PM, K.G. Schneider <[EMAIL PROTECTED]> wrote:
> I feel self-conscious about seeing posts reflected in the "planet" that
> are not related to library technology, only because I'm not willing to
> break up my blog into sub-blogs and don't know if oysters and pace
> layering really go together for the "planet."

Ouch, I suspect a conversation next about what fits the code4lib
planet moniker. Does my technology rants that don't bash MARC fit?
Does Topic Maps fit, even if libraries don't use them but they are a
perfect fit? Posts about philosophical aspects of the code we make? Or
the epistemological musings of workflows? Lest not forget that the
human aspect of the library profession is what makes librarians so
great ...

It's a tough one.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Gartner on OSS

2008-03-31 Thread Alexander Johannesen
On Mon, Mar 31, 2008 at 2:45 AM, D Chudnov <[EMAIL PROTECTED]> wrote:
> the risk of upsetting *everybody*...

It's a bit depressive that once we get an interesting discussion going
on this list which normally has such low volume, and which is
*definitely* on-topic, someone comes along and tries to kill it
because it doesn't fit *their* ideal of what the topics should be.

Allow me to vent a few seconds; Sorry, but OSS is *all* about code and
often about business models, and rest assured Karen and all the rest
of us *definitely* are defining the "enterprise" in question as the
library world, so this is *all* about code for libraries. We aren't
writing code in the posts, but we certainly are talking about code.
Nitpicking about such *tiny* semantic differences is just one of those
things which drive me up the wall! Of *course* this  topic has a place
on this list, and of *course* we're not going to create Yet Another
MailingListForSomethingJustBecauseWeAreBloodyLibraries, and of
*course* we should talk about these things, and *especially* here
where coders talk about code. Code is more than syntax.

But I guess this thread is dead now, and so is at least *my* ideal of
what this list is, so take care.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Gartner on OSS

2008-03-30 Thread Alexander Johannesen
On Sun, Mar 30, 2008 at 9:40 PM, K.G. Schneider <[EMAIL PROTECTED]> wrote:
>  For those of us in the field pushing for new approaches, the Gartner report
>  does represent positive change. It's not that OSS isn't successful. It's
>  that some of us would really like it to be much more successful...

Fair enough. I certainly understand the significance for OSS
passionadas in organisations under MBA and committee rule, it's just
infuriating that these things have to be spelled out in childish ways
(which the litmus test really is all about) by conservatives for
"approved benefits to the enterprise." This is partly why I left the
library world, mind you, so if that report can fix up some of the
glaring things that made my experience there so painful (a constant
struggle of spelling things out), I might think of coming back. :)

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Gartner on OSS

2008-03-30 Thread Alexander Johannesen
On Sun, Mar 30, 2008 at 7:51 PM, K.G. Schneider <[EMAIL PROTECTED]> wrote:
> Sorry, Alexander, I disagree.

What, is that allowed!? :)

> Gartner may sound creaky but under the starchy
>  language, this is pretty revolutionary advice.

I can't agree with the "revolutionary advice" part; business leaders,
firms, advisers and abusers have been saying this already for years.
That Gartner now is on the field saying it too shows nothing except
how conservative they are; this is an old message, and certainly not
aimed at people who's doing the actual work in their organisations.

I've been in the "enterprise" for most of my life as a high-flying
consultant (except my non-enterprise last few years in the library
world), and currently work as both manager, developer and advisor to
the largest enterprise organisations around. We've always recomended
and / or used OSS, integrated the very ideal into the fabric of
enterprise software development.

The only people that Gartner now is playing to are the business
people, who will be surprised to learn that their organisations
already use (and many fully embrace) OSS, and have done so for years.
(How they'll cope with that news is another story, and maybe Gartner
is their coming safety blanket) Even big guys who think that only the
Oracle business stack is good enough for them will be surprised to
find the odd OSS project supporting their infrastructure.

OSS is already successful, and it's already working great even if the
MBAs don't know it. And because Gasrtner now is playing to those
people, that's why the porridge litmus test works so great; in
reality, nothing will change, which for many is the perfect advice.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Gartner on OSS

2008-03-30 Thread Alexander Johannesen
Let's try the litmus test for enterprisey business bullshit : porridge ;

"Recommendations for Users
 * Look for a sustainable community that has a critical mass of skills
   supporting porridge.
 * Look for a cultural match between the porridge community and
   your internal developers and user culture as it enhances communication
   and perceived user satisfaction.
 * Prepare an SOA that can integrate IT services from many sources,
   including porridge.
 * Avoid porridge that is not built on open standards.
 * Make a conscious risk-based decision about whether you will depend on
   internal resources or external services for your porridge implementations."

In short, another template piece where [insert your favourite thing
here] is wrapped around generic advice. Do they say anything that's
specific to what open-source is all about?

Alex (without reading the darn article...)
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] for people who work with big data sets

2008-01-15 Thread Alexander Johannesen
On Jan 16, 2008 7:08 AM, Aaron Swartz <[EMAIL PROTECTED]> wrote:

Excellent initiative! Joined, and I'll forward the information around
to other communities I know do this type of work.


 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] [Fwd: [NGC4LIB] A Thought Experiment]

2007-11-08 Thread Alexander Johannesen

On Nov 9, 2007 7:42 AM, Carl Grant <[EMAIL PROTECTED]> wrote:
> I'm seeking some help understanding here.   From my perspective
> (again, that of a long time vendor of "commercial software" having
> recently moved to "commercial service for OSS software") this is
> exactly what a number of us (LibLime, Evergreen, Index Data, CARE
> Affiliates) are *trying* to do.   We're not only providing the
> services to allow libraries to adopt open source, we're also doing
> the marketing and selling that libraries seem to require before
> they'll even consider the option.

I think this is extremely important for the library world right now,
far more important than any current standard, model or prototyping
exercise ; support the vendors going Open Source. Don't think about it
for too long ; we must grab this opportunity *at all cost*, because,
frankly, it's the only chance we've got to set ourselves straight
again. The only way to get away from the suppressed and locked-down
legacy-driven world we currently live in is to embrace openness,
especially when it's coming from vendors (who's by that very token
asking us to work *with* them this time instead of just buying their

There's a slight clause here, though, for the vendors ; you *must*
adopt web services for *every* part of your solutions. I know that
this often goes against the grain of a "proposed system" (a system
that holistically solves a problem space) but the truth of the matter
is that you will never make your system work spot on for everyone, and
we need the reassurance (even if we never use the option) of going in
a different direction or using someone else's solution for a
particular problem. By allowing a more open development model the
library world will love you and gladly give you money for support and
further development. Consider the openness even a token more than a
reality option.

Here's a quick list of things I see crucially happening ;

* The library world has to come together to create a common language
for these web services, an ontology if you will. We must decide on a
few good (and possibly already existing) protocols and dictionaries.

* Vendors must settle on a development model for web services (and I'd
humbly suggest a REST model) and not be afraid of opening up or
segmenting their holistic solutions into sharable / interchangeable

* Get some outside experts in to handle usability and interaction
design, and open source the result. Create a consortium or
interest-group for library systems usability and user experience.

* Make sure we've got a *clean* cut of technology between business
logic and the user interface. Enforce low-key semantically-rich XHTML
and use CSS everywhere.

Here's to dreaming.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] mylibrary web services

2007-08-06 Thread Alexander Johannesen
On 8/7/07, Eric Lease Morgan <[EMAIL PROTECTED]> wrote:
> In summary, RESTful Web Services using ROA (Resource Oriented
> Architecture) appears to be a "purist" approach to using the Web.

Purist? No, it's not the purist way, but the right way. You can use a
hammer to put in a screw, but are you going to call me a purist if I
suggest you use a screwdriver?

"Purist" as a word has a lot of negative connotations.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] executing a cgi script in the middle of a url

2007-07-30 Thread Alexander Johannesen
On 7/31/07, Eric Lease Morgan <[EMAIL PROTECTED]> wrote:
> What am I doing wrong? How do I need to configure Apache accordingly?

We use a bunch of URL rewrite rules to solve this issue. We have a
host of backend technologies like Perl, PHP, CGI, Java, but all the
URL are equally clean.

We set up one set of rules per service point, where a 'point' is
defined in a rather technological fashion in addition to the semantic
value, so is redirected to, including all
sub-paths from this point.

 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Open Source OPAC - VUFind Beta Released

2007-07-20 Thread Alexander Johannesen

On 7/20/07, Andrew Nagy <[EMAIL PROTECTED]> wrote:

Excellent stuff, and thanks for the open-source effort.

Three things ;

1. Will there be efforts towards a development community outside your library?

2. has serious problems in its
"similar items" section. :)

3. If you scroll down a list of things and then do something that
requires a login, only the top part of the page that's not in view has
the action. The user sees nothing, and nothing happens.

Apart from that, great stuff and, if you accept such, I'd love to
participate in ways that I can.

Kind regards,

Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] "good" web service api

2007-06-30 Thread Alexander Johannesen

On 6/30/07, Eric Lease Morgan <[EMAIL PROTECTED]> wrote:

What are the characteristics of a "good" Web Service API?

That you refrain from the notion of an API. :)

Seriously, before you do anything, read the book "Restful WebServices"
by Sam Ruby and Leonard Richardson
( I'd do it the ROA way
(and have for some time; resource oriented architecture), but I do
understand it puts certain strain on the areas of the brain
responsible for learning conceptually new things.

Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Position Available: Manager of Data Systems

2007-05-17 Thread Alexander Johannesen

On 5/18/07, Patty De Anda <[EMAIL PROTECTED]> wrote:


... and not a word (that I could find) on where in the world - or
where in the assumed USA - this position is held. :)

Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] PHP Symfony

2007-03-24 Thread Alexander Johannesen

On 3/24/07, Michael J. Giarlo <[EMAIL PROTECTED]> wrote:

Hmm?  What's that you say?  Just a sec, but in the meantime, why not sit
down and have some of this delicious Kool-Aid over here?  It's Ruby
Red-flavored; I think you'll like it.

Come, now; for those who meddle in things PHP knows that a lot of the
goodness you get from Ruby you'll these days also find in PHP as well.
Things have progressed quite a bit in the last 5 years, and PHP 5.2 is
quite mature and offers an OO model on par with Ruby, without the
hassle of being a fringe technology. :)

As to about Symfony, yes, it's pretty good and compliments (or
answers) the RoR thing well. I personally don't use it as I'm more of
a XSLT, SOA, REST freak (and Symfony is slightly tricky to push into
that box, especially given the non-MVC direction of the SOA we're
building). Now that Ror 1.2 has better support for REST I think
Symfony may follow, but I don't like the default templating language
(PHP with "specials") nor the non-MVC paradigm. Having said that, I
haven't used it for a few versions and things may have improved. Check
it out.

Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Using OpenID in libraries

2007-03-23 Thread Alexander Johannesen

On 3/23/07, Jeremy Frumkin <[EMAIL PROTECTED]> wrote:

While OpenID has potential within certain contexts, I have difficulty seeing
it being quickly adopted by libraries, universities, or other entities that
need to relate real identities to an OpenID. OpenID doesn¹t do trust; it
explicitly says it is not a trust system. For libraries to adopt OpenID,
they need to somehow link OpenID to a trust system. It isn¹t clear to me
that there is enough added value to libraries at this point to adopt OpenID
­ of course, I¹d be glad to buy someone a beer if they provide a use case to
convince me otherwise ;-)

I can only offer you a beer of agreement; OpenID is fantastic for
geeks who can control their online environment, but hopeless for
normal people. The only trust given in the system is based on the
trust of the ID source, and in many cases that's just as hard to come
by in new shapes as it has been in the past. For *me* OpenID is
fantastic, but for my wife it means nothing. I suspect most of our
patrons are in the latter category, but hey, we're going to implement
OpenID cross-system soon so at least we're trying. :)

Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Videos?

2007-03-05 Thread Alexander Johannesen

On 3/6/07, Noel Peden <[EMAIL PROTECTED]> wrote:

I'm finally back the office today and the videos are in process...  I'm
not sure where they'll go, but they'll be up somewhere.
BTW, if anybody has any ideas for royalty free title music (a short 3+
second thing), I'm open.  I'll whip up something if needed.

In my dark past I was a musician, and I've got stuff lying around
waiting for the oppertune moment to be donate. What are you looking

Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps

Re: [CODE4LIB] Getting data from Voyager into XML?

2007-01-17 Thread Alexander Johannesen

On 1/18/07, Doran, Michael D <[EMAIL PROTECTED]> wrote:

So you may find that there is a well-founded reluctance among
Voyager systems people to get too carried away with the DBA 101 stuff.  ;-)

We're routing around the problem by creating a webservice that is
Voyager specific and let other apps and services use this one. That
means that if you have to do DBA stuff, you do it in one spot. It's
not the ultimate solution, but it solves a great deal of legacy and
flexibility problems.

Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps
-- ---


2006-11-01 Thread Alexander Johannesen


You may be interested in OpenFRBR:

Its aim is to build a full, free implementation of FRBR, showing
everything it can do, and looking for problems along the way.  Everyone's
welcome to get involved in whatever way they wish.

I can't get to that site (is it down?), but a few words on what you're
trying to do (is it a technical approach, model approach,
philosophical approach?), and how you want to do it would be great.

"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

Re: [CODE4LIB] OpenURL XML generation libraries?

2006-10-17 Thread Alexander Johannesen

On 10/18/06, Ross Singer <[EMAIL PROTECTED]> wrote:

I respond with an SVN repository for a ruby OpenURL library (that
doesn't currently have any documentation).

Not sure what's completely out of context about this.

Because you didn't say "a ruby OpenURL library"? :) I had no idea what
I was looking at. An SVN to some code doesn't mean everyone groks that
code by listing generic directories. A few lines of what it is and
what it can do would be fantastic.

"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

Re: [CODE4LIB] OpenURL XML generation libraries?

2006-10-17 Thread Alexander Johannesen

On 10/18/06, Ross Singer <[EMAIL PROTECTED]> wrote:

See also:

Why? What are we looking at?

"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

Re: [CODE4LIB] Open Source Seel?

2006-09-12 Thread Alexander Johannesen

On 9/13/06, Smith,Devon <[EMAIL PROTECTED]> wrote:

So, I'm wondering how many people think they might actually work on it.

Well, if I were you I'd spend some time writing up what the advantages
are. The current docos are too wishywashy to tell anything more than
that you've got a good idea you've tried out. Right now you state that
it *is* better than XSLT, but I know both XSLT and semantic data
modelling extremely well and need some convincing. Also, what
languages and technologies are used? What would be the proposed
license? Does it solve real issues or is it a nice to have?

Real interest can be gained by showing us real benefits using real
technology solving real issues. If not, then it was an interesting
research project. :)

Kind regards,

"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

Re: [CODE4LIB] native xml databases and/or XQuery?

2006-08-16 Thread Alexander Johannesen

On 8/17/06, Kevin S. Clarke <[EMAIL PROTECTED]> wrote:

I'm curious in finding out how many libraries out there are using or
experimenting with native xml databases.

Asked here, answered here. :)

We've got a few eXist's lying about serving mostly experimental stuff,
although one is semi-experimental / quasi-production quality. We're
also drooling over the latest release of DB2 with native XML support
(I think it's good enough to count as native :), but that's just the
next step, I think.

I'm also interested in learning of libraries who are using XQuery as a
primary development language.

Primary? Over my dead body. :) Although do a search for "XqueryP"
(notice the 'P') for something that just might solve some of the
bigger issues I can think of with the proposition.

"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

Re: [CODE4LIB] Photo galleries and accessibility

2006-07-12 Thread Alexander Johannesen

On 7/13/06, Amy M Ostrom <[EMAIL PROTECTED]> wrote:

Or does anyone know about photo galleries and accessibility?

There is a bigger group of people which can both see images and have
accessibility needs; low-vision users (estimated some 30% of all

Having said that, there's really nothing stopping you making tables
perfectly accessible, and it the sense of images they *are* presented
in a tabular fashion. This is where we use common sense instead of
rigid rules, so there is no reason to feel that using tables for this
is somehow wrong (unless you want to go into the whole WAI 2.0 debate

Do it the way you do, and clean up the generated code to fix the worst
offenders. If you still want to be strict on it, try talking to the
GAWDS community ( about gallery options. I seem
to recall there were some discussion about this a while back, but the
gist was that most gallery software were equally crap in accessibility
regards. Maybe things have changed.


"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

Re: [CODE4LIB] next generation opac mailing list

2006-06-06 Thread Alexander Johannesen


On 6/7/06, Ross Singer <[EMAIL PROTECTED]> wrote:

That by trotting out their Endeca powered catalog, they've finally
gotten the tangible that we nerds have been unable to get
institutional support for.  Now every librarian in the country wants
clustering and faceted search.

Sorry, I'm in the wrong country. :) In fact, that event as much as it
triggered peoples hearts and minds, it never shook the foundation of
the OPAC in this place.

But this time last year, I defy you to tell me that you could have
trotted out a project like that to anybody outside the systems office
(that wasn't already labelled a 'systems apologist').

Possibly not. Hmm. No, not with the OPAC, but other systems. I think
libraries have put too much faith in vendors who create crappy systems
and continues to do so. If vendors want libraries to buy their stuff,
they need to make sure they've got good stuff; it's getting easier and
easier to do these things ourselves.

"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

Re: [CODE4LIB] next generation opac mailing list

2006-06-06 Thread Alexander Johannesen


On 6/7/06, Jonathan Rochkind <[EMAIL PROTECTED]> wrote:

My impression is that there are LOTS of catalogers interested in
discussing this topic---the future of The Catalog.

As much as I would love to disagree with you, I don't. :) My stance on
this is not to let hackers create applications as they see fit, dear
Dog, no! I'm a die-hard user-centred design and usability guy; my life
is dedicated to develop solutions fit for the user, wheter that be
patrons, catalogers, super-users and otherwise.

I'm more talking about politics of *actually* doing something; I find
it easy to talk about innovation with my collegues, but hard to do in
practice, although we're setting up a "labs" area these days in an
attempt to break free of the tyranny of PRINCE2 and top-down
hiearchies. But hey, i realise this is probably besides the point; if
we have fruitful discussions, maybe someone can do something with it.

Some coders seem to assume
that the cataloging community doesn't realize the need for change, or
doesn't understand the possibilities of the online catalog. I think
this is more and more NOT the case. Catalogers too realize that
things are broken, change is the topic of discussion.

Actually, I've found the reverse to be true; catalogers overly aware
of things being broken, but having hackers that either can't see the
problem or are too busy to do so. My feeling about this all is that
we're too busy maintaining the MARC Legacy than create a shining new
one which may or may not solve the problem. Of course, the problem
with MARC is the culture not the technology, so in order to change the
culture we need a *whopping* effort put in by *all* libraries around
the world. No very likely, but it would be fantastic if we could.

But such common vision is desperately needed.

I'd say such common vision is desperately needed on the management
level! What drives the libraries if not management? Sure, footsoldiers
and captains can push the envelope, but only so far before it becomes
political, huge, convuluted,  a project with a steering commitee, and
so forth. For me the strategy is to create prototypes to demonstrate
what we're on about, and in my case I do that *with* catalogers,
reference librarians and other friends around the library / library
world. The idea here is to unite the bottom soldiers in such a way
that the top management can see the light and resource and process

So we desperately need more forums for discussion involving both
catalogers and developers, focused on this topic.

No, we desperately need everyone to join the same forums! Not more
forums, but less! Less is more. We don't need yet another commitee; we
need one stronger one. But hey, I'm dreaming.

As Eric writes, an important topic for discussion is: "To what degree
should traditional cataloging practices be used in such a thing, or
to what degree should new and upcoming practices such as FRBR be

The danger here is that automated processes adds a quality check to
our processes, and a lot of people don't like that, especially top
management, because it points out mistakes made in the past.
Technically we don't have many problems, we can do pretty much
anything we'd like to do if we really wanted to, but it's all about
internal politics and shuffeling of resources which decides wheter it
should be done or not. If *management* don't understand what hackers
and catalogers and reference librarians are talking about, we're

Anyway, I don't think we disagree on this, only the part about needed
yet another mailing-list.


"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

Re: [CODE4LIB] next generation opac mailing list

2006-06-06 Thread Alexander Johannesen


You can thank NCSU for bringing the catalogers, reference types,
administrators, vendors, etc. to the table.

Hmm, how so? I've been at the table with many of them for many years
already and know them quite well. :) Are you referring to something


"Ultimately, all things are known because you want to believe you know."
- Frank Herbert
__ __

  1   2   >