Re: [CODE4LIB] dspace, digiTool, etd-db, and mylibrary

2006-02-02 Thread Alexander Johannesen
Hi,

On 2/3/06, Eric Lease Morgan [EMAIL PROTECTED] wrote:

http://dewey.library.nd.edu/morgan/demo/

 Your comments regarding our initial implementation would be greatly
 appreciated.

Could you explain what we're seeing and what we should be looking for
(like, the second digital image doesn't work; bad ID?) ?


Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] Code4Lib Code Sharing - (Re: [CODE4LIB] journal)

2006-02-22 Thread Alexander Johannesen
On 2/23/06, Ryan Eby [EMAIL PROTECTED] wrote:
 http://www.textualize.com/

 Again I'm unsure if we would be looking at mostly small snippets and
 functions or full fledged classes/libraries.

Thanks for pointing this out; looks good.

Now with any of this, it not so much the actual libraries and classes
that are of interest to me, but clever code to *use* them. And, to a
big degree, it is talking about application design that I fear that
we're overlooking. This is where all that we did X, and here's good
and here's bad about that approach comes in really handy; how do we
design for our audience, please the business side and make it look
good in the process? The tools should support at least these three
fundamental things, and I would hope that the journal discussed
appraoched *design* more than code.

As an example, how does it change our development infrastructure when
moving to a SOA? I can write *books* about this topic, mostly
positive. :) But this discussion seems to float to the edges (not
geeky enough for geeks, not business enough for the business people,
even though both can recognise the value of it) A journal could be a
lever to use to push the importance of things that are on the fringe
of things onto the real agenda.

2 cents worth.


Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] Libraries that support user tagging in OPAC?

2006-03-08 Thread Alexander Johannesen
Hi,

In our next generation OPAC prototype, we do typed tagging and
comments. (Typed means that there is a difference between a patron
tagging something and a reference librarian; the tags and comments are
fed back into the search engine, and alters relevance ranking) One day
it may see the day of light, but it certainly has proved itself really
powerful features.


Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] At an end : when you rub against your managers

2006-03-08 Thread Alexander Johannesen
Hi Ed,

On 3/9/06, Ed Summers [EMAIL PROTECTED] wrote:
 Lucky you! I've had similar problems in non-library settings, so I
 don't think that the library community is any worse at following
 software best practices than other communities.

Ok, so what you're saying is that is, for me, an isolated incident,
and I'd be better off to quit and find somewhere else. I can live with
that. :)

 If they were then
 there wouldn't be such an appetite for the wisdom you find in Joel on
 Software, Paul Graham, et al.

Hmm, having an appetite doesn't equate that what you're eating is
healthy, but yes, I understand your point. :)

 I'm not sure griping in public like this will help much...

I'm not so much griping in public as I'm reaching out to my fellow
geeks; I'm pretty sure that I can't be the only one who's battled new
things against conservative bastions before. Most of my problems are
located within a rather conservative mindset of my managers that I
can't seem to get through. I've broken through it in other places, to
great success, but the library world, to me, seems inpenetrable. I
guess I should have know, Z39.50 and all. :)

 In my experience I've found that people react best to seeing how a new
 development process, pattern or technology helps *in practice* rather
 than *in theory*.

I agree, and I've done all that and more, yet nothing changes. If
management above you still don't get it, or fight it, then there is
nothing left to do, and as such I think I've just concluded that. I'm
sorry to leave the library world, but not sorry to leave the
mentality.

 But everyone likes recognition for good work--I'm sorry it sounds like
 you aren't getting that support.  Good luck--and try to focus on one
 thing at a time...says ADD man.

With anything that goes against what you know as good, it won't be
classified as 'good'.

Anyways, thanks for the input.


Regards,

Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] At an end : when you rub against your managers

2006-03-08 Thread Alexander Johannesen
On 3/9/06, Kevin S. Clarke [EMAIL PROTECTED] wrote:
 That's my opinion anyway... not sure this has anything to do with code.

You're right, it hasn't; it was only geek related in the sense that we
probably all face conservativism in liue of new and fancy code. Sorry
for the noise, and thanks for the words. I think I know the answers
now.


Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] compact display for marc-xml

2006-03-27 Thread Alexander Johannesen
On 3/28/06, Hickey,Thom [EMAIL PROTECTED] wrote:
 I've attached a compressed tar file of compact.xsl, compact.css and
 mudlumps.xml, a test record.  After you've extracted the files to a
 directory you should be able to view mudlumps.xml with a browser and see
 the results.

I'd like to have a look and help out, but could you post it non-tar'ed
and possibly non-zip'ed? My gmail is barfing, and WinZip coughs and
splutters. :)


Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] Web services for LII content?

2006-03-28 Thread Alexander Johannesen
On 3/29/06, K.G. Schneider [EMAIL PROTECTED] wrote:
 Develop web services (accessible by subscription) to allow a developer to
 include some of the LII in an application.

I was going to do exactly this for the Australasian part of the world
(still pending; too much to do). I think the idea is a very good one,
but I'm not sure about the paid service. I think the only thing you
should consider is a flat small yearly fee to be part of the system,
although setting something up like this and use deli.cio.us for
tagging and commenting the links isn't that much of a deal. In short,
I value the service, but not as a pay-service. :)


Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] Question re: ranking and FRBR

2006-04-11 Thread Alexander Johannesen
On 4/12/06, Jonathan Rochkind [EMAIL PROTECTED] wrote:
 If you are instead using a formula where an increased
 number of records for a given work increases your ranking, all other
 things being equal---I'm skeptical.

Ditto; I think the answer to this is that there needs to be some
serious pre-processing and analysis to come up with some really smarts
in terms of these searches. I don't think there is an easy way out
once you've gone past the ooh, shiny stage of whatever context you
bring the user; good or bad context?


Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] Question re: ranking and FRBR

2006-04-12 Thread Alexander Johannesen
On 4/12/06, K.G. Schneider [EMAIL PROTECTED] wrote:
 Do users actually determine relevance or do they have faith in Google to
 provide the best results on the first results page?

I'd say people use a click and try n times, before refine search
until relevance is fulfilled technique. But again, this is *totally*
dependant on what they're searching for; known or unknown ;

 - books by Frank Herbert (specific enough to get some results)
 - Jungs philosophy in fiction (general enough to cause bleeds)
 - good SciFi (general enough to cause bleeding)
 - oil crisis metaphors (specific and general at the same time)

All of the above can lead to Dune by Frank Herbert. What is it's
relevance to the above searches? It's a book by Herbert, it certainly
contains Jungs philosophy, it's a good SciFi book, and has indeed the
metaphors as part of its concept. And to top it all, it's still a
popular book. So I could say The Dosadi Experiment and all the same
is true, except the popularity. Who is to say that former is preferred
over the latter? Google will give us the former, never the latter.

For libraries, this is an interesting problem to solve, because
popularity, at least in my view, is mostly a misnomer in searching for
information. Popularity in Google is measured by people actually
putting in the links, which means they point to something *because*
there is something interesting that way. In the library catalogs there
is no such thing.

We've got an experiment running here which uses tags to do this last
bit for us; people and librarians alike can tag books which will boost
their ratings. An anonymous tag denotes popularity (unless stated
otherwise), while a reference librarian boosts importance. Another
fields I'm digging into is using search term logs to do some of this
as well, generating heat for items ... close to popular, but can be
very time-based (unlike links which stays around) if you don't feed
the flame, it eventually will die out (or in this case, repurposed).

Anyways, just a few thoughts and ideas.


Alex
--
Ultimately, all things are known because you want to believe you know.
 - Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] next generation opac mailing list

2006-06-05 Thread Alexander Johannesen

On 6/6/06, Michael Bowden [EMAIL PROTECTED] wrote:

We need something.  My ILS has decided that their next generation
catalog will be a portal with its own database, etc.  I already have one
database with MARC data why do I need another to hold the non-MARC data.
 Why isn't my ILS working to expand/create the next generation MARC
record?  I think the next generation catalog goes hand and hand with the
next generation of MARC.


Oh, this one is easy to answer; we need to get away from MARC. No, not
the content of MARC, nor the idea of it, nor necessarily even the MARC
format and standard itself, but we need to get away from we need
MARC and the idea that knowledge sharing in libraries are best done
through MARC and that Z39.50 must be part of our requirements.

For example, MARC can hold some change control info, but never to the
granulaity that supports for example an NBD which can properly update
records and work on a distributed model. But as soon as we put that
info outside of MARC, the culture will choose to ignore the problem
rather than try to change it. The *culture* of MARC is the problem.

I don't think the OPAC will go away, nor that it absolutely must, but
the very idea of an OPAC is based on knowing what our patrons want;
books that we've cataloged. But all too often we have no idea what
they want; all we've got are assumptions. I think we've come a long
way, but the time to look anew to what purpose the OPAC serves
certainly is ripe.

Ok, I'll stop now. :)


Regards,

Alex
--
Ultimately, all things are known because you want to believe you know.
- Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] next generation opac mailing list

2006-06-06 Thread Alexander Johannesen

Hi,

On 6/7/06, Jonathan Rochkind [EMAIL PROTECTED] wrote:

My impression is that there are LOTS of catalogers interested in
discussing this topic---the future of The Catalog.


As much as I would love to disagree with you, I don't. :) My stance on
this is not to let hackers create applications as they see fit, dear
Dog, no! I'm a die-hard user-centred design and usability guy; my life
is dedicated to develop solutions fit for the user, wheter that be
patrons, catalogers, super-users and otherwise.

I'm more talking about politics of *actually* doing something; I find
it easy to talk about innovation with my collegues, but hard to do in
practice, although we're setting up a labs area these days in an
attempt to break free of the tyranny of PRINCE2 and top-down
hiearchies. But hey, i realise this is probably besides the point; if
we have fruitful discussions, maybe someone can do something with it.


Some coders seem to assume
that the cataloging community doesn't realize the need for change, or
doesn't understand the possibilities of the online catalog. I think
this is more and more NOT the case. Catalogers too realize that
things are broken, change is the topic of discussion.


Actually, I've found the reverse to be true; catalogers overly aware
of things being broken, but having hackers that either can't see the
problem or are too busy to do so. My feeling about this all is that
we're too busy maintaining the MARC Legacy than create a shining new
one which may or may not solve the problem. Of course, the problem
with MARC is the culture not the technology, so in order to change the
culture we need a *whopping* effort put in by *all* libraries around
the world. No very likely, but it would be fantastic if we could.


But such common vision is desperately needed.


I'd say such common vision is desperately needed on the management
level! What drives the libraries if not management? Sure, footsoldiers
and captains can push the envelope, but only so far before it becomes
political, huge, convuluted,  a project with a steering commitee, and
so forth. For me the strategy is to create prototypes to demonstrate
what we're on about, and in my case I do that *with* catalogers,
reference librarians and other friends around the library / library
world. The idea here is to unite the bottom soldiers in such a way
that the top management can see the light and resource and process
accordingly.


So we desperately need more forums for discussion involving both
catalogers and developers, focused on this topic.


No, we desperately need everyone to join the same forums! Not more
forums, but less! Less is more. We don't need yet another commitee; we
need one stronger one. But hey, I'm dreaming.


As Eric writes, an important topic for discussion is: To what degree
should traditional cataloging practices be used in such a thing, or
to what degree should new and upcoming practices such as FRBR be
exploited?


The danger here is that automated processes adds a quality check to
our processes, and a lot of people don't like that, especially top
management, because it points out mistakes made in the past.
Technically we don't have many problems, we can do pretty much
anything we'd like to do if we really wanted to, but it's all about
internal politics and shuffeling of resources which decides wheter it
should be done or not. If *management* don't understand what hackers
and catalogers and reference librarians are talking about, we're
stuffed!

Anyway, I don't think we disagree on this, only the part about needed
yet another mailing-list.


Regards,

Alex
--
Ultimately, all things are known because you want to believe you know.
- Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] next generation opac mailing list

2006-06-06 Thread Alexander Johannesen

Hiya,

On 6/7/06, Ross Singer [EMAIL PROTECTED] wrote:

That by trotting out their Endeca powered catalog, they've finally
gotten the tangible that we nerds have been unable to get
institutional support for.  Now every librarian in the country wants
clustering and faceted search.


Sorry, I'm in the wrong country. :) In fact, that event as much as it
triggered peoples hearts and minds, it never shook the foundation of
the OPAC in this place.


But this time last year, I defy you to tell me that you could have
trotted out a project like that to anybody outside the systems office
(that wasn't already labelled a 'systems apologist').


Possibly not. Hmm. No, not with the OPAC, but other systems. I think
libraries have put too much faith in vendors who create crappy systems
and continues to do so. If vendors want libraries to buy their stuff,
they need to make sure they've got good stuff; it's getting easier and
easier to do these things ourselves.


Alex
--
Ultimately, all things are known because you want to believe you know.
- Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] Photo galleries and accessibility

2006-07-12 Thread Alexander Johannesen

On 7/13/06, Amy M Ostrom [EMAIL PROTECTED] wrote:

Or does anyone know about photo galleries and accessibility?


There is a bigger group of people which can both see images and have
accessibility needs; low-vision users (estimated some 30% of all
users).

Having said that, there's really nothing stopping you making tables
perfectly accessible, and it the sense of images they *are* presented
in a tabular fashion. This is where we use common sense instead of
rigid rules, so there is no reason to feel that using tables for this
is somehow wrong (unless you want to go into the whole WAI 2.0 debate
:).

Do it the way you do, and clean up the generated code to fix the worst
offenders. If you still want to be strict on it, try talking to the
GAWDS community (http://www.gawds.org/) about gallery options. I seem
to recall there were some discussion about this a while back, but the
gist was that most gallery software were equally crap in accessibility
regards. Maybe things have changed.


regards,

Alex
--
Ultimately, all things are known because you want to believe you know.
- Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] OpenURL XML generation libraries?

2006-10-17 Thread Alexander Johannesen

On 10/18/06, Ross Singer [EMAIL PROTECTED] wrote:

See also: http://www.textualize.com/trac/browser/ropenurl


Why? What are we looking at?


Alex
--
Ultimately, all things are known because you want to believe you know.
- Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] OpenFRBR

2006-11-01 Thread Alexander Johannesen

Hi,


You may be interested in OpenFRBR:
http://www.openfrbr.org/

Its aim is to build a full, free implementation of FRBR, showing
everything it can do, and looking for problems along the way.  Everyone's
welcome to get involved in whatever way they wish.


I can't get to that site (is it down?), but a few words on what you're
trying to do (is it a technical approach, model approach,
philosophical approach?), and how you want to do it would be great.


Alex
--
Ultimately, all things are known because you want to believe you know.
- Frank Herbert
__ http://shelter.nu/ __


Re: [CODE4LIB] Getting data from Voyager into XML?

2007-01-17 Thread Alexander Johannesen

On 1/18/07, Doran, Michael D [EMAIL PROTECTED] wrote:

So you may find that there is a well-founded reluctance among
Voyager systems people to get too carried away with the DBA 101 stuff.  ;-)


We're routing around the problem by creating a webservice that is
Voyager specific and let other apps and services use this one. That
means that if you have to do DBA stuff, you do it in one spot. It's
not the ultimate solution, but it solves a great deal of legacy and
flexibility problems.


Alex
--
Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ ---


Re: [CODE4LIB] Videos?

2007-03-05 Thread Alexander Johannesen

On 3/6/07, Noel Peden [EMAIL PROTECTED] wrote:

I'm finally back the office today and the videos are in process...  I'm
not sure where they'll go, but they'll be up somewhere.
BTW, if anybody has any ideas for royalty free title music (a short 3+
second thing), I'm open.  I'll whip up something if needed.


In my dark past I was a musician, and I've got stuff lying around
waiting for the oppertune moment to be donate. What are you looking
for?


Alex
--
---
Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] PHP Symfony

2007-03-24 Thread Alexander Johannesen

On 3/24/07, Michael J. Giarlo [EMAIL PROTECTED] wrote:

Hmm?  What's that you say?  Just a sec, but in the meantime, why not sit
down and have some of this delicious Kool-Aid over here?  It's Ruby
Red-flavored; I think you'll like it.


Come, now; for those who meddle in things PHP knows that a lot of the
goodness you get from Ruby you'll these days also find in PHP as well.
Things have progressed quite a bit in the last 5 years, and PHP 5.2 is
quite mature and offers an OO model on par with Ruby, without the
hassle of being a fringe technology. :)

As to about Symfony, yes, it's pretty good and compliments (or
answers) the RoR thing well. I personally don't use it as I'm more of
a XSLT, SOA, REST freak (and Symfony is slightly tricky to push into
that box, especially given the non-MVC direction of the SOA we're
building). Now that Ror 1.2 has better support for REST I think
Symfony may follow, but I don't like the default templating language
(PHP with specials) nor the non-MVC paradigm. Having said that, I
haven't used it for a few versions and things may have improved. Check
it out.


Alex
--
---
Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Position Available: Manager of Data Systems

2007-05-17 Thread Alexander Johannesen

On 5/18/07, Patty De Anda [EMAIL PROTECTED] wrote:

MANAGER OF DATA SYSTEMS


... and not a word (that I could find) on where in the world - or
where in the assumed USA - this position is held. :)


Alex
--
---
Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] good web service api

2007-06-30 Thread Alexander Johannesen

On 6/30/07, Eric Lease Morgan [EMAIL PROTECTED] wrote:

What are the characteristics of a good Web Service API?


That you refrain from the notion of an API. :)

Seriously, before you do anything, read the book Restful WebServices
by Sam Ruby and Leonard Richardson
(http://www.oreilly.com/catalog/9780596529260/). I'd do it the ROA way
(and have for some time; resource oriented architecture), but I do
understand it puts certain strain on the areas of the brain
responsible for learning conceptually new things.


Alex
--
---
Project Wrangler, SOA, Information Alchymist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Open Source OPAC - VUFind Beta Released

2007-07-20 Thread Alexander Johannesen

On 7/20/07, Andrew Nagy [EMAIL PROTECTED] wrote:

http://www.vufind.org/


Excellent stuff, and thanks for the open-source effort.

Three things ;

1. Will there be efforts towards a development community outside your library?

2. http://www.vufind.org/demo/Record/56179 has serious problems in its
similar items section. :)

3. If you scroll down a list of things and then do something that
requires a login, only the top part of the page that's not in view has
the action. The user sees nothing, and nothing happens.

Apart from that, great stuff and, if you accept such, I'd love to
participate in ways that I can.


Kind regards,

Alexander
--
---
Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] [Fwd: [NGC4LIB] A Thought Experiment]

2007-11-08 Thread Alexander Johannesen
Hiya,

On Nov 9, 2007 7:42 AM, Carl Grant [EMAIL PROTECTED] wrote:
 I'm seeking some help understanding here.   From my perspective
 (again, that of a long time vendor of commercial software having
 recently moved to commercial service for OSS software) this is
 exactly what a number of us (LibLime, Evergreen, Index Data, CARE
 Affiliates) are *trying* to do.   We're not only providing the
 services to allow libraries to adopt open source, we're also doing
 the marketing and selling that libraries seem to require before
 they'll even consider the option.

I think this is extremely important for the library world right now,
far more important than any current standard, model or prototyping
exercise ; support the vendors going Open Source. Don't think about it
for too long ; we must grab this opportunity *at all cost*, because,
frankly, it's the only chance we've got to set ourselves straight
again. The only way to get away from the suppressed and locked-down
legacy-driven world we currently live in is to embrace openness,
especially when it's coming from vendors (who's by that very token
asking us to work *with* them this time instead of just buying their
stuff).

There's a slight clause here, though, for the vendors ; you *must*
adopt web services for *every* part of your solutions. I know that
this often goes against the grain of a proposed system (a system
that holistically solves a problem space) but the truth of the matter
is that you will never make your system work spot on for everyone, and
we need the reassurance (even if we never use the option) of going in
a different direction or using someone else's solution for a
particular problem. By allowing a more open development model the
library world will love you and gladly give you money for support and
further development. Consider the openness even a token more than a
reality option.

Here's a quick list of things I see crucially happening ;

* The library world has to come together to create a common language
for these web services, an ontology if you will. We must decide on a
few good (and possibly already existing) protocols and dictionaries.

* Vendors must settle on a development model for web services (and I'd
humbly suggest a REST model) and not be afraid of opening up or
segmenting their holistic solutions into sharable / interchangeable
parts.

* Get some outside experts in to handle usability and interaction
design, and open source the result. Create a consortium or
interest-group for library systems usability and user experience.

* Make sure we've got a *clean* cut of technology between business
logic and the user interface. Enforce low-key semantically-rich XHTML
and use CSS everywhere.

Here's to dreaming.


Alex
--
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] theinfo.org: for people who work with big data sets

2008-01-15 Thread Alexander Johannesen
On Jan 16, 2008 7:08 AM, Aaron Swartz [EMAIL PROTECTED] wrote:
 http://theinfo.org/

Excellent initiative! Joined, and I'll forward the information around
to other communities I know do this type of work.


Regards,

Alex
--
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Gartner on OSS

2008-03-30 Thread Alexander Johannesen
Let's try the litmus test for enterprisey business bullshit : porridge ;

Recommendations for Users
 * Look for a sustainable community that has a critical mass of skills
   supporting porridge.
 * Look for a cultural match between the porridge community and
   your internal developers and user culture as it enhances communication
   and perceived user satisfaction.
 * Prepare an SOA that can integrate IT services from many sources,
   including porridge.
 * Avoid porridge that is not built on open standards.
 * Make a conscious risk-based decision about whether you will depend on
   internal resources or external services for your porridge implementations.

In short, another template piece where [insert your favourite thing
here] is wrapped around generic advice. Do they say anything that's
specific to what open-source is all about?


Alex (without reading the darn article...)
--
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Gartner on OSS

2008-03-30 Thread Alexander Johannesen
On Sun, Mar 30, 2008 at 7:51 PM, K.G. Schneider [EMAIL PROTECTED] wrote:
 Sorry, Alexander, I disagree.

What, is that allowed!? :)

 Gartner may sound creaky but under the starchy
  language, this is pretty revolutionary advice.

I can't agree with the revolutionary advice part; business leaders,
firms, advisers and abusers have been saying this already for years.
That Gartner now is on the field saying it too shows nothing except
how conservative they are; this is an old message, and certainly not
aimed at people who's doing the actual work in their organisations.

I've been in the enterprise for most of my life as a high-flying
consultant (except my non-enterprise last few years in the library
world), and currently work as both manager, developer and advisor to
the largest enterprise organisations around. We've always recomended
and / or used OSS, integrated the very ideal into the fabric of
enterprise software development.

The only people that Gartner now is playing to are the business
people, who will be surprised to learn that their organisations
already use (and many fully embrace) OSS, and have done so for years.
(How they'll cope with that news is another story, and maybe Gartner
is their coming safety blanket) Even big guys who think that only the
Oracle business stack is good enough for them will be surprised to
find the odd OSS project supporting their infrastructure.

OSS is already successful, and it's already working great even if the
MBAs don't know it. And because Gasrtner now is playing to those
people, that's why the porridge litmus test works so great; in
reality, nothing will change, which for many is the perfect advice.


Alex
--
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Gartner on OSS

2008-03-31 Thread Alexander Johannesen
On Mon, Mar 31, 2008 at 2:45 AM, D Chudnov [EMAIL PROTECTED] wrote:
  ...at the risk of upsetting *everybody*...

It's a bit depressive that once we get an interesting discussion going
on this list which normally has such low volume, and which is
*definitely* on-topic, someone comes along and tries to kill it
because it doesn't fit *their* ideal of what the topics should be.

Allow me to vent a few seconds; Sorry, but OSS is *all* about code and
often about business models, and rest assured Karen and all the rest
of us *definitely* are defining the enterprise in question as the
library world, so this is *all* about code for libraries. We aren't
writing code in the posts, but we certainly are talking about code.
Nitpicking about such *tiny* semantic differences is just one of those
things which drive me up the wall! Of *course* this  topic has a place
on this list, and of *course* we're not going to create Yet Another
MailingListForSomethingJustBecauseWeAreBloodyLibraries, and of
*course* we should talk about these things, and *especially* here
where coders talk about code. Code is more than syntax.

But I guess this thread is dead now, and so is at least *my* ideal of
what this list is, so take care.


Grumpy,

Alex
--
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] planet.code4lib.org -- 3 suggestions

2008-05-22 Thread Alexander Johannesen
On Thu, May 22, 2008 at 5:06 PM, K.G. Schneider [EMAIL PROTECTED] wrote:
 I feel self-conscious about seeing posts reflected in the planet that
 are not related to library technology, only because I'm not willing to
 break up my blog into sub-blogs and don't know if oysters and pace
 layering really go together for the planet.

Ouch, I suspect a conversation next about what fits the code4lib
planet moniker. Does my technology rants that don't bash MARC fit?
Does Topic Maps fit, even if libraries don't use them but they are a
perfect fit? Posts about philosophical aspects of the code we make? Or
the epistemological musings of workflows? Lest not forget that the
human aspect of the library profession is what makes librarians so
great ...

It's a tough one.


Alex
--
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] PHP5 Help

2008-07-01 Thread Alexander Johannesen
On Tue, Jul 1, 2008 at 13:42, Nicole Engard [EMAIL PROTECTED] wrote:
 I am missing something right in front of my eyes.  I'm rusty on my
 PHP, I'm wondering if someone can help me with this error:

 Warning: gmmktime() expects parameter 3 to be long, string given in
 /public_html/magpierss-0.72/rss_utils.inc on line 35

Well, it's a bit puzzling in the sense that the parameters are all
ints, but hey. :) Try casting the values ;
   gmmktime( (int) $hours, (int) $minutes, (int) $seconds, (int)
$month, (int) $day, (int) $year ) ;

or try the same with (long).


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] marc21 and usmarc

2009-01-27 Thread Alexander Johannesen
On Tue, Jan 27, 2009 at 17:04, Eric Lease Morgan emor...@nd.edu wrote:
 Can somebody say MARCXML or MODS complete with a schema?

Well, we can say it, and I think we *have* said it for a very long
time, but it doesn't seem to change anything. Damn those words.

 Such solutions offer at least syntactic validation if not also
 semantic validation. Oh well.

I would say a little bit more than oh well (but I don't really have;
you know how I feel :), but I would love to hear what the vendors are
thinking about this all. They seem to very, very quiet about it all
(without speculating to why ...)


regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] marc21 and usmarc (fwd)

2009-01-27 Thread Alexander Johannesen
On Tue, Jan 27, 2009 at 17:09, Ardie Bausenbach a...@loc.gov wrote:
 Since that time, many other national libraries have moved from
 their national formats to MARC 21, including (among others),
 the UK, Germany, Finland, and Spain.

I know a few more, but another point worth, er, screaming about, is
the various AACT2 / RDA / other rules changes that's not linked to
MARC at all. I know a lot of it is covered in MARC documentation, but
there's hidden gems, like punctuations, symbols, character-encodings,
etc which aren't always specified.

If the library world embraced XML as a minimum a lot could be fixed in
that area (and no, XMLMARC does not qualify :).


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] marc21 and usmarc

2009-01-27 Thread Alexander Johannesen
On Tue, Jan 27, 2009 at 18:56, Kyle Banerjee kyle.baner...@gmail.com wrote:
 There are arguments to do so, but the business case is not strong.

Well, I'd say the future of the library world is a good business case,
and I know several people (high and low) fully aware of it, but I
think it's hard to take any step in either direction that would be
deemed worth it. Toguh one, indeed.

 That data providers won't send MODS until libraries demand it.
 Libraries won't demand it until their systems use it. Systems won't
 use it until libraries demand it because that's what their data
 providers require.

Well, I've been yelling for vendors to get more involved for a long
time, but there's a lot of blankness coming from them. I guess they're
happy with the current tie to MARC (binding the libraries to them
forever) until the business is gone ...

 It's a vicious circle, so we're stuck with MARC. The only people who
 aren't happy with this arrangement are those who are trying to create
 something new. Many librarians who think they use MARC every day
 have no idea that it is a binary format that is unfriendly to eyes and
 machines.

MARC may be MAchine Readable, but not MAchine Understandable or even
MAchine Usable.

I had an idea some time ago to create a dummy / fake MARC record with
much more to it (like extensions and special tags systems can react
to, such as validation) and pass it around the infrastructure to see
what in it survives (the golden rule is to ignore what you don't
understand, although I know a few MARC systems who filter out what
they don't understand (!!!) because, well, these systems were mostly
built back when a megabyte of storage and / or memory had a price of
about a cataloger or two. Friggin' crazies!). Anyone in? :)


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] MARC 21 and MODS

2009-01-28 Thread Alexander Johannesen
On Wed, Jan 28, 2009 at 20:29, Rebecca S Guenther r...@loc.gov wrote:
 It is interesting though that a study of different metadata
 formats at Los Alamos National Labs a few years ago
 concluded that MARCXML was the richest and most robust.
 http://www.dlib.org/dlib/september06/goldsmith/09goldsmith.html

Umm, I just have to add that all those compared won't make it to my
top 10 list of good formats, so, er, comparing library formats against
each other is a bit like comparing all the wonderful juicy fruit in
the world where your selection is limited to what can grow in Alaska.

It still amazes me that RDF and / or DC hidden in SRDF or Topic Maps
haven't gotten any traction when it seriously matches what you want.

 We are also working on modeling MODS as RDF-- some
 work has already been done on this.

That is good news, albeit a little late and certainly a little slow.
But I hear good things about Talis moving into this arena, and
hopefully they can pull a few other vendors with them. I guess the
first thing that is needed is a basic MARC / RDF vocabulary we can all
participate in and extend, and then cross-pollinate vocabularies as we
move away from AACR2 to more RDA / FRBR friendly stuff (although, me
personally, I would jump way ahead of RDA, but that's not going to
happen).

 In terms of MARC, we are planning for its evolution and streamlining to
 get rid of some of its problems and plan for a future where the transition
 to new cataloging rules will work well with existing records and cataloging
 infrastructure.

Are you talking about RDA here? And when will these changes happen, in
what form, how do you build momentum and expertize, etc.?

 Whatever the format of the future is, the transition will need
 to be evolutionary because of the billions of records that are
 out there and the need to satisfy a lot of the user tasks
 required of library (and other) metadata.

I agree fully, although I'd stress the poor infra-structure as a
reason more than records available (they can always be converted into
something else, but you can't easily change how systems require
MARC21)

 It is also worth noting that despite some calls for a MARC
 replacement, we have a number of national libraries
 throughout the world that are abandoning their national
 formats and just now adopting MARC 21. They also need
 to be considered in this transition.

I find it a bit scary it's taken this long, but I certainly welcome
the change as it makes it easier to move from one format to the other
once we all agree on a fundamental platform. But I still don't think a
clear direction forward is set. Any docos you can point to about the
future direction of LoC approved meta data exchange?


regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] MARC 21 and MODS

2009-01-29 Thread Alexander Johannesen
Hi there,

On Thu, Jan 29, 2009 at 15:55, Rebecca S Guenther r...@loc.gov wrote:
 Yes, better late than never (we're a small office and stretched thin).

You're not *that* small, no? :)

 Also we want to explore MARC/RDF. We also have to keep in mind
 that MARC is also used by non-AACR2 users (and when RDA is
 implemented non-RDA users).

Shouldn't the library world slowly work towards a common set of rules,
backed by technology, to make it easier for us all to move forward
with less pain?

 As a starting point in exploring semantic web types
 of technologies we are establishing a registry for controlled values
 used in various standards-- MARC, MODS, PREMIS. See the text at:
 http://id.loc.gov

Ah, I like! This is very close to the concept in Topic Maps of
Published Subject Indicators. Could the identifiers within have a
certain degree of persistance and resolvability? If so, both the
SemWeb and TM communities could use this out of the box. I also think
the DC RDA working-group has something similar. Karen? And should you
work together?

 In the meantime we have a prototype at:
 http://www.loc.gov:8081/standards/registry/lists.html

Can't make much work there. Must be in alpha. :) But I like this
direction. If you now can get the vendors on-board, or better, make
more SemWeb systems yourselves, and you're a *huge* step forward. I'm
*very* excited to see this coming from LoC.


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?

2009-02-12 Thread Alexander Johannesen
On Thu, Feb 12, 2009 at 21:43, Rebecca S Guenther r...@loc.gov wrote:
 Patrick is right that an XML schema such as MODS or MARCXML would be text/xml.

I would strongly advise against text/xml, as it is an oxymoron (text
is not XML XML is not text even if it is delivered through a text
protocol), and more and more are switching away from the generic text
protocol (which makes little sense in structured data).

Hence, a more correct MIME type for XMLMARC would be
application/marc+xml, although until registered should be
application/x-marc+xml.


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?

2009-02-12 Thread Alexander Johannesen
On Thu, Feb 12, 2009 at 22:32, Jonathan Rochkind rochk...@jhu.edu wrote:
 Didn't we finish having this conversation last week? We talked about all
 this stuff being brought up now last week.

We did indeed, and your summary is better than what my retort could
have been; spot on.

I guess it's hard to understand why text/xml is such a waste of MIME
and time as long as we still got text/html as the original understood
MIME for HTML pages, but luckily the internet has moved on and
evolved. :)

One question we haven't asked is if we really need a MIME type for MARCXML. :)


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] MIME Type for MARC, Mods, etc.?

2009-02-13 Thread Alexander Johannesen
 One question we haven't asked is if we really need a MIME type for
 MARCXML. :)

On Thu, Feb 12, 2009 at 23:28, Jonathan Rochkind rochk...@jhu.edu wrote:
 PPS: Yes, it has been asked, and it's pretty obvious to me that we do.

I wasn't asking for technical reasons; I was more having a stab at how
many people use and need MARCXML specifically as compared to a number
of other more used formats. I mean, seriously, you can use MARCXML
embedded in Atom and get the best of both worlds instead.

Don't worry about it; it's not a serious _enough_ question. :)


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] points of failure (was Re: [CODE4LIB] resolution and identification )

2009-04-02 Thread Alexander Johannesen
On Fri, Apr 3, 2009 at 10:44, Mike Taylor m...@indexdata.com wrote:
 Going back to someone's point about living in the real
 world (sorry, I forget who), the Inconvenient Truth is that 90% of
 programs and 99% of users, on seeing an http: URL, will try to treat
 it as a link.  They don't know any better.

What on earth is this about? URIs *are* links; its in its design, it's
what its supposed to be. Don't design systems where they are treated
any differently. Again we're seeing that all we need are URIs poor
judgement of SemWeb enthusiasts muddling the waters. The short of it
is, if you're using URIs as identifiers, having the choice to
dereference it is a *feature*; if it resolves to 404 then tough (and
I'd say you designed your system poorly), but if it resolves to an
information snippet about the semantic meaning of that URI, they yay.
This is how us Topic Mappers see this whole debacle and flaw in the
SemWeb structure, and we call it Public Subject Indicators, where
Public means it resolves to something (just like WikiPedia URIs
resolve to some text that explains what it is representing),
Subjects are anything in the world (but distinct from Topics which
are software representations), and Indicators as they indicate
(rather than absolutely identify) things.

In other words, if you use URIs as identifiers (which is a *good*
thing), then resolvability is a feature to be promoted, not something
to be shunned. If you can't make good systems design, use URNs. You
can treat URI identifiers as both identifiers and subject indicators,
while URNs are evil.

 Let's make our identifiers look like identifiers.

What does that even mean? :)

 (By the way, note that this is NOT what I was saying back at the start
 of the thread.  This means that I have -- *gasp* -- changed my mind!
 Is this a first on the Internet?  :-)

Maybe, but it surely will be the last ...


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Something completely different

2009-04-08 Thread Alexander Johannesen
On Wed, Apr 8, 2009 at 22:38, Dr R. Sanderson azar...@liverpool.ac.uk wrote:
 I would encourage looking at rdf triplestores seriously, if the graph
 approach is the direction that you want to go in.

Or, Topic Maps which is *not* a triplestore, closer to the OO model
(basically a meta data model), and don't carry the stack overflow of
RDF (RDF, RDFs, OWL 1-2-3) nor anonymous nodes. :)


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Something completely different

2009-04-08 Thread Alexander Johannesen
On Thu, Apr 9, 2009 at 14:33, stuart yeates stuart.yea...@vuw.ac.nz wrote:
 That's not an entirely useful comparison on topic maps and RDF.

If I indented to be useful I'd write something substantial, backed up
with stuff other than humour. I'll give that a go the next time. :)

 We currently use topic maps, alot, in our infrastructure. If we were
 starting again tomorrow, I'd advocate using RDF instead, mainly because of
 the much better tool support and take-up.

Hmm, not a good thing at all. Could you elaborate, though, as I use it
too as part of infrastructure too, and wouldn't touch RDF / SemWeb
without a long stick? I'm into application semantics and shared
knowledge-bases. What are you guys doing where you feel the support
and tools are lacking? And what are the RDF alternatives?


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)

2009-04-14 Thread Alexander Johannesen
On Tue, Apr 14, 2009 at 23:34, Jonathan Rochkind rochk...@jhu.edu wrote:
 The difference between URIs and URLs?  I don't believe that URL is 
 something that exists any more in any standard, it's all URIs. Correct me if 
 I'm wrong.

Sure it exists: URLs are a subset of URIs. URLs are locators as
opposed to just identifiers (which is an important distinction, much
used in SemWeb lingo), where URLs are closer to the protocol like
things Ray describe (or so I think).

 I don't entirely agree with either dogmatic side here, but I do think that 
 we've arrived at an
 awfully confusing (for developers) environment.

But what about it is confusing (apart from us having this discussion
:) ? Is it that we have IDs that happens to *also* resolve? And why is
that confusing?

 Re-reading the various semantic web TAG position papers people keep
 referencing, I actually don't entirely agree with all of their principles in 
 practice.

Well, let me just say that there's more to SemWeb than what comes out of W3C. :)


Kind regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Something completely different

2009-04-14 Thread Alexander Johannesen
On Wed, Apr 15, 2009 at 07:10, stuart yeates stuart.yea...@vuw.ac.nz wrote:
 RDF, unlike topic maps, is being used by substantial numbers of people who
 we interact with in the real world and would like to interoperate with. If
 we used RDF rather than topic maps internally, that interoperability would
 be much, much cheaper. It's tempting to say it's free, but it's not quite,
 because it does impose some constraints.

But it's not that hard to create a bridge from RDF to Topic Maps and
back, no? Or is your interop story different?

 In my eyes, the core thing that RDF supports that topic maps don't seem to
 is seamless reuse by people you don't care about.

Yes, this has been brought up on several occasions, including by me at
the TMRA 2008. But then, it's not so much that RDF does something that
Topic Maps doesn't *support*, it's that it's packaged differently. So,
where RDF has got five standard ontology levels (RDF, RDFS, OWL
DL/Lite/Full) Topic Maps got one simpler one (TMDM), yet neither can
express anything  better or differently than the other.

My theory here is that people *like* 5 layers of RDF, because it gives
the false sensation of choice. But it's all ontological definitions.
However, the 5 levels of RDF does indeed create a defined platform for
sharing (if not cast in iron), in which in the TM world you need to
include it / create it.

Oh, and of course the academics seem to have embraced W3C and anything
by the authority of TBL, and its effect is trickling down.

 For example the people at http://lcsubjects.org have never heard of us (that
 I know of), but we can use their URLs like
 http://lcsubjects.org/subjects/sh90005545#concept to represent our roles.

Not sure I understand your example. Here's my Topic Map identifier in
a Topic Map ;

   http://psi.ontopedia.net/Alexander_Johannesen

Identifier and locator, and resolvable, and can be used by anyone.


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Something completely different

2009-04-14 Thread Alexander Johannesen
On Wed, Apr 15, 2009 at 10:32, stuart yeates stuart.yea...@vuw.ac.nz wrote:
 Yes, we mint something very similar (see http://authority.nzetc.org/52969/
 for mine), but none of our interoperability partners do. None of our local
 libraries, none of our local archives and only one of our local museums (by
 virtue of some work we did with them).
 All of them publish and most consume some form RDF.

Hmm, RDF resources are just URIs, so I'm still a bit unsure about what
you mean. Are you talking about the fact that the RDF definitions (and
not the RDF vocabs themselves) aren't encoded in your TM engine?

 Additionally many of the taxonomies we're interested in are available in RDF
 but not topic maps.

Converting them to a Topic Map isn't that hard to do, but I guess
there is *a* cost there.


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)

2009-04-14 Thread Alexander Johannesen
On Wed, Apr 15, 2009 at 00:20, Jonathan Rochkind rochk...@jhu.edu wrote:
 Can you show me where this definition of a URL vs. a URI is made in any 
 RFC or standard-like document?

From http://www.faqs.org/rfcs/rfc3986.html ;

1.1.3.  URI, URL, and URN

   A URI can be further classified as a locator, a name, or both.  The
   term Uniform Resource Locator (URL) refers to the subset of URIs
   that, in addition to identifying a resource, provide a means of
   locating the resource by describing its primary access mechanism
   (e.g., its network location).  The term Uniform Resource Name
   (URN) has been used historically to refer to both URIs under the
   urn scheme [RFC2141], which are required to remain globally unique
   and persistent even when the resource ceases to exist or becomes
   unavailable, and to any other URI with the properties of a name.

   An individual scheme does not have to be classified as being just one
   of name or locator.  Instances of URIs from any given scheme may
   have the characteristics of names or locators or both, often
   depending on the persistence and care in the assignment of
   identifiers by the naming authority, rather than on any quality of
   the scheme.  Future specifications and related documentation should
   use the general term URI rather than the more restrictive terms
   URL and URN [RFC3305].

As you can see, an URI is an identifier, and a URL is a locator
(mechanism for retrieval), and since a URL is a subset of an URI, you
_can_ resolve URIs as well.

 Sure, we have a _sense_ of how the connotation is different, but
 I don't think that sense is actually formalized anywhere.

It is, and the same stuff is documented in WikiPedia as well ;

   http://en.wikipedia.org/wiki/Uniform_Resource_Identifier
   http://en.wikipedia.org/wiki/Uniform_Resource_Locator

 I think the sem web crowd actually embraces this confusingness,

No, I think they take it at face value; they(the URIs)  are
identifiers for things, and can be used for just that purpose, but
they are also URLs which mean they resolve to something. What I think
you're coming at is that something thing it resolves too, as *that*
has no definition. But then, if you go from RDF to Topic Maps PSIs
(PSIs are URIs with an extended meaning), *that* thing it resolves to
indeed has a definition; it's the prose explaining what the identifier
identifies, and this is the most important difference between RDF and
Topic Maps (and a very subtle but important difference, too).

 they want to have it both ways: Oh, a URI doesn't need to resolve,
 it's just an opaque identifier; but you really should use http URIs
 for all URIs; why? because it's important that they resolve.

I smell straw-man. :) But yes, they do want both, as both is in fact a
friggin' smart thing to have. We all deal with identifiers all the
time, in internal as external applications, so why not use an
indetifier scheme that has the added bonus of adding a resolver
mechanism? If you want to be stupid and lock yourself in your limited
world, then using them as just identifiers is fine but perhaps a bit,
well, stupid. But if you want to be smart about it, realizing that
without ontological work there will *never* be proper interop, you use
those identifiers and let them resolve to something. And if you're
really smart, you let them resolve to either more RDF statements, or,
if you're seriously Einsteinly smart, use PSIs (as in Topic Maps) :).

 In general, combining two functions in one mechanism is a
 dangerous and confusing thing to do in data design, in my opinion.

Because ... ?

 By analogy, it's what gets a lot of MARC/AACR2 into trouble.

Hmm, and I thought it was crap design that did that, coupled with poor
metadata constraints and validation channels, untyped fields, poor
tooling, the lack of machine understandability, and the general
library idiom of not invented here. But correct me if I'm wrong. :)

 Over in: http://www.w3.org/2001/tag/doc/URNsAndRegistries-50-2006-08-17.html

Umm, I'd be wary to take as canon a draft with editorial notes going
back 4 to 5 years that still aren't resolved. In other words, this
document isn't relevant to the real world. Yet.

 They suggest: URI opacity    'Agents making use of URIs SHOULD NOT attempt 
 to infer properties of the referenced resource.'

Well, as a RESTafarian I understand this argument quite well. It's
about not assuming too much from the internal structure of the URI.
Again, it's an identifier, not a scheme such as an URL where structure
is defined. Again, for URIs, don't assume structure because at this
point it isn't an URL.

 If I get a URI representing (eg) a Sudoc (or an ISSN, or an LCCN), I need to
 be able to tell from the URI alone that it IS a Sudoc, AND I need to be able
 to extract the actual SuDoc identifier from it.  That completely violates 
 their
 Opacity requirement

I think you are quite mistaken on this, but before we leap into wheter
the web is suitable for SuDoc I'd 

Re: [CODE4LIB] resolution and identification (was Re: [CODE4LIB] registering info: uris?)

2009-04-15 Thread Alexander Johannesen
Hiya,

On Thu, Apr 16, 2009 at 01:10, Jonathan Rochkind rochk...@jhu.edu wrote:
 It stands in the way of using them in the fully realized sem web vision.

Ok, I'm puzzled. How? As the SemWeb vision is all about first-order
logic over triplets, and the triplets are defined as URIs, if you can
pop something into a URI you're good to go. So how is it that SuDoc
doesn't fit into this, as you *can* chuck it in a URI? I said it was
unfriendly to the Web, not impossible.

 It does NOT stand in the way of using them in many useful ways that I can
 and want to use them _right now_.

Ah, but then go fix it.

 Ways which having a URI to refer to them
 are MUCH helped by. Whether it can resolve or not (YOU just made the point
 that a URI doesn't actually need to resolve, right? I'm still confused by
 this having it both ways -- URIs don't need to resolve, but if you're URIs
 don't resolve than you're doing it wrong. Huh?)

C'mon, it ain't *that* hard. :) URIs as identifiers is fine, having
them resolve as well is great. What's so confusing about that?

 , if you have a URI for a
 SuDoc you can use it in any infrastructure set up to accept, store, and
 relate URIs. Like an OpenURL rft_id, and, yeah, like RDF even.  You can make
 statements about a SuDoc if it has a URI, whether or not it resolves,
 whether or not SuDoc itself is 'web friendly'.  One step at a time.

 This is my frustration with semantic web stuff, making it harder to do
 things that we _could_ do right here and now, because it violates a fantasy
 of an ideal infrastructure that we may never actually have.

Huh? The people who made SuDoc didn't make it web friendly, and thus
the SemWeb stuff is harder to do because it lives on the web? (And
chucking your meta data into HTML as MF or RDF snippets ain't that
hard, it just require a minimum of knowledge)

 There are business costs, as well as technical problems, to be solved to
 create that ideal fantasy infrastructure. The business costs are _real_

No more real than the cost currently in place. The thing is that a lot
of people see the traditional cost disappear with the advent of SemWeb
and the new costs heavily reduced.

  Also, having a unified resolver for
 SuDoc isn't hard, can be at a fixed URL, and use a parameter for
 identifiers. You don't need to snoop the non-parameterized section of
 an URI to get the ID's ;

 Okay, Alex, why don't you set this up for us then?

Why? I don't give a rats bottom about SuDoc, don't need it, think it's
poorly designed, and gives me nothing in life. Why should I bother?
(Unless I'm given money for it, then I'll start caring ... :)

 And commit to providing
 it persistently indefinitely? Because I don't have the resources to do that.

Who's behind SuDoc, and are they serious about their creation? That's
the people you should send your anger instead.

  And for the use cases I am confronted with, I don't _need_ it, any old URI,
 even not resolvable, will do--yes, as long as I can recognize it as a SuDoc
 and extract the bare SuDoc out of it.

So what's the problem with just making some stuff up? If you can do
your thing in a vacuum I don't fully understand your problem with the
SemWeb stuff? If you don't want it, don't use it.

 Which you say I shouldn't be doing
 (while others say that's a mis-reading of those docs to think I shouldn't be
 doing it)

No, I think this one is the subtle difference between a URL and a URI.

 but avoiding doing that would raise the costs of my software
 quite a bit, and make the feature infeasible in the first place. Business
 costs and resources _matter_.

As with anything on the Web, you work with what you got, and if you
can fix and share your fix, we all will love you for it. I seriously
don't think I understand what you're getting at here; it's been this
way since the Web popped into existance, and don't really want it to
be any other way.

 No it's not; if you design your system RESTfully (which, indeed, HTTP
 is) then the discovery part can be fast, cached, and using URI
 templates embedded in HTTP responses, fully flexible and fit for your
 purposes.

 These URIs are
 _external_ URIs from third parties, I have no control over whether they are
 designed RESTfully or not.

Not sure I follow this one. There are no good or bad RESTful URIs,
just URIs. REST is how your framework work with the URIs.

 In the meantime, I'll continue trying to balance functionality,
 maintainability, future expansion, and the programming and hardware
 resources available to me, same as I always do, here in the real world when
 we're building production apps, not RD experiments

My day job is to balance functionality, maintainability, future
expansion, and the programming and hardware resources available to me,
same as I always do, here in the real world when we're building
production apps ... and I'm using Topic Maps and SemWeb technologies.
Is there something I'm doing which degrades my work to an RD
experiment, something I should let my customers 

Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-03 Thread Alexander Johannesen
With Topic Maps it's been solved years and years ago, and it's the
part of it that the RDF world didn't think of until recently (and
applied their kludges). I'm not going to bang my gong on this, just
urge you to read up on PSIs.

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


[CODE4LIB] Another nail in the coffin

2009-05-03 Thread Alexander Johannesen
Another nail in the library coffin, especially the academic ones ;

   http://www.youtube.com/watch?v=5TIOH80Qg7Q

Organisations and people are slowly turning into data producers, not
book producers.


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Another nail in the coffin

2009-05-04 Thread Alexander Johannesen
On Mon, May 4, 2009 at 23:25, Joe Hourcle onei...@grace.nascom.nasa.gov wrote:
 You're forgetting the 5th Law:
        The library is a growing organism.
 http://en.wikipedia.org/wiki/Five_laws_of_library_science

Not forgotten, I just don't believe it anymore. And, taken to its
natural consequence, organisms through evolution comes and goes. :)


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Another nail in the coffin

2009-05-04 Thread Alexander Johannesen
On Mon, May 4, 2009 at 22:44, Andreas Orphanides
andreas_orphani...@ncsu.edu wrote:
 You say that as though libraries are all about books.

Libraries still have the word biblio as their primer, and it
certainly is the written word on paper that occupies most of our time,
no? Sure libraries around the world are trying to play catch-up in the
digital and modern world with all sorts of things, but the primary
directive is still books for most librarians. Not sure what you mean
they're *really* into?


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-07 Thread Alexander Johannesen
On Wed, May 6, 2009 at 18:44, Mike Taylor m...@indexdata.com wrote:
 Can't you just tell us?

Sorry, but surely you must be tired of me banging on this gong by now?
It's not that I don't want to seem helpful, but I've been writing a
bit on this here already and don't want to be marked as spam for Topic
Maps.

In the Topic Maps world our global identificators are called PSI, for
Published Subject Indicators. There's a few subtleties within this,
but they are not so different from any other identificator you'll find
elsewhere (RDF, library world, etc.) except of course they are
*always* URIs. Now, the thing here is that they should *always* be
published somewhere, whether as a part of a list or somewhere. The
next thing is that they always should resolve to something (although
the standard don't require this, however I'd say you're doing it wrong
if you couldn't do this, even if it sometimes is an evil necessity).

This last part is really the important bit, where any PSI will act as
1) a global identificator, and 2) resolve to a human text explaining
what it represents. Systems can just use it while at the same time
people can choose the right ones for their uses.

And, yes, the identificators can be done any way you slice them. Some
might think that ie. a PSI set for all dates is crazy as you need to
produce identificators for all dates (or times), and that would be
just way too much to deal with, but again, that's not an identifcation
problem, that's a resolver problem. If I can browse to a PSI and get
the text that this is 3rd of June, 19971, using the whatsnot calendar
style, then that's safe for me to use for my birthday. Let's pretend
the PSI is http://iso.org/datetime/03061971. By releasing an URI
template computers can work with this automatically, no frills.

Now a bit more technical; any topic (which is a Topic Map
representation of any subject, where subject is defined as anything
you can ever hope to think of) can have more than one PSI, because I
might use the PSI http://someother.org/time/date/3/6/1971 for my date.
If my application only understand this former set of PSIs, I can't
merge and find similar cross-semantics (which really is the core of
the problem this thread has been talking about). But simply attach the
second PSI to the same Topic, and you do. In fact, both parties will
understand perfectly what you're talking about.

More complex is that the definitions of PSI sets doesn't have to
happen on the subject level, ie. the Topic called Alex to which I
tried to attach my birthday. It can be moved to a meta model level,
where you say the Topic for Time and dates have the PSI for both
organsiations, and all Topics just use one or the other; we're
shifting the explicity of identification up a notch.

Having multiple PSIs might seem a bit unordered, but it's based on the
notion of organic growth, just like the web. People will gravitate
towards using PSIs from the most trusted sources (or most accurate or
most whatever), shifting identification schemes around. This is a good
thing (organic growth) at the price of multiple identifiers, but if
the library world started creating PSIs, I betcha humanity and the
library world both could be saved in one fell swoop! (That's another
gong I like to bang)

I'm kinda anticipating Jonathan saying this is all so complex now. :)
But it's not really; your application only has to have complexity in
the small meta model you set up, *not* for every single Topic you've
got in your map. And they're mergable and shareable, and as such can
be merged and fixed (or cleaned or sobered or made less complex) for
all your various needs also.

Anyway, that's the basics. Let me know if you want me to bang on. :)
For me, the problem the library face isn't really the mechanisms of
this (because this is solvable, and I guess you just have to trust
that the Topic Maps community have been doing this for the last 10
years or so already :), however, but how you're going to fit existing
resources into FRBR and RDA, but that's a separate discussion.


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-08 Thread Alexander Johannesen
On Sat, May 9, 2009 at 00:32, Jonathan Rochkind rochk...@jhu.edu wrote:
 I don't understand from your description how Topic Maps solve the
 identifying multiple versions of a standard problem.

It's the mechanism of having multiple identifiers for Topics, so, in pseudo ;

Topic MARC21
  psi info:ofi/fmt:xml:xsd:MARC21
  psi http://loc.org/stuff/marc21;
  property #mime-type whatever for the binary

Topic MARC 1.1
  is_a MARC
  psi info:srw/schema/1/marcxml-v1.1
  psi http://loc.org/stuff/marcxml-v1.1;
  property #mime-type whatever 1.1

Topic MARC 1.2
  is_a MARC
  psi info:srw/schema/1/marcxml-v1.2
  psi http://bingo.com/psi/marcxml;
  property #mime-type whatever 1.2

Or, if if MARC 1.2 is backwards compatible with 1.1 ;

Topic MARC 1.2
  is_a MARC 1.1
  psi info:srw/schema/1/marcxml-v1.2

Or, if I make my own unofficial version ;

Topic MARC 2.0
  is_a MARC 1.2
  psi http://alex.com/psi/marc-2.0;

This is enough to hobble together what is and isn't compatible in
types of formats, so if your application is Topic Maps aware, this
should be trivial (including what format to ignore or react to). The
point is that you don't need *one* identifier for things; Topics are
proxies for knowledge, and part of the notion of knowledge is what
identifies that knowledge. Multiple PSIs help us leverage both rigid
and fuzzy systems.

As to the identifiers themselves (as in, the formatting), is that important?

Anyway, I'm suspecting I don't see what the problem seems to be. To
create the best identifier for things seems a bit of a strange
notion to me, but is this based on that there is only (or rather, that
you're trying to create) one identifier for any one thing?


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-11 Thread Alexander Johannesen
On Mon, May 11, 2009 at 16:04, Rob Sanderson azar...@liverpool.ac.uk wrote:
 * One namespace is used to define two _totally_ separate sets of
 elements.  There's no reason why this can't be done.

As opposed to all the reasons for not doing it. :) This is crap design
of a higher magnitude, and the designers should be either a) whipped
in public and thrown out in shame, or b) repent and made to fix the
problem. Even I would opt for the latter, but such a simple task not
being done seems to suggest that perhaps the former needs to be put in
place.

 * One namespace defines so many elements that it's meaningless to call
 it a format at all.  Even though the top level tag might be the same,
 the contents are so varied that you're unable to realistically process
 it.

Yeah, don't use MODS in general; it's a hack. It's even crazier still
that many versions have the same namespace. What were they thinking?!

Anyway, even if the namespace is botched, you can still (if I'll dare
go by the Topic Maps moniker) have multiple namespaces for the same
subject (the format in question), and simply publish and use your own
and let the TM mechanics handle the ambiguity for you. If enough
people do this, and perhaps even use your unofficial identifiers,
maybe LOC will see the errors of their ways and repent.


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-11 Thread Alexander Johannesen
On Mon, May 11, 2009 at 19:34, Jonathan Rochkind rochk...@jhu.edu wrote:
 In the real world, we use things when they solve the problem in front of us
 in as easy a way as possible

And somehow you're suggesting that I don't live in the real-world? :)
Good try, but as far as I've experienced, people in the library world
lives quite a distance away from the real one.


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-14 Thread Alexander Johannesen
On Thu, May 14, 2009 at 17:35, Rob Sanderson azar...@liverpool.ac.uk wrote:
 For example, the owl:sameAs predicate is used to express that the
 subject and object are the same 'thing'.  Then the application can infer
 that if a owl:sameAs b, and a x y, then b x y.

Yes, but there's a snag; as RDF work only on the URI resource level
(no added semantics to the typification of the URI resource) if
someone does an owl:sameAs between an identifier of a thing and a
locator of a thing (a locator being the resource itself as opposed to
being an identifier; example are you talking about Sun Corp
(http://sun.com/) or are you talking about their website
(http://sun.com/)) you can get a nasty case of integrity rot, and I've
not seen any proposals to address this issue (the RDF world is
essentially assuming modeling from the viewpoint of everything being
true).

I guess Mike don't like RDF *nor* Topic Maps now. :)


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-14 Thread Alexander Johannesen
On Thu, May 14, 2009 at 17:45, Rob Sanderson azar...@liverpool.ac.uk wrote:
 I'll quote Mike (and most common approaches to the problem):
        Don't Do That Then.
 :)

Oh, for sure. :) But these are very subtle things that are hard to
understand, and certainly the long-term implications, so people *will*
do this, and they *will* put rot into the SemWeb chains people create.
It's unavoidable, but I know lots are trying to work out some kind of
solution. Unfortunately, this one is being routed to software
frameworks rather than the RDF core itself. Oh well.


Regards,

Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] A Book Grab by Google

2009-05-20 Thread Alexander Johannesen
On Thu, May 21, 2009 at 10:07, Karen Coyle li...@kcoyle.net wrote:
 - without competition, Google (with the agreement of the registry, whose
 purpose is to garner as much income as possible for rights holders) will
 charge a price that is more than some institutions will be able to afford;
 others will subscribe, but to the detriment of other resource subscriptions.

How is this different from what's already in place in terms of
electronic resources? This is not uniquely Google, nor has it even
been proven to happen.


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] HTML mark-up in MARC records

2009-06-22 Thread Alexander Johannesen
Hiya,

I guess I'm the one who's got to step up to the self-slaughtering
altar, but the fact that a lot of our systems break or don't know how
to handle HTML is despicable. I'm sure you guys are familiar with RSS
/ Atom, and because in there we *expect* HTML and therefore make sure
our back-ends can grok it, it enhances the meta data *greatly*.

Don't think for a second that purity of the data format in any shape
or form is the definition of its usefulness. Mixed content models
might be complex to work with, but their value is immense. I can fully
understand *why* people say don't do it, because, yes, it ups the
complexity, and perhaps with these dinosaur technologies like MARC and
our ILS's breaking under the pressure of more modern technologies
enforces it, I don't think we should shun it because of it.

If your back-end can't grok HTML, I'd suggest you fix it immediately!
If your ILS chokes on XML and / or HTML snippets, I suggest you
replace it. You seriously shouldn't allow this rigidity into your
infra-structure, and it's depressing to watch how we as complex users
of MARC don't dare to extend it to become a format that does what it
should and need to do.

Even *if* HTML in MARC records probably is a bad idea.


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Library Linked Data

2009-10-28 Thread Alexander Johannesen
Hiya,

On Thu, Oct 29, 2009 at 15:16, Roy Tennant tenna...@oclc.org wrote:
 Could you elaborate a bit? In my mind, the only semantic web technology of
 any note is linked data.

What do you mean by linked data? I work in fields of semantic web
technology where there's very little linked data (ie. data on the web
you can link to and use), yet I feel all our work is very valuable and
certainly worthy of note ...


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Library Linked Data

2009-10-28 Thread Alexander Johannesen
Hiya,

On Thu, Oct 29, 2009 at 16:19, stuart yeates stuart.yea...@vuw.ac.nz wrote:
 I'm guessing that Roy meant linked data in the sense of
 http://www.w3.org/DesignIssues/LinkedData.html and http://linkeddata.org/

I'm pretty sure he did, too. I guess I was trying to smoke out his
reasoning for choosing linked data as the only worthwhile semantic
web technology. Let me clarify, and have a look at this ;

   http://en.wikipedia.org/wiki/Semantic_Web_Stack

Linked data is the bottom four boxes out of a total of 12 (13 if you
count the top one), where the ones missing is things like Trust,
Proof, Logic, Querying, Ontologies and Taxonomies, all things that I
thought it was evident belonged at the core of what library science is
all about. It simply astounds me the lack of understanding from the
library world on these things, so sad to see that these things aren't
linked up; you *are* what these things are about! Sure, linked data is
easier; that's why everyone is doing it, have been doing it for years.
But you're missing out in fields that should be second-nature to you.


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Twitter annotations and library software

2010-04-29 Thread Alexander Johannesen
Hi,

On Thu, Apr 29, 2010 at 22:47, Walker, David dwal...@calstate.edu wrote:
 I would suggest it's more because, once you step outside of the
 primary use case for OpenURL, you end-up bumping into *other* standards.

These issues were raised all the back when it was created, as well. I
guess it's easy to be clever in hindsight. :) Here's what I wrote
about it 5 years ago (http://shelter.nu/blog-159.html) ;

So let's talk about 'Not invented here' first, because surely, we're
all guilty of this one from time to time. For example, lately I dug
into the ANSI/NISO Z39.88 -2004 standard, better known as OpenURL. I
was looking at it critically, I have to admit, comparing it to what I
already knew about Web Services, SOA, http,
Google/Amazon/Flickr/Del.icio.us API's, and various Topic Maps and
semantic web technologies (I was the technical editor of Explorers
Guide to the Semantic Web)

I think I can sum up my experiences with OpenURL as such; why? Why
have the library world invented a new way of doing things that already
can be done quite well already? Now, there is absolutely nothing wrong
with the standard per se (except a pretty darn awful choice of
name!!), so I'm not here criticising the technical merits and the work
put into it. No, it's a simple 'why' that I have yet to get a decent
answer to, even after talking to the OpenURL bigwigs about it. I mean,
come on; convince me! I'm not unreasonable, no truly, really, I just
want to be convinced that we need this over anything else.


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 04:17, Jakob Voss jakob.v...@gbv.de wrote:
 But all the flaws of XML can be traced back to SGML which is why we now use
 JSON despite all of its limitations.

Hmm, this is wrong on so many levels. First, SGML was pretty darn good
for its *purpose*, but it was a geeks dream and pretty scary for
anyone who hacked at it not fully getting it (like most normal
developers). As with many things where the learning curve is steep, it
fell into the not good for normal consumption category and they
(well, people who cared, and made decisions about the web) were
forced to make XML. But JSON? Are you sure you've got this figured
out? JSON as a object serializing format is good for a number of
things (small footprint, embedded type, etc.), but sucks for most
information management tasks.

However, I'd like to add here that I happen to love XML, even from an
integration perspective, but maybe that stems from understanding all
those tedious bits no one really cares about about it, like id(s) and
refid(s) (and all the indexing goodness that comes from it), canonical
datasets, character sets and Unicode, all that schema craziness
(including Schematron and RelaxNG), XPath and XQuery (and all the
sub-standards), XSLT and so on. I love it all, and not because of the
generic simplicity itself (simple in the default mode of operation, I
might add), but because of a) modeling advantages, b)
cross-environment language and schema support, and c) ease of
creation. (I don't like how easy well-formedness breaks, though. That
sucks)

But I mention all this for a specific reason ; MARCXML is the work of
the devil! There's a certain dedication needed for doing it right,
by paying attention in XML class, and play well with your playmates.
This is how you build a community and understanding around standards;
the standards themselves are not enough. The library world did nothing
of the kind ;
http://shelter.nu/blog/2008/09/marcxml-beast-of-burden.html

The flaws of XML can most likely be traced back to people not playing
well with playmates, and not the format itself.

 May brother Ted Nelson enlighten all of
 us - he not only hates XML [1] and similar formats but also  proposed an
 alternative way to structure information even before the invention of
 hierarchical file systems and operating systems [2].

Bah. For someone who don't see the SGML - XML - HTML transgression
as an inherited and more rigid structure (or, by popular language,
more schematic) as a document model as a good thing, I'm not
impressed. Any implied structure can be criticized, including pretty
much any corner of Xanadu as well. (I mean, seriously; taking
hypermedia one step closer to a file system does *not* solve problems
with the paper-based document model of HTTP, it just shifts the focus)

 In his vision of Xanadu
 every piece of published information had a unique ID that was reused
 everytimes the publication was referenced - which would solve our problem.

*Having* an identifier doesn't mean that identifier is a *good* one,
nor that it solves your problem. There's plenty of systems out there
where everything has an identifier (and, if you knew XML deeper,
you'll find identification models as well in there, but people don't
use them because the early onset of XML didn't understand nor need
them). Have a look at the failed XLink brooha for something that
worked and filled the niche, but people didn't get nor did tool-makers
see the point of implementation, and the thing died a premature death.
The current model of document structure and XQuery is somewhat of an
alternative, but people are also switching to CSS3 styles as well. The
thing is, just because you've got persistence in a system of
identifiers, it does not follow that the information is persisted; the
problem of change is *not* solved in neither systems, and so we work
with the one we got and make the best of it.

One thing I always found intriguing about librarians were their
commitment to persistent URIs for information resources, and use of
303 if need be (although I see this mindset dwindling). I think you're
the only ones in the entire world who gives a monkeys bottom about
these issues, as the rest of the world simply use Google as a
resolver. I can see where this is going. :)


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] it's cool to hate on OpenURL (was: Twitter annotations...)

2010-04-29 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 10:54, Eric Hellman e...@hellman.net wrote:
 May I just add here that of all the things we've talked about in these 
 threads, perhaps the only thing that will still be in use a hundred years 
 from now will be Unicode. إن شاء الله

May I remind you that we're still using MARC. Maybe you didn't mean in
the library world ... *rimshot*


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 18:47, Owen Stephens o...@ostephens.com wrote:
 Could you expand on how you think the problem that OpenURL tackles would
 have been better approached with existing mechanisms?

As we all know, it's pretty much a spec for a way to template incoming
and outgoing URLs, defining some functionality along the way. As such,
URLs with basic URI templates and rewriting have been around for a
long time. Even longer than that is just the basics of HTTP which have
status codes and functionality to do exactly the same. We've been
doing link resolving since mid 90's, either as CGI scripts, or as
Apache modules, so none of this were new. URI comes in, you look it up
in a database, you cross-check with other REQUEST parameters (or
sessions, if you must, as well as IP addresses) and pop out a 303
(with some possible rewriting of the outgoing URL) (with the hack we
needed at the time to also create dummy pages with META tags
*shudder*).

So the idea was to standardize on a way to do this, and it was a good
idea as such. OpenURL *could* have had a great potential if it
actually defined something tangible, something concrete like a model
of interaction or basic rules for fishing and catching tokens and the
like, and as someone else mentioned, the 0.1 version was quite a good
start. But by the time when 1.0 came out, all the goodness had turned
so generic and flexible in such a complex way that handling it turned
you right off it. The standard also had a very difficult language, and
more specifically didn't use enough of the normal geeky language used
by sysadmins around. The more I tried to wrap my head around it, the
more I felt like just going back to CGI scripts that looked stuff up
in a database. It was easier to hack legacy code, which, well, defeats
the purpose, no?

Also, forgive me if I've forgotten important details; I've suppressed
this part of my life. :)


Kind regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Twitter annotations and library software

2010-04-30 Thread Alexander Johannesen
On Fri, Apr 30, 2010 at 20:29, Owen Stephens o...@ostephens.com wrote:
 However I'd argue that actually OpenURL 'succeeded' because it did manage to
 get some level of acceptance (ignoring the question of whether it is v0.1 or
 v1.0) - the cost of developing 'link resolvers' would have been much higher
 if we'd been doing something different for each publisher/platform. In this
 sense (I'd argue) sometimes crappy standards are better than none.

Well, perhaps. I see OpenURL as the natural progression from PURL, in
which both have their degree of success, however I'm careful using
that word as I live on the outside of the library world. It may well
be a success on the inside. :)

 I think the point about Link Resolvers doing stuff that Apache and CGI
 scripts were already doing is a good one - and I've argued before that what
 we actually should do is separate some of this out (a bit like Johnathan did
 with Umlaut) into an application that can answer questions about location
 (what is generally called the KnowledgeBase in link resolvers) and the
 applications that deal with analysing the context and the redirection

Yes, split it into smaller chunks is always smart, especially with
complex issues. For example, in the Topic Maps world, the who standard
(reference model, data model, query language, constraint language, XML
exchange language, various notational languages) is wrapped up with a
guide in the middle. Make them into smaller parcels, and make your
flexible point there. If you pop it all into one, no one will read it
and fully understand it. (And don't get me started on the WS-* set of
standards on the same issues ...)

 (To introduce another tangent in a tangential thread, interestingly (I
 think!) I'm having a not dissimilar debate about Linked Data at the moment -
 there are many who argue that it is too complex and that as long as you have
 a nice RESTful interface you don't need to get bogged down in ontologies and
 RDF etc. I'm still struggling with this one - my instinct is that it will
 pay to standardise but so far I've not managed to convince even myself this
 is more than wishful thinking at the moment)

Ah, now this is certainly up my alley. As you might have seen, I'm a
Topic Maps guy, and we have in our model a distinction between three
different kinds of identities; internal, external indicators and
published subject identifiers. The RDF world only had rdf:about, so
when you used www.somewhere.org, are you talking about that thing,
or does that thing represent something you're talking about? Tricky
stuff which has these days become a *huge* problem with Linked Data.
And yes, they're trying to solve that by issuing a HTTP 303 status
code as a means of declaring the identifiers imperative, which is a
*lot* of resolving to do on any substantial set of data, and in my
eyes a huge ugly hack. (And what if your Internet falls down? Tough.)

Anyway, here's more on these identity problems ;
   http://www.ontopia.net/topicmaps/materials/identitycrisis.html

As to the RESTful notions, they only take you as far as content-types
can take you. Sure, you can gleam semantics from it, but I reckon
there's an impedance mismatch between just the things librarians how
got down pat ; meta data vs. data. CRUD or, in this example, GPPD
(get/post/put/delete), who aren't in a dichotomy btw, can only
determine behavior that enables certain semantic paradigms, but cannot
speak about more complex relationships or even modest models. (Very
often models aren't actionable :)

The funny thing is that after all these years of working with Topic
Maps I find that these hard issues have been solved years ago, and the
rest of the world is slowly catching up to it. I blame the lame
DAML+OIL background of RDF and OWL, to be honest; a model too simple
to be elegantly advanced and too complex to be easily useful.


Kind regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] OCLC Service Outage Update

2010-05-10 Thread Alexander Johannesen
On Tue, May 11, 2010 at 06:59, stuart yeates stuart.yea...@vuw.ac.nz wrote:
 No, the real problem is with trolls sending flamebait.

Friggin' AMEEN!

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] OCLC Service Outage Update

2010-05-10 Thread Alexander Johannesen
Michael J. Giarlo leftw...@alumni.rutgers.edu wrote:
 ... people took Simon's comment seriously?

Language is a funny thing ; some times the things that are being said
is taken seriously. And the script-haters are spread far and wide, so
there was no reason not to take him seriously. Should the default be
not to take anyone seriously? Srsly?


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Alexander Johannesen
Hiya,

On Tue, Oct 26, 2010 at 6:26 AM, Nate Vack njv...@wisc.edu wrote:
 Switching to an XML format doesn't help with that at all.

I'm willing to take it further and say that MARCXML was the worst
thing the library world ever did. Some might argue it was a good first
step, and that it was better with something rather than nothing, to
which I respond ;

Poppycock!

MARCXML is nothing short of evil. Not only does it goes against every
principal of good XML anywhere (don't rely on whitespace, structure
over code, namespace conventions, identity management, document
control, separation of entities and properties, and on and on), it
breaks the ontological commitment that a better treatment of the MARC
data could bring, deterring people from actually a) using the darn
thing as anything but a bare minimal crutch, and b) expanding it to be
actual useful and interesting.

The quicker the library world can get rid of this monstrosity, the
better, although I doubt that will ever happen; it will hang around
like a foul stench for as long as there is MARC in the world. A long
time. A long sad time.

A few extra notes;
   http://shelterit.blogspot.com/2008/09/marcxml-beast-of-burden.html

Can you tell I'm not a fan? :)


Kind regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Alexander Johannesen
Ray Denenberg, Library of Congress r...@loc.gov wrote:
 It really is possible to make your point without being quite so obnoxious.

Obnoxious?


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Alexander Johannesen
On Tue, Oct 26, 2010 at 11:56 AM, Walker, David dwal...@calstate.edu wrote:
 Your criticisms of MARC-XML all seem to presume that MARC-XML is the
 goal, the end point in the process.  But MARC-XML is really better seen as a
 utility, a middle step between binary MARC and the real goal, which is some
 other useful and interesting XML schema.

How do you create an ontological commitment in a community to an
expanding and useful set of tools and vocabularies? I think I need to
remind people of what MARCXML is supposed to be ;

a framework for working with MARC data in a XML environment. This
framework is intended to be flexible and extensible to allow users to
work with MARC data in ways specific to their needs. The framework
itself includes many components such as schemas, stylesheets, and
software tools.

I'm not assuming MARCXML is a goal, no matter how we define that. I'm
poo-pooing MARCXML for the semantics we, as a community, have been
given by a process I suspect had goals very different from reality.
Very few people would work with MARC through MARCXML, they would use
it to convert it, filter it, hack around it to something else
entirely. And I'm afraid lots of people are missing the point of
stubbing the developments in a community by embracing tools that
pushes a packet that inhibits innovation. So, here's the point, in
paraphrased point;

   Here's our new thing. And we did it by simply converting all our
MARC into MARCXML that runs on a cron job every midnight, and a bit of
horrendous XSLT that's impossible to maintain.

   But it looks just like the old thing using MARC and some templates?

   Ah yes, but now we're doing it in XML!

   (Yeah, yeah, your mileage will vary)

I'm sorry if I'm overly pessimistic about the XML goodness in the
world, not for the XML itself, but the consequences of the named
entities involved. I've been a die-hard XML wonk for far too many
years, and the tools in that tool-chest doesn't automatically solve
hard problems better by wrapping stuff up in angle brackets, and -
dare I say it? - perhaps introduces a whole fleet of other problems
rarely talked about when XML is the latest buzz-word, like using a
document model on what's a traditional records model, character
encodings, whitespace issues, unicode, size and efficiencies (the
other part of this thread), and so on.

But let me also be a bit more specific about that hard semantic
problem I'm talking about;

Lots of people around the library world infra-structure will think
that since your data is now in XML it has taken some important step
towards being inter-operable with the rest of the world, that library
data now is part of the real world in *any* meaningful way, but this
is simply demonstrably deceivingly not true. By having our data in XML
has killed a few good projects where people have gone A new project
to convert our MARC into useful XML? Aha! LoC has already solved that
problem for us.

Btw, to those who find me so obnoxious, at no point do I say it was
intentionally evil, just evil none the same. The road to hell is, as
always, paved with good intentions.


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] MARCXML - What is it for?

2010-10-25 Thread Alexander Johannesen
On Tue, Oct 26, 2010 at 12:48 PM, Bill Dueber b...@dueber.com wrote:
 Here, I think you're guilty of radically underestimating lots of people
 around the library world. No one thinks MARC is a good solution to
 our modern problems, and no one who actually knows what MARC
 is has trouble understanding MARC-XML as an XML serialization of
 the same old data -- certainly not anyone capable of meaningful
 contribution to work on an alternative.

Slow down, Tex. Lots of people in the library world is not the same
as developers, or even good developers, or even good XML developers,
or even good XML developers who knows what the document model imposes
to a data-centric approach.

 The problem we're dealing with is *hard*. Mind-numbingly hard.

This is no justification for not doing things better. (And I'd love to
know what the hard bits are; always interesting to hear from various
people as to what they think are the *real* problems of library
problems, as opposed to any other problem they have)

 The library world has several generations of infrastructure built
 around MARC (by which I mean AACR2), and devising data
 structures and standards that are a big enough improvement over
  MARC to warrant replacing all that infrastructure is an engineering
  and political nightmare.

Political? For sure. Engineering? Not so much. This is just that whole
blinded by MARC issue that keeps cropping up from time to time, and
rightly so; it is truly a beast - at least the way we have come to
know it through AACR2 and all its friends and its death-defying focus
on all things bibliographic - that has paralyzed library innovation,
probably to the point of making libraries almost irrelevant to the
world.

 I'm happy to take potshots at the RDA stuff from the sidelines, but I never
 forget that I'm on the sidelines, and that the people active in the game are
 among the best and brightest we have to offer, working on a problem that
  invariably seems more intractable the deeper in you go.

Well, that's a pretty scary sentence, for all sorts of reasons, but I
think I shall not go there.

 If you think MARC-XML is some sort of an actual problem

What, because you don't agree with me the problem doesn't exist? :)

 and that people
 just need to be shouted at to realize that and do something about it, then,
 well, I think you're just plain wrong.

Fair enough, although you seem to be under the assumption that all of
the stuff I'm saying is a figment of my imagination (I've been
involved in several projects lambasted because managers think MARCXML
is solving some imaginary problem; this is not bullshit, but pain and
suffering from the battlefields of library development), that I'm not
one of those developers (or one of you, although judging from this
discussion it's clear that I am not), that the things I say somehow
doesn't apply because you don't agree with, umm, what I'm assuming is
my somewhat direct approach to stating my heretic opinions.


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] MARCXML - What is it for?

2010-10-27 Thread Alexander Johannesen
Hi,

On Tue, Oct 26, 2010 at 1:23 PM, Bill Dueber b...@dueber.com wrote:
 Sorry. That was rude, and uncalled for. I disagree that the problem is
 easily solved, even without the politics. There've been lots of attempts to
 try to come up with a sufficiently expressive toolset for dealing with
 biblio data, and we're still working on it. If you do think you've got some
 insight, I'm sure we're all ears, but try to frame it terms of the existing
 work if you can (RDA, some of the dublin core stuff, etc.) so we have a
 frame of reference.

Well, I've wined enough both here and on NGC4LIB, and I'm kinda over
it, just like I'm sure most people are over my whining. But sufficient
to say is that FRBR is a 15 year old model that has still not been
proven in the Real World[TM] in any meaningful way (the prototypes
works fine until you dig a bit) and probably never will as long as
MARC21 runs the show, and trying to stick RDA on top with rules that
has got use-cases that are old enough to be my kids, well, I'm not
very positive about that either.

The direction of going ontological is a good one, and in the lack of
anything else, RDF-infused FRBR / RDA is probably the way to go
(except I'd ditch RDA and, uh, perhaps even FRBR, or at least
seriously modify it), but the community is decidedly not talking about
ontological interoperability nor extensions nor the semantics involved
to solve actual problems in the bibliographic world (including the
fact that it is inherently bibliographic). There needs to be much more
involvement by library geeks and managers in defining semantic reuse
and extensibility, to properly define those things that are almost
absent from the AACR2 and friends; the relationships between entities
themselves. In other words, you need to get away from the
record-centered view, and embrace the subject-centric view.

Anyway, enough from this old grumpy bum. Sorry to stir up the dust.


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] MARCXML - What is it for?

2010-10-27 Thread Alexander Johannesen
 Political? For sure. Engineering? Not so much.

 Ok. Solve it. Let us know when you're done.

Wow, lamest reply so far. Surely you could muster a tad bit better? I
was excited about getting a list of the hardest problems, for example,
I'd love to see that. Then by that perhaps you could explain what this
unsurmountable hard mind-boggeling problem actually is, because, you
know, you never actually said.


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Alexander Johannesen
On Thu, Oct 28, 2010 at 2:44 AM, Doran, Michael D do...@uta.edu wrote:
 Can that limit threshold be raised?  If so, are there reasons why it should 
 not be raised?

Is it to throttle spam or something? 50 seems rather low, and it's
rather depressing to have a lively discussion throttled like that. Not
to mention I thought I was simply kicked out for living things up
(especially given my reasonable follow-up was where the throttling
began).

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Django

2010-10-27 Thread Alexander Johannesen
On Wed, Oct 27, 2010 at 3:09 AM, Elliot Hallmark permafact...@gmail.com wrote:
 However, I switched to this other scripting
 language, python, because it could do things php cant.

Not to start a flame, but that's a rather big statement which I think
A) needs backing up, and B) is probably untrue.

  For instance,
 my first project in python involved capturing keyboard input before
 windows heard about it.  Then I kept discovering amazing things python
 can do that php cant.

For instance, PHP can do this fine. Was there something in particular
you're thinking of that PHP can't do?

 I helped write a non-sequential optical ray tracer in python.  When it
 needed to be faster there were several libraries for writing C code
 directly in a pythonic syntax.  Python has hooks into everything, like
 optical character recognition, electronic music
 sequeuencing/generation, serial port i/o.

Again, PHP the same. For the sophisticated hacker, most languages can
be tweaked to solve almost any problem.

And I'm not even suggesting that you use PHP. Happy hacking.


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Alexander Johannesen
On Thu, Oct 28, 2010 at 6:53 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 Pretty sure it wasn't depressing to the vast majority of the listserv
 audience.  That was/is a discussion that benefited from a timeout period,
 like you give the pre-schoolers.

Given we're adults, and not in pre-school, I disagree.

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] mailing list administratativia

2010-10-27 Thread Alexander Johannesen
On Thu, Oct 28, 2010 at 6:58 AM, Chris Fitzpatrick cf...@stanford.edu wrote:
 +1 to the  this discussion is really depressing me  camp.

Ok, ok, I get the message. This is no place to voice strong opinions
about bad library tech, and my (different, but not bad)  language nor
stance (contrarian, but not accusatory) are simply not acceptable. I'm
outta here.


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


[CODE4LIB] PHP vs. Python [was: Re: Django]

2010-10-27 Thread Alexander Johannesen
Hola, compadre,

Elliot Hallmark permafact...@gmail.com wrote:
 Other things beyond that seemed
 awkward, difficult, or impossible from what I knew. python immediately
 jumped out to me as a tool more suited to these tasks.

The fact that Python has a looping run-time environment is, of course,
a give-away to why most people think this, and perhaps to some degree,
rightly so, but PHP has got the same, it's just that *most* people use
PHP through some Apache module as a request/response module. Indeed,
that's where it started, and that's its forte.

 From my experience, it seemed php was a server side
 scripting language.

Strictly speaking, so is Python.

 Can you write a php script that gets key presses
 and doesn't pass them along to windows to process?  I thought the OS
 would have to process the key press, pass it along to the php server
 and then php could process it. (pyhook)

A couple of obvious candidates;
 - http://gtk.php.net/
 - http://winbinder.org/

 Also, how would you go about using a GPU from a graphics card in php?
 (python cuda in google gives many results)

PHP is just a C program with various bindings, so I suspect in the
same way Python would do it. Whether anyone has done it, though, is a
different question.

 Has anyone written a scientific computing package along the lines of
 matlab in php (scipy, numpy, matplotlib)?  Or a non-sequential optical
 raytracer?

Not seen any scientific packages, but I've seen a few ray-tracers,
although they're all demo apps and fun toys (although I think that
applies to Python, too). It's not so much about whether you can do it
or not (you can), but whether it makes sense to do so (it mostly
doesn't). Having said that, there's nothing stopping me making a local
run-time PHP program to do either, it's just that it's PHP and hence
slower than C. Python, too, is slower than C, except when it runs some
C module, which, uh, is C, the same as if PHP runs some C module. For
example, one of the fastest and best XSLT 1.0 processors and XML
libraries out there is XMLlib and XSLTlib (RedHat and Gnome?), written
in C, and is the defacto PHP XML and XSLT modules used. Whatever
you've got that runs in C, you can run in PHP, it's not really a big
deal, it just depends on whether it makes sense to patch it up with
the way you use your PHP.

 if you wanted to write a web interface for GNU cash or another well
 established accounting program, could you do it?

Sure. Here's someone who'dunnit back in 2008;
   http://web.archiveorange.com/archive/v/LJV4vT1u2IqE3LstFA1V

 please feel free to point me to the php equivilants of pyhook, pycuda,
 scipy, numpy and some examples of widely used programs with php
 bindings.

You can bind PHP and Python the same, it's just a matter of doing and
whether it makes sense to do so. It's *not* a question of /if/ you can
do it, but if you /should/ do it. Your milage *will* vary.

  For the sophisticated hacker, most languages can
 be tweaked to solve almost any problem.

 I am sure that is true. Though, I feel many for many tasks php would
 require quite a bit more tweaking than python, with much less
 community support behind it (I mean, google comes up with fewer
 helpful links to the problems I sited above).

Maybe your Google-foo is weak. :)

 My impression, based on very little experience with php, is that if
 you asked in a forum about using php for advanced scientific
 computing, or writing music generation/sequencing software,
 knowledgeable folks would first ask: are you sure you want to do this
 in php?  how about java or python?

Again, probably because they don't realize it can be done in a
non-request/response kinda way with PHP as well. But then, PHP itself
isn't all that fast if you have little knowledge of how to do proper
PHP, but this is a pitfall in any language.

 That said, php may be superior for generating websites from databases.

Not really, but the installations you'll find in the wild is readily
configured for it, so it's easy to get going. However, this has little
to do with the language itself, and more to do with default packaging
of it.

Anyway, I wasn't meaning to promote PHP over Python, just pointing out
that PHP is a lot more (and more often still, a lot better) than what
most people think it is.


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] PHP vs. Python [was: Re: Django]

2010-10-27 Thread Alexander Johannesen
Olá, como vai?

Luciano Ramalho luci...@ramalho.org wrote:
 Actually, Python is a general purpose programming language. It was not
 created specifically for server side scripting like PHP was. But it is
 very suitable to that task.

I'm not sure talking about what something used to be is as interesting
as talking about what it is. Both Pyhton and PHP can share whatever
moniker we choose (scripting-language, programming language,
real-time, half-time, bytecoded, virtual, etc.).

 Not seen any scientific packages, but I've seen a few ray-tracers,
 although they're all demo apps and fun toys (although I think that
 applies to Python, too).

 No, that does not apply to Python. Python is widely used for hardcore
 scientific computing.

I was referring to the ray-tracing part.

 It is also the most important scripting language in large scale CGI
 settings

Yes, Python is widely used for scripting up interfaces into other more
complex systems. But rarely is the core of the thing written entirely
in Python.

 Maybe your Google-foo is weak. :)

 Or maybe he's just realizing that outside of server side web
 scripting, PHP is just not so widely used.

Absolutely, and fair enough.

 Having used both languages, I discovered that Python is easier for
 most tasks, and one reason is that the libraries that come with Python
 are extremely robust, well tested and consistent.

Hmm. PHP is extremely robust and well-tested, but yes, it's not all
that consistent, especially not before version 5.2+. However, things
have moved on, and with release 6 around the corner things will be
tighter still. Just like the first versions of Python were
interesting, so was PHP's, but where the biggest problem with the
evolution of PHP was the very fact that it was the most popular
language for rapid web development by far.

 PHP is very
 practical for server-side web scripting, but it's libraries are
 unfortunately full of gotchas, traps and unexpected behaviour.

There's gotchas in every language, even Python.

 A key reason for that is the fact that Python has always had an
 exception-handling mechanism while PHP has grown something like that
 only a few years ago

True enough. But earlier versions of any language are less desirable
than the latest versions, so I'm not sure this is a prevailing
argument for the horribleness of PHP or any language. These things
evolve. PHP 5.3+ and soon 6 are looking very good, indeed, but yes, we
will just have to live with a poor reputation brought on by the big
number of users and the pre 5.2+ era.

 So, I my opinion, PHP is great at what it does best: enabling quick
 server-side Web scripting on almost any hosting service on Earth.

I'm fairly sure you can say that because you haven't done much other
kind of PHP work. :)

 For everything else, it is very worthwhile to learn and use a general
 purpose dynamic language such as Python, Ruby or Perl.

Of course. Developers should learn many of languages, and choose
wisely the language best suited to the problem at hand.

 Sorry for the rant. I must confess I am a founder of the Brazilian
 Python Association and was its first president, so you can call me a
 Python advocate.

No bias at all, really. :)


Kind regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] PHP vs. Python [was: Re: Django]

2010-10-29 Thread Alexander Johannesen
On Sat, Oct 30, 2010 at 7:49 AM, Bradley Allen
bradley.p.al...@gmail.com wrote:
 Mark- I would highly recommend looking at Tornado
 (http://www.tornadoweb.org) as an alternative to using Django without
 the ORM.

I'd second that one. Has used it for a couple of projects, and it
seriously cut down on prerequisite clutter and is super fast.


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Let's go somewhere [was PHP vs. Python...]

2010-11-01 Thread Alexander Johannesen
On Tue, Nov 2, 2010 at 5:03 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 I would be very unlikely to use someone's homegrown library specific
 scripting language. However, if you want to make a library for an existing 
 popular scripting
 language that handles your specific domain well, I'd be quite likely to use
 that if I had a problem with your domain and I was comfortable with the
 existing popular scripting language, i'd use it for sure.

Hmm. The balance between the old and tried, and the new and
experimental will, forever, cause these kinds of discussions. Now, I
agree with the basic sentiment of what you're saying, but ...

 Odds are your
 domain is not really libraries (that's not really a software problem
 domain), but perhaps as Patrick suggests dealing with relationships among
 semantic objects, and then odds are libraries are not the only people
 interested in this problem domain.

I've worked in the three basic tiers of library development world; the
plain vanilla programming world, the semantic web world, and the dark
dungeons of the Cult of MARC. Is the domain of library IT solved by
the generic technologies used? No.

There's nothing bad about a DSL, in fact, I encourage it. If you want
to get away from MARC, say, then having a DSL that approaches meta
data on the programmatic level directly is a wonderful abstraction.
But yes, we have to separate API from language. And API is, mostly
these days, simply a function/method call on top of an abstraction,
and it processes your request with your input. A language, on the
other hand, will let you deal directly with that problem. Most DSLs
are functional abstraction pre-compiled.

The line between a library and a language perhaps these days are more
blurred than ever before, however there are certain things that I
think justifies a library DSL ;

 * focus on identity management
 * mergability on entities
 * large distributed sets
 * more defined line between data and meta data
 * controlled vocabularies and structures

There's generic tools for all of these, however no one central thing
that binds them all together in a seamless way, elegant or otherwise.
No platform binds these together in an easy nor elegant way, and
perhaps such a thing would be beneficial to the library community, to
create a language that tries to create a bridge between computer
programming and what you learn in library school.

But even if we all concede that a library DSL perhaps is not a
practical solution, I'd still like to see us work on it, for nothing
more than sussing out our actual needs and wants in the process. Don't
underestimate the process of doing something that will never
eventuate, even knowingly.

 Some people like ruby because of it's support for creating what they call
 domain specific languages, which I think is a silly phrase, which really
 just means a libraryAPI at the right level of abstraction for the tasks at
 hand, so you can accomplish the tasks at hand concisely and without repeated
 code.

Depends on the language. Perhaps this doesn't make sense in Ruby, but
it certainly does in Scala, Haskell, and perhaps more than any, Rebol.
Even Lisp and derivatives, who can create custom structures on the
fly, are well suited to create actual languages that redefine the
language's original syntax and structure. You can redefine the hell
out of C to create any language imaginable, too, even when you
shouldn't.

A well-defined API is not a bad thing, though, but an API are
basically semantic entities in a language to parse structures.
However, a language redefines the syntax used by that language. Sure
you can create a word record in an API that mimics, say, a MARC
record, but the interesting part is when you redefine the syntax to
work *with* that semantic concept, like ;

  external_repository {
 baseURI: 'http://example.com/',
 type: OAI-PHM
  }

  my_repository {
 baseURI: 'http://example.com/',
 type: RIF-CS
  }

  some_vocabulary {
 baseURI: 'http://example.com/vocab'
 type: thesauri
  }

  foreach record in external_repository [without tag 850] {
 inject into my_repository {
with: exploded words ( tag 245 )
when: match words in some_vocabulary ( NT  2 )
merge into: tag 850
 }
  }

Creating classes that deal with record merging based on identity
management and various standards would be trivial to script together
super-fast, because the underlying concepts for us is rather
well-known. Hacking this together in Java or otherwise is a test on
patience and sanity, because they are generic tools, even when known
library-type APIs are used. Of course lots of stuff is assumed in the
example, but these are well-understood assumptions (about merging
subject headings (like multiple tags handling, LCSH lookup, etc.),
about identity control, about word lookup (for example, I'm assuming
some form of stemming before matching), and on and on. A language that
half text manipulation and lookup, and 

Re: [CODE4LIB] Seth Godin on The future of the library

2011-06-01 Thread Alexander Johannesen
Hi,

On Thu, Jun 2, 2011 at 9:11 AM, Jonathan Rochkind rochk...@jhu.edu wrote:
 There are some unanswered questions about what the purpose of the catalog is
 or should be in our users research workflow, and it's not obvious to me 
 whether
 that purpose will involve putting any possible book or article that exists 
 for free
 on the internet in the catalog.

I personally think that libraries in general still have some
fundamental issues of just getting their head around the two-headed
problem of free web resources. Not only are these free, but they don't
physically exists. This has certain implications for libraries ;

Free: as has been pointed out, sometimes this means not being peer
reviewed, or doesn't have the quality seal of a publisher, and as such
there is no process for libraries to really understand how that
knowledge fits into the rest of their collection. (I don't think it's
a price issue; it's more a fundamental model issue) It's sometimes
hard to wrap your head around the concept of anything free being of
much *worth* where in the past worth and often quality was measured in
the name of publishers and the amount of peer-review or the reputation
of the author. The Internet has *changed* this to the core; it's all
gone or going, and new models are coming through the haze of confusion
which I think the library world is both unprepared for and seriously
underfunded to deal with.

Links: The whole concept of web resources, of what a link (or a link
to a mirror or cache) is all about confuses libraries who are deeply
rooted in all things being physical. I know this is a dozy, but I
still find this an issue when talking to librarians even today. The
concept of virtual things in the library world really only exists with
the notion of meta data, and I don't think the transition to the
resource itself *also* being virtual has worked out well. Libraries
*likes* physical objects, they *like* shelves, they *like* their
buildings, and I don't blame them; we are physical beings who love the
smell of paper, however books are not actually important, buildings
are not actually important, that smell is definitely not important :
Ideas, knowledge and concepts are, and that's what we all try to pry
from the books. (As an aside, if ideas and concepts were valued more,
why couldn't LCSH morph into something far, far more important and
useful? The mind boggles at the lost opportunities!) You cannot pry
anything from a link except the possible resource at the other end,
but it is a few traceroutes away in a virtual place, and in need of
technological interpretation on arrival, and then comes the next level
of trouble;

These are just the conceptual problem. The next real problem of
technology and the library world is - despite the hard and excellent
work put in by people like us on this very list! - that they are still
a slow-poke in the realm of using and developing technology. Most ILS
are charmingly quaint in dealing with these things. OPAC's are mostly
dreadful. Backend infra-structure never powerful or big enough for the
growing digital stuff coming in. Systems running always a bunch of
features away from being what we need, only getting by on a barely
useful set of features (that far too often the vendors dictates) to do
the minimum we have to do. Yes, yes, exceptions here and there, I
would never deny that, but look at library land as a whole; you're
lagging behind and you cannot really compete in a world that needs you
to not only run, but win. And frankly, you *cannot* win, not on
technology. There's just no way. Winning this one requires not
technology as such, but paradigm shifts in thinking, both from inside
and especially from the outside, coupled with proper resourcing by
people who understands the value libraries truly bring to the world.
And this latter thing is becoming a real problem, I think.

 One reason that libraries may not prioritize putting free ebooks in the 
 catalog is because
 there are other places users can search for free ebooks on the internet -- 
 but there
 aren't other places users can search for non-free ebooks that they know will 
 be licensed
 to them as library patrons, or for that matter to search for physical things 
 on the shelves
 that they know are available from their library.

Seems like an odd argument to me. Why are we talking about the price
and the format of the information rather than the *quality* of it? I
thought a curated collection was the bee's knees, regardless of what
formats used. Hmm. Maybe I'm thinking too much like a knowledge
customer than a librarian these days, and I've lost my touch or my
way. :)


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] iPads as Kiosks

2011-08-24 Thread Alexander Johannesen
Just my two bobs ;

We're going through various stages of testing out tablets for both
kiosks *and* portable workstations (for nurses and staff), and have
tried out iPads and various Androids, and our current favorite is
actually the Asus Eee Pad Transformer, a vanilla (but good quality)
Honeycomb Android during day, but with a snap-on keyboard with extra
ports and batteries for some netbook action at night, so it satisfies
both our criteria.

As with all things, it also depends on what software you want to run.
If you go with iPad you need to go through Apple's various
restrictions, while on Android you can use whatever you want. For a
you are here tablet a cheap 150$ Android seems like a good option,
too.


Regards,

Alex



On Wed, Aug 24, 2011 at 11:51 PM, Madrigal, Juan A
j.madrig...@miami.edu wrote:
 That零 the equivalent to $25/month and includes support for your whole
 development team/institution.

 If your employer can't afford that then I suggest you look for a new job!
 ;)

 Juan Madrigal

 Web Developer
 Web and Emerging Technologies
 University of Miami
 Richter Library



 On 8/23/11 2:21 PM, Dan Funk daniel.h.f...@gmail.com wrote:

Wow, just $300/year and you can run your own software on your own
hardware? What a deal.

On Tue, Aug 23, 2011 at 2:13 PM, David Uspal david.us...@villanova.edu
wrote:
 Thanks for the update.   This definitely solves that issue -- its
unfortunate this wasn't in place in 2009, or I'd be into year two of a
five year contract...

 David K. Uspal
 Technology Development Specialist
 Falvey Memorial Library
 Phone: 610-519-8954
 Email: david.us...@villanova.edu




 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Andrew Hankinson
 Sent: Tuesday, August 23, 2011 2:00 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] iPads as Kiosks

 You can distribute apps via an internal web server, with no need to go
out to Apple.


http://developer.apple.com/library/ios/#featuredarticles/FA_Wireless_Ente
rprise_App_Distribution/Introduction/Introduction.html

 You need to be a registered business to do this, and it costs $299/yr.
You get a digital certificate, but that doesn't mean your code needs to
be seen by anyone outside of your org.


 On 2011-08-23, at 1:47 PM, David Uspal wrote:

 When I did my iPhone work, it was back in 2009 before this document
even existed, so it's good they've come some distance on this issue
since then.  Still, the document below doesn't break the dependency on
the iTunes store and/or a digital certificate issued by Apple to
download applications (if I'm reading page 63 right), which was the big
sticking point of the contract.  Not only did the user not want the
network controlled by Apple (which this document does handle), they
also didn't want the code seen by any outside source at all (aka via
uploading it to the store)


 David K. Uspal
 Technology Development Specialist
 Falvey Memorial Library
 Phone: 610-519-8954
 Email: david.us...@villanova.edu


 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
Of Andrew Hankinson
 Sent: Tuesday, August 23, 2011 1:34 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] iPads as Kiosks

 They now have an enterprise app deployment mechanism.

 http://www.apple.com/support/iphone/enterprise/


 On 2011-08-23, at 12:54 PM, David Uspal wrote:

 Then again, by selecting the iPad you're essentially tethered to
Apple's iron grip of the iWorld via its iTunes vetting process and
strict control of Apple hardware.   YMMV on this depending on what
you're doing, but it should definitely be a consideration when
choosing between Android tablets and the iPad.

 Quick side story -- we had to drop a contract one time at my old job
due to the customer proprietary requirements.  The customer didn't
want to release its developed software outside of house (minus the
developers of course) and Apple wouldn't give them a waiver from using
the iTunes store.  Mind you, this was a very big company with
resources, so Apple probably lost a 5000 unit sale due to this


 David K. Uspal
 Technology Development Specialist
 Falvey Memorial Library
 Phone: 610-519-8954
 Email: david.us...@villanova.edu






 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
Of Stephen X. Flynn
 Sent: Tuesday, August 23, 2011 9:01 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] iPads as Kiosks

 Let's not forget a far superior user experience.


 
 Stephen X. Flynn
 Emerging Technologies Librarian
 Andrews Library, College of Wooster
 1140 Beall Ave.
 Wooster, OH 44691
 (330) 263-2154
 http://www.sxflynn.net



 On Aug 22, 2011, at 12:56 PM, Madrigal, Juan A wrote:

 I would definitely go with the iPad. More accessories, better
support and
 consistency.


 Juan Madrigal

 Web Developer
 Web and Emerging Technologies
 University of Miami
 Richter Library



 On 8/22/11 11:19 AM, Dan 

Re: [CODE4LIB] Ontology Question

2011-11-11 Thread Alexander Johannesen
Hiya,

 Is it okay to just use the classes I need or should I include the super 
 classes which they belong to?

I think we also need to define a few concepts here. What do you mean,
include? As far as I can tell, you want to say something like
Here's a few concepts we're using, and their definition is based off
this other ontology over *there* (pointing), but that's not always
the case, so just asking.

Now, Karen is of course right in her take on it, but there's a little
thing that require a bit of focus, and that's how this new ontology is
going to be used. Is it one of these manual labour things where it
doesn't actually require formal definitions as much as a human one, or
is it (however you use the ontology) to be passed through a tool, or
more formally passed through an inferencer?


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Professional development advice?

2011-11-28 Thread Alexander Johannesen
I could give you tons of advice, most of it specific to some
technological domain or another, but over the years I've more or less
settled on one thing that beat out all the other ;

Data models.

Once you grok data models, what they are, how they work, and all the
extended family (schemas, ontologies, persistent identification,
querying, de-duplication, layered models, LUT/transcripts, stored
procedures [and why they are evil], RDBMS vs. NoSQL vs. whatever, and
so on), everything else is miscellaneous. The way we humans use
computers as tools are all rooted in a data model at the bottom of
some program or database, and the rest of the time is spent
interacting with the data model, trying to make it do the things we
need it to do, and so on. Everything is about and around that data
model, so getting it right is a lot more important than any amount of
beautiful coding against it.

So, that's my big tip; all that technology we much about with is
really trying to work well with a data model. Your task should rather
be to understand the why, who, how, when and the thenceforth of data
models, and everything else will follow.

Now, this tip could under normal circumstances be applied to any part
of the IT industry, but it makes especially sense in the library
world. Most of the time is spent converting data between data models
(whatever  MARC  whatever), or making sense of the one (MARC21/FRBR)
or other (AACR2/RDA and that third one I can never remember the name
of, that extension rules to AACR2?) or three (LCSH/DDC). We're all
battling against the original thought and implementation of data
models, and very often you'll find better technological solutions when
you understand the underlying human efforts of ... data modeling (and
by extension, you might discover my pet peeve, how all bad software
and systems in the world comes from bad data modeling, and *not* from
bad programming [even if there's plenty of that, too])


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Professional development advice?

2011-11-28 Thread Alexander Johannesen
Hiya,

On Tue, Nov 29, 2011 at 10:06 AM, Nate Vack njv...@wisc.edu wrote:
 A more productive task is to understand the who, how, when, and
 thenceforth of what tasks actual people want to accomplish with their
 computers

Understanding this is not disconnected from designing data models
*right*. It's the same thing. By extension I should mention that
people are terrible at telling you what they want or need, but they're
good at telling you what they hate. If nothing else, I'd suggest to
tap into that wonderful hate.

 But an 'all flows from data modeling' thought process leads to FOAF,
 FOAF leads to hate, and hate leads to suffering.

This sounds suspiciously like someone who don't understand the perils
of data models and how they affect all the FOAF and hate that's built
up around its faults. FOAF and suffering is a symptom of shitty data
models, not shitty code. Unless you've got a little more meat on that
argument? :)


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Professional development advice?

2011-11-29 Thread Alexander Johannesen
Kyle Banerjee wrote:
 Starting with data modeling is like trying to learn a new spoken language
 by focusing on grammar [...]

Hmm. It seems that a lot of people are, shall we say, somewhat
misguided to what data modelling is, even mighty WikiPedia who makes
it into a formal process of sorts, and I can see it repeated ad
nauseum wherever you go, giving us the idea that it is all about the
schema of columns and the nuts and bolts of tables and relations in a
RDBMS. That's confusing data modelling tools or processes with the
generic open-ended category of data modelling.

Data modelling is simply the act of exploring data-oriented
structures. Over time I've learned that everything we do, every little
problem you battle with in your every daily life, revolves around some
data structure, the names of such, and their internal and external
relationships. The simplest web form has a model, simple and complex
applications do as well, enterprise systems, library systems, formats,
databases, documents, spreadsheets, this conversation, your bicycle,
your morning routine, *everything*.

There is, in my strong opinion, a horrible conflation of the concept
of modelling data and implementations pinning down data types; it's an
evil so strong it blinds us, cripple us, and I feel like screaming out
in terrifying agony the horrors within! The wrongly applied indeces!
The labels on columns! The semantic binding of one sub-structure to
another! The optimising tricks used! The stored procedures! The
conceptual semantics of labels in n-ary graphs!!
*aaarghhh!!*

The wretched *name* of a single field and how it quietly eats up any
disambiguous notion we put in place, through the many well-meaning but
afflicting layers of abstraction and implementation, it drives me
insane! Name!? What does that mean in the context of an email
address? What does comment mean when it reaches my ORB? What were
they thinking when the model designed resulted in SQL statements 1K
long?

There's so much information written of the topic of data modelling,
and most of it ignore that very thing that it should embrace and focus
heavily on; good semantic design. (Granted, it has become far more
focused on in the last 10 years, and I'm extremely happy for that) Put
some heavy thought into your tables, because what you perceive as a
simple table of users becomes an overwhelming problem when you add
special users to the system. Have any of you ever created an ILS with
a table book in it? (C'mon, raise your hand, I know you have!) Yeah,
that's the sort of evil I'm talking about! Libraries don't deal with
books, they deal with bibliographic meta data of objects, and
sometimes those objects are called a book which has certain
constraints and properties that link to special meta data that isn't
static. Version 1.0 of any system if famously rubbish because of the
learning process of getting all this stuff wrong. Version 2.0 is
famous for being overly abstracted and incomprehensible. Version 3.0
is getting there, but you're bogged down in the middleware,
translating between good but incompatible models. By the time you get
to version 4.0 you realize that the underlying concepts which drove
versions 1 through 3 are flawed, and you need to work in terms of FRBR
sub-graphs instead of MARC records. Version 5.0 is so re-written and
re-conceptualized, you decide to call it something else, version 1.0
And we repeat the cycle. If your software isn't like this, consider
yourself lucky (or at worst, self-deluded :).

 Data modeling is extremely useful, but
 mistaking drips and drabs of it early on for reality can poison your
 thinking.

Sorry, you got that back to front. We all agree that understanding
what user want and / or need is King, but unless you've got that
understanding of not only what the users want but how systems can
deliver this without creating constraints that will screw things up
when you extend that original delivery idea, you're going to suffer.
Badly.

It's easy; take great care to what you call things in your system (no
matter whether it's in the database, your objects / classes /
instances / interfaces, user interface, buttons, messages, windows,
data types, loops ... they're all data models that need to be as
cooperative as possible, speaking the *same language*, to be
compatible in the meaning they give the concepts used. If your Wheels
API has different semantics from your Steering API, making that car is
going to be a really crappy experience, for you as a developer, for
testers, for maintenance guys, for service people, and most of all
don't think for a second that the driver won't notice. These semantics
are far more important than what our industry traditionally have given
them, and in my opinion it is our biggest flaw.

Trust me, I've stared at data models up and down so many systems over
the years (10 of them in a high-flying big consultant agency where we
came in when projects otherwise failed) it's amazing I'm still 

Re: [CODE4LIB] Models of MARC in RDF

2011-12-06 Thread Alexander Johannesen
On Wed, Dec 7, 2011 at 1:49 PM, stuart yeates stuart.yea...@vuw.ac.nz wrote:
 As much as I have nothing against anyone on this list, isn't it a little
 US-centric? Didn't we make that mistake before?

I wouldn't worry. A dream-team have no basis in reality, hence the
dream part. I'd like to see a Real Team instead, an international
collaboration of people, including international smarts and
non-librarians. (Realistically, an international [or semi] library
conference should have a three-day session with smart people first on
this very issue, and that would make a fine place to get this thing
working, even to some degree of speed)


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] Namespace management, was Models of MARC in RDF

2011-12-12 Thread Alexander Johannesen
Richard Wallis richard.wal...@talis.com wrote:
 Your are not the only one who is looking for a better term for what is
 being created - maybe we should hold a competition to come up with one.

A named graph gets thrown around a lot, and even though this is
technically correct, it's neither nice nor sexy.

In my past a bucket was much used, as you can easily thrown things in or
take it out (as opposed to the more terminal record being set), however
people have a problem with the conceptual size of said bucket, which more
or less summarizes why this term is so hard to pin down.

I have, however, seen some revert the old RDBMS world of rows, as they
talk about properties on the same line, just thinking the line to be more
flexible than what it used to be, but we'll see if it sticks around.
Personally I think the problem is that people *like* the idea of a closed
little silo that is perfectly contained, no matter if it is technically
true or not, and therefore futile. This is also why, I think, it's been so
hard to explain to more traditional developers the amazing advantages you
get through true semantic modelling; people find it hard to let go of a
pattern that has helped them so in the past.

Breaking the meta data out of the wonderful constraints of a MARC record?
FRBR/RDA will never fly, at least not until they all realize that the
constraints are real and that they truly and utterly constrain not just the
meta data but the future field of librarying ... :)

Regards,

Alex


Re: [CODE4LIB] Namespace management, was Models of MARC in RDF

2011-12-12 Thread Alexander Johannesen
Richard Wallis richard.wal...@talis.com wrote:
 Collection of triples?

Yes, no baggage there ... :) Some of us are doing this completely without a
single triplet, so I'm not sure it is accurate or even politically correct.
*hehe*

 A classic example of only being able to describe/understand the future in
 the terms of your past experience.

Yes, exactly. Although, having said that, I'm excited that the library
world is finally taking the semantic challenge seriously. It's taken quite
a number of years, but slowly there's a few drips and draps happening.
Here's to hoping that there's a fluse somewhere about to open fully, and
maybe the RDA vehicle have proper wheels? (Didn't the last time I checked,
but that's admittedly a couple of years back. I hear they at least got new
suspension?)

Regards,

Alex


[CODE4LIB] Open datasets

2012-01-11 Thread Alexander Johannesen
Hiya,

I'm in the middle of creating a meta data management system (including
merging and persistent identifier management) for a somewhat different
domain (intranets and business integration), but it's based on Topic Maps
and so is well suited to other means of meta data handling / mangling. It's
also going to be open-source, and it might be well-suited to library tasks
as well.

So in order to test the integrity and performance of my system so far I'm
wondering if there's a suitable open dataset of bibliographic records that
aren't too obscure (meaning, I can find the titles at amazon or Open
Library) that you could recommend? More than 1000 records, but less than a
million, maybe?

Regards,

Alex


Re: [CODE4LIB] Open datasets

2012-01-12 Thread Alexander Johannesen
Hiya,

Thanks for the all the pointers; just what I wanted, and gives me plenty of
ways to test the generic meta data handling. Great!

Regards,

Alex
On Jan 12, 2012 3:19 AM, Simon Spero s...@unc.edu wrote:

 You can get anything you want
 At Brewster Kahle's restaurant.
 http://openlibrary.org/data#bulk_download

 Simon

 On Wed, Jan 11, 2012 at 10:55 AM, LeVan,Ralph le...@oclc.org wrote:

  http://staff.oclc.org/~levan/PearsTraining/scifi.usmarc has 10,000 marc
  records in it.  They are part of the old SiteSearch system that OCLC
  released as open source.  They date back to 2002 and will not contain
  any Unicode, if you were hoping to include that as part of your testing.
 
  Ralph
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
  Alexander Johannesen
  Sent: Wednesday, January 11, 2012 5:36 AM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Open datasets
 
  Hiya,
 
  I'm in the middle of creating a meta data management system (including
  merging and persistent identifier management) for a somewhat different
  domain (intranets and business integration), but it's based on Topic
  Maps
  and so is well suited to other means of meta data handling / mangling.
  It's
  also going to be open-source, and it might be well-suited to library
  tasks
  as well.
 
  So in order to test the integrity and performance of my system so far
  I'm
  wondering if there's a suitable open dataset of bibliographic records
  that
  aren't too obscure (meaning, I can find the titles at amazon or Open
  Library) that you could recommend? More than 1000 records, but less than
  a
  million, maybe?
 
  Regards,
 
  Alex
 



Re: [CODE4LIB] Project Management Software Question

2012-02-23 Thread Alexander Johannesen
Hiya,

 --What project management software are you using?

Semantic MediaWiki, xSiteable

 --What made you choose the system?

Most project management software is written by geeks, not for humans. They
all propose some methodology to go with their model, but either their model
is inflexible (and crashing with yours), or it is so flexible that any tool
might do the trick. Also, they are notoriously hard to configure on a
cumulative scale of the people involved. Also, people hate putting in their
data, so most software, even if they might just do the trick, fails for
human reasons.

So, a simple wiki with some added ontology cruff, and xSiteable delivering
semantics and widgets across all people is enough. Simple todo's beat
complex task management every time.

 --Has the system met all of your needs? If not, where does it fail?

It only fails when we need average to higher degree of data, again, a human
problem. Oh, and it sometimes fails because the MediaWiki GUI sucks for
non-geeks. I think Confluence is better and overal pretty good.

 --Overall opinions?

I could write you a sonnett or two, but I have very little trust in
software helping much in project management (after having tried them all
over a span of 20 years). A joint platform for documentation (and for
heavens' sake, choose a Wiki that has a usable interface!)

In fact, you'd be *far* better off getting Making stuff happen by Scott
Berkun (
http://www.amazon.com/dp/0596517718?tag=scottberkunco-20camp=14573creative=327641linkCode=as1creativeASIN=0596517718adid=1B6JF6HWHDT0S5RYZNNM),
the best book I ever got. Honest, I'm not affiliated. :)

 --What systems did you evaluate and decide not to recommend?

Hmm, I think I've tried too many. I'm sure there's software out there that
doesn't suck (ie. I hear good things about a few here and there), but far
too often do I see this usability parred with human engagement problem crop
up and ruin the best of software packages.

 Any information would be great!

Sorry to be so glum. I'm more happy with simpler approaches such as
project on a page (ie. one Wiki page with short description, people,
contacts, goals, and progress) and more agile ways of dealing with
requirements and development (reduces the need for approved paper, easier
to roll back bad decisions, etc.). The closest I get to a Gant chart is
that one of our vendors insists on sending me one every now and then,
despite that he has to come into the office and explain it to people every
single time.

In other words; use software to document and drive forward, never use
software to measure progress and estimates.

Regards,

Alex (disgruntled ex-beliver in project management software)


Re: [CODE4LIB] Seeking examples of outstanding discovery layers

2012-09-19 Thread Alexander Johannesen
I love the Trove from the National Library of Australia ;

   http://trove.nla.gov.au/


Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
--- http://shelter.nu/blog/ --
-- http://www.google.com/profiles/alexander.johannesen ---


Re: [CODE4LIB] rdf serialization

2013-11-04 Thread Alexander Johannesen
Hiya,

On Tue, Nov 5, 2013 at 1:59 AM, Karen Coyle li...@kcoyle.net wrote:
 Eric, I really don't see how RDF or linked data is any more difficult to
 grasp than a database design

Well, there's at least one thing which makes people tilt; the flexible
structures for semantics (or, ontologies) in where things aren't as
solid as in a data model. A framework where there are endless options
(on the surface of it) for relationships between things is daunting to
people who come from a world where the options are cast in iron.
There's also a shift away from thing's identities being tied down in a
model somewhere into a world where identities are a bit more, hmm,
flexible? And less rigid? That can make some people cringe, as well.

 A master chef understands the chemistry of his famous dessert - the rest of
 us just eat and enjoy.

Hmm. Some of us will try to make that dessert again, for sure. :)


Alex


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Alexander Johannesen
Ross Singer rossfsin...@gmail.com wrote:
 This is definitely where RDF outclasses almost every alternative*, because
 each serialization (besides RDF/XML) works extremely well for specific
 purposes [...]

Hmm. That depends on what you mean by alternative to RDF
serialisation. I can think of a few, amongst them obviously (for me)
is Topic Maps which don't go down the evil triplet way with conversion
back and to an underlying data model.

Having said that, there's tuples of many kinds, it's only that the
triplet is the most used under the W3C banner. Many are using to a
more expressive quad, a few crazies , for example, even though that
may or may not be a better way of dealing with it. In the end, it all
comes down to some variation over frames theory (or bundles); a
serialisation of key/value pairs with some ontological denotation for
what the semantics of that might be.

It's hard to express what we perceive as knowledge in any notational
form. The models and languages we propose are far inferior to what is
needed for a world as complex as it is. But as you quoted George Box,
some models are more useful than others.

My personal experience is that I've got a hatred for RDF and triplets
for many of the same reasons Eric touch on, and as many know, I prefer
the more direct meta model of Topic Maps. However, these two different
serialisation and meta model frameworks are - lo and behold! -
compatible; there's canonical lossless conversion between the two. So
the argument at this point comes down to personal taste for what makes
more sense to you.

As to more on problems of RDF, read this excellent (but slighlt dated)
Bray article;
   http://www.tbray.org/ongoing/When/200x/2003/05/21/RDFNet

But wait, there's more! We haven't touched upon the next layer of the
cake; OWL, which is, more or less, an ontology for dealing with all
things knowledge and web. And it kinda puzzles me that it is not more
often mentioned (or used) in the systems we make. A lot of OWL was
tailored towards being a better language for expressing knowledge
(which in itself comes from DAML and OIL ontologies), and then there's
RDFs, and OWL in various formats, and then ...

Complexity. The problem, as far as I see it, is that there's not
enough expression and rigor for the things we want to talk about in
RDF, but we don't want to complicate things with OWL or RDFs either.
And then there's that tedious distinction between a web resource and
something that represents the thing in reality that RDF skipped (and
hacked a 304 solution to). It's all a bit messy.

 * Unless you're writing a parser, then having a kajillion serializations
 seriously sucks.

Some of us do. And yes, it sucks. I wonder about non-political
solutions ever being possible again ...


Regards,

Alex
-- 
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
 http://shelter.nu/blog  |  google.com/+AlexanderJohannesen  |
http://xsiteable.org
 http://www.linkedin.com/in/shelterit


Re: [CODE4LIB] rdf serialization

2013-11-05 Thread Alexander Johannesen
Hi,

Robert Sanderson azarot...@gmail.com wrote:
 c) I've never used a Topic Maps application. (and see (a))

How do you know?

 There /are/ challenges with RDF [...]
 But for the vast majority of cases, the problems are solved (JSON-LD) or no
 one cares any more (httpRange14).

What are you trying to say here? That httpRange14 somehow solves some
issue, and we no longer need to worry about it?

 Having said that, there's tuples of many kinds, it's only that the
 triplet is the most used under the W3C banner. Many are using to a
 more expressive quad, a few crazies , for example, even though that

 ad hominem? really? Your argument ceased to be valid right about here.

I think you're a touch sensitive, mate. Crazies as in, few and
knowledgeable (most RDF users these days don't know what tuples are,
and how they fit into the representation of data) but not mainstream.
I'm one of those crazies. It was meant in jest.

 may or may not be a better way of dealing with it. In the end, it all
 comes down to some variation over frames theory (or bundles); a
 serialisation of key/value pairs with some ontological denotation for
 what the semantics of that might be.

 Except that RDF follows the web architecture through the use of URIs for
 everything. That is not to be under-estimated in terms of scalability and
 long term usage.

So does Topic Maps. Not sure I get your point? This is just semantics
of the key dominator in tuple serialisation, there's nothing
revolutionary about that, it's just an ontological commitment used by
systems. URIs don't give you some magic advantage; they're still a
string of characters as far as representation is concerned, and I dare
say, this points out the flaw in httpRange14 right there; in order to
know representation you need to resolve the identifier, ie. there's a
movable dynamic part to what in most cases needs to be static. Not
saying I have the answer, mind you, but there are some fundamental
problems with knowledge representation in RDF that a lot of people
don't care about which I do feel people of a library bent should
care about.

 But wait, there's more! [big snip]

 Your point? You don't like an ontology? #DDTT

My point was the very first words in the following paragraph;

 Complexity.

And of course I like ontologies. I've bandied them around these parts
for the last 10 years or so, and I'm very happy with RDA/FRBR
directions of late, taking at least RDF/Linked Data seriously. I'm
thus not convinced you understood what I wrote, and if nothing else,
my bad. I'll try again.

 That's no more a problem of RDF than any other system.

Yes, it is. RDF is promoted as a solution to a big problem of findable
and shareable meta data, however until you understand and use the full
RDF cake, you're scratching the surface and doing things sloppy (and
I'd argue, badly). The whole idea of strict ontologies is rigor,
consistency and better means of normalising the meta data so we all
can use it to represent the same things we're talking about. But the
question to every piece of meta data is *authority*, which is the part
of RDF that sucks. Currently it's all balanced on WikiPedia and
dbPedia, which isn't a bad thing all in itself, but neither of those
two are static nor authoritative in the same way, say, a global
library organisation might be. With RDF, people are slowly being
trained to accept all manners of crap meta data, and we as librarians
should not be so eager to accept that. We can say what we like about
the current library tools and models (and, of course, we do; they're
not perfect), but there's a whole missing chunk of what makes RDF
'work' that is, well, sub-par for *knowledge representation*. And
that's our game, no?

The shorter version; the RDF cake with it myriad of layers and
standards are too complex for most people to get right, so Linked Data
comes along and try to be simpler by making the long goal harder to
achieve.

I'm not, however, *against* RDF. But I am for pointing out that RDF is
neither easy to work with, nor ideal for any long-term goals we might
have in knowledge representation. RDF could have been made a lot
better which has better solutions upstream, but most of this RDF talk
is stuck in 1.0 territory, suffering the sins of former versions.

 And then there's that tedious distinction between a web resource and
 something that represents the thing in reality that RDF skipped (and
 hacked a 304 solution to). It's all a bit messy.

 That RDF skipped? No, *RDF* didn't skip it nor did RDF propose the *303*
 solution. You can use URIs to identify anything.

I think my point was that since representation is so important to any
goal you have for RDF (and the rest of the stack) it was a mistake to
not get it right *first*. OWL has better means of dealing with it, but
then, complexity, yadda, yadda.

 http://www.w3.org/2001/tag/doc/httpRange-14/2007-05-31/HttpRange-14
 And it's not messy, it's very clean.

Subjective, of course. Have you ever played with an 

Re: [CODE4LIB] Protagonists

2015-04-13 Thread Alexander Johannesen
Hmm. So, I'm a big fan of WikiPedia and would still go that way even
if the data can be haphazard. WikiPedia has a lot of classics with a
section called Lead characters (Pride and Prejudice included) where
the focus is the novel first, which should be easy to call and then
trim with some simple text parsing to get basic characterizations,
like gender, possibly age, place and purpose to the story (main
protagonist, antagonist, support character, etc.)

I'd start with a page like Le Monde's 100 Books of the Century
(http://en.wikipedia.org/wiki/Le_Monde%27s_100_Books_of_the_Century)
and give each of them a visit, scraping for main characters or
characters headings, and devise a small set of parsing rules to grab
the top ones and their properties. Sounds like a fun day or so.


Cheers,

Alex



On Tue, Apr 14, 2015 at 3:35 PM, McAulay, Elizabeth
emcau...@library.ucla.edu wrote:
 Cool set of questions! Here's a funny cheat -- what about querying Amazon 
 or the like for a list of Cliff's Notes and call the subjects of the 
 Cliff's Notes the Canon? That could serve as a the canon list. Another idea 
 would be to consult a reference work, but I can't think of a good source 
 offhand. One example that's not perfect is the Dictionary of Literary 
 Biography. The Canon is created by what is included in the reference work.

 As for finding lead character names, that's something I don't have an 
 immediate answer for.

 Good luck!

 Best,
 Lisa

 -
 Elizabeth Lisa McAulay
 Librarian for Digital Collection Development
 UCLA Digital Library Program
 http://digital.library.ucla.edu/
 email: emcaulay [at] library.ucla.edu

 
 From: Code for Libraries CODE4LIB@LISTSERV.ND.EDU on behalf of 
 davesgonechina davesgonech...@gmail.com
 Sent: Monday, April 13, 2015 7:12 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: [CODE4LIB] Protagonists

 So I have this idea I'd like to do for a hobby project, but it requires
 finding a table that lists a classic novel, a Gutenberg.org link to an
 instance of that work (first listed, one with most downloads, whichever),
 the lead female character, and the lead male character (can be null). E.g.
 Pride and Prejudice, http://www.gutenberg.org/ebooks/42671, Elizabeth
 Bennet, Mr. Darcy. Even leaving the Gutenberg part for another day, this
 has been really difficult to find.

 I've had no success with Dbpedia/Wikidata since there's no real
 standardized format for novels, characters often are associated more
 strongly with films or video games than original works (Cheshire Cat), and
 when characters are listed they are neither prioritized nor link to a
 record that clearly states gender. And then there's how to select some sort
 of Western Canon list. ISBNs are nowhere to be found, nor any other
 identifier that might help to corral a fair chunk of results.

 I looked at OCLC, but WorldCat Works is still an experiment and frankly
 looks like too much work to query for too little return even if it had good
 coverage. Amazon? Librarything? Goodreads? No luck yet.

 I raise this partly because a) I would like to make some toys with that
 list, and b) I feel this is a good test case for what developers might
 want from library data, linked or otherwise. It is the sort of request
 that includes many unspoken assumptions (that there is a canon, and it is
 well-defined) that app users, product managers, and developers typically
 want even if it is woefully incomplete or imperfect, so long as it matches
 expectations. While I appreciate what it takes to make such a list, I feel
 like this really ought to be a solved problem in the library space. Not in
 the process of being solved, hopefully, by new emerging standards solved,
 but like we solved this ages ago, here ya go solved.

 I'm posting this basically in the hopes that someone will say No, doofus,
 there's an easy way to do this, you just aren't very good at this - look:
 and show me where I'm wrong.

 D



-- 
 Project Wrangler, SOA, Info Alchemist, UX, RESTafarian, Topic Maps
 http://shelter.nu/blog  |  google.com/+AlexanderJohannesen
 http://xsiteable.org  |  http://www.linkedin.com/in/shelterit