Yeah, I thought of the URI encoding issue, that's easy enough to deal with, makes sense.

I have no idea how to tell if SuDocs are case sensitive or not. But they ARE all assigned by the GPO, and look-up-able in the GPO catalog. Yeah, they have to be URL encoded, certainly, but can't we just say "must be a valid SuDoc class (including book number) assigned by the GPO, but [url encode it]." This can't be the only use case for essentially arbitrary strings assigned by a third party controlling authority, that you want to make into an info: uri, right? But maybe I'll try doing the best I can, with or without GPO assistance (Ed Summers said he thought he might know somebody at GPO interested in identifiers), and maybe run it by you? If this ends up being a huge time sink -- I'm probably going to give up, and just use my own illegal info:sudoc identifiers that aren't really registered at all, which would be bad, but I need a sudoc URI and don't have a huge amount of time to sink into doing it 'right'.

Believe me, I have already spent quite a bit of time with that document you reference. It was written for an earlier era, clearly.

Jonathan

Ray Denenberg, Library of Congress wrote:
Pointing to the documentation and saying "one of these" isn't going to work, I'm afraid. Most important is to make sure that the syntax is consistent with URI syntax. Where the syntax of the identifier you're representing is potentially at odds with URI syntax, you might have to make adjustments, like percent-encode. So if you're going to register sudoc, you're going to have to understand the syntax to some degree, there's really no way around it. (I didn't know the lccn syntax, registering it forced me to learn it, and I'm a better man for it.)

I don't know much about SuDoc, and most everything seems to point to http://www.gpo.gov/su_docs/fdlp/pubs/explain.html which doesn't really explain their syntax. (Though if you look a bit harder maybe you'll find something better.)

But I see this example:    Y 3.C 76/3:2 K 54

That's apparently a sudoc. It immediately raises the following flags: spaces, slash, colon, and case (sensitivity). For your purposes I don't think that colon or slash is a problem. (They become a problem when you are using them as special characters for delimitation, but you're not doing that.) Spaces, though, have to be percent encoded. (That simply means replace each occurence of a space with "%20".)

You also need to look at case-sensitivity. If sudocs are case-sensitive, no problem, if not, then you may want to normalize to either upper or lower case.

There may not be any normalization issues (other than case sensitivity, if that). Normalization is an issue only if a particular sudoc can be represented by more than one string. If so you have two choices:
1. prescribe a canonical form (which is the approach we took for LCCNs).
2. simply describe the rules for determining when two strings represent the same sudoc (there is no rule that says that two different info URIs can't refer to the same resource).

You can contact me privately if you have problems.

No, sorry, I don't know anyone at GPO. I worked the graveyard shift there part time during college. (I had to load mailing machines with junk mail. Several junk items loaded into a machine which would combine them into one mailing item. The machine would jam about every tenth time. Worst job I ever had.) But that was many years ago and that's the last contact I've had with GPO.

Good luck.

-Ray

----- Original Message ----- From: "Jonathan Rochkind" <rochk...@jhu.edu>
To: <CODE4LIB@LISTSERV.ND.EDU>
Sent: Friday, March 27, 2009 3:36 PM
Subject: Re: [CODE4LIB] registering info: uris?


Thanks Ray.

Oh boy, I don't know enough about SuDoc to describe the syntax rules fully. I can spend some more time with the SuDoc documentation (written for a pre-computer era) and try to figure it out, or do the best I can. I mean, the info registration can clearly point to the existing SuDoc documentation and say "one of these" -- but actually describing the syntax formally may or may not be possible/easy/possible-for-me-personally.

I can't even tell if normalization would be required or not. I don't think so. I think SuDocs don't suffer from that problem LCCNs did to require normalization, I think they already have consistent form, but I'm not certain.

I'll see what I can do with it.
But Ray, you work for 'the government'. Do you have a relationship with a counter-part at GPO that might be interested in getting involved with this?

Jonathan

Ray Denenberg, Library of Congress wrote:
It's a fairly straightforward process,  See:
http://info-uri.info/registry/register.html

You should look at a few examples first, go to http://info-uri.info/registry/ and click on a few of those listed in the left column.

I think registering one for SuDocs would be fairly easy.

The info folks are most concerned that the syntax rules are well-described. I had registered a few of these before they started cracking the whip on that (and rightly so), and when I registered info:lc it became more difficult; you might want to look at that for an example:
http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:lc/

Also, normalization - I suggested looking at info:lccn normalization rules:
http://info-uri.info/registry/OAIHandler?verb=GetRecord&metadataPrefix=reg&identifier=info:lccn/

--Ray


----- Original Message ----- From: "Jonathan Rochkind" <rochk...@jhu.edu>
To: <CODE4LIB@LISTSERV.ND.EDU>
Sent: Friday, March 27, 2009 3:12 PM
Subject: [CODE4LIB] registering info: uris?



Does anyone know the process for registering a sub-scheme for info: uris?

I'd like to have one for SuDoc classification numbers, info:sudoc/.

I'm not sure if I can register that on my own, without working with the US Government Printing Office, who actually maintains sudocs. But if I have to get GPO to do it, I'll probably give up quicker (unless it turns out easier than I thought to find the right person at GPO and get them to sign on -- I doubt it!). Or if the registration process is really long and onerous.

But if it's easy enough to just fill out a form and get info:sudoc registered, I'd rather it be legal than use things that look like an info uri but really aren't a legally registered sub-scheme.

Anyone know?

Jonathan

Reply via email to