Re: [CODE4LIB] Question abt the code4libwomen idea
There being no rules about who can form a group does not mean there are no opinions about it, or that nobody should share an opinion. Just the opposite, the community defines itself by sharing opinions and discussing them, not by rules. There is no contradiction between thinking something is a bad idea and thinking it is not prohibited by any rules, I am surprised to find you astonished by it. Yes, you don't need permission, you can just do it. But people will have opinions about what you do, and they'll share them. That's how a community functions, no? People are encouraged to float their ideas by the community and get community feedback and take that feedback into account -- but taking it into account doesn't mean you have to refrain from doing something if some people don't like it (especially when other people do), you can make your own decision. I'm not even going to talk about the particular plan here, because I think this general point is much more important. The idea that rules are the only thing that can or should guide's one course of action is absolutely antithetical to a well-functioning community, online or offline. Thinking that either there should be a rule against something, or else nobody should resist or express opposition to anything that lacks a rule against it -- is a recipe for stultifying beuarocracy, not community. From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Karen Coyle [li...@kcoyle.net] Sent: Friday, December 07, 2012 12:50 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Question abt the code4libwomen idea Code4lib appears to have no rules about who can and cannot form a group. Therefore, if there are some folks who want a group, they should create that group. If it's successful, it's successful. If not, it'll fade away like so many start-up groups. I'm astonished at the resistance to the formation of a group on the part of people who also insist that there are no rules about forming groups. I don't recall that any other proposal to set up a group has met this kind of resistance. In fact, we were recently reminded that if you want something done in c4l you should just do it. There is no need to ask permission. So, do it. I think the only open question is: where? e.g. what platform? kc On 12/7/12 9:25 AM, Salazar, Christina wrote: Hi Bohyun, Thank you so much for raising this again. I'm still interested in such a group. I found the terminology separate but equal (that some on this list chose to use as a reason not to do this) offensive; it was not at all the spirit that I'd originally proposed and no one had suggested either separate OR equal other than detractors. In fact I said that anyone would be welcome. I completely agree with what you're saying about there not being any reason why we women couldn't do both (I think we're versatile that way). I'm pretty sure I vaguely recall (maybe) there being some (similar) concerns about the local c4ls and I would say it's very similar - no one says that just because a person finds say, Appalachia.c4l useful, it detracts from the global c4l. If I can find other women who are willing to work together as a women in library technology/coder/whatever support group, I will work to make something like this happen. As someone pointed out, we don't need blessing from anyone. If you will be there, I will look for you at the conference and we can discuss further. If there are other women who are interested, go us. Christina Salazar Systems Librarian John Spoor Broome Library California State University, Channel Islands 805/437-3198 p.s. Usual disclaimer about these opinions being my own and not reflecting those of my workplace/employers. -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Bohyun Kim Sent: Friday, December 07, 2012 8:14 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Question abt the code4libwomen idea Hi all, I might upset some people with this, but I wanted to bring up this question. First, let me say that I think it is a terrific idea to have a code4lib learning group with or without a mentoring program. But from what I read from the listserv, it seemed to me that there were interests in a space for women, NOT as a separate group from code4lib BUT more as just a small support and discussion group for just women, INSIDE the c4l community not OUTSIDE of it. (Like an IG inside LITA or something like that...). I just wanted to know if there are still women in code4lib who are interested in this idea because gender-specific issues won't be addressed by a code4lib learning group. (If this is the case, I am still interested in participating, and I already set up #code4libwomen IRC channel.) Or, do we think that the initial needs that led to the talk of code4libwomen will be sufficiently met by having a
Re: [CODE4LIB] Code4lib Chicago 2013 poster
I like the picture a lot, but I'd take the male/female symbols out of it, I think they're cheesy and the point is better made more subtly and implicitly just by the image itself, rather than beating people over the head with it with the gender symbols. But I also have no idea why open up the door is apropos. On 12/6/2012 6:24 PM, Doran, Michael D wrote: I have come up with an unofficial Code4lib 2013 conference poster. It was inspired by the recent discussions exploring ways to be more gender inclusive in our community, to open up the door. Although often unacknowledged, women have been coders since the beginning. The photo is from the Computer History Museum website, which states In 1952, mathematician Grace Hopper completed what is considered to be the first compiler, a program that allows a computer user to use English-like words instead of numbers. [1] Props there! The photo was actually taken in 1961 and shows Ms. Hopper in front of UNIVAC magnetic tape drives and holding a COBOL programming manual [2]. [cid:image002.jpg@01CDD3D6.93CD2690] Bonus points for knowing additional reasons why open up the door is apropos. -- Michael [1] http://www.computerhistory.org/timeline/?year=1952 [2] http://www.computerhistory.org/collections/accession/102635875 Also see terms of use: http://www.computerhistory.org/terms/ # Michael Doran, Systems Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 mobile # do...@uta.edu # http://rocky.uta.edu/doran/
Re: [CODE4LIB] Gender Survey Summary and Results
Hmm, it's quite possible you know more about statistics than me, but... Usually equations for calculating confidence level are based on the assumption of a random sample, not a volunteering self-selected sample. If you have a self-selected sample, then the equations for how likely is this to be a fluke are only accurate if your self-selected sample is representative; and there aren't really any equations that can tell you how likely your self-selected sample is to be representative, it depends on the circumstances (which is why for the statistical equations to be completely valid, you need a random sample). Is my understanding. On 12/5/2012 2:18 PM, Rosalyn Metz wrote: Ross, I totally get what you're saying, I thought of all of that too, but according to everything I was reading through, the likelihood that the survey's results are a fluke is extremely low. Its actually the reason I put information in the write up about the sample size (378), population size (2,250), response rate (16.8%), confidence level (95%), and confidence interval (+/- 4.6%). Rosalyn On Wed, Dec 5, 2012 at 1:52 PM, Ross Singer rossfsin...@gmail.com wrote: Thanks, Rosalyn for setting this up and compiling the results! While it doesn't change my default position, yes we need more diversity among Code4lib presenters!, I'm not sure, statistically speaking, that you can draw the conclusions you have based on the sample size, especially given the survey's topic (note, I am not saying that women aren't underrepresented in the Code4lib program). If 83% of the mailing didn't respond, we simply know nothing about their demographics. They could be 95% male, they could be 99% female, we have no idea. I think it is safe to say that the breakdown of the 16% is probably biased towards females simply given the subject matter and the dialogue that surrounded it. We simply cannot project that the mailing list is 57/42 from this, I don't think. What is interesting, however, is that the number roughly corresponds to the number of seats in the conference. I think it would be interesting to see how this compares to the gender breakdown at the conference. This doesn't diminish how awesome it is that you put this together, though. Thanks, again to you and Karen! -Ross. On Dec 5, 2012, at 1:28 PM, Rosalyn Metz rosalynm...@gmail.com wrote: Hi Friends, I put together the data and a summary for the gender survey. Now that conference and hotel registration has subsided, it's a perfect time for you to kick back and read through. [Code4Lib] Gender Survey Data https://docs.google.com/spreadsheet/ccc?key=0AqfFxMd8RTVhdFVQSWlPaFJ2UTh1Nmo0akNhZlVDTlE Gender Survey Data is the raw data for the survey. Not very interesting, but you can use it to view my Pivot Tables and charts. [Code4Lib] Gender Survey Summary https://docs.google.com/document/d/1Hbofh63-5F9MWEk8y8C83heOkNodttASWF5juqGLQ1E/edit Gender Survey Summary is easy to read version of the above -- its the summary I wrote about the results. Included is a brief intro, charts (from above), and a summary of the results. Let the discussion begin, Rosalyn P.S. Much thanks to Karen Coyle for reviewing the summary for me before I sent it out. Also if there are any typos or grammar mistakes, please blame my friend Abigail who behaved as my editor.
Re: [CODE4LIB] Help with WordPress for Code4Lib Journal
We've looked at OJS in the past and not been happy with it, we're pretty happy with WordPress, and not really looking to migrate all our operations to different software. But thanks for the suggestion. (I do think there are probably ways we could keep using WP without a custom codebase, which I personally would prefer, but it's all tradeoffs.). On 12/5/2012 5:05 PM, Ed Sperr wrote: Instead of maintaining a custom codebase to try and force WP to do what you want, why not just use a tool purpose-built for this kind of job? The open-source, Open Journal Systems from PKP might be a good fit: http://pkp.sfu.ca/?q=ojs Ed Sperr, M.L.I.S. Copyright and Electronic Resources Officer St. George's University esp...@sgu.edu __ This email has been scanned by the Symantec Email Security.cloud service. For more information please visit http://www.symanteccloud.com __
Re: [CODE4LIB] Help with WordPress for Code4Lib Journal
While I agree with ross in general about suggesting technical solutions without suggesting how they are going to be maintained -- agree very strongly -- and would further re-emphasize that it's improtant to remember that ALL software installations are living organisms (Ranganthan represent!), and need ongoing labor not just initial install labor I don't agree with the conclusion that the _only_ way to do this is with a central organization or my organization which has shown commitment through z I think it IS possible to run things sustainably with volunteer decentralized not-formal-organization labor. But my experience shows that it _isn't_ likely to work with ONE PERSON volunteering. It IS more likely to work with an actual defined collective, which feels collective responsibility for replacing individual members when they leave and maintaining it's collective persistence. Is that foolproof? No. But it doens't make it foolproof to incorporate and have a 'central organization' (still need labor, paid or unpaid), or to have an existing organization that commits to it (can always change their mind, or not fulfill their commitments even without actually changing their mind). There are plusses and minuses to both. I am a firm believer in code4lib's dentralized volunteer community-not-organization nature. I may be becoming a minority, it seems like everyone else wants code4lib to be Official? There are plusses and minuses to both. But either way, I don't think officiality is EITHER neccesary NOR sufficient to ensure sustainability of tech projects (or anything else). But i fully agree with rsinger that setting up a new tech project _without_ thinking about ongoing sustainability is foolhardy, unless it's just a toy you don't mind if it disappears when the originator loses interest. On 12/4/2012 11:08 AM, Ross Singer wrote: Shaun, I think you missed my point. Our Drupal (and per Tom's reply, Wordpress -- ...and I'm going to take a stab in the dark and throw MediaWiki instance into the pile) is, for all intents and purposes, unmaintained because we have no in charge of maintaining it. Oregon State hosts it, but that's it. Every year, every year, somebody proposes we ditch the diebold-o-tron for something else (Drupal modules, mediawiki plugins, OCS, ... and most recently Easy Chair), yet nobody has ever bothered to do anything besides send an email of what we should use instead. Because that requires work and commitment. What I'm saying is, we don't have any central organization, and thus we have no real sustainable way to implement locally hosted services. The Drupal instance, the diebold-o-tron (and maybe Mediawiki) are legacies from when several of us ran a shared server in a colocation facility. We had skin in the game. And then our server got hacked because Drupal was unpatched (which sucked) and we realized we probably needed to take this a little more seriously. The problem was, though, when we moved to OSU for our hosting, we lost any power to do anything for ourselves and since we no longer had to (nor could) maintain anything, all impetus to do so was lost. To be clear, when we ran all these services on anvil, that wasn't sustainable either! We simply don't have the the organization or resources to effectively run this stuff by ourselves. That's why I'm really not interested in hearing about some x we can run for y if it's not backed up with and my organization which has shown commitment through z will take on the task of doing all the work on this. -Ross. On Dec 4, 2012, at 10:41 AM, Shaun Ellis sha...@princeton.edu wrote: Tom, can you post the plugin to Code4Lib's github so we can have a crack at it? Ross, I'm not sure how many folks on this list were aware of the Drupal upgrade troubles. Regardless, I don't think it's constructive to put new ideas on halt until it gets done. Not everyone's a Drupal developer, but they could contribute in other ways. -Shaun On 12/4/12 10:27 AM, Tom Keays wrote: On Tue, Dec 4, 2012 at 9:53 AM, Ross Singer rossfsin...@gmail.com wrote: Seriously, folks, if we can't even figure out how to upgrade our Drupal instance to a version that was released this decade, we shouldn't be discussing *new* implementations of *anything* that we have to host ourselves. Not being one to waste a perfectly good segue... The Code4Lib Journal runs on WordPress. This was a decision made by the editorial board at the time (2007) and by and large it was a good one. Over time, one of the board members offered his technical expertise to build a few custom plugins that would streamline the workflow for publishing the journal. Out of the box, WordPress is designed to publish a string of individual articles, but we wanted to publish issues in a more traditional model, with all the issues published at one time and arranged in the issue is a specific order. We could (and have done) all this manually, but having the plugin has
Re: [CODE4LIB] Choosing fora. was: Proliferation of Code4Lib Channels
On 12/4/2012 12:10 PM, MJ Ray wrote: Really? I hoped if I wanted to do serious hacking, I could clone it on git.software.coop and send a pull request. If you use github *and insist everyone else does* then you lose all the decentralised networked collaboration benefits of git and it becomes a worse-and-better CVS. A pull request is a feature of github.com. There is no feature of git-the-software called a pull request. Which of course doens't stop you from sending an email requesting a pull. A pull, including from decentralized third party repos, is a feature of git. But yes, if you get used to the features of a particular free service, you get locked into that particular free service. This is certainly part of the overall cost/benefit of using free hosted services.
Re: [CODE4LIB] Choosing fora. was: Proliferation of Code4Lib Channels
Okay, I guess that is a feature. It generates a plain text file you can send to someone else via email; the person can respond by taking manual action on their git command line. Definitely not the github pull requests people are used to. On 12/4/2012 1:16 PM, MJ Ray wrote: Jonathan Rochkind rochk...@jhu.edu On 12/4/2012 12:10 PM, MJ Ray wrote: Really? I hoped if I wanted to do serious hacking, I could clone it on git.software.coop and send a pull request. If you use github *and insist everyone else does* then you lose all the decentralised networked collaboration benefits of git and it becomes a worse-and-better CVS. A pull request is a feature of github.com. There is no feature of git-the-software called a pull request. I don't think that's correct. GitHub was only launched in April 2008, but here's a pull request from 2005: http://lkml.indiana.edu/hypermail/linux/kernel/0507.3/0869.html Here's the start of the relevant page in the git software manual: [quote] NAME git-request-pull - Generates a summary of pending changes SYNOPSIS git request-pull [-p] start url [end] DESCRIPTION Summarizes the changes between two commits to the standard output, and includes the given URL in the generated summary. [/quote] Which of course doens't stop you from sending an email requesting a pull. A pull, including from decentralized third party repos, is a feature of git. It sucks that github doesn't accept emails of such git pull requests and do anything useful with them. Ignoring the huge potential of email coordination seems like missing a big feature of git. But yes, if you get used to the features of a particular free service, you get locked into that particular free service. [...] If one is locked in, that means it has an exit cost, so it's no longer a free service. The piper might just not need payment yet. Hope that explains,
Re: [CODE4LIB] Help with WordPress for Code4Lib Journal
I'd check out the links under Bootcamp here: https://help.github.com/ On 12/4/2012 5:18 PM, Mark Pernotto wrote: As I'm clearly not well-versed in the goings-on of GitHub, I've 'forked' a response, but am not sure it worked correctly. I've zipped up and sent updates to Tom. If anyone could point me in the direction of a good GitHub tutorial (for contributing to projects such as these - the 'creating an account' part I think I have down), I'd appreciate it. Thanks, Mark On Tue, Dec 4, 2012 at 1:43 PM, Tom Keays tomke...@gmail.com wrote: Let's have mine be the canonical version for now. It will be too confusing to have two versions that don't have an explicit fork relationship. https://github.com/tomkeays/issue-manager Tom On Tue, Dec 4, 2012 at 1:56 PM, Chad Nelson chadbnel...@gmail.com wrote: Beat me by one minute Tom! And here it is in code4lib github https://github.com/code4lib/IssueManager On Tue, Dec 4, 2012 at 1:47 PM, Tom Keays tomke...@gmail.com wrote: On Tue, Dec 4, 2012 at 1:01 PM, Shaun Ellis sha...@princeton.edu wrote: You can upload it to your account and then someone with admin rights to Code4Lib can fork it if they think our Code4Lib Journal custom code should be a repo there. Doesn't really matter if they do actually. I think for debugging, it's best to point folks to the actual code the journal is running, which was forked from the official one on the Codex, right? It was written for the Journal and originally kept in a Google Code repo (this is before Github became the de facto). After the author left the journal, he did a couple of updates which he uploaded to the WP Codex, but nothing for a few years. Anyway, here it is: https://github.com/tomkeays/issue-manager
Re: [CODE4LIB] Library event systems and using your API talents for good
On this thread in general, people may be interested in a previous Code4Lib Journal article on using Google Calendars via it's API to embed library open hours information on a website. (Sorry if this has already been mentioned in this thread!) http://journal.code4lib.org/articles/46 It occurs to me that such could also potentially be used for library events, I'm thinking? You'd be essentially using Google Calendar for it's UI for entering and managing events (and perhaps taking advantage of it's iCal feed for end-users that want such?), while building your own actual display UI, built on the Google Calendars API. It's be free, would be one advantage. On 12/2/2012 10:51 AM, Michael Schofield wrote: This will be brief and full of typos (on my phone during breakfast). I've only been with my current library for the last year, but they/we have been using an event calendar called Helios. It is cheap and working with it is similar to Wordpress. Since I've been here, we purchased Program Registration (an iii product). Our public and reference staff really didn't like using it (can't blame them), so we hacked-up Helios to be the front-end for our program registration backend (which only really matters IF an event requires actual registration). Anyway, just a simple plug for Helios if only because we found it to be super malleable. Also, the support from the main guy has been super. I think the URL is refreshmy.com, but I'm on my phone and that's from memory. Sent from my iPhone On Dec 2, 2012, at 10:35 AM, Tom Keays tomke...@gmail.com wrote: I've been disappointed by event management/calendaring systems in general. I think there are a number of common needs that libraries all share. Calendar systems -- scheduling single instance or repeating instance events seems to be the one thing you can find in a system. Basic metadata/filtering parameters should (and usually do) include: date, time, location, description. There's variation in how rich this metadata is; some include permutations on address, campus information, mapping options, etc.; some include html options for the description, such as allowing links or images. Event registration -- an added feature is the ability to allow users to register for an event and for event organizers to process that data. You don't want to have to maintain a separate registration system. Outside the scope of LibraryThing's Event API, except possibly to replicate registration links so users can sign up from within LT. Syndication -- Jon Udell spent much of 2009 and 2010 documenting his efforts to find and then build a calendaring system that would aggregate existing sources of calendar data, the goal being reuse rather than replication. [1] His specific objective was to create a shared community calendar [2] and along the way, he explored the limitations of RSS and iCal data. Once such data was captured by a calendar aggregator, it could then be resyndicated, giving users a single source for the entire community. (Udell has been less public since 2010, so I lost track of where this has been going.) [1] http://radar.oreilly.com/2010/08/lessons-learned-building-the-e.html [2] http://elmcity.cloudapp.net/ Embedded calendar data -- Also related to syndication is the idea of including calendar metadata in a format on a web page that can be indexed by search engines and directly consumed by users via browser plugins and the like. The hCalendar microformat [3] was an attempt to embed iCal calendar data into event listings. When RDFa had its brief accendency a couple of years ago, it looked like hCalendar might be merge into it or be replaced my similar systems, such as Schema.org's Event property [4]. However, now it looks like HTML5 time attribute might edge out Schema.org and hCalendar. Unfortunately, it seems to be impossible to encode hCalendar microformats as HTML5 microdata. [3] http://microformats.org/wiki/hcalendar [4] http://schema.org/Event [5] http://html5doctor.com/the-time-element/ Ongoing events -- much of library event data doesn't fit neatly into regular calendar systems. Whereas calendaring systems only seem to be good at scheduling events with a specified time and date of occurence, I'd also like to see calendar system that can handle scheduling of events that are ongoing -- e.g., exhibits, art shows, library week announcements, etc. A defining feature of a good event system would the ability to schedule both the publication and expiration dates of the event, along with a mechanism to archive expired events. From the public's point of view, an ongoing event would appear once on the calendar -- i.e., as a single event spanning several days rather than as a series individual listings strung over the course of several days or weeks. On a day calendar, it would show as an all-day event or announcement. On a week or month calendar, it might be a bar spanning the days or weeks for which it was in effect. My observation has been that
Re: [CODE4LIB] Choosing fora. was: Proliferation of Code4Lib Channels
Reddit tends to be a pretty segmented place, there are many subreddits that exist, IMO, as more or less 'culturally autonomous' from the rest of the reddit, with little interaction with other parts of reddit. Just people taking advantage of reddit to do their own thing. Reddit's UI makes it easy for these subreddits to stay completely separate, there's really little in the UI that brings people from one area of reddit to another or makes them end up 'combined'. I believe that there are many sub-communities on reddit that do not have this misogyny problem, even if reddit's brand has sadly become known for misogyny. I could be wrong, but I'd suggest finding out by asking friends of yours that are redditors (or finding out if friends of yours are redditors, heh), rather than assuming based on media reports that anything on reddit is doomed. Mainstream media is not very good at covering virtual communities, even still. That said, I still don't think a Code4Lib subreddit is likely to become a particularly useful idea, I think it's unlikely to ever achieve 'critical mass' (It has been tried before, there's both a code4lib and a libraries subreddit that have existed for quite a while without significant uptake, aren't there?) On 12/2/2012 1:44 PM, Karen Coyle wrote: *sigh* From an article about sexual harassment on reddit: Reddit is a notoriously male-dominated forum. According to Google's DoubleClick Ad Planner, Reddit users in the U.S. https://www.google.com/adplanner/site_profile#siteDetails?uid=domain%253A%2520Reddit.comgeo=001lp=false are 72 percent male. Reddit subgroups include r/mensrights and the misogynistic r/chokeabitch, perhaps in part prompting another popular thread that asked recently, Why is Reddit so anti-women? http://www.reddit.com/r/AskReddit/comments/x5oac/why_is_reddit_so_antiwomen_outside_of_rgonewild/ In April, a confused 14-year-old user took to the site in a desperate attempt to seek advice after she had been sexually assaulted http://www.reddit.com/r/AskReddit/comments/smbgv/i_think_i_might_have_been_raped_on_420please_help/. Jezebel chronicled the backlash, as commenters attacked the young victim for overreacting http://jezebel.com/5904323/reddit-is-officially-the-worst-possible-place-for-rape-victims-to-seek-advice. Given its reputation, the site may seem less than appropriate as a forum for effective dialogue.[1] Which doesn't mean that we should boycott reddit, but it is good to know the make-up and culture of tools that you use. And I think I have yet to find a thread on ANY TOPIC on slashdot that doesn't have the word tits in it somewhere. I just read the post about the possible move to a $1 coin in the US, and the first post is about strippers. FIRST POST. *sigh* Although perhaps the question now is: which will happen first - acceptance of a $1 coin in the US or a Slashdot thread that isn't sexist? kc [1] http://www.huffingtonpost.com/2012/07/30/reddit-rapists_n_1714854.html On 11/30/12 9:51 AM, Shaun Ellis wrote: Mark and Karen, yes, the DIY and take-initiative ethos of Code4Lib leads to a lot of channels. I think this is a good thing as each has its strengths. But it creates chaos without more clarity on what platforms are best for certain types of communication? We have similar issues when it comes to our own internal documentation attempts at Princeton. Wiki? Git? Git Wiki? IRC? Blogosphere? Reddit? Listserv? Twitter? Why should I use any of them?!? I will say that I like Reddit for potentially controversial or philosophical discussions. It's built to keep the conversation on track and reward the most insightful/best comments with more visibility. So, anyway, I've posted this discussion on the subreddit: http://www.reddit.com/r/code4lib/comments/1426fn/the_diy_and_takeinitiative_ethos_of_code4lib/ I also added a post on mentorship to the subreddit, since I'm particularly interested in that. Karen, while I think your comments on promotion and giving credit are important, I'm not sure how they are related to mentorship. Would love to hear more about that in the subreddit. -Shaun On 11/30/12 12:30 PM, Mark A. Matienzo wrote: On Fri, Nov 30, 2012 at 12:07 PM, Karen Coyle li...@kcoyle.net wrote: Wow. We could not have gotten a better follow-up to our long thread about coders and non-coders. I don't git. I've used it to read code, but never contributed. I even downloaded a gui with a cute icon that is supposed to make it easy, and it still is going to take some learning. So I'm afraid that it either needs to be on a different platform for editing, OR someone (you know, the famed someone) is going to have to do updates for us non-gitters. Karen, I've added instructions about how to add contributions without knowing Git to the README file: https://github.com/code4lib/antiharassment-policy/blob/master/README.md If you'd like, I'm happy to have feedback as to changes here. A small handful of people have also asked if we could move
Re: [CODE4LIB] Choosing fora. was: Proliferation of Code4Lib Channels
On 12/2/2012 9:19 PM, Esmé Cowles wrote: I think this raises some interesting questions about community and appropriate use of the code4lib name. I just took a look at the code4lib reddit and there were comments from a handful of people. If a handful of people want to create some new channel and call it code4lib, is that OK? It always has been up to now, it's how every single part of code4lib was created. So it's how we got here. Who decides that? That handful of people do. Does it matter if it's part of something like reddit, that is seriously at odds with our budding anti-harassment policy? I think it's far from clear that a code4lib subreddit is inherently at odds with an anti-harrasment policy (OR more importantly, at odds with our desire to be a comfortable place for all sorts of people including people from disadvantaged groups, which is more important than any particular policy). But of course not everyone will agree on this, perhaps I am wrong. I'd suggest that if you think someone is doing is something with the code4lib name you find harmful to code4lib, you bring it up with them, either in private or in public, whatever you prefer. I think it's more productive to discuss this in concrete than in abstract. I don't think we need some general policy or beucrocracy on who can use the code4lib name, we've never had one before. But instead of that, what we have is the ability to discuss _any particular use_ that people don't like -- so if you don't like the group on reddit, let's talk about THAT, specifically. If the general consensus seems to be that there shouldn't be a code4lib reddit area, then I suspect the people who created it will get rid of it. That's always happened before. If they don't, then the community can decide what we should do to distance that from code4lib (which we'd have to do anyway with non-compliant folks even if we had a policy and beurocracy over who was allowed to use the name). So if this is not just hypothetical but you actually are concerned about it, please do bring it up in a separate thread on the list, or start by contacting the folks who created the reddit thing off-list, whatever you prefer.
Re: [CODE4LIB] Choosing fora. was: Proliferation of Code4Lib Channels
I don't think running one's own Hacker News OR Reddit is a particularly sustainable thing to do. I say as someone who's looked into both, for daydreams of improving the planet.code4lib stuff. They're both fairly complicated codebases, with multiple components that need to be installed, and not a lot of documentation (as they are mainly developed for their patrons, they code is made available open source, but is not really documented/supported for other people). Really, I don't think running virtually ANY software of our own for 'code4lib' is particularly sustainable, we're already having trouble sufficiently maintaining what we've already got; this stuff ends up being a lot more work than expected to maintain, and after the initial novelty of implementing a new thing! wears off (if not before :) ), difficult to find volunteer labor to maintain. Especially without knowing if people are going to use the thing anyway. If there's a free service that already does what you want, why not just use it, and see if it catches on? Well, in this case because some people are objecting to www.reddit.com as a service, I guess. Personally, I think those objects are at least in part mis-placed, reddit is just a big place where lots of stuff happens (like youtube, or the internet): check out for instance http://www.reddit.com/r/feminism http://www.reddit.com/r/transgender ). But maybe I'm wrong on this. Either way though, I kind of suspect nobody would be using a /r/Code4Lib anyway, honestly. On the other hand, maybe I'm wrong about that too, I just went to look up the 'libraries' reddit some folks created a while ago to show that it didn't get much use -- but found it actually IS getting some use! http://www.reddit.com/r/libraries On 12/3/2012 11:34 AM, Shaun Ellis wrote: I'm not particularly sold on Reddit. I just think that there are some types of discussions that might be more constructive with a threaded forum than a listserv, just like there are some types of communication that are more suited to IRC or the wiki. In line with Jonathan's comments, we're not going to stop using YouTube just because it's filled with trolls, right? I only suggested and created the subreddit because it's easy to set up and requires very little maintenance. I, for one, am open to suggestions for tools with similar functionality, so long as they don't require too much maintenance. Looking at the Hacker News source code... anyone know Arc? :) -Shaun On 12/3/12 11:23 AM, Jonathan Rochkind wrote: Reddit tends to be a pretty segmented place, there are many subreddits that exist, IMO, as more or less 'culturally autonomous' from the rest of the reddit, with little interaction with other parts of reddit. Just people taking advantage of reddit to do their own thing. Reddit's UI makes it easy for these subreddits to stay completely separate, there's really little in the UI that brings people from one area of reddit to another or makes them end up 'combined'. I believe that there are many sub-communities on reddit that do not have this misogyny problem, even if reddit's brand has sadly become known for misogyny. I could be wrong, but I'd suggest finding out by asking friends of yours that are redditors (or finding out if friends of yours are redditors, heh), rather than assuming based on media reports that anything on reddit is doomed. Mainstream media is not very good at covering virtual communities, even still. That said, I still don't think a Code4Lib subreddit is likely to become a particularly useful idea, I think it's unlikely to ever achieve 'critical mass' (It has been tried before, there's both a code4lib and a libraries subreddit that have existed for quite a while without significant uptake, aren't there?) On 12/2/2012 1:44 PM, Karen Coyle wrote: *sigh* From an article about sexual harassment on reddit: Reddit is a notoriously male-dominated forum. According to Google's DoubleClick Ad Planner, Reddit users in the U.S. https://www.google.com/adplanner/site_profile#siteDetails?uid=domain%253A%2520Reddit.comgeo=001lp=false are 72 percent male. Reddit subgroups include r/mensrights and the misogynistic r/chokeabitch, perhaps in part prompting another popular thread that asked recently, Why is Reddit so anti-women? http://www.reddit.com/r/AskReddit/comments/x5oac/why_is_reddit_so_antiwomen_outside_of_rgonewild/ In April, a confused 14-year-old user took to the site in a desperate attempt to seek advice after she had been sexually assaulted http://www.reddit.com/r/AskReddit/comments/smbgv/i_think_i_might_have_been_raped_on_420please_help/. Jezebel chronicled the backlash, as commenters attacked the young victim for overreacting http://jezebel.com/5904323/reddit-is-officially-the-worst-possible-place-for-rape-victims-to-seek-advice. Given its reputation, the site may seem less than appropriate as a forum for effective dialogue.[1] Which doesn't mean that we should boycott reddit
Re: [CODE4LIB] Proliferation of Code4Lib Channels
A final note is that Reddit's source code is up on github. I'm not a python expert, but it could probably be set up in isolation from reddit if that's seen as a problem. It could use whatever authentication the C4L wiki uses. I has a restful API as well, so we could integrate it into the listserv as Ed Summers did with the jobs site. I believe you're talking about a fairly major development/maintenance project there. Installing and running the reddit software for myself is not something I think anyone should plan on doing as a minimal part of their 'spare time', let along modifying it and running a forked version. Nothing wrong with major development/maintenance projects done by volunteers if someone's interested. And nothing wrong with experimenting with it to see if you can prove me wrong and it really is a trivial task. But I'd be cautious of assuming that code4lib has a bottomless reserve of volunteer labor to do non-trivial tasks, we have trouble continuing to maintain the tech infrastructure we've already got. If it were me, I'd be considering cost/benefit, and not assuming something will be used just because if you build it they will come. And if someone IS looking to do some self-directed development and maintenance work for the code4lib community, they should of course do it where they feel most called to do it -- but if you have an interest in helping out the Code4Lib Journal, we could use it, we're having trouble maintaining and developing our tech infrastructure there at the level we'd like, with currently available interested volunteer labor.
Re: [CODE4LIB] What is a coder?
The mission statement on the code4lib website says The Code4Lib Journal exists to foster community and share information among those interested I want to clarify that the Code4Lib Journal is a specific project with a specific list of people on it's editorial board. In this way, it's unlike the broader Code4Lib Community of which it's a part, which really is a community in the ordinary sense of the word, not a formal organization or project. The Journal only speaks for the Journal, not for Code4Lib. That mission statement is on the Journal website, and is the Journal's mission, as agreed upon by the Journal's founding editorial board; it is not the code4lib website, the mission statement was agreed upon by nobody other than the Journal's founding editorial board, and it applies to nothing other than the Journal. (But I don't think I've ever heard ANYONE say that only coders are welcome at code4lib, I think it's a straw man and I'm not sure why it's being 'debated'. I just wanted to clear up the relationship between The Code4Lib Journal and it's website to code4lib. Perhaps the Journal website needs some more clarifying language on it's website? I think it probably does, hmm.)
Re: [CODE4LIB] What is a coder?
Dude, I'm positive I'm a coder because I spend a whole lot of time coding, and I think I do it pretty decently -- and search in Google is a key part of my workflow! So is debugging. Hopefully copy-and-paste-coding-without-knowing-what-i'm-doing is not, however, true. But no need to be elitist about it. From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Friscia, Michael [michael.fris...@yale.edu] Sent: Thursday, November 29, 2012 8:45 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] What is a coder? Thought process of a coder: 1- I need to open a file in my program 2- ok, I'll import IO into my application and read the definition 3- i create methods and functions around the definition and open my file Total time to deliver code: 5 mins Thought process of a non-coder 1- I need to open a file in my program 2- I open up a web browser and go to google 3- search open file in java 4- copy/paste the code I find 5- can't figure out why it doesn't work, go back to step 3 and try a different person's code 6- really stuck, contemplates changing the programming language 7- runs some searches on easier programming languages 8- goes back to Google and tries new search terms and gets different results 9- finally get it working 10- remove all comments from the copy/paste code so it looks like I wrote it. Total time to deliver code: 5 hours ___ Michael Friscia Manager, Digital Library Programming Services Yale University Library (203) 432-1856 -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mark A. Matienzo Sent: Wednesday, November 28, 2012 10:03 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] What is a coder? Some discussion (both on-list and otherwise) has referred to coders, and some discussion as such has raised the question whether non-coders are welcome at code4lib. What's a coder? I'm not trying to be difficult - I want to make code4lib as inclusive as possible. Mark A. Matienzo m...@matienzo.org Digital Archivist, Manuscripts and Archives, Yale University Library Technical Architect, ArchivesSpace
Re: [CODE4LIB] What is a coder?
The statement on the actual code4lib website (not the Journal's website) can be found here: http://code4lib.org/about I have no idea how old that statement is, or how often it's been changed -- it looks like it's got some stuff added to it at least as a result of recent discussion? But at any rate, it probably wasn't consensed upon by any large group of people, it's probably somebody at some point thought made sense and put there, and it's stayed there because nobody found it objectionable (possibly because nobody noticed it). I don't think there's anything wrong with that, I think that's how our community works! But it means it's not set in stone or anything, or representative of 'everybody', or representative of everyone's thinking. Particular projects done by code4lib people have particular missions and goals and organizational structures -- code4lib in general has none of these things, it's just a bunch of people, nothing more or less. (With regard to that 'about' statement particularly, if you want to change the 'about' there, draw up a draft, get feedback from others on it, install it when general consensus seems to be reached. It sounds like some people may have been doing that recently, although perhaps they skipped the tell folks you're changing it and get feedback step. :) ) But anyway, here's the 'about' statement on the actual code4lib website. (Personally, I would not refer to code4lib as a collective, as 'collective' to me means more of a cohesive organization with defined membership; I'd call it a 'community'). code4lib isn't entirely about code or libraries. It is a volunteer-driven collective of hackers, designers, architects, curators, catalogers, artists and instigators from around the world, who largely work for and with libraries, archives and museums on technology stuff. It started in the fall of 2003 as a mailing list when a group of library programmers decided to create an overarching community agnostic towards any particular language or technology. Code4Lib is dedicated to providing a harassment-free community experience for everyone regardless of gender, sexual orientation, disability, physical appearance, body size, race, or religion. For more information, please see our emerging CodeofConduct4Lib. code4lib grew out of other efforts such as the Access Conference, web4lib, perl4lib, /usr/lib/info (2003-2005, see archive.org) and oss4lib which allow technology folks in libraries, archives and museums to informally share approaches, techniques, and code across institutional and project divides. Soon after the mailing list was created, the community decided to setup a #code4lib IRC channel (chat room) on freenode. The first face-to-face meeting was held in 2005 in Chicago, Illinois, USA and the now-annual conference started in 2006 in Corvallis, Oregon, USA, and has continued since. Local meetings have also sprung up from time to time and are encouraged. A volunteer effort manages an edited online journal that publishes relevant articles from the field in a timely fashion. Things get done because people share ideas, step up to lead, and work together, not because anyone is in charge. We prefer to make community decisions by holding open votes, e.g. on who gets to present at our conferences, where to host them, etc. If you've got an idea or an itch to scratch, please join in; we welcome your participation! If you are interested in joining the community: sign up to the discussion list; join the Facebook or LinkedIn groups; follow us on Twitter; subscribe to our blogs; or get right to the heart of it in the chat room on IRC. From: Jonathan Rochkind Sent: Thursday, November 29, 2012 9:02 AM To: Code for Libraries Subject: RE: [CODE4LIB] What is a coder? The mission statement on the code4lib website says The Code4Lib Journal exists to foster community and share information among those interested I want to clarify that the Code4Lib Journal is a specific project with a specific list of people on it's editorial board. In this way, it's unlike the broader Code4Lib Community of which it's a part, which really is a community in the ordinary sense of the word, not a formal organization or project. The Journal only speaks for the Journal, not for Code4Lib. That mission statement is on the Journal website, and is the Journal's mission, as agreed upon by the Journal's founding editorial board; it is not the code4lib website, the mission statement was agreed upon by nobody other than the Journal's founding editorial board, and it applies to nothing other than the Journal. (But I don't think I've ever heard ANYONE say that only coders are welcome at code4lib, I think it's a straw man and I'm not sure why it's being 'debated'. I just wanted to clear up the relationship between The Code4Lib Journal and it's website to code4lib. Perhaps the Journal website needs some more
Re: [CODE4LIB] What is a coder?
I think that _everyone_ who finds our topics and discussions interesting and useful is welcome at the conference, on the listserv, in IRC, etc. However, at the same time, I will confess that I personally find the proliferation of archival/repository topics at the conference dissapointing. I feel like there are many many venues for discussing institutional repositories and digital archiving. Many other venues (journals, conferences, listservs, organizations) that purport to be about library technology in general or digital libraries really end up being focused almost exclusively on archival/repository matters. When I first found code4lib, what was exciting to me is that finally there was a venue for people discussing and trying to DO technological innovation in actual 'ordinary' library user services, in helping our patrons do all the things that libraries have traditionally tried to help them do, and which need an upgraded tech infrastructure to continue helping them do in the 21st century. But that's just me. I don't think there's _anyone_ that's interested in drawing lines around _who_ can participate in 'code4lib'. But I think almost _everyone_ has an interest in _what_ the topics and discussions at code4lib are. Because that's what makes it code4lib, there's already a web4lib listserv, there's already a D-Lib Magazine, there's already DLF gatherings, there's already LITA, etc -- those who are fans of code4lib like it because of something unique about it, and want it to change in some ways and not in other ways. And we probably don't all agree on those ways. But it would be disingenous to pretend that everyone in code4lib has no opinion about what sorts of topics and discussions should take place at confs or on the listserv etc. But I've still never seen anyone say that any person or type of person is unwelcome! Yeah, there is some tension here, becuase of course what ends up creating the what, but the who who are there? I am not afraid to say that code4lib would not be able to remain code4lib unless the _majority_ of participants were coders, broadly understood (writing HTML is writing code, writing anything to be interpreted by a computer is writing code). But either that will happen or it won't, there's no way to force it. (And personally, I'm not afraid to say that code4lib would not be able to remian code4lib for ME, if the _majority_ of participants become people who work mostly on digital repository or archival areas, as is true of so many other library technology venues.) From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Christie Peterson [cpeter...@jhu.edu] Sent: Thursday, November 29, 2012 9:13 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] What is a coder? I think my tweet yesterday may have been partially responsible for raising this question in Mark's mind. I wrote: Debating registering for c4l since I'll be getting -- at most -- 50% reimbursement for costs , well, I'm not a coder. Thoughts? When I wrote this, I was using coder in the sense that Jonathan used it: A coder is someone who writes code, naturally. :) and also in the sense that Henry mentioned: sysadmin types who do a minimal amount of literal coding but self-identify as technologists. I profess to be neither, yet many of the topics on this year's lineup are directly relevant to my work. My professional identity is, first, as an archivist. This belies a lot of tech-heavy activities that I'm involved with, however: management of born-digital materials, digital preservation, designing/building a digital repository, metadata management, interface design, process improvement and probably a few other things that just don't happen to be what I'm thinking about at this particular moment. So although I'm not a coder in the sense that I defined above, it's essential for my work that I understand a lot about the technical work of libraries and that I can communicate and collaborate with the true coders. As my tweet hinted at, this puts me in an odd place in terms of library financial support for attendance at technology-focused conferences. While the coders I work with (hi guys!) get fully funded to attend code4lib and similar conferences, I don't. If this were training in the sense of a seminar or a formal class on the exact same topics, I would be eligible for full funding, but since it's a conference, it's funded at a significantly lower level. I'll gladly take suggestions anyone has for arguments about why attendance at these types of events is critical to successfully doing my work in a way that, say, attending ALA isn't -- and why, therefore, they should be supported at a higher funding rate than typical library conferences. Any non-coders successfully made this argument before? Cheers, Christie S. Peterson Records Management Archivist Johns Hopkins University The Sheridan Libraries
Re: [CODE4LIB] tech vs. nursing
On 11/29/2012 4:19 PM, Chris Fitzpatrick wrote: departments in kinda interesting ways. There now seems to be things like Metadata or Systems groups that are distinct from Digital Repository or Applications groups. Catalogers and the people who work on the ILS are often completely segregated from the people who work on the new flashy grant-funded projects. Yes, this isn't new, and it is a problem. The former, it kinda seems to me, tends to have more women members while the latter is often lacking. Code4Lib draws mostly from people working in these new-ish groups, Code4Lib didn't used to, when I attended the second code4lib conf, the vast majority of the presentations and presenters were NOT about grant-funded work or digital repository work, and the majority of people I met at Code4Lib were not working on such things. I miss that. Code4Lib was in fact the only place I knew of for people working on traditional library use cases, not on grant-funded projects, trying to innovate with technology and keep libraries relevant.
Re: [CODE4LIB] What about Code4Lib4Women?
Sounds possibly interesting. Other than a word, what would that be exactly, and what would be the goals of it? Do you mean a different conference, or listserv, or what? On 11/28/2012 3:34 PM, Salazar, Christina wrote: And/or Code4Lib4[I hate that word minority, but cannot think of another for here, but maybe you get what I mean] Not trying to splinter, but that might be one way to encourage diversity but again, without implication that ANYONE would be excluded. (Inspired by http://www.meetup.com/Los-Angeles-Womens-Ruby-on-Rails-Group/ ) Christina Salazar Systems Librarian John Spoor Broome Library California State University, Channel Islands 805/437-3198 [Description: Description: CI Formal Logo_1B grad_em signature]
Re: [CODE4LIB] What is a coder?
A coder is someone who writes code, naturally. :) code is something intended to be interpreted or executed by a computer or a computer program. I think everyone agrees that anyone is welcome at code4lib. However, many want to keep code4lib conference presentations and community focused on technical matters and matters of interest to coders. These things are not neccesarily contradictorily. From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Mark A. Matienzo [mark.matie...@gmail.com] Sent: Wednesday, November 28, 2012 10:02 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] What is a coder? Some discussion (both on-list and otherwise) has referred to coders, and some discussion as such has raised the question whether non-coders are welcome at code4lib. What's a coder? I'm not trying to be difficult - I want to make code4lib as inclusive as possible. Mark A. Matienzo m...@matienzo.org Digital Archivist, Manuscripts and Archives, Yale University Library Technical Architect, ArchivesSpace
[CODE4LIB]
On 11/27/2012 4:46 PM, Shaun Ellis wrote: I agree with Tom. If you look at the links Andromeda sent earlier in this thread, both conference organizers reported dramatic increases in the number of under-represented presenters simply by 1) making the proposal authors anonymous during voting Hmm, is the proposal author a legitimate (or illegitimate) criteria to judge proposals on? I tend to think it's actually legitimate; there are some people I know will give a valuable presentation because of who they are, and others who's expertise I might trust on some topics but not others. I don't think this is illegitimate, and wouldn't want to take this information away from voters. We are, after all, voting not just on a topic, but on a topic to be presented by a certain person or people. (I would be quite fine with having some of the program decided upon by the program committee not by the voters at large though! Using a variety of criteria. In addition to issues of diversity in presenters, I think it could also in general improve the quality of presentations and topical diversity as well).
[CODE4LIB] Your proposal wasn't accepted? Consider submitting to the Code4Lib Journal?
Are you sad your proposal wasn't accepted to Code4Lib Conference? Please consider submitting it as an article to Code4Lib Journal instead! In fact, you can submit something as an article even if you are presenting at the conf too -- but especially if you aren't, getting an article published in the Journal can be an alternate way to get your ideas out to the Code4Lib audience -- maybe get them out to even more people than would see them at the conference, and in something that stays on the web for future generations of system librarians too! It isn't neccesarily that much harder to prepare an article than to prepare a presentation. Whatever you would have included in your presentation, you just need to set it down in narrative text instead. It needs to be clear and legible, and we won't just take a slide deck as an article -- but we do accept articles written informally, if they are clear and convey good information, and articles can include screenshot and screencast components. Share what you're doing with your peers, at the Code4Lib Journal! There are a number of proposed presentations that didn't make the cut, that I would have liked learning about -- I hope you submit them as articles to the Journal instead! We accept submissions at any time, on a rolling basis, as either abstracts or first drafts: http://journal.code4lib.org/call-for-submissions Although the next proposal cut-off date is Jan 7th, for the 20th issue to be published in the spring. But don't procastinate and wait, avoid the rush, get your proposal in now while it's fresh in your head!
Re: [CODE4LIB] COinS
On 11/20/2012 8:25 PM, Godmar Back wrote: Could you elaborate on your belief that COinS is actually illegal in HTML5? Why would that be so? Yeah, thanks for calling me on that, I was wrong! Not sure where I got that idea, but it does not seem to be illegal. (Did some earlier version of HTML5 get rid of 'title attribute on every element'? Or was I just confused?) Perhaps what I was thinking of is that some people see an accessibility issue in using the 'title' attribute for non-human-readable data, like COinS does. As the title attribute theoretically provides extra human-readable content that a user-agent can display in some cases, and filling it with non-human-readable data may confuse people. I seem to recall _someone_ complaining about a COinS title attribute on these grounds in some app I develop, but I can't remember the details. Here's others mentioning that potential problem: * http://en.wikipedia.org/wiki/Microformat#Accessibility http://www.bbc.co.uk/blogs/radiolabs/2008/06/removing_microformats_from_bbc.shtml However, in practice, that seems to be a problem more likely, if at all, with title attributes on abbr elements, not span elements like COinS. If you google around, you find a lot of people complaining about the reverse problem -- don't assume that adding a title attribute to your span provides an accessible description (say, to visually impaired users), because most assistive user-agents in fact ignore the title attribute! Still, it's kind of messy to use a title attribute for non-human-readable purposes. And is a large part of the motivation for HTML5 microdata. - Godmar On Tue, Nov 20, 2012 at 5:20 PM, Jonathan Rochkind rochk...@jhu.edu wrote: It _IS_ an old unused metadata format that should be replaced by something else (among other reasons because it's actually illegal in HTML5), but I'm not sure there is a something else with the right balance of flexibility, simplicity, and actual adoption by consuming software. But COinS didn't have a whole lot of adoption by consuming software either. Can you say what you think the COinS you've been adding are useful for, what they are getting used for? And what sorts of 'citations' youw ere adding them for? For my own curiosity, and because it might help answer if there's another solution that would still meet those needs. But if you want to keep using COinS -- creating a COinS generator like OCLC's no longer existing one is a pretty easy thing to do, perhaps some code4libber reading this will be persuaded to find the time to create one for you and others. If you have a server that could host it, you could offer that. :) On 11/20/2012 4:47 PM, Bigwood, David wrote: I've used the COinS Generator at OCLC for years. Now it is gone. Any suggestions on how I can get an occasional COinS for use in our bibliography? Do any of the citation managers generate COinS? Or is this just an old unused metadata format that should be replaced by something else? Thanks, Dave Bigwood dbigw...@hou.usra.edu Lunar and Planetary Institute
Re: [CODE4LIB] COinS
It _IS_ an old unused metadata format that should be replaced by something else (among other reasons because it's actually illegal in HTML5), but I'm not sure there is a something else with the right balance of flexibility, simplicity, and actual adoption by consuming software. But COinS didn't have a whole lot of adoption by consuming software either. Can you say what you think the COinS you've been adding are useful for, what they are getting used for? And what sorts of 'citations' youw ere adding them for? For my own curiosity, and because it might help answer if there's another solution that would still meet those needs. But if you want to keep using COinS -- creating a COinS generator like OCLC's no longer existing one is a pretty easy thing to do, perhaps some code4libber reading this will be persuaded to find the time to create one for you and others. If you have a server that could host it, you could offer that. :) On 11/20/2012 4:47 PM, Bigwood, David wrote: I've used the COinS Generator at OCLC for years. Now it is gone. Any suggestions on how I can get an occasional COinS for use in our bibliography? Do any of the citation managers generate COinS? Or is this just an old unused metadata format that should be replaced by something else? Thanks, Dave Bigwood dbigw...@hou.usra.edu Lunar and Planetary Institute
[CODE4LIB] ruby gem for testing IP addresses for inclusion in sets of non-contiguous address ranges
Something we university library folks often need to do, even though it's kind of a ridiculous design. I wrote a ruby convenience gem for it that some may find useful, basically just a convenience method around the ruby IPAddr stdlib, which does the heavy lifting. https://github.com/jrochkind/ipaddr_range_set
Re: [CODE4LIB] Code4Lib Mid-Atlantic Google Group
All it takes is doing it. You can create a wiki page on the code4lib wiki if you want, next to the other regional ones. The wiki is editable by anyone. Then you just have to find other people who live around you, and get them to do code4lib-like activities with you using the code4lib name. That's all there is. On 11/8/2012 3:12 PM, Akerman, Laura wrote: Another newbie (can't say how innocent) is interested in the answer to this and seeing that there isn't one for the Southeast, wonders what it would take to create one? Or, if that's out of reach for now, whether a visitor from below the Mason-Dixon line would be unwelcome or not to one of the other regions' meetings? We do do code down here sometimes... Laura Laura Akerman Technology and Metadata Librarian Room 128, Robert W. Woodruff Library Emory University, Atlanta, Ga. 30322 (404) 727-6888 lib...@emory.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Michael Schofield Sent: Thursday, November 08, 2012 8:45 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Code4Lib Mid-Atlantic Google Group Hi David [and all], Innocent newbie question: I see there is a Code4Lib NE and Mid-Atlantic - does the latter descend so far into Florida? Is there a Code4Lib SE? A better question: is there a more appropriate place for me to have looked this up? Michael -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Mark Wilhelm Sent: Thursday, November 08, 2012 6:09 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Code4Lib Mid-Atlantic Google Group David, When I access this group I get a you cannot view topics in this forum message. Thanks once again for hosting the conference a few weeks back. --Mark On Wed, Oct 24, 2012 at 11:44 AM, David Uspal david.us...@villanova.edu wrote: All, Thanks to everyone who made the Code4Lib Mid-Atlantic kick-off meeting a success! To keep the ball rolling, I've set up a temporary home base at Google Groups so we can talk about local issues (our next informal meetup, listservs, etc) without flooding inboxes. You can join the growing list here: https://groups.google.com/forum/#!forum/code4lib-mid-atlantic David K. Uspal Technology Development Specialist Falvey Memorial Library Phone: 610-519-8954 Email: david.us...@villanova.edu -- Mark Wilhelm E-Mail: markc...@gmail.com Twitter: @markcwil Facebook: facebook.com/markcwil Read the Information Science News Blog at: http://infoscinews.blogspot.com/ This e-mail message (including any attachments) is for the sole use of the intended recipient(s) and may contain confidential and privileged information. If the reader of this message is not the intended recipient, you are hereby notified that any dissemination, distribution or copying of this message (including any attachments) is strictly prohibited. If you have received this message in error, please contact the sender by reply e-mail message and destroy all copies of the original message (including attachments).
Re: [CODE4LIB] A [Wordpress-based] Alerts Dashboard - Library Closings, etc.
That's a really cool idea Jason! I highly encourage you to write it up for the Code4Lib Journal, sounds like a great (possibly short) article for the journal. Do you do anything with dates, so 'old' alerts/notices aren't shown anymore? Sounds like no, you just display the last 3, in case people want to look back at history too? Would love to see some screenshots or webcasts or examples of it in action -- or write a code4lib journal article to share with everyone! On 11/7/2012 11:31 AM, Jason Griffey wrote: We aren't right now...all posts just go where they go. But it's trivial to break out a category-specific RSS feed in Wordpress, so that would be easily done. We typically update the notice instead of taking it down. Good blog form, and all that. For most alert items (Database down, etc) the display just shows the last 3-5 items, and so stuff rolls off quickly. If not, the update generally takes care of it. Jason On Wed, Nov 7, 2012 at 9:37 AM, Michael Schofield mschofi...@nova.edu wrote: Hey Jason, Are you watching for different categories--closings, emergencies, weather - etc.--and, also, how are you determining when to take down the notice (if at all)? Sent from my iPhone On Nov 7, 2012, at 10:26 AM, Jason Griffey grif...@gmail.com wrote: We run a Wordpress multisite setup here at MPOW, and have two different blogs that we use for this type of purpose: an Alerts blog for in-house alert needs, and a News blog for public-facing announcements. We just use the RSS feed to push the alerts where needed, and there's certainly no shortage of RSS collection/parsing libraries. I'm partial to Magpie (http://magpierss.sourceforge.net/) but only because I've had years of using it. We even recently moved to using Growl for Windows with an RSS plugin to do heads up alerts on staff/faculty PCs, so that when something is posted to the Alerts blog, all staff machines get an impossible-to-ignore alert overlay on their screens. We will likely be doing a similar thing for Emergency use and the public machines. Jason On Wed, Nov 7, 2012 at 9:12 AM, Michael Schofield mschofi...@nova.edu wrote: Hey everyone, I've been toying with the idea making something because I can't seem to find a free alternative, but I thought I'd do my due diligence and pick your brains. I'm open for any alternatives to the following, but I'm specifically looking for a free option with an API. Scenario: our main website lives on the university's server, which turns out to be a very dull playground: HTML/CSS/JS only. This means there's about 150 static files that I'm now presently rolling into a WP Network living on our own boxes-and our own domain-(we've been waiting for the last year for a university-wide CMS, but we just don't want to hold our breaths any longer J) but the main site, the landing page, will always be static. This means that whenever there's an early closure, a hurricane watch, or some other announcement someone has to submit a ticket and then I have to make a change. My goal is to cut me-the middleman-out of the process. My potential project: So what I was thinking was jury-rigging a Wordpress theme into an alerts dashboard for managers, directors, and so on. I want to empower the Circulation manager to login, make an announcement, and be done with it. For all the departmental and other sites that live on the WP Network, I'd write and install a corresponding alerts plugin that watches the JSON API for an alert and-if true-display it. For our static sites, I'd toss in a jquery plugin that did the same. My question: this seems like something that's been done before! Has it? If not, anyone want to collaborate on github? All the best, Michael Schofield(@nova.edu) | Web Services Librarian | (954) 262-4536 Alvin Sherman Library, Research, and Information Technology Center Hi! Hit me up any time, but I'd really appreciate it if you report broken links, bugs, your meeting minutes, or request an awesome web app over on the Library Web Services http://staff.library.nova.edu/pm site.
Re: [CODE4LIB] one tool and/or resource that you recommend to newbie coders in a library?
http://journal.code4lib.org On 11/1/2012 4:24 PM, Bohyun Kim wrote: Hi all code4lib-bers, As coders and coding librarians, what is ONE tool and/or resource that you recommend to newbie coders in a library (and why)? I promise I will create and circulate the list and make it into a Code4Lib wiki page for collective wisdom. =) Thanks in advance! Bohyun --- Bohyun Kim, MA, MSLIS Digital Access Librarian bohyun@fiu.edu 305-348-1471 Medical Library, College of Medicine Florida International University http://medlib.fiu.edu http://medlib.fiu.edu/m (Mobile)
[CODE4LIB] Q: Discovery products and authentication (esp Summon)
Looking at the major 'discovery' products, Summon, Primo, EDS ...all three will provide some results to un-authenticated users (the general public), but have some portions of the corpus that are restricted and won't show up in your results unless you have an authenticated user affiliated with customer's organization. So when we look around on the web for Summon and Primo examples, we can for instance do some sample searches there even without logging in or being affiliated with the particular institution. But we are only seeing a subset of results there, not actually seeing everything, since we didn't auth. But most of these examples I look at don't, in their UI, make this particularly clear. This leads to me wonder if, in actual use, even for customers who _could_ login to see complete results -- anyone ever does. So very curious to get an answer from any existing customers as to this issue. Do the end-users realize they will get more complete results if they log in? Do you have any numbers (or other info, even if not cold stats) on how many end-users choose to log in to see more complete results? If nobody ever authenticates to see more complete results then the subset available to un-authenticated users essentially _is_ the product, the extra stuff that nobody ever sees is kinda irrelevant, no? Anyone who is a current customer of Summon/Primo/EDS want to say anything on this topic? Would be helpful.
Re: [CODE4LIB] Q: Discovery products and authentication (esp Summon)
Right, thanks, but you're missing my point/question. A significant portion of all of our libraries use these days is by patrons that are off-campus and will not be IP-authenticated (Unless you have all patrons use a VPN or something before using library services?) Those off-campus patrons at Dartmouth, do they just always get the limited results available to non-auth end-users, or do you encourage them to login (and if so, any idea how many do?) On 10/24/2012 1:54 PM, Mark Mounts wrote: We have Summon at Dartmouth College. Authentication is IP based so with a Dartmouth IP address the user will see all our licensed content. There is also the option to see all the content Summon has beyond what we license by selecting the option Add results beyond your library's collection Mark -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind Sent: Wednesday, October 24, 2012 12:16 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Q: Discovery products and authentication (esp Summon) Looking at the major 'discovery' products, Summon, Primo, EDS ...all three will provide some results to un-authenticated users (the general public), but have some portions of the corpus that are restricted and won't show up in your results unless you have an authenticated user affiliated with customer's organization. So when we look around on the web for Summon and Primo examples, we can for instance do some sample searches there even without logging in or being affiliated with the particular institution. But we are only seeing a subset of results there, not actually seeing everything, since we didn't auth. But most of these examples I look at don't, in their UI, make this particularly clear. This leads to me wonder if, in actual use, even for customers who _could_ login to see complete results -- anyone ever does. So very curious to get an answer from any existing customers as to this issue. Do the end-users realize they will get more complete results if they log in? Do you have any numbers (or other info, even if not cold stats) on how many end-users choose to log in to see more complete results? If nobody ever authenticates to see more complete results then the subset available to un-authenticated users essentially _is_ the product, the extra stuff that nobody ever sees is kinda irrelevant, no? Anyone who is a current customer of Summon/Primo/EDS want to say anything on this topic? Would be helpful.
Re: [CODE4LIB] Q: Discovery products and authentication (esp Summon)
On 10/24/2012 2:04 PM, Ben Florin wrote: We use Primo, but we've never bothered with their restricted search scopes. Apparently the answer to my question is that nobody has thought about this before, heh. Primo, by default, will suppress some content from end-users unless they are authenticated, no? Maybe that's what restricted search scopes are? I'm not talking about your locally indexed content, but about the PrimoCentral index of scholarly articles. At least I know the Primo API requires you to tell it if end-users are authenticated or not, and suppresses some results if they are not. I assume Primo 'default' interface must have the same restrictions? Perhaps the answer to my question is that at most discovery customers, off-campus users always get the 'restricted' search results, have no real way to authenticate, and nobody's noticed yet!
Re: [CODE4LIB] Q: Discovery products and authentication (esp Summon)
Good to have some numbers, thanks! Even taking your largest number, 25% + 12% == 37% coming from on-campus is definitely less than half, and not 'most' use being from on-campus -- which does not surprise me at all, it's what I would expect. This is an interesting discussion, I think. Thanks all. (Except for Ross and that other guy having a flamewar about things entirely unrelated to the topic! Just kidding, we love you Ross and that other guy. But yeah, unrelated to the topic.) From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of David Friggens [frigg...@waikato.ac.nz] Sent: Wednesday, October 24, 2012 9:15 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Q: Discovery products and authentication (esp Summon) a) most queries come from on-campus Really? Are people just assuming this, or do they actually have data? That would surprise me for most contemporary american places of higher education. For the last two months, 25.4% of our Summon traffic has come from the IP addresses we've given as on campus, according to the stats Serials Solutions provides. Note that another 11.8% came from the local ISP that provides wireless for our students, so most of that would be on campus at other institutions. But it may very well be the extra restricted content is not important and nobody minds it's absence. (Which would make one wonder why the vendor bothers to spend resources putting it in there!). That's been our view (though you're making me think we should perhaps try and understand better what the difference is). The AI results are interesting. EDS seems to promote results from their own AI databases more highly than I would expect, and they're certainly noticeable when blanked out with cannot be displayed to guests. When Summon started showing AI results there was some interesting discussion on the mailing list - they're not immediately accessible, so they're arguably not in the library's collection. And Summon (as does Primo) has an option to add results beyond your library's collection. There was some argument on the other side, that AI results are important to be included, so it seems that there is librarian pressure as well as commercial/licence pressure. David
Re: [CODE4LIB] VPN EZ Proxy
VPN does what EZProxy does already -- make web access appear to come from an on-campus address -- but for ALL web access, not just access that follows links from your web pages using EZProxy. This assumes outgoing traffic from users using the VPN will be on an IP address recognized as 'licensed' by your vendors, which it typically is. If people are using VPN's and happy with it, that's actually a MORE reliable solution than EZProxy. Rather than tell them to turn off your VPN, I would ensure you have EZProxy configured not to interfere with it, if possible. If they are on a VPN, and then ALSO go through EZProxy... it should _work_, but it'll hurt performance, as all traffic is effectively being sent through two proxies, for no reason. You should instead configure your EZProxy so when a client is on an IP Address recognized as the VPN, EZProxy simply redirects without proxying instead of double-proxying the traffic. On 10/18/2012 1:46 PM, Joselito Dela Cruz wrote: Hi All, We use EZ Proxy for authentication and we always tell the staff who uses VPN to turn their VPN off so they can access our databases. Is this the right way? Looking for answers around and could not find any. I thought I would throw this in here. Thanks for feedbacks. Jay Dela Cruz
Re: [CODE4LIB] Q.: software for vendor title list processing
I've always been a fan of ONIX for SOH, although never had the chance to use it -- but the spec is written nicely, based on my experience with this stuff, it actually accomplishes the goal of machine-readable statement of serial holdings (theoretically useful for print or online holdings) well. KBART, I have some concerns about, when it comes to holdings. Is there a place to send feedback to KBART? Just on a quick skim of the parts of interest to me, I am filled with alarm at how much missing the point this is:we recommend that the ISO 8601 date syntax should be used... For simplicity, '365D' will always be equivalent to one year, and '30D' will always be equivalent to one month, even in leap years and months that do not have 30 days. Totally missing the point of ISO 8601 to allow/encourage this when 1Y and 1M are available -- dealing with calendar dates is harder than one might naively think, and by trying to 'improve' on ISO 8601 like this, you just create a mess of ambiguous and difficult to deal with data. On 10/17/2012 5:11 AM, Owen Stephens wrote: Are there any examples of data in this format in the wild we can look at? Also given KBART and ONIX for Serials Online Holdings have NISO involvement, is there any view on how these two activities complement each other? Thanks, Owen Owen Stephens Owen Stephens Consulting Web: http://www.ostephens.com Email: o...@ostephens.com Telephone: 0121 288 6936 On 17 Oct 2012, at 09:47, Michael Hopwood mich...@editeur.org wrote: Hi Godmar, There is also ONIX for Serials Online Holdings (http://www.editeur.org/120/ONIX-SOH/). I'm copying in Tim Devenport who might say more. Best wishes, Michael -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Owen Stephens Sent: 16 October 2012 23:09 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Q.: software for vendor title list processing I'm working on the JISC KB+ project that Tom mentioned. As part of the project we've been collating journal title lists from various sources. We've been working with members of the KBART steering group and have used KBART where possible, although we've been collecting data not covered by KBART. All the data we have at this level is published under a CC0 licence at http://www.kbplus.ac.uk/kbplus/publicExport - including a csv that uses the KBART data elements. The focus so far has been on packages negotiated by JISC in the UK - although in many cases the title lists may be the same as are made available in other markets. We also include what we call 'Master lists' which are an attempt to capture the complete list of titles and coverage offered by a content provider. We'd very much welcome any feedback on these exports, and of course be interested to know if anyone makes use of them. So far a lot of the work on collating/coverting/standardising the data has been done by hand - which is clearly not ideal. In the next phase of the project the KB+ project is going to work with the GoKB project http://gokb.org - as part of this collaboration we are currently working on ways of streamlining the data processing from publisher files or other sources, to standardised data. While we are still working on how this is going to be implemented, we are currently investigating the possibility of using Google/Open Refine to capture and re-run sets of rules across data sets from specific sources. We should be making progress on this in the next couple of months. Hope that's helpful Owen Owen Stephens Owen Stephens Consulting Web: http://www.ostephens.com Email: o...@ostephens.com Telephone: 0121 288 6936 On 16 Oct 2012, at 20:23, Tom Pasley tom.pas...@gmail.com wrote: You might also be interested in the work at http://www.kbplus.ac.uk . The site is up at the moment, but I can't reach it for some reason... they have a public export page which you might want to know about http://www.kbplus.ac.uk/kbplus/publicExport Tom On Wed, Oct 17, 2012 at 8:12 AM, Jonathan Rochkind rochk...@jhu.edu wrote: I think KBART is such an effort. As with most library standards groups, there may not be online documentation of their most recent efforts or successes, but: http://www.uksg.org/kbart http://www.uksg.org/kbart/s5/**guidelines/data_formathttp://www.uksg .org/kbart/s5/guidelines/data_format On 10/16/2012 2:16 PM, Godmar Back wrote: Hi, at our library, there's an emerging need to process title lists from vendors for various purposes, such as checking that the titles purchased can be discovered via discovery system and/or OPAC. It appears that the formats in which those lists are provided are non-uniform, as is the process of obtaining them. For example, one vendor - let's call them Expedition Scrolls - provides title lists for download to Excel, but which upon closer inspection turn out to be HTML tables. They are encoded using an odd mixture of CP1250 and HTML entities. Other vendors use entirely
Re: [CODE4LIB] Q.: software for vendor title list processing
I think KBART is such an effort. As with most library standards groups, there may not be online documentation of their most recent efforts or successes, but: http://www.uksg.org/kbart http://www.uksg.org/kbart/s5/guidelines/data_format On 10/16/2012 2:16 PM, Godmar Back wrote: Hi, at our library, there's an emerging need to process title lists from vendors for various purposes, such as checking that the titles purchased can be discovered via discovery system and/or OPAC. It appears that the formats in which those lists are provided are non-uniform, as is the process of obtaining them. For example, one vendor - let's call them Expedition Scrolls - provides title lists for download to Excel, but which upon closer inspection turn out to be HTML tables. They are encoded using an odd mixture of CP1250 and HTML entities. Other vendors use entirely different formats. My question is whether there are efforts, software, or anything related to streamlining the acquisition and processing of vendor title lists in software systems that aid in the collection development and maintenance process. Any pointers would be appreciated. - Godmar
Re: [CODE4LIB] formatting citation output programmatically
There are a billion different citation formats with their own rules. I don't think there is any simple answer to the question you ask. On 10/11/2012 2:45 PM, William Gunn wrote: Hi list! I have a technical question about formatting citation output which some of you may have dealt with in the past. I see journal names and their abbreviations listed three different ways: ALL CAPS no periods: http://images.webofknowledge.com/WOK46/help/WOS/A_abrvjt.html Proper Case, with periods: http://www.lib.berkeley.edu/BIOS/j_abbr.html Proper Case, no periods: http://home.ncifcrf.gov/research/bja/journams_a.html As far as I'm aware, citations in published papers should always be proper case, but are there any cases where a journal should be cited without periods in the abbreviated form? I'm aware of the edge cases like PLOS, JAMA, BMJ, but what I'm wondering is if anyone knows of any instances where a journal which is normally abbreviated as Anal. Biochem. would instead be formatted as Anal Biochem (without periods) in the references list/bibliography for a paper? If anyone has dealt with this issue in the past, I'd love to hear what you came up with. Thanks! William Gunn +1 646 755 9862 http://synthesis.williamgunn.org/about/ Support free access to scientific journal articles arising from taxpayer-funded research: http://wh.gov/6TH
Re: [CODE4LIB] Seeking examples of outstanding discovery layers
On 9/20/2012 1:39 PM, Karen Coyle wrote: So, given this, and given that in a decent-sized catalog users regularly retrieve hundreds or thousands of items, what is the best way to help them grok that set given that the number of records is too large for the user to look at them one-by-one to make a decision? Can the fact that the data is in a database help users get a feel for what they have retrieved without having to look at every record? I've often felt that, if it can be properly presented, facets are a really great way to do this. Facets (with hit counts next to every value) give you a 'profile' of a result set that is too large for you to get a sense of otherwise, they give you a sort of descriptive statistical summary of it. When the facets are 'actionable', as they are usually, they also let you then drill down to particular aspects of the giant result set you are interested in, and get a _different_ 2.5 screens of results you'll look at. Of course, library studies also often show that our users don't use the facets, heh. But there are a few conflicting studies that shows they are used a significant minority of the time. I think it may have to do with UI issues of how the facets are presented. It's also important to remember that it doesn't neccesarily represent a failure if the user's don't engage with the results beyond the first 2.5 screens -- it may mean they got what they wanted/needed in those first 2.5 screens. And likewise, that it's okay for us libraries to develop features which are used only by significant minorities of our users (important to remember what our logs show is really significant minorities of _uses_. All users using a feature 1% of the time can show up the same as 1% of users using a feature 100% of the time). We are not lowest common denominator, while we need to make our interfaces _usable_ by everyone (lowest common denominator perhaps), it's part of our mission to provide functionality in those interfaces for especially sophisticated uses that won't be used by everyone all the time.
Re: [CODE4LIB] Displaying TGN terms
From the examples you've given how about: 1. Start with the first (most detailed) element in the hieararchy. 2. Moving up the hieararchy, add on the first inhabited place found, if any. 3. Continuing to move up the hieararchy, add on the first nation found, if any. On 9/17/2012 3:12 PM, ddwigg...@historicnewengland.org wrote: We use the Getty Thesaurus of Geographic Names for coding place names in our museum and archival cataloguing systems. We're currently struggling with the best way to display and make these terms searchable in our online database. Currently we're just displaying the term itself, which is flawed, because just seeing Springfield or Florence doesn't give the user enough information to figure out where something was really made. But we're finding that the number of variant place types in TGN makes it hard to figure out a concise way of indiciating a more detailed place name that will work consistently across all entries in the thesaurus. For example, the full hierarchy for Florence (the one in Italy) is Florence (inhabited place), Firenze (province), Tuscany (region), Italy (nation), Europe (continent), World (facet) Neigborhoods and other local subdivisions can be even more of a mouthfull: Notting Hill (neighborhood), Kensington and Chelsea (borough), London (inhabited place), Greater London (metropolitan area), England (country), United Kindom (nation), Europe (continent), World (facet) Ideally I'd probably like to show the above as Florence, Italy and Notting Hill, London, England But I'm having trouble coming up with an algorithm that can consistently spit these out in the form we'd want to display given the data available in TGN. Would welcome any ideas or feedback on this. Thanks, David __ David Dwiggins Systems Librarian/Archivist, Historic New England 141 Cambridge Street, Boston, MA 02114 (617) 994-5948 ddwigg...@historicnewengland.org http://www.historicnewengland.org
Re: [CODE4LIB] Random Casual Poll: What abt. Web Services Should You Know?
Okay, here's my own reverse survey for you. :) What is web services, what job description or role or responsibilities does a librarian planning to work in web services mean to you? Because I'm not sure myself, nor am I sure everyone else who uses that term agrees. My answer to your survey depends on what we're talking about. And I have NO idea what w/e means or stands for. On 9/10/2012 4:12 PM, Michael Schofield wrote: Hi everyone, Every so often in the library blogosophere I see posts dedicated to whether librarians should know how to code. The answer I usually give is awful - something like, Um. Probably. Anyway, since you all work with the web and/or library systems, I'm curious about your wizened answers. Here's the scenario: if a LIS student intending to work in web services (or w/e) asked your advice, what code / platforms / other skills would you recommend for success? I'll compile and share the results in a couple of weeks. All the best, Michael Schofield(@nova.edu) | Web Services Librarian Alvin Sherman Library, Research, and Information Technology Center Hi! Hit me up any time, but I'd really appreciate it if you report broken links, bugs, your meeting minutes, or request an awesome web app over on the Library Web Services http://staff.library.nova.edu/pm site.
Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA?
not report on Encore in the final analysis. The study (and chapter) does offers findings on the other three discovery tools. There were six student groups in the course; each group studied two tools with the same user population (undergrad, graduate and faculty) so that each tool was compared against the other three with each user population overall. The .pdf that you found was the final report of one of those six groups, so it only addresses two of the four tools. The chapter is the only document that pulls the six portions of the study together. I would be happy to discuss this with any of you individually if you need more information. Thanks for your interest in the study. Lucy Holman, DCD Director, Langsdale Library University of Baltimore 1420 Maryland Avenue Baltimore, MD 21201 410-837-4333 - end insert Jonathan LeBreton Sr. Associate University Librarian Temple University Libraries Paley M138, 1210 Polett Walk, Philadelphia PA 19122 voice: 215-204-8231 fax: 215-204-5201 mobile: 215-284-5070 email: lebre...@temple.edu email: jonat...@temple.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of karim boughida Sent: Tuesday, September 04, 2012 5:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA? Hi Tom, Top players are EDS, Primo and Summonthe only reason I see encore in the mix is if you have other III products which is not the case of Ubalt library. They have now worldcat? Encore vs Summon is an easy win for summon. Let's wait for Jonathan LeBreton (Thanks BTW). Karim Boughida On Tue, Sep 4, 2012 at 4:26 PM, Tom Pasley tom.pas...@gmail.com wrote: Yes, I'm curious to know too! Due to database/resource matching or coverage perhaps (anyone's guess). Tom On Wed, Sep 5, 2012 at 7:50 AM, karim boughida kbough...@gmail.com wrote: Hi All, Initially EDS, Primo, Summon, and Encore were considered but only Encore and Summon were tested. Do we know why? Thanks Karim Boughida On Tue, Sep 4, 2012 at 10:44 AM, Jonathan Rochkind rochk...@jhu.edu wrote: Hi helpful code4lib community, at one point there was a report online at: http://student-iat.ubalt.edu/students/kerber_n/idia642/Final_Usabilit y_Report.pdf David Walker tells me the report at that location included findings about SFX and/or other link resolvers. I'm really interested in reading it. But it's gone from that location, and I'm not sure if it's somewhere else (I don't have a title/author to search for other than that URL, which is not in google cache or internet archive). Is anyone reading this familiar with the report? Perhaps one of the authors is reading this, or someone reading it knows one of the authors and can be put me in touch? Or knows someone likely in the relevant dept at ubalt and can be put me in touch? Or has any other information about this report or ways to get it? Thanks! Jonathan -- Karim B Boughida kbough...@gmail.com kbough...@library.gwu.edu -- Karim B Boughida kbough...@gmail.com kbough...@library.gwu.edu
Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA?
On 9/5/2012 9:04 AM, Emily Lynema wrote: Yes, there were (we used 360 Link during the testing). This is one of the reasons we turned on 1-Click about 6 months ago and have been fairly pleased with the results. What does turn on 1-Click mean with regard to Summon? This has turned into a somewhat interesting conversation. We all need to talk about this stuff more!
Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA?
Ah, thanks. If you are thinking of using Summon with a different link resolver, you'd have to see if they provide a similar pass-through type service. I *think* that SFX does. SFX indeed does, but I think on the same basis as 360Link -- turn it on or off globally. Umlaut, the open source link resolver front-end that I develop, provides this feature with even more control -- a URL query param can be provided to turn it on for certain requests but off globally. So for instance, if your Summon interface could send that special URL param with it's openURLs, you could have it on for links from Summon but off for default links. Or if you were creating your own Summon interface with the Summon API, you could even have one link (say, off of title) that did 1 click, but another link (say, below the record) that gave you the full menu. Alternately, if you have no control of the linking like this, a feature could easily be added to Umlaut to turn on or off 1 click behavior based on source ID (rfr_id in the openurl, or even HTTP referrer). I've been investigating Summon a bit with some trial access, and I have to say, just from a very basic surface invesgitation of this particular feature so far -- I am indeed quite impressed with Summon's index enhanced linking -- some things will just NEVER work well with OpenURL (say, digital video or audio links), and Summon's index enhanced linking also sometimes gets you to open access online copies that existing OpenURL link resolver products do a very poor job of discovering.
[CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA?
Hi helpful code4lib community, at one point there was a report online at: http://student-iat.ubalt.edu/students/kerber_n/idia642/Final_Usability_Report.pdf David Walker tells me the report at that location included findings about SFX and/or other link resolvers. I'm really interested in reading it. But it's gone from that location, and I'm not sure if it's somewhere else (I don't have a title/author to search for other than that URL, which is not in google cache or internet archive). Is anyone reading this familiar with the report? Perhaps one of the authors is reading this, or someone reading it knows one of the authors and can be put me in touch? Or knows someone likely in the relevant dept at ubalt and can be put me in touch? Or has any other information about this report or ways to get it? Thanks! Jonathan
Re: [CODE4LIB] U of Baltimore, Final Usability Report, link resolvers -- MIA?
Ha, I'm terrible at google searching apparently, thanks Matt and Joe, I believe that is what I was looking for. code4lib++ On 9/4/2012 10:48 AM, Matthew LeVan wrote: It's like a google search challenge! Looks like they changed their student home link patterns... http://home.ubalt.edu/nicole.kerber/idia642/Final_Usability_Report.pdf Thanks, matt On Tue, Sep 4, 2012 at 10:44 AM, Jonathan Rochkind rochk...@jhu.edu wrote: Hi helpful code4lib community, at one point there was a report online at: http://student-iat.ubalt.edu/**students/kerber_n/idia642/** Final_Usability_Report.pdfhttp://student-iat.ubalt.edu/students/kerber_n/idia642/Final_Usability_Report.pdf David Walker tells me the report at that location included findings about SFX and/or other link resolvers. I'm really interested in reading it. But it's gone from that location, and I'm not sure if it's somewhere else (I don't have a title/author to search for other than that URL, which is not in google cache or internet archive). Is anyone reading this familiar with the report? Perhaps one of the authors is reading this, or someone reading it knows one of the authors and can be put me in touch? Or knows someone likely in the relevant dept at ubalt and can be put me in touch? Or has any other information about this report or ways to get it? Thanks! Jonathan
Re: [CODE4LIB] haititrust
There is a HathiTrust search API that you can use, in addition to RSS/OpenSearch. I can look up the details when i'm back at work next week if you can't find em googling. In fact, I think there are two seperate HT apis, one that searches HT fulltext and one that just searches metadata. I use the metadata searching one in production, and indeed use it to look up HT records by ISBN, LCCN, and OCLCnum. I am not sure if you can limit to just items your library owns using this API though. At a minimum (this may be obvious) your library would probably need to be a HT member, and have shared holdings information with HT -- otherwise HT has no idea which items your library owns. (My library is a HT member but has not yet shared holdings information with HT, because, well, we aren't able to identify our holdings reliably with OCLCnumbers, which is how HT (reasonably) wants it0. The support/question link at the top right of all HT pages, contrary to usual expectations (heh), actually does usually get directed to the right person and get a response, even for technical questions. I'd give a shot asking them directly. Jonathan From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Ford, Kevin [k...@loc.gov] Sent: Friday, August 03, 2012 12:20 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] haititrust Ideally, you shouldn't need the hathifiles. The HathiTrust search page links to an OpenSearch document [1], which promisingly identifies an RSS feed and a JSON serialization of the search results. Neither appears to work. In theory, doing as Jon says and then appending view=rss would get you an RSS feed. There is a contact email in the OpenSearch document you might try. FWIW, if you look at the search page HTML, there is a fixme note in an HTML comment, the same comment, incidentally, that also comments out the RSS feed link in the HTML. Yours, Kevin [1] http://catalog.hathitrust.org/Search/OpenSearch?method=describe -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jon Stroop Sent: Friday, August 03, 2012 11:15 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] haititrust You can do an empty query in their catalog, and use the Original Location facet to filter to a holding library. Programatically, I'm not sure, but you'd probably need to use the Hathi files: http://www.hathitrust.org/hathifiles. -Jon On 08/03/2012 11:07 AM, Eric Lease Morgan wrote: If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] haititrust
Not an answer to your question, but if you want to share I'm curious what your use case is where you want to limit to items your library owns. If HathiTrust has em in fulltext -- why would it matter to your patrons if your library has a print copy or not? And if HT does not have them in fulltext still, why would it matter to your patrons if your library has a print copy or not? From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Eric Lease Morgan [emor...@nd.edu] Sent: Friday, August 03, 2012 11:07 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] haititrust If I needed/wanted to know what materials held by my library were also in the HaitTrust, then programmatically how could I figure this out? In other words, do you know of a way to query the HaitTrust and limit the results to items my library owns? --Eric Lease Morgan
Re: [CODE4LIB] code4lib.org down?
Yeah, the whole server seems to be down, including planet.code4lib.org hosted there, etc. Anyone know what individual we should bring this to their attention? On 6/25/2012 8:30 AM, Ed Summers wrote: Paging Oregon State: do we know why code4lib.org isn't responding? http://code4lib.org/ HTTP requests currently seem to timeout. //Ed PS. Thanks to Carol Bean for noticing it, and bringing it up in #code4lib :-)
Re: [CODE4LIB] Academic libraries - Will dev for pay models?
It seems odd to me for the library to charge individual departments for special projects. Although I realize it can make sense and be reasonable in some cases, I think there are some dangers. I mean, the library is already funded to provide services to the rest of the university, right? EVERYTHING we do serves other schools and departments, that's what we do, almost all our customers are internal. Different universities have different ways of accounting for this -- the individual schools or departments may already have budget line items moving cash from their budget to the libraries, or the university may just take care of it. But either way, it's usually flat rate, pay for the libraries budget. The Business School doesn't get better service than the Philosophy Dept because they've got a bigger budget; nor are schools/departments usually 'charged back' because their undergrads use the reference librarians more than other depts/schools. Likewise, some features we develop serve some department/schools more than others. If we realized there was a need to search/facet by MeSH (NLM Medical Subject) headings, and we weren't doing that yet, but we had the capability to do it -- would we only add that feature if the Medical School paid us? I realize that all of our universities are increasingly trying to subject their components to market discipline, making everything be a fee-based transaction. I think our professional ethics should be to resist this -- it's true we can't do everything we might want and need to prioritize -- but I think our professional ethics in a university library should be against giving better service to those parts of the university which can pay more. But, really, I just put this out as something to think about. I realize that in some cases it can make sense, and be reasonable and ethical. But I think care is warranted. Another thing to beware of with software development in particular -- is that software going to be running on your servers, are you expected to maintain it as well? We who develop software realize that software is hardly ever one and done, software (like libraries, Ranganthan's last law) is a growing organism, it takes constant care and feeding. Even if no features are ever added (and certainly people WILL ask for changes), it takes constant operational care just to keep the thing running, including patching dependencies for security vulnerabilities, as well as simple operational/hardware expenses, etc. If you charge per project the end, but are responsible for maintaining the software indefinitely, that doesn't work even from a strictly budgetary perspective. With digital collections, for instance, if possible I think it'd make a lot more sense to support as part of the libraries mission and general budget, say, an general Omeka installation that anyone can use to create their own 'exhibition', and/or a general Repository that anyone can use to store their digital artifacts, rather than charge individual projects per-project to develop (and then charge more per-year to maintain/support?). Even just on basic financial sustainability grounds. On 6/6/2012 4:24 PM, Eric Larson wrote: Hi Rosy, Thanks for your reply. I would greatly appreciate seeing your spreadsheets. We do an honorable amount of project estimation and time-tracking here, too. We always draft a Memorandum of Understanding -- an agreement for what work the library will provide on the project and a timetable for completing said work -- with our digital collection project clients. We try hard to stay focused on the deliverables in that document, but there's always some feature creep in development work. We do not have plans to charge back for development services, but wondered if other schools worked in such a way. The recent success of our new library catalog launch and future digital collection platform (Hi Blacklight folk) has momentarily increased interest in our born-digital digital collection efforts. There's also a campus-wide effort here at UW-Madison to raise awareness for Educational Innovation opportunities that might generate new revenue streams for the university. We're not used to charging for our services in the library, but some hypothetical partnerships could present the opportunity. I'm sure other public institutions are doing similar what-if revenue exercises: http://edinnovation.wisc.edu/ Thanks again and I'll ping you off list to chat more. Cheers, - Eric On Jun 6, 2012, at 11:28 AM, Rosalyn Metz wrote: Hey Eric, At GW we've been doing some cost estimates for projects. Essentially we pull together the team, figure out the different tasks that need to be accomplished, determine who will be working on those tasks, estimate hours necessary to do the work, and then use salaries to calculate the cost. Right now we're primarily doing this for digitization projects, but I've had experience doing this at other jobs (not in
Re: [CODE4LIB] MARC Magic for file
I have become recently unpleasantly aquainted with the world of Marc that is not Marc21, but is ISO 2709. What'll it do on ISO 2709? I might be able to dig up an example. I wonder if it'll claim it's Marc21 (not), or if it's Marc21 Non-confirming (no, it's not quite that either. It's ISO-2709 MARC that's not Marc21). If it just doens't know anything about it and says 'data', that's just fine, if it knows about Marc21 but not non-Marc21 ISO 2709. On 5/23/2012 3:48 PM, Ford, Kevin wrote: Does it work for bulk files? -- It passed on a file containing 215 MARC Bibs and on a file containing 2,574 MARC Auth records. Don't know if you consider these bulk, but there is more than 1 record in each file (caveat: file stops after evaluating the first line, so of the 2,574 Auth records, the last 2,573 could be invalid). It failed on a file containing all of LC Classification. I need to figure out why. Kevin, do you have examples of the output? -- I received MARC21 Bibliography and MARC21 Authority respectively. In theory, if Leader 20-23 are not 4500 then (non-conforming) should be appended to the identification. If requested, the mimetype - application/marc - should also be outputted. Rgds, Kevin -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ross Singer Sent: Wednesday, May 23, 2012 3:29 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] MARC Magic for file Wow, this is pretty cool. Kevin, do you have examples of the output? Does it work for bulk files? I mean, I could just try this on my Ubuntu machine, but it's all the way downstairs... -Ross. On May 23, 2012, at 3:14 PM, Ford, Kevin wrote: I finally had occasion today (read: remembered) to see if the *nix file command would recognize a MARC record file. I haven't tested extensively, but it did identify the file as MARC21 Bibliographic record. It also correctly identified a MARC21 Authority Record. I'm running the most recent version of Ubuntu (12.04 - precise pangolin). I write because the inclusion of a file MARC21 specification rule in the magic.db stems from a Code4lib exchange that started in March 2011 [1] (it ends in April if you want to go crawling for the entire thread). Rgds, Kevin [1] https://listserv.nd.edu/cgi- bin/wa?A2=ind1103L=CODE4LIBT=0F=S=P=1 12728 -- Kevin Ford Network Development and MARC Standards Office Library of Congress Washington, DC
[CODE4LIB] ruby-marc 0.5.0 released
v0.5.0 - Extensive rewrite of MARC::Reader (ISO 2709 binary reader) to provide a fairly complete and consistent handing of char encoding issues in ruby 1.9. - This code is well covered by automated tests, but ends up complex, there may be bugs, please report them. - May not work properly under jruby with non-unicode source encodings. - Still can't handle Marc8 encoding. - May not have entirely backwards compatible behavior with regard to char encodings under ruby 1.9.x as previous 0.4.x versions. Test your code. In particular, previous versions may have automatically _transcoded_ non-unicode encodings to UTF-8 for you. This version will not do so unless you ask it to with correct arguments. `gem install ruby-marc -v 0.5.0 ` https://github.com/ruby-marc/ruby-marc
Re: [CODE4LIB] crowdsourced book scanning
ILL at most institutions does not keep scanned copies for future patrons, not even in a database that's not publically searchable. To do so would be of highly questionable legality with regard to copyright. As would be this plan, alas. You can easily violate copyright just sharing within the (eg) university community, or even just among librarians, it does not need to be 'publicly searchable' to violate copyright. On 4/25/2012 2:20 PM, Ross Singer wrote: I am not sure this would be as much of a problem as long as it's not a publicly searchable database (that is, people can't browse scans are there and choose them). Of course, this restriction makes it difficult to envision how the UI would work, but something triggered by an exact match should work. Then again, I am not a lawyer. -Ross. On Apr 25, 2012, at 2:05 PM, Andrew Shuping wrote: What type of pages from books are you talking about? Like reference materials, histories, biographies, fiction? Because while my first thought is that would be an interesting idea, my immediate second thought is that publishers and authors would never allow it to happen because of Copyright. Even in ILL land we can't keep scanned pages for a long period of time due to copyright restrictions. Also this sounds a lot like the Google Books project... Andrew Shuping Interlibrary Loan/Emerging Technologies Services Librarian Jack Tarver Library Robert Frost - In three words I can sum up everything I've learned about life: it goes on. On Wed, Apr 25, 2012 at 1:36 PM, Michael Lindsey mlind...@law.berkeley.edu wrote: A colleague posed an interesting idea: patrons scan book pages to deliver to themselves by email, flash drive, etc. What if the scans didn't disappear from memory, but went into a repository so the next patron looking for that passage didn't have to jockey the flatbed scanner? * Patron scans library barcode at the scanner * The system says, I have these pages available in cache. o Patron's project overlaps with the cache and saves time in the scanning, or o Patron needs different pages, scans them and contributes to the cache Now imagine a consortium of some sort where when the patron scans the barcode, the system takes a hop via the ISBN number in the record to reach out to a cache developed between a number of libraries. I know there are a number of cases where this may not apply, like loose-leaf publications in binders that get updated, etc. And I'm sure there are discussions around how to handle copyright, fair use, etc. Do we as a community already have a similar endeavor in place? Michael Lindsey UC Berkeley Law Library
Re: [CODE4LIB] more on MARC char encoding
Ah, thanks Terry. That canned cleaner in MarcEdit sounds potentially useful -- I'm in a continuing battle to keep the character encoding in our local marc corpus clean. (The real blame here is on cataloger interfaces that let catalogers save data that are illegal bytes for the character set it's being saved as. And/or display the data back to the cataloger using a translation that lets them show up as expected even though they are _wrong_ for the character set being saved as. Connexion is theoretically the rolls royce of cataloger interfaces, does it do this? Gosh I hope not.) On 4/19/2012 2:20 PM, Reese, Terry wrote: Actually -- the issue isn't one of MARC8 versus UTF8 (since this data is being harvested from DSpace and is UTF8 encoded). It's actually an issue with user entered data -- specifically, smart quotes and the like. These values obviously are not in the MARC8 characterset and cause many who transform user entered data (which tend to be used by default on Windows) from XML to MARC. If you are sticking with a strickly UTF8 based system, there generally are not issues because these are valid characters. If you move them into a system where the data needs to be represented in MARC -- then you have more problems. We do a lot of harvesting, and because of that, we run into these types of issues moving data that is in UTF8, but has characters not represented in MARC8, from into Connexion and having some of that data flattened. Given the wide range of data not in the MARC8 set that can show up in UTF8, it's not a surprise that this would happen. My guess is that you could add a template to your XSLT translation that attempted to filter the most common forms of these smart quotes/values and replace them with the more standard values. Likewise, if there was a great enough need, I could provide a canned cleaner in MarcEdit that could fix many of the most common varieties of these smart quotes/values. --TR -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind Sent: Thursday, April 19, 2012 11:13 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] more on MARC char encoding If your records are really in MARC8 not UTF8, your best bet is to use a tool to convert them to UTF8 before hitting your XSLT. The open source 'yaz' command line tools can do it for Marc21. The Marc4J package can do it in java, and probably work for any MARC variant not just Marc21. Char encoding issues are tricky. You might want to first figure out if your records are really in Marc8, thus the problems, or if instead they illegally contain bad data or data in some other encoding (Latin1). Char encoding is a tricky topic, you might want to do some reading on it in general. The Unicode docs are pretty decent. On 4/19/2012 11:06 AM, Deng, Sai wrote: Hi list, I am a Metadata librarian but not a programmer, sorry if my question seems naïve. We use XSLT stylesheet to transform some harvested DC records from DSpace to MARC in MarcEdit, and then export them to OCLC. Some characters do not display correctly and need manual editing, for example: In MarcEditor Transferred to OCLC Edit in OCLC Bayes’ theorem Bayes⁰́₉ theorem Bayes' theorem ―it won‘t happen here‖ attitude ⁰́₅it won⁰́₈t happen here⁰́₆ attitude it won't happen here attitude “Generation Y” ⁰́₋Generation Y⁰́₊ Generation Y listeners‟ evaluationslisteners⁰́ evaluations listeners' evaluations high school – from high school ⁰́₃ from high school – from Co₀․₅Zn₀․₅Fe₂O₄ Co²́⁰⁰́Þ²́⁵Zn²́⁰⁰́Þ²́⁵Fe²́²O²́⁴ Co0.5Zn0.5Fe2O4? μ Îơ μ Nafion®Nafion℗ʼ Nafion® Lévy L©♭vy Lévy 43±13.20 years 43℗ł13.20 years 43±13.20 years 12.6 ± 7.05 ft∙lbs 12.6 ℗ł 7.05 ft⁸́₉lbs 12.6 ± 7.05 ft•lbs ‘Pouring on the Pounds' ⁰́₈Pouring on the Pounds
Re: [CODE4LIB] more on MARC char encoding
On 4/19/2012 3:23 PM, LeVan,Ralph wrote: We see Unicode data pasted into MARC8 records all the time. It happens enough that my MARC8-Unicode converter takes a second look at illegal MARC8 bytes and tries a UTF-8 encoding as well. Right. I see it too. I'm arguing that means cataloger entry tools, the tools which catalogers are using when they paste that stuff in, are not giving the cataloger sufficient feedback as to their entry. Flagging completely illegal byte sequences in the output encoding and not letting them be saved; make sure cataloger input is displayed back _as appropriate for the current encoding_, so they get immediate visual feedback if they're entering bytes that don't mean what they think for the operative output encoding. I think it's possible _no_ cataloger interfaces actually do this. (although if any do, I bet it's MarcEdit). If Connexion doesn't, for interactive cataloger entry, it'd be awfully nice if it did.
[CODE4LIB] ruby-marc, better ruby 1.9 char encoding support, testers wanted
I have implemented fairly complete and robust proper support for character encodings in ruby-marc when reading 'binary' marc under ruby 1.9. It's currently in a git branch, not yet released, and not yet in git master. https://github.com/ruby-marc/ruby-marc/tree/char_encodings If anyone who uses this (or doesn't) has a chance to beta test it, it would be appreciated. One way to test, checkout with git, switch to 'char_encodings' branch, and `rake install` to install as a gem to your system. These changes should _only_ effect use under ruby 1.9, and only effect reading in 'binary' (ISO 2709) marc. The new functionality is pretty extensively covered by automated tests, but there are some weird and complex interactions that can occur depending on exactly what you're doing, bugs are possible. It was somewhat more complicated than one might expect to implement a complete solution here, in part because we _do_ have international users who use ruby-marc, with encodings that are neither MARC8 nor UTF8, and in fact non-MARC21. If any of the other committers (or anyone else) wants to code review, you are welcome to. POSSIBLE BACKWARDS INCOMPAT Some previous 0.4.x versions, when running under ruby 1.9 only, would automatically _transcode_ non-unicode encodings to UTF-8 for you under the hood. The new version no longer does so automatically (although you can ask it to). It was not tenable to support that backwards compatibly. Everything else _ought_ to be backwards compatible with previous 0.4.x ruby-marc under ruby 1.9, fixing many problems. NEW FEATURES All applying to ruby 1.9 only, and to reading binary MARC only. * Do a pretty good job of setting encodings properly for your ruby environment, especially under standard UTF-8 usage. * You _can_ and _do have to_ provide an argument for reading non-UTF8 encodings. (but sadly no support for marc8). * You can ask MARC::Reader to transcode to a different encoding when loading marc. * You can ask MARC::Reader to replace bytes that are illegal in the believed source encoding with a replacement character (or the empty string) to avoid ruby invalid UTF-8 byte exceptions later, and sanitize your input. New features documented in inline comments, see at: http://rubydoc.info/github/ruby-marc/ruby-marc/MARC/Reader I had trouble making the docs concise, sorry, I think I've been pounding my head against this stuff so much realizing how complicated it ends up being that I wasn't sure what to leave out.
Re: [CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21
On 4/18/2012 6:04 AM, Tod Olson wrote: It has to mean UTF-8. ISO 2709 is very byte-oriented, from the directory structure to the byte-offsets in the fixed fields. The values in these places all assume 8-bit character data, it's completely baked in to the file format. I'm not sure that follows. One could certainly have UTF-16 in a Marc record, and still count bytes to get a directory structure and byte offsets. (In some ways it'd be easier since every char would be two bytes). In fact, I worry that the standard may pre-date UTF-8, with it's reference to UCS --- if I understand things right, at one point there was only one unicode encoding, called UCS, which is basically a backwards-compatible subset of what became UTF-16. So I worry the standard really means UCS/UTF-16. But if in fact records in the wild with the 'u' value are far more likely to be UTF-8... well it's certainly not the first time the MARC21 standard was useless/ignored as a standard in answering such questions.
[CODE4LIB] MarcXML and char encodings
I know how char encodings work in MARC ISO binary -- the encoding can legally be either Marc8 or UTF8 (nothing else). The encoding of a record is specified in it's header. In the wild, specified encodings are frequently wrong, or data includes weird mixed encodings. Okay! But what's going on with MarcXML? What are the legal encodings for MarcXML? Only Marc8 and UTF8, or anything that can be expressed in XML? The MARC header is (or can) be present in MarcXML -- trust the MARC header, or trust the XML doctype char encoding? What's the legal thing to do? What's actually found 'in the wild' with MarcXML? Can anyone advise? Jonathan
Re: [CODE4LIB] MarcXML and char encodings
So what if the ?xml? decleration says one charset encoding, but the MARC header included in the MarcXML says a different encoding... which one is the 'legal' one to believe? Is it legal to have MarcXML that is not UTF-8 _or_ Marc8, that is an entirely different charset that is legal in XML? If you did that, what should the MARC header included in the XML say? I know how char encodings work in XML. I don't understand what the standards say about how that interacts with the MARC data in MarcXML. Jonathan On 4/17/2012 1:51 PM, LeVan,Ralph wrote: There are probably a couple of answers to that. XML rules define what characterset is used. The encoding attribute on the?xml? header is where you find out what characterset is being used. I've always gone under the assumption that if an encoding wasn't specified, then UTF-8 is in effect and that has always worked for me. It turns out the standard says US-ASCII is the default encoding. But, ignoring the encoding, the original MarcXML rules were the same as the MARC-21 rules for character repertoire and you were suppose to restrict yourself to characters that could be mapped back into MARC-8. I don't know if that rule is still in force, but everyone ignores it. I hope that helps! Ralph -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jonathan Rochkind Sent: Tuesday, April 17, 2012 12:35 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: MarcXML and char encodings I know how char encodings work in MARC ISO binary -- the encoding can legally be either Marc8 or UTF8 (nothing else). The encoding of a record is specified in it's header. In the wild, specified encodings are frequently wrong, or data includes weird mixed encodings. Okay! But what's going on with MarcXML? What are the legal encodings for MarcXML? Only Marc8 and UTF8, or anything that can be expressed in XML? The MARC header is (or can) be present in MarcXML -- trust the MARC header, or trust the XML doctype char encoding? What's the legal thing to do? What's actually found 'in the wild' with MarcXML? Can anyone advise? Jonathan
Re: [CODE4LIB] MarcXML and char encodings
On 4/17/2012 1:57 PM, Kyle Banerjee wrote: In some cases, invalid XML. In an ideal world, the encoding should be included in the declaration. But I wouldn't trust it. kyle So would you use the Marc header payload instead? Or you're just saying you wouldn't trust _any_ encoding declerations you find anywhere? When writing a library to handle marc, I think the base line should be making it do the official legal standards-complaint right thing. Extra heuristics to deal with invalid data can be added on top. But my trouble here is I can't even figure out what the official legal standards-compliant thing is. Maybe that's becuase the MarcXML standard simply doesn't address it, and it's all implementation dependent. sigh. The problem is how the XML documents own char encoding is supposed to interact with the MARC header; especially because there's no way to put Marc8 in an XML char encoding doctype (is there?); and whether encodings other than Marc8 or UTF8 are legal in MarcXML, even though they aren't in MARC ISO binary. I think the answer might be nobody knows, and there is no standard right way to do it. Which is unfortunate.
Re: [CODE4LIB] MarcXML and char encodings
Okay, maybe here's another way to approach the question. If I want to have a MarcXML document encoded in Marc8 -- what should it look like? What should be in the XML decleration? What should be in the MARC header embedded in the XML? Or is it not in fact legal at all? If I want to have a MarcXML document encoded in UTF8, what should it look like? What should be in the XML decleration? What should be in the MARC header embedded in the XML? If I want to have a MarcXML document with a char encoding that is _neither_ Marc8 nor UTF8, but something else generally legal for XML -- is this legal at all? And if so, what should it look like? What should be in the XML decleration? What should be in the MARC header embedded in the XML? On 4/17/2012 1:57 PM, Kyle Banerjee wrote: What's the legal thing to do? What's actually found 'in the wild' with MarcXML? In some cases, invalid XML. In an ideal world, the encoding should be included in the declaration. But I wouldn't trust it. kyle
Re: [CODE4LIB] MarcXML and char encodings
Thanks, this is helpful feedback at least. I think it's completely irrelevant, when determining what is legal under standards, to talk about what certain Java tools happen to do though, I don't care too much what some tool you happen to use does. In this case, I'm _writing_ the tools. I want to make them do 'the right thing', with some mix of what's actually official legally correct and what's practically useful. What your Java tools do is more or less irrelevant to me. I certainly _could_ make my tool respect the Marc leader encoded in MarcXML over the XML decleration if I wanted to. I could even make it assume the data is Marc8 in XML, even though there's no XML charset type for it, if the leader says it's Marc8. But do others agree that there is in fact no legal way to have Marc8 in MarcXML? Do others agree that you can use non-UTF8 encodings in MarcXML, so long as they are legal XML? I won't even ask someone to cite standards documents, because it's pretty clear that LC forgot to consider this when establishing MarcXML. (And I have no faith that one could get LC to make a call on this and publish it any time this century). Has anyone seen any Marc8-encoded MarcXML in the wild? Is it common? How is it represented with regard to the XML leader and the Marc header? Has anyone seen any MarcXML with char encodings that are neither Marc8 nor UTF8 in the wild? Are they common? How are they represented with regard to XML leader and Marc header? On 4/17/2012 2:32 PM, LeVan,Ralph wrote: If I want to have a MarcXML document encoded in Marc8 -- what should it look like? What should be in the XML decleration? What should be in the MARC header embedded in the XML? Or is it not in fact legal at all? I'm going out on a limb here, but I don't think it is legal. There is no formal encoding that corresponds to MARC-8, so there's no way to tell XML tools how to interpret the bytes. If I want to have a MarcXML document encoded in UTF8, what should it look like? What should be in the XML decleration? What should be in the MARC header embedded in the XML? ?xml encoding=UTF-8? I suppose you'll want to set the leader to UTF-8 as well, but it doesn't really matter to any XML tools. If I want to have a MarcXML document with a char encoding that is _neither_ Marc8 nor UTF8, but something else generally legal for XML
Re: [CODE4LIB] MarcXML and char encodings
On 4/17/2012 3:01 PM, Sheila M. Morrissey wrote: No -- it is perfectly legal - -but you MUST declare the encoding to BE Marc8 in the XML prolog, Wait, how canyou declare a Marc8 encoding in an XML decleration/prolog/whatever it's called? The things that appear there need to be from a specific list, and I didn't think Marc8 was on that list? Can you give me an example? And, if you happen to have it, link to XML standard that says this is legal?
[CODE4LIB] more on MARC char encoding: Now we're about ISO_2709 and MARC21
Okay, forget XML for a moment, let's just look at marc 'binary'. First, for Anglophone-centric MARC21. The LC docs don't actually say quite what I thought about leader byte 09, used to advertise encoding: a - UCS/Unicode Character coding in the record makes use of characters from the Universal Coded Character Set (UCS) (ISO 10646), or Unicode™, an industry subset. That doesn't say UTF-8. It says UCS or Unicode. What does that actually mean? Does it mean UTF-8, or does it mean UTF-16 (closer to what used to be called UCS I think?). Whatever it actually means, do people violate it in the wild? Now we get to non-Anglophone centric marc. I think all of which is ISO_2709? A standard which of course is not open access, so I can't get it to see what it says. But leader 09 being used for encoding -- is that Marc21 specific, or is it true of any ISO-2709? Marc8 and unicode being the only valid encodings can't be true of any ISO-2709, right? Is there a generic ISO-2709 way to deal with this, or not so much?
Re: [CODE4LIB] Silently print (no GUI) in Windows
If you had PDFs, you could probably do it. But if you have a bunch of different proprietary application files each one is different, and needs software that can interpret the file and turn it into a print job (postscript, or whatever). Normally this software is the 'full application' that owns it, say Microsoft Word. The particular application may come with software to 'silently' print, but most probably does not. The particular format may have a competitor that can open it (say, OpenOffice for Microsoft Word), and an open source competitor is perhaps more likely to have such 'silent printing' ability -- but it would still need to be done on a format-by-format basis. I don't know if anyone's selling software that can try to do what you're talking about for a multitude of popular formats. But it's pretty much impossible for there to be software that can do it for every/any format. I think you're not going to have much luck. Perhaps you could figure out a way to use some kind of Windows 'macro' program to actually open up each document in the 'full application' and choose File/Print, but to do this unattended. I am not familiar with such software. On 4/3/2012 2:48 PM, Kozlowski,Brendon wrote: Not a dumb question at all. In this particular case, the receiving PC that is to be storing/printing the documents will be taking jobs from multiple networks, buildings, etc by either piping an email account, or downloading via a user's upload from a webpage. We already have a solution for catching jobs in the print spooler (not ours), but need to automate the sending of the documents to the spooler itself. The only way I've ever sent documents to the spooler was by opening up the full application (ex: Microsoft Word), and using the GUI to send the print job. Since the PC housing and releasing these files is expected to be un-manned and sit in a back room, we just need to be able to silently print the jobs in the background. Opening multiple applications over and over again would use up a lot of resources, so a silent, no-GUI option would be the best from my very little understanding - if it's even possible. Brendon Kozlowski Web Administrator Saratoga Springs Public Library 49 Henry Street Saratoga Springs, NY, 12866 [518] 584-7860 x217 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Kyle Banerjee [baner...@uoregon.edu] Sent: Tuesday, April 03, 2012 1:25 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Silently print (no GUI) in Windows At the risk of asking a dumb question, why wouldn't a print server meet your use case if the print jobs come from elsewhere? kyle On Tue, Apr 3, 2012 at 9:15 AM, Kozlowski,Brendonbkozlow...@sals.eduwrote: I'm curious to know if anyone has discovered ways of silently printing documents from such Windows applications as: - Acrobat Reader (current version) - Microsoft Office 2007 (Word, Excel, Powerpoint, Visio, etc...) - Windows Picture and Fax Viewer I unfortunately haven't had much luck finding any resources on this. I'd like to be able to receive documents in a queue like fashion to a single PC and simply print them off as they arrive. However, automating the loading/exiting of the full-blown application each time, and on-demand, seems a little too cumbersome and unnecessary. I have not yet decided on whether I'd be scripting it (PHP, AutoIT, batch files, VBS, Powershell, etc...) or learning and then writing a .NET application. If .NET solutions use the COM object, the scripting becomes a potential candidate. Unfortunately I need to know how, or even if, it's even possible to do first. Thank you for any and all feedback or assistance. Brendon Kozlowski Web Administrator Saratoga Springs Public Library 49 Henry Street Saratoga Springs, NY, 12866 [518] 584-7860 x217 Please consider the environment before printing this message. To report this message as spam, offensive, or if you feel you have received this in error, please send e-mail to ab...@sals.edu including the entire contents and subject of the message. It will be reviewed by staff and acted upon appropriately. -- -- Kyle Banerjee Digital Services Program Manager Orbis Cascade Alliance baner...@uoregon.edu / 503.999.9787 To report this message as spam, offensive, or if you feel you have received this in error, please send e-mail to ab...@sals.edu including the entire contents and subject of the message. It will be reviewed by staff and acted upon appropriately.
Re: [CODE4LIB] Anyone implementing common LIS applications on PaaS providers?
Older 3.x versions of Blacklight may have put a solrmarc.jar inside your app's ./config/SolrMarc. That may not be caught by your slug ignore. This was an error, it was never meant to do that. If you have one in a BL 3.x you should be safe to remove it. Other than that, I'm curious what's making a BL app so large! Incidentally, you don't need a ./jetty in your local app _at all_, unless you actually want to keep a jetty Solr there. BL will optionally install one there, but it's not required. (Does slug size include your gem dependencies? I am not familiar with heroku. Cause the BL gem itself _does_ also include a SolrMarc.jar, if that's a problem, we'd have to refactor things on the BL side to make it an optional dependency instead of baked into BL). On 3/29/2012 12:37 PM, Chris Fitzpatrick wrote: Hey Sean, Jah, I did that...my .slugignore is: tmp/* log/* coverage/* spec/* koha/* jetty/* That dropped it down to 30 from ~50mb, so that's good . (koha has some scripts wrote to pull from our ILS). I think the slug size is a really minor issue. Heroku says under 25mb is good, but over 50mb is not so good. Not Good, but not Chaotic Evil . Neutral Good. On Thu, Mar 29, 2012 at 6:26 PM, Sean Hannanshan...@jhu.edu wrote: If you already have everything indexed in Solr elsewhere, a way to cut down the BL slug size is to remove/ignore the SolrMarc.jar. It's pretty sizable. -Sean On 3/29/12 12:16 PM, Chris Fitzpatrickchrisfitz...@gmail.com wrote: Hi, I've deployed Blacklight on both Heroku and Elastic BeanStalk. Heroku is still a much better choice. The only issue I had was I needed to make sure the sass-rails gem in installed in the :production gem group and not just development. I still have an issue of getting heroku to compile all my sass/coffeescript/etc assets on update, but it actually doesn't seem to make much of an impact on performance. The minor issue is that it would be nice to figure out a way to slim down BL's slug size. The lowest I've been able to get it is about 30mb and Heroku recommends having it be below 25mb. I have not used Heroku's solr service (I still use EC2 for my solr deployments). EngineYard would also be another option. There is also an AMI for DSpace, so deploying that to EC2 should be pretty easy b,chris. On Thu, Mar 29, 2012 at 3:55 PM, Rosalyn Metzrosalynm...@gmail.com wrote: Erik, I haven't tried it (recently) on PaaS providers, but I have on IaaS. The AMIs I've created in association with start up scripts (if you're interested in seeing those let me know, I'd have to look for them somewhere or other) mean that the application automagically starts up on its own, all you need to do is go to the URL. I've used this as a back up method in the past and I think would be a great way for people to be able to play with the different apps before committing. To this end, I created an AMI for Blacklight a while back: http://www.rosalynmetz.com/ami-3c10f255/ I guarantee you it is grossly out of date. I also have instructions on creating an EBS backed AMI: http://rosalynmetz.com/ideas/2011/04/14/creating-an-ebs-backed-ami/ which is the method I used for creating the Blacklight AMI. These instructions are also fairly old, but I still get comments on my blog now and then that the method works. I also played around with it on Heroku, but that was so long ago I don't think any of the things I learned still apply (this was when Heroku was fairly new to the scene). Hope some of this helps. Rosalyn On Thu, Mar 29, 2012 at 8:34 AM, Seth van Hoolandsvhoo...@ulb.ac.bewrote: Dear Erik, Bram Wiercx and myself have given a talk on how to put together a package to install CollectiveAccess on Red Hat's OpenShift: http://www.dish2011.nl/sessions/open-source-software-platform-collectiveacce s-as-a-service-solution . My students are currently happily playing around with CollectiveAccess, which they have installed on OpenShift. My teaching assistant Max De Wilde has developed clear guidelines on how to run the installation procedure: http://homepages.ulb.ac.be/~svhoolan/redhat_ca_install.pdf. It would be wonderful to aggregate these kind of installation procedure's for other types of LIS applications... Kind regards and looking forward to your book! Seth van Hooland Président du Master en Sciences et Technologies de l'Information et de la Communication (MaSTIC) Université Libre de Bruxelles Av. F.D. Roosevelt, 50 CP 123 | 1050 Bruxelles http://homepages.ulb.ac.be/~svhoolan/ http://twitter.com/#!/sethvanhooland http://mastic.ulb.ac.be 0032 2 650 4765 Office: DC11.113 Le 29 mars 2012 à 14:10, Erik Mitchell a écrit : Hi all, I have been toying with the process of implementing common LIS applications (e.g. Vufind, Dspace, Blacklight. . .) on PaaS providers like Heroku and Amazon Elastic Beanstalk. I have just tried out of the box distributions so far and have not made much progress but was wondering if someone else had tried this or had ideas
Re: [CODE4LIB] Anyone implementing common LIS applications on PaaS providers?
On 3/29/2012 5:05 PM, Chris Fitzpatrick wrote: locally and push them rather than rely on Heroku to precompile them (currently when I push, Heroku's precompile fails, so it reverts to compile at runtime mode) if anyone has insight into this, please lemme know...I believe having them compile at runtime does slow down the application... Have no idea why it's not working in heroku, no experience with heroku (although I'm familiar with the concept). But compile at runtime _will_ slow down your app, yeah. Here's a stackoverflow I asked on it myself: http://stackoverflow.com/questions/8821864/config-assets-compile-true-in-rails-production-why-not Compiling locally and then pushing should work, and is arguably better in some ways (why waste cycles on the production machine compiling assets?) But, if you choose to compile and check into your source control repo, here's a trick that will keep it from driving you crazy in development using your on-disk compiled assets... eh, I can't find the blog post on google now, but it's something like changing config.assets.path = /dev-assets in environments/development.rb, so in development it will ignore your on disk compiled assets.
Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records
a) Mis-characterized MARC char encodings are common amongst many of our corpuses and ILS's. It is a common problem. It can be very inconvenient. Not only Marc8 that says it's UTF8 and vice versa, but something that says it's MARC8 or UTF8 but is actually neither. b) While one solution would be having the marc tool pass the char stream through as is without complaining like Godmar suggested; and another solution would be trying to heuristically guess the 'real' solution like Gabe suggests; personally I favor a different solution: The thing that's encoding as unicode on the way out? Instead of raising on an invalid char, it should have the option of silently eating it, replacing it with either empty string or the unicode replacement character ( used to replace an incoming character whose value is unknown or unrepresentable in Unicode [http://www.fileformat.info/info/unicode/char/fffd/index.htm] ) I have worked with character encoding libraries before that have this option, replace messed up bytes with unicode replacement char. I don't know what's avail in Python though. Jonathan On 3/8/2012 3:19 PM, Gabriel Farrell wrote: Sounds like what you do, Terry, and what we need in PyMARC, is something like UnicodeDammit [0]. Actually handling all of these esoteric encodings would be quite the chore, though. I also used to think it would be cool if we could get MARC8 encoding/decoding into the Python standard library, but then I realized I'd rather work on other stuff while MARC8 withers and dies. [0] https://github.com/bdoms/beautifulsoup/blob/master/BeautifulSoup.py#L1753 On Thu, Mar 8, 2012 at 2:36 PM, Reese, Terry terry.re...@oregonstate.edu wrote: This is one of the reasons you really can't trust the information found in position 9. This is one of the reasons why when I wrote MarcEdit, I utilize a mixed process when working with data and determining characterset -- a process that reads this byte and takes the information under advisement, but in the end treats it more as a suggestion and one part of a larger heuristic analysis of the record data to determine whether the information is in UTF8 or not. Fortunately, determining if a set of data is in UTF8 or something else, is a fairly easy process. Determining the something else is much more difficult, but generally not necessary. For that reason, if I was advising other people working on MARC processing libraries, I'd advocate having a process for recognizing that certain informational data may not be set correctly, and essentially utilize a compatibility process to read and correct them. Because unfortunately, while the number of vendors and systems that set this encoding byte correctly has increased dramatically (it used to be pretty much no one) -- but it's still so uneven, I generally consider this information unreliable. --TR -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Godmar Back Sent: Thursday, March 08, 2012 11:01 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records On Thu, Mar 8, 2012 at 1:46 PM, Terray, Jamesjames.ter...@yale.edu wrote: Hi Godmar, UnicodeDecodeError: 'ascii' codec can't decode byte 0xe8 in position 9: ordinal not in range(128) Having seen my fair share of these kinds of encoding errors in Python, I can speculate (without seeing the pymarc source code, so please don't hold me to this) that it's the Python code that's not set up to handle the UTF-8 strings from your data source. In fact, the error indicates it's using the default 'ascii' codec rather than 'utf-8'. If it said 'utf-8' codec can't decode..., then I'd suspect a problem with the data. If you were to send the full traceback (all the gobbledy-gook that Python spews when it encounters an error) and the version of pymarc you're using to the program's author(s), they may be able to help you out further. My question is less about the Python error, which I understand, than about the MARC record causing the error and about how others deal with this issue (if it's a common issue, which I do not know.) But, here's the long story from pymarc's perspective. The record has leader[9] == 'a', but really, truly contains ANSEL-encoded data. When reading the record with a MARCReader(to_unicode = False) instance, the record reads ok since no decoding is attempted, but attempts at writing the record fail with the above error since pymarc attempts to utf8 encode the ANSEL-encoded string which contains non-ascii chars such as 0xe8 (the ANSEL Umlaut prefix). It does so because leader[9] == 'a' (see [1]). When reading the record with a MARCReader(to_unicode=True) instance, it'll throw an exception during marc_decode when trying to utf8-decode the ANSEL-encoded string. Rightly so. I don't blame pymarc for this behavior; to me, the record looks wrong. - Godmar (ps: that said, what pymarc does fails
Re: [CODE4LIB] Q.: MARC8 vs. MARC/Unicode and pymarc and misencoded III records
Oh, and why do I favor this solution? Compared to passing input through as is: You're just prolonging the pain, something downstream is still going to have a problem with it, outputting known illegal data is not a good idea. Compared to heuristically guessing encoding: Heuristically guessing is okay, but obviously a good deal harder than just replacing bad data with unicode 'replacement' glyph. But honestly, I don't _want_ this kind of mis-encoded data to be completely transparent -- I want it to do something to make the error visible (without stopping the app or data transformation process in it's tracks), so catalogers can't possibly think that the data is just fine. If you use heuristics to guess, sometimes those heuristics will fail -- when they do, the catalogers will think there's something wrong with your logic. But it works fine for all the other records that you say have the same problem, why can't it work fine for this one? But this is partially as a result of my general conclusions, from experience, about trying to heuristically 'autocorrect' bad marc data -- I try to do it as minimally as possible. It's too easy to get in a long battle with trying to make your heuristics better, instead of focusing on, you know, actually fixing the data. Now, a place where i'd be willing to use heuristics -- a bulk process to try to actually fix the data in your ILS. Something that goes through all your marc and flags records that aren't legal for the encoding they claim to be. If you want to add heuristics there to try to guess what encoding they really are and automatically fix em, that doesn't seem a terrible idea to me. But working around the problem with heuristics at higher levels does; spend time on actually fixing the bad data instead. Bad marc data, including illegal char encodings, is a continual inconvenience, you work around it in your pymarc-based software, eventually you'll have some other software in a different language that you have to duplicate your workarounds in. On 3/8/2012 3:45 PM, Jonathan Rochkind wrote: a) Mis-characterized MARC char encodings are common amongst many of our corpuses and ILS's. It is a common problem. It can be very inconvenient. Not only Marc8 that says it's UTF8 and vice versa, but something that says it's MARC8 or UTF8 but is actually neither. b) While one solution would be having the marc tool pass the char stream through as is without complaining like Godmar suggested; and another solution would be trying to heuristically guess the 'real' solution like Gabe suggests; personally I favor a different solution: The thing that's encoding as unicode on the way out? Instead of raising on an invalid char, it should have the option of silently eating it, replacing it with either empty string or the unicode replacement character ( used to replace an incoming character whose value is unknown or unrepresentable in Unicode [http://www.fileformat.info/info/unicode/char/fffd/index.htm] ) I have worked with character encoding libraries before that have this option, replace messed up bytes with unicode replacement char. I don't know what's avail in Python though. Jonathan On 3/8/2012 3:19 PM, Gabriel Farrell wrote: Sounds like what you do, Terry, and what we need in PyMARC, is something like UnicodeDammit [0]. Actually handling all of these esoteric encodings would be quite the chore, though. I also used to think it would be cool if we could get MARC8 encoding/decoding into the Python standard library, but then I realized I'd rather work on other stuff while MARC8 withers and dies. [0] https://github.com/bdoms/beautifulsoup/blob/master/BeautifulSoup.py#L1753 On Thu, Mar 8, 2012 at 2:36 PM, Reese, Terry terry.re...@oregonstate.edu wrote: This is one of the reasons you really can't trust the information found in position 9. This is one of the reasons why when I wrote MarcEdit, I utilize a mixed process when working with data and determining characterset -- a process that reads this byte and takes the information under advisement, but in the end treats it more as a suggestion and one part of a larger heuristic analysis of the record data to determine whether the information is in UTF8 or not. Fortunately, determining if a set of data is in UTF8 or something else, is a fairly easy process. Determining the something else is much more difficult, but generally not necessary. For that reason, if I was advising other people working on MARC processing libraries, I'd advocate having a process for recognizing that certain informational data may not be set correctly, and essentially utilize a compatibility process to read and correct them. Because unfortunately, while the number of vendors and systems that set this encoding byte correctly has increased dramatically (it used to be pretty much no one) -- but it's still so uneven, I generally consider this information unreliable. --TR -Original
Re: [CODE4LIB] Microsoft Transact-SQL
Then you might be best starting with a really good book on SQL in general, or 'standard' SQL. On 3/6/2012 1:42 PM, Wilfred Drew wrote: It is actually for a job I am interested in. I have no SQL experience in depth at all. Just some using Access. -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jon Gorman Sent: Tuesday, March 06, 2012 1:39 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Microsoft Transact-SQL On Tue, Mar 6, 2012 at 11:05 AM, Wilfred Drewdr...@tc3.edu wrote: I did mean Transact-SQL!! Sorry. I am after book recommendations. Right, sorry, should have made myself clearer. Do you have previous experience with creating database queries? I can't say I have any real recommendations, but it might help others. (And you might be able to get away with a more general book on sql and then look through the online documentation for specific problems). Jon Gorman
Re: [CODE4LIB] Repositories, OAI-PMH and web crawling
IF your HTML includes embedded semantic data using HTML5 microdata or RDFa or something similar (using a standard vocabulary -- the standard for repositories seems to be DC-based, since that's often all you can get out of OAI-PMH anyway) --- then web crawling combined with site maps probably provides about as much functionality as OAI-PMH. But embedded semantic metadata is key. However, even in the current OAI-PMH-considered-standard-best-practice world, the document-level metadata from repositories is often _extremely_ basic, as well as often unreliable. This severely limits the functionality that harvesters can put harvests to. So it's not neccesarily really about OAI-PMH vs web crawling. It's about sufficient and sufficiently reliable metadata. And even in the OAI-PMH world, we rarely have it. Note for instance that OAISter and similar harvesters are _unable to know_ whether a harvested document is open access full text or not. That seems like something you'd want to tell people in their search results right, they might only want stuff that they can actually access. But it's not really possible, becuase most (all?) repo's do not reveal any standard metadata in their OAI-PMH that would specify this. On 3/1/2012 9:38 AM, Ian Ibbotson wrote: Owen... Just wanted to say that, whilst I've been silent since my initial response, I'm not sure I agree with all the viewpoints presented here.. From a point of view of (for example, CultureGrid) I'm not sure what has been done could have been pragmatically achieved soley with web crawling as it's described in this thread. Don't have a problem with anything thats been written here. It certainly represent a great cross-section of viewpoints. However, from a jisc discovery perspective, I don't want to contribute to any confirmation bias that we could dispose of pesky old OAI. I'd be interested in providing a counter-point to any Best practice document that suggested we could. Ian. On Thu, Mar 1, 2012 at 12:36 PM, Owen Stephenso...@ostephens.com wrote: Thanks Jason and Ed, I suspect within this project we'll keep using OAI-PMH because we've got tight deadlines and the other project strands (which do stuff with the harvested content) need time from the developer. At the moment it looks like we will probably combine OAI-PMH with web crawling (using nutch) - so use data from the However, that said, one of the things we are meant to be doing is offering recommendations or good practice guidelines back to the (repository) community based on our experience. If we have time I would love to tackle the questions (a)-(d) that you highlight here - perhaps especially (a) and (c). Since this particular project is part of the wider JISC 'Discovery' programme (http://discovery.ac.uk and tech principles at http://technicalfoundations.ukoln.info/guidance/technical-principles-discovery-ecosystem) - from which one of the main themes might be summarised as 'work with the web' these questions are definitely relevant. I need to look at Jason's stuff again as I think this definitely has parallels with some of the Discovery work, as, of course, does some of the recent discussion on here about the question of the indexing of library catalogues by search engines. Thanks again to all who have contributed to the discussion - very useful Owen Owen Stephens Owen Stephens Consulting Web: http://www.ostephens.com Email: o...@ostephens.com Telephone: 0121 288 6936 On 1 Mar 2012, at 11:42, Ed Summers wrote: On Mon, Feb 27, 2012 at 12:15 PM, Jason Ronallojrona...@gmail.com wrote: I'd like to bring this back to your suggestion to just forget OAI-PMH and crawl the web. I think that's probably the long-term way forward. I definitely had the same thoughts while reading this thread. Owen, are you forced to stay within the context of OAI-PMH because you are working with existing institutional repositories? I don't know if it's appropriate, or if it has been done before, but as part of your work it would be interesting to determine: a) how many IRs allow crawling (robots.txt or lack thereof) b) how many IRs support crawling with a sitemap c) how many IR HTML splashpages use the rel-license [1] pattern d) how many IRs support syndication (RSS/Atom) to publish changes If you could do this in a semi-automated way for the UK it would be great if you could then apply it to IRs around the world. It would also align really nicely with the sort of work that Jason has been doing around CAPS [2]. It seems to me that there might be an opportunity to educate digital repository managers about better aligning their content w/ the Web ... instead of trying to cook up new standards. I imagine this is way out of scope for what you are currently doing--if so, maybe this can be your next grant :-) //Ed [1] http://microformats.org/wiki/rel-license [2] https://github.com/jronallo/capsys
Re: [CODE4LIB] Local catalog records and Google, Bing, Yahoo!
On 2/23/2012 1:37 PM, Sean Hannan wrote: Anecdotally, it would appear that bing (and bing-using yahoo) seem to drastically play down catalog records in their results. We're not doing anything to favor a particular search engine; we have a completely open robots.txt file. I think they're probably right to play down catalog records! I wonder how many people searching on google and ending up at our catalog are actually satisfied with what they get there -- info on how to check the book out if they were affiliated with our university, or where to find it on the shelves if they come to baltimore? An electronic copy that most of the time they can't access without being affiliated with our university?
Re: [CODE4LIB] Local catalog records and Google, Bing, Yahoo!
On 2/23/2012 2:45 PM, Karen Coyle wrote: This links to thoughts I've had about linked data and finding a way to use library holdings over the Web. Obviously, bibliographic data alone is a full service: people want to get the stuff once they've found out that such stuff exists. So how do we get users from the retrieval of a bibliographic record to a place where they have access to the stuff? I see two options: the WorldCat model, where people get sent to a central database where they input their zip code, or a URL-like model where they get a link on retrievals that has knowledge about their preferred institution and access. I think we need both of those, and mixtures between the two, and more. OCLC is trying to do the second one too. For instance with their link resolver redirector. But it requires link resolvers being registered, link resolvers working, and link resolvers working for print materials, etc. Of course get a link on retrievals begs the question of from where they are retrieving and who is generating this link? But in theory, anyone with a retrieval system could give you a link through OCLC's link resolver redirector. Which isn't quite fleshed out yet, but theoretically could then redirect you to the link resolver of your choice based on preferences or proximity. Except, well, it doens't work that well, for a variety of reasons both under and not under OCLC's control. But it's the sort of architecture we're talking about, I think. (Now if there was a common machine-readable response for link resolver type requests, an OCLC-like service could even aggregate the responses from _several_ preferred institutions on one page. Umlaut originally tried to do that with SFX link resolvers, but it never really went anywhere). Anyhow, yeah, both of those, and more. They definitely aren't mutually exclusive, and the sorts of technologies and metadata ecologies that are needed to support each one have a whole lot of overlap. Incidentally, my Umlaut software, mostly targetted at academic libraries, is really focused on that exact problem: people want to get the stuff once they've found out that such stuff exists. So how do we get users from the retrieval of a bibliographic record to a place where they have access to the stuff? But it's definitely not done yet, it's my goal with Umlaut, but there's still a lot left to do to get there. (Ultimately, you need some kind of LibX-type approach, browser plugin or javascript bookmarklet, to get people to a place where they have access from third parties that have absolutely no interest in collaborating on this plan. Amazon doesn't want to help you go anywhere other than Amazon to acquire a book). Definitely a work in progress, but the goal it's oriented to is exactly what you say. https://github.com/team-umlaut/umlaut Jonathan I have no idea if the latter is feasible on a true web scale, but it would be my ideal solution. We know that search engines keep track of your location and tailor retrievals based on that. Could libraries get into that loop? kc On 2/23/12 11:35 AM, Eoghan Ó Carragáin wrote: That's true, but since Blacklight/Vufind often sit over digital/institutional repositories as well as ILS systems subscription resources, at least some public domain content gets found that otherwise wouldn't be. As you said, even if the item isn't available digitally, for Special Collections libraries unique materials are exposed to potential researchers who'd never have known about them. Eoghan On 23 February 2012 19:25, Sean Hannanshan...@jhu.edu wrote: It's hard to say. Going off of the numbers that I have, I'd say that they do find what they are looking for, but they unless they are a JHU affiliate, they are unable to access it. Our bounce rate for Google searches is 76%. Which is not necessarily bad, because we put a lot of information on our item record pages--we don't make you dig for anything. On the other hand, 9% of visits coming to us through Google searches are return visits. To me, that says that the other 91% are not JHU affiliates, and that's 91% of Google searchers that won't have access to materials. I know from monitoring our feedback form, we have gotten in increase in requests from far flung places for access to things we have in special collections from non-affiliates. So, we get lots of exposure via searches, but due to the nature of how libraries work with subscriptions, licensing, membership and such, we close lots of doors once they get there. -Sean On 2/23/12 1:55 PM, Schneider, Waynewschnei...@hclib.org wrote: This is really interesting. Do you have evidence (anecdotally or otherwise) that the people coming to you via search engines found what they were looking for? Sorry, I don't know exactly how to phrase this. To put it another way - are your patrons finding you this way? wayne -Original Message- From: Code for Libraries
[CODE4LIB] How to get from what you've found to access:
Changing the subject line, cause this is an interesting topic on it's own. On 2/23/2012 2:45 PM, Karen Coyle wrote: This links to thoughts I've had about linked data and finding a way to use library holdings over the Web. Obviously, bibliographic data alone is a full service: people want to get the stuff once they've found out that such stuff exists. So how do we get users from the retrieval of a bibliographic record to a place where they have access to the stuff? I think this is exactly right as a problem libraries (which provide various methods of access to items people may find out about elsewhere) should be focusing on more than they do. I mentioned in a previous reply that this is in fact exactly the mission of Umlaut. A better direct link for people interested in Umlaut than the one I pasted before: https://github.com/team-umlaut/umlaut/wiki It's definitely a work in progress, like I said, I'm not saying Umlaut solves this problem. But the thinking behind Umlaut is that you've got to have software which can take a machine-described thing someone is interested in and tell them everything that your institution (which they presumably are affiliated with) can do for them for that item. That's exactly what Umlaut tries to do, providing a platform that you can use to piece together information from your various silo'd knowledge bases and third party resources, etc. And including ALL your stuff, monographs etc., not just journal articles like typical link resolvers. After that (and even that is hard), you've got to figure out how to _get_ people from where they've found out about something to your service for telling them what they can do with it via your institution. That's not an entirely solved problem. One reason Umlaut speaks OpenURL is there is already a substantial infrastructure of things in the academic market that can express a thing someone knows about in OpenURL and send it to your local software (including Google Scholar). But it's still not enough. Ultimately some kind of LibX approach is probably required -- whether a browser plugin, or a javascript bookmarklet (same sort of thing, different technology), a way to get someone from a third party to your 'list of services', even when that third party is completely uninterested in helping them get there (Amazon doesn't particularly want to help someone who starts at Amazon get somewhere _else_ to buy the book! Others may be less hostile, but just not all that interested in spending any energy on it). Jonathan I see two options: the WorldCat model, where people get sent to a central database where they input their zip code, or a URL-like model where they get a link on retrievals that has knowledge about their preferred institution and access. I have no idea if the latter is feasible on a true web scale, but it would be my ideal solution. We know that search engines keep track of your location and tailor retrievals based on that. Could libraries get into that loop? kc On 2/23/12 11:35 AM, Eoghan Ó Carragáin wrote: That's true, but since Blacklight/Vufind often sit over digital/institutional repositories as well as ILS systems subscription resources, at least some public domain content gets found that otherwise wouldn't be. As you said, even if the item isn't available digitally, for Special Collections libraries unique materials are exposed to potential researchers who'd never have known about them. Eoghan On 23 February 2012 19:25, Sean Hannanshan...@jhu.edu wrote: It's hard to say. Going off of the numbers that I have, I'd say that they do find what they are looking for, but they unless they are a JHU affiliate, they are unable to access it. Our bounce rate for Google searches is 76%. Which is not necessarily bad, because we put a lot of information on our item record pages--we don't make you dig for anything. On the other hand, 9% of visits coming to us through Google searches are return visits. To me, that says that the other 91% are not JHU affiliates, and that's 91% of Google searchers that won't have access to materials. I know from monitoring our feedback form, we have gotten in increase in requests from far flung places for access to things we have in special collections from non-affiliates. So, we get lots of exposure via searches, but due to the nature of how libraries work with subscriptions, licensing, membership and such, we close lots of doors once they get there. -Sean On 2/23/12 1:55 PM, Schneider, Waynewschnei...@hclib.org wrote: This is really interesting. Do you have evidence (anecdotally or otherwise) that the people coming to you via search engines found what they were looking for? Sorry, I don't know exactly how to phrase this. To put it another way - are your patrons finding you this way? wayne -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Sean Hannan Sent: Thursday, February 23,
Re: [CODE4LIB] Local catalog records and Google, Bing, Yahoo!
On 2/23/2012 3:53 PM, Karen Coyle wrote: Jonathan, while having these thoughts your Umlaut service did come to mind. If you ever have time to expand on how it could work in a wide open web environment, I'd love to hear it. (I know you explain below, but I don't know enough about link resolvers to understand what it really means from a short explanation. Diagrams are always welcome!) I'm not entirely sure what is meant by 'wide open web environment.' I mean, part of the current environment is that there's lots of stuff on the web that is NOT free/open access, it's only available to certain licensed people. AND that libraries license a lot of this stuff on behalf of their user group. (Not just content, but sometimes services too). It's really that environment Umlaut is focused on, if that changed, what would be required would have little to do with Umlaut as it is now, I think. But I don't think anyone anticipates that changing anytime soon, I don't think that's what Karen means by 'wide open web environment.' So if that continues to be the case I think Umlaut has a role working pretty much as it does now, it would work how it works. (Maybe I'm not sufficiently forward-thinking). I will admit that, while I come accross lots of barriers in implementing Umlaut, I have yet to come accross anything that makes me think this would be a lot easier if only there was more RDF. Maybe it's a failure of imagination on my part. More downloadable data, sure. More http APIs, even more so. And Umlaut already takes advantage of such things, especially the API's more than the downloadable data (it turns out it's a lot more 'expensive' to try to download data and do something with it yourself, compared to using an API someone else provides to do the heavy lifting for you). But has it ever been much of a problem that the data is in some format other than RDF, such that it would be easier in RDF? Not from my perspective, not really. (In some ways, RDF is harder to deal with than other formats, from where I'm coming from. If a service does offer data in RDF triples as well as something else, I'm likely to choose the something else). This may be ironic because Umlaut is very concerned with 'linking data', in the sense of figuring out whether this record from the local catalog represents 'the same thing' as this record from Amazon, as this record from Google Books, or HathiTrust. If this citation that came in as an OpenURL represents the 'same thing' as a record in a vendor database, or mendeley, or whatever. There are real barriers in making this determination; they wouldn't be solved if everything was just in RDF, but they _would_ be solved if there were more consistent use of identifiers, for sure. I DO think this would be easier if only there were more consistent use of identifiers all the time. That experience with Umlaut is also what leads me to believe that the WEMI ontology is not only not contradictory to linked data applications, but _crucial_ for it. Realizing that without it, it's very hard to tell when something is the same thing. There are lots of times Umlaut ends up saying Okay, I found something that I _think_ is at least an edition of the same thing you care about, but I really can't tell you if it's the _same_ edition you are interested in or not. So, yeah, Umlaut would work _better_ with more widespread use identifiers, and even better with consistent use of common identifiers. I guess that's maybe where RDF could come in, in expressing determinations people have made of this identifier in system X represents the same 'thing' as this other identifier in system Y (someone would still have to MAKE those determinations, RDF would just be one way to then convey that determination, and I wouldn't particularly care if it was conveyed in RDF or something else). So anyway, it would work better with some of that stuff, but would it work substantially _differently_? Not so much. Ah, if web pages started having more embedded machine readable data with citations and identifiers of what is being looked at (microdata, RDFa, whatever), that would make it easier to get a user from some random web page _to_ an institution's Umlaut, that's one thing that would be nice. You may (or may not) find the What is Umlaut, Anyway? article on the Umlaut wiki helpful. https://github.com/team-umlaut/umlaut/wiki/What-is-Umlaut-anyway And there's really not much to understand about 'link resolvers' for these purposes, except that there's this thing called OpenURL (really bad name), which is really just a way for one website to hyperlink to another website and pass a machine-readable citation to it. This application receiving the machine readable citation then tries to get the user to appropriate access or services for it, with regard to institutional entitlements. That's about it, if you understand that, you understand enough. Except that most
Re: [CODE4LIB] Local catalog records and Google, Bing, Yahoo!
On 2/23/2012 5:35 PM, Stephen Hearn wrote: But there's a catch--when WorldCat redirects a search to the selected local library catalog, it targets the OCLC record number. If the holding library has included the OCLC record number in its indexed data, the user goes right to the desired record. If not, the user is left wondering why the title of interest turned into some mysterious number and the search failed. I've been wishing OCLC would change this for a while. When specifying WorldCat's redirects for your local catalog, it's already possible to NOT specify an OCLCnum based search, but only specify an ISBN, ISSN, etc search. If you do this, and the record HAS an (eg) ISBN, it'll redirect to an ISBN search in your catalog. But if the record doesn't have an ISBN, ISSN, etc, I think it'll just redirect to your catalog home page. So WorldCat is already capable of redirecting to an ISBN search. But if you config the OCLCnum search, it seems it'll always use it instead. I wish WorldCat instead would do the ISBN search if there is an ISBN, do an ISSN search if there's an ISSN, and only resort to the OCLCnum search if there's no ISBN or ISSN to search on. Or at least that could be a configurable option. Would result in a greater proportion of succesful 'hits' when redirecting to local catalog, which may not have an OCLCnum in it for every single record that it possibly could. (For that matter, what about when there are multiple OCLCnums, multiple records, for the same manifestation? For instance, a German language cataloging record and an English language cataloging record, for the exact same manifestation, have a different OCLCnum. Will OCLC ever send the German language cataloging record OCLCnum and miss becuase you had the English language one? I dunno). Anyhow, I've tried making this suggestion before to relevant OCLC people, but it's possible I never found the relevant OCLC people. It's kind of hard to figure out how to make such feature suggestions to OCLC in a way that won't just be dropped on the floor (not sure it's possible, in fact). Jonathan Stephen On Thu, Feb 23, 2012 at 4:11 PM, David Friggensfrigg...@waikato.ac.nz wrote: why local library catalog records do not show up in search results? Basically, most OPACs are crap. :-) There are still some that that don't provide persistent links to record pages, and most are designed so that the user has a session and gets kicked out after 10 minutes or so. These issues were part of Tim Spalding's message that as well as joining web 2.0, libraries also need to join web 1.0. http://vimeo.com/user2734401 We don't allow crawlers because it has caused serious performance issues in the past. Specifically (in our case at least), each request creates a new session on the server which doesn't time out for about 10 minutes, thus a crawler would fill up the system's RAM pretty quickly. You can use Crawl-delay: http://en.wikipedia.org/wiki/Robots_exclusion_standard#Crawl-delay_directive You can set Google's crawl rate in Webmaster Tools as well. I've had this suggested before and thought about it, but never had it high up enough in my list to test it out. Has anyone actually used the above to get a similar OPAC crawled successfully and not brought down on its knees? David
Re: [CODE4LIB] Issue Tracker Recommendations
On 2/22/2012 5:10 PM, Sebastian Karcher wrote: Because Trac and Git have come up: Zotero has switched from Trac/SVN to Git and I (and I think everyone else involved) much prefers git, not least because of it's better issue handling. I found Trac slow, clumsy, and ugly. I'm confused. Git is a source control repo, like svn. Trac is an issue tracker. Okay, you switched from svn to git (which to me seems somewhat orthogonal to issue tracker, although it's true that certain issue tracker software integrates with some versioning control systems and not others, like trac does with svn). But what are you using for issue tracking now, instead of Trac? git is not an issue tracker, so I'm not sure what you mean by git's better issue handling, git doesn't do issue handling (any more than svn does). Do you mean you're using github.com as your git host, and their issue tracker? Or something else? Jonathan If, as you say, the code repository function isn't important, there may very well be better products for issue tracking only, but between Trac and github the latter is imho much superior. On Wed, Feb 22, 2012 at 1:52 PM, Sarr, Nathan ns...@library.rochester.edu wrote: You might want to take a look at asana: http://asana.com/ -Nate Nathan Sarr Senior Software Engineer River Campus Libraries University of Rochester Rochester, NY 14627 (585) 275-0692 ns...@library.rochester.edu -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cynthia Ng Sent: Wednesday, February 22, 2012 3:46 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Issue Tracker Recommendations Thanks for all the responses, everyone. If there are any more, I'd still like to hear them. Should probably add that 4) it's more for issue tracking/documentation i.e. code versioning/repository is not a priority right now (though it's great if it has that feature) There will be discussions with the rest of the team and we'll have to talk to the programmer/server admin to see what he thinks is easier to implement, but we're likely to go with Redmine or Trac based on recommendations/needs.
[CODE4LIB] more on returning partial HTML to javascript
A while ago we had a big debate/argument about whether it makes sense to return partial HTML snippets from ajax (or really, um, ajah, in this case?) requests from javascript; or whether instead modern apps should all move toward javascript MVC models with most logic in the js layer; or something in between; or maybe it depends on the app, heh. Anyway, I ran accross (on reddit I think) this interseting blog post by the developers of the ruby Basecamp app, explaining how they use partial HTML returns to js, and avoid javascript MVC except in areas where the UI really requires it; and in particular how they then have to pay a lot of attention to server-side caching to get the very snappy performance they want. Some may find it interesting. http://37signals.com/svn/posts/3112-how-basecamp-next-got-to-be-so-damn-fast-without-using-much-client-side-ui
Re: [CODE4LIB] www.code4lib.org down?
On 2/20/2012 12:54 PM, Cary Gordon wrote: I could also put it on one of my servers. It needs a simple LAMP stack. I think that it requires PHP 5.2.x and might throw errors on 5.3.x. There are some other things running on that server than Drupal. Including the 'planet' aggregator, the wiki, the custom dieboldotron voting software, possibly others. I don't know of anything that requires anything particularly complicated or non-standard, but when making 'requirements' keep in mind it's not just the drupal.
[CODE4LIB] Pre confirm when where?
Is there some obvious way I'm not seei g to figure out when and where the conf pre conf sessions are tomorrow? I can't seem to find it anywhere. I don't even know what time to wake up and go looking for them? Not even positive if they are at the conf hotel?
Re: [CODE4LIB] code4lib 2012 streaming
Is the video also being recorded for putting up on the web later? On 2/1/2012 11:48 AM, Corey A Harper wrote: Dear All, I'll be managing our attempts to ensure code4lib 2012 is streamed. The plan is to stream all plenary portions of the conference via livestream, and I'll post the channel link to IRC, Twitter on this list before the event begins. If all goes well, we'll have a stream for the following (PST) times: * Tues: 9am-12pm, 1pm-2.40, 4-5.20 * Wed: 9am-12pm, 1pm-2.20, 3.50-5.15 * Thu: 9am-12pm The streaming committee has some concerns about the equipment we have access to, so if there is anyone in the community who would volunteer a digital camcorder with a firewire known to be compatible with Livestream, we would be in your debt. (Which means I would buy you beer from time to time throughout the conference...) Alternately, I have leads on rental equipment, so please let me know (offlist) if virtual attendees would be willing to donate toward the stream or if onsite attendees would be willing to make a donation at the door. :) Thanks in advance. I will post a link to the livestream channel no later than Monday. Best, -Corey On Mon, Jan 30, 2012 at 11:21 AM, Julia Bauderjulia.bau...@gmail.com wrote: Speaking of video streaming, is there any information yet about the streaming? E.g., what will be streamed, and where will the links to the stream appear? Julia (who is also eagerly awaiting her streaming + IRC Code4Lib fix) On Mon, Jan 30, 2012 at 9:50 AM, Ranti Junusranti.ju...@gmail.com wrote: Hello All, For those who might not realize it, the code4lib 2012 schedule is up. http://code4lib.org/conference/2012/schedule Once the conference is over, we'll work on adding the links to the presentations. Better yet, those of you who do the presentation can add the link to your own presentation (slides, screencast, code examples, etc.) You'd need to register for an account first, if you haven't done that. Have a great time, everyone! I'm looking forward to watch the video streaming and participate in the #code4lib IRC. thanks, ranti on behalf of code4lib 2012 program committee -- Bulk mail. Postage paid.
Re: [CODE4LIB] Koha in the Running
The only thing I can say is be careful of PTFS/LibLime as a vendor. There are other vendors that provide Koha support in the US, however. http://bibwild.wordpress.com/2011/09/20/koha-support-or-hosting-options/ On 1/12/2012 1:19 PM, todd.d.robb...@gmail.com wrote: Hello all, I'm curious to know of this lists current thoughts on Koha as an ILS. Where would you rank it among the various options, open source and vendor? Cheers, Tod PS: If this has been addressed recently and I just happened to miss it in the archives: my apologies.
Re: [CODE4LIB] Obvious answer to registration limitations
On 1/11/2012 11:31 AM, Jim Safley wrote: I happen to know that Amanda French, THATCamp Coordinator, is interested in talking with the code4lib coordinators about the distributed conference model. Ah, but if you haven't figured it out yet, there pretty much are no such thing as 'code4lib coordinators'. If some people are interested in this, they should investigate, there's pretty much nobody who has authority to do it or tell you that you have authority.
[CODE4LIB] http://openurl.code4lib.org/ MIA
there used to be an http://openurl.code4lib.org/ . It's even linked to from a Wikipedia article on OpenURL. I seem to recall it had some useful stuff rsinger put there. It is now MIA. Anyone know what happened to it, and if it's easy to bring it back? rsinger? No big deal, just curious. Jonathan
[CODE4LIB] re-introducing Umlaut, again
An alpha release of Umlaut 3.0 is now available. Umlaut is an open source front-end for a link resolver, or: Umlaut is a just-in-time aggregator of last mile specific citation services, taking input as OpenURL, and providing an HTML UI as well as an api suite for embedding Umlaut services in other applications. What the heck does this mean?Read more. https://github.com/team-umlaut/umlaut/wiki/What-is-Umlaut-anyway The 3.0 release of Umlaut will not add any new features, but instead modernizes Umlaut's architecture to be based on Rails 3.1+ as an engine gem, and work on modern ruby versions. Lots of unsupported cruft was also removed from the codebase. (Umlaut actually began as a Rails 1.x application!). Why this matters to you is that Umlaut should be easier to install and maintain than it ever was before. SeeInstallation/Getting Started instructions. https://github.com/team-umlaut/umlaut/wiki/Installation This is still an alpha release at present. It likely has some not yet discovered bugs, missing features, or performance issues. But it should be much easier to work with than Umlaut 2.x, if you are looking to get started with Umlaut, definitely start with the 3.x alpha. Alpha tester feedback very welcome, please let me know of any difficulties you have with it, suggestions, questions, etc. Umlaut 3.x source code is available in theumlaut3dev branch in the github project https://github.com/team-umlaut/umlaut/tree/umlaut3dev(eventually it will move to master).
Re: [CODE4LIB] Q: best practices for *simple* contributor IP/licensing management for open source?
Thanks! I wasn't wanting to invent something new, I was just having trouble finding any light weight processes via googling, thus I figured I'd ask you all. I'll definitely spend some time checking out the DCO process. Hopefully the documents used in it are licensed (creative commons or something?) such that other projects can re-use em? On 12/14/2011 9:56 PM, Dan Scott wrote: Trying to post inline in GroupWise, apologies if it ends up looking like crap... I*m imagining something where each contributor/accepted-pull-request-submitter basically just puts a digital file in the repo, once, that says something like *All the code I*ve contributed to this repo in past or future, I have the legal ability to release under license X, and I have done so.* And then I guess in the License file, instead of saying *copyright Original Author*, it would be like *copyright by various contributors, see files in ./contributors to see who.* I wouldn't suggest imagining new things when it comes to legal issues ;) I would suggest considering the Developer's Certificate of Originality (DCO) process as adopted by the Linux project and others (including Evergreen). When Evergreen was in the process of joining the Software Freedom Conservancy, that process was considered acceptable practice (IIRC, the Software Freedom Law Center did take a glance) - no doubt in part because it is a well-established practice. And talk about lightweight; using the git Signed-off-by tag indicates that you've read the DCO and agree to its terms. For a recent discussion and description of the DCO (in the context of the Project Harmony discussions which were focused primarily on the much heavier-weight CLA processes), see http://lists.harmonyagreements.org/pipermail/harmony-drafting/2011-August/99.html for example.
Re: [CODE4LIB] conference voting and registration
On 12/15/2011 6:07 PM, Francis Kayiwa wrote: Perhaps it has reached a point where regional ones will be the way to go as more and more people get left out. I say if you get left out. Plan to run your $local code4lib to make up for it. Yep, that'd be the party line. You know Code4Lib was started only, what, 6 years ago, by a bunch of random coders who just said Hey, let's put on a conference, why not? It's gotten harder to put on since then, but the first one was pretty seat of the pants (I understand, I wasn't there, although i was at the 2nd). If you're unhappy that you can't get into code4lib, start your own that you can get into!
Re: [CODE4LIB] conference voting and registration
On 12/15/2011 6:32 PM, Cary Gordon wrote: Pretty much any volunteer position guarantees you a spot. It is up to the organizers to figure out what they need help with. I do not think this is true. Pretty sure Kyle just said as much for this year. I don't think it's been true in past years either. But I think the old record for selling out was 4 days, not one hour, so anyone involved in volunteering probably just signed up the usual way and got in in the past.
[CODE4LIB] Q: best practices for *simple* contributor IP/licensing management for open source?
Also posted on my blog at: http://bibwild.wordpress.com/2011/12/14/practices-for-simple-contributor-management/ So, like many non-huge non-corporate-supported open source projects, many of the open source projects I contribute to go something like this (some of which I was original author, others not): * Someone starts the project in an publicly accessible repo. * If she works for a company, in the best case she got permission with her employer (who may or may not own copyright to code she writes) to release it as open source. * She sticks some open source License file in the repo saying “copying Carrie Coder” and/or the the name of the employer. Okay, so far so good, but then: * She adds someone else as a committer, who starts committing code. And/or accepts pull requests on github etc, committing code by other authors. * Never even thinks about licensing/intellectual property issues. What can go wrong? * Well, the license file probably still says ‘copyright Carrie Coder’ or ‘copyright Acme Inc’, even though the code by other authors has copyright held by them (or their employers). So right away something seems not all on the up and up. * One of those contributors can later be like “Wait, I didn’t mean to release that open source, and I own the copyright, you don’t have my permission to use it, take it out.” * Or worse, one of the contributors employers can assert they own the copyright and did not give permission for it to be released open source and you don’t have permission to use it (and neither does anyone else that’s copied or forked it from you). == Heavy weight solutions So there’s a really heavy-weight solution to this, like Apache Foundation uses in their Contributor License Agreement. This is something people have to actually print out and sign and mail in. Some agreements like this actually transfer the copyright to some corporate entity, presumably so the project can easily re-license under a different license later. (I thought Apache did this, but apparently not). This is kind of too much over-head for a simple non-corporate-sponsored open source project. Who’s going to receive all this mail, and where are they going to keep the contracts? There is no corporate entity to be granted a non-exclusive license to do anything. (And the hypothetical project isn’t nearly so important or popular to justify trying to get umbrella stewardship from Apache or the Software Freedom Conservancy or whatever.(If it were, the Software Freedom Conservancy is a good option, but still too much overhead for the dozens of different tiny-to-medium sized projects anyone may be involved in. ) Even so far as individuals, over the life of the project who the committers are may very well change, and not include the original author(s) anymore. And you don’t want to make someone print out sign and wait for you to receive something before accepting their commits, that’s not internet-speed. == Best practices for a simpler solution that’s not nothing? So doing it ‘right’ with that heavy-weight solution is just way too much trouble, so most of us just keep ignoring it. But is there some lighter-weight better-than-nothing probably-good-enough approach? I am curious if anyone can provide examples, ideally lawyer-vetted examples, of doing this much simpler. Most of my projects are MIT-style licensed, which already says “do whatever the heck you want with this code”, so I don’t really care about being able to re-license under a different license later (I don’t think I do? Or maybe even the MIT license would already allow anyone to do that). So I definitely don’t need and can’t really can’t handle paper print-outs. I’m imagining something where each contributor/accepted-pull-request-submitter basically just puts a digital file in the repo, once, that says something like “All the code I’ve contributed to this repo in past or future, I have the legal ability to release under license X, and I have done so.” And then I guess in the License file, instead of saying ‘copyright Original Author’, it would be like ‘copyright by various contributors, see files in ./contributors to see who.’ Does something along those lines end up working legally, or is it worthless, no better than just continuing to ignore the problem, so you might as well just continue to ignore the problem? Or if it is potentially workable, does anyone have examples of projects using such a system, ideally with some evidence some lawyer has said it’s worthwhile, including a lawyer-vetted digital contributor agreement? Any ideas?
Re: [CODE4LIB] Sending html via ajax -vs- building html in js (was: jQuery Ajax request to update a PHP variable)
On 12/8/2011 9:27 AM, Bill Dueber wrote: To these I would add: * Reuse. The call you're making may be providing data that would be useful in other contexts as well. If you're generating application-specific html, that can't happen. Well, if the other contexts are Javascript, and your HTML is nice semantically structured with good classes and ID's it actually ends up being just about as easy getting the data out of HTML with JQuery selectors as it would be with JSON. This is kind of the direction of HTML5 microdata/schema.org --- realizing that properly structured semantic HTML can be pretty damn machine readable, so if you do that you can get human HTML and machine readability with only one representation, instead of having to maintain multiple representations. (In some of the scenarios we're talking about, there are potentially THREE representations to maintain -- server-side generated HTML, server-side generated JSON, AND js to turn the JSON into HTML). But yeah, there are always lots of trade-offs. This particular question I think ends up depending on a lot on what choices you've made for the REST of your software stack. Each choice at each level has different trade-offs, but the most important thing is probably to reduce the 'impedence' of making inconsistent choices in different places. That is -- if you're heavy into client-side JS app framework rendering of HTML already, then sure you've made your choice, stick to it.
Re: [CODE4LIB] Sending html via ajax -vs- building html in js (was: jQuery Ajax request to update a PHP variable)
On 12/8/2011 11:19 AM, Robert Sanderson wrote: If you blindly include whatever you get back directly into the page, it might include either badly performing, out of date, or potentially maliciousscript tags that subsequently destroy the page. It's the equivalent of blindly accepting web form input into an SQL query and then wondering where your tables all disappeared off to. Hmm, i'm not sure it's the _equivalent_, isn't JS (especially JS you wrote) going to only be getting HTML from servers running software you wrote/controlled? Even if a server is just adding HTML to a page (no JS involved), it COULD be subject to an HTML injection attack, if the server is basing the HTML on user input without properly sanitizing it. I don't think the fact that you've split the logic between the server and the JS neccearily changes things. It's essentially just a 'remote procedure call'. The server is STILL responsible for delivering secure HTML -- exactly as it was when there was no JS involved at all, no? Now, granted, it is a more complicated environment when there's JS involved, so there is more chance for a security bug. But I wouldn't say it's the equivalent of blinding accepting web form input etc whether JS is involved or not, if the server is generating HTML, it's the server's job to _not_ blinding accept web form input and stick it into HTML. If you have your JS asking _untrusted sources_ (instead of your own server) for HTML, then that might be a different story.
Re: [CODE4LIB] Sending html via ajax -vs- building html in js (was: jQuery Ajax request to update a PHP variable)
On Thu, Dec 8, 2011 at 9:11 AM, Godmar Backgod...@gmail.com wrote: If we tell newbies (no offense meant by that term) that AJAX means send a request and then insert a chunk of HTML in your DOM, we're short-changing their view of the type of Rich Internet Application (RIA) AJAX today is equated with. sure, fair point -- I just don't think there is anything wrong with I would not want to tell newbies that. But if we tell newbies that javascript communication with the server should _always_ mean sending JSON, and that sending HTML is unfashionable and they should never do it, I also think we're short-changing their view, and giving them cargo cult trend-following approaches. I think there are plenty of scenarios where either approach is justified and appropriate. It depends on the context, it depends on the rest of your stack, it depends on what's going on. There is no substitute for actual thought and analysis and decision.
Re: [CODE4LIB] Sending html via ajax -vs- building html in js (was: jQuery Ajax request to update a PHP variable)
A fair number? Anyone but Godmar? On 12/7/2011 5:02 PM, Nate Vack wrote: OK. So we have a fair number of very smart people saying, in essence, it's better to build your HTML in javascript than send it via ajax and insert it. So, I'm wondering: Why? Is it an issue of data transfer size? Is there a security issue lurking? Is it tedious to bind events to the new / updated code? Something else? I've thought about it a lot and can't think of anything hugely compelling... Thanks! -Nate
Re: [CODE4LIB] marc in json
The reason some of us want marc in JSON has absolutely nothing to do with sending json mime type over http and viewing it in a browser with jsonovich or whatever. (In fact, marc-in-json is possibly LESS human readable than marcxml, or at any rate no more so). It's to escape the limits and ease-of-corrupting of ISO Marc21 binary (maximum length, directory/headers that can easily become corrupt, unpredictable char encoding issues), but with a more compact (in bytes) and quicker/easier to parse representation than XML. But yes, it's awesome that we have parsers in several languages that will now read compatible marc-in-json formats. But sometimes you need to send more than one MARC record at once. (at once can be a file, or a network/inter-process stream) . More than one can range from 2 to dchud's a ton. So the next step is a common format for more than one of these things. It would be useful, yes. From: Daniel Chudnov [daniel.chud...@gmail.com] Sent: Wednesday, December 07, 2011 7:27 PM To: Code for Libraries Cc: Jonathan Rochkind Subject: Re: [CODE4LIB] marc in json On 12/1/2011 3:24 PM, Jonathan Rochkind wrote: newline-delimited is certainly one simple solution, even though the aggregate file is not valid JSON. Does it matter? Not sure if there are any simple solutions that still give you valid JSON, but if there aren't, I'd rather sacrifice valid JSON (that it's unclear if there's any important use case for anyway), than sacrifice simplicity. That's the same question - does it matter? - that I had reading this thread. If you have a ton of records to pack into a file, are the advantages of sending a json mime type over http and viewing it in a browser with jsonovich or whatever worth it when it's a really big file anyway? Seems that having 3-4 parsers that share the exact same idea of how to read/write individual records is the main story, and a great step forward. +1 to y'all for getting this done. -1 to me for never following through with my half-done pymarc implementation at c4lc '09 or whenever it was. -Dan
Re: [CODE4LIB] Sending html via ajax -vs- building html in js (was: jQuery Ajax request to update a PHP variable)
Also, I've thought of a good reason myself: performance. If I'm adding an item to a list, it's a better user experience to update the display immediately rather than waiting for the server to send back a 200 OK, and handle the error or timeout case specially. While in general I tend toward the other the other thing you said, Does it make sense to replicate the server-side functionality on the client? -- I think what you propose above is legit. MOST people don't write interfaces like that, even in js. That is, an interface that will update the user interface even before/without receiving _anything_ back from the server. (But, in the best cases, produce and error message and/or 'undo' the user interface action iff the server does later get back with an error/failure message). So if you're going to do that, then--- it kind of doesn't matter if the server sends back HTML or JSON or anything else, the user interface is updating before/without getting _anything_ from the server. But to the extent the server's response then serves pretty much only as a notification-of-failure or whatever, yeah, JSON is the way to go. So, yeah, if you're going to go all the way there, that's a pretty cool thing (if you can make sure the failure conditions are handled acceptably), sure, go for it.
Re: [CODE4LIB] jQuery Ajax request to update a PHP variable
Is it too late to dedicate a presentation slot to a performance? (Whoa, actually, seriously, a Code4Lib talent show would be AWESOME.) The rails conf in baltimore a couple years ago had an evening jam session slot. Sadly, it's really a pain bringing the accordion on an airplane.
Re: [CODE4LIB] jQuery Ajax request to update a PHP variable
I'll admit I haven't spent a lot of time investigating/analyzing this particular application -- it's quite possible an all-JS app is the right choice here. I was just responding to the suggestion that returning HTML to AJAX was out of style and shouldn't be done anymore; with the implication I picked up that (nearly) ALL apps should be almost all JS, use JS templating engines, etc., that this is the right new way to write web apps. I think this sends the wrong message to newbies. It's true that it is very trendy these days to write all JS apps, which if they function at all without JS do so with a completely seperate codepath (this is NOT progressive enhancement, although it is a way of ensuring non-JS accessibility). Yeah, it's trendy, but I think it's frequently (but not always, true) the wrong choice when it's done. If you do provide a completely separate codepath for non-JS, this can be harder to maintain than actual progressive enhancement. And pure JS either way can easily make your app a poor web citizen of the web, harder to screen-scrape or spider, harder to find URLs to link to certain parts of the app, etc. (eg, http://www.tbray.org/ongoing/When/201x/2011/02/09/Hash-Blecch ) But, sure, maybe in this particular case pure-JS is a good way to go, I haven't spent enough time looking at or thinking about it to have an opinion. Sure, if you've already started down the path of using a JS templating/view-rendering engine, and that's something you/want need to do anyway, you might as well stick to it, I guess. I just reacted to the suggestion that doing anything _but_ this is out of style, or an old bad way of doing things. If writing apps that produce HTML with progressive enhancement is out of style, then I don't want to be fashionable! From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Godmar Back [god...@gmail.com] Sent: Tuesday, December 06, 2011 9:34 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] jQuery Ajax request to update a PHP variable On Tue, Dec 6, 2011 at 8:38 AM, Erik Hatcher erikhatc...@mac.com wrote: I'm with jrock on this one. But maybe I'm a luddite that didn't get the memo either (but I am credited for being one of the instrumental folks in the Ajax world, heh - in one or more of the Ajax books out there, us old timers called it remote scripting). On the in-jest rhetorical front, I'm wondering if referring to oneself as oldtimer helps in defending against insinuations that opposing technological change makes one a defender of the old ;-) But: What I hate hate hate about seeing JSON being returned from a server for the browser to generate the view is stuff like: string = div + some_data_from_JSON + /div; That embodies everything that is wrong about Ajax + JSON. That's exactly why you use new libraries such as knockout.js, to avoid just that. Client-side template engines with automatic data-bindings. Alternatively, AJAX frameworks use JSON and then interpret the returned objects as code. Take a look at the client/server traffic produced by ZK, for instance. As Jonathan said, the server is already generating dynamic HTML... why have it return It isn't. There is no already generating anything server, it's a new app Nate is writing. (Unless you count his work of the past two days). The dynamic HTML he's generating is heavily tailored to his JS. There's extremely tight coupling, which now exists across multiple files written in multiple languages. Simply avoidable bad software engineering. That's not even making the computational cost argument that avoiding template processing on the server is cheaper. And with respect to Jonathan's argument of degradation, a degraded version of his app (presumably) would use table - or something like that, it'd look nothing like what's he showed us yesterday. Heh - the proof of the pudding is in the eating. Why don't we create 2 versions of Nate's app, one with mixed server/client - like the one he's completing now, and I create the client-side based one, and then we compare side by side? I'll work with Nate on that. - Godmar [ I hope it's ok to snip off the rest of the email trail in my reply. ]
Re: [CODE4LIB] jQuery Ajax request to update a PHP variable
On 12/6/2011 1:42 PM, Godmar Back wrote: Current trends certainly go in the opposite direction, look at jQuery Mobile. Hmm, JQuery mobile still operates on valid and functional HTML delivered by the server. In fact, one of the designs of JQuery mobile is indeed to degrade to a non-JS version in feature phones (you know, eg, flip phones with a web browser but probably no javascript). The non-JS version it degrades to is the same HTML that was delivered to the browser in either way, just not enhanced by JQuery Mobile. If I were writing AJAX requests for an application targetted mainly at JQuery Mobile... I'd be likely to still have the server delivery HTML to the AJAX request, then have js insert it into the page and trigger JQuery Mobile enhancements on it. - Godmar
Re: [CODE4LIB] Models of MARC in RDF
On 12/5/2011 1:40 PM, Karen Coyle wrote: This brings up another point that I haven't fully grokked yet: the use of MARC kept library data consistent across the many thousands of libraries that had MARC-based systems. Well, only somewhat consistent, but, yeah. What happens if we move to RDF without a standard? Can we rely on linking to provide interoperability without that rigid consistency of data models? Definitely not. I think this is a real issue. There is no magic to linking or RDF that provides interoperability for free; it's all about the vocabularies/schemata -- whether in MARC or in anything else. (Note different national/regional library communities used different schemata in MARC, which made interoperability infeasible there. Some still do, although gradually people have moved to Marc21 precisely for this reason, even when Marc21 was less powerful than the MARC variant they started with). That is to say, if we just used MARC's own implicit vocabularies, but output them as RDF, sure, we'd still have consistency, although we wouldn't really _gain_ much.On the other hand, if we switch to a new better vocabulary -- we've got to actually switch to a new better vocabulary. If it's just whatever anyone wants to use, we've made it VERY difficult to share data, which is something pretty darn important to us. Of course, the goal of the RDA process (or one of em) was to create a new schema for us to consistently use. That's the library community effort to maintain a common schema that is more powerful and flexible than MARC. If people are using other things instead, apparently that failed, or at least has not yet succeeded.
Re: [CODE4LIB] jQuery Ajax request to update a PHP variable
I'm not sure what you're trying to do makes sense. You'd have to write some PHP code to receive the AJAX request and use it to update the variable. There's nothing in PHP that will do this automatically. However, since, I believe, PHP variables are usually only 'in scope' for the context of the request, I'm not sure _what_ variable you are trying to update. I suppose you could update a session variable, and that might make sense. But it doens't sound like that's what you're trying to do it sounds like what you're trying to do is something fundamentally impossible. But if you have a PHP script with $searchterm = 'drawing'; in it, then that statement gets executed (setting $searchterm to 'drawing') every time the PHP script gets executed. Which is every time a request is received that executes that PHP script. It doesn't matter what some _other_ request did, and an AJAX request is just some other request. You can't use AJAX to change your source code. (Or, I suppose, there would be SOME crazy way to do that, but you definitely definitely wouldn't want to!). On 12/5/2011 5:08 PM, Nate Hill wrote: If I have in my PHP script a variable... $searchterm = 'Drawing'; And I want to update 'Drawing' to be 'Cooking' w/ a jQuery hover effect on the client side then I need to make an Ajax request, correct? What I can't figure out is what that is supposed to look like... something like... $.ajax({ type: POST, url: myfile.php, data: ...not sure how to write what goes here to make it 'Cooking'... }); Any ideas?
Re: [CODE4LIB] jQuery Ajax request to update a PHP variable
I still like sending HTML back from my server. I guess I never got the message that that was out of style, heh. My server application already has logic for creating HTML from templates, and quite possibly already creates this exact same piece of HTML in some other place, possibly for use with non-AJAX fallbacks, or some other context where that snippet of HTML needs to be rendered. I prefer to re-use this logic that's already on the server, rather than have a duplicate HTML generating/templating system in the javascript too. It's working fine for me, in my use patterns. Now, certainly, if you could eliminate any PHP generation of HTML at all, as I think Godmar is suggesting, and basically have a pure Javascript app -- that would be another approach that avoids duplication of HTML generating logic in both JS and PHP. That sounds fine too. But I'm still writing apps that degrade if you have no JS (including for web spiders that have no JS, for instance), and have nice REST-ish URLs, etc. If that's not a requirement and you can go all JS, then sure. But I wouldn't say that making apps that use progressive enhancement with regard to JS and degrade fine if you don't have is out of style, or if it is, it ought not to be! Jonathan On 12/5/2011 6:31 PM, Godmar Back wrote: FWIW, I would not send HTML back to the client in an AJAX request - that style of AJAX fell out of favor years ago. Send back JSON instead and keep the view logic client-side. Consider using a library such as knockout.js. Instead of your current (difficult to maintain) mix of PhP and client-side JavaScript, you'll end up with a static HTML page, a couple of clean JSON services (for checked-out per subject, and one for the syndetics ids of the first 4 covers), and clean HTML templates. You had earlier asked the question whether to do things client or server side - well in this example, the correct answer is to do it client-side. (Yours is a read-only application, where none of the advantages of server-side processing applies.) - Godmar On Mon, Dec 5, 2011 at 6:18 PM, Nate Hillnathanielh...@gmail.com wrote: Something quite like that, my friend! Cheers N On Mon, Dec 5, 2011 at 3:10 PM, Walker, Daviddwal...@calstate.edu wrote: I gotcha. More information is, indeed, better. ;-) So, on the PHP side, you just need to grab the term from the query string, like this: $searchterm = $_GET['query']; And then in your JavaScript code, you'll send an AJAX request, like: http://www.natehill.net/vizstuff/catscrape.php?query=Cooking Is that what you're looking for? --Dave - David Walker Library Web Services Manager California State University -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nate Hill Sent: Monday, December 05, 2011 3:00 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] jQuery Ajax request to update a PHP variable As always, I provided too little information. Dave, it's much more involved than that I'm trying to make a kind of visual browser of popular materials from one of our branches from a .csv file. In order to display book covers for a series of searches by keyword, I query the catalog, scrape out only the syndetics images, and then display 4 of them. The problem is that I've hardcoded in a search for 'Drawing', rather than dynamically pulling the correct term and putting it into the catalog query. Here's the work in process, and I believe it will only work in Chrome right now. http://www.natehill.net/vizstuff/donerightclasses.php I may have a solution, Jason's idea got me part way there. I looked all over the place for that little snippet he sent over! Thanks! On Mon, Dec 5, 2011 at 2:44 PM, Walker, Daviddwal...@calstate.edu wrote: And I want to update 'Drawing' to be 'Cooking' w/ a jQuery hover effect on the client side then I need to make an Ajax request, correct? What you probably want to do here, Nate, is simply output the PHP variable in your HTML response, like this: h1 id=foo?php echo $searchterm ?/h1 And then in your JavaScript code, you can manipulate the text through the DOM like this: $('#foo').html('Cooking'); --Dave - David Walker Library Web Services Manager California State University -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nate Hill Sent: Monday, December 05, 2011 2:09 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] jQuery Ajax request to update a PHP variable If I have in my PHP script a variable... $searchterm = 'Drawing'; And I want to update 'Drawing' to be 'Cooking' w/ a jQuery hover effect on the client side then I need to make an Ajax request, correct? What I can't figure out is what that is supposed to look like... something like... $.ajax({ type: POST, url: myfile.php, data: ...not sure how to write what goes here to make it 'Cooking'... }); Any ideas? -- Nate Hill
Re: [CODE4LIB] Pandering for votes for code4lib sessions
I would also mention that we generally expect people voting to either plan to at least potentially attend the conference, or have a prior participation/affiliation/interest in the Code4Lib Community. We're not expecting random people to be voting just for the hell of it, or to help our a freind with a proposal. (I also don't think the 'incident' of 'vote pandering' is all that awful or there was much reason for the 'perpetrator' to have expected anyone would have a problem with it. I do think when we have a system of open voting like we have, we should have a statement of what we expect from voters, however, that they have to read before voting. Which will keep people from accidentally violating community standards they didn't even know existed. ) On 12/1/2011 10:40 AM, Joe Hourcle wrote: On Dec 1, 2011, at 10:29 AM, Ross Singer wrote: On Thu, Dec 1, 2011 at 10:09 AM, Richard, Joel Mrichar...@si.edu wrote: I feel this whole situation has tainted things somewhat. :( Let's not blow things out of proportion. The aforementioned wrong-doing actually seems pretty innocent (there is backstory in the IRC channel, I'm not going to bring it up here). There is a valid case for advertising interest in your talks (or location, or t-shirt design, etc.), especially in an extremely crowded field, and we've never explicitly set a policy around what is appropriate and what isn't. I think a simple edit on the part of the accused would clear up any ambiguity of intention. Our one known incident was handled privately, but didn't really cause us to address the potential for impropriety. We seem to have quite a bit of support for the splash page. If people will help me draft up the wording -- ideally something we can point to when we want to guide people in the right direction in other forums -- I think we can put this issue to bed. It depends on how harsh you want be ... I mean, if you're on the fence about ballot stuffing, you could go with something like: When voting, we expect you to actually read through the list, and pick the best ones. So yes, go ahead and vote for your friends and colleagues, but also read through the others to find other equally good proposals. -Joe