Re: [CODE4LIB] looking for free hosting for html code
On Fri, 22 May 2015, Sarles Patricia (18K500) wrote: [trimmed] I plan to teach coding to my 6th and 12th grade students next school year and our lab has a mixture of old (2008) and new Macs (2015) so I want to make all the Macs functional for writing code in an editor. My next question is this: I am familiar with free Web creation and hosting sites like Weebly, Wix, Google sites, Wikispaces, WordPress, and Blogger, but do you know of any free hosting sites that will allow you to plug in your own code. i.e. host your own html files? If it's straight HTML, and doesn't need any sort of text pre-processing (SSI, ASP, JSP, PHP, ColdFusion, etc.), I think that you can use Google Drive. This help page seems to suggest that's true: https://support.google.com/drive/answer/2881970?hl=en With all static files it might also be possible to lay things out so that you could serve it through github or similar. (and teaching them about version control isn't a bad idea, either) -Joe
Re: [CODE4LIB] free html editors
On Sat, 16 May 2015, Nathan Rogers wrote: If you do not need all the bells and whistles I would recommend TextWrangler. Free versions should still be available online and its bigger brother BBEdit is overkill for basic web editing. Actually, the significant difference between TextWrangler and BBEdit is that BBEdits has a number of features that are specifically for web design, that don't exist in TextWrangler. Looking at the version of BBEdit 9.1 that I have installed, the majority of it is in the 'Markup' menu: * Close current tag / Balance tags * Check syntax * Check links * Check accessibility * Cleaners for GoLive/PageMill/HomePage/DreamWeaver * Convert to HTML / XHTML * Menu items to insert tags (which then give what attributes are allowed) * Menu item to insert CSS * Preview in ... (gives a list of installed web browsers) ... That said, TextWrangler is still a good free editor -- and I personally rarely ever use the insert tags/CSS items (as I've been writing HTML for ... crap ... I feel old ... 20+ years). But to say that BBEdit is overkill for web editing is just wrong -- the majority of the feature differences are *specifically* for web editing. -Joe (disclaimer: for a decade or so, I was a beta tester for BareBones. I haven't been using the latest-and-greatest version in a while, as I prefer not to install newer version of MacOSX on my personal systems ... basically, since Apple decided to bring all of the iOS annoyances into the desktop. As such, I can't install BBEdit 10 or 11 to see what the difference are in more recent versions) -Original Message- From: Sarles Patricia (18K500) psar...@schools.nyc.gov Sent: ?5/?16/?2015 10:21 AM To: CODE4LIB@LISTSERV.ND.EDU CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] free html editors I just this minute subscribed to this list after reading Andromeda Yelton's column in American Libraries from yesterday with great interest since I would like to teach coding in my high school library next year. I purchased Andy Harris' HTML5 and CSS3 All-in-One For Dummies for my summer reading and the free HTML editors he mentions in the book are either not really free or are not compatible with my lab's 2008 Macs. Can anyone recommend a free HTML editor for older Macs? Many thanks and happy to be on this list, Patricia Patricia Sarles, MA (Anthropology), MLS Librarian Jerome Parker Campus Library 100 Essex Drive Staten Island, NY 10314 718-370-6900 x1322 psar...@schools.nyc.gov http://jeromeparkercampus.libguides.com/home You can tell whether a man is clever by his answers. You can tell whether a man is wise by his questions. - Naguib Mahfouz As a general rule the most successful man in life is the man who has the best information. - Benjamin Disraeli
Re: [CODE4LIB] pdf and web publishing question
On Wed, 29 Apr 2015, Sergio Letuche wrote: Dear all, we have a pdf, that is taken from a to be printed pdf, full of tables. The text is split in two columns. How would you suggest we uploaded this pdf to the web? We would like to keep the structure, and split each section taken from the table of contents as a page, but also keep the format, and if possible, serve the content both in an html view, and in a pdf view, based on the preference of the user. The last time I spoke to someone from AAS about how they extracted their 'Data Behind the Table' (aka 'DbT'), it was mostly dependent upon getting something from the author when it was still in a useful format. The document is made with Indesign CS6, and i do not know in which format i could transform it into There are a few ways to do tables in InDesign, as it's page layout software. If it's in a single table within a text block, and there's nothing strange within each cell, you should be able to just select the table, copy it, and paste it out into a text editor. You'll get line returns between each row, and tabs between each cell. If they've placed line returns within the cells, those will get pasted in the middle of the cell, which can really screw you up. For cases like that, it's sometimes easiest to go through the file, and paste HTML elements at the beginning of each cell to mark table cells (td), so when you export, you have markers as to which are legitimate changes in cells, and which are line returns in the file. I then do post-processing to add in the close cells, and the row markers. If I were using BBEdit, I'd do: Find : \ttd Replace : /tdtd Find: \rtd Replace : /td/tr\rtrtd If you're doing it in some other editor that supports search/replace, you should be able to do similar, but you might need to figure out how to specify tabs line returns in your program. ... and then fix the initial final lines. (and maybe convert some of the tds into ths) -Joe ps. after getting in trouble last week, I should mention that all statements are my own, and I don't represent NASA or any other organizations in this matter.
Re: [CODE4LIB] Data Lifecycle Tracking Documentation Tools
On Wed, 11 Mar 2015, davesgonechina wrote: Hi John, Good question - we're taking in XLS, CSV, JSON, XML, and on a bad day PDF of varying file sizes, each requiring different transformation and audit strategies, on both regular and irregular schedules. New batches often feature schema changes requiring modification to ingest procedures, which we're trying to automate as much as possible but obviously require a human chaperone. Mediawiki is our default choice at the moment, but then I would still be looking for a good workflow management model for the structure of the wiki, especially since in my experience wikis are often a graveyard for the best intentions. A few places that you might try asking this question again, to see if you can find a solution that better answers your question: The American Society for Information Science Technology's Research Data Access Preservation group. It has a lot of librarians archivists in it, as well as people from various research disiplines: http://mail.asis.org/mailman/listinfo/rdap http://www.asis.org/rdap/ ... The Research Data Alliance has a number of groups that might be relevant. Here are a few that I suspect are the best fit: Libraries for Research Data IG https://rd-alliance.org/groups/libraries-research-data.html Reproducibility IG https://rd-alliance.org/groups/reproducibility-ig.html Research Data Provenance IG https://rd-alliance.org/groups/research-data-provenance.html Data Citation WG (as this fits into their 'dynamic data' problem) https://rd-alliance.org/groups/data-citation-wg.html ('IG' is 'Interest Group', which are long-lived. 'WG' is 'Working Group' which are formed to solve a specific problem and then disband) The group 'Publishing Data Workflows' might seem to be appropriate but it's actually 'Workflows for Publishing Data' not 'Publishing of Data Workflows' (which falls under 'Data Provenance' and 'Data Citation') There was a presentation at the meeting earlier this week by Andreas Rauber in the Data Citation group on workflows using git or SQL databases to be able to track appending or modification for CSV and similar ASCII files. ... Also, I would consider this to be on-topic for Stack Exchange's Open Data site (and I'm one of the moderators for the site): http://opendata.stackexchange.com/ -Joe On Tue, Mar 10, 2015 at 8:10 PM, Scancella, John j...@loc.gov wrote: Dave, How are you getting the metadata streams? Are they actual stream objects, or files, or database dumps, etc? As for the tools, I have used a number of the ones you listed below. I personally prefer JIRA (and it is free for non-profit). If you are ok if editing in wiki syntax I would recommend mediaWiki (it is what powers Wikipedia). You could also take a look at continuous deployment technologies like Virtual Machines (virtualbox), linux containers (docker), and rapid deployment tools (ansible, salt). Of course if you are doing lots of code changes you will want to test all of this continually (Jenkins). John Scancella Library of Congress, OSI -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of davesgonechina Sent: Tuesday, March 10, 2015 6:05 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Data Lifecycle Tracking Documentation Tools Hi all, One of my projects involves harvesting, cleaning and transforming steady streams of metadata from numerous publishers. It's an infinite loop but every cycle can be a little bit or significantly different. Many issue tracking tools are designed for a linear progression that ends in deployment, not a circular workflow, and I've not hit upon a tool or use strategy that really fits. The best illustration I've found so far of the type of workflow I'm talking about is the DCC Curation Lifecycle Model http://www.dcc.ac.uk/sites/default/files/documents/publications/DCCLifecycle.pdf . Here are some things I've tried or thought about trying: - Git comments - Github Issues - MySQL comments - Bash script logs - JIRA - Trac - Trello - Wiki - Unfuddle - Redmine - Zendesk - Request Tracker - Basecamp - Asana Thoughts? Dave
Re: [CODE4LIB] Get It Services / Cart
On Fri, 6 Mar 2015, Smith, Steelsen wrote: Hi All, I'm new to this list, so if there are any conventions I'm ignoring I'd appreciate someone letting me know. I'm working on a project to allow requests that will go to multiple systems to be aggregated in a requesting interface. It would be implemented as an independent application, allow a shopping list of items to be added, and be able to perform some back end business logic (availability checking, metadata enrichment, etc.). This seems like a very common use case so I'm surprised that I've had trouble finding anyone who has published an application that works like this - the closest I've found being Umlaut which doesn't seem to support multiple simultaneous requesting (although I couldn't get as far as request in any sample system to be certain). Is anyone on the list aware of such a project? I'm aware of such a project. And it's been the bane of my existance for 5+ years. I've actually asked my boss to fire me a few times so that I don't have to support it, as it's more like babysitting than anything else. However, it's for science archives, not libraries, and only really supports objects that are stored in FITS (Flexibile Image Transport System). I cannot in good faith recommend that anyone use it. I've even started up a mailing list for IT people in solar physics archives so that I can try to make sure that we fight against implementing it for any new scientific missions. -Joe ps. it's not an independent application ... it's the service that does the 'metadata enrichment' because they store all of the data without any metadata so that anyone outside not running their custom software can actually make use of it ... and then I manage the system that does the aggregation, and someone else wrote the logic for availability checking (which seems to have decided to crap itself last month, shortly after the programmer who wrote it 5+ years ago moved on to another job). pps. if you're going to implement something like this, I'd recommend using metalink for the 'shipping cart' sort of stuff, and hand off to some dedicated download manager. For our community, an even better option would be BagIt with a fetch.txt file, but the client-side tool support just isn't out there.
[CODE4LIB] Fwd: [CNI-ANNOUNCE] Call for Participation: Security and Privacy Agenda Workshop, March 3, 2015
I saw 'hardening OAI-PMH', and thought this might be of interest to this group. -Joe Begin forwarded message: From: Clifford Lynch cni-annou...@cni.org Date: February 6, 2015 4:16:15 PM EST To: CNI-ANNOUNCE -- News from the Coalition cni-annou...@cni.org Subject: [CNI-ANNOUNCE] Call for Participation: Security and Privacy Agenda Workshop, March 3, 2015 Reply-To: CNI-ANNOUNCE -- News from the Coalition cni-annou...@cni.org On March 3 CNI is going to host a small workshop to develop a near term agenda for work needed to improve security and privacy in systems related to scholarly communication and access to scholarly information resources. The focus will be largely technical, and will emphasize setting an agenda for various groups to address needs and problems, rather than details of how to solve specific problems. I've deliberately left the agenda scoped rather broadly , and I want to look at everything from encouraging wider and more routine use of HTTPS to hardening some popular protocols like OAI-PMH. Technical identity management related issues are also in scope, as are some discussions about appropriate levels of assurance. We'll meet in Washington DC from 10AM-3PM on Tuesday, March 3. CNI will provide refreshments and lunch, but we will not cover travel expenses. This will be a small workshop, and we will do our best to balance for different perspectives. If you are interested in attending, please send an email to Joan Lippincott (j...@cni.org) with a brief summary of the expertise and perspective you would bring to the meeting. Given that the meeting is only about a month away, I'll send out a first batch of acceptances by Feb 13, and after than respond to later applications as they come in. I'll provide more detailed logistical information with acceptances. There will be a public report from the meeting, and for those who cannot attend, suggestions and comments are welcome going into the meeting. If you have questions, please be in touch with me by email. Clifford Lynch Director, CNI cl...@cni.org == This message is sent to you because you are subscribed to the mailing list cni-annou...@cni.org. To unsubscribe, E-mail to: cni-announce-...@cni.org To switch to the DIGEST mode, E-mail to cni-announce-dig...@cni.org To switch to the INDEX mode, E-mail to cni-announce-in...@cni.org To postpone your subscription, E-mail to cni-announce-n...@cni.org To resume mail list message delivery from postpone mode, E-mail to cni-announce-f...@cni.org Send administrative queries to cni-announce-requ...@cni.org Visit the CNI-ANNOUNCE e-mail list archive at https://mail2.cni.org/Lists/CNI-ANNOUNCE/.
Re: [CODE4LIB] Plagiarism checker
On Jan 23, 2015, at 9:44 AM, Mark A. Matienzo wrote: I believe Turnitin and SafeAssign both compare the text of submissions to against external sources (e.g., SafeAssign uses ABI/INFORM, among others). I am not certain if they compare submissions against each other. My understanding of TurnItIn, at least initially, was that they built their corpus on existing submissions. (they had some deals with universities back when they started up to use their service for free or cheap, so that they could build up their corpus). However, if you're looking for something along the lines of what Dre suggests, you could use ssdeep, which is an implementation of a piecewise hashing algorithm [0]. The issue with that you would have to assume that all students would probably be using the same file format. You could also using something like Tika to extract the text content from all the submissions, and then compare them against each other. I'd agree on extracting the text. MS Word used to store documents as strings of edits, making it difficult to compare two documents for similarity without parsing the format. (I don't know if they still do this in .docx) -Joe
Re: [CODE4LIB] Lost thread - centrally hosted global navbar
On Jan 10, 2015, at 8:37 PM, Jason Bengtson wrote: Do you have access to the server-side? Server side scripting languages (and the frameworks and CMSes built with them) have provisions for just this sort of thing. Include statements in PHP and cfinclude tags in coldfusion, for example. Every Content Management System I've used has had a provision to create reusable content that can be added to multiple pages as blocks or via shortcodes. If you can use server-side script I recommend it; that's really the cleaner way to do this sort of thing. Another option you could use that avoids something like iframes is to create a javascript file that dynamically creates the navbar dynamically in your pages. Just include the javascript file in any page you want the toolbar to appear in. That method adds some overhead to your pages, but it's perfectly workable if server-side script is out of reach. The javascript trick works pretty well when you have people mirroring your site via wget (as they won't run the js, and thus won't try to retrieve all of the images that are used to make the page pretty every time they run their mirror job. You can see it in action at: http://stereo-ssc.nascom.nasa.gov/data/ins_data/ The drawback is that some browsers have a bit of a flash when they first hit the page. It might be possible to mitigate the problem by having the HTML set the background to whatever color the background will be changed to, but I don't quite the flexibility to do that in my case, due to how the page is being generated. -Joe ps. It's been years since I've done ColdFusion, but I remember there being a file that you could set, that would automatically getting inserted into every page in that directory, or in sub-directories. I want to say it was often used for authentication and such, but it might be possible to use for this. If nothing else, you could load header into a variable, and have the pages just print the variable in the right location.
Re: [CODE4LIB] linked data and open access
On Dec 19, 2014, at 9:48 AM, Eric Lease Morgan wrote: I don’t know about y’all, but it seems to me that things like linked data and open access are larger trends in Europe than here in the United States. Is there are larger commitment to sharing in Europe when compared to the United States? If so, is this a factor based on the nonexistence of a national library in the United States? Is this your perception too? —Eric Morgan I can't comment on the linked data side of things so much, but in following all of the comments from the US's push for opening up access to federally funded research, I'd have to say that capitalism and protectionist attitudes from 'publishers' seem to be a major factor in the fight against open access. I've placed 'publishers' in quotes, because groups that I would've considered to have been 'scientific societies' submitted comments against the opening up of the research, and in the case of AGU, referred to themselves multiple times as a 'publisher' and never as a 'society'.[1] I dropped my membership when I realized that. Statements from the 2011 RFI from OSTP: http://www.whitehouse.gov/administration/eop/ostp/library/publicaccess Statements from the 2013 NAS meetings: http://sites.nationalacademies.org/DBASSE/CurrentProjects/DBASSE_082378 (note that I made statements at the National Academies meeting on opening access to federally funded research data) [1] http://www.whitehouse.gov/sites/default/files/microsites/ostp/scholarly-pubs-(%23065).pdf -Joe ps. I still haven't seen what any of the official policies are (last year's government shutdown delayed the white house response to their submissions, and I have no idea if they've finally publicized anything) ... but I hosted a session at the AGU last year, where we had representatives from NOAA, NASA and USGS speak about what they were doing, and the NASA policy seemed to be heavily influenced by the more senior scientists ... who were more likely to be editors of journals. They haven't updated their 'Data Information Policy' (http://science.nasa.gov/earth-science/earth-science-data/data-information-policy/) page in over three years.
Re: [CODE4LIB] linked data and open access
On Dec 19, 2014, at 12:28 PM, Kyle Banerjee wrote: On Fri, Dec 19, 2014 at 7:57 AM, Joe Hourcle onei...@grace.nascom.nasa.gov wrote: I can't comment on the linked data side of things so much, but in following all of the comments from the US's push for opening up access to federally funded research, I'd have to say that capitalism and protectionist attitudes from 'publishers' seem to be a major factor in the fight against open access. That definitely doesn't help. But quite a few players own this problem. Pockets where there is a culture of openness can be found but at least in my neck of the woods, researchers as a group fear being scooped and face incentive structures that discourage openness. You get brownie points for driving your metrics up as well as being first and novel, not for investing huge amounts of time structuring your data so that everyone else can look great using what you created. There's been a lot of discussion of this problem over the last ~5 years or so. The general consensus is that : 1. We need better ways for people to acknowledge data being re-used. a. The need for standards for citation so that we can use bibliometric tools to extract the relationships b. The need for a citation specifically to the data, and not a proxy (eg, the first results or instrument papers), to show that maintaining the data is still important. c. Shift the work in determining how to acknowledge the data from the re-user back to the distributor the data. 2. We need standards to make it easier for researchers to re-use data. Findability, accessibility of the file formats, documentation of data, etc. 3. We need institutions to change their culture to acknowledge that producing really good data is as important for the research ecosystem as writing papers. This includes decisions regarding awarding grants, tenure promotion, etc. Much of this is covered by the Joint Declaration of Data Citation Principles: https://force11.org/datacitation There are currently two sub-groups; one working on dissemination, to make groups aware of the issues the principles, and another (that I'm on) working on issues of implementation. We actually just submitted something to PeerJ this week, on how to deal with 'machine actionable' landing pages: https://peerj.com/preprints/697/ (I've been pushing for one of the sections to be clarified, so feel free to comment ... if enough other people agree w/ me, maybe I can get my changes into the final paper) Libraries face their own challenges in this regard. Even if we ignore that many libraries and library organizations are pretty tight with what they consider their intellectual property, there is still the issue that most of us are also under pressure to demonstrate impact, originality, etc. As a practical matter, this means we are rewarded for contributing to churn, imposing branding, keeping things siloed and local, etc. so that we can generate metrics that show how relevant we are to those who pay our bills even if we could do much more good by contributing to community initiatives. But ... one of the other things that libraries do is make stuff available to the public. So as most aren't dealing with data, getting that into their IRs means that they've then got more stuff that they can serve to possibly help push up their metrics. (not that I think those metrics are good ... I'd rather *not* transfer data that people aren't going to use, but the bean counters like those graphs of data transfer going up ... we just don't mention that it's groups in China attempting to mirror our entire holdings) With regards to our local data initiatives, we don't push the open data aspect because this has practically no traction with researchers. What does interest them is meeting funder and publisher requirements as well as being able to transport their own research from one environment to another so that they can use it. The takeaway from this is that leadership from the top does matter. The current strategy is to push for the scientific societies to implement policies requiring the data be opened if it's to be used as evidence in a journal article. There are some exceptions*, but the recommendations so far are to still set up the landing page to make the data citable, but instead of linking directly to the data, provide an explanation of what the procedures are to request access. Through this, we have the requirement be that if the researcher wants to publish their paper ... they have to provide the data, too. We're run into a few interesting snags, though. For instance, some are only requiring the data that directly supports the paper to be published; this means that we have no way of knowing if they cherry-picked their data and the larger collection might have evidence to refute their findings. The 'publishers' seem
Re: [CODE4LIB] looking for a good PHP table-manipulating class
On Dec 11, 2014, at 4:32 PM, Ken Irwin wrote: Hi folks, I'm hoping to find a PHP class that designed to display data in tables, preferably able to do two things: 1. Swap the x- and y-axis, so you could arbitrarily show the table with y=Puppies, x=Kittens or y=Kittens,x=Puppies 2. Display the table either using plain text columns or formatted html I feel confident that in a world of 7 billion people, someone must have wanted this before. There's much more work being done in javascript tables these days than in backend software. Unfortunately, I've never found a good matrix to compare features between the various 'data table' or 'data grid' implementations. I did start evaluating a lot a while back, but the problem is that you have go go through them all to figure out what the different features might be, and then go back through a second time to see which ones might implement those features. The second problem is that some are implemented as part of a given JS framework (eg, ExtJS), while other toolkits might have a dozen different 'data table' implementations (eg, jQuery). -Joe ps. and as this wasn't a feature that I was looking for, this wasn't something that I tracked when I did my analysis. I was looking for things like scaling to a thousand rows w/ 20 columns, rearranging/hiding columns, etc.
[CODE4LIB] Fwd: [Rdap] Call for Editors for IMLS-funded DataQ Project (Due 1/30/15)
A few months ago, there was a discussion of trying to try to make a libraries site on Stack Exchange. For those that were interested, this might be an interesting project to participate in, although their scope isn't necessarily all library questions. -Joe Begin forwarded message: From: Andrew Johnson andrew.m.john...@colorado.edu Date: December 11, 2014 5:34:37 PM EST To: r...@mail.asis.org r...@mail.asis.org Subject: [Rdap] Call for Editors for IMLS-funded DataQ Project (Due 1/30/15) Reply-To: Research Data, Access and Preservation r...@asis.org Call for Editors for the DataQ Project The University of Colorado Boulder Libraries, the Greater Western Library Alliance, and the Great Plains Network are excited to announce that we have received funding from the Institute of Museum and Library Services to develop an online resource called DataQ, which will function as a collaborative knowledge-base of research data questions and answers curated for and by the library community. Library staff from any institution may submit questions on research data topics to the DataQ website, where questions will then be both crowd-sourced and reviewed by an Editorial Team of experts. Answers to these questions, from both the community and the Editorial Team, will be posted to the DataQ website and will include links to resources and tools, best practices, and practical approaches to working with researchers to address specific research data issues. We are currently seeking applications for our Editorial Team. If you are interested in becoming a DataQ Editor, please fill out the application form here by January 30, 2015: http://bit.ly/DataQApp. DataQ Editors will be responsible for helping to identify initial content, providing expert feedback on questions from DataQ users, and developing policies and procedures for answering questions. The Editorial Team will participate in regular virtual meetings and attend one in-person meeting in Kansas City, MO in late May. Each Editor will receive a $1000 stipend to help cover travel costs and time contributed to the project. The initial term for each Editor will last until October 31, 2015 when the grant period ends, but there may be opportunities to continue serving beyond the life of the grant based on the outcome of the project. Additional opportunities to contribute to DataQ will be announced soon. For all of the latest information about DataQ, please follow @ResearchDataQhttp://twitter.com/researchdataq on Twitter. Please send any questions about DataQ to the project Co-PIs Andrew Johnson at andrew.m.john...@colorado.edumailto:andrew.m.john...@colorado.edu and Megan Bresnahan at megan.bresna...@colorado.edumailto:megan.bresna...@colorado.edu. - Andrew Johnson Assistant Professor; Research Data Librarian University of Colorado Boulder Libraries Phone: 303-492-6102 Website: https://data.colorado.edu/ ORCID iD: -0002-7952-6536http://orcid.org/-0002-7952-6536 Impactstory Profile: https://impactstory.org/AndrewJohnson ___ Rdap mailing list r...@mail.asis.org http://mail.asis.org/mailman/listinfo/rdap
Re: [CODE4LIB] Balancing security and privacy with EZproxy
On Nov 19, 2014, at 11:47 PM, Dan Scott wrote: On Wed, Nov 19, 2014 at 4:06 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: There are a number of technical approaches that could be used to identify which accounts have been compromised. But it's easier to just make the problem go away by setting usage limits so EZP locks the account out after it downloads too much. But EZProxy still doesn't let you set limits based on the type of download. You therefore have two very blunt sledge hammers with UsageLimit: - # of downloads (-transfers) - # of megabytes downloaded (-MB) [trimmed] I'm not familiar with EZProxy, but if it's running on an OS that you have control of (and not some vendor locked appliance), you likely have other tools that you can use for rate limiting. For instance, I have a CGI on a webserver that's horribly resource intensive and takes quite a while to run. Most people wonder what's taking so long, and reload multiple times, thinking the process is stuck ... or they know what's going on, and will open up multiple instances in different tabs to reduce their wait. So I have the following IP tables rule: -A INPUT -p tcp -m tcp --dport 80 --tcp-flags FIN,SYN,RST,ACK SYN -m connlimit --connlimit-above 5 --connlimit-mask 32 -j REJECT --reject-with tcp-reset I can't remember if starts blocking the 5th connection, or once they're above 5, but it keeps us from having one IP address with 20+ copies running at once. ... And back from my days of managing directory servers -- brute forcing was a horrible problem with single sign-on. We didn't have a good way to temporarily lock accounts for repeatedly failing passwords at the directory server (which would also cause a denial of service, as you could lock someone else) ... so it had to be up to each application to implement ... which of course, they didn't. ... so you'd have something like a webpage that required authentication that someone could brute force ... and then they'd also get access to a shell account and whatever else that person had authorization for. -Joe (and on that 'wow, I feel old' note ... it's been 10+ years since I've had to manage an LDAP server ... it's possible that they've gotten better about that issue since then)
Re: [CODE4LIB] Wednesday afternoon reverie
On Nov 6, 2014, at 5:17 PM, Karen Coyle wrote: Cynthia, it's been a while but I wanted to give you feedback... Ranking on importance based on library ownership and/or circulation is something that I've seen discussed but not implemented -- mainly due to the difficulty of gathering the data from library systems. But it seems like an obvious way to rank results, IMO. Too bad that one has to pay for BISAC headings. They tend to mirror the headings in bookstores (and ebook stores) that people might be familiar with. They capture fiction topics, especially, in a way to resonates with some users (topics like Teen Paranormal Romance). I believe that they were created specifically for bookstores. The problem is that the publishers (likely with support of the authors) get to decide where stuff should be filed. As I help manage the Friend's bookstore at my local library branch, I've seen Creation Science (on an 'E' book with archealogists dinosaur bones on the cover) and a few others make me cringe. -Joe ps. I haven't seen Teen Paranormal Romance specifically as a heading (although yes, I've seen those books) ... I'm waiting for Amish Paranormal Romance (although I don't know if Amish Romance is an official BISAC heading). pps. The nature of the BISAC headings make them less useful for determining if a book's actually on the shelves. It's fine for general browsing, but it reminds me of the filing system from Black Books (from 0:40 to ~1:45): https://www.youtube.com/watch?v=RZVDr4r9HEw On 10/22/14 1:25 PM, Harper, Cynthia wrote: So I'm deleting all the Bisac subject headings (650_7|2bisacsh) from our ebook records - they were deemed not to be useful, especially as it would entail a for-fee indexing change to make them clickable. But I'm thinking if we someday have a discovery system, they'll be useful as a means for broader-to-narrower term browsing that won't require translation to English, as would call number ranges. As I watch the system slowly chunk through them, I think about how library collections and catalogs facilitate jumping to the most specific subjects, but browsing is something of an afterthought. What if we could set a ranking score for the importance of an item in browsing, based on circulation data - authors ranked by the relative circulation of all their works, same for series, latest edition of a multi-edition work given higher ranking, etc.? Then have a means to set the threshold importance value you want to look at, and browse through these general Bisac terms, or the classification? Or have a facet for importance threshold. I see Bisac sometimes has a broadness/narrowness facet (overview) - wonder how consistently that's applied, enough to be useful? Guess those rankings would be very expensive in compute time. Well, back to the deletions. Cindy Harper Electronic Services and Serials Librarian Virginia Theological Seminary 3737 Seminary Road Alexandria VA 22304 703-461-1794 char...@vts.edu -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: +1-510-435-8234 skype: kcoylenet/+1-510-984-3600
Re: [CODE4LIB] Past Conference T-Shirts?
On Nov 6, 2014, at 8:11 PM, Josh Wilson wrote: The Code4Lib version is clearly of superior quality, design, and provenance, but I actually thought this was an internet thing of unknown origin? e.g., http://www.cafepress.com/mf/17182533/metadata_tshirt http://www.redbubble.com/people/charlizeart/works/1280530-metadata?p=t-shirt Perhaps a case of multiple discovery. I don't know when I first saw it, but I know variations of the Helvetica shirt were first: http://welovetypography.com/post/10993/ -Joe ps. being a font snob ... the Cafe Press shirt just has horrible kerning between the 'T' and 'A. The Code4Lib one is better, but misses the little bevel on the 'T' to have it made up tight to the 'A', and the kerning between 'A' and 'D' could be a bit tighter. The helvetica shirt I linked to clearly slid the letters around (as the 'T' has the beveled edge to mate up to the now-missing 'A'.) On Thu, Nov 6, 2014 at 5:37 PM, Goben, Abigail ago...@uic.edu wrote: Joshua Gomez did the original, correct? http://wiki.code4lib.org/ index.php/2013_t-shirt_design_proposals Thanks for working on this Riley! I know several people who will be very happy to be able to purchase it. On 11/6/2014 2:48 PM, Riley Childs wrote: Some one sent me the design, if you did it please let me know so I can give attribution. //Riley Sent from my Windows Phone -- Riley Childs Senior Charlotte United Christian Academy Library Services Administrator IT Services Administrator (704) 537-0331x101 (704) 497-2086 rileychilds.net @rowdychildren I use Lync (select External Contact on any XMPP chat client) From: todd.d.robb...@gmail.commailto:todd.d.robb...@gmail.com Sent: 11/6/2014 3:41 PM To: CODE4LIB@LISTSERV.ND.EDUmailto:CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Past Conference T-Shirts? Joshua, That is so gnarly!!! On Thu, Nov 6, 2014 at 1:13 PM, Riley Childs rchi...@cucawarriors.com wrote: Ok, will do, I didn't actually design it may take a little time while I dig though download folders from my backups, I will try and get to it next week //Riley -- Riley Childs Senior Charlotte United Christian Academy IT Services Administrator Library Services Administrator https://rileychilds.net cell: +1 (704) 497-2086 office: +1 (704) 537-0331x101 twitter: @rowdychildren Checkout our new Online Library Catalog: https://catalog.cucawarriors. com Proudly sent in plain text -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Jason Stirnaman Sent: Thursday, November 06, 2014 2:46 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Past Conference T-Shirts? Riley, Could you fix the spelling on More then just books in the store? Should be More than just books Thanks, Jason Jason Stirnaman Lead, Library Technology Services University of Kansas Medical Center jstirna...@kumc.edu 913-588-7319 On Nov 6, 2014, at 1:04 PM, Riley Childs rchi...@cucawarriors.com wrote: Yes, but I have been unsuccessful thus far in getting a vector file/high res transparent image. If you have one and can send please do so and I will put it up on the code4lib store (code4lib.spreadshirt.com). Thanks //Riley -- Riley Childs Senior Charlotte United Christian Academy IT Services Administrator Library Services Administrator https://rileychilds.net cell: +1 (704) 497-2086 office: +1 (704) 537-0331x101 twitter: @rowdychildren Checkout our new Online Library Catalog: https://catalog.cucawarriors.com Proudly sent in plain text -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Goben, Abigail Sent: Thursday, November 06, 2014 1:10 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Past Conference T-Shirts? My Metadata t-shirt, from C4L 2013, has been getting some interest/requests of where others can purchase. I thought we'd talked about that here. Was there a store ever finally set up that I could refer people to? Abigail -- Abigail Goben, MLS Assistant Information Services Librarian and Assistant Professor Library of the Health Sciences University of Illinois at Chicago 1750 W. Polk (MC 763) Chicago, IL 60612 ago...@uic.edu -- Tod Robbins Digital Asset Manager, MLIS todrobbins.com | @todrobbins http://www.twitter.com/#!/todrobbins -- Abigail Goben, MLS Assistant Information Services Librarian and Assistant Professor Library of the Health Sciences University of Illinois at Chicago 1750 W. Polk (MC 763) Chicago, IL 60612 ago...@uic.edu
Re: [CODE4LIB] Stack Overflow
On Nov 4, 2014, at 9:12 AM, Schulkins, Joe wrote: Presumably I'm not alone in this, but I find Stack Overflow a valuable resource for various bits of web development and I was wondering whether anyone has given any thought about proposing a Library Technology site to Stack Exchange's Area 51 (http://area51.stackexchange.com/)? Doing a search of the proposals shows there was one for 'Libraries and Information Science' but this closed 2 years ago as it didn't reach the required levels during the beta phase. Some history on the Stack Exchange site: 1. Before 'Stack Exchange 2.0', they used to let other sites pay them to host QA sites. There had been a library-focused site on Unshelved: http://www.unshelved.com/2010-7-15 2. We got *hundreds* of people from Unshelved Answers to sign up on Area 51 ... but they wouldn't start up the site unless enough people with high enough reputation on existing 'Stack Exchange 2.0' sites expressed interest, claiming that they needed sufficient people with knowledge of the system. I tried lobbying for them to count people w/ experience from Unshelved Answers, but they wouldn't do it. 3. It took over a year for the 'Libraries' proposal to get enough support to be accepted; by then, I assume most library folks had moved on. 4. They then named the site 'Library and Information Science', not 'Libraries'. http://discuss.area51.stackexchange.com/q/3846/5710 After my complaining, they changed it to 'Libraries and Information Science', but there was still a major problem: 5. As if all of the rest wasn't bad enough, we then had a bunch of non-library people closing answers because there wasn't a single definite answer, which was a large number of the questions on Unshelved Answers ... and most of the 'example' questions were in that category as well: https://web.archive.org/web/20120325030045/http://area51.stackexchange.com/proposals/12432/libraries-information-science The reason I think this might be useful is that instead of individual places to go for help or raise questions (i.e. various mailing lists) there could be a 'one-stop' shop approach from which we could get help with LMSs, discovery layers, repository software etc. I appreciate though that certain vendors aren't particularly open (yes, Innovative I'm looking at you here) and might not like these things being discussed on an open forum. Does anybody else think this might be useful? Would such a forum be shot down by all the vendors legalese wrapped up in their Terms and Conditions? Or are you happy with the way you go about getting help? I think that the Stack Exchange culture policies make it a bad fit for our community. I think that yes, there is a need for such a site, but that the issues with immediately closing questions without a clear answer are a *huge* problem. If questions were easily answered, we'd have done the research and answered it outselves (most of us have LIS degrees and know how to research things!). You might also be able to get support from Unshelved again, and if we the community can put together a site, have them brand it as 'Unshelved Answers' again. -Joe ps. I'm currently the moderator of OpenData.StackExchange.com; I was previously the moderator of Seasoned Advice (aka. cooking.stackexchange.com) pps. I also objected when they changed the name of the 'databases' proposal to 'database administrators', which many of us felt narrowed the scope dramatically ( http://meta.dba.stackexchange.com/q/1/51 ; http://meta.dba.stackexchange.com/q/11/51 ). I don't even bother with the site these days.
Re: [CODE4LIB] Stack Overflow
On Nov 4, 2014, at 1:33 PM, Mark Pernotto wrote: I think all of this is really useful. I'd be lying if I said I didn't get a lot of great ideas and results from StackOverflow. However, I've been burned quite a bit as well - deprecated code, inaccurate results, or just the wrong answer gets accepted. There seems to be such a push to 'accept as answer' that no one gives a second thought to alternative solutions. Because one size doesn't fit all - I think we all know that. I hate it when I answer something 15-20 min after someone posts a question, and they flag it as the 'correct' answer. Someone else might have some better response. * I made the mistake of accepting an answer without fully testing it: http://dba.stackexchange.com/q/30/51 Notice how no one else gave an alternative, as it works ... but I just added the comment that the performance was much, much worse than when I started. ... and we run into issues where what might have once been the correct answer no longer is (because there's a new, better alternative, or because some tool's no longer available (or not recommended because of a horrible security flaw). I guess I'm trying to advocate not to rely on this type of resource completely when resolving your coding challenges. While it can certainly be a tremendous learning tool, keep an objective mind for what tool best fits your institution's purpose. What I'd like to see is some place where we can have the summary of recommended practices for various problems ... lots of people can contribute, and it can kept up-to-date. Basically, a crowd sourced FAQ. The problem is, you can't just set up a wiki and expect people to contribute. Say what you will about StackExchange's herd-mentality about the 'right type of questions'**, their system gets people to contribute. * for the people who complain about the grubbing for reputation: it's gamification. I just hate the people who can manage to pop out reasonable sounding responses 10 min after the question was asked that are clearly just internet research because I *know* the answer is wrong. ... one person on Seasoned Advice was gaming the system; if you started downvoting their questions, they'd just delete them, but they were getting almost all of upvotes due to their 'early and plausible' strategy. ** Yet, I still have the 4th-rated question on Seasoned Advice for Translating cooking terms between US / UK / AU / CA / NZ ( http://cooking.stackexchange.com/q/784/67 ), simply because I got it in back when 'community wiki' was considered an option. Lots of other interesting questions have gotten closed as their community cracked down on 'em, though. (eg, cookbook recommendations) -Joe ps. Nothing frustrates me more than scouring the internet due to a problem you've run into ... and you *finally* find a 2 year old post on some forum that is the *exact* symptoms you have ... and you scroll through all of the replies of things you've already tried ... and get to the last post, from the person with the problem and they've posted 'nevermind, I fixed it'.
Re: [CODE4LIB] Terrible Drupal vulnerability
On Oct 31, 2014, at 11:46 AM, Lin, Kun wrote: Hi Cary, I don't know from whom. But for the heartbeat vulnerability earlier this year, they as well as some other big providers like Google and Amazon were notified and patched before it was announced. If they have an employee who contributes to the project, it's possible that this was discussed on development lists before it was sent down to user level mailing lists. Odds are, there's also some network of people who are willing to give things a cursory review / beta test in a more controlled manner before they're officially released (and might break thousands of websites). It would make sense that companies who derive a good deal of their profits in supporting software would participate in those programs, as well. I could see categorizing either of those as 'ahead of the *general* public', which was Kun's assertion. -Joe -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Cary Gordon Sent: Friday, October 31, 2014 11:10 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Terrible Drupal vulnerability How do they receive vulnerability report ahead of general public? From whom? Cary On Friday, October 31, 2014, Lin, Kun l...@cua.edu wrote: If you are using drupal as main website, consider using Cloudflare Pro. It's just $20 a month and worth it. They'll help block most attacks. And they usually receive vulnerability report ahead of general public. Kun -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU javascript:;] On Behalf Of Cary Gordon Sent: Friday, October 31, 2014 9:59 AM To: CODE4LIB@LISTSERV.ND.EDU javascript:; Subject: Re: [CODE4LIB] Terrible Drupal vulnerability This is what I posted to the Drupal4Lib list: By now, you should have seen https://www.drupal.org/PSA-2014-003 and heard about the Drupageddon exploits. and you may be wondering if you were vulnerable or iff you were hit by this, how you can tell and what you should do. Drupageddon affects Drupal 7, Drupal 8 and, if you use the DBTNG module, Drupal 6. The general recommendation is that if you do not know or are unsure of your server's security and you did not either update to Drupal 7.32 or apply the patch within a few hours of the notice, you should assume that your site (and server) was hacked and you should restore everything to a backup from before October 15th or earlier. If your manage your server and you have any doubts about your file security, you should restore that to a pre 10/15 image, as well or do a reinstall of your server software. I know this sounds drastic, and I know that not everyone will do that. There are some tests you can run on your server, but they can only verify the hacks that have been identified. At MPOW, we enforce file security on our production servers. Our deployments are scripted in our continuous integration system, and only that system can write files outside of the temporal file directory (e.g. /sites/site-name/files). We also forbid executables in the temporal file system. This prevents many exploits related to this issue. Of course, the attack itself is on the database, so even if the file system is not compromised, the attacker could, for example, get admin access to the site by creating an account, making it an admin, and sending themselves a password. While they need a valid email address to set the password, they would likely change that as soon as they were in. Some resources: https://www.drupal.org/PSA-2014-003 https://www.acquia.com/blog/learning-hackers-week-after-drupal-sql-inj ection-announcement http://drupal.stackexchange.com/questions/133996/drupal-sa-core-2014-0 05-how-to-tell-if-my-server-sites-were-compromised I won't attempt to outline every audit technique here, but if you have any questions, please ask them. The takeaway from this incident, is that while Drupal has a great security team and community, it is incumbent upon site owners and admins to pay attention. Most Drupal security issues are only exploitable by privileged users, and admins need to be careful and read every security notice. If a vulnerability is publicly exploitable, you must take action immediately. Thanks, Cary On Thu, Oct 30, 2014 at 5:24 PM, Dan Scott deni...@gmail.com javascript:; wrote: Via lwn.net, I came across https://www.drupal.org/PSA-2014-003 and my heart sank: Automated attacks began compromising Drupal 7 websites that were not patched or updated to Drupal 7.32 within hours of the announcement of SA-CORE-2014-005 - https://www.drupal.org/SA-CORE-2014-005Drupal https://www.drupal.org/SA-CORE-2014-005 core - SQL injection https://www.drupal.org/SA-CORE-2014-005. You should proceed under the assumption that every Drupal 7 website was compromised unless updated or patched
Re: [CODE4LIB] Why learn Unix?
On Oct 28, 2014, at 10:07 AM, Joshua Welker wrote: There are 2 reasons I have learned/am learning Linux: 1. It is cheaper as a web hosting platform. Not substantially, but enough to make a difference. This is a big deal when you are a library with a barebones budget or an indie developer (I am both). Note that if you are looking for enterprise-level support, the picture is quite different. 1a. A less significant reason is that Linux is much less resource-intensive on computers and works well on old/underpowered computers and embedded systems. If you want to hack an Android device or Chromebook to expand its functionality, Linux is what you want. I am running Ubuntu on my Acer C720 Chromebook using Crouton, and now it has all the functionality of a full-fledged laptop at $200. When I worked for an ISP in the late 1990s, our two FreeBSD servers that handled *everything* were 75MHz Pentiums that another company had discarded Our network admin bridged my apartment to his using a 386 w/ picoBSD install that booted from a 3.5 floppy (to drive a WaveLAN card, before the days of 802.11) I think one of the P75s was running fark.com for a while before they added all of the commenting functionality. It's amazing just how much functionality you can get out of hardware that people have discarded by putting a system on it that doesn't have a lot of cruft. 2. Many scripting languages and application servers were born in *nix and have struggled to port over to non-*nix platforms. For example, Python and Ruby both are a major pain to set up in Windows. Setting up a production-level Rails or Django server is stupidly overcomplicated in Windows to the point where it is probably easier just to use Linux. It's much easier to sudo apt-get install in Ubuntu than to spend hours tweaking environment variables and config files in Windows to achieve the same effect. If you're going to run Python on windows, it used to be easier to download a full 'WAMP' build (windows, apache, mysql, perl/php/python). I don't know what the current state of python installers are ... except for on the Mac, where they're still a bit of a pain. I have no idea on Ruby. I will go out on a limb here and say that *nix isn't inherently better than Windows except perhaps the fact that it is less resource-intensive (which doesn't apply to OSX, the most popular *nix variant). #1 and #2 above are really based on historical circumstances rather than any inherent superiority in Linux. Back when the popular scripting languages, database servers, and application servers were first developed in the 90s, Windows had a very sucktastic security model and was generally not up to the task of running a server. Windows has cleaned up its act quite a bit, but the ship has sailed, at this point. If you compare Windows today to Linux today, they are on very equal footing in terms of server features. The only real advantage Linux has at this point is that the big distros like Ubuntu have a much more robust package ecosystem that makes it much easier to install common server-side applications through the command line. But when you look at actually using and managing the OS, Linux is at a clear disadvantage. And if you compare the two as desktop environments, Windows wins hands-down except for a very few niche use cases. I say this as someone who uses a Ubuntu laptop every day. For managing OSes, I admit that I haven't played with Windows 8, but I'm still in the FreeBSD camp for servers. (and not what Apple's done to it) Windows might have an advantage if you're doing active directory w/ group policies, but I've heard horror stories from my neighbor about his co-worker who decides to 'hide' his changes to individual people (eg, blocking what websites they can get to), making it difficult for someone else to go in and clear them out because he was too over-zealous. (Anyone who has read this far might be interested to know that Windows 10 is going to include an official MS-supported command line package management suite called OneGet that will build on the package ecosystem of the third-party Chocolatey suite.) Very interesting. -Joe
Re: [CODE4LIB] Subject: Re: Why learn Unix?
On Oct 28, 2014, at 8:11 PM, Alex Berry wrote: And that is why alias rm='rm -I' was invented. Do not *ever* set this to be a default for new users. During my undergrad, I worked at helpdesk for the group that managed the computer labs, the general use unix cms systems (not content management system ... an IBM mainframe ... one of the last bitnet-to-internet gateways). The engineering school set up a bunch of default aliases for their systems... including rm='rm -i'. This meant that when people came to *our* servers ... they'd decide to interactively clean out their home directory by typing: rm * ... and then wonder why it didn't prompt them. ... and then come to the computer lab to complain. ... and then complain some more when we wouldn't immediately restore their files for them. (our policy was technically disaster recovery only, but it was effectively disaster recovery, upper level management, or members of the faculty senate ... because restores from tape really, really sucked.) -Joe
Re: [CODE4LIB] Why learn Unix?
On Oct 27, 2014, at 12:38 PM, Bigwood, David wrote: Learning UNIX is fine. However, I do think learning SQL might be a better investment. So many of our resources are in databases. Understanding indexing, sorting and relevancy ranking of our databases is also crucial. With linked data being all the rage knowing about sparql endpoints is important. The presentation of the information from databases under our control needs work. Is the information we present actionable or just strings? Quite likely. I wouldn't teach people SQL (and I've done plenty of pl/sql and t/sql programming) unless: 1. They had data they wanted to use that's already on an SQL server. 2. They had a (read-only) account on that server, so they could actually use it. If they had to go about setting up a server (even if it's an installable application) and ingesting their data to be able to analyze, you can get frustrated before you even start to see any useful results. If they have some scenario where they need multiple tables and joins, then sure, teach them SQL ... but over the years, I've had weeks of SQL-related training*, and I don't know that I'd want to make anyone go through all of that if they're just trying to do some simple reports that could be done in other ways. I wouldn't even suggest teaching people about indexing until they've tried doing stuff in SQL and wondered why it's so slow. Likewise, if there were some sort of non-SQL database for them to play with (even an LDAP server) that might have information of use to them, I'd teach them that first ... but I'd likely start w/ unix command line stuff (see below). Or maybe I just like those topics better and find the work being done there fascinating? Quite likely. I still haven't found a reason good reason to wrap my head around sparql ... I guess in part because the stuff I'm dealing with isn't served as linked data. ... On Oct 27, 2014, at 11:15 AM, Tod Olson wrote: There’s also something to be said for the Unix pipeline/filter model of processing. That way of breaking down a task into small steps, wiring little programs to filter the data for each step, building up the solution iteratively, essentially a form of function composition. Immedidately, you can do a lot of powerful one-off or scripting tasks right from the command line. More generally, it’s a very powerful model to have in your head, can transform your thinking. I 100% agree. If I were to try to teach unix to a group, I'd come up with some scenarios where command like tools can actually help them, and show them how to automate things that they'd have to do anyway. (or tried to do, and gave up on). For instance, if there's some sort of metric that they need, you can show how simple `cut | sort | uniq | wc` can be used... eg, if I have a 'common' or 'common+' webserver log file, I can get a quick count of today's unique hosts via : cut -d -f1 /var/log/httpd/access_log-2014.10.27 | sort | uniq | wc -l If I wanted to see the top 10 hosts hitting us: cut -d -f1 /var/log/httpd/access_log-2014.10.27 | sort | uniq -c | sort -rn | head -10 If you're lazy, and want to alias this so it didn't have to hard-code today's date: cut -d -f1 `ls -1t /var/log/httpd/access_log* | head -1` | sort | uniq | wc -l If your log files are rolled weekly, and we need to extract just today : (note that it's easier if you're sure that something looking like today's date won't show up in requests) cut -d -f1,4 `ls -1t /var/log/httpd/access_log* | head -1` | grep `date '+%d/%b/%Y'` | cut -d -f1 | sort | uniq | wc -l If you just wanted a quick report of hits per day, and your log files aren't rolled and compressed: cat `ls -1tr /var/log/httpd/access_log*` | cut -d\[ -f2 | cut -d: -f1 | uniq -c | more (note that that last one isn't always clean ... the dates logged are when the request started, but they're logged when the script finishes, so sometimes you'll get something strange like: 12354 23/Oct/2014 3 24/Oct/2014 1 23/Oct/2014 14593 24/Oct/2014 ... but if you try to use `sort`, and you cross months, it'll sort of alphabetical, not cronological) You could probably dedicate another full day to sed awk, if you wanted ... or teach them enough perl to be dangerous. -Joe * I've taken all of the Oracle DBA classes back in the 8i days (normally 4 weeks if taken as full-day classes), plus Oracle's data modeling and sql tuning classes (4-5 days each?)
Re: [CODE4LIB] CrossRef/DOI content-negotiation for metadata lookup?
On Oct 23, 2014, at 11:19 AM, Jonathan Rochkind wrote: Hi, the DOI system supports some metadata lookup via HTTP content-negotiation. I found this blog post talking about CrossRef's support: http://www.crossref.org/CrossTech/2011/04/content_negotiation_for_crossr.html But I know DataCite supports it to some extent too. Does anyone know if there's overall registrar-agnostic documentation from DOI for this service? None that I'm aware of. We've actually been discussing this issue in a breakout from 'Data Citation Implementors Group', and I think we're currently leaning towards not relying solely on content negotiation, but also using HTTP Link headers or HTML link elements to make it possible to discover the other formats that the metadata may be available. If you dig into the OAI-ORE documentation, they specifically mention one of the problems of using Content Negotation is that you can't tell exactly what someone's asking for solely based on the Accept header ... do they want a resource map to the content, or just the metadata from the splash / landing page? Or, if there's kept-updated documentation from CrossRef and/or DataCite on it? It looks like the one for CrossRef is : http://www.crosscite.org/cn/ ... if you go to the documentation for DataCite, it still has 'Beta' in the title: DataCite Content Resolver Beta http://data.datacite.org/static/index.html (note : http://data.datacite.org/ redirects here, which is linked from https://www.datacite.org/services ) From that blog post, it says rdf+xml, turtle, and atom+xml should all be supported as response formats. But atom+xml seems to not be supported -- if I try the very example from that blog post, I just get a 406 No Acceptable Resource Available. I am not sure if this is a bug, or if CrossRef at a later point than that blog post decided not to support atom+xml. Anyone know how I'd find out, or get more information? The link I gave to CrossRef documentation has three e-mail addresses at the bottom, if you wanted to ask them if the documentation is still current*: 6 Getting Help Please contact l...@crossref.org, t...@datacite.org or medrast...@cineca.it for support. -Joe * and this is why when I used to maintain documentation, every document had both 'last revised' and 'last reviewed' date on 'em, so you had a clue how likely they were to be out of date.
Re: [CODE4LIB] Citation Publication Tool
On Oct 22, 2014, at 4:10 PM, Bigwood, David wrote: Any suggestions for publishing citations on the Web? We have a department that has lots of publications with citations at the end of each. Keeping the citations up-to-date is a chore. Many here use Endnotes, and I know that can publish to the Web. Any examples I can view? Would Libguides be something to consider? Any other suggestions for easily getting different groups of citations up in multiple places? Some examples of the pages involved: http://www.lpi.usra.edu/education/explore/LifeOnMars/resources/ http://www.lpi.usra.edu/education/explore/solar_system/resources/ http://www.lpi.usra.edu/education/explore/space_health/resources/ Based on the pages that you've linked to, I wouldn't call those 'citations'* I've seen them called different things, depending on the reason for creating the lists, and the intended audience. For instance, if they're lists of scholarly resources (books journal articles, maybe presentations thesis) that make use of your group's data, then it's either an 'Observatory Bibliography' or 'Telescope Bibliography' depending on the scope, and sometimes just 'Publication List'. Those are actually an easy case in our field, as The Astrophysics Data System indexes the main journals in our field, so you just need software that can look up metadata from bibcodes: http://adsabs.harvard.edu/cgi-bin/nph-bib_query?bibcode=2004SPIE.5493..163Hdata_type=BIBTEXdb_key=ASTnocookieset=1 'Publication Lists' are a little harder, as they often include Public Press coverage (ie, media intended for non-scientists such as newspaper / website / tv news / magazines), which ADS doesn't index. (and you often want to grab a snapshot of them, in case it disappears) For what you have ... although you have some links to formally published items, it looks to more be links to various websites with more information on a topic. I've heard them informally referred to as EPO Resource Pages (EPO == Education Public Outreach) or if specifically for teachers 'Educator Resource Pages. I've typically seen them organized first by intended age level, then by the type of resource. (organizing how you have it is generally for bean-counting when it comes time for senior reviews). ... As for software recommendations ... if you're already using a CMS, I'd look to see if has any add-ons for managing either bibliographies or just lists of external links. If you're looking for stand alone software, I'd look for 'Reference Manager' or 'Bibliography Manager' software that can generate HTML to post online. There are some that allow you to manage everything online, but then you have to be worried about securing it** : http://en.wikipedia.org/wiki/Comparison_of_reference_management_software I'm not aware of any that have specifically been built for EPO purposes, but many of them have ways to add extra fields, so you could handle intended audience and your current classification that way. -Joe * There was actually an issue that came up during the work on the 'Joint Declaration of Data Citation Principles' that makes me believe that there are at least 6 different things that people may mean by 'citation', and yours would likely be a 7th. See http://docs.virtualsolar.org/wiki/CitationVocabulary ** We had to drop the one we were using after a SQL injection, and my boss decided to ban all PHP on our network, so we rolled back to use 10+ year old software that had been written for another mission.
Re: [CODE4LIB] Linux distro for librarians
On Oct 19, 2014, at 3:20 PM, Francis Kayiwa wrote: [trimmed] I'm willing to bet it would be much less effort to fix this Ubuntu problem dealing with the Ubuntu devs (I've found them reasonable to work with) than trying to heard the cats around yet another debian fork Another alternative would be to pick an existing OS, and make sure that all of the requisite packages are in their package manager. -Joe ps. 'OS for librarians' was never defined as being (1) for servers at libraries, (2) for librarian workstations, or (3) for public-use machines. Things that make a good client machine doesn't always make for a good server. And what makes a good personally managed desktop doesn't necessarily make it a good desktop when you're managing dozens or hundreds. (take MacOSX ... replacing bits to make it 'easier' for users, but harder to manage remotely in bulk)
[CODE4LIB] Citation hackathon tomorrow at PLOS
I was just looking at the PLOS website, and noticed they had a banner: PLOS is hosting a hackathon on Saturday, October 18th, 2014 at our SF office. So, if you're in the San Francisco area, and are interested in citations (the theme of the hackathon), and don't have plans for tomorrow, see their website for more details and to RSVP: http://www.ploslabs.org/citation-hackathon -Joe
Re: [CODE4LIB] Requesting a Little IE Assistance
It sounds like the issue already has a solution, but ... On Oct 13, 2014, at 10:13 PM, Matthew Sherman wrote: The DSpace angle also complicates things a bit as they do not have any built in CSS that I could edit for this purpose. I am hoping they will be amenable to the suggestions to right click and open in notepad because txt files are darn preservation friendly and readable with almost anything since they are some of the simplest files in computing. Thanks for the input folks. I'm not a DSpace user, but my understanding is that it's not a stand-alone webserver ... which means that you may still have ways to re-write what gets served out of it. For instance, if you're running Apache you can build an 'output filter'. I've only done them via mod_perl, but some quick research points to mod_ext_filter to call any command as a filter: http://httpd.apache.org/docs/2.2/mod/mod_ext_filter.html You'd then set up a 'smart filter' to trigger this when you had a text/plain response and the UserAgent is IE ... but the syntax is ... complex, to put it nicely: http://httpd.apache.org/docs/2.2/mod/mod_filter.html (I've never configured a smart filter myself, and searching for useful examples isn't really panning out for me). ... but I thought I'd mention this as an option for anyone who might have similar problems in the future, as it lets you mess with images and other types of content, too. -Joe
Re: [CODE4LIB] Requesting a Little IE Assistance
On Oct 13, 2014, at 9:59 AM, Matthew Sherman wrote: For anyone who knows Internet Explore, is there a way to tell it to use word wrap when it displays txt files? This is an odd question but one of my supervisors exclusively uses IE and is going to try to force me to reupload hundreds of archived permissions e-mails as text files to a repository in a different, less preservable, file format if I cannot tell them how to turn on word wrap. Yes it is as crazy as it sounds. Any assistance is welcome. If there's a way to do it, it likely wouldn't be something that you could send from the server. Depending on the web server that you're using, you might be able to use client detection, and then pass requests from IE through a CGI (or similar) that does the line-wrapping ... or wraps it in HTML. If you go the HTML route, you might be able to just put the whole thing in a textarea element. If you *do* have to modify all of the text files, as you specifically mention that they're e-mails, I'd recommend looking at 'flowed' formatting, which uses 79 character lines, but SP CRLF to mark 'soft' returns: https://www.ietf.org/rfc/rfc2646.txt You could also try just setting an HTTP header to 'Format: Flowed' and see if IE will handle it from there. (I'd test myself, but I don't have IE to test with) -Joe
Re: [CODE4LIB] Requesting a Little IE Assistance
On Oct 13, 2014, at 5:15 PM, Kyle Banerjee wrote: You could encode it quotable-printable or mess with content disposition http headers. Oh, please not quoted-printable. That's= the one that makes you think that something= is wrong with your mail client because= there are strange equals signs (=3D) all= over the place. -Joe
Re: [CODE4LIB] Informal survey regarding library website liberty
On Sep 2, 2014, at 11:39 AM, Brad Coffield wrote: Hi all, I would love to hear from people about what sort of setup they have regarding linkage/collaboration/constrictions/freedom regarding campus-wide IT practices and CMS usage and the library website. [trimmed] I'm hoping that I can get some responses from you all that way I can informally say of x libraries that responded y of them are not firmly tied to IT. (or something to that effect) I'm also very curious to read responses because I'm sure they will be educational and help me to make our site better. THE QUESTION: What kind of setup does your library have regarding servers, IT dept collaboration, CMS restrictions, anything else? I imagine that there are many unique situations. Any input you're willing to provide will be very welcome and useful. So, rather than answer the question (as I don't work for a library), but I worked in central IT for a university for ~7 years: If you're going to consider using central IT for your infrastructure, ask them what sort of service guarantees they're willing to provide. This is typically called a 'Service Level Agreement', where they spell out who's responsible for what, response times, acceptable downtime / maintenance windows, etc. It may include costs, but that may be a separate document. Typically, the hosted solutions are best when you've just got a few pages that rarely get updated (once a year or so); if you're pulling info from a database to display on a website, most shared solutions fall flat on their face. They might have a database where you could store stuff to make data-driven web pages, but they rarely are flexible enough to interface with some external server. So, anyway ... it doesn't matter what other schools do if your IT dept. can't provide the services you need. If they *can* provide it, you need to weight costs vs. level of service ... the cost savings may not be worth it if they regularly take the server down for maintenance at times when you need it. -Joe
Re: [CODE4LIB] Hiring strategy for a library programmer with tight budget - thoughts?
On Aug 15, 2014, at 12:44 PM, Kim, Bohyun wrote: I am in a situation in which a university has a set salary guideline for programmer position classifications and if I want to hire an entry-lever dev, the salary is too low to be competitive and if I want to hire a more experienced dev in a higher classification, the competitive salary amount exceeds what my library cannot afford. So as a compromise I am thinking about going the route of posting a half-time position in a higher classification so that the salary would be at least competitive. It will get full-time benefits on a pro-rated basis. But I am wondering if this strategy would be viable or not. Also anyone has a experience in hiring a developer to telework completely from another state when you do not have previous experience working with her/him? This seems a bit risky strategy to me but I am wondering if it may attract more candidates particularly when the position is half time. As a current/past/future library programmer or hiring manager in IT or both, if you have any thoughts, experience, or ideas, I would really appreciate it. Salary's not the only factor when it comes to hiring ... convenience and work environment are a factor, too. If I were you, I'd look to hire a half-time employee, and let them have flexible hours, so you could pick up a current student. If you can offer them reduced tuition or parking (matters at some campuses ... for College Park, just getting 'em in a lot that's closer to their classes) might make up for a less-competitive salary. You should also check with the university's legal department, as you have a class of students who specifically *can't* work full time (foreigners on student visas), so you might be able to hire a grad student that would've other problems getting hired. Especially in the D.C. area, they have a hard time finding jobs (as so many companies are tied to the federal government, they don't want to hire non-US citizens). ... As for the telework aspect -- it's a pain to get set up from nothing. If you have someone that you're comfortable with and they move away, that's completely different from bringing in someone who doesn't have a vested relationship in the group. At the very least, I'd recommend bring them in for an orientation period (2-8 weeks), where you can get a feel for their work ethic such. Most of the people on the project I'm on are remote ... but we keep an IM group chat window up all the time, and we have meetings 1-3 times per year where we all get together for a week to hash out various issues and keep the relationships strong. -Joe
Re: [CODE4LIB] Hiring strategy for a library programmer with tight budget - thoughts?
On Aug 15, 2014, at 2:49 PM, BWS Johnson wrote: Salvete! My first thought was a project-based contract, too. But there are few programmer projects that would require zero maintenance once finished. As someone who has had to pick up projects completed by others, there are always bugs, gaps in documentation, and difficult upgrade paths. There could be follow up contracts for those problems, or they might be less of a hassle for in house staff to handle than trying to do absolutely errything from scratch. That actually made me think of something -- I've worked in places where we've had issues with people brought in as short-term contract developers. The problem is ... the code was crap. As they didn't have to maintain it for the long run, they wrote some really sloppy code. I know of one group who brought someone in, they poo-pooed all of the code, and insisted it had to be re-written (so they did ... in ksh ... without quoting anything ... and loading config files by sourcing them) ... but of course, he was on an hourly contract, so he had a vested interest in making more work for himself. (and for me, as I was then responsible for integrating their system w/ one that I maintain). You also get cases where every change in the specs requires new negotiation of payment. (like the whole healthcare.gov thing) ... so to sum up ... if you don't already have an established relationship with the person, I'd avoid bringing in someone to telework. -Joe So I have no solutions to offer. Enticing people with telework is a good idea. It's disappointing to see libraries (and higher ed more generally) continuing to not invest in software development. We need developers. If we cannot find the money for them, perhaps we should re-evaluate our (budgetary?) priorities. Anytime I see things which I think more than one Library would like to have I think Caw, innit that what a Consortium is for? One member alone might not be able to afford a swank techie, but perhaps pooling resources across Libraries would let you hire someone at an attractive salary for the long haul while getting all of the members' projects knocked out. It would also mean that you don't have to do any of those nasty follow up contracts since the person that made it would still be about. I'm pretty sure that there was someone on this list a few years back who made a comment if every library contributed 10% of an FTE of funding, we could fund a lot of developers.
Re: [CODE4LIB] Dewey code
On Aug 8, 2014, at 10:13 PM, Riley Childs wrote: Ok, so you want to access LC data to get Dewey decimal numbers? You need to use a z39.50 client to pull the record, you can do it with marc edit but it is labor intensive. You would need to roll your own solution for this or use classify.oclc.org to get book info (this doesn't give you API access). Your best bet is classify.oclc.org. That aside: Honestly you might be better off running with something like Koha, writing a home brew library system is no cake walk, trust me I know from 2 years of experience trying to code one and ultimately moving to koha. Koha can be run on a VPS (Digital Ocean is what i would use) or on an old PC in the corner. I am in a situation similar to yours if you want to contact me off list I can give you some advice. I 100% agree -- you'd be better off going with something intended for personal libraries (eg Delicious Library) and give it a dedicated machine before trying to roll your own. oss4lib hasn't been updated in a while, but Lyrasis is maintaining foss4lib.org as a catalog of free open source library software, and has a 'ILS feature comparison tool' which lists feature differences between Koha and Evergreen: http://ils.foss4lib.org/ -Joe
Re: [CODE4LIB] very large image display?
On Jul 25, 2014, at 11:36 AM, Jonathan Rochkind wrote: Does anyone have a good solution to recommend for display of very large images on the web? I'm thinking of something that supports pan and scan, as well as loading only certain tiles for the current view to avoid loading an entire giant image. A URL to more info to learn about things would be another way of answering this question, especially if it involves special server-side software. I'm not sure where to begin. Googling around I can't find any clearly good solutions. Has anyone done this before and been happy with a solution? If you store the images in JPEG2000, you can pull tiles or different resolutions out via JPIP (JPEG 2000 Interactive Protocol) Unfortunately, most web browsers don't support JPIP directly, so you have to set up a proxy for it. For an example, see Helioviewer: http://helioviewer.org/ Documentation and links to their JPIP server are available at: http://wiki.helioviewer.org/wiki/JPIP_Server -Joe
Re: [CODE4LIB] Publishing large datasets
On Jul 23, 2014, at 5:29 PM, Kyle Banerjee wrote: We've been facing increasing requests to help researchers publish datasets. There are many dimensions to this problem, but one of them is applying appropriate metadata and mounting them so they can be explored with a regular web browser or downloaded by expert users using specialized tools. Datasets often are large. One that we used for a pilot project contained well over 10,000 objects with a total size of about 1 TB. We've been asked to help with much larger and more complex datasets. The pilot was successful but our current process is neither scalable nor sustainable. We have some ideas on how to proceed, but we're mostly making things up. Are there methods/tools/etc you've found helpful? Also, where should we look for ideas? Thanks, The tools I use are too customized for our field to be of much use to anyone else, so can't help on that part of the question. I'd really recommend trying to reach out to someone working in data informatics in the field that the data is from, as they would have recommendations on specific metadata that should be captured. For the general 'data publication' community, it's coalescing, but still a bit all over the place. Here are some of the ones that I know about: JISC has a 'Data Publication' mailing list: https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=DATA-PUBLICATION ASIST runs a 'Research Data Access Preservation' conference and mailing list: http://www.asis.org/rdap/ http://mail.asis.org/mailman/listinfo/rdap ... and they put most of the presentations up on slideshare: http://www.slideshare.net/asist_org/ The Research Data Alliance has two working groups on the topic, Publishing Services and Publishing Data Workflows: https://rd-alliance.org/group/rdawds-publishing-services-wg.html https://rd-alliance.org/group/rdawds-publishing-data-workflows-wg.html I'm also one of the moderators of the Open Data site on Stack Exchange, which has some questions that might be relevant: Let's suppose I have potentially interesting data. How to distribute? http://opendata.stackexchange.com/q/768/263 Benefits of using CC0 over CC-BY for data http://opendata.stackexchange.com/q/26/263 ... or just ask a new question. I'd also recommend that when you catalog your data, that you also consider adding DataCite metadata, so that we can try to make it easier for others to cite your data. (specific implementation recommendations for data citation are still evolving, but general principles have been released; if you have questions, feel free to ask me, as I think we need to add some clarification to what we mean on some of the items). http://www.datacite.org/ https://www.force11.org/datacitation As I see it, you're dealing with data that's in the problem range -- if it were larger, the department collecting the data would have a system in place already; if it were smaller, it's easier to manage as a single item for deposit. -Joe
Re: [CODE4LIB] net.fun
On Jul 14, 2014, at 8:21 AM, Riley Childs wrote: My MOTDs are not as fun... RUN GET OUT OF HERE YOU ARE NOT WELCOME TODAY RESTRICTED ACCESS HERE. I would expect that in the banner, not the motd: $ more /etc/banner This US Government computer is for authorized users only. By accessing this system you are consenting to complete monitoring with no expectation of privacy. Unauthorized access or use may subject you to disciplinary action and criminal prosecution. The banner gets displayed before the login prompt, the motd gets displayed after ... there's also an assumption that the motd changes regularly, as it's 'message of the day' ... although most people have it be completely random and just call fortune or never bother changing it. -Joe
Re: [CODE4LIB] net.fun
On Jul 14, 2014, at 10:44 AM, Cary Gordon wrote: I remember when system administrators would change the MOTD daily. The '80s were so pastoral. 0 0 * * * /bin/fortune /etc/motd or, for those running Vixie cron (which most people weren't in the 80s) : @daily /bin/fortune /etc/motd ... but then, everyone went the way of 'web portals' and the like, rather than assuming everyone was going to be (telnet|tn3270)ing into a (unix|cms) system so they could check their e-mail, nntp, gopher, etc. -Joe ps. is it disturbing that the talk of motd is making me nostalgic for ASCII art? On Monday, July 14, 2014, Joe Hourcle onei...@grace.nascom.nasa.gov wrote: On Jul 14, 2014, at 8:21 AM, Riley Childs wrote: My MOTDs are not as fun... RUN GET OUT OF HERE YOU ARE NOT WELCOME TODAY RESTRICTED ACCESS HERE. I would expect that in the banner, not the motd: $ more /etc/banner This US Government computer is for authorized users only. By accessing this system you are consenting to complete monitoring with no expectation of privacy. Unauthorized access or use may subject you to disciplinary action and criminal prosecution. The banner gets displayed before the login prompt, the motd gets displayed after ... there's also an assumption that the motd changes regularly, as it's 'message of the day' ... although most people have it be completely random and just call fortune or never bother changing it. -Joe -- Cary Gordon The Cherry Hill Company http://chillco.com
Re: [CODE4LIB] net.fun
On Jul 14, 2014, at 11:56 AM, Riley Childs wrote: I know I might be little youn but code4lib needs a bbs I can see it now ... someone re-writing TradeWars 2000 so you're an intergalactic bookmobile. -Joe
Re: [CODE4LIB] net.fun
On Jul 14, 2014, at 5:25 PM, Lisa Rabey wrote: The cause of the problem is: /dev/clue was linked to /dev/null Teehee. http://pages.cs.wisc.edu/~ballard/bofh/bofhserver.pl It's difficult to use the excuse 'solar flares' when your boss is (1) a solar physicist and (2) reads BOFH. http://bofh.ntk.net/BOFH//bastard06.php On Mon, Jul 14, 2014 at 1:02 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: The only problem is that some people might have difficulty obtaining audio modems that could be made to work with their cell phones... So in um ... spring of 1995, I think it was ... we managed to get a car phone from Bell Atlantic (might've been Bell Atlantic-NYNEX at that point), and the phone had an RJ-11 jack on it. ... which of course meant that we had to see if we could hold up a modem connection while on the road. Unfortunately, the best that we could manage to get was about 2400 baud for any extended periods. We had our best transfer rates (9600 baud?) up near the NSA campus along the BW Parkway. Mind you, this was in the days when modems topped out at 33.6k ('x2' and 'kFlex' didn't come along 'til a year or two later, then finally v90) ... the modem banks we were dialing into might've only been 14.4k or 28.8k. This was also in the days of analog cell service, as PCS didn't come out 'til even later ... once it did, the sysadmin for the ISP I worked at got cables so he could dial out from what today we'd call a 'netbook' (back then it was just a really tiny laptop ... this was also the days when you could keep a computer on your lap without it crushing you (the 'portable' aka 'luggable' era) and it burning your crotch (the current 'notebook' era). ... but I still think we could pull off 1200 baud w/ an acoustic coupler over a cell phone, which is about the bare minimum for MUDs in the mid 1990s, and would've been fine for BBSes, as long as you weren't dealing in warez. -Joe ps. wow, this whole conversation is making me feel old ... doesn't help that I blew my back out last week, so I was already feeling old before the day started.
Re: [CODE4LIB] Why not Sharepoint?
On Jul 11, 2014, at 10:33 AM, Thomas Kula wrote: On Fri, Jul 11, 2014 at 10:10:40AM -0400, Jacob Ratliff wrote: Hi Ned, The biggest case for SP is boiled down to 2 things in my mind. 1) its terrible at preservation. If you are just using it as a digital asset mgmt system its fine, but if you need the preservation component go with something else. I've never used Sharepoint, but really it boils down to coming up with a list of requirements for a digital preservation storage system: - It must have an audit log of who did what to what when - It must do fixity checking of digital assets - At minimum, it must tell you when a fixity check fails - It really should be able to recover from fixity check failures when an object is read - Ideally it should discover these *before* an object is accessed, recover, and notify someone - It must support rich enough metadata for your objects - It must meet your preservation needs (N copies distributed over X distance within Y hours) - It must be scalable to handle anticipated future growth. I'm sure there are more, I haven't had much coffee yet this morning so I'm missing some. And honestly, you have to scale your requirements to what your specific needs are. *Only* then can you evaluate solutions. If you've got a list of requirements, you can then ask I need this. How well does SP (or any other possible solution) meet this need? So it doesn't look like you're just coming up with cases that Sharepoint doesn't do, you might consider something like the TRAC checklist: 2007 version, from CRL: http://www.crl.edu/sites/default/files/attachments/pages/trac_0.pdf 2011 update from CCSDS: http://public.ccsds.org/publications/archive/652x0m1.pdf The 2011 update should mirror what's in ISO 16363. Most of the other certifications that I've seen look more at the organization, and don't have specific portions for technology. -Joe ps. A quick search for 'SharePoint' and 'OAIS' led me to : http://www.eprints.org/events/or2011/hargood.pdf ... which as best I can tell is the abstract for a poster at OR2011.
Re: [CODE4LIB] Is ISNI / ISO 27729:2012 a name identifier or an entity identifier?
On Jun 20, 2014, at 4:30 PM, Karen Coyle wrote: On 6/20/14, 11:38 AM, Richard Wallis wrote: In what ways does ISNI support linked data? See: http://www.isni.org/how-isni-works#HowItWorks_LinkedData accessible by a persistent URI in the form isni-url.oclc.nl/isni/000134596520 (for example) and soon also in the form isni.org/isni/000134596520. Odd. I assume that whoever wrote that on their page just forgot the http://; part of those strings. Right? People think I'm being pedantic when I bitch about the protocol missing for printed materials (flyers, business cards, etc) ... but in this case, it's a definite violation of RFC 3986: 1.1.1. Generic Syntax Each URI begins with a scheme name, as defined in Section 3.1, that refers to a specification for assigning identifiers within that scheme. As such, the URI syntax is a federated and extensible naming system wherein each scheme's specification may further restrict the syntax and semantics of identifiers using that scheme. Now, it's possible that this whole we don't need to bother with http://; thing has spilled into the CMS building community, and they're actively stripping it out. From their page, I think they're using Drupal, but the horrible block of HTML that this was in is blatantly MS Word's 'save as HTML' foulness: h2span lang=EN-USa name=HowItWorks_LinkedData/aLinked Data/span/h2 p class=MsoNormalspan lang=EN-USLinked data is part of the ISNI-IA’s strategy to make ISNIs freely available and widely diffused.nbsp; Each assigned ISNI is accessible by a persistent URI in the form isni-url.oclc.nl/isni/000134596520 (for example) nbsp;and soon also in the form isni.org/isni/000134596520.nbsp;/span/p p class=MsoNormalspan lang=EN-USComing soon:nbsp; ISNI core metadata in RDF triples.nbsp; The RDF triples will be embedded in the public web pages and the format will be available via the persistent URI and the SRU search API./span/p p class=MsoNormalspan lang=EN-USnbsp;/span/p -Joe On 20 June 2014 18:57, Eric Lease Morgan emor...@nd.edu wrote: On Jun 20, 2014, at 10:56 AM, Richard Wallis richard.wal...@dataliberate.com wrote: authority control|simple identifier |Linked Data capability +-+--+--+ VIAF |X|X | X | +-+--+--+ ORCID | |X | | +-+--+--+ ISNI |X|X | X | +-+--+--+ Increasingly I like linked data, and consequently, here is clarification and a question. ORCID does support RDF, but only barely. It can output FOAF-like data, but not bibliographic. Moreover, it is experimental, at best: curl -L -H 'accept: application/rdf+xml' http://orcid.org/-0002-9952-7800 In what ways does ISNI support linked data? --- Eric Morgan -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] College Question!
On May 28, 2014, at 11:17 PM, Riley Childs wrote: I was curious about the type of degrees people had. I am heading off to college next year (class of 2015) and am trying to figure out what to major in. I want to be a systems librarian, but I can't tell what to major in! I wanted to hear about what paths people took and how they ended up where they are now. What paths we took? Well, I'm in the mood for procrastinating, so here goes. ... Mine started well before college. My dad got our family a computer (Apple IIe) when I was in 3rd or 4th grade ... so I learned Basic back in the days when you'd copy program listings from magazines. In middle school, I learned Logo, and in 8th grade was a aide for the computer lab. One summer I went to a two week camp, and learned Pascal, and the difference between Basic and Basica. During this time, my mom worked for a computer company, and we upgraded to a Apple ][gs. My high school was a 'science and tech' school. I had 2.5 years of drafting, 2 years of commercial graphics, and by senior year I was working as a TA in the computer lab, and had an independent study in the school's print shop. Through this time, we upgraded to a Macintosh SE/30 and then a Macintosh IIci. For summers in high school, I was working as an intern for an office of the Department of Defense (my dad was military), and I learned a few other OSes, including ALIS (a window manager for Sun UNIX boxes). I was also calling into BBSes quite regularly (had started back in middle school w/ a 1200 baud modem). In college, I had planned to work towards a degree in Architectural Engineering, but my dad taught at a school that didn't offer it ... so I started a degree in Civil Engineering. After my freshman year, I started working in the university's academic computing center. (They managed the computer labs the general use UNIX CMS machines). I started off doing general helpdesk support, but by my junior year that whole 'world wide web' thing was getting popular. As I had experience with computer programming, databases, desktop publishing, graphics, etc ... so I ended up splitting my time between the helpdesk, and the newly formed 'web development team' ... which was two of us (both students), working half time. And I was getting to be a fairly fast typist from mudding. After my sophomore year, Tim, the other member of our 'web development team' graduated, and went to work full time, while I was half time. We grew to four people (3 half time, as we were full time students), and we did some cutting edge stuff to get all of the university's course information online (required parsing quark xpress files to generate HTML, parsing dumps from the university's course registration system, and generating HTML, etc) ... and so Tim got offered a job to go work for Harvard. Through this time, I helped out on the university's solar car team, and got distracted and never got around to switching to a school for architecture. I ended up taking over in managing the university's web server while they tried to find a new manager for our group. (this was back when 'webmaster' meant 'web server administrator' and not 'person who designs web pages') I learned Perl, to go along with the various shell scripting that I had already learned. I picked up the 'UNIX System Administration Handbook' and learned from our group's sysadmins until I was trusted to manage that server. While all of this was going on, as I had taken enough classes to be 1/2 a semester off from my classmates, I never realized that I was supposed to take the EIT (Engineer in Training test) ... so I was a bit screwed if I wanted to be an engineer. After graduation, I went to resign, as I wanted to look for a full time job, but the director said that they were putting in for a new position for me. By the middle of summer, my new manager told me that the director had told her that under no circumstances was she to hire me for the job that was being created. He really didn't like guys with long hair. ... but through this time, I spent some of my savings to help one of the folks on the mud to start an ISP (so they could host the mud which was getting kicked out of the university it was at). I was working as their webmaster, remotely. After all of this crap went down at my university, I got offered to do some contract work at that ISP, so I moved out to Kentucky. The first contract fell through, but I kept doing various coding projects for them, did tech support (phone and still the days when we'd drive out to people's houses to set up their modems). I learned mysql in the process. The contracting side of our company merged with another contracting company, but then everything fell through ... and oddly I was the only employee that suddenly found themselves working for a different company. Through this time, I did mostly web database work ... the ISP that I worked for
Re: [CODE4LIB] jobs digest for 2014-05-16
On May 16, 2014, at 3:46 PM, Andreas Orphanides wrote: THIS IS SLIGHTLY DIFFERENT THAN WHAT WE DISCUSSED. Agreed, but there's no need for shouting. It looks to me like it's a change in the messages that 'jobs.code4lib.org' generates and sends to the list ... *not* the change that Eric made to the mailing list. (I'm basing that on what a LISTSERV(tm) digest looks like, and the fact that it's archived this as a single message). ... and whoever made the change should at the very least put 'JOBS:' in the subject, so LISTSERV(tm) assigns it to the right topic for people to then ignore it. -Joe On Fri, May 16, 2014 at 3:44 PM, j...@code4lib.org wrote: Library Electronic Resources Specialist Raritan Valley Community College Branchburg Township, New Jersey ColdFusion, EZproxy, JavaScript, Personal computer hardware http://jobs.code4lib.org/job/13115 Digital Scholarship Specialist University of Oklahoma Norman, Oklahoma Digital humanities, University of Oklahoma http://jobs.code4lib.org/job/14593 Research Data Consultant Virginia Polytechnic Institute and State University Blacksburg, Virginia Data curation, Data management, Digital library, Informatics http://jobs.code4lib.org/job/14591 Systems Librarian Central Michigan University Mount Pleasant, Michigan CONTENTdm, Ex Libris, Innovative Interfaces, MARC standards, Proxy server, Resource Description and Access, SFX http://jobs.code4lib.org/job/14590 To post a new job please visit http://jobs.code4lib.org/
Re: [CODE4LIB] separate list for jobs
On Thu, 15 May 2014, Jodi Schneider wrote: elm++ people still use elm? I'm personally using the 'patterns-filters2' rule in alpine for managing my mailing lists. I've considered switching to mutt, but I haven't used elm or its derivatives in over a decade. (elm didn't have good MIME support, and I was getting tired of jumping through hoops for every attachment... although, it was *much* better than pine if you were connecting at 1200 baud, as it didn't redraw the screen constantly) -Joe On Thu, May 15, 2014 at 6:09 PM, Eric Lease Morgan emor...@nd.edu wrote: I have done my initial best to configure the mailing list to support a jobs topic, and I've blogged about how you can turn off or turn on the jobs listings. [1] From the blog: The Code4Lib community has also spawned job postings. Sometimes these job postings flood the mailing list, and while it is entirely possible use mail filters to exclude such postings, there is also more than one way to skin a cat. Since the mailing list uses the LISTSERV software, the mailing list has been configured to support the idea of topics, and through this feature a person can configure their subscription preferences to exclude job postings. Here's how. By default every subscriber to the mailing list will get all postings. If you want to turn off getting the jobs postings, then email the following command to lists...@listserv.nd.edu: SET code4lib TOPICS: -JOBS If you want to turn on the jobs topic and receive the notices, then email the following command to lists...@listserv.nd.edu: SET code4lib TOPICS: +JOBS Sorry, but if you subscribe to the mailing list in digest mode, then the topics command has no effect; you will get the job postings no matter what. Special thanks go to Jodi Schneider and Joe Hourcle who pointed me in the direction of this LISTSERV functionality. Thank you! The LISTSERV topics feature is new to me, and I hope it works as advertised. I think it will. [1] blog posting - http://bit.ly/1nSCG2u ? Eric Lease Morgan, Mailing List Owner
Re: [CODE4LIB] statistics for image sharing sites?
On May 13, 2014, at 10:16 PM, Stuart Yeates wrote: On 05/14/2014 01:39 PM, Joe Hourcle wrote: On May 13, 2014, at 9:04 PM, Stuart Yeates wrote: We have been using google analytics since October 2008 and by and large we're pretty happy with it. Recently I noticed that we're getting 100 hits a day from the Pinterest/0.1 +http://pinterest.com/; bot which I understand is a reasonably reliable indicator of activity from that site. Much of this activity is pure-jpeg, so there is no HTML and no opportunity to execute javascript, so google analytics doesn't see it. pinterest.com is absent from our referrer logs. My main question is whether anyone has an easy tool to report on this kind of use of our collections? Set your webserver logs to include user agent (I use 'combined' logs), then use: grep Pinterest /path/to/access/logs You could also use any analytic tools that work directly off of your log files. It might not have all of the info that the javascript analytics tools pull (window size, extensions installed, etc.), but it'll work for anything, not just HTML files. When I visit http://www.pinterest.com/search/pins/?q=nzetc I see a whole lot of our images, but absolutely zero traffic in my log files, because those images are cached by pinterest. You could also go the opposite route, and deny Pinterest your images, so they can't cache them. You could either use robots.txt rules, or matching rules w/in Apache to deny their agents absolutely. I have no idea if they'd then link straight to your images (so that you could get useful stats), or if they'd just not allow it to be used on their site at all. -Joe
Re: [CODE4LIB] statistics for image sharing sites?
On May 13, 2014, at 9:04 PM, Stuart Yeates wrote: We have been using google analytics since October 2008 and by and large we're pretty happy with it. Recently I noticed that we're getting 100 hits a day from the Pinterest/0.1 +http://pinterest.com/; bot which I understand is a reasonably reliable indicator of activity from that site. Much of this activity is pure-jpeg, so there is no HTML and no opportunity to execute javascript, so google analytics doesn't see it. pinterest.com is absent from our referrer logs. My main question is whether anyone has an easy tool to report on this kind of use of our collections? Set your webserver logs to include user agent (I use 'combined' logs), then use: grep Pinterest /path/to/access/logs You could also use any analytic tools that work directly off of your log files. It might not have all of the info that the javascript analytics tools pull (window size, extensions installed, etc.), but it'll work for anything, not just HTML files. My secondary question is whether any httpd gurus have recipes for redirecting by agent string from low quality images to high quality. So when AGENT = Pinterest/0.1 +http://pinterest.com/; and the URL matches a pattern redirect to a different pattern. For example: http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a%28w100%29.jpg to http://nzetc.victoria.ac.nz/etexts/MakOldT/MakOldTP022a.jpg Perfectly possible w/ Apache's mod_rewrite, but you didn't say what http server you're using. If Apache, you'd do something like: RewriteCond %{HTTP_USER_AGENT} ^Pinterest RewriteRule (^/etexts/MakOldT/.*)\(.*\)\.jpg $1.jpg [L] You might need to adjust the regex to match your URLs ... I just assumed the stuff in parens got stripped out of stuff in that directory.
Re: [CODE4LIB] separate list for jobs
On May 8, 2014, at 11:35 AM, Ben Brumfield wrote: I suspect I'm not the only mostly-lurker who subscribes to CODE4LIB in digest mode, finding value in a glance over the previous day's discussions each morning, then (very) occasionally weighing in on individual threads via the web interface. I find this to be more effective and efficient than filtering-and-foldering individual messages, at least for my goal of having some idea of the content of the conversations here, although--not being a full-time library technologist--I'm really just skimming. I also suspect that I'm also not the only digest-mode subscriber who would see value in a digest-mode option that excluded job postings. As this is an an actual LISTSERV(tm) mailing list, it's possible for the list owner to define 'topics', and then for people to set up their subscription to exclude those they wish to ignore: http://www.lsoft.com/manuals/16.0/htmlhelp/list%20owners/ModeratingEditingLists.html#2338132 I would suspect it would be honored even in digest mode, but I've never tried it. -Joe
Re: [CODE4LIB] separate list for Jobs
On May 8, 2014, at 3:54 PM, Coral Sheldon-Hess wrote: I have another, maybe minor, point to add to this: I've posted a job to Code4Lib, and I did it wrong. I have no idea how I'm supposed to make a job show up correctly, and now that I have realized I've done it wrong, I probably won't send another job to this list. (Or maybe I'll look it up in ... where? the wiki?) A second list would make this a lot clearer, I think. So, from my 'knowing way to much about LISTSERV(tm) brand mailing lists, from having been the primary support person at a university for a couple of a years: There's another feature for 'sub-lists', where you can set up parent/child relationships between lists ... so someone you can have a separate address to send to for job postings specifically: http://www.lsoft.com/manuals/16.0/htmlhelp/list%20owners/StartingMailingLists.html#2337469 I've never tried it, but it might be possible to set the SUBJECTHDR on the sub-list so the parent list assigns a topic for a given sub-list. -Joe
Re: [CODE4LIB] separate list for discussing a separate list for jobs
On May 6, 2014, at 12:34 PM, Dan Chudnov wrote: Is it time to reconsider: should we start a separate list for Job: postings? code4lib-jobs, perhaps? I think the real question here is if we should have a separate list for discussing if we need a separate list for jobs. I propose 'code4lib-jobs-list-discuss'. -Joe
Re: [CODE4LIB] CD auto-loader machine and/or services to rip CD's to disk
On Apr 30, 2014, at 11:31 AM, Derek Merleaux wrote: I have few thousand CD's and DVD's of images scanned back in the days of more expensive server storage. I want the files on these transferred to a hard-drive or cloud storage where I can get at the them and sort out the keepers etc. I have seen a lot of great home-built auto-loader machines, but sadly do not have time/energy right now to build my own. Looking for recommendations for machines and/or for a reliable service who will take my discs and put them a server. Summer interns. Well, I guess it depends on just how many thousands it is. I'm actually surprised that there aren't any groups renting these sorts of things out -- most efforts like this (or film scanning, book scanning, etc), are generally an effort that might run for a year or two, and the gear isn't needed anymore.* You'd think there'd be a market for folks to share the costs... find three groups looking to do the scanning, share the up-front costs and then pass it from place to place. I think that IMLS has given grants for these sorts of efforts... but if they could help match up equipment to groups that needed it, they might be able to get better results for each dollar spent. -Joe * Unless some item isn't discovered 'til later.
Re: [CODE4LIB] CFP: A Librarian's Introduction to Programming Languages
On Mar 26, 2014, at 9:32 AM, Simon Spero wrote: I would structure the book by task, showing how different languages would implement the same task. For example, using a marc parsing library in java, groovy, python, ruby, perl, c/c++/objective c, Haskell. Implementing same. Using a rest API Implementing a rest API Doing statistical analysis of catalog records, circulation data , etc. Doing knowledge based analysis of same -- Treatment of each topic and language is likely to be cursory at best, and I am not sure who the audience would be. A series of language for librarians books would seem more useful and easier to produce. If you tried to put it all into a book, you'd have two issues: 1. It'd be horribly long. (anyone remember the 'Encyclopedia of Graphical File Formats'?) 2. Tools change over time, and books don't. ... so instead, perhaps the code4lib community would want to try to put some of these together on the code4lib wiki. Eg, for the Marc one: http://wiki.code4lib.org/index.php/Working_with_MARC ... people could contribute recipes of how they use the various libraries that are linked in. (or just say, look it's outdated, we listed it, but we recommend (x) instead). Think of it like a code golf challenge -- someone throws out a problem, and members of the community (if they have the time) submit their various solutions in different languages or using different libraries. ... another possibility would be to organize something over on stackexchange ... if you set some 'scoring criteria', we could run them as 'code-challenges' on the codegolf site: http://codegolf.stackexchange.com/questions/tagged/code-challenge -Joe
Re: [CODE4LIB] CFP: A Librarian's Introduction to Programming Languages
On Mar 25, 2014, at 9:03 AM, Miles Fidelman wrote: Come to think of it, there's nothing there to frame the intent and scope of the book - is it aimed at librarians who write code, or at librarians who are trying to guide people to topical material? An excellent question, so I'm cc'ing the editors for the book, so maybe they can answer. (I suspect by the languages listed that it's the first one; the second would be so broad that it might not be useful ... I'm having a difficult time coming up with justifications for using Logo, IDL or Brainfuck in a library [1]). And the mention of how a specific language can be used to enhance library services and resources might be a clue, too) Either way, it sure seems like at least three framing topics are missing: - a general overview of programming language types and characteristics (i.e., context for reading the other chapters) - a history of programming languages (the family tree, if you will) - programming environments, platforms, tools, libraries and repositories - a language's ecosystem probably influences choice of language use as much as the language itself Agreed on all three ... in some cases, the main justification for using a language is the ecosystem (eg, CPAN for Perl). In some cases, it might be worth just assuming a library -- eg, do you want to teach people (ECMA|J(ava)?|Live)Script, or just assume jQuery, so they can get up to speed faster? (yes, I know, you then bring in the jQuery vs. MooTools vs. every other JS library, but I think it's safe to say that jQuery is a defacto standard these days) - non-language languages - e.g., sql/nosql, spreadsheet macros and other platforms that one builds on Agreed on the need for SQL. NoSQL isn't really a language on its own; I'm not aware of any specific general API, so I'd go with XPath XSLT for discussing non-relational data. Macro languages would be useful (and I'd assume the 'Basic' proposal was actually for VBA, so you could create more complex MS Access databases) -Joe [1] okay, maybe Logo in the context of MakerSpaces, but still nothing on the other two. ps. I haven't trimmed this, so the editors can see some of the other comments made. Miles Fidelman p.s. I wrote a book for ALA Editions, they were great to work with. The acquisitions editor I worked with is now a Sr. Editor, so I expect they're still good folks to work with. Jason Bengtson wrote: I'm also surprised not to see anything about the sql/nosql end of the equation. Integral to a lot of apps and tools . . . at least from a web perspective (and probably from others too). Best regards, Jason Bengtson, MLIS, MA Head of Library Computing and Information Systems Assistant Professor, Graduate College Department of Health Sciences Library and Information Management University of Oklahoma Health Sciences Center 405-271-2285, opt. 5 405-271-3297 (fax) jason-bengt...@ouhsc.edu http://library.ouhsc.edu www.jasonbengtson.com NOTICE: This e-mail is intended solely for the use of the individual to whom it is addressed and may contain information that is privileged, confidential or otherwise exempt from disclosure. If the reader of this e-mail is not the intended recipient or the employee or agent responsible for delivering the message to the intended recipient, you are hereby notified that any dissemination, distribution, or copying of this communication is strictly prohibited. If you have received this communication in error, please immediately notify us by replying to the original message at the listed email address. Thank You. On Mar 25, 2014, at 7:39 AM, Ian Ibbotson ian.ibbot...@k-int.com wrote: Going in the other direction from cobol and fortran -Fair warning - Putting on java evangelist hat- :) I wonder if it might be worth suggesting to the authors that they change java into JVM Languages and cover off Java, Scala, Groovy,...(others). We've had lots of success in the GoKB( http://gokb.org/) and KB+(https://www.jisc-collections.ac.uk/News/kbplus/) Knowledge Base projects using groovy on grails - Essentially all the pre-built libraries and enterprise gubbins of Java, but with a more ruby-esq idiom making it much more readable / less verbose / more expressive, and integrating nicely with all that existing enterprise infrastructure to boot. The use of embedded languages in JVMs (Including javascript) means that the use of Domain Specific Languages are becoming more and more widespread under JVMs, and this seems (To me) an area where there is some real advantage to having practitioners with real coding skills - Maybe not the hardcore systems development stuff but certainly ability to tune and configure software. Expressing things like business rules in DSLs (EG How to choose a supplier for an item, or how to deduplicate a title) gives librarians an opportunity to tune the behaviour of systems dynamically without system
Re: [CODE4LIB] Usability resources
On Mar 25, 2014, at 4:07 PM, Coral Sheldon-Hess wrote: Some things that came up in the UX discussion (well, the third of it I was in) at the breakout session, about how to get your library to be more open to UX: [trimmed, although, I agree on the Steve Krug books] I apologize for the self promotion, but not all libraries' cultures allow for the big public test approach. Mine ... might, now, but probably wouldn't have, a couple of years ago. There's been a recommendation for years that big public tests are a waste of people's time ... you don't do that until it's effectively a release candidate. Here are the problems: (1) there's going to be one or two problems that are the majority of the problem reports. (2) once everyone's tested out the buggy version, they're tainted so can't be a clean slate when testing the next version. Most recommendations that I've seen call for 3-5 testers for each iteration, with 2-3 being preferred if you're doing fast cycles. [1] Yes, you can run into the one tester with completely unreasonable demands about how things should be done, but if your programmers don't see how stupid the ideas are, they should be shown to be horrible in the next test cycle. If you run too large of tests, you've got to leave some long time window for people to test, someone has to correlate all of the comments ... it's just a drag. Small test groups mean you can run a day of testing once a week and keep moving forward. -Joe [1] I'll probably out myself as an old fogey here, but : http://www.useit.com/articles/why-you-only-need-to-test-with-5-users/
Re: [CODE4LIB] Job: PERL PROGRAMMER at The Center for Research Libraries
For those looking to hire a Perl programmer, two suggestions: 1. Don't put it in all caps: http://www.perl.org/about/style-guide.html 2. Make sure you post on the Perl jobs board: http://jobs.perl.org/ -Joe ps. I have no idea how the Java folks like their language capitalized, but I suspect it's similar. pps. On the plus side, it makes it really easy to weed out resumes of who's only dabbling and not active in the community. On Mar 10, 2014, at 11:35 AM, j...@code4lib.org wrote: PERL PROGRAMMER The Center for Research Libraries Chicago Center for Research Libraries (CRL) is a membership consortium consisting of the leading academic and research libraries in the U.S. and abroad, with a unique and historic collection. A recently awarded grant from the Andrew W. Mellon Foundation has enabled the CRL to continue and expand its efforts to shape a data-centered international strategy for archiving and digitizing historical journals and newspapers. We are seeking a PERL Programmer to work with our existing team of librarians to further develop and maintain data projects critical to meeting our objective. Work primarily involves analyzing and manipulating data sets from library and commercial sources to pull out needed data and transform it into additional formats for ingest into existing databases or tools used for presentation of the data. Duties and Responsibilities: • Working closely with librarians to analyze and manipulate data sets • Creating optimized scalable code • Design, build and test tools to analyze data, extract patterns, and transform data among various formats as required by project demands. • Design and build user-friendly interface for tools. Requirements: • Strong analytical skills, with experience analyzing dataflow, data patterns and work flow • Minimum of 1 year of PERL programming experience • Experience using PERL, JAVA or other programming languages to normalize text and applying API's to harvest or capture data. • Ability to collaborate and contribute to a team and work independently • Ability to document and explain standards • Related degree required In addition to professional challenge and the chance to make a creative contribution, the CRL offers a competitive salary and exceptional benefits package. Respond with the title of the position in the subject line to: resu...@crl.edu. You may also respond by mail or fax, indicating the position you are applying for to: Human Resources Center for Research Libraries 6050 S. Kenwood Ave. Chicago, IL 60637 Fax: 773-955-4545 An Equal Opportunity Employer m/f/d/v Brought to you by code4lib jobs: http://jobs.code4lib.org/job/12932/
Re: [CODE4LIB] Job: PERL PROGRAMMER at The Center for Research Libraries
On Mar 10, 2014, at 12:19 PM, Lisa Rabey wrote: On Mon, Mar 10, 2014 at 11:46 AM, Joe Hourcle onei...@grace.nascom.nasa.gov wrote: For those looking to hire a Perl programmer, two suggestions: 1. Don't put it in all caps: http://www.perl.org/about/style-guide.html This is a fair point if they only all-capped Perl, which they didn't; they capped the title of the job. I'm assuming they did this for formatting reasons in the email, which should have no barring in who's dabbling in the community and who is not. And although I'm normally a fan of trimming down message test to the relevant parts, you conveniently removed the other three occurrences of 'PERL' in the posting: We are seeking a PERL Programmer to work ... Minimum of 1 year of PERL programming experience Experience using PERL, JAVA or other ... But it also raises the point that if I were a Perl programmer, someone nitpicking about email formatting is probably not someone I would want to work with. Right ... I should apologize for top-posting in my last message. I'm sorry, and I'll try not to do it again. Thank you for not continuing the top-posting in your reply. -Joe
Re: [CODE4LIB] Book scanner suggestions redux
On Mar 3, 2014, at 10:54 AM, Aaron Rubinstein wrote: Hi all, We’re looking to purchase a book scanner and I was hoping to get some recommendations from those who’ve had experience. I don't have experience, but a couple of years back, a group started selling kits to make book scanners: http://diybookscanner.myshopify.com/products/diy-book-scanner-kit It's $500+shipping, and missing some parts (glass, cameras, paint), but it means that instead of carpentry skills, you just need experience assembling things. -Joe
Re: [CODE4LIB] online book price comparison websites?
On Feb 26, 2014, at 3:14 PM, Jonathan Rochkind wrote: Anyone have any recommendations of online sites that compare online prices for purchasing books? I'm looking for recommendations of sites you've actually used and been happy with. They need to be searchable by ISBN. Bonus is if they have good clean graphic design. Extra bonus is if they manage to include shipping prices in their price comparisons. Might be too late, but : http://isbn.nu/ It doesn't include the shipping prices in their results, though. API is just appending the ISBN to the end, either 9 or 13 : http://isbn.nu/0060853980 http://isbn.nu/9780060853983 -Joe
Re: [CODE4LIB] Question about OAI Harvesting via Perl
On Jan 14, 2014, at 3:01 PM, Eka Grguric wrote: Hi, I am a complete newbie to Perl (and to Code4Lib) and am trying to set up a harvester to get complete metadata records from oai-pmh repositories. My current approach is to use things already built as much as possible - specifically the Net::Oai::Harvester (http://search.cpan.org/~esummers/OAI-Harvester-1.0/lib/Net/OAI/Harvester.pm). The code I'm using is located in the synopsis and specific parts of it seem to work with some samples I've tried. For example, if I submit a request for a list of sets to the oai url for arXiv.org (http://arXiv.org/oai2) I get the correct list. The error I run into reads can't call listRecords() on an undefined value in *filename* line *#*. listRecords() seems to have been an issue in past iterations but I'm not sure how to get around it. At the moment it looks like this: ## list all the records in a repository my $list = $harvester-listRecords( metadataPrefix = 'oai_dc' ); Any help (or Perl resources) would be appreciated! The error message you're getting is a sign that '$harvester' (the item that you tried calling 'listRecords' on) hasn't been set up properly. The typical scenarios are that either the object was never called to be created or when you tried to create it the function returned undef (undefined value) to indicate that something had gone wrong. How did you initialize it? -Joe
Re: [CODE4LIB] The lie of the API
On Dec 2, 2013, at 1:25 PM, Kevin Ford wrote: A key (haha) thing that keys also provide is an opportunity to have a conversation with the user of your api: who are they, how could you get in touch with them, what are they doing with the API, what would they like to do with the API, what doesn’t work? These questions are difficult to ask if they are just a IP address in your access log. -- True, but, again, there are other ways to go about this. I've baulked at doing just this in the past because it reveals the raw and primary purpose behind an API key: to track individual user usage/access. I would feel a little awkward writing (and receiving, incidentally) a message that began: -- Hello, I saw you using our service. What are you doing with our data? Cordially, Data service team -- It's better than posting to a website: We can't justify keeping this API maintained / available, because we have no idea who's using it, or what they're using it for. Or: We've had to shut down the API because we'd had people abusing the API and we can't easily single them out as it's not just coming from a single IP range. We don't require API keys here, but we *do* send out messages to our designated community every couple of years with: If you use our APIs, please send a letter of support that we can include in our upcoming Senior Review. (Senior Review is NASA's peer-review of operating projects, where they bring in outsiders to judge if it's justifiable to continue funding them, and if so, at what level) Personally, I like the idea of allowing limited use without a key (be it number of accesses per day, number of concurrent accesses, or some other rate limiting), but as someone who has been operating APIs for years and is *not* *allowed* to track users, I've seen quite a few times when it would've made my life so much easier. And, if you cringe a little at the ramifications of the above, then why do you need user-specific granularity? (That's really not meant to be a rhetorical question - I would genuinely be interested in whether my notions of open and free are outmoded and based too much in a theoretical purity that unnecessary tracking is a violation of privacy). You're assuming that you're actually correlating API calls to the users ... it may just be an authentication system and nothing past that. Unless the API key exists to control specific, user-level access precisely because this is a facet of the underlying service, I feel somewhere in all of this the service has violated, in some way, the notion that it is open and/or free, assuming it has billed itself as such. Otherwise, it's free and open as in Google or Facebook. You're also assuming that we've claimed that our services are 'open'. (mine are, but I know of plenty of them that have to deal with authorization, as they manage embargoed or otherwise restricted items). Of course, you can also set up some sort of 'guest' privileges for non-authenticated users so they just wouldn't see the restricted content. All that said, I think a data service can smooth things over greatly by not insisting on a developer signing a EULA (which is essentially what happens when one requests an API key) before even trying the service or desiring the most basic of data access. There are middle ground solutions. I do have problems with EULAs ... one in that we have to get things approved by our legal department, second in that they're often written completely one-sided and third in that they're often written assuming personal use. Twitter and Facebook had to make available alternate EULAs so that governments could use them ... because you can't hold the person who signed up for the account responsible for it. (and they don't want it 'owned' by that person should they be fired, etc.) ... but sometimes they're less restrictive ... more TOS than EULA. Without it, you've got absolutely no sort of SLA ... if they want to take down their API, or block you, you've got no recourse at all. -Joe
Re: [CODE4LIB] The lie of the API
On Dec 1, 2013, at 3:51 PM, LeVan,Ralph wrote: I'm confused about the supposed distinction between content negotiation and explicit content request in a URL. The reason I'm confused is that the response to content negotiation is supposed to be a content location header with a URL that is guaranteed to return the negotiated content. In other words, there *must* be a form of the URL that bypasses content negotiation. If you can do content negotiation, then you should have a URL form that doesn't require content negotiation. There are three types of content negotiation discussed in HTTP/1.1. The one that most gets used is 'transparent negotiation' which results in there being different content served under a single URL. Transparent negotiation schemes do *not* redirect to a new URL to allow the cache or browser to identify the specific content returned. (this would require an extra round trip, as you'd have to send a Location: header to redirect, then have the browser request the new page) So that you don't screw up web proxies, you have to specify the 'Vary' header to tell which parameters you consider significant so that it knows what is or isn't cacheable. So if you might serve different content based on the Accept and Accept-Encoding would return: Vary: Accept, Accept-Encoding (Including 'User Agent' is problematic because of some browsers that pack in every module + the version in there, making there be so many permutations that many proxies will refuse to cache it) -Joe (who has been managing web servers since HTTP/0.9, and gets annoyed when I have to explain to our security folks each year why I don't reject pre-HTTP/1.1 requests or follow the rest of the CIS benchmark recommendations that cause our web services to fail horribly)
Re: [CODE4LIB] The lie of the API
On Dec 1, 2013, at 7:57 PM, Barnes, Hugh wrote: +1 to all of Richard's points here. Making something easier for you to develop is no justification for making it harder to consume or deviating from well supported standards. [Robert] You can't just put a file in the file system, unlike with separate URIs for distinct representations where it just works, instead you need server side processing. If we introduce languages into the negotiation, this won't scale. It depends on what you qualify as 'scaling'. You can configure Apache and some other servers so that you pre-generate files such as : index.en.html index.de.html index.es.html index.fr.html ... It's even the default for some distributions. Then, depending on what the Accept-Language header is sent, the server returns the appropriate response. The only issue is that the server assumes that the 'quality' of all of the translations are equivalent. You know that 'q=0.9' stuff? There's actually a scale in RFC 2295, that equates the different qualities to how much content is lost in that particular version: Servers should use the following table a guide when assigning source quality values: 1.000 perfect representation 0.900 threshold of noticeable loss of quality 0.800 noticeable, but acceptable quality reduction 0.500 barely acceptable quality 0.300 severely degraded quality 0.000 completely degraded quality [Robert] This also makes it much harder to cache the responses, as the cache needs to determine whether or not the representation has changed -- the cache also needs to parse the headers rather than just comparing URI and content. Don't know caches intimately, but I don't see why that's algorithmically difficult. Just look at the Content-type of the response. Is it harder for caches to examine headers than content or URI? (That's an earnest, perhaps naïve, question.) See my earlier response. The problem is without a 'Vary' header or other cache-control headers, caches may assume that a URL is a fixed resource. If it were to assume that was static, then it wouldn't matter what was sent for the Accept, Accept-Encoding or Accept-Language ... and so the first request proxied gets cached, and then subsequent requests get the cached copy, even if that's not what the server would have sent. If we are talking about caching on the client here (not caching proxies), I would think in most cases requests are issued with the same Accept-* headers, so caching will work as expected anyway. I assume he's talking about caching proxies, where it's a real problem. [Robert] Link headers can be added with a simple apache configuration rule, and as they're static are easy to cache. So the server side is easy, and the client side is trivial. Hadn't heard of these. (They are on Wikipedia so they must be real.) What do they offer over HTML link elements populated from the Dublin Core Element Set? Wikipedia was the first place you looked? Not IETF or W3C? No wonder people say libraries are doomed, if even people who work in libraries go straight to Wikipedia. ... oh, and I should follow up to my posting from earlier tonight -- upon re-reading the HTTP/1.1 spec, it seems that there *is* a way to specify the authoritative URL returned without an HTTP round-trip, Content-Location : http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.14 Of course, it doesn't look like my web browser does anything with it: http://www.w3.org/Protocols/rfc2616/rfc2616 http://www.w3.org/Protocols/rfc2616/rfc2616.html http://www.w3.org/Protocols/rfc2616/rfc2616.txt ... so you'd still have to use Location: if you wanted it to show up to the general public. -Joe
Re: [CODE4LIB] The lie of the API
On Dec 1, 2013, at 9:36 PM, Barnes, Hugh wrote: -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Joe Hourcle (They are on Wikipedia so they must be real.) Wikipedia was the first place you looked? Not IETF or W3C? No wonder people say libraries are doomed, if even people who work in libraries go straight to Wikipedia. It was a humorous aside, regrettably lacking a smiley. Yes, a smiley would have helped. It also doesn't help that there used to be a website out there named 'ScoopThis'. They started as a wrestling parody site, but my favorite part was their advice column from 'Dusty the Fat, Bitter Cat'. I bring this up because their slogan was cuz if it’s on the net, it’s got to be true ... so I twitch a little whenever someone says something similar to that phrase. (unfortunately, the site's gone, and archive.org didn't cache them, so you can't see the photoshopped pictures of Dusty at Woodstock '99 or the Rock's cooking show. They started up a separate website for Dusty, but when they closed that one down, they put up a parody of a porn site, so you probably don't want to go looking for it) I think that comment would be better saved to pitch at folks who cite and link to w3schools as if authoritative. Some of them are even in libraries. Although I wish that w3schools would stop showing up so highly in searches for javascript methods css attributes, they did have a time when they were some of the best tutorials out there on web-related topics. I don't know if I can claim that to be true today, though. Your other comments were informative, though. Thank you :) I try ... especially when I'm procrastinating on doing posters that I need to have printed by Friday. (but if anyone has any complaints about data.gov or other federal data dissemination efforts, I'll be happy to work them in) -Joe
Re: [CODE4LIB] The lie of the API
On Dec 1, 2013, at 11:12 PM, Simon Spero wrote: On Dec 1, 2013 6:42 PM, Joe Hourcle onei...@grace.nascom.nasa.gov wrote: So that you don't screw up web proxies, you have to specify the 'Vary' header to tell which parameters you consider significant so that it knows what is or isn't cacheable. I believe that if a Vary isn't specified, and the content is not marked as non cachable, a cache must assume Vary:*, but I might be misremembering That would be horrible for caching proxies to assume that nothing's cacheable unless it said it was. (as typically only the really big websites or those that have seen some obvious problems bother with setting cache control headers.) I haven't done any exhaustive tests in many years, but I was noticing that proxies were starting to cache GET requests with query strings, which bothered me -- it used to be that anything that was an obvious CGI wasn't cached. (I guess that enough sites use it, it has to make the assumption that the sites aren't stateful, and that the parameters in the URL are enough information for hashing) (who has been managing web servers since HTTP/0.9, and gets annoyed when I have to explain to our security folks each year why I don't reject pre-HTTP/1.1 requests or follow the rest of the CIS benchmark recommendations that cause our web services to fail horribly) Old school represent (0.9 could out perform 1.0 if the request headers were more than 1 MTU or the first line was sent in a separate packet with nagle enabled). [Accept was a major cause of header bloat]. Don't even get me started on header bloat ... My main complaint about HTTP/1.1 is that it requires clients to support chunked encoding, and I've got to support a client that's got a buggy implementation. (and then my CGIs that serve 2GB tarballs start failing, and it's calling a program that's not smart enough to look for SIG_PIPE, so I end up with a dozen of 'em going all stupid and sucking down CPU on one of my servers) Most people don't have to support a community written HTTP client, though. (and the one alternative HTTP client in IDL doesn't let me interactive w/ the HTTP headers directly, so I can't put a wrapper around it to extract the tarball's filename from the Content-Disposition header) -Joe ps. yep, still having writer's block on posters.
Re: [CODE4LIB] calibr: a simple opening hours calendar
On Nov 27, 2013, at 11:01 AM, Jonathan Rochkind wrote: Many of our academic libraries have very byzantine 'hours' policies. Developing UI that can express these sensibly is time-consuming and difficult; by doing a great job at it (like Sean has), you can make the byzantine hours logic a lot easier for users to understand... but you can still only do so much to make convoluted complicated library hours easy to deal with and understand for users. If libraries can instead simplify their hours, it would make things a heck of a lot easier on our users. Synchronize the hours of the different parts of the library as much as possible. If some service points aren't open the full hours of the library, if you can make all those service points open the _same_ reduced hours, not each be different. Etc. To some extent, working on hours displays to convey byzantine hours structures can turn into the familiar case of people looking for technological magic bullet solutions to what are in fact business and social problems. I agree up to a point. When I was at GWU, we were running what was the most customized version of Banner (a software system for class registration, HR, etc.) Some of the changes were to deal with rules that no one could come up with a good reason for, and they should have been simplified. Other ones were there for a legitimate reason.* You should take these sorts of opportunities to ask *why* the hours are so complicated, and either document the reason for it, or look to simplify it. Did a previous librarian have some regularly scheduled thing every Tuesday afternoon, and that's why one section closes down early on Tuesdays? If they're not there anymore, you can change that. Does one station requiring some sort of a shutdown / closing procedure that takes a significant amount of time, and they close early so they're done by closing time? Or do they open late because they have similar issue setting up in the morning, and it's unrealistic to have them come in earlier than everyone else? Maybe there's something else that could be done to improve and/or speed up the procedures.** Has there been historically less demand for certain types of books at different times of the day? Well, that's going to be hard to verify, as people have now adjusted to the library's hours, rather than visa-versa ... but it's a legitimate reason to not keep service points open if no one's using them. ... but I would suggest that you don't use criteria like the US Postal Service's recommendation to remove postboxes -- they based it on number of pieces of mail, and ended up removing them all in some areas. ... Anyway, the point I'm making -- libraries are about service. Simplification might make it easier to keep track of things, but it doesn't necessarily make for better service. -Joe * Well, legitimate to someone, at least. For instance, the development office had a definition of alumni that included donors who might not've actually attended the university. ** When I worked for the group that ran GW's computer labs, some days I staffed a desk that we had over in the library ... but I had to clock in at the main office, then walk over to other building, and once the shift was over, walk back to the main office to clock out. I got them to designate one of the phones in the library computer lab as being allowed to call into the time clock system, so I could stop wasting so much time ... then they decided to just stop having staff over there. On 11/27/13 9:25 AM, Sean Hannan wrote: I¹d argue that library hours are nothing but edge cases. Staying open past midnight is actually a common one. But how do you deal with multiple library locations? Multiple service points at multiple library locations? Service points that are Œby appointment only¹ during certain days/weeks/months of the year? Physical service points that are under renovation (and therefore closed) but their service is being carried out from another location? When you have these edge cases sorted out, how do you display it to users in a way that makes any kind of sense? How do you get beyond shoehorning this massive amount of data into outmoded visual paradigms into something that is easily scanned and processed by users? How do you make this data visualization work on tablets and phones? The data side of calendaring is one thing (and for as standard and developed as the are, iCal and Google Calendar¹s data formats don¹t get it 100% correct as far as I¹m concerned). Designing the interaction is wholly another. It took me a good two or three weeks to design the interaction for our new hours page (http://www.library.jhu.edu/hours.html) over the summer. There were lots of iterations, lots of feedback, lots of user testing. ³User testing? Just for an hours page?² Yes. It¹s one of our most highly sought pieces of information on our website (and yours too, probably). Getting it right pays off
Re: [CODE4LIB] Faculty publication database
On Oct 25, 2013, at 11:35 AM, Alevtina Verbovetskaya wrote: Hi guys, Does your library maintain a database of faculty publications? How do you do it? Some things I've come across in my (admittedly brief) research: - RSS feeds from the major databases - RefWorks citation lists These options do not necessarily work for my university, made up of 24 colleges/institutions, 6,700+ FT faculty, and 270,000+ degree-seeking students. Does anyone have a better solution? It need not be searchable: we are just interested in pulling a periodical report of articles written by our faculty/students without relying on them self-reporting days/weeks/months/years after the fact. If you're forced to rely on self-reporting, one of the solutions that I've seen is to add a few more features and introduce it as a 'CV Builder' or some sort of 'Faculty Directory' ... so the faculty members get some benefit back out of it, and it's more public so they have an interest in keeping it updated. I'd also recommend talking to the individual colleges -- it's possible that some of them already maintain databases, either for the whole college or at the departmental level. They might be willing to keep the data populated if you provide the hosted service. (and the tenure-track folks have a vested interest in making sure their records kept up-to-date). In looking through the other recommendations -- I didn't see ORCID or ResearcherID mentioned ... I know they're not exhaustive, but it might be possible to have a way to automate dumps from them -- so the faculty member keeps ORCID up-to-date, and you periodically generate dumps from ORCID for all of your faculty. The last time I checked it, ORCID found all of my ASIST work ... but missed all of the stuff that I've published in space physics and data informatics. (admittedly, those weren't peer-reviewed, but neither were most of the ASIST ones) -Joe
[CODE4LIB] Please use HTTP 503 (was: Library of Congress)
On Oct 1, 2013, at 9:52 AM, Nick Ruest wrote: Welp. XSDs are redirecting. See[1]. -nruest [1] http://www.loc.gov/standards/mods/v3/mods-3-4.xsd (*@#!@#% I tried telling people around here to use HTTP 503 ... but GSA sent out advice to use 302s ... If there are any people who are still in the process of their 'orderly shutdown' ... please send HTTP 503 (Service Unavailable) for requests, so that search engines don't completely screw things up while we're shut down, or ignorant systems try to process the error page as if it were real content. -Joe Apache : http://stackoverflow.com/q/622466/143791 ISS : http://serverfault.com/q/483145/14119 Nginx : http://stackoverflow.com/q/5984270/143791 On 13-10-01 09:36 AM, John Palmer wrote: Furloughs don't officially start until noon local time Tuesday, so they may be in the process of receiving instructions for shutdown. On Tue, Oct 1, 2013 at 6:21 AM, Doran, Michael D do...@uta.edu wrote: As far as I can tell the LOC is up and the offices are closed. HORRAY!! Let's celebrate! Before we start celebrating, let's consider our friends and colleagues at the LOC (some of who are code4lib people) who aren't able to work and aren't getting paid starting today. -- Michael # Michael Doran, Systems Librarian # University of Texas at Arlington # 817-272-5326 office # 817-688-1926 mobile # do...@uta.edu # http://rocky.uta.edu/doran/ -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Riley Childs Sent: Tuesday, October 01, 2013 5:28 AM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Library of Congress As far as I can tell the LOC is up and the offices are closed. HORRAY!! Let's celebrate! Riley Childs Junior and Library Tech Manager Charlotte United Christian Academy +1 (704) 497-2086 Sent from my iPhone Please excuse mistakes
Re: [CODE4LIB] Way to record usage of tables/rooms/chairs in Library
On Aug 16, 2013, at 9:52 AM, Ian Walls wrote: Suma is the most practical and reliable way to do this right now, I think. I've been investigating using a sensor network, but there are a lot of limits on the accuracy of PIR, and trip-lasers are low enough and require enough power that they'd be troublesome to maintain in a busy undergraduate environment. One idea was to use an array of sensors: PIR for motion, microphone for noise level and piezo/something similar for vibration. The thought is that elevated levels of these 3 measurements should correspond to high activity. The placement and calibration of the sensors, though, would be key, and you'd need to do some thorough spot checking with Suma or something similar in order to be confident that what you're measuring (motion, noise and vibration) actually correlate to number of people. The sensors would also need to be made out of cheap enough materials and use low-congestion wireless frequencies in order to be practical. Balancing this with accuracy may never happen... but it would certainly be a fun experiment! If you're going to take the sensor approach, and it's just a matter of if there are bodies in specific places, you *might* be able to do it by modifying cheap webcams. Many are sensitive in infrared, so you take the IR filter out, and then add a visible filter. Position the cameras so that you have coverage of the area you care about, have them take a picture at whatever times you care about, and then it's just looking for hot spots. (although of course, if you do this, it'd be just as easy for someone to review security camera footage, if you have coverage in the places you care about; the IR might be easier to automate the counting, though, if you have someone who's good with automated image analysis) And if it's just a matter of activity counting -- you might be able to see if your wireless access points can tell how many items they're in contact with, and use that as a proxy. -Joe
Re: [CODE4LIB] locking app for iPads
On Jul 25, 2013, at 3:52 PM, Cheryl Kohen wrote: Dear Fellow Techs, We're looking to create a circulation policy for iPads (gen 4) in the Learning Commons, and were wondering about an app that will lock the device after a specific amount of time (3-4 hours). The idea is if a student does, in fact, steal the device, they will be locked out of actually utilizing it. Has anyone heard of something like this? I don't know of a time-sensitive one, but Apple's Find My iPad (or iPhone), has an option to remotely lock a device: https://www.apple.com/icloud/features/find-my-iphone.html I suspect it needs a network connection to send the signal to lock. I don't know if it'll stop anyone who can jailbreak the device, but it would hopefully stop the person attempting to 'borrow' it long-term. (and you can track where it is, if it's a device with GPS) -Joe
Re: [CODE4LIB] Lightweight Autocomplete Application
On Jul 8, 2013, at 10:37 AM, Anderson, David (NIH/NLM) [E] wrote: I'm looking for a lightweight autocomplete application for data entry. Here's what I'd like to be able to do: * Import large controlled vocabularies into the app * Call up the app with a macro wherever I'm entering data * Begin typing in a term from the vocabulary, get a list of suggestions for terms * Select a term from the list and have it paste automatically into my data entry field Ideally it would load and suggest terms quickly. I've looked around, but nothing really stands out. Anyone using anything like this? Is this web-based? If not, do you have control of the software that you're entering the data into? If so, what language is it in?) If not, what OS are you using? -Joe
Re: [CODE4LIB] StackExchange reboot?
On Jul 8, 2013, at 3:50 PM, Christie Peterson wrote: I agree with both Shaun and Galen's points; when you're asking a how to do X with tool Y type of question, SE is a great forum. Like Christina, I've mostly encountered SE when Googling for answers to these types of questions. However, for the reasons that Henry and Gary mentioned, I was disappointed in the Digital Preservation SE experience. At the request of one of the SE organizers, I posted a question there that I had also posted to a listserv. It was flagged for not being in the proper form, but I have no idea how I could have framed it properly for SE because it simply wasn't a question that had a single answer. I wanted discussion. Digital Preservation in particular is a developing field and I was trying to gague opinions and currently evolving best practices. Somewhat ironically given the potential value of the commenting and upvoting mechanism, SE did not prove to be a good forum for this. There may be some value to having a code4lib SE instance that answers questions of the how to do X with tool Y type and similar for the reasons that Shaun and Galen state. But unless the community standards about what makes a good SE question change radically, I don't see it being an attractive or useful forum for the more open-ended, discussion/opinion type questions that people often post to library, digital preservation and other listservs. I actually just responded to this issue the other day on the Open Data SE site: http://meta.opendata.stackexchange.com/q/126/263 Back when Cooking SE started (~2.5 years ago), multiple possible answers was considered a valid question. They didn't tend to like polls ('what's the best ...') but questions about possibilities of how to deal with problems were acceptable. I'd link to some of them, but there have since been a few people who go around and vote to close every question they don't like, even if they're gotten a dozen or more upvotes. Here's one instead that's not even a question that's ranked in the top 10 'questions' on the cooking site: http://cooking.stackexchange.com/q/784/67 Personally, I'm of the opinion that there are *very* few problems that only have a single solution, or a 'best' solution. What they really tend to reward people for is coming up with a plausible, moderately detailed answer quick enough. I've seen a number get marked as the 'best answer' within 30 min of the question being asked where the answer from my point of view was just plain wrong. I do see a use for the sort of things that might've once been considered 'community wiki' ... what books can I recommend to a 3rd grader who is interested in science fiction? (I've cheated before and worded them like 'where can I find a list of books to recommend ...') It *might* be possible to get enough like-minded people involved to ensure that if anyone attempts to close reasonable questions we can get them re-opened quickly ... but I'd like to recommend changing the scope up front to museums, libraries archives. I don't know that the more practical 'library' and the abstract/academic 'library science' communities really mesh all that well. And I should probably go get some sleep as I write e-mail that's even more incoherent than typical when I've only gotten ~8hrs sleep over the last 3 days. -Joe
Re: [CODE4LIB] Code4Lib 2014: Save the dates!
On Jun 29, 2013, at 7:16 AM, BWS Johnson wrote: Salvete! I am happy to announce that we have secured the venue and dates for Code4Lib 2014! The conference will be held at the Sheraton Raleigh Hotel in downtown Raleigh, NC on March 24 - 27, 2014. Preconferences will be held Monday March 24, and the main conference on Tuesday March 25 - 27. Hooray, that's sort of close. Maybe I'll be able to pit fight my own place next year. Finally, the hotel has the capacity to host all of the attendees, and we've negotiated a rate of $159/night that includes wireless access in the hotel rooms. Hotel reservations will be able to made after you register using the information provided in your registration confirmation. We will be publishing more details as become available. Ruh oh. This was rather shocking. Perhaps you might wish to show them a hotels.com search, which puts your $159 just over the Hilton and about double other places in the vicinity. I'm sure it's nice and all that, but uh, perhaps they would be willing to come down seeing as how we're sending a boatload of traffic their way. Government per-diem rates for Raleigh is $91 per night : http://www.gsa.gov/portal/category/100120 I have no idea if that can be used for negotiations at all. For some reason, they're not showing any federal government rates when I searched, but they're offering state government employees rooms at $64/night. (you might have to pay extra for the wifi, though) I highly suggest that people who work for public universities or libraries inquire about getting that rate. -Joe (even though the state of maryland makes me pay into the state retirement system because I'm a municipal elected official, they won't issue me a state ID card, so I can't get the state rates when traveling, which are typically better than the federal rates ... I've actually debated about if it makes sense to work for 3 years in a real state job, then claim a pension based on my 'top 3 years' of pay times the number of years worked)
Re: [CODE4LIB] DOI scraping
On May 21, 2013, at 9:40 PM, Fitchett, Deborah wrote: Joe and Owen-- Thanks for the ideas! It's a bit of the opposite goal to LibX, in that rather than having a title/DOI/whatever from some random site and wanting to get to the full-text article, I'm looking at the use case of academics who are already viewing the full-text article and want a link that they can share with students. Even aside from the proxy prefix, the url in their browser may include (or consist entirely of) session gunk. I'll try a regexp and see how far that gets me. I'm a bit trepidatious about the way the DOI standard allows just about any character imaginable, but at least there's the 10. prefix. Am also considering that if DOIs also appear in the article's bibliography I'll need to make sure the javascript can distinguish between them and the DOI for the article itself; but a lot of this might be 'cross that bridge if I come to it' stuff. Crap. I just remembered : http://shortdoi.org/ ... I don't know if any publishers are actually using them, or if they're just for people to use on twitter other social media. The real problem with them is that they don't have the '10.' string in them. You can probably get away with just tracking the resolving form of them: http://doi[.]org/(\w+) And ignore the 10/(\w+) form. -Joe
Re: [CODE4LIB] Policies for 3D Printers
On May 20, 2013, at 4:47 PM, Bigwood, David wrote: That's a question every library will have to answer for themselves. For us it makes perfect sense. Our scientists are sending out files to have 3D models of craters. When the price drops enough it will become more cost effective to do that in-house. It will just be an extension of maps and remote sensing data we already have in the collection. I can see a limit being fabrication related to the mission of the Institute, same as the large-format printer. A public library might have other concerns. If it is unlimited and free, is printing out 100 Hulk statues to sell at a comic convention acceptable? How about Barbie dolls to sell at a flea market? Or maybe Barbee dolls to side-step trademarks? Lots of unanswered questions, but each library will have to decide based on local conditions. Actually, this made me think back to my undergrad, when I worked in our schools 'Academic Computing' department. We had a big problem with students printing out multiple copies of their thesis on the printers in the computer labs, because they'd: 1. tie up the printers for a rather long time. 2. burn through all of the paper The result was, one or two bad actors kept everyone else from being able to use the services, because there were taking advantage of our 'free' printing. Our typical process, when we found someone needed to print their thesis was to print one copy from the printer in our staff offices, and they then had to go to one of the local copy shops to make the additional copies that they needed. (the policy of only one copy had been established for years, but was only really enforced when people came in and complained about people printing whole books) Although I can appreciate some of the arguments for making library services free, there needs to be some sort of a line drawn so that one or two people don't end up monopolizing a service. Just as I left, they ended up going to a system of some number of free pages per semester per student, with them having to pay if they wanted to print more than their gratis quota. I don't know if something like that would work, but you'd have to work out how to handle it. (number of objects? time spent on the printer? amount of material used?) -Joe
Re: [CODE4LIB] DOI scraping
On May 17, 2013, at 12:32 AM, Fitchett, Deborah wrote: Kia ora koutou, I’m wanting to create a bookmarklet that will let people on a journal article webpage just click the bookmarklet and get a permalink to that article, including our proxy information so it can be accessed off-campus. Once I’ve got a DOI (or other permalink, but I’ll cross that bridge later), the rest is easy. The trouble is getting the DOI. The options seem to be: Can anyone think of anything else I should be looking at for inspiration? 4. Look for any strings that look like a DOI: \b((?:http://dx.doi.org/|doi:|)10.[\d.]+/(?:\S+)) (as it sucks to code special things for each database, in case they change or you add a new one) You can then fall back to #1 if necessary. Also on a more general matter: I have the general level of Javascript that one gets by poking at things and doing small projects and then getting distracted by other things and then coming back some months later for a different small project and having to relearn it all over again. I’ve long had jQuery on my “I guess I’m going to have to learn this someday but, um, today I just wanna stick with what I know” list. So is this the kind of thing where it’s going to be quicker to learn something about jQuery before I get started, or can I just as easily muddle along with my existing limited Javascript? (What really are the pros and cons here?) If depends on what you're going to do with the output -- I'd likely look through the a href='' values for http://dx.doi.org DOIs first, then just look at the text displaying on the page. I don't think you'd need jQuery for that. -Joe
Re: [CODE4LIB] On-going support for DL projects
On May 17, 2013, at 9:51 AM, Tim McGeary wrote: I'm interested in starting or joining discussions about best practices for on-going support for digital library projects. In particular, I'm looking at non-repository projects, such as projects built on applications like Omeka. In the repository context, there are initiatives like APTrust and DPN that are addressing on-going and long term collaborative support. But, as far as I know, we aren't having the same types of discussions for DL projects that are application driven. If you're asking about funding issues, most of those discussions that I've seen lump it into 'governance'. There is no easy answer for this, so I'm looking for discussion. - Should we begin considering a cooperative project that focuses on emulation, where we could archive projects that emulate the system environment they were built? I know that there are projects using emulation when it'd be too expensive to port the software (and validate / vet it). There are some that are are setting up VMs for new software being written, so that they can archive the whole environment to ensure that the proper version of the OS, libraries, etc. are captured. Most of the ones that I've been have been focusing on scientific workflows, but that's likely because that's the field I'm in, so I tend to see more of those talks at conferences than other subjects. - Do we set policy that these types of projects last for as long as they can, and once they break they are pulled down? I wouldn't recommend that directly ... like anything, the stuff being archived has a value, and if someone's willing to pay for it to be continued, then you do it. Maybe you just need to have a policy on cost-recovery for when this happens. (and then you need to look at the various 'governance' discussions. - Do we set policy that supports these projects for a certain period of time and then deliver the application, files, and databases to the faculty member to find their own support? The ultimate decision might be at a higher pay grade -- you may want to come up with the list of options, estimated costs, and have the provost or deans decide what makes sense for the budget. - Do we look for a solution like the Way Back Machine of the Internet Archive to try to present some static / flat presentation of these project? Again, it likely depends on what's being archived. An online database that you can search / filter / interact with would be mostly useless as static pages. -Joe
Re: [CODE4LIB] makerspaces in libraries workshp
On May 15, 2013, at 8:30 AM, Edward Iglesias wrote: Hello All, I have the unlikely distinction of getting to offer a 1 day workshop on Makerspaces in libraries. I have a general idea of how it's going to go --morning theory afternoon hands on -- but am a little overwhelmed by the possibilities. My first thought was to show them how to use a Raspberry Pi but that would require them all to buy a Raspberry Pi. I am open to suggestions on what would be worth learning that is hands on and preferably cheap for a group of around 20. What would you teach/learn in an afternoon given the chance? Edward Iglesias I'd make sure to mention that this does *not* have to be high-tech. Our library runs jewelry-making workshops, and some of the local churches have knitting circles / quilting bees so there can be a social component of 'making'. They've never considered this to be 'makerspaces', but it fits the description. If it were me, depending on how much time you had, I'd try to come up with some sort of a project that people could build take home with them, (and so the Raspberry Pi idea is likely out). Depending on where you are, it might be a good time of year to make bird or bat houses, or maybe something decorative. Have them leave with a physical item that they can take and show off to others. Depending on how soon you'll get kicked out after your class ends, you might be able to plan for building something, and then let people stay later if they wanted to paint or otherwise decorate it. I'd plan on having someone cut all of the pieces in advance unless it can be done w/ hand tools and you have a sufficient number of the necessary tools ... ideally, you'd want something that could be assembled with press-fit and glue, or maybe a few nails or screws. (if you had to add hinges). -Joe If you really need an idea of something to make -- I can give you plans for gift boxes that I make ... it's shadow-box that says 'in case of emergency, break glass', and you can then put whatever you want in them. (typically, I give 'em with pacifiers to friends having their first child ... but I've done other stuff, like gave one w/ a box of kosher salt, peppercorns and whole cumin to Alton Brown when he was doing a book signing back in 2004 or so) It's simple pine, a plexiglass front, etc. You'll need a table saw, a miter box or chop saw and a label maker, and then it's just a matter of glue, a few nails, and some sanding. (you could also borrow a pneumatic brad nailer + a power sander, so that once you get everyone to make the item, show that it can all be done in 1/10th the time w/ the proper tools ... which is part of the reason for building out these spaces)
[CODE4LIB] FW: Digital Forensics Hackathon - June 3-5
I thought this was something that might interest people in code4lib. -Joe -Original Message- From: Cal Lee [mailto:cal...@email.unc.edu] Sent: Wednesday, May 01, 2013 11:36 AM Subject: Digital Forensics Hackathon - June 3-5 We'll be running a hackathon in Chapel Hill on June 3-5 that will focus on applying digital forensics methods to born-digital collections. We're running this with the Open Planets Foundation, who have done a terrific job in the past of running these events. The format is one in which people bring real technical challenges (including the associated data from their collections) to the event and pair up with developers who can provide substantive solutions to those challenges by the end of the three days. http://wiki.opf-labs.org/display/KB/2013-06-03+OPF+Hackathon+-+Tackling+Real-World+Collection+Challenges+with+Digital+Forensics+Tools+and+Methods+%28Chapel+Hill%29 I'm very excited that we're running this event in Chapel Hill. It's the first time that this very successful OPF model has made it to the US. It should be a great opportunity for all involved. I would really appreciate any efforts you could take to help us with getting the word out about it. Broadcasts through mailing lists, Twitter and such are all helpful. Even better is pointing it out to specific individuals who you think would be interested and would benefit from the event. The deadline for booking a hotel room at the block rate is May 19. But it's even better if people sign up well before then, so we can make the appropriate pairings and plan for the event. - Cal
Re: [CODE4LIB] Tool to highlight differences in two files
On Apr 23, 2013, at 4:37 PM, Alexander Duryee wrote: The absolute simplest way to do this would be to fire up a terminal (OSX/Linux) and: diff page1.html page2.html | less Unfortunately, this will also catch changes made in other markup, and may or may not be terribly readable. At the very least, I'd suggest adding a '-b' which will ignore changes to whitespace. Also see: http://www.w3.org/wiki/HtmlDiff -Joe On Tue, Apr 23, 2013 at 4:31 PM, Alevtina Verbovetskaya alevtina.verbovetsk...@mail.cuny.edu wrote: I've recently begun to use Beyond Compare: http://www.scootersoftware.com/ It's not free or OSS, though. There's also a plugin for Notepad++ that does something similar: http://sourceforge.net/projects/npp-compare/ This is free, of course. Thanks! Allie -Original Message- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Wilhelmina Randtke Sent: Tuesday, April 23, 2013 4:24 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Tool to highlight differences in two files I would like to compare versions of a website scraped at different times to see what paragraphs on a page have changed. Does anyone here know of a tool for holding two files side by side and noting what is the same and what is different between the files? It seems like any simple script to note differences in two strings of text would work, but I don't know a tool to use. -Wilhelmina Randtke
Re: [CODE4LIB] Tool to highlight differences in two files
On Apr 23, 2013, at 8:12 PM, Genny Engel wrote: There's a list here that may be more along the lines of what you're seeking. http://webapps.stackexchange.com/questions/11547/diff-for-websites Hmm ... I guess I should actually accept the answer as it was the only one ever given. -Joe
[CODE4LIB] password lockboxes (was: what do you do: API accounts used by library software, that assume an individual is registered)
On Mar 5, 2013, at 8:29 AM, Adam Constabaris wrote: An option is to use a password management program (KeepassX is good because it is cross platform) to store the passwords on the shared drive, although of course you need to distribute the passphrase for it around. So years ago, when I worked for a university, they wanted us to put all of the root passwords into an envelope, and give them to management to hold. (we were a Solaris shop, so there actually were root passwords on the boxes, but you had to connect from the console or su to be able to use 'em). We managed to drag our heels on it, and management forgot about it*, but I had an idea ... What if there were a way to store the passwords similar to the secret formula in Knight Rider? Yes, I know, it's an obscure geeky reference, and probably dates me. The story went that the secret bullet-proof spray on coating wasn't held by any one person; there were three people who each knew part of the formula, and that any two of them had enough knowledge to make it. For needing 2 of 3 people, the process is simple -- divide it up into 3 parts, and each person has a different missing bit. This doesn't work for 4 people, though (either needing 2 people, or 3 people to complete it). You could probably do it for two or three classes of people (eg, you need 1 sysadmin + 1 manager to unlock it), but I'm not sure if there's some method to get an arbitrary X of Y people required to unlock. If anyone has ideas, send 'em to be off-list. (If other people want the answer, I can aggregate / summarize the results, so I don't end up starting yet another inappropriate out-of-control thread) ... Oh, and I was assuming that you'd be using PGP, using the public key to encrypt the passwords, so that anyone could insert / update a password into whatever drop box you had; it'd only be taking stuff out that would require multiple people to combine efforts. -Joe * or at least, they didn't bring it up again while I was still employed there.
Re: [CODE4LIB] what do you do: API accounts used by library software, that assume an individual is registered
On Mar 4, 2013, at 11:11 AM, Jonathan Rochkind rochk...@jhu.edu wrote: Whether it's Amazon AWS, or Yahoo BOSS, or JournalTOCs, or almost anything else -- there are a variety of API's that library software wants to use, which require registering an account to use. [trimmed] Has anyone found a way to deal with this issue, other than having each API registered to an account belonging to whatever individual staff happened to be dealing with it that day? The government actually has a program for this. http://www.howto.gov/web-content/resources/tools/terms-of-service-agreements If you work for the feds there are some alternate terms of services for various Social Media Providers (it actually covers more than what I think of as social media). So far, they've only really looked at 'free' services. It's a little bit tricky to use them, as you have to find out if your government agency has yet agreed to the terms that a company is offering. If they don't have an agreement ... well, it takes some time to get the approval, as it's got to go through the agency's legal council. If you're with a state government (and most state universities are considered state government), then there are alternate TOSes available for Twitter, Facebook and YouTube. -Joe
Re: [CODE4LIB] A newbie seeking input/suggestions
On Feb 21, 2013, at 11:20 AM, Paul Butler (pbutler3) wrote: For something like this I would go the hardware route. A walkie-talkie on a charging stand at each service point. The walkie-talkies would always be on and tuned to the same channel. That way the staff person is not tied to the PC itself, they can grab the walkie-talkie and still do what they need to do - like head to the stacks or look for that reserve material. No phone number to remember. This solution could help with other issues, like security and system/network outages. I admit, I've never worked as a librarian, but I did work at a computer help desk during undergrad. We had a policy of trying our best *not* to go into the computer labs, because if you did, you'd get 6+ people who suddenly had questions they wanted to ask ... but couldn't have been bothered to actually go to the office to ask. When I first started, someone who went to go add paper to a printer might not come back for 30+ minutes. (I realize that this policy likely won't work for a library, though) Our follow-up policy was not the answer questions in the labs, and make them go to the office so they don't cut in line if there were people queued up. ... so I completely agree about needing something that's not fixed to a single location. If you can make it beep on demand, that's even better. (oops, sorry, I've got to go, I've been summoned back to the desk) If you're going to do something that's computer-based, I'd be inclined to think about some sort of phone app, or even part of a more comprehensive tool to assist in other things that you might need while you're in the stacks trying to help someone. -Joe
Re: [CODE4LIB] A newbie seeking input/suggestions
On Feb 21, 2013, at 2:28 PM, Cab Vinton wrote: This seems like a good application for text messaging -- as long as all librarians have smartphones, which they surely would at Yale :-) The problem is that you'd have to have it dynamically generate the list of who to text based on who's currently on duty. Otherwise, you have it harassing people on their days off, when they're home sick, etc. -Joe
Re: [CODE4LIB] You *are* a coder. So what am I?
On Feb 18, 2013, at 11:17 AM, John Fereira wrote: I suggested PHP primarily because I find it easy to read and understand and that's it's very commonly used. Both Drupal and Wordpress are written in PHP and if we're talking about building web pages there are a lot of sites that use one of those as a CMS. And if you're forced to maintain one of those, then by all means, learn PHP ... but please don't recommend that anyone learn it as a first language. ... and I'd like to say that in my mention of Perl, it was only because there's going to be the workshop ... not that I'd necessarily recommend it as a first language for all people ... I'd look at what they were interested in trying to do, and make a recommendation on what would best help them do what they're interested in. I've looked at both good and bad perl code, some written some very accomplished software developers, and I still don't like it. I am not personally interested in learning to make web pages (I've been making them for 20 years) and have mostly dabbled in Ruby but suspect that I'll be doing a lot more programming in Ruby (and will be attending the LibDevConX workshop at Stanford next month where I'm sure we'll be discussing Hydra). I'm also somewhat familiar with Python but I just haven't found that many people are using it in my institution (where I've worked for the past 15 years) to spend any time learning more about it. If you're going to suggest mainstream languages I'm not sure how you can omit Java (though just mentioning the word seems to scare people). It's *really* easy to omit Java: http://www.recursivity.com/blog/2012/10/28/ides-are-a-language-smell/ ... not to mention all of the security vulnerabilities and memory headaches associated with anything that runs in a VM. You might as well ask why I didn't suggest C or assembler for beginners. That's not to say that I haven't learned things from programming in those languages (and I've even applied tricks from Fortran and IDL in other languages), but I wouldn't recommend any of those languages to someone who's just learning to program. -Joe (ps. I'm grumpier than usual today, as I've been trying to get hpn patched openssh to compile under centos 6 ... so that it can be called by a java daemon that is called by another C program that dynamically generates python and shell scripts ... and executes them but doesn't always check the exit status ... this is one of those times when I wish some people hadn't learned to program, so they'd just hire someone else to write it)
Re: [CODE4LIB] You *are* a coder. So what am I?
On Feb 17, 2013, at 11:43 AM, John Fereira wrote: I have been writing software professionally since around 1980 and first encounterd perl in the early 1990s of so and have *always* disliked it. Last year I had to work on a project that was mostly developed in perl and it reminded me how much I disliked it. As a utility language, and one that I think is good for beginning programmers (especially for those working in a library) I'd recommend PHP over perl every time. I'll agree that there are a few aspects of Perl that can be confusing, as some functions will change behavior depending on context, and there was a lot of bad code examples out there.* ... but I'd recommend almost any current mainstream language before recommending that someone learn PHP. If you're looking to make web pages, learn Ruby. If you're doing data cleanup, Perl if it's lots of text, Python if it's mostly numbers. I should also mention that in the early 1990s would have been Perl 4 ... and unfortunately, most people who learned Perl never learned Perl 5. It's changed a lot over the years. (just like PHP isn't nearly as insecure as it used to be ... and actually supports placeholders so you don't end up with SQL injections) -Joe
Re: [CODE4LIB] You *are* a coder. So what am I?
On Feb 15, 2013, at 8:22 AM, Kyle Banerjee wrote: On Thu, Feb 14, 2013 at 7:40 AM, Jason Griffey grif...@gmail.com wrote: The vast, vast, vast, vast majority of people have absolutely no clue how code translates into instructions for the magic glowing screen they look at all day. Even a tiny bit of empowerment in that arena can make huge differences in productivity and communication abilities This is what it boils down to. C4l is dominated by linux based web apps. For people in a typical office setting, the technologies these involve are a lousy place to start learning to program. What most of them need is very different than what is discussed here and it depends heavily on their use case and environment. A bit of VBA, vbs, or some proprietary scripting language that interfaces with an app they use all the time to help with a small problem is a more realistic entry point for most people. However, discussion of such things is practically nonexistent here. Well, as you mention that ... I'm one of the organizers of the DC-Baltimore Perl Workshop : http://dcbpw.org/dcbpw2013/ Last year, we targeted the beginner's track as a sort of 'Perl as a second language', assuming that you already knew the basic concepts of programming (what's a variable, an array, a function, etc.) Would it be worth us aiming for an even lower level of expertise? -Joe ps. Students the unemployed are free ... $25 before March 1st, $50 after; will be April 20th at U. Baltimore. We're also in talks with a training company to have either another track of paid training or a separate day (likely Sunday); they wouldn't necessarily be Perl-specific.
Re: [CODE4LIB] You *are* a coder. So what am I?
On Feb 15, 2013, at 9:00 AM, Lin, Kun wrote: Wow, Interesting. But I am not fun of Perl. Is there other workshop? I don't know of any full workshops in the area, but there are plenty of monthly or semi-monthly meetings of different groups: Python: http://dcpython.org/ R : http://www.meetup.com/R-users-DC/ Groovy: http://www.dcgroovy.org/ Drupal: http://groups.drupal.org/washington-dc-drupalers Hadoop: http://www.meetup.com/Hadoop-DC/ Ruby: http://www.dcrug.org/ ColdFusion: http://www.cfug-md.org/ For those not in this area, see: http://www.pm.org/groups/ http://wiki.python.org/moin/LocalUserGroups http://r-users-group.meetup.com/ http://groups.drupal.org/ http://www.ruby-lang.org/en/community/user-groups/ http://www.haskell.org/haskellwiki/User_groups http://coldfusion.meetup.com/ -Joe
[CODE4LIB] Learning programming data (was: You *are* a coder. So what am I?)
On Feb 15, 2013, at 10:26 AM, Chris Gray wrote: Yes. Exactly. It's like saying you can't go to the doctor or hire a lawyer without a bit of medical or law school. Doctors and lawyers need to be able to explain what they're doing. Another skill that would be useful is understanding databases, by which I do not mean learning SQL. Too many people's idea of working with data is Excel, which provides no structure for data. Type in any data in any box. There is none of the data integrity that a database requires. Here my ideal is Database Design for Mere Mortals which teaches no SQL at all but teaches how to work from data you know and use and arrive at a structure that could easily be put into a database. It's not just data, but data structure that needs to be understood. I've seen plenty of evidence that people who build commercial database-backed software don't understand database structure. I don't know of one specifically for the library community, but there are some courses on the topic for the science community on learning how to use scientific databases, or to develop their own. Two that I know well are Kirk Bourne at GMU and Peter Fox and his cohorts at RPI, and there's been an effort from the Federation of Earth Science Information Partners (ESIP) to put together short presentations on various related topics: http://classweb.gmu.edu/kborne/ http://tw.rpi.edu/wiki/Peter_Fox http://wiki.esipfed.org/index.php/Data_Management_Short_Course With the need for expertise in data management, there's also been a push to teach librarians in data curation data management at Syracuse*, UIUC and recently started at UNC. http://eslib.ischool.syr.edu/ http://cirss.lis.illinois.edu/CollMeta/dcep.html http://sils.unc.edu/programs/graduate/post-masters-certificates/data-curation And, another conference that I'm helping to organize, the Research Data Access and Preservation (RDAP) Summit, also being held in Baltimore this year (April 4-5, co-located with the IA Summit)**. It's been a place for the science, library and archives community to discuss issues (and solutions) that we're facing; it can be an interesting overview for librarians who are starting to look into the management of data. See the 'Resources' page for links to articles summarizing past years videos of the talks from last year.*** http://www.asis.org/rdap/ -Joe * disclaimer : I gave an invited talk to one of the Syracuse eScience classes a couple of years back. ** I know, you're thinking, 'what idiot would be involved with organizing two events being held weeks apart?' ... but I'm not ... I'm organizing three, so if you know any craft vendors who might be interested in participating in a street festival in Upper Marlboro, Maryland the day before Mother's Day : http://MarlboroughDay.org/ . (yes, it's the Marlboro of tobacco horse fame, but we don't have cowboys) *** although, my talk's particularly bad, as I wasn't expecting to actually give it 'til two of my three speakers bowed out at the last minute. But both Peter Fox Kirk Borne spoke in other sessions, and lots of other interesting people. ps. and um ... the thing about people making database software that don't understand data structures ... that's also part of my complaint about that project with people writing software that they shouldn't have ... storing journaled data in the same table, and no indexes so a RDBMS becomes a document store as there's only two useful accessors (one of which has to be checked to see if it's been deprecated by another record because of the journaling))
Re: [CODE4LIB] You *are* a coder. So what am I?
On Feb 15, 2013, at 12:27 PM, Kyle Banerjee wrote: On Fri, Feb 15, 2013 at 6:45 AM, Diane Hillmann metadata.ma...@gmail.comwrote: I'm all for people learning to code if they want to and think it will help them. But it isn't the only thing library people need to know, and in fact, the other key skill needed is far rarer: knowledge of library data... ...More useful, I think, is for each side of that skills divide to value the skills of other other, and learn to work together Well put. No amount of technical skill substitutes for understanding what people are actually doing -- it's very easy to write apps that nail any set of specifications and then some but are still totally useless. Even if you never intend to do any programming, it's still useful to know how to code because it will help you know what is feasible, what questions to ask, what is feasible, and how to interpret responses. That doesn't mean you need to know any particular language. It does mean you need to grok the fundamental methodologies and constraints. And the vocabulary (which Alison also mentioned, but for those who read Stranger in a Strange Land know that 'grok' was also associated with understanding the language to be able to explain what something was.) I've had *way* too many incidents where the problem was simply mis-communication because one group was using a term that had a specific meaning to the other group with some other intended meaning. I even gave a talk last year on the problem: http://www.igniteshow.com/videos/polysemous-terms-did-everyone-understand-your-message And one of the presenters earlier that night touched on the issue, for scientists talking to politicians and the public: http://www.igniteshow.com/videos/return-jedis-so-what-making-your-science-matter It takes more than just people skills to coordinate between the customers the software people.* Being able to translate between the problem domain's jargon and the programmers (possibly via some requirements language, like UML), or even just normalizing metadata between the sub-communities is probably 25-50% of my work. As a quick example, there's 'data' ... it means something completely different if you're dealing with scientists, programmers, or information scientists. For the scientists, metadata vs. data is a legitimate distinction as not all of what programmers would consider 'data' is considered to be 'scientific data'. -Joe * http://www.youtube.com/watch?v=mGS2tKQhdhY
Re: [CODE4LIB] editing code4lib livestream - preferred format
On Feb 15, 2013, at 2:30 PM, Matthew Sherman wrote: Not to be snarky, but wouldn't the session on HTML5 video tell you what you need to know? Code it in 3+ different formats, and stack your tags in hope that you've used enough different codecs that the browser actually supports one of them? http://caniuse.com/#feat=video,ogv,webm,mpeg4 ... then fail back to syncronized slide show / audio: http://caniuse.com/#feat=audio,svg-smil ... then fail back to Flash or some other security risk. (or did they have some other solution?) -Joe On Fri, Feb 15, 2013 at 1:20 PM, Tara Robertson trobert...@langara.bc.cawrote: Hi, I'm editing the video from code4lib into the sesison chunks. What format should I export the videos as? Anything else I should be aware of? Thanks, Tara -- Tara Robertson Accessibility Librarian, CILS http://www2.langara.bc.ca/**cils/http://www2.langara.bc.ca/cils/ T 604.323.5254 F 604.323.5954 trobert...@langara.bc.ca mailto:Tara%20Robertson%20%** 3ctrobert...@langara.bc.catara%2520robertson%2520%253ctrobert...@langara.bc.ca %3E Langara. http://www.langara.bc.ca 100 West 49th Avenue, Vancouver, BC, V5Y 2Z6
Re: [CODE4LIB] You *are* a coder. So what am I?
On Feb 14, 2013, at 8:57 AM, Karen Coyle wrote: EVERYONE should know some code. see: http://laboratorium.net/archive/2013/01/16/my_career_as_a_bulk_downloader But it's hard to find the classes that teach coding for everyone. This would be a good thing for c4l'ers to do in their institutions. How to write the short script you need to do something practical. Also, how to throw a few things into a database so you can re-munge it or explore some connections. We need those classes. We need to turn a room in the library into a hacker space for the staff. A learning lab. I just realized that the e-mails from Chris Erdmann a couple of weeks back were *not* on code4lib ... he's running a class on programming for librarians (specifically for processing data), and in a couple of weeks, they're going to have a workshop on interfaces at Harvard. See below. Also, a blog post from last month arguing that all librarians should know how to program: http://altbibl.io/dst4l/109/ -Joe ps. personally, I *hate* the term coder ... one, it make me think 'code monkey', and what I do is much more involved than that (analyst, architect, sysadmin, dba, programing, debugging, tech support, etc.). If I had a MLS, I might be a 'Systems Librarian', but I have a MIM (Info. Management ... still an LIS degree, but not the same accreditation); It's still easier to tell the library community that's what I am, and it's easier to explain what I do to the science by telling them I'm a 'data librarian'.* Two, 'coding' is a relatively minor skill. It's like putting 'typist' as a job title, because you use your keyboard a lot at work. Figuring out what needs to be written/typed/coded is more important than the actual writing aspect of it. As for titles, over the years, I've had the job title of : Programmer/Analyst Systems Analyst Software Engineer UNIX Engineer Multimedia Applications Analyst Short Guy with Beard (which was only funny because there was a much shorter guy with a more impressive beard) Web Developer Webmaster (back when it meant the person who administered the service, not the person who made the website) System Administrator ... etc. (I've had a lot as the university I worked at tied titles to pay rate, so every promotion required getting new business cards; right now, I work for a contractor, and the contractor gives me different titles than what NASA has me down as ... it's important what roles that I play, and the work that I do than what category someone's lumped me in. If you're going to insist on it, I'd rather it be broad, like 'techie' than just a 'coder'.) * and to make it more confusing, my company's title for me is 'Principal Software Engineer', but I don't meet the requirements to be an engineer. I went to an ABET accredited engineering program, but never took the EIT/FE or PE tests. So I try to avoid the 'engineer' titles, too. Begin forwarded message: From: cerdm...@cfa.harvard.edu Date: February 7, 2013 6:57:37 AM EST To: pam...@listserv.nd.edu Subject: [PAMNET] Liberact Workshop and Data Scientist Training for Librarians Reply-To: cerdm...@cfa.harvard.edu Good morning! Just a reminder to those thinking about interactive technologies in libraries, this workshop may be of interest: http://altbibl.io/liberact/ Also, we just started a course called Data Scientist Training for Librarians. Follow along here: http://altbibl.io/dst4l/blog/ Please forward to interested colleagues. Best regards, Christopher Erdmann, Head Librarian Harvard-Smithsonian Center for Astrophysics Begin forwarded message: From: cerdm...@cfa.harvard.edu Date: January 25, 2013 5:06:58 PM EST To: pam...@listserv.nd.edu Subject: [PAMNET] Liberact Workshop Feb 28 - Mar 1 @ Harvard Reply-To: cerdm...@cfa.harvard.edu To individuals interested in interactive technologies in libraries, this event is for you. The Liberact Workshop aims to bring librarians and developers together to discuss and brainstorm interactive, gesture-based systems for library settings. An array of gesture-based technologies will be demonstrated on the first day with presentations, brainstorming and discussions taking place on the second day. The workshop will be held at the Radcliffe Institute of Advanced Study at Harvard University in Cambridge, Massachusetts, and takes place February 28 - March 1. Visit the Liberact Workshop website to learn more: http://altbibl.io/liberact To register, visit the Eventbrite page for the workshop: https://liberact.eventbrite.com We hope you will join us! Christopher Erdmann, Martin Schreiner, Lynn Schmelz, Susan Berstler, Paul Worster, Enrique Diaz, Lynn Sayers, Michael Leach
[CODE4LIB] Comparison of JavaScript 'data grids'?
A couple of weeks ago, I posted to Stack Exchange's 'Webmasters' site, asking if there were any good feature comparisons of different Javascript 'data grid' implementations.* The response has been ... lacking, to put it mildly:** http://webmasters.stackexchange.com/q/42847/22457 I can find all sorts of comparisons of databases, javascript frameworks, web browsers, etc ... but I just haven't been able to find anything on tabular data presentation other than the sort of 'top 10 list'-type stuff that doesn't go into detail about why you might select one over another. Is anyone aware of such a comparison, or should I just put something half-assed up on wikipedia in hopes that the different implementations will fill it in? -Joe * ie, the ones that let you play with tabular data ... not the 'grid' stuff that the web designers use for layout, nor the 'data grid' stuff that the comp.sci scientific community use for distributed data storage. ** maybe I should've just asked on Stack Overflow, rather than post to the correct topical place
Re: [CODE4LIB] Comparison of JavaScript 'data grids'?
On Thu, 14 Feb 2013, Cary Gordon wrote: I have used Flexigrid, but there are several choices, and one of the others might better suit your needs. I have informally tiered them but my (based on very little) perception of their popularity. Flexigrid: http://flexigrid.info/ Ingrid: http://reconstrukt.com/ingrid/ jQuery Grid: http://github.com/tonytomov/jqGrid jqGridView: http://plugins.jquery.com/project/jqGridView SlickGrid: http://github.com/mleibman/SlickGrid DataTables: http://www.datatables.net/index jTable: http://www.jtable.org/ Thanks for the effort, That's the sort of thing that I *don't* need. I'm concerned about what features they have, and which browsers they support. For instance: How can you feed data into it? HTML tables (progressive enhancement) XML JSOC some other API Can it cache data locally, and if so, how? localStorage webDB indexedDB How is it licensed? commercial BSD GPLv2 GPLv3 LGPL Does it do sorting / filtering / pagination locally, or does it require a server component? Can you extend the datatypes? (to support abnormal sorting) Can you specify a function for rendering? (eg, show negative numbers in red, wrapped in parens; display alternate info when null) Does it support ... tree views? dynamic groupings? column re-ordering? automatic table sizing (to fill the view)? shift-clicking ranges of records? alt/ctrl-clicking multiple records? selecting checkboxes (so the table's a form input) adding new rows? hiding columns? infinate scrolling? editing of cells? adding / deleting records? Does it meet Section 508 requirements? What's the realistic maximum for: number of columns number of rows displayed number of records total (including not displayed) ... and the list goes on ... that's just some of the significant discriminators I've noticed when looking at the different implementations. -Joe On Thu, Feb 14, 2013 at 9:48 AM, Joe Hourcle onei...@grace.nascom.nasa.gov wrote: A couple of weeks ago, I posted to Stack Exchange's 'Webmasters' site, asking if there were any good feature comparisons of different Javascript 'data grid' implementations.* The response has been ... lacking, to put it mildly:** http://webmasters.stackexchange.com/q/42847/22457 I can find all sorts of comparisons of databases, javascript frameworks, web browsers, etc ... but I just haven't been able to find anything on tabular data presentation other than the sort of 'top 10 list'-type stuff that doesn't go into detail about why you might select one over another. Is anyone aware of such a comparison, or should I just put something half-assed up on wikipedia in hopes that the different implementations will fill it in? -Joe * ie, the ones that let you play with tabular data ... not the 'grid' stuff that the web designers use for layout, nor the 'data grid' stuff that the comp.sci scientific community use for distributed data storage. ** maybe I should've just asked on Stack Overflow, rather than post to the correct topical place -- Cary Gordon The Cherry Hill Company http://chillco.com
Re: [CODE4LIB] You *are* a coder. So what am I?
On Thu, 14 Feb 2013, Jason Griffey wrote: On Thu, Feb 14, 2013 at 10:30 AM, Joe Hourcle onei...@grace.nascom.nasa.gov wrote: Two, 'coding' is a relatively minor skill. It's like putting 'typist' as a job title, because you use your keyboard a lot at work. Figuring out what needs to be written/typed/coded is more important than the actual writing aspect of it. Any skill is minor if you already have it. :-) As others have pointed out, learning even a tiny, tiny bit of code is a huge benefit for librarians. The vast, vast, vast, vast majority of people have absolutely no clue how code translates into instructions for the magic glowing screen they look at all day. Even a tiny bit of empowerment in that arena can make huge differences in productivity and communication abilities. Just understanding the logic behind code means that librarians have a better understanding of what falls into the possible and impossible categories for doing stuff with a computer and anything that grounds decision making in the possible is AWESOME. It's true ... and learning lots of different programming languages makes you think about the problem in different ways* But equally important is knowing that's it's just one tool. It's like the quote, 'when you have a hammer, everything's a nail'. ... and more often than people realize, the correct answer is not to write code, or to write less of it. I remember once, I had inherited a project where they were doing this really complex text parsing, and we'd spend a month or so of man-hours on it each year. My manager quit, so I got to meet with the 'customer'.** I told her some of the more problematic bits, and some of them were things that she hadn't liked, so used it to push back and get things changed upstream. The next year, I was able to shave a week off the turn-around time. For the last few years, I've been dealing with software that someone wrote when what they *should* have done was survey what was out there, and figure out which one met their needs, and if necessary, adapt it slightly. Instead, they wrote massive complex systems that was unnecessary. And now we've got to support it, as there isn't the funding to convert it all over to something that has a broad community of support. (and I guess that's one of my issues against 'coders' ... anyone who writes code should be required to support it, too ... I've done the 'developer', 'sysadmin' and 'helpdesk' roles individually ... and when some developer makes a change that causes you to get 2am wakeup calls when the server crashes every night for two weeks straight,*** but they of course can't roll back, because 'but it's in production now, as it passed our testing'.) -Joe ps. I like Stuart's 'Library Systems Specialist' title for those who actually work in libraries. pps. Yes, I should actually be writing code right now. * procedural, functional, OO, ... I still haven't wrapped my head around this whole 'noSQL' movement, and I used to manage LDAP servers and *love* heirarchical databases. (even tried to push for its use in our local registry ... I got shot down by the others on the project). ** we were generating an HTML version of the schedule of classes based on the export generated from QuarkXPress, which was used to typeset the book. The biggest problem was dealing with a department code that had an ampersand in it, and the hack that we did to the lexer to deal with it doubled the time of each run. (and they made enough changes year-to-year that the previous year's script never worked right out the bat, so we'd have to run it, verify, tweak the code, re-run, etc.) *** they never actually fixed the problem. I put in (coded?) a watchdog script that'd check every 60 sec. if ColdFusion was down, and if so, start it back up again. So only the times when the config got corrupted did I have to manually intervene. By the time I was fired (long story, unrelated), it was crashing 5-10 times a day.
Re: [CODE4LIB] post your presentation slides before your talk, please!
On Feb 13, 2013, at 2:10 PM, Cynthia Ng wrote: Adding it to lanyrd is super easy too! http://xkcd.com/949/ On Wed, Feb 13, 2013 at 10:14 AM, James Stuart james.stu...@gmail.com wrote: If our entirely awesome presenters can, just drop an email on this thread or link into the IRC with your slides right before you go up. That way the talks that use code and small text can be followable without squinting. If you use dropbox, dropping the share link into IRC is super easy. Thanks! ps. If you're using a tablet and can't see the title text (aka. 'alt text') for the image, save this as a bookmark, then select when you're on an xkcd page (I hope it'll work on tablets ... I use it for printing out the comics) javascript:function%20hide(item){item.style.setProperty('display','none')};hide(document.getElementById('bottom'));hide(document.getElementById('topContainer'));Array.prototype.slice.call(document.getElementById('middleContainer').getElementsByTagName('ul'),0).forEach(hide);document.getElementById('ctitle').style.fontSize='3em';img=document.getElementById('comic').getElementsByTagName('img')[0];img.insertAdjacentHTML('afterend','p%20style=padding:0em%201em%200em%201em'+comic.getElementsByTagName('img')[0].title+'/p');document.getElementById('ctitle').style.fontSize='3em';
Re: [CODE4LIB] On-the-fly Closed Captioning
On Feb 6, 2013, at 4:16 PM, John Wynstra wrote: I have been asked to find out whether there are software or hardware solutions for on-the-fly closed captioning. We currently work with University IT production house on campus to perform this task. I'm not involved in any aspect of this at this time, but have been asked to investigate. Workflow is like this: 1) purchase a separate VHS copy of movie for captioning purpose (license issues I believe) 2) view show and write a transcript (probably time consuming) 3) Campus IT production creates a closed captioned digital copy using transcript and movie. [trimmed] Thoughts? It would never get you full closed captioning. It might get you subtitles, but true closed captioning also includes comments about other audio (background music, dogs barking, singing vs. mumbling vs. normal speech) Some of the more elaborate ones that I've seen will specifically move the text around on the screen so that they're not blocking important visual items. (that might've been on a DVD, and not standard closed captioning; much of my experience was in hanging out with folks from Gallaudet during undergrad and one of my dad's ex-girlfriends who only had partial hearing, but this was all in the mid to late 1990s) If you watch most news programs these days, they seem to use some sort of automatic closed captioning, as it's just awful. Lots of homophone confusion, random phonetic misspellings, etc. ... I would think that something that might be more productive, as you're dealing with existing published media and not stuff that you're generating yourself, would be to see if there exists some cooperative library of closed captioning, and if there isn't, make one. (so that people can submit time-tagged text to go with a given ISBN for a VHS or DVD) ... and a quick search seems to suggest that one exists; the Alternate Media eXchange Database: http://www.amxdb.net/ It seems there's also an 'OpenSubtitles' player which isn't resitricted to educational institutions, but as it's all torrent files and looks like many other torrent trackers, I'm afraid to download them (for fear it's got the video included). -Joe
Re: [CODE4LIB] conf presenters: a kind request
On Feb 5, 2013, at 9:42 AM, Wilhelmina Randtke wrote: If your university or any local professional groups have brown bag lunches with presentations, or anything informal and about the same amount of time as the conference presentation, then you can ask the group if you can do a dry run there. And if you want to get critiques on the manner of presentation, rather than the content, you might consider checking to see if there's a Toastmasters group in your area: http://www.toastmasters.org/ (there are some dues associated with the club, though ... but for those with a fear of public speaking, they can help you through it) -Joe
Re: [CODE4LIB] conf presenters: a kind request
On Feb 4, 2013, at 11:25 AM, Bill Dueber wrote: [trimmed (and agreed with all of that)] As Jonathan said: this is a great, great audience. We're all forgiving, we're all interested, we're all eager to lean new things and figure out how to apply them to our own situations. We love to hear about your successes. We *love* to hear about failures that include a way for us to avoid them, and you're going to be well-received no matter what because a bunch of people voted to hear you! I'd actually be interested in people's complaints about bad presentations; I've been keeping notes for years, with the intention of making a presentation on giving better presentations. (but it's much harder than it sounds, as I plan on making all of the mistakes during the presentation) On Mon, Feb 4, 2013 at 10:47 AM, Jonathan Rochkind rochk...@jhu.edu wrote: We are all very excited about the conference next week, to speak to our peers and to hear what our peers have to say! I would like to suggest that those presenting be considerate to your audience, and actually prepare your talk in advance! [trimmed] Just practice it once in advance (even the night before, as a last resort!), and it'll go great! I did one of those 'Ignite' talks this year; because it's auto- advancing slides, I went over it multiple times. My recommendation is that you try to get various co-workers as guinea pigs. I even subjected one of my neighbors to it, even though he wasn't necessarily part of the intended audience. They gave me a lot of feed back -- asking for clarification on bits, we realized I could trim down a couple of slides, giving me more slides to expand other bits. I still screwed up the presentation, but it would have been much worse if I hadn't practiced. My local ASIST chapter used to run 'preview' events before the annual meeting, where the local folks presenting at annual were invited to give their talks. If nothing else, it forced you to have it done a couple of weeks early, but more importantly, it gave me a chance to have a similar audience to what would be at the main meeting ... one of my talks bombed hard; it was on standards protocols for scientific data, and I hadn't considered just how bad a talk that's 50% acronyms would go over. I was able to change how I presented the material so it wasn't quite so painful the second time around. There's only been once when practicing in advanced made for a worse presentation ... and that's because when I finished, PowerPoint asked me if I wanted to save the timings ... what ever you do, do *not* tell it yes. Because then it'll auto-advance your slides, so when you skip over one slide during the practice, it'll not let you have it up during the real talk. (There's a setting to turn off use of timings ... and the audience laughed when I kept scolding the computer, but it still felt horrible when I was up there) And it's important that you *must* practice in front of other people. How fast you think it's going to take you, or how fast it takes you talking to yourself is nothing like talking in front of other people. ... So, all of that being said, some of the things I've made a note of over the years. (it's incomplete, as I've still take notes by hand, and there are more items on the back pages of the various memo books I've had over the years) * Get there before the session, and test your presentation on the same hardware as it's going to be presented from. This is especially important if you're a Mac user, and presenting from a PC, or visa-versa. Look for odd fonts, images that didn't load, videos, abnormal gamma, bad font sizes (may result in missing test), missing characters, incorrect justification, etc. * If you're going to be presenting from your own machine, still test it out, to make sure that you have all of the necessary adaptors, that you know what needs to be done to switch the monitor, that the machine detects the projector at a reasonable size and the gamma's adjusted correctly. (and have it loaded in advance; you're wasting enough time switching machines). And start switching machines while the last presenter's doing QA ... and if you lose 5 min because of switching, prepare to cut your talk short, force the following presenters to lose time) * Have a backup plan, with the presentation stashed on a website that you've memorized the URL to, *and* on a USB stick. (website is safer vs. virus transfer, only use the USB stick if there's no internet) And put the file at the top level of the USB stick, not buried 12 folders deep. * If they have those clip on microphones, put it on your label on the same side as the screen is to you. (so whenever you turn to look at the screen, it still picks up your voice) * If you have a stationary mic, you have to actually stay near it or it doesn't work. * Hand-held mics suck unless you're used to them, as most of us aren't used to holding our hand up
Re: [CODE4LIB] Linked data [was: Why we need multiple discovery services engine?]
On Feb 4, 2013, at 10:34 AM, Donna Campbell wrote: In mentioning pushing to break down silos more, it brings to mind a question I've had about linked data. From what I've read thus far, the idea of breaking down silos of information seems like a good one in that it makes finding information easier but doesn't it also remove some of the markers of finding credible sources? Doesn't it blend accurate sources and inaccurate sources? Yes, yes it does. The 'intelligence' community has actually been talking about this problem with RDF for years. My understanding is that they use RDF quads (not triples) so that they have an extra parameter to track the source. (it might be that they use something larger than a quad). From what I remember (the conversation was years ago), they have to be able to mark information as suspect (eg, they find that one of the sources is unreliable, then re-run all all of the analysis without that source's contribution to determine if they came to the same result). I don't know enough about the implementation of linked data systems, so if there's some way to filter which sources are considered for input, or if there's any tracking of the RDF triples once they're parsed out. -Joe
Re: [CODE4LIB] Tablets to help with circulation services
On Jan 23, 2013, at 12:34 PM, Stephen Francoeur wrote: We're looking into ways that tablets might be used by library staff assisting patrons in a long line at the circ desk. With a tablet, an additional staff person could pick folks off the line who might have things that can be handled on a properly outfitted tablet. [trimmed] I have two thoughts on the matter -- 1. Trying to take a picture with a tablet is pretty awkward. It might be better on a smaller form-factor device. (eg, an iPod Touch or an Android phone w/out a service plan) ... but this might be less useful for other tasks. 2. It might be worthwhile to look at what tasks can be handled by staff without a computer, or without a specially outfitted computer. (eg, can you answer reference questions using the publicly available website?) -Joe
Re: [CODE4LIB] anti-harassment policy for code4lib?
On Nov 26, 2012, at 7:47 PM, Michael J. Giarlo wrote: Hi Kyle, IMO, this is less an instrument to keep people playing nice and more an instrument to point to in the event that we have to take action against an offender. That was the reasoning for the DCBPW code of conduct ... covering ourselves if we had to eject someone. And it's not just a diversity thing -- One of the concerns for the DCBPW one was that there had been a guy at some previous Perl workshop who seemed to think that the presentations were personal conversations between him and the speaker, and kept interjecting. The sad reality is, there seem to be an abnormally high number of people in the technology fields who have gotten as far as they have with little to no understanding of social etiquette. (I've been told that I can cite myself as an example ... if you don't believe me, do a `whois annoying.org`) -Joe