Re: [CODE4LIB] tar without compression? General comments welcome

2023-09-21 Thread Stuart A. Yeates
Depending on your platform and context, you may want to explicitly use the
--selinux --acls --xattrs
options.

cheers
stuart
--
...let us be heard from red core to black sky


On Fri, 22 Sept 2023 at 07:32, Lolis, John  wrote:

> Long live tar!  *T*ape *AR*chiving may be dead, but the command it had
> wrought shows no sign of aging.  It's part of an on demand backup approach
> for our web site.  A script kicks it off where it tars and bz2 compresses
> the entire document root to a 17GB file and then FTPs it to a test server
> where it gets extracted to the corresponding document root.
>
> I daresay that I don't ever recall running into a problem with the use of
> tar, including hitting a memory or file size limit (of course, omitting
> file system size limits which have nothing to do with a limitation of tar).
>
> John Lolis
> Coordinator of Computer Systems
>
> 100 Martine Avenue
> White Plains, NY  10601
>
> tel: 1.914.422.1497
> fax: 1.914.422.1452
>
> https://whiteplainslibrary.org/
>
> *“I would rather have questions that can’t be answered than answers that
> can’t be questioned.”*
> — Richard Feynman
> <
> https://click.fourhourmail.com/5qure95xkf7hvvo93wh2/7qh7h8h05vr4zrtz/aHR0cHM6Ly9lbi53aWtpcGVkaWEub3JnL3dpa2kvUmljaGFyZF9GZXlubWFu
> >,
> theoretical physicist and recipient of the Nobel Prize in Physics in 1965
>
>
> On Thu, 21 Sept 2023 at 14:35, Esmé Cowles 
> wrote:
>
> > That seems like a reasonable approach to me. Aren't .docx files
> > directories of XML files in a Zip container? If so, they probably
> wouldn't
> > compress much anyway.
> >
> > I recently had to download large sets of files from two different
> > services, and one of them used Zip and the other used uncompressed Tar.
> The
> > Zip packaging was awful because it needed to be split into a lot of files
> > to avoid having one file to too large (they were all around 2GB). But the
> > Tar worked much more smoothly, since it could just let me download a
> single
> > 50GB Tar file that worked fine.
> >
> > -Esmé
> >
> > > On Sep 21, 2023, at 2:29 PM, Amy Schuler <
> > 00088c12581f-dmarc-requ...@lists.clir.org> wrote:
> > >
> > > Hi all,
> > > does anyone use the tar command to group files anymore?  I'm looking to
> > > group some .docx files together to archive in a system that does not
> use
> > > folder hierarchies.  I'm thinking of doing this without compression.
> > > Thoughts/comments, or good alternatives?
> > > Thanks!
> > > Amy
> > >
> > > --
> > >
> > > Amy C. Schuler (she/her)
> > > Director, Information Services & Library
> > >
> > > Cary Institute of Ecosystem Studies | 2801 Sharon Turnpike | Millbrook,
> > NY
> > > www.caryinstitute.org
> >
>


Re: [CODE4LIB] [External] Re: [CODE4LIB] What manner of creature is LaTeX?

2023-07-20 Thread Stuart A. Yeates
" a title, a footnote, or a quotation block is defined separately"

Yes, but at any point when creating an instance of a title, footnote or
quotation block you can define arbitrary computations (including redefining
the current style).

Sane people don't, of course, but you can.

cheers
stuart
--
...let us be heard from red core to black sky


On Fri, 21 Jul 2023 at 09:19, McDonald, Stephen 
wrote:

> I'm not sure I agree with that description of LaTeX.  LaTeX is more
> concerned with formatting than style.  LaTeX says, "this is the title",
> "this is a footnote", "this is a quotation block", "this is a chapter",
> "this is a sidebar note".  The actual style that is used for a title, a
> footnote, or a quotation block is defined separately, and layout on a page
> is done in end-processing.  I don't see any way you could separate those
> format blocks from the text to be blocked out.  Am I misunderstanding you?
>
> Steve McDonald
> steve.mcdon...@tufts.edu
>
>
> > -Original Message-
> > From: Code for Libraries  On Behalf Of Stuart
> A.
> > Yeates
> > Sent: Thursday, July 20, 2023 4:49 PM
> > To: CODE4LIB@LISTS.CLIR.ORG
> > Subject: [External] Re: [CODE4LIB] What manner of creature is LaTeX?
> >
> > LaTeX is a little like PostScript and Excel with autorun scripts:
> formats conceived
> > and developed prior to the software development insight that separation
> of
> > content and code need to be separate.
> >
> > Nowadays it is accepted that content should be split into text and
> style, but
> > way back when, there wasn't even a consensus for the split between
> content
> > and code.
> >
> > cheers
> > stuart
>
>


Re: [CODE4LIB] What manner of creature is LaTeX?

2023-07-20 Thread Stuart A. Yeates
LaTeX is a little like PostScript and Excel with autorun scripts: formats
conceived and developed prior to the software development insight that
separation of content and code need to be separate.

Nowadays it is accepted that content should be split into text and style,
but way back when, there wasn't even a consensus for the split between
content and code.

cheers
stuart
--
...let us be heard from red core to black sky


On Wed, 19 Jul 2023 at 03:33, Dan Johnson <
0100c29c0f99-dmarc-requ...@lists.clir.org> wrote:

> Dear List,
>
> How do you all deal with LaTeX? The LaTeX Project describes it as a
> "high-quality typesetting system," but it *looks* similar to a few
> different software paradigms, and this makes it hard to figure out who on a
> university campus should be supporting it.
>
> For example, one could make the case that it's an advanced, low-level form
> of word processing, which should therefore be supported with training and
> problem solving by central IT, who cover Microsoft Word and Google Docs.
> But it's much more than WYSIWYG word processing, and support for IT would
> be a very heavy lift.
>
> So maybe instead you think of it as a markup system. In that case, perhaps
> it's the library's digital scholarship center that should be providing
> support. Yet, it's not really used for the purposes of scholarly annotation
> and digital presentation of primary sources that TEI is.
>
> Since it's used for creating beautifully-looking articles and books,
> perhaps it's a scholarly communication tool, and hence the schol comm
> division of the library should support it. But the biggest use case may be
> dissertation formatting, in which case perhaps a university's graduate
> school or office of research should take charge (especially if they provide
> a dissertation template).
>
> But then, the software is especially good at formatting mathematical
> notations, and indeed, the vast majority of dissertations submitted with
> LaTeX formatting come from the school of science, so perhaps it is
> scientific computing software. In that case, maybe the college of science's
> departmental IT units should bear the brunt of support.
>
> The final option, it seems to me, is to call it "just one of those very
> helpful things," like regex, that you won't see in any formal or informal
> learning environment, and so you have to figure it out on your own to be in
> the know.
>
> How do you all parcel this out?
>
> Best,
> Dan
>
> --
> *Daniel Johnson, Ph.D.*
> *Interim Co-Director, Navari Family Center for Digital Scholarship *
> *English; Digital Humanities**; and Film, Television, and Theatre *
> *Librarian*
>
> *University of Notre Dame*
> 250C Hesburgh Library
> Notre Dame, IN 46556
> o: 574-631-3457
> e: djohn...@nd.edu
>


Re: [CODE4LIB] Searching Gmail by Star next to email

2022-05-08 Thread Stuart A. Yeates
label:starred appears to be your friend.

See also label:unread, etc.

cheers
stuart
--
...let us be heard from red core to black sky



On Mon, 9 May 2022 at 11:02, charles meyer  wrote:
>
> I've Googled this issue but Gmail Help support page doesn't solve the
> problem.
>
> You receive an approved email. You star it.
>
> Now you want to search for only the messages you've starred.
>
> Clicking on the empty box in the upper left above the messages which lists
> Starred emails does nothing.
>
> How have you searched gmail by only the Starred emails?
>
> Thank you.
>
> Charles.
>
> Charles Meyer
> Charlotte County Public Library
> Port Charlotte, FL


Re: [CODE4LIB] OJS cloud hosting

2022-03-10 Thread Stuart A. Yeates
https://pkpservices.sfu.ca/ are the people who developed OJS. I've not
used them for hosting but had very positive interactions with them for
debugging / fixing / patching stuff. They're also a unit of a Canadian
university rather than a for-profit company.

cheers
stuart
--
...let us be heard from red core to black sky

On Fri, 11 Mar 2022 at 10:19, Fitchett, Deborah
 wrote:
>
> Kia ora koutou,
>
> We can see there are a number of other places offering OJS hosting, but some 
> look highly dodgy and others look like at the least a lot more research is 
> required.
>
> We're familiar with PKP services - can anyone recommend other SaaS providers 
> for OJS that are worth considering?
>
> Deborah
>
> 
>
> "The contents of this e-mail (including any attachments) may be confidential 
> and/or subject to copyright. Any unauthorised use, distribution, or copying 
> of the contents is expressly prohibited. If you have received this e-mail in 
> error, please advise the sender by return e-mail or telephone and then delete 
> this e-mail together with all attachments from your system."


Re: [CODE4LIB] It’s Official! Fedora 6.0 Production Release is Available

2021-07-08 Thread Stuart A. Yeates
Comms tip:

When writing an announcement you hope is going to be widely
circulated, make it clear less than three paragraphs in which of the
two confusingly-similarly-named software projects you're writing
about.

cheers
stuart
--
...let us be heard from red core to black sky

On Fri, 9 Jul 2021 at 02:00, Arran Griffith
<007087c5d6bd-dmarc-requ...@lists.clir.org> wrote:
>
> Dear Community:
>
>
> Today is the day we’ve all been waiting for. We can officially announce that 
> the Fedora 6.0 Production Release is available! After a successful release 
> candidate testing phase we are extremely happy with the state of Fedora 6.0 
> and are excited to get it in the hands of the community.
>
>
> The design and development of Fedora 6.0 was guided by three goals: improve 
> the digital preservation feature set, support migrations from all previous 
> versions of the software, and improve performance and scalability. Drawing on 
> community input at all stages, we are delighted to announce that we got there.
>
>
> Highlights of Fedora 6.0 include:
>
>
>   1.  Oxford Common File Layout (OCFL) persistence
>
>   2.  Robust migration tooling and documentation
>
>   3.  Improved performance and scale
>
>   4.  Built-in simple search
>
>   5.  Minimal API changes
>
>
> A huge effort has gone into making this happen. Here’s a list of the 
> individuals who contributed to development, testing and documentation of this 
> release [1]. We cannot express our gratitude enough to the individuals and 
> institutions who have helped us achieve this milestone. Special thanks to the 
> core development team whose outsized contributions brought Fedora 6 across 
> the finish line.
>
>
>   *   Danny Bernstein, LYRASIS
>
>   *   Ben Pennell, University of North Carolina Chapel Hill
>
>   *   Peter Winkles, University of Wisconsin Madison
>
>   *   Jared Whiklo, University of Manitoba
>
>   *   Andrew Woods, Harvard University Libraries (formerly LYRASIS)
>
>
> https://fedora-repository.atlassian.net/issues/?filter=10008
>
>
> Try it out:
>
> 1) Download the one-click-run:
>
>   *   
> https://github.com/fcrepo/fcrepo/releases/download/fcrepo-6.0.0/fcrepo-webapp-6.0.0-jetty-console.jar
>
> 2) Run in docker:
>
>   *   docker run -p8080:8080 --name=fcrepo6 fcrepo/fcrepo:6.0.0
>
> 3) Download and deploy the WAR file:
>
>   *   
> https://github.com/fcrepo/fcrepo/releases/download/fcrepo-6.0.0/fcrepo-webapp-6.0.0.war
>
>
> The following tools are also available for use:
>
>   *   Migration-utils [2]:  Migrate from Fedora 3.x to Fedora 6.0
>
>   *   Fcrepo-migration-validator [3]:  Validate your Fedora 3.x -> 6.0 
> migration
>
>   *   Fcrepo-upgrade-utils [4]: Migrate from Fedora 4.x or 5.x to Fedora 6.0
>
>   *   Fcrepo-import-export [5]:  Import and export Fedora 4.x or 5.x 
> repositories (for use in conjunction with fcrepo-upgrade-utils)
>
>
> Documentation can be enjoyed here: https://wiki.lyrasis.org/display/FEDORA6x/
>
>
> For migration instructions for Fedora 3.x, 4.x, and 5.x visit this page: 
> https://wiki.lyrasis.org/display/FEDORA6x/Migrate+to+Fedora+6
>
>
> Please use the fedora-tech mailing list [6] or the #fedora-6-testing channel 
> in the Fedora Slack [7] to provide any feedback.
>
>
> Coming up next...Fedora 6.0 Release Party!! Stay tuned for more details on 
> this special event.
>
>
> For now... it’s time to celebrate!
>
>
> Sincerely,
>
> The Fedora Team
>
>
> [1] https://wiki.lyrasis.org/display/FF/Fedora+6.0.0+Release+Notes
>
> [2] https://github.com/fcrepo-exts/migration-utils/releases
>
> [3] https://github.com/fcrepo-exts/fcrepo-migration-validator/releases
>
> [4] https://github.com/fcrepo-exts/fcrepo-upgrade-utils/releases
>
> [5] https://github.com/fcrepo-exts/fcrepo-import-export/releases
>
> [6] https://groups.google.com/g/fedora-tech
>
> [7] https://fedora-project.slack.com


Re: [CODE4LIB] Digitization Outsourcing Survey

2021-06-23 Thread Stuart A. Yeates
How are you going to present answers to questions such as "In your
opinion, what is the most interesting item or collection that your
library has digitized?" anonymously, given that these are highly
likely to be unique?

"15. Top 3: Considering vendors and other partners that you have
worked with, please rank up to three favorite digitization service
providers, where 1 is your highest preference." also seems like a
pretty dubious question...

Given that there are list subscribers in the EU on this list, how are
you complying with the General Data Protection Regulation?

cheers
stuart
--
...let us be heard from red core to black sky

On Thu, 24 Jun 2021 at 02:13, Kelly Barrall  wrote:
>
> Good morning - thank you for providing clarity, Eric.
>
> Your participation in the survey will help us identify key areas to focus on 
> as demands for digital content increases, and improve our ability to connect 
> with our library partners.
>
> Kelly Barrall
>
> -Original Message-
> From: Code for Libraries  On Behalf Of Eric Lease 
> Morgan
> Sent: Wednesday, June 23, 2021 10:00 AM
> To: CODE4LIB@LISTS.CLIR.ORG
> Subject: Re: [CODE4LIB] Digitization Outsourcing Survey
>
> On Jun 23, 2021, at 9:51 AM, Anonymous  wrote:
>
> > https://www.surveymonkey.com/r/5NZ823L
>
>
> To provide some context for this posting, from the introductory survey 
> question:
>
>   Thank you for participating. Your responses will remain
>   anonymous. Results from this survey will be made available
>   only in aggregate. We appreciate your willingness to
>   share your thoughts regarding library digitization.
> 
>
> Furthermore, the author of the survey is Backstage Library Works.
>
> --
> Eric Morgan


Re: [CODE4LIB] facial hair names

2021-03-30 Thread Stuart A. Yeates
DSpace does collect usage in solr, but doesn't expose those stats in
METS when exporting items.

cheers
stuart
--
...let us be heard from red core to black sky

On Wed, 31 Mar 2021 at 15:00, Fitchett, Deborah
 wrote:
>
> DSpace does all its usage stats with SOLR, at least these days, doesn’t it? 
> Ours stopped working at some point and I never understood it so even once we 
> got it working again I think all the old stats are missing.  There is [at 
> least one] standard for usage stats if you want one, though I don’t know how 
> well it’s suited beyond the ebook/journal usage data it was designed for: see 
> https://www.projectcounter.org/
>
> Ah, ethnicity. Author biographical data in general. There are *so many* cases 
> where it’d be really useful to be able to search or narrow results by 
> information about the author rather than by how many pages are in the book’s 
> preface. In an attempt to make progress on a “read a book from every country 
> in the world” challenge I coded 
> https://deborahfitchett.com/toys/aroundtheworld/ using Wikidata but… needs 
> more data.
>
> Deborah
>
> From: Code for Libraries  On Behalf Of Stuart A. 
> Yeates
> Sent: Wednesday, 31 March 2021 10:26 AM
> To: CODE4LIB@LISTS.CLIR.ORG
> Subject: Re: [CODE4LIB] facial hair names
>
> It was a joke, but with serious intent.
>
> The pandemic has revealed the extent to which the "metadata" we
> consider important is insanely contingent on who's making the call,
> when they're making the call, and what the envisioned use for the data
> and the metadata.
>
> We're currently migrating some collections from one DSpace instance to
> another using METS. METS has the ball-of-mud approach to metadata (you
> can stick _anything_ in there and it's still a ball of mud) but there
> appear to be no namespaces for some metadata, like usage stats. Not
> even any non-standardised namespaces. Even the archivists don't appear
> to have namespaces / standardisation for usage data. Yet most of us
> have layers of management who drool over usage stats.
>
> Why?
>
> Ethnicity (of authors or subject matter) is another metadata field
> where we're lacking and have layers of management who (sh/w)ould love
> this information.
>
> cheers
> stuart
> --
> ...let us be heard from red core to black sky
>
> On Wed, 31 Mar 2021 at 09:39, Fitchett, Deborah
> mailto:deborah.fitch...@lincoln.ac.nz>> wrote:
> >
> > I was assuming it was a joke just because I’m not aware of Stuart working 
> > on collections where such a taxonomy would be useful (though if I’m wrong I 
> > look forward to seeing a demo sometime!) but that doesn’t preclude serious 
> > answers too: I can see all sorts of research applications (and various 
> > surveillance applications) though admittedly I’m mostly envisaging using 
> > the library discovery layer to play a game of Guess Who.
> >
> > Deborah
> >
> > From: Code for Libraries 
> > mailto:CODE4LIB@LISTS.CLIR.ORG>> On Behalf Of 
> > McDonald, Stephen
> > Sent: Wednesday, 31 March 2021 2:09 AM
> > To: CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG>
> > Subject: Re: [CODE4LIB] facial hair names
> >
> > Ah, I see. Was the original question intended as a joke? I took the 
> > question seriously. There are databases out there which record facial 
> > features like this and taxonomies exist for various body features. But I'm 
> > not aware of a metadata standard for exchanging such information. What 
> > field and taxonomy to use for facial hair is a legitimate question for 
> > researchers.
> >
> > Steve McDonald
> > steve.mcdon...@tufts.edu<mailto:steve.mcdon...@tufts.edu<mailto:steve.mcdon...@tufts.edu%3cmailto:steve.mcdon...@tufts.edu>>
> >
> >
> > -Original Message-
> > From: Code for Libraries 
> > mailto:CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG%3cmailto:CODE4LIB@LISTS.CLIR.ORG>>>
> >  On Behalf Of Fitchett, Deborah
> > Sent: Monday, March 29, 2021 7:46 PM
> > To: 
> > CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG%3cmailto:CODE4LIB@LISTS.CLIR.ORG>>
> > Subject: Re: [CODE4LIB] facial hair names
> >
> > dc.coverage.facial
> >
> > From: Code for Libraries 
> > mailto:CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG%3cmailto:CODE4LIB@LISTS.CLIR.ORG>>>
> >  On Behalf Of Stuart A. Yeates
> > Sent: Monday, 29 March 2021 12:06 PM
> > To: 
> > CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG%3cmailto:CODE4LIB@LISTS.CLIR.ORG>>
&g

Re: [CODE4LIB] facial hair names

2021-03-30 Thread Stuart A. Yeates
It was a joke, but with serious intent.

The pandemic has revealed the extent to which the "metadata" we
consider important is insanely contingent on who's making the call,
when they're making the call, and what the envisioned use for the data
and the metadata.

We're currently migrating some collections from one DSpace instance to
another using METS. METS has the ball-of-mud approach to metadata (you
can stick _anything_ in there and it's still a ball of mud) but there
appear to be no namespaces for some metadata, like usage stats. Not
even any non-standardised namespaces. Even the archivists don't appear
to have namespaces / standardisation for usage data. Yet most of us
have layers of management who drool over usage stats.

Why?

Ethnicity (of authors or subject matter) is another metadata field
where we're lacking and have layers of management who (sh/w)ould love
this information.

cheers
stuart
--
...let us be heard from red core to black sky

On Wed, 31 Mar 2021 at 09:39, Fitchett, Deborah
 wrote:
>
> I was assuming it was a joke just because I’m not aware of Stuart working on 
> collections where such a taxonomy would be useful (though if I’m wrong I look 
> forward to seeing a demo sometime!) but that doesn’t preclude serious answers 
> too: I can see all sorts of research applications (and various surveillance 
> applications) though admittedly I’m mostly envisaging using the library 
> discovery layer to play a game of Guess Who.
>
> Deborah
>
> From: Code for Libraries  On Behalf Of McDonald, 
> Stephen
> Sent: Wednesday, 31 March 2021 2:09 AM
> To: CODE4LIB@LISTS.CLIR.ORG
> Subject: Re: [CODE4LIB] facial hair names
>
> Ah, I see. Was the original question intended as a joke? I took the question 
> seriously. There are databases out there which record facial features like 
> this and taxonomies exist for various body features. But I'm not aware of a 
> metadata standard for exchanging such information. What field and taxonomy to 
> use for facial hair is a legitimate question for researchers.
>
> Steve McDonald
> steve.mcdon...@tufts.edu<mailto:steve.mcdon...@tufts.edu>
>
>
> -Original Message-
> From: Code for Libraries 
> mailto:CODE4LIB@LISTS.CLIR.ORG>> On Behalf Of 
> Fitchett, Deborah
> Sent: Monday, March 29, 2021 7:46 PM
> To: CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG>
> Subject: Re: [CODE4LIB] facial hair names
>
> dc.coverage.facial
>
> From: Code for Libraries 
> mailto:CODE4LIB@LISTS.CLIR.ORG>> On Behalf Of Stuart 
> A. Yeates
> Sent: Monday, 29 March 2021 12:06 PM
> To: CODE4LIB@LISTS.CLIR.ORG<mailto:CODE4LIB@LISTS.CLIR.ORG>
> Subject: [CODE4LIB] facial hair names
>
> The CDC has released a list of facial hair names 
> https://www.cdc.gov/niosh/npptl/pdfs/FacialHairWmask11282017-508.pdf<https://www.cdc.gov/niosh/npptl/pdfs/FacialHairWmask11282017-508.pdf><https://www.cdc.gov/niosh/npptl/pdfs/FacialHairWmask11282017-508.pdf<https://www.cdc.gov/niosh/npptl/pdfs/FacialHairWmask11282017-508.pdf>>
>
> If we want to use these for facetting, which metadata fields should we be 
> using?
>
> cheers
> stuart
> --
> ...let us be heard from red core to black sky
>
> 
>
> "The contents of this e-mail (including any attachments) may be confidential 
> and/or subject to copyright. Any unauthorised use, distribution, or copying 
> of the contents is expressly prohibited. If you have received this e-mail in 
> error, please advise the sender by return e-mail or telephone and then delete 
> this e-mail together with all attachments from your system."


[CODE4LIB] facial hair names

2021-03-28 Thread Stuart A. Yeates
The CDC has released a list of facial hair names
https://www.cdc.gov/niosh/npptl/pdfs/FacialHairWmask11282017-508.pdf

If we want to use these for facetting, which metadata fields should we be using?

cheers
stuart
--
...let us be heard from red core to black sky


Re: [CODE4LIB] Moving away from handle service

2021-03-01 Thread Stuart A. Yeates
Does the Apache rewrite map approach with http://hdl.handle.net/ handles?

We mint handles like: http://hdl.handle.net/10063/1710 and the
handle.net handle server listens on a custom port that helps that get
resolved to https://researcharchive.vuw.ac.nz/handle/10063/1710

cheers
stuart
--
...let us be heard from red core to black sky

On Tue, 2 Mar 2021 at 12:47, Hardy Pottinger  wrote:
>
> Hello! I've worked with an institution which moved from a custom repository
> to a DSpace repository (generating handles) and then decided to move
> elsewhere (oh no, our handles need to redirect!) We ended up using an
> Apache rewrite map [1] on the DSpace server's Apache front-end proxy. It's
> a bit fiddly to set up, and you'll get to use your scripting skills to
> generate the map and config. But as soon as it's done, it'll just do its
> job.
>
> 1. https://httpd.apache.org/docs/current/rewrite/rewritemap.html
>
> On Mon, Mar 1, 2021 at 5:38 PM Kun Lin  wrote:
>
> > For handle, if you are using  hdl.handle.net  domain, you have to maintain
> > the handle server for it to resolve. But if you are using your own domain
> > name, it's very easy to write a script for URL redirect. This is the reason
> > I insisted on using our domain name instead of hdl.handle.net
> > However, handle.net organization did tell me before that they are open to
> > the idea of a customized 404 page for organization that moved away from
> > handle and decommissioned their handle server.
> >
> > Kun Lin
> >
> > -Original Message-
> > From: Code for Libraries  On Behalf Of Peter
> > Murray
> > Sent: Monday, March 1, 2021 2:52 PM
> > To: CODE4LIB@LISTS.CLIR.ORG
> > Subject: Re: [CODE4LIB] Moving away from handle service
> >
> > If it were me, I’d write a script that fills an S3 bucket with HTML files
> > containing
> >
> >  
> >
> > ...and nice body content that points users to handle destination. Then
> > throw
> > a CloudFront distribution in front of it, change the DNS, and call it a
> > day.
> > You’ll spend a couple bucks a month for storage and bandwidth, but the
> > ultimate control stays with the institution.
> >
> > Peter
> > On Mar 1, 2021, 5:19 PM -0500, Stuart A. Yeates ,
> > wrote:
> > > My institution has used handles for more than a decade and would like
> > > to stop (non-standard ports, special server, etc), particularly as
> > > we're now committed to DOIs.
> > >
> > > However, we don't want to break URLs.
> > >
> > > Does anyone know of a third party service that we can hand a list of
> > > handle-to-URL-mappings and walk away? Preferably with a single upfront
> > > payment rather than on-going cost.
> > >
> > > cheers
> > > stuart
> > > --
> > > ...let us be heard from red core to black sky
> >


[CODE4LIB] Moving away from handle service

2021-03-01 Thread Stuart A. Yeates
My institution has used handles for more than a decade and would like
to stop (non-standard ports, special server, etc), particularly as
we're now committed to DOIs.

However, we don't want to break URLs.

Does anyone know of a third party service that we can hand a list of
handle-to-URL-mappings and walk away? Preferably with a single upfront
payment rather than on-going cost.

cheers
stuart
--
...let us be heard from red core to black sky


Re: [CODE4LIB] Web app to search XML files

2020-12-17 Thread Stuart A. Yeates
There's XML and XML.

I suggest that you enquire about the exact format that you're going to
be receiving and ask around for systems that support it out of the
box.

cheers
stuart


--
...let us be heard from red core to black sky

On Fri, 18 Dec 2020 at 07:37, Pennington, Buddy D.  wrote:
>
> Hi all,
>
> We're purchasing an XML dataset for the historical NY Times and I am curious 
> about any suggestions to quickly build a web app to search and display those 
> records for end users.
>
> Buddy Pennington
> Head of Electronic Resources & Systems
> University Libraries
> University of Missouri - Kansas City
> (he/him/his)


Re: [CODE4LIB] to stop word, or not to stop word. that is the question

2020-07-10 Thread Stuart A. Yeates
Sounds like a classical use for the  tf–idf measure.

For those with no background in information retrieval, see
https://en.wikipedia.org/wiki/Tf%E2%80%93idf

cheers
stuart

--
...let us be heard from red core to black sky

On Sat, 11 Jul 2020 at 06:58, Eric Lease Morgan  wrote:
>
> To stop word, or not to stop word? That is the question.
>
> Seriously, I am working with a team of people to index and analyze a set of 
> 65,000 - 100,000 full text scientific journal articles, and all of the 
> articles are on the topic of COVID-19. [1] We have indexed the data set and 
> we have created subsets of the data, affectionately called "study carrels". 
> Each study carrel is characterized with a short name and a few 
> bibliographic-like features. [2] Within each study carrel are a number of 
> different analyses, such as ngram frequencies, parts-of-speech enumerations, 
> and topic modeling.
>
> Each article in each carrel also has a set of "keywords" extracted from it. 
> These keywords are computed, and for all intents & purposes, the computation 
> is pretty good. For example, see a set of keywords from a particular carrel. 
> [3] Unfortunately, many of the study carrels have very very very similar sets 
> of keywords. Again, if you peruse the set of all the carrels [2] you see the 
> preponderance of keywords such as "cell", "covid-19", "SARS", and "patient". 
> These words happen so frequently that they become (almost) meaningless.
>
> My questions to y'all are, "When and where should I add something like 
> 'cell', or better yet 'covid-19', to my list of stopwords?"
>
>
> [1] data set of articles - https://www.semanticscholar.org/cord19
> [2] study carrels - https://cord.distantreader.org/carrels/INDEX.HTM
> [3] example keywords - 
> https://cord.distantreader.org/carrels/kaggle-risk-factors/index.htm#keywords
>
> --
> Eric Morgan


Re: [CODE4LIB] Robert Sandusky

2020-04-13 Thread Stuart A. Yeates
After allowing time for obits to appear, I would encourage you to
consider writing a wikipeda article about Sandusky. Most late career
academics meet the criteria. If you need help with the mechanics of
writing I'm more than willing to help, but start with compling
secondary sources.

cheers
stuart
--
...let us be heard from red core to black sky

On Tue, 14 Apr 2020 at 04:04, Goben, Abigail H  wrote:
>
> I am sorry to share that Robert Sandusky, the AUL for IT at the University of 
> Illinois at Chicago, passed away last Friday from cancer.
>
> I've included some of the note from my Dean, Mary Case, below.  We do not yet 
> have a time for a memorial service but if you'd like that information, please 
> let me know.
> Abigail
>
>
>
>
>
> Bob was the UIC Library's Associate University Librarian and Associate Dean 
> for Information Technology (AULIT). He joined UIC in this capacity in October 
> 2007. Bob was an Associate Professor having been granted tenure in 2012. 
> Prior to joining UIC, Bob spent from 2005-2007 as an Assistant Professor at 
> the University of Tennessee, Knoxville, in the School of Information 
> Sciences. He received his PhD in Library and Information Science from the 
> University of Illinois at Urbana-Champaign in 2005. Prior to his career in 
> academia, Bob spent a decade as Systems Officer of the Fednet National 
> Network Management Control Center at the Federal Reserve Bank in Chicago and 
> was a Senior Project Manager, Information Architect and Senior Web Developer 
> at ComPsych Corporation in Chicago. Bob had a BA in English and an MS in 
> Computer Science from Northern Illinois University.
>
>
>
> During his 12 years of service as AULIT, Bob built a strong team and 
> introduced project management processes for tracking and evaluating systems 
> initiatives. He led one and oversaw a second revision of our website, 
> supported the team working on the Explore Chicago Collections portal, guided 
> the implementation and upgrading of several systems that support Special 
> Collections, Digital Scholarship/Scholarly Communications, and Digital 
> Programs & Services to name just a few. Bob coordinated the internal 
> Library/IT assessment process for many years and met with students as we 
> discussed priorities and projects. In this past year, Bob also engaged deeply 
> in the Library's diversity, equity and inclusion program serving as a member 
> of the Design Team.
>
>
>
> Bob was an active member of the University community. He served on the 
> Faculty Senate and the Senate Support Services Committee which he chaired. He 
> was also appointed to the Research Committee of the IT Governance Council and 
> served as Chair of the Institutional Stewardship of Research Data 
> Subcommittee.
>
>
>
> Bob was engaged nationally in the NSF funded DataONE: Observation Network for 
> Earth project. He was a co-investigator and member of the Core 
> Cyberinfrastructure Team (2007-2014), a member of the Usability and 
> Assessment Working Group, and was co-chair of the DataONE Users Group.
>
>
>
> Bob was an active scholar, producing many peer-reviewed journal articles, 
> books chapters, workshops and conference presentations. His research focused 
> on data management and reuse, sociotechnical systems, and most recently on 
> the application of notions of computational provenance to libraries, archives 
> and museums.
>
>
>
> As a member of the senior team in the Library, Bob asked hard questions and 
> provided wise advice. He was always a gentle-man and fought cancer with 
> dignity and drive. Our hearts ache for this loss for his family and for our 
> community. We will miss him deeply.
>
>
>
> His family will be planning a memorial service to celebrate Bob's life for 
> later in the year, when we will be able to gather together again.


Re: [CODE4LIB] open journal systems and oai

2020-03-12 Thread Stuart A. Yeates
Some years ago I played around with creating an organic oai-pmh
endpoint locator. The results of my work are at
https://github.com/stuartyeates/oai-found

Unfortunately these have been poisoned by a OJS bug and list multiple
synonymous endpoints with identical data.


--
...let us be heard from red core to black sky

On Fri, 13 Mar 2020 at 09:43, Eric Lease Morgan  wrote:
>
> On Mar 12, 2020, at 3:44 PM, Stuart A. Yeates  wrote:
>
> >> Where can I find a list of Open Journal System (OJS) journals and their 
> >> associated OAI-PMH data repository root URLs? I have all but finished 
> >> successfully using OAI-PMH to harvest and then "read" the whole of ITAL, 
> >> and I would like to apply the same process to other open access journals 
> >> supported by OJS.
> >
> > I believe that many years ago there was a comprehensive list published
> > by the PKP (who produce OJS) based on the homing signals OJS uses to
> > check for updates. That's gone now (privacy reasons?), DOAJ is a great
> > place to start, but many journals don't qualify for DOAJ.
>
>
> Yes, this process is not simplistic. In the end I found a CSV file from the 
> DOAJ. I then filtered the file for titles both in English as well as created 
> using OJS. I then reverse engineer the resulting URLs. In the end few of the 
> titles are really in English.
>
> I suppose you could say I am doing both collection development as well as 
> acquisitions. :)
>
> --
> Eric Morgan


Re: [CODE4LIB] open journal systems and oai

2020-03-12 Thread Stuart A. Yeates
I believe that many years ago there was a comprehensive list published
by the PKP (who produce OJS) based on the homing signals OJS uses to
check for updates. That's gone now (privacy reasons?), DOAJ is a great
place to start, but many journals don't qualify for DOAJ.

cheers
stuart


--
...let us be heard from red core to black sky

On Fri, 13 Mar 2020 at 07:57, Eric Lease Morgan  wrote:
>
> Where can I find a list of Open Journal System (OJS) journals and their 
> associated OAI-PMH data repository root URLs? I have all but finished 
> successfully using OAI-PMH to harvest and then "read" the whole of ITAL, and 
> I would like to apply the same process to other open access journals 
> supported by OJS.
>
> --
> Eric Morgan
> University of Notre Dame


Re: [CODE4LIB] WARC --> static HTML?

2020-03-04 Thread Stuart A. Yeates
WARC is not an access format.

WARC is entirely optimised for crawling and the gold standard for archiving
because it's close to the 'on the wire' web experience.

BUT

There is no file index: you access every file using a linear search from
the start of the archive.
There is no guarantee that related files are stored together: an HTML page
and it's CSS, images and embedded streaming video
There is no guarantee that related pages are stored together.

If you're using WARC for access, you need something that overcomes these
limitations, and the obvious choice is CDX indexes. For an explanation of
how CDX files index WARC files, see the diagram on
https://support.archive-it.org/hc/en-us/articles/115001790023-Access-Archive-It-s-Wayback-index-with-the-CDX-C-API

---

Alternatively, use wget with the --convert-links option over your WARC /
pywb solution. This should be faster than 40 mins per page on average,
since CSS and branding images should only have to be retrieved once
(assuming sane site design).

cheers
stuart
--
...let us be heard from red core to black sky


On Thu, 5 Mar 2020 at 04:37, Demian Katz  wrote:

> Hello, everyone –
>
> I’ve been struggling with a use case that feels like it can’t be unique to
> my situation. Wondering if anyone else has solved this!
>
> We’ve decommissioned an old dynamic site, and we still want to make the
> content available in a static form. It was a large and complex site with a
> lot of pages, and after trying a variety of solutions, we ended up
> harvesting it all into a WARC file. This is great for archival purposes,
> but we’re struggling with presentation.
>
> The problem with serving content from a WARC is that it seems to be
> unbearably slow in every solution we try. (And when I say unbearably, I
> mean “40 minutes to load one page using pywb” – not kidding).
>
> I assume that this slowness has to do with dynamically navigating around
> in a multi-gigabyte file to retrieve things… but really all we want to do
> is serve up static content.
>
> Is there some tool that can simply unpack a WARC into a directory of
> static files that can be navigated quickly? It seems like this should be
> possible, but I’m coming up empty in searching.
>
> And just to be clear: I understand that unpacking a WARC probably won’t
> retain all of the richness of detail that dynamic retrieval from the WARC
> can provide, and I certainly don’t plan to throw away the WARC… but for
> people who just want to quickly navigate content from the most
> recently-crawled version of the old site, I want a solution that will
> perform acceptably, and I haven’t found it yet.
>
> Thanks for any and all advice! 
>
> - Demian
>


Re: [CODE4LIB] Mobile access to IP-authenticated resources

2020-02-20 Thread Stuart A. Yeates
Depending on what level of control you have, you could potentially do
some DNS black magic: have a VPN that sent all client requests for
*.ebscohost.com to server.yourdomain.edu which then used httpd
redirects (or similar) to rewrite those to include the proxy prefix.
You'd need to include an appropriate  certificate in browser config if
you didn't want the HTTPS handshake to fail.

cheers
stuart
--
...let us be heard from red core to black sky

On Fri, 21 Feb 2020 at 09:03, Bob Dougherty  wrote:
>
> Andreas, I think I misunderstood something you wrote: "our solution is to 
> force all connections via the VPN through the proxy."
>
> Can you "force" a connection through the proxy when it didn't start with a 
> proxied web page? Such as the scenarios I mentioned (a link to an article 
> found in an internet search or an email)?
>
> Thanks,
>
> Bob


[CODE4LIB] https://portal.issn.org/

2020-01-23 Thread Stuart A. Yeates
My understanding is that the database at https://portal.issn.org/
contains the official information for all registered ISSNs and that
while it's free to access on a record-by-record basis, they charge for
API access and don't permit the redistribution of the results (and
goes to great length to talk about database rights they own).

I'm in a jurisdiction (New Zealand) which I don't believe has database
rights, so I'm thinking about the usefulness of compiling this into a
CSV, probably using the the RDF as the base.

Unless there is an alternative source for this information?

cheers
stuart

--
...let us be heard from red core to black sky


Re: [CODE4LIB] ode to fred kilgour

2020-01-06 Thread Stuart A. Yeates
If you know the original photographer and are in contact with them,
would it be possible to get them to upload a copy to
https://commons.wikimedia.org/wiki/Special:UploadWizard for use in
https://en.wikipedia.org/wiki/Fred_Kilgour ?

Ideally get the original uploaded for provenance purposes, the wiki
gnomes can crop it (ping me if necessary).

You can also add links to personal anecdotes such as this email to the
'External links' or 'Further reading' sections of wikipedia
biographies.

cheers
stuart
--
...let us be heard from red core to black sky

On Tue, 7 Jan 2020 at 08:10, Eric Lease Morgan  wrote:
>
>
> Fred Kilgour was a very influential man in the world of modern librarianship. 
> As you may or may not know, he founded OCLC, and I had a few blushes with him.
>
> My first blush with Mr. Kilgour occurred in 1985 or so when I was the lending 
> side of Interlibrary Loan at Drexel University where I was going to library 
> school. Everyday I would tickle the OCLC M300 terminal, and it would spit out 
> lending requests which I was expected to fill. Through the process I had to 
> learn the various ways to search OCLC, specifically with codes such as 4,4 or 
> 4,2,2,1 etc. These codes were title or author searches. Four characters of 
> the first word of the title, two characters of the second word, two more 
> characters of the third word, and a single character of the forth word. At 
> the time the whole thing was smart, but later it became passé, but in 
> retrospect (and given a knowledge of data structures of the time), it was 
> really smart. I used a similar scheme to create my first catalog.
>
> A bit more than a decade later I was recruited to teach Internet 101 to sets 
> of library school graduate students at UNC Chapel-Hill. During my second 
> stint at the job I shared an office with Mr. Kilgour. He had been hired as an 
> emeritus professor of the library school. His job was merely to be there. I 
> couldn't wait! I was going to discuss with him the ideas of 4,4 and 4,2,2,1. 
> I was going to discuss the role of OCLC in librarianship, and more 
> specifically discuss what it meant to be a not-for-profit organization.
>
> Well, my perspective changed. Mr. Kilgour (who was 83 years old at the time) 
> was writing a book on the history of the book, and at the same time he was 
> writing an article using bibliometrics as the foundation of his 
> investigations. We talked about MARC. "It is really stupid", he said. I said, 
> "Yes, but..." "No", he replied, "It is really stupid." He told me stories 
> about the beginnings of OCLC. One time a terminal was "broken". He entered 
> the room and noticed that it was unplugged from the power source. He said, 
> "It needs to be plugged in", and the librarian said, "Well, then plug it in. 
> I don't do such things." He plugged it in and everything worked just fine. As 
> I got to know him, I couldn't give him any grief.
>
> A few years after that, for the first time, I visited the OCLC Home Planet in 
> Dublin (Ohio). I wanted to touch the computer which housed the bibliographic 
> record of my one-and-only formally published book. Ralph LeVan (a programmer 
> at OCLC who recently retired) and I went to the space where the computers 
> were located. It was a huge space, but mostly empty. Since Mr. Kilgour's 
> time, computers had gotten a lot smaller and generated less heat than their 
> predecessors. In the past, the big bad computers were used to heat the OCLC 
> building, but times had changed (or rather evolved).
>
> Over the recent holiday, as I was doing my annual data archiving thing, I 
> uncovered a photograph of Mr. Kilgour and myself, circa 1997. When I came to 
> work today I took my autographed copy of his complete works off the shelf 
> next to my desk, and I leafed through it pages. "Where are the articles 
> describing his idea of 'personal catalogs'?" Mr. Kilgour had a direct 
> influence on my life as a librarian. I know he has had an effect on your 
> professional life as well.
>
> Mr. Kilgour died in 2006 at the age of 92.
>
> --
> Eric Lease Morgan, Librarian
> University of Notre Dame
>


Re: [CODE4LIB] Systematic / Systemic bias in bibliometics

2019-07-17 Thread Stuart A. Yeates
Thank you.

cheers
stuart
--
...let us be heard from red core to black sky

On Thu, 18 Jul 2019 at 12:10, Marijane White  wrote:
>
> The work of Cassidy Sugimoto and Vincent Larivière comes to mind, as well as 
> some of the work done at the Centre for Science and Technology Studies (CWTS) 
> in the Netherlands.
>
> Some examples:
> https://www.semanticscholar.org/paper/Bibliometrics%3A-global-gender-disparities-in-Larivi%C3%A8re-Ni/73068e44373215a447d0a646446e73b94550610c
> https://www.cwts.nl/blog?article=n-q2z294=the-end-of-gender-disparities-in-science-if-only-it-were-true
> https://www.cwts.nl/blog?article=n-r2w2c4=indicators-for-social-good
>
>
> Marijane White, M.S.L.I.S.
> Data Librarian, Assistant Professor
> Oregon Health & Science University Library
>
> Phone: 503.494.3484
> Email: whi...@ohsu.edu
> ORCiD: https://orcid.org/-0001-5059-4132
>
>
> On 2019/07/17, 1:30 PM, "Code for Libraries on behalf of Stuart A. Yeates" 
>  wrote:
>
> I'm looking for work or discussions on systematic bias in
> bibliometrics or appropriate fora where such discussions are likely to
> happen. Even critical analysis of the founding assumptions of
> bibliometrics as a field would be a good place to start
>
> I have some ideas but they seem obvious and I'm afraid I'm missing a
> community of practice because what I think of as a widget they know as
> a whatzit.
>
> cheers
> stuart
> --
> ...let us be heard from red core to black sky
>
>


[CODE4LIB] Systematic / Systemic bias in bibliometics

2019-07-17 Thread Stuart A. Yeates
I'm looking for work or discussions on systematic bias in
bibliometrics or appropriate fora where such discussions are likely to
happen. Even critical analysis of the founding assumptions of
bibliometrics as a field would be a good place to start

I have some ideas but they seem obvious and I'm afraid I'm missing a
community of practice because what I think of as a widget they know as
a whatzit.

cheers
stuart
--
...let us be heard from red core to black sky


Re: [CODE4LIB] From the Community Support Squad wrt "Note [admiistratativia]"

2019-07-14 Thread Stuart A. Yeates
I was personally ambivalent about anonymity on the mailing list.

However, the fact that it appears to be predominantly men arguing for
banning anonymity and women arguing for allowing it is a tell that us
men folk might have our lower appendages in our orifices.

cheers
stuart
--
...let us be heard from red core to black sky

On Mon, 15 Jul 2019 at 14:12, Edward Almasy
<000e5cccdc3a-dmarc-requ...@lists.clir.org> wrote:
>
> On Jul 14, 2019, at 8:36pm, Eric Lease Morgan  wrote:
> > IMHO, the Code4Lib mailing list should not be akin to an anonymous chat 
> > room where anyone can come in and say whatever they desire under the cloak 
> > of anonymity.
> > One must be accountable for what they say, and accountability is increased 
> > by knowledge of the source. It is similar to information literacy and 
> > citing one's references so the validity of an argument can be substantiated.
>
> There is also the issue of bias.  Knowing, for example, that someone is from 
> an institution or organization that has invested heavily in a particular 
> platform or toolset can help put their views or advocacy into context.
>
> I think allowing anonymous or pseudonymous posts in this context decreases 
> the integrity and value of the discourse.
>
> Ed
>
>
> --
> Edward Almasy 
> Director  •  Internet Scout Research Group
> Computer Sciences Dept  •  U of Wisconsin - Madison
> 1210 W Dayton St  •  Madison WI 53706  •  3HCV+J6
> 608-262-6606 (voice)  •  608-265-9296 (fax)


Re: [CODE4LIB] Note [administratativia]

2019-06-30 Thread Stuart A. Yeates
In the spirit of fixing actual things that need to be fixed, I'd like
to point out the really positive news out of the ALA about them
recognising Dewey
https://www.theguardian.com/books/2019/jun/27/melvil-deweys-name-stripped-from-top-librarian-award
and AAAS about them recognising issues with current fellows
https://www.the-scientist.com/news-opinion/petition-asks-aaas-to-remove-fellows-with-sexual-harassment-records-64488
Both of these are the result of concerted campaigns over several years
by groups of named individuals with clear, concrete, goals in mind and
explicit strategies of how to get there.

Without a doubt things need to improve in libraries, but I don't see
petitions such as this an effective method of improving them.

If there is anyone wanting to discuss strategies, my emails are open.
Publicising locally the two recent advances mentioned above is going
to be on my to-do list.

cheers
stuart
--
...let us be heard from red core to black sky

On Fri, 28 Jun 2019 at 23:12, Eric Lease Morgan  wrote:
>
> On Jun 27, 2019, at 11:13 PM, S B 
> <0019b805e526-dmarc-requ...@lists.clir.org> wrote:
>
> > Dear Library Community,
> >
> > I have always known the library community to be forward thinking and 
> > helpful.  Through this journey of writing about sexual harassment in 
> > libraries, I have become concerned about the lack of sexual harassment 
> > awareness trainings and open and positive dialogue on the subject.  Yes, 
> > articles have been written.  Yes, some individual libraries as well as some 
> > systems and associations are doing training and having conversation.  That 
> > is great and should not be overlooked.  In many cases, though, it is the 
> > exact opposite.
> >
> > In a profession that is so progressive, why does sexual harassment 
> > awareness seem to take a back seat?
> >
> > I created a petition about sexual harassment awareness in libraries.  Would 
> > you consider signing and sharing? You don’t have to be in the library 
> > profession to sign.
> >
> > Thank you for your consideration of this request.  Many Thanks!
> >
> > Petition Link:
> >
> > http://chng.it/PzRB4BQp5f
>
>
> If the sorts of notes above persist, then I will look into their whereabouts 
> & see if we can get rid of them. Now back to our regular programming. --Eric 
> Morgan, List Owner


Re: [CODE4LIB] code4lib journal [reader]

2019-06-13 Thread Stuart A. Yeates
> > (d) Has thought been put into making them data archive-friendly?
>
> I don't understand. In this case, what does "archive-friendly" mean?

Well there are two options here:

(a) pre-harvest archiving, maybe you push URLs into archive.org (or
similar) as you harvest them, giving you reproducability and

(b) post-harvest archiving probably implies changing the format of the
resulting file to a standard. Possible standards include ePub, WARC or
METS, depending on your vision for the project. Alternatively, work
with a research data archive to include some basic metadata in the
current zip in a format they can understand and unpack on ingest.

Oh, and the 'About your study carrel' needs a colophon with links to
the software, version, etc.


cheers
stuart

--
...let us be heard from red core to black sky

On Fri, 14 Jun 2019 at 04:03, Eric Lease Morgan  wrote:
>
> On Jun 12, 2019, at 8:40 PM, Stuart A. Yeates  wrote:
>
> >> The Distant Reader [0] harvests an arbitrary number of user-supplied files 
> >> or links to files, transforms them into plain text files, and performs 
> >> numerous natural language processes against them. The result is a large 
> >> set of indexes that can be used to "read" the given corpus. I have made 
> >> available the about pages of a number of such indexes:
> >>
> >>  * Code4Lib Journal - http://dh.crc.nd.edu/tmp/code4lib-journal/about.html
> >> o 1,234,348 words; 303 documents
> >> o all articles from a journal named Code4Lib Journal
> >
> > Taking a look at distant reader (which I don't believe I've looked at 
> > before):
> >
> > (a) It would be great to sanity-check the corpus by running language
> > identification on each of the files
>
> Stuart, thank you for the feedback. As of right now, the Distant Reader is 
> only designed to process English language materials. Since it (I) rely on a 
> Python module called spaCy to do the part-of-speech and named-entity 
> extraction, I ought to be able to handle other Romance languages without too 
> much difficulty. [1]
>
>
> > (b) There are a whole flotilla of technical identifiers that could
> > useful be extracted from the text files (DOIs, ISBNs, ISSNs, etc)
>
> This is a fun idea, and I will investigate it further.
>
>
> > (c) A little webification of the texts would go a long way
>
> Hmmm... The plain text versions of the documents are necessary for the 
> natural language processing, but instead of returning links to the plain text 
> I could return links to the cached versions of the texts which are usually 
> formatted in HTML or as PDF. Thus, a part of the reading process would be 
> made easier.
>
>
> > (d) Has thought been put into making them data archive-friendly?
>
> I don't understand. In this case, what does "archive-friendly" mean?
>
>
> For a good time, I created a new data set -- 460 love stories (238 million 
> words; 460 documents; 5.94 uncompressed GB)
>
>   * about page - 
> http://dh.crc.nd.edu/sandbox/reader/hackaton/love-stories/about.html
>   * data set ("study carrel") - 
> http://dh.crc.nd.edu/sandbox/reader/hackaton/love-stories.zip
>
> Again, thank you for the feedback.
>
>
> [0] Distant Reader - https://distantreader.org
> [1] spaCy - https://spacy.io/models
>
> --
> Eric Lease Morgan
> Digital Initiatives Librarian, Navari Family Center for Digital Scholarship
> Hesburgh Libraries
>
> University of Notre Dame
> 250E Hesburgh Library
> Notre Dame, IN 46556
> o: 574-631-8604
> e: emor...@nd.edu
> w: cds.library.nd.edu


Re: [CODE4LIB] code4lib journal

2019-06-12 Thread Stuart A. Yeates
Taking a look at distant reader (which I don't believe I've looked at before):

(a) It would be great to sanity-check the corpus by running language
identification on each of the files

(b) There are a whole flotilla of technical identifiers that could
useful be extracted from the text files (DOIs, ISBNs, ISSNs, etc)

(c) A little webification of the texts would go a long way

(d) Has thought been put into making them data archive-friendly?

cheers
stuart

--
...let us be heard from red core to black sky

On Thu, 13 Jun 2019 at 08:51, Eric Lease Morgan  wrote:
>
> Through the use of my tool called the Distant Reader, I have refined a 
> process for indexing things like Code4Lib Journal. [1]
>
> The Distant Reader harvests an arbitrary number of user-supplied files or 
> links to files, transforms them into plain text files, and performs numerous 
> natural language processes against them. The result is a large set of indexes 
> that can be used to "read" the given corpus. I have made available the about 
> pages of a number of such indexes:
>
>   * Code4Lib Journal - http://dh.crc.nd.edu/tmp/code4lib-journal/about.html
>  o 1,234,348 words; 303 documents
>  o all articles from a journal named Code4Lib Journal
>
>   * Cultural Analytics - 
> http://dh.crc.nd.edu/tmp/cultural-analytics/about.html
>  o 318,287 words; 33 documents
>  o all articles from a journal named Cultural Analytics
>
>   * Plato - http://dh.crc.nd.edu/tmp/plato/about.html
>  o 929,704 words; 24 documents
>  o the complete works of Plato
>
>   * aesthetics - http://dh.crc.nd.edu/tmp/aesthetics/about.html
>  o 2,296,890 words; 37 documents
>  o books classified as the philosophy of art
>
> At an upcoming high performance computing conference, I -- with a number of 
> colleagues from Indiana University -- will be presenting a poster about the 
> Distant Reader, and we will be taking part in a hack-a-thon. [2, 3] If you 
> too would like hack against the output of the Distant Reader, then drop me a 
> line.
>
> [1] Distant Reader - https://distantreader.org
> [2] high performance computing conference - https://www.pearc19.pearc.org
> [3] hack-a-thon invitation - https://sites.nd.edu/emorgan/2019/06/hackathon/
>
> --
> Eric Lease Morgan
> Digital Initiatives Librarian, Navari Family Center for Digital Scholarship
> Hesburgh Libraries
>
> University of Notre Dame
> 250E Hesburgh Library
> Notre Dame, IN 46556
> o: 574-631-8604
> e: emor...@nd.edu
> w: cds.library.nd.edu


Re: [CODE4LIB] Ready for Review: OCFL 0.3 (Beta)

2019-06-04 Thread Stuart A. Yeates
Is there a clear statement of the problem OCFL is trying to solve? I'm
a third of the way through and it looks like METS with JSON replacing
XML and all references to existing metadata schemas (MARC, dublin
core, etc) stripped.

cheers
stuart
--
...let us be heard from red core to black sky

On Wed, 5 Jun 2019 at 02:25, Andrew Woods  wrote:
>
> Hello All,
>
> The Oxford Common File Layout (OCFL) specification describes an
> application-independent approach to the storage of digital information in a
> structured, transparent, and predictable manner. It is designed to promote
> standardized long-term object management practices within digital
> repositories.
>
> For those following the OCFL initiative or those generally interested in
> current community practice related to preservation persistence, you will be
> pleased to know that the OCFL 0.3 beta specification has been released and
> is now ready for your detailed review and feedback!
> - https://ocfl.io/0.3/spec/
>
> Twenty four issues [1] have been addressed since the 0.2 alpha release
> (February, 2019). Beyond editorial/clarifying updates, the more substantive
> changes in this beta release include:
> - Flexibility of directory name within version directories for holding
> content payload [2]
> - Optional “deposit” directory at top of Storage Root as draft workspace [3]
> - Expectation of case sensitivity of file paths and file names [4]
>
> Within the 90 day review period until September 2nd, please review the
> specification and implementation notes and provide your feedback either as
> discussion on the ocfl-community [5] mailing list or as GitHub issues [6].
>
> The monthly OCFL community meetings [7] are open to all (second Wednesday
> of every month @11am ET). Please join the conversation, or simply keep your
> finger on OCFL’s pulse by lurking!
>
> More detail and implementation notes can be found at https://ocfl.io.
>
> Best regards,
> Andrew Woods, on behalf of the OCFL editorial group
>
> [1]
> https://github.com/OCFL/spec/issues?utf8=%E2%9C%93=is%3Aissue+closed%3A2019-02-18..2019-06-03+
> [2] https://github.com/OCFL/spec/issues/341
> [3] https://github.com/OCFL/spec/issues/320
> [4] https://github.com/OCFL/spec/issues/285
> [5] ocfl-commun...@googlegroups.com
> [6] https://github.com/ocfl/spec/issues
> [7] https://github.com/OCFL/spec/wiki/Community-Meetings


Re: [CODE4LIB] altmetric

2019-04-22 Thread Stuart A. Yeates
Altmetic is great for capturing social media buzz.

Unfortunately it has a very narrow selective view of impact, which
means that many classes of publication aren't captured.

It also has no notion of conflict of interest, so institutions which
have twitter bots connected to their institutional repositories
automatically inflate all publications by one (if you're looking for
one of these on the cheap, I recommend https://ifttt.com/ which I use
for https://twitter.com/KiwiPhDs ). Institutions with real marketing
budgets are really making hay.

cheers
stuart



--
...let us be heard from red core to black sky

On Tue, 23 Apr 2019 at 07:04, Eric Lease Morgan  wrote:
>
> To what degree does anybody here have experience with the Altmetric API, and 
> if it is greater than zero, then what is your experience?
>
> Here in the Libraries I am at the very beginnings of an investigation to 
> determine the "impact" of various journal articles. In this case, "impact" is 
> multifaceted and includes things like citations counts as well as mentions in 
> the social media. API access to the Altmetric database may be just the sort 
> of thing I would find useful.
>
> What do y'all think?
>
> [1] Altmetric API - http://api.altmetric.com
>
> --
> Eric Lease Morgan
> University of Notre Dame


Re: [CODE4LIB] AuthorityBox

2019-04-16 Thread Stuart A. Yeates
That's great!

I've done a lot of work with VIAF, ORCID, etc in wikipedia and one of
the huge issues is with multiple clusters for the same person. The
wiki page for the list of VIAF errors is so long it no longer loads on
some of my browsers...

cheers
stuart
--
...let us be heard from red core to black sky

On Wed, 17 Apr 2019 at 01:57, Stefano Bargioni  wrote:
>
> Hello everyone,
> the Library of the Pontificia Università della Santa Croce, Rome, has added 
> AuthorityBox to the display of bibliographic records.
> AuthorityBox is an "accordion" composed by an infobox for each personal name 
> related to the record.
> An extra infobox is for settings, help and about.
> Each infobox may contain:
> - information from the authority record
> - links to other resources available in the library, like the "Name Cloud"
> - links to external resources, like "WorldCat Identities" and Wikipedia pages
> - a picture from Wikidata
> - the permalink of the authority record (hidden by default, use settings to 
> show)
> Examples:
> http://catalogo.pusc.it/bib/182859 (1 author)
> http://catalogo.pusc.it/bib/95161 (5 authors)
> http://catalogo.pusc.it/bib/88801 (10 authors)
>
> Some technicalities.
> AuthorityBox is based on VIAF id [1] and other data from MARC21 authority 
> records, in compliance with RDA Cataloguing Guidelines [2].
> Links are composed, directly or indirectly, on the VIAF id or the authority 
> id. For instance, the source of the picture is retrieved by the browser that 
> accesses the SPARQL endpoint query.wikidata.org. For teachers of our 
> University, without a page on Wikipedia, pictures are from a simple 
> repository.
> The ILS is the open source Koha [3].
>
> Stefano
>
> [1] https://viaf.org
> [2] https://www.oclc.org/en/rda/about.html
> [3] https://koha-community.org


Re: [CODE4LIB] Online data transformation tools

2019-04-15 Thread Stuart A. Yeates
I would also add somewhere links to the definitions / standards for
each of these files types. Not everyone who encounters MARC can be
expected to know all the other acronyms-as-file-formats.

cheers
stuart


--
...let us be heard from red core to black sky

On Tue, 16 Apr 2019 at 08:58, Kyle Banerjee  wrote:
>
> On Mon, Apr 15, 2019 at 11:20 AM Thomas Dunbar  wrote:
>
> > Hello everyone,
> >
> > I'm working on a proof of concept web application for common library data
> > conversions with support for large files.
> > The application is build using a serverless architecture, which allows me
> > do this at scale and at low cost.
> >
>
> Love the concept -- I tried a few conversions, including some north of
> 200MB. Overall, it worked impressively. Not having to download software is
> cool because you don't always have the ability to download software or
> might need to do something using a cell phone.
>
> For me personally, the chief needs driving conversions are : 1) To perform
> fixes in a format that's easier to work with (e.g. no one fixes in binary
> MARC) and convert back; 2) analysis -- i.e. identify records/elements that
> have or don't have X; and 3) migrations (which have required further
> manipulation in every single case). In other words, manipulations and
> partial extractions. In the context of these use cases, delimited text,
> plain text, XML, MARC, and JSON (to a lesser extent) dominate conversion
> needs.
>
> Regarding the MARC to text conversion, delimited text conversions need a
> subdelimiter for repeated fields as this is what often must be loaded into
> another system, presented in a table to someone, etc. -- the current method
> which adds more lines will cause trouble for anyone without coding skills.
> On a related note, considering the indicators part of the field makes
> philosophical sense but it creates practical problems (especially with
> nonrepeatable fields). For example, it scatters the 245 titles over as many
> indicator variations that exist making the simple task of generating a list
> of titles trickier than it should be. MARC already has a huge number of
> fields, so when the indicator permutations are combined with separate
> fielding for repeated fields, it takes no time at all to get many hundreds
> of fields with files that aren't that big -- something headache inducing
> even for those with mad skilz.
>
> One thing you'll want to think about as you develop the tool is what the
> people use it to accomplish. In my experience, conversions set you up for
> what you were really doing rather than being objectives in their own right.
>
> But again, very cool.
>
> kyle


[CODE4LIB] Browzine / thirdiron off campus authentication issue after cookies cleared?

2018-11-05 Thread Stuart A. Yeates
Are there any other Browzine / thirdiron (https://thirdiron.com/)
users out there who are seeing an issue with browzine forgetting which
institution it's meant to be authenticating against when used by off
campus users who have not previously logged into our institutional
systems with this browser?

We are linking to browzine from PRIMO on item pages, with our
institutional code in the URL. From on campus everything works. For
off campus users who are already logged into PRIMO everything works.
For off campus users who are already logged into other institutional
systems (SAML / ADFS) everything works.  For off campus users who
logged into these systems in a previous browser session things seem to
work. But other users get sent to a WAYF (where are you from) page and
have to select our institution, despite the institution being in the
URL.

Does this sound familiar to anyone?

cheers
stuart
--
...let us be heard from red core to black sky


Re: [CODE4LIB] exact relationship between DOIs and handles?

2018-08-19 Thread Stuart A. Yeates
Someone off list notified me of at least one third party which has the same
behaviour as the official doi.org lookup, https://doi.pangaea.de/  it does
at least document it's behaviour.

cheers
stuart
--
...let us be heard from red core to black sky


On Mon, 20 Aug 2018 at 12:17, Fitchett, Deborah <
deborah.fitch...@lincoln.ac.nz> wrote:

> I tend to consider it an “unintended feature” myself. ☺ But otherwise this
> is my understanding of the situation too.
>
> As far as I’m aware DOIs proper are all in the form 10./some_more_stuff
>
> Deborah
>
> From: Code for Libraries  On Behalf Of Conal
> Tuohy
> Sent: Friday, 17 August 2018 1:26 PM
> To: CODE4LIB@LISTS.CLIR.ORG
> Subject: Re: [CODE4LIB] exact relationship between DOIs and handles?
>
> Kia ora Stuart!
>
> I think the answer to your question is "no, the identifier is not a valid
> DOI".
>
> As evidence, I offer this URI which is supposed return information about
> the Registration Agency which registered that DOI:
> https://doi.org/doiRA/10063/1710<https://doi.org/doiRA/10063/1710>
>
> As you know, DOIs are a proper subset of Handles; and functionally, the DOI
> system relies on the Handle system as its infrastructure for URI
> resolution. I believe that when you resolve the URI <
> https://doi.org/10063/1710<https://doi.org/10063/1710>>, the DOI resolver
> is simply resolving the
> identifier as a Handle, and not first validating that the Handle is
> actually a valid DOI. I'd regard that as a bug in the DOI's resolver,
> personally.
>
> Cheers!
>
> Conal
>
>
> On Fri, 17 Aug 2018 at 09:37, Stuart A. Yeates  syea...@gmail.com>> wrote:
>
> > We have a DSpace instance that is configured to issue handle.net<
> http://handle.net>
> > identifiers to all items, so links such as:
> >
> > https://researcharchive.vuw.ac.nz/handle/10063/1710<
> https://researcharchive.vuw.ac.nz/handle/10063/1710>
> > http://researcharchive.vuw.ac.nz/handle/10063/1710<
> http://researcharchive.vuw.ac.nz/handle/10063/1710>
> > https://hdl.handle.net/10063/1710<https://hdl.handle.net/10063/1710>
> > http://hdl.handle.net/10063/1710<http://hdl.handle.net/10063/1710>
> >
> > all take a web browser to the same content. The following URLs also take
> > web
> > browsers to the same content:
> >
> > https://doi.org/10063/1710<https://doi.org/10063/1710>
> > http://doi.org/10063/1710<http://doi.org/10063/1710>
> > https://dx.doi.org/10063/1710<https://dx.doi.org/10063/1710>
> > http://dx.doi.org/10063/1710<http://dx.doi.org/10063/1710>
> >
> > The lookup at https://www.doi.org/index.html<
> https://www.doi.org/index.html> resolves the doi "10063/1710"
> > to the same content.
> >
> > I have two questions:
> >
> > (a) is 10063/1710 a valid/legal doi for this item ?
> > (b) are the doi.org<http://doi.org> URLs above valid/legal for this
> item?
> >
> > The documentation on the https://www.doi.org/<https://www.doi.org/> and
> https://handle.net/<https://handle.net/>
> > websites is surprisingly quiet on these issues...
> >
> > [We've been assuming the answer to these questions is 'yes' but yesterday
> > this was questioned by a colleague, so I'm looking for definitive
> answers]
> >
> > cheers
> > stuart
> > --
> > ...let us be heard from red core to black sky
> >
>
>
> --
> Conal Tuohy
> http://conaltuohy.com/<http://conaltuohy.com/>
> @conal_tuohy
> +61-466-324297
>
> 
>
> "The contents of this e-mail (including any attachments) may be
> confidential and/or subject to copyright. Any unauthorised use,
> distribution, or copying of the contents is expressly prohibited. If you
> have received this e-mail in error, please advise the sender by return
> e-mail or telephone and then delete this e-mail together with all
> attachments from your system."
>


Re: [CODE4LIB] exact relationship between DOIs and handles?

2018-08-16 Thread Stuart A. Yeates
Further exploration reveals that  http://doai.io ,
http://doi2oa.erambler.co.uk/ and https://oadoi.org/ (now
http://unpaywall.org/ ) don't resolve handles in the way that
https://doi.org/ does.

Having said that, at least some are open source (
https://github.com/jezcope/doi2oa) so it shouldn't be too hard to add.

cheers
stuart
--
...let us be heard from red core to black sky


On Fri, 17 Aug 2018 at 14:20, Stuart A. Yeates  wrote:

> Interesting insight Conal, I wasn't aware of that service.
>
> https://doi.org/10063/1710 redirects to
> http://researcharchive.vuw.ac.nz/handle/10063/1710 using a 302 redirect,
> implying that the server knows where the DOI resides by RFC 7231
> https://tools.ietf.org/html/rfc7231
>
> If  10063/1710 were not a valid DOI, the DOI server should use 303 (if it
> redirects) and  a 400 or 404 if it doesn't.
>
> cheers
> stuart
> --
> ...let us be heard from red core to black sky
>
>
> On Fri, 17 Aug 2018 at 13:27, Conal Tuohy  wrote:
>
>> Kia ora Stuart!
>>
>> I think the answer to your question is "no, the identifier is not a valid
>> DOI".
>>
>> As evidence, I offer this URI which is supposed return information about
>> the Registration Agency which registered that DOI:
>> https://doi.org/doiRA/10063/1710
>>
>> As you know, DOIs are a proper subset of Handles; and functionally, the
>> DOI
>> system relies on the Handle system as its infrastructure for URI
>> resolution. I believe that when you resolve the URI <
>> https://doi.org/10063/1710>, the DOI resolver is simply resolving the
>> identifier as a Handle, and not first validating that the Handle is
>> actually a valid DOI. I'd regard that as a bug in the DOI's resolver,
>> personally.
>>
>> Cheers!
>>
>> Conal
>>
>>
>> On Fri, 17 Aug 2018 at 09:37, Stuart A. Yeates  wrote:
>>
>> > We have a DSpace instance that is configured to issue handle.net
>> > identifiers to all items, so links such as:
>> >
>> > https://researcharchive.vuw.ac.nz/handle/10063/1710
>> > http://researcharchive.vuw.ac.nz/handle/10063/1710
>> > https://hdl.handle.net/10063/1710
>> > http://hdl.handle.net/10063/1710
>> >
>> > all take a web browser to the same content. The following URLs also take
>> > web
>> > browsers to the same content:
>> >
>> > https://doi.org/10063/1710
>> > http://doi.org/10063/1710
>> > https://dx.doi.org/10063/1710
>> > http://dx.doi.org/10063/1710
>> >
>> > The lookup at https://www.doi.org/index.html resolves the doi
>> "10063/1710"
>> > to the same content.
>> >
>> > I have two questions:
>> >
>> > (a) is 10063/1710 a valid/legal doi for this item ?
>> > (b) are the doi.org URLs above valid/legal for this item?
>> >
>> > The documentation on the https://www.doi.org/ and https://handle.net/
>> > websites is surprisingly quiet on these issues...
>> >
>> > [We've been assuming the answer to these questions is 'yes' but
>> yesterday
>> > this was questioned by a colleague, so I'm looking for definitive
>> answers]
>> >
>> > cheers
>> > stuart
>> > --
>> > ...let us be heard from red core to black sky
>> >
>>
>>
>> --
>> Conal Tuohy
>> http://conaltuohy.com/
>> @conal_tuohy
>> +61-466-324297
>>
>


Re: [CODE4LIB] exact relationship between DOIs and handles?

2018-08-16 Thread Stuart A. Yeates
Interesting insight Conal, I wasn't aware of that service.

https://doi.org/10063/1710 redirects to
http://researcharchive.vuw.ac.nz/handle/10063/1710 using a 302 redirect,
implying that the server knows where the DOI resides by RFC 7231
https://tools.ietf.org/html/rfc7231

If  10063/1710 were not a valid DOI, the DOI server should use 303 (if it
redirects) and  a 400 or 404 if it doesn't.

cheers
stuart
--
...let us be heard from red core to black sky


On Fri, 17 Aug 2018 at 13:27, Conal Tuohy  wrote:

> Kia ora Stuart!
>
> I think the answer to your question is "no, the identifier is not a valid
> DOI".
>
> As evidence, I offer this URI which is supposed return information about
> the Registration Agency which registered that DOI:
> https://doi.org/doiRA/10063/1710
>
> As you know, DOIs are a proper subset of Handles; and functionally, the DOI
> system relies on the Handle system as its infrastructure for URI
> resolution. I believe that when you resolve the URI <
> https://doi.org/10063/1710>, the DOI resolver is simply resolving the
> identifier as a Handle, and not first validating that the Handle is
> actually a valid DOI. I'd regard that as a bug in the DOI's resolver,
> personally.
>
> Cheers!
>
> Conal
>
>
> On Fri, 17 Aug 2018 at 09:37, Stuart A. Yeates  wrote:
>
> > We have a DSpace instance that is configured to issue handle.net
> > identifiers to all items, so links such as:
> >
> > https://researcharchive.vuw.ac.nz/handle/10063/1710
> > http://researcharchive.vuw.ac.nz/handle/10063/1710
> > https://hdl.handle.net/10063/1710
> > http://hdl.handle.net/10063/1710
> >
> > all take a web browser to the same content. The following URLs also take
> > web
> > browsers to the same content:
> >
> > https://doi.org/10063/1710
> > http://doi.org/10063/1710
> > https://dx.doi.org/10063/1710
> > http://dx.doi.org/10063/1710
> >
> > The lookup at https://www.doi.org/index.html resolves the doi
> "10063/1710"
> > to the same content.
> >
> > I have two questions:
> >
> > (a) is 10063/1710 a valid/legal doi for this item ?
> > (b) are the doi.org URLs above valid/legal for this item?
> >
> > The documentation on the https://www.doi.org/ and https://handle.net/
> > websites is surprisingly quiet on these issues...
> >
> > [We've been assuming the answer to these questions is 'yes' but yesterday
> > this was questioned by a colleague, so I'm looking for definitive
> answers]
> >
> > cheers
> > stuart
> > --
> > ...let us be heard from red core to black sky
> >
>
>
> --
> Conal Tuohy
> http://conaltuohy.com/
> @conal_tuohy
> +61-466-324297
>


[CODE4LIB] exact relationship between DOIs and handles?

2018-08-16 Thread Stuart A. Yeates
We have a DSpace instance that is configured to issue handle.net
identifiers to all items, so links such as:

https://researcharchive.vuw.ac.nz/handle/10063/1710
http://researcharchive.vuw.ac.nz/handle/10063/1710
https://hdl.handle.net/10063/1710
http://hdl.handle.net/10063/1710

all take a web browser to the same content. The following URLs also take web
browsers to the same content:

https://doi.org/10063/1710
http://doi.org/10063/1710
https://dx.doi.org/10063/1710
http://dx.doi.org/10063/1710

The lookup at https://www.doi.org/index.html resolves the doi "10063/1710"
to the same content.

I have two questions:

(a) is 10063/1710 a valid/legal doi for this item ?
(b) are the doi.org URLs above valid/legal for this item?

The documentation on the https://www.doi.org/ and https://handle.net/
websites is surprisingly quiet on these issues...

[We've been assuming the answer to these questions is 'yes' but yesterday
this was questioned by a colleague, so I'm looking for definitive answers]

cheers
stuart
--
...let us be heard from red core to black sky


Re: [CODE4LIB] BIBFRAME nesting question

2018-01-18 Thread Stuart A. Yeates
> I
> haven't thought this through but because BF combines the FRBR work and
> expression into a single entity, it may be safe to say that no BF
> instance can be an instanceOf more than one BF work.

Isn't every edition of 'Complete Works of Shakespeare' an instanceOf each
of the plays?

cheers
stuart
--
...let us be heard from red core to black sky

On 19 January 2018 at 08:49, Karen Coyle  wrote:

> Joshua,
>
> Yes, as Nate says, those examples on my site are from quite a while ago,
> and come out of an early MARC -> BFv1 converter.
>
> I don't how BF decides what gets a URI vs. what is a blank node (and I
> find it to be heavy on blank nodes, which may reflect an XML development
> environment). I do know that the FRBR model treats each bibliographic
> entity (WEMI) as a top-level "thing". FRBR also explicitly rejects the
> idea that the whole WEMI can be expressed with a single URI.[0] That
> seems extreme, but in fact in FRBR there are many-to-many relationships
> between works and expressions, so it isn't a hierarchy but a graph. I
> haven't thought this through but because BF combines the FRBR work and
> expression into a single entity, it may be safe to say that no BF
> instance can be an instanceOf more than one BF work. However, any BF
> work can have more than one instance, so the "super-set" identifier
> becomes difficult.
>
> My gut feeling is that you should analyze your own data based on your
> own use cases and then posit a model - so that your ideas are clear
> before you step into the morass of BF assumptions (many of which do not
> appear to be directly articulated in the public documentation). If you
> find that your use cases are not served by BF, PLEASE bring that to the
> attention of the community working on BF and LD4P [1]. There are aspects
> of the BF development that may meet the needs of some but not all,
> because the range of experiences is still limited. More voices are a
> Good Thing.
>
> kc
> [0] For more than you ever wanted to know, look at part II of
> http://kcoyle.net/beforeAndAfter/index.html
> [1] https://wiki.duraspace.org/display/LD4P
>
> On 1/18/18 11:26 AM, Josh Welker wrote:
> > Okay, thanks all. I will set up the code to split the entities into
> > different files. Is there a rule of thumb for when a Thing needs to be
> > split out into a different file with its own URI vs being a blank node?
> For
> > instance, maybe blank nodes one level deep are okay but nested ones are
> > not. But I don't see the point of making a URI for the Title of a
> yearbook,
> > for instance, when virtually no one is ever going to reference the Title
> > outside the context of the larger Work or Instance.
> >
> > Joshua Welker
> > Information Technology Librarian
> > James C. Kirkpatrick Library
> > University of Central Missouri
> > Warrensburg, MO 64093
> > JCKL 2260
> > 660.543.8022
> >
> >
> > On Thu, Jan 18, 2018 at 12:39 PM, Trail, Nate  wrote:
> >
> >> Just to note, that is a BIBFRAME1 vocab example. You can tell because
> the
> >> namespace is http://bibframe.org/vocab...
> >>
> >> You could certainly extract them and post them to their own end points,
> >> but you have to decide how to make the uris unique in your endpoint
> area:
> >> Karen's had a unique uri for the Work: http://id/test/C:\Users\
> >> deborah\Documents\OxygenXMLDeveloper\samples14107665 , but nothing for
> >> the Instance.
> >>
> >> If she wanted, she could have posted the Work part to
> >> http://kcoyle.net/bibframe/works/samples1410665
> >> And she could have posted the Instance part to
> http://kcoyle.net/bibframe/
> >> instances/samples1410665 (and changed the bf:Instance bf:instanceOf
> >> address to the new work URI).
> >>
> >> http://kcoyle.net/bibframe/works/
> >> samples1410665 "/>
> >>
> >>
> >>
> >> BY the way, the bf2 version is comparable here (if I'm right that the
> >> number is the LC voyager bib id):
> >>
> >> Id.loc.gov/tools/bibframe/compare-id/full-rdf?find=14107665 or
> >> Id.loc.gov/tools/bibframe/compare-id/full-ttlf?find=14107665
> >> It's also available for extraction and use  here:
> >> http://lx2.loc.gov:210/LCDB?query=rec.id=14107665;
> recordSchema=bibframe2a&
> >> maximumRecords=1
> >>
> >> Making things even more interesting, this one also has embedded Work
> >> descriptions :
> >> http://bibframe.example.org/14107665#Work740-46; >
> >> Blest pair of sirens.
> >>
> >> They are pretty skimpy but could be used as stub descriptions and given
> >> their own identity (uri) until such time as they can be reconciled to an
> >> existing description or be more fully cataloged to stand on their own.
> >>
> >> Nate
> >>
> >> -
> >> Nate Trail
> >> Network Development & MARC Standards Office
> >> LS/ABA/NDMSO
> >> LA308, Mail Stop 4402
> >> Library of Congress
> >> Washington DC 20540
> >>
> >>
> >>
> >>
> >> -Original Message-
> >> From: Code for Libraries [mailto:CODE4LIB@LISTS.CLIR.ORG] 

Re: [CODE4LIB] c4l Journal unreachable

2018-01-12 Thread Stuart A. Yeates
In the meantime:
http://web.archive.org/web/20171202212012/http://journal.code4lib.org/articles/5468

cheers
stuart


--
...let us be heard from red core to black sky

On 13 January 2018 at 07:05, Karen Coyle  wrote:

> The link:
>
> http://journal.code4lib.org/
>
> goes to https://www.ibiblio.org/ blog page
>
> and individual articles get a 404 at ibiblio, e.g.
>
> http://journal.code4lib.org/articles/5468
>
> Thanks to whoever can figure this out.
>
> kc
> --
> Karen Coyle
> kco...@kcoyle.net http://kcoyle.net
> m: +1-510-435-8234
> skype: kcoylenet/+1-510-984-3600
>


Re: [CODE4LIB] curating code4lib

2017-12-12 Thread Stuart A. Yeates
Someone is on the list is bound to have extra megabytes left on their
archive.org sub at the end of the period. Maybe we could have a wiki page
describing the best crawl config so nothing gets left out? Remember that
re-crawling the same content doesn't incur a cost...

cheers
stuart

--
...let us be heard from red core to black sky

On 13 December 2017 at 07:48, Kyle Banerjee  wrote:

> On Tue, Dec 12, 2017 at 9:10 AM, Eric Lease Morgan  wrote:
>
> > As I sit here watching my EAD files get indexed by Solr, I ask myself,
> “To
> > what degree are we — the Code4Lib community — curating our content?”
> >
> > Seriously, our “community” generates content, and the bulk of it takes
> > three or four forms: the mailing list, the journal, the wiki, and
> > conference agendas/schedules. How “important” is this content? While it
> my
> > very well be backed up, and while it may very well be restorable, I
> wonder
> > about its intrinsic values...
> >
>
> Generally speaking, if you have to wonder about the value of something, you
> already have the answer ;)
>
> But seriously, just because a theoretical use case can be imagined is not a
> good reason to dedicate resources -- this is the very definition of a
> solution looking for a problem.
>
> Whether or not  content is formally organized, the good stuff has and will
> continue to permeate thinking/systems/processes elsewhere.
>
> kyle
>


Re: [CODE4LIB] Online suggestion form or box

2017-10-24 Thread Stuart A. Yeates
We're using https://www.zendesk.com/ for end user feedback. It's great.

cheers
stuart

--
...let us be heard from red core to black sky

On 25 October 2017 at 09:04, Jason Bengtson  wrote:

> I've used a variety of them in the past for projects . . . often through
> google forms. I haven't done it recently.
>
> Best regards,
>
> *Jason Bengtson*
>
>
> *http://www.jasonbengtson.com/ *
>
> On Tue, Oct 24, 2017 at 1:52 PM, Beth Goodwin 
> wrote:
>
> > Good Afternoon,
> >
> > Outside of the Springshare products, is anyone using a type of anonymous
> > online suggestion/feedback form? A digital question and answer board?
> >
> > Thanks!
> >
> > Beth Goodwin
> > Trinity International University
> >
>


Re: [CODE4LIB] Persistent Identifiers for organizations/institutions.

2017-10-14 Thread Stuart A. Yeates
On 14 October 2017 at 07:11, Edward Summers  wrote:

> What if we created an identifier system that organizations would pay an
> annual feel to belong to? This identifier would be guaranteed to be
> globally unique as long as the organization cared to maintain it. You could
> use this identifier with your web browser to find information about the
> organization.
>
> Yes, that’s DNS.
>
> What if we (memory institutions writ large) did something about
> remembering the history of DNS? It sounds simple, but it’s not. Is it
> possible?
>

archive.org web harvests include at least some DNS details for the content
they harvest. I'm not sure how comprehensive it is and I'm pretty such that
there isn't a tool for easily exploring it.

Having said that, there are some significant organisations which have yet
to embrace the digital yet, which makes DNS tracking of their structure
challenging.

cheers
stuart

--
...let us be heard from red core to black sky


Re: [CODE4LIB] Needed: some metadata documentation examples

2017-10-11 Thread Stuart A. Yeates
Do you mean something like http://rioxx.net/ ?

cheers
stuart

--
...let us be heard from red core to black sky

On 11 October 2017 at 10:20, Karen Coyle  wrote:

> Hello, all. I'm working on some projects where we are trying to define
> formats and guidance for metadata *profiles*. You may have seen the
> profiles that were created at one point for BIBFRAME (I can't find them
> at the moment on the new site) - they were more "list-like" than the
> fancy BF-lite site, but mostly the same idea. Profiles often are a
> simple list of data elements or properties, sometimes with a bit more
> info like cardinality.[1]
>
> What I want to find are some examples of documentation aimed at those
> creating the metadata records that explain what goes into the metadata,
> and hopefully some rules like "this has to be a date in the format
> -mm-dd". I'm guessing that folks using systems like contentDM may
> have something of this nature. Obviously, the whole RDA enchilada would
> be way too much to chew on at this point. If you can point me to
> documentation that you have created or use, I would appreciate it. If I
> decide to do more with it than ruminate I will let you know.
>
> I want to note that part of the goal is to link metadata schema
> documentation and metadata user documentation with a validation language
> like ShEx.[2] If you want to know more, ping me, but we should have more
> to say after the Dublin Core meeting in D.C. later this month, where we
> are having a whole day on profiles, Oct 27, called "taming the graph".[3]
>
> Thanks,
> kc
>
> [1] You can find some profiles based on the W3C standard DCAT here:
> https://www.w3.org/2017/dxwg/wiki/Main_Page#Non-W3C_Documents, and you
> may find some other interesting links on that page.
> [2] http://shex.io/
> [3] http://dcevents.dublincore.org/index.php/IntConf/dc-2017/schedConf/
> --
> Karen Coyle
> kco...@kcoyle.net http://kcoyle.net
> m: +1-510-435-8234
> skype: kcoylenet/+1-510-984-3600
>


Re: [CODE4LIB] SIP/AIP Content Guidelines

2017-06-02 Thread Stuart A. Yeates
We're currently migrating a number of in-house and DSpace systems into SIPs
for ingest into Rosetta.

My advice is to put as much metadata as possible into the METS. For
example, just this week we seriously discussed putting sushi sections in at
the item level. Previously sushi was done by very different systems at very
different levels, but it's emerged as the only serious standard for
encoding usage statistics.

We're also using putting TEIHDRs and things like that in, simply because it
makes them much more available.

cheers
stuart

--
...let us be heard from red core to black sky

On 3 June 2017 at 01:56, Andrew Weidner  wrote:

> Hi all,
>
> Can anyone point me to guidelines or best practices documentation around
> creating SIPs for transfer to archival storage? What does an ideal AIP look
> like for digitized cultural heritage materials?
>
> I'd like to set up a pipeline that sends single object (e.g. one
> photograph, one book) SIPs from our digitization workflow to Archivematica
> for automated transfer to archival storage. Here's a brief slide deck
> outlining the approach I'm envisioning:
>
> https://docs.google.com/presentation/d/19F5seismyBdhgIWk7Kt0jmJjqis--
> FCpOwNr3v6Iu-w/edit?usp=sharing
>
> I welcome any thoughts that you all may have on this, especially about
> pitfalls to avoid.
>
> Thanks,
>
> Andy
>


Re: [CODE4LIB] File name extension in bitstream URL for preservation harvesting

2017-04-24 Thread Stuart A. Yeates
Quality digital preservation software will ignore file extensions and file
names for file type identification and use JHOVE or similar.

Cheers
Stuart

On Tuesday, April 25, 2017, Benedikt Kroll 
wrote:

> That would work, but I'm rather trying to find out whether digital
> preservation softwares have problems with service-style URLs for bitstreams
> in general. Because if that is the case, it could be relevant for the
> further development of the (open source) repository software we are using.
>
>
>
> Am 24.04.2017 um 13:15 schrieb Andreas Orphanides:
>
>> Here's a silly idea that maybe runs the risk of rushing to a solution
>> without actually addressing the core question... could you set up a proxy
>> that provides a URL ending in the correct filename?
>>
>> On Mon, Apr 24, 2017 at 4:28 AM, Benedikt Kroll <
>> benedikt.kr...@der-arbeitende.de> wrote:
>>
>> So this happend when trying to ingest a file to the longterm archive from
>>> a repository:
>>>
>>> The repository's bitstream URL for a PDF file did not end on .pdf, but
>>> the
>>> mimetype was sent correctly. So the URL looked like
>>> https://test/bitstream/id/123456 The preservation system (a commercial
>>> product) did not accept this bitstream URL as part of an OAI harvesting
>>> response.
>>>
>>> The error message was that the bitstream URL must contain a valid file
>>> name and must end with an extension according to its mime type. So the
>>> expected URL would need to look like https://test/bitstream/123456.pdf
>>>
>>> This would mean that using this preservation system, we will not be able
>>> to harvest from a repository that uses service endpoints, not regular
>>> file
>>> links to deliver bitstreams.
>>>
>>> I'm trying to find out whether it is common behaviour for preservation
>>> systems to require file URLs rather than service URLs to ingest
>>> bitstreams.
>>> Any experience on how preservation software you use handles this detail
>>> would be appreciated!
>>>
>>> Thanks!
>>> Benedikt
>>>
>>>
>>>
>>>
>>> Am 21.04.2017 um 18:46 schrieb Cary Gordon:
>>>
>>> Could you be a bit more specific about the issue you encountered?

 Thanks,

 Cary

 Cary Gordon
 The Cherry Hill Company
 http://chillco.com

 On Apr 21, 2017, at 12:53 AM, Benedikt Kroll <

> benedikt.kr...@der-arbeitende.de> wrote:
>
> We run into a situation where this occured, and I'm trying to find out
> what other preservation software also do this – and if so, maybe also
> get
> to know why this check is done.
>
>


-- 
--
...let us be heard from red core to black sky


Re: [CODE4LIB] Google Marking Code4Lib As Spam

2017-01-24 Thread Stuart A. Yeates
Most SPAM filters use distributed feedback as an important feedback loop.

This means that if individuals on the list want to do something about this,
they can check their SPAM box for code4lib messages and mark them as not
spam. To find these enter "in:spam code4lib" into the gmail search box.

cheers
stuart

--
...let us be heard from red core to black sky

On 25 January 2017 at 03:17, Eric Lease Morgan  wrote:

> On Jan 24, 2017, at 8:15 AM, Regina Beach-Bertin 
> wrote:
>
> > I just took a look in my spam folder.  Google is marking some Code4Lib
> mail as spam.  It's not every single message and seems to have been during
> the last week.
>
>
> This is the third mailing list weirdness brought to my attention in as
> many days. I will look into things more closely. Thank you. —Eric Morgan
>


Re: [CODE4LIB] How to archive selected pages from a site requiring authentication

2017-01-18 Thread Stuart A. Yeates
https://archive-it.org/ the subscription service of https://archive.org/
does login-protected sites.

We've found them to be very helpful and the software to just work, but
we've never done any password protected sites.

cheers
stuart

--
...let us be heard from red core to black sky

On Thu, Jan 19, 2017 at 5:54 AM, Nicholas Taylor  wrote:

> Hi Alex,
>
> If you don't mind having your data in WARC format, you could use:
> * The Webrecorder web service (https://webrecorder.io/), which records to
> an archive pages that you browse. Works well if you only have a small
> number of pages to archive and has the advantage that it can archive
> whatever you can access via your browser. Just make sure to set the
> collection to private and/or download and delete it once completed.
> * The Heritrix archival crawler support HTTP authentication (
> https://webarchive.jira.com/wiki/display/Heritrix/Credentials), much like
> HTTrack or wget, with the added advantage of storing the files in WARC.
>
> ~Nicholas
>
> -Original Message-
> From: Alex Armstrong [mailto:armstr...@amicalnet.org]
> Sent: Tuesday, January 17, 2017 7:09 AM
> Subject: Re: How to archive selected pages from a site requiring
> authentication
>
> Hi Mike & Tom,
>
> I didn’t clarify in my original question that I’m looking to access a site
> that uses form-based authentication.
>
> You’re both pointing me to the same which is to provide cookies to a CLI
> tool. You suggest wget, I began by looking at httrack and someone off-list
> suggested curl. All of these should work :)
>
> I’ve been swamped by other work to try this, but my next steps are surer
> now. Thanks folks!
>
> Alex
>
> On 15 January 2017 at 01:49:20, Hagedon, Mike - (mhagedon) (
> mhage...@email.arizona.edu) wrote:
>
> Hi Alex,
> It might really depend on the kind of authentication used, but a number of
> years ago I had to do something similar for a site protected by university
> (CAS) authn. If I recall correctly, I logged into the site with Firefox,
> and then told wget to use Firefox cookies. More or less like this like the
> "easy" version of the accepted answer here:
>
> http://askubuntu.com/questions/161778/how-do-i-use-
> wget-curl-to-download-from-a-site-i-am-logged-into
>
> Mike
>
> Mike Hagedon | Team Lead for Software & Web Development (Dev) | Technology
> Strategy & Services | University of Arizona Libraries
>
>
> -Original Message-
> From: Code for Libraries [mailto:CODE4LIB@LISTS.CLIR.ORG] On Behalf Of
> Alex Armstrong
> Sent: Friday, January 13, 2017 12:42 AM
> To: CODE4LIB@LISTS.CLIR.ORG
> Subject: [CODE4LIB] How to archive selected pages from a site requiring
> authentication
>
> Has anyone had to archive selected pages from a login-protected site? How
> did you do it?
>
> I've used the CLI tool httrack in the past for archiving sites. But in
> this case, accessing the pages require logging in. There's some vague
> documentation about how to do this with httrack, but I haven't cracked it
> yet. (The instructions are better for the Windows version of the
> application, but I only have ready access to a Mac.)
>
> Before I go on a wild goose chase, any help would be much appreciated.
>
> Alex
>
> --
> Alex Armstrong
> Web Developer & Digital Strategist, AMICAL Consortium
> armstr...@amicalnet.org
>


Re: [CODE4LIB] MARCXML help again

2017-01-10 Thread Stuart A. Yeates
That is, if the XML is completely consistent AND you're guaranteed to never
encounter MARC data with XML special characters, then Kyle's suggestion is
an excellent one.

I really need to find an excuse to publish a document with a title starting
" wrote:

> Well, I think that's a *bit* harsh. But the "YMMV" addition was
> appreciated, because it can and will. That is, if the XML is completely
> consistent, then Kyle's suggestion is an excellent one. If it isn't, then
> Kevin's link applies, IMHO. Since it appears from what we have been told
> that the records are consistent, I think Kyle's solution is not only
> workable but the most efficient. Given the caveat stated above.
> Roy
>
> > On Jan 10, 2017, at 5:57 PM, Kevin S. Clarke 
> wrote:
> >
> > On the mention of parsing XML with string operations, I'm compelled to
> post one of my favorite StackOverflow responses:
> >
> > http://stackoverflow.com/questions/1732348/regex-match-
> open-tags-except-xhtml-self-contained-tags/1732454#1732454
> >
> > YMMV of course...
> >
> > Kevin
> >
> >
> >
> > -Original message-
> >> From:Kyle Banerjee
> >> Sent: Tuesday, January 10 2017, 5:44 pm
> >> To: CODE4LIB@LISTS.CLIR.ORG
> >> Subject: Re: [CODE4LIB] MARCXML help again
> >>
> >> Howdy Julie,
> >>
> >> Depending on your specific needs, it's often easier/faster to use string
> >> rather than XML operations to work with XML.
> >>
> >> Especially if you have a large number of files and/or the files are very
> >> big, stripping the whitespace between elements and then performing a
> simple
> >> string substitution would be a fast low tech way to remove the unwanted
> >> fields.
> >>
> >> kyle
> >>
> >> On Tue, Jan 10, 2017 at 1:13 PM, Julie Swierczek <
> jswie...@swarthmore.edu>
> >> wrote:
> >>
> >>> Thanks to all who responded to my earlier plea for help.  I now have a
> new
> >>> problem.  I'm not sure if I can do this with find and replace in
> Oxygen, or
> >>> if this requires XSLT, or what.
> >>>
> >>> I have a project of MARCXML records like this:
> >>>
> >>> 
> >>> http://www.loc.gov/MARC21/slim;
> >>>xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
> >>>xsi:schemaLocation="http://www.loc.gov/MARC21/slim
> >>> http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd;>
> >>>  
> >>> 
> >>>
> >>>Faux College
> >>>Special Collections
> >>>
> >>>  
> >>> 
> >>>
> >>> I want to strip out all instances of:
> >>>
> >>>Faux College
> >>>Special Collections
> >>>
> >>> but I want to leave other 
> >>> instances intact.  I only want to delete ones with both the Faux
> College
> >>> and Special Collections text in the subfields.
> >>>
> >>> Where would I go from here? I thought of doing an xsl:template match
> in an
> >>> XSL stylesheet, and then not providing any instructions for replacing
> the
> >>> match, but I don't know how to select for that specific text. My
> attempts
> >>> to figure that out have not worked. You can only read so much W3C
> >>> documentation and Stack Overflow before you need to just sit quietly
> and
> >>> stare at a wall for a while.
> >>>
> >>> Thanks in advance --
> >>>
> >>> Julie
> >>>
> >>
>


Re: [CODE4LIB] MARCXML help again

2017-01-10 Thread Stuart A. Yeates
You need an identity transform + a no-op template such as:



or more readibly as:




cheers
stuart


--
...let us be heard from red core to black sky

On Wed, Jan 11, 2017 at 10:13 AM, Julie Swierczek 
wrote:

> Thanks to all who responded to my earlier plea for help.  I now have a new
> problem.  I'm not sure if I can do this with find and replace in Oxygen, or
> if this requires XSLT, or what.
>
> I have a project of MARCXML records like this:
>
> 
> http://www.loc.gov/MARC21/slim;
> xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance;
> xsi:schemaLocation="http://www.loc.gov/MARC21/slim
> http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd;>
>   
> 
> 
> Faux College
> Special Collections
> 
>   
> 
>
> I want to strip out all instances of:
> 
> Faux College
> Special Collections
> 
> but I want to leave other 
> instances intact.  I only want to delete ones with both the Faux College
> and Special Collections text in the subfields.
>
> Where would I go from here? I thought of doing an xsl:template match in an
> XSL stylesheet, and then not providing any instructions for replacing the
> match, but I don't know how to select for that specific text. My attempts
> to figure that out have not worked. You can only read so much W3C
> documentation and Stack Overflow before you need to just sit quietly and
> stare at a wall for a while.
>
> Thanks in advance --
>
> Julie
>


[CODE4LIB] Fwd: Third party use of openurl resolver?

2016-11-30 Thread Stuart A. Yeates
I'm currently trying to systematise our use of EZproxy for
non-purchased/licensed resources.

There are two classes of non-purchased/licensed resources that I'm aware of.

The first is the class of PURL-like: services http://doi.org/,
http://purl.org/, http://handle.net/ and so forth which are used to link to
purchased/licensed resources.Working out when we need to add these is
pretty straight-forward: navigating from our search services to a resource
fails with an authentication message and someone notices, complains and we
fix things.

The second class are services such as https://scholar.google.co.nz/ which,
while we don't have a formal relationship with them, leverage our openurl
resolver to enable users to access more content than the normally would.

My question is whether any one knows of any services in the second class
other than google scholar? The tricky thing here is that they're likely to
work as apparently expected but not offer as much access as they might.

Thoughts?

Microsoft Academic Search appears not to do openurl, see
https://social.microsoft.com/Forums/windows/en-US/2d1b5774-b5f0-42d8-a020-
d8b89d5dadbc/microsoft-academic-search-and-openurl-compliance

cheers
stuart

--
...let us be heard from red core to black sky


Re: [CODE4LIB] PGP

2016-10-28 Thread Stuart A. Yeates
PGP has a dreadful reputation for usability, be prepared for a significant
support burden if you take that route.

You could always try omitting details from the email but providing a link:
"You have 4 books due tomorrow, click here and login to see the details"
kind of thing. That in conjunction with a local techie checking your email
sending settings.

cheers
stuart

--
...let us be heard from red core to black sky

On Sat, Oct 29, 2016 at 12:15 PM, Jim Hart  wrote:

> Depending on the client, the default security may be something other than
> PGP. Thunderbird comes to mind. I think it uses SSL. Gmail uses TLS. Yahoo!
> uses DKIM. Not that PGP can't be added as a plug-in or extension, sometimes
> (e.g. Thunderbird), but that may be beyond the capability (and willingness)
> of many people.
>
> I'd love to encrypt some of my email, but haven't been able to get
> agreement from even my most savvy acquaintances.
>
> Let us know how it goes if you decide to tackle it.
>
>
> James A. (Jim) Hart
> Board of Trustees
> Albert Church Brown Memorial Library
> China Village, Maine, USA
>
>
>
> On 10/28/2016 06:10 PM, Bigwood, David wrote:
>
>> I've been thinking about privacy lately. It seems to me much more email
>> should be encrypted. Many communications from the library might be personal
>> and potentially damaging. Email from the library showing overdues, or holds
>> might be sensitive. Would it be possible for our email systems to ask for a
>> public PGP key along with email and then use that whenever sending out
>> notices? Should my hospital, insurance company, bank, and so on be doing
>> the same? Just asking, maybe we could take the lead on privacy in this area.
>>
>> David Bigwood
>> dbigw...@hou.usra.edu
>> Public PGP Key: http://pgp.mit.edu/pks/lookup?
>> op=vindex=0x52B602E601695F10
>> Lunar and Planetary institute
>>
>>