Re: [CODE4LIB] Code 4 Lib attendees, Help please

2013-02-20 Thread Peter Murray
Sorry to hear about the difficulties, Ian.  The archive of #c4l13 tweets is 
here:

  
https://docs.google.com/spreadsheet/ccc?key=0AsyivMoYhk87dFljMUZURWZMYzNzT2lwcEduUUJ6d1E#gid=82

I think there was also an archive made of the IRC channel, but there tends to 
be a lot of noise there.


Peter

On Feb 19, 2013, at 5:04 PM, Devon dec...@gmail.com wrote:
 Ian,
 
 There's some video here if you want to rewatch it.
 http://new.livestream.com/accounts/2768983/events/1865025?device_panel=true
 Also some of the entries of the schedule have video embedded in them.
 http://code4lib.org/conference/2013/schedule
 
 Cynthia Ng did some pretty good blogging of the event.
 http://cynng.wordpress.com/tag/c4l13/
 
 And also twitter, for whatever it's worth.
 https://twitter.com/search?q=%23c4l13
 
 /dev
 
 
 
 On Tue, Feb 19, 2013 at 4:54 PM, Barba, Ian ian.ba...@ttu.edu wrote:
 
 I attended last week's Code 4 Lib conference.  Unfortunately, while I was
 having a late lunch on Thursday in China Town, my friend's car was
 vandalized and my laptop stolen.  I had all of my conference notes on that
 laptop.
 
 Would anyone be willing to share their conference notes with me?  I would
 be particularly interested in notes from an academic librarian, but I'll
 take whatever I can get my hands on.
 
 I really enjoyed the conference, but I'm reduced to trying to piece things
 together from memory-and that's spotty at best.
 
 Ian Barba
 Research  Development Librarian
 Texas Tech University Libraries



-- 
Peter Murray
Assistant Director, Technology Services Development
LYRASIS
peter.mur...@lyrasis.org
+1 678-235-2955
 
1438 West Peachtree Street NW
Suite 200
Atlanta, GA 30309
Toll Free: 800.999.8558
Fax: 404.892.7879 
www.lyrasis.org
 
LYRASIS: Great Libraries. Strong Communities. Innovative Answers.


[CODE4LIB] SuDoc normalization for sorting

2013-02-20 Thread Tod Olson
C4L,

Does anyone have some code they'd be willing to share that normalizes SuDoc 
numbers for sorting?

Best,

-Tod


Tod Olson t...@uchicago.edumailto:t...@uchicago.edu
Systems Librarian
University of Chicago Library


[CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Shaun Ellis

 (As a general rule, for every programmer who prefers tool A, and says
 that everybody should use it, there’s a programmer who disparages tool
 A, and advocates tool B. So take what we say with a grain of salt!)

It doesn't matter what tools you use, as long as you and your team are 
able to participate easily, if you want to.  But if you want to attract 
 contributions from a given development community, then choices should 
be balanced between the preferences of that community and what best 
serve the project.


From what I've been hearing, I think there is a lot of confusion about 
GitHub.  Heck, I am constantly learning about new GitHub features, APIs, 
and best practices myself. But I find it to be an incredibly powerful 
platform for moving open source, distributed software development 
forward.  I am not telling anyone to use GitHub if they don't want to, 
but I want to dispel a few myths I've heard recently:




* Myth #1 : GitHub creates a barrier to entry.
* To contribute to a project on GitHub, you need to use the 
command-line. It's not for non-coders.


GitHub != git.  While GitHub was initially built for publishing and 
sharing code via integration with git, all GitHub functionality can be 
performed directly through the web gui.  In fact, GitHub can even be 
used as your sole coding environment.  There are other tools in the 
eco-system that allow non-coders to contribute documentation, issue 
reporting, and more to a project.




* Myth #2 : GitHub is for sharing/publishing code.
* I would be fun to have a wiki for more durable poetry (github 
unfortunately would be a barrier to many).


GitHub can be used to collaborate on and publish other types of content 
as well.  For example, GitHub has a great wiki component* (as well as a 
website component).  In a number of ways, has less of a barrier to 
entry than our Code4Lib wiki.


While the path of least resistance requires a repository to have a 
wiki, public repos cost nothing and can consist of a simple README 
file.  The wiki can be locked down to a team, or it can be writable by 
anyone with a github account.  You don't need to do anything via 
command-line, don't need to understand git-flow, and you don't even 
need to learn wiki markup to write content.  All you need is an account 
and something to say, just like any wiki. Log in, go to the 
anti-harassment policy wiki, and see for yourself:

https://github.com/code4lib/antiharassment-policy/wiki

* The github wiki even has an API (via Gollum) that you can use to 
retrieve raw or formatted wiki content, write new content, and collect 
various meta data about the wiki as a whole:

https://github.com/code4lib/antiharassment-policy/wiki/_access



* Myth #3 : GitHub is person-centric.
 (And as a further aside, there’s plenty to dislike about github as
 well, from it’s person-centric view of projects (rather than
 team-centric)...

Untrue. GitHub is very team centered when using organizational accounts, 
which formalize authorization controls for projects, among other things: 
https://github.com/blog/674-introducing-organizations




* Myth #4 : GitHub is monopolizing open source software development.
 ... to its unfortunate centralizing of so much free/open
 source software on one platform.)

Convergence is not always a bad thing. GitHub provides a great, free 
service with lots of helpful collaboration tools beyond version control. 
 It's natural that people would flock there, despite having lots of 
other options.




-Shaun







On 2/19/13 5:35 PM, Erik Hetzner wrote:

At Sat, 16 Feb 2013 06:42:04 -0800,
Karen Coyle wrote:


gitHub may have excellent startup documentation, but that startup
documentation describes git in programming terms mainly using *nx
commands. If you have never had to use a version control system (e.g. if
you do not write code, especially in a shared environment), clone
push pull are very poorly described. The documentation is all in
terms of *nx commands. Honestly, anything where this is in the
documentation:

On Windows systems, Git looks for the |.gitconfig| file in the |$HOME|
directory (|%USERPROFILE%| in Windows’ environment), which is
|C:\Documents and Settings\$USER| or |C:\Users\$USER| for most people,
depending on version (|$USER| is |%USERNAME%| in Windows’ environment).

is not going to work for anyone who doesn't work in Windows at the
command line.

No, git is NOT for non-coders.


For what it’s worth, this programmer finds git’s interface pretty
terrible. I prefer mercurial (hg), but I don’t know if it’s any better
for people who aren’t familar with a command line.

   http://mercurial.selenic.com/guide/

(As a general rule, for every programmer who prefers tool A, and says
that everybody should use it, there’s a programmer who disparages tool
A, and advocates tool B. So take what we say with a grain of salt!)

(And as a further aside, there’s plenty to dislike about github as
well, from 

Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Karen Coyle
Shaun, you cannot decide whether github is a barrier to entry FOR ME (or 
anyone else), any more than you can decide whether or not my foot hurts. 
I'm telling you github is NOT what I want to use. Period.


I'm actually thinking that a blog format would be nice. It could be 
pretty (poetry and beauty go together). Poems tend to be short, so 
they'd make a nice blog post. They could appear in the Planet blog roll. 
They could be coded by author and topic. There could be comments! Even 
poems as comments! The only down-side is managing users. Anyone have 
ideas on that?


kc


On 2/20/13 8:20 AM, Shaun Ellis wrote:

 (As a general rule, for every programmer who prefers tool A, and says
 that everybody should use it, there’s a programmer who disparages tool
 A, and advocates tool B. So take what we say with a grain of salt!)

It doesn't matter what tools you use, as long as you and your team are 
able to participate easily, if you want to.  But if you want to 
attract  contributions from a given development community, then 
choices should be balanced between the preferences of that community 
and what best serve the project.


From what I've been hearing, I think there is a lot of confusion about 
GitHub.  Heck, I am constantly learning about new GitHub features, 
APIs, and best practices myself. But I find it to be an incredibly 
powerful platform for moving open source, distributed software 
development forward.  I am not telling anyone to use GitHub if they 
don't want to, but I want to dispel a few myths I've heard recently:




* Myth #1 : GitHub creates a barrier to entry.
* To contribute to a project on GitHub, you need to use the 
command-line. It's not for non-coders.


GitHub != git.  While GitHub was initially built for publishing and 
sharing code via integration with git, all GitHub functionality can be 
performed directly through the web gui.  In fact, GitHub can even be 
used as your sole coding environment. There are other tools in the 
eco-system that allow non-coders to contribute documentation, issue 
reporting, and more to a project.




* Myth #2 : GitHub is for sharing/publishing code.
* I would be fun to have a wiki for more durable poetry (github 
unfortunately would be a barrier to many).


GitHub can be used to collaborate on and publish other types of 
content as well.  For example, GitHub has a great wiki component* (as 
well as a website component).  In a number of ways, has less of a 
barrier to entry than our Code4Lib wiki.


While the path of least resistance requires a repository to have a 
wiki, public repos cost nothing and can consist of a simple README 
file.  The wiki can be locked down to a team, or it can be writable by 
anyone with a github account.  You don't need to do anything via 
command-line, don't need to understand git-flow, and you don't even 
need to learn wiki markup to write content. All you need is an account 
and something to say, just like any wiki. Log in, go to the 
anti-harassment policy wiki, and see for yourself:

https://github.com/code4lib/antiharassment-policy/wiki

* The github wiki even has an API (via Gollum) that you can use to 
retrieve raw or formatted wiki content, write new content, and collect 
various meta data about the wiki as a whole:

https://github.com/code4lib/antiharassment-policy/wiki/_access



* Myth #3 : GitHub is person-centric.
 (And as a further aside, there’s plenty to dislike about github as
 well, from it’s person-centric view of projects (rather than
 team-centric)...

Untrue. GitHub is very team centered when using organizational 
accounts, which formalize authorization controls for projects, among 
other things: https://github.com/blog/674-introducing-organizations




* Myth #4 : GitHub is monopolizing open source software development.
 ... to its unfortunate centralizing of so much free/open
 source software on one platform.)

Convergence is not always a bad thing. GitHub provides a great, free 
service with lots of helpful collaboration tools beyond version 
control.  It's natural that people would flock there, despite having 
lots of other options.




-Shaun







On 2/19/13 5:35 PM, Erik Hetzner wrote:

At Sat, 16 Feb 2013 06:42:04 -0800,
Karen Coyle wrote:


gitHub may have excellent startup documentation, but that startup
documentation describes git in programming terms mainly using *nx
commands. If you have never had to use a version control system 
(e.g. if

you do not write code, especially in a shared environment), clone
push pull are very poorly described. The documentation is all in
terms of *nx commands. Honestly, anything where this is in the
documentation:

On Windows systems, Git looks for the |.gitconfig| file in the |$HOME|
directory (|%USERPROFILE%| in Windows’ environment), which is
|C:\Documents and Settings\$USER| or |C:\Users\$USER| for most people,
depending on version (|$USER| is |%USERNAME%| in Windows’ environment).

is not 

Re: [CODE4LIB] SuDoc normalization for sorting

2013-02-20 Thread Schneider, Wayne
Hi, Tod. No idea how well it works, but there is a perl Text::SuDocs
module on CPAN:

http://search.cpan.org/~cfouts/Text-SuDocs-0.014/lib/Text/SuDocs.pm

Might be something you could reverse-engineer for another platform.

wayne

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Tod Olson
Sent: Wednesday, February 20, 2013 10:09 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] SuDoc normalization for sorting

C4L,

Does anyone have some code they'd be willing to share that normalizes
SuDoc numbers for sorting?

Best,

-Tod


Tod Olson t...@uchicago.edumailto:t...@uchicago.edu
Systems Librarian
University of Chicago Library


Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Ethan Gruber
Wordpress?


On Wed, Feb 20, 2013 at 11:42 AM, Karen Coyle li...@kcoyle.net wrote:

 Shaun, you cannot decide whether github is a barrier to entry FOR ME (or
 anyone else), any more than you can decide whether or not my foot hurts.
 I'm telling you github is NOT what I want to use. Period.

 I'm actually thinking that a blog format would be nice. It could be pretty
 (poetry and beauty go together). Poems tend to be short, so they'd make a
 nice blog post. They could appear in the Planet blog roll. They could be
 coded by author and topic. There could be comments! Even poems as comments!
 The only down-side is managing users. Anyone have ideas on that?

 kc



 On 2/20/13 8:20 AM, Shaun Ellis wrote:

  (As a general rule, for every programmer who prefers tool A, and says
  that everybody should use it, there’s a programmer who disparages tool
  A, and advocates tool B. So take what we say with a grain of salt!)

 It doesn't matter what tools you use, as long as you and your team are
 able to participate easily, if you want to.  But if you want to attract
  contributions from a given development community, then choices should be
 balanced between the preferences of that community and what best serve the
 project.

 From what I've been hearing, I think there is a lot of confusion about
 GitHub.  Heck, I am constantly learning about new GitHub features, APIs,
 and best practices myself. But I find it to be an incredibly powerful
 platform for moving open source, distributed software development forward.
  I am not telling anyone to use GitHub if they don't want to, but I want to
 dispel a few myths I've heard recently:

 

 * Myth #1 : GitHub creates a barrier to entry.
 * To contribute to a project on GitHub, you need to use the
 command-line. It's not for non-coders.

 GitHub != git.  While GitHub was initially built for publishing and
 sharing code via integration with git, all GitHub functionality can be
 performed directly through the web gui.  In fact, GitHub can even be used
 as your sole coding environment. There are other tools in the eco-system
 that allow non-coders to contribute documentation, issue reporting, and
 more to a project.

 

 * Myth #2 : GitHub is for sharing/publishing code.
 * I would be fun to have a wiki for more durable poetry (github
 unfortunately would be a barrier to many).

 GitHub can be used to collaborate on and publish other types of content
 as well.  For example, GitHub has a great wiki component* (as well as a
 website component).  In a number of ways, has less of a barrier to entry
 than our Code4Lib wiki.

 While the path of least resistance requires a repository to have a
 wiki, public repos cost nothing and can consist of a simple README file.
  The wiki can be locked down to a team, or it can be writable by anyone
 with a github account.  You don't need to do anything via command-line,
 don't need to understand git-flow, and you don't even need to learn wiki
 markup to write content. All you need is an account and something to say,
 just like any wiki. Log in, go to the anti-harassment policy wiki, and see
 for yourself:
 https://github.com/code4lib/**antiharassment-policy/wikihttps://github.com/code4lib/antiharassment-policy/wiki

 * The github wiki even has an API (via Gollum) that you can use to
 retrieve raw or formatted wiki content, write new content, and collect
 various meta data about the wiki as a whole:
 https://github.com/code4lib/**antiharassment-policy/wiki/_**accesshttps://github.com/code4lib/antiharassment-policy/wiki/_access

 

 * Myth #3 : GitHub is person-centric.
  (And as a further aside, there’s plenty to dislike about github as
  well, from it’s person-centric view of projects (rather than
  team-centric)...

 Untrue. GitHub is very team centered when using organizational accounts,
 which formalize authorization controls for projects, among other things:
 https://github.com/blog/674-**introducing-organizationshttps://github.com/blog/674-introducing-organizations

 

 * Myth #4 : GitHub is monopolizing open source software development.
  ... to its unfortunate centralizing of so much free/open
  source software on one platform.)

 Convergence is not always a bad thing. GitHub provides a great, free
 service with lots of helpful collaboration tools beyond version control.
  It's natural that people would flock there, despite having lots of other
 options.

 

 -Shaun







 On 2/19/13 5:35 PM, Erik Hetzner wrote:

 At Sat, 16 Feb 2013 06:42:04 -0800,
 Karen Coyle wrote:


 gitHub may have excellent startup documentation, but that startup
 documentation describes git in programming terms mainly using *nx
 commands. If you have never had to use a version control system (e.g. if
 you do not write code, especially in a shared environment), clone
 push pull are very poorly described. The documentation is all in
 terms of *nx commands. Honestly, anything where this is in the
 

Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Ross Singer
On Feb 20, 2013, at 11:42 AM, Karen Coyle li...@kcoyle.net wrote:

 Shaun, you cannot decide whether github is a barrier to entry FOR ME (or 
 anyone else), any more than you can decide whether or not my foot hurts. I'm 
 telling you github is NOT what I want to use. Period.
 
 I'm actually thinking that a blog format would be nice. It could be pretty 
 (poetry and beauty go together). Poems tend to be short, so they'd make a 
 nice blog post. They could appear in the Planet blog roll. They could be 
 coded by author and topic. There could be comments! Even poems as comments! 
 The only down-side is managing users. Anyone have ideas on that?

Of course, these aren't mutually exclusive:

http://octopress.org/

-Ross.

 
 kc
 
 
 On 2/20/13 8:20 AM, Shaun Ellis wrote:
  (As a general rule, for every programmer who prefers tool A, and says
  that everybody should use it, there’s a programmer who disparages tool
  A, and advocates tool B. So take what we say with a grain of salt!)
 
 It doesn't matter what tools you use, as long as you and your team are able 
 to participate easily, if you want to.  But if you want to attract  
 contributions from a given development community, then choices should be 
 balanced between the preferences of that community and what best serve the 
 project.
 
 From what I've been hearing, I think there is a lot of confusion about 
 GitHub.  Heck, I am constantly learning about new GitHub features, APIs, and 
 best practices myself. But I find it to be an incredibly powerful platform 
 for moving open source, distributed software development forward.  I am not 
 telling anyone to use GitHub if they don't want to, but I want to dispel a 
 few myths I've heard recently:
 
 
 
 * Myth #1 : GitHub creates a barrier to entry.
 * To contribute to a project on GitHub, you need to use the command-line. 
 It's not for non-coders.
 
 GitHub != git.  While GitHub was initially built for publishing and sharing 
 code via integration with git, all GitHub functionality can be performed 
 directly through the web gui.  In fact, GitHub can even be used as your sole 
 coding environment. There are other tools in the eco-system that allow 
 non-coders to contribute documentation, issue reporting, and more to a 
 project.
 
 
 
 * Myth #2 : GitHub is for sharing/publishing code.
 * I would be fun to have a wiki for more durable poetry (github 
 unfortunately would be a barrier to many).
 
 GitHub can be used to collaborate on and publish other types of content as 
 well.  For example, GitHub has a great wiki component* (as well as a website 
 component).  In a number of ways, has less of a barrier to entry than our 
 Code4Lib wiki.
 
 While the path of least resistance requires a repository to have a wiki, 
 public repos cost nothing and can consist of a simple README file.  The 
 wiki can be locked down to a team, or it can be writable by anyone with a 
 github account.  You don't need to do anything via command-line, don't need 
 to understand git-flow, and you don't even need to learn wiki markup to 
 write content. All you need is an account and something to say, just like 
 any wiki. Log in, go to the anti-harassment policy wiki, and see for 
 yourself:
 https://github.com/code4lib/antiharassment-policy/wiki
 
 * The github wiki even has an API (via Gollum) that you can use to retrieve 
 raw or formatted wiki content, write new content, and collect various meta 
 data about the wiki as a whole:
 https://github.com/code4lib/antiharassment-policy/wiki/_access
 
 
 
 * Myth #3 : GitHub is person-centric.
  (And as a further aside, there’s plenty to dislike about github as
  well, from it’s person-centric view of projects (rather than
  team-centric)...
 
 Untrue. GitHub is very team centered when using organizational accounts, 
 which formalize authorization controls for projects, among other things: 
 https://github.com/blog/674-introducing-organizations
 
 
 
 * Myth #4 : GitHub is monopolizing open source software development.
  ... to its unfortunate centralizing of so much free/open
  source software on one platform.)
 
 Convergence is not always a bad thing. GitHub provides a great, free service 
 with lots of helpful collaboration tools beyond version control.  It's 
 natural that people would flock there, despite having lots of other options.
 
 
 
 -Shaun
 
 
 
 
 
 
 
 On 2/19/13 5:35 PM, Erik Hetzner wrote:
 At Sat, 16 Feb 2013 06:42:04 -0800,
 Karen Coyle wrote:
 
 gitHub may have excellent startup documentation, but that startup
 documentation describes git in programming terms mainly using *nx
 commands. If you have never had to use a version control system (e.g. if
 you do not write code, especially in a shared environment), clone
 push pull are very poorly described. The documentation is all in
 terms of *nx commands. Honestly, anything where this is in the
 documentation:
 
 On Windows systems, Git looks for the 

Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Karen Coyle
Sure. Although the question was more: how can we make it easy to have a 
bunch of accounts? Or should we have a c4l account that we share (and 
monitor for spam)? I think anything wysiwyg-y and familiar (wordpress 
certainly meets those criteria) would be fine. There does seem to be a 
lot of familiarity with Wordpress in the group.


kc


On 2/20/13 8:45 AM, Ethan Gruber wrote:

Wordpress?


On Wed, Feb 20, 2013 at 11:42 AM, Karen Coyle li...@kcoyle.net wrote:


Shaun, you cannot decide whether github is a barrier to entry FOR ME (or
anyone else), any more than you can decide whether or not my foot hurts.
I'm telling you github is NOT what I want to use. Period.

I'm actually thinking that a blog format would be nice. It could be pretty
(poetry and beauty go together). Poems tend to be short, so they'd make a
nice blog post. They could appear in the Planet blog roll. They could be
coded by author and topic. There could be comments! Even poems as comments!
The only down-side is managing users. Anyone have ideas on that?

kc



On 2/20/13 8:20 AM, Shaun Ellis wrote:


(As a general rule, for every programmer who prefers tool A, and says
that everybody should use it, there’s a programmer who disparages tool
A, and advocates tool B. So take what we say with a grain of salt!)

It doesn't matter what tools you use, as long as you and your team are
able to participate easily, if you want to.  But if you want to attract
  contributions from a given development community, then choices should be
balanced between the preferences of that community and what best serve the
project.

 From what I've been hearing, I think there is a lot of confusion about
GitHub.  Heck, I am constantly learning about new GitHub features, APIs,
and best practices myself. But I find it to be an incredibly powerful
platform for moving open source, distributed software development forward.
  I am not telling anyone to use GitHub if they don't want to, but I want to
dispel a few myths I've heard recently:



* Myth #1 : GitHub creates a barrier to entry.
* To contribute to a project on GitHub, you need to use the
command-line. It's not for non-coders.

GitHub != git.  While GitHub was initially built for publishing and
sharing code via integration with git, all GitHub functionality can be
performed directly through the web gui.  In fact, GitHub can even be used
as your sole coding environment. There are other tools in the eco-system
that allow non-coders to contribute documentation, issue reporting, and
more to a project.



* Myth #2 : GitHub is for sharing/publishing code.
* I would be fun to have a wiki for more durable poetry (github
unfortunately would be a barrier to many).

GitHub can be used to collaborate on and publish other types of content
as well.  For example, GitHub has a great wiki component* (as well as a
website component).  In a number of ways, has less of a barrier to entry
than our Code4Lib wiki.

While the path of least resistance requires a repository to have a
wiki, public repos cost nothing and can consist of a simple README file.
  The wiki can be locked down to a team, or it can be writable by anyone
with a github account.  You don't need to do anything via command-line,
don't need to understand git-flow, and you don't even need to learn wiki
markup to write content. All you need is an account and something to say,
just like any wiki. Log in, go to the anti-harassment policy wiki, and see
for yourself:
https://github.com/code4lib/**antiharassment-policy/wikihttps://github.com/code4lib/antiharassment-policy/wiki

* The github wiki even has an API (via Gollum) that you can use to
retrieve raw or formatted wiki content, write new content, and collect
various meta data about the wiki as a whole:
https://github.com/code4lib/**antiharassment-policy/wiki/_**accesshttps://github.com/code4lib/antiharassment-policy/wiki/_access



* Myth #3 : GitHub is person-centric.

(And as a further aside, there’s plenty to dislike about github as
well, from it’s person-centric view of projects (rather than
team-centric)...

Untrue. GitHub is very team centered when using organizational accounts,
which formalize authorization controls for projects, among other things:
https://github.com/blog/674-**introducing-organizationshttps://github.com/blog/674-introducing-organizations



* Myth #4 : GitHub is monopolizing open source software development.

... to its unfortunate centralizing of so much free/open
source software on one platform.)

Convergence is not always a bad thing. GitHub provides a great, free
service with lots of helpful collaboration tools beyond version control.
  It's natural that people would flock there, despite having lots of other
options.



-Shaun







On 2/19/13 5:35 PM, Erik Hetzner wrote:


At Sat, 16 Feb 2013 06:42:04 -0800,
Karen Coyle wrote:


gitHub may have excellent startup documentation, but that startup
documentation describes git in programming 

[CODE4LIB] Providing Search Across PDFs

2013-02-20 Thread Nathan Tallman
My institution is looking for ways to provide search across PDFs through
our website. Specifically, PDFs linked from finding aids. Ideally searching
within a collection's PDFs or possibly across all PDFs linked from all
finding aids.

We do not have a CMS or a digital repository. A digital repository is on
the horizon, but it's a ways out and we need to offer the search sooner.
I've looked into Swish-e but haven't had much luck getting anything off the
ground.

One way we know we can do this through our discovery layer VuFind, using
it's ability to full-text index a website based on a sitemap (which would
includes PDFs linked from finding aids). Facets could be created for
 collections, and we may be able to create a search box on the finding aid
nav that searches specifically that collection.

But, I'm not sure how scalable that solution is. The indexing agent cannot
discern when a page was updated, so it has to re-scrape,
everything, every-night. The impetus collection is going to have about over
1000 PDFs. And that's to start. Creating the index will start to take a
long, long time.

Does anyone have any ideas or know of any useful tools for this project?
Doesn't have to be perfect, quick and dirty may work. (The OCR's dirty
anyway :-)

Thanks,
Nathan


Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Johnston, Leslie
It's technically breaking GitHub's terms of service to have multiple 
individuals sharing a single account.

Leslie

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Karen Coyle
 Sent: Wednesday, February 20, 2013 12:07 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)
 
 Sure. Although the question was more: how can we make it easy to have a
 bunch of accounts? Or should we have a c4l account that we share (and
 monitor for spam)? I think anything wysiwyg-y and familiar (wordpress
 certainly meets those criteria) would be fine. There does seem to be a
 lot of familiarity with Wordpress in the group.
 
 kc
 
 
 On 2/20/13 8:45 AM, Ethan Gruber wrote:
  Wordpress?
 
 
  On Wed, Feb 20, 2013 at 11:42 AM, Karen Coyle li...@kcoyle.net
 wrote:
 
  Shaun, you cannot decide whether github is a barrier to entry FOR ME
  (or anyone else), any more than you can decide whether or not my
 foot hurts.
  I'm telling you github is NOT what I want to use. Period.
 
  I'm actually thinking that a blog format would be nice. It could be
  pretty (poetry and beauty go together). Poems tend to be short, so
  they'd make a nice blog post. They could appear in the Planet blog
  roll. They could be coded by author and topic. There could be
 comments! Even poems as comments!
  The only down-side is managing users. Anyone have ideas on that?
 
  kc
 
 
 
  On 2/20/13 8:20 AM, Shaun Ellis wrote:
 
  (As a general rule, for every programmer who prefers tool A, and
  says that everybody should use it, there’s a programmer who
  disparages tool A, and advocates tool B. So take what we say with
 a
  grain of salt!)
  It doesn't matter what tools you use, as long as you and your team
  are able to participate easily, if you want to.  But if you want to
 attract
contributions from a given development community, then choices
  should be balanced between the preferences of that community and
  what best serve the project.
 
   From what I've been hearing, I think there is a lot of confusion
  about GitHub.  Heck, I am constantly learning about new GitHub
  features, APIs, and best practices myself. But I find it to be an
  incredibly powerful platform for moving open source, distributed
 software development forward.
I am not telling anyone to use GitHub if they don't want to, but
 I
  want to dispel a few myths I've heard recently:
 
  
 
  * Myth #1 : GitHub creates a barrier to entry.
  * To contribute to a project on GitHub, you need to use the
  command-line. It's not for non-coders.
 
  GitHub != git.  While GitHub was initially built for publishing and
  sharing code via integration with git, all GitHub functionality can
  be performed directly through the web gui.  In fact, GitHub can
 even
  be used as your sole coding environment. There are other tools in
 the eco-system
  that allow non-coders to contribute documentation, issue reporting,
  and more to a project.
 
  
 
  * Myth #2 : GitHub is for sharing/publishing code.
  * I would be fun to have a wiki for more durable poetry (github
  unfortunately would be a barrier to many).
 
  GitHub can be used to collaborate on and publish other types of
  content as well.  For example, GitHub has a great wiki component*
  (as well as a website component).  In a number of ways, has less of
 a barrier to entry
  than our Code4Lib wiki.
 
  While the path of least resistance requires a repository to have
 a
  wiki, public repos cost nothing and can consist of a simple
 README file.
The wiki can be locked down to a team, or it can be writable by
  anyone with a github account.  You don't need to do anything via
  command-line, don't need to understand git-flow, and you don't
  even need to learn wiki markup to write content. All you need is an
  account and something to say, just like any wiki. Log in, go to the
  anti-harassment policy wiki, and see for yourself:
  https://github.com/code4lib/**antiharassment-
 policy/wikihttps://git
  hub.com/code4lib/antiharassment-policy/wiki
 
  * The github wiki even has an API (via Gollum) that you can use to
  retrieve raw or formatted wiki content, write new content, and
  collect various meta data about the wiki as a whole:
  https://github.com/code4lib/**antiharassment-
 policy/wiki/_**accessh
  ttps://github.com/code4lib/antiharassment-policy/wiki/_access
 
  
 
  * Myth #3 : GitHub is person-centric.
  (And as a further aside, there’s plenty to dislike about github
 as
  well, from it’s person-centric view of projects (rather than
  team-centric)...
  Untrue. GitHub is very team centered when using organizational
  accounts, which formalize authorization controls for projects,
 among other things:
  https://github.com/blog/674-**introducing-
 organizationshttps://gith
  ub.com/blog/674-introducing-organizations
 
  
 
  * Myth #4 : GitHub is monopolizing open source software
 

Re: [CODE4LIB] Providing Search Across PDFs

2013-02-20 Thread Jason Griffey
This might not fit your need exactly, but a Google Custom Search (
http://www.google.com/cse/) should do the job. You can have the Custom
Search only index a given directory, or only PDFs, whichever is more useful.

Jason


On Wed, Feb 20, 2013 at 12:53 PM, Nathan Tallman ntall...@gmail.com wrote:

 My institution is looking for ways to provide search across PDFs through
 our website. Specifically, PDFs linked from finding aids. Ideally searching
 within a collection's PDFs or possibly across all PDFs linked from all
 finding aids.

 We do not have a CMS or a digital repository. A digital repository is on
 the horizon, but it's a ways out and we need to offer the search sooner.
 I've looked into Swish-e but haven't had much luck getting anything off the
 ground.

 One way we know we can do this through our discovery layer VuFind, using
 it's ability to full-text index a website based on a sitemap (which would
 includes PDFs linked from finding aids). Facets could be created for
  collections, and we may be able to create a search box on the finding aid
 nav that searches specifically that collection.

 But, I'm not sure how scalable that solution is. The indexing agent cannot
 discern when a page was updated, so it has to re-scrape,
 everything, every-night. The impetus collection is going to have about over
 1000 PDFs. And that's to start. Creating the index will start to take a
 long, long time.

 Does anyone have any ideas or know of any useful tools for this project?
 Doesn't have to be perfect, quick and dirty may work. (The OCR's dirty
 anyway :-)

 Thanks,
 Nathan



Re: [CODE4LIB] Providing Search Across PDFs

2013-02-20 Thread Michele R Combs
What about just a Google site search?

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Nathan 
Tallman
Sent: Wednesday, February 20, 2013 12:54 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Providing Search Across PDFs

My institution is looking for ways to provide search across PDFs through our 
website. Specifically, PDFs linked from finding aids. Ideally searching within 
a collection's PDFs or possibly across all PDFs linked from all finding aids.

We do not have a CMS or a digital repository. A digital repository is on the 
horizon, but it's a ways out and we need to offer the search sooner.
I've looked into Swish-e but haven't had much luck getting anything off the 
ground.

One way we know we can do this through our discovery layer VuFind, using it's 
ability to full-text index a website based on a sitemap (which would includes 
PDFs linked from finding aids). Facets could be created for  collections, and 
we may be able to create a search box on the finding aid nav that searches 
specifically that collection.

But, I'm not sure how scalable that solution is. The indexing agent cannot 
discern when a page was updated, so it has to re-scrape, everything, 
every-night. The impetus collection is going to have about over
1000 PDFs. And that's to start. Creating the index will start to take a long, 
long time.

Does anyone have any ideas or know of any useful tools for this project?
Doesn't have to be perfect, quick and dirty may work. (The OCR's dirty anyway 
:-)

Thanks,
Nathan


Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Karen Coyle

WE're talking about wordpress, not github.

kc

On 2/20/13 9:56 AM, Johnston, Leslie wrote:

It's technically breaking GitHub's terms of service to have multiple 
individuals sharing a single account.

Leslie


-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
Karen Coyle
Sent: Wednesday, February 20, 2013 12:07 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

Sure. Although the question was more: how can we make it easy to have a
bunch of accounts? Or should we have a c4l account that we share (and
monitor for spam)? I think anything wysiwyg-y and familiar (wordpress
certainly meets those criteria) would be fine. There does seem to be a
lot of familiarity with Wordpress in the group.

kc


On 2/20/13 8:45 AM, Ethan Gruber wrote:

Wordpress?


On Wed, Feb 20, 2013 at 11:42 AM, Karen Coyle li...@kcoyle.net

wrote:

Shaun, you cannot decide whether github is a barrier to entry FOR ME
(or anyone else), any more than you can decide whether or not my

foot hurts.

I'm telling you github is NOT what I want to use. Period.

I'm actually thinking that a blog format would be nice. It could be
pretty (poetry and beauty go together). Poems tend to be short, so
they'd make a nice blog post. They could appear in the Planet blog
roll. They could be coded by author and topic. There could be

comments! Even poems as comments!

The only down-side is managing users. Anyone have ideas on that?

kc



On 2/20/13 8:20 AM, Shaun Ellis wrote:


(As a general rule, for every programmer who prefers tool A, and
says that everybody should use it, there’s a programmer who
disparages tool A, and advocates tool B. So take what we say with

a

grain of salt!)

It doesn't matter what tools you use, as long as you and your team
are able to participate easily, if you want to.  But if you want to

attract

   contributions from a given development community, then choices
should be balanced between the preferences of that community and
what best serve the project.

  From what I've been hearing, I think there is a lot of confusion
about GitHub.  Heck, I am constantly learning about new GitHub
features, APIs, and best practices myself. But I find it to be an
incredibly powerful platform for moving open source, distributed

software development forward.

   I am not telling anyone to use GitHub if they don't want to, but

I

want to dispel a few myths I've heard recently:



* Myth #1 : GitHub creates a barrier to entry.
* To contribute to a project on GitHub, you need to use the
command-line. It's not for non-coders.

GitHub != git.  While GitHub was initially built for publishing and
sharing code via integration with git, all GitHub functionality can
be performed directly through the web gui.  In fact, GitHub can

even

be used as your sole coding environment. There are other tools in

the eco-system

that allow non-coders to contribute documentation, issue reporting,
and more to a project.



* Myth #2 : GitHub is for sharing/publishing code.
* I would be fun to have a wiki for more durable poetry (github
unfortunately would be a barrier to many).

GitHub can be used to collaborate on and publish other types of
content as well.  For example, GitHub has a great wiki component*
(as well as a website component).  In a number of ways, has less of

a barrier to entry

than our Code4Lib wiki.

While the path of least resistance requires a repository to have

a

wiki, public repos cost nothing and can consist of a simple

README file.

   The wiki can be locked down to a team, or it can be writable by
anyone with a github account.  You don't need to do anything via
command-line, don't need to understand git-flow, and you don't
even need to learn wiki markup to write content. All you need is an
account and something to say, just like any wiki. Log in, go to the
anti-harassment policy wiki, and see for yourself:
https://github.com/code4lib/**antiharassment-

policy/wikihttps://git

hub.com/code4lib/antiharassment-policy/wiki

* The github wiki even has an API (via Gollum) that you can use to
retrieve raw or formatted wiki content, write new content, and
collect various meta data about the wiki as a whole:
https://github.com/code4lib/**antiharassment-

policy/wiki/_**accessh

ttps://github.com/code4lib/antiharassment-policy/wiki/_access



* Myth #3 : GitHub is person-centric.

(And as a further aside, there’s plenty to dislike about github

as

well, from it’s person-centric view of projects (rather than
team-centric)...

Untrue. GitHub is very team centered when using organizational
accounts, which formalize authorization controls for projects,

among other things:

https://github.com/blog/674-**introducing-

organizationshttps://gith

ub.com/blog/674-introducing-organizations



* Myth #4 : GitHub is monopolizing open source software

development.

... to its unfortunate centralizing of so much free/open 

Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Jason Stirnaman
Another option might be to set it up like the Planet. Where individuals just 
post their poetry to their own blogs, Tumblrs, etc., tag them, and have 
$PLANET_NERD_POETS aggregate them.

Git and Github are great. But while I get the argument for utility, there does 
seem to be barrier-to-entry there for someone just wanting to submit a poem.

Jason

Jason Stirnaman
Digital Projects Librarian
A.R. Dykes Library
University of Kansas Medical Center
913-588-7319


From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Karen Coyle 
[li...@kcoyle.net]
Sent: Wednesday, February 20, 2013 10:42 AM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

Shaun, you cannot decide whether github is a barrier to entry FOR ME (or
anyone else), any more than you can decide whether or not my foot hurts.
I'm telling you github is NOT what I want to use. Period.

I'm actually thinking that a blog format would be nice. It could be
pretty (poetry and beauty go together). Poems tend to be short, so
they'd make a nice blog post. They could appear in the Planet blog roll.
They could be coded by author and topic. There could be comments! Even
poems as comments! The only down-side is managing users. Anyone have
ideas on that?

kc


On 2/20/13 8:20 AM, Shaun Ellis wrote:
  (As a general rule, for every programmer who prefers tool A, and says
  that everybody should use it, there’s a programmer who disparages tool
  A, and advocates tool B. So take what we say with a grain of salt!)

 It doesn't matter what tools you use, as long as you and your team are
 able to participate easily, if you want to.  But if you want to
 attract  contributions from a given development community, then
 choices should be balanced between the preferences of that community
 and what best serve the project.

 From what I've been hearing, I think there is a lot of confusion about
 GitHub.  Heck, I am constantly learning about new GitHub features,
 APIs, and best practices myself. But I find it to be an incredibly
 powerful platform for moving open source, distributed software
 development forward.  I am not telling anyone to use GitHub if they
 don't want to, but I want to dispel a few myths I've heard recently:

 

 * Myth #1 : GitHub creates a barrier to entry.
 * To contribute to a project on GitHub, you need to use the
 command-line. It's not for non-coders.

 GitHub != git.  While GitHub was initially built for publishing and
 sharing code via integration with git, all GitHub functionality can be
 performed directly through the web gui.  In fact, GitHub can even be
 used as your sole coding environment. There are other tools in the
 eco-system that allow non-coders to contribute documentation, issue
 reporting, and more to a project.

 

 * Myth #2 : GitHub is for sharing/publishing code.
 * I would be fun to have a wiki for more durable poetry (github
 unfortunately would be a barrier to many).

 GitHub can be used to collaborate on and publish other types of
 content as well.  For example, GitHub has a great wiki component* (as
 well as a website component).  In a number of ways, has less of a
 barrier to entry than our Code4Lib wiki.

 While the path of least resistance requires a repository to have a
 wiki, public repos cost nothing and can consist of a simple README
 file.  The wiki can be locked down to a team, or it can be writable by
 anyone with a github account.  You don't need to do anything via
 command-line, don't need to understand git-flow, and you don't even
 need to learn wiki markup to write content. All you need is an account
 and something to say, just like any wiki. Log in, go to the
 anti-harassment policy wiki, and see for yourself:
 https://github.com/code4lib/antiharassment-policy/wiki

 * The github wiki even has an API (via Gollum) that you can use to
 retrieve raw or formatted wiki content, write new content, and collect
 various meta data about the wiki as a whole:
 https://github.com/code4lib/antiharassment-policy/wiki/_access

 

 * Myth #3 : GitHub is person-centric.
  (And as a further aside, there’s plenty to dislike about github as
  well, from it’s person-centric view of projects (rather than
  team-centric)...

 Untrue. GitHub is very team centered when using organizational
 accounts, which formalize authorization controls for projects, among
 other things: https://github.com/blog/674-introducing-organizations

 

 * Myth #4 : GitHub is monopolizing open source software development.
  ... to its unfortunate centralizing of so much free/open
  source software on one platform.)

 Convergence is not always a bad thing. GitHub provides a great, free
 service with lots of helpful collaboration tools beyond version
 control.  It's natural that people would flock there, despite having
 lots of other options.

 

 -Shaun







 On 2/19/13 5:35 PM, Erik Hetzner wrote:
 At Sat, 16 

Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Erik Hetzner
At Wed, 20 Feb 2013 11:20:33 -0500,
Shaun Ellis wrote:
 
   (As a general rule, for every programmer who prefers tool A, and says
   that everybody should use it, there’s a programmer who disparages tool
   A, and advocates tool B. So take what we say with a grain of salt!)
 
 It doesn't matter what tools you use, as long as you and your team are 
 able to participate easily, if you want to.  But if you want to attract 
 contributions from a given development community, then choices should 
 be balanced between the preferences of that community and what best 
 serve the project.

It does matter what tools you use, which is why people are so
passionate about them. But I agree completely that you need to balance
the preferences of the community.

 From what I've been hearing, I think there is a lot of confusion
 about GitHub. Heck, I am constantly learning about new GitHub
 features, APIs, and best practices myself. But I find it to be an
 incredibly powerful platform for moving open source, distributed
 software development forward. I am not telling anyone to use GitHub
 if they don't want to, but I want to dispel a few myths I've heard
 recently:

It’s not confusion; and these aren’t “myths”: they are disagreements.

best, Erik
Sent from my free software system http://fsf.org/.


pgpB5ekrOeqHs.pgp
Description: PGP signature


[CODE4LIB] PHP YAZ

2013-02-20 Thread Brent Ferguson
Is there anyone that has experience working with PHP and YAZ on a Windows Box...

Have a few questions to help clarify what is needed to get up and running...

Brent Ferguson, MLS
Web Developer / Reference Librarian - Elkhart Public Library
http://www.myepl.org/epl


Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Tom Johnson
 But while I get the argument for utility, there does seem to be
barrier-to-entry there for someone just wanting to submit a poem.

The original suggestion wasn't about utility, but about modes of writing.
Git repositories would make for poems which are easily shared, copied,
forked, and merged back together. I'm interested in the relationship this
has to the idea of an oral tradition. Especially given that a git
poetry tradition would record its own history in the medium.

I agree that wordpress is much more accessible. It seems obvious to me that
we could post poems where we see fit and aggregate them. Written and oral
is even more accessible than that. It seems obvious to me that we could
write down and/or recite poems, pass them around, and commit them to
memory. I think we should do all these things--and maybe play around with
git, too.

For me, the important take away from this discussion is that git art
shouldn't be the dominant form of expression or the raison d'etre for the
'nerd poetry' idea.

As an aside: I share the concerns about GitHub. I resisted joining for
years because of exactly this issue. If Facebook is a man-in-the-middle
exploit on social interaction, then surely GitHub is the same on Free
Software development. I thought the FOSS community would be better served
if we all put up our git repositories in our own ways, and tried to build
tools for collaboration. As it turns out, GitHub has done wonders for code
sharing and collaborative development and the company has been good to us,
which is why I'm there now. I still worry about ways the our platform
dependence could go badly. Luckily, the risk is mitigated by gits
distributed and portable nature.

- Tom


On Wed, Feb 20, 2013 at 10:20 AM, Jason Stirnaman jstirna...@kumc.eduwrote:

 Another option might be to set it up like the Planet. Where individuals
 just post their poetry to their own blogs, Tumblrs, etc., tag them, and
 have $PLANET_NERD_POETS aggregate them.

 Git and Github are great. But while I get the argument for utility, there
 does seem to be barrier-to-entry there for someone just wanting to submit a
 poem.

 Jason

 Jason Stirnaman
 Digital Projects Librarian
 A.R. Dykes Library
 University of Kansas Medical Center
 913-588-7319

 
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Karen
 Coyle [li...@kcoyle.net]
 Sent: Wednesday, February 20, 2013 10:42 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

 Shaun, you cannot decide whether github is a barrier to entry FOR ME (or
 anyone else), any more than you can decide whether or not my foot hurts.
 I'm telling you github is NOT what I want to use. Period.

 I'm actually thinking that a blog format would be nice. It could be
 pretty (poetry and beauty go together). Poems tend to be short, so
 they'd make a nice blog post. They could appear in the Planet blog roll.
 They could be coded by author and topic. There could be comments! Even
 poems as comments! The only down-side is managing users. Anyone have
 ideas on that?

 kc


 On 2/20/13 8:20 AM, Shaun Ellis wrote:
   (As a general rule, for every programmer who prefers tool A, and says
   that everybody should use it, there’s a programmer who disparages tool
   A, and advocates tool B. So take what we say with a grain of salt!)
 
  It doesn't matter what tools you use, as long as you and your team are
  able to participate easily, if you want to.  But if you want to
  attract  contributions from a given development community, then
  choices should be balanced between the preferences of that community
  and what best serve the project.
 
  From what I've been hearing, I think there is a lot of confusion about
  GitHub.  Heck, I am constantly learning about new GitHub features,
  APIs, and best practices myself. But I find it to be an incredibly
  powerful platform for moving open source, distributed software
  development forward.  I am not telling anyone to use GitHub if they
  don't want to, but I want to dispel a few myths I've heard recently:
 
  
 
  * Myth #1 : GitHub creates a barrier to entry.
  * To contribute to a project on GitHub, you need to use the
  command-line. It's not for non-coders.
 
  GitHub != git.  While GitHub was initially built for publishing and
  sharing code via integration with git, all GitHub functionality can be
  performed directly through the web gui.  In fact, GitHub can even be
  used as your sole coding environment. There are other tools in the
  eco-system that allow non-coders to contribute documentation, issue
  reporting, and more to a project.
 
  
 
  * Myth #2 : GitHub is for sharing/publishing code.
  * I would be fun to have a wiki for more durable poetry (github
  unfortunately would be a barrier to many).
 
  GitHub can be used to collaborate on and publish other types of
  content as well.  For example, GitHub has a great wiki component* (as
  

Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Benjamin Armintor
You are definitely insulated from loss of material by the distributed
character of git, but it would be difficult to replace the social network
around the projects. You really see this when you work with a non-Github
git repository: Getting a copy of it is trivial, but you have no mechanism
for alerting the original repository (much less its network) of potentially
valuable changes. Of course, there's the old-fashioned splash-pages and
contact emails, but the relative triviality of advertising changes to a
Github repository (and accepting them, for that matter) is pretty
groundbreaking.

- Ben


On Wed, Feb 20, 2013 at 2:04 PM, Tom Johnson johnson.tom+code4...@gmail.com
 wrote:

  But while I get the argument for utility, there does seem to be
 barrier-to-entry there for someone just wanting to submit a poem.

 The original suggestion wasn't about utility, but about modes of writing.
 Git repositories would make for poems which are easily shared, copied,
 forked, and merged back together. I'm interested in the relationship this
 has to the idea of an oral tradition. Especially given that a git
 poetry tradition would record its own history in the medium.

 I agree that wordpress is much more accessible. It seems obvious to me that
 we could post poems where we see fit and aggregate them. Written and oral
 is even more accessible than that. It seems obvious to me that we could
 write down and/or recite poems, pass them around, and commit them to
 memory. I think we should do all these things--and maybe play around with
 git, too.

 For me, the important take away from this discussion is that git art
 shouldn't be the dominant form of expression or the raison d'etre for the
 'nerd poetry' idea.

 As an aside: I share the concerns about GitHub. I resisted joining for
 years because of exactly this issue. If Facebook is a man-in-the-middle
 exploit on social interaction, then surely GitHub is the same on Free
 Software development. I thought the FOSS community would be better served
 if we all put up our git repositories in our own ways, and tried to build
 tools for collaboration. As it turns out, GitHub has done wonders for code
 sharing and collaborative development and the company has been good to us,
 which is why I'm there now. I still worry about ways the our platform
 dependence could go badly. Luckily, the risk is mitigated by gits
 distributed and portable nature.

 - Tom


 On Wed, Feb 20, 2013 at 10:20 AM, Jason Stirnaman jstirna...@kumc.edu
 wrote:

  Another option might be to set it up like the Planet. Where individuals
  just post their poetry to their own blogs, Tumblrs, etc., tag them, and
  have $PLANET_NERD_POETS aggregate them.
 
  Git and Github are great. But while I get the argument for utility, there
  does seem to be barrier-to-entry there for someone just wanting to
 submit a
  poem.
 
  Jason
 
  Jason Stirnaman
  Digital Projects Librarian
  A.R. Dykes Library
  University of Kansas Medical Center
  913-588-7319
 
  
  From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Karen
  Coyle [li...@kcoyle.net]
  Sent: Wednesday, February 20, 2013 10:42 AM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)
 
  Shaun, you cannot decide whether github is a barrier to entry FOR ME (or
  anyone else), any more than you can decide whether or not my foot hurts.
  I'm telling you github is NOT what I want to use. Period.
 
  I'm actually thinking that a blog format would be nice. It could be
  pretty (poetry and beauty go together). Poems tend to be short, so
  they'd make a nice blog post. They could appear in the Planet blog roll.
  They could be coded by author and topic. There could be comments! Even
  poems as comments! The only down-side is managing users. Anyone have
  ideas on that?
 
  kc
 
 
  On 2/20/13 8:20 AM, Shaun Ellis wrote:
(As a general rule, for every programmer who prefers tool A, and says
that everybody should use it, there’s a programmer who disparages
 tool
A, and advocates tool B. So take what we say with a grain of salt!)
  
   It doesn't matter what tools you use, as long as you and your team are
   able to participate easily, if you want to.  But if you want to
   attract  contributions from a given development community, then
   choices should be balanced between the preferences of that community
   and what best serve the project.
  
   From what I've been hearing, I think there is a lot of confusion about
   GitHub.  Heck, I am constantly learning about new GitHub features,
   APIs, and best practices myself. But I find it to be an incredibly
   powerful platform for moving open source, distributed software
   development forward.  I am not telling anyone to use GitHub if they
   don't want to, but I want to dispel a few myths I've heard recently:
  
   
  
   * Myth #1 : GitHub creates a barrier to entry.
   * To contribute to a project on GitHub, 

[CODE4LIB] Fuseki and other SPARQL servers

2013-02-20 Thread Ethan Gruber
Hi all,

I have been playing around with Fuseki (
http://jena.apache.org/documentation/serving_data/index.html) for a few
months to get my feet wet with accessing and querying RDF.  I quite like
it. I find it well documented and easy to set up.  We will soon deploy a
SPARQL server in a production environment, and I would like to know if
others on the list have experience with Fuseki in production, or have other
recommendations.  Mulgara is off the table as it inexplicably conflicts
with other apps installed in Tomcat.

Thanks,
Ethan


Re: [CODE4LIB] Providing Search Across PDFs

2013-02-20 Thread Nathan Tallman
@Jason and @Michele: I'd rather stay away from a Google solution. The
reason being that they don't index everything. Our sitemap is submitted
nightly and out of about 6000 URLs only 1500 are indexed. I can't make sure
Google indexes the PDFs or be sure that they always will. (If I'm
misunderstanding this, please let me know.)

@Péter: The VuFind solution I mentioned is very similar to what you use
here. It uses Aperture (although soon to use Tika instead) to grab the
full-text and shoves everything inside a solr index. The import is managed
through a PHP script the crawls every URL on the sitemap. The only part I
don't have is removing deleted, adding new, and updating changed
webpages/files. I'm not sure how to rework the script to use a list of new
files rather than the sitemap, but everything is on the same server so that
should work.


On Wed, Feb 20, 2013 at 12:53 PM, Nathan Tallman ntall...@gmail.com wrote:

 My institution is looking for ways to provide search across PDFs through
 our website. Specifically, PDFs linked from finding aids. Ideally searching
 within a collection's PDFs or possibly across all PDFs linked from all
 finding aids.

 We do not have a CMS or a digital repository. A digital repository is on
 the horizon, but it's a ways out and we need to offer the search sooner.
 I've looked into Swish-e but haven't had much luck getting anything off the
 ground.

 One way we know we can do this through our discovery layer VuFind, using
 it's ability to full-text index a website based on a sitemap (which would
 includes PDFs linked from finding aids). Facets could be created for
  collections, and we may be able to create a search box on the finding aid
 nav that searches specifically that collection.

 But, I'm not sure how scalable that solution is. The indexing agent cannot
 discern when a page was updated, so it has to re-scrape,
 everything, every-night. The impetus collection is going to have about over
 1000 PDFs. And that's to start. Creating the index will start to take a
 long, long time.

 Does anyone have any ideas or know of any useful tools for this project?
 Doesn't have to be perfect, quick and dirty may work. (The OCR's dirty
 anyway :-)

 Thanks,
 Nathan






Re: [CODE4LIB] Fuseki and other SPARQL servers

2013-02-20 Thread Hugh Cayless
Hi Ethan!

We've been using Jena/Fuseki in papyri.info for about a year now, iirc. We 
started with Mulgara, but switched. It's running in its own Jetty container in 
our system, but I've had no performance issues with it whatever. 

Best,
Hugh

On Feb 20, 2013, at 14:31 , Ethan Gruber ewg4x...@gmail.com wrote:

 Hi all,
 
 I have been playing around with Fuseki (
 http://jena.apache.org/documentation/serving_data/index.html) for a few
 months to get my feet wet with accessing and querying RDF.  I quite like
 it. I find it well documented and easy to set up.  We will soon deploy a
 SPARQL server in a production environment, and I would like to know if
 others on the list have experience with Fuseki in production, or have other
 recommendations.  Mulgara is off the table as it inexplicably conflicts
 with other apps installed in Tomcat.
 
 Thanks,
 Ethan


Re: [CODE4LIB] Fuseki and other SPARQL servers

2013-02-20 Thread Hugh Cayless
Jetty's performance characteristics are really very good. I'd have no 
hesitation in using it.

Hugh

On Feb 20, 2013, at 14:52 , Ethan Gruber ewg4x...@gmail.com wrote:

 Hi Hugh,
 
 I have investigated the possibility of deploying Fuseki as a war in Tomcat (
 https://issues.apache.org/jira/browse/JENA-201) because I wasn't sure how
 the default Jetty container would respond in production, but since you
 aren't having any problems with that deployment, I may go ahead and do that.
 
 Ethan
 
 
 On Wed, Feb 20, 2013 at 2:39 PM, Hugh Cayless philomou...@gmail.com wrote:
 
 Hi Ethan!
 
 We've been using Jena/Fuseki in papyri.info for about a year now, iirc.
 We started with Mulgara, but switched. It's running in its own Jetty
 container in our system, but I've had no performance issues with it
 whatever.
 
 Best,
 Hugh
 
 On Feb 20, 2013, at 14:31 , Ethan Gruber ewg4x...@gmail.com wrote:
 
 Hi all,
 
 I have been playing around with Fuseki (
 http://jena.apache.org/documentation/serving_data/index.html) for a few
 months to get my feet wet with accessing and querying RDF.  I quite like
 it. I find it well documented and easy to set up.  We will soon deploy a
 SPARQL server in a production environment, and I would like to know if
 others on the list have experience with Fuseki in production, or have
 other
 recommendations.  Mulgara is off the table as it inexplicably conflicts
 with other apps installed in Tomcat.
 
 Thanks,
 Ethan
 


Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Tom Johnson
 but it would be difficult to replace the social network around the
projects.

Especially difficult now that GitHub is where the community is. It's
technically possible to build a social web that works on a decentralized
basis, but it may no longer be culturally possible. Platforms are hard to
get down from.

On Wed, Feb 20, 2013 at 11:12 AM, Benjamin Armintor armin...@gmail.comwrote:

 You are definitely insulated from loss of material by the distributed
 character of git, but it would be difficult to replace the social network
 around the projects. You really see this when you work with a non-Github
 git repository: Getting a copy of it is trivial, but you have no mechanism
 for alerting the original repository (much less its network) of potentially
 valuable changes. Of course, there's the old-fashioned splash-pages and
 contact emails, but the relative triviality of advertising changes to a
 Github repository (and accepting them, for that matter) is pretty
 groundbreaking.

 - Ben


 On Wed, Feb 20, 2013 at 2:04 PM, Tom Johnson 
 johnson.tom+code4...@gmail.com
  wrote:

   But while I get the argument for utility, there does seem to be
  barrier-to-entry there for someone just wanting to submit a poem.
 
  The original suggestion wasn't about utility, but about modes of writing.
  Git repositories would make for poems which are easily shared, copied,
  forked, and merged back together. I'm interested in the relationship
 this
  has to the idea of an oral tradition. Especially given that a git
  poetry tradition would record its own history in the medium.
 
  I agree that wordpress is much more accessible. It seems obvious to me
 that
  we could post poems where we see fit and aggregate them. Written and oral
  is even more accessible than that. It seems obvious to me that we could
  write down and/or recite poems, pass them around, and commit them to
  memory. I think we should do all these things--and maybe play around with
  git, too.
 
  For me, the important take away from this discussion is that git art
  shouldn't be the dominant form of expression or the raison d'etre for the
  'nerd poetry' idea.
 
  As an aside: I share the concerns about GitHub. I resisted joining for
  years because of exactly this issue. If Facebook is a man-in-the-middle
  exploit on social interaction, then surely GitHub is the same on Free
  Software development. I thought the FOSS community would be better served
  if we all put up our git repositories in our own ways, and tried to build
  tools for collaboration. As it turns out, GitHub has done wonders for
 code
  sharing and collaborative development and the company has been good to
 us,
  which is why I'm there now. I still worry about ways the our platform
  dependence could go badly. Luckily, the risk is mitigated by gits
  distributed and portable nature.
 
  - Tom
 
 
  On Wed, Feb 20, 2013 at 10:20 AM, Jason Stirnaman jstirna...@kumc.edu
  wrote:
 
   Another option might be to set it up like the Planet. Where individuals
   just post their poetry to their own blogs, Tumblrs, etc., tag them, and
   have $PLANET_NERD_POETS aggregate them.
  
   Git and Github are great. But while I get the argument for utility,
 there
   does seem to be barrier-to-entry there for someone just wanting to
  submit a
   poem.
  
   Jason
  
   Jason Stirnaman
   Digital Projects Librarian
   A.R. Dykes Library
   University of Kansas Medical Center
   913-588-7319
  
   
   From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Karen
   Coyle [li...@kcoyle.net]
   Sent: Wednesday, February 20, 2013 10:42 AM
   To: CODE4LIB@LISTSERV.ND.EDU
   Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)
  
   Shaun, you cannot decide whether github is a barrier to entry FOR ME
 (or
   anyone else), any more than you can decide whether or not my foot
 hurts.
   I'm telling you github is NOT what I want to use. Period.
  
   I'm actually thinking that a blog format would be nice. It could be
   pretty (poetry and beauty go together). Poems tend to be short, so
   they'd make a nice blog post. They could appear in the Planet blog
 roll.
   They could be coded by author and topic. There could be comments! Even
   poems as comments! The only down-side is managing users. Anyone have
   ideas on that?
  
   kc
  
  
   On 2/20/13 8:20 AM, Shaun Ellis wrote:
 (As a general rule, for every programmer who prefers tool A, and
 says
 that everybody should use it, there’s a programmer who disparages
  tool
 A, and advocates tool B. So take what we say with a grain of salt!)
   
It doesn't matter what tools you use, as long as you and your team
 are
able to participate easily, if you want to.  But if you want to
attract  contributions from a given development community, then
choices should be balanced between the preferences of that community
and what best serve the project.
   
From what I've been hearing, I 

Re: [CODE4LIB] Fuseki and other SPARQL servers

2013-02-20 Thread Ross Singer
On Feb 20, 2013, at 2:52 PM, Ethan Gruber ewg4x...@gmail.com wrote:

 Hi Hugh,
 
 I have investigated the possibility of deploying Fuseki as a war in Tomcat (
 https://issues.apache.org/jira/browse/JENA-201) because I wasn't sure how
 the default Jetty container would respond in production, but since you
 aren't having any problems with that deployment, I may go ahead and do that.

Fuseki/Jetty will have no problems scaling, it's what the Talis Platform used 
for large datasets.  I also ran a large dataset for quite a while with it. 

Which backend are you using?  TDB?  SDB?

-Ross.

 
 Ethan
 
 
 On Wed, Feb 20, 2013 at 2:39 PM, Hugh Cayless philomou...@gmail.com wrote:
 
 Hi Ethan!
 
 We've been using Jena/Fuseki in papyri.info for about a year now, iirc.
 We started with Mulgara, but switched. It's running in its own Jetty
 container in our system, but I've had no performance issues with it
 whatever.
 
 Best,
 Hugh
 
 On Feb 20, 2013, at 14:31 , Ethan Gruber ewg4x...@gmail.com wrote:
 
 Hi all,
 
 I have been playing around with Fuseki (
 http://jena.apache.org/documentation/serving_data/index.html) for a few
 months to get my feet wet with accessing and querying RDF.  I quite like
 it. I find it well documented and easy to set up.  We will soon deploy a
 SPARQL server in a production environment, and I would like to know if
 others on the list have experience with Fuseki in production, or have
 other
 recommendations.  Mulgara is off the table as it inexplicably conflicts
 with other apps installed in Tomcat.
 
 Thanks,
 Ethan
 


Re: [CODE4LIB] Fuseki and other SPARQL servers

2013-02-20 Thread Ethan Gruber
TDB as per the startup instruction: fuseki-server --loc=DB
/DatasetPathName

Ethan


On Wed, Feb 20, 2013 at 3:02 PM, Ross Singer rossfsin...@gmail.com wrote:

 On Feb 20, 2013, at 2:52 PM, Ethan Gruber ewg4x...@gmail.com wrote:

  Hi Hugh,
 
  I have investigated the possibility of deploying Fuseki as a war in
 Tomcat (
  https://issues.apache.org/jira/browse/JENA-201) because I wasn't sure
 how
  the default Jetty container would respond in production, but since you
  aren't having any problems with that deployment, I may go ahead and do
 that.

 Fuseki/Jetty will have no problems scaling, it's what the Talis Platform
 used for large datasets.  I also ran a large dataset for quite a while with
 it.

 Which backend are you using?  TDB?  SDB?

 -Ross.

 
  Ethan
 
 
  On Wed, Feb 20, 2013 at 2:39 PM, Hugh Cayless philomou...@gmail.com
 wrote:
 
  Hi Ethan!
 
  We've been using Jena/Fuseki in papyri.info for about a year now, iirc.
  We started with Mulgara, but switched. It's running in its own Jetty
  container in our system, but I've had no performance issues with it
  whatever.
 
  Best,
  Hugh
 
  On Feb 20, 2013, at 14:31 , Ethan Gruber ewg4x...@gmail.com wrote:
 
  Hi all,
 
  I have been playing around with Fuseki (
  http://jena.apache.org/documentation/serving_data/index.html) for a
 few
  months to get my feet wet with accessing and querying RDF.  I quite
 like
  it. I find it well documented and easy to set up.  We will soon deploy
 a
  SPARQL server in a production environment, and I would like to know if
  others on the list have experience with Fuseki in production, or have
  other
  recommendations.  Mulgara is off the table as it inexplicably conflicts
  with other apps installed in Tomcat.
 
  Thanks,
  Ethan
 



[CODE4LIB] Job Posting / Metadata Cataloger / Washington, DC

2013-02-20 Thread Suzanne Richards
Apologies for the cross postings . . . . . . . .



LAC Group is seeking a Metadata Cataloger for a potential 5-year contract 
position with a federal government agency located in Washington, DC.   The 
primary function of this position is to catalog and provide metadata for 
digital objects that are added to a digital repository.   If interested, please 
send us a copy of your resume immediately as this is a quick turn-around; the 
position is contingent on award.



Responsibilities/Requirements:

§  Participate in the development, maintenance, and documentation of 
transportation and library standards, such as the Agency's Research Thesaurus, 
Dublin Core schema, MARC to Dublin Core mapping, authority files, and 
digitization specifications;

§  Support Agency's digital document management functions, which include but 
are not limited to support for the Digital Repository, operation of the 
electronic journal maintenance system, and file systems;

§  Support expansion of media within the Agency as audio, video, still image, 
data series, or other content may be added;

§  Investigate  and evaluate new software applications to facilitate technical  
services functions such as processing documents into the Digital Repository, 
machine-aided indexing, metadata extraction, and digital preservation;

§  Provide publications  support, such as editors, desktop publishing 
professionals, editorial assistants, and graphic designers to assist RITA in 
the production, publication, or other services to produce reports,  
informational  materials, and other documents in print and electronic  formats 
for print publishing or publishing  to the Web;

§  Provide digitization services for identified collections, delivering 
products adhering to the accessibility standards of Section 508;

§  Continue in progress improvements to Agency's Integrated Search;

§  Complete and continue support for user interface for cataloging and 
integration of controls for controlled fields;

§  Support integration and development of Agency's web site systems 
applications  and services as these evolve, including but not limited to: 
information resource management; knowledge management; content management; 
process management; document management; and web site updates, development, and 
maintenance;

§  Provide and maintain metadata design and functionality consistent with 
national and international standards for Open Archives Initiative, Dublin Core 
Metadata Initiative, information retrieval, data visualization, and developing 
transportation standards for metadata;

§  Provide technical input and assistance for integrating Content Management 
System into Workflows;

§  Provide recommendations on how to improve and streamline the Agency's 
Technical Services processes.



To apply:  http://goo.gl/lYSGV
LAC Group is an Equal Opportunity/Affirmative Action employer and values 
diversity in the workforce.
LAC Group is a premier provider of recruiting and consultancy services for 
information professionals at U.S. and global organizations including Fortune 
100 companies, law firms, pharmaceutical companies, large academic institutions 
and prominent government agencies.


Re: [CODE4LIB] Fuseki and other SPARQL servers

2013-02-20 Thread John Fereira
I've been using Fuseki for a while myself and have been using it in production. 
 It can be a bit tricky to configure when you want to connect to a jena SDB but 
it, along with a small jar file from one of the jena developers that manages 
the SDB database connection, it works pretty well.

If you want to have more fun with Fuseki, check out the linked data API 
implementations called Elda (a java impl) or Puelia (PHP) and connect it to 
your Fuseki endpoint.

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ethan 
Gruber
Sent: Wednesday, February 20, 2013 2:32 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Fuseki and other SPARQL servers

Hi all,

I have been playing around with Fuseki (
http://jena.apache.org/documentation/serving_data/index.html) for a few months 
to get my feet wet with accessing and querying RDF.  I quite like it. I find it 
well documented and easy to set up.  We will soon deploy a SPARQL server in a 
production environment, and I would like to know if others on the list have 
experience with Fuseki in production, or have other recommendations.  Mulgara 
is off the table as it inexplicably conflicts with other apps installed in 
Tomcat.

Thanks,
Ethan


Re: [CODE4LIB] Fuseki and other SPARQL servers

2013-02-20 Thread John Fereira
If forgot about that.  That issue was created quite awhile ago and I hadn't 
check on it in a long time.  I've found that Jetty has worked fine in our 
production environment so far.  As I wrote earlier, I have it connecting to a 
jena SDB that is used for a semantic web application (VIVO) that was developed 
here.  Although we have the semantic web application running on a different 
server than the SDB database I found the performance was fairly significantly 
improved by having the Fuseki server running on the same machine as the SDB.

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Ethan 
Gruber
Sent: Wednesday, February 20, 2013 2:52 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Fuseki and other SPARQL servers

Hi Hugh,

I have investigated the possibility of deploying Fuseki as a war in Tomcat (
https://issues.apache.org/jira/browse/JENA-201) because I wasn't sure how the 
default Jetty container would respond in production, but since you aren't 
having any problems with that deployment, I may go ahead and do that.

Ethan


On Wed, Feb 20, 2013 at 2:39 PM, Hugh Cayless philomou...@gmail.com wrote:

 Hi Ethan!

 We've been using Jena/Fuseki in papyri.info for about a year now, iirc.
 We started with Mulgara, but switched. It's running in its own Jetty 
 container in our system, but I've had no performance issues with it 
 whatever.

 Best,
 Hugh

 On Feb 20, 2013, at 14:31 , Ethan Gruber ewg4x...@gmail.com wrote:

  Hi all,
 
  I have been playing around with Fuseki (
  http://jena.apache.org/documentation/serving_data/index.html) for a 
  few months to get my feet wet with accessing and querying RDF.  I 
  quite like it. I find it well documented and easy to set up.  We 
  will soon deploy a SPARQL server in a production environment, and I 
  would like to know if others on the list have experience with Fuseki 
  in production, or have
 other
  recommendations.  Mulgara is off the table as it inexplicably 
  conflicts with other apps installed in Tomcat.
 
  Thanks,
  Ethan



Re: [CODE4LIB] Providing Search Across PDFs

2013-02-20 Thread Wilhelmina Randtke
Yes, Google Custom Search is not too bad, if your PDFs are sorted
meaningfully by directory, and if you submit a site map to Google for more
complete indexing.  You can use Xenu to make a site map, put the site map
online as a static XML file, and then use Google Webmaster Tools to pass
the location of the site map.  This helps Google to index your site more
completely.  Then you periodically recreate and update the site map.

For homegrown search, I would have recommended Swish-e, if you hadn't said
it was out of reach.

-Wilhelmina Randtke


On Wed, Feb 20, 2013 at 12:07 PM, Jason Griffey grif...@gmail.com wrote:

 This might not fit your need exactly, but a Google Custom Search (
 http://www.google.com/cse/) should do the job. You can have the Custom
 Search only index a given directory, or only PDFs, whichever is more
 useful.

 Jason


 On Wed, Feb 20, 2013 at 12:53 PM, Nathan Tallman ntall...@gmail.com
 wrote:

  My institution is looking for ways to provide search across PDFs through
  our website. Specifically, PDFs linked from finding aids. Ideally
 searching
  within a collection's PDFs or possibly across all PDFs linked from all
  finding aids.
 
  We do not have a CMS or a digital repository. A digital repository is on
  the horizon, but it's a ways out and we need to offer the search sooner.
  I've looked into Swish-e but haven't had much luck getting anything off
 the
  ground.
 
  One way we know we can do this through our discovery layer VuFind, using
  it's ability to full-text index a website based on a sitemap (which would
  includes PDFs linked from finding aids). Facets could be created for
   collections, and we may be able to create a search box on the finding
 aid
  nav that searches specifically that collection.
 
  But, I'm not sure how scalable that solution is. The indexing agent
 cannot
  discern when a page was updated, so it has to re-scrape,
  everything, every-night. The impetus collection is going to have about
 over
  1000 PDFs. And that's to start. Creating the index will start to take a
  long, long time.
 
  Does anyone have any ideas or know of any useful tools for this project?
  Doesn't have to be perfect, quick and dirty may work. (The OCR's dirty
  anyway :-)
 
  Thanks,
  Nathan
 



Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Mark A. Matienzo
Regarding forking and WordPress:
http://wordpress.org/extend/plugins/post-forking/

WordPress Post Forking allows users to fork or create an alternate
version of content to foster a more collaborative approach to
WordPress content curation. This can be used, for example, to allow
external users (such as visitors to your site) or internal users (such
as other authors) with the ability to submit proposed revisions. It
can even be used on smaller or single-author sites to enable post
authors to edit published posts without their changes appearing
immediately. If you're familiar with Git, or other decentralized
version control systems, you're already familiar with WordPress post
forking.

Mark


On Wed, Feb 20, 2013 at 2:51 PM, Jason Stirnaman jstirna...@kumc.edu wrote:
 Git repositories would make for poems which are easily shared, copied,
 forked, and merged back together. I'm interested in the relationship this
 has to the idea of an oral tradition.

 Point taken. That's a really interesting idea. Sorry that I jumped in at the 
 middle.

 I think we should do all these things--and maybe play around with
 git, too.

 Agreed. So, distilling some of the key ideas from the thread:
 1. Keep it simple for anyone to share a poem.
 2. Help those poems find an audience (aka, more nerds for starters).
 3. Allow the audience to comment on the poems.
 4. Help other people share, adapt, fork, the poems.
 5. Help the poems persist and record their history as they go.

 Jason

 
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Tom Johnson 
 [johnson.tom+code4...@gmail.com]
 Sent: Wednesday, February 20, 2013 1:04 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

 But while I get the argument for utility, there does seem to be
 barrier-to-entry there for someone just wanting to submit a poem.

 The original suggestion wasn't about utility, but about modes of writing.
 Git repositories would make for poems which are easily shared, copied,
 forked, and merged back together. I'm interested in the relationship this
 has to the idea of an oral tradition. Especially given that a git
 poetry tradition would record its own history in the medium.

 I agree that wordpress is much more accessible. It seems obvious to me that
 we could post poems where we see fit and aggregate them. Written and oral
 is even more accessible than that. It seems obvious to me that we could
 write down and/or recite poems, pass them around, and commit them to
 memory. I think we should do all these things--and maybe play around with
 git, too.

 For me, the important take away from this discussion is that git art
 shouldn't be the dominant form of expression or the raison d'etre for the
 'nerd poetry' idea.

 As an aside: I share the concerns about GitHub. I resisted joining for
 years because of exactly this issue. If Facebook is a man-in-the-middle
 exploit on social interaction, then surely GitHub is the same on Free
 Software development. I thought the FOSS community would be better served
 if we all put up our git repositories in our own ways, and tried to build
 tools for collaboration. As it turns out, GitHub has done wonders for code
 sharing and collaborative development and the company has been good to us,
 which is why I'm there now. I still worry about ways the our platform
 dependence could go badly. Luckily, the risk is mitigated by gits
 distributed and portable nature.

 - Tom


 On Wed, Feb 20, 2013 at 10:20 AM, Jason Stirnaman jstirna...@kumc.eduwrote:

 Another option might be to set it up like the Planet. Where individuals
 just post their poetry to their own blogs, Tumblrs, etc., tag them, and
 have $PLANET_NERD_POETS aggregate them.

 Git and Github are great. But while I get the argument for utility, there
 does seem to be barrier-to-entry there for someone just wanting to submit a
 poem.

 Jason

 Jason Stirnaman
 Digital Projects Librarian
 A.R. Dykes Library
 University of Kansas Medical Center
 913-588-7319

 
 From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Karen
 Coyle [li...@kcoyle.net]
 Sent: Wednesday, February 20, 2013 10:42 AM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

 Shaun, you cannot decide whether github is a barrier to entry FOR ME (or
 anyone else), any more than you can decide whether or not my foot hurts.
 I'm telling you github is NOT what I want to use. Period.

 I'm actually thinking that a blog format would be nice. It could be
 pretty (poetry and beauty go together). Poems tend to be short, so
 they'd make a nice blog post. They could appear in the Planet blog roll.
 They could be coded by author and topic. There could be comments! Even
 poems as comments! The only down-side is managing users. Anyone have
 ideas on that?

 kc


 On 2/20/13 8:20 AM, Shaun Ellis wrote:
   (As a general rule, for every 

Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Jonathan Rochkind
Probably a mistake for me to post at all, but I'm full of mistakes. You 
know what, if someone wants to set up a spot for nerd poetry, I think 
they should do so. If someone else wants to set up a different spot 
using different tech, I think they should do so too.


I think it's mistaken to think anyone needs to reach consensus on what 
the 'right' technology or spot for this is; and I also think it's 
mistaken to think that any one platform or technology is going to be 
thought to be the best by everyone involved, differnet people will 
always have different opinions.   I also think it's a mistake to get 
offended because what someone else feels like experimenting with setting 
up is not something you think is the best way to do it.


Most piece of code4lib 'social' tech, going back to this mailing list 
itself, was created by someone who felt like creating it because they 
thought it would be fun and rewarding, and they did so.


Now, if you want to discuss the technical pro's and con's of different 
technical options, or even some technical how-to's, I think that would 
be a great use of the code4lib listserv.


Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Johnston, Leslie
Ah, my bad.

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Karen Coyle
 Sent: Wednesday, February 20, 2013 1:15 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)
 
 WE're talking about wordpress, not github.
 
 kc
 
 On 2/20/13 9:56 AM, Johnston, Leslie wrote:
  It's technically breaking GitHub's terms of service to have multiple
 individuals sharing a single account.
 
  Leslie
 
  -Original Message-
  From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf
  Of Karen Coyle
  Sent: Wednesday, February 20, 2013 12:07 PM
  To: CODE4LIB@LISTSERV.ND.EDU
  Subject: Re: [CODE4LIB] GitHub Myths (was thanks and poetry)
 
  Sure. Although the question was more: how can we make it easy to
 have
  a bunch of accounts? Or should we have a c4l account that we share
  (and monitor for spam)? I think anything wysiwyg-y and familiar
  (wordpress certainly meets those criteria) would be fine. There does
  seem to be a lot of familiarity with Wordpress in the group.
 
  kc
 
 
  On 2/20/13 8:45 AM, Ethan Gruber wrote:
  Wordpress?
 
 
  On Wed, Feb 20, 2013 at 11:42 AM, Karen Coyle li...@kcoyle.net
  wrote:
  Shaun, you cannot decide whether github is a barrier to entry FOR
  ME (or anyone else), any more than you can decide whether or not
 my
  foot hurts.
  I'm telling you github is NOT what I want to use. Period.
 
  I'm actually thinking that a blog format would be nice. It could
 be
  pretty (poetry and beauty go together). Poems tend to be short, so
  they'd make a nice blog post. They could appear in the Planet blog
  roll. They could be coded by author and topic. There could be
  comments! Even poems as comments!
  The only down-side is managing users. Anyone have ideas on that?
 
  kc
 
 
 
  On 2/20/13 8:20 AM, Shaun Ellis wrote:
 
  (As a general rule, for every programmer who prefers tool A, and
  says that everybody should use it, there’s a programmer who
  disparages tool A, and advocates tool B. So take what we say
 with
  a
  grain of salt!)
  It doesn't matter what tools you use, as long as you and your
 team
  are able to participate easily, if you want to.  But if you want
  to
  attract
 contributions from a given development community, then choices
  should be balanced between the preferences of that community and
  what best serve the project.
 
From what I've been hearing, I think there is a lot of
 confusion
  about GitHub.  Heck, I am constantly learning about new GitHub
  features, APIs, and best practices myself. But I find it to be an
  incredibly powerful platform for moving open source, distributed
  software development forward.
 I am not telling anyone to use GitHub if they don't want to,
  but
  I
  want to dispel a few myths I've heard recently:
 
  
 
  * Myth #1 : GitHub creates a barrier to entry.
  * To contribute to a project on GitHub, you need to use the
  command-line. It's not for non-coders.
 
  GitHub != git.  While GitHub was initially built for publishing
  and sharing code via integration with git, all GitHub
  functionality can be performed directly through the web gui.  In
  fact, GitHub can
  even
  be used as your sole coding environment. There are other tools in
  the eco-system
  that allow non-coders to contribute documentation, issue
  reporting, and more to a project.
 
  
 
  * Myth #2 : GitHub is for sharing/publishing code.
  * I would be fun to have a wiki for more durable poetry (github
  unfortunately would be a barrier to many).
 
  GitHub can be used to collaborate on and publish other types of
  content as well.  For example, GitHub has a great wiki component*
  (as well as a website component).  In a number of ways, has less
  of
  a barrier to entry
  than our Code4Lib wiki.
 
  While the path of least resistance requires a repository to
 have
  a
  wiki, public repos cost nothing and can consist of a simple
  README file.
 The wiki can be locked down to a team, or it can be writable
 by
  anyone with a github account.  You don't need to do anything via
  command-line, don't need to understand git-flow, and you don't
  even need to learn wiki markup to write content. All you need is
  an account and something to say, just like any wiki. Log in, go
 to
  the anti-harassment policy wiki, and see for yourself:
  https://github.com/code4lib/**antiharassment-
  policy/wikihttps://git
  hub.com/code4lib/antiharassment-policy/wiki
 
  * The github wiki even has an API (via Gollum) that you can use
 to
  retrieve raw or formatted wiki content, write new content, and
  collect various meta data about the wiki as a whole:
  https://github.com/code4lib/**antiharassment-
  policy/wiki/_**accessh
  ttps://github.com/code4lib/antiharassment-policy/wiki/_access
 
  
 
  * Myth #3 : GitHub is person-centric.
  (And as a further aside, there’s plenty to dislike about github
  as
  well, from it’s 

Re: [CODE4LIB] GitHub Myths (was thanks and poetry)

2013-02-20 Thread Erik Hetzner
At Wed, 20 Feb 2013 11:50:45 -0800,
Tom Johnson wrote:
 
  but it would be difficult to replace the social network around the
 projects.
 
 Especially difficult now that GitHub is where the community is. It's
 technically possible to build a social web that works on a decentralized
 basis, but it may no longer be culturally possible. Platforms are hard to
 get down from.

Maybe. Most people today use internet email, not Compuserve email;
they use the web, not AOL keywords; and they use jabber/xmpp, not ICQ.
I don’t think it’s unreasonable to think that people will eventually
leave twitter for a status.net implementation, or github for something
else.

best, Erik
Sent from my free software system http://fsf.org/.


pgpxpukkELTRd.pgp
Description: PGP signature


Re: [CODE4LIB] Question on CONTENTdm and Linked Data

2013-02-20 Thread Ahniwa Ferrari
I work right next to the CONTENTdm guys, so I suppose I could ask them, but
I also use to work at the Washington State Library, and I like what they're
doing with CONTENTdm, and they have some maps. Is this a good example of
what you're trying to do at all?

http://www.washingtonruralheritage.com/cdm/map


On Wed, Feb 20, 2013 at 2:25 PM, Matthew Sherman
matt.r.sher...@gmail.comwrote:

 Hello Code4Lib,

 I was wondering if anyone has had success in using digital data or
 resources that are stored in CONTENTdm in any linked data projects.  I have
 tried utilizing CONTENTdm data for a small Google Map in the past and found
 it quite difficult to use.  At the same time I have not used CONTENTdm in
 over a year so I do not know if they have made it easier to exact and
 utilize information from the system.  I am working on an interview
 presentation and one of the parts I am trying to tackle involves working a
 set of data into a user friendly system related to a specific
 topic, possibly using a map.  I know these folks have CONTENTdm currently
 so I was wondering if I would be able to present a way to work with the
 existing system or if I should be saying that to make this project work
 they need to put it into a different CMS.  Any insight folks have had
 working with linked data in CONTENTdm would be quite welcome.  Thanks.

 Matt Sherman