Re: [CODE4LIB] Let's implement the referrer meta tag

2015-06-12 Thread Andrew Anderson
I was not suggesting mixing HTTP/HTTPS resources, but rather wholesale 
converting to SSL and taking advantage of the fact that vendors don’t seem to 
care about privacy issues and do not support SSL today.   Thus when leaving the 
secure site, the referring header will not be sent, thanks to RFC 2616 behavior.

Of course, SPDY^WHTTP/2.0 will make this moot, but perhaps someone can convince 
the standards group that referring URLs are not a good idea to carry forward in 
general.

-- 
Andrew Anderson, Director of Development, Library and Information Resources 
Network, Inc.
http://www.lirn.net/ | http://www.twitter.com/LIRNnotes | 
http://www.facebook.com/LIRNnotes

On Jun 12, 2015, at 18:23, Eric Hellman  wrote:

> While going to SSL is a good thing to do, it's not a good idea to be loading 
> non-secure resources into a secure web page, because then your page is no 
> longer secure.
> 
> So for example, if you load the google analytics script via http from an 
> https page, and MITM attacker could just insert evil code into the script. Or 
> verizon could insert x-uidh headers into non-SSL cover image requests.
> 
> Eric
> 
>> On Jun 12, 2015, at 2:37 AM, Andrew Anderson  wrote:
>> 
>> Or just SSL enable your library web site.  Few vendors support SSL today, so 
>> crossing the HTTP/HTTPS barrier is supposed to automatically disable 
>> referring URL passing.
>> 
>> http://www.w3.org/Protocols/rfc2616/rfc2616-sec15.html#sec15.1.3
>> 
>> 15.1.3 Encoding Sensitive Information in URI's
>> 
>> Because the source of a link might be private information or might reveal an 
>> otherwise private information source, it is strongly recommended that the 
>> user be able to select whether or not the Referer field is sent. For 
>> example, a browser client could have a toggle switch for browsing 
>> openly/anonymously, which would respectively enable/disable the sending of 
>> Referer and From information.
>> 
>> Clients SHOULD NOT include a Referer header field in a (non-secure) HTTP 
>> request if the referring page was transferred with a secure protocol.
>> 
>> Authors of services which use the HTTP protocol SHOULD NOT use GET based 
>> forms for the submission of sensitive data, because this will cause this 
>> data to be encoded in the Request-URI. Many existing servers, proxies, and 
>> user agents will log the request URI in some place where it might be visible 
>> to third parties. Servers can use POST-based form submission instead
>> 
>> -- 
>> Andrew Anderson, Director of Development, Library and Information Resources 
>> Network, Inc.
>> http://www.lirn.net/ | http://www.twitter.com/LIRNnotes | 
>> http://www.facebook.com/LIRNnotes
>> 
>> On Jun 12, 2015, at 0:24, Conal Tuohy  wrote:
>> 
>>> Assuming your library web server has a front-end proxy (I guess this is
>>> pretty common) or at least runs inside Apache httpd or something, then
>>> rather than use the HTML meta tag, it might be easier to set the "referer"
>>> policy via the "Content-Security-Policy" HTTP header field.
>>> 
>>> https://w3c.github.io/webappsec/specs/content-security-policy/#content-security-policy-header-field
>>> 
>>> e.g. in Apache httpd with mod_headers:
>>> 
>>> Header set Content-Security-Policy referrer 'no-referrer'
>>> 
>>> 
>>> 
>>> On 12 June 2015 at 13:55, Frumkin, Jeremy A - (frumkinj) <
>>> frumk...@email.arizona.edu> wrote:
>>> 
 Eric -
 
 Many thanks for raising awareness of this. It does feel like encouraging
 good practice re: referrer meta tag would be a good thing, but I would not
 know where to start to make something like this required practice. Did you
 have some thoughts on that?
 
 — jaf
 
 ---
 Jeremy Frumkin
 Associate Dean / Chief Technology Strategist
 University of Arizona Libraries
 
 +1 520.626.7296
 j...@arizona.edu
 ——
 "A person who never made a mistake never tried anything new." - Albert
 Einstein
 
 
 
 
 
 
 
 
 
 On 6/11/15, 8:25 AM, "Eric Hellman"  wrote:
 
> 
 http://go-to-hellman.blogspot.com/2015/06/protect-reader-privacy-with-referrer.html
 <
 http://go-to-hellman.blogspot.com/2015/06/protect-reader-privacy-with-referrer.html
> 
> 
> I hope this is easy to deploy on library websites, because the privacy
 enhancement is significant.
> 
> I'd be very interested to know of sites that are using it; I know Thomas
 Dowling implemented a referrer policy on http://oatd.org/ <
 http://oatd.org/>
> 
> Would it be a good idea to make it a required practice for libraries?
> 
> 
> Eric Hellman
> President, Gluejar.Inc.
> Founder, Unglue.it https://unglue.it/
> http://go-to-hellman.blogspot.com/
> twitter: @gluejar
 


Re: [CODE4LIB] Let's implement the referrer meta tag

2015-06-12 Thread Suchy, Daniel
No to mention that users may get that annoying and paranoia-inducing
browser warning: "These resources can be viewed by others while in
transit, and can be modified by an attacker to change the look of the
page.²

Its enough to scare me off even though its usually harmless.
-Dan


Daniel Suchy
Assistant Director
Academic Computing & Media Services
UC San Diego
858.534.9556
dsu...@ucsd.edu | acms.ucsd.edu



On 6/12/15, 3:23 PM, "Eric Hellman"  wrote:

>While going to SSL is a good thing to do, it's not a good idea to be
>loading non-secure resources into a secure web page, because then your
>page is no longer secure.
>
>So for example, if you load the google analytics script via http from an
>https page, and MITM attacker could just insert evil code into the
>script. Or verizon could insert x-uidh headers into non-SSL cover image
>requests.
>
>Eric
>
>> On Jun 12, 2015, at 2:37 AM, Andrew Anderson  wrote:
>> 
>> Or just SSL enable your library web site.  Few vendors support SSL
>>today, so crossing the HTTP/HTTPS barrier is supposed to automatically
>>disable referring URL passing.
>> 
>> http://www.w3.org/Protocols/rfc2616/rfc2616-sec15.html#sec15.1.3
>> 
>> 15.1.3 Encoding Sensitive Information in URI's
>> 
>> Because the source of a link might be private information or might
>>reveal an otherwise private information source, it is strongly
>>recommended that the user be able to select whether or not the Referer
>>field is sent. For example, a browser client could have a toggle switch
>>for browsing openly/anonymously, which would respectively enable/disable
>>the sending of Referer and From information.
>> 
>> Clients SHOULD NOT include a Referer header field in a (non-secure)
>>HTTP request if the referring page was transferred with a secure
>>protocol.
>> 
>> Authors of services which use the HTTP protocol SHOULD NOT use GET
>>based forms for the submission of sensitive data, because this will
>>cause this data to be encoded in the Request-URI. Many existing servers,
>>proxies, and user agents will log the request URI in some place where it
>>might be visible to third parties. Servers can use POST-based form
>>submission instead
>> 
>> -- 
>> Andrew Anderson, Director of Development, Library and Information
>>Resources Network, Inc.
>> http://www.lirn.net/ | http://www.twitter.com/LIRNnotes |
>>http://www.facebook.com/LIRNnotes
>> 
>> On Jun 12, 2015, at 0:24, Conal Tuohy  wrote:
>> 
>>> Assuming your library web server has a front-end proxy (I guess this is
>>> pretty common) or at least runs inside Apache httpd or something, then
>>> rather than use the HTML meta tag, it might be easier to set the
>>>"referer"
>>> policy via the "Content-Security-Policy" HTTP header field.
>>> 
>>> 
>>>https://w3c.github.io/webappsec/specs/content-security-policy/#content-s
>>>ecurity-policy-header-field
>>> 
>>> e.g. in Apache httpd with mod_headers:
>>> 
>>> Header set Content-Security-Policy referrer 'no-referrer'
>>> 
>>> 
>>> 
>>> On 12 June 2015 at 13:55, Frumkin, Jeremy A - (frumkinj) <
>>> frumk...@email.arizona.edu> wrote:
>>> 
 Eric -
 
 Many thanks for raising awareness of this. It does feel like
encouraging
 good practice re: referrer meta tag would be a good thing, but I
would not
 know where to start to make something like this required practice.
Did you
 have some thoughts on that?
 
 ‹ jaf
 
 ---
 Jeremy Frumkin
 Associate Dean / Chief Technology Strategist
 University of Arizona Libraries
 
 +1 520.626.7296
 j...@arizona.edu
 ‹‹
 "A person who never made a mistake never tried anything new." - Albert
 Einstein
 
 
 
 
 
 
 
 
 
 On 6/11/15, 8:25 AM, "Eric Hellman"  wrote:
 
> 
 
http://go-to-hellman.blogspot.com/2015/06/protect-reader-privacy-with-r
eferrer.html
 <
 
http://go-to-hellman.blogspot.com/2015/06/protect-reader-privacy-with-r
eferrer.html
> 
> 
> I hope this is easy to deploy on library websites, because the
>privacy
 enhancement is significant.
> 
> I'd be very interested to know of sites that are using it; I know
>Thomas
 Dowling implemented a referrer policy on http://oatd.org/ <
 http://oatd.org/>
> 
> Would it be a good idea to make it a required practice for libraries?
> 
> 
> Eric Hellman
> President, Gluejar.Inc.
> Founder, Unglue.it https://unglue.it/
> http://go-to-hellman.blogspot.com/
> twitter: @gluejar
 


Re: [CODE4LIB] Let's implement the referrer meta tag

2015-06-12 Thread Eric Hellman
While going to SSL is a good thing to do, it's not a good idea to be loading 
non-secure resources into a secure web page, because then your page is no 
longer secure.

So for example, if you load the google analytics script via http from an https 
page, and MITM attacker could just insert evil code into the script. Or verizon 
could insert x-uidh headers into non-SSL cover image requests.

Eric

> On Jun 12, 2015, at 2:37 AM, Andrew Anderson  wrote:
> 
> Or just SSL enable your library web site.  Few vendors support SSL today, so 
> crossing the HTTP/HTTPS barrier is supposed to automatically disable 
> referring URL passing.
> 
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec15.html#sec15.1.3
> 
> 15.1.3 Encoding Sensitive Information in URI's
> 
> Because the source of a link might be private information or might reveal an 
> otherwise private information source, it is strongly recommended that the 
> user be able to select whether or not the Referer field is sent. For example, 
> a browser client could have a toggle switch for browsing openly/anonymously, 
> which would respectively enable/disable the sending of Referer and From 
> information.
> 
> Clients SHOULD NOT include a Referer header field in a (non-secure) HTTP 
> request if the referring page was transferred with a secure protocol.
> 
> Authors of services which use the HTTP protocol SHOULD NOT use GET based 
> forms for the submission of sensitive data, because this will cause this data 
> to be encoded in the Request-URI. Many existing servers, proxies, and user 
> agents will log the request URI in some place where it might be visible to 
> third parties. Servers can use POST-based form submission instead
> 
> -- 
> Andrew Anderson, Director of Development, Library and Information Resources 
> Network, Inc.
> http://www.lirn.net/ | http://www.twitter.com/LIRNnotes | 
> http://www.facebook.com/LIRNnotes
> 
> On Jun 12, 2015, at 0:24, Conal Tuohy  wrote:
> 
>> Assuming your library web server has a front-end proxy (I guess this is
>> pretty common) or at least runs inside Apache httpd or something, then
>> rather than use the HTML meta tag, it might be easier to set the "referer"
>> policy via the "Content-Security-Policy" HTTP header field.
>> 
>> https://w3c.github.io/webappsec/specs/content-security-policy/#content-security-policy-header-field
>> 
>> e.g. in Apache httpd with mod_headers:
>> 
>> Header set Content-Security-Policy referrer 'no-referrer'
>> 
>> 
>> 
>> On 12 June 2015 at 13:55, Frumkin, Jeremy A - (frumkinj) <
>> frumk...@email.arizona.edu> wrote:
>> 
>>> Eric -
>>> 
>>> Many thanks for raising awareness of this. It does feel like encouraging
>>> good practice re: referrer meta tag would be a good thing, but I would not
>>> know where to start to make something like this required practice. Did you
>>> have some thoughts on that?
>>> 
>>> — jaf
>>> 
>>> ---
>>> Jeremy Frumkin
>>> Associate Dean / Chief Technology Strategist
>>> University of Arizona Libraries
>>> 
>>> +1 520.626.7296
>>> j...@arizona.edu
>>> ——
>>> "A person who never made a mistake never tried anything new." - Albert
>>> Einstein
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On 6/11/15, 8:25 AM, "Eric Hellman"  wrote:
>>> 
 
>>> http://go-to-hellman.blogspot.com/2015/06/protect-reader-privacy-with-referrer.html
>>> <
>>> http://go-to-hellman.blogspot.com/2015/06/protect-reader-privacy-with-referrer.html
 
 
 I hope this is easy to deploy on library websites, because the privacy
>>> enhancement is significant.
 
 I'd be very interested to know of sites that are using it; I know Thomas
>>> Dowling implemented a referrer policy on http://oatd.org/ <
>>> http://oatd.org/>
 
 Would it be a good idea to make it a required practice for libraries?
 
 
 Eric Hellman
 President, Gluejar.Inc.
 Founder, Unglue.it https://unglue.it/
 http://go-to-hellman.blogspot.com/
 twitter: @gluejar
>>> 


Re: [CODE4LIB] Let's implement the referrer meta tag

2015-06-12 Thread Eric Hellman
I'd not heard of this.

But on reading it closely, I don't think it regulates the referer header, 
rather it prevent restricts the origins of resources that a page can load.So it 
doesn't work with referrer policies. but I could be wrong

Eric

On Jun 12, 2015, at 12:24 AM, Conal Tuohy  wrote:
> 
> Assuming your library web server has a front-end proxy (I guess this is
> pretty common) or at least runs inside Apache httpd or something, then
> rather than use the HTML meta tag, it might be easier to set the "referer"
> policy via the "Content-Security-Policy" HTTP header field.
> 
> https://w3c.github.io/webappsec/specs/content-security-policy/#content-security-policy-header-field
> 
> e.g. in Apache httpd with mod_headers:
> 
> Header set Content-Security-Policy referrer 'no-referrer'
> 
> 
> 


[CODE4LIB] Job Posting: Geospatial Project Metadata Coordinator

2015-06-12 Thread Kevin Dyke
Apologies for cross posting.

The University of Minnesota Libraries are seeking a Geospatial Project
Metadata Coordinator. Please forward to those who may be interested.


University of Minnesota Libraries, Twin Cities

CIC Geospatial Project Metadata Coordinator


The University of Minnesota Libraries seek a knowledgeable and proactive
Geospatial Project Metadata Coordinator to advance the CIC Geospatial Data
Discovery Project, which focuses on developing a geospatial data discovery
portal for participating institutions in the CIC . The
Geospatial Project Metadata Coordinator works with the CIC Geospatial Data
Task Force under the management and direction of the University of
Minnesota Libraries, which holds responsibility for leading the CIC
Geospatial Data Discovery Project. As such, the work of the Geospatial
Project Metadata Coordinator will involve managing the process of metadata
creation, metadata template creation, and metadata ingest, as well as
coordinating these activities across all collaborating institutions.

Required Qualifications include a Master's degree in library/information
science from an American Library Association accredited library school,
GIS-related field, or equivalent combination of advanced degree and
relevant experience, experience with metadata creation, standards, and
management, and demonstrated project management skills.

Review of applications begins immediately and will continue until the
position is filled. For complete description and qualifications, and to
apply, go to http://z.umn.edu/ulib336.

The University of Minnesota is an equal opportunity educator and employer.

-- 
Kevin Dyke
Spatial Data Analyst/Curator
John R. Borchert Map Library, University of Minnesota Libraries
Office: 612.301.3932
Email: kevind...@umn.edu
Web: kevinrdyke.com


Re: [CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread LeVan,Ralph
This is as close to an official statement as I can find:

http://www.oclc.org/developer/news/2015/dewey-down.en.html

I've asked around, but can't add anything to that.

Ralph

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen 
Coyle
Sent: Friday, June 12, 2015 12:35 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: Auto discovery of Dewey, UDC

Hi. I tweeted this last month and got a reply that dewey.info is indeed 
currently down for major work, but is intended to return. That said, it was 
intended to return a month or two ago, so the usual coding project delays are 
in action here. ;-)

kc

On 6/12/15 7:08 AM, Sergio Letuche wrote:
> dewey.info
>
> seems to be dead, we have also checked this.
>
> 2015-06-12 16:57 GMT+03:00 Péter Király :
>
>> Hi Sergio,
>>
>> As part of eXtensible Catalog we developed a Dewey module for Drupal,
>> which takes a Dewey number, and use OCLC's dewey.info to fetch the
>> textual description of the part. When it was created the service
>> contained only 3 levels of the classification system, since then they
>> went ahead, and now it is deeper.
>>
>> You can find the sorce here:
>> http://cgit.drupalcode.org/xc/tree/xc_dewey/xc_dewey.module?h=7.x-1.x
>>
>> Maybe it helps you.
>>
>> Regarding to UDC: it is much a harder task, and when I worked with it,
>> I run into a blocking problem, which is that UDC was not licenced as
>> freely usable, and I was not able to get a licence to use it in an
>> open source project. There were some other problems as well: UDC
>> changed from time to time, and sometimes it means, that a given
>> classification code means this thing in a given point of time, and
>> that thing some years later. The MARC catalog I worked with did not
>> contain any information about the UDC versions, so the accuracy of the
>> tool was not guaranted (of course you can do some intelligent
>> guessing). And the last problem was, that on contrary to the Dewey
>> classification UDC contains sometime very lengthy descriptions instead
>> of one or two words. Semantically it is OK, but makes the UI design a
>> little bit hard, and if you want to search for the textual
>> description, you'll end up sometimes with a "noisy" result set.
>> Otherwise to handle the operators, the subclasses, and all the nice
>> things UDC provides is a very interesting challange.
>>
>> Cheers,
>> Péter
>>
>>
>> 2015-06-12 12:59 GMT+02:00 Sergio Letuche :
>>> thank you very much for your quick reply, dear Stefano,
>>>
>>> i appreciate it
>>>
>>> 2015-06-12 13:47 GMT+03:00 Stefano Bargioni :
>>>
 Hi, Sergio:
 maybe this article [1 abstract] [2 English text] can give you some basic
 ideas. We added a lot of DDC info in our Koha catalog two years ago.
 HTH. Stefano

 [1] http://leo.cineca.it/index.php/jlis/article/view/8766
 [2] http://leo.cineca.it/index.php/jlis/article/view/8766/8060

 On 12/giu/2015, at 12:03, Sergio Letuche 
>> wrote:
> hello community!
>
> we are facing this challenging issue. We need to complete for a vast
 amount
> of records, the dewey, UDC info, has anyone had any experience with
>> this?
> We need some way (via modeling? mahout?) to try and discover these
 values,
> based on some text, found in the records' metadata, and then auto
 complete
> these values.
>
> I would appreciate any feedback, if there is any opensource tool you
>> have
> used for this purpose, or if you are aware of any best practice for
>> doing
> this task.
>
> Best
>

 __
 Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto
 semplice ma di grande valore.
 Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze
>> di
 tutti noi.
 Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti
 indicando nella dichiarazione dei redditi il codice fiscale 97023980580.

>>
>>
>> --
>> Péter Király
>> software developer
>> GWDG, Göttingen - Europeana - eXtensible Catalog - The Code4Lib Journal
>> http://linkedin.com/in/peterkiraly
>>

-- 
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600


[CODE4LIB] Job: Metadata Technologies Librarian at North Carolina State University

2015-06-12 Thread jobs
Metadata Technologies Librarian
North Carolina State University
Raleigh, North Carolina

The NCSU Libraries invites applications and nominations for the position
ofMetadata Technologies Librarianin the
Acquisitions and Discovery department. Acquisitions and Discovery's seven
librarians and sixteen staff is responsible for managing the Libraries'
approximately $10 million collections budget, acquiring materials in all
formats, negotiating license agreements and contracts, and describing and
maintaining access points to facilitate discovery of the Libraries'
resources. NCSU Acquisitions and Discovery librarians have
been actively involved in the development of an e-resource management
system,E-Matrix, and are bringing that expertise to the
national level as part of the Kuali OLE and GOKb projects.

  
Review of applications is underway; position will remain open until a suitable
candidate is found. See vacancy announcement with application instructions 
at[https://www.lib.ncsu.edu/jobs/epa/mtl](https://www.lib.ncsu
.edu/jobs/epa/mtl).

  
AA/OEO. NC State welcomes all persons without regard to sexual orientation or
genetic information. For ADA accommodations, please
call(919)515-3148.



Brought to you by code4lib jobs: http://jobs.code4lib.org/job/21483/
To post a new job please visit http://jobs.code4lib.org/


Re: [CODE4LIB] Distribution of collections by DDC or UDC?

2015-06-12 Thread Karen Coyle
Christina, I was hoping that someone with more info would reply, but to 
my knowledge the services that do these statistical surveys are 
"pay-fer" so this might be considered proprietary info. Presumably 
libraries that have used these services received the results, but may 
not be allowed to share them. You might, however, have better luck on a 
list that has a higher percentage of collection development librarians. 
Unfortunately, I don't know what list that would be. Anyone?


kc

On 5/14/15 6:47 AM, Pikas, Christina K. wrote:

This might be a bizarre question, but can anyone point to some analysis for a 
large general library, consortium, or even like WorldCat, a distribution of 
materials by class?  So say for example 10% of the collection is in the 700s, 
and half of that is in the 741s, a quarter is in 746.432...
This table:
Table 4: Subject breakdown, nonfiction print books
History and auxiliary sciences

8 percent

Engineering and technology

7 percent

Business and economics

7 percent

Language, linguistics, and literature

6 percent

Philosophy and religion

5 percent

Health and medicine

5 percent

Art and architecture

3 percent

Law

3 percent

Sociology

3 percent

Education

3 percent

Other

15 percent

Unknown

35 percent

 From http://www.dlib.org/dlib/november09/lavoie/11lavoie.html isn't really 
granular enough.
Thanks!
--
Christina K. Pikas
Librarian
The Johns Hopkins University Applied Physics Laboratory
Baltimore: 443.778.4812
D.C.: 240.228.4812
christina.pi...@jhuapl.edu


--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600


Re: [CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread Karen Coyle
Hi. I tweeted this last month and got a reply that dewey.info is indeed 
currently down for major work, but is intended to return. That said, it 
was intended to return a month or two ago, so the usual coding project 
delays are in action here. ;-)


kc

On 6/12/15 7:08 AM, Sergio Letuche wrote:

dewey.info

seems to be dead, we have also checked this.

2015-06-12 16:57 GMT+03:00 Péter Király :


Hi Sergio,

As part of eXtensible Catalog we developed a Dewey module for Drupal,
which takes a Dewey number, and use OCLC's dewey.info to fetch the
textual description of the part. When it was created the service
contained only 3 levels of the classification system, since then they
went ahead, and now it is deeper.

You can find the sorce here:
http://cgit.drupalcode.org/xc/tree/xc_dewey/xc_dewey.module?h=7.x-1.x

Maybe it helps you.

Regarding to UDC: it is much a harder task, and when I worked with it,
I run into a blocking problem, which is that UDC was not licenced as
freely usable, and I was not able to get a licence to use it in an
open source project. There were some other problems as well: UDC
changed from time to time, and sometimes it means, that a given
classification code means this thing in a given point of time, and
that thing some years later. The MARC catalog I worked with did not
contain any information about the UDC versions, so the accuracy of the
tool was not guaranted (of course you can do some intelligent
guessing). And the last problem was, that on contrary to the Dewey
classification UDC contains sometime very lengthy descriptions instead
of one or two words. Semantically it is OK, but makes the UI design a
little bit hard, and if you want to search for the textual
description, you'll end up sometimes with a "noisy" result set.
Otherwise to handle the operators, the subclasses, and all the nice
things UDC provides is a very interesting challange.

Cheers,
Péter


2015-06-12 12:59 GMT+02:00 Sergio Letuche :

thank you very much for your quick reply, dear Stefano,

i appreciate it

2015-06-12 13:47 GMT+03:00 Stefano Bargioni :


Hi, Sergio:
maybe this article [1 abstract] [2 English text] can give you some basic
ideas. We added a lot of DDC info in our Koha catalog two years ago.
HTH. Stefano

[1] http://leo.cineca.it/index.php/jlis/article/view/8766
[2] http://leo.cineca.it/index.php/jlis/article/view/8766/8060

On 12/giu/2015, at 12:03, Sergio Letuche 

wrote:

hello community!

we are facing this challenging issue. We need to complete for a vast

amount

of records, the dewey, UDC info, has anyone had any experience with

this?

We need some way (via modeling? mahout?) to try and discover these

values,

based on some text, found in the records' metadata, and then auto

complete

these values.

I would appreciate any feedback, if there is any opensource tool you

have

used for this purpose, or if you are aware of any best practice for

doing

this task.

Best



__
Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto
semplice ma di grande valore.
Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze

di

tutti noi.
Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti
indicando nella dichiarazione dei redditi il codice fiscale 97023980580.




--
Péter Király
software developer
GWDG, Göttingen - Europeana - eXtensible Catalog - The Code4Lib Journal
http://linkedin.com/in/peterkiraly



--
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
m: +1-510-435-8234
skype: kcoylenet/+1-510-984-3600


Re: [CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread Sergio Letuche
i am afraid we cannot wait...

Thanx again Stefano,

Cheers

2015-06-12 18:00 GMT+03:00 Stefano Bargioni :

> I found this (a bit strange):
> http://ddc.typepad.com/025431/2015/04/deweyinfo-is-coming-back.html
>
> On 12/giu/2015, at 16:08, Sergio Letuche  wrote:
>
> > dewey.info
> >
> > seems to be dead, we have also checked this.
> >
> > 2015-06-12 16:57 GMT+03:00 Péter Király :
> >
> >> Hi Sergio,
> >>
> >> As part of eXtensible Catalog we developed a Dewey module for Drupal,
> >> which takes a Dewey number, and use OCLC's dewey.info to fetch the
> >> textual description of the part. When it was created the service
> >> contained only 3 levels of the classification system, since then they
> >> went ahead, and now it is deeper.
> >>
> >> You can find the sorce here:
> >> http://cgit.drupalcode.org/xc/tree/xc_dewey/xc_dewey.module?h=7.x-1.x
> >>
> >> Maybe it helps you.
> >>
> >> Regarding to UDC: it is much a harder task, and when I worked with it,
> >> I run into a blocking problem, which is that UDC was not licenced as
> >> freely usable, and I was not able to get a licence to use it in an
> >> open source project. There were some other problems as well: UDC
> >> changed from time to time, and sometimes it means, that a given
> >> classification code means this thing in a given point of time, and
> >> that thing some years later. The MARC catalog I worked with did not
> >> contain any information about the UDC versions, so the accuracy of the
> >> tool was not guaranted (of course you can do some intelligent
> >> guessing). And the last problem was, that on contrary to the Dewey
> >> classification UDC contains sometime very lengthy descriptions instead
> >> of one or two words. Semantically it is OK, but makes the UI design a
> >> little bit hard, and if you want to search for the textual
> >> description, you'll end up sometimes with a "noisy" result set.
> >> Otherwise to handle the operators, the subclasses, and all the nice
> >> things UDC provides is a very interesting challange.
> >>
> >> Cheers,
> >> Péter
> >>
> >>
> >> 2015-06-12 12:59 GMT+02:00 Sergio Letuche :
> >>> thank you very much for your quick reply, dear Stefano,
> >>>
> >>> i appreciate it
> >>>
> >>> 2015-06-12 13:47 GMT+03:00 Stefano Bargioni :
> >>>
>  Hi, Sergio:
>  maybe this article [1 abstract] [2 English text] can give you some
> basic
>  ideas. We added a lot of DDC info in our Koha catalog two years ago.
>  HTH. Stefano
> 
>  [1] http://leo.cineca.it/index.php/jlis/article/view/8766
>  [2] http://leo.cineca.it/index.php/jlis/article/view/8766/8060
> 
>  On 12/giu/2015, at 12:03, Sergio Letuche 
> >> wrote:
> 
> > hello community!
> >
> > we are facing this challenging issue. We need to complete for a vast
>  amount
> > of records, the dewey, UDC info, has anyone had any experience with
> >> this?
> > We need some way (via modeling? mahout?) to try and discover these
>  values,
> > based on some text, found in the records' metadata, and then auto
>  complete
> > these values.
> >
> > I would appreciate any feedback, if there is any opensource tool you
> >> have
> > used for this purpose, or if you are aware of any best practice for
> >> doing
> > this task.
> >
> > Best
> >
> 
> 
>  __
>  Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto
>  semplice ma di grande valore.
>  Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze
> >> di
>  tutti noi.
>  Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti
>  indicando nella dichiarazione dei redditi il codice fiscale
> 97023980580.
> 
> >>
> >>
> >>
> >> --
> >> Péter Király
> >> software developer
> >> GWDG, Göttingen - Europeana - eXtensible Catalog - The Code4Lib Journal
> >> http://linkedin.com/in/peterkiraly
> >>
> >
>
>
> __
> Il tuo 5x1000 al Patronato di San Girolamo della Carità è un gesto
> semplice ma di grande valore.
> Una tua firma aiuterà i sacerdoti ad essere più vicini alle esigenze di
> tutti noi.
> Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti
> indicando nella dichiarazione dei redditi il codice fiscale 97023980580.
>


Re: [CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread Stefano Bargioni
I found this (a bit strange):
http://ddc.typepad.com/025431/2015/04/deweyinfo-is-coming-back.html

On 12/giu/2015, at 16:08, Sergio Letuche  wrote:

> dewey.info
> 
> seems to be dead, we have also checked this.
> 
> 2015-06-12 16:57 GMT+03:00 Péter Király :
> 
>> Hi Sergio,
>> 
>> As part of eXtensible Catalog we developed a Dewey module for Drupal,
>> which takes a Dewey number, and use OCLC's dewey.info to fetch the
>> textual description of the part. When it was created the service
>> contained only 3 levels of the classification system, since then they
>> went ahead, and now it is deeper.
>> 
>> You can find the sorce here:
>> http://cgit.drupalcode.org/xc/tree/xc_dewey/xc_dewey.module?h=7.x-1.x
>> 
>> Maybe it helps you.
>> 
>> Regarding to UDC: it is much a harder task, and when I worked with it,
>> I run into a blocking problem, which is that UDC was not licenced as
>> freely usable, and I was not able to get a licence to use it in an
>> open source project. There were some other problems as well: UDC
>> changed from time to time, and sometimes it means, that a given
>> classification code means this thing in a given point of time, and
>> that thing some years later. The MARC catalog I worked with did not
>> contain any information about the UDC versions, so the accuracy of the
>> tool was not guaranted (of course you can do some intelligent
>> guessing). And the last problem was, that on contrary to the Dewey
>> classification UDC contains sometime very lengthy descriptions instead
>> of one or two words. Semantically it is OK, but makes the UI design a
>> little bit hard, and if you want to search for the textual
>> description, you'll end up sometimes with a "noisy" result set.
>> Otherwise to handle the operators, the subclasses, and all the nice
>> things UDC provides is a very interesting challange.
>> 
>> Cheers,
>> Péter
>> 
>> 
>> 2015-06-12 12:59 GMT+02:00 Sergio Letuche :
>>> thank you very much for your quick reply, dear Stefano,
>>> 
>>> i appreciate it
>>> 
>>> 2015-06-12 13:47 GMT+03:00 Stefano Bargioni :
>>> 
 Hi, Sergio:
 maybe this article [1 abstract] [2 English text] can give you some basic
 ideas. We added a lot of DDC info in our Koha catalog two years ago.
 HTH. Stefano
 
 [1] http://leo.cineca.it/index.php/jlis/article/view/8766
 [2] http://leo.cineca.it/index.php/jlis/article/view/8766/8060
 
 On 12/giu/2015, at 12:03, Sergio Letuche 
>> wrote:
 
> hello community!
> 
> we are facing this challenging issue. We need to complete for a vast
 amount
> of records, the dewey, UDC info, has anyone had any experience with
>> this?
> We need some way (via modeling? mahout?) to try and discover these
 values,
> based on some text, found in the records' metadata, and then auto
 complete
> these values.
> 
> I would appreciate any feedback, if there is any opensource tool you
>> have
> used for this purpose, or if you are aware of any best practice for
>> doing
> this task.
> 
> Best
> 
 
 
 __
 Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto
 semplice ma di grande valore.
 Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze
>> di
 tutti noi.
 Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti
 indicando nella dichiarazione dei redditi il codice fiscale 97023980580.
 
>> 
>> 
>> 
>> --
>> Péter Király
>> software developer
>> GWDG, Göttingen - Europeana - eXtensible Catalog - The Code4Lib Journal
>> http://linkedin.com/in/peterkiraly
>> 
> 


__
Il tuo 5x1000 al Patronato di San Girolamo della Carità è un gesto semplice ma 
di grande valore.
Una tua firma aiuterà i sacerdoti ad essere più vicini alle esigenze di tutti 
noi.
Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti 
indicando nella dichiarazione dei redditi il codice fiscale 97023980580.


Re: [CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread Sergio Letuche
dewey.info

seems to be dead, we have also checked this.

2015-06-12 16:57 GMT+03:00 Péter Király :

> Hi Sergio,
>
> As part of eXtensible Catalog we developed a Dewey module for Drupal,
> which takes a Dewey number, and use OCLC's dewey.info to fetch the
> textual description of the part. When it was created the service
> contained only 3 levels of the classification system, since then they
> went ahead, and now it is deeper.
>
> You can find the sorce here:
> http://cgit.drupalcode.org/xc/tree/xc_dewey/xc_dewey.module?h=7.x-1.x
>
> Maybe it helps you.
>
> Regarding to UDC: it is much a harder task, and when I worked with it,
> I run into a blocking problem, which is that UDC was not licenced as
> freely usable, and I was not able to get a licence to use it in an
> open source project. There were some other problems as well: UDC
> changed from time to time, and sometimes it means, that a given
> classification code means this thing in a given point of time, and
> that thing some years later. The MARC catalog I worked with did not
> contain any information about the UDC versions, so the accuracy of the
> tool was not guaranted (of course you can do some intelligent
> guessing). And the last problem was, that on contrary to the Dewey
> classification UDC contains sometime very lengthy descriptions instead
> of one or two words. Semantically it is OK, but makes the UI design a
> little bit hard, and if you want to search for the textual
> description, you'll end up sometimes with a "noisy" result set.
> Otherwise to handle the operators, the subclasses, and all the nice
> things UDC provides is a very interesting challange.
>
> Cheers,
> Péter
>
>
> 2015-06-12 12:59 GMT+02:00 Sergio Letuche :
> > thank you very much for your quick reply, dear Stefano,
> >
> > i appreciate it
> >
> > 2015-06-12 13:47 GMT+03:00 Stefano Bargioni :
> >
> >> Hi, Sergio:
> >> maybe this article [1 abstract] [2 English text] can give you some basic
> >> ideas. We added a lot of DDC info in our Koha catalog two years ago.
> >> HTH. Stefano
> >>
> >> [1] http://leo.cineca.it/index.php/jlis/article/view/8766
> >> [2] http://leo.cineca.it/index.php/jlis/article/view/8766/8060
> >>
> >> On 12/giu/2015, at 12:03, Sergio Letuche 
> wrote:
> >>
> >> > hello community!
> >> >
> >> > we are facing this challenging issue. We need to complete for a vast
> >> amount
> >> > of records, the dewey, UDC info, has anyone had any experience with
> this?
> >> > We need some way (via modeling? mahout?) to try and discover these
> >> values,
> >> > based on some text, found in the records' metadata, and then auto
> >> complete
> >> > these values.
> >> >
> >> > I would appreciate any feedback, if there is any opensource tool you
> have
> >> > used for this purpose, or if you are aware of any best practice for
> doing
> >> > this task.
> >> >
> >> > Best
> >> >
> >>
> >>
> >> __
> >> Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto
> >> semplice ma di grande valore.
> >> Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze
> di
> >> tutti noi.
> >> Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti
> >> indicando nella dichiarazione dei redditi il codice fiscale 97023980580.
> >>
>
>
>
> --
> Péter Király
> software developer
> GWDG, Göttingen - Europeana - eXtensible Catalog - The Code4Lib Journal
> http://linkedin.com/in/peterkiraly
>


Re: [CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread Péter Király
Hi Sergio,

As part of eXtensible Catalog we developed a Dewey module for Drupal,
which takes a Dewey number, and use OCLC's dewey.info to fetch the
textual description of the part. When it was created the service
contained only 3 levels of the classification system, since then they
went ahead, and now it is deeper.

You can find the sorce here:
http://cgit.drupalcode.org/xc/tree/xc_dewey/xc_dewey.module?h=7.x-1.x

Maybe it helps you.

Regarding to UDC: it is much a harder task, and when I worked with it,
I run into a blocking problem, which is that UDC was not licenced as
freely usable, and I was not able to get a licence to use it in an
open source project. There were some other problems as well: UDC
changed from time to time, and sometimes it means, that a given
classification code means this thing in a given point of time, and
that thing some years later. The MARC catalog I worked with did not
contain any information about the UDC versions, so the accuracy of the
tool was not guaranted (of course you can do some intelligent
guessing). And the last problem was, that on contrary to the Dewey
classification UDC contains sometime very lengthy descriptions instead
of one or two words. Semantically it is OK, but makes the UI design a
little bit hard, and if you want to search for the textual
description, you'll end up sometimes with a "noisy" result set.
Otherwise to handle the operators, the subclasses, and all the nice
things UDC provides is a very interesting challange.

Cheers,
Péter


2015-06-12 12:59 GMT+02:00 Sergio Letuche :
> thank you very much for your quick reply, dear Stefano,
>
> i appreciate it
>
> 2015-06-12 13:47 GMT+03:00 Stefano Bargioni :
>
>> Hi, Sergio:
>> maybe this article [1 abstract] [2 English text] can give you some basic
>> ideas. We added a lot of DDC info in our Koha catalog two years ago.
>> HTH. Stefano
>>
>> [1] http://leo.cineca.it/index.php/jlis/article/view/8766
>> [2] http://leo.cineca.it/index.php/jlis/article/view/8766/8060
>>
>> On 12/giu/2015, at 12:03, Sergio Letuche  wrote:
>>
>> > hello community!
>> >
>> > we are facing this challenging issue. We need to complete for a vast
>> amount
>> > of records, the dewey, UDC info, has anyone had any experience with this?
>> > We need some way (via modeling? mahout?) to try and discover these
>> values,
>> > based on some text, found in the records' metadata, and then auto
>> complete
>> > these values.
>> >
>> > I would appreciate any feedback, if there is any opensource tool you have
>> > used for this purpose, or if you are aware of any best practice for doing
>> > this task.
>> >
>> > Best
>> >
>>
>>
>> __
>> Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto
>> semplice ma di grande valore.
>> Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze di
>> tutti noi.
>> Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti
>> indicando nella dichiarazione dei redditi il codice fiscale 97023980580.
>>



-- 
Péter Király
software developer
GWDG, Göttingen - Europeana - eXtensible Catalog - The Code4Lib Journal
http://linkedin.com/in/peterkiraly


Re: [CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread Sergio Letuche
thank you very much for your quick reply, dear Stefano,

i appreciate it

2015-06-12 13:47 GMT+03:00 Stefano Bargioni :

> Hi, Sergio:
> maybe this article [1 abstract] [2 English text] can give you some basic
> ideas. We added a lot of DDC info in our Koha catalog two years ago.
> HTH. Stefano
>
> [1] http://leo.cineca.it/index.php/jlis/article/view/8766
> [2] http://leo.cineca.it/index.php/jlis/article/view/8766/8060
>
> On 12/giu/2015, at 12:03, Sergio Letuche  wrote:
>
> > hello community!
> >
> > we are facing this challenging issue. We need to complete for a vast
> amount
> > of records, the dewey, UDC info, has anyone had any experience with this?
> > We need some way (via modeling? mahout?) to try and discover these
> values,
> > based on some text, found in the records' metadata, and then auto
> complete
> > these values.
> >
> > I would appreciate any feedback, if there is any opensource tool you have
> > used for this purpose, or if you are aware of any best practice for doing
> > this task.
> >
> > Best
> >
>
>
> __
> Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto
> semplice ma di grande valore.
> Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze di
> tutti noi.
> Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti
> indicando nella dichiarazione dei redditi il codice fiscale 97023980580.
>


Re: [CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread Stefano Bargioni
Hi, Sergio:
maybe this article [1 abstract] [2 English text] can give you some basic ideas. 
We added a lot of DDC info in our Koha catalog two years ago.
HTH. Stefano

[1] http://leo.cineca.it/index.php/jlis/article/view/8766
[2] http://leo.cineca.it/index.php/jlis/article/view/8766/8060

On 12/giu/2015, at 12:03, Sergio Letuche  wrote:

> hello community!
> 
> we are facing this challenging issue. We need to complete for a vast amount
> of records, the dewey, UDC info, has anyone had any experience with this?
> We need some way (via modeling? mahout?) to try and discover these values,
> based on some text, found in the records' metadata, and then auto complete
> these values.
> 
> I would appreciate any feedback, if there is any opensource tool you have
> used for this purpose, or if you are aware of any best practice for doing
> this task.
> 
> Best
> 


__
Il tuo 5x1000 al Patronato di San Girolamo della Carita' e' un gesto semplice 
ma di grande valore.
Una tua firma aiutera' i sacerdoti ad essere piu' vicini alle esigenze di tutti 
noi.
Aiutaci a formare sacerdoti e seminaristi provenienti dai 5 continenti 
indicando nella dichiarazione dei redditi il codice fiscale 97023980580.


[CODE4LIB] Auto discovery of Dewey, UDC

2015-06-12 Thread Sergio Letuche
hello community!

we are facing this challenging issue. We need to complete for a vast amount
of records, the dewey, UDC info, has anyone had any experience with this?
We need some way (via modeling? mahout?) to try and discover these values,
based on some text, found in the records' metadata, and then auto complete
these values.

I would appreciate any feedback, if there is any opensource tool you have
used for this purpose, or if you are aware of any best practice for doing
this task.

Best