On 08.11.2012 09:32, Vesa Sivunen wrote:
Hi!
I would be very interested in seeing the script. The reason
being exactly "...no one really wanted to do this by mouse and
keyboard... ;)".
The second, an IMHO much more important, reason is that you never get it
consistent if you want to do it by hand on more than one instance. Even
if you use diligent librarians for the setup ;)
For sake of simpicity I've included the real world configs from JuSER,
which is the instance in Jülich of the ongoing project with DESY, GSI,
Jülich and RWTH Aachen.
CreateCollections.py is the main script, just call it with it's input in
the same directory. We use _2_ config files. One that is specific for a
given instance. It's name is derived in the usual Invenio style by
evaluating CFG_WEBSTYLE_TEMPLATE_SKIN. Our local instance uses
CFG_WEBSTYLE_TEMPLATE_SKIN = fzj
so our local, instance specific file is CollectionList_fzj.txt. The
second one is a "generic" list we share between all instances.
Note that the _fzj-File is executed first as it sets up some mother
collections for the gereic ones. For us these are mainly the collections
Workflow, Documenttypes, Authrities, InstColl and FullTexts which should
be first level children of our main instance which is called JuSER. This
explains the first (and subsequent) lines in this file:
Place a collection named internally "Workflow" as a _v_irtual child of
JuSER, and name it "Workflow collections" in @english and "Workflow
collections in @deutsch.
For documenttypes you see the different namings like "Document types"
and "Dokumenttypen". Note also that all collections in the _fzj-File do
NOT use a collection query except the speciality "FullTexts". Therefore
you have _2_ tab chars after the internal name. (:set list in vi shows
it or use some spreadsheet app like gnumeric).
In Collectionlist.txt you see all mor or less regular collections we
use. Same syntax:
Internalname <tab> collection query <tab> r|v <tab> translations
You can spedify whatever languagnes you want, just use the invenio
internal language code preceeded by an @. (So you can have more columns
than the files show.)
If you check out CreateCollections.py in more detail you'll see that it
doesn't deserve the "rocket science" label. It's actually pretty simple
straight forward calls against invenios high level api. In a way it
mimics exactly what would happen if you hit a mouse button her and the
keyboard there in the web frontend.
HTH :)
--
Kind regards,
Alexander Wagner
Subject Specialist
Central Library
52425 Juelich
mail : [email protected]
phone: +49 2461 61-1586
Fax : +49 2461 61-6103
www.fz-juelich.de/zb/DE/zb-fi
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Forschungszentrum Juelich GmbH
52425 Juelich
Sitz der Gesellschaft: Juelich
Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
Vorsitzender des Aufsichtsrats: MinDir Dr. Karl Eugen Huthmacher
Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender),
Karsten Beneke (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt,
Prof. Dr. Sebastian M. Schmidt
------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------
Kennen Sie schon unsere app? http://www.fz-juelich.de/app
#!/usr/bin/env python
#
##
## This file is part of Invenio.
## Copyright (C) 2011, HGF
##
## CDS Invenio is free software; you can redistribute it and/or
## modify it under the terms of the GNU General Public License as
## published by the Free Software Foundation; either version 2 of the
## License, or (at your option) any later version.
##
## CDS Invenio is distributed in the hope that it will be useful, but
## WITHOUT ANY WARRANTY; without even the implied warranty of
## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
## General Public License for more details.
##
## You should have received a copy of the GNU General Public License
## along with CDS Invenio; if not, write to the Free Software Foundation, Inc.,
## 59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.
import sys
import re
from invenio.config import CFG_SITE_NAME, CFG_WEBSTYLE_TEMPLATE_SKIN
from invenio.websearchadminlib import \
add_col, \
add_col_dad_son, \
delete_col
from invenio.search_engine import \
get_colID
from invenio.bibrankadminlib import \
get_languages, \
modify_translations
import csv
base = 'Collectionlist'
ext = '.txt'
generic = base + ext
specific = base + '_' + CFG_WEBSTYLE_TEMPLATE_SKIN + ext
print generic
print specific
sitelangs = get_languages()
CollectionReader = csv.reader(open(specific, 'rb'), delimiter='\t');
for row in CollectionReader:
data = []
for col in row:
data.append(col)
delete_col(get_colID(data[0]))
add_col(data[0], data[1])
add_col_dad_son(get_colID(data[3]), get_colID(data[0]), data[2])
transdict = {}
translist = []
for i in range(4,len(data)):
lang = re.split('@', data[i])
for t in lang:
transdict[lang[1]] = lang[0]
for lang in sitelangs:
try:
translist.append(transdict[lang[0]])
except:
translist.append('')
modify_translations(get_colID(data[0]), sitelangs, 'ln', translist,
"collection")
CollectionReader = csv.reader(open(generic, 'rb'), delimiter='\t');
for row in CollectionReader:
data = []
for col in row:
data.append(col)
## print "Collection: ", data[0]
## print "Query : ", data[1]
## print "r/v : ", data[2]
## print "dad : ", data[3]
## print "en : ", data[4]
## print "de : ", data[5]
## there could be more translations at the end of the line
# Drop all collections first before recreating them. Just in case they
exist.
delete_col(get_colID(data[0]))
# Add the collections and handle subordination
add_col(data[0], data[1])
add_col_dad_son(get_colID(data[3]), get_colID(data[0]), data[2])
# Extract available translations. Languages are marked as @en etc.
# Build up a hash using language as key so they can easily be sorted
into
# an array for modify_translations call.
transdict = {}
translist = []
# Build a dictionary of all available translations
for i in range(4,len(data)):
lang = re.split('@', data[i])
for t in lang:
transdict[lang[1]] = lang[0]
# Try to add the translated value, add '' if no proper tranlsation is
found
for lang in sitelangs:
try:
translist.append(transdict[lang[0]])
except:
translist.append('')
modify_translations(get_colID(data[0]), sitelangs, 'ln', translist,
"collection")
HGVVOC collection:"HGFVOC" r Authorities Controlled
vocabulary@en Kontrolliertes Vokabular@de
StatID collection:"StatID" r Authorities Statistics keys@en
Statistikschlüssel@de
PubTypes collection:"PUB" r Authorities Publication
types@en Publikationsformen@de
Periodicals collection:"PERI" r Authorities Periodicals@en
Periodika@de
People collection:"P" r Authorities People@en Personen@de
Institutes collection:"I" r Authorities Institutes@en
Institute@de
Institution collection:"Institution" r Authorities
Institutions@en Institutionen@de
Grants collection:"G" r Authorities Grants@en Projekte@de
Unpublished r Documenttypes Unpublished@en
Unpubliziertes@de
Theses r Documenttypes Theses@en Hochschulschriften@de
Reports r Documenttypes Reports@en Berichte@de
Presentations r Documenttypes Presentations@en
Präsentationen@de
Patents r Documenttypes Patents@en Patente@de
Other Resources r Documenttypes Other Resources@en
Andere@de
Events r Documenttypes Events@en Ereignisse@de
Books r Documenttypes Books@en Bücher@de
Articles r Documenttypes Articles@en Aufsätze@de
EDITOR collection:"EDITOR" r Workflow In process@en In
Bearbeitung@de
LIBRARY collection:"LIBRARY" r Workflow At library@en
Bibliotheksprüfung@de
MAIL collection:"MAIL" r Workflow Mail to editor@en
Sachbearbeiter benachrichtigt@de
TEMPENTRY collection:"TEMPENTRY" r Workflow Temporary
Entries@en Temporäre Einträge@de
USER collection:"USER" r Workflow User submitted
records@en Eingereichte Dokumente@de
VDB collection:"VDB" r Workflow Publications
database@en Publikationsdatenbank@de
VDBINPRINT collection:"VDBINPRINT" r Workflow Documents in
print@en Im Druck@de
VDBRELEVANT collection:"VDBRELEVANT" r Workflow
Relevant for Publication database@en Für Publikationsdatenbank relevant@de
MIGRATION collection:"MIGRATION" r Workflow Migrated
datasets (backup)@en Migrierte Datensätze (Backup)@de
MASSMEDIA collection:"MASSMEDIA" r Workflow In the media@en
In den Medien@de
UNRESTRICTED collection:"UNRESTRICTED" r Workflow Public
records@en Öffentliche Einträge@de
Project collection:"UNRESTRICTED" and collection:"project" r
Unpublished Projects@en Projekte@de
Notes collection:"UNRESTRICTED" and collection:"notes" r
Unpublished Notes@en Notizen@de
News collection:"UNRESTRICTED" and collection:"news" r Unpublished
News@en Nachrichten@de
FormTemplate collection:"UNRESTRICTED" and collection:"formtmp" r
Unpublished Forms / Templates@en Formulare / Vorlagen@de
Communication collection:"UNRESTRICTED" and collection:"comm" r
Unpublished Communication@en Mitteilung@de
Staatsexamen collection:"UNRESTRICTED" and collection:"exam" r Theses
Staatsexamen@en Staatsexamen@de
PostdoctoralThesis collection:"UNRESTRICTED" and collection:"habil"
r Theses Postdoctoral Theses@en Habilitationen@de
PhDThesis collection:"UNRESTRICTED" and collection:"phd" r Theses
Ph.D. Theses@en Doktorarbeiten@de
MasterThesis collection:"UNRESTRICTED" and collection:"master" r
Theses Master Theses@en Masterarbeiten@de
Magisterarbeit collection:"UNRESTRICTED" and collection:"magister" r
Theses Magisterarbeit@en Magisterarbeiten@de
DiplomaThesis collection:"UNRESTRICTED" and collection:"diploma" r
Theses Diploma Theses@en Diplomarbeiten@de
Coursework collection:"UNRESTRICTED" and collection:"course" r
Theses Course works@en Kursarbeiten@de
BachelorThesis collection:"UNRESTRICTED" and collection:"bachelor" r
Theses Bachelor Theses@en Bachelorarbeiten@de
Report collection:"UNRESTRICTED" and collection:"report" r Reports
Reports@en Berichte@de
Preprint collection:"UNRESTRICTED" and collection:"preprint" r
Reports Preprints@en Vorabdrucke@de
Minutes collection:"UNRESTRICTED" and collection:"minutes" r Reports
Minutes@en Protokolle@de
InternalReport collection:"UNRESTRICTED" and collection:"intrep" r
Reports Internal Reports@en Interne Berichte@de
Talknon-conference collection:"UNRESTRICTED" and collection:"talk" r
Presentations Talks (non-conference)@en Vorträge (nicht Konferenz)@de
Poster collection:"UNRESTRICTED" and collection:"poster" r
Presentations Poster@en Poster@de
Lecture collection:"UNRESTRICTED" and collection:"lecture" r
Presentations Lectures@en Vorlesungen@de
ConferencePresentation collection:"UNRESTRICTED" and collection:"conf" r
Presentations Conference Presentations@en Konferenzvorträge@de
Abstract collection:"UNRESTRICTED" and collection:"abstract" r
Presentations Abstracts@en Zusammenfassungen@de
Patent collection:"UNRESTRICTED" and collection:"patent" r Patents
Patents@en Patente@de
Software collection:"UNRESTRICTED" and collection:"sware" r
Other Resources Software@en Software@de
PhysicalObject collection:"UNRESTRICTED" and collection:"physobj" r
Other Resources Physical Objects@en Physikalische Objekte@de
Multimedia collection:"UNRESTRICTED" and collection:"media" r
Other Resources Multimedia@en Multimedia@de
Images collection:"UNRESTRICTED" and collection:"images" r Other
Resources Images@en Bilder@de
Dataset collection:"UNRESTRICTED" and collection:"dataset" r Other
Resources Datasets@en Datensätze@de
Contribution2proceeding collection:"UNRESTRICTED" and collection:"contrib"
r Events Contributions to a conference proceeding@en Beiträge zu
Proceedings@de
Conferences collection:"UNRESTRICTED" and collection:"Conferences" r
Events Conferences@en Konferenzen@de
ConferenceEvent collection:"UNRESTRICTED" and collection:"ConferenceEvent"
r Events Conferences / Events@en Konferenzen / Veranstaltungen@de
Reference collection:"UNRESTRICTED" and collection:"refs" r Books
Reference@en Referenzen@de
Proceedings collection:"UNRESTRICTED" and collection:"proc" r Books
Proceedings@en Proceedings@de
Contribution2book collection:"UNRESTRICTED" and collection:"contb"
r Books Contribution to a book@en Buchbeitrag@de
Book collection:"UNRESTRICTED" and collection:"book" r Books
Books@en Bücher@de
JournalArticle collection:"UNRESTRICTED" and collection:"journal" r
Articles Journal Article@en Zeitschriftenaufsätze@de
Workflow v JuSER Workflow collections@en Workflow
collections@de
Documenttypes v JuSER Document types@en Dokumenttypen@de
Authorities r JuSER Authorities@en Normsätze@de
InstColl r JuSER Institute Collections@en
Institutssammlungen@de
FullTexts collection:"JUWEL" r JuSER JUWEL@en JUWEL@de