Hi Sigurd, On 06/03/13 18:05, Sigurd Gran-Jansen wrote:
Hi,We are working with Invenio, and have som initial questions about setting up correctly.1) How do we install different modules (or is all the modules pre-installes from the Invenio installation package?)
All modules are installed when you run 'make install'. However, some extra (mostly optional) external dependencies are installed via addtional "make install-<name>-plugins". Following is from http://invenio-software.org/repo/invenio/tree/INSTALL?h=next:
1a. Installation
----------------
$ cd invenio
$ ./configure
$ make
$ make install
$ make install-bootstrap
$ make install-mathjax-plugin ## optional
$ make install-jquery-plugins ## optional
$ make install-jquery-tokeninput ## optional
$ make install-plupload-plugin ## optional
$ make install-ckeditor-plugin ## optional
$ make install-pdfa-helper-files ## optional
$ make install-mediaelement ## optional
$ make install-solrutils ## optional
$ make install-js-test-driver ## option
Once installed, you need to configure Invenio and create database tables
etc. All details is in " 1. Quick instructions for the impatient Invenio
admin" in the link above.
2) Which module harvest metadata from a document (based on how often the word is mentioned, headings, etc.)? And how do we make this work?Please see http://invenio-demo-next.cern.ch/help/hacking/modules-overview for a general Invenio overview to understand how different modules are connected. It's crucial that you understand the underlying data model, MARC, which you can read about here http://invenio-demo-next.cern.ch/help/admin/howto-marc. Here's a quick example of how data is stored:
245__ $$aThis is some title <----title stored in field 245, subfield a 260__ $$c1998-06-28 <----publication date stored in field 260 subfield c520__ $$aConference "Internet, Web, What's next?" on 26 June 1998 at CERN : Tim Berners-Lee, inventor of the World-Wide Web and Director of the W3C, explains how the Web came to be and give his views on the future. <----- description/abstract stored in field 520, subfield a
980__ $$aPICTURE <---- collection name stored in field 980 subfield aInvenio comes with default assumptions of where titles, authors etc are stored. This is all documented in the How To MARC guide above.
Back to the question - the general way to get content into invenio is:
Write MARCXML file to myfirstmarcfile.xml e.g.:
====
<record>
<controlfield tag="001">controlfield>
<datafield tag="024" ind1="7" ind2=" ">
<subfield code="a">10.1234/some-preprint</subfield>
<subfield code="2">DOI</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Test, Name</subfield>
<subfield code="u">Test Affiliation</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">This is my preprint</subfield>
</datafield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="c">2013-01-01</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a">Test Description
Test Line</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">Keyword 1</subfield>
</datafield>
<datafield tag="653" ind1="1" ind2=" ">
<subfield code="a">Keyword 2</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">publication</subfield>
</datafield>
<datafield tag="FFT" ind1=" " ind2=" ">
<subfield
code="a">/path/to/file/OpenAIRE/399/fFn7be/files/8l2Phf/zenodo-web2.png</subfield>
<subfield code="r"></subfield>
</datafield>
</record>
====
Run bibupload
/opt/invenio/bin/bibupload -i /path/to/myfirstmarcfile.xml
This creates a task in the background job scheduler (BibSched) which you
now have to run:
/opt/invenio/bin/bibsched Find the "bibupload" task, and press "R" on it to run it.This will upload the record. In the record above, there's a special tag: FFT, which is also documented in http://invenio-demo-next.cern.ch/help/admin/howto-marc as well as here http://invenio-demo-next.cern.ch/help/admin/bibupload-admin-guide (see 3.6 Uploading Fulltext Files).
Now, after the record have been uploaded, it needs to be indexed, ranked etc (see http://invenio-demo-next.cern.ch/help/admin/howto-run):
$ bibindex -f50000 -s5m $ bibreformat -oHB -s5m $ webcoll -v0 -s5m $ bibrank -f50000 -s5m $ bibsort -s5mI'm not sure if I answered your question - but from the document you upload only the metadata is indexed and searchable. Invenio has add-ons with SOLR to also search the text inside the PDF, but to get started I think it's better to leave that for later until you understand above first.
3) How can we upload photos and videos to the repo
As described above, just change the FFT tag to point to your file.
4) We have some problems figuring out how invenio and the apache server communicate. Is invenio an API that the apache server accesses, or is it the other way around, that Invenio hosts the website?
Invenio is a standard Python WSGI application. When you run 'make install' invenio writes a file '/opt/invenio/var/www-wsgi/invenio.wsgi which has a variable "application". This is the entry point for Apache. To see how see how WSGI works please see: https://code.google.com/p/modwsgi/wiki/QuickInstallationGuide and https://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide. Nearly all modern Python web applications uses WSGI so if you work with Flask, Django, etc they all work by creating a WSGI application.
From invenio.wsgi file, you can track down how a request is being processed, but to get started, you first just need to know that the Apache<->Invenio interface is via invenio.wsgi
Hopes this helps :-) Cheers, Lars
Regards, Sigurd Gran-Jansen NTNU School of Entrepreneurship Tlf: +47 48135952
-- Lars Holm Nielsen Software Engineer CERN, IT Department, Digital Library Technology Section Office 513/1-014 Tel: +41 22 76 79182 Cel: +41 76 672 8927
smime.p7s
Description: S/MIME Cryptographic Signature

