Hi Sigurd,

On 06/03/13 18:05, Sigurd Gran-Jansen wrote:
Hi,

We are working with Invenio, and have som initial questions about setting up correctly.

1) How do we install different modules (or is all the modules pre-installes from the Invenio installation package?)

All modules are installed when you run 'make install'. However, some extra (mostly optional) external dependencies are installed via addtional "make install-<name>-plugins". Following is from http://invenio-software.org/repo/invenio/tree/INSTALL?h=next:

1a. Installation
----------------
      $ cd invenio
      $ ./configure
      $ make
      $ make install
      $ make install-bootstrap
      $ make install-mathjax-plugin    ## optional
      $ make install-jquery-plugins    ## optional
      $ make install-jquery-tokeninput ## optional
      $ make install-plupload-plugin   ## optional
      $ make install-ckeditor-plugin   ## optional
      $ make install-pdfa-helper-files ## optional
      $ make install-mediaelement      ## optional
      $ make install-solrutils         ## optional
      $ make install-js-test-driver    ## option

Once installed, you need to configure Invenio and create database tables etc. All details is in " 1. Quick instructions for the impatient Invenio admin" in the link above.

2) Which module harvest metadata from a document (based on how often the word is mentioned, headings, etc.)? And how do we make this work?
Please see http://invenio-demo-next.cern.ch/help/hacking/modules-overview for a general Invenio overview to understand how different modules are connected. It's crucial that you understand the underlying data model, MARC, which you can read about here http://invenio-demo-next.cern.ch/help/admin/howto-marc. Here's a quick example of how data is stored:

245__ $$aThis is some title   <----title stored in field 245, subfield a
260__ $$c1998-06-28   <----publication date  stored in field 260 subfield c
520__ $$aConference "Internet, Web, What's next?" on 26 June 1998 at CERN : Tim Berners-Lee, inventor of the World-Wide Web and Director of the W3C, explains how the Web came to be and give his views on the future. <----- description/abstract stored in field 520, subfield a
980__ $$aPICTURE <---- collection name stored in field 980 subfield a

Invenio comes with default assumptions of where titles, authors etc are stored. This is all documented in the How To MARC guide above.

Back to the question - the general way to get content into invenio is:

Write MARCXML file to myfirstmarcfile.xml e.g.:
====
<record>
  <controlfield tag="001">controlfield>
  <datafield tag="024" ind1="7" ind2=" ">
    <subfield code="a">10.1234/some-preprint</subfield>
    <subfield code="2">DOI</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Test, Name</subfield>
    <subfield code="u">Test Affiliation</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">This is my preprint</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2013-01-01</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">Test Description

Test Line</subfield>
  </datafield>
  <datafield tag="653" ind1="1" ind2=" ">
    <subfield code="a">Keyword 1</subfield>
  </datafield>
  <datafield tag="653" ind1="1" ind2=" ">
    <subfield code="a">Keyword 2</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
  </datafield>
  <datafield tag="FFT" ind1=" " ind2=" ">
<subfield code="a">/path/to/file/OpenAIRE/399/fFn7be/files/8l2Phf/zenodo-web2.png</subfield>
    <subfield code="r"></subfield>
  </datafield>
</record>
====

Run bibupload
/opt/invenio/bin/bibupload -i /path/to/myfirstmarcfile.xml

This creates a task in the background job scheduler (BibSched) which you now have to run:
/opt/invenio/bin/bibsched

Find the "bibupload" task, and press "R" on it to run it.

This will upload the record. In the record above, there's a special tag: FFT, which is also documented in http://invenio-demo-next.cern.ch/help/admin/howto-marc as well as here http://invenio-demo-next.cern.ch/help/admin/bibupload-admin-guide (see 3.6 Uploading Fulltext Files).

Now, after the record have been uploaded, it needs to be indexed, ranked etc (see http://invenio-demo-next.cern.ch/help/admin/howto-run):

$ bibindex -f50000 -s5m
$ bibreformat -oHB -s5m
$ webcoll -v0 -s5m
$ bibrank -f50000 -s5m
$ bibsort -s5m

I'm not sure if I answered your question - but from the document you upload only the metadata is indexed and searchable. Invenio has add-ons with SOLR to also search the text inside the PDF, but to get started I think it's better to leave that for later until you understand above first.

3) How can we upload photos and videos to the repo
As described above, just change the FFT tag to point to your file.


4) We have some problems figuring out how invenio and the apache server communicate. Is invenio an API that the apache server accesses, or is it the other way around, that Invenio hosts the website?

Invenio is a standard Python WSGI application. When you run 'make install' invenio writes a file '/opt/invenio/var/www-wsgi/invenio.wsgi which has a variable "application". This is the entry point for Apache. To see how see how WSGI works please see: https://code.google.com/p/modwsgi/wiki/QuickInstallationGuide and https://code.google.com/p/modwsgi/wiki/QuickConfigurationGuide. Nearly all modern Python web applications uses WSGI so if you work with Flask, Django, etc they all work by creating a WSGI application.

From invenio.wsgi file, you can track down how a request is being processed, but to get started, you first just need to know that the Apache<->Invenio interface is via invenio.wsgi


Hopes this helps :-)

Cheers,
Lars


Regards,
Sigurd Gran-Jansen
NTNU School of Entrepreneurship
Tlf: +47 48135952


--
Lars Holm Nielsen
Software Engineer

CERN, IT Department, Digital Library Technology Section
Office 513/1-014
Tel: +41 22 76 79182
Cel: +41 76 672 8927

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to