[ 
https://jira.duraspace.org/browse/DS-1226?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=26290#comment-26290
 ] 

Robin Taylor commented on DS-1226:
----------------------------------

Hi Kostas,

Sorry to be a pain but when I try and build the code Maven is giving a warning 
that concerns me a little...

"[WARNING] POM for 'gr.ekt:biblio-transformation-engine:pom:0.81:compile' is 
invalid.
Its dependencies (if any) will NOT be available to the current build."

Looking at the POM I notice that there is one odd dependency...

<systemPath>${basedir}/lib/marc4j.jar</systemPath>

Is that a true dependency or can it be removed ? If it is a true dependency 
then it needs to be available from a Maven Repository.

Cheers, Robin. 

                
> Batch import from basic bibliographic formats (Endnote, BibTex, RIS, TSV, CSV)
> ------------------------------------------------------------------------------
>
>                 Key: DS-1226
>                 URL: https://jira.duraspace.org/browse/DS-1226
>             Project: DSpace
>          Issue Type: New Feature
>          Components: DSpace API
>            Reporter: Kostas Stamatis
>              Labels: has-patch, has-pull-request, import
>         Attachments: biblio-transformation-engine-0.8.jar, import-patch.diff, 
> jbibtex-r45.jar, README.txt
>
>
> This proposed extension (implemented by National Documentation Centre/EKT - 
> http://www.ekt.gr) allows the batch import of metadata (and/or bitstreams) to 
> DSpace using the import script and the Biblio-Transformation-Engine tool. The 
> input format can be any bibliographic format (the specific patch includes 
> support for Endnote, RIS, BibTex, TSV and CSV formats).
> The biblio transformation engine 
> (http://code.google.com/p/biblio-transformation-engine/) is an open source 
> java framework developed by the Hellenic National Documentation Centre (EKT, 
> www.ekt.gr) and consists of programmatic APIs for filtering and modifying 
> records that are retrieved from various types of data sources (eg. databases, 
> files, legacy data sources) as well as for outputing them in appropriate 
> standards formats (eg. database files, txt, xml, Excel). The framework 
> includes independent abstract modules that are executed seperately, offering 
> in many cases alternative choices to the user depending of the input data 
> set, the transformation workflow that needs to be executed and the output 
> format that needs to be generated.
> Thus, the attached patch, adds support for utilizing the 
> Biblio-Transformation-Engine in the DSpace batch import procedure where the 
> user only needs to specify the mapping between the input metadata and DSpace 
> metadata. Default mapping are also provided for the default DSpace Dublin 
> Core metadata schema.
> USEFULNESS
> ---------------------
> Suppose a researcher of your institute provides you with a file with his/her 
> publications that you need to import in the repository. Supposing that the 
> format of the file is one the following: CSV, TSV, Endnote, BibTex, RIS 
> (formats that are commonly used for bibliographic metadata) using only one 
> command you can import all the records to the DSpace repository while in 
> parallel, configuration files apply in order to control which metadata is 
> imported and in which DC (or any other schema of the DSpace repository) field 
> it maps.
> For those who know well the use of the Biblio-Transformation-Engine, this 
> extension is powerful given that they can write their own DataLoaders in 
> order to support more input formats. Filtering of records as well as 
> modifying the metadata is also possible with very little effort (using Biblio 
> transformation engine's filters and modifiers). The same applies for the 
> addition of bitstreams in the records.
> CONFIGURATION FILES
> ---------------------------------------
> Since Bibilio-transformation-Engine supports Spring, the only configurations 
> that the user must work with are the Spring XML files for the Dependency 
> Injection. These files are located within "config" directory and the user can 
> specify in them the mapping between input metadata and DSpace Dublin Core 
> schema (or any other schema users have in their repository)
> EXTERNAL LIBRARIES
> -----------------------------------
> This extension makes use of three external java libraries:
> a) jbibtex, a java library for reading bibtex files (under BSD licence - 
> http://www.linfo.org/bsdlicense.html)
> b) opencsv, a java library for reading csv files (under Apache License V2.0 - 
> http://www.apache.org/licenses/LICENSE-2.0)
> c) biblio-transformation-engine, a java library for metadata transformation, 
> fitlering and modification (under European Union Public Licence (EUPL) 
> License, http://www.osor.eu/eupl/european-union-public-licence-eupl-v.1.1)
> HOW TO RUN
> ----------------------
> In the import script, there is a new option (-b) to import using the 
> Biblio-Transformation-Engine and an option -i to declare the type of the 
> input format. All the other options are the same. Option -s points to a file 
> (and not a directory as it used to) that is the file of the input data.
> Thus, to import metadata from the various input format use the following 
> commands:
> for BibTex input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-bibtex -i bibtex
> for csv input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-csv -i csv
> for tsv input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-tsv -i tsv
> for ris input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-ris -i ris
> for endnote input: ./dspace import -b -m mapFile -e [email protected] -c 
> 123456789/1 -s /DATA/export-endnote -i endnote
> (-e must be a valid email of a DSpace user and -c must be the collection 
> handle the items will be imported)
> Before you run the commands, feel free to change the configuration files 
> (config/spring-bibtex2dspace.xml, config/spring-csv2dspace.xml, 
> config/spring-tsv2dspace.xml, config/spring-ris2dspace.xml, 
> config/spring-endnote2dspace.xml) in order to specify the mapping of the 
> input format to the DC metadata schema of DSpace.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

------------------------------------------------------------------------------
Got visibility?
Most devs has no idea what their production app looks like.
Find out how fast your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219671;13503038;y?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Dspace-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-devel

Reply via email to