Hi Jayan (or anyone who knows how to do batch submission):
I am still unable to do batch submission. Here is what I did: (1) Created a directory, /Users/pan/tmp and put 3 files under it: Content (a text file, attached); Dublin_core.xml (attached); and batch_import.pdf (the doc I wanted to submit to DSpace); (2) Ran: pan$ dsrun org.dspace.app.itemimport.ItemImport --add --eperson= [EMAIL PROTECTED] --collection=123456789/2 --source=/Users/pan/tmp --mapfile=/Users/pan/test_map Destination collections: Owning Collection: PODAAC collection Adding items from directory: /Users/pan/tmp Generating mapfile: /Users/pan/test_map No error message was shown, but the pdf file was not imported. An empty test_map file was generated. I also ran filter-media and found that all bitstreams were skipped because no new doc has been added. I found out from 1.4.1 beta 1 System Doc (pp. 22) that there are batch tools and registration is an althernate means to upload bitstreams, but no details or examples are provided. Can you provide links to more details or examples please? Thanks a lot for your help! -Pan On 2/1/07, Jayan Chirayath Kurian <[EMAIL PROTECTED]> wrote:
You solved your problem in importing documents or are u using the interface to upload documents into the repository. Jayan ------------------------------ *From:* Pan Family [mailto:[EMAIL PROTECTED] *Sent:* Friday, February 02, 2007 5:19 AM *To:* Jayan Chirayath Kurian *Subject:* Re: [Dspace-tech] how can I find out the collectionID? Thanks a lot! -Pan On 1/31/07, *Jayan Chirayath Kurian* <[EMAIL PROTECTED]> wrote: <? xml version="1.0" encoding="iso-8859-1" ?> - <!-- title of pdf AMIC_1984_10_CM_03.pdf * * --> *-* <dublin_core> * * <dcvalue element="*creator*" qualifier ="*conference*">*AMIC-Chiangmai ** University** Refresher Course on Communication Research Methodology : Chiangmai, Oct 29-Nov 2, 1984.*</dcvalue> * * <dcvalue element="*title*" qualifier ="*none*">*The Logic of Social Science Research.*</dcvalue> * * <dcvalue element="*contributor*" qualifier ="*author*">*Atal, Yogesh.* </dcvalue> * * <dcvalue element="*date*" qualifier ="*issued*">*1984-10-29*</ dcvalue > * * </dublin_core> ------------------------------ *From:* Pan Family [mailto:[EMAIL PROTECTED] *Sent:* Thursday, February 01, 2007 3:52 AM *To:* Jayan Chirayath Kurian *Cc:* [email protected] *Subject:* Re: [Dspace-tech] how can I find out the collectionID? Could you please kindly provide a sample Dublin_core.xml? I assumed that dsrun would recursively go through the directories and index all the files under them. Apparently I was wrong. The requirement of Dublin_core.xml and the content file makes the process much less automatic. Is there a way around this? Thanks a lot! -Pan On 1/30/07, *Jayan Chirayath Kurian* <[EMAIL PROTECTED]> wrote: ------------------------------ *From:* Pan Family [mailto: [EMAIL PROTECTED] *Sent:* Wednesday, January 31, 2007 1:15 PM *To:* Jayan Chirayath Kurian *Cc:* Dorothea Salo; [email protected] *Subject:* Re: [Dspace-tech] how can I find out the collectionID? Ok. I will give this a try. Still two questions: (1) Where can I get the file Dublin_core.XML? Dublin_core.xml contains the meta data descriptions of the resource (e.g. title, date published etc). You have to create the xml file using a notepad. (2) Let's say I only want to index one file named: foo.pdf, and I put it under /Users/pan/tmp/foo.pdf and pass src=/Users/pan to dsrun Is foo.pdf considered the content file or the resource? And which is the third type of file? foo.pdf is the resource (i.e. pdf or ppt or jpeg…..) Content file is a text file that just contains the name of the resource i.e. foo.pdf Thanks a lot! -Pan On 1/30/07, *Jayan Chirayath Kurian* <[EMAIL PROTECTED]> wrote: I feel the tmp directory should have (1) the Dublin_core.XML (2) contents file and (3) actual resource. The tmp directory should have all these files without any more subdirectories for these files. Can you try with source=/Users/pan/ and removing all subdirectories under tmp and having only these 3 files listed above. Hope it works. My structure is src = C:\DSpace\bin\archive_directory The archive_directory contains the directory Item_001 Item_001 contains (1) Dublin_core.XML (2) contents file and (3) actual resource. There are no more subdirectories under Item_001. Thanks, Jayan ------------------------------ *From:* Pan Family [mailto: [EMAIL PROTECTED] *Sent:* Wednesday, January 31, 2007 4:06 AM *To:* Jayan Chirayath Kurian *Cc:* Dorothea Salo; [email protected] *Subject:* Re: [Dspace-tech] how can I find out the collectionID? Thanks for your help! I am working on Mac OS X. Yes, "pan" contains "tmp" It seems that for me the dir that I give to source= cannot contain any subdirs. For example, if I give it "/Users/pan/" I got an error complaining about the missing file ".fvwm/dublin_core.xml" .fvwm is a subdir under "Users/pan/" If I give it "/Users/pan/tmp/" then it complains about the same missing file under the subdirs of "tmp" until I removed all the subdirs under "tmp" But I still don't get the files under "tmp" imported to my collection, even if no error shows after I removed all subdirs. bubba:$ dsrun org.dspace.app.itemimport.ItemImport --add --eperson= [EMAIL PROTECTED] --collection=123456789/2 --source=/Users/pan/ --mapfile=/Users/pan/test_map --test **Test Run** - not actually importing items. Destination collections: Owning Collection: PODAAC collection Adding items from directory: /Users/pan/ Generating mapfile: /Users/pan/test_map Adding item from directory .fvwm java.io.FileNotFoundException : /Users/pan/.fvwm/dublin_core.xml (No such file or directory) at java.io.FileInputStream.open(Native Method) at java.io.FileInputStream.<init>(FileInputStream.java:106) at java.io.FileInputStream .<init>(FileInputStream.java:66) at sun.net.www.protocol.file.FileURLConnection.connect( FileURLConnection.java:70) at sun.net.www.protocol.file.FileURLConnection.getInputStream( FileURLConnection.java :161) at org.apache.xerces.impl.XMLEntityManager.setupCurrentEntity(Unknown Source) at org.apache.xerces.impl.XMLVersionDetector.determineDocVersion(Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse (Unknown Source) at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source) at org.apache.xerces.parsers.XMLParser.parse(Unknown Source) at org.apache.xerces.parsers.DOMParser.parse (Unknown Source) at org.apache.xerces.jaxp.DocumentBuilderImpl.parse(Unknown Source) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java :172) at org.dspace.app.itemimport.ItemImport.loadXML (ItemImport.java :1269) at org.dspace.app.itemimport.ItemImport.loadDublinCore( ItemImport.java:795) at org.dspace.app.itemimport.ItemImport.loadMetadata( ItemImport.java:780) at org.dspace.app.itemimport.ItemImport.addItem (ItemImport.java :626) at org.dspace.app.itemimport.ItemImport.addItems(ItemImport.java :498) at org.dspace.app.itemimport.ItemImport.main(ItemImport.java:407) java.io.FileNotFoundException: /Users/pan/.fvwm/dublin_core.xml (No such file or directory) ***End of Test Run*** On 1/29/07, *Jayan Chirayath Kurian* <[EMAIL PROTECTED]> wrote: Can you please try with source=/Users/pan/ I encountered the same problem on windows platform. This was rectified by giving the main folder name with the import command. I assume that "pan" contains the subfolder "tmp" which infact contains the pdf file. Hope you will let me know if this works with you. Thanks, Jayan ------------------------------ *From:* [EMAIL PROTECTED] [mailto: [EMAIL PROTECTED] *On Behalf Of *Pan Family *Sent:* Tuesday, January 30, 2007 8:02 AM *To:* Dorothea Salo *Cc:* [email protected] *Subject:* Re: [Dspace-tech] how can I find out the collectionID? Hi Dorothea: Thanks a lot for your help! In my case, the handle is 123456789/2. So I used the following command to add a pdf file under /User/pan/tmp, but somehow the pdf file was not added into the collection and the file test_map is empty. No error message was shown either. I wonder what I did wrong. Could you give me some ideas on how to debug? Thanks again, -Pan bubba:~/dspace-1.4.1-source /bin pan$ dsrun org.dspace.app.itemimport.ItemImport --add [EMAIL PROTECTED]/2 --source=/Users/pan/tmp/ --mapfile=/Users/pan/tmp/test_map Destination collections: Owning Collection: PODAAC collection Adding items from directory: /Users/pan/tmp/ Generating mapfile: /Users/pan/tmp/test_map On 1/29/07, *Dorothea Salo *<[EMAIL PROTECTED]> wrote: Pan Family wrote: > dsrun org.dspace.app.itemimport.ItemImport --add > [EMAIL PROTECTED] --collection=collectionID --source=items_dir > --mapfile=mapfile > > Hi, > > The above command for batch import requires > the collectionID as input. I wonder how > I can find out this ID? Is it the string > that I used to name my collection, or an ID > that DSpace uses internally? You can use the collection's handle for this; go to the collection's home page and use the numbers after "handle/" in the URL. If you should need the internal DSpace collection ID for some reason, though, log in, surf to the collection page, and then use the "Edit" button under Admin Tools. From there, choose "Collection's Authorizations," and DSpace will pop up the "DB ID" in the title of the page. (I hope there's an easier way to do this! There certainly should be.) Dorothea -- Dorothea Salo, Digital Repository Services Librarian (703)993-3742 [EMAIL PROTECTED] AIM: gmumars MSN 2FL, Fenwick Library George Mason University 4400 University Drive, Fairfax VA 22031 ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys - and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech
Content
Description: Binary data
<?xml version="1.0" encoding="iso-8859-1" ?> <!-- title of pdf batch_import.pdf --> <dublin_core> <dcvalue element="creator" qualifier="email">email on how to submit in batch mode, Feb 23, 2007.</dcvalue> <dcvalue element="title" qualifier="none">Google email.</dcvalue> <dcvalue element="contributor" qualifier="author">Pan, Lei.</dcvalue> <dcvalue element="date" qualifier="issued">2007-02-23</dcvalue> </dublin_core>
------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

