Re: [Dspace-tech] standards to facilitate metadata extraction during text extraction
During my PhD, this was still a research subject (automatic extraction of data from physical structure of a document). Have a look at http://www.loria.fr/equipes/read/ I don't know whether there have been free or proprietary systems since then. When the layout of your documents is a regular one, some rather simple process may be useful, but if it varies too much, it is a much more complicated task! -- François PARMENTIER / INIST-CNRS On Sun, Dec 14, 2008 at 12:52 AM, Andrew Marlow marlow.and...@googlemail.com wrote: This may seem like a crazy or naive question, but is there any standard laid down by publishers or societies that authors must adhere to so that the extraction of metadata from articles can be easily automated? Having just performed a text extraction on a non-searchable PDF I see that there is no easy way to get any metadata out. But if a society had conventions for the layour of the article, specifying location and format of title, authors, abstract, bibliography etc, then it might be possible. I have seen a very regular visual layout in the PDFs from some places. Using OCR techniques it might be possible to locate blocks of interest. It might also be possible from a text extraction but that might be harder since all visual layout information is gone (at least it was with the tool I used). I wonder if this is being considered by anyone. I am very new to this area so please excuse me if this seems like a silly question. -- Regards, Andrew M. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Citation format
You need to add new terms to the metadata registry using the dspace-admin function, and then add those new terms to input-forms.xml to make them appear on the html forms. That bit is fairly easy and is a typical thing that a Dspace admin would do, the more messy part is that there is not any widely recognised metadata schema that will allow you to store this information, at least not in the context of Dspace. I think that most people with a similar requirement make up their own terms and possibly schema eg. mine:volume, mine:issue, etc. The next tricky bit is that the Dspace submission process is not 'type' based so how do you request the appropriate information dependant on type ? You can either just not think about type and always ask the user for all citation data even when not appropriate (eg volume/issue for a book?), or you can split your collections by type as input-forms.xml allows you to define different forms for different collections. Cheers, Robin. Robin Taylor Main Library University of Edinburgh Tel. 0131 6515208 -Original Message- From: juuventud [mailto:s.m...@ru.ac.za] Sent: 13 December 2008 17:55 To: dspace-tech@lists.sourceforge.net Subject: [Dspace-tech] Citation format Hi all In order to have the proper citation format for different types of documents, eg. journals, books, book chapters, etc, I need to have fields where I can enter things like Journal Name (not publisher), Volume number, Part number, Pagination (eg. pg142 - pg163), etc. I don't know how to do this. Is there a standard set of submission pages with these fields already available? If not, where can I create such forms and how do I create the link from the form to specific fields in the database? Any help would be GREATLY APPRECIATED. I'm using DSpace 1.4.2 with PostgreSQL 8.1 on Windows Server 2003. Many thanks in advance -- View this message in context: http://www.nabble.com/Citation-format-tp20992675p20992675.html Sent from the DSpace - Tech mailing list archive at Nabble.com. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009 .visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] HANDLE update issue
Following up on my question of last week on the failed UPDATE queries for changing handles on an Oracle database, this is what happened. Stuart Lewis suggested that the failing query was this: UPDATE metadatavalue SET text_value= (SELECT 'http://hdl.handle.net/' || handle FROM handle WHERE handle.resource_id=item_id AND handle.resource_type_id=2) WHERE text_value LIKE 'http://hdl.handle.net/%'; When I ran that in the Oracle SQL Developer application, I got an error something like no statement at cursor. I simply deleted the semicolon and ran: UPDATE metadatavalue SET text_value= (SELECT 'http://hdl.handle.net/' || handle FROM handle WHERE handle.resource_id=item_id AND handle.resource_type_id=2) WHERE text_value LIKE 'http://hdl.handle.net/%' Which worked. Could that really be all that it is? _ Tom McGee Senior Digital Media Specialist Seton Hall University 400 South Orange Ave., South Orange, NJ 07079 973.275.2992 -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] HANDLE update issue
Hi Tom, Would you mind trying an experiment for us? If you remove the semicolon from [dsapce-src]/dspace-api/src/main/java/org/dspace/handle/UpdateHandlePref ix.java run mvn package, and ant update, does the script [dspace]/bin/update-hanmdle-prefix then run OK? I suspect it will fix the problem. If you could confirm this, we'll get it fixed ready for the next release of DSpace. Thanks, Stuart From: McGee, Thomas A. [mailto:thomas.mc...@shu.edu] Sent: 15 December 2008 14:56 To: dspace-tech@lists.sourceforge.net Subject: [Dspace-tech] HANDLE update issue Following up on my question of last week on the failed UPDATE queries for changing handles on an Oracle database, this is what happened. Stuart Lewis suggested that the failing query was this: UPDATE metadatavalue SET text_value= (SELECT 'http://hdl.handle.net/' || handle FROM handle WHERE handle.resource_id=item_id AND handle.resource_type_id=2) WHERE text_value LIKE 'http://hdl.handle.net/% http://hdl.handle.net/%25 '; When I ran that in the Oracle SQL Developer application, I got an error something like no statement at cursor. I simply deleted the semicolon and ran: UPDATE metadatavalue SET text_value= (SELECT 'http://hdl.handle.net/' || handle FROM handle WHERE handle.resource_id=item_id AND handle.resource_type_id=2) WHERE text_value LIKE 'http://hdl.handle.net/% http://hdl.handle.net/%25 ' Which worked. Could that really be all that it is? _ Tom McGee Senior Digital Media Specialist Seton Hall University 400 South Orange Ave., South Orange, NJ 07079 973.275.2992 -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] standards to facilitate metadata extraction during text extraction
End of the 1990s, I used MS-Word forms and macros to allow authors to enter metadata together with their articles. Even references were structured. It seemed a good idea (normalizing upfront). It ended up very badly because: * MacIntosh MS-Word was not compatible for forms and macros; * Word Perfect was still popular and presented as being compatible (which was not true for forms and macros); The worse was one of the revisors who opened most of the articles in Word Perfect and saved them after comments addition... * Asian versions of Word were introducing unknown characters for Western versions; * About a quarter of the authors did not understood the form. Those (technical?) problems produced a terrible mess which took very long to correct and delayed the publication of the paper. Efficient cataloguers (possibly with the help of a submission form like the DSpace one + a better cataloguing form than the current one) will be always better than machine to tame the authors' diversity! Have a nice day! Christophe Dupriez François Parmentier a écrit : During my PhD, this was still a research subject (automatic extraction of data from physical structure of a document). Have a look at http://www.loria.fr/equipes/read/ I don't know whether there have been free or proprietary systems since then. When the layout of your documents is a regular one, some rather simple process may be useful, but if it varies too much, it is a much more complicated task! -- François PARMENTIER / INIST-CNRS On Sun, Dec 14, 2008 at 12:52 AM, Andrew Marlow marlow.and...@googlemail.com mailto:marlow.and...@googlemail.com wrote: This may seem like a crazy or naive question, but is there any standard laid down by publishers or societies that authors must adhere to so that the extraction of metadata from articles can be easily automated? Having just performed a text extraction on a non-searchable PDF I see that there is no easy way to get any metadata out. But if a society had conventions for the layour of the article, specifying location and format of title, authors, abstract, bibliography etc, then it might be possible. I have seen a very regular visual layout in the PDFs from some places. Using OCR techniques it might be possible to locate blocks of interest. It might also be possible from a text extraction but that might be harder since all visual layout information is gone (at least it was with the tool I used). I wonder if this is being considered by anyone. I am very new to this area so please excuse me if this seems like a silly question. -- Regards, Andrew M. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net mailto:DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech begin:vcard fn:Christophe Dupriez n:Dupriez;Christophe org:DESTIN inc. SSEB adr;quoted-printable:;;rue des Palais 44, bo=C3=AEte 1;Bruxelles;;B-1030;Belgique email;internet:christophe.dupr...@destin.be title:Informaticien tel;work:+32/2/216.66.15 tel;fax:+32/2/242.97.25 tel;cell:+32/475.77.62.11 note;quoted-printable:D=C3=A9veloppement de Syst=C3=A8mes de Traitement de l'Information x-mozilla-html:TRUE url:http://www.destin.be version:2.1 end:vcard -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] Controlled Vocabulary Filtering
Hello, I have recently uploaded a controlled vocabulary into Dspace and it is working with the submission forms searching within the metadata. The problems is that when I go to subject search and try to filter my controlled vocabulary I just get a blank page. I've been researching if there is something I have to do to set up the filter but I can't find anything. Can anyone offer assistance? Thank you in advance, Barbara Barbara Yates Hilderbrand, MLS Metadata/Digital Collections Librarian, Library Associates NASA/GSFC Library Code 272, Building 21 Greenbelt, MD 20771 301-286-6246 barbara.y.hilderbr...@nasa.gov -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] standards to facilitate metadata extraction duringtext extraction
I don't think it's a daft question at all, but then I am known to ask some very daft ones myself :) I think the problem is that we wrap the data up in formats that make extraction difficult and then need to go to great lengths to try and extract that data. I don't know of any widely used, reliable methos as yet. Better to move towards formats that make extraction easy. Microsoft docx documents looks like a step in the right direction to me. It's a normal Word document but is stored as xml and hence is readable programatically. In addition the author can add their own tags, so there is no reason why they should not tag the abstract, references, etc. In theory it should be easy to then extract that information. I'm sure there are good reasons why we all favour pdf's but I think the principle still applies. Cheers, Robin. Robin Taylor Main Library University of Edinburgh Tel. 0131 6515208 -Original Message- From: Andrew Marlow [mailto:marlow.and...@googlemail.com] Sent: 13 December 2008 23:53 To: dspace-tech@lists.sourceforge.net Subject: [Dspace-tech] standards to facilitate metadata extraction duringtext extraction This may seem like a crazy or naive question, but is there any standard laid down by publishers or societies that authors must adhere to so that the extraction of metadata from articles can be easily automated? Having just performed a text extraction on a non-searchable PDF I see that there is no easy way to get any metadata out. But if a society had conventions for the layour of the article, specifying location and format of title, authors, abstract, bibliography etc, then it might be possible. I have seen a very regular visual layout in the PDFs from some places. Using OCR techniques it might be possible to locate blocks of interest. It might also be possible from a text extraction but that might be harder since all visual layout information is gone (at least it was with the tool I used). I wonder if this is being considered by anyone. I am very new to this area so please excuse me if this seems like a silly question. -- Regards, Andrew M. -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] Citation format
Thanks a lot Robin. Question answered. ... -- View this message in context: http://www.nabble.com/Citation-format-tp20992675p21013005.html Sent from the DSpace - Tech mailing list archive at Nabble.com. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] standards to facilitate metadata extraction duringtext extraction
On Mon, Dec 15, 2008 at 9:36 AM, Robin Taylor robin.tay...@ed.ac.uk wrote: I don't think it's a daft question at all, but then I am known to ask some very daft ones myself :) I think the problem is that we wrap the data up in formats that make extraction difficult and then need to go to great lengths to try and extract that data. I don't know of any widely used, reliable methos as yet. Better to move towards formats that make extraction easy. Microsoft docx documents looks like a step in the right direction to me. No, no, no, please let us not use formats invented by Microsoft. We need open formats not closed-secret-proprietary ones. And if Microsoft claim it is open we must not believe them. Just look at their track record. I realise that PDFs are not completely open either but they are bound to be more open than anything Microsoft produce. And I was talking about PDFs. But I do not want the discussion to focus on file formats. As I said originally, But if a society had conventions for the layout of the article, specifying location and format of title, authors, abstract, bibliography etc, then it might be possible -Original Message- From: Andrew Marlow [mailto:marlow.and...@googlemail.com] Sent: 13 December 2008 23:53 To: dspace-tech@lists.sourceforge.net Subject: [Dspace-tech] standards to facilitate metadata extraction duringtext extraction This may seem like a crazy or naive question, but is there any standard laid down by publishers or societies that authors must adhere to so that the extraction of metadata from articles can be easily automated? Having just performed a text extraction on a non-searchable PDF I see that there is no easy way to get any metadata out. But if a society had conventions for the layour of the article, specifying location and format of title, authors, abstract, bibliography etc, then it might be possible. I have seen a very regular visual layout in the PDFs from some places. Using OCR techniques it might be possible to locate blocks of interest. It might also be possible from a text extraction but that might be harder since all visual layout information is gone (at least it was with the tool I used). I wonder if this is being considered by anyone. -- Regards, Andrew M. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] Ingest information in Dspace
Hello! :-) I need to ingest some information to my university Dspace. According to the information provided by dspace http://www.dspace.org/index.php/Architecture/technology/metadata.html , all that information can be uploaded. I am using this example to upload some information about publications (Author, Abstract, language) from http://www.ukoln.ac.uk/repositories/sword/example.zip The problem is that I cannot find the way to ingest other needed information such as ISBN , govdoc, issn, ismn ... Have any of you got a xml example about how to upload metadata such as : dc.contributor.author dc.contributor.author dc.date.accessioned dc.date.available dc.date.issueddc.identifier.citation dc.identifier.govdoc dc.identifier.isbn dc.identifier.issn dc.identifier.pmid dc.identifier.doi dc.identifier.other dc.identifier.uri dc.identifier.uri dc.description dc.description.abstract dc.language.iso dc.publisher dc.relation.ispartofseries dc.relation.ispartofseries dc.relation.ispartofseries dc.relation.ispartofseries dc.relation.url dc.relation.url dc.subject dc.subject dc.subject dc.subject dc.subject.mesh dc.subject.mesh dc.subject.mesh dc.subject.mesh dc.subject.other dc.subject.other dc.title dc.title.alternative dc.title.alternative dc.type dc.contributor.department dc.identifier.journal dc.identifier.pmcid Thanks a lot to all of you for your help, Javier Espinosa de los Monteros. University of Wolverhampton -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] standards to facilitate metadata extraction duringtext extraction
On Mon, Dec 15, 2008 at 09:36:11AM +, Robin Taylor wrote: I think the problem is that we wrap the data up in formats that make extraction difficult and then need to go to great lengths to try and extract that data. I don't know of any widely used, reliable methos as yet. Better to move towards formats that make extraction easy. Most common formats other than plain text have some sort of tagging feature. In some cases, few know about them so they aren't much used. That could be fixed easily. Microsoft docx documents looks like a step in the right direction to me. It's a normal Word document but is stored as xml and hence is readable programatically. The older Office formats are readable programmatically too. More readable, actually, since OOXML is very new, still only partially documented, and not implemented anywhere, even at Microsoft. There's a store for document attributes inside the traditional Office format's bag. There's a nice Java library (POI) that can extract them. But then that only works for MS Office documents. Not for OpenOffice or Symphony. Not for Acrobat. We have tens of thousands of PDFs. We have audio and video streams waiting in the wings. And we still need a system for assigning meanings to the tags. In addition the author can add their own tags, so there is no reason why they should not tag the abstract, references, etc. In theory it should be easy to then extract that information. See the subject line. If everybody makes up his own tags then there is no standard, and software cannot make use of the tags without being told, for each individual provider's profile, what to look for and what they mean. Bibliographic software like EndNote shows us what we wind up with: hundreds of format modules to be maintained. We can do that but I'd rather have something systematic. (BTW EndNote or one of its brethren might be able to serve the original request.) If there is no standard now, then maybe it's up to the document repository community (that's us) to lay the groundwork for some standardization and champion the idea until it's accepted. -- Mark H. Wood, Lead System Programmer mw...@iupui.edu Friends don't let friends publish revisable-form documents. pgpFXdB0KGzKu.pgp Description: PGP signature -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] structure import problem with French and German accented characters
I have created an XML file for a structure import, based on a CSV file I have of journal titles. I am converting the CSV to XML using a bit of perl. Everything is fine until I introduce journal titles that contain accented characters. For example, one title contains the German word 'fur' with u umlaut. I get a UTF-8 error if I leave it like that. So in my XML file I change this for uuml; but it doesn't work. It says 'the entity uuml was referenced but not declared'. What is going wromg please? How may titles with accented characters be imported? -- Regards, Andrew M. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] how to zap the DSpace database quickly and start again
Now that I have done several experiments with bulk imports using StructBuilder, my DSpace database is full of rubbish. Can anyone please tell me what is the best way to zap the database so I can start again? I don't want to do a complete reinstall of DSpace, I would lose config info and XMLUI customisations that I want to keep. -- Regards, Andrew M. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] structure import problem with French and German accented characters
Hi Andrew, I can't speak to the general question. But on this point . . . uuml; . . . is an HTML character entity reference, and is not recognized within XML documents in general. To use this, you would need, as the XML parser here is saying, a supporting DTD entity reference declaration. Easier, I think, maybe just to reference it using it's numeric value: #252; --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Andrew Marlow [marlow.and...@googlemail.com] Sent: Monday, December 15, 2008 2:05 PM To: dspace-tech@lists.sourceforge.net Subject: [Dspace-tech] structure import problem with French and German accented characters I have created an XML file for a structure import, based on a CSV file I have of journal titles. I am converting the CSV to XML using a bit of perl. Everything is fine until I introduce journal titles that contain accented characters. For example, one title contains the German word 'fur' with u umlaut. I get a UTF-8 error if I leave it like that. So in my XML file I change this for uuml; but it doesn't work. It says 'the entity uuml was referenced but not declared'. What is going wromg please? How may titles with accented characters be imported? -- Regards, Andrew M. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
[Dspace-tech] StructBuilder says I am not authorized
I am now having trouble running the StructBuilder. Here is the error I get: Using DSpace installation in: G:\mystuff\tools\dspace-1.5.1-src-release\dspace\target\dspace-1.5.1-build.dir Exception in thread main org.dspace.authorize.AuthorizeException: Only administrators can create communities at org.dspace.content.Community.create(Community.java:193) This used to work! The only thing I have done different is I blew away the database because previous imports filled it with rubbish. I stopped tomcat, dropped the database, said ant fresh_install and restarted tomcat. I must have missed off something, but what? -- Regards, Andrew M. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] StructBuilder says I am not authorized
On Mon, Dec 15, 2008 at 11:17:45PM +, Andrew Marlow wrote: I am now having trouble running the StructBuilder. Here is the error I get: Using DSpace installation in: G:\mystuff\tools\dspace-1.5.1-src-release\dspace\target\dspace-1.5.1-build.dir Exception in thread main org.dspace.authorize.AuthorizeException: Only administrators can create communities at org.dspace.content.Community.create(Community.java:193) This used to work! The only thing I have done different is I blew away the database because previous imports filled it with rubbish. I stopped tomcat, dropped the database, said ant fresh_install and restarted tomcat. I must have missed off something, but what? Sounds like you forgot to re-create the administrator account (see the install instructions). cheers, Jim -- James Rutherford | Hewlett-Packard Limited registered Office: Research Engineer | Cain Road, HP Labs | Bracknell, Bristol, UK | Berks +44 117 312 7066 | RG12 1HN. james.rutherf...@hp.com | Registered No: 690597 England The contents of this message and any attachments to it are confidential and may be legally privileged. If you have received this message in error, you should delete it from your system immediately and advise the sender. To any recipient of this message within HP, unless otherwise stated you should consider this message and attachments as HP CONFIDENTIAL. -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech
Re: [Dspace-tech] StructBuilder says I am not authorized
Hi Andrew, as you dropped the database and did a fresh_install, there is no administrator anymore. Run [dspace]/bin/create_administrator and try the structure-builder with this eperson. Hope that helps Claudia Jürgen Andrew Marlow schrieb: I am now having trouble running the StructBuilder. Here is the error I get: Using DSpace installation in: G:\mystuff\tools\dspace-1.5.1-src-release\dspace\target\dspace-1.5.1-build.dir Exception in thread main org.dspace.authorize.AuthorizeException: Only administrators can create communities at org.dspace.content.Community.create(Community.java:193) This used to work! The only thing I have done different is I blew away the database because previous imports filled it with rubbish. I stopped tomcat, dropped the database, said ant fresh_install and restarted tomcat. I must have missed off something, but what? -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech -- SF.Net email is Sponsored by MIX09, March 18-20, 2009 in Las Vegas, Nevada. The future of the web can't happen without you. Join us at MIX09 to help pave the way to the Next Web now. Learn more and register at http://ad.doubleclick.net/clk;208669438;13503038;i?http://2009.visitmix.com/ ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech