Hi euler,
there is a config setting
https://github.com/DSpace/DSpace/blob/dspace-5_x/dspace/config/modules/oai.cfg#L20
which determines the base URL for bitstreams.
So most likely you got http still there, if so change it and rebuild the
oai core.
Hope this helps
Claudia Jürgen
Am 06.04.2017 um 10:02 schrieb euler:
Hi helix,
I tried your suggestion to use text editor to open the corrupt pdf. Now I
am wondering why the harvested pdf contained this html response with an
error message:
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a
href="https://repository.seafdec.org.ph/bitstream/10862/1483/1/aep01.pdf">here</a>.</p>
</body></html>
Could this be because I setup Apache to redirect http to https? What should
I do to resolve this issue? So my hunch is correct that using https is
causing this issue.
Thanks and regards,
euler
On Thursday, April 6, 2017 at 3:46:55 PM UTC+8, euler wrote:
Hi helix,
Thanks for the response. Yes, the pdfs are normal if downloaded directly.
My issue is when I harvest that collection with full replication in the
harvesting options, the pdfs are corrupt. This is also happening in other
collections.
Thanks again.
Sincerely,
euler
On Thursday, April 6, 2017 at 3:37:55 PM UTC+8, helix84 wrote:
I tried to download one of the PDFs from your col_10862_1482, but it
looks normal (~4 MB):
http://repository.seafdec.org.ph/bitstream/10862/1483/1/aep01.pdf
Look at the small PDF with a text editor. My guess is that you'll find a
HTML response there with an error message.
Regards,
~~helix84
Compulsory reading: DSpace Mailing List Etiquette
https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
On Thu, Apr 6, 2017 at 9:28 AM, euler <[email protected]> wrote:
Dear All,
I would like to know why the pdfs that were harvested from our
repository are corrupt, mostly the file size is 274~bytes. I am using
apache in front of tomcat and enabled https. I am not sure where to look
why the pdfs harvested are corrupt. I did not find any entry from dspace
log files that could be related to this issue. I tried harvesting our
repository in the dspace demo and in my local test instance but the results
are the same, corrupt pdfs. Please help me locate what could be the cause
of this. You can try harvesting a small collection (with only 3 items) from
our repository (set: col_10862_1482). The oai source is
https://repository.seafdec.org.ph/oai/request. I would also like to ask
from anybody if they have a special setup in their oai if using https
because I have a hunch that this could be a reason also.
Thanks in advance.
euler
--
You received this message because you are subscribed to the Google
Groups "DSpace Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.
--
Claudia Juergen
Eldorado
Technische Universität Dortmund
Universitätsbibliothek
Vogelpothsweg 76
44227 Dortmund
Tel.: +49 231-755 40 43
Fax: +49 231-755 40 32
[email protected]
www.ub.tu-dortmund.de
Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie ist
ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für diese
E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender und
vernichten Sie diese Mail. Vielen Dank.
Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen
ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher Schriftform
(mit eigenhändiger Unterschrift) oder durch Übermittlung eines solchen
Schriftstücks per Telefax erfolgen.
Important note: The information included in this e-mail is confidential. It is
solely intended for the recipient. If you are not the intended recipient of
this e-mail please contact the sender and delete this message. Thank you.
Without prejudice of e-mail correspondence, our statements are only legally
binding when they are made in the conventional written form (with personal
signature) or when such documents are sent by fax.
--
You received this message because you are subscribed to the Google Groups "DSpace
Technical Support" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/dspace-tech.
For more options, visit https://groups.google.com/d/optout.