Sue, A few comments inline...
On 5/6/2010 11:44 AM, Thornton, Susan M. (LARC-B702)[RAYTHEON TECHNICAL SERVICES COMPANY] wrote: > I just noticed in one of our DSpace instances that _all_ of the rows > in the bitstream table have column “bitstream_format_id” set to “1” – > Unknown. All the documents are either .pdf files or their equivalent > .pdf.txt files (from filter-media). The strange thing is that all the > .pdf files are in ORIGINAL bundles and all the .pdf.txt files are in > TEXT bundles. The ORIGINAL bundle always contains the original files (as they were uploaded in DSpace). The TEXT bundle always includes text-extraction files which are auto-generated by the filter-media script. More info on Bundle usage can be found in the DSpace Data Model descriptions: http://www.dspace.org/1_6_0Documentation/ch02.html#docbook-functional.html-data_model > > What is the proper way to set the value of “bitstream_format_id” during > an import? Is it a field you have to include in the Contents file? Or is > it supposed to be set programmatically in DSpace? I guess I can write a > query to “update” the bitstream_format_id based on the document names, > i.e., .pdf files are bitstream_format_id = “3” and .pdf.txt files should > be “5”. DSpace will attempt to recognize File Formats automatically on upload/ingest. It does so in a very rudimentary way, by essentially checking the file extension. If the uploaded file's extension matches a known extension in DSpace's Bitstream Format Registry, than DSpace will assume that file is of that known format. So, each of your ".pdf" files should have been auto-recognized as PDF format, assuming your Bitstream Format Registry has an entry for ".pdf" (it should, as this is a default entry -- the only way it wouldn't is if you specifically removed it, or your Format Registry was not initialized properly to begin with). I'm at a loss for why this doesn't seem to be working in your DSpace installation (as I've never seen this before). Is there any custom submission/ingest code that code be affecting this? Are you ingesting this content via a UI (XMLUI or JSPUI) or is it all ingested via commandline (either way, DSpace should be recognizing the formats properly -- but it could help narrow down the problem)? - Tim ------------------------------------------------------------------------------ _______________________________________________ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech