Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-28 Thread Kyle Banerjee
On Sat, Apr 27, 2013 at 9:37 PM, Andrew Hankinson 
andrew.hankin...@gmail.com wrote:

 As someone who works on document recognition, I have to disagree. You
 should always keep an uncompressed original around, since you can never
 recover it without (often expensive) re-imaging. JPEG, or any other type of
 lossy compression, introduces artifacts that don't look too bad by the
 human eye, but have a significant effect on the quality of OCR. You can
 never recover this after you have discarded your originals.

 Big files are clunky to work with, which is why you should have an
 automated way of producing surrogate, compressed copies for general use,
 but like any archivist will tell you, a photocopy is not a replacement for
 the original.


All true, but keeping just in case copies of uncompressed files around
has significant disadvantages unless you have the resources to deal with
them. Any archivist will tell you they need the uncompressed files.
However, many of them don't have the disk space, bandwidth, staff
resources, etc to deal with these files and wind up doing things that are
far more dangerous like just having files sitting around on cheap external
HD's.

Every choice people make is about loss. Equipment, optics, lighting, you
name it. But for some reason, the instant we're talking about bits of data
on a disk, people plan as though capacity were unlimited when most archives
are severely underresourced.

If you only have to deal with a few small projects, keeping uncompressed
images is no big deal. But let's suppose you have a million pages or more
-- this introduces a completely different cost structure that permanently
affects what resources you'll have for other projects in the future.
Objectives and available resources need to drive decisions unless we
believe that the best plan is to do what we'd do in an ideal world until
resources run out.

kyle


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-28 Thread Simon Spero
On Sun, Apr 28, 2013 at 2:43 AM, Kyle Banerjee kyle.baner...@gmail.comwrote:

Every choice people make is about loss. Equipment, optics, lighting,
 you name it. But for some reason, the instant we're talking about bits of
 data
 on a disk, people plan as though capacity were unlimited when most
 archives are severely underresourced.


 Strictly speaking, it is not correct to say that every choice is about
loss (or cost), and for once I'm saying this in a case where the difference
is actually significant. [Someone help edsu to the fainting couch.]

If a particular set of choices are below the Production Possibility
Frontierhttp://en.wikipedia.org/wiki/Production%E2%80%93possibility_frontier,
then those choices are strictly inferior to those that are on the Frontier.
 Why is this relevant?  Because, for the situation where lossless image
storage has a very value,  TIFF is not the most space efficient way of
storing the data.

A month or so ago I did a few measurements, using a (not necessarily
representative) color photograph (TIFF extracted from a Canon EOS-10D raw).


For lossless conversion, I used uncompressed TIFF, compressed TIFF, PNG,
and JP2 (100% quality).   Measurements using the ImageMagick compare
utility  confirmed zero signal loss:

-rw-r--r--@ 1 ses  staff18M Mar 19 14:52
CRW_4237_tiff_8_uncompressed.tif
-rw-r--r--@ 1 ses  staff   9.4M Mar 19 14:53 CRW_4237_tiff_8_compressed.tif
-rw-r--r--  1 ses  staff   8.2M Mar 19 14:29 CRW_4237-0.png
-rw-r--r--@ 1 ses  staff   6.1M Mar 19 14:03 CRW_4237_quality_100-0.jp2

For lossy compression, using RMSE as the metric, we can see that JPEG at
90% quality is showing measurable signal degradation, with a compression
ratio of 4.7:1  relative to the JP2 file (vs. 14:1 relative to uncompressed
tiff, and 7.2:1 for compressed).

$compare  ... CRW_4237_jpg_90.jpg =459.806 (0.00701619) [1.3M]
(4.7:1)

JP2 at quality 75 showed slightly less signal loss by RMSE, with a
compression ratio of 5.5 : 1

$compare  ... CRW_4237_quality_75.jp2 =457.959 (0.006988)   [1.1M]
(5.5:1)

Note that the image type was a color photograph; other image types may get
 better lossless compression using PNG or TIFF.  Also, some people have
expressed concern over the use of JP2 for archival purposes due to a
relatively small number of open-source libraries.  On the other hand, JP2
has some potentially useful properties for distributed replicated
preservation (layers with fine levels of detail could be split off and
 stored on fewer replicas).

Simon


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-28 Thread Kyle Banerjee
Sure, but these are especially small photos -- are these born digital?

Lossless scans of pretty small photos are frequently well over 100MB, and
it takes hardly anything to get a 1GB scan. It costs a fortunate to treat a
thesis (i.e. what's really being preserved is readable text rather than the
artifact itself) as if it were a historical document where texture of the
paper and the like is actually relevant.

The problem we encounter is that people want to scan a tiny photo and blow
it up to poster or wall size. But the source material and the equipment are
typically far more limiting factors than the format unless you're doing
something that's totally nuts. You want maximum flexibility for
unanticipated future needs, but value judgments need to be made up front or
you wind up painting yourself into a corner.

kyle


On Sun, Apr 28, 2013 at 12:40 PM, Simon Spero sesunc...@gmail.com wrote:

 On Sun, Apr 28, 2013 at 2:43 AM, Kyle Banerjee kyle.baner...@gmail.com
 wrote:

 Every choice people make is about loss. Equipment, optics, lighting,
  you name it. But for some reason, the instant we're talking about bits of
  data
  on a disk, people plan as though capacity were unlimited when most
  archives are severely underresourced.
 

  Strictly speaking, it is not correct to say that every choice is about
 loss (or cost), and for once I'm saying this in a case where the difference
 is actually significant. [Someone help edsu to the fainting couch.]

 If a particular set of choices are below the Production Possibility
 Frontier
 http://en.wikipedia.org/wiki/Production%E2%80%93possibility_frontier,
 then those choices are strictly inferior to those that are on the Frontier.
  Why is this relevant?  Because, for the situation where lossless image
 storage has a very value,  TIFF is not the most space efficient way of
 storing the data.

 A month or so ago I did a few measurements, using a (not necessarily
 representative) color photograph (TIFF extracted from a Canon EOS-10D raw).


 For lossless conversion, I used uncompressed TIFF, compressed TIFF, PNG,
 and JP2 (100% quality).   Measurements using the ImageMagick compare
 utility  confirmed zero signal loss:

 -rw-r--r--@ 1 ses  staff18M Mar 19 14:52
 CRW_4237_tiff_8_uncompressed.tif
 -rw-r--r--@ 1 ses  staff   9.4M Mar 19 14:53
 CRW_4237_tiff_8_compressed.tif
 -rw-r--r--  1 ses  staff   8.2M Mar 19 14:29 CRW_4237-0.png
 -rw-r--r--@ 1 ses  staff   6.1M Mar 19 14:03 CRW_4237_quality_100-0.jp2

 For lossy compression, using RMSE as the metric, we can see that JPEG at
 90% quality is showing measurable signal degradation, with a compression
 ratio of 4.7:1  relative to the JP2 file (vs. 14:1 relative to uncompressed
 tiff, and 7.2:1 for compressed).

 $compare  ... CRW_4237_jpg_90.jpg =459.806 (0.00701619) [1.3M]
 (4.7:1)

 JP2 at quality 75 showed slightly less signal loss by RMSE, with a
 compression ratio of 5.5 : 1

 $compare  ... CRW_4237_quality_75.jp2 =457.959 (0.006988)   [1.1M]
 (5.5:1)

 Note that the image type was a color photograph; other image types may get
  better lossless compression using PNG or TIFF.  Also, some people have
 expressed concern over the use of JP2 for archival purposes due to a
 relatively small number of open-source libraries.  On the other hand, JP2
 has some potentially useful properties for distributed replicated
 preservation (layers with fine levels of detail could be split off and
  stored on fewer replicas).

 Simon



Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-27 Thread Wilhelmina Randtke
Yes, exactly.  You will loose some of the image quality.  If you change to
a compressed format, then back to the TIFF, you can get the format, but you
can't go back to the original file.

Stop and think:  What are your long term goals?

Big files are clunky to work with.  I'm guessing that's why you don't want
TIFF.  In my experience, files big enough to be clunky are discarded within
a few years, regardless of the intentions when they were prepped.  If you
want to avoid big files, then your best bet is to assess and test the file
you will actually keep and do the best job you can with it.  So, if you
want to rerun OCR in a few years when the recognition will be better, then
make your PDFs in such a way that you can get decent OCR out of them today,
and plan to rerun on those files, not the (discarded) originals.  Don't
think reformatting will get you any better image quality later.

-Wilhelmina Randtke

On Fri, Apr 26, 2013 at 3:19 PM, James Gilbert gilber...@whitehallpl.orgwrote:

 I'm by no means an expert in the math behind image format conversions...
 but:

 When converting to TIFF-to-JPG, TIFF is uncompressed formatting and JPG is
 compressed format.

 When back converting, wouldn't the original quality of TIFF would be lost,
 converted only to the quality of the last JPG (with degradation on each
 time
 this process occurs)?

 James Gilbert, BS, MLIS
 Systems Librarian
 Whitehall Township Public Library
 3700 Mechanicsville Road
 Whitehall, PA 18052
 610-432-4339 ext: 203

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Roy
 Sent: Friday, April 26, 2013 4:15 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] tiff2pdf, then back to pdf?

 If you can stand an extrastep, Ed, there are tools to convert PDF to jpg
 images, and from there it shouldn't be too hard to get TIFF output. Do a
 search for convert PDF to image to get started. There are tools that are
 not online only, which I'm pretty sure is what you're after.

 Roy Zimmer
 Western Michigan University


 On 4/26/2013 4:08 PM, Edward M. Corrado wrote:
  Hi All,
 
  I have a need to batch convert many TIFF images to PDF. I'd then like
  to be able to discard the TIFF images, but I can only do that if I can
  create the original TIFF again from the PDF. Is this possible? If so,
  using what tools and how?
 
  tiff2pdf seems like a possible solution, but I can't find a
  corresponding pdf2tif program that reverses the process.
 
  Any ideas?
 
  Edward



Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-27 Thread Andrew Hankinson
As someone who works on document recognition, I have to disagree. You should 
always keep an uncompressed original around, since you can never recover it 
without (often expensive) re-imaging. JPEG, or any other type of lossy 
compression, introduces artifacts that don't look too bad by the human eye, 
but have a significant effect on the quality of OCR. You can never recover this 
after you have discarded your originals.

Big files are clunky to work with, which is why you should have an automated 
way of producing surrogate, compressed copies for general use, but like any 
archivist will tell you, a photocopy is not a replacement for the original.

-Andrew

On 2013-04-27, at 7:17 PM, Wilhelmina Randtke rand...@gmail.com wrote:

 Yes, exactly.  You will loose some of the image quality.  If you change to
 a compressed format, then back to the TIFF, you can get the format, but you
 can't go back to the original file.
 
 Stop and think:  What are your long term goals?
 
 Big files are clunky to work with.  I'm guessing that's why you don't want
 TIFF.  In my experience, files big enough to be clunky are discarded within
 a few years, regardless of the intentions when they were prepped.  If you
 want to avoid big files, then your best bet is to assess and test the file
 you will actually keep and do the best job you can with it.  So, if you
 want to rerun OCR in a few years when the recognition will be better, then
 make your PDFs in such a way that you can get decent OCR out of them today,
 and plan to rerun on those files, not the (discarded) originals.  Don't
 think reformatting will get you any better image quality later.
 
 -Wilhelmina Randtke
 
 On Fri, Apr 26, 2013 at 3:19 PM, James Gilbert 
 gilber...@whitehallpl.orgwrote:
 
 I'm by no means an expert in the math behind image format conversions...
 but:
 
 When converting to TIFF-to-JPG, TIFF is uncompressed formatting and JPG is
 compressed format.
 
 When back converting, wouldn't the original quality of TIFF would be lost,
 converted only to the quality of the last JPG (with degradation on each
 time
 this process occurs)?
 
 James Gilbert, BS, MLIS
 Systems Librarian
 Whitehall Township Public Library
 3700 Mechanicsville Road
 Whitehall, PA 18052
 610-432-4339 ext: 203
 
 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Roy
 Sent: Friday, April 26, 2013 4:15 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] tiff2pdf, then back to pdf?
 
 If you can stand an extrastep, Ed, there are tools to convert PDF to jpg
 images, and from there it shouldn't be too hard to get TIFF output. Do a
 search for convert PDF to image to get started. There are tools that are
 not online only, which I'm pretty sure is what you're after.
 
 Roy Zimmer
 Western Michigan University
 
 
 On 4/26/2013 4:08 PM, Edward M. Corrado wrote:
 Hi All,
 
 I have a need to batch convert many TIFF images to PDF. I'd then like
 to be able to discard the TIFF images, but I can only do that if I can
 create the original TIFF again from the PDF. Is this possible? If so,
 using what tools and how?
 
 tiff2pdf seems like a possible solution, but I can't find a
 corresponding pdf2tif program that reverses the process.
 
 Any ideas?
 
 Edward
 


[CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Edward M. Corrado
Hi All,

I have a need to batch convert many TIFF images to PDF. I'd then like to be
able to discard the TIFF images, but I can only do that if I can create the
original TIFF again from the PDF. Is this possible? If so, using what tools
and how?

tiff2pdf seems like a possible solution, but I can't find a corresponding
pdf2tif program that reverses the process.

Any ideas?

Edward


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Roy
If you can stand an extrastep, Ed, there are tools to convert PDF to jpg 
images, and from there it shouldn't be too hard to get TIFF output. Do a 
search for convert PDF to image to get started. There are tools that 
are not online only, which I'm pretty sure is what you're after.


Roy Zimmer
Western Michigan University


On 4/26/2013 4:08 PM, Edward M. Corrado wrote:

Hi All,

I have a need to batch convert many TIFF images to PDF. I'd then like to be
able to discard the TIFF images, but I can only do that if I can create the
original TIFF again from the PDF. Is this possible? If so, using what tools
and how?

tiff2pdf seems like a possible solution, but I can't find a corresponding
pdf2tif program that reverses the process.

Any ideas?

Edward


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Friscia, Michael
Image Magick can do it, you need Ghost Script installed though. I'Ve done
this with multi layer TIFs and multi page PDFs.
-mike


___
Michael Friscia
Manager, Digital Library  Programming Services
 
Yale University Library
(203) 432-1856





On 4/26/13 4:08 PM, Edward M. Corrado ecorr...@ecorrado.us wrote:

Hi All,

I have a need to batch convert many TIFF images to PDF. I'd then like to
be
able to discard the TIFF images, but I can only do that if I can create
the
original TIFF again from the PDF. Is this possible? If so, using what
tools
and how?

tiff2pdf seems like a possible solution, but I can't find a corresponding
pdf2tif program that reverses the process.

Any ideas?

Edward


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread James Gilbert
I'm by no means an expert in the math behind image format conversions...
but:

When converting to TIFF-to-JPG, TIFF is uncompressed formatting and JPG is
compressed format.

When back converting, wouldn't the original quality of TIFF would be lost,
converted only to the quality of the last JPG (with degradation on each time
this process occurs)?

James Gilbert, BS, MLIS
Systems Librarian
Whitehall Township Public Library
3700 Mechanicsville Road
Whitehall, PA 18052
610-432-4339 ext: 203

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Roy
Sent: Friday, April 26, 2013 4:15 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] tiff2pdf, then back to pdf?

If you can stand an extrastep, Ed, there are tools to convert PDF to jpg
images, and from there it shouldn't be too hard to get TIFF output. Do a
search for convert PDF to image to get started. There are tools that are
not online only, which I'm pretty sure is what you're after.

Roy Zimmer
Western Michigan University


On 4/26/2013 4:08 PM, Edward M. Corrado wrote:
 Hi All,

 I have a need to batch convert many TIFF images to PDF. I'd then like 
 to be able to discard the TIFF images, but I can only do that if I can 
 create the original TIFF again from the PDF. Is this possible? If so, 
 using what tools and how?

 tiff2pdf seems like a possible solution, but I can't find a 
 corresponding pdf2tif program that reverses the process.

 Any ideas?

 Edward


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Aaron Addison
Imagemagick's convert will do it both ways.

convert a.tiff b.pdf
convert b.pdf a.tiff

If the pdf is more than one page, the tiff will be a multipage tiff.

Aaron

-- 
Aaron Addison
Unix Administrator 
W. E. B. Du Bois Library UMass Amherst
413 577 2104



On Fri, 2013-04-26 at 16:08 -0400, Edward M. Corrado wrote:
 Hi All,
 
 I have a need to batch convert many TIFF images to PDF. I'd then like to be
 able to discard the TIFF images, but I can only do that if I can create the
 original TIFF again from the PDF. Is this possible? If so, using what tools
 and how?
 
 tiff2pdf seems like a possible solution, but I can't find a corresponding
 pdf2tif program that reverses the process.
 
 Any ideas?
 
 Edward


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Steve Cherry
Yes, converting from JPG to TIFF would have the quality of a JPG and (I 
believe) the file size of a TIFF.


On 4/26/13 4:19 PM, James Gilbert wrote:

I'm by no means an expert in the math behind image format conversions...
but:

When converting to TIFF-to-JPG, TIFF is uncompressed formatting and JPG is
compressed format.

When back converting, wouldn't the original quality of TIFF would be lost,
converted only to the quality of the last JPG (with degradation on each time
this process occurs)?

James Gilbert, BS, MLIS
Systems Librarian
Whitehall Township Public Library
3700 Mechanicsville Road
Whitehall, PA 18052
610-432-4339 ext: 203

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Roy
Sent: Friday, April 26, 2013 4:15 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] tiff2pdf, then back to pdf?

If you can stand an extrastep, Ed, there are tools to convert PDF to jpg
images, and from there it shouldn't be too hard to get TIFF output. Do a
search for convert PDF to image to get started. There are tools that are
not online only, which I'm pretty sure is what you're after.

Roy Zimmer
Western Michigan University


On 4/26/2013 4:08 PM, Edward M. Corrado wrote:

Hi All,

I have a need to batch convert many TIFF images to PDF. I'd then like
to be able to discard the TIFF images, but I can only do that if I can
create the original TIFF again from the PDF. Is this possible? If so,
using what tools and how?

tiff2pdf seems like a possible solution, but I can't find a
corresponding pdf2tif program that reverses the process.

Any ideas?

Edward


--
Steve Cherry
Electronic Services Librarian
The Catholic University of America
202-319-6433


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Pottinger, Hardy J.
Hi, you'll notice from the language you use to describe your use case,
that you use the word convert to describe what you're doing to the
original TIFF images. Once you're done producing a derivative from those
TIFFs, the only way back to the original TIFFs is to go back to the
actual originals. The TIFF images are not stored in the PDF. Only way to
go back to the originals is to preserve them.
--
HARDY POTTINGER pottinge...@umsystem.edu
University of Missouri Library Systems
http://lso.umsystem.edu/~pottingerhj/
https://MOspace.umsystem.edu/
Do you love it? Do you hate it? There it is, the way you made it.
--Frank Zappa





On 4/26/13 3:08 PM, Edward M. Corrado ecorr...@ecorrado.us wrote:

Hi All,

I have a need to batch convert many TIFF images to PDF. I'd then like to
be
able to discard the TIFF images, but I can only do that if I can create
the
original TIFF again from the PDF. Is this possible? If so, using what
tools
and how?

tiff2pdf seems like a possible solution, but I can't find a corresponding
pdf2tif program that reverses the process.

Any ideas?

Edward


Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Edward M. Corrado
This works sometimes. Well, it does give me a new tiff file from the pdf
all of the time, but it is not always anywhere near the same size as the
original tiff. My guess is that maybe there is a flag or somethign that
woulf help. Here is what I get with one fil:


ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.tif A001a.pdf
ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.pdf A001b.tif
ecorrado@ecorrado:~/Desktop/test$ ls -al
total 361056
drwxrwxr-x 2 ecorrado ecorrado 4096 Apr 26 17:07 .
drwxr-xr-x 7 ecorrado ecorrado20480 Apr 26 16:54 ..
-rw-rw-r-- 1 ecorrado ecorrado 38497046 Apr 26 17:07 A001a.pdf
-rw-r--r-- 1 ecorrado ecorrado 38178650 Apr 26 17:07 A001a.tif
-rw-rw-r-- 1 ecorrado ecorrado  5871196 Apr 26 17:07 A001b.tif


In this case, the two tif files should be the same size. They are not even
close. Maybe there is a flag to convert (besides compress) that I can use.
FWIW: I tried three files/ 2 are like this. The other one, the resulting
tiff is the same size as the original.

Edward





On Fri, Apr 26, 2013 at 4:25 PM, Aaron Addison addi...@library.umass.eduwrote:

 Imagemagick's convert will do it both ways.

 convert a.tiff b.pdf
 convert b.pdf a.tiff

 If the pdf is more than one page, the tiff will be a multipage tiff.

 Aaron

 --
 Aaron Addison
 Unix Administrator
 W. E. B. Du Bois Library UMass Amherst
 413 577 2104



 On Fri, 2013-04-26 at 16:08 -0400, Edward M. Corrado wrote:
  Hi All,
 
  I have a need to batch convert many TIFF images to PDF. I'd then like to
 be
  able to discard the TIFF images, but I can only do that if I can create
 the
  original TIFF again from the PDF. Is this possible? If so, using what
 tools
  and how?
 
  tiff2pdf seems like a possible solution, but I can't find a corresponding
  pdf2tif program that reverses the process.
 
  Any ideas?
 
  Edward



Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Edward M. Corrado
Actually, I'm mistaken. It didn't ever work. :-(. I do get a tiff, but not
the original. I looked at the wrong files.


On Fri, Apr 26, 2013 at 5:11 PM, Edward M. Corrado ecorr...@ecorrado.uswrote:

 This works sometimes. Well, it does give me a new tiff file from the pdf
 all of the time, but it is not always anywhere near the same size as the
 original tiff. My guess is that maybe there is a flag or somethign that
 woulf help. Here is what I get with one fil:


 ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.tif
 A001a.pdf
 ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.pdf
 A001b.tif
 ecorrado@ecorrado:~/Desktop/test$ ls -al
 total 361056
 drwxrwxr-x 2 ecorrado ecorrado 4096 Apr 26 17:07 .
 drwxr-xr-x 7 ecorrado ecorrado20480 Apr 26 16:54 ..
 -rw-rw-r-- 1 ecorrado ecorrado 38497046 Apr 26 17:07 A001a.pdf
 -rw-r--r-- 1 ecorrado ecorrado 38178650 Apr 26 17:07 A001a.tif
 -rw-rw-r-- 1 ecorrado ecorrado  5871196 Apr 26 17:07 A001b.tif


 In this case, the two tif files should be the same size. They are not even
 close. Maybe there is a flag to convert (besides compress) that I can use.
 FWIW: I tried three files/ 2 are like this. The other one, the resulting
 tiff is the same size as the original.

 Edward





 On Fri, Apr 26, 2013 at 4:25 PM, Aaron Addison 
 addi...@library.umass.eduwrote:

 Imagemagick's convert will do it both ways.

 convert a.tiff b.pdf
 convert b.pdf a.tiff

 If the pdf is more than one page, the tiff will be a multipage tiff.

 Aaron

 --
 Aaron Addison
 Unix Administrator
 W. E. B. Du Bois Library UMass Amherst
 413 577 2104



 On Fri, 2013-04-26 at 16:08 -0400, Edward M. Corrado wrote:
  Hi All,
 
  I have a need to batch convert many TIFF images to PDF. I'd then like
 to be
  able to discard the TIFF images, but I can only do that if I can create
 the
  original TIFF again from the PDF. Is this possible? If so, using what
 tools
  and how?
 
  tiff2pdf seems like a possible solution, but I can't find a
 corresponding
  pdf2tif program that reverses the process.
 
  Any ideas?
 
  Edward





Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Ethan Gruber
What's your use case in this scenario? Do you want to provide access to the
PDFs over the web or are you using them as your archival format?  You
probably don't want to use PDF to achieve both objectives.

Ethan
On Apr 26, 2013 5:11 PM, Edward M. Corrado ecorr...@ecorrado.us wrote:

 This works sometimes. Well, it does give me a new tiff file from the pdf
 all of the time, but it is not always anywhere near the same size as the
 original tiff. My guess is that maybe there is a flag or somethign that
 woulf help. Here is what I get with one fil:


 ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.tif
 A001a.pdf
 ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.pdf
 A001b.tif
 ecorrado@ecorrado:~/Desktop/test$ ls -al
 total 361056
 drwxrwxr-x 2 ecorrado ecorrado 4096 Apr 26 17:07 .
 drwxr-xr-x 7 ecorrado ecorrado20480 Apr 26 16:54 ..
 -rw-rw-r-- 1 ecorrado ecorrado 38497046 Apr 26 17:07 A001a.pdf
 -rw-r--r-- 1 ecorrado ecorrado 38178650 Apr 26 17:07 A001a.tif
 -rw-rw-r-- 1 ecorrado ecorrado  5871196 Apr 26 17:07 A001b.tif


 In this case, the two tif files should be the same size. They are not even
 close. Maybe there is a flag to convert (besides compress) that I can use.
 FWIW: I tried three files/ 2 are like this. The other one, the resulting
 tiff is the same size as the original.

 Edward





 On Fri, Apr 26, 2013 at 4:25 PM, Aaron Addison addi...@library.umass.edu
 wrote:

  Imagemagick's convert will do it both ways.
 
  convert a.tiff b.pdf
  convert b.pdf a.tiff
 
  If the pdf is more than one page, the tiff will be a multipage tiff.
 
  Aaron
 
  --
  Aaron Addison
  Unix Administrator
  W. E. B. Du Bois Library UMass Amherst
  413 577 2104
 
 
 
  On Fri, 2013-04-26 at 16:08 -0400, Edward M. Corrado wrote:
   Hi All,
  
   I have a need to batch convert many TIFF images to PDF. I'd then like
 to
  be
   able to discard the TIFF images, but I can only do that if I can create
  the
   original TIFF again from the PDF. Is this possible? If so, using what
  tools
   and how?
  
   tiff2pdf seems like a possible solution, but I can't find a
 corresponding
   pdf2tif program that reverses the process.
  
   Any ideas?
  
   Edward
 



Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Edward M. Corrado
Hardy,

You may very well be correct, but some programs claim to keep the original
image data unaltered [1], so I was hoping that was the case (basically it
would put some sort of wrapper around the tiff. Tiff2pdf on my Ubuntu box
seems to keep the file sizes very close when I use it so, I'm thinking it
still might be possible. But then again, it might not be and it might
depend o the features of the tiff file (and what pdf version) that is being
used.

If I can't do it, I'll figure something else out, but it would make my life
easier to have to deal with only one file for each representation. But,
I'll live regardless :-)

Edward

[1] http://www.davince.com/docs/tiff2pdf.html is one example of a program
that says this, but it also does point out not all features of tiff are
supported in pdf. It is also old, and they don't offer a program that I can
find that does the reversal.



On Fri, Apr 26, 2013 at 5:07 PM, Pottinger, Hardy J. 
pottinge...@missouri.edu wrote:

 Hi, you'll notice from the language you use to describe your use case,
 that you use the word convert to describe what you're doing to the
 original TIFF images. Once you're done producing a derivative from those
 TIFFs, the only way back to the original TIFFs is to go back to the
 actual originals. The TIFF images are not stored in the PDF. Only way to
 go back to the originals is to preserve them.
 --
 HARDY POTTINGER pottinge...@umsystem.edu
 University of Missouri Library Systems
 http://lso.umsystem.edu/~pottingerhj/
 https://MOspace.umsystem.edu/
 Do you love it? Do you hate it? There it is, the way you made it.
 --Frank Zappa





 On 4/26/13 3:08 PM, Edward M. Corrado ecorr...@ecorrado.us wrote:

 Hi All,
 
 I have a need to batch convert many TIFF images to PDF. I'd then like to
 be
 able to discard the TIFF images, but I can only do that if I can create
 the
 original TIFF again from the PDF. Is this possible? If so, using what
 tools
 and how?
 
 tiff2pdf seems like a possible solution, but I can't find a corresponding
 pdf2tif program that reverses the process.
 
 Any ideas?
 
 Edward



Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Edward M. Corrado
On Fri, Apr 26, 2013 at 5:29 PM, Ethan Gruber ewg4x...@gmail.com wrote:

 What's your use case in this scenario? Do you want to provide access to the
 PDFs over the web or are you using them as your archival format?  You
 probably don't want to use PDF to achieve both objectives.




The problem I have is I have multipage TIFF files and I don't currently
have a good way for users to view them. I also need to preserve these
files. Ideally my use case would be to use PDF files created from the TIFFs
for both preservation and an archival format. But, as I said, that depends
on if I can recreate the original tiff. I have the option of creating a
custom viewer that can deal with the the display of the tiff files, but I'm
looking for other options.

So I have a few choices that I thought of implementing (that I haven't
ruled out):

1) This is what I asked about. Make a PDF from the TIFF files. If I could
embed the tiff into a pdf, and then at some point recreate the tiff if
needed for archival purposes, I have my solution.

2) Convert the multipage TIFF files to individual TIFF files. This would
work for my endusers, but would be more clunky than a PDF for them. The new
TIFF fiels could be my archival copy.

3) Convert the multipage TIFF files to PDF (probably in a smaller,
compressed? state), use the PDF for display/access, save the TIFF for
archival purposes.

4) Convert the multipage TIFFs to PDF (or PDF/A?), and don’t worry about
being able to recreate the original TIFF files.

I should add, the content is what is important in these documents and they
are mostly type written or hand written text. Still, I'd like to keep them
in as high quality of a format as possible.

I'm sure there are some other possible solutions as well. I really would
like #1, but it may not be possible. If it isn't, I need to decide (with
representatives of my user community) which of the others are better. My
guess is it would be #3, but I am not positive.

Edward







 Ethan
 On Apr 26, 2013 5:11 PM, Edward M. Corrado ecorr...@ecorrado.us wrote:

  This works sometimes. Well, it does give me a new tiff file from the pdf
  all of the time, but it is not always anywhere near the same size as the
  original tiff. My guess is that maybe there is a flag or somethign that
  woulf help. Here is what I get with one fil:
 
 
  ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.tif
  A001a.pdf
  ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.pdf
  A001b.tif
  ecorrado@ecorrado:~/Desktop/test$ ls -al
  total 361056
  drwxrwxr-x 2 ecorrado ecorrado 4096 Apr 26 17:07 .
  drwxr-xr-x 7 ecorrado ecorrado20480 Apr 26 16:54 ..
  -rw-rw-r-- 1 ecorrado ecorrado 38497046 Apr 26 17:07 A001a.pdf
  -rw-r--r-- 1 ecorrado ecorrado 38178650 Apr 26 17:07 A001a.tif
  -rw-rw-r-- 1 ecorrado ecorrado  5871196 Apr 26 17:07 A001b.tif
 
 
  In this case, the two tif files should be the same size. They are not
 even
  close. Maybe there is a flag to convert (besides compress) that I can
 use.
  FWIW: I tried three files/ 2 are like this. The other one, the resulting
  tiff is the same size as the original.
 
  Edward
 
 
 
 
 
  On Fri, Apr 26, 2013 at 4:25 PM, Aaron Addison 
 addi...@library.umass.edu
  wrote:
 
   Imagemagick's convert will do it both ways.
  
   convert a.tiff b.pdf
   convert b.pdf a.tiff
  
   If the pdf is more than one page, the tiff will be a multipage tiff.
  
   Aaron
  
   --
   Aaron Addison
   Unix Administrator
   W. E. B. Du Bois Library UMass Amherst
   413 577 2104
  
  
  
   On Fri, 2013-04-26 at 16:08 -0400, Edward M. Corrado wrote:
Hi All,
   
I have a need to batch convert many TIFF images to PDF. I'd then like
  to
   be
able to discard the TIFF images, but I can only do that if I can
 create
   the
original TIFF again from the PDF. Is this possible? If so, using what
   tools
and how?
   
tiff2pdf seems like a possible solution, but I can't find a
  corresponding
pdf2tif program that reverses the process.
   
Any ideas?
   
Edward
  
 



Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Jason Curtis
Hi, Edward:

After reading through the string of messages and the options that you list 
below, I think that #3 is your best option.  It seems to best fall in line with 
good archiving practices as I understand them (have one copy for public use and 
another for archival purposes).  If you really want to convert the TIFF to PDF 
and ditch the TIFF file, I would suggest using PDF/A, the archival version of 
PDF, if you can.  Best of luck!

Sincerely,
Jason

__
Jason Curtis
Technical Services Librarian
Legal Research Center
University of San Diego
5998 Alcalá Park
San Diego, CA 92110
Ph: (619) 260-4600, ext.2875
Fax: (619) 260-7495
cur...@sandiego.edu

-Original Message-
From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Edward 
M. Corrado
Sent: Friday, April 26, 2013 2:55 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] tiff2pdf, then back to pdf?

On Fri, Apr 26, 2013 at 5:29 PM, Ethan Gruber ewg4x...@gmail.com wrote:

 What's your use case in this scenario? Do you want to provide access 
 to the PDFs over the web or are you using them as your archival 
 format?  You probably don't want to use PDF to achieve both objectives.




The problem I have is I have multipage TIFF files and I don't currently have a 
good way for users to view them. I also need to preserve these files. Ideally 
my use case would be to use PDF files created from the TIFFs for both 
preservation and an archival format. But, as I said, that depends on if I can 
recreate the original tiff. I have the option of creating a custom viewer that 
can deal with the the display of the tiff files, but I'm looking for other 
options.

So I have a few choices that I thought of implementing (that I haven't ruled 
out):

1) This is what I asked about. Make a PDF from the TIFF files. If I could embed 
the tiff into a pdf, and then at some point recreate the tiff if needed for 
archival purposes, I have my solution.

2) Convert the multipage TIFF files to individual TIFF files. This would work 
for my endusers, but would be more clunky than a PDF for them. The new TIFF 
fiels could be my archival copy.

3) Convert the multipage TIFF files to PDF (probably in a smaller, compressed? 
state), use the PDF for display/access, save the TIFF for archival purposes.

4) Convert the multipage TIFFs to PDF (or PDF/A?), and don't worry about being 
able to recreate the original TIFF files.

I should add, the content is what is important in these documents and they are 
mostly type written or hand written text. Still, I'd like to keep them in as 
high quality of a format as possible.

I'm sure there are some other possible solutions as well. I really would like 
#1, but it may not be possible. If it isn't, I need to decide (with 
representatives of my user community) which of the others are better. My guess 
is it would be #3, but I am not positive.

Edward







 Ethan
 On Apr 26, 2013 5:11 PM, Edward M. Corrado ecorr...@ecorrado.us wrote:

  This works sometimes. Well, it does give me a new tiff file from the 
  pdf all of the time, but it is not always anywhere near the same 
  size as the original tiff. My guess is that maybe there is a flag or 
  somethign that woulf help. Here is what I get with one fil:
 
 
  ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.tif 
  A001a.pdf ecorrado@ecorrado:~/Desktop/test$ convert -compress none 
  A001a.pdf A001b.tif ecorrado@ecorrado:~/Desktop/test$ ls -al total 
  361056
  drwxrwxr-x 2 ecorrado ecorrado 4096 Apr 26 17:07 .
  drwxr-xr-x 7 ecorrado ecorrado20480 Apr 26 16:54 ..
  -rw-rw-r-- 1 ecorrado ecorrado 38497046 Apr 26 17:07 A001a.pdf
  -rw-r--r-- 1 ecorrado ecorrado 38178650 Apr 26 17:07 A001a.tif
  -rw-rw-r-- 1 ecorrado ecorrado  5871196 Apr 26 17:07 A001b.tif
 
 
  In this case, the two tif files should be the same size. They are 
  not
 even
  close. Maybe there is a flag to convert (besides compress) that I 
  can
 use.
  FWIW: I tried three files/ 2 are like this. The other one, the 
  resulting tiff is the same size as the original.
 
  Edward
 
 
 
 
 
  On Fri, Apr 26, 2013 at 4:25 PM, Aaron Addison 
 addi...@library.umass.edu
  wrote:
 
   Imagemagick's convert will do it both ways.
  
   convert a.tiff b.pdf
   convert b.pdf a.tiff
  
   If the pdf is more than one page, the tiff will be a multipage tiff.
  
   Aaron
  
   --
   Aaron Addison
   Unix Administrator
   W. E. B. Du Bois Library UMass Amherst
   413 577 2104
  
  
  
   On Fri, 2013-04-26 at 16:08 -0400, Edward M. Corrado wrote:
Hi All,
   
I have a need to batch convert many TIFF images to PDF. I'd then 
like
  to
   be
able to discard the TIFF images, but I can only do that if I can
 create
   the
original TIFF again from the PDF. Is this possible? If so, using 
what
   tools
and how?
   
tiff2pdf seems like a possible solution, but I can't find a
  corresponding
pdf2tif program that reverses the process.
   
Any

Re: [CODE4LIB] tiff2pdf, then back to pdf?

2013-04-26 Thread Andrew Cunningham
Although I do find the persistent myth of PDF/A as an archival format
amusing.

Under very specific circumstances it can be, but its rare for those
circumstances to be deliberatively met.

And for many languages it is impossible to use pdf for archival purpuses
ever.

It is the nature of PDF.
On 27/04/2013 8:28 AM, Jason Curtis cur...@sandiego.edu wrote:

 Hi, Edward:

 After reading through the string of messages and the options that you list
 below, I think that #3 is your best option.  It seems to best fall in line
 with good archiving practices as I understand them (have one copy for
 public use and another for archival purposes).  If you really want to
 convert the TIFF to PDF and ditch the TIFF file, I would suggest using
 PDF/A, the archival version of PDF, if you can.  Best of luck!

 Sincerely,
 Jason

 __
 Jason Curtis
 Technical Services Librarian
 Legal Research Center
 University of San Diego
 5998 Alcalá Park
 San Diego, CA 92110
 Ph: (619) 260-4600, ext.2875
 Fax: (619) 260-7495
 cur...@sandiego.edu

 -Original Message-
 From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of
 Edward M. Corrado
 Sent: Friday, April 26, 2013 2:55 PM
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] tiff2pdf, then back to pdf?

 On Fri, Apr 26, 2013 at 5:29 PM, Ethan Gruber ewg4x...@gmail.com wrote:

  What's your use case in this scenario? Do you want to provide access
  to the PDFs over the web or are you using them as your archival
  format?  You probably don't want to use PDF to achieve both objectives.
 



 The problem I have is I have multipage TIFF files and I don't currently
 have a good way for users to view them. I also need to preserve these
 files. Ideally my use case would be to use PDF files created from the TIFFs
 for both preservation and an archival format. But, as I said, that depends
 on if I can recreate the original tiff. I have the option of creating a
 custom viewer that can deal with the the display of the tiff files, but I'm
 looking for other options.

 So I have a few choices that I thought of implementing (that I haven't
 ruled out):

 1) This is what I asked about. Make a PDF from the TIFF files. If I could
 embed the tiff into a pdf, and then at some point recreate the tiff if
 needed for archival purposes, I have my solution.

 2) Convert the multipage TIFF files to individual TIFF files. This would
 work for my endusers, but would be more clunky than a PDF for them. The new
 TIFF fiels could be my archival copy.

 3) Convert the multipage TIFF files to PDF (probably in a smaller,
 compressed? state), use the PDF for display/access, save the TIFF for
 archival purposes.

 4) Convert the multipage TIFFs to PDF (or PDF/A?), and don't worry about
 being able to recreate the original TIFF files.

 I should add, the content is what is important in these documents and they
 are mostly type written or hand written text. Still, I'd like to keep them
 in as high quality of a format as possible.

 I'm sure there are some other possible solutions as well. I really would
 like #1, but it may not be possible. If it isn't, I need to decide (with
 representatives of my user community) which of the others are better. My
 guess is it would be #3, but I am not positive.

 Edward






 
  Ethan
  On Apr 26, 2013 5:11 PM, Edward M. Corrado ecorr...@ecorrado.us
 wrote:
 
   This works sometimes. Well, it does give me a new tiff file from the
   pdf all of the time, but it is not always anywhere near the same
   size as the original tiff. My guess is that maybe there is a flag or
   somethign that woulf help. Here is what I get with one fil:
  
  
   ecorrado@ecorrado:~/Desktop/test$ convert -compress none A001a.tif
   A001a.pdf ecorrado@ecorrado:~/Desktop/test$ convert -compress none
   A001a.pdf A001b.tif ecorrado@ecorrado:~/Desktop/test$ ls -al total
   361056
   drwxrwxr-x 2 ecorrado ecorrado 4096 Apr 26 17:07 .
   drwxr-xr-x 7 ecorrado ecorrado20480 Apr 26 16:54 ..
   -rw-rw-r-- 1 ecorrado ecorrado 38497046 Apr 26 17:07 A001a.pdf
   -rw-r--r-- 1 ecorrado ecorrado 38178650 Apr 26 17:07 A001a.tif
   -rw-rw-r-- 1 ecorrado ecorrado  5871196 Apr 26 17:07 A001b.tif
  
  
   In this case, the two tif files should be the same size. They are
   not
  even
   close. Maybe there is a flag to convert (besides compress) that I
   can
  use.
   FWIW: I tried three files/ 2 are like this. The other one, the
   resulting tiff is the same size as the original.
  
   Edward
  
  
  
  
  
   On Fri, Apr 26, 2013 at 4:25 PM, Aaron Addison 
  addi...@library.umass.edu
   wrote:
  
Imagemagick's convert will do it both ways.
   
convert a.tiff b.pdf
convert b.pdf a.tiff
   
If the pdf is more than one page, the tiff will be a multipage tiff.
   
Aaron
   
--
Aaron Addison
Unix Administrator
W. E. B. Du Bois Library UMass Amherst
413 577 2104
   
   
   
On Fri, 2013-04-26 at 16:08 -0400, Edward M. Corrado