Re: [CODE4LIB] mass convert jpeg to pdf
Roy Tennant wrote: Throwing in my two cents on the IIP Image Server. I've been using it on my photos web site[0] for a while now and it works great. I was also happy to see that there is a version that supports the International Image Interoperability Framework (IIIF) API [1], which I was introduced to at DLF by Tom Cramer and company. That would make you compliant with the Mirador multi-windowing tool that he mentioned. Sounds like a win-win to me. +1 iipsrv is extremely fast and easy to deploy (a single static fcgi binary). i've learned now its iiif compliance, and i've just tried the branch https://github.com/ruven/iipsrv/tree/iiif works great, even if a bit undocumented, i looked the code to understand the url: http://{SERVER}/iipsrv.fcgi?iiif={IMAGE}.tif/full/full/0/native.jpg but iipsrv serves only jp2 or tiff images. are you aware of other decoding modules? https://github.com/ruven/iipsrv/blob/master/README#L181 bye -- raffaele, @atomotic
Re: [CODE4LIB] mass convert jpeg to pdf
Just thought I might plug some software we're developing to solve the book image navigation misery that Kyle mentions. http://ddmal.music.mcgill.ca/diva/ and a demo: http://ddmal.music.mcgill.ca/newdiva/demo/single.html We developed it because we were frustrated with the image gallery paradigm for book image viewing, and wanted something more like Google Books' viewer, but with access to the highest resolution possible. We also were frustrated with having to download large PDFs to just view a couple pages. Diva uses IIP on the back-end to serve out image tiles, so you're only ever downloading the part of the image that's viewable -- the rest is auto-loaded as the user scrolls. We've used it to display a manuscript that's ~80GB (total), with each image around 200MB. http://coltrane.music.mcgill.ca/salzinnes/experiments/diva-cci-tif/ It's also got a couple other neat features, like in-browser brightness/contrast/rotation adjustments via canvas. (Click the little gear icon in the top left of each page image). Cheers, -Andrew On 2013-11-08, at 4:22 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? This should be pretty easy. But the issue with tiling is that the nav process is miserable for all but the shortest books. Most of the people who want to download want are looking for jpegs rather than source tiffs and one pdf instead of a bunch of tiffs (which is good since each one is typically over 100MB). Of course there are people who want the real deal, but that's actually a much less common use case. As Karen observes, downloading and viewing serve different use cases so of course we will provide both. IIP Image Server looks intriguing. But most of our users who want the full res stuff really just want to download the source tiffs which will be made available. kyle
Re: [CODE4LIB] mass convert jpeg to pdf
Nice. -- Al Matthews Software Developer, Digital Services Unit Atlanta University Center, Robert W. Woodruff Library email: amatth...@auctr.edu; office: 1 404 978 2057 On 11/12/13 9:59 AM, Andrew Hankinson andrew.hankin...@gmail.com wrote: Just thought I might plug some software we're developing to solve the book image navigation misery that Kyle mentions. http://ddmal.music.mcgill.ca/diva/ and a demo: http://ddmal.music.mcgill.ca/newdiva/demo/single.html We developed it because we were frustrated with the image gallery paradigm for book image viewing, and wanted something more like Google Books' viewer, but with access to the highest resolution possible. We also were frustrated with having to download large PDFs to just view a couple pages. Diva uses IIP on the back-end to serve out image tiles, so you're only ever downloading the part of the image that's viewable -- the rest is auto-loaded as the user scrolls. We've used it to display a manuscript that's ~80GB (total), with each image around 200MB. http://coltrane.music.mcgill.ca/salzinnes/experiments/diva-cci-tif/ It's also got a couple other neat features, like in-browser brightness/contrast/rotation adjustments via canvas. (Click the little gear icon in the top left of each page image). Cheers, -Andrew On 2013-11-08, at 4:22 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? This should be pretty easy. But the issue with tiling is that the nav process is miserable for all but the shortest books. Most of the people who want to download want are looking for jpegs rather than source tiffs and one pdf instead of a bunch of tiffs (which is good since each one is typically over 100MB). Of course there are people who want the real deal, but that's actually a much less common use case. As Karen observes, downloading and viewing serve different use cases so of course we will provide both. IIP Image Server looks intriguing. But most of our users who want the full res stuff really just want to download the source tiffs which will be made available. kyle
Re: [CODE4LIB] mass convert jpeg to pdf
I’m similarly curious to hear if other people have done annotation with zoomable interfaces before. I have been trying OpenLayers' stock functions for this kind of thing for OurDigitalWorld, there is an example here [1]. Leaflet probably does this well too. The mapping tools do seem to have some slick drawing functions [2] though I have only used polygons. art --- 1. http://tiles.uwindsor.ca/ink/cecil/focus/swoda/pictures/assumption/09_1915/1 2. http://openlayers.org/dev/examples/draw-feature.html
Re: [CODE4LIB] mass convert jpeg to pdf
Annotorius has been integrated with OpenLayers [1] to support annotation of maps on zoomable images. Quite excellent work indeed, thanks to Rainer. As part of IIIF [2] and Shared Canvas [3] we have been targeting a similar OpenSeadragon integration with Annotorius and then making this a feature/modality in the Mirador image comparison environment [4]. Part of the roadmap for Mirador is to have annotation viewing and making integrated with OpenSeadragon (or similarly tiled) zooming. -Stu [1] http://annotorious.github.io/demos/openlayers-annotation.html [2] http://iiif.io/ [3] http://www.shared-canvas.org/ [4] http://iiif.io/mirador/ On Nov 10, 2013, at 12:34 PM, Edward Summers wrote: Annotorious [1] is a neat little JavaScript library for adding annotations to an image, and displaying them later. I might be wrong, but it doesn’t appear to support zoomable images at the moment. I do see there was some cross-project activity with OpenSeaDragon [2] so maybe asking over there will yield some leads? Ranier Simon gave a excellent, brief presentation about Annotorious at iAnnotate earlier this year. [3] Leaflet [4] is widely known as a JavaScript library for doing maps; but the tiling that goes on when displaying maps is very similar to zooming on other images like in OpenSeaDragon. Because it is oriented around maps, it definitely supports drawing paths, polygons, other shapes, and there are lots of plugins [5] for various things, including overlaying stuff over the image with Raphael. Another thing to look at from the digital library research angle might be the SharedCanvas work [6,7]. I’m similarly curious to hear if other people have done annotation with zoomable interfaces before. Wondering out loud a bit: don’t your archivists need to make the annotations on a zoomable interface, even if your end-users don’t? //Ed [1] http://annotorious.github.io/ [2] https://github.com/openseadragon/openseadragon/issues/14 [3] http://www.youtube.com/watch?v=-HgWIkBeQNM [4] http://leafletjs.com/ [5] http://leafletjs.com/plugins.html [6] http://www.shared-canvas.org/ [7] http://link.springer.com/article/10.1007%2Fs00799-012-0098-8 On Nov 10, 2013, at 9:41 AM, Ethan Gruber ewg4x...@gmail.com wrote: Does anyone have experience with an image zooming engine in conjunction with image annotation? I don't want end users to annotate things themselves, but allow them to click on annotations added by an archivist.
Re: [CODE4LIB] mass convert jpeg to pdf
Does anyone have experience with an image zooming engine in conjunction with image annotation? I don't want end users to annotate things themselves, but allow them to click on annotations added by an archivist. Thanks, Ethan On Nov 8, 2013 4:39 PM, Edward Summers e...@pobox.com wrote: I’m having trouble understanding who the user of this content you are putting into Omeka is, and what you are expecting them to do with it. But, ok … //Ed On Nov 8, 2013, at 4:22 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? This should be pretty easy. But the issue with tiling is that the nav process is miserable for all but the shortest books. Most of the people who want to download want are looking for jpegs rather than source tiffs and one pdf instead of a bunch of tiffs (which is good since each one is typically over 100MB). Of course there are people who want the real deal, but that's actually a much less common use case. As Karen observes, downloading and viewing serve different use cases so of course we will provide both. IIP Image Server looks intriguing. But most of our users who want the full res stuff really just want to download the source tiffs which will be made available. kyle
Re: [CODE4LIB] mass convert jpeg to pdf
Annotorious [1] is a neat little JavaScript library for adding annotations to an image, and displaying them later. I might be wrong, but it doesn’t appear to support zoomable images at the moment. I do see there was some cross-project activity with OpenSeaDragon [2] so maybe asking over there will yield some leads? Ranier Simon gave a excellent, brief presentation about Annotorious at iAnnotate earlier this year. [3] Leaflet [4] is widely known as a JavaScript library for doing maps; but the tiling that goes on when displaying maps is very similar to zooming on other images like in OpenSeaDragon. Because it is oriented around maps, it definitely supports drawing paths, polygons, other shapes, and there are lots of plugins [5] for various things, including overlaying stuff over the image with Raphael. Another thing to look at from the digital library research angle might be the SharedCanvas work [6,7]. I’m similarly curious to hear if other people have done annotation with zoomable interfaces before. Wondering out loud a bit: don’t your archivists need to make the annotations on a zoomable interface, even if your end-users don’t? //Ed [1] http://annotorious.github.io/ [2] https://github.com/openseadragon/openseadragon/issues/14 [3] http://www.youtube.com/watch?v=-HgWIkBeQNM [4] http://leafletjs.com/ [5] http://leafletjs.com/plugins.html [6] http://www.shared-canvas.org/ [7] http://link.springer.com/article/10.1007%2Fs00799-012-0098-8 On Nov 10, 2013, at 9:41 AM, Ethan Gruber ewg4x...@gmail.com wrote: Does anyone have experience with an image zooming engine in conjunction with image annotation? I don't want end users to annotate things themselves, but allow them to click on annotations added by an archivist.
[CODE4LIB] mass convert jpeg to pdf
We are in the process of migrating our digital collections from CONTENTdm to Omeka and are trying to figure out what to do about the compound objects -- the vast majority of which are digitized books. The source files are actually hi res tiffs but since ginormous objects broken into hundreds of pieces (each of which can be well over 100MB in size) aren't exactly friendly to use, we'd like to stitch them into individual pdf's that can be viewed more conveniently My game plan is to simply have a script pull the files down as jpegs which can be fed to imagemagick which can theoretically do everything I need. However, I've never actually done anything like this before, so I wanted to see if there's a method that people have used for combining lots of images into pdfs that works particularly well. Thanks, kyle
Re: [CODE4LIB] mass convert jpeg to pdf
I've done something like this in imagemagick, and it worked quite well, so I can vouch for this workflow. But just to clarify, I presume you will be creating static PDF files to place in the filesystem--not generate a PDF dynamically through Omeka when a user clicks to download a PDF (as in, Omeka files off an imagemagick process). Ethan On Nov 8, 2013 2:00 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: We are in the process of migrating our digital collections from CONTENTdm to Omeka and are trying to figure out what to do about the compound objects -- the vast majority of which are digitized books. The source files are actually hi res tiffs but since ginormous objects broken into hundreds of pieces (each of which can be well over 100MB in size) aren't exactly friendly to use, we'd like to stitch them into individual pdf's that can be viewed more conveniently My game plan is to simply have a script pull the files down as jpegs which can be fed to imagemagick which can theoretically do everything I need. However, I've never actually done anything like this before, so I wanted to see if there's a method that people have used for combining lots of images into pdfs that works particularly well. Thanks, kyle
Re: [CODE4LIB] mass convert jpeg to pdf
It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? For an example of using Leaflet (usually used for working with maps) in this way checkout NYTimes Machine Beta: http://apps.beta620.nytimes.com/timesmachine/1969/07/20/issue.html //Ed On Nov 8, 2013, at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: We are in the process of migrating our digital collections from CONTENTdm to Omeka and are trying to figure out what to do about the compound objects -- the vast majority of which are digitized books. The source files are actually hi res tiffs but since ginormous objects broken into hundreds of pieces (each of which can be well over 100MB in size) aren't exactly friendly to use, we'd like to stitch them into individual pdf's that can be viewed more conveniently My game plan is to simply have a script pull the files down as jpegs which can be fed to imagemagick which can theoretically do everything I need. However, I've never actually done anything like this before, so I wanted to see if there's a method that people have used for combining lots of images into pdfs that works particularly well. Thanks, kyle
Re: [CODE4LIB] mass convert jpeg to pdf
On the same note, I've had good experiences with using adore djatoka to render jpeg2000 files. Maybe something better has since come along. I'm out of touch with this type of technology. On Nov 8, 2013 2:10 PM, Edward Summers e...@pobox.com wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? For an example of using Leaflet (usually used for working with maps) in this way checkout NYTimes Machine Beta: http://apps.beta620.nytimes.com/timesmachine/1969/07/20/issue.html //Ed On Nov 8, 2013, at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: We are in the process of migrating our digital collections from CONTENTdm to Omeka and are trying to figure out what to do about the compound objects -- the vast majority of which are digitized books. The source files are actually hi res tiffs but since ginormous objects broken into hundreds of pieces (each of which can be well over 100MB in size) aren't exactly friendly to use, we'd like to stitch them into individual pdf's that can be viewed more conveniently My game plan is to simply have a script pull the files down as jpegs which can be fed to imagemagick which can theoretically do everything I need. However, I've never actually done anything like this before, so I wanted to see if there's a method that people have used for combining lots of images into pdfs that works particularly well. Thanks, kyle
Re: [CODE4LIB] mass convert jpeg to pdf
+1 for the viewer concept, and I'll add that viewing downloading meet different needs and should both be offered if possible. (said because of recently having had to download huge PDFs just to glance at a few pages). kc On 11/8/13 11:10 AM, Edward Summers wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? For an example of using Leaflet (usually used for working with maps) in this way checkout NYTimes Machine Beta: http://apps.beta620.nytimes.com/timesmachine/1969/07/20/issue.html //Ed On Nov 8, 2013, at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: We are in the process of migrating our digital collections from CONTENTdm to Omeka and are trying to figure out what to do about the compound objects -- the vast majority of which are digitized books. The source files are actually hi res tiffs but since ginormous objects broken into hundreds of pieces (each of which can be well over 100MB in size) aren't exactly friendly to use, we'd like to stitch them into individual pdf's that can be viewed more conveniently My game plan is to simply have a script pull the files down as jpegs which can be fed to imagemagick which can theoretically do everything I need. However, I've never actually done anything like this before, so I wanted to see if there's a method that people have used for combining lots of images into pdfs that works particularly well. Thanks, kyle -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] mass convert jpeg to pdf
On Nov 8, 2013, at 11:14 AM, Ethan Gruber wrote: On the same note, I've had good experiences with using adore djatoka to render jpeg2000 files. Maybe something better has since come along. I'm out of touch with this type of technology. For zoomable image rendering (from JPEG2000 or TIFF), you may also want to look at IIP Image Server: http://iipimage.sourceforge.net/ Loris: https://github.com/pulibrary/loris (from Jon Stroop @ Princeton) Djatoka is still widely used, but does not enjoy a robust or active development / support community. This web page may have some useful links for the curious: http://iiif.io/apps-demos.html - Tom PS. At DLF this week, there was also a presentation on Mirador, a multi-up windowing environment for viewing and comparing images from different repositories. It might be a nice complement to an exhibits environment.
Re: [CODE4LIB] mass convert jpeg to pdf
Do you need OCR? This script = http://bookscanner.pbworks.com/w/page/45609343/Homer%20bash%20script will OCR a directory of TIFFs (using Tesseract) and build a PDF using Tesseract. It's a little old, but I still use it pretty much every day. I think you'll need to have Ruby 1.9 installed, since the PDFBeads library uses Hpricot. There's lots of Document View/Book Widget/Page Turners...the Internet Archive one is good. I also really like the NYTime Document Viewer ( https://github.com/documentcloud/document-viewer ). The DocumentCloud people also have something to rip your PDFs apart and put them into the viewer ( https://github.com/documentcloud/docsplit ) On Fri, Nov 8, 2013 at 8:23 PM, Karen Coyle li...@kcoyle.net wrote: +1 for the viewer concept, and I'll add that viewing downloading meet different needs and should both be offered if possible. (said because of recently having had to download huge PDFs just to glance at a few pages). kc On 11/8/13 11:10 AM, Edward Summers wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? For an example of using Leaflet (usually used for working with maps) in this way checkout NYTimes Machine Beta: http://apps.beta620.nytimes.com/timesmachine/1969/07/20/issue.html //Ed On Nov 8, 2013, at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: We are in the process of migrating our digital collections from CONTENTdm to Omeka and are trying to figure out what to do about the compound objects -- the vast majority of which are digitized books. The source files are actually hi res tiffs but since ginormous objects broken into hundreds of pieces (each of which can be well over 100MB in size) aren't exactly friendly to use, we'd like to stitch them into individual pdf's that can be viewed more conveniently My game plan is to simply have a script pull the files down as jpegs which can be fed to imagemagick which can theoretically do everything I need. However, I've never actually done anything like this before, so I wanted to see if there's a method that people have used for combining lots of images into pdfs that works particularly well. Thanks, kyle -- Karen Coyle kco...@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet
Re: [CODE4LIB] mass convert jpeg to pdf
Throwing in my two cents on the IIP Image Server. I've been using it on my photos web site[0] for a while now and it works great. I was also happy to see that there is a version that supports the International Image Interoperability Framework (IIIF) API [1], which I was introduced to at DLF by Tom Cramer and company. That would make you compliant with the Mirador multi-windowing tool that he mentioned. Sounds like a win-win to me. Roy [0] http://FreeLargePhotos.com/ - sample: http://freelargephotos.com/photos/01/full.jp2/Roy+Tennant [1] http://www-sul.stanford.edu/iiif/image-api/1.1/ On Fri, Nov 8, 2013 at 11:27 AM, Tom Cramer tcra...@stanford.edu wrote: On Nov 8, 2013, at 11:14 AM, Ethan Gruber wrote: On the same note, I've had good experiences with using adore djatoka to render jpeg2000 files. Maybe something better has since come along. I'm out of touch with this type of technology. For zoomable image rendering (from JPEG2000 or TIFF), you may also want to look at IIP Image Server: http://iipimage.sourceforge.net/ Loris: https://github.com/pulibrary/loris (from Jon Stroop @ Princeton) Djatoka is still widely used, but does not enjoy a robust or active development / support community. This web page may have some useful links for the curious: http://iiif.io/apps-demos.html - Tom PS. At DLF this week, there was also a presentation on Mirador, a multi-up windowing environment for viewing and comparing images from different repositories. It might be a nice complement to an exhibits environment.
Re: [CODE4LIB] mass convert jpeg to pdf
Echo the above sentiments, and would also mention the Open Library/Internet Archive book reader[1]. We use it in Islandora[2] with Djatoka. -nruest [1] https://github.com/openlibrary/bookreader [2] http://sandbox.islandora.ca/islandora/object/islandora%3A40#page/1/mode/2up On 13-11-08 02:38 PM, Simeon Warner wrote: I agree with Ed that going to PDF seems unfortunate. Check out Jon Stroop's Loris [1] for a lightweight implementation of tiling using IIIF [2,3] that the Open Seadragon zoom-pan viewer works over. Cool demo at: http://libimages.princeton.edu/osd-demo/ Cheers, Simeon [1] https://github.com/pulibrary/loris [2] http://iiif.io/ [3] http://www-sul.stanford.edu/iiif/image-api/1.1/ On 11/8/13 2:14 PM, Ethan Gruber wrote: On the same note, I've had good experiences with using adore djatoka to render jpeg2000 files. Maybe something better has since come along. I'm out of touch with this type of technology. On Nov 8, 2013 2:10 PM, Edward Summers e...@pobox.com wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? For an example of using Leaflet (usually used for working with maps) in this way checkout NYTimes Machine Beta: http://apps.beta620.nytimes.com/timesmachine/1969/07/20/issue.html //Ed On Nov 8, 2013, at 2:00 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: We are in the process of migrating our digital collections from CONTENTdm to Omeka and are trying to figure out what to do about the compound objects -- the vast majority of which are digitized books. The source files are actually hi res tiffs but since ginormous objects broken into hundreds of pieces (each of which can be well over 100MB in size) aren't exactly friendly to use, we'd like to stitch them into individual pdf's that can be viewed more conveniently My game plan is to simply have a script pull the files down as jpegs which can be fed to imagemagick which can theoretically do everything I need. However, I've never actually done anything like this before, so I wanted to see if there's a method that people have used for combining lots of images into pdfs that works particularly well. Thanks, kyle
Re: [CODE4LIB] mass convert jpeg to pdf
It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? This should be pretty easy. But the issue with tiling is that the nav process is miserable for all but the shortest books. Most of the people who want to download want are looking for jpegs rather than source tiffs and one pdf instead of a bunch of tiffs (which is good since each one is typically over 100MB). Of course there are people who want the real deal, but that's actually a much less common use case. As Karen observes, downloading and viewing serve different use cases so of course we will provide both. IIP Image Server looks intriguing. But most of our users who want the full res stuff really just want to download the source tiffs which will be made available. kyle
Re: [CODE4LIB] mass convert jpeg to pdf
I’m having trouble understanding who the user of this content you are putting into Omeka is, and what you are expecting them to do with it. But, ok … //Ed On Nov 8, 2013, at 4:22 PM, Kyle Banerjee kyle.baner...@gmail.com wrote: It is sad to me that converting to PDF for viewing off the Web seems like the answer. Isn’t there a tiling viewer (like Leaflet) that could be used to render jpeg derivatives of the original tif files in Omeka? This should be pretty easy. But the issue with tiling is that the nav process is miserable for all but the shortest books. Most of the people who want to download want are looking for jpegs rather than source tiffs and one pdf instead of a bunch of tiffs (which is good since each one is typically over 100MB). Of course there are people who want the real deal, but that's actually a much less common use case. As Karen observes, downloading and viewing serve different use cases so of course we will provide both. IIP Image Server looks intriguing. But most of our users who want the full res stuff really just want to download the source tiffs which will be made available. kyle