Attached is an sdaps_recognize.py that I believe works correctly but still uses the add_image from sdaps.add. A diff is attached also.
Terry Terrence Kovacs Research Systems Engineer Physics and Astronomy Department Dartmouth College Wilder 341, 603-646-9303 ________________________________________ From: Terrence Kovacs <[email protected]> Sent: Friday, October 2, 2015 7:56 AM To: Benjamin Berg; [email protected] Subject: Re: custom script sdap-recognize.py Yes - I realised this too after I sent that last note. Thanks very much for the follow up. Terry Terrence Kovacs Research Systems Engineer Physics and Astronomy Department Dartmouth College Wilder 341, 603-646-9303 ________________________________________ From: Benjamin Berg <[email protected]> Sent: Thursday, October 1, 2015 4:55 PM To: Terrence Kovacs; [email protected] Subject: Re: custom script sdap-recognize.py Ah, now I remember why the script does not use the routines from the "add" package. Right now the add package unconditionally adds all pages of a tiff, while the script handles the input image on a page by page basis. This will not make a difference for you if you are simply using single page tiffs, but does mean that your code is not quite correct either unfortunately :) I'll need to properly fix this somehow so that the script works with simplex projects with both simplex and duplex scans. /me opens a ticket to not forget. Issue #83. Benjamin On Do, 2015-10-01 at 18:03 +0000, Terrence Kovacs wrote: > Thanks for the quick response. > Yes it is a simplex document. > Unfortunately the patch you sent threw this: > [tkovacs@sunspot new]$ ./sdaps-recognize.py tenq tenq_scan.tif > TIFFOpen: tenq/../DUMMY: No such file or directory. > Traceback (most recent call last): > File "./sdaps-recognize.py", line 74, in <module> > survey.questionnaire.recognize.recognize() > File "/usr/lib64/python2.7/site > -packages/sdaps/recognize/buddies.py", line 479, in recognize > res = self.identify(clean=False) > File "/usr/lib64/python2.7/site > -packages/sdaps/recognize/buddies.py", line 465, in identify > self.obj.sheet.recognize.recognize() > File "/usr/lib64/python2.7/site > -packages/sdaps/recognize/buddies.py", line 55, in recognize > image.surface.load() > File "/usr/lib64/python2.7/site-packages/sdaps/surface.py", line > 51, in load > True if self.obj.rotated else False > AssertionError: The image surface could not be created! Broken or non > 1bit tiff file? > > after some tinkering I resorted to importing add_image from sdaps.add > which does the correct magic. > > $ diff ../sdaps-1.1.10/custom-scripts/sdaps-recognize.py sdaps > -recognize.py > 31a32 > > from sdaps.add import add_image > 54,62d54 > < > < def add_image(survey, tiff, page): > < img = model.sheet.Image() > < survey.sheet.add_image(img) > < # SDAPS assumes a relative path from the survey directory > < img.filename = os.path.relpath(os.path.abspath(tiff), > survey.survey_dir) > < img.orig_name = tiff > < img.tiff_page = page > < > 67,70c59 > < add_image(survey, *images.pop(0)) > < > < if survey.defs.duplex: > < add_image(survey, *images.pop(0)) > --- > > add_image(survey, images.pop(0)[0], survey.defs.duplex) > > > Terrence Kovacs > Research Systems Engineer > Physics and Astronomy Department > Dartmouth College > Wilder 341, 603-646-9303 > > ________________________________________ > From: Benjamin Berg <[email protected]> > Sent: Wednesday, September 30, 2015 5:04 PM > To: Terrence Kovacs; [email protected] > Subject: Re: custom script sdap-recognize.py > > Hi, > > hm, interesting. My guess would be that you have a simplex document? > Probably never tested the script in that case. > > Short explanation. SDAPS basically assumes internally that all > surveys > are printed and scanned in duplex mode. To handle simplex mode the > "sdaps add" command inserts a special dummy page so that exactly this > failure does not happen. > > If you scanned in simplex mode then apply the attached patch. If you > scanned in duplex mode, then just run the second add_image there > unconditionally instead. > > Benjamin > > > On Mi, 2015-09-30 at 20:47 +0000, Terrence Kovacs wrote: > > I have a sdaps tenq survey which I can run: > > [tkovacs@sunspot new]$ sdaps tenq recognize > > ------------------------------------------------------------------- > > -- > > --------- > > - SDAPS -- recognize > > ------------------------------------------------------------------- > > -- > > --------- > > 6 sheets > > > ################################################################| > > 100% 00:00:01 > > 0.219686 seconds per sheet > > > > without issue but the custom script throws an error. Am I doing > > something wrong? > > > > [tkovacs@sunspot new]$ ./sdaps-recognize.py tenq tenq_scan.tif > > Traceback (most recent call last): > > File "./sdaps-recognize.py", line 73, in <module> > > survey.questionnaire.recognize.recognize() > > File "/usr/lib64/python2.7/site > > -packages/sdaps/recognize/buddies.py", line 479, in recognize > > res = self.identify(clean=False) > > File "/usr/lib64/python2.7/site > > -packages/sdaps/recognize/buddies.py", line 465, in identify > > self.obj.sheet.recognize.recognize() > > File "/usr/lib64/python2.7/site > > -packages/sdaps/recognize/buddies.py", line 85, in recognize > > self.duplex_copy_image_attr(failed_pages, 'rotated', _("Neither > > %s, %i or %s, %i has a known rotation!")) > > File "/usr/lib64/python2.7/site > > -packages/sdaps/recognize/buddies.py", line 292, in > > duplex_copy_image_attr > > second = self.obj.images[i + 1] > > IndexError: list index out of range > > > > Terrence Kovacs > > Research Systems Engineer > > Physics and Astronomy Department > > Dartmouth College > > Wilder 341, 603-646-9303 > > -- To unsubscribe, send mail to [email protected].
31a32 > from sdaps.add import add_image 47,61d47 < for page in range(num_pages): < images.append((file, page)) < < if len(images) == 0: < # No images, simply exit again. < sys.exit(1) < < < def add_image(survey, tiff, page): < img = model.sheet.Image() < survey.sheet.add_image(img) < # SDAPS assumes a relative path from the survey directory < img.filename = os.path.relpath(os.path.abspath(tiff), survey.survey_dir) < img.orig_name = tiff < img.tiff_page = page 63c49,52 < while images: --- > if num_pages == 0: > # No images, simply exit again. > sys.exit(1) > 65a55,60 > # add all the page images in the file > add_image(survey, file, survey.defs.duplex, copy=False) > > # Run the recognition algorithm over each page image > for page in range(num_pages): > survey.index = page + 1 67,88c62,77 < add_image(survey, *images.pop(0)) < < if survey.defs.duplex: < add_image(survey, *images.pop(0)) < < # Run the recognition algorithm over the given images < survey.questionnaire.recognize.recognize() < < for qobject in survey.questionnaire.qobjects: < if isinstance(qobject, model.questionnaire.Question): < # Only print data if an image for the page has been loaded < if survey.sheet.get_page_image(qobject.page_number) is None: < continue < for box in qobject.boxes: < print "%s,%s,%s,%s,%i,%f" % ( < survey.sheet.global_id, < survey.sheet.survey_id, < survey.sheet.questionnaire_id, < '_'.join([str(num) for num in box.id]), < int(box.data.state), < float(box.data.quality)) < print --- > survey.questionnaire.recognize.recognize() > > for qobject in survey.questionnaire.qobjects: > if isinstance(qobject, model.questionnaire.Question): > # Only print data if an image for the page has been loaded > if survey.sheet.get_page_image(qobject.page_number) is None: > continue > for box in qobject.boxes: > print "%s,%s,%s,%s,%i,%f" % ( > survey.sheet.global_id, > survey.sheet.survey_id, > survey.sheet.questionnaire_id, > '_'.join([str(num) for num in box.id]), > int(box.data.state), > float(box.data.quality)) > print
#!/usr/bin/env python # -*- coding: utf8 -*- # SDAPS - Scripts for data acquisition with paper based surveys # Copyright (C) 2008, Christoph Simon <[email protected]> # Copyright (C) 2012, Benjamin Berg <[email protected]> # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see <http://www.gnu.org/licenses/>. import sys import os # Use the following and local_run=True below to run without installing SDAPS #sys.path.append(os.path.join(os.path.dirname(sys.argv[0]), '..')) import sdaps #sdaps.init(local_run=True) sdaps.init() from sdaps import model from sdaps import image from sdaps.add import add_image # Assume the first argument is a survey survey = model.survey.Survey.load(sys.argv[1]) # We need the recognize buddies, as they are able to identify the data from sdaps.recognize import buddies # A sheet object to attach the images to sheet = model.sheet.Sheet() survey.add_sheet(sheet) images = [] for file in sys.argv[2:]: num_pages = image.get_tiff_page_count(file) if num_pages == 0: # No images, simply exit again. sys.exit(1) # Simply drop the list of images again. sheet.images = [] # add all the page images in the file add_image(survey, file, survey.defs.duplex, copy=False) # Run the recognition algorithm over each page image for page in range(num_pages): survey.index = page + 1 survey.questionnaire.recognize.recognize() for qobject in survey.questionnaire.qobjects: if isinstance(qobject, model.questionnaire.Question): # Only print data if an image for the page has been loaded if survey.sheet.get_page_image(qobject.page_number) is None: continue for box in qobject.boxes: print "%s,%s,%s,%s,%i,%f" % ( survey.sheet.global_id, survey.sheet.survey_id, survey.sheet.questionnaire_id, '_'.join([str(num) for num in box.id]), int(box.data.state), float(box.data.quality)) print # And, we simply quit, ie. we don't save the survey
