My apologies for the length, I wanted to make sure to provide all the necessary information.
========= Background My first Pylons project is the server portion of an automatic data collection system in a manufacturing environment. There are a number of vendor provided Windows boxes that run some custom built manufacturing systems on which I've deployed a Windows service. This service watches some number of files that the vendor dumps logging/ status information to and then posts a file in the windows configuration format to http://servername:port/department_name/tool_name/controller/data_version/year/month/day/hour/minute/second'. The server side at this point is made up of what roughly amounts to an abstract parent class that handles talking to the database and one (though more coming) child class that parses whatever the vendor code outputs and then calls the insertion method in the parent. Everything was working fantastically for the first few weeks when we had only one client but since I've deployed the code to the rest of the tools on the floor about one submission 250 or so appears in the database as if it's coming from the wrong tool. ========== The Problem: One of my users submitted a bug report that stated that a job that had run on one tool was being reported as having run on a different tool. When I checked the database, I found that there were 4 different tools associated with a job that had only run on 1. I then expanded my search and found that just about half the jobs in the database were had records from extraneous tools. For example, job_id=24 was run exclusively on the tool specified by data_source_id=3 but a total of 6 records out of 680 records appear to be coming from data_source_id=6. In one representative case, two requests from different machines, S14 (data_source_id=3) and PB24 (data_source_id=6) come in at about the same time (see logs below). The request from S14 hits an authkit error (see logs below, the IP of S14 is 10.11.4.228). One record (75198) from that batch is inserted with the correct source (S14), the remaining 3 (75176, 75172, 75174) are inserted as though they came from PB24. Below is the relevant data from the database, sorted by data_source_id to show the problem: id | value | time | measurement_type_id | data_source_id | job_id 75170 | 2008-08-14 18:08:14 | 4 | 3 | 24 75180 | 2008-08-14 18:08:16 | 1 | 3 | 24 75178 | 2008-08-14 18:08:16 | 2 | 3 | 24 75179 | 2008-08-14 18:08:16 | 3 | 3 | 24 75177 | 2008-08-14 18:08:16 | 4 | 3 | 24 75198 | 2008-08-14 18:10:14 | 4 | 3 | 24 75208 | 2008-08-14 18:10:16 | 1 | 3 | 24 75206 | 2008-08-14 18:10:16 | 2 | 3 | 24 75207 | 2008-08-14 18:10:16 | 3 | 3 | 24 75205 | 2008-08-14 18:10:16 | 4 | 3 | 24 75164 | 2008-08-14 18:08:12 | 1 | 6 | 22 75162 | 2008-08-14 18:08:12 | 2 | 6 | 22 75163 | 2008-08-14 18:08:12 | 3 | 6 | 22 75161 | 2008-08-14 18:08:12 | 4 | 6 | 22 75176 | 2008-08-14 18:09:12 | 1 | 6 | 24 75175 | 2008-08-14 18:09:12 | 1 | 6 | 22 75171 | 2008-08-14 18:09:12 | 2 | 6 | 22 75172 | 2008-08-14 18:09:12 | 2 | 6 | 24 75173 | 2008-08-14 18:09:12 | 3 | 6 | 22 75174 | 2008-08-14 18:09:12 | 3 | 6 | 24 75169 | 2008-08-14 18:09:12 | 4 | 6 | 22 75192 | 2008-08-14 18:10:12 | 1 | 6 | 22 75190 | 2008-08-14 18:10:12 | 2 | 6 | 22 75191 | 2008-08-14 18:10:12 | 3 | 6 | 22 75189 | 2008-08-14 18:10:12 | 4 | 6 | 22 75201 | 2008-08-14 18:11:12 | 1 | 6 | 24 75204 | 2008-08-14 18:11:12 | 1 | 6 | 22 75199 | 2008-08-14 18:11:12 | 2 | 6 | 24 75202 | 2008-08-14 18:11:12 | 2 | 6 | 22 75203 | 2008-08-14 18:11:12 | 3 | 6 | 22 75200 | 2008-08-14 18:11:12 | 3 | 6 | 24 75197 | 2008-08-14 18:11:12 | 4 | 6 | 22 My first thought was that s = select([model.data_source.c.id],and_(model.equipment.c.name==tool_name, model.equipment.c.id==model.data_source.c.equipment_id, model.data_source_type.c.name==data_source_type_name, model.data_source.c.type_id==model.data_source_type.c.id)) result = s.execute().fetchone() ds_id = result[0] (see full code below) was returning a different result each time due to the lack of a sort, but the table is only the following 7 rows: id | type_id | equipment_id | collector_id ----+---------+--------------+-------------- 1 | 1 | 1 | 1 2 | 1 | 2 | 2 3 | 1 | 3 | 3 4 | 1 | 4 | 4 5 | 1 | 5 | 5 6 | 1 | 6 | 6 7 | 1 | 7 | 7 (7 rows) Neither the equipment nor the collector tables have any obvious problems like two records with the same name, etc. I expected that either the 3 records that end up assigned to the wrong tool (which I'm assuming are processed after the exception is raised) would get tossed (unhandled exception) or inserted as S14 (exception is handled somewhere in the framework) but this isn't the case. In fact, I'm not even sure why the exception is generated. Obviously there is a bug somewhere, but I have no idea where. Any thoughts? ===================== The relevant code snippets: <...> class DatasourceParams(object): dataFile = None timestamp = None tool = None rawDataPath = None def __init__(self,request): self.dataFile = request.params['data_file'] self.timestamp = datetime.datetime(int(request.urlvars['year']), int(request.urlvars['month']), int(request.urlvars['day']), int(request.urlvars['hour']), int(request.urlvars['minute']), int(request.urlvars['second'])) self.data_source_type = request.urlvars['controller'] self.version = request.urlvars['version'] self.tool = request.urlvars['toolname'] self._storeRaw() def _storeRaw(self): if not os.path.exists(config['rawstore']) : os.mkdir(config['rawstore']) elif not os.path.isdir(config['rawstore']): raise Exception("Raw Storage Directory specifies a file not a directory") self.rawDataPath = config['rawstore']+self.tool +'-'+self.timestamp.strftime('%Y%m%d%H%M%S') fp = file(self.rawDataPath,'w') fp.write(self.dataFile) fp.flush() fp.close() <...> class DatasourceController(BaseController): ds_params = None __supported_versions = [] def create(self): log.debug(request) self.ds_params = h.DatasourceParams(request) if self.ds_params.version in self._supported_versions: self._parse() return "Records Inserted" else : return "Unsupported Version" def insertDatum(self, measurementTypeLabel, tool_name, data_source_type_name, timeStamp, value, job_name=None): result = model.measurement_type.select(model.measurement_type.c.label==measurementTypeLabel).execute() row = result.fetchone() mt_id = row['id'] s = select([model.data_source.c.id],and_(model.equipment.c.name==tool_name, model.equipment.c.id==model.data_source.c.equipment_id, model.data_source_type.c.name==data_source_type_name, model.data_source.c.type_id==model.data_source_type.c.id)) result = s.execute().fetchone() ds_id = result[0] if job_name != None: s = select([model.job.c.id],and_(model.job.c.name==job_name)) result = s.execute().fetchone() if result == None: result = model.job.insert().execute(name=job_name).fetchone() j_id = result[0] else : j_id = None model.measurement.insert().execute(measurement_type_id=mt_id, data_source_id = ds_id, time=timeStamp, value=value, job_id=j_id) <...> class FaststatController(DatasourceController): _supported_versions = ['1'] _datafile_section_name = "Job" _sourceToDBMapping = {'Eye 5 Good Books':"Good Books", 'Eye 5 IJ Books':"IJ Books", 'Eye 5 QC Books':"QC Books", 'Eye 5 Rejects':"Rejects", 'Print Order' : 'Job Print Order', } def _parse(self): try: # A hack to avoid having to write my own config parser massagedDataFile = "\r \n".join(self.ds_params.dataFile.split('\n\n')) fp = h.StringWithReadLine(massagedDataFile) inputFile = ConfigParser.ConfigParser() inputFile.readfp(fp) for measurementType in self._sourceToDBMapping.keys(): self.insertDatum(measurementType, self.ds_params.tool, self.ds_params.data_source_type, self.ds_params.timestamp, str(inputFile.get(self._datafile_section_name, self._sourceToDBMapping[measurementType])).replace(',',''), job_name = inputFile.get(self._datafile_section_name, 'Job Name')) except Exception, inst: log.error(inst) FaststatController = middleware(FaststatController(),HasAuthKitRole(['collector'])) paster.log ======= The server is logging at DEBUG level, so I checked the log and found this (please note, we suffer from some pretty bad clock skew problems between different servers, I did some verification and these log messages do correspond to those in the database section above): <...> 18:09:56,068 DEBUG [warehouse.controllers.dataSource] <paste.wsgiwrappers.WSGIRequest object at 0x9ad53ac method='POST', scheme='http', host='mds.lanepress.com:5000', script_name='', path_info='/bindery/S14/FASTstat/1/2008/08/14/18/08/14/', languges=['en-us'], charset='utf-8', errors='replace', GET=UnicodeMultiDict([]), POST=UnicodeMultiDict([('data_file', u'[Misc]\nLast Updated Date/Time = 08/14/08 18:08:12\nCurrent Run Speed (BPM) = 0\nMachine Name = 16 Pocket\nCurrent Operator = NONE\nLogin Level (Mode) = QC\n\n[Current Volume Status]\nName = cat34606_08_12dom\nFile Name = V1.*\nFirst Record = 1\nLast Record = 6,021\nTotal Record s = 6,021\nStop Record = 6,021\nLast Record to PLC = 200\nRemaining Records = 5,821\n\n[Job]\nJob Name = CATEGORY SOLUTIONS\nJob Print Order = 7,0 00\nSystem Cycles = 49\nGood Books = 3\nIJ Books = 0\nNon-IJ Books = 3\nQC Books = 0\nBulk Books = 0\nFixed Msg Books = 0\nSystem Stops = 0\nSlow Downs = 0\nInhibit Tests = 0\nRejects = 49\nDiverts = 0\nStacks = 0\n\n[Shift]\nSystem Cycles = 48,839\nGood Books = 47,475\nIJ Books = 42,557\nNon-IJ Books = 4,908\nQC Books = 10\nBulk Books = 0\nFixed Msg Books = 0\nSystem Stops = 64\nSlow Downs = 0\nInhibit Tests = 0\nRejects = 48,839\nDivert s = 82\n\n[ECR]\nNon-Reorder = 3\nNon- ECR Diverted = 0\nECR Diverted = 0\nNon-ECR Stacked = 0\nECR Stacked = 0\nNo n-ECR Manual = 0\nECR Manual = 0\n \n[Volume Queue]\nCurrent Volume (1) = cat34606_08_12dom\nNext Volume (2) = \nNext Volume (3) = \nNext Volume (4) = \nNex t Volume (5) = \n\n[24 Hour Production Status]\nHour 1900 = 0\nHour 2000 = 181\nHour 2100 = 2,441\nHour 2200 = 2,103\nHour 2300 = 2,138\nHour 0000 = 748\nHour 0100 = 3,157\nHou r 0200 = 1,450\nHour 0300 = 1,751\nHour 0400 = 691\nHour 0500 = 854\nHour 0600 = 0\nHour 0700 = 2,665\nHour 0800 = 5,956\nHour 0900 = 6,860\nHour 1000 = 5,478\nHour 1100 = 6,379 \nHour 1200 = 6,520\nHour 1300 = 5,844\nHour 1400 = 6,437\nHour 1500 = 1,333\nHour 1600 = 0\nHour 1700 = 0\nCurrent = 3\n')]), cookies={}> 18:09:56,076 DEBUG [paste.httpserver.ThreadPool] Added task (0 tasks queued) 18:09:56,081 DEBUG [routes.middleware] Matched POST /bindery/PB24/ FASTstat/1/2008/08/14/18/09/12/ 18:09:56,081 DEBUG [routes.middleware] Route path: ':department/:toolname/:controller/:version/:year/:month/:day/:hour/:minute/:second', defaults: {'action': u'create'} 18:09:56,082 DEBUG [routes.middleware] Match dict: {'hour': u'18', 'second': u'12', 'month': u'08', 'action': u'create', 'controller': u'FASTstat', 'version': u'1', 'year': u'2008', 'department': u'bindery', 'toolname': u'PB24', 'day': u'14', 'minute': u'09'} 18:09:56,082 DEBUG [pylons.wsgiapp] Setting up Pylons stacked object globals 18:09:56,083 DEBUG [pylons.wsgiapp] Resolved URL to controller: u'FASTstat' 18:09:56,083 DEBUG [pylons.wsgiapp] Calling controller class with WSGI interface 18:09:56,084 DEBUG [authkit.authenticate.multi] Status: '401 Not Authenticated', Headers: [('content-type', 'text/plain')] 18:09:56,084 DEBUG [authkit.authenticate.multi] Status checker recieved status '401 Not Authenticated', headers [('content-type', 'text/plain')], intecept ['401'] 18:09:56,085 DEBUG [authkit.authenticate.multi] Status checker returns True 18:09:56,085 DEBUG [authkit.authenticate.multi] MultiMiddleware self.checker check() returning <authkit.authenticate.basic.BasicAuthHandler object at 0x96ba30c> 18:09:56,086 DEBUG [authkit.authenticate.multi] Matched binding returns status: '401 Unauthorized', headers: [('WWW-Authenticate', 'Basic realm="LanePress"'), ('content-type', 'text/plain')], exc _info: None 18:09:56,087 DEBUG [routes.middleware] Matched POST /error/document/ 18:09:56,087 DEBUG [routes.middleware] Route path: 'error/:action/:id', defaults: {'action': 'index', 'controller': u'error', 'id': None} 18:09:56,087 DEBUG [routes.middleware] Match dict: {'action': u'document', 'controller': u'error', 'id': None} 18:09:56,088 DEBUG [pylons.wsgiapp] Setting up Pylons stacked object globals 18:09:56,088 DEBUG [pylons.wsgiapp] Resolved URL to controller: u'error' 18:09:56,089 DEBUG [pylons.wsgiapp] Controller appears to be a class, instantiating 18:09:56,089 DEBUG [pylons.wsgiapp] Calling controller class with WSGI interface 18:09:56,090 DEBUG [pylons.controllers.core] Looking for u'document' method to handle the request 18:09:56,090 DEBUG [pylons.controllers.core] Calling 'document' method with keyword args: **{} 18:09:56,091 DEBUG [pylons.controllers.core] Assuming controller returned a basestring or buffer, writing it to pylons.response 18:09:56,091 DEBUG [pylons.controllers.core] Calling Response object to return WSGI data 18:09:56,092 DEBUG [pylons.controllers.core] Merging pylons.response headers into start_response call, status: 200 OK 18:09:56,092 DEBUG [authkit.authenticate.multi] Status: '200 OK', Headers: [('content-type', 'text/html; charset=utf-8'), ('pragma', 'no- cache'), ('cache-control', 'no-cache')] 18:09:56,093 DEBUG [authkit.authenticate.multi] Status checker recieved status '200 OK', headers [('content-type', 'text/html; charset=utf-8'), ('pragma', 'no-cache'), ('cache-control', 'no-cache ')], intecept ['401'] 18:09:56,093 DEBUG [authkit.authenticate.multi] Status checker returns False 18:09:56,094 DEBUG [authkit.authenticate.multi] Multi: No binding was found for the check 18:09:56,099 DEBUG [paste.httpserver.ThreadPool] Added task (0 tasks queued) 18:09:56,101 DEBUG [authkit.authenticate] valid_password called. username: pb24 18:09:56,103 DEBUG [routes.middleware] Matched POST /bindery/PB24/ FASTstat/1/2008/08/14/18/09/12/ 18:09:56,103 DEBUG [routes.middleware] Route path: ':department/:toolname/:controller/:version/:year/:month/:day/:hour/:minute/:second', defaults: {'action': u'create'} 18:09:56,103 DEBUG [routes.middleware] Match dict: {'hour': u'18', 'second': u'12', 'month': u'08', 'action': u'create', 'controller': u'FASTstat', 'version': u'1', 'year': u'2008', 'department': u'bindery', 'toolname': u'PB24', 'day': u'14', 'minute': u'09'} 18:09:56,104 DEBUG [pylons.wsgiapp] Setting up Pylons stacked object globals 18:09:56,105 DEBUG [pylons.wsgiapp] Resolved URL to controller: u'FASTstat' 18:09:56,105 DEBUG [pylons.wsgiapp] Calling controller class with WSGI interface 18:09:56,106 DEBUG [warehouse.controllers.dataSource] <paste.wsgiwrappers.WSGIRequest object at 0x9ad32ec method='POST', scheme='http', host='mds.lanepress.com:5000', script_name='', path_info='/bindery/PB24/FASTstat/1/2008/08/14/18/09/12/', languges=['en-us'], charset='utf-8', errors='replace', GET=UnicodeMultiDict([]), POST=UnicodeMultiDict([('data_file', u'[Misc]\nLast Updated Date/Time = 08/14/2008 18:09:08\nCurrent Run Speed (BPM) = 0\nMachine Name = RGUB\nCurrent Operator = Supervisor\nLo gin Level (Mode) = SETUP\n\n[Current Volume Status] \nName = \nFile Name = .*\nFirst Record = 0\nLast Record = 0\nTotal Records = 0\nStop Record = 0\nLast Record to PLC = 0\nRemaining Records = 0\n\n[Job]\nJob Name = ARTnews\nJob Print Order = 100,250\nSystem Cycles = 46,873\nGood Books = 20,598\nIJ Books = 0\nNon-IJ Books = 20,598\nQC Books = 0\nBulk Books = 0\nFixed Msg Books = 0\nSystem Stops = 91\nSlow Downs = 0\nInhibit Tests = 0\nRejects = 1,239\nDiverts = 13\nStacks = 0\n\n[Shift]\nSystem Cycles = 44,944\nGood Books = 20,569\nIJ Books = 0\nNon-IJ Books = 20,569\nQC Books = 0\nBulk Books = 0\nFixed Msg Books = 0\nSystem Stops = 74\nSlow Downs = 0\nInhibit Tests = 0\nRejects = 1,218\nDiverts = 12\n \n[ECR]\nNon-Reorder = 20,598\nNon-ECR Diverted = 0\nECR Diverted = 0\nNon-ECR Stacked = 0\nECR Stacked = 0\nNon-ECR Manual = 0\nECR Manual = 0\n\n[Volume Queue]\nCurrent Volume (1) = \nNext Volume (2) = \nNext Volume (3) = \nNext Volume (4) = \nNext Volume (5) = \n\n[24 Hour Producti on Status]\nHour 1900 = 3,883\nHour 2000 = 5,495\nHour 2100 = 675\nHour 2200 = 0\nHour 2300 = 12\nHour 0000 = 0\nHour 0100 = 0\nHour 0200 = 0\nHour 0300 = 0\nHou r 0400 = 0\nHour 0500 = 0\nHour 0600 = 17\nHour 0700 = 220\nHour 0800 = 235\nHour 0900 = 1,825\nHour 1000 = 3,013\nHour 1100 = 5,193\nHour 1200 = 4,393\nHour 1300 = 1,444 \nHour 1400 = 3,136\nHour 1500 = 557\nHour 1600 = 537\nHour 1700 = 1\nCurrent = 15\n')]), cookies={}> 18:09:56,224 DEBUG [authkit.authenticate.multi] Status: '200 OK', Headers: [('content-type', 'text/html; charset=utf-8'), ('pragma', 'no- cache'), ('cache-control', 'no-cache')] ---------------------------------------- Exception happened during processing of request from ('10.11.4.228', 1557) Traceback (most recent call last): File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/httpserver.py", line 1046, in process_request_in_thread self.finish_request(request, client_address) File "/usr/local/lib/python2.5/SocketServer.py", line 254, in finish_request self.RequestHandlerClass(request, client_address, self) File "/usr/local/lib/python2.5/SocketServer.py", line 522, in __init__ self.handle() File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/httpserver.py", line 425, in handle BaseHTTPRequestHandler.handle(self) File "/usr/local/lib/python2.5/BaseHTTPServer.py", line 316, in handle self.handle_one_request() File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/httpserver.py", line 420, in handle_one_request self.wsgi_execute() File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/httpserver.py", line 287, in wsgi_execute self.wsgi_start_response) File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/translogger.py", line 67, in __call__ return self.application(environ, replacement_start_response) File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/cascade.py", line 92, in __call__ return self.apps[-1](environ, start_response) File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/registry.py", line 340, in __call__ app_iter = self.application(environ, start_response) File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/recursive.py", line 80, in __call__ return self.application(environ, start_response) File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/ paste/errordocument.py", line 185, in __call__ app_iter = self.application(environ, change_response) File "/usr/local/lib/python2.5/site-packages/AuthKit-0.4.0-py2.5.egg/ authkit/authenticate/__init__.py", line 290, in __call__ return self.app(environ, start_response) File "/usr/local/lib/python2.5/site-packages/AuthKit-0.4.0-py2.5.egg/ authkit/authenticate/basic.py", line 126, in __call__ return self.application(environ, start_response) File "/usr/local/lib/python2.5/site-packages/AuthKit-0.4.0-py2.5.egg/ authkit/authenticate/multi.py", line 64, in __call__ raise Exception('WSGI start_response was not called before a result' Exception: WSGI start_response was not called before a result was returned ---------------------------------------- <...> --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "pylons-discuss" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [EMAIL PROTECTED] For more options, visit this group at http://groups.google.com/group/pylons-discuss?hl=en -~----------~----~----~----~------~----~------~--~---
