My apologies for the length, I wanted to make sure to provide all the
necessary information.


=========
Background

My first Pylons project is the server portion of an automatic data
collection system in a manufacturing environment.  There are a number
of vendor provided Windows boxes that run some custom built
manufacturing systems on which I've deployed a Windows service.  This
service watches some number of files that the vendor dumps logging/
status information to and then posts a file in the windows
configuration format to
http://servername:port/department_name/tool_name/controller/data_version/year/month/day/hour/minute/second'.

The server side at this point is made up of what roughly amounts to an
abstract parent class that handles talking to the database and one
(though more coming) child class that parses whatever the vendor code
outputs and then calls the insertion method in the parent.  Everything
was working fantastically for the first few weeks when we had only one
client but since I've deployed the code to the rest of the tools on
the floor about one submission 250 or so appears in the database as if
it's coming from the wrong tool.

==========
The Problem:

One of my users submitted a bug report that stated that a job that had
run on one tool was being reported as having run on a different tool.
When I checked the database, I found that there were 4 different tools
associated with a job that had only run on 1.  I then expanded my
search and found that just about half the jobs in the database were
had records from extraneous tools.  For example, job_id=24 was run
exclusively on the tool specified by data_source_id=3 but a total of 6
records out of 680 records appear to be coming from data_source_id=6.

In one representative case, two requests from different machines, S14
(data_source_id=3) and  PB24 (data_source_id=6) come in at about the
same time (see logs below).  The request from S14 hits an authkit
error (see logs below, the IP of S14 is 10.11.4.228).  One record
(75198) from that batch is inserted with the correct source (S14), the
remaining 3 (75176, 75172, 75174) are inserted as though they came
from PB24.

 Below is the relevant data from the database, sorted by
data_source_id to show the problem:
  id   | value |        time         | measurement_type_id |
data_source_id | job_id
 75170 | 2008-08-14 18:08:14 |                   4 |              3
|     24
 75180 | 2008-08-14 18:08:16 |                   1 |              3
|     24
 75178 | 2008-08-14 18:08:16 |                   2 |              3
|     24
 75179 | 2008-08-14 18:08:16 |                   3 |              3
|     24
 75177 | 2008-08-14 18:08:16 |                   4 |              3
|     24
 75198 | 2008-08-14 18:10:14 |                   4 |              3
|     24
 75208 | 2008-08-14 18:10:16 |                   1 |              3
|     24
 75206 | 2008-08-14 18:10:16 |                   2 |              3
|     24
 75207 | 2008-08-14 18:10:16 |                   3 |              3
|     24
 75205 | 2008-08-14 18:10:16 |                   4 |              3
|     24
 75164 | 2008-08-14 18:08:12 |                   1 |              6
|     22
 75162 | 2008-08-14 18:08:12 |                   2 |              6
|     22
 75163 | 2008-08-14 18:08:12 |                   3 |              6
|     22
 75161 | 2008-08-14 18:08:12 |                   4 |              6
|     22
 75176 | 2008-08-14 18:09:12 |                   1 |              6
|     24
 75175 | 2008-08-14 18:09:12 |                   1 |              6
|     22
 75171 | 2008-08-14 18:09:12 |                   2 |              6
|     22
 75172 | 2008-08-14 18:09:12 |                   2 |              6
|     24
 75173 | 2008-08-14 18:09:12 |                   3 |              6
|     22
 75174 | 2008-08-14 18:09:12 |                   3 |              6
|     24
 75169 | 2008-08-14 18:09:12 |                   4 |              6
|     22
 75192 | 2008-08-14 18:10:12 |                   1 |              6
|     22
 75190 | 2008-08-14 18:10:12 |                   2 |              6
|     22
 75191 | 2008-08-14 18:10:12 |                   3 |              6
|     22
 75189 | 2008-08-14 18:10:12 |                   4 |              6
|     22
 75201 | 2008-08-14 18:11:12 |                   1 |              6
|     24
 75204 | 2008-08-14 18:11:12 |                   1 |              6
|     22
 75199 | 2008-08-14 18:11:12 |                   2 |              6
|     24
 75202 | 2008-08-14 18:11:12 |                   2 |              6
|     22
 75203 | 2008-08-14 18:11:12 |                   3 |              6
|     22
 75200 | 2008-08-14 18:11:12 |                   3 |              6
|     24
 75197 | 2008-08-14 18:11:12 |                   4 |              6
|     22

My first thought was that
        s =
select([model.data_source.c.id],and_(model.equipment.c.name==tool_name,
 
model.equipment.c.id==model.data_source.c.equipment_id,
 
model.data_source_type.c.name==data_source_type_name,
 
model.data_source.c.type_id==model.data_source_type.c.id))
        result = s.execute().fetchone()
        ds_id = result[0]

 (see full code below)
was returning a different result each time due to the lack of a sort,
but the table is only the following 7 rows:
 id | type_id | equipment_id | collector_id
----+---------+--------------+--------------
  1 |       1 |            1 |            1
  2 |       1 |            2 |            2
  3 |       1 |            3 |            3
  4 |       1 |            4 |            4
  5 |       1 |            5 |            5
  6 |       1 |            6 |            6
  7 |       1 |            7 |            7
(7 rows)

Neither the equipment nor the collector tables have any obvious
problems like two records with the same name, etc.

I expected that either the 3 records that end up assigned to the wrong
tool (which  I'm assuming are processed after the exception is raised)
would get tossed (unhandled exception) or inserted as S14 (exception
is handled somewhere in the framework) but this isn't the case.  In
fact, I'm not even sure why the exception is generated.  Obviously
there is a bug somewhere, but I have no idea where.  Any thoughts?


=====================
The relevant code snippets:

<...>

class DatasourceParams(object):
    dataFile = None
    timestamp = None
    tool = None
    rawDataPath = None

    def __init__(self,request):
        self.dataFile = request.params['data_file']
        self.timestamp =
datetime.datetime(int(request.urlvars['year']),
                             int(request.urlvars['month']),
                             int(request.urlvars['day']),
                             int(request.urlvars['hour']),
                             int(request.urlvars['minute']),
                             int(request.urlvars['second']))
        self.data_source_type =
request.urlvars['controller']
        self.version =
request.urlvars['version']
        self.tool = request.urlvars['toolname']
        self._storeRaw()

    def _storeRaw(self):
        if not os.path.exists(config['rawstore']) :
            os.mkdir(config['rawstore'])
        elif not os.path.isdir(config['rawstore']):
            raise Exception("Raw Storage Directory specifies a file
not a directory")
        self.rawDataPath = config['rawstore']+self.tool
+'-'+self.timestamp.strftime('%Y%m%d%H%M%S')
        fp = file(self.rawDataPath,'w')
        fp.write(self.dataFile)
        fp.flush()
        fp.close()


<...>

class DatasourceController(BaseController):

    ds_params = None
    __supported_versions = []

    def create(self):
        log.debug(request)
        self.ds_params = h.DatasourceParams(request)
        if self.ds_params.version in self._supported_versions:
            self._parse()
            return "Records Inserted"
        else :
            return "Unsupported Version"

    def insertDatum(self, measurementTypeLabel, tool_name,
data_source_type_name, timeStamp, value, job_name=None):
        result =
model.measurement_type.select(model.measurement_type.c.label==measurementTypeLabel).execute()
        row = result.fetchone()
        mt_id = row['id']

        s =
select([model.data_source.c.id],and_(model.equipment.c.name==tool_name,
 
model.equipment.c.id==model.data_source.c.equipment_id,
 
model.data_source_type.c.name==data_source_type_name,
 
model.data_source.c.type_id==model.data_source_type.c.id))

        result = s.execute().fetchone()
        ds_id = result[0]
        if job_name != None:
            s =
select([model.job.c.id],and_(model.job.c.name==job_name))
            result = s.execute().fetchone()
            if result == None:
                result =
model.job.insert().execute(name=job_name).fetchone()
            j_id = result[0]
        else :
            j_id = None
        model.measurement.insert().execute(measurement_type_id=mt_id,
                                           data_source_id = ds_id,
                                           time=timeStamp,
                                           value=value,
                                           job_id=j_id)

<...>

class FaststatController(DatasourceController):

    _supported_versions = ['1']
    _datafile_section_name = "Job"
    _sourceToDBMapping = {'Eye 5 Good Books':"Good Books",
                           'Eye 5 IJ Books':"IJ Books",
                           'Eye 5 QC Books':"QC Books",
                           'Eye 5 Rejects':"Rejects",
                           'Print Order' : 'Job Print Order',
                          }
    def _parse(self):
        try:
            # A hack to avoid having to write my own config parser
            massagedDataFile = "\r
\n".join(self.ds_params.dataFile.split('\n\n'))
            fp = h.StringWithReadLine(massagedDataFile)
            inputFile = ConfigParser.ConfigParser()
            inputFile.readfp(fp)

            for measurementType in self._sourceToDBMapping.keys():
                self.insertDatum(measurementType,
                                 self.ds_params.tool,
                                 self.ds_params.data_source_type,
                                 self.ds_params.timestamp,
 
str(inputFile.get(self._datafile_section_name,
 
self._sourceToDBMapping[measurementType])).replace(',',''),
                                 job_name =
inputFile.get(self._datafile_section_name, 'Job Name'))
        except Exception, inst:
            log.error(inst)

FaststatController =
middleware(FaststatController(),HasAuthKitRole(['collector']))



paster.log
=======
The server is logging at DEBUG level, so I checked the log and found
this (please note, we suffer from some pretty bad clock skew problems
between different servers, I did some verification and these log
messages do correspond to those in the database section above):

<...>
18:09:56,068 DEBUG [warehouse.controllers.dataSource]
<paste.wsgiwrappers.WSGIRequest object at 0x9ad53ac method='POST',
scheme='http', host='mds.lanepress.com:5000', script_name='',
path_info='/bindery/S14/FASTstat/1/2008/08/14/18/08/14/',
languges=['en-us'], charset='utf-8', errors='replace',
GET=UnicodeMultiDict([]),
POST=UnicodeMultiDict([('data_file', u'[Misc]\nLast Updated Date/Time
= 08/14/08 18:08:12\nCurrent Run Speed (BPM) = 0\nMachine
Name            = 16 Pocket\nCurrent Operator        = NONE\nLogin
 Level (Mode)      = QC\n\n[Current Volume Status]\nName
= cat34606_08_12dom\nFile Name          = V1.*\nFirst Record
=         1\nLast Record        =     6,021\nTotal Record
s      =     6,021\nStop Record        =     6,021\nLast Record to PLC
=       200\nRemaining Records  =     5,821\n\n[Job]\nJob
Name           = CATEGORY SOLUTIONS\nJob Print Order    =      7,0
00\nSystem Cycles      =         49\nGood Books         =
3\nIJ Books           =          0\nNon-IJ Books       =
3\nQC Books           =          0\nBulk Books         =
    0\nFixed Msg Books    =          0\nSystem Stops       =
0\nSlow Downs         =          0\nInhibit Tests      =
0\nRejects            =         49\nDiverts            =
       0\nStacks             =          0\n\n[Shift]\nSystem
Cycles      =  48,839\nGood Books         =  47,475\nIJ
Books           =  42,557\nNon-IJ Books       =   4,908\nQC
Books           =
     10\nBulk Books         =       0\nFixed Msg Books    =
0\nSystem Stops       =      64\nSlow Downs         =       0\nInhibit
Tests      =       0\nRejects            =  48,839\nDivert
s            =      82\n\n[ECR]\nNon-Reorder        =         3\nNon-
ECR Diverted   =         0\nECR Diverted       =         0\nNon-ECR
Stacked    =         0\nECR Stacked        =         0\nNo
n-ECR Manual     =         0\nECR Manual         =         0\n
\n[Volume Queue]\nCurrent Volume (1) = cat34606_08_12dom\nNext Volume
(2)    =  \nNext Volume (3)    =  \nNext Volume (4)    =  \nNex
t Volume (5)    =  \n\n[24 Hour Production Status]\nHour 1900 =
0\nHour 2000 =    181\nHour 2100 =  2,441\nHour 2200 =  2,103\nHour
2300 =  2,138\nHour 0000 =    748\nHour 0100 =  3,157\nHou
r 0200 =  1,450\nHour 0300 =  1,751\nHour 0400 =    691\nHour 0500
=    854\nHour 0600 =      0\nHour 0700 =  2,665\nHour 0800 =
5,956\nHour 0900 =  6,860\nHour 1000 =  5,478\nHour 1100 =  6,379
\nHour 1200 =  6,520\nHour 1300 =  5,844\nHour 1400 =  6,437\nHour
1500 =  1,333\nHour 1600 =      0\nHour 1700 =      0\nCurrent
=      3\n')]),
cookies={}>
18:09:56,076 DEBUG [paste.httpserver.ThreadPool] Added task (0 tasks
queued)
18:09:56,081 DEBUG [routes.middleware] Matched POST /bindery/PB24/
FASTstat/1/2008/08/14/18/09/12/
18:09:56,081 DEBUG [routes.middleware] Route path:
':department/:toolname/:controller/:version/:year/:month/:day/:hour/:minute/:second',
defaults: {'action': u'create'}
18:09:56,082 DEBUG [routes.middleware] Match dict: {'hour': u'18',
'second': u'12', 'month': u'08', 'action': u'create', 'controller':
u'FASTstat', 'version': u'1', 'year': u'2008', 'department':
 u'bindery', 'toolname': u'PB24', 'day': u'14', 'minute': u'09'}
18:09:56,082 DEBUG [pylons.wsgiapp] Setting up Pylons stacked object
globals
18:09:56,083 DEBUG [pylons.wsgiapp] Resolved URL to controller:
u'FASTstat'
18:09:56,083 DEBUG [pylons.wsgiapp] Calling controller class with WSGI
interface
18:09:56,084 DEBUG [authkit.authenticate.multi] Status: '401 Not
Authenticated', Headers: [('content-type', 'text/plain')]
18:09:56,084 DEBUG [authkit.authenticate.multi] Status checker
recieved status '401 Not Authenticated', headers [('content-type',
'text/plain')], intecept ['401']
18:09:56,085 DEBUG [authkit.authenticate.multi] Status checker returns
True
18:09:56,085 DEBUG [authkit.authenticate.multi] MultiMiddleware
self.checker check() returning
<authkit.authenticate.basic.BasicAuthHandler object at 0x96ba30c>
18:09:56,086 DEBUG [authkit.authenticate.multi] Matched binding
returns status: '401 Unauthorized', headers: [('WWW-Authenticate',
'Basic realm="LanePress"'), ('content-type', 'text/plain')], exc
_info: None
18:09:56,087 DEBUG [routes.middleware] Matched POST /error/document/
18:09:56,087 DEBUG [routes.middleware] Route path:
'error/:action/:id', defaults: {'action': 'index', 'controller':
u'error', 'id': None}
18:09:56,087 DEBUG [routes.middleware] Match dict: {'action':
u'document', 'controller': u'error', 'id': None}
18:09:56,088 DEBUG [pylons.wsgiapp] Setting up Pylons stacked object
globals
18:09:56,088 DEBUG [pylons.wsgiapp] Resolved URL to controller:
u'error'
18:09:56,089 DEBUG [pylons.wsgiapp] Controller appears to be a class,
instantiating
18:09:56,089 DEBUG [pylons.wsgiapp] Calling controller class with WSGI
interface
18:09:56,090 DEBUG [pylons.controllers.core] Looking for u'document'
method to handle the request
18:09:56,090 DEBUG [pylons.controllers.core] Calling 'document' method
with keyword args: **{}
18:09:56,091 DEBUG [pylons.controllers.core] Assuming controller
returned a basestring or buffer, writing it to pylons.response
18:09:56,091 DEBUG [pylons.controllers.core] Calling Response object
to return WSGI data
18:09:56,092 DEBUG [pylons.controllers.core] Merging pylons.response
headers into start_response call, status: 200 OK
18:09:56,092 DEBUG [authkit.authenticate.multi] Status: '200 OK',
Headers: [('content-type', 'text/html; charset=utf-8'), ('pragma', 'no-
cache'), ('cache-control', 'no-cache')]
18:09:56,093 DEBUG [authkit.authenticate.multi] Status checker
recieved status '200 OK', headers [('content-type', 'text/html;
charset=utf-8'), ('pragma', 'no-cache'), ('cache-control', 'no-cache
')], intecept ['401']
18:09:56,093 DEBUG [authkit.authenticate.multi] Status checker returns
False
18:09:56,094 DEBUG [authkit.authenticate.multi] Multi: No binding was
found for the check
18:09:56,099 DEBUG [paste.httpserver.ThreadPool] Added task (0 tasks
queued)
18:09:56,101 DEBUG [authkit.authenticate] valid_password called.
username: pb24
18:09:56,103 DEBUG [routes.middleware] Matched POST /bindery/PB24/
FASTstat/1/2008/08/14/18/09/12/
18:09:56,103 DEBUG [routes.middleware] Route path:
':department/:toolname/:controller/:version/:year/:month/:day/:hour/:minute/:second',
defaults: {'action': u'create'}
18:09:56,103 DEBUG [routes.middleware] Match dict: {'hour': u'18',
'second': u'12', 'month': u'08', 'action': u'create', 'controller':
u'FASTstat', 'version': u'1', 'year': u'2008', 'department':
 u'bindery', 'toolname': u'PB24', 'day': u'14', 'minute': u'09'}
18:09:56,104 DEBUG [pylons.wsgiapp] Setting up Pylons stacked object
globals
18:09:56,105 DEBUG [pylons.wsgiapp] Resolved URL to controller:
u'FASTstat'
18:09:56,105 DEBUG [pylons.wsgiapp] Calling controller class with WSGI
interface
18:09:56,106 DEBUG [warehouse.controllers.dataSource]
<paste.wsgiwrappers.WSGIRequest object at 0x9ad32ec method='POST',
scheme='http', host='mds.lanepress.com:5000', script_name='',
path_info='/bindery/PB24/FASTstat/1/2008/08/14/18/09/12/',
languges=['en-us'], charset='utf-8', errors='replace',
GET=UnicodeMultiDict([]),
POST=UnicodeMultiDict([('data_file', u'[Misc]\nLast Updated Date/Time
= 08/14/2008 18:09:08\nCurrent Run Speed (BPM) = 0\nMachine
Name            = RGUB\nCurrent Operator        = Supervisor\nLo
gin Level (Mode)      = SETUP\n\n[Current Volume Status]
\nName               = \nFile Name          = .*\nFirst Record
=         0\nLast Record        =         0\nTotal Records      =
    0\nStop Record        =         0\nLast Record to PLC =
0\nRemaining Records  =         0\n\n[Job]\nJob Name           =
ARTnews\nJob Print Order    =    100,250\nSystem Cycles      =
     46,873\nGood Books         =     20,598\nIJ Books
=          0\nNon-IJ Books       =     20,598\nQC Books
=          0\nBulk Books         =          0\nFixed Msg Books
  =          0\nSystem Stops       =         91\nSlow Downs
=          0\nInhibit Tests      =          0\nRejects
=      1,239\nDiverts            =         13\nStacks
     =          0\n\n[Shift]\nSystem Cycles      =  44,944\nGood
Books         =  20,569\nIJ Books           =       0\nNon-IJ
Books       =  20,569\nQC Books           =       0\nBulk Books
    =       0\nFixed Msg Books    =       0\nSystem Stops       =
74\nSlow Downs         =       0\nInhibit Tests      =
0\nRejects            =   1,218\nDiverts            =      12\n
\n[ECR]\nNon-Reorder        =    20,598\nNon-ECR Diverted   =
0\nECR Diverted       =         0\nNon-ECR Stacked    =         0\nECR
Stacked        =         0\nNon-ECR Manual     =
   0\nECR Manual         =         0\n\n[Volume Queue]\nCurrent Volume
(1) =  \nNext Volume (2)    =  \nNext Volume (3)    =  \nNext Volume
(4)    =  \nNext Volume (5)    =  \n\n[24 Hour Producti
on Status]\nHour 1900 =  3,883\nHour 2000 =  5,495\nHour 2100 =
675\nHour 2200 =      0\nHour 2300 =     12\nHour 0000 =      0\nHour
0100 =      0\nHour 0200 =      0\nHour 0300 =      0\nHou
r 0400 =      0\nHour 0500 =      0\nHour 0600 =     17\nHour 0700
=    220\nHour 0800 =    235\nHour 0900 =  1,825\nHour 1000 =
3,013\nHour 1100 =  5,193\nHour 1200 =  4,393\nHour 1300 =  1,444
\nHour 1400 =  3,136\nHour 1500 =    557\nHour 1600 =    537\nHour
1700 =      1\nCurrent   =     15\n')]),
cookies={}>
18:09:56,224 DEBUG [authkit.authenticate.multi] Status: '200 OK',
Headers: [('content-type', 'text/html; charset=utf-8'), ('pragma', 'no-
cache'), ('cache-control', 'no-cache')]
----------------------------------------
Exception happened during processing of request from ('10.11.4.228',
1557)
Traceback (most recent call last):
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/httpserver.py", line 1046, in process_request_in_thread
    self.finish_request(request, client_address)
  File "/usr/local/lib/python2.5/SocketServer.py", line 254, in
finish_request
    self.RequestHandlerClass(request, client_address, self)
  File "/usr/local/lib/python2.5/SocketServer.py", line 522, in
__init__
    self.handle()
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/httpserver.py", line 425, in handle
    BaseHTTPRequestHandler.handle(self)
  File "/usr/local/lib/python2.5/BaseHTTPServer.py", line 316, in
handle
    self.handle_one_request()
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/httpserver.py", line 420, in handle_one_request
    self.wsgi_execute()
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/httpserver.py", line 287, in wsgi_execute
    self.wsgi_start_response)
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/translogger.py", line 67, in __call__
    return self.application(environ, replacement_start_response)
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/cascade.py", line 92, in __call__
    return self.apps[-1](environ, start_response)
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/registry.py", line 340, in __call__
    app_iter = self.application(environ, start_response)
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/recursive.py", line 80, in __call__
    return self.application(environ, start_response)
  File "/usr/local/lib/python2.5/site-packages/Paste-1.4.2-py2.5.egg/
paste/errordocument.py", line 185, in __call__
    app_iter = self.application(environ, change_response)
  File "/usr/local/lib/python2.5/site-packages/AuthKit-0.4.0-py2.5.egg/
authkit/authenticate/__init__.py", line 290, in __call__
    return self.app(environ, start_response)
  File "/usr/local/lib/python2.5/site-packages/AuthKit-0.4.0-py2.5.egg/
authkit/authenticate/basic.py", line 126, in __call__
    return self.application(environ, start_response)
  File "/usr/local/lib/python2.5/site-packages/AuthKit-0.4.0-py2.5.egg/
authkit/authenticate/multi.py", line 64, in __call__
    raise Exception('WSGI start_response was not called before a
result'
Exception: WSGI start_response was not called before a result was
returned
----------------------------------------
<...>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"pylons-discuss" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/pylons-discuss?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to