Following up from my original email I have done some more tests and have some other comments.
We have now tested by removing the content length from the response, by changing the GetMethod. This has cured the corrupted file problem, but clearly is not an ideal situation. I am now looking at other solutions which solve the overall problem. With the above change the same scenario and test case produce a correct file, but another issue arises. That is that for a moment when doing an update no file exists in the DB. This is because in the ContentStore an update does a "delete" then a "store" of the content. Is there any reason that this was not just a direct DB update? Also, if you do a GET while a PUT is being done then at best you are held up till the table lock completes, at worst it falls into the 404 trap mentioned above. I think that a better solution would be to make the behavior such that a new revision is always created for an update, but if revisions are turned off any existing revisions are deleted. Now overall this would have some more overhead, but would at least make synchronous GET's with a PUT work. Comments please? Rgds CB > -----Original Message----- > From: Britton, Colin > Sent: Monday, January 07, 2002 3:22 PM > To: [EMAIL PROTECTED] > Subject: Problem with GET and content Length > > > We have come across a problem which we think is the following.... > > Before I start, this is using the J2EE content and descriptor > store on win2k with Tomcat 4.0.2b1 and sql server, but I > presume all other stores will do the same (although the > timing of processes may make this more difficult to experience). > > Scenario > ======== > If you have a small file in the store and you update it with > a new file which is bigger, and then while this update is > being done you do a GET on the same file. The GET waits for > the put to finish and then GETS the new content from the > store, but because it has already read the descriptors and > these contain data from the old file then the response > contains an invalid content-length header and the > browser/client does not correctly read the file, leading to > corruption. The other way round would cause less of a visible > problem because the content-length header would be set to > longer than the content, but is still I believe the case. > > The sequence is such that a file storage occurs before the > descriptors are set, and during a get the descriptors are > read before the file is retrieved from storage. During this > crossover the problem can be experienced. > > Test Case > ========= > We have done some test to indicate this is the case, but I am > open to comments or corrections if anyone has any other ideas > what might cause this.. > > You can recreate the problem by doing the following. > 1) Create two files of a different size, but the same name > (we used 200k and 6mb in different folder - obviously) > > 2) Create a web folder client access to slide > > 3) drop on file 1 > > 4) browse to first file with IE and open it > > 5) Drop large file in web folder to overwrite first file > > 6) immediately refresh IE - second file is downloaded and get > waits for this to complete, and then is viewed but only to > the amount of content of the first file. > > Another note on this is, for some reason of which I am not > yet certain this behavior remains for this file in that > browser until the server is restarted. This could be a > browser cache issue as I see that no headers are set (pragma, > expires etc...). Once the Tomcat server is restarted the > problem disappears (or if you connect fresh from another client. > > Solution ideas > ============== > > 1) remove the content length from the response (not a good > idea but a quick fix) > > 2) Change the GET method to get the content length after it > gets the data, but before committing the response. (not good > for performance) > > 3) Change the storage implementation to always use revisions > but only keep the latest two. So you always have a matching > content and descriptor pair. i.e. new content is not ready > till new descriptors are written. (not sure about this, but > the idea of a two-phase content update is nice) > > 4) sure there are plenty of others..... > > Comments, ideas, solutions? > > rgds > CB > > -- > To unsubscribe, e-mail: > <mailto:slide-dev-> [EMAIL PROTECTED]> > For > additional commands, > e-mail: <mailto:[EMAIL PROTECTED]> > > -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
