JimThe memtest showed no errors - I can't seem to trace a hardware related issue at all.
Our problem seems to be related to transaction files which grow quickly... If we resize them regularly, then the issue seems to go away. The hashing in some of these files may not be equally spread - making this issue worse as most of our transaction files have an id of internaldate.sequence_no (eg 15848.1 15848.2 etc).
Regards Simon =========================== Simon Verona Director Dealer Management Services Ltd email: [email protected] tel: 0845 686 2300 On 31/05/2011 02:46, Jim Idle wrote:
Try not to top post if you can avoid it.-----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of jaro Sent: Monday, May 30, 2011 1:58 PM To: jBASE Subject: Re: File Corruption... Causes? Usually we had the occurence of corrupted files in the following cases: - unexpected hardware problemPossible but hardware logs often tell you that something went wrong. Memory errors on PC hardware usually do not, but sometimes when you reboot, Windows can tell you that it suspects memory. Simon, what were the results of your memtests?- process got killed by for example adminIf they are a good admin, then they will first try a normal kill before a kill -9. If you go straight to kill -9 (or equivalent) then you are a bad admin.- process interrupted by the user itselfThis cannot cause file corruption unless the user issues kill -9 on their own process and you should replace the standard kill command so that they cannot do that.- uncommon but in the case the programs use their own way to update the data in the jbase hash files without using standard jbase IO statementsNothing should do that.- could be the reason some special characters passed within the record content that could cause any collision with the default jbase delimiters?It is not possible to corrupt a jBASE file in this way.In those cases there is very high change of the file corruption if the file is updated at that point of time.This just makes it more likely because there is a higher chance of part of an update being in memory when a kill -9 is issued. A normal kill will not abort the memory flush to disk.Another very high change of the file corruption is if the file is very bad sized, and new and new records needs to be written or re-organized. In this case the file is extended with the overflow groups and the link between primary and the secondary segments could be broken.This is not a source of file corruption, it just means that there are more physical updates involved in a logical update and so it 'enhances' the chances of outstanding physical updates being present if you issue a kill -9/ Lesson: don't issue kill -9Technically during the record write jbase assign the binary lock on the whole block segment where the record should reside based on the hash alghoritm, so at that point of time all the records inside that segment are blocked.This is file type dependent. JimOn Apr 27, 3:51 pm, Jim Idle<[email protected]> wrote:If they are corrupted after a hard reboot then your biggest suspectbya long way is that there was actually a process (perhaps orphaned/zombie, perhaps the user says everything is closed properly but it isn’t) that was running when you went to reboot and it/theyarenot getting closed in a neat way. You should develop a shutdownscriptthat verifies that everyone is off the system and there are no jBASE processes around. You can use the sysinternals toolset to help with that – you want to try closing the processes in the cleanest way possible and hard reboot without stopping the processes may well killthe processes in a drastic way.Of course if you mean that someone is turning off the system at the mains, then the more heavily used files are pretty much bound to be corrupt. Only switching to J3 secure files would prevent that and as you say, that is a lot slower if you want to lose as little data aspossible.On the memory thing – it is worth running the memcheck on a system as part of general maintenance, but if this is the issue, it will affect the file when it is in memory and only afterwards when it is flushed to disk will the disk version reflect whatever the problem is. So flushing the files from memory would not help at all as it is thememory that is corrupted anyway.If there are no hardware faults, then I think you need to be looking at the general operations of the users such as whether they just switch things off without a proper shutdown, make sure they have UPS and so on. Also, it is high time that you moved on to jBASE 5 Ithink.The J5 files there should in theory avoid corruption issues, though they were not very good performance wise when I last tested them(ages ago now).Jim *From:* [email protected] [mailto:[email protected]] *On Behalf Of *Simon Verona *Sent:* Tuesday, April 26, 2011 10:21 PM *To:* [email protected] *Subject:* RE: File Corruption... Causes? Jim Thanks for the thought viz-a-viz memory. I had considered that as an issue - is it possible that because these are active files, that they tend to stick in memory for longer and therefore have more of a chance of getting corrupted (I know that a hard reboot will almost certainly corrupt these files!). Is there a way (with jbase 3.4.10/win2003) of forcing these files to be flushed to disk more regularly? I seem to recall the ability to configure a file in jbase so that it forces a physical disk write on update - though this would hamper performance somewhat! I will make backups of the files before fixing them and see if devsup can advise (they tend to be large 1GB+ files but will zip well). Regards Simon -----Original Message----- From: Jim Idle<[email protected]> To: [email protected] Date: Tue, 26 Apr 2011 13:47:48 -0700 Subject: RE: File Corruption... Causes? You need to examine hex dumps of the files and see what data is getting overwritten and with what other data. The other thing you should do is download the latest Memcheck CD image and run a complete memory check on the system. Most PCs do not use ECC memory and other than crashes or strange things happening you do not realize thatthereis a memory issue. However you may find that this is something like opening the file and editing it with notepad, or something silly likethat.Anyway, you might need jBASE to help you with that but make copies of the corrupt files before ‘fixing’ them; then you can look forpatternsin the corruption. The fact that they are high activity files, just means that they are the most likely to exhibit the problem. Youshoulddo the memcheck overnight as soon as possible though. Just get the customer to stick the CD in and reboot. Jim *From:* [email protected] [mailto:[email protected]] *On Behalf Of *Simon Verona *Sent:* Tuesday, April 26, 2011 1:13 PM *To:* [email protected] *Subject:* RE: File Corruption... Causes? Jim I mean physically corrupt... If you do a COUNT [Filename] you crash out with a "Readnext error 2007, file is corrupt message" (or similar). JCHECK with no options confirms the corruption (I double check by running it multiple times). To correct, I have to run JCHECK -MS [Filename] with all users loggedout.Typically, the files that this happens to are high activity files, with lots of smallish records in. I suspected that the size maybe was the issue so I converted one of the customers into a multipart file but within a month one of the parts has corrupted. The file is normally discovered as corrupt when reading a record (either atomically, or when running a report). The problem is that it's not a completely random event - whilst I can't predict when and where it's going to happen - I notice thatsomesystems are more prone to the error and that certain files are more likely than others to have the problem. I've kind of eliminated multi-user writing as being a cause - one of the files is only written to by a single program - this sets an execution lock to ensure that only one process can update the file at a time. It is ironically, this file that statistically corrupts themost often.I'm sorry if I'm a little vague about the issue, but I don't really have a grip as to what is going on. I don't know *when* the files are corrupting - only that they *are *corrupt. Regards Simon -----Original Message----- From: Jim Idle< [email protected]> To: [email protected] Date: Tue, 26 Apr 2011 12:50:26 -0700 Subject: RE: File Corruption... Causes? Do you mean logically corrupt (your records are wrong) or physically corrupt (you have to use jcheck)? You cannot physically corrupt afileby writing to it without taking a lock, you will just get trash results in your file. When are you discovering the data is corrupt? There are lots of things that you can do to actually corrupt it and some things (such as running jcheck when people are writing to thefile) that might make you think it is corrupt.Jim *From:* [email protected] [mailto:[email protected]] *On Behalf Of *Simon Verona *Sent:* Tuesday, April 26, 2011 12:39 PM *To:* [email protected] *Subject:* File Corruption... Causes? This issue is generic, and relates to a number of similar jBASE3.4.10based systems running Windows Server 2003. We have an ongoing issue with file corruptions in j4 format files. The problem appears somehow to be application driven - I suspect this because across a number of systems, the files that corrupt are always the same ones... So, I'm looking for inspiration at an application level as to what could cause file corruptions. One thought I had was a WRITE without previously doing a READU. I've not managed to duplicate the issue doing this, but it's difficult to simulate a multi-user test that replicates what the application mightbe doing.Does anybody know if this *could* be the cause, or know of some other application (data-basic) issue that could cause a J4 file to becorrupted?thanks in advance Simon Verona -- Please read the posting guidelines at:http://groups.google.com/group/jBASE/web/Posting%20Guidelines IMPORTANT: Type T24: at the start of the subject line for questions specific to Globus/T24 To post, send email to [email protected] To unsubscribe, send email to [email protected] For more options, visit this group athttp://groups.google.com/group/jBASE?hl=en -- Please read the posting guidelines at:http://groups.google.com/group/jBASE/web/Posting%20Guidelines IMPORTANT: Type T24: at the start of the subject line for questions specific to Globus/T24 To post, send email to [email protected] To unsubscribe, send email to [email protected] For more options, visit this group athttp://groups.google.com/group/jBASE?hl=en -- Please read the posting guidelines at:http://groups.google.com/group/jBASE/web/Posting%20Guidelines IMPORTANT: Type T24: at the start of the subject line for questions specific to Globus/T24 To post, send email to [email protected] To unsubscribe, send email to [email protected] For more options, visit this group athttp://groups.google.com/group/jBASE?hl=en-- Please read the posting guidelines at: http://groups.google.com/group/jBASE/web/Posting%20Guidelines IMPORTANT: Type T24: at the start of the subject line for questions specific to Globus/T24 To post, send email to [email protected] To unsubscribe, send email to [email protected] For more options, visit this group at http://groups.google.com/group/jBASE?hl=en
-- Please read the posting guidelines at: http://groups.google.com/group/jBASE/web/Posting%20Guidelines IMPORTANT: Type T24: at the start of the subject line for questions specific to Globus/T24 To post, send email to [email protected] To unsubscribe, send email to [email protected] For more options, visit this group at http://groups.google.com/group/jBASE?hl=en
<<attachment: simon.vcf>>
