Jim

The memtest showed no errors - I can't seem to trace a hardware related issue at all.

Our problem seems to be related to transaction files which grow quickly... If we resize them regularly, then the issue seems to go away. The hashing in some of these files may not be equally spread - making this issue worse as most of our transaction files have an id of internaldate.sequence_no (eg 15848.1 15848.2 etc).

Regards
Simon

===========================
Simon Verona
Director
Dealer Management Services Ltd

email: [email protected]
tel: 0845 686 2300


On 31/05/2011 02:46, Jim Idle wrote:
Try not to top post if you can avoid it.

-----Original Message-----
From: [email protected] [mailto:[email protected]] On Behalf
Of jaro
Sent: Monday, May 30, 2011 1:58 PM
To: jBASE
Subject: Re: File Corruption... Causes?

Usually we had the occurence of corrupted files in the following
cases:
- unexpected hardware problem
Possible but hardware logs often tell you that something went wrong.
Memory errors on PC hardware usually do not, but sometimes when you
reboot, Windows can tell you that it suspects memory. Simon, what were the
results of your memtests?

- process got killed by for example admin
If they are a good admin, then they will first try a normal kill before a
kill -9. If you go straight to kill -9 (or equivalent) then you are a bad
admin.


- process interrupted by the user itself
This cannot cause file corruption unless the user issues kill -9 on their
own process and you should replace the standard kill command so that they
cannot do that.

- uncommon but in the case the programs use their own way to update the
data in the jbase hash files without using standard jbase IO statements
Nothing should do that.

- could be the reason some special characters passed within the record
content that could cause any collision with the default jbase
delimiters?
It is not possible to corrupt a jBASE file in this way.

In those cases there is very high change of the file corruption if the
file is updated at that point of time.
This just makes it more likely because there is a higher chance of part of
an update being in memory when a kill -9 is issued. A normal kill will not
abort the memory flush to disk.


Another very high change of the
file corruption is if the file is very bad sized, and new and new
records needs to be written or re-organized. In this case the file is
extended with the overflow groups and the link between primary and the
secondary segments could be broken.
This is not a source of file corruption, it just means that there are more
physical updates involved in a logical update and so it 'enhances' the
chances of outstanding physical updates being present if you issue a kill
-9/

Lesson: don't issue kill -9

Technically during the record write jbase assign the binary lock
on the whole block segment where the record should reside based on the
hash alghoritm, so at that point of time all the records inside that
segment are blocked.
This is file type dependent.

Jim

On Apr 27, 3:51 pm, Jim Idle<[email protected]>  wrote:
If they are corrupted after a hard reboot then your biggest suspect
by
a long way is that there was actually a process (perhaps
orphaned/zombie, perhaps the user says everything is closed properly
but it isn’t) that was running when you went to reboot and it/they
are
not getting closed in a neat way. You should develop a shutdown
script
that verifies that everyone is off the system and there are no jBASE
processes around. You can use the sysinternals toolset to help with
that – you want to try closing the processes in the cleanest way
possible and hard reboot without stopping the processes may well kill
the processes in a drastic way.
Of course if you mean that someone is turning off the system at the
mains, then the more heavily used files are pretty much bound to be
corrupt. Only switching to J3 secure files would prevent that and as
you say, that is a lot slower if you want to lose as little data as
possible.
On the memory thing – it is worth running the memcheck on a system as
part of general maintenance, but if this is the issue, it will affect
the file when it is in memory and only afterwards when it is flushed
to disk will the disk version reflect whatever the problem is. So
flushing the files from memory would not help at all as it is the
memory that is corrupted anyway.
If there are no hardware faults, then I think you need to be looking
at the general operations of the users such as whether they just
switch things off without a proper shutdown, make sure they have UPS
and so on. Also, it is high time that you moved on to jBASE 5 I
think.
The J5 files there should in theory avoid corruption issues, though
they were not very good performance wise when I last tested them
(ages ago now).
Jim

*From:* [email protected] [mailto:[email protected]] *On
Behalf Of *Simon Verona
*Sent:* Tuesday, April 26, 2011 10:21 PM
*To:* [email protected]
*Subject:* RE: File Corruption... Causes?

Jim

Thanks for the thought viz-a-viz memory.  I had considered that as an
issue
- is it possible that because these are active files, that they tend
to stick in memory for longer and therefore have more of a chance of
getting corrupted (I know that a hard reboot will almost certainly
corrupt these files!).

Is there a way (with jbase 3.4.10/win2003) of forcing these files to
be flushed to disk more regularly?

I seem to recall the ability to configure a file in jbase so that it
forces a physical disk write on update - though this would hamper
performance somewhat!

I will make backups of the files before fixing them and see if devsup
can advise (they tend to be large 1GB+ files but will zip well).

Regards

Simon



-----Original Message-----
From: Jim Idle<[email protected]>
To: [email protected]
Date: Tue, 26 Apr 2011 13:47:48 -0700
Subject: RE: File Corruption... Causes?

You need to examine hex dumps of the files and see what data is
getting overwritten and with what other data. The other thing you
should do is download the latest Memcheck CD image and run a complete
memory check on the system. Most PCs do not use ECC memory and other
than crashes or strange things happening you do not realize that
there
is a memory issue. However you may find that this is something like
opening the file and editing it with notepad, or something silly like
that.
Anyway, you might need jBASE to help you with that but make copies of
the corrupt files before ‘fixing’ them; then you can look for
patterns
in the corruption. The fact that they are high activity files, just
means that they are the most likely to exhibit the problem. You
should
do the memcheck overnight as soon as possible though. Just get the
customer to stick the CD in and reboot.

Jim

*From:* [email protected] [mailto:[email protected]] *On
Behalf Of *Simon Verona
*Sent:* Tuesday, April 26, 2011 1:13 PM
*To:* [email protected]
*Subject:* RE: File Corruption... Causes?

Jim

I mean physically corrupt...

If you do a COUNT [Filename]  you crash out with a "Readnext error
2007, file is corrupt message" (or similar).

JCHECK with no options confirms the corruption (I double check by
running it multiple times).

To correct, I have to run JCHECK -MS [Filename] with all users logged
out.
Typically, the files that this happens to are high activity files,
with lots of smallish records in.    I suspected that the size maybe
was the issue so I converted one of the customers into a multipart
file but within a month one of the parts has corrupted.

The file is normally discovered as corrupt when reading a record
(either atomically, or when running a report).

The problem is that it's not a completely random event - whilst I
can't predict when and where it's going to happen - I notice that
some
systems are more prone to the error and that certain files are more
likely than others to have the problem.

I've kind of eliminated multi-user writing as being a cause - one of
the files is only written to by a single program - this sets an
execution lock to ensure that only one process can update the file at
a time.  It is ironically, this file that statistically corrupts the
most often.
I'm sorry if I'm a little vague about the issue, but I don't really
have a grip as to what is going on.   I don't know *when* the files
are corrupting
- only that they *are *corrupt.

Regards

Simon

-----Original Message-----
From: Jim Idle<  [email protected]>
To: [email protected]
Date: Tue, 26 Apr 2011 12:50:26 -0700
Subject: RE: File Corruption... Causes?

Do you mean logically corrupt (your records are wrong) or physically
corrupt (you have to use jcheck)? You cannot physically corrupt a
file
by writing to it without taking a lock, you will just get trash
results in your file. When are you discovering the data is corrupt?
There are lots of things that you can do to actually corrupt it and
some things (such as running jcheck when people are writing to the
file) that might make you think it is corrupt.
Jim

*From:* [email protected] [mailto:[email protected]] *On
Behalf Of *Simon Verona
*Sent:* Tuesday, April 26, 2011 12:39 PM
*To:* [email protected]
*Subject:* File Corruption... Causes?

This issue is generic, and relates to a number of similar jBASE
3.4.10
based systems running Windows Server 2003.

We have an ongoing issue with file corruptions in j4 format files.

The problem appears somehow to be application driven - I suspect this
because across a number of systems, the files that corrupt are always
the same ones...

So, I'm looking for inspiration at an application level as to what
could cause file corruptions.

One thought I had was a WRITE without previously doing a READU.  I've
not managed to duplicate the issue doing this, but it's difficult to
simulate a multi-user test that replicates what the application might
be doing.
Does anybody know if this *could* be the cause, or know of some other
application (data-basic) issue that could cause a J4 file to be
corrupted?
thanks in advance

Simon Verona

--
Please read the posting guidelines
at:http://groups.google.com/group/jBASE/web/Posting%20Guidelines

IMPORTANT: Type T24: at the start of the subject line for questions
specific to Globus/T24

To post, send email to [email protected] To unsubscribe, send
email to [email protected]
For more options, visit this group
athttp://groups.google.com/group/jBASE?hl=en

--
Please read the posting guidelines
at:http://groups.google.com/group/jBASE/web/Posting%20Guidelines

IMPORTANT: Type T24: at the start of the subject line for questions
specific to Globus/T24

To post, send email to [email protected] To unsubscribe, send
email to [email protected]
For more options, visit this group
athttp://groups.google.com/group/jBASE?hl=en

--
Please read the posting guidelines
at:http://groups.google.com/group/jBASE/web/Posting%20Guidelines

IMPORTANT: Type T24: at the start of the subject line for questions
specific to Globus/T24

To post, send email to [email protected] To unsubscribe, send
email to [email protected]
For more options, visit this group
athttp://groups.google.com/group/jBASE?hl=en
--
Please read the posting guidelines at:
http://groups.google.com/group/jBASE/web/Posting%20Guidelines

IMPORTANT: Type T24: at the start of the subject line for questions
specific to Globus/T24

To post, send email to [email protected] To unsubscribe, send
email to [email protected]
For more options, visit this group at
http://groups.google.com/group/jBASE?hl=en

--
Please read the posting guidelines at: 
http://groups.google.com/group/jBASE/web/Posting%20Guidelines

IMPORTANT: Type T24: at the start of the subject line for questions specific to 
Globus/T24

To post, send email to [email protected]
To unsubscribe, send email to [email protected]
For more options, visit this group at http://groups.google.com/group/jBASE?hl=en

<<attachment: simon.vcf>>

Reply via email to