Re: [GENERAL] Corrupted Data ?

Ioana Danes Fri, 12 Aug 2016 08:31:25 -0700

On Fri, Aug 12, 2016 at 11:26 AM, Adrian Klaver <[email protected]>
wrote:


> On 08/12/2016 08:10 AM, Ioana Danes wrote:
>
>>
>>
>> On Fri, Aug 12, 2016 at 10:47 AM, Francisco Olarte
>> <[email protected] <mailto:[email protected]>> wrote:
>>
>>     CCing to the list...
>>
>> Thanks
>>
>>
>>     On Fri, Aug 12, 2016 at 4:10 PM, Ioana Danes <[email protected]
>>     <mailto:[email protected]>> wrote:
>>     >> given 318220 and 318216 are just a bit away ( 4db08/4db0c ), and it
>>     >> repeats sporadically, have you ruled out ( by having page
>>     checksums or
>>     >> other mechanism ) a potential disk read/write error ?
>>     >>
>>     >>
>>     >> > Also the index is correct on db3 as the record in case (with
>>     drawid =
>>     >> > 318216) is retrieved if I filter by drawid = 318220
>>     >>
>>     >> Specially if this happens, you may have some slightly bad
>> disks/ram/
>>     >> leading to this kind of problems.
>>     >>
>>     >
>>     > Could be. I also had some issues with an rsync between db3 and
>>     drdb a week
>>     > ago that did not complete for bigger files (> 200MB) and gave me
>> some
>>     > corruption messages. Then the system was revbooted and everything
>>     seemed
>>     > fine but apparently it is not.
>>     > I am planning to drop & create the table from a good backup and if
>>     that does
>>     > not fix the issue then I will rebuild the server.
>>
>>     I would check whatever logs you can ( syslog or eventlog, smart log,
>>     etc.. ) hunting for disk errors ( sometimes they are reported ). This
>>     kind of problems, with programs as tested as postgres and rsync, tend
>>     to indicate controller/RAM/disk going bad ( in your case it could be
>>     caused by a single bit getting flipped in a sector for the data
>>     portion of the table, and not being propagated either because it
>>     happened after your sync of drdb or because it was synced from the WAL
>>     and not the table, or because it was read from the disk cache ).
>>
>> I agree, unfortunately I did not find any clues about corruption or any
>> anomalies in the logs.
>> I will work tonight to rebuild that table and see where I go from there.
>>
>
> The db3 database is on a different machine from all the other databases
> you set up, correct?
>
> Yes, they are all different vms first 3 dbs are on the same cluster but
drdb is a remote machine,

Thank you


>
>> Thanks,
>> ioana
>>
>>     Francisco Olarte.
>>
>>
>>
>
> --
> Adrian Klaver
> [email protected]
>

Re: [GENERAL] Corrupted Data ?

Reply via email to