On Mon, Mar 7, 2011 at 10:22 PM, Ben Skeggs <[email protected]> wrote:
> On Mon, 2011-03-07 at 21:51 +0000, Maarten Maathuis wrote:
>> On Sun, Mar 6, 2011 at 2:24 PM, Ben Skeggs <[email protected]> wrote:
>> >
>> >
>> > Sent from my iPhone
>> >
>> > On 07/03/2011, at 0:03, Maarten Maathuis <[email protected]> wrote:
>> >
>> >> On Sun, Mar 6, 2011 at 1:44 PM, Ben Skeggs <[email protected]> wrote:
>> >>> Sorry for the top posting, it's late and typing from my phone in bed lol.
>> >>>
>> >>> Just wanted to see if you had an update? And, this is NV86 I guess?
>> >>>
>> >>> Ben.
>> >>>
>> >>> Sent from my iPhone
>> >>>
>> >>> On 02/03/2011, at 8:20, Maarten Maathuis <[email protected]> wrote:
>> >>>
>> >>>> On Tue, Mar 1, 2011 at 9:51 PM, Ben Skeggs <[email protected]> wrote:
>> >>>>> On Tue, 2011-03-01 at 21:08 +0000, Maarten Maathuis wrote:
>> >>>>>
>> >>>>>> Those come after 15-30 minutes of running warzone2100, i haven't
>> >>>>>> played any games for a while, so no idea how long this has been going
>> >>>>>> on.
>> >>>>>> I also got a TRAP_CCACHE on channel 2 a little while ago, it takes
>> >>>>>> much longer to trigger (a few hours). I'm using todays "nouveau
>> >>>>>> kernel" git.
>> >>>>> You're not the first person to have reported this fwiw, personally, I
>> >>>>> haven't seen it yet..
>> >>>>>
>> >>>>>>
>> >>>>>> I'm guessing something is being unmapped too early or without reason,
>> >>>>>> or some cache is stale. But it isn't obvious what exactly it is.
>> >>>>>>
>> >>>>>> Because i don't remember having these lockups before I'm inclined to
>> >>>>>> guess that this commit is involved
>> >>>>>> http://cgit.freedesktop.org/nouveau/linux-2.6/commit/?id=6330d8f5ecc4a19fd2ad3c7fa128b2f4c2ce3360
>> >>>>>>
>> >>>>>> Any ideas?
>> >>>>> Not really.  If this commit *is* the cause, the problem is still
>> >>>>> somewhere else.  That commit just makes sure PTEs are marked invalid, 
>> >>>>> so
>> >>>>> if it's causing your faults, then previously the GPU would still have
>> >>>>> been reading/writing invalid data.
>> >>>>>
>> >>>>> Plus, I expect you should probably have seen a VM fault..
>> >>>>
>> >>>> So these faults are just generic errors? Unrelated to page faults?
>> >>>>
>> >>>>>
>> >>>>> Ben.
>> >>>>>>
>> >>>>>> Maarten.
>> >>>>>>
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> --
>> >>>> Far away from the primal instinct, the song seems to fade away, the
>> >>>> river get wider between your thoughts and the things we do and say.
>> >>>> _______________________________________________
>> >>>> Nouveau mailing list
>> >>>> [email protected]
>> >>>> http://lists.freedesktop.org/mailman/listinfo/nouveau
>> >>>
>> >>
>> >> No this is NV96. The revert definitely helps, but no luck so far in
>> >> finding a plausible cause for the problem.
>> > Hey,
>> >
>> > Ok. Hmm. I thought you had NV86 for some reason! It's a long shot and I'm 
>> > not entirely convinced it'll help at all, but can you switch 
>> > graph.tlb_flush pointer to the nv86 version and see if anything changes?
>>
>> I used to have a NV86, but it died more than a year ago in the typical
>> way for that generation of card, due to thermal issues I guess (it was
>> a passively cooled card). I haven't tried using the nv86 tlb flush,
>> out of curiosity, is this something nvidia does (a lot) on nv86?
> Yes, NVIDIA do it on pretty much every card I've looked at traces for,
> we've never seen any need for other chipsets as of yet however.
> Originally, it looked like NVIDIA did this on all pre-NVA3 cards, but, a
> trace of my T510 with recent drivers show that they do it on NVA3+ now
> too.
>
>>
>> >
>> > The *other* possible thing is that the ttm delayed delete queue is causing 
>> > multiple tlb flushes to happen at the same time.  I'll add locking for 
>> > that in the morning, that was a complete oversight.
>>
>> I've had no lockups since you added the spinlocks, so maybe that was
>> it. Time will tell.
> *crosses fingers*
>
> Ben.
>>
>> >
>> > Ben.
>> >
>> >>
>> >> --
>> >> Far away from the primal instinct, the song seems to fade away, the
>> >> river get wider between your thoughts and the things we do and say.
>> >
>>
>>
>>
>
>
>

It went alright for quite some time (much longer than before), but i
got another one. I should note this happened at the exact moment X
rendered something over my fullscreen opengl app. So it does smell a
bit fishy. I'll have a look myself at possible causes again.

Mar  8 23:30:58 madman kernel: [25325.644794] [drm] nouveau
0000:01:00.0: PGRAPH - TRAP_CCACHE FAULT
Mar  8 23:30:58 madman kernel: [25325.644815] [drm] nouveau
0000:01:00.0: PGRAPH - TRAP_CCACHE 00000080 00000000 00000000 00000000
00000000 00000004 00000000
Mar  8 23:30:58 madman kernel: [25325.644829] [drm] nouveau
0000:01:00.0: PGRAPH - TRAP_MP - TP1: Unhandled ustatus 0x00020000
Mar  8 23:30:58 madman kernel: [25325.644836] [drm] nouveau
0000:01:00.0: PGRAPH - TRAP
Mar  8 23:30:58 madman kernel: [25325.644848] [drm] nouveau
0000:01:00.0: PGRAPH - ch 2 (0x0000840000) subc 5 class 0x8297 mthd
0x0f04 data 0x00000000
Mar  8 23:30:58 madman kernel: [25325.644865] [drm] nouveau
0000:01:00.0: VM: trapped read at 0x002000f000 on ch 2 [0x00000840]
PFIFO/PFIFO_READ/SEMAPHORE reason: DMAOBJ_LIMIT

-- 
Far away from the primal instinct, the song seems to fade away, the
river get wider between your thoughts and the things we do and say.
_______________________________________________
Nouveau mailing list
[email protected]
http://lists.freedesktop.org/mailman/listinfo/nouveau

Reply via email to