On 2022-05-16 11:08, Christian König wrote:
Am 16.05.22 um 16:12 schrieb Andrey Grodzovsky:
Ping
Ah, yes sorry.
Andrey
On 2022-05-13 11:41, Andrey Grodzovsky wrote:
Yes, exactly that's the idea.
Basically the reset domain knowns which amdgpu devices it needs to
reset together.
If
Am 16.05.22 um 16:12 schrieb Andrey Grodzovsky:
Ping
Ah, yes sorry.
Andrey
On 2022-05-13 11:41, Andrey Grodzovsky wrote:
Yes, exactly that's the idea.
Basically the reset domain knowns which amdgpu devices it needs to
reset together.
If you then represent that so that you always have
Ping
Andrey
On 2022-05-13 11:41, Andrey Grodzovsky wrote:
Yes, exactly that's the idea.
Basically the reset domain knowns which amdgpu devices it needs to
reset together.
If you then represent that so that you always have a hive even when
you only have one device in it, or if you put an
On 2022-05-12 09:15, Christian König wrote:
Am 12.05.22 um 15:07 schrieb Andrey Grodzovsky:
On 2022-05-12 02:06, Christian König wrote:
Am 11.05.22 um 22:27 schrieb Andrey Grodzovsky:
On 2022-05-11 11:39, Christian König wrote:
Am 11.05.22 um 17:35 schrieb Andrey Grodzovsky:
On
Sure, I will investigate that. What about the ticket which LIjo raised
which was basically doing 8 resets instead of one ? Lijo - can this
ticket wait until I come up with this new design for amdgpu reset
function or u need a quick solution now in which case we can use the
already existing
Am 12.05.22 um 15:07 schrieb Andrey Grodzovsky:
On 2022-05-12 02:06, Christian König wrote:
Am 11.05.22 um 22:27 schrieb Andrey Grodzovsky:
On 2022-05-11 11:39, Christian König wrote:
Am 11.05.22 um 17:35 schrieb Andrey Grodzovsky:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022
On 2022-05-12 02:06, Christian König wrote:
Am 11.05.22 um 22:27 schrieb Andrey Grodzovsky:
On 2022-05-11 11:39, Christian König wrote:
Am 11.05.22 um 17:35 schrieb Andrey Grodzovsky:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um
On 2022-05-12 02:03, Christian König wrote:
Am 11.05.22 um 17:57 schrieb Andrey Grodzovsky:
[SNIP]
How about we do it like this then:
struct amdgpu_reset_domain {
union {
struct {
struct work_item debugfs;
struct work_item ras;
On 5/12/2022 11:36 AM, Christian König wrote:
Am 11.05.22 um 22:27 schrieb Andrey Grodzovsky:
On 2022-05-11 11:39, Christian König wrote:
Am 11.05.22 um 17:35 schrieb Andrey Grodzovsky:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um
Am 11.05.22 um 22:27 schrieb Andrey Grodzovsky:
On 2022-05-11 11:39, Christian König wrote:
Am 11.05.22 um 17:35 schrieb Andrey Grodzovsky:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11
Am 11.05.22 um 17:57 schrieb Andrey Grodzovsky:
[SNIP]
How about we do it like this then:
struct amdgpu_reset_domain {
union {
struct {
struct work_item debugfs;
struct work_item ras;
};
struct work_item
On 2022-05-11 11:39, Christian König wrote:
Am 11.05.22 um 17:35 schrieb Andrey Grodzovsky:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um
On 2022-05-11 11:39, Christian König wrote:
Am 11.05.22 um 17:35 schrieb Andrey Grodzovsky:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um
On 2022-05-11 11:46, Lazar, Lijo wrote:
On 5/11/2022 9:13 PM, Andrey Grodzovsky wrote:
On 2022-05-11 11:37, Lazar, Lijo wrote:
On 5/11/2022 9:05 PM, Andrey Grodzovsky wrote:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43
On 5/11/2022 9:13 PM, Andrey Grodzovsky wrote:
On 2022-05-11 11:37, Lazar, Lijo wrote:
On 5/11/2022 9:05 PM, Andrey Grodzovsky wrote:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11
On 2022-05-11 11:37, Lazar, Lijo wrote:
On 5/11/2022 9:05 PM, Andrey Grodzovsky wrote:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um 20:53
Am 11.05.22 um 17:35 schrieb Andrey Grodzovsky:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um 20:53 schrieb Andrey Grodzovsky:
[SNIP]
E.g. in
On 5/11/2022 9:05 PM, Andrey Grodzovsky wrote:
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um 20:53 schrieb Andrey Grodzovsky:
[SNIP]
E.g.
On 2022-05-11 11:20, Lazar, Lijo wrote:
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um 20:53 schrieb Andrey Grodzovsky:
[SNIP]
E.g. in the reset code (either before or after the
On 5/11/2022 7:28 PM, Christian König wrote:
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um 20:53 schrieb Andrey Grodzovsky:
[SNIP]
E.g. in the reset code (either before or after the reset, that's
debatable) you do something like
Am 11.05.22 um 15:43 schrieb Andrey Grodzovsky:
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um 20:53 schrieb Andrey Grodzovsky:
[SNIP]
E.g. in the reset code (either before or after the reset, that's
debatable) you do something like this:
for (i = 0; i < num_ring; ++i)
On 2022-05-11 03:38, Christian König wrote:
Am 10.05.22 um 20:53 schrieb Andrey Grodzovsky:
On 2022-05-10 13:19, Christian König wrote:
Am 10.05.22 um 19:01 schrieb Andrey Grodzovsky:
On 2022-05-10 12:17, Christian König wrote:
Am 10.05.22 um 18:00 schrieb Andrey Grodzovsky:
[SNIP]
Am 10.05.22 um 20:53 schrieb Andrey Grodzovsky:
On 2022-05-10 13:19, Christian König wrote:
Am 10.05.22 um 19:01 schrieb Andrey Grodzovsky:
On 2022-05-10 12:17, Christian König wrote:
Am 10.05.22 um 18:00 schrieb Andrey Grodzovsky:
[SNIP]
That's one of the reasons why we should have
On 2022-05-10 13:19, Christian König wrote:
Am 10.05.22 um 19:01 schrieb Andrey Grodzovsky:
On 2022-05-10 12:17, Christian König wrote:
Am 10.05.22 um 18:00 schrieb Andrey Grodzovsky:
[SNIP]
That's one of the reasons why we should have multiple work items
for job based reset and other
Am 10.05.22 um 19:01 schrieb Andrey Grodzovsky:
On 2022-05-10 12:17, Christian König wrote:
Am 10.05.22 um 18:00 schrieb Andrey Grodzovsky:
[SNIP]
That's one of the reasons why we should have multiple work items
for job based reset and other reset sources.
See the whole idea is the
On 2022-05-10 12:17, Christian König wrote:
Am 10.05.22 um 18:00 schrieb Andrey Grodzovsky:
[SNIP]
That's one of the reasons why we should have multiple work items for
job based reset and other reset sources.
See the whole idea is the following:
1. We have one single queued work queue for
Am 10.05.22 um 18:00 schrieb Andrey Grodzovsky:
[SNIP]
That's one of the reasons why we should have multiple work items for
job based reset and other reset sources.
See the whole idea is the following:
1. We have one single queued work queue for each reset domain which
makes sure that all
On 2022-05-06 04:56, Christian König wrote:
Am 06.05.22 um 08:02 schrieb Lazar, Lijo:
On 5/6/2022 3:17 AM, Andrey Grodzovsky wrote:
On 2022-05-05 15:49, Felix Kuehling wrote:
Am 2022-05-05 um 14:57 schrieb Andrey Grodzovsky:
On 2022-05-05 11:06, Christian König wrote:
Am 05.05.22 um
Am 06.05.22 um 08:02 schrieb Lazar, Lijo:
On 5/6/2022 3:17 AM, Andrey Grodzovsky wrote:
On 2022-05-05 15:49, Felix Kuehling wrote:
Am 2022-05-05 um 14:57 schrieb Andrey Grodzovsky:
On 2022-05-05 11:06, Christian König wrote:
Am 05.05.22 um 15:54 schrieb Andrey Grodzovsky:
On 2022-05-05
On 5/6/2022 3:17 AM, Andrey Grodzovsky wrote:
On 2022-05-05 15:49, Felix Kuehling wrote:
Am 2022-05-05 um 14:57 schrieb Andrey Grodzovsky:
On 2022-05-05 11:06, Christian König wrote:
Am 05.05.22 um 15:54 schrieb Andrey Grodzovsky:
On 2022-05-05 09:23, Christian König wrote:
Am
On 2022-05-05 17:47, Andrey Grodzovsky wrote:
>
> On 2022-05-05 15:49, Felix Kuehling wrote:
>>
>> Am 2022-05-05 um 14:57 schrieb Andrey Grodzovsky:
>>>
>>> On 2022-05-05 11:06, Christian König wrote:
Am 05.05.22 um 15:54 schrieb Andrey Grodzovsky:
>
> On 2022-05-05 09:23,
On 2022-05-05 15:49, Felix Kuehling wrote:
Am 2022-05-05 um 14:57 schrieb Andrey Grodzovsky:
On 2022-05-05 11:06, Christian König wrote:
Am 05.05.22 um 15:54 schrieb Andrey Grodzovsky:
On 2022-05-05 09:23, Christian König wrote:
Am 05.05.22 um 15:15 schrieb Andrey Grodzovsky:
On
Am 2022-05-05 um 14:57 schrieb Andrey Grodzovsky:
On 2022-05-05 11:06, Christian König wrote:
Am 05.05.22 um 15:54 schrieb Andrey Grodzovsky:
On 2022-05-05 09:23, Christian König wrote:
Am 05.05.22 um 15:15 schrieb Andrey Grodzovsky:
On 2022-05-05 06:09, Christian König wrote:
Am
On 2022-05-05 11:06, Christian König wrote:
Am 05.05.22 um 15:54 schrieb Andrey Grodzovsky:
On 2022-05-05 09:23, Christian König wrote:
Am 05.05.22 um 15:15 schrieb Andrey Grodzovsky:
On 2022-05-05 06:09, Christian König wrote:
Am 04.05.22 um 18:18 schrieb Andrey Grodzovsky:
Problem:
Am 05.05.22 um 15:54 schrieb Andrey Grodzovsky:
On 2022-05-05 09:23, Christian König wrote:
Am 05.05.22 um 15:15 schrieb Andrey Grodzovsky:
On 2022-05-05 06:09, Christian König wrote:
Am 04.05.22 um 18:18 schrieb Andrey Grodzovsky:
Problem:
During hive reset caused by command timing out on
On 2022-05-05 09:23, Christian König wrote:
Am 05.05.22 um 15:15 schrieb Andrey Grodzovsky:
On 2022-05-05 06:09, Christian König wrote:
Am 04.05.22 um 18:18 schrieb Andrey Grodzovsky:
Problem:
During hive reset caused by command timing out on a ring
extra resets are generated by triggered
Am 05.05.22 um 15:15 schrieb Andrey Grodzovsky:
On 2022-05-05 06:09, Christian König wrote:
Am 04.05.22 um 18:18 schrieb Andrey Grodzovsky:
Problem:
During hive reset caused by command timing out on a ring
extra resets are generated by triggered by KFD which is
unable to accesses registers on
On 2022-05-05 06:09, Christian König wrote:
Am 04.05.22 um 18:18 schrieb Andrey Grodzovsky:
Problem:
During hive reset caused by command timing out on a ring
extra resets are generated by triggered by KFD which is
unable to accesses registers on the resetting ASIC.
Fix: Rework GPU reset to
Am 04.05.22 um 18:18 schrieb Andrey Grodzovsky:
Problem:
During hive reset caused by command timing out on a ring
extra resets are generated by triggered by KFD which is
unable to accesses registers on the resetting ASIC.
Fix: Rework GPU reset to use a list of pending reset jobs
such that the
Problem:
During hive reset caused by command timing out on a ring
extra resets are generated by triggered by KFD which is
unable to accesses registers on the resetting ASIC.
Fix: Rework GPU reset to use a list of pending reset jobs
such that the first reset jobs that actaully resets the entire
40 matches
Mail list logo