Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Mark Lohry
You have my condolences if you have to support all those things
simultaneously.

On Wed, Jan 4, 2023, 7:27 PM Matthew Knepley  wrote:

> On Wed, Jan 4, 2023 at 7:22 PM Junchao Zhang 
> wrote:
>
>> We don't have a machine for us to test with both "--with-cuda --with-hip"
>>
>
> Yes, but your answer suggested that the structure of the code prevented
> this combination.
>
>   Thanks,
>
>  Matt
>
>
>> --Junchao Zhang
>>
>>
>> On Wed, Jan 4, 2023 at 6:17 PM Matthew Knepley  wrote:
>>
>>> On Wed, Jan 4, 2023 at 7:09 PM Junchao Zhang 
>>> wrote:
>>>
 On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley 
 wrote:

> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang 
> wrote:
>
>>
>> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry  wrote:
>>
>>> Oh, is the device backend not known at compile time?
>>>
>> Currently it is known at compile time.
>>
>
> Are you sure? I don't think it is known at compile time.
>
 We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not
 both

>>>
>>> Where is the logic for that in the code? This seems like a crazy design.
>>>
>>>   Thanks,
>>>
>>> Matt
>>>
>>>
   Thanks,
>
>  Matt
>
>
>> Or multiple backends can be alive at once?
>>>
>>
>> Some petsc developers (Jed and Barry) want to support this, but we
>> are incapable now.
>>
>>
>>>
>>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang 
>>> wrote:
>>>


 On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry  wrote:

> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then
>> we would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE 
>> on AMD
>> GPUs, ...
>
>
> Wouldn't one function suffice? Assuming these are contiguous
> arrays in CSR format, they're just raw device pointers in all cases.
>
 But we need to know what device it is (to dispatch to either
 petsc-CUDA or petsc-HIP backend)


>
> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <
> junchao.zh...@gmail.com> wrote:
>
>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays()
>> for GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), 
>> but
>> then we would need another for MATMPIAIJCUSPARSE, and then for 
>> HIPSPARSE on
>> AMD GPUs, ...
>>
>> The real problem I think is to deal with multiple MPI ranks.
>> Providing the split arrays for petsc MATMPIAIJ is not easy and thus 
>> is
>> discouraged for users to do so.
>>
>> A workaround is to let petsc build the matrix and allocate the
>> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array 
>> and fill
>> it up.
>>
>> We recently added routines to support matrix assembly on GPUs,
>> see if MatSetValuesCOO
>> 
>>  helps
>>
>> --Junchao Zhang
>>
>>
>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry 
>> wrote:
>>
>>> I have a sparse matrix constructed in non-petsc code using a
>>> standard CSR representation where I compute the Jacobian to be used 
>>> in an
>>> implicit TS context. In the CPU world I call
>>>
>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols,
>>> rowidxptr, colidxptr, valptr, Jac);
>>>
>>> which as I understand it -- (1) never copies/allocates that
>>> information, and the matrix Jac is just a non-owning view into the 
>>> already
>>> allocated CSR, (2) I can write directly into the original data 
>>> structures
>>> and the Mat just "knows" about it, although it still needs a call to
>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far 
>>> this
>>> works great with GAMG.
>>>
>>> I have the same CSR representation filled in GPU data allocated
>>> with cudaMalloc and filled on-device. Is there an equivalent Mat
>>> constructor for GPU arrays, or some other way to avoid unnecessary 
>>> copies?
>>>
>>> Thanks,
>>> Mark
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>

>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> 

Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Matthew Knepley
On Wed, Jan 4, 2023 at 7:22 PM Junchao Zhang 
wrote:

> We don't have a machine for us to test with both "--with-cuda --with-hip"
>

Yes, but your answer suggested that the structure of the code prevented
this combination.

  Thanks,

 Matt


> --Junchao Zhang
>
>
> On Wed, Jan 4, 2023 at 6:17 PM Matthew Knepley  wrote:
>
>> On Wed, Jan 4, 2023 at 7:09 PM Junchao Zhang 
>> wrote:
>>
>>> On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley 
>>> wrote:
>>>
 On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang 
 wrote:

>
> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry  wrote:
>
>> Oh, is the device backend not known at compile time?
>>
> Currently it is known at compile time.
>

 Are you sure? I don't think it is known at compile time.

>>> We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both
>>>
>>
>> Where is the logic for that in the code? This seems like a crazy design.
>>
>>   Thanks,
>>
>> Matt
>>
>>
>>>   Thanks,

  Matt


> Or multiple backends can be alive at once?
>>
>
> Some petsc developers (Jed and Barry) want to support this, but we are
> incapable now.
>
>
>>
>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang 
>> wrote:
>>
>>>
>>>
>>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry  wrote:
>>>
 Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then
> we would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE 
> on AMD
> GPUs, ...


 Wouldn't one function suffice? Assuming these are contiguous arrays
 in CSR format, they're just raw device pointers in all cases.

>>> But we need to know what device it is (to dispatch to either
>>> petsc-CUDA or petsc-HIP backend)
>>>
>>>

 On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <
 junchao.zh...@gmail.com> wrote:

> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for
> GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but 
> then we
> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on 
> AMD
> GPUs, ...
>
> The real problem I think is to deal with multiple MPI ranks.
> Providing the split arrays for petsc MATMPIAIJ is not easy and thus is
> discouraged for users to do so.
>
> A workaround is to let petsc build the matrix and allocate the
> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array 
> and fill
> it up.
>
> We recently added routines to support matrix assembly on GPUs, see
> if MatSetValuesCOO
> 
>  helps
>
> --Junchao Zhang
>
>
> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry 
> wrote:
>
>> I have a sparse matrix constructed in non-petsc code using a
>> standard CSR representation where I compute the Jacobian to be used 
>> in an
>> implicit TS context. In the CPU world I call
>>
>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols,
>> rowidxptr, colidxptr, valptr, Jac);
>>
>> which as I understand it -- (1) never copies/allocates that
>> information, and the matrix Jac is just a non-owning view into the 
>> already
>> allocated CSR, (2) I can write directly into the original data 
>> structures
>> and the Mat just "knows" about it, although it still needs a call to
>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far 
>> this
>> works great with GAMG.
>>
>> I have the same CSR representation filled in GPU data allocated
>> with cudaMalloc and filled on-device. Is there an equivalent Mat
>> constructor for GPU arrays, or some other way to avoid unnecessary 
>> copies?
>>
>> Thanks,
>> Mark
>>
>

 --
 What most experimenters take for granted before they begin their
 experiments is infinitely more interesting than any results to which their
 experiments lead.
 -- Norbert Wiener

 https://www.cse.buffalo.edu/~knepley/
 

>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 

Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Junchao Zhang
We don't have a machine for us to test with both "--with-cuda --with-hip"

--Junchao Zhang


On Wed, Jan 4, 2023 at 6:17 PM Matthew Knepley  wrote:

> On Wed, Jan 4, 2023 at 7:09 PM Junchao Zhang 
> wrote:
>
>> On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley  wrote:
>>
>>> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang 
>>> wrote:
>>>

 On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry  wrote:

> Oh, is the device backend not known at compile time?
>
 Currently it is known at compile time.

>>>
>>> Are you sure? I don't think it is known at compile time.
>>>
>> We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both
>>
>
> Where is the logic for that in the code? This seems like a crazy design.
>
>   Thanks,
>
> Matt
>
>
>>   Thanks,
>>>
>>>  Matt
>>>
>>>
 Or multiple backends can be alive at once?
>

 Some petsc developers (Jed and Barry) want to support this, but we are
 incapable now.


>
> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang 
> wrote:
>
>>
>>
>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry  wrote:
>>
>>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then
 we would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on 
 AMD
 GPUs, ...
>>>
>>>
>>> Wouldn't one function suffice? Assuming these are contiguous arrays
>>> in CSR format, they're just raw device pointers in all cases.
>>>
>> But we need to know what device it is (to dispatch to either
>> petsc-CUDA or petsc-HIP backend)
>>
>>
>>>
>>> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang <
>>> junchao.zh...@gmail.com> wrote:
>>>
 No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for
 GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but 
 then we
 would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
 GPUs, ...

 The real problem I think is to deal with multiple MPI ranks.
 Providing the split arrays for petsc MATMPIAIJ is not easy and thus is
 discouraged for users to do so.

 A workaround is to let petsc build the matrix and allocate the
 memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array and 
 fill
 it up.

 We recently added routines to support matrix assembly on GPUs, see
 if MatSetValuesCOO
 
  helps

 --Junchao Zhang


 On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry  wrote:

> I have a sparse matrix constructed in non-petsc code using a
> standard CSR representation where I compute the Jacobian to be used 
> in an
> implicit TS context. In the CPU world I call
>
> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols,
> rowidxptr, colidxptr, valptr, Jac);
>
> which as I understand it -- (1) never copies/allocates that
> information, and the matrix Jac is just a non-owning view into the 
> already
> allocated CSR, (2) I can write directly into the original data 
> structures
> and the Mat just "knows" about it, although it still needs a call to
> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far 
> this
> works great with GAMG.
>
> I have the same CSR representation filled in GPU data allocated
> with cudaMalloc and filled on-device. Is there an equivalent Mat
> constructor for GPU arrays, or some other way to avoid unnecessary 
> copies?
>
> Thanks,
> Mark
>

>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> 
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>


Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Matthew Knepley
On Wed, Jan 4, 2023 at 7:09 PM Junchao Zhang 
wrote:

> On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley  wrote:
>
>> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang 
>> wrote:
>>
>>>
>>> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry  wrote:
>>>
 Oh, is the device backend not known at compile time?

>>> Currently it is known at compile time.
>>>
>>
>> Are you sure? I don't think it is known at compile time.
>>
> We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both
>

Where is the logic for that in the code? This seems like a crazy design.

  Thanks,

Matt


>   Thanks,
>>
>>  Matt
>>
>>
>>> Or multiple backends can be alive at once?

>>>
>>> Some petsc developers (Jed and Barry) want to support this, but we are
>>> incapable now.
>>>
>>>

 On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang 
 wrote:

>
>
> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry  wrote:
>
>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
>>> GPUs, ...
>>
>>
>> Wouldn't one function suffice? Assuming these are contiguous arrays
>> in CSR format, they're just raw device pointers in all cases.
>>
> But we need to know what device it is (to dispatch to either
> petsc-CUDA or petsc-HIP backend)
>
>
>>
>> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang 
>> wrote:
>>
>>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for
>>> GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but 
>>> then we
>>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
>>> GPUs, ...
>>>
>>> The real problem I think is to deal with multiple MPI ranks.
>>> Providing the split arrays for petsc MATMPIAIJ is not easy and thus is
>>> discouraged for users to do so.
>>>
>>> A workaround is to let petsc build the matrix and allocate the
>>> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array and 
>>> fill
>>> it up.
>>>
>>> We recently added routines to support matrix assembly on GPUs, see if
>>>  MatSetValuesCOO
>>> 
>>>  helps
>>>
>>> --Junchao Zhang
>>>
>>>
>>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry  wrote:
>>>
 I have a sparse matrix constructed in non-petsc code using a
 standard CSR representation where I compute the Jacobian to be used in 
 an
 implicit TS context. In the CPU world I call

 MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols,
 rowidxptr, colidxptr, valptr, Jac);

 which as I understand it -- (1) never copies/allocates that
 information, and the matrix Jac is just a non-owning view into the 
 already
 allocated CSR, (2) I can write directly into the original data 
 structures
 and the Mat just "knows" about it, although it still needs a call to
 MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
 works great with GAMG.

 I have the same CSR representation filled in GPU data allocated
 with cudaMalloc and filled on-device. Is there an equivalent Mat
 constructor for GPU arrays, or some other way to avoid unnecessary 
 copies?

 Thanks,
 Mark

>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> 
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Junchao Zhang
On Wed, Jan 4, 2023 at 6:02 PM Matthew Knepley  wrote:

> On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang 
> wrote:
>
>>
>> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry  wrote:
>>
>>> Oh, is the device backend not known at compile time?
>>>
>> Currently it is known at compile time.
>>
>
> Are you sure? I don't think it is known at compile time.
>
We define either PETSC_HAVE_CUDA or PETSC_HAVE_HIP or NONE, but not both


>
>   Thanks,
>
>  Matt
>
>
>> Or multiple backends can be alive at once?
>>>
>>
>> Some petsc developers (Jed and Barry) want to support this, but we are
>> incapable now.
>>
>>
>>>
>>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang 
>>> wrote:
>>>


 On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry  wrote:

> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
>> GPUs, ...
>
>
> Wouldn't one function suffice? Assuming these are contiguous arrays in
> CSR format, they're just raw device pointers in all cases.
>
 But we need to know what device it is (to dispatch to either petsc-CUDA
 or petsc-HIP backend)


>
> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang 
> wrote:
>
>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for
>> GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then 
>> we
>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
>> GPUs, ...
>>
>> The real problem I think is to deal with multiple MPI ranks.
>> Providing the split arrays for petsc MATMPIAIJ is not easy and thus is
>> discouraged for users to do so.
>>
>> A workaround is to let petsc build the matrix and allocate the
>> memory, then you call MatSeqAIJCUSPARSEGetArray() to get the array and 
>> fill
>> it up.
>>
>> We recently added routines to support matrix assembly on GPUs, see if
>>  MatSetValuesCOO
>> 
>>  helps
>>
>> --Junchao Zhang
>>
>>
>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry  wrote:
>>
>>> I have a sparse matrix constructed in non-petsc code using a
>>> standard CSR representation where I compute the Jacobian to be used in 
>>> an
>>> implicit TS context. In the CPU world I call
>>>
>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
>>> colidxptr, valptr, Jac);
>>>
>>> which as I understand it -- (1) never copies/allocates that
>>> information, and the matrix Jac is just a non-owning view into the 
>>> already
>>> allocated CSR, (2) I can write directly into the original data 
>>> structures
>>> and the Mat just "knows" about it, although it still needs a call to
>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
>>> works great with GAMG.
>>>
>>> I have the same CSR representation filled in GPU data allocated with
>>> cudaMalloc and filled on-device. Is there an equivalent Mat constructor 
>>> for
>>> GPU arrays, or some other way to avoid unnecessary copies?
>>>
>>> Thanks,
>>> Mark
>>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> 
>


Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Matthew Knepley
On Wed, Jan 4, 2023 at 6:49 PM Junchao Zhang 
wrote:

>
> On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry  wrote:
>
>> Oh, is the device backend not known at compile time?
>>
> Currently it is known at compile time.
>

Are you sure? I don't think it is known at compile time.

  Thanks,

 Matt


> Or multiple backends can be alive at once?
>>
>
> Some petsc developers (Jed and Barry) want to support this, but we are
> incapable now.
>
>
>>
>> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang 
>> wrote:
>>
>>>
>>>
>>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry  wrote:
>>>
 Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
> GPUs, ...


 Wouldn't one function suffice? Assuming these are contiguous arrays in
 CSR format, they're just raw device pointers in all cases.

>>> But we need to know what device it is (to dispatch to either petsc-CUDA
>>> or petsc-HIP backend)
>>>
>>>

 On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang 
 wrote:

> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for
> GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then 
> we
> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
> GPUs, ...
>
> The real problem I think is to deal with multiple MPI ranks. Providing
> the split arrays for petsc MATMPIAIJ is not easy and thus is discouraged
> for users to do so.
>
> A workaround is to let petsc build the matrix and allocate the memory,
> then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up.
>
> We recently added routines to support matrix assembly on GPUs, see if
> MatSetValuesCOO
> 
>  helps
>
> --Junchao Zhang
>
>
> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry  wrote:
>
>> I have a sparse matrix constructed in non-petsc code using a standard
>> CSR representation where I compute the Jacobian to be used in an implicit
>> TS context. In the CPU world I call
>>
>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
>> colidxptr, valptr, Jac);
>>
>> which as I understand it -- (1) never copies/allocates that
>> information, and the matrix Jac is just a non-owning view into the 
>> already
>> allocated CSR, (2) I can write directly into the original data structures
>> and the Mat just "knows" about it, although it still needs a call to
>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
>> works great with GAMG.
>>
>> I have the same CSR representation filled in GPU data allocated with
>> cudaMalloc and filled on-device. Is there an equivalent Mat constructor 
>> for
>> GPU arrays, or some other way to avoid unnecessary copies?
>>
>> Thanks,
>> Mark
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Junchao Zhang
On Wed, Jan 4, 2023 at 5:40 PM Mark Lohry  wrote:

> Oh, is the device backend not known at compile time?
>
Currently it is known at compile time.

Or multiple backends can be alive at once?
>

Some petsc developers (Jed and Barry) want to support this, but we are
incapable now.


>
> On Wed, Jan 4, 2023, 6:27 PM Junchao Zhang 
> wrote:
>
>>
>>
>> On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry  wrote:
>>
>>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
 would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
 GPUs, ...
>>>
>>>
>>> Wouldn't one function suffice? Assuming these are contiguous arrays in
>>> CSR format, they're just raw device pointers in all cases.
>>>
>> But we need to know what device it is (to dispatch to either petsc-CUDA
>> or petsc-HIP backend)
>>
>>
>>>
>>> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang 
>>> wrote:
>>>
 No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for
 GPUs. Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
 would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
 GPUs, ...

 The real problem I think is to deal with multiple MPI ranks. Providing
 the split arrays for petsc MATMPIAIJ is not easy and thus is discouraged
 for users to do so.

 A workaround is to let petsc build the matrix and allocate the memory,
 then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up.

 We recently added routines to support matrix assembly on GPUs, see if
 MatSetValuesCOO
  helps

 --Junchao Zhang


 On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry  wrote:

> I have a sparse matrix constructed in non-petsc code using a standard
> CSR representation where I compute the Jacobian to be used in an implicit
> TS context. In the CPU world I call
>
> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
> colidxptr, valptr, Jac);
>
> which as I understand it -- (1) never copies/allocates that
> information, and the matrix Jac is just a non-owning view into the already
> allocated CSR, (2) I can write directly into the original data structures
> and the Mat just "knows" about it, although it still needs a call to
> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
> works great with GAMG.
>
> I have the same CSR representation filled in GPU data allocated with
> cudaMalloc and filled on-device. Is there an equivalent Mat constructor 
> for
> GPU arrays, or some other way to avoid unnecessary copies?
>
> Thanks,
> Mark
>



Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Junchao Zhang
On Wed, Jan 4, 2023 at 5:19 PM Mark Lohry  wrote:

> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
>> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
>> GPUs, ...
>
>
> Wouldn't one function suffice? Assuming these are contiguous arrays in CSR
> format, they're just raw device pointers in all cases.
>
But we need to know what device it is (to dispatch to either petsc-CUDA or
petsc-HIP backend)


>
> On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang 
> wrote:
>
>> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for GPUs.
>> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we would
>> need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD GPUs, ...
>>
>> The real problem I think is to deal with multiple MPI ranks. Providing
>> the split arrays for petsc MATMPIAIJ is not easy and thus is discouraged
>> for users to do so.
>>
>> A workaround is to let petsc build the matrix and allocate the memory,
>> then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up.
>>
>> We recently added routines to support matrix assembly on GPUs, see if
>> MatSetValuesCOO
>>  helps
>>
>> --Junchao Zhang
>>
>>
>> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry  wrote:
>>
>>> I have a sparse matrix constructed in non-petsc code using a standard
>>> CSR representation where I compute the Jacobian to be used in an implicit
>>> TS context. In the CPU world I call
>>>
>>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
>>> colidxptr, valptr, Jac);
>>>
>>> which as I understand it -- (1) never copies/allocates that information,
>>> and the matrix Jac is just a non-owning view into the already allocated
>>> CSR, (2) I can write directly into the original data structures and the Mat
>>> just "knows" about it, although it still needs a call to
>>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
>>> works great with GAMG.
>>>
>>> I have the same CSR representation filled in GPU data allocated with
>>> cudaMalloc and filled on-device. Is there an equivalent Mat constructor for
>>> GPU arrays, or some other way to avoid unnecessary copies?
>>>
>>> Thanks,
>>> Mark
>>>
>>


Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Mark Lohry
>
> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we
> would need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD
> GPUs, ...


Wouldn't one function suffice? Assuming these are contiguous arrays in CSR
format, they're just raw device pointers in all cases.

On Wed, Jan 4, 2023 at 6:02 PM Junchao Zhang 
wrote:

> No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for GPUs.
> Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we would
> need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD GPUs, ...
>
> The real problem I think is to deal with multiple MPI ranks. Providing the
> split arrays for petsc MATMPIAIJ is not easy and thus is discouraged for
> users to do so.
>
> A workaround is to let petsc build the matrix and allocate the memory,
> then you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up.
>
> We recently added routines to support matrix assembly on GPUs, see if
> MatSetValuesCOO
>  helps
>
> --Junchao Zhang
>
>
> On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry  wrote:
>
>> I have a sparse matrix constructed in non-petsc code using a standard CSR
>> representation where I compute the Jacobian to be used in an implicit TS
>> context. In the CPU world I call
>>
>> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
>> colidxptr, valptr, Jac);
>>
>> which as I understand it -- (1) never copies/allocates that information,
>> and the matrix Jac is just a non-owning view into the already allocated
>> CSR, (2) I can write directly into the original data structures and the Mat
>> just "knows" about it, although it still needs a call to
>> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
>> works great with GAMG.
>>
>> I have the same CSR representation filled in GPU data allocated with
>> cudaMalloc and filled on-device. Is there an equivalent Mat constructor for
>> GPU arrays, or some other way to avoid unnecessary copies?
>>
>> Thanks,
>> Mark
>>
>


Re: [petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Junchao Zhang
No, we don't have a counterpart of MatCreateSeqAIJWithArrays() for GPUs.
Maybe we could add a MatCreateSeqAIJCUSPARSEWithArrays(), but then we would
need another for MATMPIAIJCUSPARSE, and then for HIPSPARSE on AMD GPUs, ...

The real problem I think is to deal with multiple MPI ranks. Providing the
split arrays for petsc MATMPIAIJ is not easy and thus is discouraged for
users to do so.

A workaround is to let petsc build the matrix and allocate the memory, then
you call MatSeqAIJCUSPARSEGetArray() to get the array and fill it up.

We recently added routines to support matrix assembly on GPUs, see if
MatSetValuesCOO
 helps

--Junchao Zhang


On Wed, Jan 4, 2023 at 2:22 PM Mark Lohry  wrote:

> I have a sparse matrix constructed in non-petsc code using a standard CSR
> representation where I compute the Jacobian to be used in an implicit TS
> context. In the CPU world I call
>
> MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
> colidxptr, valptr, Jac);
>
> which as I understand it -- (1) never copies/allocates that information,
> and the matrix Jac is just a non-owning view into the already allocated
> CSR, (2) I can write directly into the original data structures and the Mat
> just "knows" about it, although it still needs a call to
> MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
> works great with GAMG.
>
> I have the same CSR representation filled in GPU data allocated with
> cudaMalloc and filled on-device. Is there an equivalent Mat constructor for
> GPU arrays, or some other way to avoid unnecessary copies?
>
> Thanks,
> Mark
>


[petsc-users] MatCreateSeqAIJWithArrays for GPU / cusparse

2023-01-04 Thread Mark Lohry
I have a sparse matrix constructed in non-petsc code using a standard CSR
representation where I compute the Jacobian to be used in an implicit TS
context. In the CPU world I call

MatCreateSeqAIJWithArrays(PETSC_COMM_WORLD, nrows, ncols, rowidxptr,
colidxptr, valptr, Jac);

which as I understand it -- (1) never copies/allocates that information,
and the matrix Jac is just a non-owning view into the already allocated
CSR, (2) I can write directly into the original data structures and the Mat
just "knows" about it, although it still needs a call to
MatAssemblyBegin/MatAssemblyEnd after modifying the values. So far this
works great with GAMG.

I have the same CSR representation filled in GPU data allocated with
cudaMalloc and filled on-device. Is there an equivalent Mat constructor for
GPU arrays, or some other way to avoid unnecessary copies?

Thanks,
Mark


Re: [petsc-users] puzzling arkimex logic

2023-01-04 Thread Mark Adams
Thanks, it is working fine.
Mark

On Wed, Jan 4, 2023 at 1:12 PM Jed Brown  wrote:

> This default probably shouldn't be zero, and probably lengthening steps
> should be more gentle after a recent failure. But Mark, please let us know
> if what's there works for you.
>
> "Zhang, Hong via petsc-users"  writes:
>
> > Hi Mark,
> >
> > You might want to try -ts_adapt_time_step_increase_delay to delay
> increasing the time step after it has been decreased due to a failed solve.
> >
> > Hong (Mr.)
> >
> >> On Jan 2, 2023, at 12:17 PM, Mark Adams  wrote:
> >>
> >> I am using arkimex and the logic with a failed KSP solve is puzzling.
> This step starts with a dt of ~.005, the linear solver fails and cuts the
> time step by 1/4. So far, so good. The step then works but the next time
> step the time step goes to ~0.006.
> >> TS seems to have forgotten that it had to cut the time step back.
> >> Perhaps that logic is missing or my parameters need work?
> >>
> >> Thanks,
> >> Mark
> >>
> >> -ts_adapt_dt_max 0.01 # (source: command line)
> >> -ts_adapt_monitor # (source: file)
> >> -ts_arkimex_type 1bee # (source: file)
> >> -ts_dt .001 # (source: command line)
> >> -ts_max_reject 10 # (source: file)
> >> -ts_max_snes_failures -1 # (source: file)
> >> -ts_max_steps 8000 # (source: command line)
> >> -ts_max_time 14 # (source: command line)
> >> -ts_monitor # (source: file)
> >> -ts_rtol 1e-6 # (source: command line)
> >> -ts_type arkimex # (source: file)
> >>
> >>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> >>   TSAdapt basic arkimex 0:1bee step   1 accepted t=0.001  +
> 2.497e-03 dt=5.404e-03  wlte=0.173  wltea=   -1 wlter=
> >>  -1
> >> 2 TS dt 0.00540401 time 0.00349731
> >> 0 SNES Function norm 1.358886930084e-05
> >> Linear solve did not converge due to DIVERGED_ITS iterations 100
> >>   Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
> >>   TSAdapt basic step   2 stage rejected (DIVERGED_LINEAR_SOLVE)
> t=0.00349731 + 5.404e-03 retrying with dt=1.351e-03
> >> 0 SNES Function norm 1.358886930084e-05
> >> Linear solve converged due to CONVERGED_RTOL iterations 19
> >> 1 SNES Function norm 4.412110425362e-10
> >> Linear solve converged due to CONVERGED_RTOL iterations 6
> >> 2 SNES Function norm 4.978968053066e-13
> >>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> >> 0 SNES Function norm 8.549322067920e-06
> >> Linear solve converged due to CONVERGED_RTOL iterations 14
> >> 1 SNES Function norm 8.357075378456e-11
> >> Linear solve converged due to CONVERGED_RTOL iterations 4
> >> 2 SNES Function norm 4.983138402512e-13
> >>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> >> 0 SNES Function norm 1.044832467924e-05
> >> Linear solve converged due to CONVERGED_RTOL iterations 13
> >> 1 SNES Function norm 1.036101875301e-10
> >> Linear solve converged due to CONVERGED_RTOL iterations 4
> >> 2 SNES Function norm 4.984888077288e-13
> >>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> >>   TSAdapt basic arkimex 0:1bee step   2 accepted t=0.00349731 +
> 1.351e-03 dt=6.305e-03  wlte=0.0372  wltea=   -1 wlter=
> >>   -1
> >> 3 TS dt 0.00630456 time 0.00484832
> >> 0 SNES Function norm 8.116559104264e-06
> >> Linear solve did not converge due to DIVERGED_ITS iterations 100
>


Re: [petsc-users] puzzling arkimex logic

2023-01-04 Thread Jed Brown
This default probably shouldn't be zero, and probably lengthening steps should 
be more gentle after a recent failure. But Mark, please let us know if what's 
there works for you.

"Zhang, Hong via petsc-users"  writes:

> Hi Mark,
>
> You might want to try -ts_adapt_time_step_increase_delay to delay increasing 
> the time step after it has been decreased due to a failed solve.
>
> Hong (Mr.)
>
>> On Jan 2, 2023, at 12:17 PM, Mark Adams  wrote:
>> 
>> I am using arkimex and the logic with a failed KSP solve is puzzling. This 
>> step starts with a dt of ~.005, the linear solver fails and cuts the time 
>> step by 1/4. So far, so good. The step then works but the next time step the 
>> time step goes to ~0.006.
>> TS seems to have forgotten that it had to cut the time step back.
>> Perhaps that logic is missing or my parameters need work?
>> 
>> Thanks,
>> Mark
>> 
>> -ts_adapt_dt_max 0.01 # (source: command line)
>> -ts_adapt_monitor # (source: file)
>> -ts_arkimex_type 1bee # (source: file)
>> -ts_dt .001 # (source: command line)
>> -ts_max_reject 10 # (source: file)
>> -ts_max_snes_failures -1 # (source: file)
>> -ts_max_steps 8000 # (source: command line)
>> -ts_max_time 14 # (source: command line)
>> -ts_monitor # (source: file)
>> -ts_rtol 1e-6 # (source: command line)
>> -ts_type arkimex # (source: file)
>> 
>>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>>   TSAdapt basic arkimex 0:1bee step   1 accepted t=0.001  + 
>> 2.497e-03 dt=5.404e-03  wlte=0.173  wltea=   -1 wlter=  
>>  -1
>> 2 TS dt 0.00540401 time 0.00349731
>> 0 SNES Function norm 1.358886930084e-05 
>> Linear solve did not converge due to DIVERGED_ITS iterations 100
>>   Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
>>   TSAdapt basic step   2 stage rejected (DIVERGED_LINEAR_SOLVE) 
>> t=0.00349731 + 5.404e-03 retrying with dt=1.351e-03 
>> 0 SNES Function norm 1.358886930084e-05 
>> Linear solve converged due to CONVERGED_RTOL iterations 19
>> 1 SNES Function norm 4.412110425362e-10 
>> Linear solve converged due to CONVERGED_RTOL iterations 6
>> 2 SNES Function norm 4.978968053066e-13 
>>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>> 0 SNES Function norm 8.549322067920e-06 
>> Linear solve converged due to CONVERGED_RTOL iterations 14
>> 1 SNES Function norm 8.357075378456e-11 
>> Linear solve converged due to CONVERGED_RTOL iterations 4
>> 2 SNES Function norm 4.983138402512e-13 
>>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>> 0 SNES Function norm 1.044832467924e-05 
>> Linear solve converged due to CONVERGED_RTOL iterations 13
>> 1 SNES Function norm 1.036101875301e-10 
>> Linear solve converged due to CONVERGED_RTOL iterations 4
>> 2 SNES Function norm 4.984888077288e-13 
>>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>>   TSAdapt basic arkimex 0:1bee step   2 accepted t=0.00349731 + 
>> 1.351e-03 dt=6.305e-03  wlte=0.0372  wltea=   -1 wlter= 
>>   -1
>> 3 TS dt 0.00630456 time 0.00484832
>> 0 SNES Function norm 8.116559104264e-06 
>> Linear solve did not converge due to DIVERGED_ITS iterations 100


Re: [petsc-users] puzzling arkimex logic

2023-01-04 Thread Zhang, Hong via petsc-users
Hi Mark,

You might want to try -ts_adapt_time_step_increase_delay to delay increasing 
the time step after it has been decreased due to a failed solve.

Hong (Mr.)

> On Jan 2, 2023, at 12:17 PM, Mark Adams  wrote:
> 
> I am using arkimex and the logic with a failed KSP solve is puzzling. This 
> step starts with a dt of ~.005, the linear solver fails and cuts the time 
> step by 1/4. So far, so good. The step then works but the next time step the 
> time step goes to ~0.006.
> TS seems to have forgotten that it had to cut the time step back.
> Perhaps that logic is missing or my parameters need work?
> 
> Thanks,
> Mark
> 
> -ts_adapt_dt_max 0.01 # (source: command line)
> -ts_adapt_monitor # (source: file)
> -ts_arkimex_type 1bee # (source: file)
> -ts_dt .001 # (source: command line)
> -ts_max_reject 10 # (source: file)
> -ts_max_snes_failures -1 # (source: file)
> -ts_max_steps 8000 # (source: command line)
> -ts_max_time 14 # (source: command line)
> -ts_monitor # (source: file)
> -ts_rtol 1e-6 # (source: command line)
> -ts_type arkimex # (source: file)
> 
>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>   TSAdapt basic arkimex 0:1bee step   1 accepted t=0.001  + 2.497e-03 
> dt=5.404e-03  wlte=0.173  wltea=   -1 wlter=  
>  -1
> 2 TS dt 0.00540401 time 0.00349731
> 0 SNES Function norm 1.358886930084e-05 
> Linear solve did not converge due to DIVERGED_ITS iterations 100
>   Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
>   TSAdapt basic step   2 stage rejected (DIVERGED_LINEAR_SOLVE) 
> t=0.00349731 + 5.404e-03 retrying with dt=1.351e-03 
> 0 SNES Function norm 1.358886930084e-05 
> Linear solve converged due to CONVERGED_RTOL iterations 19
> 1 SNES Function norm 4.412110425362e-10 
> Linear solve converged due to CONVERGED_RTOL iterations 6
> 2 SNES Function norm 4.978968053066e-13 
>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> 0 SNES Function norm 8.549322067920e-06 
> Linear solve converged due to CONVERGED_RTOL iterations 14
> 1 SNES Function norm 8.357075378456e-11 
> Linear solve converged due to CONVERGED_RTOL iterations 4
> 2 SNES Function norm 4.983138402512e-13 
>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
> 0 SNES Function norm 1.044832467924e-05 
> Linear solve converged due to CONVERGED_RTOL iterations 13
> 1 SNES Function norm 1.036101875301e-10 
> Linear solve converged due to CONVERGED_RTOL iterations 4
> 2 SNES Function norm 4.984888077288e-13 
>   Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE iterations 2
>   TSAdapt basic arkimex 0:1bee step   2 accepted t=0.00349731 + 1.351e-03 
> dt=6.305e-03  wlte=0.0372  wltea=   -1 wlter= 
>   -1
> 3 TS dt 0.00630456 time 0.00484832
> 0 SNES Function norm 8.116559104264e-06 
> Linear solve did not converge due to DIVERGED_ITS iterations 100



Re: [petsc-users] Getting global indices of vector distributed among different processes.

2023-01-04 Thread Matthew Knepley
On Wed, Jan 4, 2023 at 10:48 AM Venugopal, Vysakh (venugovh) via
petsc-users  wrote:

> Hello,
>
>
>
> Is there a way to get the global indices from a vector created from
> DMCreateGlobalVector? Example:
>
>
>
> If global vector V (of size 10) has indices {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
> and they are divided into 2 processes. Is there a way to get information
> such as (process 1: {0,1,2,3,4}, process 2: {5,6,7,8,9})?
>

https://petsc.org/main/docs/manualpages/Vec/VecGetOwnershipRange/

  Thanks,

 Matt


> The reason I need this information is that I need to query the values of a
> different vector Q of size 10 and place those values in V. Example: Q(1)
> --- V(1) @ process 1, Q(7) – V(7) @ process 2, etc.. If there are smarter
> ways to do this, I am happy to pursue that.
>
>
>
> Thank you,
>
>
>
> Vysakh V.
>
>
>
> ---
>
> Vysakh Venugopal
>
> Ph.D. Candidate
>
> Department of Mechanical Engineering
>
> University of Cincinnati, Cincinnati, OH 45221-0072
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ 


[petsc-users] Getting global indices of vector distributed among different processes.

2023-01-04 Thread Venugopal, Vysakh (venugovh) via petsc-users
Hello,

Is there a way to get the global indices from a vector created from 
DMCreateGlobalVector? Example:

If global vector V (of size 10) has indices {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} and 
they are divided into 2 processes. Is there a way to get information such as 
(process 1: {0,1,2,3,4}, process 2: {5,6,7,8,9})?

The reason I need this information is that I need to query the values of a 
different vector Q of size 10 and place those values in V. Example: Q(1) --- 
V(1) @ process 1, Q(7) - V(7) @ process 2, etc.. If there are smarter ways to 
do this, I am happy to pursue that.

Thank you,

Vysakh V.

---
Vysakh Venugopal
Ph.D. Candidate
Department of Mechanical Engineering
University of Cincinnati, Cincinnati, OH 45221-0072