Hey David,

I don't think Table is designed in a way that you could "populate" it with
a 2D tensor. It should rather be populated with a collection of equal
length arrays.
Sparse CSR tensor on the other hand is composed of three arrays (indices,
indptr, values) and you need a bit more involved logic to manipulate those
than regular arrays. See [1] for memory layout definition.

What are you looking to accomplish? What access patterns are you expecting?

Rok

[1] https://github.com/apache/arrow/blob/master/format/SparseTensor.fbs

On Wed, Jul 6, 2022 at 10:48 PM dl <[email protected]> wrote:

> Hi Rok,
>
> What data type would I use for a pyarrow SparseCSRMatrix in a schema? I
> need to build a table with rows which include a field of this type. I don't
> see a related example in the test module. I'm doing something like:
>
> schema = pyarrow.schema(fields, metadata=metadata)
> table = pyarrow.Table.from_arrays(table_data, schema=schema)
>
> where fields is a list of tuples of the form (field_name, pyarrow_type),
> e.g. ('field1', pyarrow.string()). What should pyarrow_type be for a
> SparseCSRMatrix field? Or will this not work?
>
> Thanks,
> David
>
>
> On 7/1/2022 9:18 AM, Rok Mihevc wrote:
>
> We lack pyarow sparse tensor documentation (PRs welcome), so tests are
> perhaps most extensive description of what is doable:
> https://github.com/apache/arrow/blob/master/python/pyarrow/tests/test_sparse_tensor.py
>
> Rok
>
> On Fri, Jul 1, 2022 at 5:38 PM dl via user <[email protected]> wrote:
>
>> So, I guess this is supported in 8.0.0. I can do this:
>>
>> import numpy as npimport pyarrow as pafrom scipy.sparse import csr_matrix
>>
>> a = np.random.rand(100)
>> a[a < .9] = 0.0
>> s = csr_matrix(a)
>> arrow_sparse_csr_matrix = pa.SparseCSRMatrix.from_scipy(s)
>>
>> Now, how do I use that to build a pyarrow table? Stay tuned...
>>
>> On 7/1/2022 8:19 AM, dl wrote:
>>
>> I find pyarrow.SparseCSRMatrix mentioned here
>> <https://arrow.apache.org/docs/python/integration/extending.html?highlight=sparse#pyarrow.pyarrow_wrap_sparse_csr_matrix>.
>> But how do I use that? Is there documentation for that class?
>>
>> On 7/1/2022 7:47 AM, dl wrote:
>>
>>
>> Hi,
>>
>> I'm trying to understand support for sparse tensors in Arrow. It looks
>> like there is "experimental" support using the C++ API
>> <https://arrow.apache.org/docs/cpp/api/tensor.html?highlight=sparse#sparse-tensors>.
>> When was this introduced? I see in the code base here
>> <https://github.com/apache/arrow/blob/master/python/pyarrow/tensor.pxi>
>> Cython sparse array classes. Can these be accessed using the Python API.
>> Are they included in the 8.0.0 release? Is there any other support for
>> sparse arrays/tensors in the Python API? Are there good examples for any of
>> this, in particular for using the 8.0.0 Python API to create sparse tensors?
>>
>> Thanks,
>> David
>>
>>
>>
>>
>>
>

Reply via email to