Re: conver panda image column to spark dataframe

2023-08-03 Thread Adrian Pop-Tifrea
Hello,

can you also please show us how you created the pandas dataframe? I mean,
how you added the actual data into the dataframe. It would help us for
reproducing the error.

Thank you,
Pop-Tifrea Adrian

On Mon, Jul 31, 2023 at 5:03 AM second_co...@yahoo.com <
second_co...@yahoo.com> wrote:

> i changed to
>
> ArrayType(ArrayType(ArrayType(IntegerType( , still get same error
>
> Thank you for responding
>
> On Thursday, July 27, 2023 at 06:58:09 PM GMT+8, Adrian Pop-Tifrea <
> poptifreaadr...@gmail.com> wrote:
>
>
> Hello,
>
> when you said your pandas Dataframe has 10 rows, does that mean it
> contains 10 images? Because if that's the case, then you'd want ro only use
> 3 layers of ArrayType when you define the schema.
>
> Best regards,
> Adrian
>
>
>
> On Thu, Jul 27, 2023, 11:04 second_co...@yahoo.com.INVALID
>  wrote:
>
> i have panda dataframe with column 'image' using numpy.ndarray. shape is (500,
> 333, 3) per image. my panda dataframe has 10 rows, thus, shape is (10,
> 500, 333, 3)
>
> when using spark.createDataframe(panda_dataframe, schema), i need to
> specify the schema,
>
> schema = StructType([
> StructField("image",
> ArrayType(ArrayType(ArrayType(ArrayType(IntegerType(), nullable=False)
> ])
>
>
> i get error
>
> raise TypeError(
> , TypeError: field image: 
> ArrayType(ArrayType(ArrayType(ArrayType(IntegerType(), True), True), True), 
> True) can not accept object array([[[14, 14, 14],
>
> ...
>
> Can advise how to set schema for image with numpy.ndarray ?
>
>
>
>


Re: conver panda image column to spark dataframe

2023-07-27 Thread Adrian Pop-Tifrea
Hello,

when you said your pandas Dataframe has 10 rows, does that mean it contains
10 images? Because if that's the case, then you'd want ro only use 3 layers
of ArrayType when you define the schema.

Best regards,
Adrian



On Thu, Jul 27, 2023, 11:04 second_co...@yahoo.com.INVALID
 wrote:

> i have panda dataframe with column 'image' using numpy.ndarray. shape is (500,
> 333, 3) per image. my panda dataframe has 10 rows, thus, shape is (10,
> 500, 333, 3)
>
> when using spark.createDataframe(panda_dataframe, schema), i need to
> specify the schema,
>
> schema = StructType([
> StructField("image",
> ArrayType(ArrayType(ArrayType(ArrayType(IntegerType(), nullable=False)
> ])
>
>
> i get error
>
> raise TypeError(
> , TypeError: field image: 
> ArrayType(ArrayType(ArrayType(ArrayType(IntegerType(), True), True), True), 
> True) can not accept object array([[[14, 14, 14],
>
> ...
>
> Can advise how to set schema for image with numpy.ndarray ?
>
>
>
>