[jira] [Assigned] (SPARK-42021) createDataFrame with array.array

2023-01-16 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-42021:


Assignee: Hyukjin Kwon

> createDataFrame with array.array
> 
>
> Key: SPARK-42021
> URL: https://issues.apache.org/jira/browse/SPARK-42021
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Hyukjin Kwon
>Priority: Major
> Fix For: 3.4.0
>
>
> {code}
> pyspark/sql/tests/test_types.py:964 (TypesParityTests.test_array_types)
> self =  testMethod=test_array_types>
> def test_array_types(self):
> # This test need to make sure that the Scala type selected is at least
> # as large as the python's types. This is necessary because python's
> # array types depend on C implementation on the machine. Therefore 
> there
> # is no machine independent correspondence between python's array 
> types
> # and Scala types.
> # See: https://docs.python.org/2/library/array.html
> 
> def assertCollectSuccess(typecode, value):
> row = Row(myarray=array.array(typecode, [value]))
> df = self.spark.createDataFrame([row])
> self.assertEqual(df.first()["myarray"][0], value)
> 
> # supported string types
> #
> # String types in python's array are "u" for Py_UNICODE and "c" for 
> char.
> # "u" will be removed in python 4, and "c" is not supported in python 
> 3.
> supported_string_types = []
> if sys.version_info[0] < 4:
> supported_string_types += ["u"]
> # test unicode
> >   assertCollectSuccess("u", "a")
> ../test_types.py:986: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../test_types.py:975: in assertCollectSuccess
> df = self.spark.createDataFrame([row])
> ../../connect/session.py:278: in createDataFrame
> _table = pa.Table.from_pylist([row.asDict(recursive=True) for row in 
> _data])
> pyarrow/table.pxi:3700: in pyarrow.lib.Table.from_pylist
> ???
> pyarrow/table.pxi:5221: in pyarrow.lib._from_pylist
> ???
> pyarrow/table.pxi:3575: in pyarrow.lib.Table.from_arrays
> ???
> pyarrow/table.pxi:1383: in pyarrow.lib._sanitize_arrays
> ???
> pyarrow/table.pxi:1364: in pyarrow.lib._schema_from_arrays
> ???
> pyarrow/array.pxi:320: in pyarrow.lib.array
> ???
> pyarrow/array.pxi:39: in pyarrow.lib._sequence_to_array
> ???
> pyarrow/error.pxi:144: in pyarrow.lib.pyarrow_internal_check_status
> ???
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> >   ???
> E   pyarrow.lib.ArrowInvalid: Could not convert array('u', 'a') with type 
> array.array: did not recognize Python value type when inferring an Arrow data 
> type
> pyarrow/error.pxi:100: ArrowInvalid
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42021) createDataFrame with array.array

2023-01-16 Thread Hyukjin Kwon (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon reassigned SPARK-42021:


Assignee: Ruifeng Zheng  (was: Hyukjin Kwon)

> createDataFrame with array.array
> 
>
> Key: SPARK-42021
> URL: https://issues.apache.org/jira/browse/SPARK-42021
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Ruifeng Zheng
>Priority: Major
> Fix For: 3.4.0
>
>
> {code}
> pyspark/sql/tests/test_types.py:964 (TypesParityTests.test_array_types)
> self =  testMethod=test_array_types>
> def test_array_types(self):
> # This test need to make sure that the Scala type selected is at least
> # as large as the python's types. This is necessary because python's
> # array types depend on C implementation on the machine. Therefore 
> there
> # is no machine independent correspondence between python's array 
> types
> # and Scala types.
> # See: https://docs.python.org/2/library/array.html
> 
> def assertCollectSuccess(typecode, value):
> row = Row(myarray=array.array(typecode, [value]))
> df = self.spark.createDataFrame([row])
> self.assertEqual(df.first()["myarray"][0], value)
> 
> # supported string types
> #
> # String types in python's array are "u" for Py_UNICODE and "c" for 
> char.
> # "u" will be removed in python 4, and "c" is not supported in python 
> 3.
> supported_string_types = []
> if sys.version_info[0] < 4:
> supported_string_types += ["u"]
> # test unicode
> >   assertCollectSuccess("u", "a")
> ../test_types.py:986: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../test_types.py:975: in assertCollectSuccess
> df = self.spark.createDataFrame([row])
> ../../connect/session.py:278: in createDataFrame
> _table = pa.Table.from_pylist([row.asDict(recursive=True) for row in 
> _data])
> pyarrow/table.pxi:3700: in pyarrow.lib.Table.from_pylist
> ???
> pyarrow/table.pxi:5221: in pyarrow.lib._from_pylist
> ???
> pyarrow/table.pxi:3575: in pyarrow.lib.Table.from_arrays
> ???
> pyarrow/table.pxi:1383: in pyarrow.lib._sanitize_arrays
> ???
> pyarrow/table.pxi:1364: in pyarrow.lib._schema_from_arrays
> ???
> pyarrow/array.pxi:320: in pyarrow.lib.array
> ???
> pyarrow/array.pxi:39: in pyarrow.lib._sequence_to_array
> ???
> pyarrow/error.pxi:144: in pyarrow.lib.pyarrow_internal_check_status
> ???
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> >   ???
> E   pyarrow.lib.ArrowInvalid: Could not convert array('u', 'a') with type 
> array.array: did not recognize Python value type when inferring an Arrow data 
> type
> pyarrow/error.pxi:100: ArrowInvalid
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42021) createDataFrame with array.array

2023-01-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42021:


Assignee: (was: Apache Spark)

> createDataFrame with array.array
> 
>
> Key: SPARK-42021
> URL: https://issues.apache.org/jira/browse/SPARK-42021
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Priority: Major
>
> {code}
> pyspark/sql/tests/test_types.py:964 (TypesParityTests.test_array_types)
> self =  testMethod=test_array_types>
> def test_array_types(self):
> # This test need to make sure that the Scala type selected is at least
> # as large as the python's types. This is necessary because python's
> # array types depend on C implementation on the machine. Therefore 
> there
> # is no machine independent correspondence between python's array 
> types
> # and Scala types.
> # See: https://docs.python.org/2/library/array.html
> 
> def assertCollectSuccess(typecode, value):
> row = Row(myarray=array.array(typecode, [value]))
> df = self.spark.createDataFrame([row])
> self.assertEqual(df.first()["myarray"][0], value)
> 
> # supported string types
> #
> # String types in python's array are "u" for Py_UNICODE and "c" for 
> char.
> # "u" will be removed in python 4, and "c" is not supported in python 
> 3.
> supported_string_types = []
> if sys.version_info[0] < 4:
> supported_string_types += ["u"]
> # test unicode
> >   assertCollectSuccess("u", "a")
> ../test_types.py:986: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../test_types.py:975: in assertCollectSuccess
> df = self.spark.createDataFrame([row])
> ../../connect/session.py:278: in createDataFrame
> _table = pa.Table.from_pylist([row.asDict(recursive=True) for row in 
> _data])
> pyarrow/table.pxi:3700: in pyarrow.lib.Table.from_pylist
> ???
> pyarrow/table.pxi:5221: in pyarrow.lib._from_pylist
> ???
> pyarrow/table.pxi:3575: in pyarrow.lib.Table.from_arrays
> ???
> pyarrow/table.pxi:1383: in pyarrow.lib._sanitize_arrays
> ???
> pyarrow/table.pxi:1364: in pyarrow.lib._schema_from_arrays
> ???
> pyarrow/array.pxi:320: in pyarrow.lib.array
> ???
> pyarrow/array.pxi:39: in pyarrow.lib._sequence_to_array
> ???
> pyarrow/error.pxi:144: in pyarrow.lib.pyarrow_internal_check_status
> ???
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> >   ???
> E   pyarrow.lib.ArrowInvalid: Could not convert array('u', 'a') with type 
> array.array: did not recognize Python value type when inferring an Arrow data 
> type
> pyarrow/error.pxi:100: ArrowInvalid
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-42021) createDataFrame with array.array

2023-01-15 Thread Apache Spark (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-42021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apache Spark reassigned SPARK-42021:


Assignee: Apache Spark

> createDataFrame with array.array
> 
>
> Key: SPARK-42021
> URL: https://issues.apache.org/jira/browse/SPARK-42021
> Project: Spark
>  Issue Type: Sub-task
>  Components: Connect
>Affects Versions: 3.4.0
>Reporter: Hyukjin Kwon
>Assignee: Apache Spark
>Priority: Major
>
> {code}
> pyspark/sql/tests/test_types.py:964 (TypesParityTests.test_array_types)
> self =  testMethod=test_array_types>
> def test_array_types(self):
> # This test need to make sure that the Scala type selected is at least
> # as large as the python's types. This is necessary because python's
> # array types depend on C implementation on the machine. Therefore 
> there
> # is no machine independent correspondence between python's array 
> types
> # and Scala types.
> # See: https://docs.python.org/2/library/array.html
> 
> def assertCollectSuccess(typecode, value):
> row = Row(myarray=array.array(typecode, [value]))
> df = self.spark.createDataFrame([row])
> self.assertEqual(df.first()["myarray"][0], value)
> 
> # supported string types
> #
> # String types in python's array are "u" for Py_UNICODE and "c" for 
> char.
> # "u" will be removed in python 4, and "c" is not supported in python 
> 3.
> supported_string_types = []
> if sys.version_info[0] < 4:
> supported_string_types += ["u"]
> # test unicode
> >   assertCollectSuccess("u", "a")
> ../test_types.py:986: 
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> ../test_types.py:975: in assertCollectSuccess
> df = self.spark.createDataFrame([row])
> ../../connect/session.py:278: in createDataFrame
> _table = pa.Table.from_pylist([row.asDict(recursive=True) for row in 
> _data])
> pyarrow/table.pxi:3700: in pyarrow.lib.Table.from_pylist
> ???
> pyarrow/table.pxi:5221: in pyarrow.lib._from_pylist
> ???
> pyarrow/table.pxi:3575: in pyarrow.lib.Table.from_arrays
> ???
> pyarrow/table.pxi:1383: in pyarrow.lib._sanitize_arrays
> ???
> pyarrow/table.pxi:1364: in pyarrow.lib._schema_from_arrays
> ???
> pyarrow/array.pxi:320: in pyarrow.lib.array
> ???
> pyarrow/array.pxi:39: in pyarrow.lib._sequence_to_array
> ???
> pyarrow/error.pxi:144: in pyarrow.lib.pyarrow_internal_check_status
> ???
> _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
> _ 
> >   ???
> E   pyarrow.lib.ArrowInvalid: Could not convert array('u', 'a') with type 
> array.array: did not recognize Python value type when inferring an Arrow data 
> type
> pyarrow/error.pxi:100: ArrowInvalid
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org