Re: issue on define a dataframe

2021-12-14 Thread Sean Owen
Is this python? Just try passing [("apple",), ("orange",), ...]

On Tue, Dec 14, 2021 at 7:18 PM  wrote:

> Hello,
>
> Spark newbie here :)
>
> Why I can't create the dataframe with just one column?
>
> for instance, this works:
>
> >>> df=spark.createDataFrame([("apple",2),("orange",3)],["name","count"])
> >>>
>
> But this can't work:
>
> >>> df=spark.createDataFrame([("apple"),("orange")],["name"])
> Traceback (most recent call last):
>File "", line 1, in 
>File "/opt/spark/python/pyspark/sql/session.py", line 675, in
> createDataFrame
>  return self._create_dataframe(data, schema, samplingRatio,
> verifySchema)
>File "/opt/spark/python/pyspark/sql/session.py", line 700, in
> _create_dataframe
>  rdd, schema = self._createFromLocal(map(prepare, data), schema)
>File "/opt/spark/python/pyspark/sql/session.py", line 512, in
> _createFromLocal
>  struct = self._inferSchemaFromList(data, names=schema)
>File "/opt/spark/python/pyspark/sql/session.py", line 439, in
> _inferSchemaFromList
>  schema = reduce(_merge_type, (_infer_schema(row, names) for row in
> data))
>File "/opt/spark/python/pyspark/sql/session.py", line 439, in
> 
>  schema = reduce(_merge_type, (_infer_schema(row, names) for row in
> data))
>File "/opt/spark/python/pyspark/sql/types.py", line 1067, in
> _infer_schema
>  raise TypeError("Can not infer schema for type: %s" % type(row))
> TypeError: Can not infer schema for type: 
>
>
> how can I fix it?
>
> Thanks
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


issue on define a dataframe

2021-12-14 Thread bitfox

Hello,

Spark newbie here :)

Why I can't create the dataframe with just one column?

for instance, this works:


df=spark.createDataFrame([("apple",2),("orange",3)],["name","count"])



But this can't work:


df=spark.createDataFrame([("apple"),("orange")],["name"])

Traceback (most recent call last):
  File "", line 1, in 
  File "/opt/spark/python/pyspark/sql/session.py", line 675, in 
createDataFrame
return self._create_dataframe(data, schema, samplingRatio, 
verifySchema)
  File "/opt/spark/python/pyspark/sql/session.py", line 700, in 
_create_dataframe

rdd, schema = self._createFromLocal(map(prepare, data), schema)
  File "/opt/spark/python/pyspark/sql/session.py", line 512, in 
_createFromLocal

struct = self._inferSchemaFromList(data, names=schema)
  File "/opt/spark/python/pyspark/sql/session.py", line 439, in 
_inferSchemaFromList
schema = reduce(_merge_type, (_infer_schema(row, names) for row in 
data))
  File "/opt/spark/python/pyspark/sql/session.py", line 439, in 

schema = reduce(_merge_type, (_infer_schema(row, names) for row in 
data))
  File "/opt/spark/python/pyspark/sql/types.py", line 1067, in 
_infer_schema

raise TypeError("Can not infer schema for type: %s" % type(row))
TypeError: Can not infer schema for type: 


how can I fix it?

Thanks

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org