[GitHub] spark pull request #22568: [SPARK-23401][PYTHON][TESTS] Add more data types ...

HyukjinKwon Thu, 27 Sep 2018 07:35:50 -0700

Github user HyukjinKwon commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22568#discussion_r220948302
  
    --- Diff: python/pyspark/sql/tests.py ---
    @@ -5525,32 +5525,73 @@ def data(self):
                 .withColumn("v", explode(col('vs'))).drop('vs')
     
         def test_supported_types(self):
    -        from pyspark.sql.functions import pandas_udf, PandasUDFType, 
array, col
    -        df = self.data.withColumn("arr", array(col("id")))
    +        from decimal import Decimal
    +        from distutils.version import LooseVersion
    +        import pyarrow as pa
    +        from pyspark.sql.functions import pandas_udf, PandasUDFType
     
    -        # Different forms of group map pandas UDF, results of these are 
the same
    +        input_values_with_schema = [
    +            (1, StructField('id', IntegerType())),
    +            (2, StructField('byte', ByteType())),
    +            (3, StructField('short', ShortType())),
    +            (4, StructField('int', IntegerType())),
    +            (5, StructField('long', LongType())),
    +            (1.1, StructField('float', FloatType())),
    +            (2.2, StructField('double', DoubleType())),
    +            (Decimal(1.123), StructField('decim', DecimalType(10, 3))),
    +            ([1, 2, 3], StructField('array', ArrayType(IntegerType()))),
    +            (True, StructField('bool', BooleanType())),
    +            ('hello', StructField('str', StringType())),
    +        ]
    --- End diff --
    
    I understood why you did this but I think we can just do like:
    
    ```python
    values = [
        1, 2, 3,
        4, 5, 1.1,
        ...
    ]
    output_schema = StructType([
        StructField('id', IntegerType()), StructField('byte', ByteType()), 
StructField('short', ShortType()),
        StructField('int', IntegerType()), StructField('long', LongType()), 
StructField('float', FloatType()),
        ...
    ])
    ```
    
    
    Let's just keep the original way and make it simple.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request #22568: [SPARK-23401][PYTHON][TESTS] Add more data types ...

Reply via email to