[jira] [Commented] (SPARK-28482) Data incomplete when using pandas udf in Python 3

2019-08-22 Thread jiangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913879#comment-16913879 ] jiangyu commented on SPARK-28482: - hi, [~bryanc] , i have tested toPandas(), it is okay. Row numbers is

[jira] [Comment Edited] (SPARK-28482) Data incomplete when using pandas udf in Python 3

2019-08-22 Thread jiangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913183#comment-16913183 ] jiangyu edited comment on SPARK-28482 at 8/22/19 9:37 AM: -- hi, [~bryanc] ,

[jira] [Commented] (SPARK-28482) Data incomplete when using pandas udf in Python 3

2019-08-22 Thread jiangyu (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16913183#comment-16913183 ] jiangyu commented on SPARK-28482: - hi, [~bryanc] , maybe you should produce more data, like 100,000

[jira] [Commented] (SPARK-28482) Data incomplete when using pandas udf in pyspark

2019-07-24 Thread jiangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16891638#comment-16891638 ] jiangyu commented on SPARK-28482: - hi  [~dongjoon] , I have tested this in spark 2.3.3 , 2.4.2 and

[jira] [Updated] (SPARK-28482) Data incomplete when using pandas udf in pyspark

2019-07-23 Thread jiangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiangyu updated SPARK-28482: Description: Hi,   Since Spark 2.3.x, pandas udf has been introduced as default ser/des method when

[jira] [Commented] (SPARK-28482) Data incomplete when using pandas udf in pyspark

2019-07-23 Thread jiangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16890793#comment-16890793 ] jiangyu commented on SPARK-28482: - Also submit a pr in arrow community 

[jira] [Updated] (SPARK-28482) Data incomplete when using pandas udf in pyspark

2019-07-23 Thread jiangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiangyu updated SPARK-28482: Description: Hi,   Since Spark 2.3.x, pandas udf has been introduced as default ser/des method when

[jira] [Updated] (SPARK-28482) Data incomplete when using pandas udf in pyspark

2019-07-23 Thread jiangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiangyu updated SPARK-28482: Attachment: test.py test.csv > Data incomplete when using pandas udf in pyspark >

[jira] [Updated] (SPARK-28482) Data incomplete when using pandas udf in pyspark

2019-07-23 Thread jiangyu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] jiangyu updated SPARK-28482: Attachment: py3.6.png worker.png py2.7.png > Data incomplete when using

[jira] [Created] (SPARK-28482) Data incomplete when using pandas udf in pyspark

2019-07-23 Thread jiangyu (JIRA)
jiangyu created SPARK-28482: --- Summary: Data incomplete when using pandas udf in pyspark Key: SPARK-28482 URL: https://issues.apache.org/jira/browse/SPARK-28482 Project: Spark Issue Type: Bug