[jira] [Commented] (SPARK-39457) Support pure IPV6 environment without IPV4

2022-06-13 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-39457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17553901#comment-17553901 ] Ruslan Dautkhanov commented on SPARK-39457: --- Is there a dependency on Hadoop to support IPv6

[jira] [Commented] (SPARK-26413) SPIP: RDD Arrow Support in Spark Core and PySpark

2021-11-08 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-26413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17440731#comment-17440731 ] Ruslan Dautkhanov commented on SPARK-26413: --- [https://github.com/apache/spark/pull/34505] is

[jira] [Commented] (SPARK-32399) Support full outer join in shuffled hash join

2020-10-14 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214440#comment-17214440 ] Ruslan Dautkhanov commented on SPARK-32399: --- Here's another view if that's helpful that shows

[jira] [Updated] (SPARK-32399) Support full outer join in shuffled hash join

2020-10-14 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-32399: -- Attachment: Screen Shot 2020-10-14 at 11.08.37 PM.png > Support full outer join in

[jira] [Commented] (SPARK-32399) Support full outer join in shuffled hash join

2020-10-14 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17214426#comment-17214426 ] Ruslan Dautkhanov commented on SPARK-32399: --- [~chengsu] thank you for all the Shuffled Hash

[jira] [Updated] (SPARK-32399) Support full outer join in shuffled hash join

2020-10-14 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-32399: -- Attachment: Screen Shot 2020-10-14 at 12.30.07 PM.png > Support full outer join in

[jira] [Comment Edited] (SPARK-32760) Support for INET data type

2020-09-01 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188612#comment-17188612 ] Ruslan Dautkhanov edited comment on SPARK-32760 at 9/1/20, 4:29 PM:

[jira] [Commented] (SPARK-32760) Support for INET data type

2020-09-01 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32760?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17188612#comment-17188612 ] Ruslan Dautkhanov commented on SPARK-32760: --- [~smilegator] understood. Would be great to

[jira] [Resolved] (SPARK-32759) Support for INET data type

2020-08-31 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov resolved SPARK-32759. --- Resolution: Duplicate > Support for INET data type > -- > >

[jira] [Created] (SPARK-32760) Support for INET data type

2020-08-31 Thread Ruslan Dautkhanov (Jira)
Ruslan Dautkhanov created SPARK-32760: - Summary: Support for INET data type Key: SPARK-32760 URL: https://issues.apache.org/jira/browse/SPARK-32760 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-32759) Support for INET data type

2020-08-31 Thread Ruslan Dautkhanov (Jira)
Ruslan Dautkhanov created SPARK-32759: - Summary: Support for INET data type Key: SPARK-32759 URL: https://issues.apache.org/jira/browse/SPARK-32759 Project: Spark Issue Type: New Feature

[jira] [Commented] (SPARK-28367) Kafka connector infinite wait because metadata never updated

2020-08-13 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17177202#comment-17177202 ] Ruslan Dautkhanov commented on SPARK-28367: --- [~gsomogyi] thanks! yep would be great to learn

[jira] [Updated] (SPARK-32294) GroupedData Pandas UDF 2Gb limit

2020-07-13 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-32294?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-32294: -- Description: `spark.sql.execution.arrow.maxRecordsPerBatch` is not respected for

[jira] [Created] (SPARK-32294) GroupedData Pandas UDF 2Gb limit

2020-07-13 Thread Ruslan Dautkhanov (Jira)
Ruslan Dautkhanov created SPARK-32294: - Summary: GroupedData Pandas UDF 2Gb limit Key: SPARK-32294 URL: https://issues.apache.org/jira/browse/SPARK-32294 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-28266) data duplication when `path` serde property is present

2020-01-13 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014467#comment-17014467 ] Ruslan Dautkhanov edited comment on SPARK-28266 at 1/13/20 4:46 PM:

[jira] [Commented] (SPARK-28266) data duplication when `path` serde property is present

2020-01-13 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014467#comment-17014467 ] Ruslan Dautkhanov commented on SPARK-28266: --- Thank you for checking [~dongjoon] That may have

[jira] [Commented] (SPARK-29224) Implement Factorization Machines as a ml-pipeline component

2019-12-23 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002402#comment-17002402 ] Ruslan Dautkhanov commented on SPARK-29224: --- E.g. would this work with 0.1m or 1m sparse

[jira] [Commented] (SPARK-29224) Implement Factorization Machines as a ml-pipeline component

2019-12-23 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17002398#comment-17002398 ] Ruslan Dautkhanov commented on SPARK-29224: --- That's great. Out of curiosity - what's largest

[jira] [Reopened] (SPARK-21488) Make saveAsTable() and createOrReplaceTempView() return dataframe of created table/ created view

2019-12-04 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov reopened SPARK-21488: --- > Make saveAsTable() and createOrReplaceTempView() return dataframe of created >

[jira] [Commented] (SPARK-21488) Make saveAsTable() and createOrReplaceTempView() return dataframe of created table/ created view

2019-12-04 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-21488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16988011#comment-16988011 ] Ruslan Dautkhanov commented on SPARK-21488: --- [~zsxwing] any chance this can be added to Spark

[jira] [Commented] (SPARK-19842) Informational Referential Integrity Constraints Support in Spark

2019-12-04 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-19842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16987999#comment-16987999 ] Ruslan Dautkhanov commented on SPARK-19842: --- >From the design document  """ This alternative

[jira] [Commented] (SPARK-22340) pyspark setJobGroup doesn't match java threads

2019-11-11 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-22340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16971723#comment-16971723 ] Ruslan Dautkhanov commented on SPARK-22340: --- Glad to see this is solved.  A nice side-effect

[jira] [Commented] (SPARK-29041) Allow createDataFrame to accept bytes as binary type

2019-10-16 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953163#comment-16953163 ] Ruslan Dautkhanov commented on SPARK-29041: --- [~hyukjin.kwon] thanks for getting back on this

[jira] [Commented] (SPARK-29041) Allow createDataFrame to accept bytes as binary type

2019-09-30 Thread Ruslan Dautkhanov (Jira)
[ https://issues.apache.org/jira/browse/SPARK-29041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16941289#comment-16941289 ] Ruslan Dautkhanov commented on SPARK-29041: --- Thank you [~hyukjin.kwon] Our users say this

[jira] [Updated] (SPARK-28266) data duplication when `path` serde property is present

2019-07-12 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-28266: -- Summary: data duplication when `path` serde property is present (was: data

[jira] [Commented] (SPARK-28266) data correctness issue: data duplication when `path` serde property is present

2019-07-11 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16883222#comment-16883222 ] Ruslan Dautkhanov commented on SPARK-28266: --- Another interesting side Spark bug found while

[jira] [Commented] (SPARK-28266) data correctness issue: data duplication when `path` serde property is present

2019-07-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882381#comment-16882381 ] Ruslan Dautkhanov commented on SPARK-28266: --- This issue happens `spark.sql.sources.provider`

[jira] [Commented] (SPARK-28266) data correctness issue: data duplication when `path` serde property is present

2019-07-10 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16882200#comment-16882200 ] Ruslan Dautkhanov commented on SPARK-28266: --- Suspecting change in SPARK-22158 causes this  >

[jira] [Commented] (SPARK-22158) convertMetastore should not ignore storage properties

2019-07-09 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881602#comment-16881602 ] Ruslan Dautkhanov commented on SPARK-22158: --- [~dongjoon] I may have misreported it - sorry. 

[jira] [Comment Edited] (SPARK-22158) convertMetastore should not ignore storage properties

2019-07-09 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881437#comment-16881437 ] Ruslan Dautkhanov edited comment on SPARK-22158 at 7/9/19 6:57 PM: ---

[jira] [Commented] (SPARK-22158) convertMetastore should not ignore storage properties

2019-07-09 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16881437#comment-16881437 ] Ruslan Dautkhanov commented on SPARK-22158: --- [~dongjoon] can you please check if this causes

[jira] [Updated] (SPARK-28266) data correctness issue: data duplication when `path` serde property is present

2019-07-08 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-28266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-28266: -- Summary: data correctness issue: data duplication when `path` serde property is

[jira] [Created] (SPARK-28266) data correctness issue: data duplication when `path` serde peroperty is present

2019-07-05 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-28266: - Summary: data correctness issue: data duplication when `path` serde peroperty is present Key: SPARK-28266 URL: https://issues.apache.org/jira/browse/SPARK-28266

[jira] [Issue Comment Deleted] (SPARK-22151) PYTHONPATH not picked up from the spark.yarn.appMasterEnv properly

2019-05-30 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-22151: -- Comment: was deleted (was: Is there is a workaround for this in Apache Livy? We're

[jira] [Commented] (SPARK-22151) PYTHONPATH not picked up from the spark.yarn.appMasterEnv properly

2019-05-30 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16852381#comment-16852381 ] Ruslan Dautkhanov commented on SPARK-22151: --- Is there is a workaround for this in Apache Livy?

[jira] [Comment Edited] (SPARK-15463) Support for creating a dataframe from CSV in Dataset[String]

2019-05-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840018#comment-16840018 ] Ruslan Dautkhanov edited comment on SPARK-15463 at 5/15/19 4:00 PM:

[jira] [Commented] (SPARK-15463) Support for creating a dataframe from CSV in Dataset[String]

2019-05-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16840018#comment-16840018 ] Ruslan Dautkhanov commented on SPARK-15463: --- [~hyukjin.kwon] would it be possible to make

[jira] [Commented] (SPARK-15719) Disable writing Parquet summary files by default

2019-04-09 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814061#comment-16814061 ] Ruslan Dautkhanov commented on SPARK-15719: --- [~lian cheng] quick question on this part from

[jira] [Commented] (SPARK-21784) Add ALTER TABLE ADD CONSTRANT DDL to support defining primary key and foreign keys

2019-04-04 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16810124#comment-16810124 ] Ruslan Dautkhanov commented on SPARK-21784: --- Any chance this can be part of Spark 3.0 release?

[jira] [Commented] (SPARK-26764) [SPIP] Spark Relational Cache

2019-02-25 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26764?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16777060#comment-16777060 ] Ruslan Dautkhanov commented on SPARK-26764: --- That seems to be closely related to Hive

[jira] [Comment Edited] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-21 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695427#comment-16695427 ] Ruslan Dautkhanov edited comment on SPARK-26019 at 11/22/18 12:42 AM:

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-21 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16695427#comment-16695427 ] Ruslan Dautkhanov commented on SPARK-26019: --- Thank you [~irashid] I confirm that swapping

[jira] [Reopened] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-21 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov reopened SPARK-26019: --- > pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" > in

[jira] [Reopened] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov reopened SPARK-26019: --- > pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" > in

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692716#comment-16692716 ] Ruslan Dautkhanov commented on SPARK-26019: --- [~viirya] exception stack reads that error

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692377#comment-16692377 ] Ruslan Dautkhanov commented on SPARK-26019: --- [~irashid] thanks a lot for looking at this ! It

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692237#comment-16692237 ] Ruslan Dautkhanov commented on SPARK-26019: --- cc [~lucacanali] > pyspark/accumulators.py:

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692233#comment-16692233 ] Ruslan Dautkhanov commented on SPARK-26019: --- Sorry, nope it was broken by this change -

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692226#comment-16692226 ] Ruslan Dautkhanov commented on SPARK-26019: --- Might be broken by

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16692219#comment-16692219 ] Ruslan Dautkhanov commented on SPARK-26019: --- [~hyukjin.kwon] today I reproduced this first

[jira] [Reopened] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-19 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov reopened SPARK-26019: --- Reproduced myself > pyspark/accumulators.py: "TypeError: object of type 'NoneType'

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-16 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16690123#comment-16690123 ] Ruslan Dautkhanov commented on SPARK-26019: --- That user said he has seen this error 4-5 times,

[jira] [Commented] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689056#comment-16689056 ] Ruslan Dautkhanov commented on SPARK-26041: --- [~hyukjin.kwon] I didn't request investigation. I

[jira] [Commented] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16689052#comment-16689052 ] Ruslan Dautkhanov commented on SPARK-26019: --- No, it was the only instance I had for this

[jira] [Resolved] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-15 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov resolved SPARK-26019. --- Resolution: Cannot Reproduce > pyspark/accumulators.py: "TypeError: object of type

[jira] [Updated] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-26041: -- Environment: Spark 2.3.2  Hadoop 2.6 When we materialize one of intermediate

[jira] [Comment Edited] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686907#comment-16686907 ] Ruslan Dautkhanov edited comment on SPARK-26041 at 11/14/18 5:45 PM: -

[jira] [Comment Edited] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686907#comment-16686907 ] Ruslan Dautkhanov edited comment on SPARK-26041 at 11/14/18 5:45 PM: -

[jira] [Commented] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686907#comment-16686907 ] Ruslan Dautkhanov commented on SPARK-26041: --- thank for checking this [~mgaido] just attached

[jira] [Updated] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-26041: -- Attachment: SPARK-26041.txt > catalyst cuts out some columns from dataframes: >

[jira] [Commented] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16686851#comment-16686851 ] Ruslan Dautkhanov commented on SPARK-26041: --- Thanks for referencing that jira [~mgaido]

[jira] [Updated] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-14 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-26041: -- Affects Version/s: 2.3.0 2.3.1 > catalyst cuts out some

[jira] [Commented] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685592#comment-16685592 ] Ruslan Dautkhanov commented on SPARK-26041: --- There are a couple of related jiras that were

[jira] [Commented] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-26041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16685589#comment-16685589 ] Ruslan Dautkhanov commented on SPARK-26041: --- Issue might be introduced by SPARK-9830  Comment

[jira] [Created] (SPARK-26041) catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute

2018-11-13 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-26041: - Summary: catalyst cuts out some columns from dataframes: org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Binding attribute Key: SPARK-26041 URL:

[jira] [Created] (SPARK-26019) pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates()

2018-11-12 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-26019: - Summary: pyspark/accumulators.py: "TypeError: object of type 'NoneType' has no len()" in authenticate_and_accum_updates() Key: SPARK-26019 URL:

[jira] [Commented] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-09 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681892#comment-16681892 ] Ruslan Dautkhanov commented on SPARK-25958: --- Yep, the pyspark job completes fine afetr we

[jira] [Resolved] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-09 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov resolved SPARK-25958. --- Resolution: Not A Problem > error: [Errno 97] Address family not supported by

[jira] [Commented] (SPARK-24244) Parse only required columns of CSV file

2018-11-09 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16681735#comment-16681735 ] Ruslan Dautkhanov commented on SPARK-24244: --- [~maxgekk] great improvement  is this new option

[jira] [Commented] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-08 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679261#comment-16679261 ] Ruslan Dautkhanov commented on SPARK-25958: --- [~XuanYuan] interesting.. here's our /etc/hosts:

[jira] [Commented] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-08 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16679263#comment-16679263 ] Ruslan Dautkhanov commented on SPARK-25958: --- I just removed ipv6 reference ::1 in /etc/hosts

[jira] [Comment Edited] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678894#comment-16678894 ] Ruslan Dautkhanov edited comment on SPARK-25958 at 11/7/18 10:35 PM: -

[jira] [Commented] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16678894#comment-16678894 ] Ruslan Dautkhanov commented on SPARK-25958: --- We do have ipv6 disabled on our hadoop servers,

[jira] [Updated] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-25958: -- Issue Type: Bug (was: New Feature) > error: [Errno 97] Address family not supported

[jira] [Updated] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-07 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-25958: -- Description: Following error happens on a heavy Spark job after 4 hours of runtime..

[jira] [Created] (SPARK-25958) error: [Errno 97] Address family not supported by protocol in dataframe.take()

2018-11-06 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-25958: - Summary: error: [Errno 97] Address family not supported by protocol in dataframe.take() Key: SPARK-25958 URL: https://issues.apache.org/jira/browse/SPARK-25958

[jira] [Comment Edited] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.

2018-10-29 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667721#comment-16667721 ] Ruslan Dautkhanov edited comment on SPARK-25863 at 10/29/18 8:37 PM: -

[jira] [Commented] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala

2018-10-29 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667721#comment-16667721 ] Ruslan Dautkhanov commented on SPARK-25863: --- [~mgaido], I will try to get a reproducer, but it

[jira] [Commented] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2018-10-29 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16667478#comment-16667478 ] Ruslan Dautkhanov commented on SPARK-22505: --- Thank you! That worked > toDF() /

[jira] [Commented] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala

2018-10-28 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1736#comment-1736 ] Ruslan Dautkhanov commented on SPARK-25863: --- It seems error happens here

[jira] [Updated] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala:1

2018-10-28 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-25863: -- Affects Version/s: 2.3.1 > java.lang.UnsupportedOperationException: empty.max at >

[jira] [Commented] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala

2018-10-28 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1726#comment-1726 ] Ruslan Dautkhanov commented on SPARK-25863: --- This happens only on one of our heaviest Spark

[jira] [Created] (SPARK-25863) java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala:1

2018-10-28 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-25863: - Summary: java.lang.UnsupportedOperationException: empty.max at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator$.updateAndGetCompilationStats(CodeGenerator.scala:1475)

[jira] [Commented] (SPARK-25814) spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore

2018-10-23 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25814?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16661282#comment-16661282 ] Ruslan Dautkhanov commented on SPARK-25814: --- thank you [~vanzin] ! I will try to tune those

[jira] [Updated] (SPARK-25814) spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore

2018-10-23 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-25814: -- Priority: Major (was: Critical) > spark driver runs out of memory on

[jira] [Updated] (SPARK-25814) spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore

2018-10-23 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-25814: -- Description:  We're looking into issue when even huge spark driver memory gets

[jira] [Updated] (SPARK-25814) spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore

2018-10-23 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-25814: -- Description:  We're looking into issue when even huge spark driver memory gets

[jira] [Updated] (SPARK-25814) spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore

2018-10-23 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-25814: -- Description:  We're looking into issue when even huge spark driver memory gets

[jira] [Updated] (SPARK-25814) spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore

2018-10-23 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25814?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ruslan Dautkhanov updated SPARK-25814: -- Attachment: image-2018-10-23-14-06-53-722.png > spark driver runs out of memory on

[jira] [Created] (SPARK-25814) spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore

2018-10-23 Thread Ruslan Dautkhanov (JIRA)
Ruslan Dautkhanov created SPARK-25814: - Summary: spark driver runs out of memory on org.apache.spark.util.kvstore.InMemoryStore Key: SPARK-25814 URL: https://issues.apache.org/jira/browse/SPARK-25814

[jira] [Commented] (SPARK-13587) Support virtualenv in PySpark

2018-10-22 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16659374#comment-16659374 ] Ruslan Dautkhanov commented on SPARK-13587: --- We're using conda environments shared across

[jira] [Commented] (SPARK-22947) SPIP: as-of join in Spark SQL

2018-10-21 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16658401#comment-16658401 ] Ruslan Dautkhanov commented on SPARK-22947: --- Perhaps at least part of implementation can be

[jira] [Commented] (SPARK-25643) Performance issues querying wide rows

2018-10-16 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16652039#comment-16652039 ] Ruslan Dautkhanov commented on SPARK-25643: --- [~viirya] we confirm this problem on our

[jira] [Comment Edited] (SPARK-22505) toDF() / createDataFrame() type inference doesn't work as expected

2018-10-05 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22505?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16263733#comment-16263733 ] Ruslan Dautkhanov edited comment on SPARK-22505 at 10/5/18 8:26 PM:

[jira] [Commented] (SPARK-25164) Parquet reader builds entire list of columns once for each column

2018-10-05 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16640045#comment-16640045 ] Ruslan Dautkhanov commented on SPARK-25164: --- Thank you [~bersprockets] - SPARK-25643 would be

[jira] [Commented] (SPARK-25164) Parquet reader builds entire list of columns once for each column

2018-09-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614020#comment-16614020 ] Ruslan Dautkhanov commented on SPARK-25164: --- Hi [~bersprockets]   Thanks a lot for the

[jira] [Comment Edited] (SPARK-25164) Parquet reader builds entire list of columns once for each column

2018-09-13 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16614020#comment-16614020 ] Ruslan Dautkhanov edited comment on SPARK-25164 at 9/13/18 8:19 PM:

[jira] [Comment Edited] (SPARK-25164) Parquet reader builds entire list of columns once for each column

2018-09-11 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611089#comment-16611089 ] Ruslan Dautkhanov edited comment on SPARK-25164 at 9/11/18 7:00 PM:

[jira] [Commented] (SPARK-25164) Parquet reader builds entire list of columns once for each column

2018-09-11 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-25164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16611089#comment-16611089 ] Ruslan Dautkhanov commented on SPARK-25164: --- Thanks [~bersprockets] Very good find ! Thanks.

[jira] [Commented] (SPARK-24316) Spark sql queries stall for column width more than 6k for parquet based table

2018-09-04 Thread Ruslan Dautkhanov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16603590#comment-16603590 ] Ruslan Dautkhanov commented on SPARK-24316: --- Thanks [~bersprockets]  Is cloudera

  1   2   3   >