from:"Hyukjin Kwon \(Jira\)"

[jira] [Commented] (SPARK-16371) IS NOT NULL clause gives false for nested not empty column

2016-07-05 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363757#comment-15363757 ] Hyukjin Kwon commented on SPARK-16371: -- Here is shorter codes {code} from pyspark.sql.functions

[jira] [Commented] (SPARK-16316) dataframe except API returning wrong result in spark 1.5.0

2016-07-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365811#comment-15365811 ] Hyukjin Kwon commented on SPARK-16316: -- also confirm it works in 1.6.1 {code} scala> val dfa =

[jira] [Commented] (SPARK-16316) dataframe except API returning wrong result in spark 1.5.0

2016-07-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15365810#comment-15365810 ] Hyukjin Kwon commented on SPARK-16316: -- I confirm this works fine in current master {code} scala>

[jira] [Created] (SPARK-16461) Support partition batch pruning with `<=>` (EqualNullSafe) predicate in InMemoryTableScanExec

2016-07-09 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-16461: Summary: Support partition batch pruning with `<=>` (EqualNullSafe) predicate in InMemoryTableScanExec Key: SPARK-16461 URL: https://issues.apache.org/jira/browse/SPARK-16461

[jira] [Commented] (SPARK-15144) option nullValue for CSV data source not working for several types.

2016-07-09 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-15144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15369124#comment-15369124 ] Hyukjin Kwon commented on SPARK-15144: -- Sorry, i missed this comment.. It seems SPARK-16460 has a

[jira] [Created] (SPARK-16472) Inconsistent nullability in schema after being read in SQL API.

2016-07-10 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-16472: Summary: Inconsistent nullability in schema after being read in SQL API. Key: SPARK-16472 URL: https://issues.apache.org/jira/browse/SPARK-16472 Project: Spark

[jira] [Updated] (SPARK-16472) Inconsistent nullability in schema after being read in SQL API.

2016-07-10 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16472: - Priority: Minor (was: Major) > Inconsistent nullability in schema after being read in SQL API.

[jira] [Updated] (SPARK-16472) Inconsistent nullability in schema after being read in SQL API.

2016-07-10 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16472: - Description: It seems the data sources implementing {{FileFormat}} seems loading the data by

[jira] [Created] (SPARK-16434) Avoid record-per type dispatch in JSON when reading

2016-07-07 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-16434: Summary: Avoid record-per type dispatch in JSON when reading Key: SPARK-16434 URL: https://issues.apache.org/jira/browse/SPARK-16434 Project: Spark Issue

[jira] [Commented] (SPARK-16371) IS NOT NULL clause gives false for nested not empty column

2016-07-05 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15363768#comment-15363768 ] Hyukjin Kwon commented on SPARK-16371: -- Sorry for being noisy, here is the Scala version {code}

[jira] [Commented] (SPARK-16371) IS NOT NULL clause gives false for nested not empty column

2016-07-06 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15364049#comment-15364049 ] Hyukjin Kwon commented on SPARK-16371: -- Right, it seems https://github.com/apache/spark/pull/9940

[jira] [Updated] (SPARK-16896) Loading csv with duplicate column names

2016-08-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16896: - Component/s: SQL > Loading csv with duplicate column names >

[jira] [Commented] (SPARK-16896) Loading csv with duplicate column names

2016-08-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408715#comment-15408715 ] Hyukjin Kwon commented on SPARK-16896: -- I agree with appending a number to the deplicated column

[jira] [Created] (SPARK-16908) Java code style guideline documentation

2016-08-04 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-16908: Summary: Java code style guideline documentation Key: SPARK-16908 URL: https://issues.apache.org/jira/browse/SPARK-16908 Project: Spark Issue Type:

[jira] [Commented] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408735#comment-15408735 ] Hyukjin Kwon commented on SPARK-16903: -- Oh, I thought we should apply {{nullValue}} to all types

[jira] [Comment Edited] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408704#comment-15408704 ] Hyukjin Kwon edited comment on SPARK-16903 at 8/5/16 12:44 AM: --- Hi

[jira] [Commented] (SPARK-16903) nullValue in first field is not respected by CSV source when read

2016-08-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408704#comment-15408704 ] Hyukjin Kwon commented on SPARK-16903: -- Hi [~falaki], is this about SPARK-16462, SPARK-16460 and

[jira] [Comment Edited] (SPARK-16908) Java code style guideline documentation

2016-08-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408772#comment-15408772 ] Hyukjin Kwon edited comment on SPARK-16908 at 8/5/16 2:12 AM: -- Could you

[jira] [Commented] (SPARK-16908) Java code style guideline documentation

2016-08-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15408772#comment-15408772 ] Hyukjin Kwon commented on SPARK-16908: -- Could you please take a look [~srowen]? > Java code style

[jira] [Commented] (SPARK-16908) Java code style guideline documentation

2016-08-05 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16908?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15409159#comment-15409159 ] Hyukjin Kwon commented on SPARK-16908: -- Thanks! Yes. I just want to be sure on Java style. > Java

[jira] [Created] (SPARK-16960) Deprecate approxCountDistinct, toDegrees and toRadians according to FunctionRegistry in Scala and Python

2016-08-08 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-16960: Summary: Deprecate approxCountDistinct, toDegrees and toRadians according to FunctionRegistry in Scala and Python Key: SPARK-16960 URL:

[jira] [Commented] (SPARK-16896) Loading csv with duplicate column names

2016-08-09 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413310#comment-15413310 ] Hyukjin Kwon commented on SPARK-16896: -- [~nlauchande] Just FYI, actual codes that need to be

[jira] [Comment Edited] (SPARK-16896) Loading csv with duplicate column names

2016-08-09 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413310#comment-15413310 ] Hyukjin Kwon edited comment on SPARK-16896 at 8/9/16 9:57 AM: -- [~nlauchande]

[jira] [Comment Edited] (SPARK-16896) Loading csv with duplicate column names

2016-08-09 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413271#comment-15413271 ] Hyukjin Kwon edited comment on SPARK-16896 at 8/9/16 9:47 AM: -- [~srowen] Oh,

[jira] [Commented] (SPARK-16896) Loading csv with duplicate column names

2016-08-09 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413038#comment-15413038 ] Hyukjin Kwon commented on SPARK-16896: -- I don't mind if you go ahead (I was looking at this problem

[jira] [Created] (SPARK-16971) Strip trailing zeros for decimals when using show() API in Dataset

2016-08-09 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-16971: Summary: Strip trailing zeros for decimals when using show() API in Dataset Key: SPARK-16971 URL: https://issues.apache.org/jira/browse/SPARK-16971 Project: Spark

[jira] [Commented] (SPARK-16896) Loading csv with duplicate column names

2016-08-09 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413271#comment-15413271 ] Hyukjin Kwon commented on SPARK-16896: -- Sean, If my understanding is correct, we have tried to

[jira] [Commented] (SPARK-16918) Weird error when selecting more than 100 spark udf columns

2016-08-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411282#comment-15411282 ] Hyukjin Kwon commented on SPARK-16918: -- FYI, it seems fine in current master fortunately. {code}

[jira] [Comment Edited] (SPARK-16918) Weird error when selecting more than 100 spark udf columns

2016-08-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411282#comment-15411282 ] Hyukjin Kwon edited comment on SPARK-16918 at 8/8/16 5:14 AM: -- FYI, it seems

[jira] [Commented] (SPARK-16918) Weird error when selecting more than 100 spark udf columns

2016-08-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16918?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15411300#comment-15411300 ] Hyukjin Kwon commented on SPARK-16918: -- Oh, thanks for pointing this out. I just did this with 101.

[jira] [Comment Edited] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403376#comment-15403376 ] Hyukjin Kwon edited comment on SPARK-16842 at 8/2/16 5:21 AM: -- hm.. don't we

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-01 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403376#comment-15403376 ] Hyukjin Kwon commented on SPARK-16842: -- hm.. don't we make a connection and then run a query to

[jira] [Commented] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-02 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15403573#comment-15403573 ] Hyukjin Kwon commented on SPARK-16842: -- Thank you both for great explanations. I will close this as

[jira] [Closed] (SPARK-16842) Concern about disallowing user-given schema for Parquet and ORC

2016-08-02 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16842?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon closed SPARK-16842. Resolution: Not A Problem > Concern about disallowing user-given schema for Parquet and ORC >

[jira] [Commented] (SPARK-16646) LEAST doesn't accept numeric arguments with different data types

2016-07-22 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389273#comment-15389273 ] Hyukjin Kwon commented on SPARK-16646: -- It seems basically comparison between numbers and decimal,

[jira] [Commented] (SPARK-16646) LEAST doesn't accept numeric arguments with different data types

2016-07-22 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15389276#comment-15389276 ] Hyukjin Kwon commented on SPARK-16646: -- [~lian cheng] Should we also follow this? I will follow your

[jira] [Commented] (SPARK-16877) Add a rule for preventing use Java's Override annotation

2016-08-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15405860#comment-15405860 ] Hyukjin Kwon commented on SPARK-16877: -- I will work on this but take a look into this with some

[jira] [Updated] (SPARK-16877) Add a rule for preventing use Java's Override annotation

2016-08-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-16877: - Priority: Minor (was: Major) > Add a rule for preventing use Java's Override annotation >

[jira] [Created] (SPARK-16351) Avoid record-per type dispatch in JSON when writing

2016-07-02 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-16351: Summary: Avoid record-per type dispatch in JSON when writing Key: SPARK-16351 URL: https://issues.apache.org/jira/browse/SPARK-16351 Project: Spark Issue

[jira] [Created] (SPARK-16356) Add testImplicits for ML unit tests and promote toDF()

2016-07-03 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-16356: Summary: Add testImplicits for ML unit tests and promote toDF() Key: SPARK-16356 URL: https://issues.apache.org/jira/browse/SPARK-16356 Project: Spark Issue

[jira] [Created] (SPARK-17071) Fetch Parquet schema within driver-side when there is single file to touch without another Spark job

2016-08-15 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-17071: Summary: Fetch Parquet schema within driver-side when there is single file to touch without another Spark job Key: SPARK-17071 URL:

[jira] [Created] (SPARK-19435) Type coercion between ArrayTypes

2017-02-02 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-19435: Summary: Type coercion between ArrayTypes Key: SPARK-19435 URL: https://issues.apache.org/jira/browse/SPARK-19435 Project: Spark Issue Type: Improvement

[jira] (SPARK-6307) Executers fetches the same rdd-block 100's or 1000's of times

2017-01-30 Thread Hyukjin Kwon (JIRA)

Title: Message Title Hyukjin Kwon resolved as Not A Problem

[jira] [Resolved] (SPARK-7051) Support Compression write for Parquet

2017-02-01 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-7051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-7051. - Resolution: Duplicate I am just resolving it just because It seems the JIRA itself is a

[jira] [Resolved] (SPARK-8676) After TGT expires, Thrift Server get "No valid credentials provided" exception

2017-02-01 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-8676. - Resolution: Later I am resolving this JIRA per

[jira] [Commented] (SPARK-11072) simplify self join handling

2017-02-01 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848329#comment-15848329 ] Hyukjin Kwon commented on SPARK-11072: -- Ah, [~cloud_fan], I just wonder if this JIRA is resolvable

[jira] [Commented] (SPARK-13678) transformExpressions should only apply on QueryPlan.expressions

2017-02-01 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-13678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15848345#comment-15848345 ] Hyukjin Kwon commented on SPARK-13678: -- [~cloud_fan], then, do SPARK-13694 and SPARK-13651 fix this

[jira] [Resolved] (SPARK-19442) Unable to add column to the dataset using Dataset.WithColumn() api

2017-02-06 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-19442. -- Resolution: Cannot Reproduce I am resolving this as I can't reproduce in the current master as

[jira] [Commented] (SPARK-19439) PySpark's registerJavaFunction Should Support UDAFs

2017-02-06 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854034#comment-15854034 ] Hyukjin Kwon commented on SPARK-19439: -- So, as you said, is this a duplicate of SPARK-10915? If so,

[jira] [Resolved] (SPARK-19440) Window in pyspark doesn't have attributes unboundedPreceding, unboundedFollowing and currentRow

2017-02-06 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19440?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-19440. -- Resolution: Invalid It seems there are as below: {code} >>> from pyspark.sql import Window

[jira] [Commented] (SPARK-19474) SparkSQL unsupports to change hive table's name\dataType

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855541#comment-15855541 ] Hyukjin Kwon commented on SPARK-19474: -- It seems a duplicate of SPARK-18893. Please reopen this if I

[jira] [Resolved] (SPARK-19474) SparkSQL unsupports to change hive table's name\dataType

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-19474. -- Resolution: Duplicate > SparkSQL unsupports to change hive table's name\dataType >

[jira] [Commented] (SPARK-14352) approxQuantile should support multi columns

2017-02-02 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-14352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15849889#comment-15849889 ] Hyukjin Kwon commented on SPARK-14352: -- Oh, [~holdenk], it seems this JIRA is resolvable as we have

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857554#comment-15857554 ] Hyukjin Kwon commented on SPARK-19496: -- Yea, thank you for explanation. I was just curious so tested

[jira] [Commented] (SPARK-19474) SparkSQL unsupports to change hive table's name\dataType

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857382#comment-15857382 ] Hyukjin Kwon commented on SPARK-19474: -- I have no information about this but I am less sure given

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon commented on SPARK-19496: -- - Hive {code} hive> SELECT to_date('2014-31-12');

[jira] [Commented] (SPARK-19507) pyspark.sql.types._verify_type() exceptions too broad to debug collections or nested data

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857423#comment-15857423 ] Hyukjin Kwon commented on SPARK-19507: -- That is as you said private with the underbar prefix

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:12 AM: -- - Hive

[jira] [Commented] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857473#comment-15857473 ] Hyukjin Kwon commented on SPARK-19496: -- Oh, yes. I just found and updated my comment. > to_date

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:16 AM: -- - Hive

[jira] [Comment Edited] (SPARK-19496) to_date with format has weird behavior

2017-02-07 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857459#comment-15857459 ] Hyukjin Kwon edited comment on SPARK-19496 at 2/8/17 6:16 AM: -- - Hive

[jira] [Commented] (SPARK-19121) No need to refresh metadata cache for non-partitioned Hive tables

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852868#comment-15852868 ] Hyukjin Kwon commented on SPARK-19121: -- (Just as a kind reminder purely just to inform, maybe it

[jira] [Commented] (SPARK-17208) Build Outer Join Test Cases in File-based Testing Framework

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-17208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852857#comment-15852857 ] Hyukjin Kwon commented on SPARK-17208: -- [~smilegator], would this JIRA maybe resolvable per

[jira] [Commented] (SPARK-12256) [SQL] Code refactoring: naming boolean variables

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-12256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852863#comment-15852863 ] Hyukjin Kwon commented on SPARK-12256: -- (maybe this one might be resolvable per the discussion in

[jira] [Resolved] (SPARK-17477) SparkSQL cannot handle schema evolution from Int -> Long when parquet files have Int as its type while hive metastore has Long as its type

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-17477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-17477. -- Resolution: Duplicate > SparkSQL cannot handle schema evolution from Int -> Long when parquet

[jira] [Commented] (SPARK-17581) Invalidate Statistics After Some ALTER TABLE Commands

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-17581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852860#comment-15852860 ] Hyukjin Kwon commented on SPARK-17581: -- (maybe it is worth to double check if it is resolvable per

[jira] [Commented] (SPARK-19428) Ability to select first row of groupby

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852893#comment-15852893 ] Hyukjin Kwon commented on SPARK-19428: -- [~lminer], another workaround might be (with {{from

[jira] [Commented] (SPARK-19428) Ability to select first row of groupby

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852895#comment-15852895 ] Hyukjin Kwon commented on SPARK-19428: -- Oh, the other comments show up later in my browser. Please

[jira] [Comment Edited] (SPARK-19428) Ability to select first row of groupby

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852895#comment-15852895 ] Hyukjin Kwon edited comment on SPARK-19428 at 2/4/17 6:38 PM: -- Oh, the other

[jira] [Commented] (SPARK-19428) Ability to select first row of groupby

2017-02-04 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15852894#comment-15852894 ] Hyukjin Kwon commented on SPARK-19428: -- (FWIW, I think he meant... {code} >>> df =

[jira] [Commented] (SPARK-16200) Rename AggregateFunction#supportsPartial

2017-02-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851474#comment-15851474 ] Hyukjin Kwon commented on SPARK-16200: -- (maybe it seems good to double-check this one too per

[jira] [Created] (SPARK-19446) Remove unused findTightestCommonType in TypeCoercion

2017-02-03 Thread Hyukjin Kwon (JIRA)

Hyukjin Kwon created SPARK-19446: Summary: Remove unused findTightestCommonType in TypeCoercion Key: SPARK-19446 URL: https://issues.apache.org/jira/browse/SPARK-19446 Project: Spark Issue

[jira] [Commented] (SPARK-15180) Support subexpression elimination in Fliter

2017-02-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-15180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851447#comment-15851447 ] Hyukjin Kwon commented on SPARK-15180: -- Ah, [~viirya], I just happened to see this JIRA. Should we

[jira] [Commented] (SPARK-16041) Disallow Duplicate Columns in `partitionBy`, `bucketBy` and `sortBy`

2017-02-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851468#comment-15851468 ] Hyukjin Kwon commented on SPARK-16041: -- [~smilegator], I just happened to see this JIRA just while

[jira] [Commented] (SPARK-16043) Prepare GenericArrayData implementation specialized for a primitive array

2017-02-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16043?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851471#comment-15851471 ] Hyukjin Kwon commented on SPARK-16043: -- (maybe would be great if this one is checked too per

[jira] [Resolved] (SPARK-15911) Remove additional Project to be consistent with SQL when insert into table

2017-02-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-15911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-15911. -- Resolution: Duplicate I am resolving this per

[jira] [Commented] (SPARK-16042) Eliminate nullcheck code at projection for an array type

2017-02-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851469#comment-15851469 ] Hyukjin Kwon commented on SPARK-16042: -- [~kiszk], would this JIRA maybe be resolvable per

[jira] [Commented] (SPARK-16094) Support HashAggregateExec for non-partial aggregates

2017-02-03 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-16094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15851473#comment-15851473 ] Hyukjin Kwon commented on SPARK-16094: -- [~maropu], I just happened to see this JIRA. Maybe would

[jira] [Commented] (SPARK-19109) ORC metadata section can sometimes exceed protobuf message size limit

2017-01-22 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15833870#comment-15833870 ] Hyukjin Kwon commented on SPARK-19109: -- It seems this JIRA describes upgrading the version of Hive

[jira] [Commented] (SPARK-10924) Failed to update accumulators for ShuffleMapTask: Broken pipe

2017-01-24 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-10924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837249#comment-15837249 ] Hyukjin Kwon commented on SPARK-10924: -- [~ptallada], Would this be possible to provide a

[jira] [Resolved] (SPARK-11087) spark.sql.orc.filterPushdown does not work, No ORC pushdown predicate

2017-01-25 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-11087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-11087. -- Resolution: Cannot Reproduce {code} hive> create table people(name string, address string,

[jira] [Commented] (SPARK-5786) Documentation of Narrow Dependencies

2017-01-26 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-5786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839788#comment-15839788 ] Hyukjin Kwon commented on SPARK-5786: - It seems they are documented, at least, in API docs, e.g.,

[jira] [Resolved] (SPARK-2687) after receving allocated containers,amClient should remove ContainerRequest.

2017-01-26 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-2687. - Resolution: Duplicate I am resolving this per

[jira] [Resolved] (SPARK-17734) inner equi-join shorthand that returns Datasets, like DataFrame already has

2017-01-26 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-17734?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-17734. -- Resolution: Won't Fix We have {{joinWith}} to return {{Dataset}}. Also, we have {{join}} and

[jira] [Commented] (SPARK-18839) Executor is active on web, but actually is dead

2017-01-26 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-18839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839719#comment-15839719 ] Hyukjin Kwon commented on SPARK-18839: -- [~uncleGen] Would you mind elaborating why you think it is

[jira] [Commented] (SPARK-18579) spark-csv strips whitespace (pyspark)

2017-01-26 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-18579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15839704#comment-15839704 ] Hyukjin Kwon commented on SPARK-18579: -- Can we just strip them within the dataframe/dataset?

[jira] [Commented] (SPARK-14480) Remove meaningless StringIteratorReader for CSV data source for better performance

2017-01-26 Thread Hyukjin Kwon (JIRA)

[ https://issues.apache.org/jira/browse/SPARK-14480?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15840872#comment-15840872 ] Hyukjin Kwon commented on SPARK-14480: -- removed `StringIteratorReader` concatenates the lines in