[jira] [Commented] (SPARK-13719) Bad JSON record raises java.lang.ClassCastException

2016-03-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198537#comment-15198537 ] Hyukjin Kwon commented on SPARK-13719: -- Sorry, I will remove the duplicate link because it is a

[jira] [Commented] (SPARK-13953) Support for specifying the field name for corrupted record at JSON datasource.

2016-03-20 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15198636#comment-15198636 ] Hyukjin Kwon commented on SPARK-13953: -- Let me work on this as soon as some PRs that would cause

[jira] [Updated] (SPARK-13953) Support for specifying the field name for corrupted record at JSON datasource.

2016-03-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-13953: - Description: It would be great if we maybe set {{spark.sql.columnNameOfCorruptRecord}} via

[jira] [Created] (SPARK-13953) Support for specifying the field name for corrupted record at JSON datasource.

2016-03-19 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-13953: Summary: Support for specifying the field name for corrupted record at JSON datasource. Key: SPARK-13953 URL: https://issues.apache.org/jira/browse/SPARK-13953

[jira] [Comment Edited] (SPARK-13728) Fix ORC PPD

2016-03-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184742#comment-15184742 ] Hyukjin Kwon edited comment on SPARK-13728 at 3/8/16 10:17 AM: --- I see. I

[jira] [Comment Edited] (SPARK-13728) Fix ORC PPD

2016-03-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184742#comment-15184742 ] Hyukjin Kwon edited comment on SPARK-13728 at 3/8/16 10:17 AM: --- I see. I

[jira] [Commented] (SPARK-13728) Fix ORC PPD

2016-03-08 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184742#comment-15184742 ] Hyukjin Kwon commented on SPARK-13728: -- I see. I found some clues. It looks

[jira] [Created] (SPARK-13667) Support for specifying custom date format for date and timestamp types

2016-03-03 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-13667: Summary: Support for specifying custom date format for date and timestamp types Key: SPARK-13667 URL: https://issues.apache.org/jira/browse/SPARK-13667 Project:

[jira] [Updated] (SPARK-13638) Support for saving with a quote mode

2016-03-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-13638: - Description: https://github.com/databricks/spark-csv/pull/254 tobithiel reported this. {quote}

[jira] [Updated] (SPARK-13638) Support for saving with a quote mode

2016-03-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-13638: - Description: https://github.com/databricks/spark-csv/pull/254 tobithiel reported this. {quote}

[jira] [Created] (SPARK-13638) Support for saving with a quote mode

2016-03-02 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-13638: Summary: Support for saving with a quote mode Key: SPARK-13638 URL: https://issues.apache.org/jira/browse/SPARK-13638 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-13638) Support for saving with a quote mode

2016-03-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-13638: - Description: https://github.com/databricks/spark-csv/pull/254 tobithiel reported this. {quote}

[jira] [Updated] (SPARK-13638) Support for saving with a quote mode

2016-03-07 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-13638: - Description: https://github.com/databricks/spark-csv/pull/254 tobithiel reported this. {quote}

[jira] [Comment Edited] (SPARK-13766) Inconsistent file extensions and omitted file extensions written by CSV, TEXT and JSON data sources

2016-03-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186872#comment-15186872 ] Hyukjin Kwon edited comment on SPARK-13766 at 3/9/16 9:59 AM: -- Firstly,

[jira] [Commented] (SPARK-13766) Inconsistent file extensions and omitted file extensions written by CSV, TEXT and JSON data sources

2016-03-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186872#comment-15186872 ] Hyukjin Kwon commented on SPARK-13766: -- Firstly, sorry, I just checked this after creating a PR. I

[jira] [Commented] (SPARK-13766) Inconsistent file extensions and omitted file extensions written by CSV, TEXT and JSON data sources

2016-03-09 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15186890#comment-15186890 ] Hyukjin Kwon commented on SPARK-13766: -- Are there any possibility that the "part-*" files could be

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-03 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15223650#comment-15223650 ] Hyukjin Kwon commented on SPARK-14103: -- This issue in Univocity is fixed and they will release

[jira] [Commented] (SPARK-14189) JSON data source infers a field type as StringType when some are inferred as DecimalType not capable of IntegralType.

2016-03-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15213817#comment-15213817 ] Hyukjin Kwon commented on SPARK-14189: -- I can work on this. > JSON data source infers a field type

[jira] [Updated] (SPARK-14189) JSON data source infers a field type as StringType when some are inferred as DecimalType not capable of IntegralType.

2016-03-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14189?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14189: - Description: When inferred types in the same field during finding competible {{DataType}} are

[jira] [Created] (SPARK-14189) JSON data source infers a field type as StringType when some are inferred as DecimalType not capable of IntegralType.

2016-03-27 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-14189: Summary: JSON data source infers a field type as StringType when some are inferred as DecimalType not capable of IntegralType. Key: SPARK-14189 URL:

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225263#comment-15225263 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/5/16 12:45 AM: --- Oh, sorry I

[jira] [Commented] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-03-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215486#comment-15215486 ] Hyukjin Kwon commented on SPARK-14231: -- [~rxin] I can work on this but would you maybe confirm

[jira] [Created] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-03-28 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-14231: Summary: JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision. Key: SPARK-14231 URL:

[jira] [Updated] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-03-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14231: - Description: Currently, JSON data source supports {{floatAsBigDecimal}} option, which reads

[jira] [Commented] (SPARK-14260) Increase default value for maxCharsPerColumn

2016-03-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217785#comment-15217785 ] Hyukjin Kwon commented on SPARK-14260: -- For me, yes, I think the error message should be more

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217853#comment-15217853 ] Hyukjin Kwon commented on SPARK-14103: -- [~shubhanshumis...@gmail.com] I just wonder if I could have

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217850#comment-15217850 ] Hyukjin Kwon commented on SPARK-14103: -- Thank you so much for cutting it short. Currenrly Im not too

[jira] [Commented] (SPARK-14260) Increase default value for maxCharsPerColumn

2016-03-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217870#comment-15217870 ] Hyukjin Kwon commented on SPARK-14260: -- Hm.. wouldn't users set the {{maxCharsPerColumn}} if they

[jira] [Comment Edited] (SPARK-14260) Increase default value for maxCharsPerColumn

2016-03-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217785#comment-15217785 ] Hyukjin Kwon edited comment on SPARK-14260 at 3/30/16 10:40 AM: For me,

[jira] [Updated] (SPARK-14271) Exposed option but not implemented rowSeparator option.

2016-03-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14271: - Description: I realised that {{rowSeparator}} option is exposed but it does not actually work

[jira] [Created] (SPARK-14271) Exposed option but not implemented rowSeparator option.

2016-03-30 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-14271: Summary: Exposed option but not implemented rowSeparator option. Key: SPARK-14271 URL: https://issues.apache.org/jira/browse/SPARK-14271 Project: Spark

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15218001#comment-15218001 ] Hyukjin Kwon commented on SPARK-14103: -- Thanks for detailed directions. Fortunately, I think i found

[jira] [Updated] (SPARK-14271) rowSeparator does not work for both reading and writing

2016-03-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14271: - Summary: rowSeparator does not work for both reading and writing (was: Exposed option but not

[jira] [Resolved] (SPARK-14271) rowSeparator does not work for both reading and writing

2016-04-01 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon resolved SPARK-14271. -- Resolution: Invalid Sorry, this option is not exposed but just exists in {{CSVOptions}}. >

[jira] [Commented] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215577#comment-15215577 ] Hyukjin Kwon commented on SPARK-14231: -- Sorry for adding many comment but maybe would this better

[jira] [Commented] (SPARK-14231) JSON data source fails to infer floats as decimal when precision is bigger than 38 or scale is bigger than precision.

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215560#comment-15215560 ] Hyukjin Kwon commented on SPARK-14231: -- I will change the name when I open a PR and yes I think we

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222792#comment-15222792 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/2/16 8:34 AM: --

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222792#comment-15222792 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/2/16 8:35 AM: --

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222792#comment-15222792 ] Hyukjin Kwon commented on SPARK-14103: -- [~shubhanshumis...@gmail.com] Right, it looks like an issue

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222879#comment-15222879 ] Hyukjin Kwon commented on SPARK-14103: -- cc [~falaki] [~r...@databricks.com] > Python DataFrame CSV

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222792#comment-15222792 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/2/16 1:51 PM: --

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222931#comment-15222931 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/2/16 4:22 PM: -- After

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222931#comment-15222931 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/2/16 4:23 PM: -- After

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15222931#comment-15222931 ] Hyukjin Kwon commented on SPARK-14103: -- After thinking further, I realised that this might be a

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225263#comment-15225263 ] Hyukjin Kwon commented on SPARK-14103: -- Oh, sorry I should have mentioned that it reads all the data

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225263#comment-15225263 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/4/16 11:34 PM: --- Oh, sorry I

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225263#comment-15225263 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/4/16 11:37 PM: --- Oh, sorry I

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225305#comment-15225305 ] Hyukjin Kwon commented on SPARK-14103: -- Just to cut it short, the input is being read as a byte

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-04 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225305#comment-15225305 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/4/16 11:52 PM: --- Just to cut

[jira] [Commented] (SPARK-14194) spark csv reader not working properly if CSV content contains CRLF character (newline) in the intermediate cell

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217194#comment-15217194 ] Hyukjin Kwon commented on SPARK-14194: -- Yes. CSV data source internally uses {{TextInputFormat}} and

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217200#comment-15217200 ] Hyukjin Kwon commented on SPARK-14103: -- As [~sowen] said, CRLF is dealt with in {{TextInputFormat}}

[jira] [Issue Comment Deleted] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14103: - Comment: was deleted (was: As [~sowen] said, CRLF is dealt with in {{TextInputFormat}} which

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217198#comment-15217198 ] Hyukjin Kwon commented on SPARK-14103: -- As [~sowen] said, CRLF is dealt with in {{TextInputFormat}}

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217198#comment-15217198 ] Hyukjin Kwon edited comment on SPARK-14103 at 3/30/16 1:30 AM: --- As [~sowen]

[jira] [Issue Comment Deleted] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14103: - Comment: was deleted (was: As [~sowen] said, CRLF is dealt with in {{TextInputFormat}} which

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217199#comment-15217199 ] Hyukjin Kwon commented on SPARK-14103: -- As [~sowen] said, CRLF is dealt with in {{TextInputFormat}}

[jira] [Commented] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217212#comment-15217212 ] Hyukjin Kwon commented on SPARK-14103: -- For long messages, there is a JIRA opened already here,

[jira] [Created] (SPARK-14260) Increase default value for maxCharsPerColumn

2016-03-29 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-14260: Summary: Increase default value for maxCharsPerColumn Key: SPARK-14260 URL: https://issues.apache.org/jira/browse/SPARK-14260 Project: Spark Issue Type:

[jira] [Commented] (SPARK-14260) Increase default value for maxCharsPerColumn

2016-03-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15217262#comment-15217262 ] Hyukjin Kwon commented on SPARK-14260: -- I am currently not sure if this value does not affect

[jira] [Updated] (SPARK-14839) Support for other types as option in OPTIONS clause

2016-04-21 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14839: - Description: This was found in https://github.com/apache/spark/pull/12494. Currently, Spark SQL

[jira] [Created] (SPARK-14839) Support for other types as option in OPTIONS clause

2016-04-21 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-14839: Summary: Support for other types as option in OPTIONS clause Key: SPARK-14839 URL: https://issues.apache.org/jira/browse/SPARK-14839 Project: Spark Issue

[jira] [Commented] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-29 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263759#comment-15263759 ] Hyukjin Kwon commented on SPARK-14962: -- I see. This was because ORC tries to apply a filter on the

[jira] [Created] (SPARK-14917) Enable some ORC compressions tests for writing

2016-04-26 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-14917: Summary: Enable some ORC compressions tests for writing Key: SPARK-14917 URL: https://issues.apache.org/jira/browse/SPARK-14917 Project: Spark Issue Type:

[jira] [Commented] (SPARK-12143) When column type is binary, select occurs ClassCastExcption in Beeline.

2016-04-27 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-12143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15261366#comment-15261366 ] Hyukjin Kwon commented on SPARK-12143: -- [~srowen] Can I close this? This was resolved by my PR

[jira] [Commented] (SPARK-13425) Documentation for CSV datasource options

2016-04-30 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15265572#comment-15265572 ] Hyukjin Kwon commented on SPARK-13425: -- [~rxin] I will. Thanks! (I believe R one is not yet,

[jira] [Commented] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263560#comment-15263560 ] Hyukjin Kwon commented on SPARK-14962: -- FWIW, I could not reproduce this in master branch. Let me

[jira] [Comment Edited] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263560#comment-15263560 ] Hyukjin Kwon edited comment on SPARK-14962 at 4/29/16 5:27 AM: --- ~~FWIW, I

[jira] [Comment Edited] (SPARK-14103) Python DataFrame CSV load on large file is writing to console in Ipython

2016-04-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15225263#comment-15225263 ] Hyukjin Kwon edited comment on SPARK-14103 at 4/29/16 5:28 AM: --- Oh, sorry I

[jira] [Comment Edited] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15263560#comment-15263560 ] Hyukjin Kwon edited comment on SPARK-14962 at 4/29/16 5:29 AM: --- -FWIW, I

[jira] [Updated] (SPARK-14962) spark.sql.orc.filterPushdown=true breaks DataFrame where functionality

2016-04-28 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-14962: - Affects Version/s: 2.0.0 > spark.sql.orc.filterPushdown=true breaks DataFrame where

[jira] [Comment Edited] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292750#comment-15292750 ] Hyukjin Kwon edited comment on SPARK-15393 at 5/20/16 5:51 AM: ---

[jira] [Comment Edited] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292750#comment-15292750 ] Hyukjin Kwon edited comment on SPARK-15393 at 5/20/16 5:36 AM: ---

[jira] [Commented] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-19 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292750#comment-15292750 ] Hyukjin Kwon commented on SPARK-15393: -- Hm.. I am trying to reproduce this exceptions. I added a

[jira] [Commented] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-18 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15290088#comment-15290088 ] Hyukjin Kwon commented on SPARK-15393: -- Oh yes. It seems serious.. it seems that PR is going to be

[jira] [Reopened] (SPARK-10216) Avoid creating empty files during overwrite into Hive table with group by query

2016-05-18 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-10216: -- I am reopening this because this causes a problem, SPARK-15393. > Avoid creating empty files

[jira] [Created] (SPARK-15475) Add tests for writing and reading back empty data for Parquet, Json and Text data sources

2016-05-22 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15475: Summary: Add tests for writing and reading back empty data for Parquet, Json and Text data sources Key: SPARK-15475 URL: https://issues.apache.org/jira/browse/SPARK-15475

[jira] [Created] (SPARK-15474) ORC data source fails to write and read back empty dataframe

2016-05-22 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15474: Summary: ORC data source fails to write and read back empty dataframe Key: SPARK-15474 URL: https://issues.apache.org/jira/browse/SPARK-15474 Project: Spark

[jira] [Created] (SPARK-15476) Support for reading text data source without specifying schema

2016-05-22 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15476: Summary: Support for reading text data source without specifying schema Key: SPARK-15476 URL: https://issues.apache.org/jira/browse/SPARK-15476 Project: Spark

[jira] [Created] (SPARK-15473) CSV fails to write empty dataframe

2016-05-22 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15473: Summary: CSV fails to write empty dataframe Key: SPARK-15473 URL: https://issues.apache.org/jira/browse/SPARK-15473 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-15473) CSV fails to write empty dataframe

2016-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295509#comment-15295509 ] Hyukjin Kwon commented on SPARK-15473: -- I will work on this. > CSV fails to write empty dataframe >

[jira] [Updated] (SPARK-15473) CSV fails to write and read back empty dataframe

2016-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-15473: - Description: Currently CSV data source fails to write and read empty data. The code below:

[jira] [Closed] (SPARK-15476) Support for reading text data source without specifying schema

2016-05-24 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon closed SPARK-15476. Resolution: Not A Problem I was totally stupid. I was testing {{test}} for {{text}}. Closing this.

[jira] [Commented] (SPARK-15393) Writing empty Dataframes doesn't save any _metadata files

2016-05-22 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295484#comment-15295484 ] Hyukjin Kwon commented on SPARK-15393: -- Ah.. thank you! I think the examples you wrote and I wrote

[jira] [Closed] (SPARK-15325) Replace the usage of deprecated DataSet API in tests

2016-05-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon closed SPARK-15325. Resolution: Duplicate > Replace the usage of deprecated DataSet API in tests >

[jira] [Closed] (SPARK-15325) Replace the usage of deprecated DataSet API in tests

2016-05-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon closed SPARK-15325. Resolution: Fixed It seems {{unionAll()}} is cleaned up and it seems this duplicates another. So

[jira] [Reopened] (SPARK-15325) Replace the usage of deprecated DataSet API in tests

2016-05-15 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon reopened SPARK-15325: -- > Replace the usage of deprecated DataSet API in tests >

[jira] [Created] (SPARK-15325) Replace the usage of deprecated DataSet API in tests

2016-05-14 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15325: Summary: Replace the usage of deprecated DataSet API in tests Key: SPARK-15325 URL: https://issues.apache.org/jira/browse/SPARK-15325 Project: Spark Issue

[jira] [Commented] (SPARK-15325) Replace the usage of deprecated DataSet API in tests

2016-05-14 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15283507#comment-15283507 ] Hyukjin Kwon commented on SPARK-15325: -- I will work on this. > Replace the usage of deprecated

[jira] [Commented] (SPARK-15266) Use SparkSession instead of SQLContext in Python tests

2016-05-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279575#comment-15279575 ] Hyukjin Kwon commented on SPARK-15266: -- I will work on this. > Use SparkSession instead of

[jira] [Updated] (SPARK-15266) Use SparkSession instead of SQLContext in Python tests

2016-05-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-15266: - Affects Version/s: 2.0.0 > Use SparkSession instead of SQLContext in Python tests >

[jira] [Created] (SPARK-15266) Use SparkSession instead of SQLContext in Python tests

2016-05-10 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15266: Summary: Use SparkSession instead of SQLContext in Python tests Key: SPARK-15266 URL: https://issues.apache.org/jira/browse/SPARK-15266 Project: Spark Issue

[jira] [Closed] (SPARK-15266) Use SparkSession instead of SQLContext in Python tests

2016-05-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon closed SPARK-15266. Resolution: Duplicate > Use SparkSession instead of SQLContext in Python tests >

[jira] [Commented] (SPARK-15266) Use SparkSession instead of SQLContext in Python tests

2016-05-10 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279591#comment-15279591 ] Hyukjin Kwon commented on SPARK-15266: -- Oh, sorry. I could not find. Thank you for correcting me. >

[jira] [Commented] (SPARK-15267) Refactor and add some classes for options in datasources like CSVOptions or JSONOptions

2016-05-11 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15279635#comment-15279635 ] Hyukjin Kwon commented on SPARK-15267: -- I will work on this. > Refactor and add some classes for

[jira] [Created] (SPARK-15267) Refactor and add some classes for options in datasources like CSVOptions or JSONOptions

2016-05-11 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15267: Summary: Refactor and add some classes for options in datasources like CSVOptions or JSONOptions Key: SPARK-15267 URL: https://issues.apache.org/jira/browse/SPARK-15267

[jira] [Updated] (SPARK-10216) Avoid creating empty files during overwrite into Hive table with group by query

2016-05-02 Thread Hyukjin Kwon (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-10216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-10216: - Affects Version/s: 2.0.0 > Avoid creating empty files during overwrite into Hive table with

[jira] [Created] (SPARK-15198) Support for filter push down for boolean types in ORC

2016-05-06 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15198: Summary: Support for filter push down for boolean types in ORC Key: SPARK-15198 URL: https://issues.apache.org/jira/browse/SPARK-15198 Project: Spark Issue

[jira] [Created] (SPARK-15144) option nullValue for CSV data source not working for several types.

2016-05-04 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15144: Summary: option nullValue for CSV data source not working for several types. Key: SPARK-15144 URL: https://issues.apache.org/jira/browse/SPARK-15144 Project: Spark

[jira] [Created] (SPARK-15143) CSV data source is not being tested as HadoopFsRelation

2016-05-04 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15143: Summary: CSV data source is not being tested as HadoopFsRelation Key: SPARK-15143 URL: https://issues.apache.org/jira/browse/SPARK-15143 Project: Spark

[jira] [Created] (SPARK-15148) Upgrade Univocity library from 2.0.2 to 2.1.0

2016-05-05 Thread Hyukjin Kwon (JIRA)
Hyukjin Kwon created SPARK-15148: Summary: Upgrade Univocity library from 2.0.2 to 2.1.0 Key: SPARK-15148 URL: https://issues.apache.org/jira/browse/SPARK-15148 Project: Spark Issue Type:

<    1   2   3   4   5   6   7   8   9   10   >