[GitHub] spark pull request: [SPARK-13528][SQL] Make the short names of com...

2016-03-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11408#issuecomment-191177389 If the consistent short names infer only lower-case names, then I think we can just leave them as corrected here in a way. --- If your project is set up

[GitHub] spark pull request: [SPARK-13528][SQL] Make the short names of com...

2016-03-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11408#issuecomment-191167940 One more thing is, I am not sure if we then need `uncompressed` or `none` for JSON, CSV and TEXT datasources as shorten names to explicitly set no compression

[GitHub] spark pull request: [SPARK-13528][SQL] Make the short names of com...

2016-03-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11408#issuecomment-191157997 One thing I am not sure here is, `DeflateCodec` is basically zlib as far as I remember but `.deflate` is Hadoop's conversion. For example, for ORC, it uses `ZLIB

[GitHub] spark pull request: [SPARK-13766][SQL] Consistent file extensions ...

2016-03-09 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11604#issuecomment-194212135 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-13766][SQL] Consistent file extensions ...

2016-03-09 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/11604 [SPARK-13766][SQL] Consistent file extensions for files written by internal data sources ## What changes were proposed in this pull request? This PR makes the file extensions (written

[GitHub] spark pull request: [SPARK-14189][SQL] JSON data sources find comp...

2016-04-03 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11993#issuecomment-204897624 @davies Could you maybe take a look at this one please? It is similar with https://github.com/apache/spark/pull/12030. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-14231][SQL] JSON data source infers flo...

2016-04-02 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12030#issuecomment-204864387 @davies Actually, I am not sure if I understood the last comment correctly. Would you check the tests please? I added some tests for both merged types

[GitHub] spark pull request: [SPARK-14189][SQL] JSON data sources find comp...

2016-03-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11993#issuecomment-202239591 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-13866][SQL] Handle decimal type in CSV ...

2016-03-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11724#issuecomment-202219010 @falaki Could you take a look at this please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-14189][SQL] JSON data sources find comp...

2016-03-27 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/11993 [SPARK-14189][SQL] JSON data sources find compatible types even if inferred decimal type is not capable of the others ## What changes were proposed in this pull request? When inferred

[GitHub] spark pull request: [SPARK-14143] Options for parsing NaNs, Infini...

2016-03-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11947#issuecomment-202682552 For codes, overall, it looks good to me. However, I am not used to and have a lot of experience of dealing with `NaN`, `Inf ` or `-Inf`. If the values can

[GitHub] spark pull request: [SPARK-14143] Options for parsing NaNs, Infini...

2016-03-28 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11947#issuecomment-202711498 I found both `NaN` and `Infinity` are handled in JSON data source and it was fixed in this PR, https://github.com/apache/spark/commit

[GitHub] spark pull request: [SPARK-14231][SQL] JSON data source infers flo...

2016-03-30 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12030#issuecomment-203325658 @rxin Could you maybe merge this if everything is okay? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-14143] Options for parsing NaNs, Infini...

2016-03-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/11947#discussion_r57813530 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala --- @@ -177,35 +177,57 @@ private[csv] object

[GitHub] spark pull request: [MINOR][SQL] Fix comments style and corrects s...

2016-04-01 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/12109 [MINOR][SQL] Fix comments style and corrects several styles in CSV data source ## What changes were proposed in this pull request? While trying to create a PR (which was not an issue

[GitHub] spark pull request: [SPARK-13108][SQL] Support for ascii compatibl...

2016-04-01 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11016#issuecomment-204346308 Sorry, this will not work for Windows (`\r\n`). I am closing this. If you intend to just block non-ascii compatible encodings, then I will create a new PR

[GitHub] spark pull request: [SPARK-13108][SQL] Support for ascii compatibl...

2016-04-01 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/11016 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-14231][SQL] JSON data source infers flo...

2016-03-29 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/12030 [SPARK-14231][SQL] JSON data source infers floating-point values as a double when they do not fit in a decimal ## What changes were proposed in this pull request? https

[GitHub] spark pull request: [SPARK-14231][SQL] JSON data source infers flo...

2016-03-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12030#issuecomment-202775452 cc @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-13866][SQL] Handle decimal type in CSV ...

2016-03-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11724#issuecomment-202220872 The commits I just added include the behaviour below: Infering Types - `DecimalType` is tried first. So, `10.1` (scale < precis

[GitHub] spark pull request: [MINOR][SQL] Fix comments styl and correct sev...

2016-04-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12109#discussion_r58288525 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -42,12 +42,12 @@ object CSVRelation extends

[GitHub] spark pull request: [MINOR][SQL] Fix comments styl and correct sev...

2016-04-01 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12109#discussion_r58288165 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -42,12 +42,12 @@ object CSVRelation extends

[GitHub] spark pull request: [SPARK-14231][SQL] JSON data source infers flo...

2016-03-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12030#issuecomment-203175499 @rxin I just replace the use of `Try` to `try/catch` and added the case you said. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-14596][SQL] Remove not used SqlNewHadoo...

2016-04-13 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12354#issuecomment-209730735 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [MINOR][SQL] Remove extra anonymous closure wi...

2016-04-14 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12382#issuecomment-209779860 retest this please --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14800][SQL] Dealing with null as a valu...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12629#issuecomment-213656738 cc @davies @viirya --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14800][SQL] Dealing with null as a valu...

2016-04-22 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/12629 [SPARK-14800][SQL] Dealing with null as a value in options for each internal data source ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-213658002 @rxin Could you please review this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-8603] [sparkR] In windows, Incorrect fi...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/7025#issuecomment-213659474 @JoshRosen I can submit a PR based on this if you think this PR is abandoned. --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-8603] [sparkR] In windows, Incorrect fi...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/7025#issuecomment-213659754 ping @prakashpc --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-9314] [EC2] add root EBS config options...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/7647#issuecomment-213659877 ping @kmaehashi --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-11102] [SQL] Uninformative exception wh...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/9490#issuecomment-213668158 ping @zjffdu --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14800][SQL] Dealing with null as a valu...

2016-04-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12629#issuecomment-214554156 Could you review this please? @davies and @viirya --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-25 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-214554406 ping @rxin --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [DOCS][MINOR] Accumulators

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12569#issuecomment-213280716 @jaceklaskowski Because I thought obviously it is not clear. For me it sounds like adding whole documents for Accumulators. As you just said, I think "

[GitHub] spark pull request: [SPARK-12355][SQL] Implement unhandledFilter i...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10502#issuecomment-213244434 Let me close this for now. Please let me know although this is closed. I will reopen this when I start to work on this again. --- If your project is set up

[GitHub] spark pull request: [MINOR][MLLIB] Public visibility for eval metr...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11219#issuecomment-213243550 @tanwanirahul Could you fill up the PR description with the purpose please? --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-14521][SQL]StackOverflowError in Kryo w...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12598#discussion_r60693434 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala --- @@ -324,8 +324,8 @@ private[joins] object

[GitHub] spark pull request: [SPARK-14521][SQL]StackOverflowError in Kryo w...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12598#discussion_r60693402 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashedRelation.scala --- @@ -325,7 +325,7 @@ private[joins] object

[GitHub] spark pull request: [SPARK-14525][SQL] Make DataFrameWrite.save wo...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12601#issuecomment-213297906 @JustinPihony I think we haven't reached the conclusion yet and haven't got any feedback, from committers, if we should deprecate `read.jdbc()` or support

[GitHub] spark pull request: [SPARK-13266][SQL] None read/writer options we...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12494#issuecomment-213264603 Filed in [SPARK-14839](https://issues.apache.org/jira/browse/SPARK-14839). I might be able to give a shot **if** nobody gives a try. --- If your project is set up

[GitHub] spark pull request: [DOCS][MINOR] Accumulators

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12569#issuecomment-213282258 @jaceklaskowski Would you accept my PR if I fix a bug in datasource in Spark SQL and I name it as "datasource"? --- If your project is set up for i

[GitHub] spark pull request: [SPARK-14525][SQL] Make DataFrameWrite.save wo...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12601#issuecomment-213298866 I think `Additional details` could be said in comments not in the PR description because PR description describes what the PR is. Maybe `Additional details

[GitHub] spark pull request: [SPARK-12355][SQL] Implement unhandledFilter i...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon closed the pull request at: https://github.com/apache/spark/pull/10502 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [STREAMING][DOCS] Fixes and code improvements ...

2016-04-21 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11201#issuecomment-213243682 ping @jaceklaskowski --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-14525][SQL] Make DataFrameWrite.save wo...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12601#issuecomment-213397222 Question: I found a [PPT](http://www.slideshare.net/SparkSummit/structuring-spark-dataframes-datasets-and-streaming-by-michael-armbrust) which I think is used

[GitHub] spark pull request: [SPARK-14525][SQL] Make DataFrameWrite.save wo...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12601#issuecomment-213382065 BTW, it looks it is pretty general that `Properties` just works like `HashMap[String, String]` in most cases. Firstly, I just checked [java.sql.Driver API

[GitHub] spark pull request: [SPARK-14525][SQL] Make DataFrameWrite.save wo...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12601#issuecomment-213382156 I think I can rework based on this because it is anyway opened already. Excuse my ping @rxin --- If your project is set up for it, you can reply to this email

[GitHub] spark pull request: [SPARK-14525][SQL] Make DataFrameWrite.save wo...

2016-04-22 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12601#discussion_r60721164 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala --- @@ -244,7 +244,11 @@ final class DataFrameWriter private[sql](df

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-214925695 @rxin No problem. Let me just rebase it if it has conflicts anyway. It is easier to track the changes. --- If your project is set up for it, you can reply

[GitHub] spark pull request: [SPARK-14962][SQL] Do not push down isnotnull/...

2016-04-29 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/12777 [SPARK-14962][SQL] Do not push down isnotnull/isnull on unsuportted types. ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-14962

[GitHub] spark pull request: [SPARK-14962][SQL] Do not push down isnotnull/...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12777#issuecomment-215692521 @liancheng @yhuai Could you please take a look please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-14962][SQL] Do not push down isnotnull/...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12777#discussion_r61565675 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcFilters.scala --- @@ -56,29 +55,35 @@ import org.apache.spark.sql.sources

[GitHub] spark pull request: [SPARK-14962][SQL] Do not push down isnotnull/...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12777#issuecomment-215697713 BTW, during doing this, I realised there is a unused classe and functions due to the change of `HadoopFsRelation`. - The class `OrcTableScan` is not used

[GitHub] spark pull request: [SPARK-14917][SQL] Enable some ORC compression...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12699#issuecomment-214705599 BTW, in `OrcHadoopHsRelationSuite` has a little bit inappropriate test because the default codec is `ZLIB` but the test `SPARK-13543: Support for specifying

[GitHub] spark pull request: [SPARK-14917][SQL] Enable some ORC compression...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12699#issuecomment-214708459 @rxin Could you please take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-14754][SPARK CORE] Metrics as logs are ...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12697#discussion_r61051439 --- Diff: core/src/main/scala/org/apache/spark/metrics/sink/Slf4jSink.scala --- @@ -25,6 +25,9 @@ import com.codahale.metrics.{MetricRegistry

[GitHub] spark pull request: [SPARK-14917][SQL] Enable some ORC compression...

2016-04-26 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/12699 [SPARK-14917][SQL] Enable some ORC compressions tests for writing ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-14917

[GitHub] spark pull request: [SPARK-14917][SQL] Enable some ORC compression...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12699#issuecomment-214705675 Could I maybe cc @srowen please? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [Spark-14314][SparkR] Add model persistence to...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12680#discussion_r61342771 --- Diff: mllib/src/main/scala/org/apache/spark/ml/r/KMeansWrapper.scala --- @@ -17,14 +17,21 @@ package org.apache.spark.ml.r

[GitHub] spark pull request: [SPARK-14913][SQL] Simplify configuration API

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12689#issuecomment-215239332 Yes. That's why I just left a short note! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61358822 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -17,152 +17,162 @@ package

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61359353 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -0,0 +1,189 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61359253 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -0,0 +1,189 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61359944 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -0,0 +1,189 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-215275474 @hvanhovell If you think it makes sense I will change the title of this PR and JIRA, and will add some more commits to deal with minor things (code style and etc

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61358445 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/InferSchema.scala --- @@ -30,22 +30,37 @@ import

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61358501 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityGenerator.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61358548 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityGenerator.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61358639 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -0,0 +1,189 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61359080 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityGenerator.scala --- @@ -0,0 +1,80 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14969][MLLib] Remove duplicate implemen...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12747#issuecomment-215280483 (@dding3 I think maybe you can remove the default lines such as "(Please fill in changes proposed in this fix)".) --- If your project is set up for i

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r6135 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala --- @@ -17,152 +17,162 @@ package

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-215274872 @hvanhovell Thank you for a close look! I think I need to change this title of this issue and JIRA because "better performance" might be

[GitHub] spark pull request: [SPARK-14952][CORE][ML] Remove methods that we...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12732#issuecomment-215281017 (I think `cc @srowen` should be removed because the PR description explains the PR itself and the names of reviewers might not be related with the PR itself

[GitHub] spark pull request: [SPARK-13590] [ML] [Doc] Document spark.ml LiR...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12731#issuecomment-215281321 (This is a super minor but I think `cc @mengxr` should be removed because the PR description explains the PR itself and the names of reviewers might not be related

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12268#discussion_r61366691 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala --- @@ -0,0 +1,189 @@ +/* + * Licensed

[GitHub] spark pull request: [SPARK-14800][SQL] Dealing with null as a valu...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12629#issuecomment-214975784 It looks Hadoop configuration can be set via `option()` now. It looks when it sets `null` for Hadoop configurations, it uses default values which is consistent

[GitHub] spark pull request: [SPARK-14913][SQL] Simplify configuration API

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12689#issuecomment-214977910 (Please allow me just leave a short note for some files having unused imports in this PR) ```bash sql/core/src/main/scala/org/apache/spark/sql

[GitHub] spark pull request: Unintentional white spaces in kryo classes con...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12701#discussion_r61186521 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -71,10 +71,10 @@ class KryoSerializer(conf: SparkConf

[GitHub] spark pull request: Unintentional white spaces in kryo classes con...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12701#discussion_r61186530 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -71,10 +71,10 @@ class KryoSerializer(conf: SparkConf

[GitHub] spark pull request: Unintentional white spaces in kryo classes con...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12701#discussion_r61186507 --- Diff: core/src/main/scala/org/apache/spark/SparkConf.scala --- @@ -191,7 +191,7 @@ class SparkConf(loadDefaults: Boolean) extends Cloneable

[GitHub] spark pull request: [SPARK-14939][SQL] Eliminate No-op Alias SortO...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12719#discussion_r61186903 --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/EliminateSortsSuite.scala --- @@ -69,4 +69,17 @@ class

[GitHub] spark pull request: [SPARK-14914] Normalize Paths/URIs for windows...

2016-04-26 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12695#issuecomment-214676529 (@taoli91 FYI, it would be great if you run `sbt scalastyle` or `./dev/run-tests` (this triggers style checking first) for style corrections before adding some

[GitHub] spark pull request: [SPARK-12143][SQL] Binary type support for Hiv...

2016-04-27 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/12733 [SPARK-12143][SQL] Binary type support for Hive thrift server ## What changes were proposed in this pull request? https://issues.apache.org/jira/browse/SPARK-12143 This PR

[GitHub] spark pull request: [SPARK-12143][SQL] Binary type support for Hiv...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12733#issuecomment-215071496 cc @JoshRosen Could you please take a look? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-12143][SQL] Binary type support for Hiv...

2016-04-27 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12733#issuecomment-215071439 Because I assume https://github.com/apache/spark/pull/10139 is abandoned, I submitted a PR for this. I can close this if this is problematic in any way

[GitHub] spark pull request: [SPARK-12270][SQL]remove empty space after get...

2016-04-24 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/10262#issuecomment-214143879 @huaxingao Do you mind If I submit a PR based on this if you are not working on this? (It looks it has been inactive for few months!) --- If your project is set

[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/11550#discussion_r61663895 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala --- @@ -177,7 +191,8 @@ private[csv] object

[GitHub] spark pull request: [SPARK-15024] NoClassDefFoundError in spark-ex...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12800#issuecomment-215932736 (It looks the last part of the title is truncated) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-14917][SQL] Enable some ORC compression...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12699#issuecomment-215933632 Thank you @yhuai! Actually, do you mind if I try to submit a PR after walking through sql/hive tests and remove the class imports where it is possible maybe? I

[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/11550#issuecomment-215925235 @rxin Thank you. I will change the name to the original. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-14480][SQL] Simplify CSV parsing proces...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12268#issuecomment-215907300 @rxin, @hvanhovell Do you mind if I ask your thoughts on this please? --- If your project is set up for it, you can reply to this email and have your reply appear

[GitHub] spark pull request: [SPARK-14800][SQL] Dealing with null as a valu...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12629#issuecomment-215911793 Ping @davies and @viirya. It sets a default value for `null` to the options that throw an `NullPointExceptiom` and just passes `null` for some options

[GitHub] spark pull request: [SPARK-14917][SQL] Enable some ORC compression...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12699#discussion_r61662731 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/orc/OrcQuerySuite.scala --- @@ -169,39 +169,42 @@ class OrcQuerySuite extends QueryTest

[GitHub] spark pull request: [SPARK-13667][SQL] Support for specifying cust...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/11550#discussion_r61664186 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala --- @@ -78,7 +79,63 @@ private[csv] object

[GitHub] spark pull request: [SPARK-14917][SQL] Enable some ORC compression...

2016-04-29 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12699#issuecomment-215933980 @yhuai Sure. Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-13425][SQL] Documentation for CSV datas...

2016-04-30 Thread HyukjinKwon
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/12817 [SPARK-13425][SQL] Documentation for CSV datasource options ## What changes were proposed in this pull request? This PR adds the explanation and documentation for CSV options

[GitHub] spark pull request: [SPARK-13425][SQL] Documentation for CSV datas...

2016-04-30 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/12817#discussion_r61678681 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -393,6 +393,45 @@ class DataFrameReader private[sql](sparkSession

[GitHub] spark pull request: [SPARK-13425][SQL] Documentation for CSV datas...

2016-04-30 Thread HyukjinKwon
Github user HyukjinKwon commented on the pull request: https://github.com/apache/spark/pull/12817#issuecomment-216013835 Hm.. It passes a local style test for Python and I did not edit `./python/pyspark/sql/__init__.py`. --- If your project is set up for it, you can reply

<    2   3   4   5   6   7   8   9   10   11   >