[GitHub] spark pull request: [SPARK-5212][SQL] Add support of schema-less t...

2015-01-12 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4014 [SPARK-5212][SQL] Add support of schema-less transformation According to Hive's language manual, the AS clause should be optional in [transform](https://cwiki.apache.org/confluence/display/Hive

[GitHub] spark pull request: [SPARK-5714][Mllib] Refactor initial step of L...

2015-02-10 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4501 [SPARK-5714][Mllib] Refactor initial step of LDA to remove redundant operations The `initialState` of LDA performs several RDD operations that looks redundant. This pr tries to simplify

[GitHub] spark pull request: [SPARK-5714][Mllib] Refactor initial step of L...

2015-02-10 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4501#issuecomment-73700334 an unrelated failure... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-5681][Streaming] Add tracker status and...

2015-02-13 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4467#issuecomment-74280513 @tdas, Do you have time to take a look of the analysis and the current implementation? --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-5799][SQL] Compute aggregation function...

2015-02-13 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4592 [SPARK-5799][SQL] Compute aggregation function on specified numeric columns Compute aggregation function on specified numeric columns. For example: val df = Seq((a, 1, 0, b), (b, 2, 4, c

[GitHub] spark pull request: [SPARK-5799][SQL] Compute aggregation function...

2015-02-14 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4592#discussion_r24713647 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala --- @@ -93,7 +93,14 @@ private[sql] class DataFrameImpl protected[sql

[GitHub] spark pull request: [SPARK-5799][SQL] Compute aggregation function...

2015-02-14 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4592#discussion_r24713599 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala --- @@ -93,7 +93,14 @@ private[sql] class DataFrameImpl protected[sql

[GitHub] spark pull request: [SPARK-5793][SQL] Add explode to Column

2015-02-14 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4585#discussion_r24712170 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Column.scala --- @@ -576,6 +578,27 @@ trait Column extends DataFrame { override def as(alias

[GitHub] spark pull request: [SPARK-5799][SQL] Compute aggregation function...

2015-02-14 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4592#discussion_r24713729 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameImpl.scala --- @@ -93,7 +93,14 @@ private[sql] class DataFrameImpl protected[sql

[GitHub] spark pull request: [SPARK-5793][SQL] Add explode to Column

2015-02-14 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4585#discussion_r24714073 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Column.scala --- @@ -576,6 +578,25 @@ trait Column extends DataFrame { override def as(alias

[GitHub] spark pull request: [SPARK-5615] Let the test stop gracefully and ...

2015-02-08 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4364#issuecomment-73456104 Okay I would submit a patch for that later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5773][Mllib] Further optimize sparse sy...

2015-02-12 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4568#discussion_r24585271 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/BLAS.scala --- @@ -276,11 +276,13 @@ private[spark] object BLAS extends Serializable with Logging

[GitHub] spark pull request: [SPARK-5773][Mllib] Further optimize sparse sy...

2015-02-12 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4568 [SPARK-5773][Mllib] Further optimize sparse syr The current sparse syr in BLAS can be further optimized. You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-5773][Mllib] Further optimize sparse sy...

2015-02-12 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4568#issuecomment-74083253 Okay. I would do it later. Roughly considering it, the half computation can be saved. --- If your project is set up for it, you can reply to this email and have your

[GitHub] spark pull request: [SPARK-5773][Mllib] Further optimize sparse sy...

2015-02-12 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4568#issuecomment-74104915 Performance is not improved. Close this pr. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5773][Mllib] Further optimize sparse sy...

2015-02-12 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/4568 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-5799][SQL] Compute aggregation function...

2015-02-15 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4592#discussion_r24719554 --- Diff: python/pyspark/sql/dataframe.py --- @@ -714,30 +714,45 @@ def count(self): [Row(age=2, count=1), Row(age=5, count=1

[GitHub] spark pull request: [SPARK-5793][SQL] Add explode to Column

2015-02-15 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4585#discussion_r24719184 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/Column.scala --- @@ -576,6 +578,25 @@ trait Column extends DataFrame { override def as(alias

[GitHub] spark pull request: [SPARK-5832][Mllib] Add Affinity Propagation c...

2015-02-16 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4622 [SPARK-5832][Mllib] Add Affinity Propagation clustering algorithm Affinity Propagation (AP), a graph clustering algorithm based on the concept of message passing between data points. Unlike

[GitHub] spark pull request: Add BLAS.dsyr and use it in GaussianMixtureEM

2015-01-08 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/3949 Add BLAS.dsyr and use it in GaussianMixtureEM This pr uses BLAS.dsyr to replace few implementations in GaussianMixtureEM. You can merge this pull request into a Git repository by running: $ git

[GitHub] spark pull request: [SPARK-6303][SQL] Add Average in canBeCodeGene...

2015-03-18 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4996#issuecomment-83038283 /cc @marmbrus This should be straightforward. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5014#issuecomment-82730064 @yhuai @chenghao-intel I updated it. Please take a look when you have time. Thanks! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-6326][SQL] Improve castStruct to be fas...

2015-03-15 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5017#discussion_r26448827 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -394,10 +394,16 @@ case class Cast(child: Expression

[GitHub] spark pull request: [SPARK-6215][SQL] Shorten apply and update fun...

2015-03-15 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4940#issuecomment-81130668 I think it is because the codes you refer to access the element directly with array index. If the ordinsl is not valid, runtime exception will be thrown

[GitHub] spark pull request: [SPARK-6326][SQL] Improve castStruct to be fas...

2015-03-15 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5017#issuecomment-81129086 Unrelated failure. retest it please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6302][SQL] GeneratedAggregate uses wron...

2015-03-15 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4994#issuecomment-81108144 Because `updateProjection` is the projection between `child.output` and the aggregations `updateExpressions`. Its `inputSchema` should be just `child.output

[GitHub] spark pull request: [SPARK-6326][SQL] Improve castStruct to be fas...

2015-03-15 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5017#issuecomment-81112499 With new commit. Conducting 100 the `struct casting` in `ExpressionEvaluationSuite`: before pr: 59.149s after pr: 47.243s Conducting

[GitHub] spark pull request: [SPARK-6326][SQL] Improve castStruct to be fas...

2015-03-15 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5017#discussion_r26448902 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Cast.scala --- @@ -394,10 +394,17 @@ case class Cast(child: Expression

[GitHub] spark pull request: [SPARK-6326][SQL] Improve castStruct to be fas...

2015-03-15 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5017#issuecomment-80935393 Simple benchmark conducting 100 the `struct casting` in `ExpressionEvaluationSuite`: before pr: 59.149s after pr: 53.869s Conducting 100

[GitHub] spark pull request: [SPARK-6354][SQL] Replace the plan which is pa...

2015-03-16 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/5044 [SPARK-6354][SQL] Replace the plan which is part of cached query Currently we only replace the plan which equals to cached query. This approach can be extended to replace the plan which is part

[GitHub] spark pull request: [SPARK-6215][SQL] Shorten apply and update fun...

2015-03-15 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4940#issuecomment-81366888 I think it is not a bug, it is reasonable behavior. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-5681][Streaming] Add tracker status and...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4467#issuecomment-82210437 @tdas Any updated ideas or comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-5793][SQL] Add explode to Column

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4585#issuecomment-82210952 Sure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-5332][Core] Efficient way to deal with ...

2015-03-17 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/4118 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-5938][SQL] Generate Row from JSON strin...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4712#issuecomment-82218632 @marmbrus @liancheng Please take time to review this. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-6184][SQL] Relocate logDebug to correct...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4905#issuecomment-82219585 /cc @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-5908][SQL] Resolve UdtfsAlias when only...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4692#issuecomment-82218247 Hi @marmbrus, Are you available to go with this pr now? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-6041][GraphX] Compute shortest path for...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4790#issuecomment-8532 @ankurdave @srowen Can you help review this pr? Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-6184][SQL] Relocate logDebug to correct...

2015-03-17 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/4905 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6184][SQL] Relocate logDebug to correct...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4905#issuecomment-82673965 ok. let's close this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6215][SQL] Shorten apply and update fun...

2015-03-17 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/4940 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6215][SQL] Shorten apply and update fun...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4940#issuecomment-82687260 ok. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled

[GitHub] spark pull request: [SPARK-6204][SQL] GenerateProjection's equals ...

2015-03-17 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/4931 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6204][SQL] GenerateProjection's equals ...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4931#issuecomment-82695242 It would not get a failure. For now, the `equals` function will go back to call `super.equals` to test the equality if two rows have different lengths. --- If your

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-17 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26634556 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -557,8 +557,11 @@ https://cwiki.apache.org/confluence/display/Hive/Enhanced

[GitHub] spark pull request: [SPARK-6354][SQL] Replace the plan which is pa...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5044#issuecomment-82698690 I will update the design on the JIRA and add comments later. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub

[GitHub] spark pull request: [SPARK-6204][SQL] GenerateProjection's equals ...

2015-03-17 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4931#issuecomment-82695310 If you think this is not worth fixing. Let's close this pr. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-17 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26635204 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -557,8 +557,11 @@ https://cwiki.apache.org/confluence/display/Hive/Enhanced

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-17 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26635455 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -557,8 +557,11 @@ https://cwiki.apache.org/confluence/display/Hive/Enhanced

[GitHub] spark pull request: [SPARK-6354][SQL] Replace the plan which is pa...

2015-03-19 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5044#issuecomment-83384314 @marmbrus I have updated the design on the JIRA. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6354][SQL] Replace the plan which is pa...

2015-03-19 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5044#issuecomment-83436851 More comments and tests are added now. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project

[GitHub] spark pull request: [SPARK-6354][SQL] Replace the plan which is pa...

2015-03-19 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5044#issuecomment-83498356 Looks like an irrelative failure. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6326][SQL] Improve castStruct to be fas...

2015-03-20 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5017#issuecomment-84087399 /cc @marmbrus Any other concerns? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-20 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5014#issuecomment-84087755 @yhuai Any other comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6302][SQL] GeneratedAggregate uses wron...

2015-03-20 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4994#issuecomment-84088043 /cc @yhuai @marmbrus --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6224][SQL] Also collect NamedExpression...

2015-03-09 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/4949 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6224][SQL] Also collect NamedExpression...

2015-03-09 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4949 [SPARK-6224][SQL] Also collect NamedExpressions in PhysicalOperation Currently in `PhysicalOperation`, only `Alias` expressions are collected. Similarly, `NamedExpression` can be collected

[GitHub] spark pull request: [SPARK-6087][CORE] Provide actionable exceptio...

2015-03-09 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4947#discussion_r26033358 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -158,7 +158,13 @@ private[spark] class KryoSerializerInstance(ks

[GitHub] spark pull request: [SPARK-6087][CORE] Provide actionable exceptio...

2015-03-09 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4947#discussion_r26041118 --- Diff: core/src/main/scala/org/apache/spark/serializer/KryoSerializer.scala --- @@ -158,7 +158,13 @@ private[spark] class KryoSerializerInstance(ks

[GitHub] spark pull request: [SPARK-6302][SQL] GeneratedAggregate uses wron...

2015-03-12 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4994 [SPARK-6302][SQL] GeneratedAggregate uses wrong schema on updateProjection The `updateProjection` in `GeneratedAggregate` now uses the `updateSchema` as its input schema. In fact, the schema should

[GitHub] spark pull request: [SPARK-6204][SQL] GenerateProjection's equals ...

2015-03-06 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4931 [SPARK-6204][SQL] GenerateProjection's equals should check length equality `GenerateProjection`'s `equals` now only checks column equality. It should also check length equality. You can merge

[GitHub] spark pull request: [SPARK-6159][Core] Distinguish between inprogr...

2015-03-06 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/4891 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6159][Core] Distinguish between inprogr...

2015-03-06 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4891#issuecomment-77554262 @srowen I think that is what @andrewor14 meant. It is reasonable so I close this pr. Thanks! --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-6184][SQL] Relocate logDebug to correct...

2015-03-06 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4905#issuecomment-77554431 CC @marmbrus. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature

[GitHub] spark pull request: [SPARK-6197][CORE] handle json exception when ...

2015-03-06 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4927#issuecomment-77597792 Hmm, just being curious, `replay` originally has a `try...catch` block outside the log reading, and this pr adds another one. Is it better to combine two `try...catch

[GitHub] spark pull request: [SPARK-5908][SQL] Resolve UdtfsAlias when only...

2015-03-05 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4692#issuecomment-77513338 @marmbrus, @liancheng May you take a look of this pr? It should be straightforward. Thanks! --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-6197][CORE] handle json exception when ...

2015-03-06 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4927#issuecomment-77601804 Ah, not notice that. :-) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have

[GitHub] spark pull request: [SPARK-6215][SQL] Shorten apply and update fun...

2015-03-08 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4940 [SPARK-6215][SQL] Shorten apply and update funcs in GenerateProjection Some codes in `GenerateProjection` look redundant and can be shortened. You can merge this pull request into a Git repository

[GitHub] spark pull request: [SPARK-6215][SQL] Shorten apply and update fun...

2015-03-08 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4940#issuecomment-77793875 @chenghao-intel These are the random accessors for the row (`SpecificRow`) objects produced by the projection `GenerateProjection`. So I think they are not resolved

[GitHub] spark pull request: [SPARK-6215][SQL] Shorten apply and update fun...

2015-03-08 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/4940#discussion_r26013756 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GenerateProjection.scala --- @@ -84,9 +84,15 @@ object

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26407018 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -585,34 +589,6 @@ private[hive] class HiveMetastoreCatalog(hive

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26409195 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala --- @@ -585,34 +589,6 @@ private[hive] class HiveMetastoreCatalog(hive

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26406953 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -557,8 +557,11 @@ https://cwiki.apache.org/confluence/display/Hive/Enhanced

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26408928 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -557,8 +557,11 @@ https://cwiki.apache.org/confluence/display/Hive/Enhanced

[GitHub] spark pull request: [SPARK-6326][SQL] Improve castStruct to be fas...

2015-03-13 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/5017 [SPARK-6326][SQL] Improve castStruct to be faster Current `castStruct` should be very slow. This pr slightly improves it. You can merge this pull request into a Git repository by running: $ git

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26407579 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -557,8 +557,11 @@ https://cwiki.apache.org/confluence/display/Hive/Enhanced

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26408121 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -557,8 +557,11 @@ https://cwiki.apache.org/confluence/display/Hive/Enhanced

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-13 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5014#discussion_r26409694 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveQl.scala --- @@ -557,8 +557,11 @@ https://cwiki.apache.org/confluence/display/Hive/Enhanced

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-13 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/5014 [SPARK-6322][SQL] CTAS should consider the case where no file format or storage handler is given When creating `CreateTableAsSelect` in `HiveQl`, it doesn't consider the case where no file format

[GitHub] spark pull request: [SPARK-6303][SQL] Add Average in canBeCodeGene...

2015-03-12 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/4996 [SPARK-6303][SQL] Add Average in canBeCodeGened lists of HashAggregation Currently `canBeCodeGened` in `HashAggregation` only checks `Sum`, `Count`, `Max`, `CombineSetsAndCount`, `CollectHashSet

[GitHub] spark pull request: [SPARK-6550][SQL] Add PreAnalyzer to keep logi...

2015-03-26 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/5203 [SPARK-6550][SQL] Add PreAnalyzer to keep logical plan consistent across DataFrame ## Problems In some cases, the expressions in a logical plan will be modified to new ones during analysis

[GitHub] spark pull request: [SPARK-6550][SQL] Add PreAnalyzer to keep logi...

2015-03-26 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5203#issuecomment-86417077 The test failure is caused by another commit and just fixed in #5198. Please test it again. --- If your project is set up for it, you can reply to this email and have

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-25 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5014#issuecomment-86259141 @yhuai please take a look, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6586][SQL] Add the capability of retrie...

2015-03-29 Thread viirya
Github user viirya closed the pull request at: https://github.com/apache/spark/pull/5241 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature

[GitHub] spark pull request: [SPARK-6586][SQL] Add the capability of retrie...

2015-03-29 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5241#issuecomment-87557379 Currently the option on jira is we don't need this capability to retrieve original plan for an analyzed plan. So I close this. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-6607][SQL] Aggregation attribute name i...

2015-03-30 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/5263 [SPARK-6607][SQL] Aggregation attribute name including special chars '(' and ')' should be replaced before generating Parquet schema '(' and ')' are special characters used in Parquet schema

[GitHub] spark pull request: [SPARK-6607][SQL] Aggregation attribute name i...

2015-03-30 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5263#issuecomment-87722138 @liancheng your suggestion is okay for me. If others have no opinion for that, I will send updates later for your suggestion. --- If your project is set up for it, you

[GitHub] spark pull request: [SPARK-6322][SQL] CTAS should consider the cas...

2015-03-31 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5014#issuecomment-88032626 @yhuai Can you check if this is ready to merge? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-6303][SQL] Add Average in canBeCodeGene...

2015-03-31 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4996#issuecomment-88033238 @liancheng @yhuai @marmbrus Please take a look, thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well

[GitHub] spark pull request: [SPARK-6647][SQL] Make trait StringComparison ...

2015-04-01 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/5309 [SPARK-6647][SQL] Make trait StringComparison as BinaryPredicate and throw error when Predicate can't translate to data source Filter Now trait `StringComparison` is a `BinaryExpression`. In fact

[GitHub] spark pull request: [SPARK-6640][Core] Fix the race condition of c...

2015-04-01 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5306#discussion_r27573736 --- Diff: core/src/main/scala/org/apache/spark/HeartbeatReceiver.scala --- @@ -71,12 +75,22 @@ private[spark] class HeartbeatReceiver(sc: SparkContext

[GitHub] spark pull request: [SPARK-6643][MLLIB] Implement StandardScalerMo...

2015-04-01 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5310#discussion_r27574890 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala --- @@ -434,8 +434,39 @@ private[python] class PythonMLLibAPI extends

[GitHub] spark pull request: [SPARK-6643][MLLIB] Implement StandardScalerMo...

2015-04-01 Thread viirya
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/5310#discussion_r27575052 --- Diff: python/pyspark/mllib/feature.py --- @@ -132,6 +132,22 @@ def transform(self, vector): return

[GitHub] spark pull request: [SPARK-6633][SQL] Should be Contains instead...

2015-03-31 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/5299 [SPARK-6633][SQL] Should be Contains instead of EndsWith when constructing sources.StringContains You can merge this pull request into a Git repository by running: $ git pull https

[GitHub] spark pull request: [SPARK-6633][SQL] Should be Contains instead...

2015-03-31 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5299#issuecomment-88267703 okay. i would check it later. thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does

[GitHub] spark pull request: [SPARK-6586][SQL] Add the capability of retrie...

2015-03-28 Thread viirya
GitHub user viirya opened a pull request: https://github.com/apache/spark/pull/5241 [SPARK-6586][SQL] Add the capability of retrieving original logical plan of DataFrame In order to solve a bug, since #5217, `DataFrame` now uses analyzed plan instead of logical plan. However

[GitHub] spark pull request: [SPARK-6466][SQL] Remove unnecessary attribute...

2015-03-23 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/5134#issuecomment-85009015 Good suggestion. I will do that later. Thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your

[GitHub] spark pull request: [SPARK-5938][SQL] Generate Row from JSON strin...

2015-02-27 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4712#issuecomment-76509265 @liancheng Description is updated. Please take a look when you have time. Thanks! --- If your project is set up for it, you can reply to this email and have your reply

[GitHub] spark pull request: [SPARK-5950][SQL]Insert array into a metastore...

2015-02-28 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4826#issuecomment-76519402 I wouldn't like to say this. But, @liancheng, @yhuai I think you should more respect other's contribution... You in #4782 made changes to `ParquetConversions

[GitHub] spark pull request: [SPARK-6107][CORE] Display inprogress applicat...

2015-03-02 Thread viirya
Github user viirya commented on the pull request: https://github.com/apache/spark/pull/4848#issuecomment-76887182 Looks like there are conflicts for merging, merge the updates first? --- If your project is set up for it, you can reply to this email and have your reply appear

<    1   2   3   4   5   6   7   8   9   10   >