[GitHub] spark pull request: [SPARK-5040][SQL] Support expressing unresolve...

2014-12-31 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3862#issuecomment-68471873
  
Hi @rxin I think you meant to name the class 
StringToAttributeConversionHelper.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4804] StringContext method to allow Str...

2014-12-09 Thread culler
GitHub user culler opened a pull request:

https://github.com/apache/spark/pull/3649

[SPARK-4804] StringContext method to allow String references to column 
names in Catalyst DSL 



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/culler/spark dslstringcontext

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3649.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3649


commit d6460f6396e1a5cb3f356236cf22a1ef1aad3147
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-07T16:11:09Z

Adds RichDate and RichTimestamp classes with comparison operators,
allowing them to be used in DSL expressions.  These classes provide
initializers which accept string representations of dates or times.
They are renamed as Date and Timestamp when the members of an
SQLContext are in scope.

commit fa073246a392a1234141bceb61a6a71780fd3b23
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-07T16:17:33Z

Adds new implicit conversions which allow DSL expressions to start
with a literal, e.g. 0  'x .
These conversions expose a conflict with the scalatest === operator
if assert(X === Y) is used when the conversions are in scope.  To
fix this, several tests are modified, as recommended in the scalatest
documentation, by making the change:
assert(X === Y) -- assert(convertToEqualizer(X).===(Y))

commit 492feefaea6531b3374bf3e753e61234b79e1184
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-07T16:38:18Z

Clarification of one comment.

commit 53aced3bd6f7dba5d124a7070283e95e05b61914
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-08T19:46:10Z

Fixed implicit conversions for RichDate and RichTimestamp and added
a test which would have detected the problem.

commit bd539b6ce8a45e9a6d32b77fe55296a7a960d4bf
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-08T21:00:14Z

Rebased and fixed the assert commands in ParquetQuerySuite.scala.

commit 353d12ca82eea8b097076adcf5b315641bbb81b9
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-09T16:54:10Z

Add unapply methods to RichDate and RichTimestamp and make their types
available after importing the members of an SQLContext.

commit eff68fe04ac699ad180fd7a96b71d6b14159aba0
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-10T03:28:25Z

Modified RichTimestamp.toString to use the Hive format.

commit 759e883ed1eab286ef6b437ec8c194bc8cced50b
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-10T13:56:16Z

Rebasing and making comments consistent with ScalaDocs.

commit 0c86b93e350bb3927f07907b256418036789c038
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-10T15:05:41Z

Removing all changes to SpecificMutableRow.scala.  Mutable Dates and 
Timestamps
are not useful.

commit 1414f302fe5587dba5c2dec4f33f7001223584ce
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-10T23:31:19Z

New DSL equals operator -=-.  Restored all test files as this does not 
collide
with scalatests === operator.

commit dabcaf262d68565ecae0092e70fe81f0be6307bd
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-10T23:44:42Z

Cleaning up more comments and one more test.

commit ad2940dad053089182dd65c2d9d0db6c4fedc0cf
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-11T04:57:08Z

Removed LhsLiterals for another PR.  Cleaned up.

commit f917e06db9925dc02bc98df3e25ba60642d007a3
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-11T05:02:22Z

Removed test which is no longer part of this PR.

commit 4c32c0416093fedcb70acdcbfbeb112ded4406eb
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-11T05:25:55Z

Restored lost import of scalatest.Assertions.convertToEqualizer.

commit 4cfb8640f3b185de1e2ffacaf30f6d72a8c1d9dc
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-12-09T20:51:32Z

Merge branch 'master' of https://github.com/apache/spark

commit b3a432d94bfd6856653c4019e0da4f9949582619
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-12-09T21:31:23Z

Clean copy of upstream master

commit a6e7a307b97cb951dcbb20aa52e08ac1c24d43da
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-12-09T21:47:19Z

Merge branch 'master' of https://github.com/apache/spark into clean

commit 26aa58f951df85c279da844775dbee9d7a0953ff
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-12-09T22:34:59Z

Added StringContext method to convert String to UnresolvedAttribute.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket

[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-14 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3158#issuecomment-63154275
  
Maybe next week will be better.  The PR is now much smaller, with the
changes to test files removed and the DSL part moved into PR #3210.

On Fri, Nov 7, 2014 at 3:02 PM, Michael Armbrust notificati...@github.com
wrote:

 Thanks for working on this. Is a pretty big change so I'll review it next
 week after we get some critical fixes in for 1.2.

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3158#issuecomment-62211082.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] DSL expressions with a liter...

2014-11-11 Thread culler
GitHub user culler opened a pull request:

https://github.com/apache/spark/pull/3210

[SPARK-4205][SQL] DSL expressions with a literal on the left hand side

Currently the DSL expression parser does not recognize expressions which 
begin with a literal, such as:
`0  'x`.

This PR adds a new implicit class which allows such expressions to be 
included in the DSL.  Due to conflicts with scalatest, which converts  `===` 
and `!==` in its `assert` macro, these two operators can not be used by the new 
class.  It adds two new operators `-=-` and `!=-` which behave identically to 
`===` and `!==` within the DSL, but can also be used in expressions which start 
with a literal.

The new class replaces the comment: `// TODO more implicit class for 
literal?`
I think this may be what the author had in mind.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/culler/spark lhs

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3210.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3210


commit 1d42fff2f5d77df8b9d0c0fcbcabf0d46d01c174
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-11T20:34:33Z

Adding an implicit class which enables DSL expressions which begin with a
literal to be recognized.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-11 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3158#issuecomment-62618892
  
Thanks!

On Mon, Nov 10, 2014 at 6:17 PM, Michael Armbrust notificati...@github.com
wrote:

 Quick note: you don't need a fork per PR. You can have multiple branches
 in a single fork, each of them with a corresponding PR.

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3158#issuecomment-62480969.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] DSL expressions with a liter...

2014-11-11 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3210#issuecomment-62619183
  
@marmbrus and @liancheng, here is the other half of PR #3158.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-10 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3158#discussion_r20080979
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SpecificMutableRow.scala
 ---
@@ -169,6 +170,35 @@ final class MutableByte extends MutableValue {
 newCopy.asInstanceOf[this.type]
   }
 }
+final class MutableDate extends MutableValue {
--- End diff --

Thanks @liancheng,  I was wondering about that.  I will remove them.

On Mon, Nov 10, 2014 at 2:09 AM, Cheng Lian notificati...@github.com
wrote:

 In
 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/SpecificMutableRow.scala:

  @@ -169,6 +170,35 @@ final class MutableByte extends MutableValue {
   newCopy.asInstanceOf[this.type]
 }
   }
  +final class MutableDate extends MutableValue {

 According to our discussion here
 https://github.com/apache/spark/pull/3066#discussion_r19784664, these
 two classes are not necessary.

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3158/files#r20070128.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-10 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3158#discussion_r20082032
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/timetypes.scala 
---
@@ -0,0 +1,115 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.sql.{Date, Timestamp}
+import scala.language.implicitConversions
+
+/* *
+ * Subclass of java.sql.Date which provides the usual comparison
+ * operators (as required for catalyst expressions) and which can
+ * be constructed from a string.
+ *
+ * scala val d1 = Date(2014-02-01)
+ * d1: Date = 2014-02-01
+ *
+ * scala val d2 = Date(2014-02-02)
+ * d2: Date = 2014-02-02
+ *
+ * scala d1  d2
+ * res1: Boolean = true
+ */
+
+class RichDate(milliseconds: Long) extends Date(milliseconds) {
+  def  (that: Date): Boolean = this.before(that)
+  def  (that: Date): Boolean  = this.after(that)
+  def = (that: Date): Boolean = (this.before(that) || this.equals(that))
+  def = (that: Date): Boolean = (this.after(that) || this.equals(that))
+  def === (that: Date): Boolean = this.equals(that)
+}
+
+object RichDate {
+  def apply(init: String) = new RichDate(Date.valueOf(init).getTime)
+
+  def unapply(richdate: RichDate): Option[Date] = Some(new 
Date(richdate.getTime)) 
+}
+
+/* *
+ * Analogous subclass of java.sql.Timestamp.
+ *
+ * scala val ts1 = Timestamp(2014-03-04 12:34:56.12)
+ * ts1: Timestamp = 2014-03-04 12:34:56.12
+ *
+ * scala val ts2 = Timestamp(2014-03-04 12:34:56.13)
+ * ts2: Timestamp = 2014-03-04 12:34:56.13
+ *
+ * scala ts1  ts2
+ * res13: Boolean = true
--- End diff --

Thanks @liancheng.  I am sorry I did not notice I had done that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-10 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3158#issuecomment-62395022
  
​
   Hey @culler, I noticed that a significant part of the this PR is about 
changing a === b to
   EQ(a).===(b). I understand that this is used to avoid function name 
collision between
   ScalaTest === and the newly introduced === DSL operator for Date and 
Timestamp,

​Yes, I am very unhappy about that.  ​I will study further how to work 
around it some other way.

However, you have not understood the problem.  The problem does not have 
anything to do with my tests.  Even if I completely remove my tests, the 
problem will still arise.  Also, the changes I am proposing do not change the 
DSL at all.  The ugly `EQ` construction *only* appears in scala source files 
which import *both* the scalatest FunSuite *and* the SQLContext conversions.  
That ugly stuff *never* appears in a DSL expression.

Let me try to explain.  I have added some new implicit conversions.  When 
these conversions are in scope they add `===` operators to some basic types, 
e.g Int.  The definition of this `===` operator is:
  `def === (other: Symbol) = EqualTo(literal, other)`
so this implicit conversion would not be applied when evaluating 
expressions such as the ones in test files which import scalatest.  E.g. it 
does not apply in
   `assert(literals.size === 4)`
because the right hand side (4) is not a Symbol.  So there really is no 
conflict here.

The problem is that even though the implicit conversion does not apply, the 
scalatest `assert` **macro** is nonetheless deciding that the `literals.size` 
already has an `===` operator, and so it refuses to apply its own implicit 
conversion which would add a different `===` operator.
Note that this situation is different from the one which arises with the 
DSL `===` operator.  In that case the operator is simply a method of 
Expression. If a DSL expression begins with a Symbol the implict conversions 
convert the Symbol to an Expression which then has an `===` operator.  But this 
does not work when the DSL expression begins with a literal. In fact, this is 
exactly the reason why the DSL could not recognize and expression like `0  
'x`.  This is the problem I am solving.

  EQ(a).===(b) looks too cumbersome, and looses all the meaning about DSL 
operators --
  conciseness and readability.

​Please note, this construction would never be used in a DSL 
expression.​

 The === name collision problem had already been solved once when
 org.apache.spark.sql.catalyst.dsl.=== is introduced, and it has been 
proven to be easy to
 workaround without such significant change.

The situation was different, since that `===` operator is only a method of 
Expression.​

 So my suggestion is that, to workaround this issue. You may add your 
tests in DslQuerySuite
 with the help of checkAnswer, and simply don't use both === within a same 
scope.

​Understood.  But that will not solve the problem.  The other tests will 
still fail to compile.​
 
Given that the `EQ` construction is not acceptable for use in a test, I 
will look into the possibility of requiring a separate import for these left 
hand side conversions, so that they would not need to be in scope for any of 
the tests that are not testing this specific feature.  Maybe that is the 
cleanest workaround.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-10 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3158#issuecomment-62480405
  
Hi @marmbrus and @liancheng, I have now tried what seems to be a very 
simple and direct solution to the `===` problem.  My patch no longer uses 
`===`.  I added a new equality operator to the DSL called `-=-`.  It is 
identical to the `===` operator (which is still in place) with the exception 
that it also works for DSL expressions which begin with a literal.  And it does 
not cause any conflicts with scalatest.

I reverted all of the changes to the various test files that were intended 
to work around the `===` issue.  That brings this PR down to 5 files.  All of 
the changes related to the DSL are in one file: 
sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala.  All 
of the changes related to RichDate and RichTimestamp are in the other 4 files.

This means that the two different changes have effectively been isolated 
and the structure of the PR should be more transparent now.  It would not be 
hard to remove the one file and start another PR for it, except for the 
technical detail that github only allows one fork at a time.  I suppose I could 
fork myself, so to speak, and open another github account for my alter ego.

PS The `===` issue is not just about having too many implicit conversions 
in scope -- it is also about combining those conversions with macros.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-10 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3158#issuecomment-62508858
  
OK @marmbrus and @liancheng, this PR is now fit and slim.  The LhsLiterals 
have been removed and will appear in a new PR. This one passes all tests and I 
think it is ready for review, finally.

谢谢, @liancheng, for your patience, thoughtful comments and careful 
reading.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-09 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3158#issuecomment-62339566
  
Tests are failing when comparing Hive results to Catalyst results due to 
different string conversions of Timestamps: 2011-01-01 01:01:01 (Hive) vs 
2011-01-01 01:01:01.0 (Catalyst).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date classes w...

2014-11-08 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3066#issuecomment-62257326
  
Thanks!  I'm proud to be a member of the club. :^)

On Fri, Nov 7, 2014 at 11:16 PM, Cheng Lian notificati...@github.com
wrote:

 @culler https://github.com/culler Don't worry, I guess every
 contributor had screwed up rebasing at least once :)

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3066#issuecomment-62246351.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date classes w...

2014-11-07 Thread culler
Github user culler closed the pull request at:

https://github.com/apache/spark/pull/3066


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date classes w...

2014-11-07 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3066#issuecomment-62158264
  
Hi @liancheng , now that I have completely screwed up this PR by attempting 
to rebase the repository, I will close it and open a new one which will 
hopefully be clean.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-07 Thread culler
GitHub user culler opened a pull request:

https://github.com/apache/spark/pull/3158

[SPARK-4205][SQL] Timestamp and Date with comparisons / DSL literals

This is the same as pull request #3066, which I closed due to corruption of 
the repository after I tried to rebase so as to include modifications to a test 
file added after the original pull request was issued.

There are two parts:
(1) new RichDate and RichTimestamp classes provide comparison operators, 
which allows them to be used in DSL expressions, and initializers which accept 
string representations of dates or times;
(2) new implicit conversions are added which allow recognition of DSL 
expressions which have a literal on the left, e.g. 0  'x . 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/culler/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3158.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3158


commit 7180345ba7ce255a2fc389ae8c55998ed20b9a82
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-07T16:11:09Z

Adds RichDate and RichTimestamp classes with comparison operators,
allowing them to be used in DSL expressions.  These classes provide
initializers which accept string representations of dates or times.
They are renamed as Date and Timestamp when the members of an
SQLContext are in scope.

commit bcf6e6bb143f8e4a5f22356fadae54fce4f57041
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-07T16:17:33Z

Adds new implicit conversions which allow DSL expressions to start
with a literal, e.g. 0  'x .
These conversions expose a conflict with the scalatest === operator
if assert(X === Y) is used when the conversions are in scope.  To
fix this, several tests are modified, as recommended in the scalatest
documentation, by making the change:
assert(X === Y) -- assert(convertToEqualizer(X).===(Y))

commit ef5e4a4230d671ed2ae19f74c280f5e8c44f41aa
Author: Marc Culler marc.cul...@gmail.com
Date:   2014-11-07T16:38:18Z

Clarification of one comment.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-07 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3158#issuecomment-62175638
  
@liangcheng and @rxin, I am reopening pull request #3066 as #3158 so it can 
be based on a current commit of the spark source.  I messed up #3066 by trying 
to rebase it after a new test file was added which required minor changes to 
compile.  Sorry for any confusion this causes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date with comp...

2014-11-07 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3158#issuecomment-6677
  
Thanks @marmbrus. That sounds good. If it isn't too much trouble, I'd 
appreciate seeing what jenkins has to say before you review it next week.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date classes w...

2014-11-06 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3066#issuecomment-62036504
  
The file 
sql/core/src/test/scala/org/apache/spark/sql/UserDefinedTypeSuite.scala was 
added after this PR and caused a test to fail due to the collision between === 
operators.  I am adding the whole file to this PR (with explicit conversions 
for === added to it).  I hope that is the right way to handle this.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date classes w...

2014-11-05 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3066#issuecomment-61830155
  
Thanks @liancheng.  I deleted the comment to avoid wasting your time since
I figured out what was going on.

Actually it was not my test suite that i was worried about.  Many existing
tests would not compile because my new conversions add an === operator to
types such as Int when the new conversions are in scope, as they are when
testing the DSL.  I followed scalatest's advice for dealing with cases
where an === operator already exists.  Namely, to apply their conversion
explicitly:

assert(X === Y)  -- assert(convertToEqualizer(X).===(Y))

Will that be OK?  (I import convertToEqualizer as EQ for brevity.)

I wanted to pass the existing tests before adding any new ones.  I am
checking that now.  It looks good so far, but unfortunately, 11 test files
had to be edited as above.



On Tue, Nov 4, 2014 at 11:31 PM, Cheng Lian notificati...@github.com
wrote:

 Hey @culler https://github.com/culler, saw you comment about testing
 yesterday (gone now?) but didn't have time to response. Yes, the ===
 operator is used by in both Spark SQL DSL and ScalaTest. To workaround
 this, you can either add your test case in DslQuerySuite and use
 checkAnswer for assertion, or simply avoid to use ===, there're some
 other assertion functions in ScalaTest, for example, assertResult is one
 of my favorite :)

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3066#issuecomment-61761929.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date classes w...

2014-11-05 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3066#issuecomment-61870151
  
Hi @liancheng.  Tests have been added.  The PR builds, satisfies lint, and 
pass the tests on my system.  (Except there were two failures in the 
SparkSubmitSuite, which I think are not related to this PR.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3066#discussion_r19730958
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/dsl/package.scala ---
@@ -146,6 +146,31 @@ package object dsl {
 def upper(e: Expression) = Upper(e)
 def lower(e: Expression) = Lower(e)
 
+/*
+ * Conversions to provide the standard operators in the special case
+ * where a literal is being combined with a symbol. Without these an
+ * expression such as 0  'x is not recognized.
+ */
+implicit class InitialLiteral(x: Any) {
+  val literal = Literal(x)
+  def + (other: Symbol):Expression = {literal + other}
+  def - (other: Symbol):Expression = {literal - other}
+  def * (other: Symbol):Expression = {literal * other}
+  def / (other: Symbol):Expression = {literal / other}
+  def % (other: Symbol):Expression = {literal % other}
+
+  def  (other: Symbol):Expression = {literal  other}
+  def || (other: Symbol):Expression = {literal || other}
+
+  def  (other: Symbol):Expression  = {literal  other}
+  def = (other: Symbol):Expression = {literal = other}
+  def  (other: Symbol):Expression  = {literal  other}
+  def = (other: Symbol):Expression = {literal = other}
+  def === (other: Symbol):Expression = {literal === other}
+  def = (other: Symbol):Expression = {literal = other}
+  def !== (other: Symbol):Expression = {literal !== other}
--- End diff --

Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3066#discussion_r19731906
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Projection.scala
 ---
@@ -139,6 +140,12 @@ class JoinedRow extends Row {
   def getString(i: Int): String =
 if (i  row1.size) row1.getString(i) else row2.getString(i - row1.size)
 
+  def getDate(i: Int): Date =
+if (i  row1.size) row1.getDate(i) else row2.getDate(i - row1.size)
+
+  def getTimestamp(i: Int): Timestamp =
+if (i  row1.size) row1.getTimestamp(i) else row2.getTimestamp(i - 
row1.size)
+
--- End diff --

I see.  I am removing them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3066#discussion_r19733162
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/Row.scala 
---
@@ -42,6 +44,31 @@ object Row {
* This method can be used to construct a [[Row]] from a [[Seq]] of 
values.
*/
   def fromSeq(values: Seq[Any]): Row = new GenericRow(values.toArray)
+
+  /**
+   * This method can be used to construct a [[Row]] from a [[Seq]] of 
Strings,
+   * converting each item to the type specified in a [[StructType]] schema.
+   * Only primitive types can be used.
+   */
+  def fromStringsBySchema(strings: Seq[String], schema: StructType): Row = 
{
+ val values = for {
+   (field, str) - schema.fields zip strings
+   item = field.dataType match {
+ case IntegerType= str.toInt
+ case LongType   = str.toLong
+ case DoubleType = str.toDouble
+ case FloatType  = str.toFloat
+ case ByteType   = str.toByte
+ case ShortType  = str.toShort
+ case StringType = str
+ case BooleanType= (str != )
+ case DateType   = Date.valueOf(str)
+ case TimestampType  = Timestamp.valueOf(str)
+ case DecimalType()  = new BigDecimal(str)
+   }
+ } yield item
+ new GenericRow(values.toArray)
+  }
--- End diff --

It doesn't need to be a Row method.  I will move it outside ( and worry 
about the Boolean case.)

There is an issue lurking here, however.  A SchemaRDD should have Rows 
which match its schema.   But mismatches generate no errors until the last 
stage, e.g. when one collects the results of a query.  My thought was that it 
would be helpful to have a safe Row constructor which would check its data 
against a schema and raise an error if they do not match (after applying 
whatever implicit conversions are available.  (Yes, this is a separate issue 
and does not belong here.)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3066#discussion_r19733534
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/dataTypes.scala 
---
@@ -187,7 +187,8 @@ case object NullType extends DataType
 
 object NativeType {
   val all = Seq(
-IntegerType, BooleanType, LongType, DoubleType, FloatType, ShortType, 
ByteType, StringType)
+IntegerType, BooleanType, LongType, DoubleType, FloatType, ShortType,
+ByteType, StringType, DateType, TimestampType)
--- End diff --

Yes, but the new line does exceed 100 columns due to the addition of two 
new items DateType and TimestampType.  (Or so the compiler tells me.)

Perhaps there is a reason why these types should not be regarded as 
Native but I could not see any such reason.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3066#discussion_r19733740
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/types/timetypes.scala 
---
@@ -0,0 +1,97 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the License); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an AS IS BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.expressions
+
+import java.sql.{Date = JDate, Timestamp = JTimestamp}
+import scala.language.implicitConversions
+
+/*
+ * Subclass of java.sql.Date which provides the usual comparison
+ * operators (as required for catalyst expressions) and which can
+ * be constructed from a string.
+ *
+ * scala val d1 = Date(2014-02-01)
+ * d1: Date = 2014-02-01
+ *
+ * scala val d2 = Date(2014-02-02)
+ * d2: Date = 2014-02-02
+ *
+ * scala d1  d2
+ * res1: Boolean = true
+ */
+
+class Date(milliseconds: Long) extends JDate(milliseconds) {
--- End diff --

OK.  That makes sense and avoids problems.

But how about if they are still imported as org.apache.spark.sql.Date and 
org.apache.spark.sql.Timestamp at the top level?  My reasoning is based on 
providing consistency for a user.  A user is not required to use RichInts or 
RichStrings as literals in a catalyst expression.  So why would they expect to 
have to use RichDates or RichTimestamps?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3066#discussion_r19735638
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/package.scala ---
@@ -451,4 +451,33 @@ package object sql {
* Builder for [[Metadata]]. If there is a key collision, the latter 
will overwrite the former.
*/
   type MetadataBuilder = catalyst.util.MetadataBuilder
+
+  /**
+   * :: DeveloperApi ::
+   *
+   * A Date class which support the standard comparison operators, for
+   * use in DSL expressions.  Implicit conversions to java.sql.Date
+   * are provided.  The class intializer accepts a String, e.g.
+   *
+   * val ts = Date(2014-01-01)
--- End diff --

Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3066#discussion_r19736442
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/package.scala ---
@@ -451,4 +451,33 @@ package object sql {
* Builder for [[Metadata]]. If there is a key collision, the latter 
will overwrite the former.
*/
   type MetadataBuilder = catalyst.util.MetadataBuilder
+
+  /**
+   * :: DeveloperApi ::
+   *
+   * A Date class which support the standard comparison operators, for
+   * use in DSL expressions.  Implicit conversions to java.sql.Date
+   * are provided.  The class intializer accepts a String, e.g.
+   *
+   * val ts = Date(2014-01-01)
--- End diff --

And more thanks for reading the code so carefully and for all of the 
thoughtful comments and helpful advice.  I am pushing fixes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3066#issuecomment-61488635
  
I have corrected the issues pointed out in the comments.  Please retest.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on a diff in the pull request:

https://github.com/apache/spark/pull/3066#discussion_r19737251
  
--- Diff: 
sql/core/src/main/scala/org/apache/spark/sql/columnar/ColumnType.scala ---
@@ -372,6 +372,11 @@ private[sql] object TIMESTAMP extends 
NativeColumnType(TimestampType, 9, 12) {
   override def setField(row: MutableRow, ordinal: Int, value: Timestamp): 
Unit = {
 row(ordinal) = value
   }
+
+  def append(v: Date, buffer: ByteBuffer) {
+buffer.putLong(v.getTime)
+  }
+
--- End diff --

This won't compile without an implementation of the append method, because 
it is declared in the abstract ColumnType class.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-03 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3066#issuecomment-61498140
  
Hmmm.  The GitHub docs seem to suggest that I am supposed to mention 
@liancheng and @rxin in my comments.  So I am doing that.  My apologies if this 
is superfluous or annoying.  - @culler


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-4205][SQL] Timestamp and Date classes w...

2014-11-03 Thread culler
Github user culler commented on the pull request:

https://github.com/apache/spark/pull/3066#issuecomment-61599608
  
Hi Cheng,

I will happily add tests.

There is one thing I could use some help with, though.  It would be very
helpful if I could run the existing test suite.  But I am currently unable
to compile several of the tests involving catalyst expressions.

The problem is that the compiler objects whenever a triple = ( === ) is
used inside of an assert command.  I realize that this is a feature of
FunSuite.  But with the current spark source distribution and the FunSuite
jar that it installs, this feature does not work for me.  The compiler says
that the === operator expects a Symbol argument, which suggests that it is
somehow being redefined by catalyst libraries. (It works if I just import
FunSuite alone.) Can you tell me the trick for getting FunSuite working
without having to change all of those test suite files?  Thanks, and sorry
to have to bother you with this.

- Marc

On Mon, Nov 3, 2014 at 10:31 PM, Cheng Lian notificati...@github.com
wrote:

 Hey @culler https://github.com/culler, would you please add tests for
 all the new features you introduced in this PR?

 —
 Reply to this email directly or view it on GitHub
 https://github.com/apache/spark/pull/3066#issuecomment-61592236.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: Timestamp and Date classes which work in the c...

2014-11-02 Thread culler
GitHub user culler opened a pull request:

https://github.com/apache/spark/pull/3066

Timestamp and Date classes which work in the catalyst DSL.

These Timestamp and Date classes support the usual comparison operators, so 
the can be used in catalyst Expressions.  The come with implicit conversions to 
and from the underlying Java classes.  They also have a convenient initializer 
which accepts a String.

Additional implicit conversions fix an issue with the DSL where expressions 
like 0  'x are not recognized.

An independent change adds a method to Row for creating a row from a 
sequence of Strings and a schema.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/culler/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/3066.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #3066






---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org