Github user hhbyyh commented on the issue:
https://github.com/apache/spark/pull/15179
Thanks Sean. LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so,
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15179
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15179
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65821/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15179
**[Test build #65821 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65821/consoleFull)**
for PR 15179 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15179
**[Test build #65821 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65821/consoleFull)**
for PR 15179 at commit
Github user jkbradley commented on the issue:
https://github.com/apache/spark/pull/15179
Accepting more input types SGTM too (with unit tests). The PR title and
description (and perhaps the JIRA too) should be updated. Thanks!
---
If your project is set up for it, you can reply to
Github user hhbyyh commented on the issue:
https://github.com/apache/spark/pull/15179
Sorry, I meant unit test
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/15179
Sorry, what's a UT here? user defined type?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user hhbyyh commented on the issue:
https://github.com/apache/spark/pull/15179
LGTM. How about adding an UT for the input type new ArrayType(StringType,
false)?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15179
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65769/
Test PASSed.
---
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15179
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15179
**[Test build #65769 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65769/consoleFull)**
for PR 15179 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15179
**[Test build #65769 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65769/consoleFull)**
for PR 15179 at commit
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/15179
Yes that's better IMHO. It makes this more consistent. Word2Vec was the
only case that looked for an exact match on nullability when it was expecting
nullable input (i.e. pointlessly disallowed
Github user hhbyyh commented on the issue:
https://github.com/apache/spark/pull/15179
`CountVectorizer` can accept both `Array(String, true)` and `Array(String,
false)`. Perhaps we can just change the validation code.
---
If your project is set up for it, you can reply to this email
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/15179
I think the problem is that it won't accept anything that outputs
`Array(String, true)`? it seems like it's looking for exactly the same type.
Elsewhere I see a similar check written so as to not
Github user hhbyyh commented on the issue:
https://github.com/apache/spark/pull/15179
Checked locally and Word2Vec can work with `Array(String, false)`. We can
loosen the constraint. Maybe a new UT can be added.
---
If your project is set up for it, you can reply to this email and
Github user hhbyyh commented on the issue:
https://github.com/apache/spark/pull/15179
I'm not sure what will happen if we actually send an `Array(String, false)`
column to Word2Vec. If nothing goes wrong, I think we can loosen the constraint.
I'll try something locally.
---
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/15179
Or really to have it accept an array of strings, whether nullable or not.
Right now it can accept a nullable string array type, but will pointlessly
reject input that is _not_ nullable. The check is
Github user hhbyyh commented on the issue:
https://github.com/apache/spark/pull/15179
Hi @srowen. By fixing `Word2Vec`, do you mean to change its input type to
`Array(String, false)`?
I cannot recall all the details. Right now I think the primary reason is to
be consistent
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15179
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/15179
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/65711/
Test PASSed.
---
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15179
**[Test build #65711 has
finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65711/consoleFull)**
for PR 15179 at commit
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/15179
**[Test build #65711 has
started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/65711/consoleFull)**
for PR 15179 at commit
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/15179
This change merely parallels https://github.com/apache/spark/pull/7414 from
@hhbyyh (what do you think?) I'd still sort of like to understand why we
didn't 'fix' Word2Vec instead in that change?
25 matches
Mail list logo