[jira] [Updated] (SPARK-6191) Generalize spark-ec2's ability to download libraries from PyPI

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6191: - Assignee: Nicholas Chammas Generalize spark-ec2's ability to download libraries from PyPI

[jira] [Resolved] (SPARK-5221) FileInputDStream remember window in certain situations causes files to be ignored

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5221. -- Resolution: Not a Problem Hm, if I understand this correctly, you're saying that a file appears with

[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354797#comment-14354797 ] Apache Spark commented on SPARK-6244: - User 'catap' has created a pull request for

[jira] [Comment Edited] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354575#comment-14354575 ] Platon Potapov edited comment on SPARK-6232 at 3/10/15 10:35 AM:

[jira] [Resolved] (SPARK-6234) 10% Performance regression with Breeze upgrade

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6234. -- Resolution: Not a Problem The question is, is there a performance regression in real Spark? The speed

[jira] [Commented] (SPARK-5528) Support schema merging while reading Parquet files

2015-03-10 Thread chirag aggarwal (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354731#comment-14354731 ] chirag aggarwal commented on SPARK-5528: This feature shall have severe

[jira] [Resolved] (SPARK-6177) Add note in LDA example to remind possible coalesce

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6177. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4899

[jira] [Updated] (SPARK-6177) Add note in LDA example to remind possible coalesce

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6177: - Priority: Trivial (was: Minor) Assignee: yuhao yang Add note in LDA example to remind possible

[jira] [Updated] (SPARK-6186) make tachyon version configurable in the ec2 script

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6186: - Assignee: cheng chang make tachyon version configurable in the ec2 script

[jira] [Updated] (SPARK-1457) Change APIs for training algorithms to take optimizer as parameter

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-1457: - Priority: Minor (was: Major) In this case, LogisticRegressionWithSGD hard-codes a gradient descent

[jira] [Created] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)
Kirill A. Korinskiy created SPARK-6244: -- Summary: Implement VectorSpace to easy create a complicated feature vector Key: SPARK-6244 URL: https://issues.apache.org/jira/browse/SPARK-6244 Project:

[jira] [Resolved] (SPARK-6240) Spark MLlib fpm#FPGrowth genFreqItems use Array[Item] may outOfMemory for Large Sets

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6240. -- Resolution: Not a Problem (This would be better to start as a question on the mailing list.) It

[jira] [Updated] (SPARK-6087) Provide actionable exception if Kryo buffer is not large enough

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6087: - Priority: Major (was: Critical) Fix Version/s: 1.4.0 Assignee: Lev Khomich

[jira] [Commented] (SPARK-5142) Possibly data may be ruined in Spark Streaming's WAL mechanism.

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354737#comment-14354737 ] Sean Owen commented on SPARK-5142: -- Is this really the same issue described by

[jira] [Resolved] (SPARK-4325) Improve spark-ec2 cluster launch times

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-4325. -- Resolution: Fixed Looks like sub-tasks are all resolved Improve spark-ec2 cluster launch times

[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354801#comment-14354801 ] Sean Owen commented on SPARK-6244: -- As described, this does not sound much like a vector

[jira] [Resolved] (SPARK-6239) Spark MLlib fpm#FPGrowth minSupport should use long instead

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6239. -- Resolution: Invalid No, that's definitely wrong. minSupport is not an integer, but a fraction. You

[jira] [Resolved] (SPARK-6191) Generalize spark-ec2's ability to download libraries from PyPI

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6191. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4919

[jira] [Resolved] (SPARK-6186) make tachyon version configurable in the ec2 script

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6186. -- Resolution: Fixed Fix Version/s: 1.4.0 Issue resolved by pull request 4901

[jira] [Commented] (SPARK-5313) Create simple framework for highlighting changes introduced in a PR

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354740#comment-14354740 ] Sean Owen commented on SPARK-5313: -- Is this just an umbrella on the one remaining issue

[jira] [Commented] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354799#comment-14354799 ] Sean Owen commented on SPARK-6232: -- Yes I can reproduce this on 1.2.1. Eventually I stop

[jira] [Commented] (SPARK-5986) Model import/export for KMeansModel

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354640#comment-14354640 ] Sean Owen commented on SPARK-5986: -- Do I have this right that: - the models are

[jira] [Resolved] (SPARK-5312) Use sbt to detect new or changed public classes in PRs

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-5312. -- Resolution: Won't Fix Sounds like a WontFix after investigation into SBT, and we do have a rudimentary

[jira] [Comment Edited] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354575#comment-14354575 ] Platon Potapov edited comment on SPARK-6232 at 3/10/15 10:35 AM:

[jira] [Commented] (SPARK-6240) Spark MLlib fpm#FPGrowth genFreqItems use Array[Item] may outOfMemory for Large Sets

2015-03-10 Thread Littlestar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354890#comment-14354890 ] Littlestar commented on SPARK-6240: --- ok, I kown, Thanks. I just only notice that

[jira] [Updated] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6232: - Fix Version/s: 1.3.0 Well it's OK in 1.3.0, it seems, according to all of our tests. So I think it's

[jira] [Commented] (SPARK-6239) Spark MLlib fpm#FPGrowth minSupport should use long instead

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354837#comment-14354837 ] Sean Owen commented on SPARK-6239: -- Conversely, you have to know the size of the input if

[jira] [Commented] (SPARK-4325) Improve spark-ec2 cluster launch times

2015-03-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354956#comment-14354956 ] Nicholas Chammas commented on SPARK-4325: - At this point it's more an umbrella

[jira] [Commented] (SPARK-4496) smallint (16 bit value) is being send as a 32 bit value in the thrift interface.

2015-03-10 Thread Chip Sands (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354978#comment-14354978 ] Chip Sands commented on SPARK-4496: --- getSchema() is describing a result column as a

[jira] [Commented] (SPARK-4325) Improve spark-ec2 cluster launch times

2015-03-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354939#comment-14354939 ] Nicholas Chammas commented on SPARK-4325: - [~srowen] - I should perhaps change the

[jira] [Created] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes

2015-03-10 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-6246: --- Summary: spark-ec2 can't handle clusters with 100 nodes Key: SPARK-6246 URL: https://issues.apache.org/jira/browse/SPARK-6246 Project: Spark Issue

[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes

2015-03-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354969#comment-14354969 ] Nicholas Chammas commented on SPARK-6246: - FYI [~shivaram]. spark-ec2 can't

[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354819#comment-14354819 ] Kirill A. Korinskiy commented on SPARK-6244: Yes, I agree with you that name

[jira] [Issue Comment Deleted] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill A. Korinskiy updated SPARK-6244: --- Comment: was deleted (was: Yes, I agree with you that name Vector Space mayn't

[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354820#comment-14354820 ] Kirill A. Korinskiy commented on SPARK-6244: Yes, I agree with you that name

[jira] [Reopened] (SPARK-4325) Improve spark-ec2 cluster launch times

2015-03-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas reopened SPARK-4325: - Reopening after updating contains issue links. Improve spark-ec2 cluster launch times

[jira] [Commented] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354891#comment-14354891 ] Platon Potapov commented on SPARK-6232: --- could you tell whether (and when) the fix

[jira] [Commented] (SPARK-4325) Improve spark-ec2 cluster launch times

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354948#comment-14354948 ] Sean Owen commented on SPARK-4325: -- What does this JIRA represent then that's left to be

[jira] [Commented] (SPARK-6239) Spark MLlib fpm#FPGrowth minSupport should use long instead

2015-03-10 Thread Littlestar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354829#comment-14354829 ] Littlestar commented on SPARK-6239: --- When use FPGrowthModel, the numbers of input

[jira] [Commented] (SPARK-5987) Model import/export for GaussianMixtureModel

2015-03-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356028#comment-14356028 ] Joseph K. Bradley commented on SPARK-5987: -- This is because Spark SQL's

[jira] [Commented] (SPARK-6269) Using a different implementation of java array reflection for size estimation

2015-03-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356073#comment-14356073 ] Apache Spark commented on SPARK-6269: - User 'mccheah' has created a pull request for

[jira] [Created] (SPARK-6275) Miss toDF() function in docs/sql-programming-guide.md

2015-03-10 Thread zzc (JIRA)
zzc created SPARK-6275: -- Summary: Miss toDF() function in docs/sql-programming-guide.md Key: SPARK-6275 URL: https://issues.apache.org/jira/browse/SPARK-6275 Project: Spark Issue Type: Documentation

[jira] [Commented] (SPARK-6245) jsonRDD() of empty RDD results in exception

2015-03-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6245?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356030#comment-14356030 ] Apache Spark commented on SPARK-6245: - User 'srowen' has created a pull request for

[jira] [Updated] (SPARK-6268) KMeans parameter getter methods

2015-03-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-6268: - Assignee: yuhao yang KMeans parameter getter methods ---

[jira] [Commented] (SPARK-6268) KMeans parameter getter methods

2015-03-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356125#comment-14356125 ] yuhao yang commented on SPARK-6268: --- Sure, I'll propose a PR very soon. Thanks! KMeans

[jira] [Created] (SPARK-6273) Got error when do join

2015-03-10 Thread Jeff (JIRA)
Jeff created SPARK-6273: --- Summary: Got error when do join Key: SPARK-6273 URL: https://issues.apache.org/jira/browse/SPARK-6273 Project: Spark Issue Type: Bug Affects Versions: 1.2.1

[jira] [Commented] (SPARK-5312) Use sbt to detect new or changed public classes in PRs

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355196#comment-14355196 ] Sean Owen commented on SPARK-5312: -- Oh OK, I resolved given the discussion above, but if

[jira] [Commented] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355345#comment-14355345 ] Tathagata Das commented on SPARK-6232: -- Yeah, I saw the comment :) I just wanted the

[jira] [Resolved] (SPARK-4921) TaskSetManager mistakenly returns PROCESS_LOCAL for NO_PREF tasks

2015-03-10 Thread Sandy Ryza (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4921?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza resolved SPARK-4921. --- Resolution: Won't Fix TaskSetManager mistakenly returns PROCESS_LOCAL for NO_PREF tasks

[jira] [Created] (SPARK-6247) Certain self joins cannot be analyzed

2015-03-10 Thread Yin Huai (JIRA)
Yin Huai created SPARK-6247: --- Summary: Certain self joins cannot be analyzed Key: SPARK-6247 URL: https://issues.apache.org/jira/browse/SPARK-6247 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-6248) LocalRelation needs to implement statistics

2015-03-10 Thread Yin Huai (JIRA)
Yin Huai created SPARK-6248: --- Summary: LocalRelation needs to implement statistics Key: SPARK-6248 URL: https://issues.apache.org/jira/browse/SPARK-6248 Project: Spark Issue Type: Bug

[jira] [Comment Edited] (SPARK-5987) Model import/export for GaussianMixtureModel

2015-03-10 Thread Manoj Kumar (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355274#comment-14355274 ] Manoj Kumar edited comment on SPARK-5987 at 3/10/15 5:32 PM: -

[jira] [Commented] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355321#comment-14355321 ] Sean Owen commented on SPARK-6232: -- Yeah I believe he updated the comment above

[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Xiangrui Meng (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355337#comment-14355337 ] Xiangrui Meng commented on SPARK-6244: -- Agree with Sean that this is not a vector

[jira] [Updated] (SPARK-6216) Check Python version in worker before run PySpark job

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6216: - Component/s: PySpark Check Python version in worker before run PySpark job

[jira] [Comment Edited] (SPARK-6268) KMeans parameter getter methods

2015-03-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356106#comment-14356106 ] yuhao yang edited comment on SPARK-6268 at 3/11/15 2:14 AM: Hi

[jira] [Commented] (SPARK-3438) Support for accessing secured HDFS in Standalone Mode

2015-03-10 Thread Tao Wang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356114#comment-14356114 ] Tao Wang commented on SPARK-3438: - Looks like this issue is same with that in SPARK-5158.

[jira] [Commented] (SPARK-6268) KMeans parameter getter methods

2015-03-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356123#comment-14356123 ] Joseph K. Bradley commented on SPARK-6268: -- It's not rude at all! I made a bunch

[jira] [Created] (SPARK-6270) Standalone Master hangs when streaming job completes

2015-03-10 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-6270: Summary: Standalone Master hangs when streaming job completes Key: SPARK-6270 URL: https://issues.apache.org/jira/browse/SPARK-6270 Project: Spark Issue

[jira] [Closed] (SPARK-6272) Sort these tokens in alphabetic order to avoid further duplicate in HiveQl

2015-03-10 Thread DoingDone9 (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 closed SPARK-6272. - Resolution: Duplicate Sort these tokens in alphabetic order to avoid further duplicate in HiveQl

[jira] [Created] (SPARK-6274) Add streaming examples showing integration with DataFrames and SQL

2015-03-10 Thread Tathagata Das (JIRA)
Tathagata Das created SPARK-6274: Summary: Add streaming examples showing integration with DataFrames and SQL Key: SPARK-6274 URL: https://issues.apache.org/jira/browse/SPARK-6274 Project: Spark

[jira] [Commented] (SPARK-6274) Add streaming examples showing integration with DataFrames and SQL

2015-03-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6274?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356328#comment-14356328 ] Apache Spark commented on SPARK-6274: - User 'tdas' has created a pull request for this

[jira] [Updated] (SPARK-6270) Standalone Master hangs when streaming job completes

2015-03-10 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-6270: - Description: If the event logging is enabled, the Spark Standalone Master tries to recreate the

[jira] [Created] (SPARK-6272) Sort these tokens in alphabetic order to avoid further duplicate in HiveQl

2015-03-10 Thread DoingDone9 (JIRA)
DoingDone9 created SPARK-6272: - Summary: Sort these tokens in alphabetic order to avoid further duplicate in HiveQl Key: SPARK-6272 URL: https://issues.apache.org/jira/browse/SPARK-6272 Project: Spark

[jira] [Updated] (SPARK-6270) Standalone Master hangs when streaming job completes

2015-03-10 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-6270: - Description: If the event logging is enabled, the Spark Standalone Master tries to recreate the

[jira] [Commented] (SPARK-6270) Standalone Master hangs when streaming job completes

2015-03-10 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356142#comment-14356142 ] Tathagata Das commented on SPARK-6270: -- [~joshrosen] Another user other than Netflix

[jira] [Commented] (SPARK-6268) KMeans parameter getter methods

2015-03-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356270#comment-14356270 ] Apache Spark commented on SPARK-6268: - User 'hhbyyh' has created a pull request for

[jira] [Commented] (SPARK-6271) Sort these tokens in alphabetic order to avoid further duplicate in HiveQl

2015-03-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356152#comment-14356152 ] Apache Spark commented on SPARK-6271: - User 'DoingDone9' has created a pull request

[jira] [Commented] (SPARK-4852) Hive query plan deserialization failure caused by shaded hive-exec jar file when generating golden answers

2015-03-10 Thread Kannan Rajah (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356173#comment-14356173 ] Kannan Rajah commented on SPARK-4852: - We are hitting this issue in a production case,

[jira] [Closed] (SPARK-6177) Add note in LDA example to remind possible coalesce

2015-03-10 Thread yuhao yang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6177?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] yuhao yang closed SPARK-6177. - Fix and merged, thanks. Add note in LDA example to remind possible coalesce

[jira] [Created] (SPARK-6271) Sort these tokens in alphabetic order to avoid further duplicate in HiveQl

2015-03-10 Thread DoingDone9 (JIRA)
DoingDone9 created SPARK-6271: - Summary: Sort these tokens in alphabetic order to avoid further duplicate in HiveQl Key: SPARK-6271 URL: https://issues.apache.org/jira/browse/SPARK-6271 Project: Spark

[jira] [Issue Comment Deleted] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kirill A. Korinskiy updated SPARK-6244: --- Comment: was deleted (was: Yes, this way sounds good. I can use same issue and pull

[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356164#comment-14356164 ] Kirill A. Korinskiy commented on SPARK-6244: Yes, this way sounds good. I can

[jira] [Commented] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Kirill A. Korinskiy (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14356165#comment-14356165 ] Kirill A. Korinskiy commented on SPARK-6244: Yes, this way sounds good. I can

[jira] [Updated] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6244: - Component/s: MLlib Implement VectorSpace to easy create a complicated feature vector

[jira] [Updated] (SPARK-6242) Support replace (drop) column for parquet table

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6242: - Component/s: SQL Support replace (drop) column for parquet table

[jira] [Commented] (SPARK-3878) Benchmarks and common tests for mllib algorithm

2015-03-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355005#comment-14355005 ] Apache Spark commented on SPARK-3878: - User 'epahomov' has created a pull request for

[jira] [Commented] (SPARK-6220) Allow extended EC2 options to be passed through spark-ec2

2015-03-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14354991#comment-14354991 ] Nicholas Chammas commented on SPARK-6220: - Another thought to add, there are

[jira] [Commented] (SPARK-4325) Improve spark-ec2 cluster launch times

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355000#comment-14355000 ] Sean Owen commented on SPARK-4325: -- Cool, if it's just a grouping issue with no 'content'

[jira] [Commented] (SPARK-4496) smallint (16 bit value) is being send as a 32 bit value in the thrift interface.

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355001#comment-14355001 ] Sean Owen commented on SPARK-4496: -- Where does this happen in the code? is it obvious or

[jira] [Updated] (SPARK-6244) Implement VectorSpace to easy create a complicated feature vector

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6244?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-6244: - Priority: Minor (was: Major) Implement VectorSpace to easy create a complicated feature vector

[jira] [Commented] (SPARK-4496) smallint (16 bit value) is being send as a 32 bit value in the thrift interface.

2015-03-10 Thread Chip Sands (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355084#comment-14355084 ] Chip Sands commented on SPARK-4496: --- I have not look at the spark code. But it would be

[jira] [Updated] (SPARK-4496) smallint (16 bit value) is being send as a 32 bit value in the thrift interface.

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4496: - Component/s: (was: Input/Output) SQL smallint (16 bit value) is being send as a

[jira] [Updated] (SPARK-4122) Add library to write data back to Kafka

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4122: - Issue Type: Improvement (was: Bug) Add library to write data back to Kafka

[jira] [Commented] (SPARK-3278) Isotonic regression

2015-03-10 Thread Vladimir Vladimirov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355119#comment-14355119 ] Vladimir Vladimirov commented on SPARK-3278: Martin. This would be really

[jira] [Commented] (SPARK-5986) Model import/export for KMeansModel

2015-03-10 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355134#comment-14355134 ] Joseph K. Bradley commented on SPARK-5986: -- Yes, that's correct. It doesn't

[jira] [Commented] (SPARK-5273) Improve documentation examples for LinearRegression

2015-03-10 Thread Sandeep Narayanaswami (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5273?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355058#comment-14355058 ] Sandeep Narayanaswami commented on SPARK-5273: -- I can pick this up.

[jira] [Commented] (SPARK-4012) Uncaught OOM in ContextCleaner

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355080#comment-14355080 ] Sean Owen commented on SPARK-4012: -- Andrew suggested this is maybe WontFix in his review,

[jira] [Updated] (SPARK-6241) hiveql ANALYZE TABLE doesn't work for external tables

2015-03-10 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-6241: Target Version/s: 1.4.0 hiveql ANALYZE TABLE doesn't work for external tables

[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355004#comment-14355004 ] Sean Owen commented on SPARK-6246: -- The funny thing is, the typo in that error message

[jira] [Commented] (SPARK-3106) Fix the race condition issue about Connection and ConnectionManager

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3106?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355038#comment-14355038 ] Sean Owen commented on SPARK-3106: -- Given the discussion, I'm not clear if this is a

[jira] [Resolved] (SPARK-2765) While running Spark through YARN Control Containers logging.

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2765. -- Resolution: Not a Problem SPARK_LOG4J_CONF has been deprecated for some time, so I do not think this is

[jira] [Commented] (SPARK-4012) Uncaught OOM in ContextCleaner

2015-03-10 Thread Nan Zhu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355151#comment-14355151 ] Nan Zhu commented on SPARK-4012: [~srowen], actually I got more understanding on the

[jira] [Updated] (SPARK-5312) Use sbt to detect new or changed public classes in PRs

2015-03-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5312: Description: We currently use an [unwieldy grep/sed

[jira] [Commented] (SPARK-5312) Use sbt to detect new or changed public classes in PRs

2015-03-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355622#comment-14355622 ] Nicholas Chammas commented on SPARK-5312: - Thanks for looking into this [~boyork].

[jira] [Commented] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-10 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355632#comment-14355632 ] Apache Spark commented on SPARK-6222: - User 'harishreedharan' has created a pull

[jira] [Commented] (SPARK-6246) spark-ec2 can't handle clusters with 100 nodes

2015-03-10 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355642#comment-14355642 ] Nicholas Chammas commented on SPARK-6246: - I dunno, I haven't looked into the

[jira] [Commented] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Platon Potapov (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355692#comment-14355692 ] Platon Potapov commented on SPARK-6232: --- * i've tried 1.3.0 and it seemed to work.

[jira] [Resolved] (SPARK-6232) Spark Streaming: simple application stalls processing

2015-03-10 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-6232. -- Resolution: Duplicate Fix Version/s: (was: 1.3.0) OK, resolving as a Duplicate then. I mean

[jira] [Commented] (SPARK-6222) [STREAMING] All data may not be recovered from WAL when driver is killed

2015-03-10 Thread Hari Shreedharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14355731#comment-14355731 ] Hari Shreedharan commented on SPARK-6222: - In the direct connector for Kafka, we

  1   2   >