Re: [VOTE] Release Apache Spark 1.3.0 (RC3)

2015-03-06 Thread Patrick Wendell
I'll kick it off with a +1. On Thu, Mar 5, 2015 at 6:52 PM, Patrick Wendell wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.3.0! > > The tag to be voted on is v1.3.0-rc2 (commit 4aaf48d4): > https://git-wip-us.apache.org/repos/asf?p=

[jira] [Updated] (SPARK-5345) Flaky test: o.a.s.deploy.history.FsHistoryProviderSuite

2015-03-06 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5345: --- Fix Version/s: 1.3.0 > Flaky test: o.a.s.deploy.history.FsHistoryProviderSu

[jira] [Updated] (SPARK-6141) Upgrade Breeze to 0.11 to fix convergence bug

2015-03-05 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6141: --- Fix Version/s: (was: 1.3.1) 1.3.0 > Upgrade Breeze to 0.11 to

[RESULT] [VOTE] Release Apache Spark 1.3.0 (RC2)

2015-03-05 Thread Patrick Wendell
from that I ran a set of tests on top of standalone and yarn >> and things look good. >> >> On Tue, Mar 3, 2015 at 8:19 PM, Patrick Wendell wrote: >>> Please vote on releasing the following candidate as Apache Spark version >>> 1.3.0! >>> >>

[VOTE] Release Apache Spark 1.3.0 (RC3)

2015-03-05 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.3.0! The tag to be voted on is v1.3.0-rc2 (commit 4aaf48d4): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=4aaf48d46d13129f0f9bdafd771dd80fe568a7dc The release files, including signatures, digests, etc. ca

Re: Spark v1.2.1 failing under BigTop build in External Flume Sink (due to missing Netty library)

2015-03-05 Thread Patrick Wendell
You may need to add the -Phadoop-2.4 profile. When building or release packages for Hadoop 2.4 we use the following flags: -Phadoop-2.4 -Phive -Phive-thriftserver -Pyarn - Patrick On Thu, Mar 5, 2015 at 12:47 PM, Kelly, Jonathan wrote: > I confirmed that this has nothing to do with BigTop by ru

[jira] [Updated] (SPARK-6175) Executor log links are using internal addresses in EC2; display `:0` when ephemeral ports are used

2015-03-05 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6175: --- Priority: Blocker (was: Major) > Executor log links are using internal addresses in

[jira] [Resolved] (SPARK-6182) spark-parent pom needs to be published for both 2.10 and 2.11

2015-03-05 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-6182. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Sean Owen > spark-par

Re: enum-like types in Spark

2015-03-05 Thread Patrick Wendell
not be a trait > >> > >> object StorageLevel { > >> private[this] case object _MemoryOnly extends StorageLevel > >> final val MemoryOnly: StorageLevel = _MemoryOnly > >> > >> private[this] case object _DiskOnly extends StorageLevel >

[jira] [Created] (SPARK-6182) spark-parent pom needs to be published for both 2.10 and 2.11

2015-03-04 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-6182: -- Summary: spark-parent pom needs to be published for both 2.10 and 2.11 Key: SPARK-6182 URL: https://issues.apache.org/jira/browse/SPARK-6182 Project: Spark

[jira] [Resolved] (SPARK-5143) spark-network-yarn 2.11 depends on spark-network-shuffle 2.10

2015-03-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5143. Resolution: Fixed Fix Version/s: 1.3.0 > spark-network-yarn 2.11 depends on sp

[jira] [Resolved] (SPARK-6149) Spark SQL CLI doesn't work when compiled against Hive 12 with SBT because of runtime incompatibility issues caused by Guava 15

2015-03-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-6149. Resolution: Fixed Fix Version/s: 1.3.0 > Spark SQL CLI doesn't work when

Re: enum-like types in Spark

2015-03-04 Thread Patrick Wendell
I like #4 as well and agree with Aaron's suggestion. - Patrick On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson wrote: > I'm cool with #4 as well, but make sure we dictate that the values should > be defined within an object with the same name as the enumeration (like we > do for StorageLevel). Ot

Re: Task result is serialized twice by serializer and closure serializer

2015-03-04 Thread Patrick Wendell
ince the byte array for the serialized task result > shouldn¹t account for the majority of memory footprint anyways, I¹m okay > with leaving it as is, then. > > Thanks, > Mingyu > > > > > > On 3/4/15, 5:07 PM, "Patrick Wendell" wrote: > >>Hey Min

Re: Task result is serialized twice by serializer and closure serializer

2015-03-04 Thread Patrick Wendell
Hey Mingyu, I think it's broken out separately so we can record the time taken to serialize the result. Once we serializing it once, the second serialization should be really simple since it's just wrapping something that has already been turned into a byte buffer. Do you see a specific issue with

[jira] [Commented] (SPARK-5143) spark-network-yarn 2.11 depends on spark-network-shuffle 2.10

2015-03-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347567#comment-14347567 ] Patrick Wendell commented on SPARK-5143: Yes - good catch Sean. Curious that

[jira] [Updated] (SPARK-6144) When in cluster mode using ADD JAR with a hdfs:// sourced jar will fail

2015-03-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6144: --- Component/s: Spark Core > When in cluster mode using ADD JAR with a hdfs:// sourced jar w

Re: [VOTE] Release Apache Spark 1.3.0 (RC2)

2015-03-04 Thread Patrick Wendell
sider > https://issues.apache.org/jira/browse/SPARK-6144 a serious regression > from 1.2 (since it affects existing "addFile()" functionality if the > URL is "hdfs:..."). > > Will test other parts separately. > > On Tue, Mar 3, 2015 at 8:19 PM, Patrick Wen

[jira] [Commented] (SPARK-6149) Spark SQL CLI doesn't work when compiled against Hive 12 with SBT because of runtime incompatibility issues caused by Guava 15

2015-03-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14347210#comment-14347210 ] Patrick Wendell commented on SPARK-6149: Since this only affects the sbt b

[jira] [Updated] (SPARK-6149) Spark SQL CLI doesn't work when compiled against Hive 12 with SBT because of runtime incompatibility issues caused by Guava 15

2015-03-04 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6149: --- Priority: Critical (was: Blocker) > Spark SQL CLI doesn't work when compiled against

[jira] [Commented] (SPARK-6149) Spark SQL CLI doesn't work when compiled against Hive 12 with SBT because of runtime incompatibility issues caused by Guava 15

2015-03-03 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346519#comment-14346519 ] Patrick Wendell commented on SPARK-6149: To be more specific, I am sugges

[jira] [Commented] (SPARK-6149) Spark SQL CLI doesn't work when compiled against Hive 12 with SBT because of runtime incompatibility issues caused by Guava 15

2015-03-03 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14346515#comment-14346515 ] Patrick Wendell commented on SPARK-6149: Yes - because of this I think si

[VOTE] Release Apache Spark 1.3.0 (RC2)

2015-03-03 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.3.0! The tag to be voted on is v1.3.0-rc2 (commit 3af2687): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=3af26870e5163438868c4eb2df88380a533bb232 The release files, including signatures, digests, etc. can

[RESULT] [VOTE] Release Apache Spark 1.3.0 (RC1)

2015-03-03 Thread Patrick Wendell
This vote is cancelled in favor of RC2. On Thu, Feb 26, 2015 at 9:50 AM, Sandor Van Wassenhove wrote: > FWIW, I tested the first rc and saw no regressions. I ran our benchmarks > built against spark 1.3 and saw results consistent with spark 1.2/1.2.1. > > On 2/25/15, 5:51 PM, &quo

[jira] [Updated] (SPARK-6144) When in cluster mode using ADD JAR with a hdfs:// sourced jar will fail

2015-03-03 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6144: --- Target Version/s: 1.3.0 > When in cluster mode using ADD JAR with a hdfs:// sourced jar w

[jira] [Updated] (SPARK-6144) When in cluster mode using ADD JAR with a hdfs:// sourced jar will fail

2015-03-03 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6144: --- Priority: Blocker (was: Major) > When in cluster mode using ADD JAR with a hdfs:// sour

[jira] [Updated] (SPARK-6122) Upgrade Tachyon dependency to 0.6.0

2015-03-02 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6122: --- Assignee: Calvin Jia > Upgrade Tachyon dependency to 0.

[jira] [Updated] (SPARK-6122) Upgrade Tachyon dependency to 0.6.0

2015-03-02 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6122: --- Fix Version/s: (was: 1.3.0) > Upgrade Tachyon dependency to 0.

[jira] [Updated] (SPARK-6122) Upgrade Tachyon dependency to 0.6.0

2015-03-02 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6122: --- Assignee: Patrick Wendell > Upgrade Tachyon dependency to 0.

[jira] [Updated] (SPARK-6122) Upgrade Tachyon dependency to 0.6.0

2015-03-02 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6122: --- Target Version/s: 1.4.0 > Upgrade Tachyon dependency to 0.

[jira] [Updated] (SPARK-6122) Upgrade Tachyon dependency to 0.6.0

2015-03-02 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6122: --- Assignee: (was: Patrick Wendell) > Upgrade Tachyon dependency to 0.

[jira] [Resolved] (SPARK-6048) SparkConf.translateConfKey should not translate on set

2015-03-02 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-6048. Resolution: Fixed Fix Version/s: 1.3.0 > SparkConf.translateConfKey should

[jira] [Resolved] (SPARK-6066) Metadata in event log makes it very difficult for external libraries to parse event log

2015-03-02 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-6066. Resolution: Fixed Fix Version/s: 1.3.0 Thanks Andrew and Marcelo for your work on

Re: spark-ec2 default to Hadoop 2

2015-03-01 Thread Patrick Wendell
Yeah calling it Hadoop 2 was a very bad naming choice (of mine!), this was back when CDH4 was the only real distribution available with some of the newer Hadoop API's and packaging. I think to not surprise people using this, it's best to keep v1 as the default. Overall, we try not to change defaul

[jira] [Updated] (SPARK-6087) Provide actionable exception if Kryo buffer is not large enough

2015-03-01 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6087: --- Labels: starter (was: ) > Provide actionable exception if Kryo buffer is not large eno

[jira] [Updated] (SPARK-6086) Exceptions in DAGScheduler.updateAccumulators

2015-02-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6086: --- Component/s: SQL > Exceptions in DAGScheduler.updateAccumulat

[jira] [Updated] (SPARK-6086) Exceptions in DAGScheduler.updateAccumulators

2015-02-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6086: --- Component/s: Spark Core > Exceptions in DAGScheduler.updateAccumulat

[jira] [Updated] (SPARK-6086) Exceptions in DAGScheduler.updateAccumulators

2015-02-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6086: --- Description: Class Cast Exceptions in DAGScheduler.updateAccumulators, when DAGScheduler is

[jira] [Commented] (SPARK-6066) Metadata in event log makes it very difficult for external libraries to parse event log

2015-02-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341881#comment-14341881 ] Patrick Wendell commented on SPARK-6066: [~vanzin] - yes you are right (an e

[jira] [Created] (SPARK-6087) Provide actionable exception if Kryo buffer is not large enough

2015-02-28 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-6087: -- Summary: Provide actionable exception if Kryo buffer is not large enough Key: SPARK-6087 URL: https://issues.apache.org/jira/browse/SPARK-6087 Project: Spark

[jira] [Updated] (SPARK-6087) Provide actionable exception if Kryo buffer is not large enough

2015-02-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6087: --- Description: Right now if you don't have a large enough Kryo buffer, you get a r

[jira] [Resolved] (SPARK-5979) `--packages` should not exclude spark streaming assembly jars for kafka and flume

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5979. Resolution: Fixed Fix Version/s: 1.3.0 > `--packages` should not exclude sp

[jira] [Resolved] (SPARK-6032) Move ivy logging to System.err in --packages

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-6032. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Burak Yavuz > Move

[jira] [Updated] (SPARK-5979) `--packages` should not exclude spark streaming assembly jars for kafka and flume

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5979?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5979: --- Assignee: Burak Yavuz > `--packages` should not exclude spark streaming assembly jars

[jira] [Resolved] (SPARK-6070) Yarn Shuffle Service jar packages too many dependencies

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-6070. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Marcelo Vanzin > Y

[jira] [Updated] (SPARK-6050) Spark on YARN does not work --executor-cores is specified

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6050: --- Assignee: Marcelo Vanzin > Spark on YARN does not work --executor-cores is specif

[jira] [Updated] (SPARK-6066) Metadata in event log makes it very difficult for external libraries to parse event log

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6066: --- Component/s: Spark Core > Metadata in event log makes it very difficult for exter

[jira] [Commented] (SPARK-6066) Metadata in event log makes it very difficult for external libraries to parse event log

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14341015#comment-14341015 ] Patrick Wendell commented on SPARK-6066: Hey Marcelo, I agree having a pu

[jira] [Commented] (SPARK-6066) Metadata in event log makes it very difficult for external libraries to parse event log

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340882#comment-14340882 ] Patrick Wendell commented on SPARK-6066: What if as a simple fix we do t

[jira] [Commented] (SPARK-6048) SparkConf.translateConfKey should translate on get, not set

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14340740#comment-14340740 ] Patrick Wendell commented on SPARK-6048: Okay I just talked to [~vanzin] off

[jira] [Updated] (SPARK-6055) Memory leak in pyspark sql due to incorrect equality check

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-6055: --- Summary: Memory leak in pyspark sql due to incorrect equality check (was: memory leak in

Re: Is SPARK_CLASSPATH really deprecated?

2015-02-27 Thread Patrick Wendell
I think we need to just update the docs, it is a bit unclear right now. At the time, we made it worded fairly sternly because we really wanted people to use --jars when we deprecated SPARK_CLASSPATH. But there are other types of deployments where there is a legitimate need to augment the classpath

[jira] [Commented] (SPARK-6050) Spark on YARN does not work --executor-cores is specified

2015-02-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339857#comment-14339857 ] Patrick Wendell commented on SPARK-6050: [~mrid...@yahoo-inc.com] thanks

[jira] [Comment Edited] (SPARK-6048) SparkConf.translateConfKey should translate on get, not set

2015-02-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339629#comment-14339629 ] Patrick Wendell edited comment on SPARK-6048 at 2/27/15 2:3

[jira] [Commented] (SPARK-6048) SparkConf.translateConfKey should translate on get, not set

2015-02-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6048?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14339629#comment-14339629 ] Patrick Wendell commented on SPARK-6048: Hey All, No options on which desig

Re: [VOTE] Release Apache Spark 1.3.0 (RC1)

2015-02-25 Thread Patrick Wendell
org.apache.spark.scheduler.Task.run(Task.scala:64)at >> > org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:197) >> > at >> > >> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) >> > at >> &

Re: UnusedStubClass in 1.3.0-rc1

2015-02-25 Thread Patrick Wendell
This has been around for multiple versions of Spark, so I am a bit surprised to see it not working in your build. - Patrick On Wed, Feb 25, 2015 at 9:41 AM, Patrick Wendell wrote: > Hey Cody, > > What build command are you using? In any case, we can actually comment > out the "

Re: UnusedStubClass in 1.3.0-rc1

2015-02-25 Thread Patrick Wendell
Hey Cody, What build command are you using? In any case, we can actually comment out the "unused" thing now in the root pom.xml. It existed just to ensure that at least one dependency was listed in the shade plugin configuration (otherwise, some work we do that requires the shade plugin does not h

[jira] [Updated] (SPARK-3851) Support for reading parquet files with different but compatible schema

2015-02-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3851: --- Fix Version/s: 1.3.0 > Support for reading parquet files with different but compatible sch

Re: Add PredictionIO to Powered by Spark

2015-02-24 Thread Patrick Wendell
Added - thanks! I trimmed it down a bit to fit our normal description length. On Mon, Jan 5, 2015 at 8:24 AM, Thomas Stone wrote: > Please can we add PredictionIO to > https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark > > PredictionIO > http://prediction.io/ > > PredictionIO is a

[jira] [Commented] (SPARK-5845) Time to cleanup intermediate shuffle files not included in shuffle write time

2015-02-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14335667#comment-14335667 ] Patrick Wendell commented on SPARK-5845: [~kayousterhout] did you mean the

Re: Can you add Big Industries to the Powered by Spark page?

2015-02-24 Thread Patrick Wendell
I've added it, thanks! On Fri, Feb 20, 2015 at 12:22 AM, Emre Sevinc wrote: > > Hello, > > Could you please add Big Industries to the Powered by Spark page at > https://cwiki.apache.org/confluence/display/SPARK/Powered+By+Spark ? > > > Company Name: Big Industries > > URL: http://http://www.bigi

Re: [VOTE] Release Apache Spark 1.3.0 (RC1)

2015-02-23 Thread Patrick Wendell
It's only been reported on this thread by Tom, so far. On Mon, Feb 23, 2015 at 10:29 AM, Marcelo Vanzin wrote: > Hey Patrick, > > Do you have a link to the bug related to Python and Yarn? I looked at > the blockers in Jira but couldn't find it. > > On Mon, Feb 2

Re: [VOTE] Release Apache Spark 1.3.0 (RC1)

2015-02-23 Thread Patrick Wendell
So actually, the list of blockers on JIRA is a bit outdated. These days I won't cut RC1 unless there are no known issues that I'm aware of that would actually block the release (that's what the snapshot ones are for). I'm going to clean those up and push others to do so also. The main issues I'm a

[jira] [Commented] (SPARK-5463) Fix Parquet filter push-down

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14333608#comment-14333608 ] Patrick Wendell commented on SPARK-5463: Bumping to critical. Per our off

[jira] [Resolved] (SPARK-5904) DataFrame methods with varargs do not work in Java

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5904?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5904. Resolution: Fixed Fix Version/s: 1.3.0 I think rxin just forgot to close this. It

[jira] [Updated] (SPARK-3650) Triangle Count handles reverse edges incorrectly

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3650: --- Priority: Critical (was: Blocker) > Triangle Count handles reverse edges incorrec

[jira] [Updated] (SPARK-5463) Fix Parquet filter push-down

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5463: --- Priority: Critical (was: Blocker) > Fix Parquet filter push-d

[jira] [Resolved] (SPARK-3511) Create a RELEASE-NOTES.txt file in the repo

2015-02-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3511. Resolution: Won't Fix Never ended up doing this. It's stale so I'm just

Re: FW: Submitting jobs to Spark EC2 cluster remotely

2015-02-23 Thread Patrick Wendell
. > > I think that what you are saying is exactly the issue: on my master node UI > at the bottom I can see the list of "Completed Drivers" all with ERROR > state... > > Thanks, > Oleg > > -Original Message- > From: Patrick Wendell [mailto:pwend

Re: Submitting jobs to Spark EC2 cluster remotely

2015-02-23 Thread Patrick Wendell
ontext$.blockOn(BlockContext.scala:53) > at scala.concurrent.Await$.result(package.scala:107) > at > org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:127) > at > org.apache.spark.deploy.Sp

[jira] [Commented] (SPARK-5916) $SPARK_HOME/bin/beeline conflicts with $HIVE_HOME/bin/beeline

2015-02-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14332001#comment-14332001 ] Patrick Wendell commented on SPARK-5916: The naming conflict is unfortu

[jira] [Commented] (SPARK-5920) Use a BufferedInputStream to read local shuffle data

2015-02-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14331986#comment-14331986 ] Patrick Wendell commented on SPARK-5920: We should definitely do this. >

[jira] [Updated] (SPARK-5920) Use a BufferedInputStream to read local shuffle data

2015-02-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5920: --- Priority: Blocker (was: Critical) > Use a BufferedInputStream to read local shuffle d

[jira] [Updated] (SPARK-5920) Use a BufferedInputStream to read local shuffle data

2015-02-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5920: --- Priority: Critical (was: Major) > Use a BufferedInputStream to read local shuffle d

[jira] [Commented] (SPARK-2389) globally shared SparkContext / shared Spark "application"

2015-02-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14327903#comment-14327903 ] Patrick Wendell commented on SPARK-2389: I've seen some variants of this

[jira] [Resolved] (SPARK-5887) Class not found exception com.datastax.spark.connector.rdd.partitioner.CassandraPartition

2015-02-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5887. Resolution: Invalid The Datastax connector is not part of the Apache Spark distribution

[jira] [Updated] (SPARK-5863) Performance regression in Spark SQL/Parquet due to ScalaReflection.convertRowToScala

2015-02-19 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-5863: --- Priority: Critical (was: Major) > Performance regression in Spark SQL/Parquet due

Re: [Performance] Possible regression in rdd.take()?

2015-02-18 Thread Patrick Wendell
I believe the heuristic governing the way that take() decides to fetch partitions changed between these versions. It could be that in certain cases the new heuristic is worse, but it might be good to just look at the source code and see, for your number of elements taken and number of partitions, i

Re: [VOTE] Release Apache Spark 1.3.0 (RC1)

2015-02-18 Thread Patrick Wendell
> UISeleniumSuite: > *** RUN ABORTED *** > java.lang.NoClassDefFoundError: org/w3c/dom/ElementTraversal > ... This is a newer test suite. There is something flaky about it, we should definitely fix it, IMO it's not a blocker though. > > Patrick this link gives a 404: > https://people.apache.org

[jira] [Resolved] (SPARK-5856) In Maven build script, launch Zinc with more memory

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5856. Resolution: Fixed Fix Version/s: 1.3.0 > In Maven build script, launch Zinc w

[jira] [Resolved] (SPARK-5864) support .jar as python package

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5864. Resolution: Fixed Fix Version/s: 1.3.0 Assignee: Davies Liu > support .

[jira] [Resolved] (SPARK-5850) Remove experimental label for Scala 2.11 and FlumePollingStream

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5850. Resolution: Fixed Fix Version/s: 1.3.0 > Remove experimental label for Scala 2

[jira] [Commented] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14325622#comment-14325622 ] Patrick Wendell commented on SPARK-4579: [~andrewor14] Can you take a loo

[jira] [Updated] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4579: --- Labels: (was: starter) > Scheduling Delay appears negat

[jira] [Updated] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4579: --- Assignee: Andrew Or > Scheduling Delay appears negat

[jira] [Updated] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4579: --- Labels: starter (was: ) > Scheduling Delay appears negat

[jira] [Updated] (SPARK-4579) Scheduling Delay appears negative

2015-02-18 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4579: --- Priority: Critical (was: Minor) > Scheduling Delay appears negat

Merging code into branch 1.3

2015-02-18 Thread Patrick Wendell
Hey Committers, Now that Spark 1.3 rc1 is cut, please restrict branch-1.3 merges to the following: 1. Fixes for issues blocking the 1.3 release (i.e. 1.2.X regressions) 2. Documentation and tests. 3. Fixes for non-blocker issues that are surgical, low-risk, and/or outside of the core. If there i

[VOTE] Release Apache Spark 1.3.0 (RC1)

2015-02-18 Thread Patrick Wendell
Please vote on releasing the following candidate as Apache Spark version 1.3.0! The tag to be voted on is v1.3.0-rc1 (commit f97b0d4a): https://git-wip-us.apache.org/repos/asf?p=spark.git;a=commit;h=f97b0d4a6b26504916816d7aefcf3132cd1da6c2 The release files, including signatures, digests, etc. ca

Re: Replacing Jetty with TomCat

2015-02-17 Thread Patrick Wendell
Hey Niranda, It seems to me a lot of effort to support multiple libraries inside of Spark like this, so I'm not sure that's a great solution. If you are building an application that embeds Spark, is it not possible for you to continue to use Jetty for Spark's internal servers and use tomcat for y

[jira] [Updated] (SPARK-4454) Race condition in DAGScheduler

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4454: --- Labels: backport-needed (was: ) > Race condition in DAGSchedu

[jira] [Updated] (SPARK-4454) Race condition in DAGScheduler

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4454: --- Target Version/s: 1.3.0, 1.2.2 (was: 1.3.0) > Race condition in DAGSchedu

[jira] [Reopened] (SPARK-4454) Race condition in DAGScheduler

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-4454: Actually, re-opening this since we need to back port it. > Race condition in DAGSchedu

[jira] [Resolved] (SPARK-4454) Race condition in DAGScheduler

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4454. Resolution: Fixed Fix Version/s: 1.3.0 We can't be 100% sure this is fixed be

[jira] [Resolved] (SPARK-5811) Documentation for --packages and --repositories on Spark Shell

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5811?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-5811. Resolution: Fixed Assignee: Burak Yavuz > Documentation for --packages

[jira] [Commented] (SPARK-5864) support .jar as python package

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324787#comment-14324787 ] Patrick Wendell commented on SPARK-5864: I merged davies PR, but per Bur

[jira] [Updated] (SPARK-4454) Race condition in DAGScheduler

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4454: --- Priority: Critical (was: Minor) > Race condition in DAGSchedu

[jira] [Updated] (SPARK-4454) Race condition in DAGScheduler

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4454: --- Target Version/s: 1.3.0 > Race condition in DAGSchedu

[jira] [Commented] (SPARK-4454) Race condition in DAGScheduler

2015-02-17 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324747#comment-14324747 ] Patrick Wendell commented on SPARK-4454: [~srowen] yeah I meant the particula

<    4   5   6   7   8   9   10   11   12   13   >