[jira] [Reopened] (SPARK-4183) Enable Netty-based BlockTransferService by default

2014-11-01 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-4183: Reverted this due to a test issue - seems like some state is not getting cleaned up. Enable

[jira] [Resolved] (SPARK-4121) Master build failures after shading commons-math3

2014-11-01 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4121. Resolution: Fixed Okay I merged this. Let's see how it goes. Master build failures after

Changes to Spark's networking subsystem

2014-11-01 Thread Patrick Wendell
== Short version == A recent commit replaces Spark's networking subsystem with one based on Netty rather than raw sockets. Users running off of master can disable this change by setting spark.shuffle.blockTransferService=nio. We will be testing with this during the QA period for Spark 1.2. The new

[jira] [Commented] (SPARK-4079) Snappy bundled with Spark does not work on older Linux distributions

2014-10-31 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191457#comment-14191457 ] Patrick Wendell commented on SPARK-4079: Yeah that sounds like a good call. Did

Re: Surprising Spark SQL benchmark

2014-10-31 Thread Patrick Wendell
Hey Nick, Unfortunately Citus Data didn't contact any of the Spark or Spark SQL developers when running this. It is really easy to make one system look better than others when you are running a benchmark yourself because tuning and sizing can lead to a 10X performance improvement. This benchmark

Re: How to run tests properly?

2014-10-30 Thread Patrick Wendell
packaging would only build the jar and place it in the target folder. How does that affect the tests? If tests depend on the assembly a mvn install would be more sensible to me. Probably I misunderstand the maven build life-cycle. Thanks, Niklas On 29.10.2014 19:01, Patrick Wendell wrote: One

Re: How to run tests properly?

2014-10-29 Thread Patrick Wendell
One thing is you need to do a maven package before you run tests. The local-cluster tests depend on Spark already being packaged. - Patrick On Wed, Oct 29, 2014 at 10:02 AM, Niklas Wilcke 1wil...@informatik.uni-hamburg.de wrote: Hi Sean, thanks for your reply. The tests still don't work. I

[jira] [Created] (SPARK-4114) Use stable Hive API (if one exists) for communication with Metastore

2014-10-28 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4114: -- Summary: Use stable Hive API (if one exists) for communication with Metastore Key: SPARK-4114 URL: https://issues.apache.org/jira/browse/SPARK-4114 Project

[jira] [Commented] (SPARK-4121) Master build failures after shading commons-math3

2014-10-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4121?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187453#comment-14187453 ] Patrick Wendell commented on SPARK-4121: [~srowen] - can you help

[jira] [Created] (SPARK-4123) Show new dependencies added in pull requests

2014-10-28 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4123: -- Summary: Show new dependencies added in pull requests Key: SPARK-4123 URL: https://issues.apache.org/jira/browse/SPARK-4123 Project: Spark Issue Type

[jira] [Updated] (SPARK-4123) Show new dependencies added in pull requests

2014-10-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4123: --- Description: We should inspect the classpath of Spark's assembly jar for every pull request

[jira] [Commented] (SPARK-4123) Show new dependencies added in pull requests

2014-10-28 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14187590#comment-14187590 ] Patrick Wendell commented on SPARK-4123: [~nchammas] - do you have any interest

[jira] [Created] (SPARK-4128) Create instructions on fully building Spark in Intellij

2014-10-28 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4128: -- Summary: Create instructions on fully building Spark in Intellij Key: SPARK-4128 URL: https://issues.apache.org/jira/browse/SPARK-4128 Project: Spark

Re: Support Hive 0.13 .1 in Spark SQL

2014-10-28 Thread Patrick Wendell
Hey Cheng, Right now we aren't using stable API's to communicate with the Hive Metastore. We didn't want to drop support for Hive 0.12 so right now we are using a shim layer to support compiling for 0.12 and 0.13. This is very costly to maintain. If Hive has a stable meta-data API for talking to

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
Hey Stephen, In some cases in the maven build we now have pluggable source directories based on profiles using the maven build helper plug-in. This is necessary to support cross building against different Hive versions, and there will be additional instances of this due to supporting scala 2.11

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
. Anyone who knows how to handle this - a quick note here would be appreciated. 2014-10-28 20:20 GMT-07:00 Patrick Wendell pwend...@gmail.com: Hey Stephen, In some cases in the maven build we now have pluggable source directories based on profiles using the maven build helper plug

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
Patrick Wendell pwend...@gmail.com: Hey Stephen, In some cases in the maven build we now have pluggable source directories based on profiles using the maven build helper plug-in. This is necessary to support cross building against different Hive versions, and there will be additional

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
to do this? On Tue, Oct 28, 2014 at 9:57 PM, Patrick Wendell pwend...@gmail.com wrote: I just started a totally fresh IntelliJ project importing from our root pom. I used all the default options and I added hadoop-2.4, hive, hive-0.13.1 profiles. I was able to run spark core tests from within

Re: HiveShim not found when building in Intellij

2014-10-28 Thread Patrick Wendell
Oops - I actually should have added v0.13.0 (i.e. to match whatever I did in the profile). On Tue, Oct 28, 2014 at 10:05 PM, Patrick Wendell pwend...@gmail.com wrote: Cheng - to make it recognize the new HiveShim for 0.12 I had to click on spark-hive under packages in the left pane, then go

Re: Support Hive 0.13 .1 in Spark SQL

2014-10-28 Thread Patrick Wendell
Hey Cheng, Right now we aren't using stable API's to communicate with the Hive Metastore. We didn't want to drop support for Hive 0.12 so right now we are using a shim layer to support compiling for 0.12 and 0.13. This is very costly to maintain. If Hive has a stable meta-data API for talking to

Re: Ending a job early

2014-10-28 Thread Patrick Wendell
Hey Jim, There are some experimental (unstable) API's that support running jobs which might short-circuit: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkContext.scala#L1126 This can be used for doing online aggregations like you are describing. And in one

[jira] [Commented] (SPARK-4049) Storage web UI fraction cached shows as 100%

2014-10-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184889#comment-14184889 ] Patrick Wendell commented on SPARK-4049: This actually seems alright to me

[jira] [Resolved] (SPARK-4032) Deprecate YARN alpha support in Spark 1.2

2014-10-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4032. Resolution: Fixed Fix Version/s: 1.2.0 Deprecate YARN alpha support in Spark 1.2

[jira] [Resolved] (SPARK-2621) Update task InputMetrics incrementally

2014-10-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-2621. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Sandy Ryza https

[jira] [Updated] (SPARK-4092) Input metrics don't work for coalesce()'d RDD's

2014-10-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4092: --- Assignee: Kostas Sakellis Input metrics don't work for coalesce()'d RDD's

[jira] [Updated] (SPARK-3655) Support sorting of values in addition to keys (i.e. secondary sort)

2014-10-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3655: --- Assignee: Koert Kuipers Support sorting of values in addition to keys (i.e. secondary sort

[jira] [Updated] (SPARK-4106) Shuffle write and spill to disk metrics are incorrect

2014-10-27 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4106: --- Summary: Shuffle write and spill to disk metrics are incorrect (was: Shuffle write and spill

[jira] [Updated] (SPARK-4064) NioBlockTransferService should deal with empty messages correctly

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4064: --- Summary: NioBlockTransferService should deal with empty messages correctly (was: If we

[jira] [Commented] (SPARK-3655) Secondary sort

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184565#comment-14184565 ] Patrick Wendell commented on SPARK-3655: Okay, sounds good. Secondary sort

[jira] [Updated] (SPARK-4056) Upgrade snappy-java to 1.1.1.5

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4056: --- Component/s: Spark Core Upgrade snappy-java to 1.1.1.5

[jira] [Updated] (SPARK-4085) Job will fail if a shuffle file that's read locally gets deleted

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4085?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4085: --- Component/s: Spark Core Job will fail if a shuffle file that's read locally gets deleted

[jira] [Updated] (SPARK-2760) Caching tables from multiple databases does not work

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2760: --- Component/s: SQL Caching tables from multiple databases does not work

[jira] [Updated] (SPARK-3917) Compress data before network transfer

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3917: --- Priority: Major (was: Critical) Compress data before network transfer

[jira] [Commented] (SPARK-2532) Fix issues with consolidated shuffle

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184576#comment-14184576 ] Patrick Wendell commented on SPARK-2532: Hey [~matei] - you created some sub-tasks

[jira] [Comment Edited] (SPARK-3962) Mark spark dependency as provided in external libraries

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184580#comment-14184580 ] Patrick Wendell edited comment on SPARK-3962 at 10/26/14 6:11 PM

[jira] [Updated] (SPARK-3962) Mark spark dependency as provided in external libraries

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3962: --- Assignee: Prashant Sharma Mark spark dependency as provided in external libraries

[jira] [Commented] (SPARK-3962) Mark spark dependency as provided in external libraries

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184580#comment-14184580 ] Patrick Wendell commented on SPARK-3962: [~prashant_] can you take a crack

[jira] [Created] (SPARK-4092) Input metrics don't work for coalesce()'d RDD's

2014-10-26 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4092: -- Summary: Input metrics don't work for coalesce()'d RDD's Key: SPARK-4092 URL: https://issues.apache.org/jira/browse/SPARK-4092 Project: Spark Issue Type

[jira] [Commented] (SPARK-3266) JavaDoubleRDD doesn't contain max()

2014-10-26 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14184704#comment-14184704 ] Patrick Wendell commented on SPARK-3266: I think it sort of depends how many

[jira] [Resolved] (SPARK-3812) Adapt maven build to publish effective pom.

2014-10-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3812. Resolution: Fixed Fix Version/s: 1.2.0 Okay, let's try again. Adapt maven build

[jira] [Commented] (SPARK-3561) Allow for pluggable execution contexts in Spark

2014-10-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182511#comment-14182511 ] Patrick Wendell commented on SPARK-3561: Hey [~ozhurakousky] - adding

[jira] [Commented] (SPARK-3561) Allow for pluggable execution contexts in Spark

2014-10-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182520#comment-14182520 ] Patrick Wendell commented on SPARK-3561: One other thing - if projects really do

[jira] [Commented] (SPARK-4066) Make whether maven builds fails on scalastyle violation configurable

2014-10-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183235#comment-14183235 ] Patrick Wendell commented on SPARK-4066: [~srowen] I don't see a good argument

[jira] [Commented] (SPARK-4079) Snappy bundled with Spark does not work on older Linux distributions

2014-10-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183317#comment-14183317 ] Patrick Wendell commented on SPARK-4079: What about just catching the exception

[jira] [Commented] (SPARK-4066) Make whether maven builds fails on scalastyle violation configurable

2014-10-24 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14183399#comment-14183399 ] Patrick Wendell commented on SPARK-4066: That was actually my thought originally

Re: your weekly git timeout update! TL;DR: i'm now almost certain we're not hitting rate limits.

2014-10-24 Thread Patrick Wendell
Thanks for the update Shane. As a point of process, for things like this where we re debugging specific issues - can we use JIRA instead of notifying everyone on the spark-dev list? I'd prefer if ops/infra announcements on the dev list are restricted to things that are widely applicable to

Re: Moving PR Builder to mvn

2014-10-24 Thread Patrick Wendell
compilation to be much better than before I had it running. Is the sbt build still faster (sorry, long time since I did a build with sbt). Thanks, Hari On Fri, Oct 24, 2014 at 1:46 PM, Patrick Wendell pwend...@gmail.com wrote: Overall I think this would be a good idea. The main blocker

[jira] [Commented] (SPARK-3655) Secondary sort

2014-10-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181066#comment-14181066 ] Patrick Wendell commented on SPARK-3655: Hey [~koertkuipers] - i'm not an expert

[jira] [Updated] (SPARK-4020) Failed executor not properly removed if it has not run tasks

2014-10-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4020: --- Component/s: Spark Core Failed executor not properly removed if it has not run tasks

[jira] [Commented] (SPARK-3655) Secondary sort

2014-10-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181792#comment-14181792 ] Patrick Wendell commented on SPARK-3655: Yeah so to be clear here is what I meant

[jira] [Commented] (SPARK-1239) Don't fetch all map output statuses at each reducer during shuffles

2014-10-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182122#comment-14182122 ] Patrick Wendell commented on SPARK-1239: Hey Kostas - there are a few other bugs

[jira] [Reopened] (SPARK-3812) Adapt maven build to publish effective pom.

2014-10-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-3812: It appeared that this was creating an issue with the maven tests. I am reverting this to see

[jira] [Commented] (SPARK-4030) `destroy` method in Broadcast should be public

2014-10-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14182175#comment-14182175 ] Patrick Wendell commented on SPARK-4030: I'm fine to open it up. I do think

[jira] [Resolved] (SPARK-4019) Shuffling with more than 2000 reducers may drop all data when partitons are mostly empty or cause deserialization errors if at least one partition is empty

2014-10-23 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-4019. Resolution: Fixed Fix Version/s: 1.2.0 Fixed by Josh's patch: https://github.com

[jira] [Created] (SPARK-4073) Parquet+Snappy can cause significant off-heap memory usage

2014-10-23 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4073: -- Summary: Parquet+Snappy can cause significant off-heap memory usage Key: SPARK-4073 URL: https://issues.apache.org/jira/browse/SPARK-4073 Project: Spark

Re: scalastyle annoys me a little bit

2014-10-23 Thread Patrick Wendell
Hey Koert, I think disabling the style checks in maven package could be a good idea for the reason you point out. I was sort of mixed on that when it was proposed for this exact reason. It's just annoying to developers. In terms of changing the global limit, this is more religion than anything

Spark 1.2 feature freeze on November 1

2014-10-23 Thread Patrick Wendell
Hey All, Just a reminder that as planned [1] we'll go into a feature freeze on November 1. On that date I'll cut a 1.2 release branch and make the up-or-down call on any patches that go into that branch, along with individual committers. It is common for us to receive a very large volume of

Re: scalac crash when compiling DataTypeConversions.scala

2014-10-23 Thread Patrick Wendell
Hey Ryan, I've found that filing issues with the Scala/Typesafe JIRA is pretty helpful if the issue can be fully reproduced, and even sometimes helpful if it can't. You can file bugs here: https://issues.scala-lang.org/secure/Dashboard.jspa The Spark SQL code in particular is typically the

Re: About Memory usage in the Spark UI

2014-10-23 Thread Patrick Wendell
It shows the amount of memory used to store RDD blocks, which are created when you run .cache()/.persist() on an RDD. On Wed, Oct 22, 2014 at 10:07 PM, Haopu Wang hw...@qilinsoft.com wrote: Hi, please take a look at the attached screen-shot. I wonders what's the Memory Used column mean. I

[jira] [Resolved] (SPARK-3812) Adapt maven build to publish effective pom.

2014-10-22 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3812?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-3812. Resolution: Fixed Assignee: Prashant Sharma Fixed by: https://github.com/apache/spark

Re: Which part of the code deals with communication?

2014-10-22 Thread Patrick Wendell
The best documentation about communication interfaces is the SecurityManager doc written by Tom Graves. With this as a starting point I'd recommend digging through the code for each component.

[jira] [Updated] (SPARK-4021) Issues observed after upgrading Jenkins to JDK7u71

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4021: --- Assignee: shane knapp Issues observed after upgrading Jenkins to JDK7u71

[jira] [Resolved] (SPARK-1042) spark cleans all java broadcast variables when it hits the spark.cleaner.ttl

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1042?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1042. Resolution: Fixed Fix Version/s: 0.9.2 I think this was fixed back in 0.9.2 spark

[jira] [Updated] (SPARK-4003) Add {Big Decimal, Timestamp, Date} types to Java SqlContext

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4003: --- Summary: Add {Big Decimal, Timestamp, Date} types to Java SqlContext (was: Add 3 types

[jira] [Updated] (SPARK-4006) Spark Driver crashes whenever an Executor is registered twice

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4006: --- Priority: Critical (was: Blocker) Target Version/s: 1.2.0 Spark Driver crashes

[jira] [Updated] (SPARK-4006) Spark Driver crashes whenever an Executor is registered twice

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4006: --- Priority: Blocker (was: Critical) Spark Driver crashes whenever an Executor is registered

[jira] [Updated] (SPARK-3948) Sort-based shuffle can lead to assorted stream-corruption exceptions

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3948?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3948: --- Priority: Blocker (was: Major) Sort-based shuffle can lead to assorted stream-corruption

[jira] [Commented] (SPARK-4014) TaskContext.attemptId returns taskId

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178053#comment-14178053 ] Patrick Wendell commented on SPARK-4014: [~joshrosen] what do you think about

[jira] [Commented] (SPARK-4019) Repartitioning with more than 2000 partitions may drop all data when partitions are mostly empty.

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178056#comment-14178056 ] Patrick Wendell commented on SPARK-4019: Great work getting to the root cause

[jira] [Commented] (SPARK-4030) `destroy` method in Broadcast should be public

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178068#comment-14178068 ] Patrick Wendell commented on SPARK-4030: Hey Shivaram - IIRC we made this private

[jira] [Updated] (SPARK-4030) `destroy` method in Broadcast should be public

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4030: --- Issue Type: Improvement (was: Bug) `destroy` method in Broadcast should be public

[jira] [Comment Edited] (SPARK-4030) `destroy` method in Broadcast should be public

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178068#comment-14178068 ] Patrick Wendell edited comment on SPARK-4030 at 10/21/14 7:17 AM

[jira] [Comment Edited] (SPARK-4030) `destroy` method in Broadcast should be public

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178068#comment-14178068 ] Patrick Wendell edited comment on SPARK-4030 at 10/21/14 7:17 AM

[jira] [Updated] (SPARK-3945) Write properties of hive-site.xml to HiveContext when initilize session state In SparkSQLEnv.scala

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3945: --- Assignee: luogankun Write properties of hive-site.xml to HiveContext when initilize session

[jira] [Commented] (SPARK-3945) Write properties of hive-site.xml to HiveContext when initilize session state In SparkSQLEnv.scala

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3945?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178070#comment-14178070 ] Patrick Wendell commented on SPARK-3945: Hey @luogankun would you mind adding

[jira] [Updated] (SPARK-3940) SQL console prints error messages three times

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3940: --- Assignee: wangxj SQL console prints error messages three times

[jira] [Commented] (SPARK-3940) SQL console prints error messages three times

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178073#comment-14178073 ] Patrick Wendell commented on SPARK-3940: Hey [~wangxj8] can you add a first

[jira] [Updated] (SPARK-4032) Deprecate YARN support in Spark 1.2

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4032: --- Description: When someone builds for yarn alpha, we should just display a warning like {code

[jira] [Created] (SPARK-4032) Deprecate YARN support in Spark 1.2

2014-10-21 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4032: -- Summary: Deprecate YARN support in Spark 1.2 Key: SPARK-4032 URL: https://issues.apache.org/jira/browse/SPARK-4032 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-4032) Deprecate YARN support in Spark 1.2

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4032: --- Priority: Blocker (was: Major) Deprecate YARN support in Spark 1.2

[jira] [Resolved] (SPARK-547) Provide a means to package Spark's executor into a tgz

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-547. --- Resolution: Fixed This was fixed a long time ago. Provide a means to package Spark's

[jira] [Resolved] (SPARK-773) Add fair scheduler pool information UI similar with hadoop

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-773. --- Resolution: Fixed This has existing in Spark for a while - it is a stale issue. Add fair

[jira] [Resolved] (SPARK-890) Allow multiple parallel commands in spark-shell

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-890. --- Resolution: Fixed Spark context was made thread-safe a long time ago. Allow multiple

[jira] [Resolved] (SPARK-916) Better Support for Flat/Tabular RDD's

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-916. --- Resolution: Fixed Turned out [~marmbrus] did all of this and more in SparkSQL (which btw also

[jira] [Resolved] (SPARK-735) memory leak in KryoSerializer

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-735. --- Resolution: Fixed I think this was fixed a long time ago. memory leak in KryoSerializer

[jira] [Reopened] (SPARK-566) Replace polling+sleeping with semaphores in broadcast and shuffle

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell reopened SPARK-566: --- Replace polling+sleeping with semaphores in broadcast and shuffle

[jira] [Resolved] (SPARK-566) Replace polling+sleeping with semaphores in broadcast and shuffle

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-566. --- Resolution: Not a Problem Replace polling+sleeping with semaphores in broadcast and shuffle

[jira] [Resolved] (SPARK-566) Replace polling+sleeping with semaphores in broadcast and shuffle

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-566. --- Resolution: Fixed Thew shuffle and broadcast implementations have been re-written at least

[jira] [Resolved] (SPARK-909) add task serialization footprint (time and size) into TaskMetrics

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-909. --- Resolution: Fixed This has already been fixed. add task serialization footprint (time

[jira] [Updated] (SPARK-3466) Limit size of results that a driver collects for each action

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3466: --- Assignee: Davies Liu (was: Matthew Cheah) Limit size of results that a driver collects

[jira] [Commented] (SPARK-3466) Limit size of results that a driver collects for each action

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14178857#comment-14178857 ] Patrick Wendell commented on SPARK-3466: I spoke with Matt today and I'm re

[jira] [Updated] (SPARK-4040) calling count() on RDD's emitted from a DStream blocks forEachRDD progress.

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4040: --- Component/s: Streaming calling count() on RDD's emitted from a DStream blocks forEachRDD

[jira] [Resolved] (SPARK-1813) Add a utility to SparkConf that makes using Kryo really easy

2014-10-21 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell resolved SPARK-1813. Resolution: Fixed Fix Version/s: 1.2.0 Assignee: Sandy Ryza Fixed in https

Re: something wrong with Jenkins or something untested merged?

2014-10-21 Thread Patrick Wendell
at 5:16 PM, Patrick Wendell pwend...@gmail.com (mailto:pwend...@gmail.com) wrote: The failure is in the Kinesis compoent, can you reproduce this if you build with -Pkinesis-asl? - Patrick On Mon, Oct 20, 2014 at 5:08 PM, shane knapp skn...@berkeley.edu

[jira] [Created] (SPARK-4021) Kinesis code can cause compile failures with newer JVM's

2014-10-20 Thread Patrick Wendell (JIRA)
Patrick Wendell created SPARK-4021: -- Summary: Kinesis code can cause compile failures with newer JVM's Key: SPARK-4021 URL: https://issues.apache.org/jira/browse/SPARK-4021 Project: Spark

[jira] [Updated] (SPARK-4021) Kinesis code can cause compile failures with newer JDK's

2014-10-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4021: --- Summary: Kinesis code can cause compile failures with newer JDK's (was: Kinesis code can

[jira] [Commented] (SPARK-4021) Kinesis code can cause compile failures with newer JDK's

2014-10-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14177729#comment-14177729 ] Patrick Wendell commented on SPARK-4021: Hey [~srowen] - I am getting this report

[jira] [Updated] (SPARK-4021) Issues observed after upgrading Jenkins to JDK7u71

2014-10-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4021: --- Summary: Issues observed after upgrading Jenkins to JDK7u71 (was: Kinesis code can cause

[jira] [Comment Edited] (SPARK-4021) Kinesis code can cause compile failures with newer JDK's

2014-10-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14177729#comment-14177729 ] Patrick Wendell edited comment on SPARK-4021 at 10/21/14 12:47 AM

[jira] [Updated] (SPARK-4021) Issues observed after upgrading Jenkins to JDK7u71

2014-10-20 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-4021: --- Component/s: (was: Streaming) Project Infra Issues observed after

<    13   14   15   16   17   18   19   20   21   22   >