Matei Zaharia created SPARK-2793:
Summary: Correctly lock directory creation in
DiskBlockManager.getFile
Key: SPARK-2793
URL: https://issues.apache.org/jira/browse/SPARK-2793
Project: Spark
Matei Zaharia created SPARK-2792:
Summary: Fix reading too much or too little data from each stream
in ExternalMap / Sorter
Key: SPARK-2792
URL: https://issues.apache.org/jira/browse/SPARK-2792
Matei Zaharia created SPARK-2791:
Summary: Fix committing, reverting and state tracking in shuffle
file consolidation
Key: SPARK-2791
URL: https://issues.apache.org/jira/browse/SPARK-2791
Project
[
https://issues.apache.org/jira/browse/SPARK-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14082878#comment-14082878
]
Matei Zaharia commented on SPARK-2532:
--
I'm going to create a few sub-task
This should be okay, but make sure that your cluster also has the right code
deployed. Maybe you have the wrong one.
If you built Spark from source multiple times, you may also want to try sbt
clean before sbt assembly.
Matei
On August 1, 2014 at 12:00:07 PM, SK (skrishna...@gmail.com) wrote:
[
https://issues.apache.org/jira/browse/SPARK-2490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2490.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> StackOverflowError when RDD dependencies
[
https://issues.apache.org/jira/browse/SPARK-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-695.
-
Resolution: Fixed
Fix Version/s: 1.1.0
> Exponential recursion in getPreferredLocati
[
https://issues.apache.org/jira/browse/SPARK-695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-695:
Assignee: Aaron Staple
> Exponential recursion in getPreferredLocati
[
https://issues.apache.org/jira/browse/SPARK-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2134.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Report metrics before application finis
Matei Zaharia created SPARK-2787:
Summary: Make sort-based shuffle write files directly when there
is no sorting / aggregation and # of partitions is small
Key: SPARK-2787
URL: https://issues.apache.org/jira
[
https://issues.apache.org/jira/browse/SPARK-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-983.
-
Resolution: Fixed
Fix Version/s: 1.1.0
> Support external sorting for RDD#sortBy
[
https://issues.apache.org/jira/browse/SPARK-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2670:
-
Priority: Major (was: Critical)
> FetchFailedException should be thrown when local fetch
[
https://issues.apache.org/jira/browse/SPARK-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2670.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> FetchFailedException should be thrown w
[
https://issues.apache.org/jira/browse/SPARK-2670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2670:
-
Assignee: Kousuke Saruta
> FetchFailedException should be thrown when local fetch has fai
[
https://issues.apache.org/jira/browse/SPARK-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2711:
-
Target Version/s: 1.1.0
> Create a ShuffleMemoryManager that allocates across spill
[
https://issues.apache.org/jira/browse/SPARK-2028?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2028.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Let users of HadoopRDD access the partit
[
https://issues.apache.org/jira/browse/SPARK-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14081154#comment-14081154
]
Matei Zaharia commented on SPARK-2762:
--
PR: https://github.com/apache/spark/
[
https://issues.apache.org/jira/browse/SPARK-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2762.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> SparkILoop leaks memory in multi-r
[
https://issues.apache.org/jira/browse/SPARK-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2762:
-
Assignee: Timothy Hunter
> SparkILoop leaks memory in multi-repl configurati
[
https://issues.apache.org/jira/browse/SPARK-983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia reassigned SPARK-983:
---
Assignee: Matei Zaharia
> Support external sorting for RDD#sortBy
[
https://issues.apache.org/jira/browse/SPARK-983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080385#comment-14080385
]
Matei Zaharia commented on SPARK-983:
-
Now that an ExternalSorter class from S
[
https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080105#comment-14080105
]
Matei Zaharia commented on SPARK-2447:
--
Hey Ted, thanks for putting this toge
[
https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2447:
-
Target Version/s: 1.2.0 (was: 1.1.0)
> Add common solution for sending upsert actions to HB
[
https://issues.apache.org/jira/browse/SPARK-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2711:
-
Priority: Critical (was: Major)
> Create a ShuffleMemoryManager that allocates across spill
Java is very close to Scala across the board, the only thing missing in it
right now is GraphX (which is still alpha). Python is missing GraphX, streaming
and a few of the ML algorithms, though most of them are there. So it should be
fine to start with any of them. See
http://spark.apache.org/
[
https://issues.apache.org/jira/browse/SPARK-2305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2305.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> pyspark - depend on py4j >
I agree as well. FWIW sometimes I've seen this happen due to language barriers,
i.e. contributors whose primary language is not English, but we need more
motivation for each change.
On July 29, 2014 at 5:12:01 PM, Nicholas Chammas (nicholas.cham...@gmail.com)
wrote:
+1 on using JIRA workflows
Hi Martin,
Job ads are actually not allowed on the list, but thanks for asking. Just
posting this for others' future reference.
Matei
On July 29, 2014 at 8:34:59 AM, Martin Goodson (mar...@skimlinks.com) wrote:
I'm not sure if job adverts are allowed on here - please let me know if not.
Othe
Is data being cached? It might be that those two nodes started first and did
the first pass of the data, so it's all on them. It's kind of ugly but you can
add a Thread.sleep when your program starts to wait for nodes to come up.
Also, have you checked the applicatio web UI at http://:4040 while
[
https://issues.apache.org/jira/browse/SPARK-1981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14077473#comment-14077473
]
Matei Zaharia commented on SPARK-1981:
--
The EC2 scripts actually fetch a pac
[
https://issues.apache.org/jira/browse/SPARK-2134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2134:
-
Assignee: Rahul Singhal
> Report metrics before application finis
Matei Zaharia created SPARK-2725:
Summary: Add instructions about how to build with Hive to
building-with-maven.md
Key: SPARK-2725
URL: https://issues.apache.org/jira/browse/SPARK-2725
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-1550.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Successive creation of spark context fails
[
https://issues.apache.org/jira/browse/SPARK-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2711:
-
Description: Right now if there are two ExternalAppendOnlyMaps, they don't
compete correctl
Matei Zaharia created SPARK-2711:
Summary: Create a ShuffleMemoryManager that allocates across
spilling collections in the same task
Key: SPARK-2711
URL: https://issues.apache.org/jira/browse/SPARK-2711
uot;A" or "A+X" or somesuch, but
> testing for "A" will give an incorrect answer, and the code can't be
> expected to look for everyone's "A+X" versions. Actually inspecting
> the code is more robust if a bit messier.
>
> On Sun, Jul 27,
+1
Tested this on Mac OS X.
Matei
On Jul 25, 2014, at 4:08 PM, Tathagata Das wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 1.0.2.
>
> This release fixes a number of bugs in Spark 1.0.1.
> Some of the notable ones are
> - SPARK-2452: Known issue is Spark
[
https://issues.apache.org/jira/browse/SPARK-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-1777.
--
Resolution: Fixed
> Pass "cached" blocks directly to disk if memory is not
[
https://issues.apache.org/jira/browse/SPARK-1777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-1777:
-
Priority: Critical (was: Major)
> Pass "cached" blocks directly to disk if memory
For this particular issue, it would be good to know if Hadoop provides an API
to determine the Hadoop version. If not, maybe that can be added to Hadoop in
its next release, and we can check for it with reflection. We recently added a
SparkContext.version() method in Spark to let you tell the ve
primitives. We don't
do a lot of this yet, but there was a project in the AMPLab to do more of it.
Multiple models can also be trained simultaneously with this approach.
On July 26, 2014 at 11:21:17 PM, Matei Zaharia (matei.zaha...@gmail.com) wrote:
These numbers are from GPUs and Intel M
These numbers are from GPUs and Intel MKL (a closed-source math library for
Intel processors), where for CPU-bound algorithms you are going to get faster
speeds than MLlib's JBLAS. However, there's in theory nothing preventing the
use of these in MLlib (e.g. if you have a faster BLAS locally; ad
[
https://issues.apache.org/jira/browse/SPARK-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia reassigned SPARK-2680:
Assignee: Matei Zaharia
> Lower spark.shuffle.memoryFraction to 0.2 by defa
[
https://issues.apache.org/jira/browse/SPARK-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2680.
--
Resolution: Fixed
> Lower spark.shuffle.memoryFraction to 0.2 by defa
[
https://issues.apache.org/jira/browse/SPARK-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia reassigned SPARK-2684:
Assignee: Matei Zaharia
> Update ExternalAppendOnlyMap to take an iterator as in
Even in local mode, Spark serializes data that would be sent across the
network, e.g. in a reduce operation, so that you can catch errors that would
happen in distributed mode. You can make serialization much faster by using the
Kryo serializer; see http://spark.apache.org/docs/latest/tuning.htm
[
https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2601.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> py4j.Py4JException on sc.pickleF
Even in local mode, Spark serializes data that would be sent across the
network, e.g. in a reduce operation, so that you can catch errors that would
happen in distributed mode. You can make serialization much faster by using the
Kryo serializer; see http://spark.apache.org/docs/latest/tuning.htm
[
https://issues.apache.org/jira/browse/SPARK-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2704.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> ConnectionManager threads should be named
These messages are actually not about spilling the RDD, they're about spilling
intermediate state in a reduceByKey, groupBy or other operation whose state
doesn't fit in memory. We have to do that in these cases to avoid going out of
memory. You can minimize spilling by having more reduce tasks
[
https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2279:
-
Priority: Minor (was: Major)
> JavaSparkContext should allow creation of Empty
[
https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2279:
-
Assignee: Bob Paulin
> JavaSparkContext should allow creation of Empty
[
https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2279.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> JavaSparkContext should allow creation
[
https://issues.apache.org/jira/browse/SPARK-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2652.
--
Resolution: Fixed
> Turning default configurations for PySp
[
https://issues.apache.org/jira/browse/SPARK-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2696.
--
Resolution: Fixed
Fix Version/s: 1.0.3
Target Version/s: 1.0.3 (was: 1.0.0
[
https://issues.apache.org/jira/browse/SPARK-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2696:
-
Assignee: Hossein Falaki
> Reduce default spark.serializer.objectStreamRe
[
https://issues.apache.org/jira/browse/SPARK-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-1458.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Expose sc.version in PySp
[
https://issues.apache.org/jira/browse/SPARK-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2567.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Resubmitted stage sometimes remains as act
[
https://issues.apache.org/jira/browse/SPARK-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2567:
-
Assignee: Kay Ousterhout
> Resubmitted stage sometimes remains as active stage in the web
[
https://issues.apache.org/jira/browse/SPARK-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075035#comment-14075035
]
Matei Zaharia commented on SPARK-2567:
--
I've merged this into 1.1 because
[
https://issues.apache.org/jira/browse/SPARK-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-1726.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Tasks that fail to serialize remain in act
[
https://issues.apache.org/jira/browse/SPARK-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2125.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Add sorting flag to ShuffleManager,
[
https://issues.apache.org/jira/browse/SPARK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2682.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Javadoc generated from Scala source code
[
https://issues.apache.org/jira/browse/SPARK-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2683.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> unidoc failed beca
[
https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074673#comment-14074673
]
Matei Zaharia commented on SPARK-2620:
--
The problem is that case class is comp
[
https://issues.apache.org/jira/browse/SPARK-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2689.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Remove use of println in ActorHel
Matei Zaharia created SPARK-2689:
Summary: Remove use of println in ActorHelper
Key: SPARK-2689
URL: https://issues.apache.org/jira/browse/SPARK-2689
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074664#comment-14074664
]
Matei Zaharia commented on SPARK-2689:
--
Pull request: https://github.com/ap
[
https://issues.apache.org/jira/browse/SPARK-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2657.
--
Resolution: Fixed
> Use more compact data structures than ArrayBuffer in groupBy and cogr
[
https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2574.
--
Resolution: Fixed
> Avoid allocating new ArrayBuffer in groupByKey's merge
[
https://issues.apache.org/jira/browse/SPARK-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-993:
Summary: Don't reuse Writable objects in HadoopRDDs by default (was: Don't
reuse Writab
Matei Zaharia created SPARK-2685:
Summary: Update ExternalAppendOnlyMap to avoid buffer.remove()
Key: SPARK-2685
URL: https://issues.apache.org/jira/browse/SPARK-2685
Project: Spark
Issue
Matei Zaharia created SPARK-2684:
Summary: Update ExternalAppendOnlyMap to take an iterator as input
Key: SPARK-2684
URL: https://issues.apache.org/jira/browse/SPARK-2684
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2538.
--
Resolution: Fixed
Fix Version/s: (was: 1.0.1)
(was: 1.0.0
The Pair ones return a JavaPairRDD, which has additional operations on
key-value pairs. Take a look at
http://spark.apache.org/docs/latest/programming-guide.html#working-with-key-value-pairs
for details.
Matei
On Jul 24, 2014, at 3:41 PM, abhiguruvayya wrote:
> Can any one help me understand
Matei Zaharia created SPARK-2680:
Summary: Lower spark.shuffle.memoryFraction to 0.2 by default
Key: SPARK-2680
URL: https://issues.apache.org/jira/browse/SPARK-2680
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-2014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2014.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Make PySpark store RDDs in MEMORY_ONLY_
[
https://issues.apache.org/jira/browse/SPARK-2538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2538:
-
Priority: Critical (was: Major)
> External aggregation in Pyt
This is being tracked here: https://issues.apache.org/jira/browse/SPARK-1812,
since it will also be needed for cross-building with Scala 2.11. Maybe we can
do it before that. Probably too late for 1.1, but you should open an issue for
1.2.
In that JIRA I linked, there's a pull request from a mo
[
https://issues.apache.org/jira/browse/SPARK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2661.
--
Resolution: Fixed
> Unpersist last RDD in bagel iterat
[
https://issues.apache.org/jira/browse/SPARK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2661:
-
Fix Version/s: 1.1.0
> Unpersist last RDD in bagel iterat
[
https://issues.apache.org/jira/browse/SPARK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2661:
-
Assignee: Adrian Wang
> Unpersist last RDD in bagel iterat
[
https://issues.apache.org/jira/browse/SPARK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2661:
-
Affects Version/s: (was: 1.0.1)
> Unpersist last RDD in bagel iterat
[
https://issues.apache.org/jira/browse/SPARK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2661:
-
Affects Version/s: (was: 1.0.0)
> Unpersist last RDD in bagel iterat
[
https://issues.apache.org/jira/browse/SPARK-2662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2662.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> Fix NPE for JsonProto
[
https://issues.apache.org/jira/browse/SPARK-2661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2661:
-
Affects Version/s: 1.0.0
> Unpersist last RDD in bagel iterat
[
https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2574:
-
Priority: Trivial (was: Major)
> Avoid allocating new ArrayBuffer in groupByKey's merge
[
https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14072605#comment-14072605
]
Matei Zaharia commented on SPARK-2574:
--
I implemented this as part of h
[
https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia reassigned SPARK-2574:
Assignee: Matei Zaharia
> Avoid allocating new ArrayBuffer in groupByKey's merge
Matei Zaharia created SPARK-2657:
Summary: Use more compact data structures than ArrayBuffer in
groupBy and cogroup
Key: SPARK-2657
URL: https://issues.apache.org/jira/browse/SPARK-2657
Project
[
https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2277.
--
Resolution: Fixed
> Make TaskScheduler track whether there's host o
[
https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2277:
-
Fix Version/s: 1.1.0
> Make TaskScheduler track whether there's host o
[
https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2277:
-
Assignee: Rui Li
> Make TaskScheduler track whether there's host o
[
https://issues.apache.org/jira/browse/SPARK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2640:
-
Assignee: woshilaiceshide
> In "local[N]", free cores of the only executor should
[
https://issues.apache.org/jira/browse/SPARK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2640:
-
Priority: Minor (was: Major)
> In "local[N]", free cores of the only executor shou
[
https://issues.apache.org/jira/browse/SPARK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2640.
--
Resolution: Fixed
Fix Version/s: 1.1.0
> In "local[N]", free cores of the
[
https://issues.apache.org/jira/browse/SPARK-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2609:
-
Assignee: Andrew Or
> Log thread ID when spilling ExternalAppendOnly
[
https://issues.apache.org/jira/browse/SPARK-2609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia resolved SPARK-2609.
--
Resolution: Fixed
> Log thread ID when spilling ExternalAppendOnly
Is the first() being computed locally on the driver program? Maybe it's to hard
to compute with the memory, etc available there. Take a look at the driver's
log and see whether it has the message "Computing the requested partition
locally".
Matei
On Jul 22, 2014, at 12:04 PM, Nathan Kronenfel
[
https://issues.apache.org/jira/browse/SPARK-2047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matei Zaharia updated SPARK-2047:
-
Assignee: Aaron Davidson
> Use less memory in AppendOnlyMap.destructiveSortedItera
701 - 800 of 2740 matches
Mail list logo