[jira] [Commented] (SPARK-1001) Memory leak when reading sequence file and then sorting

2015-02-22 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14332444#comment-14332444 ] Nicholas Chammas commented on SPARK-1001: - It's probably tough given how long ago

Re: textFile() ordering and header rows

2015-02-22 Thread Nicholas Chammas
I guess on a technicality the docs just say first item in this RDD, not first line in the source text file. AFAIK there is no way apart from filtering to remove header lines http://stackoverflow.com/a/24734612/877069. As long as first() always returns the same value for a given RDD, I think it's

[jira] [Updated] (SPARK-1050) Investigate AnyRefMap

2015-02-22 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1050?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-1050: Target Version/s: 1.4.0 Since Spark can now be built with Scala 2.11, I believe this issue

Re: Improving metadata in Spark JIRA

2015-02-21 Thread Nicholas Chammas
for the cleanup! Nick On Sat Feb 07 2015 at 8:29:42 PM Nicholas Chammas nicholas.cham...@gmail.com http://mailto:nicholas.cham...@gmail.com wrote: Oh derp, missed the YARN component. JIRA, does allow admins to make fields mandatory: https://confluence.atlassian.com/display/JIRA/Specifying+Field

[jira] [Updated] (SPARK-5923) Very slow query when using Oracle hive metastore and table has lots of partitions

2015-02-21 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5923: Component/s: SQL Very slow query when using Oracle hive metastore and table has lots

[jira] [Commented] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-20 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14329771#comment-14329771 ] Nicholas Chammas commented on SPARK-5629: - [~florianverhein] - Hmm... Thinking

[jira] [Commented] (SPARK-925) Allow ec2 scripts to load default options from a json file

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325317#comment-14325317 ] Nicholas Chammas commented on SPARK-925: I would prefer a format that is more human

[jira] [Commented] (SPARK-925) Allow ec2 scripts to load default options from a json file

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325334#comment-14325334 ] Nicholas Chammas commented on SPARK-925: Here's an example of what a spark-ec2

[jira] [Commented] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325372#comment-14325372 ] Nicholas Chammas commented on SPARK-5629: - YAML is not part of the Python standard

[jira] [Commented] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14325346#comment-14325346 ] Nicholas Chammas commented on SPARK-5629: - For example, you run: {code} $ spark

[jira] [Updated] (SPARK-5627) Enhance spark-ec2 for some programmatic use cases

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5627: Description: There are some cases where users may want to programmatically invoke {{spark

[jira] [Updated] (SPARK-5627) Enhance spark-ec2 to return machine-readable output

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5627: Summary: Enhance spark-ec2 to return machine-readable output (was: Enhance spark-ec2

[jira] [Updated] (SPARK-5627) Enhance spark-ec2 for some programmatic use cases

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5627: Description: There are some cases where users may want to programmatically invoke {{spark

[jira] [Updated] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5629: Description: You can launch multiple clusters using spark-ec2. At some point, you might

[jira] [Updated] (SPARK-5865) Add doc warnings for methods that return local data structures

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5865: Description: We should include a note in the doc string for any method that collects an RDD

[jira] [Updated] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5629: Description: You can launch multiple clusters using spark-ec2. At some point, you might

[jira] [Updated] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5629: Description: You can launch multiple clusters using spark-ec2. At some point, you might

[jira] [Comment Edited] (SPARK-925) Allow ec2 scripts to load default options from a json file

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321810#comment-14321810 ] Nicholas Chammas edited comment on SPARK-925 at 2/17/15 6:53 PM

[jira] [Updated] (SPARK-5711) Sort Shuffle performance issues about using AppendOnlyMap for large data sets

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5711: Component/s: Shuffle Sort Shuffle performance issues about using AppendOnlyMap for large

[jira] [Updated] (SPARK-5851) spark_ec2.py ssh failure retry handling not always appropriate

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5851: Description: The following function doesn't distinguish between the ssh failing (e.g

[jira] [Updated] (SPARK-5851) spark_ec2.py ssh failure retry handling not always appropriate

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5851: Description: The following function doesn't distinguish between the ssh failing (e.g

[jira] [Resolved] (SPARK-5749) Fix Bash word splitting bugs in compute-classpath.sh

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas resolved SPARK-5749. - Resolution: Fixed Fixed by: https://github.com/apache/spark/pull/4561 [~andrewor14

[jira] [Updated] (SPARK-5851) spark_ec2.py ssh failure retry handling not always appropriate

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5851: Description: The following function doesn't distinguish between the ssh failing (e.g

[jira] [Updated] (SPARK-5865) Add doc warnings for methods that return local data structures

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5865: Summary: Add doc warnings for methods that return local data structures (was: Add doc

[jira] [Created] (SPARK-5865) Add doc warnings for methods that collect an RDD to the driver

2015-02-17 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-5865: --- Summary: Add doc warnings for methods that collect an RDD to the driver Key: SPARK-5865 URL: https://issues.apache.org/jira/browse/SPARK-5865 Project: Spark

[jira] [Updated] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5629: Summary: Add spark-ec2 action to return info about an existing cluster (was: Add spark-ec2

[jira] [Updated] (SPARK-5708) Add Slf4jSink to Spark Metrics Sink

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5708: Component/s: Spark Core Add Slf4jSink to Spark Metrics Sink

[jira] [Commented] (SPARK-5851) spark_ec2.py ssh failure retry handling not always appropriate

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324694#comment-14324694 ] Nicholas Chammas commented on SPARK-5851: - Yeah, that's a good catch. Have you run

[jira] [Commented] (SPARK-5627) Enhance spark-ec2 for some programmatic use cases

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324716#comment-14324716 ] Nicholas Chammas commented on SPARK-5627: - I think a good way to offer this might

[jira] [Commented] (SPARK-5627) Enhance spark-ec2 for some programmatic use cases

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324717#comment-14324717 ] Nicholas Chammas commented on SPARK-5627: - cc [~joshrosen] / [~shivaram] Enhance

[jira] [Updated] (SPARK-5763) Sort-based Groupby and Join to resolve skewed data

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5763: Component/s: Spark Core Shuffle Sort-based Groupby and Join to resolve

[jira] [Updated] (SPARK-5628) Add option to return spark-ec2 version

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5628: Target Version/s: 1.2.2 Fix Version/s: (was: 1.2.2) Add option to return spark

[jira] [Commented] (SPARK-925) Allow ec2 scripts to load default options from a json file

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324708#comment-14324708 ] Nicholas Chammas commented on SPARK-925: Formatting side comment: You can surround

[jira] [Updated] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5629: Description: You can launch multiple clusters using spark-ec2. At some point, you might

[jira] [Commented] (SPARK-5629) Add spark-ec2 action to return info about an existing cluster

2015-02-17 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14324888#comment-14324888 ] Nicholas Chammas commented on SPARK-5629: - cc [~joshrosen] / [~shivaram] I see

[jira] [Commented] (SPARK-925) Allow ec2 scripts to load default options from a json file

2015-02-15 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14321810#comment-14321810 ] Nicholas Chammas commented on SPARK-925: Loading config from a file seems like

Re: SQLContext.applySchema strictness

2015-02-14 Thread Nicholas Chammas
Would it make sense to add an optional validate parameter to applySchema() which defaults to False, both to give users the option to check the schema immediately and to make the default behavior clearer? ​ On Sat Feb 14 2015 at 9:18:59 AM Michael Armbrust mich...@databricks.com wrote: Doing

Re: Building Spark with Pants

2015-02-14 Thread Nicholas Chammas
FYI: Here is the matching discussion over on the Pants dev list. https://groups.google.com/forum/#!topic/pants-devel/rTaU-iIOIFE On Mon Feb 02 2015 at 4:50:33 PM Nicholas Chammas nicholas.cham...@gmail.com http://mailto:nicholas.cham...@gmail.com wrote: To reiterate, I'm asking from

[jira] [Commented] (SPARK-3821) Develop an automated way of creating Spark images (AMI, Docker, and others)

2015-02-13 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320905#comment-14320905 ] Nicholas Chammas commented on SPARK-3821: - If you want Java 8 alongside 7, you can

[jira] [Commented] (SPARK-5765) word split problem in run-example and compute-classpath

2015-02-12 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318869#comment-14318869 ] Nicholas Chammas commented on SPARK-5765: - FWIW [~srowen], the last time I had

numpy on PyPy - potential benefit to PySpark

2015-02-11 Thread Nicholas Chammas
Random question for the PySpark and Python experts/enthusiasts on here: How big of a deal would it be for PySpark and PySpark users if you could run numpy on PyPy? PySpark already supports running on PyPy https://github.com/apache/spark/pull/2144, but libraries like MLlib that use numpy are not

Re: 1.2.1 start-all.sh broken?

2015-02-11 Thread Nicholas Chammas
Found it: https://github.com/apache/spark/compare/v1.2.0...v1.2.1#diff-73058f8e51951ec0b4cb3d48ade91a1fR73 GRRR BASH WORD SPLITTING My path has a space in it... Nick On Wed Feb 11 2015 at 2:37:39 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: This is what get: spark-1.2.1-bin

Re: 1.2.1 start-all.sh broken?

2015-02-11 Thread Nicholas Chammas
lol yeah, I changed the path for the email... turned out to be the issue itself. On Wed Feb 11 2015 at 2:43:09 PM Ted Yu yuzhih...@gmail.com wrote: I see. '/path/to/spark-1.2.1-bin-hadoop2.4' didn't contain space :-) On Wed, Feb 11, 2015 at 2:41 PM, Nicholas Chammas nicholas.cham

[jira] [Created] (SPARK-5747) Review all Bash scripts for word splitting bugs

2015-02-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-5747: --- Summary: Review all Bash scripts for word splitting bugs Key: SPARK-5747 URL: https://issues.apache.org/jira/browse/SPARK-5747 Project: Spark Issue

[jira] [Created] (SPARK-5749) Fix Bash word splitting bugs in compute-classpath.sh

2015-02-11 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-5749: --- Summary: Fix Bash word splitting bugs in compute-classpath.sh Key: SPARK-5749 URL: https://issues.apache.org/jira/browse/SPARK-5749 Project: Spark

Re: 1.2.1 start-all.sh broken?

2015-02-11 Thread Nicholas Chammas
The tragic thing here is that I was asked to review the patch that introduced this https://github.com/apache/spark/pull/3377#issuecomment-68077315, and totally missed it... :( On Wed Feb 11 2015 at 2:46:35 PM Nicholas Chammas nicholas.cham...@gmail.com wrote: lol yeah, I changed the path

Re: 1.2.1 start-all.sh broken?

2015-02-11 Thread Nicholas Chammas
got the following working (against a directory with space in its name): #!/usr/bin/env bash OLDIFS=$IFS # save it IFS= # don't split on any white space dir=$1/* for f in $dir; do cat $f done IFS=$OLDIFS # restore IFS Cheers On Wed, Feb 11, 2015 at 2:47 PM, Nicholas Chammas

[jira] [Updated] (SPARK-5747) Review all Bash scripts for word splitting bugs

2015-02-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5747: Description: Triggered by [this discussion|http://apache-spark-developers-list.1001551.n3

Re: 1.2.1 start-all.sh broken?

2015-02-11 Thread Nicholas Chammas
-hadoop2.4.0.jar FYI On Wed, Feb 11, 2015 at 2:27 PM, Nicholas Chammas nicholas.cham...@gmail.com wrote: I just downloaded 1.2.1 pre-built for Hadoop 2.4+ and ran sbin/start-all.sh on my OS X. Failed to find Spark assembly in /path/to/spark-1.2.1-bin-hadoop2.4/lib You need to build Spark

1.2.1 start-all.sh broken?

2015-02-11 Thread Nicholas Chammas
I just downloaded 1.2.1 pre-built for Hadoop 2.4+ and ran sbin/start-all.sh on my OS X. Failed to find Spark assembly in /path/to/spark-1.2.1-bin-hadoop2.4/lib You need to build Spark before running this program. Did the same for 1.2.0 and it worked fine. Nick ​

[jira] [Updated] (SPARK-5749) Fix Bash word splitting bugs in compute-classpath.sh

2015-02-11 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5749: Issue Type: Bug (was: Sub-task) Parent: (was: SPARK-5747) Fix Bash word

[jira] [Commented] (SPARK-5685) Show warning when users open text files compressed with non-splittable algorithms like gzip

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311931#comment-14311931 ] Nicholas Chammas commented on SPARK-5685: - [~joshrosen] - What do you think

Re: How to create spark AMI in AWS

2015-02-09 Thread Nicholas Chammas
at 3:59 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Guodong, spark-ec2 does not currently support the cn-north-1 region, but you can follow [SPARK-4241](https://issues.apache.org/jira/browse/SPARK-4241) to find out when it does. The base AMI used to generate the current Spark AMIs

Re: How to create spark AMI in AWS

2015-02-09 Thread Nicholas Chammas
Guodong, spark-ec2 does not currently support the cn-north-1 region, but you can follow [SPARK-4241](https://issues.apache.org/jira/browse/SPARK-4241) to find out when it does. The base AMI used to generate the current Spark AMIs is very old. I'm not sure anyone knows what it is anymore. What I

[jira] [Created] (SPARK-5685) Show warning when users open text files compressed with non-splittable algorithms like gzip

2015-02-09 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-5685: --- Summary: Show warning when users open text files compressed with non-splittable algorithms like gzip Key: SPARK-5685 URL: https://issues.apache.org/jira/browse/SPARK-5685

Re: Keep or remove Debian packaging in Spark?

2015-02-09 Thread Nicholas Chammas
+1 to an official deprecation + redirecting users to some other project that will or already is taking this on. Nate? On Mon Feb 09 2015 at 10:08:27 AM Patrick Wendell pwend...@gmail.com wrote: I have wondered whether we should sort of deprecated it more officially, since otherwise I think

[jira] [Commented] (SPARK-5676) License missing from spark-ec2 repo

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313254#comment-14313254 ] Nicholas Chammas commented on SPARK-5676: - It ended up in Mesos because [Spark

[jira] [Commented] (SPARK-5676) License missing from spark-ec2 repo

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313271#comment-14313271 ] Nicholas Chammas commented on SPARK-5676: - Yeah, AFAIK it has nothing to do

[jira] [Commented] (SPARK-3044) Create RSS feed for Spark News

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312878#comment-14312878 ] Nicholas Chammas commented on SPARK-3044: - I use RSS plenty to track companies

[jira] [Commented] (SPARK-3044) Create RSS feed for Spark News

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3044?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14312854#comment-14312854 ] Nicholas Chammas commented on SPARK-3044: - [~rxin] / [~pwendell] Is there anyone

[jira] [Commented] (SPARK-5676) License missing from spark-ec2 repo

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313214#comment-14313214 ] Nicholas Chammas commented on SPARK-5676: - [~srowen] - I don't think it's

[jira] [Commented] (SPARK-1805) Error launching cluster when master and slave machines are of different virtualization types

2015-02-09 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313079#comment-14313079 ] Nicholas Chammas commented on SPARK-1805: - I've created a PR to catch this error

Re: Improving metadata in Spark JIRA

2015-02-08 Thread Nicholas Chammas
: I think we already have a YARN component. https://issues.apache.org/jira/issues/?jql=project%20% 3D%20SPARK%20AND%20component%20%3D%20YARN I don't think JIRA allows it to be mandatory, but if it does, that would be useful. On Sat, Feb 7, 2015 at 5:08 PM, Nicholas Chammas nicholas.cham

Re: Improving metadata in Spark JIRA

2015-02-08 Thread Nicholas Chammas
at 11:53 AM, Nicholas Chammas nicholas.cham...@gmail.com wrote: Do we need some new components to be added to the JIRA project? Like: - scheduler - YARN - spark-submit - ...? Nick On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas nicholas.cham

Re: Using CUDA within Spark / boosting linear algebra

2015-02-08 Thread Nicholas Chammas
Lemme butt in randomly here and say there is an interesting discussion on this Spark PR https://github.com/apache/spark/pull/4448 about netlib-java, JBLAS, Breeze, and other things I know nothing of, that y'all may find interesting. Among the participants is the author of netlib-java. On Sun Feb

[jira] [Updated] (SPARK-1061) allow Hadoop RDDs to be read w/ a partitioner

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-1061: Component/s: Spark Core allow Hadoop RDDs to be read w/ a partitioner

[jira] [Updated] (SPARK-5664) Restore stty settings when exiting for launching spark-shell from SBT

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5664: Component/s: Build Restore stty settings when exiting for launching spark-shell from SBT

[jira] [Updated] (SPARK-5080) Expose more cluster resource information to user

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5080: Component/s: Spark Core Expose more cluster resource information to user

[jira] [Updated] (SPARK-5524) Remove messy dependencies to log4j

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5524: Component/s: Build Remove messy dependencies to log4j

[jira] [Updated] (SPARK-5156) Priority queue for cross application scheduling

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5156: Component/s: Scheduler Priority queue for cross application scheduling

[jira] [Commented] (SPARK-5524) Remove messy dependencies to log4j

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311122#comment-14311122 ] Nicholas Chammas commented on SPARK-5524: - Oh my bad. Thanks for the correction

[jira] [Commented] (SPARK-5668) spark_ec2.py region parameter could be either mandatory or its value displayed

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311054#comment-14311054 ] Nicholas Chammas commented on SPARK-5668: - This sounds good to me, Miguel. I've

[jira] [Updated] (SPARK-1142) Allow adding jars on app submission, outside of code

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-1142: Component/s: Spark Submit Allow adding jars on app submission, outside of code

[jira] [Updated] (SPARK-4383) Delay scheduling doesn't work right when jobs have tasks with different locality levels

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-4383: Component/s: Scheduler Delay scheduling doesn't work right when jobs have tasks

[jira] [Commented] (SPARK-5524) Remove messy dependencies to log4j

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311121#comment-14311121 ] Nicholas Chammas commented on SPARK-5524: - Oh my bad. Thanks for the correction

[jira] [Updated] (SPARK-4808) Spark fails to spill with small number of large objects

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-4808: Component/s: Spark Core Spark fails to spill with small number of large objects

[jira] [Commented] (SPARK-5363) Spark 1.2 freeze without error notification

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311050#comment-14311050 ] Nicholas Chammas commented on SPARK-5363: - [~TJKlein] - Can you provide more

[jira] [Commented] (SPARK-5628) Add option to return spark-ec2 version

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311055#comment-14311055 ] Nicholas Chammas commented on SPARK-5628: - We still need a backport to 1.2.2

[jira] [Commented] (SPARK-3431) Parallelize Scala/Java test execution

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14311135#comment-14311135 ] Nicholas Chammas commented on SPARK-3431: - [~srowen] - Have you tried anything

[jira] [Updated] (SPARK-5668) spark_ec2.py region parameter could be either mandatory or its value displayed

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5668?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5668: Labels: starter (was: ) spark_ec2.py region parameter could be either mandatory or its

[jira] [Updated] (SPARK-5363) Spark 1.2 freeze without error notification

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5363: Component/s: PySpark Spark 1.2 freeze without error notification

[jira] [Updated] (SPARK-5175) bug in updating counters when starting multiple workers/supervisors in actor-based receiver

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5175: Component/s: (was: Spark Core) Streaming bug in updating counters

[jira] [Updated] (SPARK-5259) Fix endless retry stage by add task equal() and hashcode() to avoid stage.pendingTasks not empty while stage map output is available

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5259?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5259: Component/s: Spark Core Fix endless retry stage by add task equal() and hashcode

[jira] [Updated] (SPARK-5175) bug in updating counters when starting multiple workers/supervisors in actor-based receiver

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5175?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5175: Component/s: Spark Core bug in updating counters when starting multiple workers

[jira] [Issue Comment Deleted] (SPARK-5524) Remove messy dependencies to log4j

2015-02-07 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5524: Comment: was deleted (was: Oh my bad. Thanks for the correction.) Remove messy

[jira] [Updated] (SPARK-540) Add API to customize in-memory representation of RDDs

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-540: --- Component/s: Spark Core Add API to customize in-memory representation of RDDs

Re: Improving metadata in Spark JIRA

2015-02-06 Thread Nicholas Chammas
Do we need some new components to be added to the JIRA project? Like: - scheduler - YARN - spark-submit - …? Nick ​ On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas nicholas.cham...@gmail.com wrote: +9000 on cleaning up JIRA. Thank you Sean for laying out some

[jira] [Updated] (SPARK-3956) Python API for Distributed Matrix

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-3956: Component/s: PySpark Python API for Distributed Matrix

[jira] [Commented] (SPARK-1799) Add init script to the debian packaging

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309753#comment-14309753 ] Nicholas Chammas commented on SPARK-1799: - cc [~markhamstra], [~srowen

[jira] [Updated] (SPARK-560) Specialize RDDs / iterators

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-560: --- Component/s: Spark Core Specialize RDDs / iterators

[jira] [Updated] (SPARK-5628) Add option to return spark-ec2 version

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5628: Fix Version/s: 1.2.2 Add option to return spark-ec2 version

[jira] [Updated] (SPARK-5628) Add option to return spark-ec2 version

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5628?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-5628: Labels: backport-needed (was: ) Add option to return spark-ec2 version

[jira] [Updated] (SPARK-706) Failures in block manager put leads to task hanging

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-706: --- Component/s: Block Manager Failures in block manager put leads to task hanging

[jira] [Updated] (SPARK-3600) RDD[Double] doesn't use primitive arrays for caching

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3600?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-3600: Component/s: Spark Core RDD[Double] doesn't use primitive arrays for caching

[jira] [Updated] (SPARK-4024) Remember user preferences for metrics to show in the UI

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-4024: Component/s: Web UI Remember user preferences for metrics to show in the UI

[jira] [Updated] (SPARK-2654) Leveled logging in PySpark

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-2654: Component/s: PySpark Leveled logging in PySpark

[jira] [Updated] (SPARK-1346) Backport SPARK-1210 into 0.9 branch

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-1346: Labels: backport-needed (was: ) Backport SPARK-1210 into 0.9 branch

[jira] [Updated] (SPARK-2064) web ui should not remove executors if they are dead

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-2064: Component/s: Web UI web ui should not remove executors if they are dead

[jira] [Updated] (SPARK-1927) Implicits declared in companion objects not found in Spark shell

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nicholas Chammas updated SPARK-1927: Component/s: Spark Shell Implicits declared in companion objects not found in Spark shell

[jira] [Commented] (SPARK-1927) Implicits declared in companion objects not found in Spark shell

2015-02-06 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1927?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14309755#comment-14309755 ] Nicholas Chammas commented on SPARK-1927: - cc [~tobias.schlatter] Implicits

<    8   9   10   11   12   13   14   15   16   17   >