Hi
I am currently implementing an algorithm involving matrix multiplication.
Basically, I have matrices represented as RDD[Array[Double]]. For example,
If I have A:RDD[Array[Double]] and B:RDD[Array[Double]] and what would be
the most efficient way to get C = A * B
Both A and B are large, so it
On 18-May-2014 5:05 am, Mark Hamstra m...@clearstorydata.com wrote:
I don't understand. We never said that interfaces wouldn't change from
0.9
Agreed.
to 1.0. What we are committing to is stability going forward from the
1.0.0 baseline. Nobody is disputing that backward-incompatible
So I think I need to clarify a few things here - particularly since
this mail went to the wrong mailing list and a much wider audience
than I intended it for :-)
Most of the issues I mentioned are internal implementation detail of
spark core : which means, we can enhance them in future without
I created a JIRA: https://issues.apache.org/jira/browse/SPARK-1870
DB, could you add more info to that JIRA? Thanks!
-Xiangrui
On Sun, May 18, 2014 at 9:46 AM, Xiangrui Meng men...@gmail.com wrote:
Btw, I tried
rdd.map { i =
System.getProperty(java.class.path)
}.collect()
but didn't
Hi,
I'm curious if it's a common approach to have discussions in JIRA not here.
I don't think it's the ASF way.
Pozdrawiam,
Jacek Laskowski
http://blog.japila.pl
17 maj 2014 23:55 Matei Zaharia matei.zaha...@gmail.com napisał(a):
We do actually have replicated StorageLevels in Spark. You can
The nice thing about putting discussion on the Jira is that everything
about the bug is in one place. So people looking to understand the
discussion a few years from now only have to look on the jira ticket rather
than also search the mailing list archives and hope commenters all put the
string
@db - it's possible that you aren't including the jar in the classpath
of your driver program (I think this is what mridul was suggesting).
It would be helpful to see the stack trace of the CNFE.
- Patrick
On Sun, May 18, 2014 at 11:54 AM, Patrick Wendell pwend...@gmail.com wrote:
@xiangrui -
@xiangrui - we don't expect these to be present on the system
classpath, because they get dynamically added by Spark (e.g. your
application can call sc.addJar well after the JVM's have started).
@db - I'm pretty surprised to see that behavior. It's definitely not
intended that users need
I took the always fun task of testing it on Windows, and unfortunately, I found
some small problems with the prebuilt packages due to recent changes to the
launch scripts: bin/spark-class2.cmd looks in ./jars instead of ./lib for the
assembly JAR, and bin/run-example2.cmd doesn’t quite match
Hi Liquan,
There is some working being done on implementing linear algebra algorithms
on Spark for use in higher-level machine learning algorithms. That work is
happening in the MLlib project, which has a
org.apache.spark.mllib.linalgpackage you may find useful.
See
JIRAs comments are mirrored to the iss...@spark.apache.org list, so people who
want to get them by email can do so. In theory one should also be able to reply
to one of those emails and have the message show up in JIRA, but I don’t think
ours is configured that way. I’m not sure why it wouldn’t
On Sun, May 18, 2014 at 8:28 PM, Andrew Ash and...@andrewash.com wrote:
The nice thing about putting discussion on the Jira is that everything
about the bug is in one place. So people looking to understand the
discussion a few years from now only have to look on the jira ticket rather
than
Hi Patrick,
If spark-submit works correctly, user only needs to specify runtime
jars via `--jars` instead of using `sc.addJar`. Is it correct? I
checked SparkSubmit and yarn.Client but didn't find any code to handle
`args.jars` for YARN mode. So I don't know where in the code the jars
in the
Hi Sandy,
It is hard to imagine that a user needs to create an object in that
way. Since the jars are already in distributed cache before the
executor starts, is there any reason we cannot add the locally cached
jars to classpath directly?
Best,
Xiangrui
On Sun, May 18, 2014 at 4:00 PM, Sandy
BTW in Spark the consensus so far was that we’d use the dev@ list for
high-level discussions (e.g. change in the development process, major features,
proposals of new components, release votes) and keep lower-level issue tracking
in JIRA. This is just how the project operated before so it was
Hey Xiangrui,
If the jars are placed in the distributed cache and loaded statically, as
the primary app jar is in YARN, then it shouldn't be an issue. Other jars,
however, including additional jars that are sc.addJar'd and jars specified
with the spark-submit --jars argument, are loaded
The reflection actually works. But you need to get the loader by `val
loader = Thread.currentThread.getContextClassLoader` which is set by Spark
executor. Our team verified this, and uses it as workaround.
Sincerely,
DB Tsai
---
My Blog:
The jars are included in my driver, and I can successfully use them in the
driver. I'm working on a patch, and it's almost working. Will submit a PR
soon.
Sincerely,
DB Tsai
---
My Blog: https://www.dbtsai.com
LinkedIn:
Since the additional jars added by sc.addJars are through http server, even
it works, we still want to have a better way due to scalability (imagine
that thousands of workers downloading jars from driver).
If we ignore the fundamental scalability issue, this can be fixed by using
the
Alright, I’ve opened https://github.com/apache/spark/pull/819 with the Windows
fixes. I also found one other likely bug,
https://issues.apache.org/jira/browse/SPARK-1875, in the binary packages for
Hadoop1 built in this RC. I think this is due to Hadoop 1’s security code
depending on a
thanks for sharing, I am using tachyon to store RDD now.
2014-05-18 12:02 GMT+08:00 Christopher Nguyen c...@adatao.com:
Qing Yang, Andy is correct in answering your direct question.
At the same time, depending on your context, you may be able to apply a
pattern where you turn the single
no ideas off hand, I'll take a look tomorrow.
Tom
On Sunday, May 18, 2014 7:28 PM, Matei Zaharia matei.zaha...@gmail.com wrote:
Alright, I’ve opened https://github.com/apache/spark/pull/819 with the Windows
fixes. I also found one other likely bug,
Hey Matei - the issue you found is not related to security. This patch
a few days ago broke builds for Hadoop 1 with YARN support enabled.
The patch directly altered the way we deal with commons-lang
dependency, which is what is at the base of this stack trace.
My bad ... I was replying via mobile, and I did not realize responses
to JIRA mails were not mirrored to JIRA - unlike PR responses !
Regards,
Mridul
On Sun, May 18, 2014 at 2:50 AM, Matei Zaharia matei.zaha...@gmail.com wrote:
We do actually have replicated StorageLevels in Spark. You can use
24 matches
Mail list logo