Does anyone know if there Spark assemblies are created and available for
download that have been built for CDH5 and YARN?
Thanks,
Philip
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands,
`(myquery))
I'm sure it won't take much imagination to figure out how to the the
matching in a batch way.
If anyone has done anything along these lines I'd love to have some
feedback.
Thanks,
Philip
On 08/04/2014 09:46 AM, Philip Ogren wrote:
This looks like a really cool feature and it seems
It is really nice that Spark RDD's provide functions that are often
equivalent to functions found in Scala collections. For example, I can
call:
myArray.map(myFx)
and equivalently
myRdd.map(myFx)
Awesome!
My question is this. Is it possible to write code that works on either
an RDD or
-parameter-forwarding-possible-in-scala
I'm not seeing a way to utilize implicit conversions in this case. Since Scala
is statically (albeit inferred) typed, I don't see a way around having a common
supertype.
On Monday, July 21, 2014 11:01 AM, Philip Ogren philip.og...@oracle.com wrote:
It is really
Hi Patrick,
This is great news but I nearly missed the announcement because it had
scrolled off the folder view that I have Spark users list messages go
to. 40+ new threads since you sent the email out on Friday evening.
You might consider having someone on your team create a
In various previous versions of Spark (and I believe the current
version, 1.0.0, as well) we have noticed that it does not seem possible
to have a local SparkContext and a SparkContext connected to a cluster
via either a Spark Cluster (i.e. using the Spark resource manager) or a
YARN cluster.
In my unit tests I have a base class that all my tests extend that has a
setup and teardown method that they inherit. They look something like this:
var spark: SparkContext = _
@Before
def setUp() {
Thread.sleep(100L) //this seems to give spark more time to
reset from the
I asked a question related to Marcelo's answer a few months ago. The
discussion there may be useful:
http://apache-spark-user-list.1001560.n3.nabble.com/RDD-URI-td1054.html
On 06/02/2014 06:09 PM, Marcelo Vanzin wrote:
Hi Jamal,
If what you want is to process lots of files in parallel, the
Hi Pierre,
I asked a similar question on this list about 6 weeks ago. Here is one
answer
http://mail-archives.apache.org/mod_mbox/spark-user/201404.mbox/%3ccamjob8n3foaxd-dc5j57-n1oocwxefcg5chljwnut7qnreq...@mail.gmail.com%3E
I got that is of particular note:
In the upcoming release of
Have you actually found this to be true? I have found Spark local mode
to be quite good about blowing up if there is something non-serializable
and so my unit tests have been great for detecting this. I have never
seen something that worked in local mode that didn't work on the cluster
Great reference! I just skimmed through the results without reading
much of the methodology - but it looks like Spark outperforms
Stratosphere fairly consistently in the experiments. It's too bad the
data sources only range from 2GB to 8GB. Who knows if the apparent
pattern would extend out
Has there been any thought to adding a tail() method to RDD? It would
be really handy to skip over the first item in an RDD when it contains
header information. Even better would be a drop(int) function that
would allow you to skip over several lines of header information. Our
attempts to
arbitrary
format and will be deprecated soon. If you find this feature useful,
you can test it out by building the master branch of Spark yourself,
following the instructions in https://github.com/apache/spark/pull/42.
Andrew
On Wed, Apr 2, 2014 at 3:39 PM, Philip Ogren philip.og...@oracle.com
directly - I think
it's been factored nicely so it's fairly decoupled from the UI. The
concern is this is a semi-internal piece of functionality and
something we might, e.g. want to change the API of over time.
- Patrick
On Wed, Apr 2, 2014 at 3:39 PM, Philip Ogren philip.og...@oracle.com
to figure out
how to do this or if it is possible.
Any advice is appreciated.
Thanks,
Philip
On 04/01/2014 09:43 AM, Philip Ogren wrote:
Hi DB,
Just wondering if you ever got an answer to your question about
monitoring progress - either offline or through your own
investigation. Any findings
15 matches
Mail list logo