cc'ing dev list
Ok, looks like when the KCL version was updated in
https://github.com/apache/spark/pull/8957, the AWS SDK version was not,
probably leading to dependency conflict, though as Burak mentions its hard
to debug as no exceptions seem to get thrown... I've tested 1.5.2 locally
and on my
Is that PR against master branch?
S3 read comes from Hadoop / jet3t afaik
—
Sent from Mailbox
On Fri, Dec 11, 2015 at 5:38 PM, Brian London
wrote:
> That's good news I've got a PR in to up the SDK version to 1.10.40 and the
> KCL to 1.6.1 which I'm running tests
That's a good point. I assume there's always a small risk but it's at least
the documented way from Atlassian to change the creation date so I'd hope
it should be okay. I'd build the minimal CSV file.
I agree that probably not a lot of people are going to search across
projects but on the other
I noticed that it is configurable in job level spark.task.cpus. Anyway to
support on task level?
Thanks.
Zhan Zhang
On Dec 11, 2015, at 10:46 AM, Zhan Zhang wrote:
> Hi Folks,
>
> Is it possible to assign multiple core per task and how? Suppose we have some
>
Thanks for looking at this. Is it worth fixing? Is there a risk (although
small) that the re-import would break other things?
Most of those are done and I don't know how often people search JIRAs by
date across projects.
On Fri, Dec 11, 2015 at 3:40 PM, Lars Francke
Hi,
I found a very minor typo in:
http://www.cs.berkeley.edu/~matei/papers/2012/nsdi_spark.pdf
Page 4:
We complement the data mining example in Section 2.2.1 with two iterative
applications: logistic regression and PageRank.
I read back to section 2.2.1, there is no these two examples.
I am not sure if we need it. The RDD API has way too many methods and
parameters. As you said, it is simply "repartition".
On Fri, Dec 11, 2015 at 2:56 PM, Hyukjin Kwon wrote:
> Hi all,
>
> I accidentally met coalesce() function and found this taking arguments
> different
Hi Folks,
Is it possible to assign multiple core per task and how? Suppose we have some
scenario, in which some tasks are really heavy processing each record and
require multi-threading, and we want to avoid similar tasks assigned to the
same executors/hosts.
If it is not supported, does it
Hi,
You may have noticed that maven build against Hadoop 2.4 times out on Jenkins.
The last module is spark-hive-thriftserver
This seemed to start with build #4440
FYI
-
To unsubscribe, e-mail: