Thanks Shane!
From: shane knapp
Sent: Tuesday, June 20, 2017 9:23:57 PM
To: dev
Subject: Re: [build system] rolling back R to working version
this is done... i backported R to 3.1.1 and reinstalled all the R
packages so we're starting w/a clean slate. the worke
this is done... i backported R to 3.1.1 and reinstalled all the R
packages so we're starting w/a clean slate. the workers are all
restarted, and i re-triggered as many PRBs as i could find.
i'll check in first thing in the morning (PDT) and see how things are going.
shane
On Tue, Jun 20, 2017
i accidentally updated R during the system update, and will be rolling
everything back to the known working versions.
again, i'm really sorry about this. our jenkins is old, and the new
ubuntu one is almost ready to go. i really can't wait to shut down
the centos boxes... they're old and crusty
you should make hbase a data source(seems we already have hbase connector?),
create a dataframe from hbase, and do join in Spark SQL.
> On 21 Jun 2017, at 10:17 AM, sunerhan1...@sina.com wrote:
>
> Hello,
> My scenary is like this:
> 1.val df=hivecontext/carboncontex.sql("sql")
>
you should make hbase a data source(seems we already have hbase connector?),
create a dataframe from hbase, and do join in Spark SQL.
> On 21 Jun 2017, at 10:17 AM, sunerhan1...@sina.com wrote:
>
> Hello,
> My scenary is like this:
> 1.val df=hivecontext/carboncontex.sql("sql")
>
I will kick off the voting with a +1.
On Tue, Jun 20, 2017 at 4:49 PM, Michael Armbrust
wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.2.0. The vote is open until Friday, June 23rd, 2017 at 18:00 PST and
> passes if a majority of at least 3 +1 PMC votes are
Please vote on releasing the following candidate as Apache Spark version
2.2.0. The vote is open until Friday, June 23rd, 2017 at 18:00 PST and
passes if a majority of at least 3 +1 PMC votes are cast.
[ ] +1 Release this package as Apache Spark 2.2.0
[ ] -1 Do not release this package because ...
https://issues.apache.org/jira/browse/SPARK-21157
Hi - often times, Spark applications are killed for overrunning available
memory by YARN, Mesos, or the OS. In SPARK-21157, I propose a design for
grabbing and reporting "total memory" usage for Spark executors - that is,
memory usage as visible fr
this is currently fixed, but did cause PRB failures this afternoon.
i'll go retrigger as many as i can as penance. :\
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> On 20 Jun 2017, at 07:49, sririshindra wrote:
>
> Is there anything similar to s3 connector for Google cloud storage?
> Since Google cloud Storage is also an object store rather than a file
> system, I imagine the same problem that the s3 connector is trying to solve
> arises with google cloud
On 19 Jun 2017, at 16:55, Ryan Blue
mailto:rb...@netflix.com.INVALID>> wrote:
I agree, the problem is that Spark is trying to be safe and avoid the direct
committer. We also modify Spark to avoid its logic. We added a property that
causes Spark to always use the output committer if the destina
(hopefully this is my last email on this subject...)
jenkins is back up. the ray and alluxio-master builds have been
de-zombified and are happily building (as well as everything else).
:)
shane
On Tue, Jun 20, 2017 at 12:27 PM, shane knapp wrote:
> i have to apologize in advance, but it looks
i have to apologize in advance, but it looks like we're going to have
to do an emergency restart of jenkins -- we have two zombie jobs that
aren't timing out and they're blocking new builds for those projects
from starting.
i've put jenkins in to quiet mode, and will do a restart in ~30 mins
to al
and we're back up and building!
On Tue, Jun 20, 2017 at 8:23 AM, shane knapp wrote:
> ok, the centos packages have been released. i've put jenkins in to
> quiet mode, and will be updating rpms and rebooting ASAP.
>
> updates as they come.
>
> shane
>
> On Mon, Jun 19, 2017 at 2:43 PM, shane knap
ok, the centos packages have been released. i've put jenkins in to
quiet mode, and will be updating rpms and rebooting ASAP.
updates as they come.
shane
On Mon, Jun 19, 2017 at 2:43 PM, shane knapp wrote:
> i've updated the two ubuntu workers (amp-jenkins-staging-01 and -02),
> and am still tw
`Dataset.mapPartitions` takes `func: Iterator[T] => Iterator[U]`, which means,
spark need to deserialize the internal binary format to type `T`, and this
deserialization is costly.
If you do need to do some hack, you can use the internal API:
`Dataset.queryExecution.toRdd.mapPartitions`, which
16 matches
Mail list logo