Re: [VOTE] SPIP ML Pipelines in R

2018-06-01 Thread Hossein
Hi Shivaram,

We converged on a CRAN release process that seems identical to current
SparkR.

--Hossein

On Thu, May 31, 2018 at 9:10 AM, Shivaram Venkataraman <
shiva...@eecs.berkeley.edu> wrote:

> Hossein -- Can you clarify what the resolution on the repository /
> release issue discussed on SPIP ?
>
> Shivaram
>
> On Thu, May 31, 2018 at 9:06 AM, Felix Cheung 
> wrote:
> > +1
> > With my concerns in the SPIP discussion.
> >
> > 
> > From: Hossein 
> > Sent: Wednesday, May 30, 2018 2:03:03 PM
> > To: dev@spark.apache.org
> > Subject: [VOTE] SPIP ML Pipelines in R
> >
> > Hi,
> >
> > I started discussion thread for a new R package to expose MLlib
> pipelines in
> > R.
> >
> > To summarize we will work on utilities to generate R wrappers for MLlib
> > pipeline API for a new R package. This will lower the burden for exposing
> > new API in future.
> >
> > Following the SPIP process, I am proposing the SPIP for a vote.
> >
> > +1: Let's go ahead and implement the SPIP.
> > +0: Don't really care.
> > -1: I do not think this is a good idea for the following reasons.
> >
> > Thanks,
> > --Hossein
>


Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Nicholas Chammas
pyspark --packages org.apache.hadoop:hadoop-aws:2.7.3 didn’t work for me
either (even building with -Phadoop-2.7). I guess I’ve been relying on an
unsupported pattern and will need to figure something else out going
forward in order to use s3a://.
​

On Fri, Jun 1, 2018 at 9:09 PM Marcelo Vanzin  wrote:

> I have personally never tried to include hadoop-aws that way. But at
> the very least, I'd try to use the same version of Hadoop as the Spark
> build (2.7.3 IIRC). I don't really expect a different version to work,
> and if it did in the past it definitely was not by design.
>
> On Fri, Jun 1, 2018 at 5:50 PM, Nicholas Chammas
>  wrote:
> > Building with -Phadoop-2.7 didn’t help, and if I remember correctly,
> > building with -Phadoop-2.8 worked with hadoop-aws in the 2.3.0 release,
> so
> > it appears something has changed since then.
> >
> > I wasn’t familiar with -Phadoop-cloud, but I can try that.
> >
> > My goal here is simply to confirm that this release of Spark works with
> > hadoop-aws like past releases did, particularly for Flintrock users who
> use
> > Spark with S3A.
> >
> > We currently provide -hadoop2.6, -hadoop2.7, and -without-hadoop builds
> with
> > every Spark release. If the -hadoop2.7 release build won’t work with
> > hadoop-aws anymore, are there plans to provide a new build type that
> will?
> >
> > Apologies if the question is poorly formed. I’m batting a bit outside my
> > league here. Again, my goal is simply to confirm that I/my users still
> have
> > a way to use s3a://. In the past, that way was simply to call pyspark
> > --packages org.apache.hadoop:hadoop-aws:2.8.4 or something very similar.
> If
> > that will no longer work, I’m trying to confirm that the change of
> behavior
> > is intentional or acceptable (as a review for the Spark project) and
> figure
> > out what I need to change (as due diligence for Flintrock’s users).
> >
> > Nick
> >
> >
> > On Fri, Jun 1, 2018 at 8:21 PM Marcelo Vanzin 
> wrote:
> >>
> >> Using the hadoop-aws package is probably going to be a little more
> >> complicated than that. The best bet is to use a custom build of Spark
> >> that includes it (use -Phadoop-cloud). Otherwise you're probably
> >> looking at some nasty dependency issues, especially if you end up
> >> mixing different versions of Hadoop.
> >>
> >> On Fri, Jun 1, 2018 at 4:01 PM, Nicholas Chammas
> >>  wrote:
> >> > I was able to successfully launch a Spark cluster on EC2 at 2.3.1 RC4
> >> > using
> >> > Flintrock. However, trying to load the hadoop-aws package gave me some
> >> > errors.
> >> >
> >> > $ pyspark --packages org.apache.hadoop:hadoop-aws:2.8.4
> >> >
> >> > 
> >> >
> >> > :: problems summary ::
> >> >  WARNINGS
> >> > [NOT FOUND  ]
> >> > com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle) (2ms)
> >> >  local-m2-cache: tried
> >> >
> >> >
> >> >
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar
> >> > [NOT FOUND  ]
> >> > com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms)
> >> >  local-m2-cache: tried
> >> >
> >> >
> >> >
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar
> >> > [NOT FOUND  ]
> >> > org.codehaus.jettison#jettison;1.1!jettison.jar(bundle) (1ms)
> >> >  local-m2-cache: tried
> >> >
> >> >
> >> >
> file:/home/ec2-user/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar
> >> > [NOT FOUND  ]
> >> > com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar (0ms)
> >> >  local-m2-cache: tried
> >> >
> >> >
> >> >
> file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar
> >> >
> >> > I’d guess I’m probably using the wrong version of hadoop-aws, but I
> >> > called
> >> > make-distribution.sh with -Phadoop-2.8 so I’m not sure what else to
> try.
> >> >
> >> > Any quick pointers?
> >> >
> >> > Nick
> >> >
> >> >
> >> > On Fri, Jun 1, 2018 at 6:29 PM Marcelo Vanzin 
> >> > wrote:
> >> >>
> >> >> Starting with my own +1 (binding).
> >> >>
> >> >> On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin 
> >> >> wrote:
> >> >> > Please vote on releasing the following candidate as Apache Spark
> >> >> > version
> >> >> > 2.3.1.
> >> >> >
> >> >> > Given that I expect at least a few people to be busy with Spark
> >> >> > Summit
> >> >> > next
> >> >> > week, I'm taking the liberty of setting an extended voting period.
> >> >> > The
> >> >> > vote
> >> >> > will be open until Friday, June 8th, at 19:00 UTC (that's 12:00
> PDT).
> >> >> >
> >> >> > It passes with a majority of +1 votes, which must include at least
> 3
> >> >> > +1
> >> >> > votes
> >> >> > from the PMC.
> >> >> >
> >> >> > [ ] +1 Release this package as Apache Spark 2.3.1
> >> >> > [ ] -1 Do not release this package because ...
> >> >> >
> >> >> > To learn more about Apache Spark, please see
> http://spark.apache.org/
> >> >> >
> >> >> > The tag to be 

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Marcelo Vanzin
I have personally never tried to include hadoop-aws that way. But at
the very least, I'd try to use the same version of Hadoop as the Spark
build (2.7.3 IIRC). I don't really expect a different version to work,
and if it did in the past it definitely was not by design.

On Fri, Jun 1, 2018 at 5:50 PM, Nicholas Chammas
 wrote:
> Building with -Phadoop-2.7 didn’t help, and if I remember correctly,
> building with -Phadoop-2.8 worked with hadoop-aws in the 2.3.0 release, so
> it appears something has changed since then.
>
> I wasn’t familiar with -Phadoop-cloud, but I can try that.
>
> My goal here is simply to confirm that this release of Spark works with
> hadoop-aws like past releases did, particularly for Flintrock users who use
> Spark with S3A.
>
> We currently provide -hadoop2.6, -hadoop2.7, and -without-hadoop builds with
> every Spark release. If the -hadoop2.7 release build won’t work with
> hadoop-aws anymore, are there plans to provide a new build type that will?
>
> Apologies if the question is poorly formed. I’m batting a bit outside my
> league here. Again, my goal is simply to confirm that I/my users still have
> a way to use s3a://. In the past, that way was simply to call pyspark
> --packages org.apache.hadoop:hadoop-aws:2.8.4 or something very similar. If
> that will no longer work, I’m trying to confirm that the change of behavior
> is intentional or acceptable (as a review for the Spark project) and figure
> out what I need to change (as due diligence for Flintrock’s users).
>
> Nick
>
>
> On Fri, Jun 1, 2018 at 8:21 PM Marcelo Vanzin  wrote:
>>
>> Using the hadoop-aws package is probably going to be a little more
>> complicated than that. The best bet is to use a custom build of Spark
>> that includes it (use -Phadoop-cloud). Otherwise you're probably
>> looking at some nasty dependency issues, especially if you end up
>> mixing different versions of Hadoop.
>>
>> On Fri, Jun 1, 2018 at 4:01 PM, Nicholas Chammas
>>  wrote:
>> > I was able to successfully launch a Spark cluster on EC2 at 2.3.1 RC4
>> > using
>> > Flintrock. However, trying to load the hadoop-aws package gave me some
>> > errors.
>> >
>> > $ pyspark --packages org.apache.hadoop:hadoop-aws:2.8.4
>> >
>> > 
>> >
>> > :: problems summary ::
>> >  WARNINGS
>> > [NOT FOUND  ]
>> > com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle) (2ms)
>> >  local-m2-cache: tried
>> >
>> >
>> > file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar
>> > [NOT FOUND  ]
>> > com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms)
>> >  local-m2-cache: tried
>> >
>> >
>> > file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar
>> > [NOT FOUND  ]
>> > org.codehaus.jettison#jettison;1.1!jettison.jar(bundle) (1ms)
>> >  local-m2-cache: tried
>> >
>> >
>> > file:/home/ec2-user/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar
>> > [NOT FOUND  ]
>> > com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar (0ms)
>> >  local-m2-cache: tried
>> >
>> >
>> > file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar
>> >
>> > I’d guess I’m probably using the wrong version of hadoop-aws, but I
>> > called
>> > make-distribution.sh with -Phadoop-2.8 so I’m not sure what else to try.
>> >
>> > Any quick pointers?
>> >
>> > Nick
>> >
>> >
>> > On Fri, Jun 1, 2018 at 6:29 PM Marcelo Vanzin 
>> > wrote:
>> >>
>> >> Starting with my own +1 (binding).
>> >>
>> >> On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin 
>> >> wrote:
>> >> > Please vote on releasing the following candidate as Apache Spark
>> >> > version
>> >> > 2.3.1.
>> >> >
>> >> > Given that I expect at least a few people to be busy with Spark
>> >> > Summit
>> >> > next
>> >> > week, I'm taking the liberty of setting an extended voting period.
>> >> > The
>> >> > vote
>> >> > will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>> >> >
>> >> > It passes with a majority of +1 votes, which must include at least 3
>> >> > +1
>> >> > votes
>> >> > from the PMC.
>> >> >
>> >> > [ ] +1 Release this package as Apache Spark 2.3.1
>> >> > [ ] -1 Do not release this package because ...
>> >> >
>> >> > To learn more about Apache Spark, please see http://spark.apache.org/
>> >> >
>> >> > The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
>> >> > https://github.com/apache/spark/tree/v2.3.1-rc4
>> >> >
>> >> > The release files, including signatures, digests, etc. can be found
>> >> > at:
>> >> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/
>> >> >
>> >> > Signatures used for Spark RCs can be found in this file:
>> >> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >> >
>> >> > The staging repository for this release can be found at:
>> >> >
>> >> > https://repository.apache.org/content/repositories/orgapachespark-1272/
>> >> >
>> >> > The 

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Reynold Xin
+1

On Fri, Jun 1, 2018 at 3:29 PM Marcelo Vanzin  wrote:

> Please vote on releasing the following candidate as Apache Spark version
> 2.3.1.
>
> Given that I expect at least a few people to be busy with Spark Summit next
> week, I'm taking the liberty of setting an extended voting period. The vote
> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>
> It passes with a majority of +1 votes, which must include at least 3 +1
> votes
> from the PMC.
>
> [ ] +1 Release this package as Apache Spark 2.3.1
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
> https://github.com/apache/spark/tree/v2.3.1-rc4
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1272/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-docs/
>
> The list of bug fixes going into 2.3.1 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12342432
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.3.1?
> ===
>
> The current list of open tickets targeted at 2.3.1 can be found at:
> https://s.apache.org/Q3Uo
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Nicholas Chammas
Building with -Phadoop-2.7 didn’t help, and if I remember correctly,
building with -Phadoop-2.8 worked with hadoop-aws in the 2.3.0 release, so
it appears something has changed since then.

I wasn’t familiar with -Phadoop-cloud, but I can try that.

My goal here is simply to confirm that this release of Spark works with
hadoop-aws like past releases did, particularly for Flintrock users who use
Spark with S3A.

We currently provide -hadoop2.6, -hadoop2.7, and -without-hadoop builds
with every Spark release. If the -hadoop2.7 release build won’t work with
hadoop-aws anymore, are there plans to provide a new build type that will?

Apologies if the question is poorly formed. I’m batting a bit outside my
league here. Again, my goal is simply to confirm that I/my users still have
a way to use s3a://. In the past, that way was simply to call pyspark
--packages org.apache.hadoop:hadoop-aws:2.8.4 or something very similar. If
that will no longer work, I’m trying to confirm that the change of behavior
is intentional or acceptable (as a review for the Spark project) and figure
out what I need to change (as due diligence for Flintrock’s users).

Nick
​

On Fri, Jun 1, 2018 at 8:21 PM Marcelo Vanzin  wrote:

> Using the hadoop-aws package is probably going to be a little more
> complicated than that. The best bet is to use a custom build of Spark
> that includes it (use -Phadoop-cloud). Otherwise you're probably
> looking at some nasty dependency issues, especially if you end up
> mixing different versions of Hadoop.
>
> On Fri, Jun 1, 2018 at 4:01 PM, Nicholas Chammas
>  wrote:
> > I was able to successfully launch a Spark cluster on EC2 at 2.3.1 RC4
> using
> > Flintrock. However, trying to load the hadoop-aws package gave me some
> > errors.
> >
> > $ pyspark --packages org.apache.hadoop:hadoop-aws:2.8.4
> >
> > 
> >
> > :: problems summary ::
> >  WARNINGS
> > [NOT FOUND  ]
> > com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle) (2ms)
> >  local-m2-cache: tried
> >
> >
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar
> > [NOT FOUND  ]
> > com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms)
> >  local-m2-cache: tried
> >
> >
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar
> > [NOT FOUND  ]
> > org.codehaus.jettison#jettison;1.1!jettison.jar(bundle) (1ms)
> >  local-m2-cache: tried
> >
> >
> file:/home/ec2-user/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar
> > [NOT FOUND  ]
> > com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar (0ms)
> >  local-m2-cache: tried
> >
> >
> file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar
> >
> > I’d guess I’m probably using the wrong version of hadoop-aws, but I
> called
> > make-distribution.sh with -Phadoop-2.8 so I’m not sure what else to try.
> >
> > Any quick pointers?
> >
> > Nick
> >
> >
> > On Fri, Jun 1, 2018 at 6:29 PM Marcelo Vanzin 
> wrote:
> >>
> >> Starting with my own +1 (binding).
> >>
> >> On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin 
> >> wrote:
> >> > Please vote on releasing the following candidate as Apache Spark
> version
> >> > 2.3.1.
> >> >
> >> > Given that I expect at least a few people to be busy with Spark Summit
> >> > next
> >> > week, I'm taking the liberty of setting an extended voting period. The
> >> > vote
> >> > will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
> >> >
> >> > It passes with a majority of +1 votes, which must include at least 3
> +1
> >> > votes
> >> > from the PMC.
> >> >
> >> > [ ] +1 Release this package as Apache Spark 2.3.1
> >> > [ ] -1 Do not release this package because ...
> >> >
> >> > To learn more about Apache Spark, please see http://spark.apache.org/
> >> >
> >> > The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
> >> > https://github.com/apache/spark/tree/v2.3.1-rc4
> >> >
> >> > The release files, including signatures, digests, etc. can be found
> at:
> >> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/
> >> >
> >> > Signatures used for Spark RCs can be found in this file:
> >> > https://dist.apache.org/repos/dist/dev/spark/KEYS
> >> >
> >> > The staging repository for this release can be found at:
> >> >
> https://repository.apache.org/content/repositories/orgapachespark-1272/
> >> >
> >> > The documentation corresponding to this release can be found at:
> >> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-docs/
> >> >
> >> > The list of bug fixes going into 2.3.1 can be found at the following
> >> > URL:
> >> > https://issues.apache.org/jira/projects/SPARK/versions/12342432
> >> >
> >> > FAQ
> >> >
> >> > =
> >> > How can I help test this release?
> >> > =
> >> >
> >> > If you are a Spark user, you can help us test this release by taking
> >> > 

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Marcelo Vanzin
Using the hadoop-aws package is probably going to be a little more
complicated than that. The best bet is to use a custom build of Spark
that includes it (use -Phadoop-cloud). Otherwise you're probably
looking at some nasty dependency issues, especially if you end up
mixing different versions of Hadoop.

On Fri, Jun 1, 2018 at 4:01 PM, Nicholas Chammas
 wrote:
> I was able to successfully launch a Spark cluster on EC2 at 2.3.1 RC4 using
> Flintrock. However, trying to load the hadoop-aws package gave me some
> errors.
>
> $ pyspark --packages org.apache.hadoop:hadoop-aws:2.8.4
>
> 
>
> :: problems summary ::
>  WARNINGS
> [NOT FOUND  ]
> com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle) (2ms)
>  local-m2-cache: tried
>
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar
> [NOT FOUND  ]
> com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms)
>  local-m2-cache: tried
>
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar
> [NOT FOUND  ]
> org.codehaus.jettison#jettison;1.1!jettison.jar(bundle) (1ms)
>  local-m2-cache: tried
>
> file:/home/ec2-user/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar
> [NOT FOUND  ]
> com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar (0ms)
>  local-m2-cache: tried
>
> file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar
>
> I’d guess I’m probably using the wrong version of hadoop-aws, but I called
> make-distribution.sh with -Phadoop-2.8 so I’m not sure what else to try.
>
> Any quick pointers?
>
> Nick
>
>
> On Fri, Jun 1, 2018 at 6:29 PM Marcelo Vanzin  wrote:
>>
>> Starting with my own +1 (binding).
>>
>> On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin 
>> wrote:
>> > Please vote on releasing the following candidate as Apache Spark version
>> > 2.3.1.
>> >
>> > Given that I expect at least a few people to be busy with Spark Summit
>> > next
>> > week, I'm taking the liberty of setting an extended voting period. The
>> > vote
>> > will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>> >
>> > It passes with a majority of +1 votes, which must include at least 3 +1
>> > votes
>> > from the PMC.
>> >
>> > [ ] +1 Release this package as Apache Spark 2.3.1
>> > [ ] -1 Do not release this package because ...
>> >
>> > To learn more about Apache Spark, please see http://spark.apache.org/
>> >
>> > The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
>> > https://github.com/apache/spark/tree/v2.3.1-rc4
>> >
>> > The release files, including signatures, digests, etc. can be found at:
>> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/
>> >
>> > Signatures used for Spark RCs can be found in this file:
>> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >
>> > The staging repository for this release can be found at:
>> > https://repository.apache.org/content/repositories/orgapachespark-1272/
>> >
>> > The documentation corresponding to this release can be found at:
>> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-docs/
>> >
>> > The list of bug fixes going into 2.3.1 can be found at the following
>> > URL:
>> > https://issues.apache.org/jira/projects/SPARK/versions/12342432
>> >
>> > FAQ
>> >
>> > =
>> > How can I help test this release?
>> > =
>> >
>> > If you are a Spark user, you can help us test this release by taking
>> > an existing Spark workload and running on this release candidate, then
>> > reporting any regressions.
>> >
>> > If you're working in PySpark you can set up a virtual env and install
>> > the current RC and see if anything important breaks, in the Java/Scala
>> > you can add the staging repository to your projects resolvers and test
>> > with the RC (make sure to clean up the artifact cache before/after so
>> > you don't end up building with a out of date RC going forward).
>> >
>> > ===
>> > What should happen to JIRA tickets still targeting 2.3.1?
>> > ===
>> >
>> > The current list of open tickets targeted at 2.3.1 can be found at:
>> > https://s.apache.org/Q3Uo
>> >
>> > Committers should look at those and triage. Extremely important bug
>> > fixes, documentation, and API tweaks that impact compatibility should
>> > be worked on immediately. Everything else please retarget to an
>> > appropriate release.
>> >
>> > ==
>> > But my bug isn't fixed?
>> > ==
>> >
>> > In order to make timely releases, we will typically not hold the
>> > release unless the bug in question is a regression from the previous
>> > release. That being said, if there is something which is a regression
>> > that has not been correctly targeted please ping me or a committer to
>> > help target the issue.
>> >
>> >
>> > --
>> > 

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Mark Hamstra
There is no hadoop-2.8 profile. Use hadoop-2.7, which is effectively
hadoop-2.7+

On Fri, Jun 1, 2018 at 4:01 PM Nicholas Chammas 
wrote:

> I was able to successfully launch a Spark cluster on EC2 at 2.3.1 RC4
> using Flintrock . However, trying
> to load the hadoop-aws package gave me some errors.
>
> $ pyspark --packages org.apache.hadoop:hadoop-aws:2.8.4
>
> 
>
> :: problems summary ::
>  WARNINGS
> [NOT FOUND  ] 
> com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle) (2ms)
>  local-m2-cache: tried
>   
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar
> [NOT FOUND  ] 
> com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms)
>  local-m2-cache: tried
>   
> file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar
> [NOT FOUND  ] 
> org.codehaus.jettison#jettison;1.1!jettison.jar(bundle) (1ms)
>  local-m2-cache: tried
>   
> file:/home/ec2-user/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar
> [NOT FOUND  ] 
> com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar (0ms)
>  local-m2-cache: tried
>   
> file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar
>
> I’d guess I’m probably using the wrong version of hadoop-aws, but I
> called make-distribution.sh with -Phadoop-2.8 so I’m not sure what else
> to try.
>
> Any quick pointers?
>
> Nick
> ​
>
> On Fri, Jun 1, 2018 at 6:29 PM Marcelo Vanzin  wrote:
>
>> Starting with my own +1 (binding).
>>
>> On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin 
>> wrote:
>> > Please vote on releasing the following candidate as Apache Spark
>> version 2.3.1.
>> >
>> > Given that I expect at least a few people to be busy with Spark Summit
>> next
>> > week, I'm taking the liberty of setting an extended voting period. The
>> vote
>> > will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>> >
>> > It passes with a majority of +1 votes, which must include at least 3 +1
>> votes
>> > from the PMC.
>> >
>> > [ ] +1 Release this package as Apache Spark 2.3.1
>> > [ ] -1 Do not release this package because ...
>> >
>> > To learn more about Apache Spark, please see http://spark.apache.org/
>> >
>> > The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
>> > https://github.com/apache/spark/tree/v2.3.1-rc4
>> >
>> > The release files, including signatures, digests, etc. can be found at:
>> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/
>> >
>> > Signatures used for Spark RCs can be found in this file:
>> > https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >
>> > The staging repository for this release can be found at:
>> > https://repository.apache.org/content/repositories/orgapachespark-1272/
>> >
>> > The documentation corresponding to this release can be found at:
>> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-docs/
>> >
>> > The list of bug fixes going into 2.3.1 can be found at the following
>> URL:
>> > https://issues.apache.org/jira/projects/SPARK/versions/12342432
>> >
>> > FAQ
>> >
>> > =
>> > How can I help test this release?
>> > =
>> >
>> > If you are a Spark user, you can help us test this release by taking
>> > an existing Spark workload and running on this release candidate, then
>> > reporting any regressions.
>> >
>> > If you're working in PySpark you can set up a virtual env and install
>> > the current RC and see if anything important breaks, in the Java/Scala
>> > you can add the staging repository to your projects resolvers and test
>> > with the RC (make sure to clean up the artifact cache before/after so
>> > you don't end up building with a out of date RC going forward).
>> >
>> > ===
>> > What should happen to JIRA tickets still targeting 2.3.1?
>> > ===
>> >
>> > The current list of open tickets targeted at 2.3.1 can be found at:
>> > https://s.apache.org/Q3Uo
>> >
>> > Committers should look at those and triage. Extremely important bug
>> > fixes, documentation, and API tweaks that impact compatibility should
>> > be worked on immediately. Everything else please retarget to an
>> > appropriate release.
>> >
>> > ==
>> > But my bug isn't fixed?
>> > ==
>> >
>> > In order to make timely releases, we will typically not hold the
>> > release unless the bug in question is a regression from the previous
>> > release. That being said, if there is something which is a regression
>> > that has not been correctly targeted please ping me or a committer to
>> > help target the issue.
>> >
>> >
>> > --
>> > Marcelo
>>
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: 

Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Nicholas Chammas
I was able to successfully launch a Spark cluster on EC2 at 2.3.1 RC4 using
Flintrock . However, trying to load
the hadoop-aws package gave me some errors.

$ pyspark --packages org.apache.hadoop:hadoop-aws:2.8.4



:: problems summary ::
 WARNINGS
[NOT FOUND  ]
com.sun.jersey#jersey-json;1.9!jersey-json.jar(bundle) (2ms)
 local-m2-cache: tried
  
file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-json/1.9/jersey-json-1.9.jar
[NOT FOUND  ]
com.sun.jersey#jersey-server;1.9!jersey-server.jar(bundle) (0ms)
 local-m2-cache: tried
  
file:/home/ec2-user/.m2/repository/com/sun/jersey/jersey-server/1.9/jersey-server-1.9.jar
[NOT FOUND  ]
org.codehaus.jettison#jettison;1.1!jettison.jar(bundle) (1ms)
 local-m2-cache: tried
  
file:/home/ec2-user/.m2/repository/org/codehaus/jettison/jettison/1.1/jettison-1.1.jar
[NOT FOUND  ]
com.sun.xml.bind#jaxb-impl;2.2.3-1!jaxb-impl.jar (0ms)
 local-m2-cache: tried
  
file:/home/ec2-user/.m2/repository/com/sun/xml/bind/jaxb-impl/2.2.3-1/jaxb-impl-2.2.3-1.jar

I’d guess I’m probably using the wrong version of hadoop-aws, but I called
make-distribution.sh with -Phadoop-2.8 so I’m not sure what else to try.

Any quick pointers?

Nick
​

On Fri, Jun 1, 2018 at 6:29 PM Marcelo Vanzin  wrote:

> Starting with my own +1 (binding).
>
> On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin 
> wrote:
> > Please vote on releasing the following candidate as Apache Spark version
> 2.3.1.
> >
> > Given that I expect at least a few people to be busy with Spark Summit
> next
> > week, I'm taking the liberty of setting an extended voting period. The
> vote
> > will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
> >
> > It passes with a majority of +1 votes, which must include at least 3 +1
> votes
> > from the PMC.
> >
> > [ ] +1 Release this package as Apache Spark 2.3.1
> > [ ] -1 Do not release this package because ...
> >
> > To learn more about Apache Spark, please see http://spark.apache.org/
> >
> > The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
> > https://github.com/apache/spark/tree/v2.3.1-rc4
> >
> > The release files, including signatures, digests, etc. can be found at:
> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/
> >
> > Signatures used for Spark RCs can be found in this file:
> > https://dist.apache.org/repos/dist/dev/spark/KEYS
> >
> > The staging repository for this release can be found at:
> > https://repository.apache.org/content/repositories/orgapachespark-1272/
> >
> > The documentation corresponding to this release can be found at:
> > https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-docs/
> >
> > The list of bug fixes going into 2.3.1 can be found at the following URL:
> > https://issues.apache.org/jira/projects/SPARK/versions/12342432
> >
> > FAQ
> >
> > =
> > How can I help test this release?
> > =
> >
> > If you are a Spark user, you can help us test this release by taking
> > an existing Spark workload and running on this release candidate, then
> > reporting any regressions.
> >
> > If you're working in PySpark you can set up a virtual env and install
> > the current RC and see if anything important breaks, in the Java/Scala
> > you can add the staging repository to your projects resolvers and test
> > with the RC (make sure to clean up the artifact cache before/after so
> > you don't end up building with a out of date RC going forward).
> >
> > ===
> > What should happen to JIRA tickets still targeting 2.3.1?
> > ===
> >
> > The current list of open tickets targeted at 2.3.1 can be found at:
> > https://s.apache.org/Q3Uo
> >
> > Committers should look at those and triage. Extremely important bug
> > fixes, documentation, and API tweaks that impact compatibility should
> > be worked on immediately. Everything else please retarget to an
> > appropriate release.
> >
> > ==
> > But my bug isn't fixed?
> > ==
> >
> > In order to make timely releases, we will typically not hold the
> > release unless the bug in question is a regression from the previous
> > release. That being said, if there is something which is a regression
> > that has not been correctly targeted please ping me or a committer to
> > help target the issue.
> >
> >
> > --
> > Marcelo
>
>
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] [SPARK-24374] SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Xiao Li
+1

2018-06-01 15:41 GMT-07:00 Xingbo Jiang :

> +1
>
> 2018-06-01 9:21 GMT-07:00 Xiangrui Meng :
>
>> Hi all,
>>
>> I want to call for a vote of SPARK-24374
>> . It introduces a new
>> execution mode to Spark, which would help both integration with external
>> DL/AI frameworks and MLlib algorithm performance. This is one of the
>> follow-ups from a previous discussion on dev@
>> 
>> .
>>
>> The vote will be up for the next 72 hours. Please reply with your vote:
>>
>> +1: Yeah, let's go forward and implement the SPIP.
>> +0: Don't really care.
>> -1: I don't think this is a good idea because of the following technical
>> reasons.
>>
>> Best,
>> Xiangrui
>> --
>>
>> Xiangrui Meng
>>
>> Software Engineer
>>
>> Databricks Inc. [image: http://databricks.com] 
>>
>
>


Re: [VOTE] [SPARK-24374] SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Xingbo Jiang
+1

2018-06-01 9:21 GMT-07:00 Xiangrui Meng :

> Hi all,
>
> I want to call for a vote of SPARK-24374
> . It introduces a new
> execution mode to Spark, which would help both integration with external
> DL/AI frameworks and MLlib algorithm performance. This is one of the
> follow-ups from a previous discussion on dev@
> 
> .
>
> The vote will be up for the next 72 hours. Please reply with your vote:
>
> +1: Yeah, let's go forward and implement the SPIP.
> +0: Don't really care.
> -1: I don't think this is a good idea because of the following technical
> reasons.
>
> Best,
> Xiangrui
> --
>
> Xiangrui Meng
>
> Software Engineer
>
> Databricks Inc. [image: http://databricks.com] 
>


Re: [VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Marcelo Vanzin
Starting with my own +1 (binding).

On Fri, Jun 1, 2018 at 3:28 PM, Marcelo Vanzin  wrote:
> Please vote on releasing the following candidate as Apache Spark version 
> 2.3.1.
>
> Given that I expect at least a few people to be busy with Spark Summit next
> week, I'm taking the liberty of setting an extended voting period. The vote
> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>
> It passes with a majority of +1 votes, which must include at least 3 +1 votes
> from the PMC.
>
> [ ] +1 Release this package as Apache Spark 2.3.1
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
> https://github.com/apache/spark/tree/v2.3.1-rc4
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1272/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-docs/
>
> The list of bug fixes going into 2.3.1 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12342432
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.3.1?
> ===
>
> The current list of open tickets targeted at 2.3.1 can be found at:
> https://s.apache.org/Q3Uo
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
>
> --
> Marcelo



-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



[VOTE] Spark 2.3.1 (RC4)

2018-06-01 Thread Marcelo Vanzin
Please vote on releasing the following candidate as Apache Spark version 2.3.1.

Given that I expect at least a few people to be busy with Spark Summit next
week, I'm taking the liberty of setting an extended voting period. The vote
will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).

It passes with a majority of +1 votes, which must include at least 3 +1 votes
from the PMC.

[ ] +1 Release this package as Apache Spark 2.3.1
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.3.1-rc4 (commit 30aaa5a3):
https://github.com/apache/spark/tree/v2.3.1-rc4

The release files, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-bin/

Signatures used for Spark RCs can be found in this file:
https://dist.apache.org/repos/dist/dev/spark/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1272/

The documentation corresponding to this release can be found at:
https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc4-docs/

The list of bug fixes going into 2.3.1 can be found at the following URL:
https://issues.apache.org/jira/projects/SPARK/versions/12342432

FAQ

=
How can I help test this release?
=

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC and see if anything important breaks, in the Java/Scala
you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with a out of date RC going forward).

===
What should happen to JIRA tickets still targeting 2.3.1?
===

The current list of open tickets targeted at 2.3.1 can be found at:
https://s.apache.org/Q3Uo

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==
But my bug isn't fixed?
==

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.


-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Xiao Li
Sure, will send RM an email next time when I made a release-blocker change.

Xiao

2018-06-01 13:32 GMT-07:00 Reynold Xin :

> Yes everybody please cc the release manager on changes that merit -1. It's
> high overhead and let's make this smoother.
>
>
> On Fri, Jun 1, 2018 at 1:28 PM Marcelo Vanzin  wrote:
>
>> Xiao,
>>
>> This is the third time in this release cycle that this is happening.
>> Sorry to single out you guys, but can you please do two things:
>>
>> - do not merge things in 2.3 you're not absolutely sure about
>> - make sure that things you backport to 2.3 are not causing problems
>> - let the RM know about these things as soon as you discover them, not
>> when they send the next RC for voting.
>>
>> Even though I was in the middle of preparing the rc, I could have
>> easily aborted that and skipped this whole thread.
>>
>> This vote is canceled. I'll prepare a new RC right away. I hope this
>> does not happen again.
>>
>>
>> On Fri, Jun 1, 2018 at 1:20 PM, Xiao Li  wrote:
>> > Sorry, I need to say -1
>> >
>> > This morning, just found a regression in 2.3.1 and reverted
>> > https://github.com/apache/spark/pull/21443
>> >
>> > Xiao
>> >
>> > 2018-06-01 13:09 GMT-07:00 Marcelo Vanzin :
>> >>
>> >> Please vote on releasing the following candidate as Apache Spark
>> version
>> >> 2.3.1.
>> >>
>> >> Given that I expect at least a few people to be busy with Spark Summit
>> >> next
>> >> week, I'm taking the liberty of setting an extended voting period. The
>> >> vote
>> >> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>> >>
>> >> It passes with a majority of +1 votes, which must include at least 3 +1
>> >> votes
>> >> from the PMC.
>> >>
>> >> [ ] +1 Release this package as Apache Spark 2.3.1
>> >> [ ] -1 Do not release this package because ...
>> >>
>> >> To learn more about Apache Spark, please see http://spark.apache.org/
>> >>
>> >> The tag to be voted on is v2.3.1-rc3 (commit 1cc5f68b):
>> >> https://github.com/apache/spark/tree/v2.3.1-rc3
>> >>
>> >> The release files, including signatures, digests, etc. can be found at:
>> >> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-bin/
>> >>
>> >> Signatures used for Spark RCs can be found in this file:
>> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
>> >>
>> >> The staging repository for this release can be found at:
>> >> https://repository.apache.org/content/repositories/
>> orgapachespark-1271/
>> >>
>> >> The documentation corresponding to this release can be found at:
>> >> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-docs/
>> >>
>> >> The list of bug fixes going into 2.3.1 can be found at the following
>> URL:
>> >> https://issues.apache.org/jira/projects/SPARK/versions/12342432
>> >>
>> >> FAQ
>> >>
>> >> =
>> >> How can I help test this release?
>> >> =
>> >>
>> >> If you are a Spark user, you can help us test this release by taking
>> >> an existing Spark workload and running on this release candidate, then
>> >> reporting any regressions.
>> >>
>> >> If you're working in PySpark you can set up a virtual env and install
>> >> the current RC and see if anything important breaks, in the Java/Scala
>> >> you can add the staging repository to your projects resolvers and test
>> >> with the RC (make sure to clean up the artifact cache before/after so
>> >> you don't end up building with a out of date RC going forward).
>> >>
>> >> ===
>> >> What should happen to JIRA tickets still targeting 2.3.1?
>> >> ===
>> >>
>> >> The current list of open tickets targeted at 2.3.1 can be found at:
>> >> https://s.apache.org/Q3Uo
>> >>
>> >> Committers should look at those and triage. Extremely important bug
>> >> fixes, documentation, and API tweaks that impact compatibility should
>> >> be worked on immediately. Everything else please retarget to an
>> >> appropriate release.
>> >>
>> >> ==
>> >> But my bug isn't fixed?
>> >> ==
>> >>
>> >> In order to make timely releases, we will typically not hold the
>> >> release unless the bug in question is a regression from the previous
>> >> release. That being said, if there is something which is a regression
>> >> that has not been correctly targeted please ping me or a committer to
>> >> help target the issue.
>> >>
>> >>
>> >> --
>> >> Marcelo
>> >>
>> >> -
>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> >>
>> >
>>
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: [VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Reynold Xin
Yes everybody please cc the release manager on changes that merit -1. It's
high overhead and let's make this smoother.


On Fri, Jun 1, 2018 at 1:28 PM Marcelo Vanzin  wrote:

> Xiao,
>
> This is the third time in this release cycle that this is happening.
> Sorry to single out you guys, but can you please do two things:
>
> - do not merge things in 2.3 you're not absolutely sure about
> - make sure that things you backport to 2.3 are not causing problems
> - let the RM know about these things as soon as you discover them, not
> when they send the next RC for voting.
>
> Even though I was in the middle of preparing the rc, I could have
> easily aborted that and skipped this whole thread.
>
> This vote is canceled. I'll prepare a new RC right away. I hope this
> does not happen again.
>
>
> On Fri, Jun 1, 2018 at 1:20 PM, Xiao Li  wrote:
> > Sorry, I need to say -1
> >
> > This morning, just found a regression in 2.3.1 and reverted
> > https://github.com/apache/spark/pull/21443
> >
> > Xiao
> >
> > 2018-06-01 13:09 GMT-07:00 Marcelo Vanzin :
> >>
> >> Please vote on releasing the following candidate as Apache Spark version
> >> 2.3.1.
> >>
> >> Given that I expect at least a few people to be busy with Spark Summit
> >> next
> >> week, I'm taking the liberty of setting an extended voting period. The
> >> vote
> >> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
> >>
> >> It passes with a majority of +1 votes, which must include at least 3 +1
> >> votes
> >> from the PMC.
> >>
> >> [ ] +1 Release this package as Apache Spark 2.3.1
> >> [ ] -1 Do not release this package because ...
> >>
> >> To learn more about Apache Spark, please see http://spark.apache.org/
> >>
> >> The tag to be voted on is v2.3.1-rc3 (commit 1cc5f68b):
> >> https://github.com/apache/spark/tree/v2.3.1-rc3
> >>
> >> The release files, including signatures, digests, etc. can be found at:
> >> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-bin/
> >>
> >> Signatures used for Spark RCs can be found in this file:
> >> https://dist.apache.org/repos/dist/dev/spark/KEYS
> >>
> >> The staging repository for this release can be found at:
> >> https://repository.apache.org/content/repositories/orgapachespark-1271/
> >>
> >> The documentation corresponding to this release can be found at:
> >> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-docs/
> >>
> >> The list of bug fixes going into 2.3.1 can be found at the following
> URL:
> >> https://issues.apache.org/jira/projects/SPARK/versions/12342432
> >>
> >> FAQ
> >>
> >> =
> >> How can I help test this release?
> >> =
> >>
> >> If you are a Spark user, you can help us test this release by taking
> >> an existing Spark workload and running on this release candidate, then
> >> reporting any regressions.
> >>
> >> If you're working in PySpark you can set up a virtual env and install
> >> the current RC and see if anything important breaks, in the Java/Scala
> >> you can add the staging repository to your projects resolvers and test
> >> with the RC (make sure to clean up the artifact cache before/after so
> >> you don't end up building with a out of date RC going forward).
> >>
> >> ===
> >> What should happen to JIRA tickets still targeting 2.3.1?
> >> ===
> >>
> >> The current list of open tickets targeted at 2.3.1 can be found at:
> >> https://s.apache.org/Q3Uo
> >>
> >> Committers should look at those and triage. Extremely important bug
> >> fixes, documentation, and API tweaks that impact compatibility should
> >> be worked on immediately. Everything else please retarget to an
> >> appropriate release.
> >>
> >> ==
> >> But my bug isn't fixed?
> >> ==
> >>
> >> In order to make timely releases, we will typically not hold the
> >> release unless the bug in question is a regression from the previous
> >> release. That being said, if there is something which is a regression
> >> that has not been correctly targeted please ping me or a committer to
> >> help target the issue.
> >>
> >>
> >> --
> >> Marcelo
> >>
> >> -
> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>
> >
>
>
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


Re: [VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Marcelo Vanzin
Xiao,

This is the third time in this release cycle that this is happening.
Sorry to single out you guys, but can you please do two things:

- do not merge things in 2.3 you're not absolutely sure about
- make sure that things you backport to 2.3 are not causing problems
- let the RM know about these things as soon as you discover them, not
when they send the next RC for voting.

Even though I was in the middle of preparing the rc, I could have
easily aborted that and skipped this whole thread.

This vote is canceled. I'll prepare a new RC right away. I hope this
does not happen again.


On Fri, Jun 1, 2018 at 1:20 PM, Xiao Li  wrote:
> Sorry, I need to say -1
>
> This morning, just found a regression in 2.3.1 and reverted
> https://github.com/apache/spark/pull/21443
>
> Xiao
>
> 2018-06-01 13:09 GMT-07:00 Marcelo Vanzin :
>>
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.3.1.
>>
>> Given that I expect at least a few people to be busy with Spark Summit
>> next
>> week, I'm taking the liberty of setting an extended voting period. The
>> vote
>> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>>
>> It passes with a majority of +1 votes, which must include at least 3 +1
>> votes
>> from the PMC.
>>
>> [ ] +1 Release this package as Apache Spark 2.3.1
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v2.3.1-rc3 (commit 1cc5f68b):
>> https://github.com/apache/spark/tree/v2.3.1-rc3
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1271/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-docs/
>>
>> The list of bug fixes going into 2.3.1 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12342432
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 2.3.1?
>> ===
>>
>> The current list of open tickets targeted at 2.3.1 can be found at:
>> https://s.apache.org/Q3Uo
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>>
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>



-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Xiao Li
Based on the plan, the changes in that PR added the extra Aggregate and
Expand for common queries:

SELECT sum(DISTINCT x), avg(DISTINCT x) FROM tab

Both Aggregate and Expand are expensive operators.



2018-06-01 13:24 GMT-07:00 Sean Owen :

> Hm, that was merged two days ago, and you decided to revert it 2 hours ago.
>
> It sounds like this was maybe risky to put into 2.3.x during the RC phase,
> at least.
> You also don't seem certain whether there's a performance problem; how
> sure are you?
>
> These may all have been the right thing to do given available info, but
> this does seem like too much rapid change at this stage of an RC.
>
> On Fri, Jun 1, 2018 at 3:20 PM Xiao Li  wrote:
>
>> Sorry, I need to say -1
>>
>> This morning, just found a regression in 2.3.1 and reverted
>> https://github.com/apache/spark/pull/21443
>>
>> Xiao
>>
>> 2018-06-01 13:09 GMT-07:00 Marcelo Vanzin :
>>
>>> Please vote on releasing the following candidate as Apache Spark version
>>> 2.3.1.
>>>
>>> Given that I expect at least a few people to be busy with Spark Summit
>>> next
>>> week, I'm taking the liberty of setting an extended voting period. The
>>> vote
>>> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>>>
>>> It passes with a majority of +1 votes, which must include at least 3 +1
>>> votes
>>> from the PMC.
>>>
>>> [ ] +1 Release this package as Apache Spark 2.3.1
>>> [ ] -1 Do not release this package because ...
>>>
>>> To learn more about Apache Spark, please see http://spark.apache.org/
>>>
>>> The tag to be voted on is v2.3.1-rc3 (commit 1cc5f68b):
>>> https://github.com/apache/spark/tree/v2.3.1-rc3
>>>
>>> The release files, including signatures, digests, etc. can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-bin/
>>>
>>> Signatures used for Spark RCs can be found in this file:
>>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>>
>>> The staging repository for this release can be found at:
>>> https://repository.apache.org/content/repositories/orgapachespark-1271/
>>>
>>> The documentation corresponding to this release can be found at:
>>> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-docs/
>>>
>>> The list of bug fixes going into 2.3.1 can be found at the following URL:
>>> https://issues.apache.org/jira/projects/SPARK/versions/12342432
>>>
>>> FAQ
>>>
>>> =
>>> How can I help test this release?
>>> =
>>>
>>> If you are a Spark user, you can help us test this release by taking
>>> an existing Spark workload and running on this release candidate, then
>>> reporting any regressions.
>>>
>>> If you're working in PySpark you can set up a virtual env and install
>>> the current RC and see if anything important breaks, in the Java/Scala
>>> you can add the staging repository to your projects resolvers and test
>>> with the RC (make sure to clean up the artifact cache before/after so
>>> you don't end up building with a out of date RC going forward).
>>>
>>> ===
>>> What should happen to JIRA tickets still targeting 2.3.1?
>>> ===
>>>
>>> The current list of open tickets targeted at 2.3.1 can be found at:
>>> https://s.apache.org/Q3Uo
>>>
>>> Committers should look at those and triage. Extremely important bug
>>> fixes, documentation, and API tweaks that impact compatibility should
>>> be worked on immediately. Everything else please retarget to an
>>> appropriate release.
>>>
>>> ==
>>> But my bug isn't fixed?
>>> ==
>>>
>>> In order to make timely releases, we will typically not hold the
>>> release unless the bug in question is a regression from the previous
>>> release. That being said, if there is something which is a regression
>>> that has not been correctly targeted please ping me or a committer to
>>> help target the issue.
>>>
>>>
>>> --
>>> Marcelo
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>


Re: [VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Sean Owen
Hm, that was merged two days ago, and you decided to revert it 2 hours ago.

It sounds like this was maybe risky to put into 2.3.x during the RC phase,
at least.
You also don't seem certain whether there's a performance problem; how sure
are you?

These may all have been the right thing to do given available info, but
this does seem like too much rapid change at this stage of an RC.

On Fri, Jun 1, 2018 at 3:20 PM Xiao Li  wrote:

> Sorry, I need to say -1
>
> This morning, just found a regression in 2.3.1 and reverted
> https://github.com/apache/spark/pull/21443
>
> Xiao
>
> 2018-06-01 13:09 GMT-07:00 Marcelo Vanzin :
>
>> Please vote on releasing the following candidate as Apache Spark version
>> 2.3.1.
>>
>> Given that I expect at least a few people to be busy with Spark Summit
>> next
>> week, I'm taking the liberty of setting an extended voting period. The
>> vote
>> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>>
>> It passes with a majority of +1 votes, which must include at least 3 +1
>> votes
>> from the PMC.
>>
>> [ ] +1 Release this package as Apache Spark 2.3.1
>> [ ] -1 Do not release this package because ...
>>
>> To learn more about Apache Spark, please see http://spark.apache.org/
>>
>> The tag to be voted on is v2.3.1-rc3 (commit 1cc5f68b):
>> https://github.com/apache/spark/tree/v2.3.1-rc3
>>
>> The release files, including signatures, digests, etc. can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-bin/
>>
>> Signatures used for Spark RCs can be found in this file:
>> https://dist.apache.org/repos/dist/dev/spark/KEYS
>>
>> The staging repository for this release can be found at:
>> https://repository.apache.org/content/repositories/orgapachespark-1271/
>>
>> The documentation corresponding to this release can be found at:
>> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-docs/
>>
>> The list of bug fixes going into 2.3.1 can be found at the following URL:
>> https://issues.apache.org/jira/projects/SPARK/versions/12342432
>>
>> FAQ
>>
>> =
>> How can I help test this release?
>> =
>>
>> If you are a Spark user, you can help us test this release by taking
>> an existing Spark workload and running on this release candidate, then
>> reporting any regressions.
>>
>> If you're working in PySpark you can set up a virtual env and install
>> the current RC and see if anything important breaks, in the Java/Scala
>> you can add the staging repository to your projects resolvers and test
>> with the RC (make sure to clean up the artifact cache before/after so
>> you don't end up building with a out of date RC going forward).
>>
>> ===
>> What should happen to JIRA tickets still targeting 2.3.1?
>> ===
>>
>> The current list of open tickets targeted at 2.3.1 can be found at:
>> https://s.apache.org/Q3Uo
>>
>> Committers should look at those and triage. Extremely important bug
>> fixes, documentation, and API tweaks that impact compatibility should
>> be worked on immediately. Everything else please retarget to an
>> appropriate release.
>>
>> ==
>> But my bug isn't fixed?
>> ==
>>
>> In order to make timely releases, we will typically not hold the
>> release unless the bug in question is a regression from the previous
>> release. That being said, if there is something which is a regression
>> that has not been correctly targeted please ping me or a committer to
>> help target the issue.
>>
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>


Re: [VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Xiao Li
Sorry, I need to say -1

This morning, just found a regression in 2.3.1 and reverted
https://github.com/apache/spark/pull/21443

Xiao

2018-06-01 13:09 GMT-07:00 Marcelo Vanzin :

> Please vote on releasing the following candidate as Apache Spark version
> 2.3.1.
>
> Given that I expect at least a few people to be busy with Spark Summit next
> week, I'm taking the liberty of setting an extended voting period. The vote
> will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).
>
> It passes with a majority of +1 votes, which must include at least 3 +1
> votes
> from the PMC.
>
> [ ] +1 Release this package as Apache Spark 2.3.1
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v2.3.1-rc3 (commit 1cc5f68b):
> https://github.com/apache/spark/tree/v2.3.1-rc3
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1271/
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-docs/
>
> The list of bug fixes going into 2.3.1 can be found at the following URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12342432
>
> FAQ
>
> =
> How can I help test this release?
> =
>
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> ===
> What should happen to JIRA tickets still targeting 2.3.1?
> ===
>
> The current list of open tickets targeted at 2.3.1 can be found at:
> https://s.apache.org/Q3Uo
>
> Committers should look at those and triage. Extremely important bug
> fixes, documentation, and API tweaks that impact compatibility should
> be worked on immediately. Everything else please retarget to an
> appropriate release.
>
> ==
> But my bug isn't fixed?
> ==
>
> In order to make timely releases, we will typically not hold the
> release unless the bug in question is a regression from the previous
> release. That being said, if there is something which is a regression
> that has not been correctly targeted please ping me or a committer to
> help target the issue.
>
>
> --
> Marcelo
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


[VOTE] Spark 2.3.1 (RC3)

2018-06-01 Thread Marcelo Vanzin
Please vote on releasing the following candidate as Apache Spark version 2.3.1.

Given that I expect at least a few people to be busy with Spark Summit next
week, I'm taking the liberty of setting an extended voting period. The vote
will be open until Friday, June 8th, at 19:00 UTC (that's 12:00 PDT).

It passes with a majority of +1 votes, which must include at least 3 +1 votes
from the PMC.

[ ] +1 Release this package as Apache Spark 2.3.1
[ ] -1 Do not release this package because ...

To learn more about Apache Spark, please see http://spark.apache.org/

The tag to be voted on is v2.3.1-rc3 (commit 1cc5f68b):
https://github.com/apache/spark/tree/v2.3.1-rc3

The release files, including signatures, digests, etc. can be found at:
https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-bin/

Signatures used for Spark RCs can be found in this file:
https://dist.apache.org/repos/dist/dev/spark/KEYS

The staging repository for this release can be found at:
https://repository.apache.org/content/repositories/orgapachespark-1271/

The documentation corresponding to this release can be found at:
https://dist.apache.org/repos/dist/dev/spark/v2.3.1-rc3-docs/

The list of bug fixes going into 2.3.1 can be found at the following URL:
https://issues.apache.org/jira/projects/SPARK/versions/12342432

FAQ

=
How can I help test this release?
=

If you are a Spark user, you can help us test this release by taking
an existing Spark workload and running on this release candidate, then
reporting any regressions.

If you're working in PySpark you can set up a virtual env and install
the current RC and see if anything important breaks, in the Java/Scala
you can add the staging repository to your projects resolvers and test
with the RC (make sure to clean up the artifact cache before/after so
you don't end up building with a out of date RC going forward).

===
What should happen to JIRA tickets still targeting 2.3.1?
===

The current list of open tickets targeted at 2.3.1 can be found at:
https://s.apache.org/Q3Uo

Committers should look at those and triage. Extremely important bug
fixes, documentation, and API tweaks that impact compatibility should
be worked on immediately. Everything else please retarget to an
appropriate release.

==
But my bug isn't fixed?
==

In order to make timely releases, we will typically not hold the
release unless the bug in question is a regression from the previous
release. That being said, if there is something which is a regression
that has not been correctly targeted please ping me or a committer to
help target the issue.


-- 
Marcelo

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] SPIP ML Pipelines in R

2018-06-01 Thread Xiangrui Meng
+1.

On Thu, May 31, 2018 at 2:28 PM Joseph Bradley 
wrote:

> Hossein might be slow to respond (OOO), but I just commented on the JIRA.
> I'd recommend we follow the same process as the SparkR package.
>
> +1 on this from me (and I'll be happy to help shepherd it, though Felix
> and Shivaram are the experts in this area).  CRAN presents challenges, but
> this is a good step towards making R a first-class citizen for ML use cases
> of Spark.
>
> On Thu, May 31, 2018 at 9:10 AM, Shivaram Venkataraman <
> shiva...@eecs.berkeley.edu> wrote:
>
>> Hossein -- Can you clarify what the resolution on the repository /
>> release issue discussed on SPIP ?
>>
>> Shivaram
>>
>> On Thu, May 31, 2018 at 9:06 AM, Felix Cheung 
>> wrote:
>> > +1
>> > With my concerns in the SPIP discussion.
>> >
>> > 
>> > From: Hossein 
>> > Sent: Wednesday, May 30, 2018 2:03:03 PM
>> > To: dev@spark.apache.org
>> > Subject: [VOTE] SPIP ML Pipelines in R
>> >
>> > Hi,
>> >
>> > I started discussion thread for a new R package to expose MLlib
>> pipelines in
>> > R.
>> >
>> > To summarize we will work on utilities to generate R wrappers for MLlib
>> > pipeline API for a new R package. This will lower the burden for
>> exposing
>> > new API in future.
>> >
>> > Following the SPIP process, I am proposing the SPIP for a vote.
>> >
>> > +1: Let's go ahead and implement the SPIP.
>> > +0: Don't really care.
>> > -1: I do not think this is a good idea for the following reasons.
>> >
>> > Thanks,
>> > --Hossein
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>
>
> --
>
> Joseph Bradley
>
> Software Engineer - Machine Learning
>
> Databricks, Inc.
>
> [image: http://databricks.com] 
>
-- 

Xiangrui Meng

Software Engineer

Databricks Inc. [image: http://databricks.com] 


[VOTE] [SPARK-24374] SPIP: Support Barrier Scheduling in Apache Spark

2018-06-01 Thread Xiangrui Meng
Hi all,

I want to call for a vote of SPARK-24374
. It introduces a new
execution mode to Spark, which would help both integration with external
DL/AI frameworks and MLlib algorithm performance. This is one of the
follow-ups from a previous discussion on dev@

.

The vote will be up for the next 72 hours. Please reply with your vote:

+1: Yeah, let's go forward and implement the SPIP.
+0: Don't really care.
-1: I don't think this is a good idea because of the following technical
reasons.

Best,
Xiangrui
-- 

Xiangrui Meng

Software Engineer

Databricks Inc. [image: http://databricks.com] 


Re: Revisiting Online serving of Spark models?

2018-06-01 Thread Saikat Kanjilal
@Chris This sounds fantastic, please send summary notes for Seattle folks

@Felix I work in downtown Seattle, am wondering if we should a tech meetup 
around model serving in spark at my work or elsewhere close, thoughts?  I’m 
actually in the midst of building microservices to manage models and when I say 
models I mean much more than machine learning models (think OR, process models 
as well)

Regards

Sent from my iPhone

On May 31, 2018, at 10:32 PM, Chris Fregly 
mailto:ch...@fregly.com>> wrote:

Hey everyone!

@Felix:  thanks for putting this together.  i sent some of you a quick calendar 
event - mostly for me, so i don’t forget!  :)

Coincidentally, this is the focus of June 6th's Advanced Spark and TensorFlow 
Meetup
 @5:30pm on June 6th (same night) here in SF!

Everybody is welcome to come.  Here’s the link to the meetup that includes the 
signup link:  
https://www.meetup.com/Advanced-Spark-and-TensorFlow-Meetup/events/250924195/

We have an awesome lineup of speakers covered a lot of deep, technical ground.

For those who can’t attend in person, we’ll be broadcasting live - and posting 
the recording afterward.

All details are in the meetup link above…

@holden/felix/nick/joseph/maximiliano/saikat/leif:  you’re more than welcome to 
give a talk. I can move things around to make room.

@joseph:  I’d personally like an update on the direction of the Databricks 
proprietary ML Serving export format which is similar to PMML but not a 
standard in any way.

Also, the Databricks ML Serving Runtime is only available to Databricks 
customers.  This seems in conflict with the community efforts described here.  
Can you comment on behalf of Databricks?

Look forward to your response, joseph.

See you all soon!

—

Chris Fregly
Founder @ PipelineAI (100,000 Users)
Organizer @ Advanced Spark and TensorFlow 
Meetup (85,000 
Global Members)

San Francisco - Chicago - Austin -
Washington DC - London - Dusseldorf

Try our PipelineAI Community Edition with GPUs and 
TPUs!!


On May 30, 2018, at 9:32 AM, Felix Cheung 
mailto:felixcheun...@hotmail.com>> wrote:

Hi!

Thank you! Let’s meet then

June 6 4pm

Moscone West Convention Center
800 Howard Street, San Francisco, CA 94103

Ground floor (outside of conference area - should be available for all) - we 
will meet and decide where to go

(Would not send invite because that would be too much noise for dev@)

To paraphrase Joseph, we will use this to kick off the discusssion and post 
notes after and follow up online. As for Seattle, I would be very interested to 
meet in person lateen and discuss ;)


_
From: Saikat Kanjilal mailto:sxk1...@hotmail.com>>
Sent: Tuesday, May 29, 2018 11:46 AM
Subject: Re: Revisiting Online serving of Spark models?
To: Maximiliano Felice 
mailto:maximilianofel...@gmail.com>>
Cc: Felix Cheung mailto:felixcheun...@hotmail.com>>, 
Holden Karau mailto:hol...@pigscanfly.ca>>, Joseph 
Bradley mailto:jos...@databricks.com>>, Leif Walsh 
mailto:leif.wa...@gmail.com>>, dev 
mailto:dev@spark.apache.org>>


Would love to join but am in Seattle, thoughts on how to make this work?

Regards

Sent from my iPhone

On May 29, 2018, at 10:35 AM, Maximiliano Felice 
mailto:maximilianofel...@gmail.com>> wrote:

Big +1 to a meeting with fresh air.

Could anyone send the invites? I don't really know which is the place Holden is 
talking about.

2018-05-29 14:27 GMT-03:00 Felix Cheung 
mailto:felixcheun...@hotmail.com>>:
You had me at blue bottle!

_
From: Holden Karau mailto:hol...@pigscanfly.ca>>
Sent: Tuesday, May 29, 2018 9:47 AM
Subject: Re: Revisiting Online serving of Spark models?
To: Felix Cheung mailto:felixcheun...@hotmail.com>>
Cc: Saikat Kanjilal mailto:sxk1...@hotmail.com>>, 
Maximiliano Felice 
mailto:maximilianofel...@gmail.com>>, Joseph 
Bradley mailto:jos...@databricks.com>>, Leif Walsh 
mailto:leif.wa...@gmail.com>>, dev 
mailto:dev@spark.apache.org>>



I'm down for that, we could all go for a walk maybe to the mint plazaa blue 
bottle and grab coffee (if the weather holds have our design meeting outside 
:p)?

On Tue, May 29, 2018 at 9:37 AM, Felix Cheung 
mailto:felixcheun...@hotmail.com>> wrote:
Bump.


From: Felix Cheung mailto:felixcheun...@hotmail.com>>
Sent: Saturday, May 26, 2018 1:05:29 PM
To: Saikat Kanjilal; Maximiliano Felice; Joseph Bradley
Cc: Leif Walsh; Holden Karau; dev

Subject: Re: Revisiting Online serving of Spark models?

Hi! How about we meet the community and discuss on June 6 4pm at (near) the 
Summit?

(I propose we meet at the venue entrance so we could accommodate people might 
not be in the conference)


From: Saikat Kanjilal mailto:sxk1...@hotmail.com>>
Sent: Tuesday, May 22, 2018 7:47:07 AM
To: