Re: Signing releases with pwendell or release manager's key?

2017-09-19 Thread Holden Karau
Another option is I can just run the build locally, this might be better
approach since it will help make sure we have the dependencies documented
for the eventual transition to dockerized builds?

On Tue, Sep 19, 2017 at 9:53 AM, Holden Karau  wrote:

> Thanks for the reminder :)
>
> On Tue, Sep 19, 2017 at 9:02 AM Luciano Resende 
> wrote:
>
>> Manually signing seems a good compromise for now, but note that there are
>> two places that this needs to happen, the artifacts that goes to dist.a.o
>> as well as the ones that are published to maven.
>>
>> On Tue, Sep 19, 2017 at 8:53 AM, Ryan Blue 
>> wrote:
>>
>>> +1. Thanks for coming up with a solution, everyone! I think the manually
>>> signed RC as a work around will work well, and it will be an improvement
>>> for the rest to be updated.
>>>
>>> On Mon, Sep 18, 2017 at 8:25 PM, Patrick Wendell >> > wrote:
>>>
 Sounds good - thanks Holden!

 On Mon, Sep 18, 2017 at 8:21 PM, Holden Karau 
 wrote:

> That sounds like a pretty good temporary work around if folks agree
> I'll cancel release vote for 2.1.2 and work on getting an RC2 out later
> this week manually signed. I've filed JIRA SPARK-22055 & SPARK-22054 to
> port the release scripts and allow injecting of the RM's key.
>
> On Mon, Sep 18, 2017 at 8:11 PM, Patrick Wendell <
> patr...@databricks.com> wrote:
>
>> For the current release - maybe Holden could just sign the artifacts
>> with her own key manually, if this is a concern. I don't think that would
>> require modifying the release pipeline, except to just remove/ignore the
>> existing signatures.
>>
>> - Patrick
>>
>> On Mon, Sep 18, 2017 at 7:56 PM, Reynold Xin 
>> wrote:
>>
>>> Does anybody know whether this is a hard blocker? If it is not, we
>>> should probably push 2.1.2 forward quickly and do the infrastructure
>>> improvement in parallel.
>>>
>>> On Mon, Sep 18, 2017 at 7:49 PM, Holden Karau 
>>> wrote:
>>>
 I'm more than willing to help migrate the scripts as part of either
 this release or the next.

 It sounds like there is a consensus developing around changing the
 process -- should we hold off on the 2.1.2 release or roll this into 
 the
 next one?

 On Mon, Sep 18, 2017 at 7:37 PM, Marcelo Vanzin <
 van...@cloudera.com> wrote:

> +1 to this. There should be a script in the Spark repo that has all
> the logic needed for a release. That script should take the RM's
> key
> as a parameter.
>
> if there's a desire to keep the current Jenkins job to create the
> release, it should be based on that script. But from what I'm
> seeing
> there are currently too many unknowns in the release process.
>
> On Mon, Sep 18, 2017 at 4:55 PM, Ryan Blue
>  wrote:
> > I don't understand why it is necessary to share a release key.
> If this is
> > something that can be automated in a Jenkins job, then can it be
> a script
> > with a reasonable set of build requirements for Mac and Ubuntu?
> That's the
> > approach I've seen the most in other projects.
> >
> > I'm also not just concerned about release managers. Having a key
> stored
> > persistently on outside infrastructure adds the most risk, as
> Luciano noted
> > as well. We should also start publishing checksums in the Spark
> VOTE thread,
> > which are currently missing. The risk I'm concerned about is
> that if the key
> > were compromised, it would be possible to replace binaries with
> perfectly
> > valid ones, at least on some mirrors. If the Apache copy were
> replaced, then
> > we wouldn't even be able to catch that it had happened. Given
> the high
> > profile of Spark and the number of companies that run it, I
> think we need to
> > take extra care to make sure that can't happen, even if it is an
> annoyance
> > for the release managers.
>
> --
> Marcelo
>
> 
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>


 --
 Twitter: https://twitter.com/holdenkarau

>>>
>>>
>>
>
>
> --
> Cell : 425-233-8271 <(425)%20233-8271>
> Twitter: https://twitter.com/holdenkarau
>


>>>
>>>
>>> --
>>> Ryan Blue
>>> Software Engineer
>>> Netflix
>>>
>>
>>
>>
>> --
>> Luciano 

Re: A little Scala 2.12 help

2017-09-19 Thread Jacek Laskowski
Hi,

Nice catch, Sean! Learnt this today. They did say you could learn a lot
with Spark! :)

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
Spark Structured Streaming (Apache Spark 2.2+)
https://bit.ly/spark-structured-streaming
Mastering Apache Spark 2 https://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

On Tue, Sep 19, 2017 at 4:23 PM, Sean Owen  wrote:

> I figured this out. It's another effect of a new behavior in
> 2.12: Eta-expansion of zero-argument method values is deprecated
> Imagine:
>
> def f(): String = "foo"
> def g(fn: () => String) = ???
>
> g(f) works in 2.11 without warning. It generates a warning in 2.12,
> because it wants you to explicitly make a function from the method
> reference: g(() => f). It will maybe be an error in 2.13.
>
> But, this affects implicit resolution. Some of the implicits that power
> SparkContext.sequenceFile() need to change to be vals of type () =>
> WritableConverter[T], not methods that return WritableConverter[T].
>
> I'm working through this and other deprecated items in 2.12 and preparing
> more 2.11-compatible changes that allow these to work cleanly in 2.12.
>
> On Fri, Sep 15, 2017 at 11:21 AM Sean Owen  wrote:
>
>> I'm working on updating to Scala 2.12, and, have hit a compile error in
>> Scala 2.12 that I'm strugging to design a fix to (that doesn't modify the
>> API significantly). If you "./dev/change-scala-version.sh 2.12" and
>> compile, you'll see errors like...
>>
>> [error] /Users/srowen/Documents/Cloudera/spark/core/src/test/
>> scala/org/apache/spark/FileSuite.scala:100: could not find implicit
>> value for parameter kcf: () => org.apache.spark.
>> WritableConverter[org.apache.hadoop.io.IntWritable]
>> [error] Error occurred in an application involving default arguments.
>> [error] val output = sc.sequenceFile[IntWritable, Text](outputDir)
>>
>> Clearly implicit resolution changed a little bit in 2.12 somehow. I
>> actually don't recall seeing this error before, so might be somehow related
>> to 2.12.3, but not sure.
>>
>> As you can see the implicits that have always existed and been imported
>> and should apply here don't seem to be found.
>>
>> If anyone is a Scala expert and could glance at this, you might help save
>> me a lot of puzzling.
>>
>


Re: Signing releases with pwendell or release manager's key?

2017-09-19 Thread Holden Karau
Thanks for the reminder :)

On Tue, Sep 19, 2017 at 9:02 AM Luciano Resende 
wrote:

> Manually signing seems a good compromise for now, but note that there are
> two places that this needs to happen, the artifacts that goes to dist.a.o
> as well as the ones that are published to maven.
>
> On Tue, Sep 19, 2017 at 8:53 AM, Ryan Blue 
> wrote:
>
>> +1. Thanks for coming up with a solution, everyone! I think the manually
>> signed RC as a work around will work well, and it will be an improvement
>> for the rest to be updated.
>>
>> On Mon, Sep 18, 2017 at 8:25 PM, Patrick Wendell 
>> wrote:
>>
>>> Sounds good - thanks Holden!
>>>
>>> On Mon, Sep 18, 2017 at 8:21 PM, Holden Karau 
>>> wrote:
>>>
 That sounds like a pretty good temporary work around if folks agree
 I'll cancel release vote for 2.1.2 and work on getting an RC2 out later
 this week manually signed. I've filed JIRA SPARK-22055 & SPARK-22054 to
 port the release scripts and allow injecting of the RM's key.

 On Mon, Sep 18, 2017 at 8:11 PM, Patrick Wendell <
 patr...@databricks.com> wrote:

> For the current release - maybe Holden could just sign the artifacts
> with her own key manually, if this is a concern. I don't think that would
> require modifying the release pipeline, except to just remove/ignore the
> existing signatures.
>
> - Patrick
>
> On Mon, Sep 18, 2017 at 7:56 PM, Reynold Xin 
> wrote:
>
>> Does anybody know whether this is a hard blocker? If it is not, we
>> should probably push 2.1.2 forward quickly and do the infrastructure
>> improvement in parallel.
>>
>> On Mon, Sep 18, 2017 at 7:49 PM, Holden Karau 
>> wrote:
>>
>>> I'm more than willing to help migrate the scripts as part of either
>>> this release or the next.
>>>
>>> It sounds like there is a consensus developing around changing the
>>> process -- should we hold off on the 2.1.2 release or roll this into the
>>> next one?
>>>
>>> On Mon, Sep 18, 2017 at 7:37 PM, Marcelo Vanzin >> > wrote:
>>>
 +1 to this. There should be a script in the Spark repo that has all
 the logic needed for a release. That script should take the RM's key
 as a parameter.

 if there's a desire to keep the current Jenkins job to create the
 release, it should be based on that script. But from what I'm seeing
 there are currently too many unknowns in the release process.

 On Mon, Sep 18, 2017 at 4:55 PM, Ryan Blue
  wrote:
 > I don't understand why it is necessary to share a release key. If
 this is
 > something that can be automated in a Jenkins job, then can it be
 a script
 > with a reasonable set of build requirements for Mac and Ubuntu?
 That's the
 > approach I've seen the most in other projects.
 >
 > I'm also not just concerned about release managers. Having a key
 stored
 > persistently on outside infrastructure adds the most risk, as
 Luciano noted
 > as well. We should also start publishing checksums in the Spark
 VOTE thread,
 > which are currently missing. The risk I'm concerned about is that
 if the key
 > were compromised, it would be possible to replace binaries with
 perfectly
 > valid ones, at least on some mirrors. If the Apache copy were
 replaced, then
 > we wouldn't even be able to catch that it had happened. Given the
 high
 > profile of Spark and the number of companies that run it, I think
 we need to
 > take extra care to make sure that can't happen, even if it is an
 annoyance
 > for the release managers.

 --
 Marcelo


 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>>>
>>>
>>> --
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>
>


 --
 Cell : 425-233-8271 <(425)%20233-8271>
 Twitter: https://twitter.com/holdenkarau

>>>
>>>
>>
>>
>> --
>> Ryan Blue
>> Software Engineer
>> Netflix
>>
>
>
>
> --
> Luciano Resende
> http://twitter.com/lresende1975
> http://lresende.blogspot.com/
>
-- 
Twitter: https://twitter.com/holdenkarau


Re: Signing releases with pwendell or release manager's key?

2017-09-19 Thread Luciano Resende
Manually signing seems a good compromise for now, but note that there are
two places that this needs to happen, the artifacts that goes to dist.a.o
as well as the ones that are published to maven.

On Tue, Sep 19, 2017 at 8:53 AM, Ryan Blue 
wrote:

> +1. Thanks for coming up with a solution, everyone! I think the manually
> signed RC as a work around will work well, and it will be an improvement
> for the rest to be updated.
>
> On Mon, Sep 18, 2017 at 8:25 PM, Patrick Wendell 
> wrote:
>
>> Sounds good - thanks Holden!
>>
>> On Mon, Sep 18, 2017 at 8:21 PM, Holden Karau 
>> wrote:
>>
>>> That sounds like a pretty good temporary work around if folks agree I'll
>>> cancel release vote for 2.1.2 and work on getting an RC2 out later this
>>> week manually signed. I've filed JIRA SPARK-22055 & SPARK-22054 to port the
>>> release scripts and allow injecting of the RM's key.
>>>
>>> On Mon, Sep 18, 2017 at 8:11 PM, Patrick Wendell >> > wrote:
>>>
 For the current release - maybe Holden could just sign the artifacts
 with her own key manually, if this is a concern. I don't think that would
 require modifying the release pipeline, except to just remove/ignore the
 existing signatures.

 - Patrick

 On Mon, Sep 18, 2017 at 7:56 PM, Reynold Xin 
 wrote:

> Does anybody know whether this is a hard blocker? If it is not, we
> should probably push 2.1.2 forward quickly and do the infrastructure
> improvement in parallel.
>
> On Mon, Sep 18, 2017 at 7:49 PM, Holden Karau 
> wrote:
>
>> I'm more than willing to help migrate the scripts as part of either
>> this release or the next.
>>
>> It sounds like there is a consensus developing around changing the
>> process -- should we hold off on the 2.1.2 release or roll this into the
>> next one?
>>
>> On Mon, Sep 18, 2017 at 7:37 PM, Marcelo Vanzin 
>> wrote:
>>
>>> +1 to this. There should be a script in the Spark repo that has all
>>> the logic needed for a release. That script should take the RM's key
>>> as a parameter.
>>>
>>> if there's a desire to keep the current Jenkins job to create the
>>> release, it should be based on that script. But from what I'm seeing
>>> there are currently too many unknowns in the release process.
>>>
>>> On Mon, Sep 18, 2017 at 4:55 PM, Ryan Blue 
>>> wrote:
>>> > I don't understand why it is necessary to share a release key. If
>>> this is
>>> > something that can be automated in a Jenkins job, then can it be a
>>> script
>>> > with a reasonable set of build requirements for Mac and Ubuntu?
>>> That's the
>>> > approach I've seen the most in other projects.
>>> >
>>> > I'm also not just concerned about release managers. Having a key
>>> stored
>>> > persistently on outside infrastructure adds the most risk, as
>>> Luciano noted
>>> > as well. We should also start publishing checksums in the Spark
>>> VOTE thread,
>>> > which are currently missing. The risk I'm concerned about is that
>>> if the key
>>> > were compromised, it would be possible to replace binaries with
>>> perfectly
>>> > valid ones, at least on some mirrors. If the Apache copy were
>>> replaced, then
>>> > we wouldn't even be able to catch that it had happened. Given the
>>> high
>>> > profile of Spark and the number of companies that run it, I think
>>> we need to
>>> > take extra care to make sure that can't happen, even if it is an
>>> annoyance
>>> > for the release managers.
>>>
>>> --
>>> Marcelo
>>>
>>> 
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>
>>
>> --
>> Twitter: https://twitter.com/holdenkarau
>>
>
>

>>>
>>>
>>> --
>>> Cell : 425-233-8271 <(425)%20233-8271>
>>> Twitter: https://twitter.com/holdenkarau
>>>
>>
>>
>
>
> --
> Ryan Blue
> Software Engineer
> Netflix
>



-- 
Luciano Resende
http://twitter.com/lresende1975
http://lresende.blogspot.com/


Re: Signing releases with pwendell or release manager's key?

2017-09-19 Thread Ryan Blue
+1. Thanks for coming up with a solution, everyone! I think the manually
signed RC as a work around will work well, and it will be an improvement
for the rest to be updated.

On Mon, Sep 18, 2017 at 8:25 PM, Patrick Wendell 
wrote:

> Sounds good - thanks Holden!
>
> On Mon, Sep 18, 2017 at 8:21 PM, Holden Karau 
> wrote:
>
>> That sounds like a pretty good temporary work around if folks agree I'll
>> cancel release vote for 2.1.2 and work on getting an RC2 out later this
>> week manually signed. I've filed JIRA SPARK-22055 & SPARK-22054 to port the
>> release scripts and allow injecting of the RM's key.
>>
>> On Mon, Sep 18, 2017 at 8:11 PM, Patrick Wendell 
>> wrote:
>>
>>> For the current release - maybe Holden could just sign the artifacts
>>> with her own key manually, if this is a concern. I don't think that would
>>> require modifying the release pipeline, except to just remove/ignore the
>>> existing signatures.
>>>
>>> - Patrick
>>>
>>> On Mon, Sep 18, 2017 at 7:56 PM, Reynold Xin 
>>> wrote:
>>>
 Does anybody know whether this is a hard blocker? If it is not, we
 should probably push 2.1.2 forward quickly and do the infrastructure
 improvement in parallel.

 On Mon, Sep 18, 2017 at 7:49 PM, Holden Karau 
 wrote:

> I'm more than willing to help migrate the scripts as part of either
> this release or the next.
>
> It sounds like there is a consensus developing around changing the
> process -- should we hold off on the 2.1.2 release or roll this into the
> next one?
>
> On Mon, Sep 18, 2017 at 7:37 PM, Marcelo Vanzin 
> wrote:
>
>> +1 to this. There should be a script in the Spark repo that has all
>> the logic needed for a release. That script should take the RM's key
>> as a parameter.
>>
>> if there's a desire to keep the current Jenkins job to create the
>> release, it should be based on that script. But from what I'm seeing
>> there are currently too many unknowns in the release process.
>>
>> On Mon, Sep 18, 2017 at 4:55 PM, Ryan Blue 
>> wrote:
>> > I don't understand why it is necessary to share a release key. If
>> this is
>> > something that can be automated in a Jenkins job, then can it be a
>> script
>> > with a reasonable set of build requirements for Mac and Ubuntu?
>> That's the
>> > approach I've seen the most in other projects.
>> >
>> > I'm also not just concerned about release managers. Having a key
>> stored
>> > persistently on outside infrastructure adds the most risk, as
>> Luciano noted
>> > as well. We should also start publishing checksums in the Spark
>> VOTE thread,
>> > which are currently missing. The risk I'm concerned about is that
>> if the key
>> > were compromised, it would be possible to replace binaries with
>> perfectly
>> > valid ones, at least on some mirrors. If the Apache copy were
>> replaced, then
>> > we wouldn't even be able to catch that it had happened. Given the
>> high
>> > profile of Spark and the number of companies that run it, I think
>> we need to
>> > take extra care to make sure that can't happen, even if it is an
>> annoyance
>> > for the release managers.
>>
>> --
>> Marcelo
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>
>
> --
> Twitter: https://twitter.com/holdenkarau
>


>>>
>>
>>
>> --
>> Cell : 425-233-8271 <(425)%20233-8271>
>> Twitter: https://twitter.com/holdenkarau
>>
>
>


-- 
Ryan Blue
Software Engineer
Netflix


Re: A little Scala 2.12 help

2017-09-19 Thread Sean Owen
I figured this out. It's another effect of a new behavior in
2.12: Eta-expansion of zero-argument method values is deprecated
Imagine:

def f(): String = "foo"
def g(fn: () => String) = ???

g(f) works in 2.11 without warning. It generates a warning in 2.12, because
it wants you to explicitly make a function from the method reference: g(()
=> f). It will maybe be an error in 2.13.

But, this affects implicit resolution. Some of the implicits that power
SparkContext.sequenceFile() need to change to be vals of type () =>
WritableConverter[T], not methods that return WritableConverter[T].

I'm working through this and other deprecated items in 2.12 and preparing
more 2.11-compatible changes that allow these to work cleanly in 2.12.

On Fri, Sep 15, 2017 at 11:21 AM Sean Owen  wrote:

> I'm working on updating to Scala 2.12, and, have hit a compile error in
> Scala 2.12 that I'm strugging to design a fix to (that doesn't modify the
> API significantly). If you "./dev/change-scala-version.sh 2.12" and
> compile, you'll see errors like...
>
> [error]
> /Users/srowen/Documents/Cloudera/spark/core/src/test/scala/org/apache/spark/FileSuite.scala:100:
> could not find implicit value for parameter kcf: () =>
> org.apache.spark.WritableConverter[org.apache.hadoop.io.IntWritable]
> [error] Error occurred in an application involving default arguments.
> [error] val output = sc.sequenceFile[IntWritable, Text](outputDir)
>
> Clearly implicit resolution changed a little bit in 2.12 somehow. I
> actually don't recall seeing this error before, so might be somehow related
> to 2.12.3, but not sure.
>
> As you can see the implicits that have always existed and been imported
> and should apply here don't seem to be found.
>
> If anyone is a Scala expert and could glance at this, you might help save
> me a lot of puzzling.
>


Re: [Spark Core] Custom Catalog. Integration between Apache Ignite and Apache Spark

2017-09-19 Thread Nikolay Izhikov

Guys,

Anyone had a chance to look at my message?

15.09.2017 15:50, Nikolay Izhikov пишет:

Hello, guys.

I’m contributor of Apache Ignite project which is self-described as an 
in-memory computing platform.


It has Data Grid features: distribute, transactional key-value store 
[1], Distributed SQL support [2], etc…[3]


Currently, I’m working on integration between Ignite and Spark [4]
I want to add support of Spark Data Frame API for Ignite.

As far as Ignite is distributed store it would be useful to create 
implementation of Catalog [5] for an Apache Ignite.


I see two ways to implement this feature:

     1. Spark can provide API for any custom catalog implementation. As 
far as I can see there is a ticket for it [6]. It is closed with 
resolution “Later”. Is it suitable time to continue working on the 
ticket? How can I help with it?


     2. I can provide an implementation of Catalog and other required 
API in the form of pull request in Spark, as it was implemented for Hive 
[7]. Can such pull request be acceptable?


Which way is more convenient for Spark community?

[1] https://ignite.apache.org/features/datagrid.html
[2] https://ignite.apache.org/features/sql.html
[3] https://ignite.apache.org/features.html
[4] https://issues.apache.org/jira/browse/IGNITE-3084
[5] 
https://github.com/apache/spark/blob/master/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/ExternalCatalog.scala 


[6] https://issues.apache.org/jira/browse/SPARK-17767
[7] 
https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala 



-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



About 2.2.1 release

2017-09-19 Thread sujith71955
Hi Folks,

Just wanted to know about the spark 2.2.1 release date, please let me know
the expected release date for this version.


Thanks,
Sujith



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org