Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-25 Thread Yi Wu
Hi Gengliang,

I found another ticket:
SPARK-36509 : Executors
don't get rescheduled in standalone mode when worker dies

And it already has the fix: https://github.com/apache/spark/pull/33818

Bests,
Yi

On Wed, Aug 25, 2021 at 9:49 PM Gengliang Wang  wrote:

> Hi all,
>
> So, RC1 failed.
> After RC1 cut, we have merged the following bug fixes to branch-3.2:
>
>- Updates AuthEngine to pass the correct SecretKeySpec format
>
> 
>- Fix NullPointerException in LiveRDDDistribution.toAPI
>
> 
>- Revert "[
>
> 
>SPARK-34415 ][ML]
>Randomization in hyperparameter optimization"
>
> 
>- Redact sensitive information in Spark Thrift Server
>
> 
>
> I will cut RC2 after the following issues are resolved:
>
>- Add back transformAllExpressions to AnalysisHelper(SPARK-36581
>)
>- Review and fix issues in API docs(SPARK-36457
>)
>- Support setting "since" version in FunctionRegistry (SPARK-36585
>)
>- pushDownPredicate=false failed to prevent push down filters to JDBC
>data source(SPARK-36574
>)
>
> Please let me know if you know of any other new bugs/blockers for the
> 3.2.0 release.
>
> Thanks,
> Gengliang
>
> On Wed, Aug 25, 2021 at 2:50 AM Sean Owen  wrote:
>
>> I think we'll need this revert:
>> https://github.com/apache/spark/pull/33819
>>
>> Between that and a few other minor but important issues I think I'd say
>> -1 myself and ask for another RC.
>>
>> On Tue, Aug 24, 2021 at 1:01 PM Jacek Laskowski  wrote:
>>
>>> Hi Yi Wu,
>>>
>>> Looks like the issue has got resolution: Won't Fix. How about your -1?
>>>
>>> Pozdrawiam,
>>> Jacek Laskowski
>>> 
>>> https://about.me/JacekLaskowski
>>> "The Internals Of" Online Books 
>>> Follow me on https://twitter.com/jaceklaskowski
>>>
>>> 
>>>
>>>
>>> On Mon, Aug 23, 2021 at 4:58 AM Yi Wu  wrote:
>>>
 -1. I found a bug (https://issues.apache.org/jira/browse/SPARK-36558)
 in the push-based shuffle, which could lead to job hang.

 Bests,
 Yi

 On Sat, Aug 21, 2021 at 1:05 AM Gengliang Wang 
 wrote:

> Please vote on releasing the following candidate as Apache Spark
>  version 3.2.0.
>
> The vote is open until 11:59pm Pacific time Aug 25 and passes if a
> majority +1 PMC votes are cast, with a minimum of 3 +1 votes.
>
> [ ] +1 Release this package as Apache Spark 3.2.0
> [ ] -1 Do not release this package because ...
>
> To learn more about Apache Spark, please see http://spark.apache.org/
>
> The tag to be voted on is v3.2.0-rc1 (commit
> 6bb3523d8e838bd2082fb90d7f3741339245c044):
> https://github.com/apache/spark/tree/v3.2.0-rc1
>
> The release files, including signatures, digests, etc. can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc1-bin/
>
> Signatures used for Spark RCs can be found in this file:
> https://dist.apache.org/repos/dist/dev/spark/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapachespark-1388
>
> The documentation corresponding to this release can be found at:
> https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc1-docs/
>
> The list of bug fixes going into 3.2.0 can be found at the following
> URL:
> https://issues.apache.org/jira/projects/SPARK/versions/12349407
>
> This release is using the release script of the tag v3.2.0-rc1.
>
>
> FAQ
>
> =
> How can I help test this release?
> =
> If you are a Spark user, you can help us test this release by taking
> an existing Spark workload and running on this release candidate, then
> reporting any regressions.
>
> If you're working in PySpark you can set up a virtual env and install
> the current RC and see if anything important breaks, in the Java/Scala
> you can add the staging repository to your projects resolvers and test
> with the RC (make sure to clean up the artifact cache before/after so
> you don't end up building with a out of date RC going forward).
>
> 

Re: Add option to Spark UI to proxy to the executors?

2021-08-25 Thread Ankit Gupta
Hey Holden

I can probably help you with that. We should probably pass the same configs
to the application also,for the reverse proxy to work.

Try setting the spark.ui.reverseProxy=true while running your application.
I hope this will solve your issue.

Best Regards
Ankit Prakash Gupta

On Thu, 26 Aug, 2021, 2:48 am Holden Karau,  wrote:

> So I tried turning on the Spark exec UI proxy but it broke the Spark UI
> (in 3.1.2) and regardless of what URL I requested everything came back as
> text/html of the jobs page. Is anyone actively using this feature in prod?
>
> On Sun, Aug 22, 2021 at 5:58 PM Holden Karau  wrote:
>
>> Oh cool. I’ll have to dig down into why that’s not working with my K8s
>> deployment then.
>>
>> On Sat, Aug 21, 2021 at 11:54 PM Gengliang Wang  wrote:
>>
>>> Hi Holden,
>>>
>>> FYI there are already some related features in Spark:
>>>
>>>- Spark Master UI to reverse proxy Application and Workers UI
>>>
>>>- Support Spark UI behind front-end reverse proxy using a path
>>>prefix Revert proxy URL 
>>>
>>> Not sure if they are helpful to you.
>>>
>>> On Sat, Aug 21, 2021 at 3:16 PM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
 Yes I can see your point.

 Will that work in kubernetes deployment?

 Mich


view my Linkedin profile
 



 *Disclaimer:* Use it at your own risk. Any and all responsibility for
 any loss, damage or destruction of data or any other property which may
 arise from relying on this email's technical content is explicitly
 disclaimed. The author will in no case be liable for any monetary damages
 arising from such loss, damage or destruction.




 On Sat, 21 Aug 2021 at 00:02, Holden Karau 
 wrote:

> Hi Folks,
>
> I'm wondering what people think about the idea of having the Spark UI
> (optionally) act as a proxy to the executors? This could help with exec UI
> access in some deployment environments.
>
> Cheers,
>
> Holden :)
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>
 --
>> Twitter: https://twitter.com/holdenkarau
>> Books (Learning Spark, High Performance Spark, etc.):
>> https://amzn.to/2MaRAG9  
>> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>>
>
>
> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


Re: Add option to Spark UI to proxy to the executors?

2021-08-25 Thread Holden Karau
So I tried turning on the Spark exec UI proxy but it broke the Spark UI (in
3.1.2) and regardless of what URL I requested everything came back as
text/html of the jobs page. Is anyone actively using this feature in prod?

On Sun, Aug 22, 2021 at 5:58 PM Holden Karau  wrote:

> Oh cool. I’ll have to dig down into why that’s not working with my K8s
> deployment then.
>
> On Sat, Aug 21, 2021 at 11:54 PM Gengliang Wang  wrote:
>
>> Hi Holden,
>>
>> FYI there are already some related features in Spark:
>>
>>- Spark Master UI to reverse proxy Application and Workers UI
>>
>>- Support Spark UI behind front-end reverse proxy using a path prefix
>>Revert proxy URL 
>>
>> Not sure if they are helpful to you.
>>
>> On Sat, Aug 21, 2021 at 3:16 PM Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> Yes I can see your point.
>>>
>>> Will that work in kubernetes deployment?
>>>
>>> Mich
>>>
>>>
>>>view my Linkedin profile
>>> 
>>>
>>>
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The author will in no case be liable for any monetary damages
>>> arising from such loss, damage or destruction.
>>>
>>>
>>>
>>>
>>> On Sat, 21 Aug 2021 at 00:02, Holden Karau  wrote:
>>>
 Hi Folks,

 I'm wondering what people think about the idea of having the Spark UI
 (optionally) act as a proxy to the executors? This could help with exec UI
 access in some deployment environments.

 Cheers,

 Holden :)

 --
 Twitter: https://twitter.com/holdenkarau
 Books (Learning Spark, High Performance Spark, etc.):
 https://amzn.to/2MaRAG9  
 YouTube Live Streams: https://www.youtube.com/user/holdenkarau

>>> --
> Twitter: https://twitter.com/holdenkarau
> Books (Learning Spark, High Performance Spark, etc.):
> https://amzn.to/2MaRAG9  
> YouTube Live Streams: https://www.youtube.com/user/holdenkarau
>


-- 
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://amzn.to/2MaRAG9  
YouTube Live Streams: https://www.youtube.com/user/holdenkarau


Re: [build system] quick jenkins restart

2021-08-25 Thread shane knapp ☠
aaand we're back!

On Wed, Aug 25, 2021 at 9:24 AM shane knapp ☠  wrote:

> i'll be:
> - upgrading jenkins to the latest LTS
> - moving jenkins to java 11 (from java 8)
> - rebooting everything
>
> sorry for the disruption...  there aren't many builds running right now so
> i'll just get this sorted.
>
> shane
> --
> Shane Knapp
> Computer Guy / Voice of Reason
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


-- 
Shane Knapp
Computer Guy / Voice of Reason
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


[build system] quick jenkins restart

2021-08-25 Thread shane knapp ☠
i'll be:
- upgrading jenkins to the latest LTS
- moving jenkins to java 11 (from java 8)
- rebooting everything

sorry for the disruption...  there aren't many builds running right now so
i'll just get this sorted.

shane
-- 
Shane Knapp
Computer Guy / Voice of Reason
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


Re: [VOTE] Release Spark 3.2.0 (RC1)

2021-08-25 Thread Gengliang Wang
Hi all,

So, RC1 failed.
After RC1 cut, we have merged the following bug fixes to branch-3.2:

   - Updates AuthEngine to pass the correct SecretKeySpec format
   

   - Fix NullPointerException in LiveRDDDistribution.toAPI
   

   - Revert "[
   

   SPARK-34415 ][ML]
   Randomization in hyperparameter optimization"
   

   - Redact sensitive information in Spark Thrift Server
   


I will cut RC2 after the following issues are resolved:

   - Add back transformAllExpressions to AnalysisHelper(SPARK-36581
   )
   - Review and fix issues in API docs(SPARK-36457
   )
   - Support setting "since" version in FunctionRegistry (SPARK-36585
   )
   - pushDownPredicate=false failed to prevent push down filters to JDBC
   data source(SPARK-36574
   )

Please let me know if you know of any other new bugs/blockers for the 3.2.0
release.

Thanks,
Gengliang

On Wed, Aug 25, 2021 at 2:50 AM Sean Owen  wrote:

> I think we'll need this revert:
> https://github.com/apache/spark/pull/33819
>
> Between that and a few other minor but important issues I think I'd say -1
> myself and ask for another RC.
>
> On Tue, Aug 24, 2021 at 1:01 PM Jacek Laskowski  wrote:
>
>> Hi Yi Wu,
>>
>> Looks like the issue has got resolution: Won't Fix. How about your -1?
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://about.me/JacekLaskowski
>> "The Internals Of" Online Books 
>> Follow me on https://twitter.com/jaceklaskowski
>>
>> 
>>
>>
>> On Mon, Aug 23, 2021 at 4:58 AM Yi Wu  wrote:
>>
>>> -1. I found a bug (https://issues.apache.org/jira/browse/SPARK-36558)
>>> in the push-based shuffle, which could lead to job hang.
>>>
>>> Bests,
>>> Yi
>>>
>>> On Sat, Aug 21, 2021 at 1:05 AM Gengliang Wang  wrote:
>>>
 Please vote on releasing the following candidate as Apache Spark
  version 3.2.0.

 The vote is open until 11:59pm Pacific time Aug 25 and passes if a
 majority +1 PMC votes are cast, with a minimum of 3 +1 votes.

 [ ] +1 Release this package as Apache Spark 3.2.0
 [ ] -1 Do not release this package because ...

 To learn more about Apache Spark, please see http://spark.apache.org/

 The tag to be voted on is v3.2.0-rc1 (commit
 6bb3523d8e838bd2082fb90d7f3741339245c044):
 https://github.com/apache/spark/tree/v3.2.0-rc1

 The release files, including signatures, digests, etc. can be found at:
 https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc1-bin/

 Signatures used for Spark RCs can be found in this file:
 https://dist.apache.org/repos/dist/dev/spark/KEYS

 The staging repository for this release can be found at:
 https://repository.apache.org/content/repositories/orgapachespark-1388

 The documentation corresponding to this release can be found at:
 https://dist.apache.org/repos/dist/dev/spark/v3.2.0-rc1-docs/

 The list of bug fixes going into 3.2.0 can be found at the following
 URL:
 https://issues.apache.org/jira/projects/SPARK/versions/12349407

 This release is using the release script of the tag v3.2.0-rc1.


 FAQ

 =
 How can I help test this release?
 =
 If you are a Spark user, you can help us test this release by taking
 an existing Spark workload and running on this release candidate, then
 reporting any regressions.

 If you're working in PySpark you can set up a virtual env and install
 the current RC and see if anything important breaks, in the Java/Scala
 you can add the staging repository to your projects resolvers and test
 with the RC (make sure to clean up the artifact cache before/after so
 you don't end up building with a out of date RC going forward).

 ===
 What should happen to JIRA tickets still targeting 3.2.0?
 ===
 The current list of open tickets targeted at 3.2.0 can be found at:
 https://issues.apache.org/jira/projects/SPARK and search for "Target
 Version/s" = 3.2.0

 Committers should look at those and triage. Extremely important bug
 fixes, documentation, and API tweaks that impact compatibility should
 be worked on