Re: [build system] rebooting firewall, access to jenkins will return shortly

2018-01-26 Thread shane knapp
and we're back!

On Fri, Jan 26, 2018 at 2:32 PM, shane knapp  wrote:

> our firewall was running a bit...  slowly...  and needed a reboot.  this
> means access to jenkins will be gone for ~10 mins.
>
> i'll send out an all-clear when we're back up and running.
>


[build system] rebooting firewall, access to jenkins will return shortly

2018-01-26 Thread shane knapp
our firewall was running a bit...  slowly...  and needed a reboot.  this
means access to jenkins will be gone for ~10 mins.

i'll send out an all-clear when we're back up and running.


Re: What is "*** UNCHECKED ***"?

2018-01-26 Thread Sean Owen
Yeah sounds like some JIRA 'feature' or issue. It's not any particular
bother. If it persists I'll ask INFRA, sure.

On Fri, Jan 26, 2018 at 12:00 PM Reynold Xin  wrote:

> Examples?
>
>
> On Fri, Jan 26, 2018 at 9:56 AM, Sean Owen  wrote:
>
>> I probably missed this, but what is the new "*** UNCHECKED ***" message
>> in the subject line of some JIRAs?
>>
>
>


Re: [VOTE] Spark 2.3.0 (RC2)

2018-01-26 Thread Sameer Agarwal
This vote has failed due to a number of aforementioned blockers. I'll
follow up with RC3 as soon as the 2 remaining (non-QA) blockers are
resolved: https://s.apache.org/oXKi


On 25 January 2018 at 12:59, Sameer Agarwal  wrote:

>
> Most tests pass on RC2, except I'm still seeing the timeout caused by
>> https://issues.apache.org/jira/browse/SPARK-23055 ; the tests never
>> finish. I followed the thread a bit further and wasn't clear whether it was
>> subsequently re-fixed for 2.3.0 or not. It says it's resolved along with
>> https://issues.apache.org/jira/browse/SPARK-22908 for 2.3.0 though I am
>> still seeing these tests fail or hang:
>>
>> - subscribing topic by name from earliest offsets (failOnDataLoss: false)
>> - subscribing topic by name from earliest offsets (failOnDataLoss: true)
>>
>
> Sean, while some of these tests were timing out on RC1, we're not aware of
> any known issues in RC2. Both maven (https://amplab.cs.berkeley.
> edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/
> spark-branch-2.3-test-maven-hadoop-2.6/146/testReport/org.
> apache.spark.sql.kafka010/history/) and sbt (https://amplab.cs.berkeley.
> edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/
> spark-branch-2.3-test-sbt-hadoop-2.6/123/testReport/org.
> apache.spark.sql.kafka010/history/) historical builds on jenkins
> for org.apache.spark.sql.kafka010 look fairly healthy. If you're still
> seeing timeouts in RC2, can you create a JIRA with any applicable build/env
> info?
>
>
>
>> On Tue, Jan 23, 2018 at 9:01 AM Sean Owen  wrote:
>>
>>> I'm not seeing that same problem on OS X and /usr/bin/tar. I tried
>>> unpacking it with 'xvzf' and also unzipping it first, and it untarred
>>> without warnings in either case.
>>>
>>> I am encountering errors while running the tests, different ones each
>>> time, so am still figuring out whether there is a real problem or just
>>> flaky tests.
>>>
>>> These issues look like blockers, as they are inherently to be completed
>>> before the 2.3 release. They are mostly not done. I suppose I'd -1 on
>>> behalf of those who say this needs to be done first, though, we can keep
>>> testing.
>>>
>>> SPARK-23105 Spark MLlib, GraphX 2.3 QA umbrella
>>> SPARK-23114 Spark R 2.3 QA umbrella
>>>
>>> Here are the remaining items targeted for 2.3:
>>>
>>> SPARK-15689 Data source API v2
>>> SPARK-20928 SPIP: Continuous Processing Mode for Structured Streaming
>>> SPARK-21646 Add new type coercion rules to compatible with Hive
>>> SPARK-22386 Data Source V2 improvements
>>> SPARK-22731 Add a test for ROWID type to OracleIntegrationSuite
>>> SPARK-22735 Add VectorSizeHint to ML features documentation
>>> SPARK-22739 Additional Expression Support for Objects
>>> SPARK-22809 pyspark is sensitive to imports with dots
>>> SPARK-22820 Spark 2.3 SQL API audit
>>>
>>>
>>> On Mon, Jan 22, 2018 at 7:09 PM Marcelo Vanzin 
>>> wrote:
>>>
 +0

 Signatures check out. Code compiles, although I see the errors in [1]
 when untarring the source archive; perhaps we should add "use GNU tar"
 to the RM checklist?

 Also ran our internal tests and they seem happy.

 My concern is the list of open bugs targeted at 2.3.0 (ignoring the
 documentation ones). It is not long, but it seems some of those need
 to be looked at. It would be nice for the committers who are involved
 in those bugs to take a look.

 [1] https://superuser.com/questions/318809/linux-os-x-tar-
 incompatibility-tarballs-created-on-os-x-give-errors-when-unt


 On Mon, Jan 22, 2018 at 1:36 PM, Sameer Agarwal 
 wrote:
 > Please vote on releasing the following candidate as Apache Spark
 version
 > 2.3.0. The vote is open until Friday January 26, 2018 at 8:00:00 am
 UTC and
 > passes if a majority of at least 3 PMC +1 votes are cast.
 >
 >
 > [ ] +1 Release this package as Apache Spark 2.3.0
 >
 > [ ] -1 Do not release this package because ...
 >
 >
 > To learn more about Apache Spark, please see
 https://spark.apache.org/
 >
 > The tag to be voted on is v2.3.0-rc2:
 > https://github.com/apache/spark/tree/v2.3.0-rc2
 > (489ecb0ef23e5d9b705e5e5bae4fa3d871bdac91)
 >
 > List of JIRA tickets resolved in this release can be found here:
 > https://issues.apache.org/jira/projects/SPARK/versions/12339551
 >
 > The release files, including signatures, digests, etc. can be found
 at:
 > https://dist.apache.org/repos/dist/dev/spark/v2.3.0-rc2-bin/
 >
 > Release artifacts are signed with the following key:
 > https://dist.apache.org/repos/dist/dev/spark/KEYS
 >
 > The staging repository for this release can be found at:
 > https://repository.apache.org/content/repositories/orgapache
 spark-1262/
 >
 > The documentation corresponding to this release can be found at:
 > 

Re: ***UNCHECKED*** [jira] [Resolved] (SPARK-23218) simplify ColumnVector.getArray

2018-01-26 Thread Reynold Xin
I have no idea. Some JIRA update? Might want to file an INFRA ticket.


On Fri, Jan 26, 2018 at 10:04 AM, Sean Owen  wrote:

> This is an example of the "*** UNCHECKED ***" message I was talking about
> -- it's part of the email subject rather than JIRA.
>
> -- Forwarded message -
> From: Xiao Li (JIRA) 
> Date: Fri, Jan 26, 2018 at 11:18 AM
> Subject: ***UNCHECKED*** [jira] [Resolved] (SPARK-23218) simplify
> ColumnVector.getArray
> To: 
>
>
>
>  [ https://issues.apache.org/jira/browse/SPARK-23218?page=
> com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Xiao Li resolved SPARK-23218.
> -
>Resolution: Fixed
> Fix Version/s: 2.3.0
>
> > simplify ColumnVector.getArray
> > --
> >
> > Key: SPARK-23218
> > URL: https://issues.apache.org/jira/browse/SPARK-23218
> > Project: Spark
> >  Issue Type: Sub-task
> >  Components: SQL
> >Affects Versions: 2.3.0
> >Reporter: Wenchen Fan
> >Assignee: Wenchen Fan
> >Priority: Major
> > Fix For: 2.3.0
> >
> >
>
>
>
>
> --
> This message was sent by Atlassian JIRA
> (v7.6.3#76005)
>
> -
> To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
> For additional commands, e-mail: issues-h...@spark.apache.org
>
>


Fwd: ***UNCHECKED*** [jira] [Resolved] (SPARK-23218) simplify ColumnVector.getArray

2018-01-26 Thread Sean Owen
This is an example of the "*** UNCHECKED ***" message I was talking about
-- it's part of the email subject rather than JIRA.

-- Forwarded message -
From: Xiao Li (JIRA) 
Date: Fri, Jan 26, 2018 at 11:18 AM
Subject: ***UNCHECKED*** [jira] [Resolved] (SPARK-23218) simplify
ColumnVector.getArray
To: 



 [
https://issues.apache.org/jira/browse/SPARK-23218?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Xiao Li resolved SPARK-23218.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

> simplify ColumnVector.getArray
> --
>
> Key: SPARK-23218
> URL: https://issues.apache.org/jira/browse/SPARK-23218
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Wenchen Fan
>Assignee: Wenchen Fan
>Priority: Major
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org


Re: What is "*** UNCHECKED ***"?

2018-01-26 Thread Reynold Xin
Examples?


On Fri, Jan 26, 2018 at 9:56 AM, Sean Owen  wrote:

> I probably missed this, but what is the new "*** UNCHECKED ***" message in
> the subject line of some JIRAs?
>


What is "*** UNCHECKED ***"?

2018-01-26 Thread Sean Owen
I probably missed this, but what is the new "*** UNCHECKED ***" message in
the subject line of some JIRAs?


Re: Why Dataset.hint uses logicalPlan (= analyzed not planWithBarrier)?

2018-01-26 Thread Jacek Laskowski
Thanks Wenchen --> https://github.com/apache/spark/pull/20405

I'd also like to write a new test where broadcast hint could be specified
with table identifiers + improve scaladoc for Dataset.hint to note that
hint does not have to be used with the Dataset but any Dataset (as long as
the table identifier is resolvable). That would help understanding that
part of Spark SQL a little better (i.e. writing a unit test with logical
rules and such).

Should I fill an issue in JIRA for this? Any suggestions how to do it the
right way?

Pozdrawiam,
Jacek Laskowski

https://about.me/JacekLaskowski
Mastering Spark SQL https://bit.ly/mastering-spark-sql
Spark Structured Streaming https://bit.ly/spark-structured-streaming
Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
Follow me at https://twitter.com/jaceklaskowski

On Fri, Jan 26, 2018 at 9:08 AM, Wenchen Fan  wrote:

> Looks like we missed this one, feel free to submit a patch, thanks for
> your finding!
>
> On Fri, Jan 26, 2018 at 3:39 PM, Jacek Laskowski  wrote:
>
>> Hi,
>>
>> I've just noticed that every time Dataset.hint is used it triggers
>> execution of logical commands, their unions and hint resolution (among
>> other things that analyzer does).
>>
>> Why?
>>
>> Why does hint trigger hint resolution (through QueryExecution.analyzed)?
>> [1]
>>
>> And moreover why not to use planWithBarrier instead? [2] Looks like an
>> oversight, doesn't it?
>>
>> [1] https://github.com/apache/spark/blob/master/sql/core/src
>> /main/scala/org/apache/spark/sql/Dataset.scala#L1219
>>
>> [2] https://github.com/apache/spark/blob/master/sql/core/src
>> /main/scala/org/apache/spark/sql/Dataset.scala#L195
>>
>> Pozdrawiam,
>> Jacek Laskowski
>> 
>> https://about.me/JacekLaskowski
>> Mastering Spark SQL https://bit.ly/mastering-spark-sql
>> Spark Structured Streaming https://bit.ly/spark-structured-streaming
>> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
>> Follow me at https://twitter.com/jaceklaskowski
>>
>
>


Re: Why Dataset.hint uses logicalPlan (= analyzed not planWithBarrier)?

2018-01-26 Thread Wenchen Fan
Looks like we missed this one, feel free to submit a patch, thanks for your
finding!

On Fri, Jan 26, 2018 at 3:39 PM, Jacek Laskowski  wrote:

> Hi,
>
> I've just noticed that every time Dataset.hint is used it triggers
> execution of logical commands, their unions and hint resolution (among
> other things that analyzer does).
>
> Why?
>
> Why does hint trigger hint resolution (through QueryExecution.analyzed)?
> [1]
>
> And moreover why not to use planWithBarrier instead? [2] Looks like an
> oversight, doesn't it?
>
> [1] https://github.com/apache/spark/blob/master/sql/core/
> src/main/scala/org/apache/spark/sql/Dataset.scala#L1219
>
> [2] https://github.com/apache/spark/blob/master/sql/core/
> src/main/scala/org/apache/spark/sql/Dataset.scala#L195
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://about.me/JacekLaskowski
> Mastering Spark SQL https://bit.ly/mastering-spark-sql
> Spark Structured Streaming https://bit.ly/spark-structured-streaming
> Mastering Kafka Streams https://bit.ly/mastering-kafka-streams
> Follow me at https://twitter.com/jaceklaskowski
>