Support ZOrder in OSS

2021-01-09 Thread Chang Chen
Hi All

I found that impala already implemented Zorder (
https://issues.apache.org/jira/browse/IMPALA-8755).

I used to think supporting zorder needed file format support, but from the
impala implementation, it looks like  only needing to implement a new
RecordComparator which is independent with File format.

I know that DRB already supports it in 2018, I believe  this feature is
useful for improving query performance in certain cases.


Would it be possible to port the zorder implementation  from DBR to open
source spark?

Thanks
Chang


Re: [FYI] CI Infra issues (in both GitHub Action and Jenkins)

2021-01-09 Thread Hyukjin Kwon
To share the update about GitHub Actions:
- I am informed that having more resources is now being discussed in org
level. Hopefully we get the situation better.
- Duplicated workflows will be canceled to save resources, see
https://github.com/apache/spark/pull/31104

cc @Holden Karau  @Xiao Li  too
who were involved

2021년 1월 9일 (토) 오전 8:34, Hyukjin Kwon 님이 작성:

> For GitHub resources of ASF repo, I have been contacting GitHub to address
> the issue few days ago. This is not a repo level problem cc @Sean Owen
> .
>
> ASF organisation in GitHub has already too many repos, and we should have
> a way to increase the limit, or set the separare limit specifically for the
> repo.
>
> On Sat, 9 Jan 2021, 07:34 shane knapp ☠,  wrote:
>
>> no, i don't think that'd be a good idea...  adding additional
>> dependencies to our cluster won't scale one bit.
>>
>> On Fri, Jan 8, 2021 at 2:16 PM Dongjoon Hyun 
>> wrote:
>>
>>> BTW, Shane, do you think we can utilize some of UCB machines as GitHub
>>> Action runners?
>>>
>>> Bests,
>>> Dongjoon.
>>>
>>> On Fri, Jan 8, 2021 at 2:14 PM Dongjoon Hyun 
>>> wrote:
>>>
 The followings?


 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-3.2/1836/console

 https://amplab.cs.berkeley.edu/jenkins/view/Spark%20QA%20Test%20(Dashboard)/job/spark-master-test-sbt-hadoop-2.7/1887/console

 On Fri, Jan 8, 2021 at 2:13 PM shane knapp ☠ 
 wrote:

> 1. Jenkins machines start to fail with the following recently.
>> (master branch)
>>
>> Python versions prior to 3.6 are not supported.
>> Build step 'Execute shell' marked build as failure
>>
>> examples please?
>
> --
> Shane Knapp
> Computer Guy / Voice of Reason
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>

>>
>> --
>> Shane Knapp
>> Computer Guy / Voice of Reason
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>>
>


Re: When is the Spark 3.1 release date?

2021-01-09 Thread Takeshi Yamamuro
Hi,

We've already started a vote for the v3.1 release:
https://www.mail-archive.com/dev@spark.apache.org/msg27133.html
But, I think we need more time for the official release.
Please keep watching vote threads in the spark-dev mailing list if you're
interested in it.

Bests,
Takeshi

On Sat, Jan 9, 2021 at 3:02 PM Vivek Bhaskar  wrote:

> I see early Jan for voting date?
> https://spark.apache.org/versioning-policy.html
>
> Regards,
> Vivek
>


-- 
---
Takeshi Yamamuro


[no subject]

2021-01-09 Thread Christos Ziakas
Unsubscribe