Re: [VOTE] SPIP: Identifiers for multi-catalog Spark

2019-02-19 Thread JackyLee
+1



--
Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] SPIP: Identifiers for multi-catalog Spark

2019-02-19 Thread Felix Cheung
+1



From: Ryan Blue 
Sent: Tuesday, February 19, 2019 9:34 AM
To: Jamison Bennett
Cc: dev
Subject: Re: [VOTE] SPIP: Identifiers for multi-catalog Spark

+1

On Tue, Feb 19, 2019 at 8:41 AM Jamison Bennett 
 wrote:
+1 (non-binding)


Jamison Bennett

Cloudera Software Engineer

jamison.benn...@cloudera.com

515 Congress Ave, Suite 1212   |   Austin, TX   |   78701


On Tue, Feb 19, 2019 at 10:33 AM Maryann Xue 
mailto:maryann@databricks.com>> wrote:
+1

On Mon, Feb 18, 2019 at 10:46 PM John Zhuge 
mailto:jzh...@apache.org>> wrote:
+1

On Mon, Feb 18, 2019 at 8:43 PM Dongjoon Hyun 
mailto:dongjoon.h...@gmail.com>> wrote:
+1

Dongjoon.

On 2019/02/19 04:12:23, Wenchen Fan 
mailto:cloud0...@gmail.com>> wrote:
> +1
>
> On Tue, Feb 19, 2019 at 10:50 AM Ryan Blue 
> wrote:
>
> > Hi everyone,
> >
> > It looks like there is consensus on the proposal, so I'd like to start a
> > vote thread on the SPIP for identifiers in multi-catalog Spark.
> >
> > The doc is available here:
> > https://docs.google.com/document/d/1jEcvomPiTc5GtB9F7d2RTVVpMY64Qy7INCA_rFEd9HQ/edit?usp=sharing
> >
> > Please vote in the next 3 days.
> >
> > [ ] +1: Accept the proposal as an official SPIP
> > [ ] +0
> > [ ] -1: I don't think this is a good idea because ...
> >
> >
> > Thanks!
> >
> > rb
> >
> > --
> > Ryan Blue
> > Software Engineer
> > Netflix
> >
>

-
To unsubscribe e-mail: 
dev-unsubscr...@spark.apache.org



--
John Zhuge


--
Ryan Blue
Software Engineer
Netflix


Re: Missing SparkR in CRAN

2019-02-19 Thread Felix Cheung
We are waiting for update from CRAN. Please hold on.



From: Takeshi Yamamuro 
Sent: Tuesday, February 19, 2019 2:53 PM
To: dev
Subject: Re: Missing SparkR in CRAN

Hi, guys

It seems SparkR still not found in CRAN and any problem
when resubmitting it?


On Fri, Jan 25, 2019 at 1:41 AM Felix Cheung 
mailto:felixche...@apache.org>> wrote:
Yes it was discussed on dev@. We are waiting for 2.3.3 to release to resubmit.


On Thu, Jan 24, 2019 at 5:33 AM Hyukjin Kwon 
mailto:gurwls...@gmail.com>> wrote:
Hi all,

I happened to find SparkR is missing in CRAN. See 
https://cran.r-project.org/web/packages/SparkR/index.html

I remember I saw some threads about this in spark-dev mailing list a long long 
ago IIRC. Is it in progress to fix it somewhere? or is it something I 
misunderstood?


--
---
Takeshi Yamamuro


Re: [build system] Jenkins stopped working

2019-02-19 Thread Hyukjin Kwon
Thanks Shane!! <3

2019년 2월 20일 (수) 오전 10:13, Wenchen Fan 님이 작성:

> Thanks Shane!
>
> On Wed, Feb 20, 2019 at 6:48 AM shane knapp  wrote:
>
>> alright, i increased the httpd and proxy timeouts and kicked apache.
>> i'll keep an eye on things, but as of right now we're happily building.
>>
>> On Tue, Feb 19, 2019 at 2:25 PM shane knapp  wrote:
>>
>>> aand i had to issue another restart.  it's the ever annoying, and
>>> never quite clear as to why it's happening proxy/502 error.
>>>
>>> currently investigating.
>>>
>>> On Tue, Feb 19, 2019 at 9:21 AM shane knapp  wrote:
>>>
 forgot to hit send before i went in to the office:  we're back up and
 building!

 On Tue, Feb 19, 2019 at 8:06 AM shane knapp 
 wrote:

> yep, it got wedged.  issued a restart and it should be back up in a
> few minutes.
>
> On Tue, Feb 19, 2019 at 7:32 AM Parth Gandhi 
> wrote:
>
>> Yes, it seems to be down. The unit tests are not getting kicked off.
>>
>> Regards,
>> Parth Kamlesh Gandhi
>>
>>
>> On Tue, Feb 19, 2019 at 8:29 AM Hyukjin Kwon 
>> wrote:
>>
>>> Hi all,
>>>
>>> Looks Jenkins stopped working. Did I maybe miss a thread, or anybody
>>> didn't report this yet?
>>>
>>> Thanks!
>>>
>>>
>>>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


 --
 Shane Knapp
 UC Berkeley EECS Research / RISELab Staff Technical Lead
 https://rise.cs.berkeley.edu

>>>
>>>
>>> --
>>> Shane Knapp
>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>>> https://rise.cs.berkeley.edu
>>>
>>
>>
>> --
>> Shane Knapp
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>>
>


Re: [build system] Jenkins stopped working

2019-02-19 Thread Wenchen Fan
Thanks Shane!

On Wed, Feb 20, 2019 at 6:48 AM shane knapp  wrote:

> alright, i increased the httpd and proxy timeouts and kicked apache.  i'll
> keep an eye on things, but as of right now we're happily building.
>
> On Tue, Feb 19, 2019 at 2:25 PM shane knapp  wrote:
>
>> aand i had to issue another restart.  it's the ever annoying, and
>> never quite clear as to why it's happening proxy/502 error.
>>
>> currently investigating.
>>
>> On Tue, Feb 19, 2019 at 9:21 AM shane knapp  wrote:
>>
>>> forgot to hit send before i went in to the office:  we're back up and
>>> building!
>>>
>>> On Tue, Feb 19, 2019 at 8:06 AM shane knapp  wrote:
>>>
 yep, it got wedged.  issued a restart and it should be back up in a few
 minutes.

 On Tue, Feb 19, 2019 at 7:32 AM Parth Gandhi 
 wrote:

> Yes, it seems to be down. The unit tests are not getting kicked off.
>
> Regards,
> Parth Kamlesh Gandhi
>
>
> On Tue, Feb 19, 2019 at 8:29 AM Hyukjin Kwon 
> wrote:
>
>> Hi all,
>>
>> Looks Jenkins stopped working. Did I maybe miss a thread, or anybody
>> didn't report this yet?
>>
>> Thanks!
>>
>>
>>

 --
 Shane Knapp
 UC Berkeley EECS Research / RISELab Staff Technical Lead
 https://rise.cs.berkeley.edu

>>>
>>>
>>> --
>>> Shane Knapp
>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>>> https://rise.cs.berkeley.edu
>>>
>>
>>
>> --
>> Shane Knapp
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


Re: Missing SparkR in CRAN

2019-02-19 Thread Takeshi Yamamuro
Hi, guys

It seems SparkR still not found in CRAN and any problem
when resubmitting it?


On Fri, Jan 25, 2019 at 1:41 AM Felix Cheung  wrote:

> Yes it was discussed on dev@. We are waiting for 2.3.3 to release to
> resubmit.
>
>
> On Thu, Jan 24, 2019 at 5:33 AM Hyukjin Kwon  wrote:
>
>> Hi all,
>>
>> I happened to find SparkR is missing in CRAN. See
>> https://cran.r-project.org/web/packages/SparkR/index.html
>>
>> I remember I saw some threads about this in spark-dev mailing list a long
>> long ago IIRC. Is it in progress to fix it somewhere? or is it something I
>> misunderstood?
>>
>

-- 
---
Takeshi Yamamuro


Re: [build system] Jenkins stopped working

2019-02-19 Thread shane knapp
alright, i increased the httpd and proxy timeouts and kicked apache.  i'll
keep an eye on things, but as of right now we're happily building.

On Tue, Feb 19, 2019 at 2:25 PM shane knapp  wrote:

> aand i had to issue another restart.  it's the ever annoying, and
> never quite clear as to why it's happening proxy/502 error.
>
> currently investigating.
>
> On Tue, Feb 19, 2019 at 9:21 AM shane knapp  wrote:
>
>> forgot to hit send before i went in to the office:  we're back up and
>> building!
>>
>> On Tue, Feb 19, 2019 at 8:06 AM shane knapp  wrote:
>>
>>> yep, it got wedged.  issued a restart and it should be back up in a few
>>> minutes.
>>>
>>> On Tue, Feb 19, 2019 at 7:32 AM Parth Gandhi 
>>> wrote:
>>>
 Yes, it seems to be down. The unit tests are not getting kicked off.

 Regards,
 Parth Kamlesh Gandhi


 On Tue, Feb 19, 2019 at 8:29 AM Hyukjin Kwon 
 wrote:

> Hi all,
>
> Looks Jenkins stopped working. Did I maybe miss a thread, or anybody
> didn't report this yet?
>
> Thanks!
>
>
>
>>>
>>> --
>>> Shane Knapp
>>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>>> https://rise.cs.berkeley.edu
>>>
>>
>>
>> --
>> Shane Knapp
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


Re: [build system] Jenkins stopped working

2019-02-19 Thread shane knapp
aand i had to issue another restart.  it's the ever annoying, and never
quite clear as to why it's happening proxy/502 error.

currently investigating.

On Tue, Feb 19, 2019 at 9:21 AM shane knapp  wrote:

> forgot to hit send before i went in to the office:  we're back up and
> building!
>
> On Tue, Feb 19, 2019 at 8:06 AM shane knapp  wrote:
>
>> yep, it got wedged.  issued a restart and it should be back up in a few
>> minutes.
>>
>> On Tue, Feb 19, 2019 at 7:32 AM Parth Gandhi 
>> wrote:
>>
>>> Yes, it seems to be down. The unit tests are not getting kicked off.
>>>
>>> Regards,
>>> Parth Kamlesh Gandhi
>>>
>>>
>>> On Tue, Feb 19, 2019 at 8:29 AM Hyukjin Kwon 
>>> wrote:
>>>
 Hi all,

 Looks Jenkins stopped working. Did I maybe miss a thread, or anybody
 didn't report this yet?

 Thanks!



>>
>> --
>> Shane Knapp
>> UC Berkeley EECS Research / RISELab Staff Technical Lead
>> https://rise.cs.berkeley.edu
>>
>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


Thoughts on dataframe cogroup?

2019-02-19 Thread Li Jin
Hi,

We have been using Pyspark's groupby().apply() quite a bit and it has been
very helpful in integrating Spark with our existing pandas-heavy libraries.

Recently, we have found more and more cases where groupby().apply() is not
sufficient - In some cases, we want to group two dataframes by the same
key, and apply a function which takes two pd.DataFrame (also returns a
pd.DataFrame) for each key. This feels very much like the "cogroup"
operation in the RDD API.

It would be great to be able to do sth like this: (not actual API, just to
explain the use case):

@pandas_udf(return_schema, ...)
def my_udf(pdf1, pdf2)
 # pdf1 and pdf2 are the subset of the original dataframes that is
associated with a particular key
 result = ... # some code that uses pdf1 and pdf2
 return result

df3  = cogroup(df1, df2, key='some_key').apply(my_udf)

I have searched around the problem and some people have suggested to join
the tables first. However, it's often not the same pattern and hard to get
it to work by using joins.

I wonder what are people's thought on this?

Li


Re: [VOTE] SPIP: Identifiers for multi-catalog Spark

2019-02-19 Thread Ryan Blue
+1

On Tue, Feb 19, 2019 at 8:41 AM Jamison Bennett
 wrote:

> +1 (non-binding)
>
> Jamison Bennett
>
> Cloudera Software Engineer
>
> jamison.benn...@cloudera.com
>
> 515 Congress Ave, Suite 1212   |   Austin, TX   |   78701
>
>
> On Tue, Feb 19, 2019 at 10:33 AM Maryann Xue 
> wrote:
>
>> +1
>>
>> On Mon, Feb 18, 2019 at 10:46 PM John Zhuge  wrote:
>>
>>> +1
>>>
>>> On Mon, Feb 18, 2019 at 8:43 PM Dongjoon Hyun 
>>> wrote:
>>>
 +1

 Dongjoon.

 On 2019/02/19 04:12:23, Wenchen Fan  wrote:
 > +1
 >
 > On Tue, Feb 19, 2019 at 10:50 AM Ryan Blue >>> >
 > wrote:
 >
 > > Hi everyone,
 > >
 > > It looks like there is consensus on the proposal, so I'd like to
 start a
 > > vote thread on the SPIP for identifiers in multi-catalog Spark.
 > >
 > > The doc is available here:
 > >
 https://docs.google.com/document/d/1jEcvomPiTc5GtB9F7d2RTVVpMY64Qy7INCA_rFEd9HQ/edit?usp=sharing
 > >
 > > Please vote in the next 3 days.
 > >
 > > [ ] +1: Accept the proposal as an official SPIP
 > > [ ] +0
 > > [ ] -1: I don't think this is a good idea because ...
 > >
 > >
 > > Thanks!
 > >
 > > rb
 > >
 > > --
 > > Ryan Blue
 > > Software Engineer
 > > Netflix
 > >
 >

 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>>>
>>> --
>>> John Zhuge
>>>
>>

-- 
Ryan Blue
Software Engineer
Netflix


Re: [build system] Jenkins stopped working

2019-02-19 Thread shane knapp
forgot to hit send before i went in to the office:  we're back up and
building!

On Tue, Feb 19, 2019 at 8:06 AM shane knapp  wrote:

> yep, it got wedged.  issued a restart and it should be back up in a few
> minutes.
>
> On Tue, Feb 19, 2019 at 7:32 AM Parth Gandhi 
> wrote:
>
>> Yes, it seems to be down. The unit tests are not getting kicked off.
>>
>> Regards,
>> Parth Kamlesh Gandhi
>>
>>
>> On Tue, Feb 19, 2019 at 8:29 AM Hyukjin Kwon  wrote:
>>
>>> Hi all,
>>>
>>> Looks Jenkins stopped working. Did I maybe miss a thread, or anybody
>>> didn't report this yet?
>>>
>>> Thanks!
>>>
>>>
>>>
>
> --
> Shane Knapp
> UC Berkeley EECS Research / RISELab Staff Technical Lead
> https://rise.cs.berkeley.edu
>


-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


Re: [VOTE] SPIP: Identifiers for multi-catalog Spark

2019-02-19 Thread Jamison Bennett
+1 (non-binding)

Jamison Bennett

Cloudera Software Engineer

jamison.benn...@cloudera.com

515 Congress Ave, Suite 1212   |   Austin, TX   |   78701


On Tue, Feb 19, 2019 at 10:33 AM Maryann Xue 
wrote:

> +1
>
> On Mon, Feb 18, 2019 at 10:46 PM John Zhuge  wrote:
>
>> +1
>>
>> On Mon, Feb 18, 2019 at 8:43 PM Dongjoon Hyun 
>> wrote:
>>
>>> +1
>>>
>>> Dongjoon.
>>>
>>> On 2019/02/19 04:12:23, Wenchen Fan  wrote:
>>> > +1
>>> >
>>> > On Tue, Feb 19, 2019 at 10:50 AM Ryan Blue 
>>> > wrote:
>>> >
>>> > > Hi everyone,
>>> > >
>>> > > It looks like there is consensus on the proposal, so I'd like to
>>> start a
>>> > > vote thread on the SPIP for identifiers in multi-catalog Spark.
>>> > >
>>> > > The doc is available here:
>>> > >
>>> https://docs.google.com/document/d/1jEcvomPiTc5GtB9F7d2RTVVpMY64Qy7INCA_rFEd9HQ/edit?usp=sharing
>>> > >
>>> > > Please vote in the next 3 days.
>>> > >
>>> > > [ ] +1: Accept the proposal as an official SPIP
>>> > > [ ] +0
>>> > > [ ] -1: I don't think this is a good idea because ...
>>> > >
>>> > >
>>> > > Thanks!
>>> > >
>>> > > rb
>>> > >
>>> > > --
>>> > > Ryan Blue
>>> > > Software Engineer
>>> > > Netflix
>>> > >
>>> >
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>
>> --
>> John Zhuge
>>
>


Re: [build system] Jenkins stopped working

2019-02-19 Thread shane knapp
yep, it got wedged.  issued a restart and it should be back up in a few
minutes.

On Tue, Feb 19, 2019 at 7:32 AM Parth Gandhi 
wrote:

> Yes, it seems to be down. The unit tests are not getting kicked off.
>
> Regards,
> Parth Kamlesh Gandhi
>
>
> On Tue, Feb 19, 2019 at 8:29 AM Hyukjin Kwon  wrote:
>
>> Hi all,
>>
>> Looks Jenkins stopped working. Did I maybe miss a thread, or anybody
>> didn't report this yet?
>>
>> Thanks!
>>
>>
>>

-- 
Shane Knapp
UC Berkeley EECS Research / RISELab Staff Technical Lead
https://rise.cs.berkeley.edu


Re: [build system] Jenkins stopped working

2019-02-19 Thread Parth Gandhi
Yes, it seems to be down. The unit tests are not getting kicked off.

Regards,
Parth Kamlesh Gandhi


On Tue, Feb 19, 2019 at 8:29 AM Hyukjin Kwon  wrote:

> Hi all,
>
> Looks Jenkins stopped working. Did I maybe miss a thread, or anybody
> didn't report this yet?
>
> Thanks!
>
>
>


[build system] Jenkins stopped working

2019-02-19 Thread Hyukjin Kwon
Hi all,

Looks Jenkins stopped working. Did I maybe miss a thread, or anybody didn't
report this yet?

Thanks!