Re: Random sampling in tests

2018-10-09 Thread Steve Loughran
Randomized testing can, in theory, help you explore a far larger area of the environment of an app than you could explicitly explore, such as "does everything work in the turkish locale where "I".toLower()!="i", etc. Good: faster tests, especially on an essentially-non-finite set of options bad

[build system] jenkins wedged, not accepting new PRBs

2018-10-09 Thread shane knapp
i just restarted jenkins... hopefully this will fix the issue. of course, nothing in the logs to show why this happened (nor is there ever). shane -- Shane Knapp UC Berkeley EECS Research / RISELab Staff Technical Lead https://rise.cs.berkeley.edu

Re: [build system] jenkins wedged, not accepting new PRBs

2018-10-09 Thread shane knapp
...and we're back and happily building. On Tue, Oct 9, 2018 at 3:12 PM shane knapp wrote: > i just restarted jenkins... hopefully this will fix the issue. > > of course, nothing in the logs to show why this happened (nor is there > ever). > > shane > -- > Shane Knapp > UC Berkeley EECS Research

Re: DataSourceV2 APIs creating multiple instances of DataSourceReader and hence not preserving the state

2018-10-09 Thread Hyukjin Kwon
I took a look for the codes. val source = classOf[MyDataSource].getCanonicalName spark.read.format(source).load().collect() Looks indeed it calls twice. First all: Looks it creates it first to read the schema for a logical plan test.org.apache.spark.sql.sources.v2.MyDataSourceReader.(MyDataSour

Re: Coalesce behaviour

2018-10-09 Thread Sergey Zhemzhitsky
Well, it seems that I can still extend the CoalesceRDD to make it preserve the total number of partitions from the parent RDD, reduce some partitons in the same way as the original coalesce does for map-only jobs and fill the gaps (partitions which should reside on the positions of the coalesced on