Re: Revisiting Online serving of Spark models?

2018-06-12 Thread Vadim Chelyshov
I've almost completed a library for speeding up current spark models serving - https://github.com/Hydrospheredata/fastserving. It depends on spark, but it provides a way to turn spark logical plan from dataframe sample, that was passed into pipeline/transformer, into an alternative transformer

Re: [CRAN-pretest-archived] CRAN submission SparkR 2.3.1

2018-06-12 Thread Shivaram Venkataraman
#1 - Yes. It doesn't look like that is being honored. This is something we should follow up with CRAN about #2 - Looking at it more closely, I'm not sure what the problem is. If the version string is 1.8.0_144 then our parsing code does work correctly. We might need to add more debug logging or

Time for 2.1.3

2018-06-12 Thread Marcelo Vanzin
Hey all, There are some fixes that went into 2.1.3 recently that probably deserve a release. So as usual, please take a look if there's anything else you'd like on that release, otherwise I'd like to start with the process by early next week. I'll go through jira to see what's the status of

Re: Very slow complex type column reads from parquet

2018-06-12 Thread Ryan Blue
Jakub, You're right that Spark currently doesn't use the vectorized read path for nested data, but I'm not sure that's the problem here. With 50k elements in the f1 array, it could easily be that you're getting the significant speed-up from not reading or materializing that column. The

Fwd: [CRAN-pretest-archived] CRAN submission SparkR 2.3.1

2018-06-12 Thread Shivaram Venkataraman
Corresponding to the Spark 2.3.1 release, I submitted the SparkR build to CRAN yesterday. Unfortunately it looks like there are a couple of issues (full message from CRAN is forwarded below) 1. There are some builds started with Java 10

Re: [CRAN-pretest-archived] CRAN submission SparkR 2.3.1

2018-06-12 Thread Felix Cheung
For #1 is system requirements not honored? For #2 it looks like Oracle JDK? From: Shivaram Venkataraman Sent: Tuesday, June 12, 2018 3:17:52 PM To: dev Cc: Felix Cheung Subject: Fwd: [CRAN-pretest-archived] CRAN submission SparkR 2.3.1 Corresponding to the