[SparkR] Compare datetime with Sys.time() throws error in R (>= 4.2.0)

2023-01-03 Thread Vivek Atal
7;class' names as a character vector (R: Object Classes); hence this type of check itself was not a good idea in the first place. t <- Sys.time()sdf <- SparkR::createDataFrame(data.frame(xx = t + c(-1,1,-1,1,-1))) SparkR::collect(SparkR::filter(sdf, SparkR::column("

Re: [R] SparkR on conda-forge

2021-12-19 Thread Hyukjin Kwon
Awesome! On Mon, 20 Dec 2021 at 09:43, yonghua wrote: > Nice release. thanks for sharing. > > On 2021/12/20 3:55, Maciej wrote: > > FYI ‒ thanks to good folks from conda-forge we have now these: > > - > To unsubscribe e-mail: us

Re: [R] SparkR on conda-forge

2021-12-19 Thread yonghua
Nice release. thanks for sharing. On 2021/12/20 3:55, Maciej wrote: FYI ‒ thanks to good folks from conda-forge we have now these: - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

[R] SparkR on conda-forge

2021-12-19 Thread Maciej
Hi everyone, FYI ‒ thanks to good folks from conda-forge we have now these: * https://github.com/conda-forge/r-sparkr-feedstock * https://anaconda.org/conda-forge/r-sparkr -- Best regards, Maciej Szymkiewicz Web: https://zero323.net PGP: A30CEF0C31A501EC OpenPGP_signature Description

Re: [SparkR] gapply with strings with arrow

2020-10-10 Thread Hyukjin Kwon
ction.apply(Unknown > Source) > ... > > When I looked at the source code there - it is all stubs. > > Is there a proper way to use arrow in gapply in SparkR? > > BR, > > Jacel > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >

[SparkR] gapply with strings with arrow

2020-10-07 Thread Jacek Pliszka
afeProjection.apply(Unknown Source) ... When I looked at the source code there - it is all stubs. Is there a proper way to use arrow in gapply in SparkR? BR, Jacel - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: Fail to use SparkR of 3.0 preview 2

2020-01-07 Thread Xiao Li
eInfo-env-quot-S3methods-quot-td4755490.html > ). > Yes, seems we should make sure we build SparkR in an old version. > Since that support for R prior to version 3.4 is deprecated as of Spark > 3.0.0, we could use either R 3.4 or matching to Jenkins's (R 3.1 IIRC) for > Spark 3.0

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Hyukjin Kwon
I was randomly googling out of curiosity, and seems indeed that's the problem ( https://r.789695.n4.nabble.com/Error-in-rbind-info-getNamespaceInfo-env-quot-S3methods-quot-td4755490.html ). Yes, seems we should make sure we build SparkR in an old version. Since that support for R prior to ve

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Jeff Zhang
Yes, I guess so. But R 3.6.2 is just released this month, I think we should use an older version to build SparkR. Felix Cheung 于2019年12月27日周五 上午10:43写道: > Maybe it’s the reverse - the package is built to run in latest but not > compatible with slightly older (3.5.2 was De

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Felix Cheung
Maybe it’s the reverse - the package is built to run in latest but not compatible with slightly older (3.5.2 was Dec 2018) From: Jeff Zhang Sent: Thursday, December 26, 2019 5:36:50 PM To: Felix Cheung Cc: user.spark Subject: Re: Fail to use SparkR of 3.0

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Jeff Zhang
AM > *To:* user.spark > *Subject:* Fail to use SparkR of 3.0 preview 2 > > I tried SparkR of spark 3.0 preview 2, but hit the following issue. > > Error in rbind(info, getNamespaceInfo(env, "S3methods")) : > number of columns of matrices must match (see arg 2) > Error:

Re: Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Felix Cheung
It looks like a change in the method signature in R base packages. Which version of R are you running on? From: Jeff Zhang Sent: Thursday, December 26, 2019 12:46:12 AM To: user.spark Subject: Fail to use SparkR of 3.0 preview 2 I tried SparkR of spark 3.0

Fail to use SparkR of 3.0 preview 2

2019-12-26 Thread Jeff Zhang
I tried SparkR of spark 3.0 preview 2, but hit the following issue. Error in rbind(info, getNamespaceInfo(env, "S3methods")) : number of columns of matrices must match (see arg 2) Error: package or namespace load failed for ‘SparkR’ in rbind(info, getNamespaceInfo(env, "S3methods

Re: SparkR integration with Hive 3 spark-r

2019-11-24 Thread Felix Cheung
I think you will get more answer if you ask without SparkR. You question is independent on SparkR. Spark support for Hive 3.x (3.1.2) was added here https://github.com/apache/spark/commit/1b404b9b9928144e9f527ac7b1caa15f932c2649 You should be able to connect Spark to Hive metastore

Re: SparkR integration with Hive 3 spark-r

2019-11-22 Thread Alfredo Marquez
>> hive metastore 2.3.5 - no mention of hive 3 metastore. I made several >> tests on this in the past[1] and it seems to handle any hive metastore >> version. >> >> However spark cannot read hive managed table AKA transactional tables. >> So I would say you should

Re: SparkR integration with Hive 3 spark-r

2019-11-18 Thread Alfredo Marquez
nsactional tables. > So I would say you should be able to read any hive 3 regular table with > any of spark, pyspark or sparkR. > > > [1] > https://parisni.frama.io/posts/playing-with-hive-spark-metastore-versions/ > > On Mon, Nov 18, 2019 at 11:23:50AM -0600, Alfredo Marquez wr

Re: SparkR integration with Hive 3 spark-r

2019-11-18 Thread Nicolas Paris
transactional tables. So I would say you should be able to read any hive 3 regular table with any of spark, pyspark or sparkR. [1] https://parisni.frama.io/posts/playing-with-hive-spark-metastore-versions/ On Mon, Nov 18, 2019 at 11:23:50AM -0600, Alfredo Marquez wrote: > Hello, > > Our c

SparkR integration with Hive 3 spark-r

2019-11-18 Thread Alfredo Marquez
Hello, Our company is moving to Hive 3, and they are saying that there is no SparkR implementation in Spark 2.3.x + that will connect to Hive 3. Is this true? If it is true, will this be addressed in the Spark 3 release? I don't use python, so losing SparkR to get work done on Hadoop is a

Re: [PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame?

2019-07-16 Thread Felix Cheung
: Monday, July 15, 2019 6:58:32 AM To: user@spark.apache.org Subject: [PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame? Hi all, Forgive this naïveté, I’m looking for reassurance from some experts! In the past we created a tailored Spark library for our

[PySpark] [SparkR] Is it possible to invoke a PySpark function with a SparkR DataFrame?

2019-07-15 Thread Fiske, Danny
tors. We'd ideally write our functions with PySpark and potentially create a SparkR "wrapper" over the top, leading to the question: Given a function written with PySpark that accepts a DataFrame parameter, is there a way to invoke this function using a SparkR DataFrame? Is the

Re: sparksql in sparkR?

2019-06-07 Thread Felix Cheung
This seem to be more a question of spark-sql shell? I may suggest you change the email title to get more attention. From: ya Sent: Wednesday, June 5, 2019 11:48:17 PM To: user@spark.apache.org Subject: sparksql in sparkR? Dear list, I am trying to use sparksql

sparksql in sparkR?

2019-06-05 Thread ya
Dear list, I am trying to use sparksql within my R, I am having the following questions, could you give me some advice please? Thank you very much. 1. I connect my R and spark using the library sparkR, probably some of the members here also are R users? Do I understand correctly that SparkSQL

Re: SparkR + binary type + how to get value

2019-02-19 Thread Felix Cheung
from the second image it looks like there is protocol mismatch. I’d check if the SparkR package running there on Livy machine matches the Spark java release. But in any case this seems more an issue with Livy config. I’d suggest checking with the community there

Re: SparkR + binary type + how to get value

2019-02-19 Thread Thijs Haarhuis
for it at: https://jira.apache.org/jira/browse/LIVY-558 When I call the spark.lapply function it reports that SparkR is not initialized. I have looked into the spark.lapply function and it seems there is no spark context. Any idea how I can debug this? I hope you can help. Regards, Thijs

Re: SparkR + binary type + how to get value

2019-02-17 Thread Felix Cheung
: Thijs Haarhuis Sent: Thursday, February 14, 2019 4:01 AM To: Felix Cheung; user@spark.apache.org Subject: Re: SparkR + binary type + how to get value Hi Felix, Sure.. I have the following code: printSchema(results) cat("\n\n\n") firstRow <- first(results

Re: SparkR + binary type + how to get value

2019-02-14 Thread Thijs Haarhuis
s it is a list. Any idea how to get the actual value, or how to process the individual bytes? Thanks Thijs From: Felix Cheung Sent: Thursday, February 14, 2019 5:31 AM To: Thijs Haarhuis; user@spark.apache.org Subject: Re: SparkR + binary type + how to get

Re: SparkR + binary type + how to get value

2019-02-13 Thread Felix Cheung
Please share your code From: Thijs Haarhuis Sent: Wednesday, February 13, 2019 6:09 AM To: user@spark.apache.org Subject: SparkR + binary type + how to get value Hi all, Does anybody have any experience in accessing the data from a column which has a binary

SparkR + binary type + how to get value

2019-02-13 Thread Thijs Haarhuis
Hi all, Does anybody have any experience in accessing the data from a column which has a binary type in a Spark Data Frame in R? I have a Spark Data Frame which has a column which is of a binary type. I want to access this data and process it. In my case I collect the spark data frame to a R dat

Re: SparkR issue

2018-10-14 Thread Felix Cheung
1 seems like its spending a lot of time in R (slicing the data I guess?) and not with Spark 2 could you write it into a csv file locally and then read it from Spark? From: ayan guha Sent: Monday, October 8, 2018 11:21 PM To: user Subject: SparkR issue Hi We

SparkR issue

2018-10-08 Thread ayan guha
Hi We are seeing some weird behaviour in Spark R. We created a R Dataframe with 600K records and 29 columns. Then we tried to convert R DF to SparkDF using df <- SparkR::createDataFrame(rdf) from RStudio. It hanged, we had to kill the process after 1-2 hours. We also tried following:

Any good book recommendations for SparkR

2018-04-30 Thread @Nandan@
Hi Team, Any good book recommendations for get in-depth knowledge from zero to production. Let me know. Thanks.

package reload in dapply SparkR

2018-04-10 Thread Deepansh Goyal
I have a native R model and doing structured streaming on it. Data comes from Kafka and goes into dapply method where my model does prediction and data is written to sink. Problem:- My model requires caret package. Inside dapply function for every stream job, caret package is loaded again which ad

Re: SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread chandan prakash
follow up for a fix. > > _ > From: Hyukjin Kwon > Sent: Wednesday, February 14, 2018 6:49 PM > Subject: Re: SparkR test script issue: unable to run run-tests.h on spark > 2.2 > To: chandan prakash > Cc: user @spark > > > > From a

Re: SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread Felix Cheung
Yes it is issue with the newer release of testthat. To workaround could you install an earlier version with devtools? will follow up for a fix. _ From: Hyukjin Kwon Sent: Wednesday, February 14, 2018 6:49 PM Subject: Re: SparkR test script issue: unable to run run

Re: SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread Hyukjin Kwon
>From a very quick look, I think testthat version issue with SparkR. I had to fix that version to 1.x before in AppVeyor. There are few details in https://github.com/apache/spark/pull/20003 Can you check and lower testthat version? On 14 Feb 2018 6:09 pm, "chandan prakash" wro

SparkR test script issue: unable to run run-tests.h on spark 2.2

2018-02-14 Thread chandan prakash
Hi All, I am trying to run test script of R under ./R/run-tests.sh but hitting same ERROR everytime. I tried running on mac as well as centos machine, same issue coming up. I am using spark 2.2 (branch-2.2) I followed from apache doc and followed the steps: 1. installed R 2. installed packages like

Re: sparkR 3rd library

2017-09-05 Thread Yanbo Liang
I guess you didn't install R package `genalg` for all worker nodes. This is not built-in package for basic R, so you need to install it to all worker nodes manually or running `install.packages` inside of your SparkR UDF. Regards to how to download third party packages and install them insi

Re: sparkR 3rd library

2017-09-04 Thread Felix Cheung
Can you include the code you call spark.lapply? From: patcharee Sent: Sunday, September 3, 2017 11:46:40 PM To: spar >> user@spark.apache.org Subject: sparkR 3rd library Hi, I am using spark.lapply to execute an existing R script in standalone mode

sparkR 3rd library

2017-09-03 Thread patcharee
Hi, I am using spark.lapply to execute an existing R script in standalone mode. This script calls a function 'rbga' from a 3rd library 'genalg'. This rbga function works fine in sparkR env when I call it directly, but when I apply this to spark.lapply I get the error cou

Re: Update MySQL table via Spark/SparkR?

2017-08-22 Thread Pierce Lamb
Russ > *Cc: *"user@spark.apache.org" > *Subject: *Re: Update MySQL table via Spark/SparkR? > > > > Hi Jake, > > This is an issue across all RDBMs including Oracle etc. When you are > updating you have to commit or roll back in RDBMS itself and I am not aware &g

Re: Update MySQL table via Spark/SparkR?

2017-08-22 Thread Jake Russ
;user@spark.apache.org" Subject: Re: Update MySQL table via Spark/SparkR? Hi Jake, This is an issue across all RDBMs including Oracle etc. When you are updating you have to commit or roll back in RDBMS itself and I am not aware of Spark doing that. The staging table is a safer method as it follow

Re: Update MySQL table via Spark/SparkR?

2017-08-21 Thread ayan guha
laimed. > The author will in no case be liable for any monetary damages arising from > such loss, damage or destruction. > > > > On 21 August 2017 at 15:50, Jake Russ wrote: > >> Hi everyone, >> >> >> >> I’m currently using SparkR to read data fr

Re: Update MySQL table via Spark/SparkR?

2017-08-21 Thread Mich Talebzadeh
explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On 21 August 2017 at 15:50, Jake Russ wrote: > Hi everyone, > > > > I’m currently using SparkR to read data from a MySQL database, perform > some

Update MySQL table via Spark/SparkR?

2017-08-21 Thread Jake Russ
Hi everyone, I’m currently using SparkR to read data from a MySQL database, perform some calculations, and then write the results back to MySQL. Is it still true that Spark does not support UPDATE queries via JDBC? I’ve seen many posts on the internet that Spark’s DataFrameWriter does not

Re: [sparkR] [MLlib] : Is word2vec implemented in SparkR MLlib ?

2017-04-21 Thread Felix Cheung
Not currently - how are you planning to use the output from word2vec? From: Radhwane Chebaane Sent: Thursday, April 20, 2017 4:30:14 AM To: user@spark.apache.org Subject: [sparkR] [MLlib] : Is word2vec implemented in SparkR MLlib ? Hi, I've been experime

[sparkR] [MLlib] : Is word2vec implemented in SparkR MLlib ?

2017-04-20 Thread Radhwane Chebaane
Hi, I've been experimenting with the Spark *Word2vec* implementation in the MLLib package with Scala and it was very nice. I need to use the same algorithm in R leveraging the power of spark distribution with SparkR. I have been looking on the mailing list and Stackoverflow for any *Word2vec

Re: Issue with SparkR setup on RStudio

2017-01-04 Thread Md. Rezaul Karim
he exception stack is fairly far away from the actual > error, but from the top of my head spark.sql.warehouse.dir and HADOOP_HOME > are the two different pieces that is not set in the Windows tests. > > > _ > From: Md. Rezaul Karim > Sent: Monday

RBackendHandler Error while running ML algorithms with SparkR on RStudio

2017-01-03 Thread Md. Rezaul Karim
, ...) : java.io.IOException: Class not found Here's my source code: Sys.setenv(SPARK_HOME = "spark-2.1.0-bin-hadoop2.7") .libPaths(c(file.path(Sys.getenv("SPARK_HOME"), "R", "lib"), .libPaths())) library(SparkR) sparkR.session(appName = "S

Re: Issue with SparkR setup on RStudio

2017-01-02 Thread Felix Cheung
set in the Windows tests. _ From: Md. Rezaul Karim mailto:rezaul.ka...@insight-centre.org>> Sent: Monday, January 2, 2017 7:58 AM Subject: Re: Issue with SparkR setup on RStudio To: Felix Cheung mailto:felixcheun...@hotmail.com>> Cc: spark users mailto

Re: Issue with SparkR setup on RStudio

2017-01-02 Thread Md. Rezaul Karim
issue with Hive config likely > with trying to load hive-site.xml. Could you try not setting HADOOP_HOME > > > -- > *From:* Md. Rezaul Karim > *Sent:* Thursday, December 29, 2016 10:24:57 AM > *To:* spark users > *Subject:* Issue with SparkR setup o

Re: Issue with SparkR setup on RStudio

2016-12-29 Thread Felix Cheung
:57 AM To: spark users Subject: Issue with SparkR setup on RStudio Dear Spark users, I am trying to setup SparkR on RStudio to perform some basic data manipulations and ML modeling. However, I am a strange error while creating SparkR session or DataFrame that says: java.lang.IllegalArgumentExc

Issue with SparkR setup on RStudio

2016-12-29 Thread Md. Rezaul Karim
Dear Spark users, I am trying to setup SparkR on RStudio to perform some basic data manipulations and ML modeling. However, I am a strange error while creating SparkR session or DataFrame that says: java.lang.IllegalArgumentException Error while instantiating

Re: Does SparkR or SparkMLib support nonlinear optimization with non linear constraints

2016-11-25 Thread Robineast
lable, do you plan to have it in your roadmap > anytime. > > TIA > Jyoti > > > If you reply to this email, your message will be added to the discussion > below: > http://apache-spark-user-list.1001560.n3.nabble.com/Does-SparkR-or-SparkMLib-support-nonlinear-

Re: How to propagate R_LIBS to sparkr executors

2016-11-17 Thread Felix Cheung
Have you tried spark.executorEnv.R_LIBS? spark.apache.org/docs/latest/configuration.html#runtime-environment _ From: Rodrick Brown mailto:rodr...@orchard-app.com>> Sent: Wednesday, November 16, 2016 1:01 PM Subject: How to propagate R_LIBS to sparkr execut

How to propagate R_LIBS to sparkr executors

2016-11-16 Thread Rodrick Brown
in sparkr? I’m using Mesos 1.0.1 and Spark 2.0.1 Thanks. -- <http://www.orchardplatform.com/> Rodrick Brown / Site Reliability Engineer +1 917 445 6839 / rodr...@orchardplatform.com <mailto:char...@orchardplatform.com> Orchard Platform 101 5th Avenue, 4th Floor, New York, NY

Re: Issue Running sparkR on YARN

2016-11-09 Thread Felix Cheung
It maybe the Spark executor is running as a different user and it can't see where RScript is? You might want to try putting Rscript path to PATH. Also please see this for the config property to set for the R command to use: https://spark.apache.org/docs/latest/configuration.html#s

Issue Running sparkR on YARN

2016-11-09 Thread Ian.Maloney
Hi, I’m trying to run sparkR (1.5.2) on YARN and I get: java.io.IOException: Cannot run program "Rscript": error=2, No such file or directory This strikes me as odd, because I can go to each node and various users and type Rscript and it works. I’ve done this on each node and spark

Re: Substitute Certain Rows a data Frame using SparkR

2016-10-19 Thread Felix Cheung
hink your example could be something we support though. Please feel free to open a JIRA for that. _ From: shilp mailto:tshi...@hotmail.com>> Sent: Monday, October 17, 2016 7:38 AM Subject: Substitute Certain Rows a data Frame using SparkR To: mailto:user@spark.apa

Substitute Certain Rows a data Frame using SparkR

2016-10-17 Thread shilp
I have a sparkR Data frame and I want to Replace certain Rows of a Column which satisfy certain condition with some value.If it was a simple R data frame then I would do something as follows:df$Column1[df$Column1 == "Value"] = "NewValue" How would i perform similar operation on

Re: SparkR execution hang on when handle a RDD which is converted from DataFrame

2016-10-14 Thread Lantao Jin
> key_id, > rtl_week_beg_dt rawdate, > gmv_plan_rate_amt value > FROM > metrics_moveing_detection_cube > " > df <- sql(sqlString) > rdd<-SparkR:::toRDD(df) > > #hang on case one: take from rdd > #take(rdd,3) > > #hang on case two: convert back to data

Re: SparkR execution hang on when handle a RDD which is converted from DataFrame

2016-10-13 Thread Felix Cheung
rics_moveing_detection_cube " df <- sql(sqlString) rdd<-SparkR:::toRDD(df) #hang on case one: take from rdd #take(rdd,3) #hang on case two: convert back to dataframe #df1<-createDataFrame(rdd) #head(df1) #not hang case: direct handle on dataframe is ok head(df,3) Code above is spark2.

SparkR execution hang on when handle a RDD which is converted from DataFrame

2016-10-13 Thread Lantao Jin
sqlContext <- sparkRHive.init(sc) sqlString<- "SELECT key_id, rtl_week_beg_dt rawdate, gmv_plan_rate_amt value FROM metrics_moveing_detection_cube " df <- sql(sqlString) rdd<-SparkR:::toRDD(df) #hang on case one: take from rdd #take(rdd,3) #hang on case two: convert

SparkR 2.0 glm prediction confidences

2016-10-05 Thread Zsolt Tóth
Hi, in Spark 1.6 the glm's predict() method returned a DataFrame with 0/1 prediction values. In 2.0 however, the same code returns confidence-like values, e.g. 0.5320209312. Can anyone tell me, what caused the change here? Is it possible to get the old, binary values with Spark 2.0? Regards, Zsol

Re: Filtering in SparkR

2016-10-03 Thread Deepak Sharma
Hi Yogesh You can try registering these 2 DFs as temporary table and then execute the sql query. df1.registerTempTable("df1") df2.registerTempTable("df2") val rs = sqlContext.sql("SELECT a.* FROM df1 a, df2 b where a.id != b.id) Thanks Deepak On Mon, Oct 3, 2016 at 12:38 PM, Yogesh Vyas wrote:

Filtering in SparkR

2016-10-03 Thread Yogesh Vyas
Hi, I have two SparkDataFrames, df1 and df2. There schemas are as follows: df1=>SparkDataFrame[id:double, c1:string, c2:string] df2=>SparkDataFrame[id:double, c3:string, c4:string] I want to filter out rows from df1 where df1$id does not match df2$id I tried some expression: filter(df1,!(df1$id

Fwd: filtering in SparkR

2016-10-02 Thread Yogesh Vyas
Hi, I have two SparkDataFrames, df1 and df2. There schemas are as follows: df1=>SparkDataFrame[id:double, c1:string, c2:string] df2=>SparkDataFrame[id:double, c3:string, c4:string] I want to filter out rows from df1 where df1$id does not match df2$id I tried some expression: filter(df1,!(df1$id

filtering in SparkR

2016-10-02 Thread Yogesh Vyas
Hi, I have two SparkDataFrames, df1 and df2. There schemas are as follows: df1=>SparkDataFrame[id:double, c1:string, c2:string] df2=>SparkDataFrame[id:double, c3:string, c4:string] I want to filter out rows from df1 where df1$id does not match df2$id I tried some expression: filter(df1,!(df1$id

RE: as.Date can't be applied to Spark data frame in SparkR

2016-09-19 Thread xingye
Update: the job can finish, but takes a long time on a 10M row data. is there a better solution? From: xing_ma...@hotmail.com To: user@spark.apache.org Subject: as.Date can't be applied to Spark data frame in SparkR Date: Tue, 20 Sep 2016 10:22:17 +0800 Hi, all I've noticed that as.

as.Date can't be applied to Spark data frame in SparkR

2016-09-19 Thread xingye
Hi, all I've noticed that as.Date can't be applied to Spark data frame. I've created the following UDF and used dapply to change a integer column "aa" to a date with origin as 1960-01-01. change_date<-function(df){ df<-as.POSIXlt(as.Date(df$aa, origin = "1960-01-01", tz = "UTC")) } customSc

Re: SparkR API problem with subsetting distributed data frame

2016-09-11 Thread Bene
? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-API-problem-with-subsetting-distributed-data-frame-tp27688p27692.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: SparkR error: reference is ambiguous.

2016-09-10 Thread Felix Cheung
t:double] > head(c) speed dist 1 0 2 2 0 10 3 0 4 4 0 22 5 0 16 6 0 10 _ From: Bedrytski Aliaksandr mailto:sp...@bedryt.ski>> Sent: Friday, September 9, 2016 9:13 PM Subject: Re: SparkR error: reference is ambiguous. To: xingye mailto:tracy.up...@

Re: SparkR API problem with subsetting distributed data frame

2016-09-10 Thread Felix Cheung
How are you calling dirs()? What would be x? Is dat a SparkDataFrame? With SparkR, i in dat[i, 4] should be an logical expression for row, eg. df[df$age %in% c(19, 30), 1:2] On Sat, Sep 10, 2016 at 11:02 AM -0700, "Bene" mailto:benedikt.haeu...@outlook.com>> wrote: Her

Re: Assign values to existing column in SparkR

2016-09-10 Thread Felix Cheung
h Species 1 5.1 3.5 0 0.2 setosa 2 4.9 3.0 0 0.2 setosa 3 4.7 3.2 0 0.2 setosa 4 4.6 3.1 0 0.2 setosa 5 5.0 3.6 0 0.2 setosa 6 5.4 3.9 0 0.4 setosa _ From: Deepak Sharma mailto:deepakmc...@gmail.com>> Sent: Friday, September 9, 2016 12:29 PM Subject: Re: Assig

Re: SparkR API problem with subsetting distributed data frame

2016-09-10 Thread Bene
essage in context: http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-API-problem-with-subsetting-distributed-data-frame-tp27688p27691.html Sent from the Apache Spark User List mailing list archive at Nabble.com. ---

Re: SparkR API problem with subsetting distributed data frame

2016-09-10 Thread Felix Cheung
Could you include code snippets you are running? On Sat, Sep 10, 2016 at 1:44 AM -0700, "Bene" mailto:benedikt.haeu...@outlook.com>> wrote: Hi, I am having a problem with the SparkR API. I need to subset a distributed data so I can extract single values from it on whi

SparkR API problem with subsetting distributed data frame

2016-09-10 Thread Bene
Hi, I am having a problem with the SparkR API. I need to subset a distributed data so I can extract single values from it on which I can then do calculations. Each row of my df has two integer values, I am creating a vector of new values calculated as a series of sin, cos, tan functions on these

Re: SparkR error: reference is ambiguous.

2016-09-09 Thread Bedrytski Aliaksandr
Hi, Can you use full-string queries in SparkR? Like (in Scala): df1.registerTempTable("df1") df2.registerTempTable("df2") val df3 = sparkContext.sql("SELECT * FROM df1 JOIN df2 ON df1.ra = df2.ra") explicitly mentioning table names in the query often solve

Re: Assign values to existing column in SparkR

2016-09-09 Thread Deepak Sharma
Data frames are immutable in nature , so i don't think you can directly assign or change values on the column. Thanks Deepak On Fri, Sep 9, 2016 at 10:59 PM, xingye wrote: > I have some questions about assign values to a spark dataframe. I want to > assign values to an existing column of a spar

SparkR error: reference is ambiguous.

2016-09-09 Thread xingye
Not sure whether this is the right distribution list that I can ask questions. If not, can someone give a distribution list that can find someone to help?I kept getting error of reference is ambiguous when implementing some sparkR code.1. when i tried to assign values to a column using the

Assign values to existing column in SparkR

2016-09-09 Thread xingye
I have some questions about assign values to a spark dataframe. I want to assign values to an existing column of a spark dataframe but if I assign the value directly, I got the following error.df$c_mon<-0Error: class(value) == "Column" || is.null(value) is not TRUEIs there a way to solve this?

Re: No SparkR on Mesos?

2016-09-07 Thread ray
Hi, Rodrick, Interesting. SparkR is expected not to work with Mesos due to lack of support for mesos in some places, and it has not been tested yet. Have you modified Spark source code by yourself? Have you deployed Spark binary distribution on all salve nodes, and set

Re: No SparkR on Mesos?

2016-09-07 Thread Rodrick Brown
We've been using SparkR on Mesos for quite sometime with no issues. [fedora@prod-rstudio-1 ~]$ /opt/spark-1.6.1/bin/sparkR R version 3.3.0 (2016-05-03) -- "Supposedly Educational" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-redhat-linux-gnu (64-

Re: No SparkR on Mesos?

2016-09-07 Thread Timothy Chen
as a silly conditional that would fail > the submission, even though all the support was there. Could be the same for > R. Can you submit a JIRA? > >> On Wed, Sep 7, 2016 at 5:02 AM, Peter Griessl wrote: >> Hello, >> >> >> >> does SparkR really n

Re: No SparkR on Mesos?

2016-09-07 Thread Felix Cheung
This is correct - SparkR is not quite working completely on Mesos. JIRAs and contributions welcome! On Wed, Sep 7, 2016 at 10:21 AM -0700, "Michael Gummelt" mailto:mgumm...@mesosphere.io>> wrote: Quite possibly. I've never used it. I know Python was "unsupp

Re: No SparkR on Mesos?

2016-09-07 Thread Michael Gummelt
5:02 AM, Peter Griessl wrote: > Hello, > > > > does SparkR really not work (yet?) on Mesos (Spark 2.0 on Mesos 1.0)? > > > > $ /opt/spark/bin/sparkR > > > > R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" > > Copyright (C) 2016 The R Foun

No SparkR on Mesos?

2016-09-07 Thread Peter Griessl
Hello, does SparkR really not work (yet?) on Mesos (Spark 2.0 on Mesos 1.0)? $ /opt/spark/bin/sparkR R version 3.3.1 (2016-06-21) -- "Bug in Your Hair" Copyright (C) 2016 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) Launching java with spark-subm

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Felix Cheung
The reason your second example works is because of a closure capture behavior. It should be ok for a small amount of data. You could also use SparkR:::broadcast but please keep in mind that is not public API we actively support. Thank you for the information on formula - I will test that out

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Cinquegrana, Piero
I tested both in local and cluster mode and the '<<-' seemed to work at least for small data. Or am I missing something? Is there a way for me to test? If that does not work, can I use something like this? sc <- SparkR:::getSparkContext() bcStack <- SparkR:::broadcast(sc,s

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-25 Thread Felix Cheung
rom: Cinquegrana, Piero mailto:piero.cinquegr...@neustar.biz>> Sent: Wednesday, August 24, 2016 10:37 AM Subject: RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big") To: Cinquegrana, Piero mailto:piero.cinquegr...@neustar.biz>>, Felix Cheung mailto:felixcheun...

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-24 Thread Cinquegrana, Piero
r.biz] Sent: Tuesday, August 23, 2016 2:39 PM To: Felix Cheung ; user@spark.apache.org Subject: RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big") The output from score() is very small, just a float. The input, however, could be as big as several hundred MBs. I woul

RE: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-23 Thread Cinquegrana, Piero
, Piero ; user@spark.apache.org Subject: Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big") How big is the output from score()? Also could you elaborate on what you want to broadcast? On Mon, Aug 22, 2016 at 11:58 AM -0700, "Cinquegrana, Piero" m

Re: spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-22 Thread Felix Cheung
How big is the output from score()? Also could you elaborate on what you want to broadcast? On Mon, Aug 22, 2016 at 11:58 AM -0700, "Cinquegrana, Piero" mailto:piero.cinquegr...@neustar.biz>> wrote: Hello, I am using the new R API in SparkR spark.lapply (spark 2.0).

spark.lapply in SparkR: Error in writeBin(batch, con, endian = "big")

2016-08-22 Thread Cinquegrana, Piero
Hello, I am using the new R API in SparkR spark.lapply (spark 2.0). I am defining a complex function to be run across executors and I have to send the entire dataset, but there is not (that I could find) a way to broadcast the variable in SparkR. I am thus reading the dataset in each executor

Re: Disable logger in SparkR

2016-08-22 Thread Felix Cheung
Monday, August 22, 2016 6:12 AM Subject: Disable logger in SparkR To: user mailto:user@spark.apache.org>> Hi, Is there any way of disabling the logging on console in SparkR ? Regards, Yogesh - To unsubscribe e-mail: u

Disable logger in SparkR

2016-08-22 Thread Yogesh Vyas
Hi, Is there any way of disabling the logging on console in SparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: UDF in SparkR

2016-08-17 Thread Yann-Aël Le Borgne
gt; https://spark.apache.org/docs/2.0.0/api/R/ > > Feedback welcome and appreciated! > > > _ > From: Yogesh Vyas > Sent: Tuesday, August 16, 2016 11:39 PM > Subject: UDF in SparkR > To: user > > > > Hi, > > Is there is

Re: UDF in SparkR

2016-08-17 Thread Felix Cheung
UDF in SparkR To: user mailto:user@spark.apache.org>> Hi, Is there is any way of using UDF in SparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>

UDF in SparkR

2016-08-16 Thread Yogesh Vyas
Hi, Is there is any way of using UDF in SparkR ? Regards, Yogesh - To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Re: SparkR error when repartition is called

2016-08-09 Thread Felix Cheung
, August 9, 2016 12:19 AM Subject: Re: SparkR error when repartition is called To: Sun Rui mailto:sunrise_...@163.com>> Cc: User mailto:user@spark.apache.org>> Sun, I am using spark in yarn client mode in a 2-node cluster with hadoop-2.7.2. My R version is 3.3.1. I have the followi

Re: SparkR error when repartition is called

2016-08-09 Thread Shane Lee
nvironment information? On Aug 9, 2016, at 11:35, Shane Lee wrote: Hi All, I am trying out SparkR 2.0 and have run into an issue with repartition.  Here is the R code (essentially a port of the pi-calculating scala example in the spark package) that can reproduce the behavior: schema <- structTyp

  1   2   3   4   5   6   >