Hi,Todd, "--driver-memory" options specifies the maximum heap memory size of the JVM backend for SparkR. The error you faced is memory allocation error of your R process. They are different. I guess that 2G memory bound for a string is limitation of the R interpreter? That's the reason why we use SparkR for big data processing. Maybe you can try copy your CSV to HDFS and have it partitioned across a cluster. But I am not sure if Spark can support distributed CSV file or not.
From: Todd [mailto:bit1...@163.com] Sent: Friday, November 6, 2015 5:00 PM To: user@spark.apache.org Subject: [Spark R]could not allocate memory (2048 Mb) in C function 'R_AllocStringBuffer' I am launching spark R with following script: ./sparkR --driver-memory 12G and I try to load a local 3G csv file with following code, > a=read.transactions("/home/admin/datamining/data.csv",sep="\t",format="single",cols=c(1,2)) but I encounter an error: could not allocate memory (2048 Mb) in C function 'R_AllocStringBuffer' I have allocated 12G memory,not sure why it complains could not allocate 2G memory. Could someone help me? Thanks!