Hi! Is there any place I can find information how to use gapply with arrow?
I've tried something very simple collect(gapply( df, c("ColumnA"), function(key, x){ data.frame(out=c("dfs"), stringAsFactors=FALSE) }, "out String" )) But it fails - similar code with integers or double works fine. [Fetched stdout timeout] Error in readBin(con, raw(), as.integer(dataLen), endian = "big") : invalid 'n' argument java.lang.UnsupportedOperationException at org.apache.spark.sql.vectorized.ArrowColumnVector$ArrowVectorAccessor.getUTF8String(ArrowColumnVector.java:233) at org.apache.spark.sql.vectorized.ArrowColumnVector.getUTF8String(ArrowColumnVector.java:109) at org.apache.spark.sql.vectorized.ColumnarBatchRow.getUTF8String(ColumnarBatch.java:220) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection.apply(Unknown Source) ... When I looked at the source code there - it is all stubs. Is there a proper way to use arrow in gapply in SparkR? BR, Jacel --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org