Sure, you can create custom RDDs. Haven’t done so in Java, but in Scala absolutely.
From: Shushant Arora Date: Wednesday, July 1, 2015 at 1:44 PM To: Silvio Fiorito Cc: user Subject: Re: custom RDD in java ok..will evaluate these options but is it possible to create RDD in java? On Wed, Jul 1, 2015 at 8:29 PM, Silvio Fiorito <silvio.fior...@granturing.com<mailto:silvio.fior...@granturing.com>> wrote: If all you’re doing is just dumping tables from SQLServer to HDFS, have you looked at Sqoop? Otherwise, if you need to run this in Spark could you just use the existing JdbcRDD? From: Shushant Arora Date: Wednesday, July 1, 2015 at 10:19 AM To: user Subject: custom RDD in java Hi Is it possible to write custom RDD in java? Requirement is - I am having a list of Sqlserver tables need to be dumped in HDFS. So I have a List<String> tables = {dbname.tablename,dbname.tablename2......}; then JavaRDD<String> rdd = javasparkcontext.parllelise(tables); JavaRDDString> tablecontent = rdd.map(new Function<String,Iterable<String>>){fetch table and return populate iterable} tablecontent.storeAsTextFile("hffs path"); In rdd.map(new Function<String,>). I cannot keep complete table content in memory , so I want to creat my own RDD to handle it. Thanks Shushant