Thanks, my issue was exactly that the function to extract the class from the file used the same object, by only changing it. Creating a new object for each item solved the issue. Thank you very much for your reply. Best regards.
> Il giorno 26/feb/2015, alle ore 22:25, Imran Rashid <iras...@cloudera.com> ha > scritto: > > any chance your input RDD is being read from hdfs, and you are running into > this issue (in the docs on SparkContext#hadoopFile): > > * '''Note:''' Because Hadoop's RecordReader class re-uses the same Writable > object for each > * record, directly caching the returned RDD or directly passing it to an > aggregation or shuffle > * operation will create many references to the same object. > * If you plan to directly cache, sort, or aggregate Hadoop writable objects, > you should first > * copy them using a `map` function. > > > > On Thu, Feb 26, 2015 at 10:38 AM, mrk91 <marcogaid...@gmail.com > <mailto:marcogaid...@gmail.com>> wrote: > Hello, > > I have an issue with the cartesian method. When I use it with the Java types > everything is ok, but when I use it with RDD made of objects defined by me it > has very strage behaviors which depends on whether the RDD is cached or not > (you can see here > <http://stackoverflow.com/questions/28727823/creating-a-matrix-of-neighbors-with-spark-cartesian-issue> > what happens). > > Is this due to a bug in its implementation or are there any requirements for > the objects to be passed to it? > Thanks. > Best regards. > Marco > View this message in context: Cartesian issue with user defined objects > <http://apache-spark-user-list.1001560.n3.nabble.com/Cartesian-issue-with-user-defined-objects-tp21826.html> > Sent from the Apache Spark User List mailing list archive > <http://apache-spark-user-list.1001560.n3.nabble.com/> at Nabble.com. >