Hi Sunitha,
Make the class which is having the common function your calling as
serializable.
Thank you,
Naresh
On Wed, Dec 20, 2017 at 9:58 PM Sunitha Chennareddy <
chennareddysuni...@gmail.com> wrote:
> Hi,
>
> Thank You All..
>
> Here is my requirement, I have a dataframe which contains list of rows
> retrieved from oracle table.
> I need to iterate dataframe and fetch each record and call a common
> function by passing few parameters.
>
> Issue I am facing is : I am not able to call common function
>
> JavaRDD personRDD = person_dataframe.toJavaRDD().map(new
> Function() {
> @Override
> public Person call(Row row) throws Exception{
> Person person = new Person();
> person.setId(row.getDecimal(0).longValue());
> person.setName(row.getString(1));
>
> personLst.add(person);
> return person;
> }
> });
>
> personRDD.foreach(new VoidFunction() {
> private static final long serialVersionUID = 1123456L;
>
> @Override
> public void call(Person person) throws Exception
> {
> System.out.println(person.getId());
> Here I tried to call common function
> }
>});
>
> I am able to print data in foreach loop, however if I tried to call common
> function it gives me below error
> Error Message : org.apache.spark.SparkException: Task not serializable
>
> I kindly request you to share some idea(sample code / link to refer) on
> how to call a common function/Interace method by passing values in each
> record of the dataframe.
>
> Regards,
> Sunitha
>
>
> On Tue, Dec 19, 2017 at 1:20 PM, Weichen Xu
> wrote:
>
>> Hi Sunitha,
>>
>> In the mapper function, you cannot update outer variables such as
>> `personLst.add(person)`,
>> this won't work so that's the reason you got an empty list.
>>
>> You can use `rdd.collect()` to get a local list of `Person` objects
>> first, then you can safely iterate on the local list and do any update you
>> want.
>>
>> Thanks.
>>
>> On Tue, Dec 19, 2017 at 2:16 PM, Sunitha Chennareddy <
>> chennareddysuni...@gmail.com> wrote:
>>
>>> Hi Deepak,
>>>
>>> I am able to map row to person class, issue is I want to to call another
>>> method.
>>> I tried converting to list and its not working with out using collect.
>>>
>>> Regards
>>> Sunitha
>>> On Tuesday, December 19, 2017, Deepak Sharma
>>> wrote:
>>>
I am not sure about java but in scala it would be something like
df.rdd.map{ x => MyClass(x.getString(0),.)}
HTH
--Deepak
On Dec 19, 2017 09:25, "Sunitha Chennareddy" > wrote:
Hi All,
I am new to Spark, I want to convert DataFrame to List with
out using collect().
Main requirement is I need to iterate through the rows of dataframe and
call another function by passing column value of each row (person.getId())
Here is the snippet I have tried, Kindly help me to resolve the issue,
personLst is returning 0:
List personLst= new ArrayList();
JavaRDD personRDD = person_dataframe.toJavaRDD().map(new
Function() {
public Person call(Row row) throws Exception{
Person person = new Person();
person.setId(row.getDecimal(0).longValue());
person.setName(row.getString(1));
personLst.add(person);
// here I tried to call another function but control never passed
return person;
}
});
logger.info("personLst size =="+personLst.size());
logger.info("personRDD count ==="+personRDD.count());
//output is
personLst size == 0
personRDD count === 3
>>
>