Cannot access data after a join (error: value _1 is not a member of Product with Serializable)

2014-11-19 Thread YaoPau
I joined two datasets together, and my resulting logs look like this:

(975894369,((72364,20141112T170627,web,MEMPHIS,AR,US,Central),(Male,John,Smith)))
(253142991,((30058,20141112T171246,web,ATLANTA16,GA,US,Southeast),(Male,Bob,Jones)))
(295305425,((28110,20141112T170454,iph,CHARLOTTE2,NC,US,Southeast),(Female,Mary,Williams)))

When I try to access the newly-joined data with JoinedInv.map(line =
line._2._2._1) I get the following error:

[ERROR] 
error: value _1 is not a member of Product with Serializable
[INFO]   val getOne = JoinedInv.map(line = line._2._2._1)
[INFO] ^
[ERROR] error: value foreach is not a member of Array[Nothing]
[INFO]   getOne.take(10).foreach(println)
[INFO]^

It looks like there are some rows where a JOIN did not occur (no key match
in the joined dataset), but because I can't access line._2._2._1 I don't
know of a way to check for that.  I can access line._2._2 but line._2._2
does not have the length attribute.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Cannot-access-data-after-a-join-error-value-1-is-not-a-member-of-Product-with-Serializable-tp19272.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Cannot access data after a join (error: value _1 is not a member of Product with Serializable)

2014-11-19 Thread Olivier Girardot
can you please post the full source of your code and some sample data to
run it on ?

2014-11-19 16:23 GMT+01:00 YaoPau jonrgr...@gmail.com:

 I joined two datasets together, and my resulting logs look like this:


 (975894369,((72364,20141112T170627,web,MEMPHIS,AR,US,Central),(Male,John,Smith)))

 (253142991,((30058,20141112T171246,web,ATLANTA16,GA,US,Southeast),(Male,Bob,Jones)))

 (295305425,((28110,20141112T170454,iph,CHARLOTTE2,NC,US,Southeast),(Female,Mary,Williams)))

 When I try to access the newly-joined data with JoinedInv.map(line =
 line._2._2._1) I get the following error:

 [ERROR]
 error: value _1 is not a member of Product with Serializable
 [INFO]   val getOne = JoinedInv.map(line = line._2._2._1)
 [INFO] ^
 [ERROR] error: value foreach is not a member of Array[Nothing]
 [INFO]   getOne.take(10).foreach(println)
 [INFO]^

 It looks like there are some rows where a JOIN did not occur (no key match
 in the joined dataset), but because I can't access line._2._2._1 I don't
 know of a way to check for that.  I can access line._2._2 but line._2._2
 does not have the length attribute.



 --
 View this message in context:
 http://apache-spark-user-list.1001560.n3.nabble.com/Cannot-access-data-after-a-join-error-value-1-is-not-a-member-of-Product-with-Serializable-tp19272.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
 For additional commands, e-mail: user-h...@spark.apache.org




Re: Cannot access data after a join (error: value _1 is not a member of Product with Serializable)

2014-11-19 Thread Tobias Pfeiffer
Hi,

it looks what you are trying to use as a Tuple cannot be inferred to be a
Tuple from the compiler. Try to add type declarations and maybe you will
see where things fail.

Tobias