Running FPGrowth over a JavaPairRDD?

2015-10-29 Thread Fernando Paladini
aRDD().collect()) { System.out.println("[" + itemset.javaItems() + "], " + itemset.freq());} But then I got: *The method run(JavaRDD) in the type FPGrowth is not applicable for the arguments (JavaPairRDD<Long,List>)* *What can I do in order to solve my problem (run FPGrowth over JavaPairRDD)?* I'm available to give you more information, just tell me exactly what you need. Thank you! Fernando Paladini

Re: "Method json([class java.util.HashMap]) does not exist" when reading JSON on PySpark

2015-10-05 Thread Fernando Paladini
ll most often >> fail. >> >> Thanks >> Best Regards >> >> On Tue, Sep 29, 2015 at 11:07 AM, Fernando Paladini <fnpalad...@gmail.com >> > wrote: >> >>> Hello guys, >>> >>> I'm very new to Spark and I'm having

Re: "Method json([class java.util.HashMap]) does not exist" when reading JSON on PySpark

2015-10-05 Thread Fernando Paladini
other useless debug information): ​ That's correct for the given JSON input <https://gist.github.com/paladini/27bb5636d91dec79bd56> (gist link above)? How can I test if Spark can understand this DataFrame and make complex manipulations with that? Thank you! Hope you can help me soon :3 Fern

"Method json([class java.util.HashMap]) does not exist" when reading JSON

2015-09-29 Thread Fernando Paladini
Hello guys, I'm very new to Spark and I'm having some troubles when reading a JSON to dataframe on PySpark. I'm getting a JSON object from an API response and I would like to store it in Spark as a DataFrame (I've read that DataFrame is better than RDD, that's accurate?). For what I've read

Re: "Method json([class java.util.HashMap]) does not exist" when reading JSON

2015-09-29 Thread Fernando Paladini
! 2015-09-29 17:14 GMT-03:00 Fernando Paladini <fnpalad...@gmail.com>: > Of course, I didn't saw that Gmail was only sending it for you. Sorry :/ > > 2015-09-29 17:13 GMT-03:00 Ted Yu <yuzhih...@gmail.com>: > >> For further analysis, can you post your most recent qu

Fwd: "Method json([class java.util.HashMap]) does not exist" when reading JSON on PySpark

2015-09-28 Thread Fernando Paladini
4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) *What I'm doing wrong? * Check out this gist <https://gist.github.com/paladini/2e2ea913d545a407b842> to see the JSON I'm trying to load. Thanks! Fernando Paladini