Re: RDD to DataFrame question with JsValue in the mix

2016-07-01 Thread Dood
On 7/1/2016 6:42 AM, Akhil Das wrote: case class Holder(str: String, js:JsValue) Hello, Thanks! I tried that before posting the question to the list but I keep getting an error such as this even after the map() operation to convert (String,JsValue) -> Holder and then toDF(). I am simply

RDD to DataFrame question with JsValue in the mix

2016-06-30 Thread Dood
Hello, I have an RDD[(String,JsValue)] that I want to convert into a DataFrame and then run SQL on. What is the easiest way to get the JSON (in form of JsValue) "understood" by the process? Thanks! - To unsubscribe e-mail:

Re: Silly Question on my part...

2016-05-17 Thread Dood
On 5/16/2016 12:12 PM, Michael Segel wrote: For one use case.. we were considering using the thrift server as a way to allow multiple clients access shared RDDs. Within the Thrift Context, we create an RDD and expose it as a hive table. The question is… where does the RDD exist. On the

Re: Structured Streaming in Spark 2.0 and DStreams

2016-05-16 Thread Dood
On 5/16/2016 9:53 AM, Yuval Itzchakov wrote: AFAIK, the underlying data represented under the DataSet[T] abstraction will be formatted in Tachyon under the hood, but as with RDD's if needed they will be spilled to local disk on the worker of needed. There is another option in case of

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 9:52 AM, Xinh Huynh wrote: I just went to IRC. It looks like the correct channel is #apache-spark. So, is this an "official" chat room for Spark? Ah yes, my apologies, it is #apache-spark indeed. Not sure if there is an official channel on IRC for spark :-)

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 9:30 AM, Paweł Szulc wrote: Just realized that people have to be invited to this thing. You see, that's why Gitter is just simpler. I will try to figure it out ASAP You don't need invitations to IRC and it has been around for decades. You can just go to webchat.freenode.net

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 6:40 AM, Paweł Szulc wrote: I've just created this https://apache-spark.slack.com for ad-hoc communications within the comunity. Everybody's welcome! Why not just IRC? Slack is yet another place to create an account etc. - IRC is much easier. What does Slack give you that's so

Re: Tracking / estimating job progress

2016-05-13 Thread Dood
On 5/13/2016 10:39 AM, Anthony May wrote: It looks like it might only be available via REST, http://spark.apache.org/docs/latest/monitoring.html#rest-api Nice, thanks! On Fri, 13 May 2016 at 11:24 Dood@ODDO <oddodao...@gmail.com <mailto:oddodao...@gmail.com>> wrote: On

Re: Tracking / estimating job progress

2016-05-13 Thread Dood
hih...@gmail.com <mailto:yuzhih...@gmail.com>> wrote: Have you looked at core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala ? Cheers On Fri, May 13, 2016 at 10:05 AM, Dood@ODDO <oddodao...@gmail.com <mailto:oddodao...@gmail.com>> w

Tracking / estimating job progress

2016-05-13 Thread Dood
I provide a RESTful API interface from scalatra for launching Spark jobs - part of the functionality is tracking these jobs. What API is available to track the progress of a particular spark application? How about estimating where in the total job progress the job is? Thanks!

Re: Confused - returning RDDs from functions

2016-05-13 Thread Dood
RDD, I get an empty Map(). If I copy/paste this code into the caller, I get the properly filled in Map. I am fairly new to Spark and Scala so excuse any inefficiencies - my priority was to be able to solve the problem in an obvious and correct way and worry about making it p

Confused - returning RDDs from functions

2016-05-12 Thread Dood
Hello all, I have been programming for years but this has me baffled. I have an RDD[(String,Int)] that I return from a function after extensive manipulation of an initial RDD of a different type. When I return this RDD and initiate the .collectAsMap() on it from the caller, I get an empty