Hi,
I have a simple Java program to read data from kafka using spark streaming.
When i run it from eclipse on my mac, it is connecting to the zookeeper,
bootstrap nodes,
But its not displaying any data. it does not give any error.
it just shows
18/01/16 20:49:15 INFO Executor: Finished task
Hi, The source file i have is on local machine and its pretty huge like 150
gb. How to go about it?
On Sun, Nov 20, 2016 at 8:52 AM, Steve Loughran <ste...@hortonworks.com>
wrote:
>
> On 19 Nov 2016, at 17:21, vr spark <vrspark...@gmail.com> wrote:
>
> Hi,
> I am
Hi,
I am looking for scala or python code samples to covert local tsv file to
orc file and store on distributed cloud storage(openstack).
So, need these 3 samples. Please suggest.
1. read tsv
2. convert to orc
3. store on distributed cloud storage
thanks
VR
Hi,
I have a continuous rest api stream which keeps spitting out data in form
of json.
I access the stream using python requests.get(url, stream=True,
headers=headers).
I want to receive them using spark and do further processing. I am not sure
which is best way to receive it in spark.
What are
g-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Sun, Sep 25, 2016 at 4:32 PM, vr spark <vrspark...@gmail.com> wrote:
> > yes, i have both spark 1.6 and spark 2.0.
> > I unset the spark home environment variable and pointed spark submit to
&
Hi,
I use scala IDE for eclipse. I usually run job against my local spark
installed on my mac and then export the jars and copy it to spark cluster
of my company and run spark submit on it.
This works fine.
But i want to run the jobs from scala ide directly using the spark cluster
of my company.
1.
>
> You've got two Spark runtimes up that may or may not contribute to the
> issue.
>
> Pozdrawiam,
> Jacek Laskowski
>
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark 2.0 http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/ja
Hi,
I have this simple scala app which works fine when i run it as scala
application from the scala IDE for eclipse.
But when i export is as jar and run it from spark-submit i am getting below
error. Please suggest
*bin/spark-submit --class com.x.y.vr.spark.first.SimpleApp test.jar*
16/09/24
o
raise AnalysisException(s.split(': ', 1)[1], stackTrace)
AnalysisException: u'undefined function json_array_to_map; line 28 pos 73'
On Wed, Aug 17, 2016 at 8:59 AM, vr spark <vrspark...@gmail.com> wrote:
> spark 1.6.1
> python
>
> I0817 08:51:59.099356 15189 detect
sql ?
>
> On Wed, Aug 17, 2016 at 9:04 AM, vr spark <vrspark...@gmail.com> wrote:
>
>> spark 1.6.1
>> mesos
>> job is running for like 10-15 minutes and giving this message and i
>> killed it.
>>
>> In this job, i am creating data frame from a hive sql
W0816 23:17:01.984846 16360 sched.cpp:1195] Attempting to accept an unknown
offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910492
W0816 23:17:01.984987 16360 sched.cpp:1195] Attempting to accept an unknown
offer b859f2f3-7484-482d-8c0d-35bd91c1ad0a-O162910493
W0816 23:17:01.985124 16360
Hi,
I am getting error on below scenario. Please suggest.
i have a virtual view in hive
view name log_data
it has 2 columns
query_map map
parti_date int
Here is my snippet for the spark data frame
my dataframe
res=sqlcont.sql("select parti_date FROM
Hi Experts,
Please suggest
On Thu, Aug 11, 2016 at 7:54 AM, vr spark <vrspark...@gmail.com> wrote:
>
> I have data which is json in this format
>
> myList: array
> |||-- elem: struct
> ||||-- nm: string (nullable = true)
> ||||-- vL
I have data which is json in this format
myList: array
|||-- elem: struct
||||-- nm: string (nullable = true)
||||-- vList: array (nullable = true)
|||||-- element: string (containsNull = true)
from my kafka stream, i created a dataframe
, 2016 at 12:05 PM, Cody Koeninger <c...@koeninger.org> wrote:
> Have you tried filtering out corrupt records with something along the
> lines of
>
> df.filter(df("_corrupt_record").isNull)
>
> On Tue, Jul 26, 2016 at 1:53 PM, vr spark <vrspark...@gmail.com>
i am reading data from kafka using spark streaming.
I am reading json and creating dataframe.
I am using pyspark
kvs = KafkaUtils.createDirectStream(ssc, kafkaTopic1, kafkaParams)
lines = kvs.map(lambda x: x[1])
lines.foreachRDD(mReport)
def mReport(clickRDD):
clickDF =
i am reading data from kafka using spark streaming.
I am reading json and creating dataframe.
kvs = KafkaUtils.createDirectStream(ssc, kafkaTopic1, kafkaParams)
lines = kvs.map(lambda x: x[1])
lines.foreachRDD(mReport)
def mReport(clickRDD):
clickDF = sqlContext.jsonRDD(clickRDD)
17 matches
Mail list logo