am,
> Jacek Laskowski
>
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
>> On Tue, Jul 26, 2016 at 2:39 PM, Mail.com <pradeep.mi...@mail.com> wrote:
>> Mor
;
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
>> On Tue, Jul 26, 2016 at 2:18 AM, Mail.com <pradeep.mi...@mail.com> wrote:
>> Hi All,
>>
>> I
http://bit.ly/mastering-apache-spark
>> Follow me at https://twitter.com/jaceklaskowski
>>
>>
>>> On Sat, Jul 23, 2016 at 3:11 PM, Mail.com <pradeep.mi...@mail.com> wrote:
>>> Hi All,
>>>
>>> Where should we us spark context stop vs clos
Hi All,
I have a directory which has 12 files. I want to read the entire file so I am
reading it as wholeTextFiles(dirpath, numPartitions).
I run spark-submit as --num-executors 12 --executor-cores 1
and numPartitions 12.
However, when I run the job I see that the stage which reads the
Hi All,
Where should we us spark context stop vs close. Should we stop the context
first and then close.
Are general guidelines around this. When I stop and later try to close I get
RPC already closed error.
Thanks,
Pradeep
Hbase Spark module will be available with Hbase 2.0. Is that out yet?
> On Jul 22, 2016, at 8:50 PM, Def_Os wrote:
>
> So it appears it should be possible to use HBase's new hbase-spark module, if
> you follow this pattern:
>
Hi All,
Can someone please confirm if streaming direct approach for reading Kafka is
still experimental or can it be used for production use.
I see the documentation and talk from TD suggesting the advantages of the
approach but docs state it is an "experimental" feature.
Please suggest
Hi All,
Can you please advise best practices to running streaming jobs in Production
that reads from Kafka.
How do we trigger them - through a start script and best ways to monitor the
application is running and send alert when down etc.
Thanks,
Pradeep
beta
> consumer for kafka 0.10
>
>> On Wed, May 25, 2016 at 9:41 PM, Mail.com <pradeep.mi...@mail.com> wrote:
>> Hi All,
>>
>> I am connecting Spark 1.6 streaming to Kafka 0.8.2 with Kerberos. I ran
>> spark streaming in debug mode, but do not see any log sa
Hi All,
I am connecting Spark 1.6 streaming to Kafka 0.8.2 with Kerberos. I ran spark
streaming in debug mode, but do not see any log saying it connected to Kafka or
topic etc. How could I enable that.
My spark streaming job runs but no messages are fetched from the RDD. Please
suggest.
Yes.
Sent from my iPhone
> On May 20, 2016, at 10:11 AM, Sahil Sareen <sareen...@gmail.com> wrote:
>
> I'm not sure if this happens on small files or big ones as I have a mix of
> them always.
> Did you see this only for big files?
>
>> On Fri, May 20, 2016 at
Hi Sahil,
I have seen this with high GC time. Do you ever get this error with small
volume files
Pradeep
> On May 20, 2016, at 9:32 AM, Sahil Sareen wrote:
>
> Hey all
>
> I'm using Spark-1.6.1 and occasionally seeing executors lost and hurting my
> application
I noticed when you specify invalid topic name, KafkaUtils doesn't
> fetch any messages. So, check you have specified the topic name correctly.
>
> ~Muthu
> ____
> From: Mail.com [pradeep.mi...@mail.com]
> Sent: Monday, May 16, 2016 9:33 PM
>
Hi Yogesh,
Can you try map operation and get what you need. Whatever parser you are using.
You could also look at spark-XML package .
Thanks,
Pradeep
> On May 19, 2016, at 4:39 AM, Yogesh Vyas wrote:
>
> Hi,
> I had xml files which I am reading through textFileStream,
Adding back users.
> On May 18, 2016, at 11:49 AM, Mail.com <pradeep.mi...@mail.com> wrote:
>
> Hi Uladzimir,
>
> I run is as below.
>
> Spark-submit --class com.test --num-executors 4 --executor-cores 5 --queue
> Dev --master yarn-client --driver-memory 512M -
Hi Muthu,
Are you on spark 1.4.1 and Kafka 0.8.2? I have a similar issue even for simple
string messages.
Console producer and consumer work fine. But spark always reruns empty RDD. I
am using Receiver based Approach.
Thanks,
Pradeep
> On May 16, 2016, at 8:19 PM, Ramaswamy, Muthuraman
>
8Pw
>
> http://talebzadehmich.wordpress.com
>
>
>> On 15 May 2016 at 13:19, Mail.com <pradeep.mi...@mail.com> wrote:
>> Hi ,
>>
>> I have seen multiple videos on spark tuning which shows how to determine #
>> cores, #executors and memory size of
Hi ,
I have seen multiple videos on spark tuning which shows how to determine #
cores, #executors and memory size of the job.
In all that I have seen, it seems each job has to be given the max resources
allowed in the cluster.
How do we factor in input size as well? I am processing a 1gb
Hi All,
I am trying to get spark 1.4.1 (Java) work with Kafka 0.8.2 in Kerberos enabled
cluster. HDP 2.3.2
Is there any document I can refer to.
Thanks,
Pradeep
-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For
Hi Arun,
Could you try using Stax or JaxB.
Thanks,
Pradeep
> On May 12, 2016, at 8:35 PM, Hyukjin Kwon wrote:
>
> Hi Arunkumar,
>
>
> I guess your records are self-closing ones.
>
> There is an issue open here, https://github.com/databricks/spark-xml/issues/92
>
>
t; its supported, try to use coalesce(1) (the spelling is wrong) and after that
> do the partitions.
>
> Regards,
> Gourav
>
>> On Mon, May 9, 2016 at 7:12 PM, Mail.com <pradeep.mi...@mail.com> wrote:
>> Hi,
>>
>> I have to write tab delimited file an
Hi,
I have to write tab delimited file and need to have one directory for each
unique value of a column.
I tried using spark-csv with partitionBy and seems it is not supported. Is
there any other option available for doing this?
Regards,
Pradeep
Can you try once by creating your own schema file and using it to read the XML.
I had similar issue but got that resolved by custom schema and by specifying
each attribute in that.
Pradeep
> On May 1, 2016, at 9:45 AM, Hyukjin Kwon wrote:
>
> To be more clear,
>
> If
hat. I don’t think
> wholeTextFile is designed for that.
>
> - Harjit
>> On Apr 26, 2016, at 7:19 PM, Mail.com <pradeep.mi...@mail.com> wrote:
>>
>>
>> Hi All,
>> I am reading entire directory of gz XML files with wholeTextFiles.
>>
>
Hi All,
I am reading entire directory of gz XML files with wholeTextFiles.
I understand as it is gz and with wholeTextFiles the individual files are not
splittable but why the entire directory is read by one executor, single task. I
have provided number of executors as number of files in that
> Hi
I have a dataframe and need to write to a tab separated file using spark 1.4
and Java.
Can some one please suggest.
Thanks,
Pradeep
I get an error with a message that state what is max number of cores allowed.
> On Apr 20, 2016, at 11:21 AM, Shushant Arora
> wrote:
>
> I am running a spark application on yarn cluster.
>
> say I have available vcors in cluster as 100.And I start spark
You might look at using JaxB or Stax. If it is simple enough use data frames
auto generated scheme.
Pradeep
> On Apr 18, 2016, at 6:37 PM, Jinan Alhajjaj wrote:
>
> Thank you for your help.
> I would like to parse the XML file using Java not scala . Can you please
28 matches
Mail list logo