Re: Apache Spark Slack

2016-05-16 Thread Matei Zaharia
ppen on the dev mailing list and JIRA so that they can easily be archived and found afterward. Matei > On May 16, 2016, at 1:06 PM, Dood@ODDO wrote: > > On 5/16/2016 9:52 AM, Xinh Huynh wrote: >> I just went to IRC. It looks like the correct channel is #apache-spark. >> So, is th

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 9:52 AM, Xinh Huynh wrote: I just went to IRC. It looks like the correct channel is #apache-spark. So, is this an "official" chat room for Spark? Ah yes, my apologies, it is #apache-spark indeed. Not sure if there is an official channel on IRC

Re: Apache Spark Slack

2016-05-16 Thread Xinh Huynh
I just went to IRC. It looks like the correct channel is #apache-spark. So, is this an "official" chat room for Spark? Xinh On Mon, May 16, 2016 at 9:35 AM, Dood@ODDO wrote: > On 5/16/2016 9:30 AM, Paweł Szulc wrote: > >> >> Just realized that people have to be in

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 9:30 AM, Paweł Szulc wrote: Just realized that people have to be invited to this thing. You see, that's why Gitter is just simpler. I will try to figure it out ASAP You don't need invitations to IRC and it has been around for decades. You can just go to webchat.freenode.net

Re: Apache Spark Slack

2016-05-16 Thread Paweł Szulc
Just realized that people have to be invited to this thing. You see, that's why Gitter is just simpler. I will try to figure it out ASAP 16 maj 2016 15:40 "Paweł Szulc" napisał(a): > I've just created this https://apache-spark.slack.com for ad-hoc > communications within the comunity. > > Everyb

Re: Apache Spark Slack

2016-05-16 Thread Dood
On 5/16/2016 6:40 AM, Paweł Szulc wrote: I've just created this https://apache-spark.slack.com for ad-hoc communications within the comunity. Everybody's welcome! Why not just IRC? Slack is yet another place to create an account etc. - IRC is much easier. What does Slack give you that's so v

Re: apache spark on gitter?

2016-05-16 Thread Sean Owen
o Apache itself, even as it is possible to make it clear it does refer to Apache Spark. Since this has come up in recent memory, I have a good link handy for the interested: http://www.apache.org/foundation/marks/ On Mon, May 16, 2016 at 2:41 PM, Paweł Szulc wrote: > I've just

Apache Spark Slack

2016-05-16 Thread Paweł Szulc
I've just created this https://apache-spark.slack.com for ad-hoc communications within the comunity. Everybody's welcome! -- Regards, Paul Szulc twitter: @rabbitonweb blog: www.rabbitonweb.com

Re: apache spark on gitter?

2016-05-16 Thread Paweł Szulc
I've just created https://apache-spark.slack.com On Thu, May 12, 2016 at 9:28 AM, Paweł Szulc wrote: > Hi, > > well I guess the advantage of gitter over maling list is the same as with > IRC. It's not actually a replacer because mailing list is also important. > But it is lot easier to build a c

Re: apache spark on gitter?

2016-05-12 Thread Xinh Huynh
I agree that it can help build a community and be a place for real-time conversations. Xinh On Thu, May 12, 2016 at 12:28 AM, Paweł Szulc wrote: > Hi, > > well I guess the advantage of gitter over maling list is the same as with > IRC. It's not actually a replacer because mailing list is also i

Re: apache spark on gitter?

2016-05-12 Thread Paweł Szulc
Hi, well I guess the advantage of gitter over maling list is the same as with IRC. It's not actually a replacer because mailing list is also important. But it is lot easier to build a community around tool with ad-hoc ability to connect with each other. I have gitter running on constantly, I visi

Re: apache spark on gitter?

2016-05-11 Thread Xinh Huynh
Hi Pawel, I'd like to hear more about your idea. Could you explain more why you would like to have a gitter channel? What are the advantages over a mailing list (like this one)? Have you had good experiences using gitter on other open source projects? Xinh On Wed, May 11, 2016 at 11:10 AM, Sean

Re: apache spark on gitter?

2016-05-11 Thread Sean Owen
I don't know of a gitter channel and I don't use it myself, FWIW. I think anyone's welcome to start one. I hesitate to recommend this, simply because it's preferable to have one place for discussion rather than split it over several, and, we have to keep the @spark.apache.org mailing lists as the

Re: apache spark on gitter?

2016-05-11 Thread Paweł Szulc
no answer, but maybe one more time, a gitter channel for spark users would be a good idea! On Mon, May 9, 2016 at 1:45 PM, Paweł Szulc wrote: > Hi, > > I was wondering - why Spark does not have a gitter channel? > > -- > Regards, > Paul Szulc > > twitter: @rabbitonweb > blog: www.rabbitonweb.com

apache spark on gitter?

2016-05-09 Thread Paweł Szulc
Hi, I was wondering - why Spark does not have a gitter channel? -- Regards, Paul Szulc twitter: @rabbitonweb blog: www.rabbitonweb.com

Re: Kafka exception in Apache Spark

2016-04-26 Thread Cody Koeninger
That error indicates a message bigger than the buffer's capacity https://issues.apache.org/jira/browse/KAFKA-1196 On Tue, Apr 26, 2016 at 3:07 AM, Michel Hubert wrote: > Hi, > > > > > > I use a Kafka direct stream approach. > > My Spark application was running ok. > > This morning we upgraded t

RE: Kafka exception in Apache Spark

2016-04-26 Thread Michel Hubert
This is production. Van: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Verzonden: dinsdag 26 april 2016 12:01 Aan: Michel Hubert CC: user@spark.apache.org Onderwerp: Re: Kafka exception in Apache Spark Hi Michael, Is this production or test? Dr Mich Talebzadeh LinkedIn https

Re: Kafka exception in Apache Spark

2016-04-26 Thread Mich Talebzadeh
Hi Michael, Is this production or test? Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw * http://talebzadehmich.wordpress.com On 26 April 2016

Kafka exception in Apache Spark

2016-04-26 Thread Michel Hubert
Hi, I use a Kafka direct stream approach. My Spark application was running ok. This morning we upgraded to CDH 5.7.0 And when I re-started my Spark application I get exceptions. It seems a problem with the direct stream approach. Any ideas how to fix this? User class threw exception: org.apac

Run Apache Spark on EMR

2016-04-22 Thread Jinan Alhajjaj
Hi AllI would like to ask for two thing and I really appreciate the answer ASAP1. How do I implement the parallelism in Apache Spark java application?2. How to run the Spark application in Amazon EMR?

Apache Spark-Get All Field Names From Nested Arbitrary JSON Files

2016-03-31 Thread John Radin
and nested) eventually to Parquet in a pipeline. I am wondering if there is a way to get the superset of field names I need for this use case staying in Apache Spark instead of Hadoop MR in a reasonable fashion? I think Apache Arrow under development might be able to help avoid this by treating

Re: apache spark errors

2016-03-24 Thread Max Schmidt
d.take(top)) { jedis …} > > > > May this resulted in a memory leak? > > > > *Van:*Ted Yu [mailto:yuzhih...@gmail.com] > *Verzonden:* donderdag 24 maart 2016 15:15 > *Aan:* Michel Hubert > *CC:* user@spark.apache.org > *Onderwerp:* Re

Re: apache spark errors

2016-03-24 Thread Ted Yu
> } > > > > May this resulted in a memory leak? > > > > *Van:* Ted Yu [mailto:yuzhih...@gmail.com] > *Verzonden:* donderdag 24 maart 2016 15:15 > > *Aan:* Michel Hubert > *CC:* user@spark.apache.org > *Onderwerp:* Re: apache spark errors > > > >

RE: apache spark errors

2016-03-24 Thread Michel Hubert
ert CC: user@spark.apache.org Onderwerp: Re: apache spark errors Do you have history server enabled ? Posting your code snippet would help us understand your use case (and reproduce the leak). Thanks On Thu, Mar 24, 2016 at 6:40 AM, Michel Hubert mailto:mich...@pha

Re: apache spark errors

2016-03-24 Thread Ted Yu
2.8.0 > jar > compile > > > > > > > > How can I look at those tasks? > > > > *Van:* Ted Yu [mailto:yuzhih...@gmail.com] > *Verzonden:* donderdag 24 maart 2016 14:33 > *Aan:* Michel Hubert > *CC:* user@spark.apache.org > *

RE: apache spark errors

2016-03-24 Thread Michel Hubert
Yu [mailto:yuzhih...@gmail.com] Verzonden: donderdag 24 maart 2016 14:33 Aan: Michel Hubert CC: user@spark.apache.org Onderwerp: Re: apache spark errors Which release of Spark are you using ? Have you looked the tasks whose Ids were printed to see if there was more clue ? Thanks On Thu, Mar 24

Re: apache spark errors

2016-03-24 Thread Ted Yu
Which release of Spark are you using ? Have you looked the tasks whose Ids were printed to see if there was more clue ? Thanks On Thu, Mar 24, 2016 at 6:12 AM, Michel Hubert wrote: > HI, > > > > I constantly get these errors: > > > > 0[Executor task launch worker-15] ERROR > org.apache.spa

apache spark errors

2016-03-24 Thread Michel Hubert
HI, I constantly get these errors: 0[Executor task launch worker-15] ERROR org.apache.spark.executor.Executor - Managed memory leak detected; size = 6564500 bytes, TID = 38969 310002 [Executor task launch worker-12] ERROR org.apache.spark.executor.Executor - Managed memory leak detected;

Re: Apache Spark Exception in thread “main” java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class

2016-03-19 Thread Josh Rosen
See the instructions in the Spark documentation: https://spark.apache.org/docs/latest/building-spark.html#building-for-scala-211 On Wed, Mar 16, 2016 at 7:05 PM satyajit vegesna wrote: > > > Hi, > > Scala version:2.11.7(had to upgrade the scala verison to enable case > clasess to accept more tha

Fwd: Apache Spark Exception in thread “main” java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class

2016-03-19 Thread satyajit vegesna
Hi, Scala version:2.11.7(had to upgrade the scala verison to enable case clasess to accept more than 22 parameters.) Spark version:1.6.1. PFB pom.xml Getting below error when trying to setup spark on intellij IDE, 16/03/16 18:36:44 INFO spark.SparkContext: Running Spark version 1.6.1 Exception

Re: Apache Spark Exception in thread “main” java.lang.NoClassDefFoundError: scala/collection/GenTraversableOnce$class

2016-03-19 Thread Josh Rosen
Err, whoops, looks like this is a user app and not building Spark itself, so you'll have to change your deps to use the 2.11 versions of Spark. e.g. spark-streaming_2.10 -> spark-streaming_2.11. On Wed, Mar 16, 2016 at 7:07 PM Josh Rosen wrote: > See the instructions in the Spark documentation:

Re: please add Christchurch Apache Spark Meetup Group

2016-03-02 Thread Sean Owen
(I have the site's svn repo handy, so I just added it.) On Wed, Mar 2, 2016 at 5:16 PM, Raazesh Sainudiin wrote: > Hi, > > Please add Christchurch Apache Spark Meetup Group to the community list > here: > http://spark.apache.org/community.html > > Our Meetup URI i

please add Christchurch Apache Spark Meetup Group

2016-03-02 Thread Raazesh Sainudiin
Hi, Please add Christchurch Apache Spark Meetup Group to the community list here: http://spark.apache.org/community.html Our Meetup URI is: http://www.meetup.com/Christchurch-Apache-Spark-Meetup/ Thanks, Raaz

Re: Calculation of histogram bins and frequency in Apache spark 1.6

2016-02-25 Thread Yanbo Liang
Actually Spark SQL `groupBy` with `count` can get frequency in each bin. You can also try with DataFrameStatFunctions.freqItems() to get the frequent items for columns. Thanks Yanbo 2016-02-24 1:21 GMT+08:00 Burak Yavuz : > You could use the Bucketizer transformer in Spark ML. > > Best, > Burak

Re: Calculation of histogram bins and frequency in Apache spark 1.6

2016-02-23 Thread Burak Yavuz
You could use the Bucketizer transformer in Spark ML. Best, Burak On Tue, Feb 23, 2016 at 9:13 AM, Arunkumar Pillai wrote: > Hi > Is there any predefined method to calculate histogram bins and frequency > in spark. Currently I take range and find bins then count frequency using > SQL query. > >

Calculation of histogram bins and frequency in Apache spark 1.6

2016-02-23 Thread Arunkumar Pillai
Hi Is there any predefined method to calculate histogram bins and frequency in spark. Currently I take range and find bins then count frequency using SQL query. Is there any better way

[OT] Apache Spark Jobs in Kochi, India

2016-02-11 Thread Andrew Holway
Hello, I'm not sure how appropriate job postings are to a user group. We're getting deep into spark and are looking for some talent in our Kochi office. http://bit.ly/Spark-Eng - Apache Spark Engineer / Architect - Kochi http://bit.ly/Spark-Dev - Lead Apache Spark Developer - Kochi

Re: Apache Spark data locality when integrating with Kafka

2016-02-07 Thread Diwakar Dhanuskodi
الليثي Date:08/02/2016 02:07 (GMT+05:30) To: Diwakar Dhanuskodi Cc: "Yuval.Itzchakov" , user Subject: Re: Apache Spark data locality when integrating with Kafka Diwakar We have our own servers. We will not use any cloud service like Amazon's On 7 February 2016 at 18:24, Diw

Re: Apache Spark data locality when integrating with Kafka

2016-02-07 Thread أنس الليثي
iwakar . > > > > Sent from Samsung Mobile. > > > Original message > From: "Yuval.Itzchakov" > Date:07/02/2016 19:38 (GMT+05:30) > To: user@spark.apache.org > Cc: > Subject: Re: Apache Spark data locality when integrating with Kafka

Re: Apache Spark data locality when integrating with Kafka

2016-02-07 Thread Diwakar Dhanuskodi
Fanoos,  Where  you  want the solution to  be deployed ?. On premise or cloud? Regards  Diwakar . Sent from Samsung Mobile. Original message From: "Yuval.Itzchakov" Date:07/02/2016 19:38 (GMT+05:30) To: user@spark.apache.org Cc: Subject: Re: Apache Spark dat

Re: Apache Spark data locality when integrating with Kafka

2016-02-07 Thread Yuval.Itzchakov
two clusters so you can, again, benefit from low IO latency and high throughput. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-data-locality-when-integrating-with-Kafka-tp26165p26170.html Sent from the Apache Spark User List mail

Re: Apache Spark data locality when integrating with Kafka

2016-02-06 Thread Koert Kuipers
Yes . To reduce network latency . > > > Sent from Samsung Mobile. > > > Original message > From: fanooos > Date:07/02/2016 09:24 (GMT+05:30) > To: user@spark.apache.org > Cc: > Subject: Apache Spark data locality when integrating with Ka

RE: Apache Spark data locality when integrating with Kafka

2016-02-06 Thread Diwakar Dhanuskodi
Yes . To  reduce  network  latency . Sent from Samsung Mobile. Original message From: fanooos Date:07/02/2016 09:24 (GMT+05:30) To: user@spark.apache.org Cc: Subject: Apache Spark data locality when integrating with Kafka Dears If I will use Kafka as a streaming source

Apache Spark data locality when integrating with Kafka

2016-02-06 Thread fanooos
Dears If I will use Kafka as a streaming source to some spark jobs, is it advised to install spark to the same nodes of kafka cluster? What are the benefits and drawbacks of such a decision? regards -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com

Re: How to edit/delete a message posted in Apache Spark User List?

2016-02-05 Thread Luciano Resende
Please see http://www.apache.org/foundation/public-archives.html On Fri, Feb 5, 2016 at 9:35 AM, SRK wrote: > Hi, > > How do I edit/delete a message posted in Apache Spark User List? > > Thanks! > > > > -- > View this message in context: > http://apache-spar

How to edit/delete a message posted in Apache Spark User List?

2016-02-05 Thread SRK
Hi, How do I edit/delete a message posted in Apache Spark User List? Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/How-to-edit-delete-a-message-posted-in-Apache-Spark-User-List-tp26160.html Sent from the Apache Spark User List mailing list

[Spark 1.6] Univariate Stats using apache spark

2016-02-04 Thread Arunkumar Pillai
Hi Currently after creating a dataframe i'm queryingmax max min mean it to get result. sqlContext.sql("SELECT MAX(variablesArray) FROM " + tableName) Is this an optimized way? I'm not able to find the all stats like min max mean variance skewness kurtosis directly from a dataframe Please help

Re: Apache spark certification pass percentage ?

2015-12-22 Thread Yash Sharma
, "kali.tumm...@gmail.com" wrote: > Hi All, > > Does anyone know pass percentage for Apache spark certification exam ? > > Thanks > Sri > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/Apache-spark-cer

Apache spark certification pass percentage ?

2015-12-22 Thread kali.tumm...@gmail.com
Hi All, Does anyone know pass percentage for Apache spark certification exam ? Thanks Sri -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-spark-certification-pass-percentage-tp25761.html Sent from the Apache Spark User List mailing list archive

How to implement statemachine functionality in apache-spark by python

2015-12-21 Thread Esa Heikkinen
Hi I am newbie with apache spark and i would want to know or find good example python codes how to implement (finite) statemachine functionality in spark. I try to read many different log files to find certain events by specific order. Is this possible or even impossible ? Or is that only

Re: seriazable error in apache spark job

2015-12-18 Thread Shixiong Zhu
:1508) > ~[na:1.7.0_79] > at > > java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) > ~[na:1.7.0_79] > at > java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) > ~[na:1.7.0_79] > > > > -- > View this mess

seriazable error in apache spark job

2015-12-17 Thread Pankaj Narang
(ObjectOutputStream.java:1508) ~[na:1.7.0_79] at java.io.ObjectOutputStream.writeOrdinaryObject(ObjectOutputStream.java:1431) ~[na:1.7.0_79] at java.io.ObjectOutputStream.writeObject0(ObjectOutputStream.java:1177) ~[na:1.7.0_79] -- View this message in context: http://apache-spark-user-list

Re: GLM in apache spark in MLlib

2015-12-10 Thread Yanbo Liang
inue to improve GLMs. Yanbo 2015-12-10 14:54 GMT+08:00 Arunkumar Pillai : > Hi > > I'm started using apache spark 1.5.2 version. I'm able to see GLM using > SparkR but it is not there in MLlib. Is there any plans or road map for > that > > > > -- > Thanks and Regards > Arun >

Apache spark Web UI on Amazon EMR not working

2015-12-10 Thread sonal sharma
Hi, We are using Spark on Amazon EMR 4.1. To access Spark web UI, we are using the link in yarn resource manager, but we are seeing a blank page on it. Further, using Firefox debugging we noticed that we got a HTTP 500 error in response. We have tried configuring proxy settings for AWS and also r

GLM in apache spark in MLlib

2015-12-09 Thread Arunkumar Pillai
Hi I'm started using apache spark 1.5.2 version. I'm able to see GLM using SparkR but it is not there in MLlib. Is there any plans or road map for that -- Thanks and Regards Arun

Re: Reading from RabbitMq via Apache Spark Streaming

2015-11-19 Thread Sabarish Sasidharan
current is 'false', class-id=50, method-id=10) Regards Sab On 19-Nov-2015 2:32 pm, "D" wrote: > I am trying to write a simple "Hello World" kind of application using > spark streaming and RabbitMq, in which Apache Spark Streaming will read > message from Ra

Re: Reading from RabbitMq via Apache Spark Streaming

2015-11-19 Thread Daniel Carroza
GMT+01:00 D : > I am trying to write a simple "Hello World" kind of application using > spark streaming and RabbitMq, in which Apache Spark Streaming will read > message from RabbitMq via the RabbitMqReceiver > <https://github.com/Stratio/rabbitmq-receiver> and print it i

Reading from RabbitMq via Apache Spark Streaming

2015-11-19 Thread D
I am trying to write a simple "Hello World" kind of application using spark streaming and RabbitMq, in which Apache Spark Streaming will read message from RabbitMq via the RabbitMqReceiver <https://github.com/Stratio/rabbitmq-receiver> and print it in the console. But some how

Re: visualizations using the apache spark

2015-11-08 Thread Hitoshi Ozawa
er choice is to use something like Apache Zeppelin (https://zeppelin.incubator.apache.org/) -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/visualizations-using-the-apache-spark-tp25246p25323.html Sent from the Apache Spark User List mailing list archiv

Broadcast Variables not showing inside Partitions Apache Spark

2015-11-08 Thread prajwol sangat
Hi All, I am facing a weird situation which is explained below. Scenario and Problem: I want to add two attributes to JSON object based on the look up table values and insert the JSON to Mongo DB. I have broadcast variable which holds look up table. However, i am not being able to access it ins

Running Apache Spark 1.5.1 on console2

2015-11-04 Thread Hitoshi Ozawa
er.Main "$@"): ambiguous redirect -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Running-Apache-Spark-1-5-1-on-console2-tp25271.html Sent from the Apache Spark User List mailing list archive at Nabble.com. -

New Apache Spark Meetup NRW, Germany

2015-11-03 Thread pchundi
Hi, After attending the Spark Summit Europe 2015, I have started a Spark meetup group for the German State of NordRhein-Westfalen. It would be great if you could add it to the list of meet up's on the Apache Spark page. http://www.meetup.com/spark-users-NRW/ <http://www.meetup.com/spa

Re: Apache Spark on Raspberry Pi Cluster with Docker

2015-11-03 Thread Akhil Das
* Thanks Best Regards On Wed, Oct 28, 2015 at 7:53 PM, Mark Bonnekessel wrote: > Hi, > > we are trying to setup apache spark on a raspberry pi cluster for > educational use. > Spark is installed in a docker container and all necessary ports are > exposed. > > After we start

Apache Spark on Raspberry Pi Cluster with Docker

2015-10-28 Thread Mark Bonnekessel
Hi, we are trying to setup apache spark on a raspberry pi cluster for educational use. Spark is installed in a docker container and all necessary ports are exposed. After we start master and workers, all workers are listed as alive in the master web ui (http://master:8080 <http://master:8

Apache Spark on Raspberry Pi Cluster with Docker

2015-10-28 Thread Mark Bonnekessel
Hi, we are trying to setup apache spark on a raspberry pi cluster for educational use. Spark is installed in a docker container and all necessary ports are exposed. After we start master and workers, all workers are listed as alive in the master web ui (http://master:8080 <http://master:8

Contributing Receiver based Low Level Kafka Consumer from Spark-Packages to Apache Spark Project

2015-10-24 Thread Dibyendu Bhattacharya
am now thinking of contributing back to Apache Spark core project so that it can get better support ,visibility and adoption. Few Point about this consumer *Why this is needed :* This Consumer is NOT the replacement for existing DirectStream API. DirectStream solves the problem around "Ex

Mapping to multiple groups in Apache Spark

2015-10-21 Thread jeffrichley
I am in a situation where I am using Apache Spark and its map/reduce functionality. I am now at a stage where I have been able to map to a data set that conceptually has many "rows" of data. Now what I am needing is to do a reduce which usually is a straight forward thing. My real need

Mapping to multiple groups in Apache Spark

2015-10-21 Thread Jeffrey Richley
I am in a situation where I am using Apache Spark and its map/reduce functionality. I am now at a stage where I have been able to map to a data set that conceptually has many "rows" of data. Now what I am needing is to do a reduce which usually is a straight forward thing. My real need

RE: Hive with apache spark

2015-10-11 Thread Cheng, Hao
Hive Server does, and you can load the Hive table as need. -Original Message- From: Hafiz Mujadid [mailto:hafizmujadi...@gmail.com] Sent: Monday, October 12, 2015 1:43 AM To: user@spark.apache.org Subject: Hive with apache spark Hi how can we read data from external hive server.

Hive with apache spark

2015-10-11 Thread Hafiz Mujadid
Hi how can we read data from external hive server. Hive server is running and I want to read data remotely using spark. is there any example ? thanks -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Hive-with-apache-spark-tp25020.html Sent from the Apache

Re: Querying on multiple Hive stores using Apache Spark

2015-09-24 Thread Michael Armbrust
This is not supported yet, though, we laid a lot of the ground work for doing this in Spark 1.4. On Wed, Sep 23, 2015 at 11:17 PM, Karthik wrote: > Any ideas or suggestions? > > Thanks, > Karthik. > > > > -- > View this message in context: > http://apache-spark-u

Re: Querying on multiple Hive stores using Apache Spark

2015-09-23 Thread Karthik
Any ideas or suggestions? Thanks, Karthik. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Querying-on-multiple-Hive-stores-using-Apache-Spark-tp24765p24797.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Re: Apache Spark job in local[*] is slower than regular 1-thread Python program

2015-09-22 Thread Jonathan Coveney
gt; my > doing! > > Can someone point me out on what I am doing wrong? That would be greatly > appreciated :) I am new with all this big data stuff. > > > > *Here is the code for the Spark app:* > > > > > > *And the python code:* > > > > Thank you

Re: Apache Spark job in local[*] is slower than regular 1-thread Python program

2015-09-22 Thread Richard Eggert
would be greatly > appreciated :) I am new with all this big data stuff. > > > > *Here is the code for the Spark app:* > > > > > > *And the python code:* > > > > Thank you for reading up to this point :) > > Have a nice day! > > > - Julien > >

Apache Spark job in local[*] is slower than regular 1-thread Python program

2015-09-22 Thread juljoin
//apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-job-in-local-is-slower-than-regular-1-thread-Python-program-tp24771.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-ma

Querying on multiple Hive stores using Apache Spark

2015-09-22 Thread Karthik
s? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Querying-on-multiple-Hive-stores-using-Apache-Spark-tp24765.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --

Re: Slow Performance with Apache Spark Gradient Boosted Tree training runs

2015-09-22 Thread Yashwanth Kumar
nce tuning: http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-1/ http://blog.cloudera.com/blog/2015/03/how-to-tune-your-apache-spark-jobs-part-2/ -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Slow-Performance-with-Apache-Spar

Slow Performance with Apache Spark Gradient Boosted Tree training runs

2015-09-21 Thread vkutsenko
aution when giving the driver process so much memory in order to ensure that it isn't memory starved for any intermediate result-aggregation operations. I'm trying to keep the number of cores per executor down to 5 as per suggestions in Clouderas How To Tune Your Spark Jobs series <

Twitter Streming using Twitter Public Streaming API and Apache Spark

2015-09-14 Thread Sadaf
rmination() } But most of the times it doesn't fetch tweets. it shows the Empty RDD as the output. Is there anything wrong? Can anyone points out the mistake? Thanks in Anticipation. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Twitter-Streming-

Re: Apache Spark Suitable JDBC Driver not found

2015-08-30 Thread shawon
Could you please elaborate ? Spark Classpath in Spark.env.sonf file ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Suitable-JDBC-Driver-not-found-tp24505p24511.html Sent from the Apache Spark User List mailing list archive at Nabble.com

Apache Spark Suitable JDBC Driver not found

2015-08-29 Thread shawon
0 down vote favorite I am using Apache Spark for analyzing query log. I already faced some difficulties to setup spark. Now I am using a standalone cluster to process queries. First I used example code in java to count words that worked fine. But when I try to connect it to a MySQL

What's the logic in RangePartitioner.rangeBounds method of Apache Spark

2015-08-17 Thread ihainan
*Firstly so sorry for my poor English.* I was reading the source code of Apache Spark 1.4.1 and I really got stuck at the logic of RangePartitioner.rangeBounds method. The code is shown below. So can anyone please explain me that: 1. What is "3.0 *" for in the code li

Re: Apache Spark - Parallel Processing of messages from Kafka - Java

2015-08-17 Thread unk1102
org/docs/latest/streaming-programming-guide.html#level-of-parallelism-in-data-receiving Hope it helps! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Parallel-Processing-of-messages-from-Kafka-Java-tp24284p24297.html Sent from the Apache Spark U

Re: Apache Spark - Parallel Processing of messages from Kafka - Java

2015-08-16 Thread Hemant Bhanawat
can > be calculated parallely. However, when i look at the logs, i see that > calculations are happening sequentially. How can i make them to run > parallely. Any suggestion would be helpful > > > > -- > View this message in context: > http://apache-spark-user-list.10015

Apache Spark - Parallel Processing of messages from Kafka - Java

2015-08-16 Thread mohanaugust
that calculations are happening sequentially. How can i make them to run parallely. Any suggestion would be helpful -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Apache-Spark-Parallel-Processing-of-messages-from-Kafka-Java-tp24284.html Sent from the Apache

Re: Encryption on RDDs or in-memory/cache on Apache Spark

2015-08-02 Thread Jörn Franke
, Matthew O'Reilly a écrit : > Hi, > > I am currently working on the latest version of Apache Spark (1.4.1), > pre-built package for Hadoop 2.6+. > > Is there any feature in Spark/Hadoop to encrypt RDDs or in-memory/cache > (something similar is Altibase's HDB: > http

Re: Encryption on RDDs or in-memory/cache on Apache Spark

2015-08-02 Thread Akhil Das
Currently RDDs are not encrypted, I think you can go ahead and open a JIRA to add this feature and may be in future release it could be added. Thanks Best Regards On Fri, Jul 31, 2015 at 1:47 PM, Matthew O'Reilly wrote: > Hi, > > I am currently working on the latest version o

Encryption on RDDs or in-memory/cache on Apache Spark

2015-07-31 Thread Matthew O'Reilly
Hi, I am currently working on the latest version of Apache Spark (1.4.1), pre-built package for Hadoop 2.6+. Is there any feature in Spark/Hadoop to encrypt RDDs or in-memory/cache (something similar is Altibase's HDB:  http://altibase.com/in-memory-database-computing-solutions/sec

Re: apache-spark 1.3.0 and yarn integration and spring-boot as a container

2015-07-30 Thread Steve Loughran
you need to fix your configuration so that the resource manager hostname/URL is set...that address there is the "listen on any port" path On 30 Jul 2015, at 10:47, Nirav Patel mailto:npa...@xactlycorp.com>> wrote: 15/07/29 11:19:26 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:

apache-spark 1.3.0 and yarn integration and spring-boot as a container

2015-07-30 Thread Nirav Patel
Hi, I was running spark application as a query service (much like spark-shell but within my servlet container provided by spring-boot) with spark 1.0.2 and standalone mode. Now After upgrading to spark 1.3.1 and trying to use Yarn instead of standalone cluster things going south for me. I created u

Re: Twitter streaming with apache spark stream only a small amount of tweets

2015-07-29 Thread Enno Shioji
the tweets posted with specific hashtag using >>>>> approach that I posted in previous email, so I guess this approach would >>>>> not work for me. The other problem is that filtering has a limit of 400 >>>>> hashtags (https://goo.gl/BywrAk), so in order

Re: Twitter streaming with apache spark stream only a small amount of tweets

2015-07-29 Thread Zoran Jeremic
gt; This brings me back to my previous question (https://goo.gl/bVDkHx). >>>> In my application I need to follow more than 400 hashtags, and I need to >>>> collect each tweet having one of these hashtags. Another complication is >>>> that users could add new hasht

Re: Twitter streaming with apache spark stream only a small amount of tweets

2015-07-29 Thread Peyman Mohajerian
ed more parallel streams. >>> >>> This brings me back to my previous question (https://goo.gl/bVDkHx). In >>> my application I need to follow more than 400 hashtags, and I need to >>> collect each tweet having one of these hashtags. Another complication is &g

Re: Twitter streaming with apache spark stream only a small amount of tweets

2015-07-29 Thread Zoran Jeremic
gt;> This brings me back to my previous question (https://goo.gl/bVDkHx). In >> my application I need to follow more than 400 hashtags, and I need to >> collect each tweet having one of these hashtags. Another complication is >> that users could add new hashtags or remove ol

Re: Twitter streaming with apache spark stream only a small amount of tweets

2015-07-29 Thread Peyman Mohajerian
> that users could add new hashtags or remove old hashtags, so I have to > update stream in the real-time. > My earlier approach without Apache Spark was to create twitter4j user > stream with initial filter, and each time new hashtag has to be added, stop > stream, add new hashtag and run i

Re: Encryption on RDDs or in-memory on Apache Spark

2015-07-27 Thread Akhil Das
Fri, Jul 24, 2015 at 2:12 PM, IASIB1 wrote: > I am currently working on the latest version of Apache Spark (1.4.1), > pre-built package for Hadoop 2.6+. > > Is there any feature in Spark/Hadoop to encrypt RDDs or in-memory > (similarly > to Altibase's HDB: > http://al

Re: Download Apache Spark on Windows 7 for a Proof of Concept installation

2015-07-26 Thread Peter Leventis
the first time in 15 years that I have struggled so much to download and install Open Source software from Apache. I managed to download and install Apache Drill in minutes. Apache Spark is just so awkward! Please help. Any version would do for the required proof of concept. -- View this

Re: Download Apache Spark on Windows 7 for a Proof of Concept installation

2015-07-26 Thread Jörn Franke
ver to download for a Proof of Concept installation of Apache Spark on > Windows 7. I have spent quite some time following a number of different > recipes to no avail. I have tried about 10 different permutations to date. > > I prefer the easiest approach, e.g. download Pre-build

Re: Twitter streaming with apache spark stream only a small amount of tweets

2015-07-25 Thread Zoran Jeremic
and I need to collect each tweet having one of these hashtags. Another complication is that users could add new hashtags or remove old hashtags, so I have to update stream in the real-time. My earlier approach without Apache Spark was to create twitter4j user stream with initial filter, and eac

Twitter streaming with apache spark stream only a small amount of tweets

2015-07-25 Thread Zoran Jeremic
Hi, I've implemented Twitter streaming as in the code given at the bottom of email. It finds some tweets based on the hashtags I'm following. However, it seems that a large amount of tweets is missing. I've tried to post some tweets that I'm following in the application, and none of them was recei

<    3   4   5   6   7   8   9   10   11   >