date:20151224

How can I get the column data based on specific column name and then stored these data in array or list ?

2015-12-24 Thread zml张明磊

Hi, I am a new to Scala and Spark and trying to find relative API in DataFrame to solve my problem as title described. However, I just only find this API DataFrame.col(colName : String) : Column which returns an object of Column. Not the content. If only DataFrame support such API which

Job Error:Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@130.1.10.108:23600/)

2015-12-24 Thread donhoff_h

Hi,folks I wrote some spark jobs and these jobs could ran successfully when I ran them one by one. But if I ran them concurrently, for example 12 jobs parallel running, I met the following error. Could anybody tell me what cause this? How to solve it? Many Thanks! Exception in thread "main"

Re: How to ignore case in dataframe groupby?

2015-12-24 Thread Yanbo Liang

You can use DF.groupBy(upper(col("a"))).agg(sum(col("b"))). DataFrame provide function "upper" to update column to uppercase. 2015-12-24 20:47 GMT+08:00 Eran Witkon : > Use DF.withColumn("upper-code",df("countrycode).toUpper)) > or just run a map function that does the same > > On Thu, Dec 24, 20

Spark Streaming - print accumulators value every period as logs

2015-12-24 Thread Roberto Coluccio

Hello, I have a batch and a streaming driver using same functions (Scala). I use accumulators (passed to functions constructors) to count stuff. In the batch driver, doing so in the right point of the pipeline, I'm able to retrieve the accumulator value and print it as log4j log. In the streamin

Re: Hive error when starting up spark-shell in 1.5.2

2015-12-24 Thread Marco Mistroni

No luck. But two updates: 1. i have downloaded spark-1.4.1 and everything works fine, i dont see any error 2. i have added the following XML file to spark's 1.5.2 conf directory and now i got the following error aused by: java.lang.RuntimeException: The root scratch dir: c:/Users/marco/tmp on HDF

Re: how to debug java.lang.IllegalArgumentException: object is not an instance of declaring class

2015-12-24 Thread Andy Davidson

Problem must be with how I am converting JavaRDD> to a DataFrame. Any suggestions? Most of my work has been done using pySpark. Tuples are a lot harder to work with in Java. JavaRDD> predictions = idLabeledPoingRDD.map((Tuple2 t2) -> { Long id = t2._1(); LabeledPoint

how to debug java.lang.IllegalArgumentException: object is not an instance of declaring class

2015-12-24 Thread Andy Davidson

Hi Any idea how I can debug this problem. I suspect the problem has to do with how I am converting a JavaRDD> to a DataFrame. Is it boxing problem? I tried to use long and double instead of Long and Double when ever possible. Thanks in advance, Happy Holidays. Andy allData.printSchema() root

RE: Spark Streaming + Kafka + scala job message read issue

2015-12-24 Thread vivek.meghanathan

We are using the older receiver based approach, the number of partitions is 1 (we have a single node kafka) and we use single thread per topic still we have the problem. Please see the API we use. All 8 spark jobs use same group name – is that a problem? val topicMap = topics.split(",").map((_,

Re: Newbie Help for spark's not finding native hadoop warning

2015-12-24 Thread Jacek Laskowski

Hi, To add to it, you can read about the native libs in https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html. Pozdrawiam, Jacek Jacek Laskowski | https://medium.com/@jaceklaskowski/ Mastering Apache Spark ==> https://jaceklaskowski.gitbooks.io/mastering-a

Re: Newbie Help for spark's not finding native hadoop warning

2015-12-24 Thread Sean Owen

You can safely ignore it. Native libs aren't set with HADOOP_HOME. See Hadoop docs on how to configure this if you're curious, but you really don't need to. On Thu, Dec 24, 2015 at 12:19 PM, Bilinmek Istemiyor wrote: > Hello, > > I have apache spark 1.5.1 installed with the help of this user gro

Re: running lda in spark throws exception

2015-12-24 Thread Li Li

anyone could help? On Wed, Dec 23, 2015 at 1:40 PM, Li Li wrote: > I ran my lda example in a yarn 2.6.2 cluster with spark 1.5.2. > it throws exception in line: Matrix topics = ldaModel.topicsMatrix(); > But in yarn job history ui, it's successful. What's wrong with it? > I submit job with > .b

Re: How to ignore case in dataframe groupby?

2015-12-24 Thread Eran Witkon

Use DF.withColumn("upper-code",df("countrycode).toUpper)) or just run a map function that does the same On Thu, Dec 24, 2015 at 2:05 PM Bharathi Raja wrote: > Hi, > Values in a dataframe column named countrycode are in different cases. Eg: > (US, us). groupBy & count gives two rows but the requ

Newbie Help for spark's not finding native hadoop warning

2015-12-24 Thread Bilinmek Istemiyor

Hello, I have apache spark 1.5.1 installed with the help of this user group. I receive following error when I start pyshell WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Later I have downloaded native binary from had

RE: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe

2015-12-24 Thread Bharathi Raja

Thanks Eran, I'll check the solution. Regards, Raja -Original Message- From: "Eran Witkon" Sent: ‎12/‎24/‎2015 4:07 PM To: "Bharathi Raja" ; "Gokula Krishnan D" Cc: "user@spark.apache.org" Subject: Re: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe

How to ignore case in dataframe groupby?

2015-12-24 Thread Bharathi Raja

Hi, Values in a dataframe column named countrycode are in different cases. Eg: (US, us). groupBy & count gives two rows but the requirement is to ignore case for this operation. 1) Is there a way to ignore case in groupBy? Or 2) Is there a way to update the dataframe column countrycode to upperc

RE: Spark Streaming + Kafka + scala job message read issue

2015-12-24 Thread Bryan

Are you using a direct stream consumer, or the older receiver based consumer? If the latter, do the number of partitions you’ve specified for your topic match the number of partitions in the topic on Kafka? That would be an possible cause – as you might receive all data from a given partition

Re: How to contribute by picking up starter bugs

2015-12-24 Thread Ted Yu

You can send out pull request for the JIRA you're interested in. Start the title of pull request with: [SPARK-XYZ] ... where XYZ is the JIRA number. The pull request would be posted on the JIRA. After pull request is reviewed, tested by QA and merged, the committer would assign your name to the

Re: Extract compressed JSON withing JSON

2015-12-24 Thread Eran Witkon

Answered using StackOverflow. if you are looking for the solution: This is the trick: val jsonNested = sqlContext.read.json(jsonUnGzip.map{case Row(cty:String, json:String,nm:String,yrs:String) => s"""{"cty": \"$cty\", "extractedJson": $json , "nm": \"$nm\" , "yrs": \"$yrs\"}"""}) See this link

How to contribute by picking up starter bugs

2015-12-24 Thread lokeshkumar

Hi >From the how to contribute page of spark jira project I came to know that I can start by picking up the starter label bugs. But who will assign me these bugs? Or should I just fix them and create a pull request. Will be glad to help the project. -- View this message in context: http://a

Re: error in spark cassandra connector

2015-12-24 Thread Ted Yu

Mind providing a bit more detail ? Release of Spark version of Cassandra connector How job was submitted complete stack trace Thanks On Thu, Dec 24, 2015 at 2:06 AM, Vijay Kandiboyina wrote: > java.lang.NoClassDefFoundError: > com/datastax/spark/connector/rdd/CassandraTableScanRDD > >

Re: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe

2015-12-24 Thread Eran Witkon

raja! I found the answer to your question! Look at http://stackoverflow.com/questions/34069282/how-to-query-json-data-column-using-spark-dataframes this is what you (and I) was looking for. general idea - you read the list as text where project Details is just a string field and then you build the

Spark Streaming + Kafka + scala job message read issue

2015-12-24 Thread vivek.meghanathan

Hi All, We are using Bitnami Kafka 0.8.2 + spark 1.5.2 in Google cloud platform. Our spark streaming job(consumer) not receiving all the messages sent to the specific topic. It receives 1 out of ~50 messages(added log in the job stream and identified). We are not seeing any errors in the kaf

error in spark cassandra connector

2015-12-24 Thread Vijay Kandiboyina

java.lang.NoClassDefFoundError: com/datastax/spark/connector/rdd/CassandraTableScanRDD

Extract compressed JSON withing JSON

2015-12-24 Thread Eran Witkon

Hi, I have a JSON file with the following row format: {"cty":"United Kingdom","gzip":"H4sIAKtWystVslJQcs4rLVHSUUouqQTxQvMyS1JTFLwz89JT8nOB4hnFqSBxj/zS4lSF/DQFl9S83MSibKBMZVExSMbQwNBM19DA2FSpFgDvJUGVUw==","nm":"Edmund lronside","yrs":"1016"} The gzip field is a compressed JSON by itsel

Re: Using Java Function API with Java 8

2015-12-24 Thread Sean Owen

You forgot a return statement in the 'else' clause, which is what the compiler is telling you. There's nothing more to it here. Your function is much simpler however as Function checkHeaders2 = (x -> x.startsWith("npi")||x.startsWith("CPT")); On Thu, Dec 24, 2015 at 1:13 AM, rdpratti wrote: > I

Re: Spark Streaming 1.5.2+Kafka+Python. Strange reading

2015-12-24 Thread Akhil Das

Would you mind posting the relevant code snippet? Thanks Best Regards On Wed, Dec 23, 2015 at 7:33 PM, Vyacheslav Yanuk wrote: > Hi. > I have very strange situation with direct reading from Kafka. > For example. > I have 1000 messages in Kafka. > After submitting my application I read this data

How can I get the column data based on specific column name and then stored these data in array or list ?

Job Error:Actor not found for: ActorSelection[Anchor(akka.tcp://sparkDriver@130.1.10.108:23600/)

Re: How to ignore case in dataframe groupby?

Spark Streaming - print accumulators value every period as logs

Re: Hive error when starting up spark-shell in 1.5.2

Re: how to debug java.lang.IllegalArgumentException: object is not an instance of declaring class

how to debug java.lang.IllegalArgumentException: object is not an instance of declaring class

RE: Spark Streaming + Kafka + scala job message read issue

Re: Newbie Help for spark's not finding native hadoop warning

Re: Newbie Help for spark's not finding native hadoop warning

Re: running lda in spark throws exception

Re: How to ignore case in dataframe groupby?

Newbie Help for spark's not finding native hadoop warning

RE: How to Parse & flatten JSON object in a text file using Spark&Scala into Dataframe

How to ignore case in dataframe groupby?

RE: Spark Streaming + Kafka + scala job message read issue

Re: How to contribute by picking up starter bugs

Re: Extract compressed JSON withing JSON

How to contribute by picking up starter bugs

Re: error in spark cassandra connector

Re: How to Parse & flatten JSON object in a text file using Spark &Scala into Dataframe

Spark Streaming + Kafka + scala job message read issue

error in spark cassandra connector

Extract compressed JSON withing JSON

Re: Using Java Function API with Java 8

Re: Spark Streaming 1.5.2+Kafka+Python. Strange reading

26 matches

Site Navigation

Mail list logo

Footer information